WO2019196281A1 - Epidemic grading and prediction method and apparatus, computer apparatus, and readable storage medium - Google Patents

Epidemic grading and prediction method and apparatus, computer apparatus, and readable storage medium Download PDF

Info

Publication number
WO2019196281A1
WO2019196281A1 PCT/CN2018/099649 CN2018099649W WO2019196281A1 WO 2019196281 A1 WO2019196281 A1 WO 2019196281A1 CN 2018099649 W CN2018099649 W CN 2018099649W WO 2019196281 A1 WO2019196281 A1 WO 2019196281A1
Authority
WO
WIPO (PCT)
Prior art keywords
epidemic
time point
risk level
period
prediction model
Prior art date
Application number
PCT/CN2018/099649
Other languages
French (fr)
Chinese (zh)
Inventor
阮晓雯
徐亮
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Priority to JP2019572829A priority Critical patent/JP6893259B2/en
Publication of WO2019196281A1 publication Critical patent/WO2019196281A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu

Definitions

  • the present application relates to the field of disease prediction technologies, and in particular, to a method and device for predicting epidemic grading, a computer device, and a non-volatile readable storage medium.
  • the epidemic prediction and warning is based on the collected epidemiological epidemic reports and epidemic monitoring data, comprehensive assessment and prediction of the area and scale of the epidemic, and then, within a certain range, adopt appropriate methods to pre-release event threat warnings, and then timely Found outbreaks and epidemics.
  • epidemic epidemic prediction has become an important part of the disease surveillance information system.
  • a first aspect of the present application provides a method for predicting an epidemic grading, the method comprising:
  • a second aspect of the present application provides an epidemic grading prediction apparatus, the apparatus comprising:
  • a training unit configured to train the epidemic prediction model with the first training data
  • a test unit configured to use the epidemiological prediction model to predict test data, determine whether the prediction result of the test data meets a preset condition, and if the predicted result of the test data does not satisfy a preset condition, Fine-tuning the epidemiological prediction model;
  • a determining unit configured to determine, by using the second training data, a hierarchical time window size based on the epidemic risk prediction model for determining an epidemic risk level, and determining the time based on the epidemic prediction model as a medium risk level and a medium risk level or higher Point in the real epidemic epidemic period, based on the epidemiological prediction model, the time point determined as the low risk level and the medium risk level is within the real epidemic non-population period;
  • a prediction unit configured to use the epidemiological prediction model to predict each time point in the grading time window size time before the time point to be measured, and divide the time in the grading time window size before the time point to be tested The epidemic epidemic period and the epidemic non-population period; calculating the mean and standard deviation of the epidemiological monitoring data of the epidemic non-epidemic period within the grading time window size time before the time point to be measured; according to the time point to be tested Calculating an epidemic risk level dividing threshold by using the mean and standard deviation of the epidemiological monitoring data of the non-epidemic period of the epidemic period in the classification time window size; determining the prevalence of the time point to be tested according to the threshold of the epidemic risk level Disease risk level.
  • a third aspect of the present application provides a computer apparatus comprising a memory and a processor, the memory for storing at least one computer readable instruction, the processor for executing the at least one computer readable instruction
  • a computer apparatus comprising a memory and a processor, the memory for storing at least one computer readable instruction, the processor for executing the at least one computer readable instruction
  • a fourth aspect of the present application provides a non-volatile readable storage medium storing at least one computer readable instruction when executed by a processor Implement epidemiological grading prediction methods.
  • the present application trains and tests the epidemic prediction model, obtains an optimized epidemiological prediction model, and uses the optimized epidemiological prediction model to predict the epidemiological monitoring data before the measurement time point, and determines based on the epidemic prediction model.
  • the size of the time window is determined according to the prediction result at each time point before the time point to be measured and the size of the classification time window to determine the epidemic risk level of the time point to be measured. Since the time window size used for the epidemic risk level determination is determined based on the epidemiological prediction model, the present application can improve the accuracy of the epidemic risk level determination.
  • FIG. 1 is a flowchart of a method for predicting epidemic grading according to Embodiment 1 of the present application.
  • step 105 of FIG. 1 is a detailed flow chart of step 105 of FIG. 1.
  • FIG. 3 is a structural diagram of an epidemic grading prediction apparatus according to Embodiment 2 of the present application.
  • FIG. 4 is a schematic diagram of a computer device according to Embodiment 3 of the present application.
  • the epidemic grading prediction method of the present application is applied to one or more computer devices.
  • FIG. 1 is a flowchart of a method for predicting epidemic grading according to Embodiment 1 of the present application.
  • the epidemic grading prediction method can predict an epidemic risk level at a time point to be tested according to epidemiological monitoring data before the time point to be measured.
  • the epidemic grading prediction method specifically includes the following steps:
  • step 101 an epidemic prediction model is established.
  • the epidemiological prediction model is used to predict epidemic epidemics and epidemic non-epidemic periods based on epidemiological surveillance data.
  • the epidemiological monitoring data is time series data.
  • the epidemiological monitoring data may include epidemiological data such as the number of visits to the epidemic, the rate of visits, the number of cases, and the incidence rate.
  • the number of daily visits to an epidemic eg, flu
  • the number of daily visits to an epidemic can be obtained from a medical institution (eg, a hospital), and the number of daily visits to an epidemic (eg, flu) can be used as epidemiological surveillance data.
  • the number of daily epidemics of a student's epidemic (eg, flu) can be obtained from the school, and the number of daily epidemics of an epidemic (eg, flu) can be used as epidemiological surveillance data.
  • the number of daily visits to an epidemic eg, flu
  • the number of daily visits to an epidemic can be obtained from a medical institution (eg, a hospital), and the number of daily visits to an epidemic (eg, flu) can be used as epidemiological surveillance data.
  • An epidemiological monitoring network composed of a plurality of monitoring points may be established in a preset area (for example, a province, a city, a region), and the epidemiological monitoring data is obtained from the monitoring points.
  • Medical institutions, schools and child care institutions, pharmacies, etc. can be selected as monitoring points to conduct epidemiological surveillance and data collection for the corresponding target population.
  • a place that meets the preset conditions can be selected as the monitoring point.
  • the preset condition may include a number of people, a scale, and the like. For example, select a school with a predetermined number of schools and child care institutions as monitoring points. Another example is to select a pharmacy that has reached the preset size (for example, by daily turnover) as a monitoring point. For another example, select a hospital (for example, the number of people who seek medical treatment in Japan) to reach a preset size as a monitoring point.
  • Epidemiological data at different time points constitute the epidemiological surveillance data (ie, time series data).
  • epidemiological data collected on a daily basis can be used as epidemiological surveillance data.
  • epidemiological disease data collected on a weekly basis can be used as epidemiological surveillance data.
  • medical institutions, schools, child care institutions, and pharmacies are mainly selected to collect epidemiological surveillance data.
  • data sources does not limit the addition or replacement of other focused populations or sites in other embodiments as a source of data for monitoring.
  • hotels can be included in the epidemiological surveillance area to obtain epidemiological surveillance data for hotel residents.
  • epidemiological surveillance data collected by any type of monitoring point can be taken.
  • any type of monitoring point such as a medical institution
  • only epidemiological surveillance data collected by the hospital can be taken.
  • epidemiological surveillance data collected from multiple types of monitoring points can be combined.
  • epidemiological surveillance data collected by hospitals can be used as a supplement, supplemented by epidemiological surveillance data from pharmacies.
  • the epidemic prediction model may include a CUSUM (Cumulative Sum) prediction model, an EWMA (Exponentially Weighted Moving-Average) prediction model, and a mobile percentile prediction model.
  • CUSUM Cumulative Sum
  • EWMA Exposurely Weighted Moving-Average
  • mobile percentile prediction model The three models are introduced separately below.
  • the CUSUM prediction model achieves the amplification effect by accumulating small deviations between the actual values (ie epidemiological monitoring data) and the reference values, and improves the sensitivity of small deviations in the prediction process.
  • the deviations accumulate to a certain extent and exceed the threshold, then It is believed that there has been a turning point, that is, the epidemic has shifted from the non-popular period to the epidemic period.
  • w is the time window size of the CUSUM prediction model.
  • X is daily epidemiology data
  • w can take 7 for 7 days (ie, one week).
  • ⁇ w is the mean of time point tw to t-1 (length w) X
  • ⁇ w is the standard deviation of time point tw to t-1 (length is w) X.
  • K 1 is a tunable parameter and is generally taken in (0, 3).
  • H t h ⁇ t , it is considered to enter the epidemic epidemic period.
  • h is an adjustable parameter, generally taking 1, 2, and 3.
  • ⁇ t is the standard deviation of the historical data at time point t.
  • C t is the value of the next time point generated based on the value of the previous time point. According to C t and H t , the epidemic epidemic period and the beginning of the epidemic non-population period can be quickly determined. If the CUSUM value C t is greater than the threshold H t , then the epidemic epidemic period is entered. If the CUSUM value C t is less than or equal to the threshold H t , it enters the epidemic non-population period.
  • EWMA is an exponentially decreasing weighted moving average.
  • the weighted influence of each value decreases exponentially with time. The more recent the data weighting influence is, the older the data also gives a certain weight value.
  • EWMA value Z t at time t is greater than UCL, it is considered to enter the epidemic epidemic period.
  • the constant ⁇ is a weight coefficient and is generally taken within (0, 1).
  • K 2 is a tunable parameter and is generally taken in (0, 3).
  • Z t is based on the value of the previous time point to generate the value of the next time point. According to Z t and UCL, the start of the epidemic period and the non-popular period can be quickly determined. If Z t is greater than UCL, it enters the epidemic period. If the CUSUM value Z t is less than or equal to UCL, it enters the epidemic non-population period.
  • the moving percentile prediction model is based on the epidemiological monitoring data of the observation week of the previous year and the pre-set weeks (for example, 2 weeks before and after) as the baseline data, and the specified percentiles are calculated (for example, P5, P10,... , P90, P95, P100) as a candidate early warning threshold, establish an early warning model.
  • the P80 value of the influenza ILI index for the third week of 2014 represents the 80th percentile of the 10-week morbidity rate for the 1-5th week of 2012-2013, which is taken as the early warning threshold for the third week of 2014.
  • the P75 value for the 20th week of 2014 represents the 75th percentile of the 10-week morbidity rate for the 18th to 22nd week of 2012-2013, which is taken as the warning threshold for the 3rd week of 2014.
  • epidemiological surveillance data Compare epidemiological surveillance data with early warning thresholds and enter epidemic epidemic if epidemiological surveillance data is greater than the pre-warning threshold. If the epidemiological surveillance data is less than or equal to the warning threshold, it enters the epidemic non-population period.
  • Step 102 Train the epidemic prediction model with the first training data.
  • the first training data is epidemiological surveillance data.
  • the training of the epidemic prediction model is performed by using the first training data, that is, the first training data is predicted by using the epidemic prediction model, and the parameters of the epidemic prediction model are adjusted or selected according to the prediction result of the first training data. .
  • the first training data is subjected to epidemiological prediction using the CUSUM prediction model, and the time window size w, parameters k 1 and h of the CUSUM prediction model are adjusted according to the prediction result of the first training data.
  • the EWMA prediction model is used to perform epidemiological prediction on the first training data, and the weight coefficient ⁇ and the parameter k 2 of the EWMA prediction model are adjusted according to the prediction result of the first training data.
  • the first training data is subjected to epidemiological prediction by using the moving percentile prediction model, and the appropriate percentile (for example, the 80th percentile) is selected as the moving percentage according to the prediction result of the first training data.
  • the warning threshold for the bit prediction model is selected as the appropriate percentile (for example, the 80th percentile) as the moving percentage according to the prediction result of the first training data.
  • the first training data may be predicted by using the epidemic prediction model, and the prediction result of the first training data is compared with a real epidemic/non-period segmentation result, and the comparison result is adjusted or selected according to the comparison result.
  • the parameters of the epidemic prediction model may be used to predict the epidemic prediction model.
  • the real epidemic/non-epidemic period is defined by medical methods.
  • the preset indicator of the prediction result of the epidemic prediction model on the first training data may be calculated, and the parameters of the epidemic prediction model are adjusted or selected according to the preset index. For example, three indicators of accuracy, specificity, and timeliness of the prediction result of the epidemic prediction model for the first training data may be calculated, and parameters of the epidemic prediction model are adjusted or selected based on the three indicators.
  • Accuracy number of days of effective warning / total number of days of true epidemic epidemic x 100%;
  • Timeliness (ie lag period) start date of the epidemic period of effective early warning - the start date of the real epidemic epidemic period.
  • the epidemiological prediction model to predict the epidemic epidemic period for the first training data, if it is predicted that the epidemiological epidemic period is a certain day, and it falls within the range of the real epidemic epidemic period, it is recorded as an effective warning. .
  • Step 103 The test data is predicted by using the epidemic prediction model, and it is determined whether the prediction result of the test data meets a preset condition. If the prediction result of the test data satisfies a preset condition, step 105 is performed.
  • the test data is epidemiological surveillance data.
  • the epidemiological prediction model is used to predict the test data, and the purpose is to verify whether the post-training epidemiological prediction model satisfies the requirements.
  • Predetermining indicators (eg, accuracy, specificity, timeliness) of the prediction result of the test data may be calculated, and determining whether the post-training epidemic prediction model satisfies a preset condition according to a preset index of the prediction result of the test data . For example, determining whether the accuracy of the prediction result of the test data reaches a preset accuracy, and/or determining whether the specificity of the prediction result of the test data reaches a preset specificity, and/or determining the test data. Whether the timeliness of the forecast results reaches the preset timeliness.
  • the accuracy of the prediction result of the test data reaches a preset accuracy, and/or the specificity of the prediction result of the test data reaches a preset specificity, and/or the timeliness of the prediction result of the test data reaches Predetermined timeliness, it is judged that the epidemic prediction model satisfies the preset condition, and an optimized epidemiological prediction model is obtained.
  • Step 104 If the prediction result of the test data does not satisfy the preset condition, fine-tune the epidemic prediction model, and then perform step 105.
  • the parameters of the epidemic prediction model are further adjusted.
  • the time window size w, parameters k 1 and h of the CUSUM prediction model are further adjusted.
  • the weight coefficient ⁇ and the parameter k 2 of the EWMA prediction model are further adjusted.
  • the 75th percentile (the 80th percentile is adjusted to the 75th percentile) is taken as the The warning threshold for the moving percentile prediction model.
  • Step 105 Determine, by using the second training data, a time window size (hereinafter referred to as a grading time window size) for determining an epidemic risk level based on the epidemic prediction model, so that the epidemic prediction model is determined to be a medium risk level and The time point above the risk level is within the real epidemic period, and the time point determined as the low risk level and the medium risk level based on the epidemic prediction model is within the real epidemic non-population period.
  • a time window size hereinafter referred to as a grading time window size
  • the second training data is epidemiological monitoring data.
  • the second training data may be the same as or different from the first training data.
  • the grading time window size is determined in order to ensure the accuracy of the epidemic risk level based on the epidemiological prediction model.
  • the grading time window size is adjusted so that the time point determined as the medium risk level and the medium risk level or higher based on the epidemic prediction model is utilized during the real epidemic period.
  • the epidemic prediction model determines that the time points of the low risk level and the medium risk level are within the real epidemic non-population period.
  • determining the hierarchical time window size may specifically include the following steps:
  • Step 201 Using the epidemic prediction model to predict each time point in the grading time window size time before the preset time point, and dividing the epidemic epidemic in the grading time window size time before the preset time point Period and epidemic period.
  • the initial value of the hierarchical time window size is a preset value, for example, 3, and is adjusted to a suitable size through steps 201-205, for example, 7.
  • Step 202 Calculate, according to the second training data, a mean and a standard deviation of epidemiological monitoring data of a non-epidemic epidemic period within the time-scale window size time before the preset time point.
  • Step 203 Calculate an epidemic risk level division threshold according to the mean and standard deviation of the epidemic monitoring data of the non-epidemic period of the epidemic period before the preset time point size. This step can refer to the description of step 108 below.
  • Step 204 Determine an epidemic risk level of the preset time point according to the epidemic risk level division threshold corresponding to the second training data. This step can refer to the description of step 109 below.
  • Step 205 If the epidemic risk level of the preset time point is a medium risk level and a medium risk level, determine whether the preset time point is within a real epidemic period, or if the preset The epidemic risk level at the time point is a low risk and a medium risk level, and it is determined whether the preset time point is within a real epidemic non-popular period.
  • Step 206 If the epidemic risk level of the preset time point is a medium risk level and a medium risk level, and the preset time point is within a real epidemic epidemic period, or if the preset time point The epidemic risk level is a low risk and a medium risk level, and the preset time point is within a real epidemic non-popular period, and the hierarchical time window size is adjusted.
  • the hierarchical time window size may be adjusted multiple times in the manner described above using different second training data to adjust the hierarchical time window size to an optimal value.
  • Step 106 predict, by using the epidemic prediction model, each time point in the grading time window size time before the time point to be measured, and divide the epidemic in the grading time window size time before the time point to be tested Epidemic period and non-epidemic period of epidemic disease.
  • the CUSUM prediction model uses the CUSUM prediction model to predict each time point in the grading time window size time before the time point to be measured, and dividing the epidemic period and epidemic time in the grading time window size time before the time point to be measured The disease is not in epidemic.
  • the mobile percentile model is used to predict each time point in the time window of the hierarchical time window before the time point is measured, and the epidemic epidemic time in the size time window size before the time point to be measured is divided. Period and epidemic period.
  • Step 107 Calculate a mean value and a standard deviation of epidemiological monitoring data of a non-epidemic epidemic period within the size time window size period before the time point to be measured.
  • step 107 all the epidemic non-population periods in the grading time window size time before the time point to be measured are counted, and all the epidemic diseases in the grading time window size time before the time point to be measured are calculated.
  • the mean and standard deviation of epidemiological surveillance data during the epidemic For example, the three time epidemic non-population periods are included in the grading time window size time before the time point to be tested, and the mean and standard deviation of the epidemiological monitoring data of the three epidemics are calculated.
  • the mean and standard deviation are a mean and a standard deviation calculated from epidemiological monitoring data of all epidemic non-epidemic periods within the grading time window size time before the time point to be measured.
  • Step 108 Calculate an epidemic risk level division threshold according to the mean value and the standard deviation of the epidemic monitoring data of the non-epidemic period of the epidemic period in the time period of the classification time window before the time point to be measured.
  • the epidemic risk level division threshold may include a high school level division threshold, a medium and low level division threshold, and a low/very low level division threshold.
  • the high school level partitioning threshold is used to divide a high risk level and a medium risk level
  • the low level ranking threshold is used to divide the medium risk level and the low risk level
  • the low level ranking threshold is used to divide the low risk level and the pole. Low risk level.
  • the mean value of the epidemiological monitoring data during the non-prevalence period is ⁇ w′
  • the standard deviation is ⁇ w′
  • the high school level dividing threshold is ⁇ w′ +k′ 1 * ⁇ w′
  • the middle and low level partitioning threshold is ⁇ w′ + k 2 ′ * ⁇ w′
  • the middle and low level partitioning threshold is ⁇ w′ + k 3 '* ⁇ w' , 2 ⁇ k' 3 ⁇ 4.
  • the high school level dividing threshold is ⁇ w′ +6* ⁇ w′
  • the middle and low level dividing threshold is ⁇ w′ +4* ⁇ w′
  • the middle and low level dividing threshold is ⁇ w′ +2 * ⁇ w' .
  • the epidemic risk level division threshold may include other quantities and types.
  • the epidemic risk level division threshold may include a high school level division threshold and a medium and low level division threshold.
  • the epidemic risk level division threshold may include a very high/high level division threshold, a high school level division threshold, a medium and low level division threshold, and a low/very low level division threshold.
  • Step 109 Determine a prevalence risk level of the time point to be tested according to the epidemic risk level division threshold.
  • the epidemic monitoring data at the time point to be tested is greater than or equal to the high school level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is a high risk level. If the epidemic monitoring data at the time point to be tested is smaller than the high school level dividing threshold and greater than or equal to the high school level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is a medium risk level. If the epidemiological monitoring data at the time point to be tested is smaller than the middle and low level dividing threshold and greater than or equal to the low/low level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is a low risk level. If the epidemiological monitoring data at the time point to be tested is less than the low/very low level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is an extremely low risk level.
  • the CUSUM prediction model, the EWMA prediction model, and the moving percentile prediction model may be combined for epidemiological grading prediction. Specifically, according to the method of steps 101-109, determining, according to the CUSUM prediction model, the time point to be tested is the first epidemic risk level, and determining the time point to be tested as the second epidemic risk level based on the EWMA prediction model, based on the mobile percentile The predictive model determines that the time point to be tested is a third epidemic risk level, and the final epidemic risk level is obtained according to the first epidemic risk level, the second epidemic risk level, and the third epidemic risk level.
  • the first epidemic risk level, the second epidemic risk level, and the third epidemic risk level have at least two epidemic risk levels consistent, if the first epidemic risk level, If at least two epidemiological risk levels are consistent between the second epidemic risk level and the third epidemic risk level, the consistent epidemic risk level is used as the final epidemic risk level.
  • the epidemic grading prediction method of the first embodiment trains and tests the epidemic prediction model, obtains an optimized epidemiological prediction model, and uses the optimized epidemic prediction model to predict epidemiological monitoring data before the time point is measured, based on
  • the epidemic prediction model determines the grading time window size, and determines the epidemic risk level of the time point to be measured according to the prediction result at each time point before the time point to be measured and the grading time window size. Since the time window size used for the epidemic risk level determination is determined based on the epidemiological prediction model, the first embodiment can improve the accuracy of the epidemiological risk level determination.
  • FIG. 3 is a structural diagram of an epidemic grading prediction apparatus according to Embodiment 2 of the present application.
  • the epidemic grading prediction apparatus 10 may include an establishing unit 301, a training unit 302, a testing unit 303, a determining unit 304, and a prediction unit 305.
  • the establishing unit 301 is configured to establish an epidemic prediction model.
  • the epidemiological prediction model is used to predict epidemic epidemics and epidemic non-epidemic periods based on epidemiological surveillance data.
  • the epidemiological monitoring data is time series data.
  • the epidemiological monitoring data may include epidemiological data such as the number of visits to the epidemic, the rate of visits, the number of cases, and the incidence rate.
  • the number of daily visits to an epidemic eg, flu
  • the number of daily visits to an epidemic can be obtained from a medical institution (eg, a hospital), and the number of daily visits to an epidemic (eg, flu) can be used as epidemiological surveillance data.
  • the number of daily epidemics of a student's epidemic (eg, flu) can be obtained from the school, and the number of daily epidemics of an epidemic (eg, flu) can be used as epidemiological surveillance data.
  • the number of daily visits to an epidemic eg, flu
  • the number of daily visits to an epidemic can be obtained from a medical institution (eg, a hospital), and the number of daily visits to an epidemic (eg, flu) can be used as epidemiological surveillance data.
  • An epidemiological monitoring network composed of a plurality of monitoring points may be established in a preset area (for example, a province, a city, a region), and the epidemiological monitoring data is obtained from the monitoring points.
  • Medical institutions, schools and child care institutions, pharmacies, etc. can be selected as monitoring points to conduct epidemiological surveillance and data collection for the corresponding target population.
  • a place that meets the preset conditions can be selected as the monitoring point.
  • the preset condition may include a number of people, a scale, and the like. For example, select a school with a predetermined number of schools and child care institutions as monitoring points. Another example is to select a pharmacy that has reached the preset size (for example, by daily turnover) as a monitoring point. For another example, select a hospital (for example, the number of people who seek medical treatment in Japan) to reach a preset size as a monitoring point.
  • Epidemiological data at different time points constitute the epidemiological surveillance data (ie, time series data).
  • epidemiological data collected on a daily basis can be used as epidemiological surveillance data.
  • epidemiological disease data collected on a weekly basis can be used as epidemiological surveillance data.
  • medical institutions, schools, child care institutions, and pharmacies are mainly selected to collect epidemiological surveillance data.
  • data sources does not limit the addition or replacement of other focused populations or sites in other embodiments as a source of data for monitoring.
  • hotels can be included in the epidemiological surveillance area to obtain epidemiological surveillance data for hotel residents.
  • epidemiological surveillance data collected by any type of monitoring point can be taken.
  • any type of monitoring point such as a medical institution
  • only epidemiological surveillance data collected by the hospital can be taken.
  • epidemiological surveillance data collected from multiple types of monitoring points can be combined.
  • epidemiological surveillance data collected by hospitals can be used as a supplement, supplemented by epidemiological surveillance data from pharmacies.
  • the epidemic prediction model may include a CUSUM (Cumulative Sum) prediction model, an EWMA (Exponentially Weighted Moving-Average) prediction model, and a mobile percentile prediction model.
  • CUSUM Cumulative Sum
  • EWMA Exposurely Weighted Moving-Average
  • mobile percentile prediction model The three models are introduced separately below.
  • the CUSUM prediction model achieves the amplification effect by accumulating small deviations between the actual values (ie epidemiological monitoring data) and the reference values, and improves the sensitivity of small deviations in the prediction process.
  • the deviations accumulate to a certain extent and exceed the threshold, then It is believed that there has been a turning point, that is, the epidemic has shifted from the non-popular period to the epidemic period.
  • w is the time window size of the CUSUM prediction model.
  • X is daily epidemiology data
  • w can take 7 for 7 days (ie, one week).
  • ⁇ w is the mean of time point tw to t-1 (length w) X
  • ⁇ w is the standard deviation of time point tw to t-1 (length is w) X.
  • K 1 is a tunable parameter and is generally taken in (0, 3).
  • H t h ⁇ t , it is considered to enter the epidemic epidemic period.
  • h is an adjustable parameter, generally taking 1, 2, and 3.
  • ⁇ t is the standard deviation of the historical data at time point t.
  • C t is the value of the next time point generated based on the value of the previous time point. According to C t and H t , the epidemic epidemic period and the beginning of the epidemic non-population period can be quickly determined. If the CUSUM value C t is greater than the threshold H t , then the epidemic epidemic period is entered. If the CUSUM value C t is less than or equal to the threshold H t , it enters the epidemic non-population period.
  • EWMA is an exponentially decreasing weighted moving average.
  • the weighted influence of each value decreases exponentially with time. The more recent the data weighting influence is, the older the data also gives a certain weight value.
  • EWMA value Z t at time t is greater than UCL, it is considered to enter the epidemic epidemic period.
  • the constant ⁇ is a weight coefficient and is generally taken within (0, 1).
  • K 2 is a tunable parameter and is generally taken in (0, 3).
  • Z t is based on the value of the previous time point to generate the value of the next time point. According to Z t and UCL, the start of the epidemic period and the non-popular period can be quickly determined. If Z t is greater than UCL, it enters the epidemic period. If the CUSUM value Z t is less than or equal to UCL, it enters the epidemic non-population period.
  • the moving percentile prediction model is based on the epidemiological monitoring data of the observation week of the previous year and the pre-set weeks (for example, 2 weeks before and after) as the baseline data, and the specified percentiles are calculated (for example, P5, P10,... , P90, P95, P100) as a candidate early warning threshold, establish an early warning model.
  • the P80 value of the influenza ILI index for the third week of 2014 represents the 80th percentile of the 10-week morbidity rate for the 1-5th week of 2012-2013, which is taken as the early warning threshold for the third week of 2014.
  • the P75 value for the 20th week of 2014 represents the 75th percentile of the 10-week morbidity rate for the 18th to 22nd week of 2012-2013, which is taken as the warning threshold for the 3rd week of 2014.
  • epidemiological surveillance data Compare epidemiological surveillance data with early warning thresholds and enter epidemic epidemic if epidemiological surveillance data is greater than the pre-warning threshold. If the epidemiological surveillance data is less than or equal to the warning threshold, it enters the epidemic non-population period.
  • the training unit 302 is configured to train the epidemic prediction model with the first training data.
  • the first training data is epidemiological surveillance data.
  • the training of the epidemic prediction model is performed by using the first training data, that is, the first training data is predicted by using the epidemic prediction model, and the parameters of the epidemic prediction model are adjusted or selected according to the prediction result of the first training data. .
  • the first training data is subjected to epidemiological prediction using the CUSUM prediction model, and the time window size w, parameters k 1 and h of the CUSUM prediction model are adjusted according to the prediction result of the first training data.
  • the EWMA prediction model is used to perform epidemiological prediction on the first training data, and the weight coefficient ⁇ and the parameter k 2 of the EWMA prediction model are adjusted according to the prediction result of the first training data.
  • the first training data is subjected to epidemiological prediction by using the moving percentile prediction model, and the appropriate percentile (for example, the 80th percentile) is selected as the moving percentage according to the prediction result of the first training data.
  • the warning threshold for the bit prediction model is selected as the appropriate percentile (for example, the 80th percentile) as the moving percentage according to the prediction result of the first training data.
  • the first training data may be predicted by using the epidemic prediction model, and the prediction result of the first training data is compared with a real epidemic/non-period segmentation result, and the comparison result is adjusted or selected according to the comparison result.
  • the parameters of the epidemic prediction model may be used to predict the epidemic prediction model.
  • the real epidemic/non-epidemic period is defined by medical methods.
  • the preset indicator of the prediction result of the epidemic prediction model on the first training data may be calculated, and the parameters of the epidemic prediction model are adjusted or selected according to the preset index. For example, three indicators of accuracy, specificity, and timeliness of the prediction result of the epidemic prediction model for the first training data may be calculated, and parameters of the epidemic prediction model are adjusted or selected based on the three indicators.
  • Accuracy number of days of effective warning / total number of days of true epidemic epidemic x 100%;
  • Timeliness (ie lag period) start date of the epidemic period of effective early warning - the start date of the real epidemic epidemic period.
  • the epidemiological prediction model to predict the epidemic epidemic period for the first training data, if it is predicted that the epidemiological epidemic period is a certain day, and it falls within the range of the real epidemic epidemic period, it is recorded as an effective warning. .
  • the testing unit 303 is configured to use the epidemic prediction model to predict the test data, determine whether the prediction result of the test data meets a preset condition, and if the prediction result of the test data does not meet the preset condition, the test unit
  • the epidemiological prediction model is fine-tuned.
  • the test data is epidemiological surveillance data.
  • the epidemiological prediction model is used to predict the test data, and the purpose is to verify whether the post-training epidemiological prediction model satisfies the requirements.
  • Predetermining indicators (eg, accuracy, specificity, timeliness) of the prediction result of the test data may be calculated, and determining whether the post-training epidemic prediction model satisfies a preset condition according to a preset index of the prediction result of the test data . For example, determining whether the accuracy of the prediction result of the test data reaches a preset accuracy, and/or determining whether the specificity of the prediction result of the test data reaches a preset specificity, and/or determining the test data. Whether the timeliness of the forecast results reaches the preset timeliness.
  • the accuracy of the prediction result of the test data reaches a preset accuracy, and/or the specificity of the prediction result of the test data reaches a preset specificity, and/or the timeliness of the prediction result of the test data reaches Predetermined timeliness, it is judged that the epidemic prediction model satisfies the preset condition, and an optimized epidemiological prediction model is obtained.
  • the parameters of the epidemic prediction model are further adjusted.
  • the time window size w, parameters k 1 and h of the CUSUM prediction model are further adjusted.
  • the weight coefficient ⁇ and the parameter k 2 of the EWMA prediction model are further adjusted.
  • the 75th percentile (the 80th percentile is adjusted to the 75th percentile) is taken as the The warning threshold for the moving percentile prediction model.
  • a determining unit 304 configured to determine, by using the second training data, a time window size (hereinafter referred to as a hierarchical time window size) for determining an epidemic risk level based on the epidemic prediction model, so that the epidemiological prediction model is determined to be middle
  • a time window size hereinafter referred to as a hierarchical time window size
  • the time points of the risk level and the medium risk level are within the real epidemic period, and the time points determined as the low risk level and the medium risk level based on the epidemic prediction model are within the real epidemic non-population period.
  • the second training data is epidemiological monitoring data.
  • the second training data may be the same as or different from the first training data.
  • the grading time window size is determined in order to ensure the accuracy of the epidemic risk level based on the epidemiological prediction model.
  • the grading time window size is adjusted so that the time point determined as the medium risk level and the medium risk level or higher based on the epidemic prediction model is utilized during the real epidemic period.
  • the epidemic prediction model determines that the time points of the low risk level and the medium risk level are within the real epidemic non-population period.
  • the determining unit 304 can determine the hierarchical time window size as follows:
  • the initial value of the hierarchical time window size is a preset value, for example, 3, and the determining unit 304 adjusts to a suitable size, for example, 7.
  • the epidemic risk level of the preset time point is a medium risk level and a medium risk level or higher, determining whether the preset time point is within a real epidemic period, or if the preset The epidemic risk level at the time point is a low risk and a medium risk level, and it is determined whether the preset time point is within a real epidemic non-popular period.
  • the epidemic risk level of the preset time point is a medium risk level and a medium risk level or higher, and the preset time point is within a real epidemic epidemic period, or if the preset time point is The epidemic risk level is a low risk and a medium risk level, and the preset time point is within a real epidemic non-popular period, and the hierarchical time window size is adjusted.
  • the hierarchical time window size may be adjusted multiple times in the manner described above using different second training data to adjust the hierarchical time window size to an optimal value.
  • the forecasting unit 305 is configured to use the epidemic prediction model to predict each time point in the grading time window size time before the time point to be measured, and divide the sizing time window size time before the time point to be tested The prevalence of epidemics and the epidemic period of epidemics.
  • the CUSUM prediction model uses the CUSUM prediction model to predict each time point in the grading time window size time before the time point to be measured, and dividing the epidemic period and epidemic time in the grading time window size time before the time point to be measured The disease is not in epidemic.
  • the mobile percentile model is used to predict each time point in the time window of the hierarchical time window before the time point is measured, and the epidemic epidemic time in the size time window size before the time point to be measured is divided. Period and epidemic period.
  • the prediction unit 305 is further configured to calculate a mean value and a standard deviation of the epidemic monitoring data of the epidemic non-population period within the grading time window size time before the time point to be measured.
  • the mean and standard deviation of the disease surveillance data For example, the three time epidemic non-population periods are included in the grading time window size time before the time point to be tested, and the mean and standard deviation of the epidemiological monitoring data of the three epidemics are calculated.
  • the mean and standard deviation are a mean and a standard deviation calculated from epidemiological monitoring data of all epidemic non-epidemic periods within the grading time window size time before the time point to be measured.
  • the prediction unit 305 is further configured to calculate an epidemic risk level division threshold according to the mean value and the standard deviation of the epidemic monitoring data of the non-epidemic period of the epidemic period before the time point of the classification time window.
  • the epidemic risk level division threshold may include a high school level division threshold, a medium and low level division threshold, and a low/very low level division threshold.
  • the high school level partitioning threshold is used to divide a high risk level and a medium risk level
  • the low level ranking threshold is used to divide the medium risk level and the low risk level
  • the low level ranking threshold is used to divide the low risk level and the pole. Low risk level.
  • the mean value of the epidemiological monitoring data during the non-prevalence period is ⁇ w′
  • the standard deviation is ⁇ w′
  • the high school level dividing threshold is ⁇ w′ +k′ 1 * ⁇ w′
  • the middle and low level partitioning threshold is ⁇ w′ + k 2 ′ * ⁇ w′
  • the middle and low level partitioning threshold is ⁇ w′ + k 3 '* ⁇ w' , 2 ⁇ k' 3 ⁇ 4.
  • the high school level dividing threshold is ⁇ w′ +6* ⁇ w′
  • the middle and low level dividing threshold is ⁇ w′ +4* ⁇ w′
  • the middle and low level dividing threshold is ⁇ w′ +2 * ⁇ w' .
  • the epidemic risk level division threshold may include other quantities and types.
  • the epidemic risk level division threshold may include a high school level division threshold and a medium and low level division threshold.
  • the epidemic risk level division threshold may include a very high/high level division threshold, a high school level division threshold, a medium and low level division threshold, and a low/very low level division threshold.
  • the predicting unit 305 is further configured to determine an epidemic risk level of the time point to be tested according to the epidemic risk level dividing threshold.
  • the epidemic monitoring data at the time point to be tested is greater than or equal to the high school level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is a high risk level. If the epidemic monitoring data at the time point to be tested is smaller than the high school level dividing threshold and greater than or equal to the high school level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is a medium risk level. If the epidemiological monitoring data at the time point to be tested is smaller than the middle and low level dividing threshold and greater than or equal to the low/low level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is a low risk level. If the epidemiological monitoring data at the time point to be tested is less than the low/very low level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is an extremely low risk level.
  • the epidemic grading prediction apparatus 10 may perform epidemiological grading prediction in combination with a CUSUM prediction model, an EWMA prediction model, and a mobile percentile prediction model. Specifically, the epidemic grading prediction apparatus 10 determines that the time point to be tested is the first epidemic risk level based on the CUSUM prediction model, and determines the time point to be tested as the second epidemic risk level based on the EWMA prediction model, based on the mobile percentile The predictive model determines that the time point to be tested is a third epidemic risk level, and the final epidemic risk level is obtained according to the first epidemic risk level, the second epidemic risk level, and the third epidemic risk level.
  • the first epidemic risk level, the second epidemic risk level, and the third epidemic risk level have at least two consistency, if the first epidemic risk level, the second epidemic If at least two of the disease risk level and the third epidemic risk level are consistent, the consistent epidemiological risk level is used as the final epidemic risk level.
  • the epidemic grading prediction device 10 of the second embodiment trains and tests the epidemic prediction model, obtains an optimized epidemic prediction model, and uses the optimized epidemic prediction model to predict epidemiological monitoring data before the time point is measured.
  • the grading time window size is determined based on the epidemic prediction model, and the epidemic risk level of the time point to be measured is determined according to the prediction result at each time point before the time point to be measured and the grading time window size. Since the time window size used for the epidemic risk level determination is determined based on the epidemic prediction model, Embodiment 2 can improve the accuracy of the epidemic risk level determination.
  • the computer device 1 includes a memory 20, a processor 30, and computer readable instructions 40 stored in the memory 20 and executable on the processor 30, such as an epidemic grading prediction program.
  • the processor 30 executes the computer readable instructions 40 to implement the steps in the above-described epidemiological grading prediction method embodiment, such as steps 101-109 shown in FIG.
  • the processor 30, when executing the computer readable instructions 40, implements the functions of the various modules/units of the apparatus embodiments described above, such as units 301-305 of FIG.
  • the computer readable instructions 40 may be partitioned into one or more modules/units that are stored in the memory 20 and executed by the processor 30, To complete this application.
  • the one or more modules/units may be a series of computer readable instruction segments capable of performing a particular function for describing the execution of the computer readable instructions 40 in the computer device 1.
  • the computer readable instructions 40 may be divided into the establishing unit 301, the training unit 302, the testing unit 303, the determining unit 304, and the predicting unit 305 in FIG.
  • the computer device 1 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. It will be understood by those skilled in the art that the schematic diagram 4 is merely an example of the computer device 1, and does not constitute a limitation of the computer device 1, and may include more or less components than those illustrated, or may combine some components, or different.
  • the components, such as the computer device 1, may also include input and output devices, network access devices, buses, and the like.
  • the processor 30 may be a central processing unit (CPU), or may be other general-purpose processors, a digital signal processor (DSP), an application specific integrated circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc.
  • the general purpose processor may be a microprocessor or the processor 30 may be any conventional processor or the like, and the processor 30 is a control center of the computer device 1, and connects the entire computer device 1 by using various interfaces and lines. Various parts.
  • the memory 20 can be used to store the computer readable instructions 40 and/or modules/units by running or executing computer readable instructions and/or modules/units stored in the memory 20, and The various functions of the computer device 1 are realized by calling data stored in the memory 20.
  • the memory 20 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be Data (such as audio data, phone book, etc.) created according to the use of the computer device 1 is stored.
  • the memory 20 may include a high-speed random access memory, and may also include a non-volatile memory such as a hard disk, a memory, a plug-in hard disk, a smart memory card (SMC), and a secure digital (Secure Digital, SD).
  • a non-volatile memory such as a hard disk, a memory, a plug-in hard disk, a smart memory card (SMC), and a secure digital (Secure Digital, SD).
  • SMC smart memory card
  • SD Secure Digital
  • Card flash card, at least one disk storage device, flash device, or other volatile solid-state storage device.
  • the modules/units integrated by the computer device 1 can be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the present application implements all or part of the processes in the foregoing embodiments, and may also be implemented by computer-readable instructions, which may be stored in a non-volatile manner. In reading a storage medium, the computer readable instructions, when executed by a processor, implement the steps of the various method embodiments described above. Wherein, the computer readable instructions comprise computer readable instruction code, which may be in the form of source code, an object code form, an executable file or some intermediate form or the like.
  • the non-transitory readable medium may include any entity or device capable of carrying the computer readable instruction code, a recording medium, a USB flash drive, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read only memory (ROM, Read-Only Memory), Random Access Memory (RAM), electrical carrier signals, telecommunications signals, and software distribution media.
  • ROM Read Only memory
  • RAM Random Access Memory
  • the contents of the non-volatile readable medium may be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction, for example, in some jurisdictions, according to legislation and patent practice, Volatile readable media does not include electrical carrier signals and telecommunication signals.

Landscapes

  • Public Health (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

An epidemic grading and prediction method, the method comprising: training and testing an epidemic prediction model to obtain an optimised epidemic prediction model and using the optimised epidemic prediction model to implement prediction on epidemic monitoring data before a time point to be tested; on the basis of the epidemic prediction model, determining the size of a grading time window; and, on the basis of the prediction results of each time point before the time point to be tested and the size of the grading time window, determining the epidemic risk level of the time point to be tested. Also provided in the present application are an epidemic grading and prediction apparatus, a computer apparatus, and a readable storage medium. The present application can improve the accuracy of determining epidemic risk level.

Description

流行病分级预测方法及装置、计算机装置及可读存储介质Epidemic grading prediction method and device, computer device and readable storage medium
本申请要求于2018年04月11日提交中国专利局,申请号为201810322432.2发明名称为“流行病分级预测方法及装置、计算机装置和可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。The present application claims the priority of the Chinese Patent Application entitled "Population Classification Prediction Method and Apparatus, Computer Apparatus, and Readable Storage Medium" by the Chinese Patent Office on April 11, 2018, the entire disclosure of which is incorporated herein by reference. This is incorporated herein by reference.
技术领域Technical field
本申请涉及疾病预测技术领域,具体涉及一种流行病分级预测方法及装置、计算机装置及非易失性可读存储介质。The present application relates to the field of disease prediction technologies, and in particular, to a method and device for predicting epidemic grading, a computer device, and a non-volatile readable storage medium.
背景技术Background technique
流行病预测预警是根据收集到的流行病疫情报告和疫情监测资料,对疫情发生的区域、规模等进行综合评估和预测,然后在一定范围内,采取适当的方式预先发布事件威胁警告,进而及时发现爆发和流行苗头。目前,流行病疫情预测已成为疾病监测信息体系的重要内容。The epidemic prediction and warning is based on the collected epidemiological epidemic reports and epidemic monitoring data, comprehensive assessment and prediction of the area and scale of the epidemic, and then, within a certain range, adopt appropriate methods to pre-release event threat warnings, and then timely Found outbreaks and epidemics. At present, epidemic epidemic prediction has become an important part of the disease surveillance information system.
然而,现有的流行病分级预测方法无法获得较好的分级预测结果。However, the existing epidemiological grading prediction methods cannot obtain better grading prediction results.
发明内容Summary of the invention
鉴于以上内容,有必要提出一种流行病分级预测方法及装置、计算机装置及非易失性可读存储介质,其可以提高流行病风险等级判定的准确性。In view of the above, it is necessary to propose an epidemiological grading prediction method and apparatus, a computer apparatus and a non-volatile readable storage medium, which can improve the accuracy of the epidemiological risk level determination.
本申请的第一方面提供一种流行病分级预测方法,所述方法包括:A first aspect of the present application provides a method for predicting an epidemic grading, the method comprising:
(1)建立流行病预测模型;(1) Establish an epidemiological prediction model;
(2)利用第一训练数据对所述流行病预测模型进行训练;(2) training the epidemic prediction model with the first training data;
(3)利用所述流行病预测模型对测试数据进行预测,判断所述测试数据的预测结果是否满足预设条件,若所述测试数据的预测结果满足预设条件,则执行(5);(3) using the epidemic prediction model to predict test data, determining whether the prediction result of the test data meets a preset condition, and if the predicted result of the test data satisfies a preset condition, executing (5);
(4)若所述测试数据的预测结果不满足预设条件,则对所述流行病预测模型进行微调,然后执行(5);(4) if the predicted result of the test data does not satisfy the preset condition, fine-tuning the epidemic prediction model, and then executing (5);
(5)利用第二训练数据确定基于所述流行病预测模型进行流行病风险等级判定的分级时间窗大小,使基于所述流行病预测模型判定为中风险等级及中风险以上等级的时间点在真实的流行病流行期内,基于所述流行病预测模型判定为低风险等级及中风险以下等级的时间点在真实的流行病非流行期内;(5) determining, by using the second training data, a grading time window size for determining an epidemic risk level based on the epidemic prediction model, so that the time point based on the epidemic prediction model is determined to be a medium risk level and a medium risk level or higher During the epidemic period of the real epidemic, the time point determined as the low risk level and the medium risk level based on the epidemiological prediction model is within the real epidemic non-population period;
(6)利用所述流行病预测模型对待测时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出所述待测时间点之前所述分级时间窗大小时间内的流行病流行期与流行病非流行期;(6) using the epidemic prediction model to predict each time point in the grading time window size time before the time point to be measured, and dividing the epidemic period within the grading time window size time before the time point to be measured Epidemic period and non-epidemic period of epidemic disease;
(7)计算所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差;(7) calculating a mean value and a standard deviation of the epidemiological monitoring data of the epidemic non-epidemic period within the size time window size period before the time point to be measured;
(8)根据所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差计算流行病风险等级划分阈值;(8) calculating a threshold value of the epidemic risk level according to the mean value and the standard deviation of the epidemiological monitoring data of the non-epidemic period of the epidemic period before the time point of the grading time window;
(9)根据所述流行病风险等级划分阈值判定所述待测时间点的流行病风险 等级。(9) determining an epidemic risk level of the time point to be tested according to the threshold value of the epidemic risk level.
本申请的第二方面提供一种流行病分级预测装置,所述装置包括:A second aspect of the present application provides an epidemic grading prediction apparatus, the apparatus comprising:
建立单元,用于建立流行病预测模型;Establish a unit for establishing an epidemiological prediction model;
训练单元,用于利用第一训练数据对所述流行病预测模型进行训练;a training unit, configured to train the epidemic prediction model with the first training data;
测试单元,用于利用所述流行病预测模型对测试数据进行预测,判断所述测试数据的预测结果是否满足预设条件,若所述测试数据的预测结果不满足预设条件,则对所述流行病预测模型进行微调;a test unit, configured to use the epidemiological prediction model to predict test data, determine whether the prediction result of the test data meets a preset condition, and if the predicted result of the test data does not satisfy a preset condition, Fine-tuning the epidemiological prediction model;
确定单元,用于利用第二训练数据确定基于所述流行病预测模型进行流行病风险等级判定的分级时间窗大小,使基于所述流行病预测模型判定为中风险等级及中风险以上等级的时间点在真实的流行病流行期内,基于所述流行病预测模型判定为低风险等级及中风险以下等级的时间点在真实的流行病非流行期内;a determining unit, configured to determine, by using the second training data, a hierarchical time window size based on the epidemic risk prediction model for determining an epidemic risk level, and determining the time based on the epidemic prediction model as a medium risk level and a medium risk level or higher Point in the real epidemic epidemic period, based on the epidemiological prediction model, the time point determined as the low risk level and the medium risk level is within the real epidemic non-population period;
预测单元,用于利用所述流行病预测模型对待测时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出所述待测时间点之前所述分级时间窗大小时间内的流行病流行期与流行病非流行期;计算所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差;根据所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差计算流行病风险等级划分阈值;根据所述流行病风险等级划分阈值判定所述待测时间点的流行病风险等级。a prediction unit, configured to use the epidemiological prediction model to predict each time point in the grading time window size time before the time point to be measured, and divide the time in the grading time window size before the time point to be tested The epidemic epidemic period and the epidemic non-population period; calculating the mean and standard deviation of the epidemiological monitoring data of the epidemic non-epidemic period within the grading time window size time before the time point to be measured; according to the time point to be tested Calculating an epidemic risk level dividing threshold by using the mean and standard deviation of the epidemiological monitoring data of the non-epidemic period of the epidemic period in the classification time window size; determining the prevalence of the time point to be tested according to the threshold of the epidemic risk level Disease risk level.
本申请的第三方面提供一种计算机装置,所述计算机装置包括存储器及处理器,所述存储器用于存储至少一个计算机可读指令,所述处理器用于执行所述至少一个计算机可读指令以实现上述流行病分级预测方法。A third aspect of the present application provides a computer apparatus comprising a memory and a processor, the memory for storing at least one computer readable instruction, the processor for executing the at least one computer readable instruction The above-mentioned epidemic grading prediction method is implemented.
本申请的第四方面提供一种非易失性可读存储介质,所述非易失性可读存储介质存储有至少一个计算机可读指令,所述至少一个计算机可读指令被处理器执行时实现流行病分级预测方法。A fourth aspect of the present application provides a non-volatile readable storage medium storing at least one computer readable instruction when executed by a processor Implement epidemiological grading prediction methods.
本申请对流行病预测模型进行训练和测试,得到优化后的流行病预测模型,利用优化后的流行病预测模型对待测时间点之前的流行病监测数据进行预测,基于所述流行病预测模型确定分级时间窗大小,根据待测时间点之前的各个时间点的预测结果及所述分级时间窗大小对待测时间点的流行病风险等级进行判定。由于流行病风险等级判定所使用的时间窗大小是基于所述流行病预测模型确定的,因此,本申请可以提高流行病风险等级判定的准确性。The present application trains and tests the epidemic prediction model, obtains an optimized epidemiological prediction model, and uses the optimized epidemiological prediction model to predict the epidemiological monitoring data before the measurement time point, and determines based on the epidemic prediction model. The size of the time window is determined according to the prediction result at each time point before the time point to be measured and the size of the classification time window to determine the epidemic risk level of the time point to be measured. Since the time window size used for the epidemic risk level determination is determined based on the epidemiological prediction model, the present application can improve the accuracy of the epidemic risk level determination.
附图说明DRAWINGS
图1是本申请实施例一提供的流行病分级预测方法的流程图。FIG. 1 is a flowchart of a method for predicting epidemic grading according to Embodiment 1 of the present application.
图2是图1中步骤105的细化流程图。2 is a detailed flow chart of step 105 of FIG. 1.
图3是本申请实施例二提供的流行病分级预测装置的结构图。FIG. 3 is a structural diagram of an epidemic grading prediction apparatus according to Embodiment 2 of the present application.
图4是本申请实施例三提供的计算机装置的示意图。4 is a schematic diagram of a computer device according to Embodiment 3 of the present application.
具体实施方式detailed description
优选地,本申请的流行病分级预测方法应用在一个或者多个计算机装置中。Preferably, the epidemic grading prediction method of the present application is applied to one or more computer devices.
实施例一Embodiment 1
图1是本申请实施例一提供的流行病分级预测方法的流程图。所述流行病分级预测方法可以根据待测时间点之前的流行病监测数据,预测待测时间点的流行病风险等级。FIG. 1 is a flowchart of a method for predicting epidemic grading according to Embodiment 1 of the present application. The epidemic grading prediction method can predict an epidemic risk level at a time point to be tested according to epidemiological monitoring data before the time point to be measured.
如图1所示,所述流行病分级预测方法具体包括以下步骤:As shown in FIG. 1 , the epidemic grading prediction method specifically includes the following steps:
步骤101,建立流行病预测模型。In step 101, an epidemic prediction model is established.
所述流行病预测模型用于根据流行病监测数据预测流行病流行期与流行病非流行期。The epidemiological prediction model is used to predict epidemic epidemics and epidemic non-epidemic periods based on epidemiological surveillance data.
所述流行病监测数据为时间序列数据。所述流行病监测数据可以包括流行病的就诊数、就诊率、发病数、发病率等流行病患病数据。例如,可以从医疗机构(例如医院)获取流行病(例如流感)的每日就诊数,将流行病(例如流感)的每日就诊数作为流行病监测数据。又如,可以从学校获取学生的流行病(例如流感)的每日发病数,将流行病(例如流感)的每日发病数作为流行病监测数据。例如,可以从医疗机构(例如医院)获取流行病(例如流感)的每日就诊数,将流行病(例如流感)的每日就诊数作为流行病监测数据。The epidemiological monitoring data is time series data. The epidemiological monitoring data may include epidemiological data such as the number of visits to the epidemic, the rate of visits, the number of cases, and the incidence rate. For example, the number of daily visits to an epidemic (eg, flu) can be obtained from a medical institution (eg, a hospital), and the number of daily visits to an epidemic (eg, flu) can be used as epidemiological surveillance data. As another example, the number of daily epidemics of a student's epidemic (eg, flu) can be obtained from the school, and the number of daily epidemics of an epidemic (eg, flu) can be used as epidemiological surveillance data. For example, the number of daily visits to an epidemic (eg, flu) can be obtained from a medical institution (eg, a hospital), and the number of daily visits to an epidemic (eg, flu) can be used as epidemiological surveillance data.
可以在预设区域(例如省、市、地区)建立由多个监测点组成的流行病监测网络,从所述监测点获取所述流行病监测数据。可以选择医疗机构、学校和幼托机构、药店等作为监测点,分别对相应的目标人群进行流行病监测及数据采集。可以选择满足预设条件的场所作为监测点。所述预设条件可以包括人数、规模等。例如,选择学生人数达到预设数量的学校和幼托机构作为监控点。又如,选择规模(例如以日营业额统计)达到预设规模的药店作为监控点。再如,选择规模(例如以日就医人数统计)达到预设规模的医院作为监控点。An epidemiological monitoring network composed of a plurality of monitoring points may be established in a preset area (for example, a province, a city, a region), and the epidemiological monitoring data is obtained from the monitoring points. Medical institutions, schools and child care institutions, pharmacies, etc. can be selected as monitoring points to conduct epidemiological surveillance and data collection for the corresponding target population. A place that meets the preset conditions can be selected as the monitoring point. The preset condition may include a number of people, a scale, and the like. For example, select a school with a predetermined number of schools and child care institutions as monitoring points. Another example is to select a pharmacy that has reached the preset size (for example, by daily turnover) as a monitoring point. For another example, select a hospital (for example, the number of people who seek medical treatment in Japan) to reach a preset size as a monitoring point.
不同时间点的流行病患病数据构成所述流行病监测数据(即时间序列数据)。例如,可以将以日为单位采集到的流行病患病数据作为流行病监测数据。或者,可以将以周为单位采集到的流行病患病数据作为流行病监测数据。Epidemiological data at different time points constitute the epidemiological surveillance data (ie, time series data). For example, epidemiological data collected on a daily basis can be used as epidemiological surveillance data. Alternatively, epidemiological disease data collected on a weekly basis can be used as epidemiological surveillance data.
医疗机构(主要包括医院)是最能捕捉流行病早期暴发预兆的场所,是开展流行病监测的首选。可以根据病人就诊情况,获取流行病监测数据。Medical institutions (mainly including hospitals) are the best place to capture early warning signs of epidemics and are the first choice for epidemiological surveillance. Epidemiological surveillance data can be obtained based on patient visits.
一部分流行病人会自行去药店购药来缓解早期症状,因此,可以根据药店的药品销售情况,获取流行病监测数据。Some prevalent patients go to the pharmacy to buy medicines to relieve early symptoms. Therefore, epidemiological surveillance data can be obtained based on the sales of drugs in pharmacies.
儿童和青少年是流行病的高危人群以及流行病传播过程中的重要环节,也应该加强对该人群的监测。学校和幼托机构是监测儿童和青少年流行病发病情况的较佳场所。可以根据学校和幼托机构的儿童和青少年的请假情况,获得流行病监测数据。Children and adolescents are at high risk of epidemics and an important link in the spread of epidemics, and monitoring of this population should also be strengthened. Schools and child care institutions are better places to monitor the incidence of epidemics in children and adolescents. Epidemiological surveillance data can be obtained based on the leave of children and adolescents in schools and child care institutions.
因此,本申请中主要选择医疗机构、学校和幼托机构、药店这三类场所进行流行病监测数据的采集。当然,上述对数据源的选择,并不能限制在另外的实施方案中增加或替换其他重点关注人群或场所作为监测的数据源。例如,可以将宾馆纳入流行病监测范围,获取宾馆入住人员的流行病监测数据。Therefore, in this application, medical institutions, schools, child care institutions, and pharmacies are mainly selected to collect epidemiological surveillance data. Of course, the above selection of data sources does not limit the addition or replacement of other focused populations or sites in other embodiments as a source of data for monitoring. For example, hotels can be included in the epidemiological surveillance area to obtain epidemiological surveillance data for hotel residents.
根据需要,可以只取任意一类监控点(例如医疗机构)采集的流行病监测数据。例如,可以只取医院采集的流行病监测数据。或者,可以结合多类 监控点采集的流行病监测数据。例如,可以以医院采集的流行病监测数据为主,以药店参加的流行病监测数据作为补充。According to the needs, only epidemiological surveillance data collected by any type of monitoring point (such as a medical institution) can be taken. For example, only epidemiological surveillance data collected by the hospital can be taken. Alternatively, epidemiological surveillance data collected from multiple types of monitoring points can be combined. For example, epidemiological surveillance data collected by hospitals can be used as a supplement, supplemented by epidemiological surveillance data from pharmacies.
所述流行病预测模型可以包括CUSUM(Cumulative Sum,累积和)预测模型、EWMA(Exponentially Weighted Moving-Average,指数加权移动平均值)预测模型和移动百分位预测模型。以下分别对这三种模型进行介绍。The epidemic prediction model may include a CUSUM (Cumulative Sum) prediction model, an EWMA (Exponentially Weighted Moving-Average) prediction model, and a mobile percentile prediction model. The three models are introduced separately below.
(1)CUSUM预测模型(1) CUSUM prediction model
CUSUM预测模型通过将实际值(即流行病监测数据)与参考值之间的小偏差累加起来,达到放大的效果,提高了预测过程中小偏差的灵敏度,当偏差累积到一定程度超过阈值时,则认为出现转折,即流行病从非流行期转到了流行期。The CUSUM prediction model achieves the amplification effect by accumulating small deviations between the actual values (ie epidemiological monitoring data) and the reference values, and improves the sensitivity of small deviations in the prediction process. When the deviations accumulate to a certain extent and exceed the threshold, then It is believed that there has been a turning point, that is, the epidemic has shifted from the non-popular period to the epidemic period.
设流行病监测数据X服从正态分布,w为CUSUM预测模型的时间窗大小,初始CUSUM值为C 0=0,则时间点t的CUSUM值C t和阈值H t为: Let the epidemiological surveillance data X obey the normal distribution, w is the time window size of the CUSUM prediction model, and the initial CUSUM value is C 0 =0, then the CUSUM value C t and the threshold H t of the time point t are:
C t=max{0,X t-(μ w+k 1σ w)+C t-1} C t =max{0,X t -(μ w +k 1 σ w )+C t-1 }
H t=hσ t H t =hσ t
w为CUSUM预测模型的时间窗大小,例如,流行病监测数据X为每日的流行病患病数据时,w可以取7,表示7天(即一周)。μ w为时间点t-w到t-1(长度为w)X的均值,σ w为时间点t-w到t-1(长度为w)X的标准差。k 1为可调参数,一般在(0,3]内取值。 w is the time window size of the CUSUM prediction model. For example, when the epidemiological surveillance data X is daily epidemiology data, w can take 7 for 7 days (ie, one week). μ w is the mean of time point tw to t-1 (length w) X, and σ w is the standard deviation of time point tw to t-1 (length is w) X. K 1 is a tunable parameter and is generally taken in (0, 3).
若C t大于阈值H t=hσ t,则认为进入流行病流行期。h为可调参数,一般取1、2、3。σ t为时间点t的历史数据的标准差。 If C t is greater than the threshold H t =hσ t , it is considered to enter the epidemic epidemic period. h is an adjustable parameter, generally taking 1, 2, and 3. σ t is the standard deviation of the historical data at time point t.
C t是基于上一时间点的值生成的下一时间点的值,根据C t与H t,可以快速判定流行病流行期与流行病非流行期的起始。若CUSUM值C t大于阈值H t,则进入流行病流行期。若CUSUM值C t小于或等于阈值H t,则进入流行病非流行期。 C t is the value of the next time point generated based on the value of the previous time point. According to C t and H t , the epidemic epidemic period and the beginning of the epidemic non-population period can be quickly determined. If the CUSUM value C t is greater than the threshold H t , then the epidemic epidemic period is entered. If the CUSUM value C t is less than or equal to the threshold H t , it enters the epidemic non-population period.
(2)EWMA预测模型(2) EWMA prediction model
EWMA是以指数式递减加权的移动平均。各数值的加权影响力随时间而指数式递减,越近期的数据加权影响力越重,但较旧的数据也给予一定的加权值。EWMA is an exponentially decreasing weighted moving average. The weighted influence of each value decreases exponentially with time. The more recent the data weighting influence is, the older the data also gives a certain weight value.
设流行病监测数据X服从正态分布X t~N(μ,σ 2),初始值Z 0=X 0,则时间点t的EWMA值为: Let the epidemiological surveillance data X obey the normal distribution X t ~ N(μ, σ 2 ), and the initial value Z 0 = X 0 , then the EWMA value at time point t is:
Z t=λ×X t+(1-λ)×Z t-1 Z t =λ×X t +(1−λ)×Z t-1
Figure PCTCN2018099649-appb-000001
Figure PCTCN2018099649-appb-000001
若时间点t下的EWMA值Z t大于UCL,则认为进入流行病流行期。 If the EWMA value Z t at time t is greater than UCL, it is considered to enter the epidemic epidemic period.
常数λ为权重系数,一般在(0,1)内取值。The constant λ is a weight coefficient and is generally taken within (0, 1).
k 2为可调参数,一般在(0,3]内取值。 K 2 is a tunable parameter and is generally taken in (0, 3).
Z t是基于上一时间点的值生成下一时间点的值,根据Z t和UCL,可以快速判定流行期与非流行期起始。若Z t大于UCL,则进入流行病流行期。若CUSUM值Z t小于或等于UCL,则进入流行病非流行期。 Z t is based on the value of the previous time point to generate the value of the next time point. According to Z t and UCL, the start of the epidemic period and the non-popular period can be quickly determined. If Z t is greater than UCL, it enters the epidemic period. If the CUSUM value Z t is less than or equal to UCL, it enters the epidemic non-population period.
(3)移动百分位预测模型(3) Moving percentile prediction model
移动百分位数预测模型是以观察周的往年同期周及其前后预设周(例如前后2周)的流行病监测数据为基线数据,计算指定的百分位数(例如P5,P10,…,P90,P95,P100)作为候选预警阈值,建立预警模型。The moving percentile prediction model is based on the epidemiological monitoring data of the observation week of the previous year and the pre-set weeks (for example, 2 weeks before and after) as the baseline data, and the specified percentiles are calculated (for example, P5, P10,... , P90, P95, P100) as a candidate early warning threshold, establish an early warning model.
例如,流感ILI指数2014年第3周的P80值表示2012-2013年第1-5周共10个周发病率的第80百分位数,取这个值作为2014年第3周的预警阈值。2014第20周的P75值表示2012-2013年第18-22周共10个周发病率的第75百分位数,取这个值作为2014年第3周的预警阈值。For example, the P80 value of the influenza ILI index for the third week of 2014 represents the 80th percentile of the 10-week morbidity rate for the 1-5th week of 2012-2013, which is taken as the early warning threshold for the third week of 2014. The P75 value for the 20th week of 2014 represents the 75th percentile of the 10-week morbidity rate for the 18th to 22nd week of 2012-2013, which is taken as the warning threshold for the 3rd week of 2014.
将流行病监测数据和预警阈值作比较,如果流行病监测数据大于预警阈值,则进入流行病流行期。若流行病监测数据小于或等于预警阈值,则进入流行病非流行期。Compare epidemiological surveillance data with early warning thresholds and enter epidemic epidemic if epidemiological surveillance data is greater than the pre-warning threshold. If the epidemiological surveillance data is less than or equal to the warning threshold, it enters the epidemic non-population period.
步骤102,利用第一训练数据对所述流行病预测模型进行训练。Step 102: Train the epidemic prediction model with the first training data.
第一训练数据是流行病监测数据。利用第一训练数据对所述流行病预测模型进行训练,就是利用所述流行病预测模型对第一训练数据进行预测,根据第一训练数据的预测结果调整或选取所述流行病预测模型的参数。The first training data is epidemiological surveillance data. The training of the epidemic prediction model is performed by using the first training data, that is, the first training data is predicted by using the epidemic prediction model, and the parameters of the epidemic prediction model are adjusted or selected according to the prediction result of the first training data. .
例如,利用CUSUM预测模型对第一训练数据进行流行病预测,根据第一训练数据的预测结果调整所述CUSUM预测模型的时间窗大小w、参数k 1和h。 For example, the first training data is subjected to epidemiological prediction using the CUSUM prediction model, and the time window size w, parameters k 1 and h of the CUSUM prediction model are adjusted according to the prediction result of the first training data.
又如,利用EWMA预测模型对第一训练数据进行流行病预测,根据第一训练数据的预测结果调整所述EWMA预测模型的权重系数λ和参数k 2For another example, the EWMA prediction model is used to perform epidemiological prediction on the first training data, and the weight coefficient λ and the parameter k 2 of the EWMA prediction model are adjusted according to the prediction result of the first training data.
再如,利用移动百分位预测模型对第一训练数据进行流行病预测,根据第一训练数据的预测结果选取适合的百分位数(例如第80百分位数)作为所述移动百分位预测模型的预警阈值。For another example, the first training data is subjected to epidemiological prediction by using the moving percentile prediction model, and the appropriate percentile (for example, the 80th percentile) is selected as the moving percentage according to the prediction result of the first training data. The warning threshold for the bit prediction model.
具体地,可以利用所述流行病预测模型对第一训练数据进行预测,将第一训练数据的预测结果与真实的流行病流行期/非流行期划分结果进行比较,根据比较结果调整或选取所述流行病预测模型的参数。Specifically, the first training data may be predicted by using the epidemic prediction model, and the prediction result of the first training data is compared with a real epidemic/non-period segmentation result, and the comparison result is adjusted or selected according to the comparison result. The parameters of the epidemic prediction model.
真实的流行病流行期/非流行期是通过医学方法来定义的。可以计算所述流行病预测模型对第一训练数据的预测结果的预设指标,根据所述预设指标调整或选取所述流行病预测模型的参数。例如,可以计算所述流行病预测模型对第一训练数据的预测结果的准确度、特异度、及时性三个指标,基于所述三个指标调整或选取所述流行病预测模型的参数。The real epidemic/non-epidemic period is defined by medical methods. The preset indicator of the prediction result of the epidemic prediction model on the first training data may be calculated, and the parameters of the epidemic prediction model are adjusted or selected according to the preset index. For example, three indicators of accuracy, specificity, and timeliness of the prediction result of the epidemic prediction model for the first training data may be calculated, and parameters of the epidemic prediction model are adjusted or selected based on the three indicators.
准确度、特异度、及时性三个指标的计算方法可以如下所示:The calculation methods of accuracy, specificity and timeliness can be as follows:
准确度=有效预警的天数/真实的流行病流行期的总天数x100%;Accuracy = number of days of effective warning / total number of days of true epidemic epidemic x 100%;
特异度=无预警产生的天数/真实的流行病非流行期的总天数x100%;Specificity = number of days without warning / total number of days of true epidemic non-population period x 100%;
及时性(即滞后期)=有效预警的流行病流行期的起始日期-真实的流行病流行期的起始日期。Timeliness (ie lag period) = start date of the epidemic period of effective early warning - the start date of the real epidemic epidemic period.
利用所述流行病预测模型来对第一训练数据预测流行病流行期时,如果预测某一天是流行病流行期,同时它落在真实的流行病流行期的范围时间内,则记为有效预警。Using the epidemiological prediction model to predict the epidemic epidemic period for the first training data, if it is predicted that the epidemiological epidemic period is a certain day, and it falls within the range of the real epidemic epidemic period, it is recorded as an effective warning. .
步骤103,利用所述流行病预测模型对测试数据进行预测,判断所述测 试数据的预测结果是否满足预设条件,若所述测试数据的预测结果满足预设条件,则执行步骤105。Step 103: The test data is predicted by using the epidemic prediction model, and it is determined whether the prediction result of the test data meets a preset condition. If the prediction result of the test data satisfies a preset condition, step 105 is performed.
所述测试数据是流行病监测数据。利用所述流行病预测模型对测试数据进行预测,目的是验证训练后的流行病预测模型是否满足要求。The test data is epidemiological surveillance data. The epidemiological prediction model is used to predict the test data, and the purpose is to verify whether the post-training epidemiological prediction model satisfies the requirements.
可以计算所述测试数据的预测结果的预设指标(例如准确度、特异度、及时性),根据所述测试数据的预测结果的预设指标判断训练后的流行病预测模型是否满足预设条件。例如,判断所述测试数据的预测结果的准确度是否达到预设准确度,和/或判断所述测试数据的预测结果的特异度是否达到预设特异度,和/或判断所述测试数据的预测结果的及时性是否达到预设及时性。若所述测试数据的预测结果的准确度达到预设准确度,和/或所述测试数据的预测结果的特异度达到预设特异度,和/或所述测试数据的预测结果的及时性达到预设及时性,则判断所述流行病预测模型满足预设条件,得到优化好的流行病预测模型。Predetermining indicators (eg, accuracy, specificity, timeliness) of the prediction result of the test data may be calculated, and determining whether the post-training epidemic prediction model satisfies a preset condition according to a preset index of the prediction result of the test data . For example, determining whether the accuracy of the prediction result of the test data reaches a preset accuracy, and/or determining whether the specificity of the prediction result of the test data reaches a preset specificity, and/or determining the test data. Whether the timeliness of the forecast results reaches the preset timeliness. If the accuracy of the prediction result of the test data reaches a preset accuracy, and/or the specificity of the prediction result of the test data reaches a preset specificity, and/or the timeliness of the prediction result of the test data reaches Predetermined timeliness, it is judged that the epidemic prediction model satisfies the preset condition, and an optimized epidemiological prediction model is obtained.
步骤104,若所述测试数据的预测结果不满足预设条件,则对所述流行病预测模型进行微调,然后执行步骤105。Step 104: If the prediction result of the test data does not satisfy the preset condition, fine-tune the epidemic prediction model, and then perform step 105.
若所述测试数据的预测结果不满足预设条件,则对所述流行病预测模型的参数进行进一步调整。If the predicted result of the test data does not satisfy the preset condition, the parameters of the epidemic prediction model are further adjusted.
例如,若所述CUSUM预测模型对所述测试数据的预测结果不满足预设条件,则进一步调整所述CUSUM预测模型的时间窗大小w、参数k 1和h。 For example, if the predicted result of the CUSUM prediction model does not satisfy the preset condition, the time window size w, parameters k 1 and h of the CUSUM prediction model are further adjusted.
又如,若所述EWMA预测模型对所述测试数据的预测结果不满足预设条件,则进一步调整所述EWMA预测模型的权重系数λ、参数k 2For another example, if the prediction result of the EWMA prediction model does not satisfy the preset condition, the weight coefficient λ and the parameter k 2 of the EWMA prediction model are further adjusted.
再如,若所述EWMA预测模型对所述测试数据的预测结果不满足预设条件,则取第75百分位数(将第80百分位数调整为第75百分位数)作为所述移动百分位预测模型的预警阈值。For another example, if the prediction result of the EWMA prediction model does not satisfy the preset condition, the 75th percentile (the 80th percentile is adjusted to the 75th percentile) is taken as the The warning threshold for the moving percentile prediction model.
步骤105,利用第二训练数据确定基于所述流行病预测模型进行流行病风险等级判定的时间窗大小(以下称分级时间窗大小),使基于所述流行病预测模型判定为中风险等级及中风险以上等级的时间点在真实的流行病流行期内,基于所述流行病预测模型判定为低风险等级及中风险以下等级的时间点在真实的流行病非流行期内。Step 105: Determine, by using the second training data, a time window size (hereinafter referred to as a grading time window size) for determining an epidemic risk level based on the epidemic prediction model, so that the epidemic prediction model is determined to be a medium risk level and The time point above the risk level is within the real epidemic period, and the time point determined as the low risk level and the medium risk level based on the epidemic prediction model is within the real epidemic non-population period.
所述第二训练数据是流行病监测数据。所述第二训练数据可以与所述第一训练数据相同,也可以不同。The second training data is epidemiological monitoring data. The second training data may be the same as or different from the first training data.
确定分级时间窗大小,目的是保证基于所述流行病预测模型判定流行病风险等级的准确性。The grading time window size is determined in order to ensure the accuracy of the epidemic risk level based on the epidemiological prediction model.
在确定分级时间窗大小的过程中,要对分级时间窗大小进行调整,使基于所述流行病预测模型判定为中风险等级及中风险以上等级的时间点在真实的流行病流行期内,利用所述流行病预测模型判定为低风险等级及中风险以下等级的时间点在真实的流行病非流行期内。In the process of determining the size of the grading time window, the grading time window size is adjusted so that the time point determined as the medium risk level and the medium risk level or higher based on the epidemic prediction model is utilized during the real epidemic period. The epidemic prediction model determines that the time points of the low risk level and the medium risk level are within the real epidemic non-population period.
参阅图2所示,确定分级时间窗大小具体可以包括以下步骤:Referring to FIG. 2, determining the hierarchical time window size may specifically include the following steps:
步骤201,利用所述流行病预测模型对预设时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出所述预设时间点之前分级时间窗 大小时间内的流行病流行期与流行病非流行期。Step 201: Using the epidemic prediction model to predict each time point in the grading time window size time before the preset time point, and dividing the epidemic epidemic in the grading time window size time before the preset time point Period and epidemic period.
所述分级时间窗大小的初始值为预设值,例如为3,经过步骤201-205调整为适合的大小,例如为7。The initial value of the hierarchical time window size is a preset value, for example, 3, and is adjusted to a suitable size through steps 201-205, for example, 7.
步骤202,根据所述第二训练数据计算所述预设时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差。Step 202: Calculate, according to the second training data, a mean and a standard deviation of epidemiological monitoring data of a non-epidemic epidemic period within the time-scale window size time before the preset time point.
步骤203,根据所述预设时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差,计算流行病风险等级划分阈值。该步骤可以参考后面步骤108的描述。Step 203: Calculate an epidemic risk level division threshold according to the mean and standard deviation of the epidemic monitoring data of the non-epidemic period of the epidemic period before the preset time point size. This step can refer to the description of step 108 below.
步骤204,根据所述第二训练数据对应的流行病风险等级划分阈值判定所述预设时间点的流行病风险等级。该步骤可以参考后面步骤109的描述。Step 204: Determine an epidemic risk level of the preset time point according to the epidemic risk level division threshold corresponding to the second training data. This step can refer to the description of step 109 below.
步骤205,若所述预设时间点的流行病风险等级为中风险等级及中风险以上等级,则判断所述预设时间点是否在真实的流行病流行期内,或者,若所述预设时间点的流行病风险等级为低风险及中风险以下等级,则判断所述预设时间点是否在真实的流行病非流行期内。Step 205: If the epidemic risk level of the preset time point is a medium risk level and a medium risk level, determine whether the preset time point is within a real epidemic period, or if the preset The epidemic risk level at the time point is a low risk and a medium risk level, and it is determined whether the preset time point is within a real epidemic non-popular period.
步骤206,若所述预设时间点的流行病风险等级为中风险等级及中风险以上等级,且所述预设时间点在真实的流行病流行期内,或者,若所述预设时间点的流行病风险等级为低风险及中风险以下等级,且所述预设时间点在真实的流行病非流行期内,则调整所述分级时间窗大小。Step 206: If the epidemic risk level of the preset time point is a medium risk level and a medium risk level, and the preset time point is within a real epidemic epidemic period, or if the preset time point The epidemic risk level is a low risk and a medium risk level, and the preset time point is within a real epidemic non-popular period, and the hierarchical time window size is adjusted.
可以利用不同的第二训练数据按照上述方式对分级时间窗大小进行多次调整,以将分级时间窗大小调整为最佳值。The hierarchical time window size may be adjusted multiple times in the manner described above using different second training data to adjust the hierarchical time window size to an optimal value.
步骤106,利用所述流行病预测模型对待测时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出所述待测时间点之前所述分级时间窗大小时间内的流行病流行期与流行病非流行期。Step 106: predict, by using the epidemic prediction model, each time point in the grading time window size time before the time point to be measured, and divide the epidemic in the grading time window size time before the time point to be tested Epidemic period and non-epidemic period of epidemic disease.
例如,利用所述CUSUM预测模型对待测时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出待测时间点之前所述分级时间窗大小时间内的流行病流行期与流行病非流行期。又如,利用所述CUSUM预测模型对待测时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出待测时间点之前所述分级时间窗大小时间内的流行病流行期与流行病非流行期。再如,利用所述移动百分位模型对待测时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出待测时间点之前所述分级时间窗大小时间内的流行病流行期与流行病非流行期。For example, using the CUSUM prediction model to predict each time point in the grading time window size time before the time point to be measured, and dividing the epidemic period and epidemic time in the grading time window size time before the time point to be measured The disease is not in epidemic. For example, using the CUSUM prediction model to predict each time point in the grading time window size time before the time point to be measured, and dividing the epidemic epidemic period in the grading time window size time before the time point to be tested The epidemic is not in epidemic. For example, the mobile percentile model is used to predict each time point in the time window of the hierarchical time window before the time point is measured, and the epidemic epidemic time in the size time window size before the time point to be measured is divided. Period and epidemic period.
步骤107,计算所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差。Step 107: Calculate a mean value and a standard deviation of epidemiological monitoring data of a non-epidemic epidemic period within the size time window size period before the time point to be measured.
步骤107中,对所述待测时间点之前所述分级时间窗大小时间内的所有流行病非流行期进行统计,计算所述待测时间点之前所述分级时间窗大小时间内所有流行病非流行期的流行病监测数据的均值与标准差。例如,待测时间点之前所述分级时间窗大小时间内包括三个流行病非流行期,则计算所述三个流行病非流行内流行病监测数据的均值与标准差。所述均值与标准差是对所述待测时间点之前所述分级时间窗大小时间内所有流行病非流行期的流行病监测数据计算得到的一个均值和一个标准差。In step 107, all the epidemic non-population periods in the grading time window size time before the time point to be measured are counted, and all the epidemic diseases in the grading time window size time before the time point to be measured are calculated. The mean and standard deviation of epidemiological surveillance data during the epidemic. For example, the three time epidemic non-population periods are included in the grading time window size time before the time point to be tested, and the mean and standard deviation of the epidemiological monitoring data of the three epidemics are calculated. The mean and standard deviation are a mean and a standard deviation calculated from epidemiological monitoring data of all epidemic non-epidemic periods within the grading time window size time before the time point to be measured.
步骤108,根据所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差计算流行病风险等级划分阈值。Step 108: Calculate an epidemic risk level division threshold according to the mean value and the standard deviation of the epidemic monitoring data of the non-epidemic period of the epidemic period in the time period of the classification time window before the time point to be measured.
所述流行病风险等级划分阈值可以包括高中等级划分阈值、中低等级划分阈值、低/极低等级划分阈值。所述高中等级划分阈值用于划分高风险等级与中风险等级,所述中低等级划分阈值用于划分中风险等级与低风险等级,所述低极等级划分阈值用于划分低风险等级与极低风险等级。The epidemic risk level division threshold may include a high school level division threshold, a medium and low level division threshold, and a low/very low level division threshold. The high school level partitioning threshold is used to divide a high risk level and a medium risk level, wherein the low level ranking threshold is used to divide the medium risk level and the low risk level, and the low level ranking threshold is used to divide the low risk level and the pole. Low risk level.
在一具体实施例中,所述非流行期内流行病监测数据的均值为μ w′,标准差为σ w′,所述高中等级划分阈值为μ w′+k′ 1w′,其中6≤k′ 1≤9,所述中低等级划分阈值为μ w′+k 2′*σ w′,其中4≤k′ 2<6,所述中低等级划分阈值为μ w′+k 3′*σ w′,2≤k′ 3<4。例如,所述高中等级划分阈值为μ w′+6*σ w′,所述中低等级划分阈值为μ w′+4*σ w′,所述中低等级划分阈值为μ w′+2*σ w′In a specific embodiment, the mean value of the epidemiological monitoring data during the non-prevalence period is μ w′ , the standard deviation is σ w′ , and the high school level dividing threshold is μ w′ +k′ 1w′ , Wherein 6 ≤ k' 1 ≤ 9, the middle and low level partitioning threshold is μ w′ + k 2 ′ * σ w′ , wherein 4 ≤ k′ 2 <6, and the middle and low level partitioning threshold is μ w′ + k 3 '*σ w' , 2≤k' 3 <4. For example, the high school level dividing threshold is μ w′ +6*σ w′ , the middle and low level dividing threshold is μ w′ +4*σ w′ , and the middle and low level dividing threshold is μ w′ +2 *σ w' .
在其他的实施例中,所述流行病风险等级划分阈值可以包括其他数量和类型。例如,所述流行病风险等级划分阈值可以包括高中等级划分阈值和中低等级划分阈值。又如,所述流行病风险等级划分阈值可以包括极高/高等级划分阈值、高中等级划分阈值、中低等级划分阈值、低/极低等级划分阈值。In other embodiments, the epidemic risk level division threshold may include other quantities and types. For example, the epidemic risk level division threshold may include a high school level division threshold and a medium and low level division threshold. For another example, the epidemic risk level division threshold may include a very high/high level division threshold, a high school level division threshold, a medium and low level division threshold, and a low/very low level division threshold.
步骤109,根据所述流行病风险等级划分阈值判定所述待测时间点的流行病风险等级。Step 109: Determine a prevalence risk level of the time point to be tested according to the epidemic risk level division threshold.
例如,若待测时间点的流行病监测数据大于或等于所述高中等级划分阈值,则判定所述待测时间点的流行病风险等级为高风险等级。若待测时间点的流行病监测数据小于所述高中等级划分阈值并且大于或等于所述高中等级划分阈值,则判定所述待测时间点的流行病风险等级为中风险等级。若待测时间点的流行病监测数据小于所述中低等级划分阈值并且大于或等于所述低/极低等级划分阈值,则判定所述待测时间点的流行病风险等级为低风险等级。若待测时间点的流行病监测数据小于所述低/极低等级划分阈值,则判定所述待测时间点的流行病风险等级为极低风险等级。For example, if the epidemic monitoring data at the time point to be tested is greater than or equal to the high school level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is a high risk level. If the epidemic monitoring data at the time point to be tested is smaller than the high school level dividing threshold and greater than or equal to the high school level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is a medium risk level. If the epidemiological monitoring data at the time point to be tested is smaller than the middle and low level dividing threshold and greater than or equal to the low/low level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is a low risk level. If the epidemiological monitoring data at the time point to be tested is less than the low/very low level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is an extremely low risk level.
在一实施例中,可以结合CUSUM预测模型、EWMA预测模型和移动百分位预测模型进行流行病分级预测。具体地,根据步骤101-109的方法,基于CUSUM预测模型判定待测时间点为第一流行病风险等级,基于EWMA预测模型判定待测时间点为第二流行病风险等级,基于移动百分位预测模型判定待测时间点为第三流行病风险等级,根据所述第一流行病风险等级、所述第二流行病风险等级和所述第三流行病风险等级得到最终的流行病风险等级。可以判断所述第一流行病风险等级、所述第二流行病风险等级和所述第三流行病风险等级是否有至少两个流行病风险等级一致,若所述第一流行病风险等级、所述第二流行病风险等级和所述第三流行病风险等级中至少有两个流行病风险等级一致,则以该一致的流行病风险等级作为最终的流行病风险等级。In an embodiment, the CUSUM prediction model, the EWMA prediction model, and the moving percentile prediction model may be combined for epidemiological grading prediction. Specifically, according to the method of steps 101-109, determining, according to the CUSUM prediction model, the time point to be tested is the first epidemic risk level, and determining the time point to be tested as the second epidemic risk level based on the EWMA prediction model, based on the mobile percentile The predictive model determines that the time point to be tested is a third epidemic risk level, and the final epidemic risk level is obtained according to the first epidemic risk level, the second epidemic risk level, and the third epidemic risk level. Determining whether the first epidemic risk level, the second epidemic risk level, and the third epidemic risk level have at least two epidemic risk levels consistent, if the first epidemic risk level, If at least two epidemiological risk levels are consistent between the second epidemic risk level and the third epidemic risk level, the consistent epidemic risk level is used as the final epidemic risk level.
实施例一的流行病分级预测方法对流行病预测模型进行训练和测试,得到优化后的流行病预测模型,利用优化后的流行病预测模型对待测时间点之前的流行病监测数据进行预测,基于所述流行病预测模型确定分级时间窗大小,根据待测时间点之前的各个时间点的预测结果及所述分级时间窗大小对 待测时间点的流行病风险等级进行判定。由于流行病风险等级判定所使用的时间窗大小是基于所述流行病预测模型确定的,因此,实施例一可以提高流行病风险等级判定的准确性。The epidemic grading prediction method of the first embodiment trains and tests the epidemic prediction model, obtains an optimized epidemiological prediction model, and uses the optimized epidemic prediction model to predict epidemiological monitoring data before the time point is measured, based on The epidemic prediction model determines the grading time window size, and determines the epidemic risk level of the time point to be measured according to the prediction result at each time point before the time point to be measured and the grading time window size. Since the time window size used for the epidemic risk level determination is determined based on the epidemiological prediction model, the first embodiment can improve the accuracy of the epidemiological risk level determination.
实施例二Embodiment 2
图3为本申请实施例二提供的流行病分级预测装置的结构图。如图3所示,所述流行病分级预测装置10可以包括:建立单元301、训练单元302、测试单元303、确定单元304、预测单元305。FIG. 3 is a structural diagram of an epidemic grading prediction apparatus according to Embodiment 2 of the present application. As shown in FIG. 3, the epidemic grading prediction apparatus 10 may include an establishing unit 301, a training unit 302, a testing unit 303, a determining unit 304, and a prediction unit 305.
建立单元301,用于建立流行病预测模型。The establishing unit 301 is configured to establish an epidemic prediction model.
所述流行病预测模型用于根据流行病监测数据预测流行病流行期与流行病非流行期。The epidemiological prediction model is used to predict epidemic epidemics and epidemic non-epidemic periods based on epidemiological surveillance data.
所述流行病监测数据为时间序列数据。所述流行病监测数据可以包括流行病的就诊数、就诊率、发病数、发病率等流行病患病数据。例如,可以从医疗机构(例如医院)获取流行病(例如流感)的每日就诊数,将流行病(例如流感)的每日就诊数作为流行病监测数据。又如,可以从学校获取学生的流行病(例如流感)的每日发病数,将流行病(例如流感)的每日发病数作为流行病监测数据。例如,可以从医疗机构(例如医院)获取流行病(例如流感)的每日就诊数,将流行病(例如流感)的每日就诊数作为流行病监测数据。The epidemiological monitoring data is time series data. The epidemiological monitoring data may include epidemiological data such as the number of visits to the epidemic, the rate of visits, the number of cases, and the incidence rate. For example, the number of daily visits to an epidemic (eg, flu) can be obtained from a medical institution (eg, a hospital), and the number of daily visits to an epidemic (eg, flu) can be used as epidemiological surveillance data. As another example, the number of daily epidemics of a student's epidemic (eg, flu) can be obtained from the school, and the number of daily epidemics of an epidemic (eg, flu) can be used as epidemiological surveillance data. For example, the number of daily visits to an epidemic (eg, flu) can be obtained from a medical institution (eg, a hospital), and the number of daily visits to an epidemic (eg, flu) can be used as epidemiological surveillance data.
可以在预设区域(例如省、市、地区)建立由多个监测点组成的流行病监测网络,从所述监测点获取所述流行病监测数据。可以选择医疗机构、学校和幼托机构、药店等作为监测点,分别对相应的目标人群进行流行病监测及数据采集。可以选择满足预设条件的场所作为监测点。所述预设条件可以包括人数、规模等。例如,选择学生人数达到预设数量的学校和幼托机构作为监控点。又如,选择规模(例如以日营业额统计)达到预设规模的药店作为监控点。再如,选择规模(例如以日就医人数统计)达到预设规模的医院作为监控点。An epidemiological monitoring network composed of a plurality of monitoring points may be established in a preset area (for example, a province, a city, a region), and the epidemiological monitoring data is obtained from the monitoring points. Medical institutions, schools and child care institutions, pharmacies, etc. can be selected as monitoring points to conduct epidemiological surveillance and data collection for the corresponding target population. A place that meets the preset conditions can be selected as the monitoring point. The preset condition may include a number of people, a scale, and the like. For example, select a school with a predetermined number of schools and child care institutions as monitoring points. Another example is to select a pharmacy that has reached the preset size (for example, by daily turnover) as a monitoring point. For another example, select a hospital (for example, the number of people who seek medical treatment in Japan) to reach a preset size as a monitoring point.
不同时间点的流行病患病数据构成所述流行病监测数据(即时间序列数据)。例如,可以将以日为单位采集到的流行病患病数据作为流行病监测数据。或者,可以将以周为单位采集到的流行病患病数据作为流行病监测数据。Epidemiological data at different time points constitute the epidemiological surveillance data (ie, time series data). For example, epidemiological data collected on a daily basis can be used as epidemiological surveillance data. Alternatively, epidemiological disease data collected on a weekly basis can be used as epidemiological surveillance data.
医疗机构(主要包括医院)是最能捕捉流行病早期暴发预兆的场所,是开展流行病监测的首选。可以根据病人就诊情况,获取流行病监测数据。Medical institutions (mainly including hospitals) are the best place to capture early warning signs of epidemics and are the first choice for epidemiological surveillance. Epidemiological surveillance data can be obtained based on patient visits.
一部分流行病人会自行去药店购药来缓解早期症状,因此,可以根据药店的药品销售情况,获取流行病监测数据。Some prevalent patients go to the pharmacy to buy medicines to relieve early symptoms. Therefore, epidemiological surveillance data can be obtained based on the sales of drugs in pharmacies.
儿童和青少年是流行病的高危人群以及流行病传播过程中的重要环节,也应该加强对该人群的监测。学校和幼托机构是监测儿童和青少年流行病发病情况的较佳场所。可以根据学校和幼托机构的儿童和青少年的请假情况,获得流行病监测数据。Children and adolescents are at high risk of epidemics and an important link in the spread of epidemics, and monitoring of this population should also be strengthened. Schools and child care institutions are better places to monitor the incidence of epidemics in children and adolescents. Epidemiological surveillance data can be obtained based on the leave of children and adolescents in schools and child care institutions.
因此,本申请中主要选择医疗机构、学校和幼托机构、药店这三类场所进行流行病监测数据的采集。当然,上述对数据源的选择,并不能限制在另外的实施方案中增加或替换其他重点关注人群或场所作为监测的数据源。例 如,可以将宾馆纳入流行病监测范围,获取宾馆入住人员的流行病监测数据。Therefore, in this application, medical institutions, schools, child care institutions, and pharmacies are mainly selected to collect epidemiological surveillance data. Of course, the above selection of data sources does not limit the addition or replacement of other focused populations or sites in other embodiments as a source of data for monitoring. For example, hotels can be included in the epidemiological surveillance area to obtain epidemiological surveillance data for hotel residents.
根据需要,可以只取任意一类监控点(例如医疗机构)采集的流行病监测数据。例如,可以只取医院采集的流行病监测数据。或者,可以结合多类监控点采集的流行病监测数据。例如,可以以医院采集的流行病监测数据为主,以药店参加的流行病监测数据作为补充。According to the needs, only epidemiological surveillance data collected by any type of monitoring point (such as a medical institution) can be taken. For example, only epidemiological surveillance data collected by the hospital can be taken. Alternatively, epidemiological surveillance data collected from multiple types of monitoring points can be combined. For example, epidemiological surveillance data collected by hospitals can be used as a supplement, supplemented by epidemiological surveillance data from pharmacies.
所述流行病预测模型可以包括CUSUM(Cumulative Sum,累积和)预测模型、EWMA(Exponentially Weighted Moving-Average,指数加权移动平均值)预测模型和移动百分位预测模型。以下分别对这三种模型进行介绍。The epidemic prediction model may include a CUSUM (Cumulative Sum) prediction model, an EWMA (Exponentially Weighted Moving-Average) prediction model, and a mobile percentile prediction model. The three models are introduced separately below.
(1)CUSUM预测模型(1) CUSUM prediction model
CUSUM预测模型通过将实际值(即流行病监测数据)与参考值之间的小偏差累加起来,达到放大的效果,提高了预测过程中小偏差的灵敏度,当偏差累积到一定程度超过阈值时,则认为出现转折,即流行病从非流行期转到了流行期。The CUSUM prediction model achieves the amplification effect by accumulating small deviations between the actual values (ie epidemiological monitoring data) and the reference values, and improves the sensitivity of small deviations in the prediction process. When the deviations accumulate to a certain extent and exceed the threshold, then It is believed that there has been a turning point, that is, the epidemic has shifted from the non-popular period to the epidemic period.
设流行病监测数据X服从正态分布,w为CUSUM预测模型的时间窗大小,初始CUSUM值为C 0=0,则时间点t的CUSUM值C t和阈值H t为: Let the epidemiological surveillance data X obey the normal distribution, w is the time window size of the CUSUM prediction model, and the initial CUSUM value is C 0 =0, then the CUSUM value C t and the threshold H t of the time point t are:
C t=max{0,X t-(μ w+k 1σ w)+C t-1} C t =max{0,X t -(μ w +k 1 σ w )+C t-1 }
H t=hσ t H t =hσ t
w为CUSUM预测模型的时间窗大小,例如,流行病监测数据X为每日的流行病患病数据时,w可以取7,表示7天(即一周)。μ w为时间点t-w到t-1(长度为w)X的均值,σ w为时间点t-w到t-1(长度为w)X的标准差。k 1为可调参数,一般在(0,3]内取值。 w is the time window size of the CUSUM prediction model. For example, when the epidemiological surveillance data X is daily epidemiology data, w can take 7 for 7 days (ie, one week). μ w is the mean of time point tw to t-1 (length w) X, and σ w is the standard deviation of time point tw to t-1 (length is w) X. K 1 is a tunable parameter and is generally taken in (0, 3).
若C t大于阈值H t=hσ t,则认为进入流行病流行期。h为可调参数,一般取1、2、3。σ t为时间点t的历史数据的标准差。 If C t is greater than the threshold H t =hσ t , it is considered to enter the epidemic epidemic period. h is an adjustable parameter, generally taking 1, 2, and 3. σ t is the standard deviation of the historical data at time point t.
C t是基于上一时间点的值生成的下一时间点的值,根据C t与H t,可以快速判定流行病流行期与流行病非流行期的起始。若CUSUM值C t大于阈值H t,则进入流行病流行期。若CUSUM值C t小于或等于阈值H t,则进入流行病非流行期。 C t is the value of the next time point generated based on the value of the previous time point. According to C t and H t , the epidemic epidemic period and the beginning of the epidemic non-population period can be quickly determined. If the CUSUM value C t is greater than the threshold H t , then the epidemic epidemic period is entered. If the CUSUM value C t is less than or equal to the threshold H t , it enters the epidemic non-population period.
(2)EWMA预测模型(2) EWMA prediction model
EWMA是以指数式递减加权的移动平均。各数值的加权影响力随时间而指数式递减,越近期的数据加权影响力越重,但较旧的数据也给予一定的加权值。EWMA is an exponentially decreasing weighted moving average. The weighted influence of each value decreases exponentially with time. The more recent the data weighting influence is, the older the data also gives a certain weight value.
设流行病监测数据X服从正态分布X t~N(μ,σ 2),初始值Z 0=X 0,则时间点t的EWMA值为: Let the epidemiological surveillance data X obey the normal distribution X t ~ N(μ, σ 2 ), and the initial value Z 0 = X 0 , then the EWMA value at time point t is:
Z t=λ×X t+(1-λ)×Z t-1 Z t =λ×X t +(1−λ)×Z t-1
Figure PCTCN2018099649-appb-000002
Figure PCTCN2018099649-appb-000002
若时间点t下的EWMA值Z t大于UCL,则认为进入流行病流行期。 If the EWMA value Z t at time t is greater than UCL, it is considered to enter the epidemic epidemic period.
常数λ为权重系数,一般在(0,1)内取值。The constant λ is a weight coefficient and is generally taken within (0, 1).
k 2为可调参数,一般在(0,3]内取值。 K 2 is a tunable parameter and is generally taken in (0, 3).
Z t是基于上一时间点的值生成下一时间点的值,根据Z t和UCL,可以快速判定流行期与非流行期起始。若Z t大于UCL,则进入流行病流行期。若CUSUM值Z t小于或等于UCL,则进入流行病非流行期。 Z t is based on the value of the previous time point to generate the value of the next time point. According to Z t and UCL, the start of the epidemic period and the non-popular period can be quickly determined. If Z t is greater than UCL, it enters the epidemic period. If the CUSUM value Z t is less than or equal to UCL, it enters the epidemic non-population period.
(3)移动百分位预测模型(3) Moving percentile prediction model
移动百分位数预测模型是以观察周的往年同期周及其前后预设周(例如前后2周)的流行病监测数据为基线数据,计算指定的百分位数(例如P5,P10,…,P90,P95,P100)作为候选预警阈值,建立预警模型。The moving percentile prediction model is based on the epidemiological monitoring data of the observation week of the previous year and the pre-set weeks (for example, 2 weeks before and after) as the baseline data, and the specified percentiles are calculated (for example, P5, P10,... , P90, P95, P100) as a candidate early warning threshold, establish an early warning model.
例如,流感ILI指数2014年第3周的P80值表示2012-2013年第1-5周共10个周发病率的第80百分位数,取这个值作为2014年第3周的预警阈值。2014第20周的P75值表示2012-2013年第18-22周共10个周发病率的第75百分位数,取这个值作为2014年第3周的预警阈值。For example, the P80 value of the influenza ILI index for the third week of 2014 represents the 80th percentile of the 10-week morbidity rate for the 1-5th week of 2012-2013, which is taken as the early warning threshold for the third week of 2014. The P75 value for the 20th week of 2014 represents the 75th percentile of the 10-week morbidity rate for the 18th to 22nd week of 2012-2013, which is taken as the warning threshold for the 3rd week of 2014.
将流行病监测数据和预警阈值作比较,如果流行病监测数据大于预警阈值,则进入流行病流行期。若流行病监测数据小于或等于预警阈值,则进入流行病非流行期。Compare epidemiological surveillance data with early warning thresholds and enter epidemic epidemic if epidemiological surveillance data is greater than the pre-warning threshold. If the epidemiological surveillance data is less than or equal to the warning threshold, it enters the epidemic non-population period.
训练单元302,用于利用第一训练数据对所述流行病预测模型进行训练。The training unit 302 is configured to train the epidemic prediction model with the first training data.
第一训练数据是流行病监测数据。利用第一训练数据对所述流行病预测模型进行训练,就是利用所述流行病预测模型对第一训练数据进行预测,根据第一训练数据的预测结果调整或选取所述流行病预测模型的参数。The first training data is epidemiological surveillance data. The training of the epidemic prediction model is performed by using the first training data, that is, the first training data is predicted by using the epidemic prediction model, and the parameters of the epidemic prediction model are adjusted or selected according to the prediction result of the first training data. .
例如,利用CUSUM预测模型对第一训练数据进行流行病预测,根据第一训练数据的预测结果调整所述CUSUM预测模型的时间窗大小w、参数k 1和h。 For example, the first training data is subjected to epidemiological prediction using the CUSUM prediction model, and the time window size w, parameters k 1 and h of the CUSUM prediction model are adjusted according to the prediction result of the first training data.
又如,利用EWMA预测模型对第一训练数据进行流行病预测,根据第一训练数据的预测结果调整所述EWMA预测模型的权重系数λ和参数k 2For another example, the EWMA prediction model is used to perform epidemiological prediction on the first training data, and the weight coefficient λ and the parameter k 2 of the EWMA prediction model are adjusted according to the prediction result of the first training data.
再如,利用移动百分位预测模型对第一训练数据进行流行病预测,根据第一训练数据的预测结果选取适合的百分位数(例如第80百分位数)作为所述移动百分位预测模型的预警阈值。For another example, the first training data is subjected to epidemiological prediction by using the moving percentile prediction model, and the appropriate percentile (for example, the 80th percentile) is selected as the moving percentage according to the prediction result of the first training data. The warning threshold for the bit prediction model.
具体地,可以利用所述流行病预测模型对第一训练数据进行预测,将第一训练数据的预测结果与真实的流行病流行期/非流行期划分结果进行比较,根据比较结果调整或选取所述流行病预测模型的参数。Specifically, the first training data may be predicted by using the epidemic prediction model, and the prediction result of the first training data is compared with a real epidemic/non-period segmentation result, and the comparison result is adjusted or selected according to the comparison result. The parameters of the epidemic prediction model.
真实的流行病流行期/非流行期是通过医学方法来定义的。可以计算所述流行病预测模型对第一训练数据的预测结果的预设指标,根据所述预设指标调整或选取所述流行病预测模型的参数。例如,可以计算所述流行病预测模型对第一训练数据的预测结果的准确度、特异度、及时性三个指标,基于所述三个指标调整或选取所述流行病预测模型的参数。The real epidemic/non-epidemic period is defined by medical methods. The preset indicator of the prediction result of the epidemic prediction model on the first training data may be calculated, and the parameters of the epidemic prediction model are adjusted or selected according to the preset index. For example, three indicators of accuracy, specificity, and timeliness of the prediction result of the epidemic prediction model for the first training data may be calculated, and parameters of the epidemic prediction model are adjusted or selected based on the three indicators.
准确度、特异度、及时性三个指标的计算方法可以如下所示:The calculation methods of accuracy, specificity and timeliness can be as follows:
准确度=有效预警的天数/真实的流行病流行期的总天数x100%;Accuracy = number of days of effective warning / total number of days of true epidemic epidemic x 100%;
特异度=无预警产生的天数/真实的流行病非流行期的总天数x100%;Specificity = number of days without warning / total number of days of true epidemic non-population period x 100%;
及时性(即滞后期)=有效预警的流行病流行期的起始日期-真实的流行病流行期的起始日期。Timeliness (ie lag period) = start date of the epidemic period of effective early warning - the start date of the real epidemic epidemic period.
利用所述流行病预测模型来对第一训练数据预测流行病流行期时,如果 预测某一天是流行病流行期,同时它落在真实的流行病流行期的范围时间内,则记为有效预警。Using the epidemiological prediction model to predict the epidemic epidemic period for the first training data, if it is predicted that the epidemiological epidemic period is a certain day, and it falls within the range of the real epidemic epidemic period, it is recorded as an effective warning. .
测试单元303,用于利用所述流行病预测模型对测试数据进行预测,判断所述测试数据的预测结果是否满足预设条件,若所述测试数据的预测结果不满足预设条件,则对所述流行病预测模型进行微调。The testing unit 303 is configured to use the epidemic prediction model to predict the test data, determine whether the prediction result of the test data meets a preset condition, and if the prediction result of the test data does not meet the preset condition, the test unit The epidemiological prediction model is fine-tuned.
所述测试数据是流行病监测数据。利用所述流行病预测模型对测试数据进行预测,目的是验证训练后的流行病预测模型是否满足要求。The test data is epidemiological surveillance data. The epidemiological prediction model is used to predict the test data, and the purpose is to verify whether the post-training epidemiological prediction model satisfies the requirements.
可以计算所述测试数据的预测结果的预设指标(例如准确度、特异度、及时性),根据所述测试数据的预测结果的预设指标判断训练后的流行病预测模型是否满足预设条件。例如,判断所述测试数据的预测结果的准确度是否达到预设准确度,和/或判断所述测试数据的预测结果的特异度是否达到预设特异度,和/或判断所述测试数据的预测结果的及时性是否达到预设及时性。若所述测试数据的预测结果的准确度达到预设准确度,和/或所述测试数据的预测结果的特异度达到预设特异度,和/或所述测试数据的预测结果的及时性达到预设及时性,则判断所述流行病预测模型满足预设条件,得到优化好的流行病预测模型。Predetermining indicators (eg, accuracy, specificity, timeliness) of the prediction result of the test data may be calculated, and determining whether the post-training epidemic prediction model satisfies a preset condition according to a preset index of the prediction result of the test data . For example, determining whether the accuracy of the prediction result of the test data reaches a preset accuracy, and/or determining whether the specificity of the prediction result of the test data reaches a preset specificity, and/or determining the test data. Whether the timeliness of the forecast results reaches the preset timeliness. If the accuracy of the prediction result of the test data reaches a preset accuracy, and/or the specificity of the prediction result of the test data reaches a preset specificity, and/or the timeliness of the prediction result of the test data reaches Predetermined timeliness, it is judged that the epidemic prediction model satisfies the preset condition, and an optimized epidemiological prediction model is obtained.
若所述测试数据的预测结果不满足预设条件,则对所述流行病预测模型的参数进行进一步调整。If the predicted result of the test data does not satisfy the preset condition, the parameters of the epidemic prediction model are further adjusted.
例如,若所述CUSUM预测模型对所述测试数据的预测结果不满足预设条件,则进一步调整所述CUSUM预测模型的时间窗大小w、参数k 1和h。 For example, if the predicted result of the CUSUM prediction model does not satisfy the preset condition, the time window size w, parameters k 1 and h of the CUSUM prediction model are further adjusted.
又如,若所述EWMA预测模型对所述测试数据的预测结果不满足预设条件,则进一步调整所述EWMA预测模型的权重系数λ、参数k 2For another example, if the prediction result of the EWMA prediction model does not satisfy the preset condition, the weight coefficient λ and the parameter k 2 of the EWMA prediction model are further adjusted.
再如,若所述EWMA预测模型对所述测试数据的预测结果不满足预设条件,则取第75百分位数(将第80百分位数调整为第75百分位数)作为所述移动百分位预测模型的预警阈值。For another example, if the prediction result of the EWMA prediction model does not satisfy the preset condition, the 75th percentile (the 80th percentile is adjusted to the 75th percentile) is taken as the The warning threshold for the moving percentile prediction model.
确定单元304,用于利用第二训练数据,确定基于所述流行病预测模型进行流行病风险等级判定的时间窗大小(以下称分级时间窗大小),使基于所述流行病预测模型判定为中风险等级及中风险以上等级的时间点在真实的流行病流行期内,基于所述流行病预测模型判定为低风险等级及中风险以下等级的时间点在真实的流行病非流行期内。a determining unit 304, configured to determine, by using the second training data, a time window size (hereinafter referred to as a hierarchical time window size) for determining an epidemic risk level based on the epidemic prediction model, so that the epidemiological prediction model is determined to be middle The time points of the risk level and the medium risk level are within the real epidemic period, and the time points determined as the low risk level and the medium risk level based on the epidemic prediction model are within the real epidemic non-population period.
所述第二训练数据为流行病监测数据。所述第二训练数据可以与所述第一训练数据相同,也可以不同。The second training data is epidemiological monitoring data. The second training data may be the same as or different from the first training data.
确定分级时间窗大小,目的是保证基于所述流行病预测模型判定流行病风险等级的准确性。The grading time window size is determined in order to ensure the accuracy of the epidemic risk level based on the epidemiological prediction model.
在确定分级时间窗大小的过程中,要对分级时间窗大小进行调整,使基于所述流行病预测模型判定为中风险等级及中风险以上等级的时间点在真实的流行病流行期内,利用所述流行病预测模型判定为低风险等级及中风险以下等级的时间点在真实的流行病非流行期内。In the process of determining the size of the grading time window, the grading time window size is adjusted so that the time point determined as the medium risk level and the medium risk level or higher based on the epidemic prediction model is utilized during the real epidemic period. The epidemic prediction model determines that the time points of the low risk level and the medium risk level are within the real epidemic non-population period.
确定单元304可以按照如下方法确定分级时间窗大小:The determining unit 304 can determine the hierarchical time window size as follows:
(1)根据第二训练数据,利用所述流行病预测模型对预设时间点之前所 述分级时间窗大小时间内的各个时间点进行预测,划分出所述预设时间点之前分级时间窗大小时间内的流行病流行期与流行病非流行期。(1) predicting, according to the second training data, each time point in the grading time window size time before the preset time point by using the epidemic prediction model, and dividing the grading time window size before the preset time point Epidemic epidemic period and non-epidemic epidemic period.
所述分级时间窗大小的初始值为预设值,例如为3,确定单元304调整后为适合的大小,例如为7。The initial value of the hierarchical time window size is a preset value, for example, 3, and the determining unit 304 adjusts to a suitable size, for example, 7.
(2)根据所述第二训练数据计算所述预设时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差。(2) calculating, according to the second training data, a mean and a standard deviation of the epidemic monitoring data of the epidemic non-population period within the grading time window size time before the preset time point.
(3)根据所述预设时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差,计算流行病风险等级划分阈值。(3) Calculating an epidemic risk level division threshold according to the mean and standard deviation of the epidemiological monitoring data of the epidemic non-epidemic period within the size time window size period before the preset time point.
(4)根据所述流行病风险等级划分阈值判定所述预设时间点的流行病风险等级。(4) determining an epidemic risk level of the preset time point according to the epidemic risk level dividing threshold.
(5)若所述预设时间点的流行病风险等级为中风险等级及中风险以上等级,则判断所述预设时间点是否在真实的流行病流行期内,或者,若所述预设时间点的流行病风险等级为低风险及中风险以下等级,则判断所述预设时间点是否在真实的流行病非流行期内。(5) if the epidemic risk level of the preset time point is a medium risk level and a medium risk level or higher, determining whether the preset time point is within a real epidemic period, or if the preset The epidemic risk level at the time point is a low risk and a medium risk level, and it is determined whether the preset time point is within a real epidemic non-popular period.
(6)若所述预设时间点的流行病风险等级为中风险等级及中风险以上等级,且所述预设时间点在真实的流行病流行期内,或者,若所述预设时间点的流行病风险等级为低风险及中风险以下等级,且所述预设时间点在真实的流行病非流行期内,则调整所述分级时间窗大小。(6) if the epidemic risk level of the preset time point is a medium risk level and a medium risk level or higher, and the preset time point is within a real epidemic epidemic period, or if the preset time point is The epidemic risk level is a low risk and a medium risk level, and the preset time point is within a real epidemic non-popular period, and the hierarchical time window size is adjusted.
可以利用不同的第二训练数据按照上述方式对分级时间窗大小进行多次调整,以将分级时间窗大小调整为最佳值。The hierarchical time window size may be adjusted multiple times in the manner described above using different second training data to adjust the hierarchical time window size to an optimal value.
预测单元305,用于利用所述流行病预测模型对待测时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出所述待测时间点之前所述分级时间窗大小时间内的流行病流行期与流行病非流行期。The forecasting unit 305 is configured to use the epidemic prediction model to predict each time point in the grading time window size time before the time point to be measured, and divide the sizing time window size time before the time point to be tested The prevalence of epidemics and the epidemic period of epidemics.
例如,利用所述CUSUM预测模型对待测时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出待测时间点之前所述分级时间窗大小时间内的流行病流行期与流行病非流行期。又如,利用所述CUSUM预测模型对待测时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出待测时间点之前所述分级时间窗大小时间内的流行病流行期与流行病非流行期。再如,利用所述移动百分位模型对待测时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出待测时间点之前所述分级时间窗大小时间内的流行病流行期与流行病非流行期。For example, using the CUSUM prediction model to predict each time point in the grading time window size time before the time point to be measured, and dividing the epidemic period and epidemic time in the grading time window size time before the time point to be measured The disease is not in epidemic. For example, using the CUSUM prediction model to predict each time point in the grading time window size time before the time point to be measured, and dividing the epidemic epidemic period in the grading time window size time before the time point to be tested The epidemic is not in epidemic. For example, the mobile percentile model is used to predict each time point in the time window of the hierarchical time window before the time point is measured, and the epidemic epidemic time in the size time window size before the time point to be measured is divided. Period and epidemic period.
预测单元305,还用于计算所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差。The prediction unit 305 is further configured to calculate a mean value and a standard deviation of the epidemic monitoring data of the epidemic non-population period within the grading time window size time before the time point to be measured.
对所述待测时间点之前所述分级时间窗大小时间内的所有流行病非流行期进行统计,计算所述待测时间点之前所述分级时间窗大小时间内所有流行病非流行期的流行病监测数据的均值与标准差。例如,待测时间点之前所述分级时间窗大小时间内包括三个流行病非流行期,则计算所述三个流行病非流行内流行病监测数据的均值与标准差。所述均值与标准差是对所述待测时间点之前所述分级时间窗大小时间内所有流行病非流行期的流行病监测数据计算得到的一个均值和一个标准差。Performing statistics on all epidemic non-population periods in the grading time window size time before the time point to be measured, and calculating the prevalence of all epidemic non-population periods in the grading time window size time before the time point to be measured The mean and standard deviation of the disease surveillance data. For example, the three time epidemic non-population periods are included in the grading time window size time before the time point to be tested, and the mean and standard deviation of the epidemiological monitoring data of the three epidemics are calculated. The mean and standard deviation are a mean and a standard deviation calculated from epidemiological monitoring data of all epidemic non-epidemic periods within the grading time window size time before the time point to be measured.
预测单元305,还用于根据所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差计算流行病风险等级划分阈值。The prediction unit 305 is further configured to calculate an epidemic risk level division threshold according to the mean value and the standard deviation of the epidemic monitoring data of the non-epidemic period of the epidemic period before the time point of the classification time window.
所述流行病风险等级划分阈值可以包括高中等级划分阈值、中低等级划分阈值、低/极低等级划分阈值。所述高中等级划分阈值用于划分高风险等级与中风险等级,所述中低等级划分阈值用于划分中风险等级与低风险等级,所述低极等级划分阈值用于划分低风险等级与极低风险等级。The epidemic risk level division threshold may include a high school level division threshold, a medium and low level division threshold, and a low/very low level division threshold. The high school level partitioning threshold is used to divide a high risk level and a medium risk level, wherein the low level ranking threshold is used to divide the medium risk level and the low risk level, and the low level ranking threshold is used to divide the low risk level and the pole. Low risk level.
在一具体实施例中,所述非流行期内流行病监测数据的均值为μ w′,标准差为σ w′,所述高中等级划分阈值为μ w′+k′ 1w′,其中6≤k′ 1≤9,所述中低等级划分阈值为μ w′+k 2′*σ w′,其中4≤k′ 2<6,所述中低等级划分阈值为μ w′+k 3′*σ w′,2≤k′ 3<4。例如,所述高中等级划分阈值为μ w′+6*σ w′,所述中低等级划分阈值为μ w′+4*σ w′,所述中低等级划分阈值为μ w′+2*σ w′In a specific embodiment, the mean value of the epidemiological monitoring data during the non-prevalence period is μ w′ , the standard deviation is σ w′ , and the high school level dividing threshold is μ w′ +k′ 1w′ , Wherein 6 ≤ k' 1 ≤ 9, the middle and low level partitioning threshold is μ w′ + k 2 ′ * σ w′ , wherein 4 ≤ k′ 2 <6, and the middle and low level partitioning threshold is μ w′ + k 3 '*σ w' , 2≤k' 3 <4. For example, the high school level dividing threshold is μ w′ +6*σ w′ , the middle and low level dividing threshold is μ w′ +4*σ w′ , and the middle and low level dividing threshold is μ w′ +2 *σ w' .
在其他的实施例中,所述流行病风险等级划分阈值可以包括其他数量和类型。例如,所述流行病风险等级划分阈值可以包括高中等级划分阈值和中低等级划分阈值。又如,所述流行病风险等级划分阈值可以包括极高/高等级划分阈值、高中等级划分阈值、中低等级划分阈值、低/极低等级划分阈值。In other embodiments, the epidemic risk level division threshold may include other quantities and types. For example, the epidemic risk level division threshold may include a high school level division threshold and a medium and low level division threshold. For another example, the epidemic risk level division threshold may include a very high/high level division threshold, a high school level division threshold, a medium and low level division threshold, and a low/very low level division threshold.
预测单元305,还用于根据所述流行病风险等级划分阈值判定所述待测时间点的流行病风险等级。The predicting unit 305 is further configured to determine an epidemic risk level of the time point to be tested according to the epidemic risk level dividing threshold.
例如,若待测时间点的流行病监测数据大于或等于所述高中等级划分阈值,则判定所述待测时间点的流行病风险等级为高风险等级。若待测时间点的流行病监测数据小于所述高中等级划分阈值并且大于或等于所述高中等级划分阈值,则判定所述待测时间点的流行病风险等级为中风险等级。若待测时间点的流行病监测数据小于所述中低等级划分阈值并且大于或等于所述低/极低等级划分阈值,则判定所述待测时间点的流行病风险等级为低风险等级。若待测时间点的流行病监测数据小于所述低/极低等级划分阈值,则判定所述待测时间点的流行病风险等级为极低风险等级。For example, if the epidemic monitoring data at the time point to be tested is greater than or equal to the high school level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is a high risk level. If the epidemic monitoring data at the time point to be tested is smaller than the high school level dividing threshold and greater than or equal to the high school level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is a medium risk level. If the epidemiological monitoring data at the time point to be tested is smaller than the middle and low level dividing threshold and greater than or equal to the low/low level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is a low risk level. If the epidemiological monitoring data at the time point to be tested is less than the low/very low level dividing threshold, it is determined that the epidemic risk level of the time point to be tested is an extremely low risk level.
在一实施例中,所述流行病分级预测装置10可以结合CUSUM预测模型、EWMA预测模型和移动百分位预测模型进行流行病分级预测。具体地,所述流行病分级预测装置10基于CUSUM预测模型判定待测时间点为第一流行病风险等级,基于EWMA预测模型判定待测时间点为第二流行病风险等级,基于移动百分位预测模型判定待测时间点为第三流行病风险等级,根据所述第一流行病风险等级、所述第二流行病风险等级和所述第三流行病风险等级得到最终的流行病风险等级。可以判断所述第一流行病风险等级、所述第二流行病风险等级和所述第三流行病风险等级是否有至少两个一致,若所述第一流行病风险等级、所述第二流行病风险等级和所述第三流行病风险等级中至少有两个一致,则以该一致的流行病风险等级作为最终的流行病风险等级。In an embodiment, the epidemic grading prediction apparatus 10 may perform epidemiological grading prediction in combination with a CUSUM prediction model, an EWMA prediction model, and a mobile percentile prediction model. Specifically, the epidemic grading prediction apparatus 10 determines that the time point to be tested is the first epidemic risk level based on the CUSUM prediction model, and determines the time point to be tested as the second epidemic risk level based on the EWMA prediction model, based on the mobile percentile The predictive model determines that the time point to be tested is a third epidemic risk level, and the final epidemic risk level is obtained according to the first epidemic risk level, the second epidemic risk level, and the third epidemic risk level. Determining whether the first epidemic risk level, the second epidemic risk level, and the third epidemic risk level have at least two consistency, if the first epidemic risk level, the second epidemic If at least two of the disease risk level and the third epidemic risk level are consistent, the consistent epidemiological risk level is used as the final epidemic risk level.
实施例二的流行病分级预测装置10对流行病预测模型进行训练和测试,得到优化后的流行病预测模型,利用优化后的流行病预测模型对待测时间点之前的流行病监测数据进行预测,基于所述流行病预测模型确定分级时间窗 大小,根据待测时间点之前的各个时间点的预测结果及所述分级时间窗大小对待测时间点的流行病风险等级进行判定。由于流行病风险等级判定所使用的时间窗大小是基于所述流行病预测模型确定的,因此,实施例二可以提高流行病风险等级判定的准确性。The epidemic grading prediction device 10 of the second embodiment trains and tests the epidemic prediction model, obtains an optimized epidemic prediction model, and uses the optimized epidemic prediction model to predict epidemiological monitoring data before the time point is measured. The grading time window size is determined based on the epidemic prediction model, and the epidemic risk level of the time point to be measured is determined according to the prediction result at each time point before the time point to be measured and the grading time window size. Since the time window size used for the epidemic risk level determination is determined based on the epidemic prediction model, Embodiment 2 can improve the accuracy of the epidemic risk level determination.
实施例三Embodiment 3
图4为本申请实施例三提供的计算机装置的示意图。所述计算机装置1包括存储器20、处理器30以及存储在所述存储器20中并可在所述处理器30上运行的计算机可读指令40,例如流行病分级预测程序。所述处理器30执行所述计算机可读指令40时实现上述流行病分级预测方法实施例中的步骤,例如图1所示的步骤101-109。或者,所述处理器30执行所述计算机可读指令40时实现上述装置实施例中各模块/单元的功能,例如图3中的单元301-305。4 is a schematic diagram of a computer device according to Embodiment 3 of the present application. The computer device 1 includes a memory 20, a processor 30, and computer readable instructions 40 stored in the memory 20 and executable on the processor 30, such as an epidemic grading prediction program. The processor 30 executes the computer readable instructions 40 to implement the steps in the above-described epidemiological grading prediction method embodiment, such as steps 101-109 shown in FIG. Alternatively, the processor 30, when executing the computer readable instructions 40, implements the functions of the various modules/units of the apparatus embodiments described above, such as units 301-305 of FIG.
示例性的,所述计算机可读指令40可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器20中,并由所述处理器30执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机可读指令段,该指令段用于描述所述计算机可读指令40在所述计算机装置1中的执行过程。例如,所述计算机可读指令40可以被分割成图3中的建立单元301、训练单元302、测试单元303、确定单元304、预测单元305,各单元具体功能参见实施例二。Illustratively, the computer readable instructions 40 may be partitioned into one or more modules/units that are stored in the memory 20 and executed by the processor 30, To complete this application. The one or more modules/units may be a series of computer readable instruction segments capable of performing a particular function for describing the execution of the computer readable instructions 40 in the computer device 1. For example, the computer readable instructions 40 may be divided into the establishing unit 301, the training unit 302, the testing unit 303, the determining unit 304, and the predicting unit 305 in FIG.
所述计算机装置1可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。本领域技术人员可以理解,所述示意图4仅仅是计算机装置1的示例,并不构成对计算机装置1的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述计算机装置1还可以包括输入输出设备、网络接入设备、总线等。The computer device 1 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. It will be understood by those skilled in the art that the schematic diagram 4 is merely an example of the computer device 1, and does not constitute a limitation of the computer device 1, and may include more or less components than those illustrated, or may combine some components, or different. The components, such as the computer device 1, may also include input and output devices, network access devices, buses, and the like.
所称处理器30可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器30也可以是任何常规的处理器等,所述处理器30是所述计算机装置1的控制中心,利用各种接口和线路连接整个计算机装置1的各个部分。The processor 30 may be a central processing unit (CPU), or may be other general-purpose processors, a digital signal processor (DSP), an application specific integrated circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor 30 may be any conventional processor or the like, and the processor 30 is a control center of the computer device 1, and connects the entire computer device 1 by using various interfaces and lines. Various parts.
所述存储器20可用于存储所述计算机可读指令40和/或模块/单元,所述处理器30通过运行或执行存储在所述存储器20内的计算机可读指令和/或模块/单元,以及调用存储在存储器20内的数据,实现所述计算机装置1的各种功能。所述存储器20可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据计算机装置1的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器20可以包括高速随机存取存储器,还可以包括非易失性存储器,例如硬盘、内存、插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡, 闪存卡(Flash Card)、至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory 20 can be used to store the computer readable instructions 40 and/or modules/units by running or executing computer readable instructions and/or modules/units stored in the memory 20, and The various functions of the computer device 1 are realized by calling data stored in the memory 20. The memory 20 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be Data (such as audio data, phone book, etc.) created according to the use of the computer device 1 is stored. In addition, the memory 20 may include a high-speed random access memory, and may also include a non-volatile memory such as a hard disk, a memory, a plug-in hard disk, a smart memory card (SMC), and a secure digital (Secure Digital, SD). Card, flash card, at least one disk storage device, flash device, or other volatile solid-state storage device.
所述计算机装置1集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,也可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性可读存储介质中,该计算机可读指令在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机可读指令包括计算机可读指令代码,所述计算机可读指令代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述非易失性可读介质可以包括:能够携带所述计算机可读指令代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述非易失性可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,非易失性可读介质不包括电载波信号和电信信号。The modules/units integrated by the computer device 1 can be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the present application implements all or part of the processes in the foregoing embodiments, and may also be implemented by computer-readable instructions, which may be stored in a non-volatile manner. In reading a storage medium, the computer readable instructions, when executed by a processor, implement the steps of the various method embodiments described above. Wherein, the computer readable instructions comprise computer readable instruction code, which may be in the form of source code, an object code form, an executable file or some intermediate form or the like. The non-transitory readable medium may include any entity or device capable of carrying the computer readable instruction code, a recording medium, a USB flash drive, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read only memory (ROM, Read-Only Memory), Random Access Memory (RAM), electrical carrier signals, telecommunications signals, and software distribution media. It should be noted that the contents of the non-volatile readable medium may be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction, for example, in some jurisdictions, according to legislation and patent practice, Volatile readable media does not include electrical carrier signals and telecommunication signals.
此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。计算机装置权利要求中陈述的多个单元或计算机装置也可以由同一个单元或计算机装置通过软件或者硬件来实现。第一,第二等词语用来表示名称,而并不表示任何特定的顺序。In addition, it is to be understood that the word "comprising" does not exclude other elements or steps. A plurality of units or computer devices recited in the computer device claims can also be implemented by the same unit or computer device in software or hardware. The first, second, etc. words are used to denote names and do not denote any particular order.
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神和范围。It should be noted that the above embodiments are only used to explain the technical solutions of the present application, and are not limited thereto. Although the present application is described in detail with reference to the preferred embodiments, those skilled in the art should understand that the technical solutions of the present application can be applied. Modifications or equivalents are made without departing from the spirit and scope of the technical solutions of the present application.

Claims (20)

  1. 一种流行病分级预测方法,其特征在于,所述方法包括:An epidemic grading prediction method, the method comprising:
    (1)建立流行病预测模型;(1) Establish an epidemiological prediction model;
    (2)利用第一训练数据对所述流行病预测模型进行训练;(2) training the epidemic prediction model with the first training data;
    (3)利用所述流行病预测模型对测试数据进行预测,判断所述测试数据的预测结果是否满足预设条件,若所述测试数据的预测结果满足预设条件,则执行(5);(3) using the epidemic prediction model to predict test data, determining whether the prediction result of the test data meets a preset condition, and if the predicted result of the test data satisfies a preset condition, executing (5);
    (4)若所述测试数据的预测结果不满足预设条件,则对所述流行病预测模型进行微调,然后执行(5);(4) if the predicted result of the test data does not satisfy the preset condition, fine-tuning the epidemic prediction model, and then executing (5);
    (5)利用第二训练数据确定基于所述流行病预测模型进行流行病风险等级判定的分级时间窗大小,使基于所述流行病预测模型判定为中风险等级及中风险以上等级的时间点在真实的流行病流行期内,基于所述流行病预测模型判定为低风险等级及中风险以下等级的时间点在真实的流行病非流行期内;(5) determining, by using the second training data, a grading time window size for determining an epidemic risk level based on the epidemic prediction model, so that the time point based on the epidemic prediction model is determined to be a medium risk level and a medium risk level or higher During the epidemic period of the real epidemic, the time point determined as the low risk level and the medium risk level based on the epidemiological prediction model is within the real epidemic non-population period;
    (6)利用所述流行病预测模型对待测时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出所述待测时间点之前所述分级时间窗大小时间内的流行病流行期与流行病非流行期;(6) using the epidemic prediction model to predict each time point in the grading time window size time before the time point to be measured, and dividing the epidemic period within the grading time window size time before the time point to be measured Epidemic period and non-epidemic period of epidemic disease;
    (7)计算所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差;(7) calculating a mean value and a standard deviation of the epidemiological monitoring data of the epidemic non-epidemic period within the size time window size period before the time point to be measured;
    (8)根据所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差计算流行病风险等级划分阈值;(8) calculating a threshold value of the epidemic risk level according to the mean value and the standard deviation of the epidemiological monitoring data of the non-epidemic period of the epidemic period before the time point of the grading time window;
    (9)根据所述流行病风险等级划分阈值判定所述待测时间点的流行病风险等级。(9) determining an epidemic risk level of the time point to be tested according to the threshold value of the epidemic risk level.
  2. 如权利要求1所述的方法,其特征在于,所述步骤(5)包括:The method of claim 1 wherein said step (5) comprises:
    利用所述流行病预测模型对预设时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出所述预设时间点之前分级时间窗大小时间内的流行病流行期与流行病非流行期;Using the epidemic prediction model to predict each time point in the grading time window size time before the preset time point, and classify the epidemic epidemic period and prevalence in the grading time window size time before the preset time point Non-epidemic period of disease;
    根据所述第二训练数据计算所述预设时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差;Calculating, according to the second training data, a mean value and a standard deviation of epidemiological monitoring data of a non-epidemic epidemic period within the grading time window size time before the preset time point;
    根据所述预设时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差,计算流行病风险等级划分阈值;Calculating a threshold value of the epidemic risk level according to the mean and standard deviation of the epidemiological monitoring data of the non-epidemic period of the epidemic period in the grading time window size period before the preset time point;
    根据所述流行病风险等级划分阈值判定所述预设时间点的流行病风险等级;Determining an epidemic risk level of the preset time point according to the epidemic risk level dividing threshold;
    若所述预设时间点的流行病风险等级为中风险等级及中风险以上等级,则判断所述预设时间点是否在真实的流行病流行期内,或者,若所述预设时间点的流行病风险等级为低风险及中风险以下等级,则判断所述预设时间点是否在真实的流行病非流行期内;If the epidemic risk level of the preset time point is a medium risk level and a medium risk level or higher, determining whether the preset time point is within a real epidemic period, or if the preset time point is If the epidemic risk level is a low risk or a medium risk level, it is determined whether the preset time point is within a real epidemic non-popular period;
    若所述预设时间点的流行病风险等级为中风险等级及中风险以上等级,且所述预设时间点在真实的流行病流行期内,或者,若所述预设时间点的流 行病风险等级为低风险及中风险以下等级,且所述预设时间点在真实的流行病非流行期内,则调整所述分级时间窗大小。If the epidemic risk level of the preset time point is a medium risk level and a medium risk level or higher, and the preset time point is within a real epidemic epidemic period, or if the epidemic period of the preset time point is The risk level is a low risk and a medium risk level, and the preset time point is within a real epidemic non-popular period, and the hierarchical time window size is adjusted.
  3. 如权利要求1所述的方法,其特征在于,所述流行病预测模型包括累积和预测模型、指数加权移动平均值预测模型和移动百分位预测模型。The method of claim 1 wherein said epidemiological prediction model comprises a cumulative and predictive model, an exponentially weighted moving average predictive model, and a mobile percentile predictive model.
  4. 如权利要求1至3中任一项所述的方法,其特征在于,所述步骤(2)包括:The method according to any one of claims 1 to 3, wherein the step (2) comprises:
    利用所述流行病预测模型对所述第一训练数据进行预测,将所述第一训练数据的预测结果与真实的流行病流行期/非流行期划分结果进行比较,根据比较结果调整或选取所述流行病预测模型的参数。Using the epidemic prediction model to predict the first training data, comparing the prediction result of the first training data with a real epidemic/non-period segmentation result, and adjusting or selecting according to the comparison result. The parameters of the epidemic prediction model.
  5. 如权利要求4所述的方法,其特征在于,所述将所述第一训练数据的预测结果与真实的流行病流行期/非流行期划分结果进行比较,根据比较结果调整或选取所述流行病预测模型的参数包括:The method according to claim 4, wherein said comparing the predicted result of said first training data with a real epidemic/non-popular phase dividing result, adjusting or selecting said pop based on the comparison result The parameters of the disease prediction model include:
    计算所述流行病预测模型对所述第一训练数据的预测结果的准确度、特异度、及时性,基于所述准确度、特异度、及时性调整或选取所述流行病预测模型的参数。Calculating accuracy, specificity, and timeliness of the prediction result of the first training data by the epidemic prediction model, adjusting or selecting parameters of the epidemic prediction model based on the accuracy, specificity, and timeliness.
  6. 如权利要求1至3中任一项所述的方法,其特征在于,所述流行病监测数据通过在预设区域建立由多个监测点组成的流行病监测网络,从所述监测点获取得到。The method according to any one of claims 1 to 3, wherein the epidemiological monitoring data is obtained from the monitoring point by establishing an epidemiological monitoring network composed of a plurality of monitoring points in a preset area. .
  7. 如权利要求6所述的方法,其特征在于,所述监测点包括满足预设人数或规模的医疗机构、学校和幼托机构、药店。The method of claim 6 wherein said monitoring points comprise medical institutions, schools and child care institutions, pharmacies that meet a predetermined number or size.
  8. 一种流行病分级预测装置,其特征在于,所述装置包括:An epidemic grading and predicting device, characterized in that the device comprises:
    建立单元,用于建立流行病预测模型;Establish a unit for establishing an epidemiological prediction model;
    训练单元,用于利用第一训练数据对所述流行病预测模型进行训练;a training unit, configured to train the epidemic prediction model with the first training data;
    测试单元,用于利用所述流行病预测模型对测试数据进行预测,判断所述测试数据的预测结果是否满足预设条件,若所述测试数据的预测结果不满足预设条件,则对所述流行病预测模型进行微调;a test unit, configured to use the epidemiological prediction model to predict test data, determine whether the prediction result of the test data meets a preset condition, and if the predicted result of the test data does not satisfy a preset condition, Fine-tuning the epidemiological prediction model;
    确定单元,用于利用第二训练数据确定基于所述流行病预测模型进行流行病风险等级判定的分级时间窗大小,使基于所述流行病预测模型判定为中风险等级及中风险以上等级的时间点在真实的流行病流行期内,基于所述流行病预测模型判定为低风险等级及中风险以下等级的时间点在真实的流行病非流行期内;a determining unit, configured to determine, by using the second training data, a hierarchical time window size based on the epidemic risk prediction model for determining an epidemic risk level, and determining the time based on the epidemic prediction model as a medium risk level and a medium risk level or higher Point in the real epidemic epidemic period, based on the epidemiological prediction model, the time point determined as the low risk level and the medium risk level is within the real epidemic non-population period;
    预测单元,用于利用所述流行病预测模型对待测时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出所述待测时间点之前所述分级时间窗大小时间内的流行病流行期与流行病非流行期;计算所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差;根据所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差计算流行病风险等级划分阈值;根据所述流行病风险等级划分阈值判定所述待测时间点的流行病风险等级。a prediction unit, configured to use the epidemiological prediction model to predict each time point in the grading time window size time before the time point to be measured, and divide the time in the grading time window size before the time point to be tested The epidemic epidemic period and the epidemic non-population period; calculating the mean and standard deviation of the epidemiological monitoring data of the epidemic non-epidemic period within the grading time window size time before the time point to be measured; according to the time point to be tested Calculating an epidemic risk level dividing threshold by using the mean and standard deviation of the epidemiological monitoring data of the non-epidemic period of the epidemic period in the classification time window size; determining the prevalence of the time point to be tested according to the threshold of the epidemic risk level Disease risk level.
  9. 一种计算机装置,其特征在于,所述计算机装置包括存储器及处理器,所述存储器用于存储至少一个计算机可读指令,所述处理器用于执行所述至少 一个计算机可读指令以实现以下步骤:A computer apparatus, comprising: a memory for storing at least one computer readable instruction, and a processor for executing the at least one computer readable instruction to implement the following steps :
    (1)建立流行病预测模型;(1) Establish an epidemiological prediction model;
    (2)利用第一训练数据对所述流行病预测模型进行训练;(2) training the epidemic prediction model with the first training data;
    (3)利用所述流行病预测模型对测试数据进行预测,判断所述测试数据的预测结果是否满足预设条件,若所述测试数据的预测结果满足预设条件,则执行(5);(3) using the epidemic prediction model to predict test data, determining whether the prediction result of the test data meets a preset condition, and if the predicted result of the test data satisfies a preset condition, executing (5);
    (4)若所述测试数据的预测结果不满足预设条件,则对所述流行病预测模型进行微调,然后执行(5);(4) if the predicted result of the test data does not satisfy the preset condition, fine-tuning the epidemic prediction model, and then executing (5);
    (5)利用第二训练数据确定基于所述流行病预测模型进行流行病风险等级判定的分级时间窗大小,使基于所述流行病预测模型判定为中风险等级及中风险以上等级的时间点在真实的流行病流行期内,基于所述流行病预测模型判定为低风险等级及中风险以下等级的时间点在真实的流行病非流行期内;(5) determining, by using the second training data, a grading time window size for determining an epidemic risk level based on the epidemic prediction model, so that the time point based on the epidemic prediction model is determined to be a medium risk level and a medium risk level or higher During the epidemic period of the real epidemic, the time point determined as the low risk level and the medium risk level based on the epidemiological prediction model is within the real epidemic non-population period;
    (6)利用所述流行病预测模型对待测时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出所述待测时间点之前所述分级时间窗大小时间内的流行病流行期与流行病非流行期;(6) using the epidemic prediction model to predict each time point in the grading time window size time before the time point to be measured, and dividing the epidemic period within the grading time window size time before the time point to be measured Epidemic period and non-epidemic period of epidemic disease;
    (7)计算所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差;(7) calculating a mean value and a standard deviation of the epidemiological monitoring data of the epidemic non-epidemic period within the size time window size period before the time point to be measured;
    (8)根据所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差计算流行病风险等级划分阈值;(8) calculating a threshold value of the epidemic risk level according to the mean value and the standard deviation of the epidemiological monitoring data of the non-epidemic period of the epidemic period before the time point of the grading time window;
    (9)根据所述流行病风险等级划分阈值判定所述待测时间点的流行病风险等级。(9) determining an epidemic risk level of the time point to be tested according to the threshold value of the epidemic risk level.
  10. 如权利要求9所述的计算机装置,其特征在于,所述步骤(5)包括:The computer apparatus according to claim 9, wherein said step (5) comprises:
    利用所述流行病预测模型对预设时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出所述预设时间点之前分级时间窗大小时间内的流行病流行期与流行病非流行期;Using the epidemic prediction model to predict each time point in the grading time window size time before the preset time point, and classify the epidemic epidemic period and prevalence in the grading time window size time before the preset time point Non-epidemic period of disease;
    根据所述第二训练数据计算所述预设时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差;Calculating, according to the second training data, a mean value and a standard deviation of epidemiological monitoring data of a non-epidemic epidemic period within the grading time window size time before the preset time point;
    根据所述预设时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差,计算流行病风险等级划分阈值;Calculating a threshold value of the epidemic risk level according to the mean and standard deviation of the epidemiological monitoring data of the non-epidemic period of the epidemic period in the grading time window size period before the preset time point;
    根据所述流行病风险等级划分阈值判定所述预设时间点的流行病风险等级;Determining an epidemic risk level of the preset time point according to the epidemic risk level dividing threshold;
    若所述预设时间点的流行病风险等级为中风险等级及中风险以上等级,则判断所述预设时间点是否在真实的流行病流行期内,或者,若所述预设时间点的流行病风险等级为低风险及中风险以下等级,则判断所述预设时间点是否在真实的流行病非流行期内;If the epidemic risk level of the preset time point is a medium risk level and a medium risk level or higher, determining whether the preset time point is within a real epidemic period, or if the preset time point is If the epidemic risk level is a low risk or a medium risk level, it is determined whether the preset time point is within a real epidemic non-popular period;
    若所述预设时间点的流行病风险等级为中风险等级及中风险以上等级,且所述预设时间点在真实的流行病流行期内,或者,若所述预设时间点的流行病风险等级为低风险及中风险以下等级,且所述预设时间点在真实的流行病非流行期内,则调整所述分级时间窗大小。If the epidemic risk level of the preset time point is a medium risk level and a medium risk level or higher, and the preset time point is within a real epidemic epidemic period, or if the epidemic period of the preset time point is The risk level is a low risk and a medium risk level, and the preset time point is within a real epidemic non-popular period, and the hierarchical time window size is adjusted.
  11. 如权利要求9所述的计算机装置,其特征在于,所述流行病预测模 型包括累积和预测模型、指数加权移动平均值预测模型和移动百分位预测模型。The computer apparatus according to claim 9, wherein said epidemiological prediction model comprises a cumulative and predictive model, an exponentially weighted moving average predictive model, and a moving percentile predictive model.
  12. 如权利要求9至11中任一项所述的计算机装置,其特征在于,所述步骤(2)包括:The computer apparatus according to any one of claims 9 to 11, wherein the step (2) comprises:
    利用所述流行病预测模型对所述第一训练数据进行预测,将所述第一训练数据的预测结果与真实的流行病流行期/非流行期划分结果进行比较,根据比较结果调整或选取所述流行病预测模型的参数。Using the epidemic prediction model to predict the first training data, comparing the prediction result of the first training data with a real epidemic/non-period segmentation result, and adjusting or selecting according to the comparison result. The parameters of the epidemic prediction model.
  13. 如权利要求12所述的计算机装置,其特征在于,所述将所述第一训练数据的预测结果与真实的流行病流行期/非流行期划分结果进行比较,根据比较结果调整或选取所述流行病预测模型的参数包括:The computer apparatus according to claim 12, wherein said comparing said predicted result of said first training data with a true epidemic/non-popular phase dividing result, adjusting or selecting said said result based on said comparing result The parameters of the epidemic prediction model include:
    计算所述流行病预测模型对所述第一训练数据的预测结果的准确度、特异度、及时性,基于所述准确度、特异度、及时性调整或选取所述流行病预测模型的参数。Calculating accuracy, specificity, and timeliness of the prediction result of the first training data by the epidemic prediction model, adjusting or selecting parameters of the epidemic prediction model based on the accuracy, specificity, and timeliness.
  14. 如权利要求9至11中任一项所述的计算机装置,其特征在于,所述流行病监测数据通过在预设区域建立由多个监测点组成的流行病监测网络,从所述监测点获取得到。The computer apparatus according to any one of claims 9 to 11, wherein the epidemiological monitoring data is obtained from the monitoring point by establishing an epidemiological monitoring network composed of a plurality of monitoring points in a preset area. get.
  15. 一种非易失性可读存储介质,其特征在于,所述非易失性可读存储介质存储有至少一个计算机可读指令,所述至少一个计算机可读指令被处理器执行时实现以下步骤:A non-volatile readable storage medium, characterized in that the non-volatile readable storage medium stores at least one computer readable instruction, the at least one computer readable instruction being executed by a processor to implement the following steps :
    (1)建立流行病预测模型;(1) Establish an epidemiological prediction model;
    (2)利用第一训练数据对所述流行病预测模型进行训练;(2) training the epidemic prediction model with the first training data;
    (3)利用所述流行病预测模型对测试数据进行预测,判断所述测试数据的预测结果是否满足预设条件,若所述测试数据的预测结果满足预设条件,则执行(5);(3) using the epidemic prediction model to predict test data, determining whether the prediction result of the test data meets a preset condition, and if the predicted result of the test data satisfies a preset condition, executing (5);
    (4)若所述测试数据的预测结果不满足预设条件,则对所述流行病预测模型进行微调,然后执行(5);(4) if the predicted result of the test data does not satisfy the preset condition, fine-tuning the epidemic prediction model, and then executing (5);
    (5)利用第二训练数据确定基于所述流行病预测模型进行流行病风险等级判定的分级时间窗大小,使基于所述流行病预测模型判定为中风险等级及中风险以上等级的时间点在真实的流行病流行期内,基于所述流行病预测模型判定为低风险等级及中风险以下等级的时间点在真实的流行病非流行期内;(5) determining, by using the second training data, a grading time window size for determining an epidemic risk level based on the epidemic prediction model, so that the time point based on the epidemic prediction model is determined to be a medium risk level and a medium risk level or higher During the epidemic period of the real epidemic, the time point determined as the low risk level and the medium risk level based on the epidemiological prediction model is within the real epidemic non-population period;
    (6)利用所述流行病预测模型对待测时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出所述待测时间点之前所述分级时间窗大小时间内的流行病流行期与流行病非流行期;(6) using the epidemic prediction model to predict each time point in the grading time window size time before the time point to be measured, and dividing the epidemic period within the grading time window size time before the time point to be measured Epidemic period and non-epidemic period of epidemic disease;
    (7)计算所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差;(7) calculating a mean value and a standard deviation of the epidemiological monitoring data of the epidemic non-epidemic period within the size time window size period before the time point to be measured;
    (8)根据所述待测时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差计算流行病风险等级划分阈值;(8) calculating a threshold value of the epidemic risk level according to the mean value and the standard deviation of the epidemiological monitoring data of the non-epidemic period of the epidemic period before the time point of the grading time window;
    (9)根据所述流行病风险等级划分阈值判定所述待测时间点的流行病风险等级。(9) determining an epidemic risk level of the time point to be tested according to the threshold value of the epidemic risk level.
  16. 如权利要求15所述的存储介质,其特征在于,所述步骤(5)包括:The storage medium of claim 15 wherein said step (5) comprises:
    利用所述流行病预测模型对预设时间点之前所述分级时间窗大小时间内的各个时间点进行预测,划分出所述预设时间点之前分级时间窗大小时间内的流行病流行期与流行病非流行期;Using the epidemic prediction model to predict each time point in the grading time window size time before the preset time point, and classify the epidemic epidemic period and prevalence in the grading time window size time before the preset time point Non-epidemic period of disease;
    根据所述第二训练数据计算所述预设时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差;Calculating, according to the second training data, a mean value and a standard deviation of epidemiological monitoring data of a non-epidemic epidemic period within the grading time window size time before the preset time point;
    根据所述预设时间点之前所述分级时间窗大小时间内流行病非流行期的流行病监测数据的均值与标准差,计算流行病风险等级划分阈值;Calculating a threshold value of the epidemic risk level according to the mean and standard deviation of the epidemiological monitoring data of the non-epidemic period of the epidemic period in the grading time window size period before the preset time point;
    根据所述流行病风险等级划分阈值判定所述预设时间点的流行病风险等级;Determining an epidemic risk level of the preset time point according to the epidemic risk level dividing threshold;
    若所述预设时间点的流行病风险等级为中风险等级及中风险以上等级,则判断所述预设时间点是否在真实的流行病流行期内,或者,若所述预设时间点的流行病风险等级为低风险及中风险以下等级,则判断所述预设时间点是否在真实的流行病非流行期内;If the epidemic risk level of the preset time point is a medium risk level and a medium risk level or higher, determining whether the preset time point is within a real epidemic period, or if the preset time point is If the epidemic risk level is a low risk or a medium risk level, it is determined whether the preset time point is within a real epidemic non-popular period;
    若所述预设时间点的流行病风险等级为中风险等级及中风险以上等级,且所述预设时间点在真实的流行病流行期内,或者,若所述预设时间点的流行病风险等级为低风险及中风险以下等级,且所述预设时间点在真实的流行病非流行期内,则调整所述分级时间窗大小。If the epidemic risk level of the preset time point is a medium risk level and a medium risk level or higher, and the preset time point is within a real epidemic epidemic period, or if the epidemic period of the preset time point is The risk level is a low risk and a medium risk level, and the preset time point is within a real epidemic non-popular period, and the hierarchical time window size is adjusted.
  17. 如权利要求15所述的存储介质,其特征在于,所述流行病预测模型包括累积和预测模型、指数加权移动平均值预测模型和移动百分位预测模型。The storage medium of claim 15, wherein the epidemiological prediction model comprises a cumulative and predictive model, an exponentially weighted moving average predictive model, and a mobile percentile predictive model.
  18. 如权利要求15至17中任一项所述的存储介质,其特征在于,所述步骤(2)包括:The storage medium according to any one of claims 15 to 17, wherein the step (2) comprises:
    利用所述流行病预测模型对所述第一训练数据进行预测,将所述第一训练数据的预测结果与真实的流行病流行期/非流行期划分结果进行比较,根据比较结果调整或选取所述流行病预测模型的参数。Using the epidemic prediction model to predict the first training data, comparing the prediction result of the first training data with a real epidemic/non-period segmentation result, and adjusting or selecting according to the comparison result. The parameters of the epidemic prediction model.
  19. 如权利要求18所述的存储介质,其特征在于,所述将所述第一训练数据的预测结果与真实的流行病流行期/非流行期划分结果进行比较,根据比较结果调整或选取所述流行病预测模型的参数包括:The storage medium according to claim 18, wherein said comparing said predicted result of said first training data with a true epidemic/non-popular period dividing result, adjusting or selecting said said result based on said comparing result The parameters of the epidemic prediction model include:
    计算所述流行病预测模型对所述第一训练数据的预测结果的准确度、特异度、及时性,基于所述准确度、特异度、及时性调整或选取所述流行病预测模型的参数。Calculating accuracy, specificity, and timeliness of the prediction result of the first training data by the epidemic prediction model, adjusting or selecting parameters of the epidemic prediction model based on the accuracy, specificity, and timeliness.
  20. 如权利要求15至17中任一项所述的存储介质,其特征在于,所述流行病监测数据通过在预设区域建立由多个监测点组成的流行病监测网络,从所述监测点获取得到。The storage medium according to any one of claims 15 to 17, wherein the epidemiological monitoring data is obtained from the monitoring point by establishing an epidemiological monitoring network composed of a plurality of monitoring points in a preset area. get.
PCT/CN2018/099649 2018-04-11 2018-08-09 Epidemic grading and prediction method and apparatus, computer apparatus, and readable storage medium WO2019196281A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2019572829A JP6893259B2 (en) 2018-04-11 2018-08-09 Infectious disease classification prediction method by computer device, infectious disease classification prediction device, computer device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810322432.2 2018-04-11
CN201810322432.2A CN108597617B (en) 2018-04-11 2018-04-11 Epidemic disease grading prediction method and device, computer device and readable storage medium

Publications (1)

Publication Number Publication Date
WO2019196281A1 true WO2019196281A1 (en) 2019-10-17

Family

ID=63622046

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/099649 WO2019196281A1 (en) 2018-04-11 2018-08-09 Epidemic grading and prediction method and apparatus, computer apparatus, and readable storage medium

Country Status (3)

Country Link
JP (1) JP6893259B2 (en)
CN (1) CN108597617B (en)
WO (1) WO2019196281A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111611297A (en) * 2020-05-21 2020-09-01 中南大学 Propagation model establishing method considering parameter time-varying property and prediction method thereof
CN112117010A (en) * 2020-07-13 2020-12-22 北京大瑞集思技术有限公司 Intelligent infectious disease early warning system and management platform
CN112509706A (en) * 2020-11-16 2021-03-16 鹏城实验室 Infectious disease early warning method, infectious disease early warning device, infectious disease early warning equipment and computer readable storage medium
CN112669984A (en) * 2020-12-30 2021-04-16 华南师范大学 Infectious disease cooperative progressive monitoring and early warning coping method based on big data artificial intelligence
CN113496780A (en) * 2020-03-19 2021-10-12 北京中科闻歌科技股份有限公司 Method, device, server and storage medium for predicting number of confirmed diagnoses of infectious diseases
CN113793689A (en) * 2021-08-06 2021-12-14 兰州理工大学 Epidemic disease monitoring method based on accumulation and control chart
CN114724730A (en) * 2022-02-18 2022-07-08 平安国际智慧城市科技股份有限公司 Infectious disease early warning method and device based on artificial intelligence, electronic equipment and medium
CN115798734A (en) * 2023-01-09 2023-03-14 杭州杏林信息科技有限公司 New emergent infectious disease prevention and control method and device based on big data and storage medium

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109979601B (en) * 2018-10-12 2020-10-02 东阳市菊苏科技有限公司 Influenza prediction camera with automatic learning function
CN109473180A (en) * 2018-11-20 2019-03-15 河南省疾病预防控制中心 A kind of Disease Control Agency information system based on B/S framework
CN109817342A (en) * 2019-01-04 2019-05-28 平安科技(深圳)有限公司 Parameter regulation means, device, equipment and the storage medium of popular season prediction model
CN110047593A (en) * 2019-04-12 2019-07-23 平安科技(深圳)有限公司 Disease popularity season grade determination method, apparatus, equipment and readable storage medium storing program for executing
CN111696682A (en) * 2020-05-26 2020-09-22 平安科技(深圳)有限公司 Data processing method and device, electronic equipment and readable storage medium
CN113707336A (en) * 2021-08-26 2021-11-26 平安国际智慧城市科技股份有限公司 Infectious disease prevention and treatment early warning method, device, equipment and medium based on data analysis
CN117095831B (en) * 2023-10-17 2024-01-16 厦门畅享信息技术有限公司 Method, system, medium and electronic equipment for monitoring sudden epidemic trend

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1881227A (en) * 2006-05-16 2006-12-20 中国人民解放军第三军医大学 Intelligent analytical model technology for diagnosing epidemic situation and classifying harmfulness degree of contagious disease
CN101894309A (en) * 2009-11-05 2010-11-24 南京医科大学 Epidemic situation predicting and early warning method of infectious diseases
US20140095417A1 (en) * 2012-10-01 2014-04-03 Frederick S.M. Herz Sdi (sdi for epi-demics)
CN105095614A (en) * 2014-04-18 2015-11-25 国际商业机器公司 Method and device for updating prediction model
CN107680676A (en) * 2017-09-26 2018-02-09 电子科技大学 A kind of gestational diabetes Forecasting Methodology based on electronic health record data-driven

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101794342B (en) * 2009-09-30 2015-09-09 中国人民解放军防化指挥工程学院 A kind of epidemic Forecasting Methodology based on considering quarantine measures
JP2011128935A (en) * 2009-12-18 2011-06-30 Noriaki Aoki Infection disease prediction system
WO2013120199A1 (en) * 2012-02-13 2013-08-22 Kamran Khan Warning system for infectious diseases and method therefor
CN103310083B (en) * 2012-03-09 2016-08-03 李晓松 A kind of infectious disease aggregation detection and early warning system
CN103390091B (en) * 2012-05-08 2016-04-06 中国人民解放军防化学院 A kind of epidemic optimal control method
JP6823379B2 (en) * 2016-04-26 2021-02-03 シスメックス株式会社 Monitoring methods, information processing equipment, information processing systems, and computer programs
CN107871538A (en) * 2016-12-19 2018-04-03 平安科技(深圳)有限公司 Big data Forecasting Methodology and system based on macroscopical factor
CN107220482B (en) * 2017-05-09 2019-09-17 清华大学 Respiratory infectious disease risk evaluating system and appraisal procedure
CN107688872A (en) * 2017-08-20 2018-02-13 平安科技(深圳)有限公司 Forecast model establishes device, method and computer-readable recording medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1881227A (en) * 2006-05-16 2006-12-20 中国人民解放军第三军医大学 Intelligent analytical model technology for diagnosing epidemic situation and classifying harmfulness degree of contagious disease
CN101894309A (en) * 2009-11-05 2010-11-24 南京医科大学 Epidemic situation predicting and early warning method of infectious diseases
US20140095417A1 (en) * 2012-10-01 2014-04-03 Frederick S.M. Herz Sdi (sdi for epi-demics)
CN105095614A (en) * 2014-04-18 2015-11-25 国际商业机器公司 Method and device for updating prediction model
CN107680676A (en) * 2017-09-26 2018-02-09 电子科技大学 A kind of gestational diabetes Forecasting Methodology based on electronic health record data-driven

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113496780A (en) * 2020-03-19 2021-10-12 北京中科闻歌科技股份有限公司 Method, device, server and storage medium for predicting number of confirmed diagnoses of infectious diseases
CN113496780B (en) * 2020-03-19 2023-11-03 北京中科闻歌科技股份有限公司 Method, device, server and storage medium for predicting number of infectious disease diagnostician
CN111611297B (en) * 2020-05-21 2023-09-15 中南大学 Propagation model establishment method considering parameter time variability and prediction method thereof
CN111611297A (en) * 2020-05-21 2020-09-01 中南大学 Propagation model establishing method considering parameter time-varying property and prediction method thereof
CN112117010A (en) * 2020-07-13 2020-12-22 北京大瑞集思技术有限公司 Intelligent infectious disease early warning system and management platform
CN112509706A (en) * 2020-11-16 2021-03-16 鹏城实验室 Infectious disease early warning method, infectious disease early warning device, infectious disease early warning equipment and computer readable storage medium
CN112509706B (en) * 2020-11-16 2023-06-06 鹏城实验室 Infectious disease early warning method, device, equipment and computer readable storage medium
CN112669984A (en) * 2020-12-30 2021-04-16 华南师范大学 Infectious disease cooperative progressive monitoring and early warning coping method based on big data artificial intelligence
CN112669984B (en) * 2020-12-30 2023-09-12 华南师范大学 Infectious disease collaborative progressive monitoring and early warning response method based on big data artificial intelligence
CN113793689A (en) * 2021-08-06 2021-12-14 兰州理工大学 Epidemic disease monitoring method based on accumulation and control chart
CN113793689B (en) * 2021-08-06 2023-10-24 兰州理工大学 Epidemic disease monitoring method based on accumulation and control diagram
CN114724730A (en) * 2022-02-18 2022-07-08 平安国际智慧城市科技股份有限公司 Infectious disease early warning method and device based on artificial intelligence, electronic equipment and medium
CN114724730B (en) * 2022-02-18 2024-04-30 深圳平安智慧医健科技有限公司 Infectious disease early warning method and device based on artificial intelligence, electronic equipment and medium
CN115798734B (en) * 2023-01-09 2023-07-14 杭州杏林信息科技有限公司 New burst infectious disease prevention and control method and device based on big data and storage medium
CN115798734A (en) * 2023-01-09 2023-03-14 杭州杏林信息科技有限公司 New emergent infectious disease prevention and control method and device based on big data and storage medium

Also Published As

Publication number Publication date
CN108597617B (en) 2022-05-20
JP6893259B2 (en) 2021-06-23
CN108597617A (en) 2018-09-28
JP2020527786A (en) 2020-09-10

Similar Documents

Publication Publication Date Title
WO2019196281A1 (en) Epidemic grading and prediction method and apparatus, computer apparatus, and readable storage medium
McGowan et al. Collaborative efforts to forecast seasonal influenza in the United States, 2015–2016
Beyer et al. New spatially continuous indices of redlining and racial bias in mortgage lending: links to survival after breast cancer diagnosis and implications for health disparities research
Aziz et al. Maternal outcomes by race during postpartum readmissions
Chang et al. Assessment of critical exposure and outcome windows in time-to-event analysis with application to air pollution and preterm birth study
WO2019196286A1 (en) Illness prediction method and device, computer device, and readable storage medium
US11276495B2 (en) Systems and methods for predicting multiple health care outcomes
Plotkin et al. Tracking facility-based perinatal deaths in Tanzania: Results from an indicator validation assessment
Neumann et al. ICER’s revised value assessment framework for 2017–2019: a critique
WO2019196283A1 (en) Epidemic disease prediction method, computer device and non-volatile readable storage medium
Szklo et al. Beyond the basics
Guarnizo-Herreño et al. Socioeconomic inequalities in birth outcomes: An 11-year analysis in Colombia
Sania et al. The K nearest neighbor algorithm for imputation of missing longitudinal prenatal alcohol data
Barrera et al. County-level associations between pregnancy-related mortality ratios and contextual sociospatial indicators
Alsolmi Investigating cancer patients characteristics using a newly generated family of distributions
Brembilla et al. Pregnancy vulnerability in urban areas: a pragmatic approach combining behavioral, medico-obstetrical, socio-economic and environmental factors
Ansell et al. A new data integration framework for Covid-19 social media information
WO2020054115A1 (en) Analysis system and analysis method
WO2021159747A1 (en) Regional health construction process evaluation method, apparatus and device, and storage medium
Bueno et al. Spatial analysis of the epidemiological risk of leprosy in the municipalities of Minas Gerais
Cox Jr What is an exposure-response curve?
Liao et al. Development, validation, and evaluation of prediction models to identify individuals at high risk of lung cancer for screening in the English primary care population using the QResearch® database: research protocol and statistical analysis plan
Stone et al. Fetal monitoring from 39 weeks’ gestation to identify South Asian-born women at risk of perinatal compromise: a retrospective cohort study
Dolezalova et al. Feasibility of using intermittent active monitoring of vital signs by smartphone users to predict SARS-CoV-2 PCR positivity
Rokicki et al. Racial and Socioeconomic Disparities in Preconception Health Risk Factors and Access to Care

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18914101

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019572829

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 19/01/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18914101

Country of ref document: EP

Kind code of ref document: A1