WO2021218207A1 - Intra-urban dengue fever spatio-temporal forecasting method and system, and electronic device - Google Patents

Intra-urban dengue fever spatio-temporal forecasting method and system, and electronic device Download PDF

Info

Publication number
WO2021218207A1
WO2021218207A1 PCT/CN2020/139657 CN2020139657W WO2021218207A1 WO 2021218207 A1 WO2021218207 A1 WO 2021218207A1 CN 2020139657 W CN2020139657 W CN 2020139657W WO 2021218207 A1 WO2021218207 A1 WO 2021218207A1
Authority
WO
WIPO (PCT)
Prior art keywords
dengue fever
city
data
prediction
township
Prior art date
Application number
PCT/CN2020/139657
Other languages
French (fr)
Chinese (zh)
Inventor
刘康
尹凌
奚桂锴
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2021218207A1 publication Critical patent/WO2021218207A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • the invention relates to a method, a system and electronic equipment for predicting the time and space of dengue fever in a city.
  • vector control such as spraying mosquitoes to eliminate adult mosquitoes, removing breeding grounds for Aedes mosquitoes, etc.
  • vector control is still the main method of dengue fever prevention and control.
  • accurate prediction and early warning of the number and location of dengue fever in the future has become the key to prevention and control.
  • the present invention provides a method for temporal and spatial prediction of dengue fever in a city.
  • the method includes the following steps: a. Collect and preprocess related data on dengue fever in the city.
  • the data on dengue fever in the city includes: dengue fever case data and meteorological data in the studied city , Population distribution data, township vector files; b. Construct a map structure that reflects the spatial relationship of the inner city; c. Select the input features for dengue spatiotemporal prediction; d. Construct a map based on the preprocessed dengue-related data within the city
  • the structure and selected input features are used to construct and train the GCN model.
  • the method further includes step e: evaluating the prediction performance of the GCN model.
  • the step a specifically includes:
  • Preprocess the collected data of dengue fever cases convert the home address of the case to latitude and longitude coordinates; determine the township where each case is located; count the number of cases in each township in each week according to the onset date of each case, constituting W*N
  • the number matrix of cases, W is the number of weeks, and N is the number of towns;
  • Preprocessing the collected meteorological data Obtain the daily average temperature and rainfall recorded by all meteorological observation stations in the city, and use the kriging method to interpolate them separately; aggregate the interpolated data to the township level on a weekly basis, Calculate the average temperature and cumulative rainfall of each town in each week to form a W*N average temperature matrix and cumulative rainfall matrix;
  • the preprocessing of the collected population distribution data includes: aggregating the population distribution data to the township level to obtain the total population of each township.
  • the step b specifically includes the following steps:
  • a graph structure is constructed.
  • Said step c specifically includes:
  • the GCN model is composed of one input layer, at least two hidden layers and one output layer; after the at least two hidden layers, the rectified linear function ReLU and the hyperbolic tangent function tanh are respectively used as activation functions.
  • the training of the GCN model in step d includes:
  • the step e specifically includes:
  • the hit rate of the prediction result in week t is defined as follows:
  • N m,t means that the number of cases in all towns within the city predicted in week t is ranked from highest to lowest, and the sum of the actual number of cases in the top m% of high-risk streets and towns; N t means the number of cases in week t The total number of actual cases in the city.
  • the present invention provides a space-time prediction system for dengue fever in a city.
  • the system includes a preprocessing unit, a graph structure building unit, a selection unit, and a model building unit.
  • the preprocessing unit is used to collect data related to dengue fever in the city and perform preprocessing.
  • the data related to dengue fever in the city includes: dengue fever case data, meteorological data, population distribution data, and township vector files of the studied city; the graph structure construction unit is used to construct a graph structure reflecting the spatial relationship of the inner city; The selection unit is used to select input features for the spatiotemporal prediction of dengue fever; the model construction unit is used to construct and train the GCN model based on the preprocessed intra-city dengue related data, the constructed graph structure, and the selected input features.
  • system further includes: an evaluation unit for evaluating the prediction performance of the GCN model.
  • the present invention also provides an electronic device, including:
  • At least one processor At least one processor
  • a memory communicatively connected with the at least one processor; wherein,
  • the memory stores instructions that can be executed by the one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the urban interior described in any one of 1 to 8 above.
  • Step a Collect and preprocess the dengue-related data in the city.
  • the dengue-related data in the city includes: dengue case data, meteorological data, population distribution data, and township vector files in the studied city;
  • Step b Construct a map structure that reflects the spatial relationship of the city's internal regions
  • Step c Select the input features for the spatiotemporal prediction of dengue fever
  • Step d Constructing and training a GCN model according to the preprocessed data related to dengue fever in the city, the constructed graph structure, and the selected input features, so as to use the GCN model to perform dengue fever spatiotemporal prediction.
  • the present invention is oriented to each area within the city, and realizes prediction on a finer spatial scale.
  • the spatial relationship between various regions help to capture the characteristics of dengue fever in the city, effectively improve the prediction performance, and enhance the level of precision prevention and control of dengue fever.
  • Figure 1 is a flow chart of the method for predicting dengue fever within a city according to the present invention
  • FIG. 2 is a schematic diagram of a process of constructing a spatial relationship within a city according to an embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of a graph convolutional neural network model provided by an embodiment of the present invention.
  • FIG. 4 is a hardware architecture diagram of the dengue fever spatiotemporal prediction system in the city of the present invention.
  • FIG. 5 is a schematic diagram of the hardware device structure of a method for predicting dengue fever in a city according to an embodiment of the present invention
  • Fig. 6 is a schematic diagram of comparison of dengue fever prediction effects on a township scale in Guangzhou City in Example 1 of the present invention.
  • This embodiment is explained with the prediction of a township scale.
  • the present invention is also applicable to urban internal spatial units divided in other ways, such as administrative districts, traffic analysis districts, grids, and the like.
  • FIG. 1 it is a flowchart of a preferred embodiment of the method for predicting dengue fever within a city according to the present invention.
  • Step S1 collecting data related to dengue fever in the city and preprocessing it. in particular:
  • the data related to dengue fever in the city includes: dengue fever case data, meteorological data, population distribution data, and township vector files (shapefile) of the studied city.
  • the meteorological data includes daily average temperature and rainfall collected by meteorological monitoring stations in the city.
  • the dengue fever case data is obtained from the disease prevention and control center of the country/province/city, and the dengue fever case data includes: the onset date and home address of each case; the meteorological data is obtained from the national/provincial/city meteorological
  • the said population distribution data is obtained from the open source global population data project WorldPop website (https://www.worldpop.org/).
  • the preprocessing of the collected dengue case data includes: first use geocoding to convert the home address of the case into latitude and longitude coordinates, and import all case points into ArcGIS according to their latitude and longitude coordinates to obtain a vector file of point type; then use Spatial in ArcGIS software
  • the Join tool associates cases (point-type vector files) with townships (face-type vector files, that is, township vector files) to determine the township where each case is located; finally, count each week according to the date of onset of each case
  • the number of cases in each township constitutes a W*N matrix of the number of cases, where W is the number of weeks and N is the number of townships.
  • the preprocessing of the collected meteorological data includes: obtaining the daily average temperature and rainfall recorded by all meteorological observatories in the city, firstly using the kriging method to interpolate them separately; then, the interpolated data is aggregated into weekly At the township level, the average temperature and cumulative rainfall of each township in each week are counted to form the W*N average temperature matrix and cumulative rainfall matrix.
  • the spatial interpolation and data aggregation are processed in batches using the ArcPy toolkit of the Python language.
  • the preprocessing of the collected population distribution data includes: this embodiment uses ArcGIS software to download the population distribution data with a resolution of 100 meters from the WorldPop website in 2015 and aggregate it to the township level to obtain the total population of each township.
  • Step S2 constructing a graph structure reflecting the spatial relationship of the inner regions of the city according to the neighboring relationship between the regions.
  • the step S2 includes:
  • Step 201 Use the Spatial Join function of ArcGIS software to obtain the neighboring relationship between the township and the township from the township vector file.
  • Step 202 Regard the township as a point, and the adjacency relationship between the townships as an edge, and construct a graph structure. Please refer to FIG. 2 for a schematic diagram of the construction process of structures A and B in this embodiment.
  • Step S3 selecting input features for dengue fever prediction. in particular:
  • This embodiment selects four types of features commonly used in the literature that are closely related to the spread and outbreak of dengue fever, including the number of cases in the current week and the past week, the average temperature, the cumulative rainfall, and the number of populations. As shown in Table 1, there are 13 features in total. Among them, the average moderate and the accumulated rainfall are related to the survival suitability of mosquito vectors; because dengue fever is an infectious disease, the number of future cases is also closely related to the number of past cases and the number of population.
  • step S4 the GCN model is constructed and trained according to the preprocessed data related to dengue fever in the city, the constructed graph structure, and the selected input features.
  • the step S4 includes:
  • Step 401 Model construction.
  • the graph convolutional neural network model used in this embodiment was proposed by Kipf Thomas N and Max Welling in 2016, and its basic structure is shown in FIG. 3.
  • the model consists of an input layer, two hidden layers (more hidden layers can also be set), and an output layer; after the two hidden layers, the rectified linear function ReLU and the hyperbolic tangent function tanh are used as the activation functions.
  • Step 402 Model training. According to the input and output requirements of the GCN model, and the different prediction windows, organize K sets of data sets; each set of data sets are divided into training sets and validation sets in a certain proportion: in this embodiment, the top 75% of all weeks in the data set The weekly data is used for training, and the last 25% of the weekly data is used for verification; the training set under each prediction window is used to train the constructed GCN model.
  • Step S5 Evaluate the prediction performance of the GCN model. in particular:
  • the hit rate of the forecast results in week t is defined as follows:
  • N m,t means that the number of cases in all towns within the city predicted in week t is ranked from highest to lowest, and the sum of the actual number of cases in the top m% of high-risk streets and towns; N t means the number of cases in week t The total number of actual cases in the city.
  • FIG. 4 is a hardware architecture diagram of the dengue fever spatiotemporal prediction system 10 in a city of the present invention.
  • the system includes: a preprocessing unit 101, a graph structure construction unit 102, a selection unit 103, a model construction unit 104, and an evaluation unit 105.
  • the preprocessing unit 101 is used to collect and preprocess the data related to dengue fever in the city. in particular:
  • the data related to dengue fever in the city includes: dengue fever case data, meteorological data, population distribution data, and township vector files (shapefile) of the studied city.
  • the meteorological data includes daily average temperature and rainfall collected by meteorological monitoring stations in the city.
  • the dengue fever case data is obtained from the disease prevention and control center of the country/province/city, and the dengue fever case data includes: the onset date and home address of each case; the meteorological data is obtained from the national/provincial/city meteorological
  • the said population distribution data is obtained from the open source global population data project WorldPop website (https://www.worldpop.org/).
  • the preprocessing of the collected dengue case data by the preprocessing unit 101 includes: firstly using a geocoding method to convert the home address of the case into latitude and longitude coordinates, and import all case points into ArcGIS according to their latitude and longitude coordinates to obtain a vector file of point type; Use the SpatialJoin tool in ArcGIS software to associate the case (point-type vector file) with the township (face-type vector file, that is, the township vector file) to determine the township where each case is located; finally, according to the date of onset of each case , Count the number of cases in each town in each week to form a matrix of the number of cases in W*N, where W is the number of weeks and N is the number of towns.
  • the pre-processing of the collected meteorological data by the pre-processing unit 101 includes: obtaining the daily average temperature and rainfall recorded by all meteorological observatories in the city, firstly using the kriging method to perform spatial interpolation on them; The data is aggregated to the township level by week, and the average temperature and cumulative rainfall of each township in each week are counted to form the average temperature matrix and cumulative rainfall matrix of W*N.
  • the spatial interpolation and data aggregation are processed in batches using the ArcPy toolkit of the Python language.
  • the preprocessing unit 101 preprocessing the collected population distribution data includes: using ArcGIS software in this embodiment to download the population distribution data with a resolution of 100 meters in 2015 from the WorldPop website to the township level to obtain the total population of each township.
  • the graph structure construction unit 102 is used for constructing a graph structure reflecting the spatial relationship of the inner region of the city according to the neighboring relationship between the regions. in particular:
  • the graph structure construction unit 102 uses the Spatial Join function of ArcGIS software to obtain the neighboring relationship between the township and the township from the township vector file.
  • FIG. 2 a schematic diagram of the construction process of structures A and B in this embodiment.
  • the selection unit 103 is used to select input features for dengue fever prediction. in particular:
  • the selection unit 103 selects four types of features commonly used in the literature that are closely related to the spread and outbreak of dengue fever, including the number of cases in the current week and the past week, the average temperature, the cumulative rainfall, and the number of population. As shown in Table 1, there are 13 features in total. Among them, the average moderate and the accumulated rainfall are related to the survival suitability of mosquito vectors; because dengue fever is an infectious disease, the number of future cases is also closely related to the number of past cases and the number of population.
  • the model construction unit 104 is used for constructing and training the GCN model according to the preprocessed intra-city dengue fever related data, the constructed graph structure, and the selected input features. in particular:
  • the model construction unit 104 performs model construction.
  • the graph convolutional neural network model used in this embodiment was proposed by Kipf Thomas N and Max Welling in 2016, and its basic structure is shown in FIG. 3.
  • the model consists of an input layer, two hidden layers (more hidden layers can also be set), and an output layer; after the two hidden layers, the rectified linear function ReLU and the hyperbolic tangent function tanh are used as the activation functions.
  • the model construction unit 104 performs model training. According to the input and output requirements of the GCN model, and the different prediction windows, organize K sets of data sets; each set of data sets are divided into training sets and validation sets in a certain proportion: in this embodiment, the top 75% of all weeks in the data set The weekly data is used for training, and the last 25% of the weekly data is used for verification; the training set under each prediction window is used to train the constructed GCN model.
  • the evaluation unit 105 is used to evaluate the prediction performance of the GCN model. in particular:
  • the evaluation unit 105 inputs the verification set under each prediction window into the corresponding trained GCN model, and accordingly obtains the prediction value of the kth week in the future (that is, the number of cases in each town). Since the main purpose of prediction is to identify high-risk streets and towns among multiple streets and towns in the city, and to deploy prevention and control measures in a targeted manner, the hit rate is used in this embodiment to evaluate the prediction performance.
  • the hit rate of the forecast results in week t is defined as follows:
  • N m,t means that the number of cases in all towns within the city predicted in week t is ranked from highest to lowest, and the sum of the actual number of cases in the top m% of high-risk streets and towns; N t means the number of cases in week t The total number of actual cases in the city.
  • FIG. 5 is a schematic diagram of the hardware device structure of the method for simulating the spread of infectious diseases in a city provided by an embodiment of the present application.
  • the device includes one or more processors and memory. Taking a processor as an example, the device may also include: an input system and an output system.
  • the processor, the memory, the input system, and the output system may be connected by a bus or other methods.
  • the connection by a bus is taken as an example.
  • the memory can be used to store non-transitory software programs, non-transitory computer executable programs, and modules.
  • the processor executes various functional applications and data processing of the electronic device by running non-transitory software programs, instructions, and modules stored in the memory, that is, realizing the processing methods of the foregoing method embodiments.
  • the memory may include a program storage area and a data storage area, where the program storage area can store an operating system and an application program required by at least one function; the data storage area can store data and the like.
  • the memory may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices.
  • the memory may optionally include a memory remotely provided with respect to the processor, and these remote memories may be connected to the processing system through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • the input system can receive input digital or character information, and generate signal input.
  • the output system may include display devices such as a display screen.
  • the one or more modules are stored in the memory, and when executed by the one or more processors, the following operations of any of the foregoing method embodiments are performed:
  • Step a Collect and preprocess the dengue fever-related data in the city.
  • the dengue fever-related data in the city includes: dengue fever case data, meteorological data, population distribution data, and township vector files of the studied city;
  • Step b Construct a map structure that reflects the spatial relationship of the city's internal regions
  • Step c Select the input features for the spatiotemporal prediction of dengue fever
  • Step d Constructing and training a GCN model according to the preprocessed data related to dengue fever in the city, the constructed graph structure, and the selected input features, so as to use the GCN model to perform dengue fever spatiotemporal prediction.
  • the embodiments of the present application provide a non-transitory (non-volatile) computer electronic device, the computer electronic device stores computer-executable instructions, and the computer-executable instructions can perform the following operations:
  • Step a Collect and preprocess the dengue fever-related data in the city.
  • the dengue fever-related data in the city includes: dengue fever case data, meteorological data, population distribution data, and township vector files of the studied city;
  • Step b Construct a map structure that reflects the spatial relationship of the city's internal regions
  • Step c Select the input features for the spatiotemporal prediction of dengue fever
  • Step d Constructing and training a GCN model according to the preprocessed data related to dengue fever in the city, the constructed graph structure, and the selected input features, so as to use the GCN model to perform dengue fever spatiotemporal prediction.
  • the embodiment of the present application provides a computer program product, the computer program product includes a computer program stored on a non-transitory computer-readable electronic device, the computer program includes program instructions, when the program instructions are executed by a computer To make the computer do the following:
  • Step a Collect and preprocess the dengue fever-related data in the city.
  • the dengue fever-related data in the city include: dengue fever case data, meteorological data, population distribution data, and township vector files of the studied city;
  • Step b Construct a map structure that reflects the spatial relationship of the city's internal regions
  • Step c Select the input features for the spatiotemporal prediction of dengue fever
  • Step d Constructing and training a GCN model according to the preprocessed data related to dengue fever in the city, the constructed graph structure, and the selected input features, so as to use the GCN model to perform dengue fever spatiotemporal prediction.
  • Example 1 of this application took 167 villages and towns in Guangdongzhou as an example to conduct experiments.
  • the study period is from January 1, 2015 to September 22, 2019, with a total of 247 weeks.
  • the data from week 5 to week 195 is used for model training, and the data from week 196 to week 247 is used for model verification.
  • the prediction window k is 1, 2, ..., 8.
  • the comparison methods are LASSO (least absolute shrinkage and selection operator) and SVM (support vector machine) regression models that are commonly used in current dengue fever prediction research and have been proven to be relatively effective. Use the above two models to make individual predictions for each township.
  • Figure 6 is a comparison diagram of model effects with hit rate as an evaluation index. It can be seen that, compared with the dengue fever prediction method based on the LASSO and SVM regression model, the dengue fever prediction method using GCN provided in the present invention has better overall prediction performance, which fully demonstrates the effectiveness of the present invention.
  • the present invention introduces the deep learning model Graph Convolutional Network (GCN) for the first time, which fully considers the spatial relationship between the inner regions of the city to capture the spread of diseases in space, and conducts joint prediction of each region, and achieves better results. Accurate prediction effect. In order to provide decision support for relevant prevention and control departments, avoid wasting manpower and material resources, and reduce the loss of life, health and property.
  • GCN Graph Convolutional Network

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

An intra-urban dengue fever spatio-temporal forecasting method, comprising: acquiring intra-urban dengue fever related data and performing preprocessing (S1); constructing a graph structure reflecting spatial relationships among intra-urban regions (S2); selecting input features for dengue fever spatio-temporal forecasting (S3); and constructing and training a GCN model according to the preprocessed intra-urban dengue fever related data, the constructed graph structure, and the selected input features (S4), so as to perform intra-urban dengue fever spatio-temporal forecasting by using the GCN model. According to the method, the spatial relationships among the intra-urban regions are sufficiently considered, forecasting on a finer spatial scale is achieved, the forecasting performance is improved, and the precise prevention and control level of dengue fever is improved.

Description

城市内部登革热时空预测方法、系统及电子设备Temporal and spatial prediction method, system and electronic equipment of dengue fever in city 技术领域Technical field
本发明涉及一种城市内部登革热时空预测方法、系统及电子设备。The invention relates to a method, a system and electronic equipment for predicting the time and space of dengue fever in a city.
背景技术Background technique
近几十年来,作为一种蚊媒传播传染病,登革热(dengue fever)已在热带与亚热带地区流行,尤其是新加坡、马来西亚等东南亚国家和地区。在中国,处于亚热带地区的广东省尤其是广州市作为经济发达、贸易活跃和人员流动频繁的地区,每年夏秋季节均会受登革热病毒侵袭。广州市2014年登革热病例多达三万七千多例,对居民生命健康产生了较严重的威胁。In recent decades, as a mosquito-borne infectious disease, dengue fever has been endemic in tropical and subtropical regions, especially in Southeast Asian countries and regions such as Singapore and Malaysia. In China, Guangdong Province in the subtropical region, especially Guangzhou City, as an area with developed economy, active trade and frequent movement of people, is affected by dengue fever virus every summer and autumn. There were more than 37,000 cases of dengue fever in Guangzhou in 2014, posing a serious threat to the lives and health of residents.
在目前尚缺有效疫苗的情况下,媒介控制(如喷洒灭蚊剂消除成蚊、清除伊蚊孳生地等)仍然是登革热防控的主要方式。在此背景下,对登革热未来发病数量和发病位置进行准确预测预警成为防控的关键。In the absence of effective vaccines, vector control (such as spraying mosquitoes to eliminate adult mosquitoes, removing breeding grounds for Aedes mosquitoes, etc.) is still the main method of dengue fever prevention and control. In this context, accurate prediction and early warning of the number and location of dengue fever in the future has become the key to prevention and control.
目前已存在不少登革热预测预警的相关研究,研究者主要基于传统统计模型和机器学习模型对研究区域未来的登革热病例数量进行预测。然而,目前的研究都是对国家、州(省份)或城市未来一段时段(如1周、2周、1个月等)的病例数进行整体时序预测,但对城市内部进行精细空间尺度(如乡镇/街道行政)的预测却较为鲜见。城市内部精细空间尺度的登革热预测颇具挑战性,其主要原因在于城市人口密集而内部 人口流动频繁,疾病在城市内部区域之间的传播更为迅速,对每个区域单独建模进行预测容易忽视区域之间的空间关系而无法达到较好的预测效果。At present, there are many related studies on dengue fever prediction and early warning. Researchers mainly predict the number of dengue fever cases in the study area in the future based on traditional statistical models and machine learning models. However, the current studies are all based on the overall time series prediction of the number of cases in the country, state (province) or city in the future (such as 1 week, 2 weeks, 1 month, etc.), but the fine spatial scale (such as The forecast of township/sub-district administration) is relatively rare. The prediction of dengue fever at the fine spatial scale inside the city is quite challenging. The main reason is that the city is densely populated and the internal population flows frequently, and the disease spreads between the inner areas of the city more quickly. It is easy to ignore the area when each area is modeled separately. The spatial relationship between them cannot achieve a better prediction effect.
发明内容Summary of the invention
有鉴于此,有必要提供一种城市内部登革热时空预测方法、系统及电子设备。In view of this, it is necessary to provide a space-time prediction method, system and electronic equipment for dengue fever in a city.
本发明提供一种城市内部登革热时空预测方法,该方法包括如下步骤:a.采集城市内部登革热相关数据并进行预处理,所述城市内部登革热相关数据包括:所研究城市的登革热病例数据、气象数据、人口分布数据、乡镇矢量文件;b.构建反映城市内部区域空间关系的图结构;c.选择用于登革热时空预测的输入特征;d.根据预处理后的城市内部登革热相关数据、构建的图结构、选择的输入特征,对GCN模型进行构建与训练。The present invention provides a method for temporal and spatial prediction of dengue fever in a city. The method includes the following steps: a. Collect and preprocess related data on dengue fever in the city. The data on dengue fever in the city includes: dengue fever case data and meteorological data in the studied city , Population distribution data, township vector files; b. Construct a map structure that reflects the spatial relationship of the inner city; c. Select the input features for dengue spatiotemporal prediction; d. Construct a map based on the preprocessed dengue-related data within the city The structure and selected input features are used to construct and train the GCN model.
其中,该方法还包括步骤e:对所述GCN模型的预测性能进行评估。Wherein, the method further includes step e: evaluating the prediction performance of the GCN model.
所述的步骤a具体包括:The step a specifically includes:
对采集的登革热病例数据预处理:将病例家庭住址转为经纬度坐标;确定每个病例所在乡镇;根据每个病例的发病日期,统计每个周次每个乡镇的发病病例数量,构成W*N的病例数量矩阵,W为周次数量,N为乡镇数量;Preprocess the collected data of dengue fever cases: convert the home address of the case to latitude and longitude coordinates; determine the township where each case is located; count the number of cases in each township in each week according to the onset date of each case, constituting W*N The number matrix of cases, W is the number of weeks, and N is the number of towns;
对采集的气象数据预处理:获取城市内所有气象观测站所记录的每日平均温和降雨量,使用克里金法分别对其进行空间插值;将 插值后的数据分周次聚合至乡镇级别,统计每个周次每个乡镇的平均温和累积降雨量,构成W*N的平均温矩阵和累积降雨量矩阵;Preprocessing the collected meteorological data: Obtain the daily average temperature and rainfall recorded by all meteorological observation stations in the city, and use the kriging method to interpolate them separately; aggregate the interpolated data to the township level on a weekly basis, Calculate the average temperature and cumulative rainfall of each town in each week to form a W*N average temperature matrix and cumulative rainfall matrix;
对采集的人口分布数据预处理包括:将人口分布数据聚合至乡镇级别,获取每个乡镇的总人口。The preprocessing of the collected population distribution data includes: aggregating the population distribution data to the township level to obtain the total population of each township.
所述的步骤b具体包括如下步骤:The step b specifically includes the following steps:
获取乡镇与乡镇之间的邻接关系;Acquire the neighboring relationship between township and township;
将乡镇视为点,乡镇之间的邻接关系视为边,构建图结构。Regarding the township as a point and the neighboring relationship between townships as an edge, a graph structure is constructed.
所述的步骤c具体包括:Said step c specifically includes:
选择文献中常用的、与登革热传播和爆发有密切关系的特征作为输入特征。The features commonly used in the literature and closely related to the spread and outbreak of dengue fever are selected as input features.
所述的GCN模型由一层输入层、至少两层隐藏层及一层输出层构成;所述至少两层隐藏层后分别使用整流线性函数ReLU和双曲正切函数tanh作为激活函数。The GCN model is composed of one input layer, at least two hidden layers and one output layer; after the at least two hidden layers, the rectified linear function ReLU and the hyperbolic tangent function tanh are respectively used as activation functions.
步骤d中所述对GCN模型进行训练包括:The training of the GCN model in step d includes:
根据所述GCN模型的输入、输出需求及不同预测窗口,整理K套数据集,每套所述数据集均划分为训练集和验证集;According to the input and output requirements of the GCN model and different prediction windows, organize K sets of data sets, and each set of the data sets is divided into a training set and a validation set;
使用每个预测窗口下的训练集分别对构建的GCN模型进行训练。Use the training set under each prediction window to train the constructed GCN model separately.
所述的步骤e具体包括:The step e specifically includes:
将每个预测窗口下的验证集,分别输入对应训练好的GCN模型,获得未来第t周的预测结果;Input the verification set under each prediction window into the corresponding trained GCN model to obtain the prediction result of the t-th week in the future;
使用击中率评估预测性能:第t周预测结果的击中率定义如下:Use hit rate to evaluate prediction performance: The hit rate of the prediction result in week t is defined as follows:
Figure PCTCN2020139657-appb-000001
Figure PCTCN2020139657-appb-000001
其中,N m,t表示将第t周所预测的城市内部所有乡镇病例数量按照从高到低排名,排名前m%的高风险街镇的实际病例数量之和;N t表示第t周该城市的实际病例总数量。 Among them, N m,t means that the number of cases in all towns within the city predicted in week t is ranked from highest to lowest, and the sum of the actual number of cases in the top m% of high-risk streets and towns; N t means the number of cases in week t The total number of actual cases in the city.
本发明提供一种城市内部登革热时空预测系统,该系统包括预处理单元、图结构构建单元、选择单元、模型构建单元,其中:所述预处理单元用于采集城市内部登革热相关数据并进行预处理,所述城市内部登革热相关数据包括:所研究城市的登革热病例数据、气象数据、人口分布数据、乡镇矢量文件;所述图结构构建单元用于构建反映城市内部区域空间关系的图结构;所述选择单元用于选择用于登革热时空预测的输入特征;所述模型构建单元用于根据预处理后的城市内部登革热相关数据、构建的图结构、选择的输入特征,对GCN模型进行构建与训练。The present invention provides a space-time prediction system for dengue fever in a city. The system includes a preprocessing unit, a graph structure building unit, a selection unit, and a model building unit. The preprocessing unit is used to collect data related to dengue fever in the city and perform preprocessing. The data related to dengue fever in the city includes: dengue fever case data, meteorological data, population distribution data, and township vector files of the studied city; the graph structure construction unit is used to construct a graph structure reflecting the spatial relationship of the inner city; The selection unit is used to select input features for the spatiotemporal prediction of dengue fever; the model construction unit is used to construct and train the GCN model based on the preprocessed intra-city dengue related data, the constructed graph structure, and the selected input features.
其中,所述系统还包括:评估单元,用于对所述GCN模型的预测性能进行评估。Wherein, the system further includes: an evaluation unit for evaluating the prediction performance of the GCN model.
本发明还提供一种电子设备,包括:The present invention also provides an electronic device, including:
至少一个处理器;以及At least one processor; and
与所述至少一个处理器通信连接的存储器;其中,A memory communicatively connected with the at least one processor; wherein,
所述存储器存储有可被所述一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述1至8任一项所述的城市内部传染病扩散模拟方法的以下操作:The memory stores instructions that can be executed by the one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the urban interior described in any one of 1 to 8 above. The following operations of the infectious disease spread simulation method:
步骤a:采集城市内部登革热相关数据并进行预处理,所述城市内部登革热相关数据包括:所研究城市的登革热病例数据、气象数据、人 口分布数据、乡镇矢量文件;Step a: Collect and preprocess the dengue-related data in the city. The dengue-related data in the city includes: dengue case data, meteorological data, population distribution data, and township vector files in the studied city;
步骤b:构建反映城市内部区域空间关系的图结构;Step b: Construct a map structure that reflects the spatial relationship of the city's internal regions;
步骤c:选择用于登革热时空预测的输入特征;Step c: Select the input features for the spatiotemporal prediction of dengue fever;
步骤d:根据预处理后的城市内部登革热相关数据、构建的图结构、选择的输入特征,对GCN模型进行构建与训练,以使用所述GCN模型进行登革热时空预测。Step d: Constructing and training a GCN model according to the preprocessed data related to dengue fever in the city, the constructed graph structure, and the selected input features, so as to use the GCN model to perform dengue fever spatiotemporal prediction.
相比于现有技术对国家、省(州)和城市的整体时序预测,本发明面向城市内部各区域,实现了更细空间尺度上的预测,在预测城市内部各区域未来的登革热病例数量时,充分考虑各区域之间的空间关系,有助于捕捉登革热在城市内部的传播特征,有效提高预测性能,提升登革热的精准防控水平。Compared with the overall time series prediction of the country, province (state), and city in the prior art, the present invention is oriented to each area within the city, and realizes prediction on a finer spatial scale. When predicting the future number of dengue fever cases in each area within the city , Fully consider the spatial relationship between various regions, help to capture the characteristics of dengue fever in the city, effectively improve the prediction performance, and enhance the level of precision prevention and control of dengue fever.
附图说明Description of the drawings
图1为本发明城市内部登革热时空预测方法的流程图;Figure 1 is a flow chart of the method for predicting dengue fever within a city according to the present invention;
图2为本发明实施例提供的城市内部区域空间关系构建过程示意图;FIG. 2 is a schematic diagram of a process of constructing a spatial relationship within a city according to an embodiment of the present invention;
图3为本发明实施例提供的图卷积神经网络模型的结构示意图;3 is a schematic structural diagram of a graph convolutional neural network model provided by an embodiment of the present invention;
图4为本发明城市内部登革热时空预测系统的硬件架构图;FIG. 4 is a hardware architecture diagram of the dengue fever spatiotemporal prediction system in the city of the present invention;
图5为本发明实施例提供的城市内部登革热时空预测方法的硬件设备结构示意图;5 is a schematic diagram of the hardware device structure of a method for predicting dengue fever in a city according to an embodiment of the present invention;
图6为本发明实施例一广州市乡镇尺度的登革热预测效果对比示意图。Fig. 6 is a schematic diagram of comparison of dengue fever prediction effects on a township scale in Guangzhou City in Example 1 of the present invention.
具体实施方式Detailed ways
下面结合附图及具体实施例对本发明作进一步详细的说明。The present invention will be further described in detail below with reference to the drawings and specific embodiments.
本实施例以乡镇尺度的预测进行说明,本发明同样适用于以其他方式划分的城市内部空间单元,如行政区、交通分析小区、格网等。This embodiment is explained with the prediction of a township scale. The present invention is also applicable to urban internal spatial units divided in other ways, such as administrative districts, traffic analysis districts, grids, and the like.
参阅图1所示,是本发明城市内部登革热时空预测方法较佳实施例的作业流程图。Referring to FIG. 1, it is a flowchart of a preferred embodiment of the method for predicting dengue fever within a city according to the present invention.
步骤S1,采集城市内部登革热相关数据并进行预处理。具体而言:Step S1, collecting data related to dengue fever in the city and preprocessing it. in particular:
所述城市内部登革热相关数据包括:所研究城市的登革热病例数据、气象数据、人口分布数据、乡镇矢量文件(shapefile)。所述气象数据包括城市内气象监测站所采集的每日平均温度与降雨量。The data related to dengue fever in the city includes: dengue fever case data, meteorological data, population distribution data, and township vector files (shapefile) of the studied city. The meteorological data includes daily average temperature and rainfall collected by meteorological monitoring stations in the city.
其中,所述登革热病例数据从国家/省/市的疾病预防与控制中心申请获取,所述登革热病例数据包括:每个病例的发病日期及家庭住址;所述气象数据从国家/省/市气象局申请获取;所述人口分布数据从开源的全球人口数据项目WorldPop网站(https://www.worldpop.org/)获取。Wherein, the dengue fever case data is obtained from the disease prevention and control center of the country/province/city, and the dengue fever case data includes: the onset date and home address of each case; the meteorological data is obtained from the national/provincial/city meteorological The said population distribution data is obtained from the open source global population data project WorldPop website (https://www.worldpop.org/).
对采集的登革热病例数据预处理包括:首先使用地理编码方法将病例家庭住址转换为经纬度坐标,并将所有病例点根据其经纬度坐标导入ArcGIS,获取点类型的矢量文件;然后使用ArcGIS软件中的Spatial Join工具将病例(点类型的矢量文件)与乡镇(面类型的矢量文件,也即乡镇矢量文件)关联起来,确定每个病例所在乡镇;最后根据每个病例的发病日期,统计每个周次每个乡镇的发病病例数量,构成W*N的病例数量矩阵,W为周次数量,N为乡镇 数量。The preprocessing of the collected dengue case data includes: first use geocoding to convert the home address of the case into latitude and longitude coordinates, and import all case points into ArcGIS according to their latitude and longitude coordinates to obtain a vector file of point type; then use Spatial in ArcGIS software The Join tool associates cases (point-type vector files) with townships (face-type vector files, that is, township vector files) to determine the township where each case is located; finally, count each week according to the date of onset of each case The number of cases in each township constitutes a W*N matrix of the number of cases, where W is the number of weeks and N is the number of townships.
对采集的气象数据预处理包括:获取城市内所有气象观测站所记录的每日平均温和降雨量,首先使用克里金法分别对其进行空间插值;然后将插值后的数据分周次聚合至乡镇级别,统计每个周次每个乡镇的平均温和累积降雨量,构成W*N的平均温矩阵和累积降雨量矩阵。在本实施例中,空间插值及数据聚合使用Python语言的ArcPy工具包批量处理。The preprocessing of the collected meteorological data includes: obtaining the daily average temperature and rainfall recorded by all meteorological observatories in the city, firstly using the kriging method to interpolate them separately; then, the interpolated data is aggregated into weekly At the township level, the average temperature and cumulative rainfall of each township in each week are counted to form the W*N average temperature matrix and cumulative rainfall matrix. In this embodiment, the spatial interpolation and data aggregation are processed in batches using the ArcPy toolkit of the Python language.
对采集的人口分布数据预处理包括:本实施例使用ArcGIS软件将从WorldPop网站下载2015年100米分辨率的人口分布数据聚合至乡镇级别,获取每个乡镇的总人口。The preprocessing of the collected population distribution data includes: this embodiment uses ArcGIS software to download the population distribution data with a resolution of 100 meters from the WorldPop website in 2015 and aggregate it to the township level to obtain the total population of each township.
步骤S2,根据区域间的邻近关系构建反映城市内部区域空间关系的图结构。Step S2, constructing a graph structure reflecting the spatial relationship of the inner regions of the city according to the neighboring relationship between the regions.
具体而言,所述步骤S2包括:Specifically, the step S2 includes:
步骤201:利用ArcGIS软件的Spatial Join功能从乡镇矢量文件中获取乡镇与乡镇之间的邻接关系。Step 201: Use the Spatial Join function of ArcGIS software to obtain the neighboring relationship between the township and the township from the township vector file.
步骤202:将乡镇视为点,乡镇之间的邻接关系视为边,构建图结构。本实施例图结构A、B的构建过程示意图请参考图2。Step 202: Regard the township as a point, and the adjacency relationship between the townships as an edge, and construct a graph structure. Please refer to FIG. 2 for a schematic diagram of the construction process of structures A and B in this embodiment.
步骤S3,选择用于登革热预测的输入特征。具体而言:Step S3, selecting input features for dengue fever prediction. in particular:
本实施例选择文献中常用的、与登革热传播和爆发有密切关系的四类特征,包括当前周及过去周的病例数量、平均温、累积降雨量及人口数量。如表1所示,共计13个特征。其中,所述平均温和所述累积降雨量与蚊媒生存适宜性有关;由于登革热是传染病,因此未来病例数量与过去病例数量及人口数量也密切相关。This embodiment selects four types of features commonly used in the literature that are closely related to the spread and outbreak of dengue fever, including the number of cases in the current week and the past week, the average temperature, the cumulative rainfall, and the number of populations. As shown in Table 1, there are 13 features in total. Among them, the average moderate and the accumulated rainfall are related to the survival suitability of mosquito vectors; because dengue fever is an infectious disease, the number of future cases is also closely related to the number of past cases and the number of population.
值得注意的是,本实施例所选输入特征不强制限定本发明所用的这13种,选择其他合理的输入特征及其组合也在本发明保护范围内。It is worth noting that the input features selected in this embodiment do not compulsorily limit the 13 types used in the present invention, and selection of other reasonable input features and combinations thereof are also within the protection scope of the present invention.
表1.用于登革热预测的输入特征Table 1. Input features used for dengue fever prediction
Figure PCTCN2020139657-appb-000002
Figure PCTCN2020139657-appb-000002
步骤S4,根据预处理后的城市内部登革热相关数据、构建的图结构、选择的输入特征,对GCN模型进行构建与训练。In step S4, the GCN model is constructed and trained according to the preprocessed data related to dengue fever in the city, the constructed graph structure, and the selected input features.
具体而言,所述步骤S4包括:Specifically, the step S4 includes:
步骤401:模型构建。本实施例使用的图卷积神经网络模型由Kipf Thomas N和Max Welling于2016年提出,其基本结构如图3所示。该模型由一层输入层、两层隐藏层(也可设置更多隐藏层)和一层输出层构成;两层隐藏层后分别使用整流线性函数ReLU和双曲正切函数tanh作为激活函数。Step 401: Model construction. The graph convolutional neural network model used in this embodiment was proposed by Kipf Thomas N and Max Welling in 2016, and its basic structure is shown in FIG. 3. The model consists of an input layer, two hidden layers (more hidden layers can also be set), and an output layer; after the two hidden layers, the rectified linear function ReLU and the hyperbolic tangent function tanh are used as the activation functions.
所述输入层的输入数据为:1)步骤S2构建的图结构A;2)N*D的特征矩阵X=N*D,其中,N为节点(即乡镇)数量,D为特征数量。所述输出层输出的是N个节点(即乡镇)未来第T+k 周的登革热病例数量,其中k为预测窗口,k=1,2,…,K。The input data of the input layer is: 1) the graph structure A constructed in step S2; 2) the feature matrix X=N*D of N*D, where N is the number of nodes (ie, towns), and D is the number of features. The output layer outputs the number of dengue fever cases at N nodes (ie, townships) in the next T+k week, where k is the prediction window, and k=1, 2,...,K.
步骤402:模型训练。根据GCN模型的输入和输出需求,及预测窗口不同,整理K套数据集;每套所述数据集均以一定比例划分训练集和验证集:在本实施例中,数据集所有周次中前75%周次的数据用以训练,后25%周次的数据用以验证;使用每个预测窗口下的训练集分别对构建的GCN模型进行训练。Step 402: Model training. According to the input and output requirements of the GCN model, and the different prediction windows, organize K sets of data sets; each set of data sets are divided into training sets and validation sets in a certain proportion: in this embodiment, the top 75% of all weeks in the data set The weekly data is used for training, and the last 25% of the weekly data is used for verification; the training set under each prediction window is used to train the constructed GCN model.
实现GCN模型的构建与训练可参考以下基于主流深度学习框架的开源代码:To achieve the construction and training of the GCN model, please refer to the following open source code based on the mainstream deep learning framework:
https://github.com/tkipf/gcnhttps://github.com/tkipf/gcn
https://github.com/tkipf/pygcnhttps://github.com/tkipf/pygcn
https://github.com/tkipf/keras-gcn。https://github.com/tkipf/keras-gcn.
步骤S5:对所述GCN模型的预测性能进行评估。具体而言:Step S5: Evaluate the prediction performance of the GCN model. in particular:
将每个预测窗口下的验证集,分别输入对应训练好的GCN模型,相应获得未来第k周的预测值(即各乡镇的病例数量)。由于预测的主要目的是在城市内部多个街镇中识别其中的高风险街镇,以针对性布设防控措施,因此,本实施例使用击中率(hitrate)评估预测性能。第t周预测结果的击中率定义如下:Input the validation set under each prediction window into the corresponding trained GCN model, and obtain the predicted value of the kth week in the future (that is, the number of cases in each town). Since the main purpose of prediction is to identify high-risk streets and towns among multiple streets and towns within the city, and to deploy prevention and control measures in a targeted manner, this embodiment uses hit rates to evaluate the prediction performance. The hit rate of the forecast results in week t is defined as follows:
Figure PCTCN2020139657-appb-000003
Figure PCTCN2020139657-appb-000003
其中,N m,t表示将第t周所预测的城市内部所有乡镇病例数量按照从高到低排名,排名前m%的高风险街镇的实际病例数量之和;N t表示第t周该城市的实际病例总数量。 Among them, N m,t means that the number of cases in all towns within the city predicted in week t is ranked from highest to lowest, and the sum of the actual number of cases in the top m% of high-risk streets and towns; N t means the number of cases in week t The total number of actual cases in the city.
参阅图4所示,是本发明城市内部登革热时空预测系统10的硬件 架构图。该系统包括:预处理单元101、图结构构建单元102、选择单元103、模型构建单元104以及评估单元105。Refer to FIG. 4, which is a hardware architecture diagram of the dengue fever spatiotemporal prediction system 10 in a city of the present invention. The system includes: a preprocessing unit 101, a graph structure construction unit 102, a selection unit 103, a model construction unit 104, and an evaluation unit 105.
所述预处理单元101用于采集城市内部登革热相关数据并进行预处理。具体而言:The preprocessing unit 101 is used to collect and preprocess the data related to dengue fever in the city. in particular:
所述城市内部登革热相关数据包括:所研究城市的登革热病例数据、气象数据、人口分布数据、乡镇矢量文件(shapefile)。所述气象数据包括城市内气象监测站所采集的每日平均温度与降雨量。The data related to dengue fever in the city includes: dengue fever case data, meteorological data, population distribution data, and township vector files (shapefile) of the studied city. The meteorological data includes daily average temperature and rainfall collected by meteorological monitoring stations in the city.
其中,所述登革热病例数据从国家/省/市的疾病预防与控制中心申请获取,所述登革热病例数据包括:每个病例的发病日期及家庭住址;所述气象数据从国家/省/市气象局申请获取;所述人口分布数据从开源的全球人口数据项目WorldPop网站(https://www.worldpop.org/)获取。Wherein, the dengue fever case data is obtained from the disease prevention and control center of the country/province/city, and the dengue fever case data includes: the onset date and home address of each case; the meteorological data is obtained from the national/provincial/city meteorological The said population distribution data is obtained from the open source global population data project WorldPop website (https://www.worldpop.org/).
所述预处理单元101对采集的登革热病例数据预处理包括:首先使用地理编码方法将病例家庭住址转换为经纬度坐标,并将所有病例点根据其经纬度坐标导入ArcGIS,获取点类型的矢量文件;然后使用ArcGIS软件中的Spatial Join工具将病例(点类型的矢量文件)与乡镇(面类型的矢量文件,也即乡镇矢量文件)关联起来,确定每个病例所在乡镇;最后根据每个病例的发病日期,统计每个周次每个乡镇的发病病例数量,构成W*N的病例数量矩阵,W为周次数量,N为乡镇数量。The preprocessing of the collected dengue case data by the preprocessing unit 101 includes: firstly using a geocoding method to convert the home address of the case into latitude and longitude coordinates, and import all case points into ArcGIS according to their latitude and longitude coordinates to obtain a vector file of point type; Use the SpatialJoin tool in ArcGIS software to associate the case (point-type vector file) with the township (face-type vector file, that is, the township vector file) to determine the township where each case is located; finally, according to the date of onset of each case , Count the number of cases in each town in each week to form a matrix of the number of cases in W*N, where W is the number of weeks and N is the number of towns.
所述预处理单元101对采集的气象数据预处理包括:获取城市内所有气象观测站所记录的每日平均温和降雨量,首先使用克里金法分别对其进行空间插值;然后将插值后的数据分周次聚合至乡镇级 别,统计每个周次每个乡镇的平均温和累积降雨量,构成W*N的平均温矩阵和累积降雨量矩阵。在本实施例中,空间插值及数据聚合使用Python语言的ArcPy工具包批量处理。The pre-processing of the collected meteorological data by the pre-processing unit 101 includes: obtaining the daily average temperature and rainfall recorded by all meteorological observatories in the city, firstly using the kriging method to perform spatial interpolation on them; The data is aggregated to the township level by week, and the average temperature and cumulative rainfall of each township in each week are counted to form the average temperature matrix and cumulative rainfall matrix of W*N. In this embodiment, the spatial interpolation and data aggregation are processed in batches using the ArcPy toolkit of the Python language.
所述预处理单元101对采集的人口分布数据预处理包括:本实施例使用ArcGIS软件将从WorldPop网站下载2015年100米分辨率的人口分布数据聚合至乡镇级别,获取每个乡镇的总人口。The preprocessing unit 101 preprocessing the collected population distribution data includes: using ArcGIS software in this embodiment to download the population distribution data with a resolution of 100 meters in 2015 from the WorldPop website to the township level to obtain the total population of each township.
所述图结构构建单元102用于根据区域间的邻近关系构建反映城市内部区域空间关系的图结构。具体而言:The graph structure construction unit 102 is used for constructing a graph structure reflecting the spatial relationship of the inner region of the city according to the neighboring relationship between the regions. in particular:
所述图结构构建单元102利用ArcGIS软件的Spatial Join功能从乡镇矢量文件中获取乡镇与乡镇之间的邻接关系。The graph structure construction unit 102 uses the Spatial Join function of ArcGIS software to obtain the neighboring relationship between the township and the township from the township vector file.
将乡镇视为点,乡镇之间的邻接关系视为边,构建图结构。本实施例图结构A、B的构建过程示意图请参考图2。Regarding the township as a point and the neighboring relationship between townships as an edge, a graph structure is constructed. Please refer to FIG. 2 for a schematic diagram of the construction process of structures A and B in this embodiment.
所述选择单元103用于选择用于登革热预测的输入特征。具体而言:The selection unit 103 is used to select input features for dengue fever prediction. in particular:
本实施例中,所述选择单元103选择文献中常用的、与登革热传播和爆发有密切关系的四类特征,包括当前周及过去周的病例数量、平均温、累积降雨量及人口数量。如表1所示,共计13个特征。其中,所述平均温和所述累积降雨量与蚊媒生存适宜性有关;由于登革热是传染病,因此未来病例数量与过去病例数量及人口数量也密切相关。In this embodiment, the selection unit 103 selects four types of features commonly used in the literature that are closely related to the spread and outbreak of dengue fever, including the number of cases in the current week and the past week, the average temperature, the cumulative rainfall, and the number of population. As shown in Table 1, there are 13 features in total. Among them, the average moderate and the accumulated rainfall are related to the survival suitability of mosquito vectors; because dengue fever is an infectious disease, the number of future cases is also closely related to the number of past cases and the number of population.
值得注意的是,本实施例所选输入特征不强制限定本发明所用的这13种,选择其他合理的输入特征及其组合也在本发明保护范围内。It is worth noting that the input features selected in this embodiment do not compulsorily limit the 13 types used in the present invention, and selection of other reasonable input features and combinations thereof are also within the protection scope of the present invention.
表1.用于登革热预测的输入特征Table 1. Input features used for dengue fever prediction
Figure PCTCN2020139657-appb-000004
Figure PCTCN2020139657-appb-000004
所述模型构建单元104用于根据预处理后的城市内部登革热相关数据、构建的图结构、选择的输入特征,对GCN模型进行构建与训练。具体而言:The model construction unit 104 is used for constructing and training the GCN model according to the preprocessed intra-city dengue fever related data, the constructed graph structure, and the selected input features. in particular:
所述模型构建单元104进行模型构建。本实施例使用的图卷积神经网络模型由Kipf Thomas N和Max Welling于2016年提出,其基本结构如图3所示。该模型由一层输入层、两层隐藏层(也可设置更多隐藏层)和一层输出层构成;两层隐藏层后分别使用整流线性函数ReLU和双曲正切函数tanh作为激活函数。The model construction unit 104 performs model construction. The graph convolutional neural network model used in this embodiment was proposed by Kipf Thomas N and Max Welling in 2016, and its basic structure is shown in FIG. 3. The model consists of an input layer, two hidden layers (more hidden layers can also be set), and an output layer; after the two hidden layers, the rectified linear function ReLU and the hyperbolic tangent function tanh are used as the activation functions.
所述输入层的输入数据为:1)步骤S2构建的图结构A;2)N*D的特征矩阵X=N*D,其中,N为节点(即乡镇)数量,D为特征数量。所述输出层输出的是N个节点(即乡镇)未来第T+k周的登革热病例数量,其中k为预测窗口,k=1,2,…,K。The input data of the input layer is: 1) the graph structure A constructed in step S2; 2) the feature matrix X=N*D of N*D, where N is the number of nodes (ie, towns), and D is the number of features. The output layer outputs the number of dengue fever cases of N nodes (ie, townships) in the next T+k week, where k is the prediction window, and k=1, 2,...,K.
所述模型构建单元104进行模型训练。根据GCN模型的输入和输出需求,及预测窗口不同,整理K套数据集;每套所述数据集均 以一定比例划分训练集和验证集:在本实施例中,数据集所有周次中前75%周次的数据用以训练,后25%周次的数据用以验证;使用每个预测窗口下的训练集分别对构建的GCN模型进行训练。The model construction unit 104 performs model training. According to the input and output requirements of the GCN model, and the different prediction windows, organize K sets of data sets; each set of data sets are divided into training sets and validation sets in a certain proportion: in this embodiment, the top 75% of all weeks in the data set The weekly data is used for training, and the last 25% of the weekly data is used for verification; the training set under each prediction window is used to train the constructed GCN model.
实现GCN模型的构建与训练可参考以下基于主流深度学习框架的开源代码:To achieve the construction and training of the GCN model, please refer to the following open source code based on the mainstream deep learning framework:
https://github.com/tkipf/gcnhttps://github.com/tkipf/gcn
https://github.com/tkipf/pygcnhttps://github.com/tkipf/pygcn
https://github.com/tkipf/keras-gcn。https://github.com/tkipf/keras-gcn.
所述评估单元105用于对所述GCN模型的预测性能进行评估。具体而言:The evaluation unit 105 is used to evaluate the prediction performance of the GCN model. in particular:
所述评估单元105将每个预测窗口下的验证集,分别输入对应训练好的GCN模型,相应获得未来第k周的预测值(即各乡镇的病例数量)。由于预测的主要目的是在城市内部多个街镇中识别其中的高风险街镇,以针对性布设防控措施,因此,本实施例使用击中率(hit rate)评估预测性能。第t周预测结果的击中率定义如下:The evaluation unit 105 inputs the verification set under each prediction window into the corresponding trained GCN model, and accordingly obtains the prediction value of the kth week in the future (that is, the number of cases in each town). Since the main purpose of prediction is to identify high-risk streets and towns among multiple streets and towns in the city, and to deploy prevention and control measures in a targeted manner, the hit rate is used in this embodiment to evaluate the prediction performance. The hit rate of the forecast results in week t is defined as follows:
Figure PCTCN2020139657-appb-000005
Figure PCTCN2020139657-appb-000005
其中,N m,t表示将第t周所预测的城市内部所有乡镇病例数量按照从高到低排名,排名前m%的高风险街镇的实际病例数量之和;N t表示第t周该城市的实际病例总数量。 Among them, N m,t means that the number of cases in all towns within the city predicted in week t is ranked from highest to lowest, and the sum of the actual number of cases in the top m% of high-risk streets and towns; N t means the number of cases in week t The total number of actual cases in the city.
图5是本申请实施例提供的城市内部传染病扩散模拟方法的硬件设备结构示意图。如图5所示,该设备包括一个或多个处理器以及存储器。以一个处理器为例,该设备还可以包括:输入系统和输出系统。FIG. 5 is a schematic diagram of the hardware device structure of the method for simulating the spread of infectious diseases in a city provided by an embodiment of the present application. As shown in Figure 5, the device includes one or more processors and memory. Taking a processor as an example, the device may also include: an input system and an output system.
处理器、存储器、输入系统和输出系统可以通过总线或者其他方式连接,图5中以通过总线连接为例。The processor, the memory, the input system, and the output system may be connected by a bus or other methods. In FIG. 5, the connection by a bus is taken as an example.
存储器作为一种非暂态计算机可读电子设备,可用于存储非暂态软件程序、非暂态计算机可执行程序以及模块。处理器通过运行存储在存储器中的非暂态软件程序、指令以及模块,从而执行电子设备的各种功能应用以及数据处理,即实现上述方法实施例的处理方法。As a non-transitory computer-readable electronic device, the memory can be used to store non-transitory software programs, non-transitory computer executable programs, and modules. The processor executes various functional applications and data processing of the electronic device by running non-transitory software programs, instructions, and modules stored in the memory, that is, realizing the processing methods of the foregoing method embodiments.
存储器可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储数据等。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施例中,存储器可选包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至处理系统。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory may include a program storage area and a data storage area, where the program storage area can store an operating system and an application program required by at least one function; the data storage area can store data and the like. In addition, the memory may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices. In some embodiments, the memory may optionally include a memory remotely provided with respect to the processor, and these remote memories may be connected to the processing system through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
输入系统可接收输入的数字或字符信息,以及产生信号输入。输出系统可包括显示屏等显示设备。The input system can receive input digital or character information, and generate signal input. The output system may include display devices such as a display screen.
所述一个或者多个模块存储在所述存储器中,当被所述一个或者多个处理器执行时,执行上述任一方法实施例的以下操作:The one or more modules are stored in the memory, and when executed by the one or more processors, the following operations of any of the foregoing method embodiments are performed:
步骤a:采集城市内部登革热相关数据并进行预处理,所述城市内部登革热相关数据包括:所研究城市的登革热病例数据、气象数据、人口分布数据、乡镇矢量文件;Step a: Collect and preprocess the dengue fever-related data in the city. The dengue fever-related data in the city includes: dengue fever case data, meteorological data, population distribution data, and township vector files of the studied city;
步骤b:构建反映城市内部区域空间关系的图结构;Step b: Construct a map structure that reflects the spatial relationship of the city's internal regions;
步骤c:选择用于登革热时空预测的输入特征;Step c: Select the input features for the spatiotemporal prediction of dengue fever;
步骤d:根据预处理后的城市内部登革热相关数据、构建的图结构、选择的输入特征,对GCN模型进行构建与训练,以使用所述GCN模型进行登革热时空预测。Step d: Constructing and training a GCN model according to the preprocessed data related to dengue fever in the city, the constructed graph structure, and the selected input features, so as to use the GCN model to perform dengue fever spatiotemporal prediction.
上述产品可执行本申请实施例所提供的方法,具备执行方法相应的功能模块和有益效果。未在本实施例中详尽描述的技术细节,可参见本申请实施例提供的方法。The above-mentioned products can execute the methods provided in the embodiments of the present application, and have functional modules and beneficial effects corresponding to the execution methods. For technical details not described in detail in this embodiment, please refer to the method provided in the embodiment of this application.
本申请实施例提供了一种非暂态(非易失性)计算机电子设备,所述计算机电子设备存储有计算机可执行指令,该计算机可执行指令可执行以下操作:The embodiments of the present application provide a non-transitory (non-volatile) computer electronic device, the computer electronic device stores computer-executable instructions, and the computer-executable instructions can perform the following operations:
步骤a:采集城市内部登革热相关数据并进行预处理,所述城市内部登革热相关数据包括:所研究城市的登革热病例数据、气象数据、人口分布数据、乡镇矢量文件;Step a: Collect and preprocess the dengue fever-related data in the city. The dengue fever-related data in the city includes: dengue fever case data, meteorological data, population distribution data, and township vector files of the studied city;
步骤b:构建反映城市内部区域空间关系的图结构;Step b: Construct a map structure that reflects the spatial relationship of the city's internal regions;
步骤c:选择用于登革热时空预测的输入特征;Step c: Select the input features for the spatiotemporal prediction of dengue fever;
步骤d:根据预处理后的城市内部登革热相关数据、构建的图结构、选择的输入特征,对GCN模型进行构建与训练,以使用所述GCN模型进行登革热时空预测。Step d: Constructing and training a GCN model according to the preprocessed data related to dengue fever in the city, the constructed graph structure, and the selected input features, so as to use the GCN model to perform dengue fever spatiotemporal prediction.
本申请实施例提供了一种计算机程序产品,所述计算机程序产品包括存储在非暂态计算机可读电子设备上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,使所述计算机执行以下操作:The embodiment of the present application provides a computer program product, the computer program product includes a computer program stored on a non-transitory computer-readable electronic device, the computer program includes program instructions, when the program instructions are executed by a computer To make the computer do the following:
步骤a:采集城市内部登革热相关数据并进行预处理,所述城市 内部登革热相关数据包括:所研究城市的登革热病例数据、气象数据、人口分布数据、乡镇矢量文件;Step a: Collect and preprocess the dengue fever-related data in the city. The dengue fever-related data in the city include: dengue fever case data, meteorological data, population distribution data, and township vector files of the studied city;
步骤b:构建反映城市内部区域空间关系的图结构;Step b: Construct a map structure that reflects the spatial relationship of the city's internal regions;
步骤c:选择用于登革热时空预测的输入特征;Step c: Select the input features for the spatiotemporal prediction of dengue fever;
步骤d:根据预处理后的城市内部登革热相关数据、构建的图结构、选择的输入特征,对GCN模型进行构建与训练,以使用所述GCN模型进行登革热时空预测。Step d: Constructing and training a GCN model according to the preprocessed data related to dengue fever in the city, the constructed graph structure, and the selected input features, so as to use the GCN model to perform dengue fever spatiotemporal prediction.
本申请实施例一实验结果:Experimental results of Example 1 of this application:
本申请实施例一以广东省167个乡镇为例进行了实验。研究时段自2015年1月1日至2019年9月22日,共247个周次。其中,第5周至第195周的数据用于模型训练,第196周至第247周的数据用于模型验证。预测窗口k取1,2,…,8。Example 1 of this application took 167 villages and towns in Guangdong Province as an example to conduct experiments. The study period is from January 1, 2015 to September 22, 2019, with a total of 247 weeks. Among them, the data from week 5 to week 195 is used for model training, and the data from week 196 to week 247 is used for model verification. The prediction window k is 1, 2, ..., 8.
对比方法为当前登革热预测研究中常用并被证明效果相对较好的LASSO(least absolute shrinkage and selection operator)和SVM(support vector machine)回归模型。使用上述两种模型对各乡镇进行单独预测。The comparison methods are LASSO (least absolute shrinkage and selection operator) and SVM (support vector machine) regression models that are commonly used in current dengue fever prediction research and have been proven to be relatively effective. Use the above two models to make individual predictions for each township.
图6为以击中率作为评价指标的模型效果对比图。可以看出,相比基于LASSO和SVM回归模型的登革热预测方法,本发明所提采用GCN的登革热预测方法预测性能整体表现更优,充分说明了本发明的有效性。Figure 6 is a comparison diagram of model effects with hit rate as an evaluation index. It can be seen that, compared with the dengue fever prediction method based on the LASSO and SVM regression model, the dengue fever prediction method using GCN provided in the present invention has better overall prediction performance, which fully demonstrates the effectiveness of the present invention.
本发明首次引入了深度学习模型图卷积神经网络(Graph Convolutional Network,GCN),充分考虑城市内部区域之间的空间关系 以捕捉疾病在空间上的传播,对各区域进行联合预测,取得了更准确的预测效果。从而为防控相关部门提供决策支持,避免浪费人力物力,减少生命健康和财产损失。The present invention introduces the deep learning model Graph Convolutional Network (GCN) for the first time, which fully considers the spatial relationship between the inner regions of the city to capture the spread of diseases in space, and conducts joint prediction of each region, and achieves better results. Accurate prediction effect. In order to provide decision support for relevant prevention and control departments, avoid wasting manpower and material resources, and reduce the loss of life, health and property.
虽然本发明参照当前的较佳实施方式进行了描述,但本领域的技术人员应能理解,上述较佳实施方式仅用来说明本发明,并非用来限定本发明的保护范围,任何在本发明的精神和原则范围之内,所做的任何修饰、等效替换、改进等,均应包含在本发明的权利保护范围之内。Although the present invention has been described with reference to the current preferred embodiments, those skilled in the art should understand that the above preferred embodiments are only used to illustrate the present invention and are not used to limit the scope of protection of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle scope of the invention shall be included in the protection scope of the present invention.

Claims (11)

  1. 一种城市内部登革热时空预测方法,其特征在于,该方法包括如下步骤:A spatiotemporal prediction method of dengue fever in a city is characterized in that the method includes the following steps:
    a.采集城市内部登革热相关数据并进行预处理,所述城市内部登革热相关数据包括:所研究城市的登革热病例数据、气象数据、人口分布数据、乡镇矢量文件;a. Collect and preprocess the dengue-related data within the city. The dengue-related data within the city include: dengue case data, meteorological data, population distribution data, and township vector files in the studied city;
    b.构建反映城市内部区域空间关系的图结构;b. Construct a map structure that reflects the spatial relationship of the city's internal regions;
    c.选择用于登革热时空预测的输入特征;c. Select the input features used for the spatiotemporal prediction of dengue fever;
    d.根据预处理后的城市内部登革热相关数据、构建的图结构、选择的输入特征,对GCN模型进行构建与训练,以使用所述GCN模型进行登革热时空预测。d. Construct and train the GCN model according to the preprocessed internal dengue fever-related data in the city, the constructed graph structure, and the selected input features, so as to use the GCN model to perform dengue fever spatiotemporal prediction.
  2. 如权利要求1所述的方法,其特征在于,该方法还包括步骤e:The method according to claim 1, wherein the method further comprises step e:
    对所述GCN模型的预测性能进行评估。The prediction performance of the GCN model is evaluated.
  3. 如权利要求1或2所述的方法,其特征在于,所述的步骤a具体包括:The method according to claim 1 or 2, wherein the step a specifically includes:
    对采集的登革热病例数据预处理:将病例家庭住址转换为经纬度坐标;确定每个病例所在乡镇;根据每个病例的发病日期,统计每个周次每个乡镇的发病病例数量,构成W*N的病例数量矩阵,W为周次数量,N为乡镇数量;Preprocess the collected data of dengue fever cases: convert the home address of the case into latitude and longitude coordinates; determine the township where each case is located; count the number of cases in each township in each week according to the date of onset of each case, constituting W*N The number matrix of cases, W is the number of weeks, and N is the number of towns;
    对采集的气象数据预处理:获取城市内所有气象观测站所记录的每日平均温和降雨量,使用克里金法分别对其进行空间插值;将插值后的数据分周次聚合至乡镇级别,统计每个周次每个乡镇的平 均温和累积降雨量,构成W*N的平均温矩阵和累积降雨量矩阵;Preprocess the collected meteorological data: Obtain the daily average temperature and rainfall recorded by all meteorological observation stations in the city, and use the kriging method to interpolate them separately; aggregate the interpolated data to the township level on a weekly basis, Calculate the average temperature and cumulative rainfall of each township in each week to form a W*N average temperature matrix and cumulative rainfall matrix;
    对采集的人口分布数据预处理包括:将人口分布数据聚合至乡镇级别,获取每个乡镇的总人口。The preprocessing of the collected population distribution data includes: aggregating the population distribution data to the township level to obtain the total population of each township.
  4. 如权利要求3所述的方法,其特征在于,所述的步骤b具体包括如下步骤:The method according to claim 3, wherein said step b specifically includes the following steps:
    获取乡镇与乡镇之间的邻接关系;Acquire the neighboring relationship between township and township;
    将乡镇视为点,乡镇之间的邻接关系视为边,构建图结构。Regarding the township as a point and the neighboring relationship between townships as an edge, a graph structure is constructed.
  5. 如权利要求4所述的方法,其特征在于,所述的步骤c具体包括:The method according to claim 4, wherein said step c specifically comprises:
    选择文献中常用的、与登革热传播和爆发有密切关系的特征作为输入特征。The features commonly used in the literature and closely related to the spread and outbreak of dengue fever are selected as input features.
  6. 如权利要求5所述的方法,其特征在于,所述的GCN模型由一层输入层、至少两层隐藏层及一层输出层构成;所述至少两层隐藏层后分别使用整流线性函数ReLU和双曲正切函数tanh作为激活函数。The method according to claim 5, wherein the GCN model is composed of one input layer, at least two hidden layers, and one output layer; after the at least two hidden layers, the rectified linear function ReLU is used respectively. And the hyperbolic tangent function tanh as the activation function.
  7. 如权利要求6所述的方法,其特征在于,步骤d中所述对GCN模型进行训练包括:The method according to claim 6, wherein the training of the GCN model in step d comprises:
    根据所述GCN模型的输入、输出需求及不同预测窗口,整理K套数据集,每套所述数据集均划分为训练集和验证集;According to the input and output requirements of the GCN model and different prediction windows, organize K sets of data sets, and each set of the data sets is divided into a training set and a validation set;
    使用每个预测窗口下的训练集分别对构建的GCN模型进行训练。Use the training set under each prediction window to train the constructed GCN model separately.
  8. 如权利要求7所述的方法,其特征在于,所述的步骤e具体包括:8. The method according to claim 7, wherein said step e specifically comprises:
    将每个预测窗口下的验证集,分别输入对应训练好的GCN模型,获得未来第t周的预测结果;Input the verification set under each prediction window into the corresponding trained GCN model to obtain the prediction result of the t-th week in the future;
    使用击中率(hit rate)评估预测性能:第t周预测结果的击中率定义如下:Use hit rate (hit rate) to evaluate prediction performance: The hit rate of the prediction result in week t is defined as follows:
    Figure PCTCN2020139657-appb-100001
    Figure PCTCN2020139657-appb-100001
    其中,N m,t表示将第t周所预测的城市内部所有乡镇病例数量按照从高到低排名,排名前m%的高风险街镇的实际病例数量之和;N t表示第t周该城市的实际病例总数量。 Among them, N m,t means that the number of cases in all towns within the city predicted in week t is ranked from highest to lowest, and the sum of the actual number of cases in the top m% of high-risk streets and towns; N t means the number of cases in week t The total number of actual cases in the city.
  9. 一种城市内部登革热时空预测系统,其特征在于,该系统包括预处理单元、图结构构建单元、选择单元、模型构建单元,其中:A spatiotemporal prediction system for dengue fever in a city, which is characterized in that the system includes a preprocessing unit, a graph structure construction unit, a selection unit, and a model construction unit, wherein:
    所述预处理单元用于采集城市内部登革热相关数据并进行预处理,所述城市内部登革热相关数据包括:所研究城市的登革热病例数据、气象数据、人口分布数据、乡镇矢量文件;The preprocessing unit is used to collect and preprocess the dengue fever related data in the city. The dengue fever related data in the city includes: dengue fever case data, meteorological data, population distribution data, and township vector files of the studied city;
    所述图结构构建单元用于构建反映城市内部区域空间关系的图结构;The graph structure construction unit is used to construct a graph structure that reflects the spatial relationship of the inner regions of the city;
    所述选择单元用于选择用于登革热时空预测的输入特征;The selection unit is used to select input features for spatiotemporal prediction of dengue fever;
    所述模型构建单元用于根据预处理后的城市内部登革热相关数据、构建的图结构、选择的输入特征,对GCN模型进行构建与训练,以使用所述GCN模型进行登革热时空预测。The model construction unit is used to construct and train the GCN model according to the preprocessed internal dengue fever related data, the constructed graph structure, and the selected input features, so as to use the GCN model to perform dengue fever spatiotemporal prediction.
  10. 如权利要求9所述的系统,其特征在于,所述系统还包括:The system of claim 9, wherein the system further comprises:
    评估单元,用于对所述GCN模型的预测性能进行评估。The evaluation unit is used to evaluate the prediction performance of the GCN model.
  11. 一种电子设备,包括:An electronic device including:
    至少一个处理器;以及At least one processor; and
    与所述至少一个处理器通信连接的存储器;其中,A memory communicatively connected with the at least one processor; wherein,
    所述存储器存储有可被所述一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述权利要求1至8任一项所述的城市内部传染病扩散模拟方法的以下操作:The memory stores instructions executable by the one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute any one of the above claims 1 to 8. The following operations of the method of simulating the spread of infectious diseases within cities:
    步骤a:采集城市内部登革热相关数据并进行预处理,所述城市内部登革热相关数据包括:所研究城市的登革热病例数据、气象数据、人口分布数据、乡镇矢量文件;Step a: Collect and preprocess the dengue fever-related data in the city. The dengue fever-related data in the city includes: dengue fever case data, meteorological data, population distribution data, and township vector files of the studied city;
    步骤b:构建反映城市内部区域空间关系的图结构;Step b: Construct a map structure that reflects the spatial relationship of the city's internal regions;
    步骤c:选择用于登革热时空预测的输入特征;Step c: Select the input features for the spatiotemporal prediction of dengue fever;
    步骤d:根据预处理后的城市内部登革热相关数据、构建的图结构、选择的输入特征,对GCN模型进行构建与训练,以使用所述GCN模型进行登革热时空预测。Step d: Constructing and training a GCN model according to the preprocessed data related to dengue fever in the city, the constructed graph structure, and the selected input features, so as to use the GCN model to perform dengue fever spatiotemporal prediction.
PCT/CN2020/139657 2020-04-27 2020-12-25 Intra-urban dengue fever spatio-temporal forecasting method and system, and electronic device WO2021218207A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010346736.XA CN111554408B (en) 2020-04-27 2020-04-27 City internal dengue space-time prediction method, system and electronic equipment
CN202010346736.X 2020-04-27

Publications (1)

Publication Number Publication Date
WO2021218207A1 true WO2021218207A1 (en) 2021-11-04

Family

ID=72004089

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/139657 WO2021218207A1 (en) 2020-04-27 2020-12-25 Intra-urban dengue fever spatio-temporal forecasting method and system, and electronic device

Country Status (2)

Country Link
CN (1) CN111554408B (en)
WO (1) WO2021218207A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115036040A (en) * 2022-06-28 2022-09-09 福州大学 Epidemic situation space-time early warning method fusing population of fever and population background data
CN118016318A (en) * 2024-04-08 2024-05-10 中国科学院地理科学与资源研究所 Method for constructing zoonosis risk prediction model based on graph neural network

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111554408B (en) * 2020-04-27 2024-04-19 中国科学院深圳先进技术研究院 City internal dengue space-time prediction method, system and electronic equipment
CN112185566B (en) * 2020-10-14 2021-08-13 上海玺翎智能科技有限公司 Method for predicting and early warning sudden increase of hospitalization population of infectious diseases based on machine learning
CN112397205A (en) * 2020-12-08 2021-02-23 中国气象局广州热带海洋气象研究所 Dengue fever infectious disease prediction method based on meteorological model
CN114464329A (en) * 2021-12-31 2022-05-10 中国科学院深圳先进技术研究院 Urban epidemic situation space-time prediction method, system, terminal and storage medium
CN114360739B (en) * 2022-01-05 2023-07-21 中国科学院地理科学与资源研究所 Dengue risk prediction method based on remote sensing cloud computing and deep learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102682188A (en) * 2011-03-15 2012-09-19 中国科学院遥感应用研究所 City-wide infectious disease simulation method and device
CN108172301A (en) * 2018-01-31 2018-06-15 中国科学院软件研究所 A kind of mosquito matchmaker's epidemic Forecasting Methodology and system based on gradient boosted tree
CN109859854A (en) * 2018-12-17 2019-06-07 中国科学院深圳先进技术研究院 Prediction Method of Communicable Disease, device, electronic equipment and computer-readable medium
CN111554408A (en) * 2020-04-27 2020-08-18 中国科学院深圳先进技术研究院 Urban interior dengue space-time prediction method and system and electronic equipment

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3995424B2 (en) * 2001-03-16 2007-10-24 株式会社パスコ Infectious disease transmission analysis system and its transmission simulation system
US20170103172A1 (en) * 2015-10-07 2017-04-13 The Arizona Board Of Regents On Behalf Of The University Of Arizona System And Method To Geospatially And Temporally Predict A Propagation Event
CN109545386B (en) * 2018-11-02 2021-07-20 深圳先进技术研究院 Influenza spatiotemporal prediction method and device based on deep learning
CN110136842A (en) * 2019-04-04 2019-08-16 平安科技(深圳)有限公司 Morbidity prediction technique, device and the computer readable storage medium of acute infectious disease
CN110459329B (en) * 2019-07-11 2022-11-18 广东省公共卫生研究院 Dengue fever risk comprehensive assessment method
CN110610767B (en) * 2019-08-01 2023-06-02 平安科技(深圳)有限公司 Morbidity monitoring method, device, equipment and storage medium
CN110993119B (en) * 2020-03-04 2020-07-07 同盾控股有限公司 Epidemic situation prediction method and device based on population migration, electronic equipment and medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102682188A (en) * 2011-03-15 2012-09-19 中国科学院遥感应用研究所 City-wide infectious disease simulation method and device
CN108172301A (en) * 2018-01-31 2018-06-15 中国科学院软件研究所 A kind of mosquito matchmaker's epidemic Forecasting Methodology and system based on gradient boosted tree
CN109859854A (en) * 2018-12-17 2019-06-07 中国科学院深圳先进技术研究院 Prediction Method of Communicable Disease, device, electronic equipment and computer-readable medium
CN111554408A (en) * 2020-04-27 2020-08-18 中国科学院深圳先进技术研究院 Urban interior dengue space-time prediction method and system and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ANONYMOUS: "Incomplete summary of existing variants of GCN (application in spatio-temporal data mining)", JIANSHU, 4 March 2020 (2020-03-04), XP055865300, Retrieved from the Internet <URL:https://www.jianshu.com/p/da48a2fb4265> [retrieved on 20211124] *
STORM: "Application of GCN in the forecast of shared bicycle traffic", ZHIHU (GRAPH ALGORITHM AND APPLICATION), 9 August 2019 (2019-08-09), XP055865294, Retrieved from the Internet <URL:https://zhuanlan.zhihu.com/p/77209846?from_voters_page=true> [retrieved on 20211124] *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115036040A (en) * 2022-06-28 2022-09-09 福州大学 Epidemic situation space-time early warning method fusing population of fever and population background data
CN118016318A (en) * 2024-04-08 2024-05-10 中国科学院地理科学与资源研究所 Method for constructing zoonosis risk prediction model based on graph neural network

Also Published As

Publication number Publication date
CN111554408B (en) 2024-04-19
CN111554408A (en) 2020-08-18

Similar Documents

Publication Publication Date Title
WO2021218207A1 (en) Intra-urban dengue fever spatio-temporal forecasting method and system, and electronic device
Meyer et al. Power-law models for infectious disease spread
Kraemer et al. Past and future spread of the arbovirus vectors Aedes aegypti and Aedes albopictus
JP2019108798A (en) Estimate of damage prevention with architectural structure renovation
McKee et al. Locally adaptive, spatially explicit projection of US population for 2030 and 2050
CN106021508A (en) Sudden event emergency information mining method based on social media
US12009106B2 (en) Emergency demand prediction device, emergency demand prediction method, and program
WO2023123624A1 (en) Method and system for predicting influenza outbreak trend in city, and terminal and storage medium
CN109242170A (en) A kind of City Road Management System and method based on data mining technology
Jiang et al. Unraveling the dynamic impacts of COVID-19 on metro ridership: An empirical analysis of Beijing and Shanghai, China
CN113496781A (en) Urban internal infectious disease diffusion simulation method and system and electronic equipment
Shi et al. Analysis of trip generation rates in residential commuting based on mobile phone signaling data
Ponce-de-Leon et al. COVID-19 Flow-Maps an open geographic information system on COVID-19 and human mobility for Spain
Xia et al. Synthesis of a high resolution social contact network for Delhi with application to pandemic planning
CN115730763A (en) Method and device for calculating accessibility of facility in workday based on terminal signaling data
Almquist et al. Point process models for household distributions within small areal units
CN116703132B (en) Management method and device for dynamic scheduling of shared vehicles and computer equipment
CN112651574A (en) P median genetic algorithm-based addressing method and device and electronic equipment
CN116452035A (en) Community living material supply toughness assessment method and system based on network survivability
CN113408867B (en) Urban burglary crime risk assessment method based on mobile phone user and POI data
CN115456238A (en) Urban trip demand prediction method based on dynamic multi-view coupling graph convolution
Ganti et al. Spatio-temporal spread of events in social networks: A gas shortage case study
Haddawy et al. Spatiotemporal Bayesian networks for malaria prediction: case study of northern Thailand
Kim et al. Examining the socio-spatial patterns of bus shelters with deep learning analysis of street-view images: A case study of 20 cities in the US
CN110428627A (en) A kind of bus trip potentiality area recognizing method and identifying system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20933882

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20933882

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 20/04/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20933882

Country of ref document: EP

Kind code of ref document: A1