WO2023123625A1 - Urban epidemic space-time prediction method and system, terminal and storage medium - Google Patents

Urban epidemic space-time prediction method and system, terminal and storage medium Download PDF

Info

Publication number
WO2023123625A1
WO2023123625A1 PCT/CN2022/076295 CN2022076295W WO2023123625A1 WO 2023123625 A1 WO2023123625 A1 WO 2023123625A1 CN 2022076295 W CN2022076295 W CN 2022076295W WO 2023123625 A1 WO2023123625 A1 WO 2023123625A1
Authority
WO
WIPO (PCT)
Prior art keywords
infectious disease
relationship
similarity
case data
urban
Prior art date
Application number
PCT/CN2022/076295
Other languages
French (fr)
Chinese (zh)
Inventor
李子垠
尹凌
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2023123625A1 publication Critical patent/WO2023123625A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Definitions

  • the application belongs to the technical field of infectious disease computing, and in particular relates to a method, system, terminal and storage medium for temporal and spatial prediction of urban epidemics.
  • infectious disease forecasting has gained prominence in recent years due to the increased availability of public health surveillance data and the development of sophisticated methods.
  • Real-time prediction of disease infections remains hampered by a lack of current estimates of mobility and interaction patterns. Mobility and interaction patterns are key drivers of disease transmission. Prediction of the incidence trend of infectious diseases is an important issue in the field of public health. Timely discovery, tracking and prediction of key information of infectious diseases, such as peak intensity and outbreak time, are crucial for effective formulation of prevention and control strategies and implementation of intervention measures.
  • infectious disease models can be divided into two categories: mechanistic models and non-mechanistic models.
  • Existing infectious disease prediction modeling methods include Gaussian process-based models, LSTM-based (Long Short-Term Memory, long-short-term memory network) neural network models, and the like.
  • existing infectious disease prediction modeling methods include modeling based on traditional statistical models and modeling based on traditional machine learning. The infectious disease prediction methods based on traditional statistical models require data to meet some strict assumptions, such as The stationarity assumption of time series, etc. However, in the real world, these assumptions are not easy to satisfy.
  • the main core of mechanistic models is the propagation dynamics model, the most commonly used is the compartment model.
  • the compartment model divides the population or other hosts in the same unit into corresponding compartments according to the infection status of the infected, and then simulates the disease development dynamics in different compartments based on the characteristics of disease transmission. Its typical representatives are susceptible-infected-recovered (susceptible-infected-recovered, SIR) and susceptible-latent-infected-recovered (susceptible-exposed-infected-recovered, SEIR) models.
  • SIR susceptible-infected-recovered
  • SEIR susceptible-latent-infected-recovered
  • Such models are often used to predict the long-term development trend of the disease and deduce the effect of different interventions. However, its modeling is complex, updates are slow, and the uncertainty of model parameters will greatly affect the accuracy of the model, making it difficult to achieve real-time and accurate prediction of short-term development trends of infectious diseases.
  • Non-mechanism models include statistical models and machine learning models.
  • Traditional statistical models require data to meet some strict assumptions (eg, stationarity assumptions of time series). However, in the real world, these assumptions are not easy to satisfy.
  • traditional machine learning-based infectious disease prediction methods are data-driven and do not need to meet strict data assumptions, most traditional machine learning methods still rely on feature engineering and cannot achieve accurate predictions on difficult prediction problems. predict.
  • these methods usually ignore other features from the spatial dimension, and thus cannot analyze the spatial and temporal data. The temporal and spatial dependencies of the implications are modeled. This way of modeling limits the predictive performance of these methods.
  • Infectious disease prediction methods based on deep learning can automatically learn nonlinear features from spatio-temporal data, which has greater performance advantages.
  • Some studies have designed a prediction framework based on deep learning from multiple perspectives to better model the spatiotemporal dependencies contained in spatiotemporal data, so as to improve the predictive performance of the model.
  • most of these studies did not take into account the spatial interaction caused by population movement and the spatio-temporal diffusion pattern of infectious diseases within cities when modeling the spatial dependencies between different regions. Accurately predict the epidemic trend of infectious diseases.
  • the present application provides a spatio-temporal prediction method, system, terminal and storage medium of an urban epidemic situation, aiming to solve one of the above-mentioned technical problems in the prior art at least to a certain extent.
  • a spatial-temporal prediction method for urban epidemics comprising:
  • the intra-city infectious disease case data are input into the infectious disease spatio-temporal prediction model, and the intra-urban infectious disease prediction results are obtained through the infectious disease spatio-temporal prediction model.
  • the technical solution adopted in the embodiment of the present application also includes: dividing the population movement flow according to the time attribute, and obtaining the population movement relationship under each time attribute based on the proximity relationship of the area, specifically:
  • the technical solution adopted in the embodiment of the present application also includes: calculating the similarity of infectious disease case data between various regions using a histogram-based similarity algorithm, and obtaining the location attention of each region according to the similarity of infectious disease case data Force relationships include:
  • the technical solution adopted in the embodiment of the present application also includes: the calculation formula of the position-attention relationship is:
  • is the threshold for establishing a connection based on the histogram-based similarity algorithm; when the similarity w i,j between region i and region j is higher than the threshold ⁇ , a connection is created between region i and region j. Edges, and finally get the adjacency matrix of the corresponding location attention relationship of each area in the city.
  • the technical solution adopted in the embodiment of the present application also includes: the construction of the spatial-temporal prediction model of infectious diseases based on the graph neural network and long-term short-term memory network according to the population movement relationship, proximity relationship and location attention relationship includes:
  • the graph structure is input into the graph neural network, and the graph neural network uses the neighborhood aggregation method to normalize the directed graph, so that the incoming edge weight of each node is equal to 1:
  • H i is a matrix, which contains the node representation of the previous layer, the initial H 0 is set to represent the historical data of the change of infectious disease cases in each region, W i represents the trainable parameter matrix of the i-th layer, and f is the nonlinear activation function.
  • the technical solution adopted in the embodiment of the present application also includes: the construction of the spatial-temporal prediction model of infectious diseases based on the graph neural network and the long-term and short-term memory network according to the population movement relationship, proximity relationship and location attention relationship further includes:
  • X i,t LSTM(h i,tn ,h i,t-n+1 ,...,h i,t-1 )
  • X i,t represents the prediction of the i-th region in the t-th time period
  • Infectious disease case data h i,t-1 represents the infectious disease case data of the i-th region in the t-1 time period.
  • the technical solution adopted in the embodiment of the present application also includes: inputting the infectious disease case data in the city into the infectious disease spatiotemporal prediction model, and obtaining the urban infectious disease prediction result through the infectious disease spatiotemporal prediction model is specifically:
  • the fusion method is specifically:
  • W adj , W od and W at are the parameter matrices that need to be trained, and are the infectious disease prediction results at time t based on the adjacency matrix, population movement flow matrix and location attention matrix, tanh is the activation function, is the final prediction result of the entire spatio-temporal prediction model at time t.
  • a spatio-temporal prediction system for urban epidemic situation including:
  • Data collection module used to collect individual movement trajectory data and infectious disease case data in the city
  • Flow calculation module used to process the individual movement trajectory data through a data-driven method, extract the population movement flow between various regions, and divide the population movement flow according to the time attribute, and obtain it based on the proximity relationship of the region The population movement relationship under each time attribute;
  • Similarity calculation module used to calculate the similarity of infectious disease case data between various regions using a histogram-based similarity algorithm, and obtain the position attention relationship of each region according to the similarity of the infectious disease case data;
  • Model building module used to construct a spatial-temporal prediction model of infectious diseases based on graph neural network and long-term short-term memory network according to the population movement relationship, proximity relationship and location attention relationship;
  • Infectious disease prediction module used to input the urban infectious disease case data into the infectious disease spatiotemporal prediction model, and obtain the urban infectious disease prediction results through the infectious disease spatiotemporal prediction model.
  • a terminal includes a processor and a memory coupled to the processor, wherein,
  • the memory stores program instructions for realizing the spatio-temporal prediction method of the urban epidemic situation
  • the processor is used to execute the program instructions stored in the memory to control the spatio-temporal prediction of urban epidemics.
  • a storage medium storing program instructions executable by a processor, and the program instructions are used to execute the method for spatio-temporal prediction of urban epidemic situation.
  • the beneficial effects produced by the embodiment of the present application lie in that the urban epidemic spatiotemporal prediction method, system, terminal and storage medium of the embodiment of the present application extract the urban population movement flow through the individual movement trajectory data in the city, and analyze the urban population flow according to the time attribute.
  • the population movement flow within a certain period of time is divided, and the proximity relationship, population movement relationship, and location attention relationship between regions are obtained according to the population movement flow of different time attributes, and the neighborhood relationship, population movement relationship, and location attention relationship are constructed.
  • the spatio-temporal prediction model of infectious diseases based on graph neural network and long-term short-term memory network can make refined predictions on the trend of infectious diseases through the spatio-temporal prediction model of infectious diseases.
  • Fig. 1 is the flow chart of the urban epidemic situation spatio-temporal prediction method of the embodiment of the present application
  • Fig. 2 is the schematic structural diagram of the urban epidemic situation spatio-temporal prediction system of the embodiment of the present application;
  • FIG. 3 is a schematic diagram of a terminal structure in an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
  • the urban epidemic spatio-temporal prediction method of the embodiment of this application is based on the proximity relationship, population movement relationship and location attention relationship between various regions in the city to construct a spatio-temporal prediction model based on graph neural network and long-term short-term memory network , to better model the temporal and spatial dependencies in spatio-temporal data, when modeling spatial dependencies, consider multiple spatial relationships between regions and construct a corresponding relationship graph structure;
  • When modeling time dependence make full use of the local spatial context information of each region to improve the prediction performance of the model; and divide the population movement flow into different time attributes according to the population movement pattern of infectious disease-susceptible groups, and construct A directed graph structure with different temporal attributes to improve the spatio-temporal awareness of the spatio-temporal prediction model and further improve the prediction performance of the model.
  • FIG. 1 is a flow chart of the spatio-temporal prediction method for urban epidemic situation in the embodiment of the present application.
  • the urban epidemic spatio-temporal prediction method of the embodiment of the present application includes the following steps:
  • S1 Obtain the data of infectious disease cases in the city, and use mobile devices to collect individual movement trajectory data in the city;
  • the infectious disease case data include, but are not limited to, the number of infected cases, the number of deaths, clinical symptoms, and hospitalized personnel information.
  • Individual movement track data includes, but is not limited to, each individual's mobile phone number, signaling timestamp, and mobile phone location data.
  • Mobile devices include, but are not limited to, devices such as cell phones or smart watches.
  • S2 Process the individual movement trajectory data through data-driven methods, extract the population movement flow between different regions, and obtain the urban population movement relationship;
  • the data-driven method is used to process the individual movement trajectory data as follows: extract the population movement flow between different regions through large-scale mobile phone location data, and capture the urban population movement relationship.
  • S3 Divide the population movement flow according to the time attribute, construct a directed graph of the population movement flow with different time attributes, and determine whether there is an edge between two nodes in the directed graph based on the proximity of the region, and generate different time attributes The adjacency matrix of the population movement relationship;
  • the time attribute includes but not limited to weekdays, weekends or holidays.
  • a city is given, create a graph structure with day, week or month as the time interval unit, and divide the population movement flow into two time attribute population movement flows according to weekdays and weekends, and divide the population movement flow of each time attribute
  • the weight from vertex v to vertex u Represents the total number of people who moved from area v to area u on weekdays of week t, so as to obtain the adjacency matrix of the population movement relationship under the two time attributes of weekdays and weekends in week t and
  • the embodiment of the present application considers the space between things with close distances relationship, that is, proximity relationship. Based on the adjacency relationship of the regions, it is determined whether there is an edge between the two regions (nodes) according to whether the two regions are in contact with each other, and the adjacency matrix A adj of the population movement relationship corresponding to the adjacency relationship is obtained.
  • the weight from vertex v to vertex u Indicates the total number of individuals who move from area v to area u on weekdays
  • the directed graph G can contain the population flow behavior of each adjacent area in the city.
  • the mobility of areas v and u on weekdays forms an edge, which is multiplied by the number of cases in area v during that time period to obtain a relative score, which indicates how many infected people are likely to flow from area v to area u.
  • case change pattern between two adjacent areas may not be fluid, for example, areas i and j are adjacent areas, the flow rate of area i is fixed and always zero, and the flow rate of area j is dynamic and non-zero, there is no correlation between the change patterns of cases in these two adjacent regions.
  • S4 Construct the histogram of the infectious disease case data, use the histogram-based similarity algorithm to calculate the similarity of the infectious disease case data between regions, and obtain the location attention relationship representing all regions according to the similarity of the infectious disease case data adjacency matrix;
  • the correlation between two regions may be affected by geographical distance, that is, adjacent regions may have similar terrain or climate characteristics, making them have similar infectious disease outbreak trends.
  • non-contiguous areas may also have potential dependencies due to population movement and similar geographic features, but it is difficult to model all relevant factors for infectious disease outbreaks. Therefore, the present invention utilizes a histogram-based similarity algorithm to calculate the correlation of infectious disease case data between regions. If the similarity of infectious disease case data in two regions is relatively high, correspondingly, their case change patterns will be relatively similar, and the embodiment of the present application calculates the infectious disease cases in these non-adjacent but highly correlated regions The similarity of the data, so as to capture the trend of case changes from a more global spatial perspective.
  • the similarity algorithm based on the histogram is as follows: firstly, construct corresponding histograms for the time series of infectious disease case data samples in different regions; secondly, calculate the similarity of the infectious disease case data in two regions, if the infection If the similarity of disease case data is higher than the set threshold, it is considered that the infectious disease outbreak trends in these two areas are relatively similar and there is a strong correlation between them, and a graph is constructed between these two areas in the directed graph , so as to generate an adjacency matrix representing the location-attention relationship between all regions. Calculated as follows:
  • is the threshold for establishing a connection edge based on the histogram-based similarity algorithm.
  • the histogram construction algorithm is as follows:
  • S5 Use the proximity relationship, population movement relationship and location attention relationship to train the spatiotemporal prediction model of infectious diseases based on Graph Neural Networks (GNN) and Long short-term memory (LSTM);
  • GNN Graph Neural Networks
  • LSTM Long short-term memory
  • the graph neural network is a neighbor aggregation strategy, and the representation vector of a node is calculated by its neighbor nodes through cyclic aggregation and transfer representation vectors.
  • the framework of graph neural network is Message Passing Neural Network (MPNN).
  • MPNN Message Propagation Neural Network is a formal framework for spatial graph convolution.
  • the graph neural network uses the following neighborhood aggregation method to normalize the input directed graph structure matrix A:
  • H i is a matrix, which contains the node representation of the previous layer
  • the initial H 0 is set to represent the historical data of the change of infectious disease cases in each region
  • W i represents the trainable parameter matrix of the i-th layer
  • f is the nonlinear activation function , such as the ReLU function.
  • Long short-term memory network is a special kind of recurrent neural network (Recurrent Neural Network, RNN), which can be used to process sequence data.
  • the long short-term memory network is mainly to solve the problem of gradient disappearance and gradient explosion in the long sequence training process.
  • Use one MPNN at each time step to obtain a representation sequence h i,tn , hi,t-n+1 ,...,hi ,t-1 .
  • the LSTM calculation formula is expressed as:
  • X i,t represents the infectious disease case data predicted in the i-th region in the t-th time period
  • h i,t-1 represents the infectious disease case data in the i-th region in the t-1 time period
  • h i,t-1 represents the infectious disease case data in the i-th region in the t-1 time period
  • the present invention adopts a fusion method based on a parameter matrix, specifically:
  • W adj , W od and W at are the parameter matrices that need to be trained, and are the infectious disease prediction results at time t based on the adjacency matrix, population movement flow matrix and location attention matrix, tanh is the activation function, is the final prediction result of the entire infectious disease spatiotemporal prediction model at time t.
  • the model parameters can be updated by learning new data after multiple predictions, making the model more intelligent and efficient.
  • the urban epidemic spatio-temporal prediction method of the embodiment of the present application extracts the urban population movement flow through the individual movement trajectory data in the city, divides the population movement flow within a certain period of time according to the time attribute, and obtains the population movement flow according to different time attributes
  • the proximity relationship, population movement relationship and location attention relationship between various regions, and according to the proximity relationship, population movement relationship and location attention relationship a spatiotemporal prediction model of infectious diseases based on graph neural network and long short-term memory network is constructed. Predictive models make refined forecasts on infectious disease trends.
  • FIG. 2 is a schematic structural diagram of the spatio-temporal prediction system of the urban epidemic situation in the embodiment of the present application.
  • the spatial-temporal prediction system 40 of the urban epidemic situation in the embodiment of the present application includes:
  • Data collection module 41 used to collect individual movement trajectory data and infectious disease case data in the city;
  • Flow calculation module 42 used to process individual movement trajectory data through a data-driven method, extract population movement flow between various regions, divide population movement flow according to time attributes, and obtain various time attributes based on the proximity of regions The population movement relationship under ;
  • Similarity calculation module 43 used to calculate the similarity of infectious disease case data between various regions using a histogram-based similarity algorithm, and obtain the position attention relationship of each region according to the similarity of infectious disease case data;
  • Model building module 44 used to construct a spatiotemporal prediction model of infectious diseases based on graph neural network and long short-term memory network according to population movement relationship, proximity relationship and location attention relationship;
  • Infectious disease prediction module 45 used to input the infectious disease case data in the city into the infectious disease spatiotemporal prediction model, and obtain the urban infectious disease prediction result through the infectious disease spatiotemporal prediction model.
  • FIG. 3 is a schematic diagram of a terminal structure in an embodiment of the present application.
  • the terminal 50 includes a processor 51 and a memory 52 coupled to the processor 51 .
  • the memory 52 stores program instructions for realizing the above-mentioned spatio-temporal prediction method of urban epidemic situation.
  • the processor 51 is used to execute the program instructions stored in the memory 52 to control the spatio-temporal prediction of the urban epidemic situation.
  • the processor 51 may also be referred to as a CPU (Central Processing Unit, central processing unit).
  • the processor 51 may be an integrated circuit chip with signal processing capabilities.
  • the processor 51 can also be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components .
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • FIG. 4 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
  • the storage medium of the embodiment of the present application stores a program file 61 capable of realizing all the above-mentioned methods, wherein the program file 61 can be stored in the above-mentioned storage medium in the form of a software product, and includes several instructions to make a computer device (which can It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the methods in various embodiments of the present invention.
  • a computer device which can It is a personal computer, a server, or a network device, etc.
  • processor processor
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. , or terminal devices such as computers, servers, mobile phones, and tablets.

Abstract

The present application relates to an urban epidemic space-time prediction method and system, a terminal and a storage medium. The method comprises: collecting individual movement trajectory data and infectious disease case data in a city; processing the individual movement trajectory data, extracting population movement traffic between regions, dividing the population movement traffic according to time attributes, and obtaining a population movement relationship under time attributes on the basis of a proximity relationship of the regions; calculating the similarity of infectious disease case data between the regions by using a histogram-based similarity algorithm, and obtaining a position attention relationship of the regions according to the similarity of the infectious disease case data; and constructing a graph neural network and long short-term memory network-based infectious disease space-time prediction model according to the population movement relationship, the proximity relationship and the position attention relationship. According to the present application, time dependence and spatial dependence in infectious disease case data can be modeled, and the spatial perception capabilities and prediction performance of the space-time prediction model are improved.

Description

一种城市疫情时空预测方法、系统、终端以及存储介质A spatio-temporal prediction method, system, terminal and storage medium of urban epidemic situation 技术领域technical field
本申请属于传染病计算技术领域,特别涉及一种城市疫情时空预测方法、系统、终端以及存储介质。The application belongs to the technical field of infectious disease computing, and in particular relates to a method, system, terminal and storage medium for temporal and spatial prediction of urban epidemics.
背景技术Background technique
由于公共卫生监测数据的可用性增加以及复杂方法的发展,传染病预测在近年变得日益突出,实时预测疾病感染仍然受到缺乏对流动性和相互作用模式当前估计的困扰。而流动性和相互作用模式是疾病传播的关键驱动因素。传染病发病趋势预测是公共卫生领域的一个重要问题,及时发现、跟踪和预测传染病的关键信息,如高峰强度和爆发时间,对于有效制定防控策略和实施干预措施至关重要。根据建模的原理和目的,可以将传染病模型分为机理模型与非机理模型两大类。Infectious disease forecasting has gained prominence in recent years due to the increased availability of public health surveillance data and the development of sophisticated methods. Real-time prediction of disease infections remains hampered by a lack of current estimates of mobility and interaction patterns. Mobility and interaction patterns are key drivers of disease transmission. Prediction of the incidence trend of infectious diseases is an important issue in the field of public health. Timely discovery, tracking and prediction of key information of infectious diseases, such as peak intensity and outbreak time, are crucial for effective formulation of prevention and control strategies and implementation of intervention measures. According to the principles and purposes of modeling, infectious disease models can be divided into two categories: mechanistic models and non-mechanistic models.
以季节性传染病为例,对传染病活动进行监测和预测,并及时作出相应防控准备,对于季节性传染病和传染病大流行的防控至关重要。现有的传染病预测建模方法包括基于高斯过程模型、基于LSTM(Long Short-Term Memory,长短期记忆网络)神经网络模型等。然而现有的传染病预测建模方法包括基于传统统计学模型进行建模以及基于传统机器学习进行建模,基于传统统计学模型进行建模的传染病预测方法要求数据满足一些严格的假设,例如时间序列的平稳性假设等。然而,在现实世界中,这些假设并不易于满足。基于传统机器学习的传染病预测方法大多数仍然依赖于特征工程,且无法在难度较大的预测问题上实现准确的预测。此外,上述方法通常都忽视了来自空间维度的其他特 征,进而无法对时空数据中蕴含的时空依赖性进行建模,限制了模型预测性能。Taking seasonal infectious diseases as an example, monitoring and forecasting infectious disease activities, and making corresponding prevention and control preparations in time are crucial to the prevention and control of seasonal infectious diseases and infectious disease pandemics. Existing infectious disease prediction modeling methods include Gaussian process-based models, LSTM-based (Long Short-Term Memory, long-short-term memory network) neural network models, and the like. However, existing infectious disease prediction modeling methods include modeling based on traditional statistical models and modeling based on traditional machine learning. The infectious disease prediction methods based on traditional statistical models require data to meet some strict assumptions, such as The stationarity assumption of time series, etc. However, in the real world, these assumptions are not easy to satisfy. Most of the infectious disease prediction methods based on traditional machine learning still rely on feature engineering, and cannot achieve accurate predictions on difficult prediction problems. In addition, the above methods usually ignore other features from the spatial dimension, and thus cannot model the spatio-temporal dependencies contained in the spatio-temporal data, which limits the predictive performance of the model.
机理模型的主要核心是传播动力学模型,最为普遍使用的是仓室模型。仓室模型根据感染者的感染状态将同一单元内的人群或者其他宿主划分到对之对应的不同仓室,然后基于疾病传播特征模拟不同仓室的疾病发展动态。其典型代表是易感-感染-恢复(susceptible-infected-recovered,SIR)和易感-潜伏-感染-恢复(susceptible-exposed-infected-recovered,SEIR)模型等。此类模型通常用于预测疾病的长期发展趋势,推演不同干预措施的效果。但其建模复杂,更新较慢,模型参数的不确定性也会大大影响模型的精准度,较难实现传染病短期发展趋势的实时精准预测。The main core of mechanistic models is the propagation dynamics model, the most commonly used is the compartment model. The compartment model divides the population or other hosts in the same unit into corresponding compartments according to the infection status of the infected, and then simulates the disease development dynamics in different compartments based on the characteristics of disease transmission. Its typical representatives are susceptible-infected-recovered (susceptible-infected-recovered, SIR) and susceptible-latent-infected-recovered (susceptible-exposed-infected-recovered, SEIR) models. Such models are often used to predict the long-term development trend of the disease and deduce the effect of different interventions. However, its modeling is complex, updates are slow, and the uncertainty of model parameters will greatly affect the accuracy of the model, making it difficult to achieve real-time and accurate prediction of short-term development trends of infectious diseases.
非机理模型包括统计学模型和机器学习模型,传统统计学模型要求数据需满足一些严格的假设(如,时间序列的平稳性假设)。然而,在现实世界中,这些假设并不易于满足。虽然基于传统机器学习的传染病预测方法是数据驱动的且无需满足对数据严格的假设,但是大多数的传统机器学习方法仍然依赖于特征工程,且无法在难度较大的预测问题上实现准确的预测。此外,在建模的过程中,不管基于传统统计学模型的传染病预测方法还是基于传统机器学习模型的传染病预测方法,这些方法通常忽视了来自空间维度的其他特征,进而无法对时空数据中蕴含的时空依赖性都进行建模。这种建模方式限制了这些方法的预测性能。基于深度学习的传染病预测方法能够从时空数据中自动学习非线性特征,具有更大的性能优势。一些研究从多个角度出发设计了基于深度学习的预测框架以更好地对时空数据中蕴含的时空依赖关系进行建模,以提高模型的预测性能。然而,这些研究在建模不同区域之间的空间依赖关系时,大多没有考虑到人口移动造成的空间交互作用和城市内部传染病的时空扩散模式,此类方法由于信息的缺失很难在城市内部精准预测传染病的流行趋势。Non-mechanism models include statistical models and machine learning models. Traditional statistical models require data to meet some strict assumptions (eg, stationarity assumptions of time series). However, in the real world, these assumptions are not easy to satisfy. Although traditional machine learning-based infectious disease prediction methods are data-driven and do not need to meet strict data assumptions, most traditional machine learning methods still rely on feature engineering and cannot achieve accurate predictions on difficult prediction problems. predict. In addition, in the process of modeling, regardless of the infectious disease prediction method based on the traditional statistical model or the traditional machine learning model based infectious disease prediction method, these methods usually ignore other features from the spatial dimension, and thus cannot analyze the spatial and temporal data. The temporal and spatial dependencies of the implications are modeled. This way of modeling limits the predictive performance of these methods. Infectious disease prediction methods based on deep learning can automatically learn nonlinear features from spatio-temporal data, which has greater performance advantages. Some studies have designed a prediction framework based on deep learning from multiple perspectives to better model the spatiotemporal dependencies contained in spatiotemporal data, so as to improve the predictive performance of the model. However, most of these studies did not take into account the spatial interaction caused by population movement and the spatio-temporal diffusion pattern of infectious diseases within cities when modeling the spatial dependencies between different regions. Accurately predict the epidemic trend of infectious diseases.
发明内容Contents of the invention
本申请提供了一种城市疫情时空预测方法、系统、终端以及存储介质,旨在至少在一定程度上解决现有技术中的上述技术问题之一。The present application provides a spatio-temporal prediction method, system, terminal and storage medium of an urban epidemic situation, aiming to solve one of the above-mentioned technical problems in the prior art at least to a certain extent.
为了解决上述问题,本申请提供了如下技术方案:In order to solve the above problems, the application provides the following technical solutions:
一种城市疫情时空预测方法,包括:A spatial-temporal prediction method for urban epidemics, comprising:
收集城市内个体移动轨迹数据以及传染病病例数据;Collect individual movement trajectory data and infectious disease case data in the city;
通过数据驱动方法对所述个体移动轨迹数据进行处理,提取各个区域之间的人口移动流量,并根据时间属性对所述人口移动流量进行划分,并基于区域的邻近关系获取各个时间属性下的人口移动关系;Process the individual movement trajectory data through a data-driven method, extract the population movement flow between various regions, and divide the population movement flow according to the time attribute, and obtain the population under each time attribute based on the proximity of the region mobile relationship;
利用基于直方图的相似度算法计算各个区域之间传染病病例数据的相似度,根据所述传染病病例数据的相似度获取各个区域的位置注意力关系;根据所述人口移动关系、邻近关系和位置注意力关系构建基于图神经网络和长短期记忆网络的传染病时空预测模型;Use the histogram-based similarity algorithm to calculate the similarity of infectious disease case data between various regions, and obtain the position attention relationship of each region according to the similarity of the infectious disease case data; according to the population movement relationship, proximity relationship and Location-attention relationship constructs a spatial-temporal prediction model of infectious diseases based on graph neural network and long-term short-term memory network;
将城市内传染病病例数据输入所述传染病时空预测模型,通过所述传染病时空预测模型获取城市内传染病预测结果。The intra-city infectious disease case data are input into the infectious disease spatio-temporal prediction model, and the intra-urban infectious disease prediction results are obtained through the infectious disease spatio-temporal prediction model.
本申请实施例采取的技术方案还包括:所述根据时间属性对所述人口移动流量进行划分,并基于区域的邻近关系获取各个时间属性下的人口移动关系具体为:The technical solution adopted in the embodiment of the present application also includes: dividing the population movement flow according to the time attribute, and obtaining the population movement relationship under each time attribute based on the proximity relationship of the area, specifically:
所述时间属性包括工作日、周末或节假日;将每个时间属性的人口移动流量分别转换为一个加权的有向图G=(V,E),其中V表示节点集合,E表示边集合,顶点代表该城市内部的各个区域,边缘用于捕获移动模式;并基于区域的邻近关系,根据两个区域之间是否相互接触判定所述有向图中两个节点之间是否有连边,得到对应邻近关系的人口移动关系邻接矩阵。The time attribute includes working days, weekends or holidays; the population movement flow of each time attribute is converted into a weighted directed graph G=(V, E), wherein V represents a node set, E represents an edge set, and the vertex Represents each area within the city, and the edge is used to capture the movement pattern; and based on the proximity of the area, it is determined whether there is an edge between the two nodes in the directed graph according to whether the two areas are in contact with each other, and the corresponding The adjacency matrix of the population movement relation of the proximity relation.
本申请实施例采取的技术方案还包括:所述利用基于直方图的相似度算法计算各个区域之间传染病病例数据的相似度,根据所述传染病病例数据的相似度获取各个区域的位置注意力关系包括:The technical solution adopted in the embodiment of the present application also includes: calculating the similarity of infectious disease case data between various regions using a histogram-based similarity algorithm, and obtaining the location attention of each region according to the similarity of infectious disease case data Force relationships include:
分别对各个区域的传染病病例数据构建相应的直方图;Construct corresponding histograms for the infectious disease case data in each region;
计算两个区域之间传染病病例数据的相似度,如果两个区域之间传染病病例数据的相似度高于设定阈值,则认为所述两个区域的传染病爆发趋势相似且具有相关性,在所述有向图中所述两个区域之间构建一条图上的连边,生成代表所有区域间位置注意力关系的邻接矩阵。Calculate the similarity of infectious disease case data between two regions, if the similarity of infectious disease case data between two regions is higher than the set threshold, the infectious disease outbreak trends in the two regions are considered to be similar and correlated , constructing a connection edge on the graph between the two regions in the directed graph, and generating an adjacency matrix representing positional attention relationships between all regions.
本申请实施例采取的技术方案还包括:所述位置注意力关系的计算公式为:The technical solution adopted in the embodiment of the present application also includes: the calculation formula of the position-attention relationship is:
Figure PCTCN2022076295-appb-000001
Figure PCTCN2022076295-appb-000001
其中,θ为基于直方图的相似度算法建立连边的阈值;当区域i和区域j之间的相似度w i,j高于阈值θ时,则在区域i和区域j之间创建一条连边,最终得到城市内各个区域对应位置注意力关系的邻接矩阵。 Among them, θ is the threshold for establishing a connection based on the histogram-based similarity algorithm; when the similarity w i,j between region i and region j is higher than the threshold θ, a connection is created between region i and region j. Edges, and finally get the adjacency matrix of the corresponding location attention relationship of each area in the city.
本申请实施例采取的技术方案还包括:所述根据所述人口移动关系、邻近关系和位置注意力关系构建基于图神经网络和长短期记忆网络的传染病时空预测模型包括:The technical solution adopted in the embodiment of the present application also includes: the construction of the spatial-temporal prediction model of infectious diseases based on the graph neural network and long-term short-term memory network according to the population movement relationship, proximity relationship and location attention relationship includes:
将所述图结构输入图神经网络,所述图神经网络使用邻域聚合方法对所述有向图进行归一化,使每个节点的入边加权等于1:The graph structure is input into the graph neural network, and the graph neural network uses the neighborhood aggregation method to normalize the directed graph, so that the incoming edge weight of each node is equal to 1:
Figure PCTCN2022076295-appb-000002
Figure PCTCN2022076295-appb-000002
其中,H i是一个矩阵,包含了前一层的节点表示,初始H 0设置为表示各区域传染病病例变化的历史数据,W i表示第i层的可训练参数矩阵,f为非线性激活函数。 Among them, H i is a matrix, which contains the node representation of the previous layer, the initial H 0 is set to represent the historical data of the change of infectious disease cases in each region, W i represents the trainable parameter matrix of the i-th layer, and f is the nonlinear activation function.
本申请实施例采取的技术方案还包括:所述根据所述人口移动关系、邻近关系和位置注意力关系构建基于图神经网络和长短期记忆网络的传染病时空预测模型还包括:The technical solution adopted in the embodiment of the present application also includes: the construction of the spatial-temporal prediction model of infectious diseases based on the graph neural network and the long-term and short-term memory network according to the population movement relationship, proximity relationship and location attention relationship further includes:
在每个时间步长使用一个消息传播神经网络,获得一个表示序列h i,t-n,h i,t-n+1,...,h i,t-1,将表示序列h i,t-n,h i,t-n+1,...,h i,t-1输入长短期记忆网络,提取其中的时间序列关系;所述长短期记忆网络计算公式为: Use a message propagation neural network at each time step to obtain a representation sequence h i,tn , hi,t-n+1 ,...,hi ,t-1 , which will represent the sequence h i,tn , h i, t-n+1 ,..., h i, t-1 are input into the long-term short-term memory network, and the time series relationship therein is extracted; the calculation formula of the long-term short-term memory network is:
X i,t=LSTM(h i,t-n,h i,t-n+1,...,h i,t-1)其中,X i,t表示第i个区域在第t个时间段预测的传染病病例数据,h i,t-1表示第i个区域在第t-1个时间段的传染病病例数据。 X i,t =LSTM(h i,tn ,h i,t-n+1 ,...,h i,t-1 ) where, X i,t represents the prediction of the i-th region in the t-th time period Infectious disease case data, h i,t-1 represents the infectious disease case data of the i-th region in the t-1 time period.
本申请实施例采取的技术方案还包括:所述将城市内传染病病例数据输入所述传染病时空预测模型,通过所述传染病时空预测模型获取城市内传染病预测结果具体为:The technical solution adopted in the embodiment of the present application also includes: inputting the infectious disease case data in the city into the infectious disease spatiotemporal prediction model, and obtaining the urban infectious disease prediction result through the infectious disease spatiotemporal prediction model is specifically:
通过传染病时空预测模型对邻近关系、人口移动关系和位置注意力关系进行融合,得到城市传染病预测结果;所述融合方式具体为:Through the spatial-temporal prediction model of infectious diseases, the neighborhood relationship, population movement relationship and location attention relationship are fused to obtain the prediction results of urban infectious diseases; the fusion method is specifically:
Figure PCTCN2022076295-appb-000003
其中,W adj、W od和W at是需要训练的参数矩阵,
Figure PCTCN2022076295-appb-000004
Figure PCTCN2022076295-appb-000005
分别是基于邻接矩阵、人口移动流量矩阵和位置注意力矩阵得到的t时刻的传染病预测结果,tanh是激活函数,
Figure PCTCN2022076295-appb-000006
是整个时空预测模型t时刻的最终预测结果。
Figure PCTCN2022076295-appb-000003
Among them, W adj , W od and W at are the parameter matrices that need to be trained,
Figure PCTCN2022076295-appb-000004
and
Figure PCTCN2022076295-appb-000005
are the infectious disease prediction results at time t based on the adjacency matrix, population movement flow matrix and location attention matrix, tanh is the activation function,
Figure PCTCN2022076295-appb-000006
is the final prediction result of the entire spatio-temporal prediction model at time t.
本申请实施例采取的另一技术方案为:一种城市疫情时空预测系统,包括:Another technical solution adopted in the embodiment of the present application is: a spatio-temporal prediction system for urban epidemic situation, including:
数据收集模块:用于收集城市内个体移动轨迹数据以及传染病病例数据;Data collection module: used to collect individual movement trajectory data and infectious disease case data in the city;
流量计算模块:用于通过数据驱动方法对所述个体移动轨迹数据进行处理,提取各个区域之间的人口移动流量,并根据时间属性对所述人口移动流量进行划分,并基于区域的邻近关系获取各个时间属性下的人口移动关系;Flow calculation module: used to process the individual movement trajectory data through a data-driven method, extract the population movement flow between various regions, and divide the population movement flow according to the time attribute, and obtain it based on the proximity relationship of the region The population movement relationship under each time attribute;
相似度计算模块:用于利用基于直方图的相似度算法计算各个区域之间传染病病例数据的相似度,根据所述传染病病例数据的相似度获取各个区域的位置注意力关系;Similarity calculation module: used to calculate the similarity of infectious disease case data between various regions using a histogram-based similarity algorithm, and obtain the position attention relationship of each region according to the similarity of the infectious disease case data;
模型构建模块:用于根据所述人口移动关系、邻近关系和位置注意力关系构建基于图神经网络和长短期记忆网络的传染病时空预测模型;Model building module: used to construct a spatial-temporal prediction model of infectious diseases based on graph neural network and long-term short-term memory network according to the population movement relationship, proximity relationship and location attention relationship;
传染病预测模块:用于将城市内传染病病例数据输入所述传染病时空预测模型,通过所述传染病时空预测模型获取城市内传染病预测结果。Infectious disease prediction module: used to input the urban infectious disease case data into the infectious disease spatiotemporal prediction model, and obtain the urban infectious disease prediction results through the infectious disease spatiotemporal prediction model.
本申请实施例采取的又一技术方案为:一种终端,所述终端包括处理器、与所述处理器耦接的存储器,其中,Another technical solution adopted by the embodiment of the present application is: a terminal, the terminal includes a processor and a memory coupled to the processor, wherein,
所述存储器存储有用于实现所述城市疫情时空预测方法的程序指令;The memory stores program instructions for realizing the spatio-temporal prediction method of the urban epidemic situation;
所述处理器用于执行所述存储器存储的所述程序指令以控制城市疫情时空预测。The processor is used to execute the program instructions stored in the memory to control the spatio-temporal prediction of urban epidemics.
本申请实施例采取的又一技术方案为:一种存储介质,存储有处理器可运行的程序指令,所述程序指令用于执行所述城市疫情时空预测方法。Another technical solution adopted in the embodiment of the present application is: a storage medium storing program instructions executable by a processor, and the program instructions are used to execute the method for spatio-temporal prediction of urban epidemic situation.
相对于现有技术,本申请实施例产生的有益效果在于:本申请实施例的城市疫情时空预测方法、系统、终端以及存储介质通过城市内个体移动轨迹数据提取城市人口移动流量,根据时间属性对一定时间段内的人口移动流量进行划分,根据不同时间属性的人口移动流量获取各个区域间的邻近关系、人口移动关系和位置注意力关系,并根据邻近关系、人口移动关系和位置注意力关系构建基于图神经网络和长短期记忆网络的传染病时空预测模型,通过传染病时空预测模型对传染病趋势做出精细化预测。本申请实施例通过对人口移动流量进行时间属性的划分,可以精准捕获不同时间属性下的人口交互与移动对城市内部传染病扩散与传播的影响,充分利用每个区域的局部空间上下文信息,并考 虑了区域之间的多种空间关系,从而更好地对传染病病例数据中的时间依赖性和空间依赖性进行建模,大大提高了传染病时空预测模型的空间感知能力以及模型预测性能,实现了城市内部传染病发展态势更高空间分辨率的预测,完成对传染病疫情的精细化分析,帮助政府和公共卫生部门及时、准确地洞察城市内部的传染病发展态势,有针对性地进行疫情防控干预,可以最大化的保障人民的生命健康安全。Compared with the prior art, the beneficial effects produced by the embodiment of the present application lie in that the urban epidemic spatiotemporal prediction method, system, terminal and storage medium of the embodiment of the present application extract the urban population movement flow through the individual movement trajectory data in the city, and analyze the urban population flow according to the time attribute. The population movement flow within a certain period of time is divided, and the proximity relationship, population movement relationship, and location attention relationship between regions are obtained according to the population movement flow of different time attributes, and the neighborhood relationship, population movement relationship, and location attention relationship are constructed. The spatio-temporal prediction model of infectious diseases based on graph neural network and long-term short-term memory network can make refined predictions on the trend of infectious diseases through the spatio-temporal prediction model of infectious diseases. In the embodiment of the present application, by dividing the time attribute of the population movement flow, it is possible to accurately capture the impact of population interaction and movement under different time attributes on the spread and spread of infectious diseases within the city, and make full use of the local spatial context information of each region, and Multiple spatial relationships between regions are considered to better model the temporal and spatial dependencies in infectious disease case data, which greatly improves the spatial awareness and model prediction performance of the infectious disease spatiotemporal prediction model. Realized the prediction of the development trend of infectious diseases in the city with a higher spatial resolution, completed the refined analysis of the epidemic situation of infectious diseases, and helped the government and public health departments to gain timely and accurate insight into the development trend of infectious diseases in the city, and carry out targeted Epidemic prevention and control intervention can maximize the protection of people's lives, health and safety.
附图说明Description of drawings
图1是本申请实施例的城市疫情时空预测方法的流程图;Fig. 1 is the flow chart of the urban epidemic situation spatio-temporal prediction method of the embodiment of the present application;
图2为本申请实施例的城市疫情时空预测系统结构示意图;Fig. 2 is the schematic structural diagram of the urban epidemic situation spatio-temporal prediction system of the embodiment of the present application;
图3为本申请实施例的终端结构示意图;FIG. 3 is a schematic diagram of a terminal structure in an embodiment of the present application;
图4为本申请实施例的存储介质的结构示意图。FIG. 4 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, not to limit the present application.
针对现有技术的不足,本申请实施例的城市疫情时空预测方法基于城市内各个区域之间的邻近关系、人口移动关系和位置注意力关系构建基于图神经网络和长短期记忆网络的时空预测模型,以更好地对时空数据中的时间依赖性和空间依赖性进行建模,在对空间依赖性进行建模时,考虑区域之间的多种空间关系并构建对应的关系图结构;在对时间依赖性进行建模时,充分利用每个区域的局部空间上下文信息,以提高模型的预测性能;并针对传染病易感人群的人口移动模式将人口移动流量划分为不同的时间属性,分别构建不同时间属性 的有向图结构,以提高时空预测模型的时空感知能力,并进一步提升模型的预测性能。Aiming at the deficiencies of existing technologies, the urban epidemic spatio-temporal prediction method of the embodiment of this application is based on the proximity relationship, population movement relationship and location attention relationship between various regions in the city to construct a spatio-temporal prediction model based on graph neural network and long-term short-term memory network , to better model the temporal and spatial dependencies in spatio-temporal data, when modeling spatial dependencies, consider multiple spatial relationships between regions and construct a corresponding relationship graph structure; When modeling time dependence, make full use of the local spatial context information of each region to improve the prediction performance of the model; and divide the population movement flow into different time attributes according to the population movement pattern of infectious disease-susceptible groups, and construct A directed graph structure with different temporal attributes to improve the spatio-temporal awareness of the spatio-temporal prediction model and further improve the prediction performance of the model.
具体地,请参阅图1,是本申请实施例的城市疫情时空预测方法的流程图。本申请实施例的城市疫情时空预测方法包括以下步骤:Specifically, please refer to FIG. 1 , which is a flow chart of the spatio-temporal prediction method for urban epidemic situation in the embodiment of the present application. The urban epidemic spatio-temporal prediction method of the embodiment of the present application includes the following steps:
S1:获取城市内传染病病例数据,并利用移动设备收集城市内个体移动轨迹数据;S1: Obtain the data of infectious disease cases in the city, and use mobile devices to collect individual movement trajectory data in the city;
本步骤中,传染病病例数据包括但不限于感染病例数、死亡病例数、临床症状和住院人员信息等数据。个体移动轨迹数据包括但不限于每个个体的手机号码、信令时间戳以及手机位置数据等。为了更好的针对未成年人等易感人群进行城市内部传染病发病趋势预测,本申请实施例中获取大规模的个体移动轨迹数据传染病病例。移动设备包括但不限于手机或智能手表等设备。In this step, the infectious disease case data include, but are not limited to, the number of infected cases, the number of deaths, clinical symptoms, and hospitalized personnel information. Individual movement track data includes, but is not limited to, each individual's mobile phone number, signaling timestamp, and mobile phone location data. In order to better predict the incidence trend of infectious diseases within the city for susceptible groups such as minors, large-scale individual movement trajectory data of infectious disease cases is obtained in this embodiment of the application. Mobile devices include, but are not limited to, devices such as cell phones or smart watches.
S2:通过数据驱动方法对个体移动轨迹数据进行处理,提取不同区域之间的人口移动流量,获取城市人口移动关系;S2: Process the individual movement trajectory data through data-driven methods, extract the population movement flow between different regions, and obtain the urban population movement relationship;
本步骤中,通过数据驱动方法对个体移动轨迹数据进行处理具体为:通过大规模手机位置数据提取不同区域之间的人口移动流量,捕获城市人口移动关系。In this step, the data-driven method is used to process the individual movement trajectory data as follows: extract the population movement flow between different regions through large-scale mobile phone location data, and capture the urban population movement relationship.
S3:根据时间属性对人口移动流量进行划分,构建不同时间属性的人口移动流量的有向图,并基于区域的邻近关系判定有向图中两个节点之间是否有连边,生成不同时间属性下人口移动关系的邻接矩阵;S3: Divide the population movement flow according to the time attribute, construct a directed graph of the population movement flow with different time attributes, and determine whether there is an edge between two nodes in the directed graph based on the proximity of the region, and generate different time attributes The adjacency matrix of the population movement relationship;
本步骤中,时间属性包括但不限于工作日、周末或节假日。假设给定一个城市,创建以日、周或月为时间间隔单位的图结构,并根据工作日、周末将人口移动流量划分为两个时间属性的人口移动流量,将每个时间属性的人口移动流量分别转换为一个加权的有向图G=(V,E),其中V表示节点集合,E表示 边集合,顶点代表该城市内部的各个区域,边缘用于捕获移动模式。例如,从顶点v到顶点u的权值
Figure PCTCN2022076295-appb-000007
表示在第t周的工作日从区域v移动到区域u的总人数,从而分别得到该城市在第t周的工作日和周末两种时间属性下人口移动关系的邻接矩阵
Figure PCTCN2022076295-appb-000008
Figure PCTCN2022076295-appb-000009
In this step, the time attribute includes but not limited to weekdays, weekends or holidays. Suppose a city is given, create a graph structure with day, week or month as the time interval unit, and divide the population movement flow into two time attribute population movement flows according to weekdays and weekends, and divide the population movement flow of each time attribute The flows are respectively transformed into a weighted directed graph G = (V, E), where V represents a set of nodes, E represents a set of edges, vertices represent various regions within the city, and edges are used to capture mobility patterns. For example, the weight from vertex v to vertex u
Figure PCTCN2022076295-appb-000007
Represents the total number of people who moved from area v to area u on weekdays of week t, so as to obtain the adjacency matrix of the population movement relationship under the two time attributes of weekdays and weekends in week t
Figure PCTCN2022076295-appb-000008
and
Figure PCTCN2022076295-appb-000009
进一步地,根据地理学第一定律,所有的事物都与其他事物相关,但是距离相近的事物比距离较远的事物更具有相关性,因此,本申请实施例考虑了距离相近的事物之间的空间关系,即邻近关系。基于区域的邻近关系,根据两个区域之间是否相互接触判定两个区域(节点)之间是否有连边,得到对应邻近关系的人口移动关系邻接矩阵A adj。具体的,假设区域v和u为相邻区域,从顶点v到顶点u的权值
Figure PCTCN2022076295-appb-000010
表示在工作日以区域v为出发地的个体移动到区域u的总人数,有向图G即可包含城市内各个相邻区域的人口流动行为。区域v和u在工作日的流动性形成一条边,再乘以区域v在该时间段的病例数,得到一个相对分数,该分数即表示有多少感染者可能从区域v流动到区域u。可以理解,两个相邻区域之间的病例变化模式也可能不具有流动性,例如区域i和j为相邻区域,区域i的流量是固定的且恒为零,区域j的流量是动态的且非零,则这两个相邻区域的病例变化模式毫无相关性。
Further, according to the first law of geography, all things are related to other things, but things with close distances are more correlated than things with far distances, therefore, the embodiment of the present application considers the space between things with close distances relationship, that is, proximity relationship. Based on the adjacency relationship of the regions, it is determined whether there is an edge between the two regions (nodes) according to whether the two regions are in contact with each other, and the adjacency matrix A adj of the population movement relationship corresponding to the adjacency relationship is obtained. Specifically, assuming that regions v and u are adjacent regions, the weight from vertex v to vertex u
Figure PCTCN2022076295-appb-000010
Indicates the total number of individuals who move from area v to area u on weekdays, and the directed graph G can contain the population flow behavior of each adjacent area in the city. The mobility of areas v and u on weekdays forms an edge, which is multiplied by the number of cases in area v during that time period to obtain a relative score, which indicates how many infected people are likely to flow from area v to area u. It can be understood that the case change pattern between two adjacent areas may not be fluid, for example, areas i and j are adjacent areas, the flow rate of area i is fixed and always zero, and the flow rate of area j is dynamic and non-zero, there is no correlation between the change patterns of cases in these two adjacent regions.
S4:构建传染病病例数据的直方图,利用基于直方图的相似度算法计算各个区域之间传染病病例数据的相似度,根据传染病病例数据的相似度获取代表所有区域间位置注意力关系的邻接矩阵;S4: Construct the histogram of the infectious disease case data, use the histogram-based similarity algorithm to calculate the similarity of the infectious disease case data between regions, and obtain the location attention relationship representing all regions according to the similarity of the infectious disease case data adjacency matrix;
本步骤中,两个区域之间的相关性可能会受到地理距离的影响,即相邻区域可能具有相似的地形或气候特征,使其具有相似的传染病爆发趋势。然而,由于人口流动和类似的地理特征,非相邻区域也可能具有潜在的依赖性,但很难模拟传染病爆发的所有相关因素。因此,本发明利用基于直方图的相似度算 法,计算各个区域之间传染病病例数据的相关性。如果两个区域的传染病病例数据相似度较高,相应地,它们的病例变化模式也会比较相似,本申请实施例通过计算这些不相邻但又具有较强相关性的区域的传染病病例数据的相似度,从而以更加全局的空间视角捕捉病例变化趋势。In this step, the correlation between two regions may be affected by geographical distance, that is, adjacent regions may have similar terrain or climate characteristics, making them have similar infectious disease outbreak trends. However, non-contiguous areas may also have potential dependencies due to population movement and similar geographic features, but it is difficult to model all relevant factors for infectious disease outbreaks. Therefore, the present invention utilizes a histogram-based similarity algorithm to calculate the correlation of infectious disease case data between regions. If the similarity of infectious disease case data in two regions is relatively high, correspondingly, their case change patterns will be relatively similar, and the embodiment of the present application calculates the infectious disease cases in these non-adjacent but highly correlated regions The similarity of the data, so as to capture the trend of case changes from a more global spatial perspective.
基于直方图的相似度算法具体为:首先,分别对不同区域的传染病病例数据样本时间序列构建相应的直方图;其次计算两个区域的传染病病例数据的相似度,如果两个区域的传染病病例数据的相似度高于设定阈值,则认为这两个区域的传染病爆发趋势较为相似且它们之间具有较强的相关性,在有向图中这两个区域之间构建一条图上的连边,从而生成代表所有区域间位置注意力关系的邻接矩阵。计算公式如下:The similarity algorithm based on the histogram is as follows: firstly, construct corresponding histograms for the time series of infectious disease case data samples in different regions; secondly, calculate the similarity of the infectious disease case data in two regions, if the infection If the similarity of disease case data is higher than the set threshold, it is considered that the infectious disease outbreak trends in these two areas are relatively similar and there is a strong correlation between them, and a graph is constructed between these two areas in the directed graph , so as to generate an adjacency matrix representing the location-attention relationship between all regions. Calculated as follows:
Figure PCTCN2022076295-appb-000011
Figure PCTCN2022076295-appb-000011
其中,θ为基于直方图的相似度算法建立连边的阈值。当区域i和区域j之间的相似度w i,j高于阈值θ时,则在区域i和区域j之间创建一条连边,最终得到该城市内各个区域对应位置注意力关系的邻接矩阵
Figure PCTCN2022076295-appb-000012
Among them, θ is the threshold for establishing a connection edge based on the histogram-based similarity algorithm. When the similarity w i,j between area i and area j is higher than the threshold θ, a connection edge is created between area i and area j, and finally the adjacency matrix of the attention relationship of each area in the city is obtained
Figure PCTCN2022076295-appb-000012
具体的,直方图构建算法如下所示:Specifically, the histogram construction algorithm is as follows:
直方图构建算法Histogram construction algorithm
Figure PCTCN2022076295-appb-000013
Figure PCTCN2022076295-appb-000013
基于直方图的相似度算法如下:The similarity algorithm based on histogram is as follows:
相似度算法similarity algorithm
Figure PCTCN2022076295-appb-000014
Figure PCTCN2022076295-appb-000014
S5:利用邻近关系、人口移动关系以及位置注意力关系训练基于图神经网络(Graph Neural Networks,GNN)和长短期记忆网络(Long short-term memory,LSTM)的传染病时空预测模型;S5: Use the proximity relationship, population movement relationship and location attention relationship to train the spatiotemporal prediction model of infectious diseases based on Graph Neural Networks (GNN) and Long short-term memory (LSTM);
本步骤中,图神经网络是一个邻居聚合策略,一个节点的表示向量由它的邻居节点通过循环的聚合和转移表示向量计算得到。图神经网络的框架为消息传播神经网络(Message Passing Neural Network,MPNN)。消息传播神经网络是一种空域图卷积的形式化框架。本申请实施例中,图神经网络使用以下邻域聚合方法对输入的有向图结构矩阵A进行归一化:In this step, the graph neural network is a neighbor aggregation strategy, and the representation vector of a node is calculated by its neighbor nodes through cyclic aggregation and transfer representation vectors. The framework of graph neural network is Message Passing Neural Network (MPNN). Message Propagation Neural Network is a formal framework for spatial graph convolution. In the embodiment of the present application, the graph neural network uses the following neighborhood aggregation method to normalize the input directed graph structure matrix A:
Figure PCTCN2022076295-appb-000015
Figure PCTCN2022076295-appb-000015
其中,H i是一个矩阵,包含了前一层的节点表示,初始H 0设置为表示各区域传染病病例变化的历史数据,W i表示第i层的可训练参数矩阵,f是非线性激活函数,如ReLU函数。对输入的有向图结构矩阵A进行归一化,使每个节点的入边加权等于1,得到归一化后的有向图结构矩阵
Figure PCTCN2022076295-appb-000016
Among them, H i is a matrix, which contains the node representation of the previous layer, the initial H 0 is set to represent the historical data of the change of infectious disease cases in each region, W i represents the trainable parameter matrix of the i-th layer, and f is the nonlinear activation function , such as the ReLU function. Normalize the input directed graph structure matrix A so that the incoming edge weight of each node is equal to 1, and the normalized directed graph structure matrix is obtained
Figure PCTCN2022076295-appb-000016
长短期记忆网络是一种特殊的循环神经网络(Recurrent Neural Network,RNN),可以用于处理序列数据。长短期记忆网络主要是为了解决长序列训练过程中的梯度消失和梯度爆炸问题。在每个时间步长使用一个MPNN,以获得一个表示序列h i,t-n,h i,t-n+1,...,h i,t-1。将这些表示序列输入长短期记忆网络, 提取其中的时间特征。LSTM计算公式表示为: Long short-term memory network is a special kind of recurrent neural network (Recurrent Neural Network, RNN), which can be used to process sequence data. The long short-term memory network is mainly to solve the problem of gradient disappearance and gradient explosion in the long sequence training process. Use one MPNN at each time step to obtain a representation sequence h i,tn , hi,t-n+1 ,...,hi ,t-1 . Input these representation sequences into the long short-term memory network to extract the temporal features. The LSTM calculation formula is expressed as:
X i,t=LSTM(h i,t-n,h i,t-n+1,...,h i,t-1)   (3) X i,t =LSTM(h i,tn ,h i,t-n+1 ,...,h i,t-1 ) (3)
其中,X i,t表示第i个区域在第t个时间段预测的传染病病例数据,h i,t-1表示第i个区域在第t-1个时间段的传染病病例数据等输入特征的表示。 Among them, X i,t represents the infectious disease case data predicted in the i-th region in the t-th time period, h i,t-1 represents the infectious disease case data in the i-th region in the t-1 time period, etc. input representation of features.
S6:通过传染病时空预测模型对邻近关系、人口移动关系和位置注意力关系进行融合,得到城市传染病预测结果;S6: Through the spatial-temporal prediction model of infectious diseases, the neighborhood relationship, population movement relationship and location attention relationship are fused to obtain the prediction results of urban infectious diseases;
其中,为了能够同时考虑邻近关系、人口移动关系和位置注意力关系对城市内部传染病预测的影响,本发明采用一种基于参数矩阵的融合方法,具体为:Among them, in order to be able to simultaneously consider the influence of the proximity relationship, population movement relationship and location attention relationship on the prediction of infectious diseases within the city, the present invention adopts a fusion method based on a parameter matrix, specifically:
Figure PCTCN2022076295-appb-000017
Figure PCTCN2022076295-appb-000017
其中,W adj、W od和W at是需要训练的参数矩阵,
Figure PCTCN2022076295-appb-000018
Figure PCTCN2022076295-appb-000019
分别是基于邻接矩阵、人口移动流量矩阵和位置注意力矩阵得到的t时刻的传染病预测结果,tanh是激活函数,
Figure PCTCN2022076295-appb-000020
是整个传染病时空预测模型t时刻的最终预测结果。
Among them, W adj , W od and W at are the parameter matrices that need to be trained,
Figure PCTCN2022076295-appb-000018
and
Figure PCTCN2022076295-appb-000019
are the infectious disease prediction results at time t based on the adjacency matrix, population movement flow matrix and location attention matrix, tanh is the activation function,
Figure PCTCN2022076295-appb-000020
is the final prediction result of the entire infectious disease spatiotemporal prediction model at time t.
由于本申请实施例采用人工智能深度学习的方式进行传染病趋势预测,可以在多次预测后通过学习新的数据更新模型参数,使得模型更加智能高效。Since the embodiment of the present application uses artificial intelligence deep learning to predict the trend of infectious diseases, the model parameters can be updated by learning new data after multiple predictions, making the model more intelligent and efficient.
基于上述,本申请实施例的城市疫情时空预测方法通过城市内个体移动轨迹数据提取城市人口移动流量,根据时间属性对一定时间段内的人口移动流量进行划分,根据不同时间属性的人口移动流量获取各个区域间的邻近关系、人口移动关系和位置注意力关系,并根据邻近关系、人口移动关系和位置注意力关系构建基于图神经网络和长短期记忆网络的传染病时空预测模型,通过传染病时空预测模型对传染病趋势做出精细化预测。本申请实施例通过对人口移动 流量进行时间属性的划分,可以精准捕获不同时间属性下的人口交互与移动对城市内部传染病扩散与传播的影响,充分利用每个区域的局部空间上下文信息,并考虑了区域之间的多种空间关系,从而更好地对传染病病例数据中的时间依赖性和空间依赖性进行建模,大大提高了传染病时空预测模型的空间感知能力以及模型预测性能,实现了城市内部传染病发展态势更高空间分辨率的预测,完成对传染病疫情的精细化分析,帮助政府和公共卫生部门及时、准确地洞察城市内部的传染病发展态势,有针对性地进行疫情防控干预,可以最大化的保障人民的生命健康安全。Based on the above, the urban epidemic spatio-temporal prediction method of the embodiment of the present application extracts the urban population movement flow through the individual movement trajectory data in the city, divides the population movement flow within a certain period of time according to the time attribute, and obtains the population movement flow according to different time attributes The proximity relationship, population movement relationship and location attention relationship between various regions, and according to the proximity relationship, population movement relationship and location attention relationship, a spatiotemporal prediction model of infectious diseases based on graph neural network and long short-term memory network is constructed. Predictive models make refined forecasts on infectious disease trends. In the embodiment of the present application, by dividing the time attribute of the population movement flow, it is possible to accurately capture the impact of population interaction and movement under different time attributes on the spread and spread of infectious diseases within the city, and make full use of the local spatial context information of each region, and Multiple spatial relationships between regions are considered to better model the temporal and spatial dependencies in infectious disease case data, which greatly improves the spatial awareness and model prediction performance of the infectious disease spatiotemporal prediction model. Realized the prediction of the development trend of infectious diseases in the city with a higher spatial resolution, completed the refined analysis of the epidemic situation of infectious diseases, and helped the government and public health departments to gain timely and accurate insight into the development trend of infectious diseases in the city, and carry out targeted Epidemic prevention and control intervention can maximize the protection of people's lives, health and safety.
请参阅图2,为本申请实施例的城市疫情时空预测系统结构示意图。本申请实施例的城市疫情时空预测系统40包括:Please refer to FIG. 2 , which is a schematic structural diagram of the spatio-temporal prediction system of the urban epidemic situation in the embodiment of the present application. The spatial-temporal prediction system 40 of the urban epidemic situation in the embodiment of the present application includes:
数据收集模块41:用于收集城市内个体移动轨迹数据以及传染病病例数据;Data collection module 41: used to collect individual movement trajectory data and infectious disease case data in the city;
流量计算模块42:用于通过数据驱动方法对个体移动轨迹数据进行处理,提取各个区域之间的人口移动流量,并根据时间属性对人口移动流量进行划分,并基于区域的邻近关系获取各个时间属性下的人口移动关系;Flow calculation module 42: used to process individual movement trajectory data through a data-driven method, extract population movement flow between various regions, divide population movement flow according to time attributes, and obtain various time attributes based on the proximity of regions The population movement relationship under ;
相似度计算模块43:用于利用基于直方图的相似度算法计算各个区域之间传染病病例数据的相似度,根据传染病病例数据的相似度获取各个区域的位置注意力关系;Similarity calculation module 43: used to calculate the similarity of infectious disease case data between various regions using a histogram-based similarity algorithm, and obtain the position attention relationship of each region according to the similarity of infectious disease case data;
模型构建模块44:用于根据人口移动关系、邻近关系和位置注意力关系构建基于图神经网络和长短期记忆网络的传染病时空预测模型;Model building module 44: used to construct a spatiotemporal prediction model of infectious diseases based on graph neural network and long short-term memory network according to population movement relationship, proximity relationship and location attention relationship;
传染病预测模块45:用于将城市内传染病病例数据输入传染病时空预测模型,通过传染病时空预测模型获取城市内传染病预测结果。Infectious disease prediction module 45: used to input the infectious disease case data in the city into the infectious disease spatiotemporal prediction model, and obtain the urban infectious disease prediction result through the infectious disease spatiotemporal prediction model.
请参阅图3,为本申请实施例的终端结构示意图。该终端50包括处理器51、与处理器51耦接的存储器52。Please refer to FIG. 3 , which is a schematic diagram of a terminal structure in an embodiment of the present application. The terminal 50 includes a processor 51 and a memory 52 coupled to the processor 51 .
存储器52存储有用于实现上述城市疫情时空预测方法的程序指令。The memory 52 stores program instructions for realizing the above-mentioned spatio-temporal prediction method of urban epidemic situation.
处理器51用于执行存储器52存储的程序指令以控制城市疫情时空预测。The processor 51 is used to execute the program instructions stored in the memory 52 to control the spatio-temporal prediction of the urban epidemic situation.
其中,处理器51还可以称为CPU(Central Processing Unit,中央处理单元)。处理器51可能是一种集成电路芯片,具有信号的处理能力。处理器51还可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。Wherein, the processor 51 may also be referred to as a CPU (Central Processing Unit, central processing unit). The processor 51 may be an integrated circuit chip with signal processing capabilities. The processor 51 can also be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components . A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
请参阅图4,为本申请实施例的存储介质的结构示意图。本申请实施例的存储介质存储有能够实现上述所有方法的程序文件61,其中,该程序文件61可以以软件产品的形式存储在上述存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本发明各个实施方式方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质,或者是计算机、服务器、手机、平板等终端设备。Please refer to FIG. 4 , which is a schematic structural diagram of a storage medium according to an embodiment of the present application. The storage medium of the embodiment of the present application stores a program file 61 capable of realizing all the above-mentioned methods, wherein the program file 61 can be stored in the above-mentioned storage medium in the form of a software product, and includes several instructions to make a computer device (which can It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the methods in various embodiments of the present invention. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. , or terminal devices such as computers, servers, mobile phones, and tablets.
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本发明中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本发明所示的这些实施例,而是要符合与本发明所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined in this invention may be implemented in other embodiments without departing from the spirit or scope of the invention. Therefore, the present invention will not be limited to these embodiments shown in the present invention, but will conform to the widest scope consistent with the principles and novel features disclosed in the present invention.

Claims (10)

  1. 一种城市疫情时空预测方法,其特征在于,包括:A spatio-temporal forecasting method for an urban epidemic, characterized by comprising:
    收集城市内个体移动轨迹数据以及传染病病例数据;Collect individual movement trajectory data and infectious disease case data in the city;
    通过数据驱动方法对所述个体移动轨迹数据进行处理,提取各个区域之间的人口移动流量,并根据时间属性对所述人口移动流量进行划分,并基于区域的邻近关系获取各个时间属性下的人口移动关系;Process the individual movement trajectory data through a data-driven method, extract the population movement flow between various regions, and divide the population movement flow according to the time attribute, and obtain the population under each time attribute based on the proximity of the region mobile relationship;
    利用基于直方图的相似度算法计算各个区域之间传染病病例数据的相似度,根据所述传染病病例数据的相似度获取各个区域的位置注意力关系;Using a histogram-based similarity algorithm to calculate the similarity of the infectious disease case data between the regions, and obtain the position attention relationship of each region according to the similarity of the infectious disease case data;
    根据所述人口移动关系、邻近关系和位置注意力关系构建基于图神经网络和长短期记忆网络的传染病时空预测模型;Constructing a spatiotemporal prediction model of infectious diseases based on graph neural network and long short-term memory network according to the population movement relationship, proximity relationship and location attention relationship;
    将城市内传染病病例数据输入所述传染病时空预测模型,通过所述传染病时空预测模型获取城市内传染病预测结果。The intra-city infectious disease case data are input into the infectious disease spatio-temporal prediction model, and the intra-urban infectious disease prediction results are obtained through the infectious disease spatio-temporal prediction model.
  2. 根据权利要求1所述的城市疫情时空预测方法,其特征在于,所述根据时间属性对所述人口移动流量进行划分,并基于区域的邻近关系获取各个时间属性下的人口移动关系具体为:The spatio-temporal prediction method of urban epidemic situation according to claim 1, characterized in that, the population movement flow is divided according to the time attribute, and the population movement relationship under each time attribute is obtained based on the proximity relationship of the region as follows:
    所述时间属性包括工作日、周末或节假日;将每个时间属性的人口移动流量分别转换为一个加权的有向图G=(V,E),其中V表示节点集合,E表示边集合,顶点代表该城市内部的各个区域,边缘用于捕获移动模式;并基于区域的邻近关系,根据两个区域之间是否相互接触判定所述有向图中两个节点之间是否有连边,得到对应邻近关系的人口移动关系邻接矩阵。The time attribute includes working days, weekends or holidays; the population movement flow of each time attribute is converted into a weighted directed graph G=(V, E), wherein V represents a node set, E represents an edge set, and the vertex Represents each area within the city, and the edge is used to capture the movement pattern; and based on the proximity of the area, it is determined whether there is an edge between the two nodes in the directed graph according to whether the two areas are in contact with each other, and the corresponding The adjacency matrix of the population movement relation of the proximity relation.
  3. 根据权利要求2所述的城市疫情时空预测方法,其特征在于,所述利用基于直方图的相似度算法计算各个区域之间传染病病例数据的相似度,根据所述 传染病病例数据的相似度获取各个区域的位置注意力关系包括:The spatio-temporal prediction method of urban epidemic situation according to claim 2, characterized in that, the similarity of the infectious disease case data between the regions is calculated using a histogram-based similarity algorithm, and according to the similarity of the infectious disease case data Obtaining the location-attention relationship of each region includes:
    分别对各个区域的传染病病例数据构建相应的直方图;Construct corresponding histograms for the infectious disease case data in each region;
    计算两个区域之间传染病病例数据的相似度,如果两个区域之间传染病病例数据的相似度高于设定阈值,则认为所述两个区域的传染病爆发趋势相似且具有相关性,在所述有向图中所述两个区域之间构建一条图上的连边,生成代表所有区域间位置注意力关系的邻接矩阵。Calculate the similarity of infectious disease case data between two regions, if the similarity of infectious disease case data between two regions is higher than the set threshold, the infectious disease outbreak trends in the two regions are considered to be similar and correlated , constructing a connection edge on the graph between the two regions in the directed graph, and generating an adjacency matrix representing positional attention relationships between all regions.
  4. 根据权利要求3所述的城市疫情时空预测方法,其特征在于,所述位置注意力关系的计算公式为:The spatio-temporal prediction method of urban epidemic situation according to claim 3, is characterized in that, the calculation formula of described location attention relation is:
    Figure PCTCN2022076295-appb-100001
    Figure PCTCN2022076295-appb-100001
    其中,θ为基于直方图的相似度算法建立连边的阈值;当区域i和区域j之间的相似度w i,j高于阈值θ时,则在区域i和区域j之间创建一条连边,最终得到城市内各个区域对应位置注意力关系的邻接矩阵。 Among them, θ is the threshold for establishing a connection based on the histogram-based similarity algorithm; when the similarity w i,j between region i and region j is higher than the threshold θ, a connection is created between region i and region j. Edges, and finally get the adjacency matrix of the corresponding location attention relationship of each area in the city.
  5. 根据权利要求1至4任一项所述的城市疫情时空预测方法,其特征在于,所述根据所述人口移动关系、邻近关系和位置注意力关系构建基于图神经网络和长短期记忆网络的传染病时空预测模型包括:The spatio-temporal prediction method of urban epidemic according to any one of claims 1 to 4, characterized in that, according to the population movement relationship, proximity relationship and location attention relationship, constructing a contagion algorithm based on a graph neural network and a long-term short-term memory network Disease spatiotemporal prediction models include:
    将所述图结构输入图神经网络,所述图神经网络使用邻域聚合方法对所述有向图进行归一化,使每个节点的入边加权等于1:The graph structure is input into the graph neural network, and the graph neural network uses the neighborhood aggregation method to normalize the directed graph, so that the incoming edge weight of each node is equal to 1:
    Figure PCTCN2022076295-appb-100002
    Figure PCTCN2022076295-appb-100002
    其中,H i是一个矩阵,包含了前一层的节点表示,初始H 0设置为表示各区域传染病病例变化的历史数据,W i表示第i层的可训练参数矩阵,f为非线性激活函数。 Among them, H i is a matrix, which contains the node representation of the previous layer, the initial H 0 is set to represent the historical data of the change of infectious disease cases in each region, W i represents the trainable parameter matrix of the i-th layer, and f is the nonlinear activation function.
  6. 根据权利要求5所述的城市疫情时空预测方法,其特征在于,所述根据所述人口移动关系、邻近关系和位置注意力关系构建基于图神经网络和长短期记忆网络的传染病时空预测模型还包括:The spatiotemporal prediction method of urban epidemic situation according to claim 5, characterized in that, constructing the spatiotemporal prediction model of infectious diseases based on graph neural network and long short-term memory network according to the population movement relationship, proximity relationship and position attention relationship also include:
    在每个时间步长使用一个消息传播神经网络,获得一个表示序列h i,t-n,h i,t-n+1,...,h i,t-1,将表示序列h i,t-n,h i,t-n+1,...,h i,t-1输入长短期记忆网络,提取其中的时间序列关系;所述长短期记忆网络计算公式为: Use a message propagation neural network at each time step to obtain a representation sequence h i,tn , hi,t-n+1 ,...,hi ,t-1 , which will represent the sequence h i,tn , h i, t-n+1 ,..., h i, t-1 are input into the long-term short-term memory network, and the time series relationship therein is extracted; the calculation formula of the long-term short-term memory network is:
    X i,t=LSTM(h i,t-n,h i,t-n+1,...,h i,t-1) X i,t =LSTM(h i,tn ,h i,t-n+1 ,...,h i,t-1 )
    其中,X i,t表示第i个区域在第t个时间段预测的传染病病例数据,h i,t-1表示第i个区域在第t-1个时间段的传染病病例数据。 Among them, X i,t represents the predicted infectious disease case data of the i-th region in the t-th time period, and h i,t-1 represents the infectious disease case data of the i-th region in the t-1 time period.
  7. 根据权利要求6所述的城市疫情时空预测方法,其特征在于,所述将城市内传染病病例数据输入所述传染病时空预测模型,通过所述传染病时空预测模型获取城市内传染病预测结果具体为:The spatio-temporal prediction method of urban epidemic situation according to claim 6, characterized in that the data of infectious disease cases in the city are input into the spatio-temporal prediction model of infectious diseases, and the prediction results of infectious diseases in the city are obtained through the spatio-temporal prediction model of infectious diseases Specifically:
    通过传染病时空预测模型对邻近关系、人口移动关系和位置注意力关系进行融合,得到城市传染病预测结果;所述融合方式具体为:Through the spatial-temporal prediction model of infectious diseases, the neighborhood relationship, population movement relationship and location attention relationship are fused to obtain the prediction results of urban infectious diseases; the fusion method is specifically:
    Figure PCTCN2022076295-appb-100003
    Figure PCTCN2022076295-appb-100003
    其中,W adj、W od和W at是需要训练的参数矩阵,
    Figure PCTCN2022076295-appb-100004
    Figure PCTCN2022076295-appb-100005
    分别是基于邻接矩阵、人口移动流量矩阵和位置注意力矩阵得到的t时刻的传染病预测结果,tanh是激活函数,
    Figure PCTCN2022076295-appb-100006
    是整个传染病时空预测模型t时刻的最终预测结果。
    Among them, W adj , W od and W at are the parameter matrices that need to be trained,
    Figure PCTCN2022076295-appb-100004
    and
    Figure PCTCN2022076295-appb-100005
    are the infectious disease prediction results at time t based on the adjacency matrix, population movement flow matrix and location attention matrix, tanh is the activation function,
    Figure PCTCN2022076295-appb-100006
    is the final prediction result of the entire infectious disease spatiotemporal prediction model at time t.
  8. 一种城市疫情时空预测系统,其特征在于,包括:A spatial-temporal prediction system for urban epidemics, characterized in that it includes:
    数据收集模块:用于收集城市内个体移动轨迹数据以及传染病病例数据;Data collection module: used to collect individual movement trajectory data and infectious disease case data in the city;
    流量计算模块:用于通过数据驱动方法对所述个体移动轨迹数据进行处理,提取各个区域之间的人口移动流量,并根据时间属性对所述人口移动流量进行划 分,并基于区域的邻近关系获取各个时间属性下的人口移动关系;Flow calculation module: used to process the individual movement trajectory data through a data-driven method, extract the population movement flow between various regions, and divide the population movement flow according to the time attribute, and obtain it based on the proximity relationship of the region The population movement relationship under each time attribute;
    相似度计算模块:用于利用基于直方图的相似度算法计算各个区域之间传染病病例数据的相似度,根据所述传染病病例数据的相似度获取各个区域的位置注意力关系;Similarity calculation module: used to calculate the similarity of infectious disease case data between various regions using a histogram-based similarity algorithm, and obtain the position attention relationship of each region according to the similarity of the infectious disease case data;
    模型构建模块:用于根据所述人口移动关系、邻近关系和位置注意力关系构建基于图神经网络和长短期记忆网络的传染病时空预测模型;Model building module: used to construct a spatial-temporal prediction model of infectious diseases based on graph neural network and long-term short-term memory network according to the population movement relationship, proximity relationship and location attention relationship;
    传染病预测模块:用于将城市内传染病病例数据输入所述传染病时空预测模型,通过所述传染病时空预测模型获取城市内传染病预测结果。Infectious disease prediction module: used to input the urban infectious disease case data into the infectious disease spatiotemporal prediction model, and obtain the urban infectious disease prediction results through the infectious disease spatiotemporal prediction model.
  9. 一种终端,其特征在于,所述终端包括处理器、与所述处理器耦接的存储器,其中,A terminal, characterized in that the terminal includes a processor and a memory coupled to the processor, wherein,
    所述存储器存储有用于实现权利要求1-7任一项所述的城市疫情时空预测方法的程序指令;The memory is stored with program instructions for realizing the urban epidemic spatiotemporal prediction method described in any one of claims 1-7;
    所述处理器用于执行所述存储器存储的所述程序指令以控制城市疫情时空预测。The processor is used to execute the program instructions stored in the memory to control the spatio-temporal prediction of urban epidemics.
  10. 一种存储介质,其特征在于,存储有处理器可运行的程序指令,所述程序指令用于执行权利要求1至7任一项所述城市疫情时空预测方法。A storage medium, characterized in that it stores program instructions executable by a processor, and the program instructions are used to execute the urban epidemic spatio-temporal prediction method according to any one of claims 1 to 7.
PCT/CN2022/076295 2021-12-31 2022-02-15 Urban epidemic space-time prediction method and system, terminal and storage medium WO2023123625A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111675147.7A CN114464329A (en) 2021-12-31 2021-12-31 Urban epidemic situation space-time prediction method, system, terminal and storage medium
CN202111675147.7 2021-12-31

Publications (1)

Publication Number Publication Date
WO2023123625A1 true WO2023123625A1 (en) 2023-07-06

Family

ID=81408260

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/076295 WO2023123625A1 (en) 2021-12-31 2022-02-15 Urban epidemic space-time prediction method and system, terminal and storage medium

Country Status (2)

Country Link
CN (1) CN114464329A (en)
WO (1) WO2023123625A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117332120A (en) * 2023-08-29 2024-01-02 泰瑞数创科技(北京)股份有限公司 Geographic entity relation construction and expression method based on space calculation
CN117649028A (en) * 2024-01-26 2024-03-05 南京航空航天大学 Urban function area matching-based inter-urban crowd flow trend prediction method
CN117668743A (en) * 2023-10-26 2024-03-08 长江信达软件技术(武汉)有限责任公司 Time sequence data prediction method of association time-space relation

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115238589A (en) * 2022-08-09 2022-10-25 浙江大学 Crowd movement prediction method based on generation of confrontation network
CN117150911B (en) * 2023-09-04 2024-04-26 吉林建筑大学 Coal rock instability fracture prediction method and system based on graph neural network
CN117690601A (en) * 2024-02-02 2024-03-12 江西省胸科医院(江西省第三人民医院) Tuberculosis epidemic trend prediction system based on big data analysis

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639787A (en) * 2020-04-28 2020-09-08 北京工商大学 Spatio-temporal data prediction method based on graph convolution network
CN113314231A (en) * 2021-05-28 2021-08-27 北京航空航天大学 Infectious disease propagation prediction system and device integrating spatio-temporal information
WO2021180245A1 (en) * 2020-11-02 2021-09-16 平安科技(深圳)有限公司 Server, data processing method and apparatus, and readable storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639787A (en) * 2020-04-28 2020-09-08 北京工商大学 Spatio-temporal data prediction method based on graph convolution network
WO2021180245A1 (en) * 2020-11-02 2021-09-16 平安科技(深圳)有限公司 Server, data processing method and apparatus, and readable storage medium
CN113314231A (en) * 2021-05-28 2021-08-27 北京航空航天大学 Infectious disease propagation prediction system and device integrating spatio-temporal information

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
NATHAN SESTI; JUAN JOSE GARAU-LUIS; EDWARD CRAWLEY; BRUCE CAMERON: "Integrating LSTMs and GNNs for COVID-19 Forecasting", ARXIV.ORG, 14 July 2021 (2021-07-14), XP091018816 *
ZONGHAN WU; SHIRUI PAN; GUODONG LONG; JING JIANG; XIAOJUN CHANG; CHENGQI ZHANG: "Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks", ARXIV.ORG, 24 May 2020 (2020-05-24), XP081682687 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117332120A (en) * 2023-08-29 2024-01-02 泰瑞数创科技(北京)股份有限公司 Geographic entity relation construction and expression method based on space calculation
CN117332120B (en) * 2023-08-29 2024-04-30 泰瑞数创科技(北京)股份有限公司 Geographic entity relation construction and expression method based on space calculation
CN117668743A (en) * 2023-10-26 2024-03-08 长江信达软件技术(武汉)有限责任公司 Time sequence data prediction method of association time-space relation
CN117649028A (en) * 2024-01-26 2024-03-05 南京航空航天大学 Urban function area matching-based inter-urban crowd flow trend prediction method
CN117649028B (en) * 2024-01-26 2024-04-02 南京航空航天大学 Urban function area matching-based inter-urban crowd flow trend prediction method

Also Published As

Publication number Publication date
CN114464329A (en) 2022-05-10

Similar Documents

Publication Publication Date Title
WO2023123625A1 (en) Urban epidemic space-time prediction method and system, terminal and storage medium
WO2023123624A1 (en) Method and system for predicting influenza outbreak trend in city, and terminal and storage medium
CN111612206B (en) Neighborhood people stream prediction method and system based on space-time diagram convolution neural network
WO2020215793A1 (en) Urban aggregation event prediction and positioning method and device
Jaya et al. Bayesian spatiotemporal mapping of relative dengue disease risk in Bandung, Indonesia
Huang et al. Spatio-attention embedded recurrent neural network for air quality prediction
Xiao et al. Predicting urban region heat via learning arrive-stay-leave behaviors of private cars
CN106815563B (en) Human body apparent structure-based crowd quantity prediction method
Kulkarni et al. Mobidict: A mobility prediction system leveraging realtime location data streams
WO2022198947A1 (en) Method and apparatus for identifying close-contact group, and electronic device and storage medium
WO2024031520A1 (en) Human mobility prediction method based on generative adversarial network
Rubio et al. Adaptive non-parametric identification of dense areas using cell phone records for urban analysis
Wang et al. Knowledge fusion enhanced graph neural network for traffic flow prediction
Li et al. Spatial-temporal similarity for trajectories with location noise and sporadic sampling
Jiang et al. Supercharging crowd dynamics estimation in disasters via spatio-temporal deep neural network
WO2023004595A1 (en) Parking data recovery method and apparatus, and computer device and storage medium
Kong et al. Multi-feature representation based COVID-19 risk stage evaluation with transfer learning
Zhang et al. A spatiotemporal graph wavelet neural network for traffic flow prediction
Wang et al. A hypergraph-based hybrid graph convolutional network for intracity human activity intensity prediction and geographic relationship interpretation
Chu et al. Self-regularized causal structure discovery for trajectory-based networks
Hou et al. Urban region profiling with spatio-temporal graph neural networks
Montalan et al. Measles metapopulation modeling using ideal flow of transportation networks
Cheng et al. Network SpaceTime AI: Concepts, Methods and Applications.
Vinothkumar et al. Crime Hotspot Identification using SVM in Machine Learning
Wang et al. Combining Theory and Data-Driven Approaches for Epidemic Forecasts

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22912932

Country of ref document: EP

Kind code of ref document: A1