WO2023123624A1 - Method and system for predicting influenza outbreak trend in city, and terminal and storage medium - Google Patents

Method and system for predicting influenza outbreak trend in city, and terminal and storage medium Download PDF

Info

Publication number
WO2023123624A1
WO2023123624A1 PCT/CN2022/076294 CN2022076294W WO2023123624A1 WO 2023123624 A1 WO2023123624 A1 WO 2023123624A1 CN 2022076294 W CN2022076294 W CN 2022076294W WO 2023123624 A1 WO2023123624 A1 WO 2023123624A1
Authority
WO
WIPO (PCT)
Prior art keywords
influenza
data
city
individual
urban
Prior art date
Application number
PCT/CN2022/076294
Other languages
French (fr)
Chinese (zh)
Inventor
李子垠
尹凌
刘康
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2023123624A1 publication Critical patent/WO2023123624A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/35Services specially adapted for particular environments, situations or purposes for the management of goods or merchandise

Definitions

  • the application belongs to the technical field of influenza computing, and in particular relates to a method, system, terminal and storage medium for predicting the incidence trend of influenza in an influenza city.
  • influenza is an acute respiratory flu caused by influenza virus. Influenza virus can break out in a short period of time, seriously threatening the health of the masses.
  • influenza prevention and early warning methods in countries around the world mainly rely on traditional influenza surveillance.
  • Influenza surveillance systems can provide a large amount of information and data useful for influenza prevention and control, including the number of infected cases, the number of deaths, clinical symptoms, and hospitalization information. .
  • the government and public health departments assess recent influenza activity trends based on the information reported by the influenza surveillance system, and make corresponding public health decisions and emergency response plans.
  • the verification and reporting of grassroots monitoring data often takes a lot of time (weeks or even months)
  • the monitoring data will lag behind the actual situation, and most countries can only achieve provincial/state or Early warning at the city level. Therefore, it is urgent to make real-time, accurate and rapid predictions on the trend of influenza incidence in cities, so as to help the government and public health departments to timely and accurately understand the development trend of influenza in cities, and to maximize the protection of people's lives, health and safety.
  • the present application provides an influenza prediction method, system, terminal and storage medium for urban influenza incidence trends, aiming to solve one of the above-mentioned technical problems in the prior art at least to a certain extent.
  • a method for predicting the incidence trend of urban influenza including:
  • the urban influenza incidence trend prediction result is obtained.
  • the collection of individual movement trajectory data in the city includes:
  • the mobile devices include mobile phones or smart watches; the individual movement trajectory data include each individual's mobile phone number, signaling time stamp, and base station latitude and longitude.
  • the technical solution adopted in the embodiment of the present application also includes: the processing of the individual movement trajectory data through the data-driven method is specifically:
  • the technical solution adopted in the embodiment of the present application also includes: the processing of the individual movement track data through the data-driven method further includes:
  • the network flow extraction is performed on the individual movement track data through a theoretical model method;
  • the theoretical model method includes a gravity model, a radiation model or a spatial proximity relationship model.
  • the technical solution adopted in the embodiment of the present application also includes: the extraction of the spatial scale information of the influenza case data by using the graph neural network is specifically:
  • the framework of the graph neural network is a message propagation neural network, and the message propagation neural network extracts spatial scale information based on spatial graph convolution; an undirected graph G is defined, the feature vector of a node v is x v , and the feature of an edge is e vw , connecting nodes v and w, N(v) represents the neighbor nodes of node v in graph G, t is the running time step, and the feature x v of node v is used as the initial state of its hidden state After that, the update of the hidden state by the spatial graph convolution is expressed as:
  • the message propagation neural network decomposes the spatial graph convolution into two parts of message delivery and state update operation, which are respectively completed by the message function M1 and the node update function U1 ;
  • the message function M1 is used to aggregate the characteristics of neighboring nodes , forming a message vector, which is ready to be delivered to the central node;
  • the node update function U l is used to update the node representation at the current moment, and combine the node representation at the current moment with the message obtained from the message function to obtain spatial scale information.
  • the technical solution adopted in the embodiment of the present application also includes: the extraction of the spatial scale information of the influenza case data by using the graph neural network is specifically:
  • Transform the individual movement trajectory data of a set time range into a weighted directed graph the vertices represent street-level areas, and the edges are used to capture movement patterns; at time t, the flow between areas u and v forms an edge, which is multiplied by The number of cases in region u at time t Indicates how many infected people may move from area u to area v; set is a vector of node attributes containing the number of cases in each of the past w weeks in region u; the message passed through the message propagation neural network computes a feature vector for each region using the composite score from all regions:
  • A represents the adjacency matrix of regional population movement flows
  • C t is a matrix whose rows contain attributes of different regions
  • x u ⁇ R w is a vector combining the number of influenza cases moving within and towards region u
  • x The expression formula of u is:
  • x u (x u w j,u +x u w i,u +....x u w v,u )+x u w u,u
  • the technical solution adopted in the embodiment of the present application also includes: the time series relationship of the influenza case data extracted using the long-short-term memory network is specifically:
  • the long-short-term memory network calculates the output ht of the hidden layer at the current moment based on the input xt at the current moment and the output ht -1 of the hidden layer at the previous time period.
  • the calculation formula of the long-short-term memory network is :
  • yi,t LSTM(h i,tn ,h i,tn ,...,h i,t-1 )
  • h i,t-1 represents the influenza case data of the i-th region in the t-1 time period
  • y i,t represents the predicted influenza case data of the i-th region in the t-th time period.
  • a system for predicting the incidence trend of urban influenza including:
  • Data collection module used to obtain the data of influenza cases in the city and collect the data of individual movement trajectories in the city;
  • Data processing module used to process the individual movement trajectory data through a data-driven method, obtain the home address of each individual, set the home address as the starting point of each individual movement trajectory, and extract each individual From the flow data of the starting point to other regions, the population movement relationship among the regions is obtained;
  • Spatio-temporal feature extraction module used to extract the spatial scale information of the influenza case data by using the graph neural network based on the population movement relationship, and extract the time series relationship of the influenza case data by using the long short-term memory network;
  • Influenza prediction module used to obtain urban influenza incidence trend prediction results based on the spatial scale information and time series relationship of the influenza case data.
  • a terminal includes a processor and a memory coupled to the processor, wherein,
  • the memory is stored with program instructions for realizing the urban influenza incidence trend prediction method
  • the processor is configured to execute the program instructions stored in the memory to control urban influenza incidence trend prediction.
  • a storage medium storing program instructions executable by a processor, and the program instructions are used to execute the method for predicting the incidence trend of urban influenza.
  • the beneficial effects produced by the embodiments of the present application are: the method, system, terminal and storage medium for predicting the incidence trend of urban influenza in the embodiments of the present application extract the population movement relationship through the individual movement trajectory data in the city, and based on the population movement relationship , using the graph convolutional neural network to obtain the spatial scale information of the influenza case data, and using the long short-term memory network to extract the time series relationship of the influenza case data, and making a refined prediction of the influenza trend according to the spatial scale information and the time series relationship.
  • the embodiment of this application realizes the prediction of the development trend of influenza in the city with higher spatial resolution, completes the refined analysis of influenza, helps the government and public health departments to timely and accurately understand the development trend of influenza in the city, and conduct targeted Epidemic prevention and control intervention can maximize the protection of people's lives, health and safety.
  • the present application has at least the following beneficial effects:
  • the spatial interaction modeling method of the family center flow can be constructed at a higher spatial resolution and at the urban scale.
  • the graph structure traffic is more concentrated, which can obviously capture multiple strong commuting traffic, and then effectively improve the prediction accuracy of the deep learning model of influenza within the city.
  • the present invention only needs to obtain the location information of the mobile phone, and does not need to obtain multi-dimensional information such as weekly average temperature, air pressure, rainfall, relative humidity, maximum temperature difference, and sunshine time, which reduces the difficulty of data acquisition and processing.
  • Fig. 1 is the flowchart of the method for predicting the incidence of urban influenza in the embodiment of the present application
  • Fig. 2 is the schematic structural diagram of the urban influenza incidence trend prediction system of the embodiment of the present application.
  • FIG. 3 is a schematic diagram of a terminal structure in an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
  • FIG. 1 is a flow chart of the method for predicting the incidence trend of urban influenza according to the embodiment of the present application.
  • the urban influenza incidence trend prediction method of the embodiment of the present application comprises the following steps:
  • S1 Obtain the data of influenza cases in the city, and use mobile devices to collect individual movement trajectory data in the city;
  • mobile devices include but are not limited to devices such as mobile phones or smart watches.
  • the individual movement trajectory data includes but not limited to each individual’s mobile phone number, signaling timestamp, identified base station latitude and longitude and other information.
  • the specific way to obtain the individual movement trajectory data set is: based on China Unicom HDFS+Hive +Spark big data platform obtains the location information carried by the mobile phone signaling of 5.6 million users, deduplicates the location information, and removes the records missing key information in the location information to obtain individual movement trajectory data.
  • S2 Process individual movement trajectory data through data-driven methods and theoretical model methods, obtain the home address of each individual, set the home address as the starting point of all movement trajectories of each individual, and extract each individual from the starting point Flow data to other regions to obtain the population movement relationship between regions;
  • the individual’s movement trajectory data is divided according to the stay time by the data-driven method, the location with the longest night stay time is set as the individual’s home address, and the home address is set as the starting point of all the individual’s movement trajectories.
  • the starting point and then extract the flow data of each individual from the starting point to other areas (such as work and entertainment places) to obtain the population movement relationship between different areas.
  • processing of individual movement trajectory data through the data-driven method is as follows: merging the original Thiessen polygons established by the city base stations in the 500-meter grid area within a certain period of time, and the location where each user spends the longest time at night Set as the user's home address, extract the OD traffic and home center traffic every half hour in the 500-meter grid area of the city; then map to each street unit to get every half-hour in the 500-meter grid area within a certain period of time OD traffic and home center traffic.
  • the processing of individual movement trajectory data through theoretical model methods is as follows: according to the population distribution information and geographical location information of each street in the city in the census data, through theoretical model methods such as gravity model, radiation model and spatial proximity model, the individual The mobile trajectory data set is used for network flow extraction, where:
  • the gravity model assumes that the intensity of population movement between two regions is directly proportional to their respective population distributions and inversely proportional to the distance between them.
  • the gravity model formula between two urban areas can usually be expressed as:
  • T ij is the population flow intensity between area i and area j in the city
  • m i and n j are the population distribution of area i and area j respectively
  • r ij is the distance between area i and area j
  • a is a constant, which can take a value of 1.
  • Radiation models view population movement as a stochastic process governed by joint probabilities, depending on the population distribution of origin, destination, and sphere of influence.
  • the radiation model formula is:
  • s ij represents the total population within the area with area i as the center and the distance between area i and area j as the radius.
  • N is the total population of area i
  • N c is the commuter population of area i.
  • the spatial proximity relationship model is based on the first law of geography, all things are related to other things, but things that are close in distance are more related than things that are far away. Therefore, the present invention considers the spatial relationship between things with close distances, that is, the proximity relationship. Based on the adjacent relationship of the regions, the present invention judges whether there is an edge between the two regions (nodes) according to whether the two regions are in contact with each other, and obtains an adjacency matrix corresponding to the adjacent relationship.
  • S3 Input the influenza case data and the population movement relationship into the Graph Neural Networks (GNN), and the graph neural network extracts the spatial features of the influenza case data according to the population movement relationship, and obtains the spatial scale information of the influenza case data;
  • GNN Graph Neural Networks
  • the graph neural network is used for spatial dependence modeling, which has a wider application range and better generalization ability.
  • the core idea of graph neural network is to learn a function that enables each node to aggregate its own features and its neighbors' features to generate a new feature representation of the node. Construct a neural network model containing multiple graph convolutional layers. Through continuous iteration and learning, the topological relationship on the graph can finally be used to learn the input features of each node and make predictions.
  • the graph neural network is a neighbor aggregation strategy, and the representation vector of a node is calculated by its neighbor nodes through cyclic aggregation and transfer representation vectors.
  • the framework of graph neural network is Message Passing Neural Network (MPNN).
  • Message Propagation Neural Network is a formal framework for spatial graph convolution.
  • the undirected graph G the feature vector of the node v is x v
  • the feature of the edge is e vw
  • N(v) represents the neighbor nodes of the node v in the graph G
  • t is the running time step
  • the update of the hidden state by spatial graph convolution is expressed by the following formula:
  • the message propagation neural network decomposes the spatial domain graph convolution into two parts: message passing and state update operation, which are completed by message function M l and node update function U l respectively.
  • the function of the message function M l is to aggregate the characteristics of the neighbor nodes to form a message vector, which is ready to be transmitted to the central node.
  • the role of the node update function U l is to update the node representation at the current moment, and combine the node representation at the current moment with the message obtained from the message function to obtain the spatial scale information of the influenza case data.
  • the graph neural network extracts the spatial features of the influenza case data according to the population movement relationship, and obtains the spatial scale information of the influenza case data as follows: convert the individual movement trajectory data within a set time range into a weighted directed graph, and its vertices represent Street-level regions, edges are used to capture movement patterns. For example, the weight w v,w of an edge (v,u) from vertex v to vertex u represents the total number of individuals whose home address is in area v who move to area u at time t.
  • A represents the adjacency matrix of regional population movement flows
  • Ct is a matrix whose rows contain attributes of different regions.
  • x u ⁇ R w is a vector combining the number of influenza cases within and moving towards area u.
  • the expression formula of x u is:
  • x u (x u w j,u +x u w i,u +....x u w v,u )+x u w u,u (7)
  • region u receives people from different regions
  • Xt contains vectors of past cases in that region.
  • x u ⁇ R w represents an estimate of the number of new potential cases in area u, broken down into cases received from other areas and new cases due to mobility within area u.
  • H i is a matrix containing the node representation of the previous layer
  • H 0 X
  • W 0 is the matrix of trainable parameters of the first layer
  • f is a nonlinear activation function, such as ReLU.
  • S4 Input the influenza case data and spatial scale information into the long short-term memory network (Long short-term memory, LSTM), and extract the time series relationship of the influenza case data through the long short-term memory network;
  • LSTM Long short-term memory
  • the long-short-term memory network is a recurrent neural network (Recurrent Neural Network, RNN) that can be used to process sequence data, which can solve the problem of gradient disappearance and gradient explosion during long sequence training.
  • RNN Recurrent Neural Network
  • LSTM can achieve better performance in longer sequences.
  • LSTM includes three stages of forgetting, selective memory, and output.
  • the forgetting stage is used to selectively forget the input transmitted by the previous node.
  • the memory selection stage is used to selectively "memorize" the input of the current stage.
  • the output stage is used to determine which will be regarded as the output of the current state, and scale the output obtained in the previous stage through a tanh activation function.
  • the LSTM calculation formula can be expressed as:
  • h i,t-1 represents the influenza case data of the i-th region in the t-1 time period
  • y i,t represents the predicted influenza case data of the i-th region in the t-th time period.
  • the time series relationship of influenza case data extracted through the long-term short-term memory network is specifically: the long-term short-term memory network calculates the current The output h t of the hidden layer at all times, and newly added the input gate it , the forgetting gate f t , the output gate o t and the memory unit c t .
  • the input gate is obtained by linearly transforming the input x t and the output h t-1 of the hidden layer of the previous step, and then calculated by the activation function, which is used to control the extent to which the new state of the current calculation is updated to the memory unit.
  • the forget gate and output gate are calculated in a similar way to the input gate, which are used to control how much the information in the memory unit of the previous step is forgotten and how much the current output depends on the current memory unit.
  • the input gate, the forget gate, and the output gate all have their own parameters W and b, and the memory unit is mainly controlled by the input gate and the forget gate: the input gate controls the information that needs to be memorized in the current input sequence, and the forget gate controls the information in the previous historical memory. Information that needs to be forgotten.
  • the output ht of the hidden layer of the long-term short-term memory network at time t is finally determined by the output gate and the memory unit.
  • the update formula of each calculation unit of the long-term short-term memory network is as follows:
  • each region shares the same long-term short-term memory network model, which helps to improve the generalization ability of the model, while reducing model parameters and reducing the complexity of the model. While learning the original features in the time series relationship, it is also making full use of the introduced spatial context information to make the prediction model better. As the number of network layers deepens, the final node function will capture more and more global information. However, in order to preserve local intermediate information, the present invention splices the hidden states H 1 and H 2 of the last time step of the two LSTM layers with the input historical information, and the formula is as follows:
  • the rows of matrix H can be viewed as vertex representations that encode multi-scale structural information (including initial features of nodes), and then pass this vertex representation to the output layer consisting of a two-layer fully-connected network.
  • the model parameters can be updated by learning new data after multiple predictions, making the model more intelligent and efficient.
  • the urban influenza incidence trend prediction method of the embodiment of the present application extracts the population movement relationship through the individual movement trajectory data in the city.
  • the memory network extracts the time series relationship of influenza case data, and makes refined predictions of influenza trends based on spatial scale information and time series relationship.
  • the embodiment of this application realizes the prediction of the development trend of influenza in the city with higher spatial resolution, completes the refined analysis of influenza, helps the government and public health departments to timely and accurately understand the development trend of influenza in the city, and conduct targeted Epidemic prevention and control intervention can maximize the protection of people's lives, health and safety.
  • the present application has at least the following beneficial effects:
  • the spatial interaction modeling method of the family center flow can be constructed at a higher spatial resolution and at the urban scale.
  • the graph structure traffic is more concentrated, which can obviously capture multiple strong commuting traffic, and then effectively improve the prediction accuracy of the deep learning model of influenza within the city.
  • the present invention only needs to obtain the location information of the mobile phone, and does not need to obtain multi-dimensional information such as weekly average temperature, air pressure, rainfall, relative humidity, maximum temperature difference, and sunshine time, which reduces the difficulty of data acquisition and processing.
  • FIG. 2 is a schematic structural diagram of an urban influenza incidence trend prediction system according to an embodiment of the present application.
  • the urban influenza incidence trend prediction system 40 of the embodiment of the present application includes:
  • Data collection module 41 used to obtain data on influenza cases in the city, and collect data on individual movement trajectories in the city;
  • Data processing module 42 used to process the individual movement trajectory data set through a data-driven method, obtain the home address of each individual, and set the home address as the starting point of all movement trajectories of each individual, and extract each individual from Flow data from the starting point to other regions to obtain the population movement relationship between regions;
  • Spatio-temporal feature extraction module 43 used to extract the spatial scale information of influenza case data by using the graph neural network based on the population movement relationship, and extract the time series relationship of the influenza case data by using the long-term short-term memory network;
  • Influenza prediction module 44 used to obtain urban influenza incidence trend prediction results based on the spatial scale information and time series relationship of influenza case data.
  • FIG. 3 is a schematic diagram of a terminal structure in an embodiment of the present application.
  • the terminal 50 includes a processor 51 and a memory 52 coupled to the processor 51.
  • the memory 52 stores program instructions for realizing the above-mentioned method for predicting the incidence trend of urban influenza.
  • the processor 51 is used to execute the program instructions stored in the memory 52 to control the prediction of urban flu incidence trends.
  • the processor 51 may also be referred to as a CPU (Central Processing Unit, central processing unit).
  • the processor 51 may be an integrated circuit chip with signal processing capabilities.
  • the processor 51 can also be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components .
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • FIG. 4 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
  • the storage medium of the embodiment of the present application stores a program file 61 capable of realizing all the above-mentioned methods, wherein the program file 61 can be stored in the above-mentioned storage medium in the form of a software product, and includes several instructions to make a computer device (which can It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the methods in various embodiments of the present invention.
  • a computer device which can It is a personal computer, a server, or a network device, etc.
  • processor processor
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. , or terminal devices such as computers, servers, mobile phones, and tablets.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Biophysics (AREA)
  • Economics (AREA)
  • Software Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Primary Health Care (AREA)
  • Computational Linguistics (AREA)
  • Epidemiology (AREA)
  • Pathology (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present application relates to a method and system for predicting an influenza outbreak trend in a city, and a terminal and a storage medium. The method comprises: acquiring influenza case data within a city, and collecting individual movement trajectory data within the city; processing the individual movement trajectory data by means of a data driving method, so as to acquire a population movement relationship; on the basis of the population movement relationship, extracting spatial scale information of the influenza case data by using a graph neural network, and extracting a time sequence relationship of the influenza case data by using a long short-term memory network; and according to the spatial scale information and the time sequence relationship of the influenza case data, obtaining an influenza outbreak trend prediction result for the city, which has an outbreak of influenza. By means of the embodiments of the present application, a higher spatial resolution prediction for an influenza development situation within a city is realized, and a refined analysis on influenza is completed, such that a government and a public health department are assisted in gaining an insight into the influenza development situation within the city in a timely and accurate manner, and targeted intervention for epidemic prevention and control is performed, thereby guaranteeing the lives, health and safety of people to the greatest extent.

Description

城市流感发病趋势预测方法、系统、终端以及存储介质Urban influenza incidence trend prediction method, system, terminal and storage medium 技术领域technical field
本申请属于流感计算技术领域,特别涉及一种流感城市流感发病趋势预测方法、系统、终端以及存储介质。The application belongs to the technical field of influenza computing, and in particular relates to a method, system, terminal and storage medium for predicting the incidence trend of influenza in an influenza city.
背景技术Background technique
2019年末到2021年初,新冠肺炎从局部爆发发展为全球性大流行,已有超过1亿人感染,并导致全球超过252万人死亡。此次疫情的爆发警醒我们,公共卫生风险依然是人类面临的主要社会风险之一。以流行性感冒(简称流感)为例,是由流行性感冒病毒引起的急性呼吸道流感,流感病毒能在短时间内爆发流行,严重威胁着群众身体健康。From the end of 2019 to the beginning of 2021, COVID-19 has developed from a local outbreak to a global pandemic. More than 100 million people have been infected and more than 2.52 million people have died worldwide. The outbreak of this epidemic has reminded us that public health risks are still one of the major social risks facing mankind. Taking influenza (abbreviated as influenza) as an example, it is an acute respiratory flu caused by influenza virus. Influenza virus can break out in a short period of time, seriously threatening the health of the masses.
当前,世界各国流感防控和预警手段主要依赖于传统的流感监测,流感监测系统可以提供大量对流感防控有用的信息和数据,包括感染病例数、死亡病例数、临床症状和住院人员信息等。政府和公共卫生部门根据流感监测系统的上报信息评估近期流感活动趋势,并做出相应的公共卫生决策和应急预案响应。但由于基层监测数据的核查与上报往往需要花费大量时间(几周甚至数月),监测数据会滞后于实际状况,且大多数国家由于缺少完备的信息采集技术,只能做到省/州或城市一级的预警。因此,亟需针对城市内部流感发病趋势进行实时精准快速预测,帮助政府和公共卫生部门及时、准确地洞察城市内部的流感发展态势,最大化的保障人民的生命健康安全。At present, influenza prevention and early warning methods in countries around the world mainly rely on traditional influenza surveillance. Influenza surveillance systems can provide a large amount of information and data useful for influenza prevention and control, including the number of infected cases, the number of deaths, clinical symptoms, and hospitalization information. . The government and public health departments assess recent influenza activity trends based on the information reported by the influenza surveillance system, and make corresponding public health decisions and emergency response plans. However, due to the fact that the verification and reporting of grassroots monitoring data often takes a lot of time (weeks or even months), the monitoring data will lag behind the actual situation, and most countries can only achieve provincial/state or Early warning at the city level. Therefore, it is urgent to make real-time, accurate and rapid predictions on the trend of influenza incidence in cities, so as to help the government and public health departments to timely and accurately understand the development trend of influenza in cities, and to maximize the protection of people's lives, health and safety.
人的交互和移动会导致流感的传播与扩散,为了更好地实现城市内部流感发病趋势的实时精准快速预测,有必要深入研究城市内部人口移动过程中的空间交互作用,构建耦合空间交互作用的城市流感发病趋势时空预测模型。现有 的流感预测建模方法包括基于高斯过程模型、基于LSTM神经网络模型等。然而,现有的大部分流感发病趋势预测方法往往只依靠流感样疾病统计数据的时间序列特征进行预测,没有考虑到不同区域之间的空间依赖关系,以及人口移动与流感传播之间的空间交互效应,此类方法由于信息的缺失很难在城市内部准确预测流感的发病趋势。其它的流感预测方法虽然综合利用了特定位置的时空特征信息,但空间特征提取方法并不完善,提取出的空间特征信息的准确度有待提升。The interaction and movement of people will lead to the spread and spread of influenza. In order to better realize the real-time, accurate and rapid prediction of the incidence trend of influenza in the city, it is necessary to study the spatial interaction in the process of population movement in the city, and construct a coupling spatial interaction model. Spatial-temporal forecasting model for urban influenza incidence trends. Existing influenza prediction modeling methods include Gaussian process-based models, LSTM-based neural network models, etc. However, most of the existing influenza incidence trend prediction methods often only rely on the time series characteristics of influenza-like illness statistics for prediction, without taking into account the spatial dependence between different regions and the spatial interaction between population movement and influenza transmission. Effect, such methods are difficult to accurately predict the incidence trend of influenza within the city due to the lack of information. Although other influenza prediction methods comprehensively utilize the spatiotemporal feature information of a specific location, the spatial feature extraction method is not perfect, and the accuracy of the extracted spatial feature information needs to be improved.
随着城市内部人类移动网络复杂性的不断加深,传统的理论模型方法无法精细化模拟城市内部的人口移动与交互,已经难以适应当前城市内部空间交互作用建模的需求。此外,基于OD(交通出行量)流量的空间交互作用建模方法往往以较低的空间分辨率进行研究,例如国家、省/州或城市级别发挥作用,在城市尺度下该方法构建的图结构流量过于分散,导致对流感爆发的时空模式预测出现偏差。As the complexity of the human movement network within the city continues to deepen, traditional theoretical modeling methods cannot finely simulate the population movement and interaction within the city, and it has been difficult to meet the needs of the current spatial interaction modeling within the city. In addition, spatial interaction modeling methods based on OD (traffic trip volume) flows are often studied at lower spatial resolutions, such as national, provincial/state or city levels, and the graph structure constructed by this method at the urban scale The flow is too dispersed, leading to biased predictions of spatiotemporal patterns of influenza outbreaks.
发明内容Contents of the invention
本申请提供了一种城市流感发病趋势流感预测方法、系统、终端以及存储介质,旨在至少在一定程度上解决现有技术中的上述技术问题之一。The present application provides an influenza prediction method, system, terminal and storage medium for urban influenza incidence trends, aiming to solve one of the above-mentioned technical problems in the prior art at least to a certain extent.
为了解决上述问题,本申请提供了如下技术方案:In order to solve the above problems, the application provides the following technical solutions:
一种城市流感发病趋势预测方法,包括:A method for predicting the incidence trend of urban influenza, including:
获取城市内流感病例数据,并收集城市内个体移动轨迹数据;Obtain the data of influenza cases in the city, and collect the data of individual movement trajectories in the city;
通过数据驱动方法对所述个体移动轨迹数据进行处理,获取每个个体的家庭住址,将所述家庭住址设定为每个个体移动轨迹的起始点,并提取每个个体从所述起始点去往其他区域的流量数据,得到各个区域之间的人口移动关系;Process the individual movement trajectory data through a data-driven method, obtain the home address of each individual, set the home address as the starting point of each individual movement trajectory, and extract each individual from the starting point to Flow data to other regions to obtain the population movement relationship between regions;
基于所述人口移动关系,利用图神经网络提取所述流感病例数据的空间尺度信息,并利用长短期记忆网络提取所述流感病例数据的时间序列关系;Based on the population movement relationship, using a graph neural network to extract the spatial scale information of the influenza case data, and using a long short-term memory network to extract the time series relationship of the influenza case data;
根据所述流感病例数据的空间尺度信息和时间序列关系得到城市流感发病趋势预测结果。According to the spatial scale information and time series relationship of the influenza case data, the urban influenza incidence trend prediction result is obtained.
本申请实施例采取的技术方案还包括:所述收集城市内个体移动轨迹数据包括:The technical solution adopted in the embodiment of the present application also includes: the collection of individual movement trajectory data in the city includes:
利用移动设备收集城市内个体移动轨迹数据;所述移动设备包括手机或智能手表;所述个体移动轨迹数据包括每个个体的手机号码、信令时间戳以及基站经纬度。Use mobile devices to collect individual movement trajectory data in the city; the mobile devices include mobile phones or smart watches; the individual movement trajectory data include each individual's mobile phone number, signaling time stamp, and base station latitude and longitude.
本申请实施例采取的技术方案还包括:所述通过数据驱动方法对个体移动轨迹数据进行处理具体为:The technical solution adopted in the embodiment of the present application also includes: the processing of the individual movement trajectory data through the data-driven method is specifically:
对设定时间段以及设定距离的格网区域内城市基站建立的原始泰森多边形进行合并;将所述个体移动轨迹数据按照停留时间进行划分,将每个个体夜间停留时间最长的位置设定为该个体的家庭住址,并提取设定距离的格网区域内设定间隔时间的OD流量和家庭中心流量。Merge the original Thiessen polygons established by the city base stations in the grid area of the set time period and the set distance; divide the individual movement track data according to the stay time, and set the position with the longest night stay time of each individual Set as the individual's home address, and extract the OD flow and home center flow at a set interval in a grid area with a set distance.
本申请实施例采取的技术方案还包括:所述通过数据驱动方法对所述个体移动轨迹数据进行处理还包括:The technical solution adopted in the embodiment of the present application also includes: the processing of the individual movement track data through the data-driven method further includes:
根据人口普查数据中城市各街道的人口分布信息和地理位置信息,通过理论模型方法对所述个体移动轨迹数据进行网络流提取;所述理论模型方法包括引力模型、辐射模型或空间邻近关系模型。According to the population distribution information and geographical location information of each street in the city in the census data, the network flow extraction is performed on the individual movement track data through a theoretical model method; the theoretical model method includes a gravity model, a radiation model or a spatial proximity relationship model.
本申请实施例采取的技术方案还包括:所述利用图神经网络提取所述流感病例数据的空间尺度信息具体为:The technical solution adopted in the embodiment of the present application also includes: the extraction of the spatial scale information of the influenza case data by using the graph neural network is specifically:
所述图神经网络的框架为消息传播神经网络,所述消息传播神经网络基于 空域图卷积提取空间尺度信息;定义无向图G,节点v的特征向量为x v,边的特征为e vw,连接节点v和w,N(v)表示图G中节点v的邻居节点,t为运行的时间步,将结点v的特征x v作为其隐藏状态的初始态
Figure PCTCN2022076294-appb-000001
后,所述空域图卷积对隐藏状态的更新表示为:
The framework of the graph neural network is a message propagation neural network, and the message propagation neural network extracts spatial scale information based on spatial graph convolution; an undirected graph G is defined, the feature vector of a node v is x v , and the feature of an edge is e vw , connecting nodes v and w, N(v) represents the neighbor nodes of node v in graph G, t is the running time step, and the feature x v of node v is used as the initial state of its hidden state
Figure PCTCN2022076294-appb-000001
After that, the update of the hidden state by the spatial graph convolution is expressed as:
Figure PCTCN2022076294-appb-000002
Figure PCTCN2022076294-appb-000002
Figure PCTCN2022076294-appb-000003
Figure PCTCN2022076294-appb-000003
所述消息传播神经网络将空域图卷积分解为消息传递与状态更新操作两个部分,分别由消息函数M l和节点更新函数U l完成;所述消息函数M l用于聚合邻居节点的特征,形成一个消息向量,准备传递给中心节点;所述节点更新函数U l用于更新当前时刻的节点表示,将当前时刻的节点表示以及从消息函数中获得的消息进行组合,获得空间尺度信息。 The message propagation neural network decomposes the spatial graph convolution into two parts of message delivery and state update operation, which are respectively completed by the message function M1 and the node update function U1 ; the message function M1 is used to aggregate the characteristics of neighboring nodes , forming a message vector, which is ready to be delivered to the central node; the node update function U l is used to update the node representation at the current moment, and combine the node representation at the current moment with the message obtained from the message function to obtain spatial scale information.
本申请实施例采取的技术方案还包括:所述利用图神经网络提取所述流感病例数据的空间尺度信息具体为:The technical solution adopted in the embodiment of the present application also includes: the extraction of the spatial scale information of the influenza case data by using the graph neural network is specifically:
将设定时间范围的个体移动轨迹数据转换为加权有向图,顶点表示街道级区域,边用于捕获移动模式;在时间t时,区域u和v之间的流动形成一个边,该边乘以时间t时区域u的病例数
Figure PCTCN2022076294-appb-000004
表示有多少感染者可能从区域u移动到区域v;设
Figure PCTCN2022076294-appb-000005
为节点属性的向量,其中包含区域u过去w周中每一周的病例数;通过消息传播神经网络传递的消息使用来自所有区域的综合得分计算每个区域的特征向量:
Transform the individual movement trajectory data of a set time range into a weighted directed graph, the vertices represent street-level areas, and the edges are used to capture movement patterns; at time t, the flow between areas u and v forms an edge, which is multiplied by The number of cases in region u at time t
Figure PCTCN2022076294-appb-000004
Indicates how many infected people may move from area u to area v; set
Figure PCTCN2022076294-appb-000005
is a vector of node attributes containing the number of cases in each of the past w weeks in region u; the message passed through the message propagation neural network computes a feature vector for each region using the composite score from all regions:
Figure PCTCN2022076294-appb-000006
Figure PCTCN2022076294-appb-000006
其中A代表区域人口移动流量的邻接矩阵,C t是其行包含不同区域的属性的矩阵;x u∈R w是一个将区域u内和朝向区域u移动的流感病例数结合起来 的向量,x u的表示公式为: where A represents the adjacency matrix of regional population movement flows, C t is a matrix whose rows contain attributes of different regions; x u ∈ R w is a vector combining the number of influenza cases moving within and towards region u, x The expression formula of u is:
x u=(x uw j,u+x uw i,u+….x uw v,u)+x uw u,u x u =(x u w j,u +x u w i,u +….x u w v,u )+x u w u,u
其中x u∈R w表示u区域中新潜在病例数量的估计值。 where x u ∈ R w denotes the estimate of the number of new potential cases in area u.
本申请实施例采取的技术方案还包括:所述利用长短期记忆网络提取所述流感病例数据的时间序列关系具体为:The technical solution adopted in the embodiment of the present application also includes: the time series relationship of the influenza case data extracted using the long-short-term memory network is specifically:
所述长短期记忆网络基于当前时刻的输入x t和上一个时间段隐含层的输出h t-1来计算当前时刻隐含层的输出h t,所述长短期记忆网络的计算公式为: The long-short-term memory network calculates the output ht of the hidden layer at the current moment based on the input xt at the current moment and the output ht -1 of the hidden layer at the previous time period. The calculation formula of the long-short-term memory network is :
yi,t=LSTM(h i,t-n,h i,t-n,...,h i,t-1) yi,t = LSTM(h i,tn ,h i,tn ,...,h i,t-1 )
其中,h i,t-1表示第i个区域在第t-1个时间段的流感病例数据表示,y i,t表示第i个区域在第t个时间段预测的流感病例数据。 Among them, h i,t-1 represents the influenza case data of the i-th region in the t-1 time period, and y i,t represents the predicted influenza case data of the i-th region in the t-th time period.
本申请实施例采取的另一技术方案为:一种城市流感发病趋势预测系统,包括:Another technical solution adopted in the embodiment of the present application is: a system for predicting the incidence trend of urban influenza, including:
数据收集模块:用于获取城市内流感病例数据,并收集城市内个体移动轨迹数据;Data collection module: used to obtain the data of influenza cases in the city and collect the data of individual movement trajectories in the city;
数据处理模块:用于通过数据驱动方法对所述个体移动轨迹数据进行处理,获取每个个体的家庭住址,将所述家庭住址设定为每个个体移动轨迹的起始点,并提取每个个体从所述起始点去往其他区域的流量数据,得到各个区域之间的人口移动关系;Data processing module: used to process the individual movement trajectory data through a data-driven method, obtain the home address of each individual, set the home address as the starting point of each individual movement trajectory, and extract each individual From the flow data of the starting point to other regions, the population movement relationship among the regions is obtained;
时空特征提取模块:用于基于所述人口移动关系,利用图神经网络提取所述流感病例数据的空间尺度信息,并利用长短期记忆网络提取所述流感病例数据的时间序列关系;Spatio-temporal feature extraction module: used to extract the spatial scale information of the influenza case data by using the graph neural network based on the population movement relationship, and extract the time series relationship of the influenza case data by using the long short-term memory network;
流感预测模块:用于根据所述流感病例数据的空间尺度信息和时间序列关系得到城市流感发病趋势预测结果。Influenza prediction module: used to obtain urban influenza incidence trend prediction results based on the spatial scale information and time series relationship of the influenza case data.
本申请实施例采取的又一技术方案为:一种终端,所述终端包括处理器、与所述处理器耦接的存储器,其中,Another technical solution adopted by the embodiment of the present application is: a terminal, the terminal includes a processor and a memory coupled to the processor, wherein,
所述存储器存储有用于实现所述城市流感发病趋势预测方法的程序指令;The memory is stored with program instructions for realizing the urban influenza incidence trend prediction method;
所述处理器用于执行所述存储器存储的所述程序指令以控制城市流感发病趋势预测。The processor is configured to execute the program instructions stored in the memory to control urban influenza incidence trend prediction.
本申请实施例采取的又一技术方案为:一种存储介质,存储有处理器可运行的程序指令,所述程序指令用于执行所述城市流感发病趋势预测方法。Another technical solution adopted in the embodiment of the present application is: a storage medium storing program instructions executable by a processor, and the program instructions are used to execute the method for predicting the incidence trend of urban influenza.
相对于现有技术,本申请实施例产生的有益效果在于:本申请实施例的城市流感发病趋势预测方法、系统、终端以及存储介质通过城市内个体移动轨迹数据提取人口移动关系,基于人口移动关系,采用图卷积神经网络获取流感病例数据的空间尺度信息,并采用长短期记忆网络提取流感病例数据的时间序列关系,根据空间尺度信息和时间序列关系对流感趋势做出精细化预测。本申请实施例实现了城市内部流感发展态势更高空间分辨率的预测,完成对流感的精细化分析,帮助政府和公共卫生部门及时、准确地洞察城市内部的流感发展态势,有针对性地进行疫情防控干预,可以最大化的保障人民的生命健康安全。相对于现有技术,本申请至少具有以下有益效果:Compared with the prior art, the beneficial effects produced by the embodiments of the present application are: the method, system, terminal and storage medium for predicting the incidence trend of urban influenza in the embodiments of the present application extract the population movement relationship through the individual movement trajectory data in the city, and based on the population movement relationship , using the graph convolutional neural network to obtain the spatial scale information of the influenza case data, and using the long short-term memory network to extract the time series relationship of the influenza case data, and making a refined prediction of the influenza trend according to the spatial scale information and the time series relationship. The embodiment of this application realizes the prediction of the development trend of influenza in the city with higher spatial resolution, completes the refined analysis of influenza, helps the government and public health departments to timely and accurately understand the development trend of influenza in the city, and conduct targeted Epidemic prevention and control intervention can maximize the protection of people's lives, health and safety. Compared with the prior art, the present application has at least the following beneficial effects:
(1)、融入大规模手机位置数据后,改善了传统的理论模型方法无法精细化模拟城市内部的人口移动与交互的缺陷,个体的移动性得到更真实的还原。(1) After integrating large-scale mobile phone location data, the defect that the traditional theoretical model method cannot be refined to simulate the population movement and interaction within the city has been improved, and the mobility of individuals has been more realistically restored.
(2)采用家庭中心流的空间交互作用建模方法,相较于基于OD流量的空间交互作用建模方法等其他数据驱动模型方法,能够在较高的空间分辨率,在城市尺度下构建的图结构流量更加集中,可以明显地捕捉到多个较强的通勤流量,进而有效地提升城市内部流感深度学习模型的预测精度。(2) Compared with other data-driven modeling methods such as the spatial interaction modeling method based on OD flow, the spatial interaction modeling method of the family center flow can be constructed at a higher spatial resolution and at the urban scale. The graph structure traffic is more concentrated, which can obviously capture multiple strong commuting traffic, and then effectively improve the prediction accuracy of the deep learning model of influenza within the city.
(3)本发明仅需要获取手机位置信息,无需获取每周平均气温、气压、 降雨量、相对湿度、最大温差和日照时间等多个维度的信息,降低了数据的获取和处理难度。(3) The present invention only needs to obtain the location information of the mobile phone, and does not need to obtain multi-dimensional information such as weekly average temperature, air pressure, rainfall, relative humidity, maximum temperature difference, and sunshine time, which reduces the difficulty of data acquisition and processing.
附图说明Description of drawings
图1是本申请实施例的城市流感发病趋势预测方法的流程图;Fig. 1 is the flowchart of the method for predicting the incidence of urban influenza in the embodiment of the present application;
图2为本申请实施例的城市流感发病趋势预测系统结构示意图;Fig. 2 is the schematic structural diagram of the urban influenza incidence trend prediction system of the embodiment of the present application;
图3为本申请实施例的终端结构示意图;FIG. 3 is a schematic diagram of a terminal structure in an embodiment of the present application;
图4为本申请实施例的存储介质的结构示意图。FIG. 4 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, not to limit the present application.
请参阅图1,是本申请实施例的城市流感发病趋势预测方法的流程图。本申请实施例的城市流感发病趋势预测方法包括以下步骤:Please refer to FIG. 1 , which is a flow chart of the method for predicting the incidence trend of urban influenza according to the embodiment of the present application. The urban influenza incidence trend prediction method of the embodiment of the present application comprises the following steps:
S1:获取城市内流感病例数据,并利用移动设备收集城市内个体移动轨迹数据;S1: Obtain the data of influenza cases in the city, and use mobile devices to collect individual movement trajectory data in the city;
本步骤中,移动设备包括但不限于手机或智能手表等设备。本申请实施例中,个体移动轨迹数据包括但不限于每个个体的手机号码、信令时间戳、识别到的基站经纬度等信息,个体移动轨迹数据集的获取方式具体为:基于联通HDFS+Hive+Spark大数据平台,获取560万用户手机信令携带的位置信息,对位置信息进行去重处理,并去除位置信息中缺失关键信息的记录,得到个体移动轨迹数据。In this step, mobile devices include but are not limited to devices such as mobile phones or smart watches. In the embodiment of this application, the individual movement trajectory data includes but not limited to each individual’s mobile phone number, signaling timestamp, identified base station latitude and longitude and other information. The specific way to obtain the individual movement trajectory data set is: based on China Unicom HDFS+Hive +Spark big data platform obtains the location information carried by the mobile phone signaling of 5.6 million users, deduplicates the location information, and removes the records missing key information in the location information to obtain individual movement trajectory data.
S2:通过数据驱动方法和理论模型方法对个体移动轨迹数据进行处理,获 取每个个体的家庭住址,并将家庭住址设定为每个个体所有移动轨迹的起始点,提取每个个体从起始点去往其他区域的流量数据,得到各个区域之间的人口移动关系;S2: Process individual movement trajectory data through data-driven methods and theoretical model methods, obtain the home address of each individual, set the home address as the starting point of all movement trajectories of each individual, and extract each individual from the starting point Flow data to other regions to obtain the population movement relationship between regions;
本步骤中,通过数据驱动方法将个体移动轨迹数据按照停留时间进行划分,将夜间停留时间最长的位置设定为该个体的家庭住址,并将家庭住址设定为该个体所有移动轨迹的起始点;然后提取每个个体从起始点去往其他区域(如工作、娱乐场所)的流量数据,得到不同区域之间的人口移动关系。In this step, the individual’s movement trajectory data is divided according to the stay time by the data-driven method, the location with the longest night stay time is set as the individual’s home address, and the home address is set as the starting point of all the individual’s movement trajectories. The starting point; and then extract the flow data of each individual from the starting point to other areas (such as work and entertainment places) to obtain the population movement relationship between different areas.
进一步地,通过数据驱动方法对个体移动轨迹数据进行处理具体为:对一定时间段内500米格网区域内城市基站建立的原始泰森多边形进行合并,将各用户夜间停留时间最长时间的位置设定为该用户的家庭住址,提取得到城市500米格网区域内每半个小时的OD流量和家庭中心流量;然后映射到各街道单元得到一定时间段内500米格网区域每半个小时的OD流量和家庭中心流量。Further, the processing of individual movement trajectory data through the data-driven method is as follows: merging the original Thiessen polygons established by the city base stations in the 500-meter grid area within a certain period of time, and the location where each user spends the longest time at night Set as the user's home address, extract the OD traffic and home center traffic every half hour in the 500-meter grid area of the city; then map to each street unit to get every half-hour in the 500-meter grid area within a certain period of time OD traffic and home center traffic.
其次,通过理论模型方法对个体移动轨迹数据进行处理具体为:根据人口普查数据中城市各街道的人口分布信息和地理位置信息,通过引力模型、辐射模型和空间邻近关系模型等理论模型方法对个体移动轨迹数据集进行网络流提取,其中:Secondly, the processing of individual movement trajectory data through theoretical model methods is as follows: according to the population distribution information and geographical location information of each street in the city in the census data, through theoretical model methods such as gravity model, radiation model and spatial proximity model, the individual The mobile trajectory data set is used for network flow extraction, where:
引力模型认为两个区域之间的人口流动强度与它们各自的人口分布成正比,与它们之间的距离成反比。两个城市区域之间的引力模型公式通常可表现为:The gravity model assumes that the intensity of population movement between two regions is directly proportional to their respective population distributions and inversely proportional to the distance between them. The gravity model formula between two urban areas can usually be expressed as:
Figure PCTCN2022076294-appb-000007
Figure PCTCN2022076294-appb-000007
上式中,T ij为城市内区域i与区域j之间的人口流动强度,m i和n j分别为区域i与区域j的人口分布,r ij为区域i与区域j之间的距离,a为常量,可 取值为1。 In the above formula, T ij is the population flow intensity between area i and area j in the city, m i and n j are the population distribution of area i and area j respectively, r ij is the distance between area i and area j, a is a constant, which can take a value of 1.
辐射模型将人口流动看做一个受联合概率支配的随机过程,取决于出发地、目的地和影响范围的人口分布。辐射模型公式为:Radiation models view population movement as a stochastic process governed by joint probabilities, depending on the population distribution of origin, destination, and sphere of influence. The radiation model formula is:
Figure PCTCN2022076294-appb-000008
Figure PCTCN2022076294-appb-000008
T i=m i(N c/N)    (3) T i =m i (N c /N) (3)
上式中,s ij代表以区域i为中心、区域i与区域j之间的距离为半径的区域范围内的人口总数。N是区域i的总人口,N c是区域i的通勤人口。 In the above formula, s ij represents the total population within the area with area i as the center and the distance between area i and area j as the radius. N is the total population of area i, and N c is the commuter population of area i.
空间邻近关系模型基于地理学第一定律,所有的事物都与其他事物相关,但是距离相近的事物比远处的事物更相关。因此,本发明考虑了距离相近的事物之间的空间关系,即邻近关系。基于区域的邻近关系,本发明根据两个区域之间是否相互接触判定两个区域(节点)之间是否有连边,得到对应邻近关系的邻接矩阵。The spatial proximity relationship model is based on the first law of geography, all things are related to other things, but things that are close in distance are more related than things that are far away. Therefore, the present invention considers the spatial relationship between things with close distances, that is, the proximity relationship. Based on the adjacent relationship of the regions, the present invention judges whether there is an edge between the two regions (nodes) according to whether the two regions are in contact with each other, and obtains an adjacency matrix corresponding to the adjacent relationship.
S3:将流感病例数据和人口移动关系输入图神经网络(Graph Neural Networks,GNN),图神经网络根据人口移动关系对流感病例数据进行空间特征提取,获得流感病例数据的空间尺度信息;S3: Input the influenza case data and the population movement relationship into the Graph Neural Networks (GNN), and the graph neural network extracts the spatial features of the influenza case data according to the population movement relationship, and obtains the spatial scale information of the influenza case data;
本步骤中,采用图神经网络进行空间依赖性建模,具有更广的应用范围和更好的泛化能力。图神经网络的核心思想是学习一个函数,使得每个节点能够聚合它自己的特征和它的邻居特征来生成该节点新的特征表示。构建包含多个图卷积层的神经网络模型,通过不断迭代和学习,最终可以利用图上的拓扑关系学习到每个节点的输入特征并进行预测。图神经网络是一个邻居聚合策略,一个节点的表示向量由它的邻居节点通过循环的聚合和转移表示向量计算得到。图神经网络的框架为消息传播神经网络(Message Passing Neural Network, MPNN)。消息传播神经网络是一种空域图卷积的形式化框架。首先定义无向图G,节点v的特征向量为x v,边的特征为e vw,连接节点v和w,N(v)表示图G中节点v的邻居节点,t为运行的时间步,将结点v的特征x v作为其隐藏状态的初始态
Figure PCTCN2022076294-appb-000009
后,空域图卷积对隐藏状态的更新由如下公式表示:
In this step, the graph neural network is used for spatial dependence modeling, which has a wider application range and better generalization ability. The core idea of graph neural network is to learn a function that enables each node to aggregate its own features and its neighbors' features to generate a new feature representation of the node. Construct a neural network model containing multiple graph convolutional layers. Through continuous iteration and learning, the topological relationship on the graph can finally be used to learn the input features of each node and make predictions. The graph neural network is a neighbor aggregation strategy, and the representation vector of a node is calculated by its neighbor nodes through cyclic aggregation and transfer representation vectors. The framework of graph neural network is Message Passing Neural Network (MPNN). Message Propagation Neural Network is a formal framework for spatial graph convolution. First define the undirected graph G, the feature vector of the node v is x v , the feature of the edge is e vw , connecting the nodes v and w, N(v) represents the neighbor nodes of the node v in the graph G, t is the running time step, Use the feature x v of node v as the initial state of its hidden state
Figure PCTCN2022076294-appb-000009
After that, the update of the hidden state by spatial graph convolution is expressed by the following formula:
Figure PCTCN2022076294-appb-000010
Figure PCTCN2022076294-appb-000010
Figure PCTCN2022076294-appb-000011
Figure PCTCN2022076294-appb-000011
消息传播神经网络将空域图卷积分解为消息传递与状态更新操作两个部分,分别由消息函数M l和节点更新函数U l完成。消息函数M l的作用是聚合邻居节点的特征,形成一个消息向量,准备传递给中心节点。节点更新函数U l的作用是更新当前时刻的节点表示,将当前时刻的节点表示以及从消息函数中获得的消息进行组合,从而获得流感病例数据的空间尺度信息。 The message propagation neural network decomposes the spatial domain graph convolution into two parts: message passing and state update operation, which are completed by message function M l and node update function U l respectively. The function of the message function M l is to aggregate the characteristics of the neighbor nodes to form a message vector, which is ready to be transmitted to the central node. The role of the node update function U l is to update the node representation at the current moment, and combine the node representation at the current moment with the message obtained from the message function to obtain the spatial scale information of the influenza case data.
具体的,图神经网络根据人口移动关系对流感病例数据进行空间特征提取,获得流感病例数据的空间尺度信息具体为:将设定时间范围的个体移动轨迹数据转换为加权有向图,其顶点表示街道级区域,边用来捕获移动模式。例如,从顶点v到顶点u的边(v,u)的权重w v,w表示在时间t以区域v为家庭住址的个体移动到区域u的总人数。在时间t时,区域u和v之间的流动形成一个边缘,该边缘乘以时间t时区域u的病例数
Figure PCTCN2022076294-appb-000012
提供了一个相对分数,表示有多少感染者可能从区域u移动到区域v。设
Figure PCTCN2022076294-appb-000013
为节点属性的向量,其中包含区域u过去w周中每一周的病例数。通过该网络传递的消息使用来自所有区域的综合得分计算每个区域的特征向量,计算公式如下:
Specifically, the graph neural network extracts the spatial features of the influenza case data according to the population movement relationship, and obtains the spatial scale information of the influenza case data as follows: convert the individual movement trajectory data within a set time range into a weighted directed graph, and its vertices represent Street-level regions, edges are used to capture movement patterns. For example, the weight w v,w of an edge (v,u) from vertex v to vertex u represents the total number of individuals whose home address is in area v who move to area u at time t. At time t, the flow between regions u and v forms an edge that is multiplied by the number of cases in region u at time t
Figure PCTCN2022076294-appb-000012
A relative score is provided indicating how many infected people are likely to move from area u to area v. set up
Figure PCTCN2022076294-appb-000013
is a vector of node attributes, which contains the number of cases in each week in the past w weeks in region u. Messages passing through this network use the combined scores from all regions to compute a feature vector for each region as follows:
Figure PCTCN2022076294-appb-000014
Figure PCTCN2022076294-appb-000014
其中A代表区域人口移动流量的邻接矩阵,C t是其行包含不同区域的属性的矩阵。x u∈R w是一个将区域u内和朝向区域u移动的流感病例数结合起来的向量。x u的表示公式为: where A represents the adjacency matrix of regional population movement flows, and Ct is a matrix whose rows contain attributes of different regions. x u ∈ R w is a vector combining the number of influenza cases within and moving towards area u. The expression formula of x u is:
x u=(x uw j,u+x uw i,u+….x uw v,u)+x uw u,u    (7) x u =(x u w j,u +x u w i,u +….x u w v,u )+x u w u,u (7)
其中区域u接收来自不同区域的人,X t包含该区域中过去案例的向量。x u∈R w表示u区域中新潜在病例数量的估计值,并细分为从其他区域收到的病例和因区域u内流动性引起的新病例。为了更新每个输入图的顶点表示,本申请实施例使用以下邻域聚合方案: where region u receives people from different regions, and Xt contains vectors of past cases in that region. x u ∈ R w represents an estimate of the number of new potential cases in area u, broken down into cases received from other areas and new cases due to mobility within area u. In order to update the vertex representation of each input graph, the embodiment of this application uses the following neighborhood aggregation scheme:
Figure PCTCN2022076294-appb-000015
Figure PCTCN2022076294-appb-000015
上式中,H i是一个矩阵,其中包含上一层的节点表示,H 0=X,W 0是第一层可训练参数的矩阵,f是非线性激活函数,如ReLU。通过归一化邻接矩阵A,使得每个节点传入边的权重之和等于1。基于一个具有K阶邻域聚合层的模型,随着邻域聚合层阶数的增加,最终节点将捕获越来越多的全局信息。 In the above formula, H i is a matrix containing the node representation of the previous layer, H 0 =X, W 0 is the matrix of trainable parameters of the first layer, and f is a nonlinear activation function, such as ReLU. By normalizing the adjacency matrix A, the sum of the weights of the incoming edges of each node is equal to 1. Based on a model with a K-order neighborhood aggregation layer, as the order of the neighborhood aggregation layer increases, the final nodes will capture more and more global information.
S4:将流感病例数据以及空间尺度信息输入长短期记忆网络(Long short-term memory,LSTM),通过长短期记忆网络提取流感病例数据的时间序列关系;S4: Input the influenza case data and spatial scale information into the long short-term memory network (Long short-term memory, LSTM), and extract the time series relationship of the influenza case data through the long short-term memory network;
本步骤中,长短期记忆网络是一种可以用于处理序列数据的循环神经网络(Recurrent Neural Network,RNN)可以解决长序列训练过程中的梯度消失和梯度爆炸问题。相比普通的RNN,LSTM能够在更长的序列中获得更好的表现。LSTM包括忘记、选择记忆以及输出三个阶段,忘记阶段用于对上一个节点传送的输入进行选择性忘记。选择记忆阶段用于将当前阶段的输入有选择性地进行“记忆”。输出阶段用于决定哪些将会被当成当前状态的输出,并通过一个tanh激活函数对上一阶段得到的输出进行放缩。LSTM计算公式可表示为:In this step, the long-short-term memory network is a recurrent neural network (Recurrent Neural Network, RNN) that can be used to process sequence data, which can solve the problem of gradient disappearance and gradient explosion during long sequence training. Compared with ordinary RNN, LSTM can achieve better performance in longer sequences. LSTM includes three stages of forgetting, selective memory, and output. The forgetting stage is used to selectively forget the input transmitted by the previous node. The memory selection stage is used to selectively "memorize" the input of the current stage. The output stage is used to determine which will be regarded as the output of the current state, and scale the output obtained in the previous stage through a tanh activation function. The LSTM calculation formula can be expressed as:
y i,t=LSTM(h i,t-n,h i,t-n,...,h i,t-1)   (9) y i,t =LSTM(h i,tn ,h i,tn ,...,h i,t-1 ) (9)
其中,h i,t-1表示第i个区域在第t-1个时间段的流感病例数据表示,y i,t表示第i个区域在第t个时间段预测的流感病例数据。 Among them, h i,t-1 represents the influenza case data of the i-th region in the t-1 time period, and y i,t represents the predicted influenza case data of the i-th region in the t-th time period.
本申请实施例中,通过长短期记忆网络提取流感病例数据的时间序列关系具体为:长短期记忆网络基于当前时刻的输入x t和上一个时间段隐含层的输出h t-1来计算当前时刻隐含层的输出h t,并新加入了输入门i t、遗忘门f t、输出门o t和记忆单元c t。其中,通过对输入x t和上一步隐含层的输出h t-1进行线性变换,再经过激活函数计算得到输入门,用于控制当前计算的新状态以多大程度更新到记忆单元中。遗忘门和输出门的计算方式与输入门类似,分别用于控制前一步记忆单元中的信息有多大程度被遗忘掉以及当前的输出有多大程度上取决于当前的记忆单元。输入门、遗忘门和输出门均具有各自的参数W和b,记忆单元则主要由输入门和遗忘门共同控制:输入门控制当前输入序列中需要记忆的信息,遗忘门控制之前的历史记忆中需要遗忘的信息。该长短期记忆网络在第t时刻隐含层的输出h t最终由输出门和记忆单元决定,长短期记忆网络的各计算单元更新公式如下: In the embodiment of the present application, the time series relationship of influenza case data extracted through the long-term short-term memory network is specifically: the long-term short-term memory network calculates the current The output h t of the hidden layer at all times, and newly added the input gate it , the forgetting gate f t , the output gate o t and the memory unit c t . Among them, the input gate is obtained by linearly transforming the input x t and the output h t-1 of the hidden layer of the previous step, and then calculated by the activation function, which is used to control the extent to which the new state of the current calculation is updated to the memory unit. The forget gate and output gate are calculated in a similar way to the input gate, which are used to control how much the information in the memory unit of the previous step is forgotten and how much the current output depends on the current memory unit. The input gate, the forget gate, and the output gate all have their own parameters W and b, and the memory unit is mainly controlled by the input gate and the forget gate: the input gate controls the information that needs to be memorized in the current input sequence, and the forget gate controls the information in the previous historical memory. Information that needs to be forgotten. The output ht of the hidden layer of the long-term short-term memory network at time t is finally determined by the output gate and the memory unit. The update formula of each calculation unit of the long-term short-term memory network is as follows:
i t=σ(W i[h t-1,x t]+b i)   (10) i t =σ(W i [h t-1 ,x t ]+b i ) (10)
f t=σ(W f[h t-1,x t]+b f)   (11) f t =σ(W f [h t-1 ,x t ]+b f ) (11)
o t=σ(W o[h t-1,x t]+b o)   (12) o t =σ(W o [h t-1 ,x t ]+b o ) (12)
Figure PCTCN2022076294-appb-000016
Figure PCTCN2022076294-appb-000016
Figure PCTCN2022076294-appb-000017
Figure PCTCN2022076294-appb-000017
h t=o t*tanh(C t)   (15) h t =o t *tanh(C t ) (15)
基于上述,对于每个输入时刻的信息X t,先用图神经网络提取每个区域的空间尺度信息,然后利用每个区域新得到的输入序列作为长短期记忆网络的输 入,通过训练和学习得到每个区域对应的隐含特征。此时,每个区域都共享同一个长短期记忆网络模型,有助于提高模型泛化能力,同时减少模型参数进而降低模型的复杂度。在学习时间序列关系中的原始特征的同时,也在充分利用引入的空间上下文信息,使得预测模型的效果更好。随着网络层数的加深,最终节点功能将捕获越来越多的全局信息。但为了保留局部中间信息,本发明将两个LSTM层最后一个时间步长的隐藏状态H 1,H 2和输入的历史信息进行拼接,公式如下: Based on the above, for the information X t at each input moment, first use the graph neural network to extract the spatial scale information of each region, and then use the newly obtained input sequence of each region as the input of the long short-term memory network, through training and learning to obtain Hidden features corresponding to each region. At this time, each region shares the same long-term short-term memory network model, which helps to improve the generalization ability of the model, while reducing model parameters and reducing the complexity of the model. While learning the original features in the time series relationship, it is also making full use of the introduced spatial context information to make the prediction model better. As the number of network layers deepens, the final node function will capture more and more global information. However, in order to preserve local intermediate information, the present invention splices the hidden states H 1 and H 2 of the last time step of the two LSTM layers with the input historical information, and the formula is as follows:
H=CONCAT(X t-n,C t-n+1,…,C t-1,H 1,H 2)   (16) H=CONCAT(X tn ,C t-n+1 ,…,C t-1 ,H 1 ,H 2 ) (16)
矩阵H的行可以被视为编码多尺度结构信息(包括节点的初始特征)的顶点表示,然后将该顶点表示传递到由两层全连接网络构成的输出层。The rows of matrix H can be viewed as vertex representations that encode multi-scale structural information (including initial features of nodes), and then pass this vertex representation to the output layer consisting of a two-layer fully-connected network.
S5:根据流感病例数据的空间尺度信息和时间序列关系得到城市流感发病趋势预测结果;S5: According to the spatial scale information and time series relationship of influenza case data, the urban influenza incidence trend prediction results are obtained;
其中,由于本申请实施例采用人工智能深度学习的方式进行流感发病趋势预测,可以在多次预测后通过学习新的数据更新模型参数,使得模型更加智能高效。Among them, since the embodiment of the present application uses artificial intelligence deep learning to predict the trend of influenza incidence, the model parameters can be updated by learning new data after multiple predictions, making the model more intelligent and efficient.
基于上述,本申请实施例的城市流感发病趋势预测方法通过城市内个体移动轨迹数据提取人口移动关系,基于人口移动关系,采用图卷积神经网络获取流感病例数据的空间尺度信息,并采用长短期记忆网络提取流感病例数据的时间序列关系,根据空间尺度信息和时间序列关系对流感趋势做出精细化预测。本申请实施例实现了城市内部流感发展态势更高空间分辨率的预测,完成对流感的精细化分析,帮助政府和公共卫生部门及时、准确地洞察城市内部的流感发展态势,有针对性地进行疫情防控干预,可以最大化的保障人民的生命健康安全。相对于现有技术,本申请至少具有以下有益效果:Based on the above, the urban influenza incidence trend prediction method of the embodiment of the present application extracts the population movement relationship through the individual movement trajectory data in the city. The memory network extracts the time series relationship of influenza case data, and makes refined predictions of influenza trends based on spatial scale information and time series relationship. The embodiment of this application realizes the prediction of the development trend of influenza in the city with higher spatial resolution, completes the refined analysis of influenza, helps the government and public health departments to timely and accurately understand the development trend of influenza in the city, and conduct targeted Epidemic prevention and control intervention can maximize the protection of people's lives, health and safety. Compared with the prior art, the present application has at least the following beneficial effects:
(1)、融入大规模手机位置数据后,改善了传统的理论模型方法无法精细化模拟城市内部的人口移动与交互的缺陷,个体的移动性得到更真实的还原。(1) After integrating large-scale mobile phone location data, the defect that the traditional theoretical model method cannot be refined to simulate the population movement and interaction within the city has been improved, and the mobility of individuals has been more realistically restored.
(2)采用家庭中心流的空间交互作用建模方法,相较于基于OD流量的空间交互作用建模方法等其他数据驱动模型方法,能够在较高的空间分辨率,在城市尺度下构建的图结构流量更加集中,可以明显地捕捉到多个较强的通勤流量,进而有效地提升城市内部流感深度学习模型的预测精度。(2) Compared with other data-driven modeling methods such as the spatial interaction modeling method based on OD flow, the spatial interaction modeling method of the family center flow can be constructed at a higher spatial resolution and at the urban scale. The graph structure traffic is more concentrated, which can obviously capture multiple strong commuting traffic, and then effectively improve the prediction accuracy of the deep learning model of influenza within the city.
(3)本发明仅需要获取手机位置信息,无需获取每周平均气温、气压、降雨量、相对湿度、最大温差和日照时间等多个维度的信息,降低了数据的获取和处理难度。(3) The present invention only needs to obtain the location information of the mobile phone, and does not need to obtain multi-dimensional information such as weekly average temperature, air pressure, rainfall, relative humidity, maximum temperature difference, and sunshine time, which reduces the difficulty of data acquisition and processing.
请参阅图2,为本申请实施例的城市流感发病趋势预测系统结构示意图。本申请实施例的城市流感发病趋势预测系统40包括:Please refer to FIG. 2 , which is a schematic structural diagram of an urban influenza incidence trend prediction system according to an embodiment of the present application. The urban influenza incidence trend prediction system 40 of the embodiment of the present application includes:
数据收集模块41:用于获取城市内流感病例数据,并收集城市内个体移动轨迹数据;Data collection module 41: used to obtain data on influenza cases in the city, and collect data on individual movement trajectories in the city;
数据处理模块42:用于通过数据驱动方法对个体移动轨迹数据集进行处理,获取每个个体的家庭住址,并将家庭住址设定为每个个体所有移动轨迹的起始点,提取每个个体从起始点去往其他区域的流量数据,得到各个区域之间的人口移动关系;Data processing module 42: used to process the individual movement trajectory data set through a data-driven method, obtain the home address of each individual, and set the home address as the starting point of all movement trajectories of each individual, and extract each individual from Flow data from the starting point to other regions to obtain the population movement relationship between regions;
时空特征提取模块43:用于基于人口移动关系,利用图神经网络提取流感病例数据的空间尺度信息,并利用长短期记忆网络提取流感病例数据的时间序列关系;Spatio-temporal feature extraction module 43: used to extract the spatial scale information of influenza case data by using the graph neural network based on the population movement relationship, and extract the time series relationship of the influenza case data by using the long-term short-term memory network;
流感预测模块44:用于根据流感病例数据的空间尺度信息和时间序列关系得到城市流感发病趋势预测结果。Influenza prediction module 44: used to obtain urban influenza incidence trend prediction results based on the spatial scale information and time series relationship of influenza case data.
请参阅图3,为本申请实施例的终端结构示意图。该终端50包括处理器 51、与处理器51耦接的存储器52。Please refer to FIG. 3 , which is a schematic diagram of a terminal structure in an embodiment of the present application. The terminal 50 includes a processor 51 and a memory 52 coupled to the processor 51.
存储器52存储有用于实现上述城市流感发病趋势预测方法的程序指令。The memory 52 stores program instructions for realizing the above-mentioned method for predicting the incidence trend of urban influenza.
处理器51用于执行存储器52存储的程序指令以控制城市流感发病趋势预测。The processor 51 is used to execute the program instructions stored in the memory 52 to control the prediction of urban flu incidence trends.
其中,处理器51还可以称为CPU(Central Processing Unit,中央处理单元)。处理器51可能是一种集成电路芯片,具有信号的处理能力。处理器51还可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。Wherein, the processor 51 may also be referred to as a CPU (Central Processing Unit, central processing unit). The processor 51 may be an integrated circuit chip with signal processing capabilities. The processor 51 can also be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components . A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
请参阅图4,为本申请实施例的存储介质的结构示意图。本申请实施例的存储介质存储有能够实现上述所有方法的程序文件61,其中,该程序文件61可以以软件产品的形式存储在上述存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本发明各个实施方式方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质,或者是计算机、服务器、手机、平板等终端设备。Please refer to FIG. 4 , which is a schematic structural diagram of a storage medium according to an embodiment of the present application. The storage medium of the embodiment of the present application stores a program file 61 capable of realizing all the above-mentioned methods, wherein the program file 61 can be stored in the above-mentioned storage medium in the form of a software product, and includes several instructions to make a computer device (which can It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the methods in various embodiments of the present invention. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. , or terminal devices such as computers, servers, mobile phones, and tablets.
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本发明中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本发明所示的这些实施例,而是要符合与本发明所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined in this invention may be implemented in other embodiments without departing from the spirit or scope of the invention. Therefore, the present invention will not be limited to these embodiments shown in the present invention, but will conform to the widest scope consistent with the principles and novel features disclosed in the present invention.

Claims (10)

  1. 一种城市流感发病趋势预测方法,其特征在于,包括:A method for predicting the incidence trend of urban influenza, characterized in that it includes:
    获取城市内流感病例数据,并收集城市内个体移动轨迹数据;Obtain the data of influenza cases in the city, and collect the data of individual movement trajectories in the city;
    通过数据驱动方法对所述个体移动轨迹数据进行处理,获取每个个体的家庭住址,将所述家庭住址设定为每个个体移动轨迹的起始点,并提取每个个体从所述起始点去往其他区域的流量数据,得到各个区域之间的人口移动关系;Process the individual movement trajectory data through a data-driven method, obtain the home address of each individual, set the home address as the starting point of each individual movement trajectory, and extract each individual from the starting point to Flow data to other regions to obtain the population movement relationship between regions;
    基于所述人口移动关系,利用图神经网络提取所述流感病例数据的空间尺度信息,并利用长短期记忆网络提取所述流感病例数据的时间序列关系;Based on the population movement relationship, using a graph neural network to extract the spatial scale information of the influenza case data, and using a long short-term memory network to extract the time series relationship of the influenza case data;
    根据所述流感病例数据的空间尺度信息和时间序列关系得到城市流感发病趋势预测结果。According to the spatial scale information and time series relationship of the influenza case data, the urban influenza incidence trend prediction result is obtained.
  2. 根据权利要求1所述的城市流感发病趋势预测方法,其特征在于,所述收集城市内个体移动轨迹数据包括:The urban influenza incidence trend prediction method according to claim 1, wherein the collection of individual movement track data in the city comprises:
    利用移动设备收集城市内个体移动轨迹数据;所述移动设备包括手机或智能手表;所述个体移动轨迹数据包括每个个体的手机号码、信令时间戳以及基站经纬度。Use mobile devices to collect individual movement trajectory data in the city; the mobile devices include mobile phones or smart watches; the individual movement trajectory data include each individual's mobile phone number, signaling time stamp, and base station latitude and longitude.
  3. 根据权利要求2所述的城市流感发病趋势预测方法,其特征在于,所述通过数据驱动方法对所述个体移动轨迹数据进行处理具体为:The method for predicting the incidence trend of urban influenza according to claim 2, wherein the processing of the individual movement trajectory data through a data-driven method is specifically:
    对设定时间段以及设定距离的格网区域内城市基站建立的原始泰森多边形进行合并;将所述个体移动轨迹数据按照停留时间进行划分,将每个个体夜间停留时间最长的位置设定为该个体的家庭住址,并提取设定距离的格网区域内设定间隔时间的OD流量和家庭中心流量。Merge the original Thiessen polygons established by the city base stations in the grid area of the set time period and the set distance; divide the individual movement track data according to the stay time, and set the position with the longest night stay time of each individual Set as the individual's home address, and extract the OD flow and home center flow at a set interval in a grid area with a set distance.
  4. 根据权利要求1所述的城市流感发病趋势预测方法,其特征在于,所述 通过数据驱动方法对所述个体移动轨迹数据进行处理还包括:The method for predicting the incidence trend of urban influenza according to claim 1, wherein the processing of the individual movement track data by a data-driven method also includes:
    根据人口普查数据中城市各街道的人口分布信息和地理位置信息,通过理论模型方法对所述个体移动轨迹数据进行网络流提取;所述理论模型方法包括引力模型、辐射模型或空间邻近关系模型。According to the population distribution information and geographical location information of each street in the city in the census data, the network flow extraction is performed on the individual movement track data through a theoretical model method; the theoretical model method includes a gravity model, a radiation model or a spatial proximity relationship model.
  5. 根据权利要求1至4任一项所述的城市流感发病趋势预测方法,其特征在于,所述利用图神经网络提取所述流感病例数据的空间尺度信息具体为:The urban influenza incidence trend prediction method according to any one of claims 1 to 4, wherein the extraction of the spatial scale information of the influenza case data using a graph neural network is specifically:
    所述图神经网络的框架为消息传播神经网络,所述消息传播神经网络基于空域图卷积提取空间尺度信息;定义无向图G,节点v的特征向量为x v,边的特征为e vw,连接节点v和w,N(v)表示图G中节点v的邻居节点,t为运行的时间步,将结点v的特征x v作为其隐藏状态的初始态
    Figure PCTCN2022076294-appb-100001
    后,所述空域图卷积对隐藏状态的更新表示为:
    The framework of the graph neural network is a message propagation neural network, and the message propagation neural network extracts spatial scale information based on spatial graph convolution; an undirected graph G is defined, the feature vector of a node v is x v , and the feature of an edge is e vw , connecting nodes v and w, N(v) represents the neighbor nodes of node v in graph G, t is the running time step, and the feature x v of node v is used as the initial state of its hidden state
    Figure PCTCN2022076294-appb-100001
    After that, the update of the hidden state by the spatial graph convolution is expressed as:
    Figure PCTCN2022076294-appb-100002
    Figure PCTCN2022076294-appb-100002
    Figure PCTCN2022076294-appb-100003
    Figure PCTCN2022076294-appb-100003
    所述消息传播神经网络将空域图卷积分解为消息传递与状态更新操作两个部分,分别由消息函数M l和节点更新函数U l完成;所述消息函数M l用于聚合邻居节点的特征,形成一个消息向量,准备传递给中心节点;所述节点更新函数U l用于更新当前时刻的节点表示,将当前时刻的节点表示以及从消息函数中获得的消息进行组合,获得空间尺度信息。 The message propagation neural network decomposes the spatial graph convolution into two parts of message delivery and state update operation, which are respectively completed by the message function M1 and the node update function U1 ; the message function M1 is used to aggregate the characteristics of neighboring nodes , forming a message vector, which is ready to be delivered to the central node; the node update function U l is used to update the node representation at the current moment, and combine the node representation at the current moment with the message obtained from the message function to obtain spatial scale information.
  6. 根据权利要求5所述的流感城市流感发病趋势预测方法,其特征在于,所述利用图神经网络提取所述流感病例数据的空间尺度信息具体为:The method for predicting the incidence of influenza in influenza cities according to claim 5, wherein the spatial scale information of the influenza case data extracted using a graph neural network is specifically:
    将设定时间范围的个体移动轨迹数据转换为加权有向图,顶点表示街道级区域,边用于捕获移动模式;在时间t时,区域u和v之间的流动形成一个边,该边乘以时间t时区域u的病例数
    Figure PCTCN2022076294-appb-100004
    表示有多少感染者可能从区域u移动到区 域v;设
    Figure PCTCN2022076294-appb-100005
    为节点属性的向量,其中包含区域u过去w周中每一周的病例数;通过消息传播神经网络传递的消息使用来自所有区域的综合得分计算每个区域的特征向量:
    Transform the individual movement trajectory data of a set time range into a weighted directed graph, the vertices represent street-level areas, and the edges are used to capture movement patterns; at time t, the flow between areas u and v forms an edge, which is multiplied by The number of cases in region u at time t
    Figure PCTCN2022076294-appb-100004
    Indicates how many infected people may move from area u to area v; set
    Figure PCTCN2022076294-appb-100005
    is a vector of node attributes containing the number of cases in each of the past w weeks in region u; the message passed through the message propagation neural network computes a feature vector for each region using the composite score from all regions:
    Figure PCTCN2022076294-appb-100006
    Figure PCTCN2022076294-appb-100006
    其中A代表区域人口移动流量的邻接矩阵,X t是其行包含不同区域的属性的矩阵;x u∈R w是一个将区域u内和朝向区域u移动的流感病例数结合起来的向量,x u的表示公式为: where A represents the adjacency matrix of regional population movement flows, X t is a matrix whose rows contain the attributes of different regions; x u ∈ R w is a vector combining the number of influenza cases moving within and towards region u, x The expression formula of u is:
    x u=(x uw j,u+x uw i,u+….x uw v,u)+x uw u,u x u =(x u w j,u +x u w i,u +….x u w v,u )+x u w u,u
    其中x u∈R w表示u区域中新潜在病例数量的估计值。 where x u ∈ R w denotes the estimate of the number of new potential cases in area u.
  7. 根据权利要求6所述的流感城市流感发病趋势预测方法,其特征在于,所述利用长短期记忆网络提取所述流感病例数据的时间序列关系具体为:According to claim 6, the method for predicting the incidence trend of influenza in urban influenza, is characterized in that, the time series relationship of the influenza case data extracted using the long-short-term memory network is specifically:
    所述长短期记忆网络基于当前时刻的输入x t和上一个时间段隐含层的输出h t-1来计算当前时刻隐含层的输出h t,所述长短期记忆网络的计算公式为: The long-short-term memory network calculates the output ht of the hidden layer at the current moment based on the input xt at the current moment and the output ht -1 of the hidden layer at the previous time period. The calculation formula of the long-short-term memory network is :
    y i,t=LSTM(h i,t-n,h i,t-n,...,h i,t-1) y i,t =LSTM(h i,tn ,h i,tn ,...,h i,t-1 )
    其中,h i,t-1表示第i个区域在第t-1个时间段的流感病例数据表示,y i,t表示第i个区域在第t个时间段预测的流感病例数据。 Among them, h i,t-1 represents the influenza case data of the i-th region in the t-1 time period, and y i,t represents the predicted influenza case data of the i-th region in the t-th time period.
  8. 一种城市流感发病趋势预测系统,其特征在于,包括:An urban influenza incidence trend forecasting system is characterized in that it includes:
    数据收集模块:用于获取城市内流感病例数据,并收集城市内个体移动轨迹数据;Data collection module: used to obtain the data of influenza cases in the city and collect the data of individual movement trajectories in the city;
    数据处理模块:用于通过数据驱动方法对所述个体移动轨迹数据进行处理,获取每个个体的家庭住址,将所述家庭住址设定为每个个体移动轨迹的起始点,并提取每个个体从所述起始点去往其他区域的流量数据,得到各个区域之间的人 口移动关系;Data processing module: used to process the individual movement trajectory data through a data-driven method, obtain the home address of each individual, set the home address as the starting point of each individual movement trajectory, and extract each individual From the flow data of the starting point to other regions, the population movement relationship among the regions is obtained;
    时空特征提取模块:用于基于所述人口移动关系,利用图神经网络提取所述流感病例数据的空间尺度信息,并利用长短期记忆网络提取所述流感病例数据的时间序列关系;Spatio-temporal feature extraction module: used to extract the spatial scale information of the influenza case data by using the graph neural network based on the population movement relationship, and extract the time series relationship of the influenza case data by using the long short-term memory network;
    流感预测模块:用于根据所述流感病例数据的空间尺度信息和时间序列关系得到城市流感发病趋势预测结果。Influenza prediction module: used to obtain urban influenza incidence trend prediction results based on the spatial scale information and time series relationship of the influenza case data.
  9. 一种终端,其特征在于,所述终端包括处理器、与所述处理器耦接的存储器,其中,A terminal, characterized in that the terminal includes a processor and a memory coupled to the processor, wherein,
    所述存储器存储有用于实现权利要求1-7任一项所述的城市流感发病趋势预测方法的程序指令;The memory is stored with program instructions for realizing the urban influenza incidence trend prediction method described in any one of claims 1-7;
    所述处理器用于执行所述存储器存储的所述程序指令以控制流感城市流感发病趋势预测。The processor is used to execute the program instructions stored in the memory to control the prediction of flu incidence trends in flu cities.
  10. 一种存储介质,其特征在于,存储有处理器可运行的程序指令,所述程序指令用于执行权利要求1至7任一项所述城市流感发病趋势预测方法。A storage medium, which is characterized by storing program instructions executable by a processor, and the program instructions are used to execute the urban influenza incidence trend prediction method according to any one of claims 1 to 7.
PCT/CN2022/076294 2021-12-31 2022-02-15 Method and system for predicting influenza outbreak trend in city, and terminal and storage medium WO2023123624A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111670090.1A CN114388137A (en) 2021-12-31 2021-12-31 Urban influenza incidence trend prediction method, system, terminal and storage medium
CN202111670090.1 2021-12-31

Publications (1)

Publication Number Publication Date
WO2023123624A1 true WO2023123624A1 (en) 2023-07-06

Family

ID=81199598

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/076294 WO2023123624A1 (en) 2021-12-31 2022-02-15 Method and system for predicting influenza outbreak trend in city, and terminal and storage medium

Country Status (2)

Country Link
CN (1) CN114388137A (en)
WO (1) WO2023123624A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116561385A (en) * 2023-07-10 2023-08-08 中国人民解放军军事科学院系统工程研究院 Knowledge representation-based plan quick matching recommendation method
CN117079148A (en) * 2023-10-17 2023-11-17 腾讯科技(深圳)有限公司 Urban functional area identification method, device, equipment and medium

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115966313B (en) * 2023-03-09 2023-06-09 创意信息技术股份有限公司 Integrated management platform based on face recognition
CN116525135B (en) * 2023-04-27 2024-03-19 兰州大学 Method for predicting epidemic situation development situation by space-time model based on meteorological factors
CN116823572B (en) * 2023-06-16 2023-12-19 中国联合网络通信有限公司深圳市分公司 Population flow data acquisition method and device and computer readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150012292A1 (en) * 2012-02-13 2015-01-08 Biodiaspora Inc. Warning System For Infectious Diseases And Method Therefor
US20200176124A1 (en) * 2017-07-28 2020-06-04 Koninklijke Philips N.V. Monitoring direct and indirect transmission of infections in a healthcare facility using a real-time locating system
CN113111581A (en) * 2021-04-09 2021-07-13 重庆邮电大学 LSTM trajectory prediction method combining space-time factors and based on graph neural network
CN113450923A (en) * 2020-03-27 2021-09-28 中国科学院深圳先进技术研究院 Method and system for simulating influenza spatiotemporal propagation process by large-scale trajectory data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150012292A1 (en) * 2012-02-13 2015-01-08 Biodiaspora Inc. Warning System For Infectious Diseases And Method Therefor
US20200176124A1 (en) * 2017-07-28 2020-06-04 Koninklijke Philips N.V. Monitoring direct and indirect transmission of infections in a healthcare facility using a real-time locating system
CN113450923A (en) * 2020-03-27 2021-09-28 中国科学院深圳先进技术研究院 Method and system for simulating influenza spatiotemporal propagation process by large-scale trajectory data
CN113111581A (en) * 2021-04-09 2021-07-13 重庆邮电大学 LSTM trajectory prediction method combining space-time factors and based on graph neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WANG, HAOBO ET AL.: "Prospect of Systems Toxicology Research Based on Graph Neural Network Model", ENVIRONMENTAL CHEMISTRY, vol. 40, no. 11, 11 November 2021 (2021-11-11), pages 3297 - 3303, XP009547549, ISSN: 0254-6108 *
WANG, XIAODONG ET AL.: "User Behavior Analysis with RNN and Graph Neural Networks", JOURNAL OF FRONTIERS OF COMPUTER SCIENCE AND TECHNOLOGY, 9 November 2020 (2020-11-09), pages 839 - 845, XP009547551, ISSN: 1673-9418 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116561385A (en) * 2023-07-10 2023-08-08 中国人民解放军军事科学院系统工程研究院 Knowledge representation-based plan quick matching recommendation method
CN116561385B (en) * 2023-07-10 2023-10-13 中国人民解放军军事科学院系统工程研究院 Knowledge representation-based plan quick matching recommendation method
CN117079148A (en) * 2023-10-17 2023-11-17 腾讯科技(深圳)有限公司 Urban functional area identification method, device, equipment and medium
CN117079148B (en) * 2023-10-17 2024-01-05 腾讯科技(深圳)有限公司 Urban functional area identification method, device, equipment and medium

Also Published As

Publication number Publication date
CN114388137A (en) 2022-04-22

Similar Documents

Publication Publication Date Title
WO2023123624A1 (en) Method and system for predicting influenza outbreak trend in city, and terminal and storage medium
WO2023123625A1 (en) Urban epidemic space-time prediction method and system, terminal and storage medium
Felemban et al. Digital revolution for Hajj crowd management: a technology survey
CN110147904B (en) Urban gathering event prediction and positioning method and device
Zheng et al. Diagnosing New York city's noises with ubiquitous data
Song et al. Prediction and simulation of human mobility following natural disasters
US8341110B2 (en) Temporal-influenced geospatial modeling system and method
CN113673769B (en) Traffic flow prediction method of graph neural network based on multivariate time sequence interpolation
Xiao et al. Predicting urban region heat via learning arrive-stay-leave behaviors of private cars
WO2014194480A1 (en) Air quality inference using multiple data sources
Yuan et al. Smart flood resilience: harnessing community-scale big data for predictive flood risk monitoring, rapid impact assessment, and situational awareness
CN115775085B (en) Digital twinning-based smart city management method and system
Zhao et al. Observing individual dynamic choices of activity chains from location-based crowdsourced data
Rahman et al. A deep learning approach for network-wide dynamic traffic prediction during hurricane evacuation
Zhuang et al. Integrating a deep forest algorithm with vector‐based cellular automata for urban land change simulation
Sai et al. Optimal design of urban transportation planning based on big data
Gong et al. Spatio-temporal parking behaviour forecasting and analysis before and during COVID-19
Zhang et al. Off-deployment traffic estimation—a traffic generative adversarial networks approach
Yang et al. Short‐Term Forecasting of Dockless Bike‐Sharing Demand with the Built Environment and Weather
WO2023004595A1 (en) Parking data recovery method and apparatus, and computer device and storage medium
Pathirana et al. Deep learning based flood prediction and relief optimization
Cheng et al. Network SpaceTime AI: Concepts, Methods and Applications.
Hu et al. A simplified deep residual network for citywide crowd flows prediction
Zhang et al. Situational-Aware Multi-Graph Convolutional Recurrent Network (SA-MGCRN) for Travel Demand Forecasting During Wildfires
Guo et al. Traffic flow prediction method of diversion area in peak hours based on double flow graph convolution network.

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22912931

Country of ref document: EP

Kind code of ref document: A1