WO2021189516A1 - Method and system for simulating process of temporal and spatial circulation of influenza with massive trajectory data - Google Patents

Method and system for simulating process of temporal and spatial circulation of influenza with massive trajectory data Download PDF

Info

Publication number
WO2021189516A1
WO2021189516A1 PCT/CN2020/082708 CN2020082708W WO2021189516A1 WO 2021189516 A1 WO2021189516 A1 WO 2021189516A1 CN 2020082708 W CN2020082708 W CN 2020082708W WO 2021189516 A1 WO2021189516 A1 WO 2021189516A1
Authority
WO
WIPO (PCT)
Prior art keywords
individual
influenza
individuals
data
spatial
Prior art date
Application number
PCT/CN2020/082708
Other languages
French (fr)
Chinese (zh)
Inventor
尹凌
张�浩
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2021189516A1 publication Critical patent/WO2021189516A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • the invention relates to a method and a system for simulating the time-space propagation process of influenza with large-scale trajectory data.
  • Influenza is an acute respiratory infection caused by influenza virus and parainfluenza virus. Through the droplets and particles produced when the infected person coughs or sneezes, the virus can easily spread from person to person and seriously endanger the lives and health of the people. .
  • the temporal and spatial evolution of human activities and related fluctuations in population density are the key driving factors for the dynamics of infectious disease outbreaks. By tracing the contact between urban individuals at high spatial and temporal resolution, starting from discrete individuals, inferring potentially infected individuals based on the time and location information of the activities of different individuals, it helps to accurately express the spatiotemporal spread of influenza and improve the spread of influenza Accuracy of spatial model prediction.
  • Bottlenecks provide new opportunities.
  • the agent-based infectious disease model at the urban scale mainly reconstructs the movement of individuals and the contact network between individuals through travel survey data, and studies the transmission characteristics of infectious diseases in time series.
  • the prior art has at least the following shortcomings: building a real-world oriented agent model requires a large amount of real individual data, otherwise it is impossible to know precise individual positions and contacts between individuals.
  • existing studies have attempted to model population movement based on trajectory data to build an infectious disease spread model, the existing trajectory data-based infectious disease spread modeling method still cannot effectively solve the integration of population attributes (such as age, gender, occupation, etc.).
  • Family structure and individual modeling needs of the movement characteristics of trajectory data, lack of effective methods for fusing trajectory data to construct temporal and spatial proximity relationships of urban individuals (for example, individuals appearing in the same place at the same time), and even neglecting individuals in densely populated modern cities
  • the complexity of the contact space; at the same time, the trajectory data has sample bias and cannot represent the entire population of the entire city.
  • the present invention provides a method for simulating the spatiotemporal transmission process of influenza with large-scale trajectory data.
  • the method includes the following steps: a. Synthesize urban population based on census data and building census data, and assign corresponding population attributes to individuals of the synthesized urban population B. For the individuals of the synthetic urban population given demographic attributes, use mobile phone location data as the main and travel survey data as a supplement to construct an individual activity chain; c. Based on the constructed individual activity chain, one hour is the time Step size, dynamically construct a contact network of 24 time series in a day; d. According to the constructed contact network of 24 time series in a day, use the SEIR model to simulate the spread of influenza at high spatial and temporal resolution.
  • the method also includes the steps: the method also includes the step e: analyzing the simulation results from the two perspectives of time and space to obtain the spatial transmission path between the infected persons, and accurately locate the key spatial locations in the spread of the influenza epidemic .
  • the population census data includes: age, gender, occupation type, family category, family size, and family age components;
  • the building census data includes: building location information, building height, building area, and building function; wherein, the Building functions include: factories, teaching buildings, residential buildings, office buildings, and shopping malls.
  • the demographic attributes include individual attributes, family attributes, and whether they are individuals with mobile phones; wherein, the individual attributes include age, gender, and occupation; the family attributes include family structure, home address, and work place.
  • Said step b specifically includes the following steps: constructing the travel trajectory of an individual with a mobile phone based on mobile phone location data: sorting the data of the same mobile phone number by time to form a day’s travel trajectory of the mobile phone user, and dividing Tyson based on the mobile phone base station Polygon, multiple buildings located in the same Tyson polygon as the mobile phone base station are the candidate sets of individual positions; the travel trajectory of individuals without mobile phones is constructed based on travel survey data: the same The multiple buildings in the traffic area are candidate sets of individual locations; according to the candidate set of individual locations obtained above, an individual activity chain is constructed.
  • Said step c specifically includes the following steps: according to the activity chain of the constructed individual, with the hour as the time granularity, compare the activity categories performed by different individuals in a day, and set individuals who perform the same activities at the same time and are in the same building location For individuals with co-occurrence in time and space, based on the activity category, different contact probabilities are assigned to individuals in co-occurrence in time and space. Under the constraint of contact probability, a dynamic contact network with hourly resolution between agents is generated.
  • the activity categories include: home, work/school, leisure and entertainment activities.
  • the step d specifically includes the following steps: using hours as the unit of time, tracking the time and place of the infection, and from whom to whom.
  • the probability of an individual being infected is called the effective infection probability P,
  • the formula for effective infection probability is as follows:
  • P c is the contact probability between individuals
  • P i is the infection probability of the individual
  • r is the relative infectivity of the individual
  • the Monte Carlo method to determine whether the individual is infected includes: generating a uniformly distributed pseudo-random number based on a computer, comparing the pseudo-random number with the effective infection probability, and if the pseudo-random number is less than or equal to the effective infection probability, the individual is infected , Repeat the above process until the final trend of the number of new infections per day is consistent with the real data and the calculated basic reproductive number is greater than 1.
  • the step e specifically includes the following steps: a comparative analysis of the city scale and the administrative division scale on the time series, divide the city into 1km ⁇ 1km grids, and analyze the number of newly infected cases in each grid every day ;
  • the epidemic tree is constructed by constructing the spread topological relationship between the parent infected person and the offspring infected person, and then the epidemic forest is composed of multiple epidemic trees to obtain the spatial transmission path between the parent infected person and the offspring infected person, and accurately locate the flu The key spatial location in the spread of the epidemic.
  • the present invention provides a system for simulating the spatiotemporal transmission process of influenza with large-scale trajectory data.
  • the system includes a population attribute assignment module, an activity chain building module, a contact network building module, and a transmission simulation module.
  • the census data and building census data synthesize the urban population, and assign corresponding population attributes to the individuals of the composite urban population;
  • the activity chain building module is used to assign the population attributes to the individuals of the composite urban population, mainly based on mobile phone location data ,
  • the travel survey data is supplemented to construct an individual's activity chain;
  • the contact network building module is used to dynamically construct a contact network of 24 time series in a day based on the constructed individual's activity chain, taking one hour as the time step
  • the spread simulation module is used to simulate the spread of influenza at high temporal and spatial resolution by using the SEIR model according to the constructed contact network of 24 time series in a day.
  • the agent-based spatial explicit epidemic model sets a large number of parameters related to the spread of influenza, improves the interpretability of the model, and helps to reveal the spreading mechanism of influenza at high spatial and temporal resolution.
  • the simulation results can match the trend of the original influenza data at the urban scale and administrative division scale.
  • the spatial information of the epidemic situation described by the simulation results effectively solves the problem of the inability to truly locate the spatial location of the infected person and the potential high-risk transmission area.
  • Fig. 1 is a flow chart of the method for simulating the temporal and spatial propagation process of influenza with large-scale trajectory data according to the present invention
  • Figure 2 is a schematic diagram of the natural history of influenza under the SEIR model
  • Figure 3 is a schematic diagram of an epidemic forest provided by an embodiment of the present invention.
  • Figure 4 is a hardware architecture diagram of the system for simulating the spread of influenza in time and space with large-scale trajectory data according to the present invention
  • FIG. 5 is a schematic diagram of the age and gender distribution of synthetic individuals provided by an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of the structure distribution of a synthetic family provided by an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of the comparison between the simulation results of the influenza diffusion process at the urban scale and the real cases according to the embodiment of the present invention.
  • FIG. 8 is a schematic diagram of the distribution of reproduction numbers generated by 100 simulation results at a city scale according to an embodiment of the present invention.
  • Fig. 9 is a schematic diagram of the comparison between the simulation results of the influenza spreading process provided by the embodiment of the present invention and real cases on a scale of 10 regions;
  • FIG. 10 is a schematic diagram of the temporal and spatial distribution of influenza spread in a city according to an embodiment of the present invention.
  • FIG. 11 is a schematic diagram of the intensity of the spread of influenza within the space unit according to an embodiment of the present invention.
  • FIG. 12 is a schematic diagram of the number of grids that can be affected by one grid during the spread of influenza according to an embodiment of the present invention.
  • FIG. 13 is a schematic diagram of the number of events (connection strength) in which influenza transmission occurs between two grids according to an embodiment of the present invention
  • FIG. 14 is a schematic diagram of a spatial aggregation phenomenon during the spread of influenza according to an embodiment of the present invention.
  • FIG. 1 it is a flowchart of a preferred embodiment of the method for simulating the time-space propagation process of influenza with large-scale trajectory data according to the present invention.
  • Step S1 Synthesize the urban population based on the population census data and the building census data, and assign corresponding population attributes to individuals of the synthesized urban population. That is, fusion of multi-source data to build a model of urban population movement.
  • the population census data includes: age, gender, occupation type, family category, family size, and family age components;
  • the building census data includes: building location information, building height, building area, and building function; wherein, the Building functions include: factories, teaching buildings, residential buildings, office buildings, and shopping malls.
  • the demographic attributes include individual attributes, family attributes, and whether they are individuals with mobile phones; wherein, the individual attributes include age, gender, and occupation; the family attributes include family structure, home address, and work place.
  • Monte Carlo simulation is used to assign corresponding individual attributes to the individuals of each composite urban population.
  • family attributes such as family category, family size, family age composition and so on in the census data
  • a synthetic family is constructed, and individuals of the synthetic urban population are filled into the synthetic family.
  • Monte Carlo simulation is performed to give the individuals of the synthetic urban population the attributes of whether they are individuals with mobile phones.
  • the individuals of the synthetic urban population are divided into two types: individuals with mobile phones and individuals without mobile phones. .
  • Step S2 for individuals (hereinafter referred to as "individuals") of synthetic urban population assigned population attributes, using mobile phone location data as the main and travel survey data as the supplement to construct an individual activity chain. That is to say, based on the combination of mobile phone data and travel survey data supplemented by the time and space characteristics of individual travel and building census data, building-scale individual movement modeling is realized. in particular:
  • the travel trajectory of individuals with mobile phones is reconstructed based on mobile phone location data, and the travel trajectories of individuals without mobile phones are constructed based on travel survey data.
  • the spatial range of the work and residence of individuals with mobile phones identified through the mobile phone data is refined to the service range of the mobile phone base station, and the spatial range of the work and residence of individuals without mobile phones based on travel survey records is refined to the traffic community.
  • the mobile phone location data includes: anonymous mobile phone number, time, base station latitude and longitude.
  • the data of the same mobile phone number is sorted by time to form a day's travel trajectory of the mobile phone user.
  • the user's travel location is the location of the mobile phone base station.
  • the Thiessen polygon is divided (the range of the Thiessen polygon is the service range of the mobile phone base station), and multiple buildings located in the same Thiessen polygon as the mobile phone base station are the candidate sets of individual positions.
  • Travel survey is an investigation of individual travel behavior.
  • the travel survey data includes: individual work unit, place of departure, destination, departure time, end time, travel mode, travel purpose, etc., but the location information is based on the traffic community.
  • the travel trajectory of an individual without a mobile phone is constructed with each travel information in the travel survey, and multiple buildings in the same traffic area are obtained as a candidate set of individual locations.
  • an individual's activity chain is constructed.
  • Step S3 based on the constructed individual activity chain, with one hour as the time step, dynamically constructing a contact network of 24 time series in a day.
  • a contact network is dynamically constructed based on the individual activity chain of the urban population.
  • the constructed activity chain of individuals with the hour as the time granularity, compare the types of activities performed by different individuals within a day, and set individuals who perform the same activities at the same time and are in the same building location as individuals with the same time and space.
  • different contact probabilities are assigned to individuals who co-occur in time and space.
  • a dynamic contact network with hourly resolution between agents is generated.
  • the activity categories include: home, work/school, leisure and entertainment activities.
  • the vertices represent traveling individuals, and individuals appearing in the same location are connected by edges.
  • the locations are in units of home (home), work (work), and building address (leisure).
  • Individuals appearing at the same location at the same time have a certain probability of contact, and this probability of contact is recorded as p c .
  • An individual has 24 contact networks in a day. Take a contact network as an example.
  • the vertices in the network represent individuals. Individuals appearing at the same location are connected by edges, which means that two individuals travel at the same location at the same time. Then, according to the contact probability in Table 1, it is judged whether two individuals appearing at the same place at the same time actually have contact. For example, when the infected person is an adult, he will have contact with a minor at a probability of 0.25 when he is at home. 0.4 probability of contact with an adult.
  • Step S4 According to the constructed contact network of 24 time series in a day, the SEIR model is used to simulate the spread of influenza at high temporal and spatial resolution. in particular:
  • the SEIR model divides individuals into four states based on the natural history of influenza (Figure 2): susceptible period, incubation period, infection period, and recovery period.
  • Figure 2 due to the vaccination and the production of autoantibodies, a part of the susceptible population has immunity to influenza virus; a susceptible individual is infected with a certain probability and enters the incubation period; after the virus is parasitic in the body for a certain period of time, the individual has Infectious, entering the infectious period; an infected person in the infectious period may show flu-corresponding symptoms, or there may be no obvious symptoms. Finally, the individual was cured and entered a recovery state.
  • the simulation process takes hours as the time unit to track when and where the infection occurred, and from whom to whom.
  • the probability of an individual being infected is called the effective probability of infection P.
  • the formula for effective infection probability is as follows:
  • P c is equal probability of contact between individuals
  • P i is the probability of infected individuals
  • r is the relative infectivity for the individual.
  • the Monte Carlo method is used to determine whether the individual is infected: based on a computer-generated uniformly distributed pseudo-random number, the pseudo-random number is compared with the effective infection probability, if the pseudo-random number is less than or equal to the effective infection probability, the individual is infected. Repeat the above process until the final trend of the number of new infections per day is consistent with the real data and the calculated basic reproductive number is greater than 1.
  • Step S5 Analyze the simulation results from the two perspectives of time and space, explore the trend and intensity of influenza outbreaks, analyze high-risk areas in the process of influenza transmission, and accurately locate key spatial locations in the process of influenza epidemic transmission.
  • influenza outbreak trend and intensity refers to: the influenza outbreak trend is mainly reflected in the time curve of newly infected cases every day, and the curve is roughly normal distribution. If the curve amplitude is narrow, it indicates that the outbreak speed is relatively high. Fast, on the contrary, the speed of the outbreak is relatively slow, the peak of the curve indicates the severity of the outbreak, the higher the peak, the more serious.
  • the high-risk areas are areas where the cumulative number of infected persons is large, which means that susceptible persons have a high risk of being infected in these areas.
  • the key spatial location is a location with strong transmission connectivity between regions, that is, intervention in the key spatial location can reduce the spread of influenza virus to other regions, thereby reducing the risk of transmission of infectious diseases within the city.
  • the epidemic tree is constructed by constructing the transmission topology relationship between the parent infected and the offspring infected, and then the epidemic forest is composed of multiple epidemic trees (Figure 3) to reveal the spatial transmission between the parent infected and the offspring infected Path, accurately locate the key spatial locations during the spread of influenza epidemics, reveal the relationship between the spread distance and the transmission intensity of influenza in the city, and the spatial agglomeration effect.
  • the key spatial position in the process of locating the influenza epidemic is mainly used to analyze the transmission intensity of influenza virus between spatial grids.
  • the spatial distance between the grids represents the transmission distance of influenza
  • the connection weight between the grids represents the transmission intensity of influenza between the grids.
  • the Pearson between the two is calculated in this embodiment.
  • the correlation coefficient r, r -0.098, shows that the propagation distance and the propagation strength are extremely weakly negatively correlated.
  • FIG. 4 is a hardware architecture diagram of the system 10 for simulating the time-space propagation process of influenza with large-scale trajectory data according to the present invention.
  • the system includes: a population attribute assignment module 101, an activity chain building module 102, a contact network building module 103, a propagation simulation module 104, and an analysis module 105.
  • the population attribute assignment module 101 is used to synthesize urban population based on census data and building census data, and assign corresponding population attributes to individuals who synthesize urban population. That is, fusion of multi-source data to build a model of urban population movement. in:
  • the population census data includes: age, gender, occupation type, family category, family size, and family age components;
  • the building census data includes: building location information, building height, building area, and building function; wherein, the Building functions include: factories, teaching buildings, residential buildings, office buildings, and shopping malls.
  • the demographic attributes include individual attributes, family attributes, and whether they are individuals with mobile phones; wherein, the individual attributes include age, gender, and occupation; the family attributes include family structure, home address, and work place.
  • the population attribute assignment module 101 first assigns corresponding individual attributes to individuals of each composite urban population through Monte Carlo simulation based on the probability distribution of individual attributes such as age, gender, and occupation type in the census data. Then a composite family is constructed according to the probability of family attributes such as family category, family size, and family age composition in the census data, and the individuals of the composite urban population are filled into the composite family. Finally, Monte Carlo simulation is carried out according to the mobile phone usage rate of different genders and age groups, and the attributes of the synthetic urban population are assigned to individuals with mobile phones. The synthetic urban population individuals are divided into two categories: mobile phone individuals and mobile phone individuals.
  • the activity chain construction module 102 is used to construct an individual activity chain for individuals (hereinafter referred to as "individuals") of the synthetic urban population assigned population attributes, using mobile phone location data as the main and travel survey data as the supplement. That is to say, based on the combination of mobile phone data and travel survey data supplemented by the time and space characteristics of individual travel and building census data, building-scale individual movement modeling is realized. in particular:
  • the activity chain construction module 102 reconstructs the travel trajectory of individuals with mobile phones based on mobile phone location data, and constructs the travel trajectories of individuals without mobile phones based on travel survey data. At this time, the spatial range of the work and residence of individuals with mobile phones identified through the mobile phone data is refined to the service range of the mobile phone base station, and the spatial range of the work and residence of individuals without mobile phones based on travel survey records is refined to the traffic community.
  • the mobile phone location data includes: anonymous mobile phone number, time, base station latitude and longitude.
  • the data of the same mobile phone number is sorted by time to form a day's travel trajectory of the mobile phone user.
  • the user's travel location is the location of the mobile phone base station.
  • the cell phone base station is divided into the Thiessen polygon (the range of the cell phone base station is the service range of the cell phone base station), and multiple buildings located in the same Thiessen polygon as the cell phone base station are the candidate sets of individual positions.
  • a travel survey is an investigation of individual travel behavior.
  • the travel survey data includes: individual work unit, departure place, destination, departure time, end time, travel mode, travel purpose, etc., but the location information is based on the traffic community.
  • the travel trajectory of an individual without a mobile phone is constructed with each travel information in the travel survey, and multiple buildings in the same traffic area are obtained as a candidate set of individual locations.
  • the activity chain construction module 102 constructs an individual activity chain according to the candidate set of individual positions obtained above.
  • the contact network construction module 103 is used to dynamically construct a contact network of 24 time series in a day based on the activity chain of the constructed individual, with a time step of one hour. That is, the contact network construction module 103 dynamically constructs a contact network based on the individual activity chain of the urban population. According to the constructed activity chain of individuals, with the hour as the time granularity, compare the types of activities performed by different individuals within a day, and set individuals who perform the same activities at the same time and are in the same building location as individuals with the same time and space. Based on the activity category, different contact probabilities are assigned to individuals who co-occur in time and space. Under the constraint of contact probability, a dynamic contact network with hourly resolution between agents is generated. Wherein, the activity categories include: home, work/school, leisure and entertainment activities.
  • the vertices represent traveling individuals, and individuals appearing in the same location are connected by edges.
  • the locations are in units of home (home), work (work), and building address (leisure).
  • Individuals appearing at the same location at the same time have a certain probability of contact, and this probability of contact is recorded as p c .
  • An individual has 24 contact networks in a day. Take a contact network as an example.
  • the vertices in the network represent individuals. Individuals appearing in the same position are connected by edges, which means that two individuals travel at the same location at the same time. Then, according to the contact probability in Table 1, it is judged whether two individuals appearing at the same place at the same time actually have contact. For example, when the infected person is an adult, he will have contact with a minor at a probability of 0.25 when he is at home. 0.4 probability of contact with an adult.
  • the spread simulation module 104 is used to simulate the spread of influenza at high temporal and spatial resolution by using the SEIR model according to the constructed contact network of 24 time series in a day. in particular:
  • the SEIR model divides individuals into four states based on the natural history of influenza (Figure 2): susceptible period, incubation period, infection period, and recovery period.
  • Figure 2 due to the vaccination and the production of autoantibodies, a part of the susceptible population has immunity to influenza virus; a susceptible individual is infected with a certain probability and enters the incubation period; after the virus is parasitic in the body for a certain period of time, the individual has Infectious, entering the infectious period; an infected person in the infectious period may show flu-corresponding symptoms, or there may be no obvious symptoms. Finally, the individual was cured and entered a recovery state.
  • the simulation process takes hours as the time unit to track when and where the infection occurred, and from whom it was transmitted to whom.
  • the probability of an individual being infected is called the effective probability of infection P.
  • the formula for effective infection probability is as follows:
  • P c is equal probability of contact between individuals
  • P i is the probability of infected individuals
  • r is the relative infectivity for the individual.
  • the Monte Carlo method is used to determine whether the individual is infected: based on a computer-generated uniformly distributed pseudo-random number, the pseudo-random number is compared with the effective infection probability, if the pseudo-random number is less than or equal to the effective infection probability, the individual is infected. Repeat the above process until the final trend of the number of new infections per day is consistent with the real data and the calculated basic reproductive number is greater than 1.
  • the analysis module 105 is used to analyze the simulation results from two perspectives of time and space, explore the trend and intensity of influenza outbreaks, analyze high-risk areas in the process of influenza transmission, and accurately locate key spatial locations in the process of influenza epidemic transmission.
  • influenza outbreak trend and intensity refers to: the influenza outbreak trend is mainly reflected in the time curve of newly infected cases every day, and the curve is roughly normal distribution. If the curve amplitude is narrow, it indicates that the outbreak speed is relatively high. Fast, on the contrary, the speed of the outbreak is relatively slow, the peak of the curve indicates the severity of the outbreak, the higher the peak, the more serious.
  • the high-risk areas are areas where the cumulative number of infected persons is large, which means that susceptible persons have a high risk of being infected in these areas.
  • the key spatial location is a location with strong transmission connectivity between regions, that is, intervention in the key spatial location can reduce the spread of influenza virus to other regions, thereby reducing the risk of transmission of infectious diseases within the city.
  • the analysis module 105 performs a comparative analysis on the city scale and the administrative division scale from the time series. In order to accurately express the spread of influenza in time and space, the city is divided into 1km ⁇ 1km grids, and the number of newly infected cases in each grid every day is analyzed. The number of cases is the average of 100 simulation results.
  • the analysis module 105 constructs an epidemic tree by constructing the transmission topology relationship between the parent infected person and the offspring infected person, and then the epidemic forest is composed of multiple epidemic trees (Figure 3) to reveal the parent infected person and the offspring infected person
  • Figure 3 The spatial transmission path between the two, accurately locates the key spatial locations in the spread of influenza epidemics, and reveals the relationship between the transmission distance and the transmission intensity of influenza in the city and the spatial agglomeration effect.
  • the key spatial position in the process of locating the influenza epidemic is mainly used to analyze the transmission intensity of influenza virus between spatial grids.
  • the spatial distance between the grids represents the transmission distance of influenza
  • the connection weight between the grids represents the transmission intensity of influenza between the grids.
  • the Pearson between the two is calculated in this embodiment.
  • the correlation coefficient r, r -0.098, shows that the propagation distance and the propagation strength are extremely weakly negatively correlated.
  • Modeling of population attributes According to the probability distribution of individual attributes such as age, gender, and occupation type in the census data, the corresponding individual attributes are assigned to each synthetic individual through Monte Carlo simulation. The age and gender distribution of the synthetic individual are shown in Figure 5.
  • a synthetic family is constructed according to the probability of family attributes such as family category, family size, and family age components in the census data, and synthetic individuals are filled into the synthetic family.
  • Figure 6 shows the size distribution of the synthetic family constructed in this embodiment. Compared with the population data of each district released by the 2017 Shenzhen Statistical Yearbook, the synthetic population constructed in this embodiment basically matches the population.
  • Figure 10 is the result of simulating the temporal and spatial distribution of influenza spread in Shenzhen.
  • the value in each grid is the average of 100 simulation results.
  • the agent-based spatial explicit infectious disease model constructed by fusing mobile phone location data proposed in this embodiment can support tracking the location information of infected persons and their interaction with other susceptible persons, effectively solving the inability to truly locate the space of infected persons. Location and potential high-risk transmission areas.
  • Figure 11-14 reflects the number of influenza spreading within each spatial unit, that is, the homes of the parents of the infected and the offspring are located in the same grid, and such infections often occur between neighbors in the same community. It can be seen from Figure 11 that there is a high-risk area in the southeast of Luohu District. Influenza is more likely to spread infection within this location, which can remind residents living in this area to prevent the spread of influenza among neighbors.
  • Figures 12, 13, and 14 reflect the situation where the infected parent and the infected offspring are in two spatial units during the spread of influenza.
  • the linear propagation paths of the infected parent and the infected offspring of different spatial units constitute the connection relationship between the grids, and the connection relationship between the grids and the grids forms a social network.
  • the value of the grid described in Figure 12 It is the degree of the grid in the social network, reflecting the number of grids that a grid can affect. It can be seen that the spatial unit at the junction of Futian District and Luohu District will have an impact on more geographic spaces in Shenzhen than other spatial units.
  • the weight of the connection relationship between the grids is different (the connection weight represents the number of events in which influenza transmission occurs between the two grids).
  • connection weight between 99.9% of the grids is less than 100 (Figure 13), and the area where the connection weight is greater than 100 has obvious spatial aggregation ( Figure 14).
  • Figure 13 The connection weight between 99.9% of the grids is less than 100
  • Figure 14 The connection weight between 99.9% of the grids is less than 100
  • Figure 14 The connection weight between 99.9% of the grids is less than 100
  • Figure 14 The connection weight between 99.9% of the grids is less than 100
  • Figure 14 The connection weight between 99.9% of the grids is less than 100
  • Figure 14 The connection weight between 99.9% of the grids is less than 100
  • Figure 14 The connection weight between 99.9% of the grids is less than 100
  • Figure 14 The connection weight between 99.9% of the grids has obvious spatial aggregation
  • the spatial distance between the grids represents the transmission distance of influenza
  • the connection weight between the grids represents the transmission intensity of influenza between the grids.
  • the invention establishes a spatial explicit infectious disease model based on an agent. Based on the traditional method of individual movement modeling based on statistical data, a new spatio-temporal framework that integrates large-scale mobile phone location data for individual movement modeling is proposed, and the chain of individual urban activities is reconstructed.
  • the model includes the entire population in the urban area (both demographic attributes and mobile behavior), the individual travel location is estimated to the building, and the individual activity place is based on the family (home), work unit (work), and building (entertainment). Construct a dynamic contact network between individuals. Then the SEIR model is used to simulate the time and space propagation process of influenza in the city. The simulation time step is 1 hour, the spatial scale is in the unit of building, and the heterogeneity of individual response to influenza virus is fully considered in the simulation process.
  • the present invention is not limited to the fusion of data such as mobile phone data, travel survey data, population census data, and building census data.
  • the present invention can simulate the spread of a variety of infectious diseases, such as: influenza, dengue fever and other close-transmitted diseases.
  • the present invention has good analysis capabilities for the spread of infectious diseases in time and space, especially spatial analysis. ability.

Landscapes

  • Public Health (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A method and a system for simulating the process of temporal and spatial circulation of influenza with massive trajectory data. Said method comprises: synthesizing an urban population on the basis of population census data and building census data, and assigning a corresponding population attribute to an individual of the synthesized urban population (S1); for the individual of the synthesized urban population to which the population attribute has been assigned, constructing an individual activity chain mainly consisting of mobile phone position data and being assisted by travel investigation data (S2); on the basis of the constructed individual activity chain, dynamically constructing, with one hour as a time step, a contact network of 24 time series in a day (S3); according to the constructed contact network, using an SEIR model to simulate the circulation of an influenza at a high temporal and spatial resolution (S4); and analyzing a simulation result from two perspectives of time and space, so as to analyze a high-risk area during circulation of the influenza and accurately position a key spatial position during circulation of influenza epidemic situation (S5). Said method can invert the temporal and spatial information of influenza outbreaks on different temporal and spatial scales, effectively solving the problem of being unable to realize real location of the spatial position of an infected person and a potentially high-risk circulation area.

Description

大规模轨迹数据模拟流感时空传播过程的方法及系统Method and system for simulating the spreading process of influenza in time and space with large-scale trajectory data 技术领域Technical field
本发明涉及一种大规模轨迹数据模拟流感时空传播过程的方法及系统。The invention relates to a method and a system for simulating the time-space propagation process of influenza with large-scale trajectory data.
背景技术Background technique
流感是由流感病毒、副流感病毒引起的急性呼吸道感染疾病,通过染病者咳嗽或打喷嚏时产生的飞沫和微粒,病毒很容易在人与人之间传播,严重危害了人民群众的生命健康。人类活动的时空演变和人口密度的相关波动是传染病暴发动态的关键驱动因素。通过追踪城市个体之间高时空分辨率下的接触,从离散个体出发,基于不同个体活动的时间和位置信息等推断可能被感染的个体,有助于准确表达流感时空扩散过程,提高对流感传播空间模式预测的准确性。近年来,大规模个体轨迹数据(手机位置数据、浮动车GPS数据、公交刷卡数据等)出现了爆发式增长,可以精准定位移动个体的时空信息,为突破传染病防控在时空精准性上的瓶颈提供了新的契机。Influenza is an acute respiratory infection caused by influenza virus and parainfluenza virus. Through the droplets and particles produced when the infected person coughs or sneezes, the virus can easily spread from person to person and seriously endanger the lives and health of the people. . The temporal and spatial evolution of human activities and related fluctuations in population density are the key driving factors for the dynamics of infectious disease outbreaks. By tracing the contact between urban individuals at high spatial and temporal resolution, starting from discrete individuals, inferring potentially infected individuals based on the time and location information of the activities of different individuals, it helps to accurately express the spatiotemporal spread of influenza and improve the spread of influenza Accuracy of spatial model prediction. In recent years, there has been an explosive growth in large-scale individual trajectory data (mobile phone location data, floating car GPS data, bus card swiping data, etc.), which can accurately locate the time and space information of mobile individuals, in order to break through the time and space accuracy of infectious disease prevention and control. Bottlenecks provide new opportunities.
目前,城市尺度下的基于智能体的传染病模型主要通过出行调查数据重构个体移动和个体间的接触网络,并研究传染病在时间序列上的传播特征。At present, the agent-based infectious disease model at the urban scale mainly reconstructs the movement of individuals and the contact network between individuals through travel survey data, and studies the transmission characteristics of infectious diseases in time series.
但是,现有技术至少存在如下缺点:构建面向真实世界的智能体模型需要大量真实的个体数据,否则无法知晓精确的个体位置和个体间的接触。尽管已有研究尝试基于轨迹数据进行人群移动建模,进而构建传染病扩散模型,但是,现有基于轨迹数据的传染病扩散建模方法尚不能有效解决融合人口属性(例如年龄、性别、职业、家庭结构)与轨迹数据移动特征的个体建模需求、缺乏融合轨迹数据构建城市个体时空邻近关系的有效方法(例如,相同时间出现在相同地点的个体),更忽略了人口稠密的现代城市中个体接触空间的复杂性;同时,轨迹数据具有样本有偏性,不能代表整个城市的全部人口。针对传染病传 播过程,个体在流感病毒传播过程中表现出明显异质性,现有很多研究为简化模型参数忽略了参数间的相互影响。此外,基于智能体的空间显式传染病模型往往以较低的时空分辨率进行研究,难以从时空动态机制的角度揭示城市尺度下流感的传播过程,导致对流感爆发的时空模式预测出现延迟和偏差。However, the prior art has at least the following shortcomings: building a real-world oriented agent model requires a large amount of real individual data, otherwise it is impossible to know precise individual positions and contacts between individuals. Although existing studies have attempted to model population movement based on trajectory data to build an infectious disease spread model, the existing trajectory data-based infectious disease spread modeling method still cannot effectively solve the integration of population attributes (such as age, gender, occupation, etc.). Family structure) and individual modeling needs of the movement characteristics of trajectory data, lack of effective methods for fusing trajectory data to construct temporal and spatial proximity relationships of urban individuals (for example, individuals appearing in the same place at the same time), and even neglecting individuals in densely populated modern cities The complexity of the contact space; at the same time, the trajectory data has sample bias and cannot represent the entire population of the entire city. In view of the spread of infectious diseases, individuals show obvious heterogeneity in the spread of influenza viruses. Many existing studies ignore the mutual influence of parameters in order to simplify the model parameters. In addition, agent-based spatial explicit infectious disease models are often studied at low temporal and spatial resolutions, and it is difficult to reveal the spread of influenza at the urban scale from the perspective of temporal and spatial dynamic mechanisms, resulting in delays and delays in predicting the temporal and spatial patterns of influenza outbreaks. deviation.
发明内容Summary of the invention
有鉴于此,有必要提供一种大规模轨迹数据模拟流感时空传播过程的方法及系统。In view of this, it is necessary to provide a method and system for simulating the spread of influenza in time and space with large-scale trajectory data.
本发明提供一种大规模轨迹数据模拟流感时空传播过程的方法,该方法包括如下步骤:a.基于人口普查数据、建筑物普查数据合成城市人口,并为合成城市人口的个体赋予相应的人口属性;b.对赋予人口属性的合成城市人口的个体,以手机位置数据为主、出行调查数据为辅,构建个体的活动链;c.以构建的个体的活动链为基础,以一小时为时间步长,动态构建一天中24个时间序列的接触网络;d.根据构建的一天中24个时间序列的接触网络,采用SEIR模型模拟流感在高时空分辨率下的传播。The present invention provides a method for simulating the spatiotemporal transmission process of influenza with large-scale trajectory data. The method includes the following steps: a. Synthesize urban population based on census data and building census data, and assign corresponding population attributes to individuals of the synthesized urban population B. For the individuals of the synthetic urban population given demographic attributes, use mobile phone location data as the main and travel survey data as a supplement to construct an individual activity chain; c. Based on the constructed individual activity chain, one hour is the time Step size, dynamically construct a contact network of 24 time series in a day; d. According to the constructed contact network of 24 time series in a day, use the SEIR model to simulate the spread of influenza at high spatial and temporal resolution.
其中,该方法还包括步骤:该方法还包括步骤e:在时间和空间两个视角下对模拟结果进行分析,得到感染者之间的空间传播路径,准确定位流感疫情传播过程中的关键空间位置。Wherein, the method also includes the steps: the method also includes the step e: analyzing the simulation results from the two perspectives of time and space to obtain the spatial transmission path between the infected persons, and accurately locate the key spatial locations in the spread of the influenza epidemic .
所述人口普查数据包括:年龄、性别、职业类型、家庭类别、家庭规模、家庭年龄成分;所述建筑物普查数据包括:建筑位置信息、楼高、建筑面积、建筑物功能;其中,所述建筑物功能包括:工厂、教学楼、居民住宅、办公楼、商场。The population census data includes: age, gender, occupation type, family category, family size, and family age components; the building census data includes: building location information, building height, building area, and building function; wherein, the Building functions include: factories, teaching buildings, residential buildings, office buildings, and shopping malls.
所述人口属性包括个体属性、家庭属性、是否为有手机个体;其中,所述个体属性包括:年龄、性别、职业;所述家庭属性包括:家庭结构、 家庭住址、工作地。The demographic attributes include individual attributes, family attributes, and whether they are individuals with mobile phones; wherein, the individual attributes include age, gender, and occupation; the family attributes include family structure, home address, and work place.
所述的步骤b具体包括如下步骤:对有手机个体的出行轨迹基于手机位置数据进行构建:将同一个手机号的数据按时间排序,构成该手机用户一天的出行轨迹,基于手机基站划分泰森多边形,与手机基站位于同一泰森多边形的多个建筑物即为个体位置的候选集合;对无手机个体的出行轨迹基于出行调查数据进行构建:以出行调查的每条出行信息进行构建,得到同一交通小区的多个建筑物为个体位置的候选集合;根据上述得到的个体位置的候选集合,构建个体的活动链。Said step b specifically includes the following steps: constructing the travel trajectory of an individual with a mobile phone based on mobile phone location data: sorting the data of the same mobile phone number by time to form a day’s travel trajectory of the mobile phone user, and dividing Tyson based on the mobile phone base station Polygon, multiple buildings located in the same Tyson polygon as the mobile phone base station are the candidate sets of individual positions; the travel trajectory of individuals without mobile phones is constructed based on travel survey data: the same The multiple buildings in the traffic area are candidate sets of individual locations; according to the candidate set of individual locations obtained above, an individual activity chain is constructed.
所述的步骤c具体包括如下步骤:根据构建的个体的活动链,以小时为时间粒度,比较不同个体在一天内进行的活动类别,将相同时刻进行相同活动且所处建筑位置相同的个体设置为时空同现的个体,基于活动类别,为时空同现的个体间赋予不同的接触概率,在接触概率的约束下,生成智能体间以小时为分辨率的动态接触网络。Said step c specifically includes the following steps: according to the activity chain of the constructed individual, with the hour as the time granularity, compare the activity categories performed by different individuals in a day, and set individuals who perform the same activities at the same time and are in the same building location For individuals with co-occurrence in time and space, based on the activity category, different contact probabilities are assigned to individuals in co-occurrence in time and space. Under the constraint of contact probability, a dynamic contact network with hourly resolution between agents is generated.
所述活动类别包括:居家、上班/上学、休闲娱乐活动。The activity categories include: home, work/school, leisure and entertainment activities.
所述的步骤d具体包括如下步骤:以小时为时间单位,追踪感染事件发生在何时何地、由谁传染给谁,在流感传播过程中,个体被感染的概率称为有效感染概率P,有效感染概率的公式如下:The step d specifically includes the following steps: using hours as the unit of time, tracking the time and place of the infection, and from whom to whom. During the spread of influenza, the probability of an individual being infected is called the effective infection probability P, The formula for effective infection probability is as follows:
P=P c×P i×r P=P c ×P i ×r
其中,P c为个体间的接触概率,P i为个体的感染概率,r为该个体的相对传染性; Among them, P c is the contact probability between individuals, P i is the infection probability of the individual, and r is the relative infectivity of the individual;
最后通过蒙特卡洛方法决定该个体是否被感染。Finally, the Monte Carlo method is used to determine whether the individual is infected.
所述通过蒙特卡洛方法决定该个体是否被感染包括:基于计算机生成均匀分布的伪随机数,将伪随机数与有效感染概率相比较,若伪随机数 小于等于有效感染概率,则个体被感染,重复以上过程,直到最终的每天新感染人数趋势与真实数据一致且计算出的基本再生数大于1。The Monte Carlo method to determine whether the individual is infected includes: generating a uniformly distributed pseudo-random number based on a computer, comparing the pseudo-random number with the effective infection probability, and if the pseudo-random number is less than or equal to the effective infection probability, the individual is infected , Repeat the above process until the final trend of the number of new infections per day is consistent with the real data and the calculated basic reproductive number is greater than 1.
所述的步骤e具体包括如下步骤:时间序列上进行城市尺度、行政区划尺度两种尺度下的对比分析,将城市划分为1km×1km的网格,分析每个网格每天新感染的病例数;The step e specifically includes the following steps: a comparative analysis of the city scale and the administrative division scale on the time series, divide the city into 1km×1km grids, and analyze the number of newly infected cases in each grid every day ;
通过构建父代感染者与子代感染者的传播拓扑关系构建疫情树,再由多棵疫情树组成疫情树林,得到父代感染者与子代被感染者之间的空间传播路径,准确定位流感疫情传播过程中的关键空间位置。The epidemic tree is constructed by constructing the spread topological relationship between the parent infected person and the offspring infected person, and then the epidemic forest is composed of multiple epidemic trees to obtain the spatial transmission path between the parent infected person and the offspring infected person, and accurately locate the flu The key spatial location in the spread of the epidemic.
本发明提供一种大规模轨迹数据模拟流感时空传播过程的系统,该系统包括人口属性赋予模块、活动链构建模块、接触网络构建模块、传播模拟模块,其中:所述人口属性赋予模块用于基于人口普查数据、建筑物普查数据合成城市人口,并为合成城市人口的个体赋予相应的人口属性;所述活动链构建模块用于对赋予人口属性的合成城市人口的个体,以手机位置数据为主、出行调查数据为辅,构建个体的活动链;所述接触网络构建模块用于以构建的个体的活动链为基础,以一小时为时间步长,动态构建一天中24个时间序列的接触网络;所述传播模拟模块用于根据构建的一天中24个时间序列的接触网络,采用SEIR模型模拟流感在高时空分辨率下的传播。The present invention provides a system for simulating the spatiotemporal transmission process of influenza with large-scale trajectory data. The system includes a population attribute assignment module, an activity chain building module, a contact network building module, and a transmission simulation module. The census data and building census data synthesize the urban population, and assign corresponding population attributes to the individuals of the composite urban population; the activity chain building module is used to assign the population attributes to the individuals of the composite urban population, mainly based on mobile phone location data , The travel survey data is supplemented to construct an individual's activity chain; the contact network building module is used to dynamically construct a contact network of 24 time series in a day based on the constructed individual's activity chain, taking one hour as the time step The spread simulation module is used to simulate the spread of influenza at high temporal and spatial resolution by using the SEIR model according to the constructed contact network of 24 time series in a day.
本发明大规模轨迹数据模拟流感时空传播过程的方法及系统的有益效果包括:The beneficial effects of the method and system for simulating the time-space propagation process of influenza with large-scale trajectory data of the present invention include:
(1)融入大规模手机位置数据后,个体的移动性得到更真实的还原,在区域间的移动聚集性更强,活动空间较出行调查数据有所增加。移动模型中个体的出行位置用建筑物坐标表示,使得基于智能体的空间显式传染病模型可以 在不同的时空尺度上反演流感爆发的时空信息。(1) With the integration of large-scale mobile phone location data, the mobility of individuals is more realistically restored, the mobility of regions is more concentrated, and the activity space has increased compared with travel survey data. The travel position of the individual in the mobile model is represented by building coordinates, so that the spatial explicit infectious disease model based on the agent can retrieve the spatiotemporal information of influenza outbreaks on different spatiotemporal scales.
(2)基于智能体的空间显式传染病模型设置了与流感传播相关的大量参数,提高了模型的可解释性,有助于揭示流感在高时空分辨率下的传播机制。模拟结果在城市尺度、行政区划尺度均能与原始流感数据的趋势相吻合,模拟结果刻画的疫情的空间信息有效解决了无法真实定位感染者空间位置及潜在高风险传播区域的问题。(2) The agent-based spatial explicit epidemic model sets a large number of parameters related to the spread of influenza, improves the interpretability of the model, and helps to reveal the spreading mechanism of influenza at high spatial and temporal resolution. The simulation results can match the trend of the original influenza data at the urban scale and administrative division scale. The spatial information of the epidemic situation described by the simulation results effectively solves the problem of the inability to truly locate the spatial location of the infected person and the potential high-risk transmission area.
(3)创新性的采用疫情树林刻画流感在小尺度空间范围下的传播规律,分析流感在格网内部及格网间的传播特征,定位流感传播过程中的高风险区域。模型可以反映流感在区域内部的传播强度、描绘流感在区域间的传播规律,有助于准确定位流感疫情传播过程中的关键空间位置。(3) Innovatively use epidemic forests to characterize the spread of influenza in a small-scale space, analyze the transmission characteristics of influenza within and between grids, and locate high-risk areas in the process of influenza transmission. The model can reflect the intensity of the spread of influenza within the region, depict the law of spread of influenza between regions, and help to accurately locate the key spatial locations in the spread of influenza epidemics.
附图说明Description of the drawings
图1为本发明大规模轨迹数据模拟流感时空传播过程的方法的流程图;Fig. 1 is a flow chart of the method for simulating the temporal and spatial propagation process of influenza with large-scale trajectory data according to the present invention;
图2为SEIR模型下流感自然史示意图;Figure 2 is a schematic diagram of the natural history of influenza under the SEIR model;
图3为本发明实施例提供的疫情树林示意图;Figure 3 is a schematic diagram of an epidemic forest provided by an embodiment of the present invention;
图4为本发明大规模轨迹数据模拟流感时空传播过程的系统的硬件架构图;Figure 4 is a hardware architecture diagram of the system for simulating the spread of influenza in time and space with large-scale trajectory data according to the present invention;
图5为本发明实施例提供的合成个体年龄及性别分布示意图;5 is a schematic diagram of the age and gender distribution of synthetic individuals provided by an embodiment of the present invention;
图6为本发明实施例提供的合成家庭结构分布示意图;FIG. 6 is a schematic diagram of the structure distribution of a synthetic family provided by an embodiment of the present invention;
图7为本发明实施例提供的城市尺度下流感扩散过程模拟结果与真实病例比较示意图;FIG. 7 is a schematic diagram of the comparison between the simulation results of the influenza diffusion process at the urban scale and the real cases according to the embodiment of the present invention;
图8为本发明实施例提供的城市尺度下100次模拟结果产生的再生数的分布示意图;FIG. 8 is a schematic diagram of the distribution of reproduction numbers generated by 100 simulation results at a city scale according to an embodiment of the present invention;
图9为本发明实施例提供的流感扩散过程模拟结果在10个区尺度上与真 实病例的比较示意图;Fig. 9 is a schematic diagram of the comparison between the simulation results of the influenza spreading process provided by the embodiment of the present invention and real cases on a scale of 10 regions;
图10为本发明实施例提供的流感在城市内部传播的时空分布示意图;FIG. 10 is a schematic diagram of the temporal and spatial distribution of influenza spread in a city according to an embodiment of the present invention;
图11为本发明实施例提供的流感在空间单元内部进行传播的强度示意图;FIG. 11 is a schematic diagram of the intensity of the spread of influenza within the space unit according to an embodiment of the present invention;
图12为本发明实施例提供的流感传播过程中一个网格所能影响的网格数量示意图;FIG. 12 is a schematic diagram of the number of grids that can be affected by one grid during the spread of influenza according to an embodiment of the present invention;
图13为本发明实施例提供的两个网格间发生流感传播的事件数(连接强度)示意图;FIG. 13 is a schematic diagram of the number of events (connection strength) in which influenza transmission occurs between two grids according to an embodiment of the present invention;
图14为本发明实施例提供的流感传播过程中的空间聚集现象示意图。FIG. 14 is a schematic diagram of a spatial aggregation phenomenon during the spread of influenza according to an embodiment of the present invention.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solutions, and advantages of this application clearer and clearer, the following further describes the application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the application, and are not used to limit the application.
参阅图1所示,是本发明大规模轨迹数据模拟流感时空传播过程的方法较佳实施例的作业流程图。Referring to FIG. 1, it is a flowchart of a preferred embodiment of the method for simulating the time-space propagation process of influenza with large-scale trajectory data according to the present invention.
步骤S1,基于人口普查数据、建筑物普查数据合成城市人口,并为合成城市人口的个体赋予相应的人口属性。也即是,融合多源数据构建城市人口移动模型。其中:Step S1: Synthesize the urban population based on the population census data and the building census data, and assign corresponding population attributes to individuals of the synthesized urban population. That is, fusion of multi-source data to build a model of urban population movement. in:
所述人口普查数据包括:年龄、性别、职业类型、家庭类别、家庭规模、家庭年龄成分;所述建筑物普查数据包括:建筑位置信息、楼高、建筑面积、建筑物功能;其中,所述建筑物功能包括:工厂、教学楼、居民住宅、办公楼、商场。The population census data includes: age, gender, occupation type, family category, family size, and family age components; the building census data includes: building location information, building height, building area, and building function; wherein, the Building functions include: factories, teaching buildings, residential buildings, office buildings, and shopping malls.
所述人口属性包括个体属性、家庭属性、是否为有手机个体;其中,所述个体属性包括:年龄、性别、职业;所述家庭属性包括:家庭结构、 家庭住址、工作地。The demographic attributes include individual attributes, family attributes, and whether they are individuals with mobile phones; wherein, the individual attributes include age, gender, and occupation; the family attributes include family structure, home address, and work place.
具体而言:in particular:
首先,根据人口普查数据中年龄、性别、职业类型等个体属性的概率分布,通过蒙特卡洛模拟为每个合成城市人口的个体分配相应的个体属性。其次,根据人口普查数据中家庭类别、家庭规模、家庭年龄成分等家庭属性的概率构建合成家庭,并将合成城市人口的个体填充进合成家庭。最后,根据不同性别和年龄段的手机使用率进行蒙特卡洛模拟,为合成城市人口的个体赋予是否为有手机个体的属性,将合成城市人口的个体分为有手机个体和无手机个体两类。First, according to the probability distribution of individual attributes such as age, gender, and occupation type in the census data, Monte Carlo simulation is used to assign corresponding individual attributes to the individuals of each composite urban population. Secondly, according to the probability of family attributes such as family category, family size, family age composition and so on in the census data, a synthetic family is constructed, and individuals of the synthetic urban population are filled into the synthetic family. Finally, according to the mobile phone usage rate of different genders and age groups, Monte Carlo simulation is performed to give the individuals of the synthetic urban population the attributes of whether they are individuals with mobile phones. The individuals of the synthetic urban population are divided into two types: individuals with mobile phones and individuals without mobile phones. .
步骤S2,对赋予人口属性的合成城市人口的个体(以下简称“个体”),以手机位置数据为主、出行调查数据为辅,构建个体的活动链。也即是,以手机数据为主、出行调查数据为辅,基于个体出行的时空特征与建筑物普查数据结合,实现建筑物尺度的个体移动建模。具体而言:Step S2, for individuals (hereinafter referred to as "individuals") of synthetic urban population assigned population attributes, using mobile phone location data as the main and travel survey data as the supplement to construct an individual activity chain. That is to say, based on the combination of mobile phone data and travel survey data supplemented by the time and space characteristics of individual travel and building census data, building-scale individual movement modeling is realized. in particular:
对有手机个体的出行轨迹基于手机位置数据进行重构,无手机个体的出行轨迹基于出行调查数据进行构建。此时,通过手机数据识别出的有手机个体的职住地的空间范围细化到手机基站服务范围,基于出行调查记录的无手机个体的职住地的空间范围细化到交通小区。The travel trajectory of individuals with mobile phones is reconstructed based on mobile phone location data, and the travel trajectories of individuals without mobile phones are constructed based on travel survey data. At this time, the spatial range of the work and residence of individuals with mobile phones identified through the mobile phone data is refined to the service range of the mobile phone base station, and the spatial range of the work and residence of individuals without mobile phones based on travel survey records is refined to the traffic community.
进一步地,所述手机位置数据包括:匿名手机号、时间、基站经纬度,将同一个手机号的数据按时间排序,构成该手机用户一天的出行轨迹,此时用户的出行位置为手机基站的位置。然后,基于手机基站划分泰森多边形(泰森多边形的范围即为手机基站的服务范围),与手机基站位于同一泰森多边形的多个建筑物即为个体位置的候选集合。Further, the mobile phone location data includes: anonymous mobile phone number, time, base station latitude and longitude. The data of the same mobile phone number is sorted by time to form a day's travel trajectory of the mobile phone user. At this time, the user's travel location is the location of the mobile phone base station. . Then, based on the mobile phone base station, the Thiessen polygon is divided (the range of the Thiessen polygon is the service range of the mobile phone base station), and multiple buildings located in the same Thiessen polygon as the mobile phone base station are the candidate sets of individual positions.
出行调查是对个体出行行为的调查,所述出行调查数据包括:个体 工作单位、出发地、目的地、出发时间、结束时间、出行方式、出行目的等,但位置信息以交通小区为单位。无手机个体的出行轨迹以出行调查的每条出行信息进行构建,得到同一交通小区的多个建筑物即为个体位置的候选集合。Travel survey is an investigation of individual travel behavior. The travel survey data includes: individual work unit, place of departure, destination, departure time, end time, travel mode, travel purpose, etc., but the location information is based on the traffic community. The travel trajectory of an individual without a mobile phone is constructed with each travel information in the travel survey, and multiple buildings in the same traffic area are obtained as a candidate set of individual locations.
根据上述得到的个体位置的候选集合,构建个体的活动链。According to the candidate set of individual positions obtained above, an individual's activity chain is constructed.
步骤S3,以构建的个体的活动链为基础,以一小时为时间步长,动态构建一天中24个时间序列的接触网络。Step S3, based on the constructed individual activity chain, with one hour as the time step, dynamically constructing a contact network of 24 time series in a day.
也即是,基于城市人口的个体活动链动态构建接触网络。根据构建的个体的活动链,以小时为时间粒度,比较不同个体在一天内进行的活动类别,将相同时刻进行相同活动且所处建筑位置相同的个体设置为时空同现的个体。基于活动类别,为时空同现的个体间赋予不同的接触概率。在接触概率的约束下,生成智能体间以小时为分辨率的动态接触网络。其中,所述活动类别包括:居家、上班/上学、休闲娱乐活动。That is, a contact network is dynamically constructed based on the individual activity chain of the urban population. According to the constructed activity chain of individuals, with the hour as the time granularity, compare the types of activities performed by different individuals within a day, and set individuals who perform the same activities at the same time and are in the same building location as individuals with the same time and space. Based on the activity category, different contact probabilities are assigned to individuals who co-occur in time and space. Under the constraint of contact probability, a dynamic contact network with hourly resolution between agents is generated. Wherein, the activity categories include: home, work/school, leisure and entertainment activities.
具体而言:in particular:
在一个时间段的接触网络中,以顶点代表出行个体,出现在相同位置的个体之间用边相连,所述位置以家庭(居家)、工作单位(工作),建筑地址(休闲)为单位。相同时间出现在相同位置的个体间以一定的概率产生接触,该接触概率记为p cIn a contact network in a period of time, the vertices represent traveling individuals, and individuals appearing in the same location are connected by edges. The locations are in units of home (home), work (work), and building address (leisure). Individuals appearing at the same location at the same time have a certain probability of contact, and this probability of contact is recorded as p c .
一个个体在一天中有24个接触网络,以一个接触网络为例,网络中的顶点表示个体,出现在相同位置的个体用边相连,表示两个个体相同时间的出行位置相同。然后,根据表1的接触概率,判断相同时间出现在相同位置的两个个体是否真正产生接触,例如,感染者为成年人时,其居家的时候以0.25的概率与未成年人发生接触,以0.4的概率与成年 人发生接触。An individual has 24 contact networks in a day. Take a contact network as an example. The vertices in the network represent individuals. Individuals appearing at the same location are connected by edges, which means that two individuals travel at the same location at the same time. Then, according to the contact probability in Table 1, it is judged whether two individuals appearing at the same place at the same time actually have contact. For example, when the infected person is an adult, he will have contact with a minor at a probability of 0.25 when he is at home. 0.4 probability of contact with an adult.
个体从事不同活动时,个体间的接触概率不同(见表1),根据接触概率每1小时生成一个个体间的接触网络,一天24个接触网络构成最终的动态接触网络。When individuals are engaged in different activities, the contact probabilities between individuals are different (see Table 1). According to the contact probability, a contact network between individuals is generated every hour, and 24 contact networks a day constitute the final dynamic contact network.
表1个体间接触概率Table 1 Probability of contact between individuals
Figure PCTCN2020082708-appb-000001
Figure PCTCN2020082708-appb-000001
步骤S4,根据构建的一天中24个时间序列的接触网络,采用SEIR模型模拟流感在高时空分辨率下的传播。具体而言:Step S4: According to the constructed contact network of 24 time series in a day, the SEIR model is used to simulate the spread of influenza at high temporal and spatial resolution. in particular:
SEIR模型根据流感自然史将个体分为四种状态(图2):易感期、潜伏期、感染期、恢复期。如图2所示,由于疫苗的接种以及自身抗体的产生,一部分易感人群对流感病毒具有免疫力;一个易感个体以一定的概率被感染进入潜伏期;病毒在体内寄生若干时间后,个体具有传染性,进入传染期;处于传染期的感染者可能显现出流感相应的症状,也可能没有明显症状。最后该个体被治愈,进入恢复状态。The SEIR model divides individuals into four states based on the natural history of influenza (Figure 2): susceptible period, incubation period, infection period, and recovery period. As shown in Figure 2, due to the vaccination and the production of autoantibodies, a part of the susceptible population has immunity to influenza virus; a susceptible individual is infected with a certain probability and enters the incubation period; after the virus is parasitic in the body for a certain period of time, the individual has Infectious, entering the infectious period; an infected person in the infectious period may show flu-corresponding symptoms, or there may be no obvious symptoms. Finally, the individual was cured and entered a recovery state.
模拟过程以小时为时间单位,追踪感染事件发生在何时何地、由谁传染给谁。在流感传播过程中,个体被感染的概率称为有效感染概率P。在模型中,有效感染概率的公式如下:The simulation process takes hours as the time unit to track when and where the infection occurred, and from whom to whom. During the spread of influenza, the probability of an individual being infected is called the effective probability of infection P. In the model, the formula for effective infection probability is as follows:
P=P c×P i×r P=P c ×P i ×r
其中P c为个体间的接触概率,P i为个体的感染概率,r为该个体的相 对传染性。最终通过蒙特卡洛方法决定该个体是否被感染:基于计算机生成均匀分布的伪随机数,将伪随机数与有效感染概率相比较,若伪随机数小于等于有效感染概率,则个体被感染。重复以上过程,直到最终的每天新感染人数趋势与真实数据一致且计算出的基本再生数大于1。 Wherein P c is equal probability of contact between individuals, P i is the probability of infected individuals, r is the relative infectivity for the individual. Finally, the Monte Carlo method is used to determine whether the individual is infected: based on a computer-generated uniformly distributed pseudo-random number, the pseudo-random number is compared with the effective infection probability, if the pseudo-random number is less than or equal to the effective infection probability, the individual is infected. Repeat the above process until the final trend of the number of new infections per day is consistent with the real data and the calculated basic reproductive number is greater than 1.
步骤S5:在时间和空间两个视角下对模拟结果进行分析,探究流感爆发趋势及强度,分析流感传播过程中的高风险区域,准确定位流感疫情传播过程中的关键空间位置。Step S5: Analyze the simulation results from the two perspectives of time and space, explore the trend and intensity of influenza outbreaks, analyze high-risk areas in the process of influenza transmission, and accurately locate key spatial locations in the process of influenza epidemic transmission.
其中,所述流感爆发趋势及强度是指:流感的爆发趋势主要体现在每天新感染的病例在时间上产生的曲线,所述曲线大致呈正态分布,若曲线幅度较窄,表明爆发速度较快,反之,爆发速度较为缓和,曲线峰值表示爆发的严重程度,峰值越高越严重。Wherein, the influenza outbreak trend and intensity refers to: the influenza outbreak trend is mainly reflected in the time curve of newly infected cases every day, and the curve is roughly normal distribution. If the curve amplitude is narrow, it indicates that the outbreak speed is relatively high. Fast, on the contrary, the speed of the outbreak is relatively slow, the peak of the curve indicates the severity of the outbreak, the higher the peak, the more serious.
所述高风险区域为累计感染人数多的区域,表示易感者在这些区域被感染的风险高。The high-risk areas are areas where the cumulative number of infected persons is large, which means that susceptible persons have a high risk of being infected in these areas.
所述关键空间位置为区域间的传播连接性较强的位置,即对关键空间位置进行干预,可以减少流感病毒向其他区域传播,进而降低城市内部传染病的传播风险。The key spatial location is a location with strong transmission connectivity between regions, that is, intervention in the key spatial location can reduce the spread of influenza virus to other regions, thereby reducing the risk of transmission of infectious diseases within the city.
具体包括:Specifically:
时间序列上进行城市尺度、行政区划尺度两种尺度下的对比分析。为了准确表达流感时空扩散过程,将城市划分为1km×1km的网格,分析每个网格每天新感染的病例数,所述病例数为100次模拟结果的平均值。通过构建父代感染者与子代感染者的传播拓扑关系构建疫情树,再由多棵疫情树组成疫情树林(图3),以揭示父代感染者与子代被感染者之间的空间传播路径,准确定位流感疫情传播过程中的关键空间位置, 揭示流感在城市中传播时的传播距离与传播强度的关系及空间集聚效应。On the time series, a comparative analysis of the city scale and the administrative division scale is carried out. In order to accurately express the spread of influenza in time and space, the city is divided into 1km×1km grids, and the number of newly infected cases per grid per day is analyzed. The number of cases is the average of 100 simulation results. The epidemic tree is constructed by constructing the transmission topology relationship between the parent infected and the offspring infected, and then the epidemic forest is composed of multiple epidemic trees (Figure 3) to reveal the spatial transmission between the parent infected and the offspring infected Path, accurately locate the key spatial locations during the spread of influenza epidemics, reveal the relationship between the spread distance and the transmission intensity of influenza in the city, and the spatial agglomeration effect.
所述定位流感疫情传播过程中的关键空间位置,主要采用过分析流感病毒在空间网格间的传播强度,一个网格与其他网格的传播连接性越强(图13)、与该网格相关的网格数量越多(图12),那么该位置越重要,需要重点干预。The key spatial position in the process of locating the influenza epidemic is mainly used to analyze the transmission intensity of influenza virus between spatial grids. The stronger the transmission connectivity between one grid and other grids (Figure 13), and the grid The greater the number of related grids (Figure 12), the more important the location is, and key intervention is required.
网格间的空间距离表示流感的传播距离,网格间的连接权重表示流感在网格间的传播强度,为了揭示传播距离与传播强度的关系,本实施例计算了二者之间的皮尔逊相关系数r,r=-0.098,说明传播距离与传播强度呈极弱的负相关关系。The spatial distance between the grids represents the transmission distance of influenza, and the connection weight between the grids represents the transmission intensity of influenza between the grids. In order to reveal the relationship between the transmission distance and the transmission intensity, the Pearson between the two is calculated in this embodiment. The correlation coefficient r, r=-0.098, shows that the propagation distance and the propagation strength are extremely weakly negatively correlated.
空间集聚效应体现在连接权重大于100的区域有明显的空间聚集现象(图14),表示这些区域内部具有极强的相互传播的能力。The spatial agglomeration effect is reflected in the obvious spatial agglomeration in areas where the connection weight is greater than 100 (Figure 14), indicating that these areas have a strong ability to communicate with each other.
参阅图4所示,是本发明大规模轨迹数据模拟流感时空传播过程的系统10的硬件架构图。该系统包括:人口属性赋予模块101、活动链构建模块102、接触网络构建模块103、传播模拟模块104以及分析模块105。Refer to FIG. 4, which is a hardware architecture diagram of the system 10 for simulating the time-space propagation process of influenza with large-scale trajectory data according to the present invention. The system includes: a population attribute assignment module 101, an activity chain building module 102, a contact network building module 103, a propagation simulation module 104, and an analysis module 105.
所述人口属性赋予模块101用于基于人口普查数据、建筑物普查数据合成城市人口,并为合成城市人口的个体赋予相应的人口属性。也即是,融合多源数据构建城市人口移动模型。其中:The population attribute assignment module 101 is used to synthesize urban population based on census data and building census data, and assign corresponding population attributes to individuals who synthesize urban population. That is, fusion of multi-source data to build a model of urban population movement. in:
所述人口普查数据包括:年龄、性别、职业类型、家庭类别、家庭规模、家庭年龄成分;所述建筑物普查数据包括:建筑位置信息、楼高、建筑面积、建筑物功能;其中,所述建筑物功能包括:工厂、教学楼、居民住宅、办公楼、商场。The population census data includes: age, gender, occupation type, family category, family size, and family age components; the building census data includes: building location information, building height, building area, and building function; wherein, the Building functions include: factories, teaching buildings, residential buildings, office buildings, and shopping malls.
所述人口属性包括个体属性、家庭属性、是否为有手机个体;其中,所述个体属性包括:年龄、性别、职业;所述家庭属性包括:家庭结构、 家庭住址、工作地。The demographic attributes include individual attributes, family attributes, and whether they are individuals with mobile phones; wherein, the individual attributes include age, gender, and occupation; the family attributes include family structure, home address, and work place.
具体而言:in particular:
所述人口属性赋予模块101首先根据人口普查数据中年龄、性别、职业类型等个体属性的概率分布,通过蒙特卡洛模拟为每个合成城市人口的个体分配相应的个体属性。接着根据人口普查数据中家庭类别、家庭规模、家庭年龄成分等家庭属性的概率构建合成家庭,并将合成城市人口的个体填充进合成家庭。最后根据不同性别和年龄段的手机使用率进行蒙特卡洛模拟,为合成城市人口的个体赋予是否为有手机个体的属性,将合成城市人口的个体分为有手机个体和无手机个体两类。The population attribute assignment module 101 first assigns corresponding individual attributes to individuals of each composite urban population through Monte Carlo simulation based on the probability distribution of individual attributes such as age, gender, and occupation type in the census data. Then a composite family is constructed according to the probability of family attributes such as family category, family size, and family age composition in the census data, and the individuals of the composite urban population are filled into the composite family. Finally, Monte Carlo simulation is carried out according to the mobile phone usage rate of different genders and age groups, and the attributes of the synthetic urban population are assigned to individuals with mobile phones. The synthetic urban population individuals are divided into two categories: mobile phone individuals and mobile phone individuals.
所述活动链构建模块102用于对赋予人口属性的合成城市人口的个体(以下简称“个体”),以手机位置数据为主、出行调查数据为辅,构建个体的活动链。也即是,以手机数据为主、出行调查数据为辅,基于个体出行的时空特征与建筑物普查数据结合,实现建筑物尺度的个体移动建模。具体而言:The activity chain construction module 102 is used to construct an individual activity chain for individuals (hereinafter referred to as "individuals") of the synthetic urban population assigned population attributes, using mobile phone location data as the main and travel survey data as the supplement. That is to say, based on the combination of mobile phone data and travel survey data supplemented by the time and space characteristics of individual travel and building census data, building-scale individual movement modeling is realized. in particular:
所述活动链构建模块102对有手机个体的出行轨迹基于手机位置数据进行重构,无手机个体的出行轨迹基于出行调查数据进行构建。此时,通过手机数据识别出的有手机个体的职住地的空间范围细化到手机基站服务范围,基于出行调查记录的无手机个体的职住地的空间范围细化到交通小区。The activity chain construction module 102 reconstructs the travel trajectory of individuals with mobile phones based on mobile phone location data, and constructs the travel trajectories of individuals without mobile phones based on travel survey data. At this time, the spatial range of the work and residence of individuals with mobile phones identified through the mobile phone data is refined to the service range of the mobile phone base station, and the spatial range of the work and residence of individuals without mobile phones based on travel survey records is refined to the traffic community.
进一步地,所述手机位置数据包括:匿名手机号、时间、基站经纬度,将同一个手机号的数据按时间排序,构成该手机用户一天的出行轨迹,此时用户的出行位置为手机基站的位置。然后,基于手机基站划分泰森多边形(泰森多边形的范围即为手机基站的服务范围),与手机基站 位于同一泰森多边形的多个建筑物即为个体位置的候选集合。Further, the mobile phone location data includes: anonymous mobile phone number, time, base station latitude and longitude. The data of the same mobile phone number is sorted by time to form a day's travel trajectory of the mobile phone user. At this time, the user's travel location is the location of the mobile phone base station. . Then, the cell phone base station is divided into the Thiessen polygon (the range of the cell phone base station is the service range of the cell phone base station), and multiple buildings located in the same Thiessen polygon as the cell phone base station are the candidate sets of individual positions.
出行调查是对个体出行行为的调查,所述出行调查数据包括:个体工作单位、出发地、目的地、出发时间、结束时间、出行方式、出行目的等,但位置信息以交通小区为单位。无手机个体的出行轨迹以出行调查的每条出行信息进行构建,得到同一交通小区的多个建筑物即为个体位置的候选集合。A travel survey is an investigation of individual travel behavior. The travel survey data includes: individual work unit, departure place, destination, departure time, end time, travel mode, travel purpose, etc., but the location information is based on the traffic community. The travel trajectory of an individual without a mobile phone is constructed with each travel information in the travel survey, and multiple buildings in the same traffic area are obtained as a candidate set of individual locations.
所述活动链构建模块102根据上述得到的个体位置的候选集合,构建个体的活动链。The activity chain construction module 102 constructs an individual activity chain according to the candidate set of individual positions obtained above.
所述接触网络构建模块103用于以构建的个体的活动链为基础,以一小时为时间步长,动态构建一天中24个时间序列的接触网络。也即是,所述接触网络构建模块103基于城市人口的个体活动链动态构建接触网络。根据构建的个体的活动链,以小时为时间粒度,比较不同个体在一天内进行的活动类别,将相同时刻进行相同活动且所处建筑位置相同的个体设置为时空同现的个体。基于活动类别,为时空同现的个体间赋予不同的接触概率。在接触概率的约束下,生成智能体间以小时为分辨率的动态接触网络。其中,所述活动类别包括:居家、上班/上学、休闲娱乐活动。The contact network construction module 103 is used to dynamically construct a contact network of 24 time series in a day based on the activity chain of the constructed individual, with a time step of one hour. That is, the contact network construction module 103 dynamically constructs a contact network based on the individual activity chain of the urban population. According to the constructed activity chain of individuals, with the hour as the time granularity, compare the types of activities performed by different individuals within a day, and set individuals who perform the same activities at the same time and are in the same building location as individuals with the same time and space. Based on the activity category, different contact probabilities are assigned to individuals who co-occur in time and space. Under the constraint of contact probability, a dynamic contact network with hourly resolution between agents is generated. Wherein, the activity categories include: home, work/school, leisure and entertainment activities.
具体而言:in particular:
在一个时间段的接触网络中,以顶点代表出行个体,出现在相同位置的个体之间用边相连,所述位置以家庭(居家)、工作单位(工作),建筑地址(休闲)为单位。相同时间出现在相同位置的个体间以一定的概率产生接触,该接触概率记为p cIn a contact network in a period of time, the vertices represent traveling individuals, and individuals appearing in the same location are connected by edges. The locations are in units of home (home), work (work), and building address (leisure). Individuals appearing at the same location at the same time have a certain probability of contact, and this probability of contact is recorded as p c .
一个个体在一天中有24个接触网络,以一个接触网络为例,网络中 的顶点表示个体,出现在相同位置的个体用边相连,表示两个个体相同时间的出行位置相同。然后,根据表1的接触概率,判断相同时间出现在相同位置的两个个体是否真正产生接触,例如,感染者为成年人时,其居家的时候以0.25的概率与未成年人发生接触,以0.4的概率与成年人发生接触。An individual has 24 contact networks in a day. Take a contact network as an example. The vertices in the network represent individuals. Individuals appearing in the same position are connected by edges, which means that two individuals travel at the same location at the same time. Then, according to the contact probability in Table 1, it is judged whether two individuals appearing at the same place at the same time actually have contact. For example, when the infected person is an adult, he will have contact with a minor at a probability of 0.25 when he is at home. 0.4 probability of contact with an adult.
个体从事不同活动时,个体间的接触概率不同(见表1),根据接触概率每1小时生成一个个体间的接触网络,一天24个接触网络构成最终的动态接触网络。When individuals are engaged in different activities, the contact probabilities between individuals are different (see Table 1). According to the contact probability, a contact network between individuals is generated every hour, and 24 contact networks a day constitute the final dynamic contact network.
表1个体间接触概率Table 1 Probability of contact between individuals
Figure PCTCN2020082708-appb-000002
Figure PCTCN2020082708-appb-000002
所述传播模拟模块104用于根据构建的一天中24个时间序列的接触网络,采用SEIR模型模拟流感在高时空分辨率下的传播。具体而言:The spread simulation module 104 is used to simulate the spread of influenza at high temporal and spatial resolution by using the SEIR model according to the constructed contact network of 24 time series in a day. in particular:
SEIR模型根据流感自然史将个体分为四种状态(图2):易感期、潜伏期、感染期、恢复期。如图2所示,由于疫苗的接种以及自身抗体的产生,一部分易感人群对流感病毒具有免疫力;一个易感个体以一定的概率被感染进入潜伏期;病毒在体内寄生若干时间后,个体具有传染性,进入传染期;处于传染期的感染者可能显现出流感相应的症状,也可能没有明显症状。最后该个体被治愈,进入恢复状态。The SEIR model divides individuals into four states based on the natural history of influenza (Figure 2): susceptible period, incubation period, infection period, and recovery period. As shown in Figure 2, due to the vaccination and the production of autoantibodies, a part of the susceptible population has immunity to influenza virus; a susceptible individual is infected with a certain probability and enters the incubation period; after the virus is parasitic in the body for a certain period of time, the individual has Infectious, entering the infectious period; an infected person in the infectious period may show flu-corresponding symptoms, or there may be no obvious symptoms. Finally, the individual was cured and entered a recovery state.
模拟过程以小时为时间单位,追踪感染事件发生在何时何地、由谁 传染给谁。在流感传播过程中,个体被感染的概率称为有效感染概率P。在模型中,有效感染概率的公式如下:The simulation process takes hours as the time unit to track when and where the infection occurred, and from whom it was transmitted to whom. During the spread of influenza, the probability of an individual being infected is called the effective probability of infection P. In the model, the formula for effective infection probability is as follows:
P=P c×P i×r P=P c ×P i ×r
其中P c为个体间的接触概率,P i为个体的感染概率,r为该个体的相对传染性。最终通过蒙特卡洛方法决定该个体是否被感染:基于计算机生成均匀分布的伪随机数,将伪随机数与有效感染概率相比较,若伪随机数小于等于有效感染概率,则个体被感染。重复以上过程,直到最终的每天新感染人数趋势与真实数据一致且计算出的基本再生数大于1。 Wherein P c is equal probability of contact between individuals, P i is the probability of infected individuals, r is the relative infectivity for the individual. Finally, the Monte Carlo method is used to determine whether the individual is infected: based on a computer-generated uniformly distributed pseudo-random number, the pseudo-random number is compared with the effective infection probability, if the pseudo-random number is less than or equal to the effective infection probability, the individual is infected. Repeat the above process until the final trend of the number of new infections per day is consistent with the real data and the calculated basic reproductive number is greater than 1.
所述分析模块105用于在时间和空间两个视角下对模拟结果进行分析,探究流感爆发趋势及强度,分析流感传播过程中的高风险区域,准确定位流感疫情传播过程中的关键空间位置。The analysis module 105 is used to analyze the simulation results from two perspectives of time and space, explore the trend and intensity of influenza outbreaks, analyze high-risk areas in the process of influenza transmission, and accurately locate key spatial locations in the process of influenza epidemic transmission.
其中,所述流感爆发趋势及强度是指:流感的爆发趋势主要体现在每天新感染的病例在时间上产生的曲线,所述曲线大致呈正态分布,若曲线幅度较窄,表明爆发速度较快,反之,爆发速度较为缓和,曲线峰值表示爆发的严重程度,峰值越高越严重。Wherein, the influenza outbreak trend and intensity refers to: the influenza outbreak trend is mainly reflected in the time curve of newly infected cases every day, and the curve is roughly normal distribution. If the curve amplitude is narrow, it indicates that the outbreak speed is relatively high. Fast, on the contrary, the speed of the outbreak is relatively slow, the peak of the curve indicates the severity of the outbreak, the higher the peak, the more serious.
所述高风险区域为累计感染人数多的区域,表示易感者在这些区域被感染的风险高。The high-risk areas are areas where the cumulative number of infected persons is large, which means that susceptible persons have a high risk of being infected in these areas.
所述关键空间位置为区域间的传播连接性较强的位置,即对关键空间位置进行干预,可以减少流感病毒向其他区域传播,进而降低城市内部传染病的传播风险。The key spatial location is a location with strong transmission connectivity between regions, that is, intervention in the key spatial location can reduce the spread of influenza virus to other regions, thereby reducing the risk of transmission of infectious diseases within the city.
具体包括:Specifically:
所述分析模块105从时间序列上进行城市尺度、行政区划尺度两种尺度下的对比分析。为了准确表达流感时空扩散过程,将城市划分为1km ×1km的网格,分析每个网格每天新感染的病例数,所述病例数为100次模拟结果的平均值。所述分析模块105通过构建父代感染者与子代感染者的传播拓扑关系构建疫情树,再由多棵疫情树组成疫情树林(图3),以揭示父代感染者与子代被感染者之间的空间传播路径,准确定位流感疫情传播过程中的关键空间位置,揭示流感在城市中传播时的传播距离与传播强度的关系及空间集聚效应。The analysis module 105 performs a comparative analysis on the city scale and the administrative division scale from the time series. In order to accurately express the spread of influenza in time and space, the city is divided into 1km×1km grids, and the number of newly infected cases in each grid every day is analyzed. The number of cases is the average of 100 simulation results. The analysis module 105 constructs an epidemic tree by constructing the transmission topology relationship between the parent infected person and the offspring infected person, and then the epidemic forest is composed of multiple epidemic trees (Figure 3) to reveal the parent infected person and the offspring infected person The spatial transmission path between the two, accurately locates the key spatial locations in the spread of influenza epidemics, and reveals the relationship between the transmission distance and the transmission intensity of influenza in the city and the spatial agglomeration effect.
所述定位流感疫情传播过程中的关键空间位置,主要采用过分析流感病毒在空间网格间的传播强度,一个网格与其他网格的传播连接性越强(图13)、与该网格相关的网格数量越多(图12),那么该位置越重要,需要重点干预。The key spatial position in the process of locating the influenza epidemic is mainly used to analyze the transmission intensity of influenza virus between spatial grids. The stronger the transmission connectivity between one grid and other grids (Figure 13), and the grid The greater the number of related grids (Figure 12), the more important the location is, and key intervention is required.
网格间的空间距离表示流感的传播距离,网格间的连接权重表示流感在网格间的传播强度,为了揭示传播距离与传播强度的关系,本实施例计算了二者之间的皮尔逊相关系数r,r=-0.098,说明传播距离与传播强度呈极弱的负相关关系。The spatial distance between the grids represents the transmission distance of influenza, and the connection weight between the grids represents the transmission intensity of influenza between the grids. In order to reveal the relationship between the transmission distance and the transmission intensity, the Pearson between the two is calculated in this embodiment. The correlation coefficient r, r=-0.098, shows that the propagation distance and the propagation strength are extremely weakly negatively correlated.
空间集聚效应体现在连接权重大于100的区域有明显的空间聚集现象(图14),表示这些区域内部具有极强的相互传播的能力。The spatial agglomeration effect is reflected in the obvious spatial agglomeration in areas where the connection weight is greater than 100 (Figure 14), indicating that these areas have a strong ability to communicate with each other.
本申请实施例一:Example one of this application:
(1)人口属性建模。根据人口普查数据中年龄、性别、职业类型等个体属性的概率分布,通过蒙特卡洛模拟为每个合成个体分配相应的个体属性,合成个体的年龄、性别分布如图5所示。根据人口普查数据中家庭类别、家庭规模、家庭年龄成分等家庭属性的概率构建合成家庭,并将合成个体填充进合成家庭,图6展示了本实施例构建的合成家庭的大小分布情况。本实施例构建的合成人口与2017年深圳市统计年鉴发布 的各区人口数据相比,人口数量基本吻合。(1) Modeling of population attributes. According to the probability distribution of individual attributes such as age, gender, and occupation type in the census data, the corresponding individual attributes are assigned to each synthetic individual through Monte Carlo simulation. The age and gender distribution of the synthetic individual are shown in Figure 5. A synthetic family is constructed according to the probability of family attributes such as family category, family size, and family age components in the census data, and synthetic individuals are filled into the synthetic family. Figure 6 shows the size distribution of the synthetic family constructed in this embodiment. Compared with the population data of each district released by the 2017 Shenzhen Statistical Yearbook, the synthetic population constructed in this embodiment basically matches the population.
(2)在城市整体水平上,模拟曲线与流感真实数据在趋势上吻合度较高,可以反映流感爆发强度、爆发持续时间、峰值等重要信息。图7展示了流感扩散过程模拟结果与真实病例在时间序列上的比较,图8展示了整个模拟周期内模型产生的再生数,再生数大于1表示流行病正在传播扩散,小于1表示流行病趋于消亡。再生数的值大于1时,再生数先上升后下降,且100次模拟结果显示再生数的值在大于1的部分波动性较大。再生数的值小于1时,再生数前期先稍微下降,后期基本趋于平稳,100次模拟结果显示再生数在小于1的部分波动性较小。(2) At the overall level of the city, the simulated curve and the actual flu data are in a high degree of trend agreement, which can reflect important information such as the intensity of the flu outbreak, the duration of the outbreak, and the peak value. Figure 7 shows the comparison between the simulation results of the influenza spreading process and the real cases in time series. Figure 8 shows the number of regenerations generated by the model during the entire simulation cycle. The number of regenerations greater than 1 indicates that the epidemic is spreading, and less than 1 indicates the trend of the epidemic. To die. When the value of the regeneration number is greater than 1, the regeneration number first rises and then decreases, and the results of 100 simulations show that the value of the regeneration number fluctuates greatly in the part greater than 1. When the value of the regeneration number is less than 1, the regeneration number decreases slightly in the early stage and basically stabilizes in the later stage. The 100 simulation results show that the regeneration number is less volatile in the part less than 1.
(3)在行政区划尺度下,大部分行政区仍然可以较好反映流感爆发强度及爆发趋势(图9)。坪山区、龙华区、光明区、大鹏区由于医院、社康较少,流感病例的时空密度稀疏(图9最后一行),造成模拟结果与真实流感数据在趋势上偏差较大。(3) Under the administrative division scale, most administrative regions can still better reflect the intensity and trend of influenza outbreaks (Figure 9). Pingshan District, Longhua District, Guangming District, and Dapeng District have fewer hospitals and social health facilities, and the spatio-temporal density of influenza cases is sparse (the last row in Figure 9), resulting in large deviations in trends between the simulation results and the real influenza data.
(4)图10是模拟深圳市流感扩散时空分布的结果,每个网格内的数值为100次模拟结果的平均值。本实施例提出的融合手机位置数据构建的基于智能体的空间显式传染病模型可以支持追踪感染者的位置信息,及其与其他易感者的交互情况,有效解决了无法真实定位感染者空间位置及潜在高风险传播区域的问题。(4) Figure 10 is the result of simulating the temporal and spatial distribution of influenza spread in Shenzhen. The value in each grid is the average of 100 simulation results. The agent-based spatial explicit infectious disease model constructed by fusing mobile phone location data proposed in this embodiment can support tracking the location information of infected persons and their interaction with other susceptible persons, effectively solving the inability to truly locate the space of infected persons. Location and potential high-risk transmission areas.
(5)根据100次模拟结果构建疫情树,在1Km×1Km的空间尺度下,本实施例分析了每个空间单元在疫情传播过程中的重要性(图11-14)。图11反映的是流感在每个空间单元内部进行传播的数量,即父代感染者与子代被感染者的家庭位置均位于同一网格,这种感染事件往往发生在同小区的邻居间。从图11可以看出,罗湖区东南部存在一个高危区域, 流感较容易在该位置内部发生感染传播,可以提醒生活在该区域的居民注意防范流感在邻居间的传播。图12、13、14反映的是流感传播过程中父代感染者与子代被感染者分别处于两个空间单元的情况。不同空间单元的父代感染者与子代被感染者的直线传播路径构成了网格间的连接关系,将网格与网格之间的连接关系组成社交网络,图12描述的网格的值即为该网格在社交网络中的度,反映了一个网格所能影响的网格数量。可以看出,福田区与罗湖区交界处的空间单元与其他空间单元相比,会对深圳市较多地理空间产生影响。通过分析发现,流感在网格与网格间进行传播时,网格间连接关系的权重不同(连接权重表示两个网格间发生流感传播的事件数)。99.9%的网格间的连接权重小于100(图13),连接权重大于100的区域有明显的空间聚集现象(图14)。虽然连接权重大于100的区域占比非常小,但识别这部分区域至关重要,因为这部分区域对流感传播的影响较大,流感在该区域的扩散概率较大,而这部分区域明显的空间聚合现象又为流行病遏制措施的实施提供了便利。网格间的空间距离表示流感的传播距离,网格间的连接权重表示流感在网格间的传播强度,为了揭示传播距离与传播强度的关系,本实施例计算了二者之间的皮尔逊相关系数r,r=-0.098,说明传播距离与传播强度呈极弱的负相关关系。(5) Construct an epidemic tree based on the results of 100 simulations. Under the spatial scale of 1Km×1Km, this embodiment analyzes the importance of each spatial unit in the spread of the epidemic (Figure 11-14). Figure 11 reflects the number of influenza spreading within each spatial unit, that is, the homes of the parents of the infected and the offspring are located in the same grid, and such infections often occur between neighbors in the same community. It can be seen from Figure 11 that there is a high-risk area in the southeast of Luohu District. Influenza is more likely to spread infection within this location, which can remind residents living in this area to prevent the spread of influenza among neighbors. Figures 12, 13, and 14 reflect the situation where the infected parent and the infected offspring are in two spatial units during the spread of influenza. The linear propagation paths of the infected parent and the infected offspring of different spatial units constitute the connection relationship between the grids, and the connection relationship between the grids and the grids forms a social network. The value of the grid described in Figure 12 It is the degree of the grid in the social network, reflecting the number of grids that a grid can affect. It can be seen that the spatial unit at the junction of Futian District and Luohu District will have an impact on more geographic spaces in Shenzhen than other spatial units. Through analysis, it is found that when influenza spreads between grids, the weight of the connection relationship between the grids is different (the connection weight represents the number of events in which influenza transmission occurs between the two grids). The connection weight between 99.9% of the grids is less than 100 (Figure 13), and the area where the connection weight is greater than 100 has obvious spatial aggregation (Figure 14). Although the proportion of areas with a connection weight greater than 100 is very small, it is important to identify this part of the area, because this part of the area has a greater impact on the spread of influenza, the probability of influenza spread in this area is greater, and this part of the area has obvious space The phenomenon of aggregation provides convenience for the implementation of epidemic containment measures. The spatial distance between the grids represents the transmission distance of influenza, and the connection weight between the grids represents the transmission intensity of influenza between the grids. In order to reveal the relationship between the transmission distance and the transmission intensity, the Pearson between the two is calculated in this embodiment. The correlation coefficient r, r=-0.098, shows that the propagation distance and the propagation strength are extremely weakly negatively correlated.
本发明建立了基于智能体的空间显式传染病模型。在传统依据统计数据进行个体移动建模方法的基础上,提出了一种新的融合大规模手机位置数据进行个体移动建模的时空框架,重构城市个体活动链。模型包含城市区域内的全部人口(兼具人口属性和移动行为),个体出行位置估算到建筑物,个体活动场所以家庭(居家)、工作单位(工作)、建筑(娱乐)为单元,以此构建个体间 的动态接触网络。然后采用SEIR模型模拟流感在城市的时空传播过程,模拟的时间步长为1小时,空间尺度以建筑为单位,并且,在模拟过程中充分考虑个体对流感病毒反应的异质性。The invention establishes a spatial explicit infectious disease model based on an agent. Based on the traditional method of individual movement modeling based on statistical data, a new spatio-temporal framework that integrates large-scale mobile phone location data for individual movement modeling is proposed, and the chain of individual urban activities is reconstructed. The model includes the entire population in the urban area (both demographic attributes and mobile behavior), the individual travel location is estimated to the building, and the individual activity place is based on the family (home), work unit (work), and building (entertainment). Construct a dynamic contact network between individuals. Then the SEIR model is used to simulate the time and space propagation process of influenza in the city. The simulation time step is 1 hour, the spatial scale is in the unit of building, and the heterogeneity of individual response to influenza virus is fully considered in the simulation process.
需要说明的是,本发明不局限在手机数据、出行调查数据、人口普查数据、建筑物普查数据等数据的融合。本发明能够针对多种传染病的传播进行模拟,例如:流感、登革热等近距离传播的疾病,此外,本发明对传染病在时间和空间上的传播均有良好的分析能力,尤其是空间分析能力。It should be noted that the present invention is not limited to the fusion of data such as mobile phone data, travel survey data, population census data, and building census data. The present invention can simulate the spread of a variety of infectious diseases, such as: influenza, dengue fever and other close-transmitted diseases. In addition, the present invention has good analysis capabilities for the spread of infectious diseases in time and space, especially spatial analysis. ability.
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。The above are only the preferred embodiments of the present invention. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of the present invention, several improvements and modifications can be made, and these improvements and modifications are also It should be regarded as the protection scope of the present invention.

Claims (10)

  1. 一种大规模轨迹数据模拟流感时空传播过程的方法,其特征在于,该方法包括如下步骤:A method for simulating the spatiotemporal transmission process of influenza with large-scale trajectory data is characterized in that the method includes the following steps:
    a.基于人口普查数据、建筑物普查数据合成城市人口,并为合成城市人口的个体赋予相应的人口属性;a. Synthesize urban population based on census data and building census data, and assign corresponding demographic attributes to individuals who synthesize urban population;
    b.对赋予人口属性的合成城市人口的个体,以手机位置数据为主、出行调查数据为辅,构建个体的活动链;b. For individuals of synthetic urban population given demographic attributes, use mobile phone location data as the main and travel survey data as a supplement to construct an individual activity chain;
    c.以构建的个体的活动链为基础,以一小时为时间步长,动态构建一天中24个时间序列的接触网络;c. Based on the constructed individual activity chain, with one hour as the time step, dynamically construct a contact network of 24 time series in a day;
    d.根据构建的一天中24个时间序列的接触网络,采用SEIR模型模拟流感在高时空分辨率下的传播。d. According to the constructed contact network of 24 time series in a day, the SEIR model is used to simulate the spread of influenza at high temporal and spatial resolution.
  2. 如权利要求1所述的方法,其特征在于,该方法还包括步骤e:The method according to claim 1, wherein the method further comprises step e:
    在时间和空间两个视角下对模拟结果进行分析,得到感染者之间的空间传播路径,准确定位流感疫情传播过程中的关键空间位置。Analyze the simulation results from the two perspectives of time and space, obtain the spatial transmission path between the infected persons, and accurately locate the key spatial locations in the spread of the influenza epidemic.
  3. 如权利要求1所述的方法,其特征在于,The method of claim 1, wherein:
    所述人口普查数据包括:年龄、性别、职业类型、家庭类别、家庭规模、家庭年龄成分;所述建筑物普查数据包括:建筑位置信息、楼高、建筑面积、建筑物功能;其中,所述建筑物功能包括:工厂、教学楼、居民住宅、办公楼、商场;The population census data includes: age, gender, occupation type, family category, family size, and family age components; the building census data includes: building location information, building height, building area, and building function; wherein, the Building functions include: factories, teaching buildings, residential buildings, office buildings, shopping malls;
    所述人口属性包括个体属性、家庭属性、是否为有手机个体;其中,所述个体属性包括:年龄、性别、职业;所述家庭属性包括:家庭结构、家庭住址、工作地。The demographic attributes include individual attributes, family attributes, and whether they are individuals with mobile phones; wherein, the individual attributes include age, gender, and occupation; the family attributes include family structure, home address, and work place.
  4. 如权利要求3所述的方法,其特征在于,所述的步骤b具体包括如下步骤:The method according to claim 3, wherein said step b specifically includes the following steps:
    对有手机个体的出行轨迹基于手机位置数据进行构建:将同一个手机号的数据按时间排序,构成该手机用户一天的出行轨迹,基于手机基站划分泰森多边形,与手机基站位于同一泰森多边形的多个建筑物即为个体位置的候选集合;The travel trajectory of an individual with a mobile phone is constructed based on the mobile phone location data: the data of the same mobile phone number is sorted by time to form the mobile phone user’s one-day travel trajectory, based on the mobile phone base station to divide the Tyson polygon, and the mobile phone base station is located in the same Tyson polygon The multiple buildings in are the candidate sets of individual locations;
    对无手机个体的出行轨迹基于出行调查数据进行构建:以出行调查的每条出行信息进行构建,得到同一交通小区的多个建筑物为个体位置的候选集合;Construct the travel trajectory of individuals without mobile phones based on travel survey data: construct each piece of travel information in the travel survey to obtain multiple buildings in the same traffic district as a candidate set of individual locations;
    根据上述得到的个体位置的候选集合,构建个体的活动链。According to the candidate set of individual positions obtained above, an individual's activity chain is constructed.
  5. 如权利要求4所述的方法,其特征在于,所述的步骤c具体包括如下步骤:The method according to claim 4, wherein said step c specifically comprises the following steps:
    根据构建的个体的活动链,以小时为时间粒度,比较不同个体在一天内进行的活动类别,将相同时刻进行相同活动且所处建筑位置相同的个体设置为时空同现的个体,基于活动类别,为时空同现的个体间赋予不同的接触概率,在接触概率的约束下,生成智能体间以小时为分辨率的动态接触网络。According to the activity chain of the constructed individuals, with the hour as the time granularity, compare the types of activities performed by different individuals in a day, and set individuals who perform the same activities at the same time and are in the same building location as individuals with co-occurrence in time and space, based on the activity category , To give different contact probabilities between individuals co-occurring in time and space. Under the constraint of contact probability, a dynamic contact network with hourly resolution between agents is generated.
  6. 如权利要求5所述的方法,其特征在于,所述活动类别包括:居家、上班/上学、休闲娱乐活动。The method according to claim 5, wherein the activity categories include: home, work/school, leisure and entertainment activities.
  7. 如权利要求5所述的方法,其特征在于,所述的步骤d具体包括如下步骤:The method according to claim 5, wherein said step d specifically comprises the following steps:
    以小时为时间单位,追踪感染事件发生在何时何地、由谁传染给谁,在流感传播过程中,个体被感染的概率称为有效感染概率P,有效感染概率的公式如下:With hours as the unit of time, track the time and place of the infection event, and from whom to whom. During the spread of influenza, the probability of an individual being infected is called the effective infection probability P. The formula for the effective infection probability is as follows:
    P=P c×P i×r P=P c ×P i ×r
    其中,P c为个体间的接触概率,P i为个体的感染概率,r为该个体的相对传染性; Among them, P c is the contact probability between individuals, P i is the infection probability of the individual, and r is the relative infectivity of the individual;
    最后通过蒙特卡洛方法决定该个体是否被感染。Finally, the Monte Carlo method is used to determine whether the individual is infected.
  8. 如权利要求7所述的方法,其特征在于,所述通过蒙特卡洛方法决定该个体是否被感染包括:8. The method of claim 7, wherein said determining whether the individual is infected by the Monte Carlo method comprises:
    基于计算机生成均匀分布的伪随机数,将伪随机数与有效感染概率相比较,若伪随机数小于等于有效感染概率,则个体被感染,重复以上过程,直到最终的每天新感染人数趋势与真实数据一致且计算出的基本再生数大于1。Based on a computer-generated uniformly distributed pseudo-random number, the pseudo-random number is compared with the effective probability of infection. If the pseudo-random number is less than or equal to the effective probability of infection, the individual is infected. Repeat the above process until the final trend of the number of new infections each day is consistent with the true The data is consistent and the calculated basic reproduction number is greater than 1.
  9. 如权利要求8所述的方法,其特征在于,所述的步骤e具体包括如下步骤:The method according to claim 8, wherein said step e specifically comprises the following steps:
    时间序列上进行城市尺度、行政区划尺度两种尺度下的对比分析,将城市划分为1km×1km的网格,分析每个网格每天新感染的病例数;Conduct a comparative analysis on the time series at the city scale and the administrative division scale, divide the city into 1km×1km grids, and analyze the number of new infections in each grid every day;
    通过构建父代感染者与子代感染者的传播拓扑关系构建疫情树,再由多棵疫情树组成疫情树林,得到父代感染者与子代被感染者之间的空间传播路径,准确定位流感疫情传播过程中的关键空间位置。The epidemic tree is constructed by constructing the spread topological relationship between the parent infected person and the offspring infected person, and then the epidemic forest is composed of multiple epidemic trees to obtain the spatial transmission path between the parent infected person and the offspring infected person, and accurately locate the flu The key spatial location in the spread of the epidemic.
  10. 一种大规模轨迹数据模拟流感时空传播过程的系统,其特征在于,该系统包括人口属性赋予模块、活动链构建模块、接触网络构建模块、传播模拟模块,其中:A system for simulating the spatiotemporal transmission process of influenza with large-scale trajectory data, characterized in that the system includes a population attribute assignment module, an activity chain building module, a contact network building module, and a transmission simulation module, wherein:
    所述人口属性赋予模块用于基于人口普查数据、建筑物普查数据合成城市人口,并为合成城市人口的个体赋予相应的人口属性;The population attribute assignment module is used to synthesize urban population based on census data and building census data, and assign corresponding population attributes to individuals who synthesize urban population;
    所述活动链构建模块用于对赋予人口属性的合成城市人口的个体,以手机位置数据为主、出行调查数据为辅,构建个体的活动链;The activity chain building module is used to construct an individual activity chain based on mobile phone location data and travel survey data as a supplement to individuals of the synthetic urban population assigned population attributes;
    所述接触网络构建模块用于以构建的个体的活动链为基础,以一小时为时间步长,动态构建一天中24个时间序列的接触网络;The contact network building module is used to dynamically construct a contact network of 24 time series in a day based on the constructed individual activity chain, with one hour as a time step;
    所述传播模拟模块用于根据构建的一天中24个时间序列的接触网络,采用SEIR模型模拟流感在高时空分辨率下的传播。The spread simulation module is used to simulate the spread of influenza at high temporal and spatial resolution by using the SEIR model according to the constructed contact network of 24 time series in a day.
PCT/CN2020/082708 2020-03-27 2020-04-01 Method and system for simulating process of temporal and spatial circulation of influenza with massive trajectory data WO2021189516A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010231036.6 2020-03-27
CN202010231036.6A CN113450923B (en) 2020-03-27 2020-03-27 Method and system for simulating influenza space-time propagation process by large-scale track data

Publications (1)

Publication Number Publication Date
WO2021189516A1 true WO2021189516A1 (en) 2021-09-30

Family

ID=77807923

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/082708 WO2021189516A1 (en) 2020-03-27 2020-04-01 Method and system for simulating process of temporal and spatial circulation of influenza with massive trajectory data

Country Status (2)

Country Link
CN (1) CN113450923B (en)
WO (1) WO2021189516A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113159706A (en) * 2021-03-11 2021-07-23 北京联创新天科技有限公司 Enterprise big data information management system
CN115268310A (en) * 2022-06-20 2022-11-01 江苏南星家纺有限公司 Adjustable control system of gas permeability that textile fabric used
CN116866842A (en) * 2023-09-05 2023-10-10 成都健康医联信息产业有限公司 Infectious disease cross-space-time target tracking and early warning method, system, terminal and medium

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114171211A (en) * 2021-12-14 2022-03-11 平安国际智慧城市科技股份有限公司 Trajectory tracking method and device, computer equipment and storage medium
CN114388137A (en) * 2021-12-31 2022-04-22 中国科学院深圳先进技术研究院 Urban influenza incidence trend prediction method, system, terminal and storage medium
CN114582522A (en) * 2022-03-04 2022-06-03 中国人民解放军军事科学院军事医学研究院 Infectious disease aerosol propagation modeling simulation method and system based on Gaussian diffusion
CN115394455B (en) * 2022-05-31 2023-07-18 北京乾图科技有限公司 Space-time spread prediction method and device for infectious diseases based on spatial clustering discrete grid
CN115206543A (en) * 2022-06-22 2022-10-18 清华大学 Target object identification method and device in epidemiological investigation and computer equipment
CN115062244B (en) * 2022-08-18 2023-02-03 深圳市城市交通规划设计研究中心股份有限公司 Space-time accompanying person and co-worker resident searching method based on multi-source data
CN115330360B (en) * 2022-10-13 2022-12-27 广东泳华科技有限公司 Pedestrian trajectory calculation method based on multi-agent simulation technology
CN115881310B (en) * 2023-02-17 2023-05-19 中国建筑设计研究院有限公司 Multi-subject-based community space respiratory disease transmission prevention and control method
CN116823572B (en) * 2023-06-16 2023-12-19 中国联合网络通信有限公司深圳市分公司 Population flow data acquisition method and device and computer readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335604A (en) * 2015-08-31 2016-02-17 吉林大学 Epidemic prevention and control oriented population dynamic contact structure modeling and discovery method
CN107256327A (en) * 2017-05-05 2017-10-17 中国科学院深圳先进技术研究院 A kind of infectious disease preventing control method and system
CN108364694A (en) * 2018-03-09 2018-08-03 中华人民共和国陕西出入境检验检疫局 Airport Disease Warning Mechanism based on multi-data source big data and prevention and control system constituting method
CN109192318A (en) * 2018-07-11 2019-01-11 辽宁石油化工大学 The foundation and Laplace for describing the simplification SIS model of infectious disease transmission process are analyzed
WO2019020477A1 (en) * 2017-07-28 2019-01-31 Koninklijke Philips N.V. Monitoring direct and indirect transmission of infections in a healthcare facility using a real-time locating system
CN109360660A (en) * 2018-10-31 2019-02-19 河南省疾病预防控制中心 A kind of preventing control method and prevention and control system of disease control and trip information interconnection

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102682188A (en) * 2011-03-15 2012-09-19 中国科学院遥感应用研究所 City-wide infectious disease simulation method and device
JP2018067183A (en) * 2016-10-20 2018-04-26 アイシン精機株式会社 Mobile body tracking controller

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335604A (en) * 2015-08-31 2016-02-17 吉林大学 Epidemic prevention and control oriented population dynamic contact structure modeling and discovery method
CN107256327A (en) * 2017-05-05 2017-10-17 中国科学院深圳先进技术研究院 A kind of infectious disease preventing control method and system
WO2019020477A1 (en) * 2017-07-28 2019-01-31 Koninklijke Philips N.V. Monitoring direct and indirect transmission of infections in a healthcare facility using a real-time locating system
CN108364694A (en) * 2018-03-09 2018-08-03 中华人民共和国陕西出入境检验检疫局 Airport Disease Warning Mechanism based on multi-data source big data and prevention and control system constituting method
CN109192318A (en) * 2018-07-11 2019-01-11 辽宁石油化工大学 The foundation and Laplace for describing the simplification SIS model of infectious disease transmission process are analyzed
CN109360660A (en) * 2018-10-31 2019-02-19 河南省疾病预防控制中心 A kind of preventing control method and prevention and control system of disease control and trip information interconnection

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JIASHENG WANG ; JIANHONG XIONG ; KUN YANG ; SHUANGYUN PENG ; QUANLI XU: "Use of GIS and agent-based modeling to simulate the spread of influenza", GEOINFORMATICS, 2010 18TH INTERNATIONAL CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, 18 June 2010 (2010-06-18), Piscataway, NJ, USA, pages 1 - 6, XP031750263, ISBN: 978-1-4244-7301-4 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113159706A (en) * 2021-03-11 2021-07-23 北京联创新天科技有限公司 Enterprise big data information management system
CN115268310A (en) * 2022-06-20 2022-11-01 江苏南星家纺有限公司 Adjustable control system of gas permeability that textile fabric used
CN115268310B (en) * 2022-06-20 2023-12-15 江苏南星家纺有限公司 Breathable adjustable control system for textile fabric
CN116866842A (en) * 2023-09-05 2023-10-10 成都健康医联信息产业有限公司 Infectious disease cross-space-time target tracking and early warning method, system, terminal and medium
CN116866842B (en) * 2023-09-05 2023-11-21 成都健康医联信息产业有限公司 Infectious disease cross-space-time target tracking and early warning method, system, terminal and medium

Also Published As

Publication number Publication date
CN113450923B (en) 2023-07-14
CN113450923A (en) 2021-09-28

Similar Documents

Publication Publication Date Title
WO2021189516A1 (en) Method and system for simulating process of temporal and spatial circulation of influenza with massive trajectory data
Nishi et al. Network interventions for managing the COVID-19 pandemic and sustaining economy
Chen Modeling the spread of infectious diseases: a review
Qian et al. Connecting urban transportation systems with the spread of infectious diseases: A Trans-SEIR modeling approach
Frias-Martinez et al. An agent-based model of epidemic spread using human mobility and social network information
Colizza et al. The modeling of global epidemics: Stochastic dynamics and predictability
Eubank et al. Modelling disease outbreaks in realistic urban social networks
Geard et al. Synthetic population dynamics: A model of household demography
Zhou et al. Optimizing spatial allocation of COVID‐19 vaccine by agent‐based spatiotemporal simulations
Aylett-Bullock et al. June: open-source individual-based epidemiology simulation
CN113496781A (en) Urban internal infectious disease diffusion simulation method and system and electronic equipment
Mazzoli et al. Interplay between mobility, multi-seeding and lockdowns shapes COVID-19 local impact
Xia et al. Synthesis of a high resolution social contact network for Delhi with application to pandemic planning
Thomine et al. Emerging dynamics from high-resolution spatial numerical epidemics
CN115274134A (en) COVID-19 intelligent body disease risk prediction model
Pechlivanoglou et al. Epidemic spreading in trajectory networks
Huang et al. Urban spatial epidemic simulation model: A case study of the second COVID‐19 outbreak in Beijing, China
Amaral et al. Spatio-temporal modeling of infectious diseases by integrating compartment and point process models
CN112669977B (en) Intervening SEIRD-CA infectious disease space-time diffusion simulation and prediction method
Yin et al. Effectiveness of contact tracing, mask wearing and prompt testing on suppressing COVID-19 resurgences in megacities: an individual-based modelling study
Banks et al. Disentangling the roles of human mobility and deprivation on the transmission dynamics of COVID-19 using a spatially explicit simulation model
Afroj Moon et al. Are all underimmunized measles clusters equally critical?
Shatnawi et al. Modeling and simulation of epidemic spread: Recent advances
Barat et al. Agent based simulatable city digital twin to explore dynamics of covid-19 pandemic
Frıas-Martınez et al. Agent-based modelling of epidemic spreading using social networks and human mobility patterns

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20927734

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20927734

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20927734

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03/07/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20927734

Country of ref document: EP

Kind code of ref document: A1