CN113643824A - Suspected epidemic infected person detection method based on gamma-suspected infected community model - Google Patents

Suspected epidemic infected person detection method based on gamma-suspected infected community model Download PDF

Info

Publication number
CN113643824A
CN113643824A CN202110813384.9A CN202110813384A CN113643824A CN 113643824 A CN113643824 A CN 113643824A CN 202110813384 A CN202110813384 A CN 202110813384A CN 113643824 A CN113643824 A CN 113643824A
Authority
CN
China
Prior art keywords
infected
suspected
gamma
nodes
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110813384.9A
Other languages
Chinese (zh)
Other versions
CN113643824B (en
Inventor
范正青
袁龙
陈紫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN202110813384.9A priority Critical patent/CN113643824B/en
Publication of CN113643824A publication Critical patent/CN113643824A/en
Application granted granted Critical
Publication of CN113643824B publication Critical patent/CN113643824B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Remote Sensing (AREA)
  • Biomedical Technology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a suspected epidemic infected person detection method based on a gamma-suspected infected community model, which comprises the following steps: crawling the information of the travel track of the infected person on the network; analyzing travel track data of infected persons, matching the travel track data of the infected persons in pairs to obtain a contact possibility numerical value of any two persons, and storing the contact possibility numerical value into a matching value file; reading data in the matching value file to generate a contact probability graph, wherein the ID of a person is used as a node of the contact probability graph, and the contact possibility numerical values of two persons are used as the weight value of the edge connecting the two nodes; and carrying out maximum cluster search on the contact probability map, finding the maximum clusters in the contact probability map, and carrying out individual analysis on all the found maximum clusters to find the gamma-suspected infected communities on all the maximum clusters. If the patient is detected in the gamma-suspected infection community, other people in the suspected infection community have high infection probability. The invention can determine the personnel carrying the virus, and greatly reduces the workload of medical personnel.

Description

Suspected epidemic infected person detection method based on gamma-suspected infected community model
Technical Field
The invention belongs to the technical field of data mining, and particularly relates to a suspected epidemic infected person detection method based on a gamma-suspected infected community model.
Background
One of the keys of epidemic prevention and control represented by the new crown is to discover an infected person, and the earlier the infected person is discovered, the epidemic situation can be controlled more timely, and the pressure of downstream medical treatment and cure work is reduced. Therefore, the development of active detection and epidemiological profiling work to dig out potential asymptomatic infectors or suspected cases has important effects on effectively controlling the infection source, reducing or eliminating pathogen diffusion and preventing the spread of diseases among people.
However, the finding of new crown infectors has many difficulties, the new crown virus has extremely strong infection capacity, can be infected by coughing, sneezing, speaking and even standing nearby, and cannot be attacked immediately after the new crown is infected, a 14-day latent period exists, and the new crown still has infectivity in the latent period, which undoubtedly increases the difficulty for the finding of new crown infectors. However, the existing algorithm has limitations, time and position information is not well coordinated, and for a new crown, the fact that the time and the geographic position information are not matched at the same time is of no display significance.
Disclosure of Invention
The invention aims to provide a method for detecting suspected epidemic infected persons based on a gamma-suspected infected community model, and all high-infection groups can be found out.
The technical scheme for realizing the purpose of the invention is as follows: a suspected epidemic infected person detection method based on a gamma-suspected infected community model comprises the following specific steps:
step 1, crawling information of outgoing tracks of infected persons on a network;
step 2, analyzing travel track data of infected persons, matching the travel track data of the infected persons in pairs to obtain any two person contact possibility numerical values, and storing the numerical values into a matching value file;
step 3, reading data in the matching value file, and generating a contact probability graph, wherein the personnel ID is used as a node of the contact probability graph, and the contact possibility numerical values of two personnel are used as the weight value of the edge connecting the two nodes;
and 4, carrying out maximum cluster search on the contact probability map, finding the maximum clusters in the contact probability map, and carrying out individual analysis on all the found maximum clusters to find the gamma-suspected infected communities on all the maximum clusters.
Preferably, the trajectory information of the infected person traveling on the network includes a unique representative ID of the individual and trajectory data including location information and time information.
Preferably, the specific method for analyzing travel track data of infected persons, matching the travel track data of persons pairwise and obtaining any 2 contact possibility numerical values of persons is as follows:
selecting any two track files, and reading track information in the files;
judging the time matching degree: judging whether the two generated track data belong to the same day and are within a set time difference value, namely whether the head time and the tail time of the two track files have intersection, if not, reselecting the two track files to judge the time matching degree, and if so, judging the next geographical position matching degree;
judging the matching degree of the geographic position: matching the positions of the places, judging whether the two generated track data belong to the same area within a distance tolerance range, if not, reselecting two track files to judge the time matching degree, and if so, calculating a contact possibility value;
and calculating the matching values of the two tracks in a segmented mode, and taking the maximum matching value as a contact possibility value.
Preferably, the calculation method of the matching value of the two tracks is as follows:
setting the area range of one of the track segments, and taking the ratio of track points of the other track segment in the area range of one of the track segments in the matching time period to the total track points of the other track segment as a matching value.
Preferably, the specific method for searching the extremely large cliques in the contact probability map is as follows:
constructing 3 sets R, P, X for storing states in an algorithm enumeration process, wherein an R set is a result set and is represented as a currently found group, and nodes in the R are all connected with each other but do not necessarily form a very large group; nodes in the P set and the X set represent public neighbors of nodes in the R set, points in the P set and the X set can be added into the R set so as to expand the group structure of the R set, the X set is a forbidden set, and the nodes are all points which are added into the R set before; the P set is an alternative set and is represented as a point which is not added into the R set;
each time, a point v is selected from the P set and added into the R set, the P set and the X set are updated simultaneously, the P set and the X set are guaranteed to still contain public neighbors of the new R set, and the method specifically comprises the following steps: and respectively intersecting the P set and the X set with the adjacent table of the v point, performing recursive calling, transferring the v point from the alternative set P to the forbidden set X after the recursive calling is finished, and outputting the R set as a maximum group when the P set and the X set are both empty.
Preferably, the specific method for finding the gamma-suspected infected community on the maximal clique is as follows:
two adjacent nodes in the extremely large cliques are selected and added into a set C, and the set C represents the currently found gamma-cliques;
selecting a new node to add into the set C under the condition that the edge between the two selected nodes is larger than gamma, and after the new node is added into the set C, multiplying the product of all edges is larger than or equal to gamma;
traversing all nodes on the huge group, and finally obtaining a set C which is the gamma-suspected infected community on the huge group.
Preferably, the specific method for selecting the new node is as follows:
and taking an unvisited node as a starting node, walking to the unvisited node along the edge of the current node, returning to the previous node when no unvisited node exists, and continuously probing other nodes until all nodes are accessed.
Compared with the prior art, the invention has the following remarkable advantages: in terms of technical means, the method adopts track data based on the real world for matching, and the matched returned value is used as the measurement of the possibility of the person infection contact, so that the limitation of directly inquiring isolation persons is eliminated, strangers who are located at a short distance with the patient can be found, the number is visually used as the degree of the person contact, and the time for searching the infected person is reduced; the track matching of the invention adopts proper pruning to show that the two currently judged tracks can not be matched in the current time, so that the matching is not carried out in space, and the calculation amount for calculating the matching of the geographic position and the matching time of the geographic position are saved; the invention grasps the moving track of the patient before being isolated, so that the patient can clearly know the place where the patient passes, and then can find the crowd contacting with the patient in a short distance through the time-space range query, and directly find the information of the staff contacting with the infected staff in a period of time, thereby saving the manpower, material resources, financial resources and the workload of medical staff; the invention can set different threshold values gamma to generate gamma-suspected infected communities under different probabilities, thereby dealing with infection matching under different scenes.
Drawings
FIG. 1 is a schematic flow diagram of the present invention.
Fig. 2 is a schematic diagram of one figure.
Fig. 3 is a schematic view of the very large cluster in fig. 2.
FIG. 4 is a schematic illustration of a contact probability map.
Fig. 5 is a schematic diagram of a track containing only position information.
Fig. 6 is a schematic diagram of a track containing time and location information.
Fig. 7 is a schematic diagram of the track mapped on the two-dimensional plane of fig. 5.
FIG. 8 is a schematic illustration of a depth traversal of an undirected graph.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
As shown in fig. 1, a method for detecting suspected epidemic infected persons based on a gamma-suspected infected community model includes the following steps:
step 1, crawling information of outgoing tracks of infected persons on a network;
the travel track information of the infected person on the network comprises a unique representation ID and track data of the person, the track data comprises Location information and Time information, and the track data is stored in a Trace (ID, Location, Time) form, the ID refers to the unique representation ID of the person, the Location refers to the Location information, and the Time refers to the Time information.
Step 2, analyzing travel track data of infected persons, matching the travel track data of the infected persons pairwise to obtain any 2 contact possibility values of the infected persons, and storing the contact possibility values into a matching value file, wherein the specific method comprises the following steps:
selecting any two track files, and reading track information in the files;
judging the time matching degree: judging whether the two generated track data belong to the same day and are within a set time difference value, namely whether the head time and the tail time of the two track files have intersection, if not, reselecting the two track files to judge the time matching degree, and if so, judging the next geographical position matching degree;
judging the matching degree of the geographic position: matching the positions of the places, judging whether the two generated track data belong to the same area within a distance tolerance range, if not, reselecting two track files to judge the time matching degree, and if so, calculating a contact possibility value;
and calculating the matching values of the two tracks in a segmented mode, and taking the maximum matching value as a contact possibility value. The calculation method of the matching value of the two tracks comprises the following steps:
setting the area range of one of the track segments, and taking the ratio of track points of the other track segment in the area range of one of the track segments in the matching time period to the total track points of the other track segment as a matching value.
And matching every two track files, and calculating a contact possibility value.
Step 3, reading data in the matching value file, and generating a contact probability graph, wherein the personnel ID is used as a node of the contact probability graph, and the contact possibility numerical values of two personnel are used as the weight value of the edge connecting the two nodes;
in the present invention, a graph is represented as G (V, E), where V is a set of all nodes in the graph and E is a set of all edges, and n and m are generally used to represent the number of points and edges in G, i.e., n ═ V |, and m ═ E |, respectively. For any two points u, v in the graph G, a bigram (u, v) is used to represent one edge connecting the point u and the point v. For each node v ∈ G, let NG(v) Representing the neighbors of v, i.e. NG(v) And { u | (V, u) ∈ E, u, V ∈ V }. All neighbor points together make up an adjacency list of points v, denoted Γ (v). The magnitude of Γ (v) is referred to as the degree of point v.
And for a given contact probability graph G ═ V, E, γ, where V denotes the set of nodes in G, E denotes the set of edges in G, and γ denotes the uncertainty in the graph, referring to the weight of the edges in the graph. In the contact probability map G ═ (V, E, γ), the presence or absence of different edges is an event independent of each other, and the probability values on the edges do not affect each other. For node u in graph G', all neighboring nodes of node u are represented by L (u). The set C stores the nodes of the currently found cliques and is used for representing the currently found gamma-cliques, for the node subset C in the graph, max (C) represents the node with the largest number in C, and L (C) represents all the neighbor nodes of C. For set C in fig. G ', clique (C, G ') represents the clique probability of C, and clique (C, G ') is specified to be 1. Let n ═ V |, m ═ E |, n and m denote the number of nodes and the number of edges of graph G, respectively.
Definition 1: in graph G ═ V, E), if any node in node subset C has an edge connecting between them, then node subset C is a clique.
Definition 2: for node subsets M and V in graph G ═ V, E), max cliques if (1) M satisfies the definition of cliques; (2) there is no node V ∈ V \ M, so that M ∈ V is a group, and then M is a very large group.
Definition 3: in the contact probability map G ═ (V, E, γ), for an arbitrary clique C, the clique (C, G') is the product of the weights of all sides in C.
Definition 4: γ -clique in the contact probability map G ═ (V, E, γ), if node subsets C and V, satisfy (1) C is a clique; (2) clique (C, G') > ═ γ, then C is a γ -group.
Definition 5: γ -suspected infected community in contact probability map G ═ V, E, γ, if node subsets C and V, satisfy (1) C is a γ cluster; (2) there is no node V ∈ V \ C in the graph G ', such that C ∈ V is a γ -group, then C is a γ -suspected infected community in the graph G'.
Step 4, carrying out maximum group search on the contact probability map, firstly finding the maximum groups in the contact probability map, then carrying out individual analysis on all the found maximum groups, and finding the gamma-suspected infected communities on all the maximum groups, wherein the specific method comprises the following steps:
the specific method for finding the extremely large group is as follows:
constructing 3 sets R, P, X for storing states in an algorithm enumeration process, wherein an R set is a result set and is represented as a currently found group, and nodes in the R are all connected with each other but do not necessarily form a very large group; nodes in the P set and the X set represent public neighbors of nodes in the R set, points in the P set and the X set can be added into the R set so as to expand the group structure of the R set, the X set is a forbidden set, and the nodes are all points which are added into the R set before; the P set is an alternative set and is represented as a point which is not added into the R set;
each time, a point v is selected from the P set and added into the R set, the P set and the X set are updated simultaneously, the P set and the X set are guaranteed to still contain public neighbors of the new R set, and the method specifically comprises the following steps: and respectively intersecting the P set and the X set with the adjacent table of the v point, performing recursive calling, transferring the v point from the alternative set P to the forbidden set X after the recursive calling is finished, and outputting the R set as a maximum group when the P set and the X set are both empty.
The specific method for finding the gamma-suspected infected communities on all the maximal cliques comprises the following steps:
two adjacent nodes in the extremely large clique are selected to be added into a set C, the set C represents the currently found gamma-clique, the edge between the two nodes is larger than gamma, gamma is set as a threshold value and represents the minimum value of the products of all the edges in the clique, and when the gamma-clique C is to be expanded, only the nodes with the number larger than max (C) in the public neighbor nodes of the C are added. But each time a new node is added to C, the clique (C, G') < gamma, the set C is no longer a gamma-clique.
And selecting other nodes under the condition that the edge between the two selected nodes is larger than gamma, and selecting the product of all edges of the new node added into the set C before adding the new node into the set C, wherein the product of all edges is larger than or equal to gamma.
The method for selecting a new node adopts a depth-first traversal (DFS) -based algorithm as shown in fig. 8, where depth-first traversal first uses an unvisited node as a starting node, and walks to the unvisited node along the edge of the current node, and when there is no unvisited node, returns to the previous node, and continues to probe other nodes until all nodes are completely visited. The nodes in the contact probability graph are processed in ascending order of node numbers, and all gamma-suspected infected communities in the contact probability graph are efficiently enumerated by maintaining a set A and a set B for searching, wherein the set A stores (u, r) data pairs, u represents the node number, r represents a value to be multiplied by the set C after the node u is added into the set C, and u > max (C), the product of all edges is still ensured to be more than or equal to a threshold gamma after the u is added into the set C, the set C stores the node set of the currently found clique, represents a subset in the contact probability graph, and max (C) represents the node with the maximum number in C. The B set is also used for storing data pairs (u, r) and has the same meaning as the A, but all the nodes stored in the B set are processed nodes, when the A set is empty, the operation is ended, but the C cannot be proved to be the gamma-suspected infected community, the nodes in the B set can still expand the C, and the C is the gamma-suspected infected community only when the A set and the B set are empty. And judging whether all the nodes in the maximal cluster are selected or not, namely whether all the nodes in the maximal cluster are traversed or not, and when all the nodes are traversed, obtaining a set C which is a gamma-suspected infected community.
If the patient is detected in the gamma-suspected infection community, other people in the suspected infection community have high infection probability. The invention can set different threshold values gamma to generate gamma-suspected infected communities under different probabilities so as to cope with infection matching under different scenes, for example, aiming at vehicles, a small threshold value can be used for judging meeting, and for common track data, compared with the contact of the vehicles, the threshold value which needs to be set is larger than that of the vehicles.
Example 1: in this embodiment, a method for detecting suspected epidemic infected persons based on a gamma-suspected infected community model is provided. Firstly, the contact degree of matching two tracks is carried out on all track data, and then the maximum clique analysis on the graph is carried out through the data of the contact degree. The search of the invention finds the suspected epidemic infected group, so that the search of the suspected epidemic infected person can be completed.
1. Model design
In the present invention, one graph is represented as G (V, E). Where V is the set of all nodes in the graph and E is the set of all edges, n is usually used, and m represents the number of points and edges in G, i.e., n ═ V |, m |, and E |. For any two points u, v in the graph G, a bigram (u, v) may be used to represent one edge connecting the point u and the point v. For any point v, all points connected to it can be called its neighbor points, and all the neighbor points together form an adjacency list of the point v, which is generally denoted by Γ (v). The magnitude of Γ (v) is referred to as the degree of point v. For a graph G ═ V, E, where V denotes a set of nodes and E denotes an edge, connecting the nodes in G. The invention respectively uses N and m to represent the number of nodes and the number of edges, namely N ═ V | and m ═ E |, and for each node V ∈ G, N is setG(v) Representing the neighbors of v, i.e. NG(v)={u|(v,u)∈E,u,v∈V}。
And for a given contact probability graph G ═ V, E, γ, where V denotes the set of nodes in G ', E denotes the set of edges in G', and γ denotes the uncertainty in the graph, referring to the weight of the edges in the graph. In the contact probability map G ═ (V, E, γ), the presence or absence of different edges is an event independent of each other, and the probability values on the edges do not affect each other. For node u in graph G', all neighboring nodes of node u are represented by L (u). For node subset C in the graph, max (C) represents the node with the largest number in C, and l (C) represents all the neighbor nodes of C. For clique C in fig. G ', clique (C, G ') represents the clique probability of C, and clique (0, G ') -1 is specified. Let n ═ V |, m ═ E |, n and m denote the number of nodes and the number of edges of graph G', respectively.
Definition 1: in graph G ═ V, E), if any node in node subset C has an edge connecting between them, then node subset C is a clique.
Definition 2: for node subsets M and V in graph G ═ V, E), max cliques if (1) M satisfies the definition of cliques; (2) there is no node V ∈ V \ M, so that M ∈ V is a group, and then M is a very large group.
Definition 3: in the contact probability map G ═ (V, E, γ), for an arbitrary clique C, the clique (C, G') is the product of the weights of all sides in C.
Definition 4: γ -clique in the contact probability map G ═ (V, E, γ), if node subsets C and V, satisfy (1) C is a clique; (2) clique (C, G') > ═ γ, then C is a γ -group.
Definition 5: γ -suspected infected community in contact probability map G ═ V, E, γ, if node subsets C and V, satisfy (1) C is a γ cluster; (2) there is no node V ∈ V \ C in the graph G', such that C ∈ V is a γ -group, and C is a γ -suspected infected community in the graph G.
As shown in FIG. 2, the very big clique defined on the graph is denoted C1={1,2,3},C2={3,5,6},C3={8,9},C4={1,4},C5={1,5},C6={5,10},C7={9,10},C8={6,7}。
2. Gamma-suspected infection community searching method on contact probability graph
2.1 Gamma-suspected infected Community search Algorithm on contact probability map
Aiming at the property of the graph, the invention designs a search algorithm for matching the matching degree of two tracks.
2.1.1 Algorithm 1
Figure BDA0003169045300000071
Figure BDA0003169045300000081
Comparing the place data and the time contained in the track data according to the information contained in the track data, mapping the track in a two-dimensional plane according to the track shown in the figure, and judging whether the geographic range contained in the track data section compared in the set unit time is within the geographic range of a certain difference value of the comparison data, namely within the allowable error range. When the Trace data Trace is equal to (ID, Date, Time, Location), the error of Time is within a certain range, the indication range of the two locations is within the error distance, and the overlapping degree of the two Trace areas is used as an index of the contact degree of the two. If two track data Location _ a and Location _ B of different people are generated, the extremum region represented by the shadow is in the region range represented by the d error, at this time, the track point of Location _ B is matched with the shadow region, the number of the matching points is used as a reference for measuring the contact degree index of the two, if the two track data exist, the maximum value of the contact degrees of the data is found, and the maximum value is used as the contact degree between the two data, namely the value of the edge between the two nodes in the graph, as shown in fig. 7.
2.1.2 Algorithm 2
Figure BDA0003169045300000082
Figure BDA0003169045300000091
The gamma-suspected infected community search (Maximal Uncertainc Lique Enummation) algorithm is characterized in that nodes in a maximum group are traversed by taking the maximum group found on a probability map as a reference, the nodes in the maximum group are sequentially added into a set C, the currently found gamma-group is represented by the set C in the gamma-suspected infected community search algorithm, when the gamma-group C needs to be expanded, only nodes with the number being larger than max (C) in common neighbor nodes of the C are added, if all edge products in the C after the nodes are added are larger than or equal to a threshold gamma, the nodes can be added into the C as nodes for expanding the C, but after new nodes are added into the C each time, the probability of the C is reduced, the probability is possibly smaller than gamma, and the set C at this time is not a gamma-group any more. When no more nodes can be added to C, it means that a y-suspected infected community is found.
Example 2: given γ of 0.1 as shown in fig. 4, the γ -suspected infected communities in the contact probability map G' are the node sets C, respectively1={1,2,3}、C2={2,5}、C3={3,4}、C4={3,5}、C5={4,5}、C6={5,6}、C7A total of 7 γ -suspected infected communities, 6, 7, 8, 9.
The protection of the present invention is not limited to the above embodiments. Variations and advantages that may occur to those skilled in the art may be incorporated into the invention without departing from the spirit and scope of the inventive concept, and the scope of the appended claims is intended to be protected.

Claims (7)

1. A suspected epidemic infected person detection method based on a gamma-suspected infected community model is characterized by comprising the following specific steps:
step 1, crawling information of outgoing tracks of infected persons on a network;
step 2, analyzing travel track data of infected persons, matching the travel track data of the infected persons in pairs to obtain any two person contact possibility numerical values, and storing the numerical values into a matching value file;
step 3, reading data in the matching value file, and generating a contact probability graph, wherein the personnel ID is used as a node of the contact probability graph, and the contact possibility numerical values of two personnel are used as the weight value of the edge connecting the two nodes;
and 4, carrying out maximum cluster search on the contact probability map, finding the maximum clusters in the contact probability map, and carrying out individual analysis on all the found maximum clusters to find the gamma-suspected infected communities on all the maximum clusters.
2. The suspected epidemic infected person detection method based on the gamma-suspected infected community model according to claim 1, wherein the travel trajectory information of the infected person on the network comprises a unique representation ID of the person and trajectory data, and the trajectory data comprises location information and time information.
3. The method for detecting suspected epidemic infected persons based on the gamma-suspected infected community model as claimed in claim 1, wherein the specific method for analyzing travel trajectory data of infected persons, performing pairwise matching on the travel trajectory data of persons to obtain any 2 person contact possibility values comprises:
selecting any two track files, and reading track information in the files;
judging the time matching degree: judging whether the two generated track data belong to the same day and are within a set time difference value, namely whether the head time and the tail time of the two track files have intersection, if not, reselecting the two track files to judge the time matching degree, and if so, judging the next geographical position matching degree;
judging the matching degree of the geographic position: matching the positions of the places, judging whether the two generated track data belong to the same area within a distance tolerance range, if not, reselecting two track files to judge the time matching degree, and if so, calculating a contact possibility value;
and calculating the matching values of the two tracks in a segmented mode, and taking the maximum matching value as a contact possibility value.
4. The suspected epidemic infected person detection method based on the gamma-suspected infected community model according to claim 1, wherein the matching value of the two tracks is calculated by:
setting the area range of one of the track segments, and taking the ratio of track points of the other track segment in the area range of one of the track segments in the matching time period to the total track points of the other track segment as a matching value.
5. The method for detecting suspected epidemic infected persons based on the gamma-suspected infected community model as claimed in claim 1, wherein the specific method for searching for the huge cliques in the contact probability map is as follows:
constructing 3 sets R, P, X for storing states in an algorithm enumeration process, wherein an R set is a result set and is represented as a currently found group, and nodes in the R are all connected with each other but do not necessarily form a very large group; nodes in the P set and the X set represent public neighbors of nodes in the R set, points in the P set and the X set can be added into the R set so as to expand the group structure of the R set, the X set is a forbidden set, and the nodes are all points which are added into the R set before; the P set is an alternative set and is represented as a point which is not added into the R set;
each time, a point v is selected from the P set and added into the R set, the P set and the X set are updated simultaneously, the P set and the X set are guaranteed to still contain public neighbors of the new R set, and the method specifically comprises the following steps: and respectively intersecting the P set and the X set with the adjacent table of the v point, performing recursive calling, transferring the v point from the alternative set P to the forbidden set X after the recursive calling is finished, and outputting the R set as a maximum group when the P set and the X set are both empty.
6. The method for detecting suspected epidemic infected persons based on the gamma-suspected infected community model according to claim 1, wherein the specific method for finding the gamma-suspected infected community on the maximal clique is as follows:
two adjacent nodes in the extremely large cliques are selected and added into a set C, and the set C represents the currently found gamma-cliques;
selecting a new node to add into the set C under the condition that the edge between the two selected nodes is larger than gamma, and after the new node is added into the set C, multiplying the product of all edges is larger than or equal to gamma;
traversing all nodes on the huge group, and finally obtaining a set C which is the gamma-suspected infected community on the huge group.
7. The suspected epidemic infected person detection method based on the gamma-suspected infected community model according to claim 6, wherein the specific method for selecting the new node is as follows:
and taking an unvisited node as a starting node, walking to the unvisited node along the edge of the current node, returning to the previous node when no unvisited node exists, and continuously probing other nodes until all nodes are accessed.
CN202110813384.9A 2021-07-19 2021-07-19 Suspected epidemic infection personnel searching method based on gamma-suspected infection community model Active CN113643824B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110813384.9A CN113643824B (en) 2021-07-19 2021-07-19 Suspected epidemic infection personnel searching method based on gamma-suspected infection community model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110813384.9A CN113643824B (en) 2021-07-19 2021-07-19 Suspected epidemic infection personnel searching method based on gamma-suspected infection community model

Publications (2)

Publication Number Publication Date
CN113643824A true CN113643824A (en) 2021-11-12
CN113643824B CN113643824B (en) 2024-03-26

Family

ID=78417714

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110813384.9A Active CN113643824B (en) 2021-07-19 2021-07-19 Suspected epidemic infection personnel searching method based on gamma-suspected infection community model

Country Status (1)

Country Link
CN (1) CN113643824B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027525A (en) * 2020-03-09 2020-04-17 中国民用航空总局第二研究所 Method, device and system for tracking potential infected persons in public places during epidemic situation
CN111177473A (en) * 2018-11-13 2020-05-19 杭州海康威视数字技术股份有限公司 Personnel relationship analysis method and device and readable storage medium
CN111354472A (en) * 2020-02-20 2020-06-30 戴建荣 Infectious disease transmission monitoring and early warning system and method
CN111540476A (en) * 2020-04-20 2020-08-14 中国科学院地理科学与资源研究所 Respiratory infectious disease infectious tree reconstruction method based on mobile phone signaling data
CN112383875A (en) * 2020-06-28 2021-02-19 中国信息通信研究院 Data processing method and electronic equipment
CN112653990A (en) * 2020-09-18 2021-04-13 武汉爱迪科技股份有限公司 Screening algorithm and system for close contact personnel
US20210204886A1 (en) * 2013-10-10 2021-07-08 Aura Home, Inc. Covid-19 risk and illness assessment method
CN113113153A (en) * 2021-04-13 2021-07-13 上海市疾病预防控制中心 Method, system, device, processor and storage medium for realizing epidemic situation dynamic information analysis in epidemic situation outbreak period by using graph model

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210204886A1 (en) * 2013-10-10 2021-07-08 Aura Home, Inc. Covid-19 risk and illness assessment method
CN111177473A (en) * 2018-11-13 2020-05-19 杭州海康威视数字技术股份有限公司 Personnel relationship analysis method and device and readable storage medium
CN111354472A (en) * 2020-02-20 2020-06-30 戴建荣 Infectious disease transmission monitoring and early warning system and method
CN111027525A (en) * 2020-03-09 2020-04-17 中国民用航空总局第二研究所 Method, device and system for tracking potential infected persons in public places during epidemic situation
CN111540476A (en) * 2020-04-20 2020-08-14 中国科学院地理科学与资源研究所 Respiratory infectious disease infectious tree reconstruction method based on mobile phone signaling data
CN112383875A (en) * 2020-06-28 2021-02-19 中国信息通信研究院 Data processing method and electronic equipment
CN112653990A (en) * 2020-09-18 2021-04-13 武汉爱迪科技股份有限公司 Screening algorithm and system for close contact personnel
CN113113153A (en) * 2021-04-13 2021-07-13 上海市疾病预防控制中心 Method, system, device, processor and storage medium for realizing epidemic situation dynamic information analysis in epidemic situation outbreak period by using graph model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘红亮;贾洪文;王雁;刘彬;姚洁;闫宣辰;: "新型冠状病毒肺炎初期传播规模的系统动力学模型估计方法及评价――以甘肃省为例的研究", 电子科技大学学报(社科版), no. 03 *
唐三一;唐彪;NICOLA LUIGI BRAGAZZI;夏凡;李堂娟;何莎;任鹏宇;王霞;向长城;彭志行;吴建宏;肖燕妮;: "新型冠状病毒肺炎疫情数据挖掘与离散随机传播动力学模型分析", 中国科学:数学, no. 08 *

Also Published As

Publication number Publication date
CN113643824B (en) 2024-03-26

Similar Documents

Publication Publication Date Title
Wang et al. Effective lossless condensed representation and discovery of spatial co-location patterns
Jones et al. Database design for a multi-scale spatial information system
Gao et al. Towards Platial Joins and Buffers in Place-Based GIS
CN105630988A (en) Method and system for rapidly detecting space data changes and updating data
Xu et al. A framework for urban land use classification by integrating the spatial context of points of interest and graph convolutional neural network method
Chen et al. An indoor trajectory frequent pattern mining algorithm based on vague grid sequence
CN105183796A (en) Distributed link prediction method based on clustering
CN110704694B (en) Organization hierarchy dividing method based on network representation learning and application thereof
Shi et al. Adaptive detection of spatial point event outliers using multilevel constrained Delaunay triangulation
Song et al. Identifying flow clusters based on density domain decomposition
Buchin et al. Improved map construction using subtrajectory clustering
CN109783696B (en) Multi-pattern graph index construction method and system for weak structure correlation
CN115081910A (en) Robustness evaluation method of urban multi-mode public transport network
Yan et al. Spatiotemporal Flow L-function: a new method for identifying spatiotemporal clusters in geographical flow data
Janeja et al. Random walks to identify anomalous free-form spatial scan windows
Peng et al. Member promotion in social networks via skyline
CN113643824A (en) Suspected epidemic infected person detection method based on gamma-suspected infected community model
Ali et al. An efficient index for contact tracing query in a large spatio-temporal database
CN112380267B (en) Community discovery method based on privacy graph
Wang et al. Prevalent co-visiting patterns mining from location-based social networks
Sun et al. Study on safe evacuation routes based on crowd density map of shopping mall
Wang et al. Accurate Detection of Road Network Anomaly by Understanding Crowd's Driving Strategies from Human Mobility
Jin et al. Service sites selection for shared bicycles based on the location data of mobikes
Yang et al. Matching road network combining hierarchical strokes and probabilistic relaxation method
Shi et al. Modeling fuzzy topological relations between uncertain objects in a GIS

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Yuan Long

Inventor after: Fan Zhengqing

Inventor after: Chen Zi

Inventor before: Fan Zhengqing

Inventor before: Yuan Long

Inventor before: Chen Zi

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant