CN113643824B - Suspected epidemic infection personnel searching method based on gamma-suspected infection community model - Google Patents
Suspected epidemic infection personnel searching method based on gamma-suspected infection community model Download PDFInfo
- Publication number
- CN113643824B CN113643824B CN202110813384.9A CN202110813384A CN113643824B CN 113643824 B CN113643824 B CN 113643824B CN 202110813384 A CN202110813384 A CN 202110813384A CN 113643824 B CN113643824 B CN 113643824B
- Authority
- CN
- China
- Prior art keywords
- suspected
- gamma
- node
- track
- nodes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 208000015181 infectious disease Diseases 0.000 title claims abstract description 43
- 238000000034 method Methods 0.000 title claims description 34
- 230000009193 crawling Effects 0.000 claims abstract description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 239000000523 sample Substances 0.000 claims description 3
- 241000700605 Viruses Species 0.000 abstract description 2
- 238000001514 detection method Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 6
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 238000010845 search algorithm Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 206010011224 Cough Diseases 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000012678 infectious agent Substances 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 244000052769 pathogen Species 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 238000013138 pruning Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/80—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
Abstract
The invention discloses a suspected epidemic infection personnel detection method based on a gamma-suspected infection community model, which comprises the following steps: crawling travel track information of infected persons on a network; analyzing the travel track data of the infected persons, carrying out pairwise matching on the travel track data of the persons, obtaining any two person contact possibility values, and storing the person contact possibility values in a matching value file; reading data in the matching value file, and generating a contact probability graph, wherein a person ID is used as a node of the contact probability graph, and the contact probability values of two persons are used as weights for connecting two node edges; searching the contact probability map for the biggest group, finding the biggest group in the contact probability map, and carrying out independent analysis on all the found biggest groups to find gamma-suspected infection communities on all the biggest groups. If a patient is detected in the gamma-suspected infection community, there is a high likelihood of infection for other people in the suspected infection community. The invention can determine the personnel carrying the virus and greatly reduce the workload of medical personnel.
Description
Technical Field
The invention belongs to the technical field of data mining, and particularly relates to a suspected epidemic infection personnel searching method based on a gamma-suspected infection community model.
Background
One of the keys of epidemic prevention and control represented by new crowns is to find the infected person, and the earlier the infected person finds, the more timely the epidemic situation can be controlled, and the pressure of the downstream medical treatment work is relieved. Therefore, active detection and epidemiological disciplinary work are carried out, potential asymptomatic infectors or suspected cases are dug out, and the method has important effects on effectively controlling infectious agents, reducing or eliminating pathogen spread and preventing spread of diseases in the population.
However, the infected person who finds the new crown has many difficulties, the infection ability of the new crown virus is particularly strong, cough, sneeze and talk are infected even when standing nearby, and the disease does not happen immediately after the new crown is infected, and a 14-day latency exists, and the new crown still has infectivity in the latency, which clearly increases the difficulty for the finding of the infected person of the new crown. However, the existing algorithm has limitations, so that the time and the position information are not well integrated, and for a new crown, the fact that the time and the geographic position information are not matched at the same time is not of display significance.
Disclosure of Invention
The invention aims to provide a method for searching suspected epidemic infection personnel based on a gamma-suspected infection community model, and all high-infection communities can be found out.
The technical scheme for realizing the purpose of the invention is as follows: a suspected epidemic infection personnel searching method based on a gamma-suspected infection community model comprises the following specific steps:
step 1, crawling travel track information of infected persons on a network;
step 2, analyzing travel track data of infected persons, carrying out pairwise matching on the travel track data of the persons, obtaining any two person contact possibility values, and storing the contact possibility values in a matching value file;
step 3, reading data in the matching value file, and generating a contact probability graph, wherein a person ID is used as a node of the contact probability graph, and a contact probability value of two persons is used as a weight for connecting two node edges;
and 4, searching the maximum groups in the contact probability map, finding the maximum groups in the contact probability map, and carrying out independent analysis on all the found maximum groups to find gamma-suspected infected communities on all the maximum groups.
Preferably, the travel track information of the infected person on the network comprises a unique representation ID of the person and track data, and the track data comprises location information and time information.
Preferably, the method for analyzing the travel track data of the infected person and carrying out pairwise matching on the travel track data of the infected person comprises the following specific steps of:
selecting any two track files and reading track information in the files;
judging the time matching degree: judging whether the generated two track data belong to the same day and are within a set time difference value, namely whether the head time and the tail time of the two track files have an intersection, if not, reselecting the two track files to perform time matching degree judgment, and if so, performing the next geographic position matching degree judgment;
judging the matching degree of the geographic position: matching the position of the place, judging whether the generated two pieces of track data belong to the same area within the range of the distance tolerance, if not, re-selecting two track files to judge the time matching degree, and if so, calculating the contact probability value;
and calculating the matching value of the two tracks in a segmentation way, and taking the maximum matching value as a contact possibility value.
Preferably, the calculation method of the matching values of the two tracks is as follows:
and setting the region range of one track section, and taking the ratio of the track point of the other track section in the region range of one track section to the total track point of the other track section in the matching time section as a matching value.
Preferably, the specific method for searching the biggest group in the contact probability map is as follows:
constructing 3 sets R, P, X for saving states in the process of algorithm enumeration, wherein an R set is a result set and is expressed as a currently found cluster, and nodes in R are mutually connected but not necessarily form a maximum cluster; nodes in the P set and the X set represent public neighbors of the nodes in the R set, points in the P set and the X set can be added into the R set, so that a group structure of the R set is expanded, the X set is a forbidden set, and the nodes in the P set and the X set are all points which have been added into the R set before; the P set is an alternative set and is expressed as points which are not added into the R set;
and selecting a point v from the P set each time to be added into the R set, and simultaneously updating the P set and the X set to ensure that the P set and the X set still contain public neighbors of the new R set, wherein the method specifically comprises the following steps: and respectively intersecting the P set and the X set with an adjacency list of v points, performing recursive call, transferring the v points from the alternative set P to the forbidden set X after the recursive call is finished, and outputting the R set as a maximum group when the P set and the X set are empty.
Preferably, the specific method for finding the gamma-suspected infection community on the biggest group is as follows:
selecting two adjacent nodes in the maximum group to be added into a set C, wherein the set C represents the currently found gamma-group;
selecting a new node to join the set C under the condition that the edge between the two selected nodes is larger than gamma, and after the new node joins the set C, the product of all the edges is larger than or equal to gamma;
traversing all nodes on the biggest group, and obtaining a final set C which is the gamma-suspected infection community on the biggest group.
Preferably, the specific method for selecting the new node is as follows:
taking an unviewed node as a starting node, walking to the unviewed node along the edge of the current node, returning to the previous node when the unviewed node is not available, and continuing to probe other nodes until all the nodes are accessed.
Compared with the prior art, the invention has the remarkable advantages that: in terms of technical means, the real-world track data is adopted for matching, and the matched return value is used as a measure of the infection contact possibility of personnel, so that the limitation of directly inquiring isolated personnel is removed, strangers which are in close range with the patient can be found, the contact degree of the personnel is intuitively taken as the number, and the time for finding the infected personnel is shortened; the track matching of the invention adopts proper pruning, which means that the two tracks which are judged currently cannot be matched in the current time, so that the matching is not carried out in space, and the calculated amount of matching the geographic position and the matching time of the geographic position are saved; the invention grasps the moving track of the patient before being isolated, can clearly know the place where the patient goes, and can find out the crowd in close contact with the patient through the space-time range query, so as to directly find out the personnel information in contact with the infected personnel in a period of time, thereby saving manpower, material resources and financial resources and the workload of medical care personnel; according to the invention, different thresholds gamma can be set to generate gamma-suspected infected communities under different probabilities, so that infection matching under different scenes can be dealt with.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
Fig. 2 is a schematic diagram of a graph.
Fig. 3 is a schematic view of the supermass of fig. 2.
Fig. 4 is a schematic diagram of a contact probability map.
Fig. 5 is a schematic diagram of a track containing only position information.
Fig. 6 is a schematic diagram of a track containing time and location information.
FIG. 7 is a schematic diagram of the trace of FIG. 5 mapped on a two-dimensional plane.
FIG. 8 is a schematic diagram of a depth traversal of an undirected graph.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
As shown in fig. 1, a method for searching suspected epidemic infection personnel based on a gamma-suspected infection community model specifically comprises the following steps:
step 1, crawling travel track information of infected persons on a network;
the travel track information of the infected person on the network comprises unique Identification (ID) of the person and track data, the track data comprises Location information and Time information, the track data is stored in a track (ID, location, time) form, the ID refers to the unique Identification (ID) of the person, the Location refers to the Location information, and the Time refers to the Time information.
Step 2, analyzing travel track data of infected persons, carrying out pairwise matching on the travel track data of the persons to obtain any 2 contact possibility values of the persons, and storing the contact possibility values into a matching value file, wherein the specific method comprises the following steps:
selecting any two track files and reading track information in the files;
judging the time matching degree: judging whether the generated two track data belong to the same day and are within a set time difference value, namely whether the head time and the tail time of the two track files have an intersection, if not, reselecting the two track files to perform time matching degree judgment, and if so, performing the next geographic position matching degree judgment;
judging the matching degree of the geographic position: matching the position of the place, judging whether the generated two pieces of track data belong to the same area within the range of the distance tolerance, if not, re-selecting two track files to judge the time matching degree, and if so, calculating the contact probability value;
and calculating the matching value of the two tracks in a segmentation way, and taking the maximum matching value as a contact possibility value. The calculation method of the matching values of the two tracks comprises the following steps:
and setting the region range of one track section, and taking the ratio of the track point of the other track section in the region range of one track section to the total track point of the other track section in the matching time section as a matching value.
And carrying out pairwise matching on all track files, and calculating the contact probability value.
Step 3, reading data in the matching value file, and generating a contact probability graph, wherein a person ID is used as a node of the contact probability graph, and a contact probability value of two persons is used as a weight for connecting two node edges;
in the present invention, the graph is represented as G (V, E), where V is the set of all nodes in the graph and E is the set of all edges, n and m are typically used to represent the number of points and the number of edges in G, i.e., n= |v|, m= |respectivelyE|. For any two points u, v in graph G, a doublet (u, v) is used to represent one edge connecting point u and point v. For each node v ε G, set N G (v) Representing v neighbors, i.e. N G (v) = { u| (V, u) ∈e, u, v∈v }. All neighbor points together form an adjacency list of points v, denoted Γ (v). The magnitude of Γ (v) is referred to as the degree of point v.
And for a given contact probability graph G' = (V, E, γ), where V represents the set of nodes in G, E represents the set of edges in G, γ represents the uncertainty in the graph, and refers to the weights of the edges in the graph. In the contact probability graph G' (V, E, γ), the presence or absence of different edges is an independent event, and the probability values on the edges do not affect each other. For node u in the graph G', all neighbor nodes of node u are denoted by L (u). The set C stores the nodes of the currently found clique and is used for representing the currently found gamma-clique, and for the node subset C in the graph, max (C) represents the node with the largest number in C, and L (C) represents all neighbor nodes of C. For the set C in the graph G ', the clique (C, G ') is used to represent the clique probability of C, and clique (C, G ') =1 is specified. Let n= |v|, m= |e|, n and m denote the number of graph G nodes and the number of edges, respectively.
Definition 1: cliques in graph g= (V, E), node subset C is a clique if there is an edge connection between any of the nodes in node subset C.
Definition 2: maximum cliques for node subsets M and V in graph g= (V, E), if (1) M satisfies the definition of cliques; (2) There is no node V e v\m, so that M ∈v is a cluster, and M is a very large cluster.
Definition 3: the clique probability in the contact probability map G '= (V, E, γ), for any clique C, the clique probability (C, G') is the product of the weights of all sides in C.
Definition 4: gamma-cliques in the contact probability map G' = (V, E, gamma), if node subsets C and V, satisfy (1) C is a clique; (2) clique (C, G') > = γ, then C is a γ -group.
Definition 5: in the contact probability graph G' = (V, E, γ), if node subsets C and V satisfy (1) C is a γ group; (2) There is no node V e v\c in graph G 'such that C &vis a γ -group, then C is a γ -suspected infected community in graph G'.
Step 4, searching the maximum group of the contact probability map, firstly finding the maximum group in the contact probability map, and then carrying out independent analysis on all the found maximum groups to find gamma-suspected infected communities on all the maximum groups, wherein the specific method is as follows:
the specific method for finding the biggest mass comprises the following steps:
constructing 3 sets R, P, X for saving states in the process of algorithm enumeration, wherein an R set is a result set and is expressed as a currently found cluster, and nodes in R are mutually connected but not necessarily form a maximum cluster; nodes in the P set and the X set represent public neighbors of the nodes in the R set, points in the P set and the X set can be added into the R set, so that a group structure of the R set is expanded, the X set is a forbidden set, and the nodes in the P set and the X set are all points which have been added into the R set before; the P set is an alternative set and is expressed as points which are not added into the R set;
and selecting a point v from the P set each time to be added into the R set, and simultaneously updating the P set and the X set to ensure that the P set and the X set still contain public neighbors of the new R set, wherein the method specifically comprises the following steps: and respectively intersecting the P set and the X set with an adjacency list of v points, performing recursive call, transferring the v points from the alternative set P to the forbidden set X after the recursive call is finished, and outputting the R set as a maximum group when the P set and the X set are empty.
The specific method for finding the gamma-suspected infection communities on all the biggest groups is as follows:
two adjacent nodes in the maximum cluster are selected to be added into a set C, the set C represents the currently found gamma-cluster, the edge between the two nodes is larger than gamma, gamma is set as a threshold value, the minimum value of the product of all edges in the cluster is represented, and when the gamma-cluster C is to be expanded, only the node with the number larger than max (C) in the public adjacent nodes of the C is added. However, each time a new node is added to C, the cluster probability of C will decrease, possibly resulting in clique (C, G') < gamma, where the set C is no longer a gamma-cluster.
And selecting other nodes under the condition that the edge between the two selected nodes is larger than gamma, wherein the product of all edges of the new nodes after being added into the set C is larger than or equal to gamma before the new nodes are selected to be added into the set C.
The new node is selected by adopting a depth-first traversal (DFS) algorithm, as shown in fig. 8, the depth-first traversal firstly takes a non-accessed node as a starting node, walks to the non-accessed node along the edge of the current node, returns to the previous node when no non-accessed node exists, and continues to probe other nodes until all the nodes are accessed. The method comprises the steps of adopting a node number ascending order to process nodes in a contact probability graph, and searching by maintaining a set A and a set B, so that all gamma-suspected infected communities in the contact probability graph are enumerated efficiently, wherein (u, r) data pairs are stored in the set A, u represents the node number, r represents the value to be multiplied by the set C after adding the node u into the set C, u > max (C), the product of all edges is still ensured to be greater than or equal to a threshold gamma after u is added into the set C, the set C stores the node set of the currently found cluster and represents a subset in the contact probability graph, and max (C) represents the node with the largest number in C. The set B is also a pair of stored data (u, r), the meaning is the same as A, but the nodes stored in the set B are all processed nodes, when the set A is empty, the set A is ended, but C cannot be proved to be a gamma-suspected infection community, and possibly, the nodes in the set B can be expanded to form C, and only when the set A and the set B are empty, the C is the gamma-suspected infection community. Judging whether all nodes in the maximum cluster are selected, namely whether all nodes in the maximum cluster are traversed, and when all nodes are traversed, obtaining a set C which is a gamma-suspected infected community.
If a patient is detected in the gamma-suspected infection community, there is a high likelihood of infection for other people in the suspected infection community. According to the invention, different thresholds gamma can be set to generate gamma-suspected infected communities under different probabilities, so that infection matching under different scenes can be dealt with, for example, aiming at vehicles, judgment meeting can be carried out by using a small threshold, and compared with the contact of the vehicles, the common track data is larger than the threshold of the vehicles.
Example 1: in this embodiment, a method for searching for suspected epidemic infection personnel based on a γ -suspected infection community model is provided. Firstly, matching the contact degree of the two tracks with all track data, and then carrying out the maximum group analysis on the graph according to the contact degree data. The invention searches to find out the suspected epidemic infection group, so that the suspected epidemic infection group can complete the search of the suspected epidemic infection personnel.
1. Model design
In the present invention, one graph is denoted as G (V, E). Where V is the set of all nodes in the graph, and E is the set of all edges, n, m are typically used to represent the number of points and the number of edges in G, respectively, i.e., n= |v|, m= |e|. For any two points u, v in graph G, a doublet (u, v) may be used to represent one edge connecting point u and point v. For any one point v, all points connected to it can be referred to as its neighbor points, all of which together form an adjacency list of points v, generally denoted Γ (v). The magnitude of Γ (v) is referred to as the degree of point v. For a graph g= (V, E), where V represents the set of nodes, E represents edges, connecting nodes in G. The invention respectively uses N and m to represent the number of nodes and the edge number, namely n= |V| and m= |E|, and for each node V epsilon G, N is set G (v) Representing v neighbors, i.e. N G (v)={u|(v,u)∈E,u,v∈V}。
And for a given contact probability graph G ' = (V, E, γ), where V represents the set of nodes in G ', E represents the set of edges in G ', γ represents the uncertainty in the graph, referring to the weights of the edges in the graph. In the contact probability graph G' (V, E, γ), the presence or absence of different edges is an independent event, and the probability values on the edges do not affect each other. For node u in the graph G', all neighbor nodes of node u are denoted by L (u). For node subset C in the graph, max (C) represents the highest numbered node in C, and L (C) represents all neighbor nodes of C. For the clique C in the graph G ', the clique (C, G ') is used to represent the clique probability of C, and clique (0,G ') =1 is specified. Let n= |v|, m= |e|, n and m denote the number of nodes and the number of edges of the graph G', respectively.
Definition 1: cliques in graph g= (V, E), node subset C is a clique if there is an edge connection between any of the nodes in node subset C.
Definition 2: maximum cliques for node subsets M and V in graph g= (V, E), if (1) M satisfies the definition of cliques; (2) There is no node V e v\m, so that M ∈v is a cluster, and M is a very large cluster.
Definition 3: the clique probability in the contact probability map G '= (V, E, γ), for any clique C, the clique probability (C, G') is the product of the weights of all sides in C.
Definition 4: gamma-cliques in the contact probability map G' = (V, E, gamma), if node subsets C and V, satisfy (1) C is a clique; (2) clique (C, G') > = γ, then C is a γ -group.
Definition 5: in the contact probability graph G' = (V, E, γ), if node subsets C and V satisfy (1) C is a γ group; (2) There is no node V e v\c in graph G', so that C & -V is a γ -group, and C is a γ -suspected infected community in graph G.
As shown in FIG. 2, the maximum group defined on the graph is denoted as C 1 ={1,2,3},C 2 ={3,5,6},C 3 ={8,9},C 4 ={1,4},C 5 ={1,5},C 6 ={5,10},C 7 ={9,10},C 8 ={6,7}。
2. Gamma-suspected infection community searching method on contact probability graph
2.1 Gamma-suspected infection community search algorithm on contact probability map
Aiming at the property of the graph, the invention designs a search algorithm for matching the matching degree of the two tracks.
2.1.1 Algorithm 1
And comparing the position data and time contained in the track data according to the information contained in the track data, mapping the track into a two-dimensional plane according to the illustrated track, and judging whether the geographic range contained in the compared track data section within the set unit time is within the geographic range of a certain difference value of the comparison data, namely within the allowable error range. For track data track= (ID, date, time, location), in the case where Date is the same, the error of Time is within a certain range, and the expression range of both locations is within the error distance, and the degree of coincidence of the two track areas serves as an index of the degree of contact of both. If two pieces of track data of different people, namely position_A and position_B, are generated, the extreme value region represented by shading is in the region range represented by d error, at the moment, the track points of position_B are matched with the shadow region, the number of the matched points is used as a reference for measuring the contact degree index of the two, if the two pieces of track data exist, the maximum value of the contact degree of the plurality of pieces of data is found, and the maximum value is used as the contact degree between the two pieces of track data, namely the numerical value of the edge between the two nodes in the graph, as shown in fig. 7.
2.1.2 Algorithm 2
The gamma-suspected infection community searching (Maximal Uncertainc Lique Enumeration) algorithm sequentially adds the nodes in the maximum cliques into the set C by taking the maximum cliques found on the probability map as a reference, wherein the currently found gamma-cliques are represented by the set C in the gamma-suspected infection community searching algorithm, when the gamma-cliques C are to be expanded, only the node with the number larger than max (C) in the public neighbor nodes added with the C is added, if the product of all edges in the C after the node is added is larger than or equal to a threshold value gamma, the node can be added as the node for expanding the C, but after a new node is added into the C each time, the clique probability of the C is reduced, and probability < gamma is possibly caused, and the set C is not a gamma-clique. When no more nodes can be added to C, this indicates that a gamma-suspected infected community is found.
Example 2: given γ=0.1 as shown in fig. 4, then the γ -suspected infected communities in the contact probability graph G' are node sets C, respectively 1 ={1,2,3}、C 2 ={2,5}、C 3 ={3,4}、C 4 ={3,5}、C 5 ={4,5}、C 6 ={5,6}、C 7 = {6,7,8,9}, a total of 7 γ -suspected infected communities.
The protection of the present invention is not limited to the above embodiments. Variations and advantages that would occur to one skilled in the art are included in the invention without departing from the spirit and scope of the inventive concept, and the scope of the invention is defined by the appended claims.
Claims (5)
1. A suspected epidemic infection personnel searching method based on a gamma-suspected infection community model is characterized by comprising the following specific steps:
step 1, crawling travel track information of infected persons on a network;
step 2, analyzing travel track data of infected persons, carrying out pairwise matching on the travel track data of the persons, obtaining any two person contact possibility values, and storing the contact possibility values in a matching value file;
step 3, reading data in the matching value file to generate a contact probability graph, wherein a person ID is used as a node of the contact probability graph, and a contact probability value of two persons is used as a weight for connecting two node edges, and the specific method is as follows:
selecting any two track files and reading track information in the files;
judging the time matching degree: judging whether the generated two track data belong to the same day and are within a set time difference value, namely whether the head time and the tail time of the two track files have an intersection, if not, reselecting the two track files to perform time matching degree judgment, and if so, performing the next geographic position matching degree judgment;
judging the matching degree of the geographic position: matching the position of the place, judging whether the generated two pieces of track data belong to the same area within the range of the distance tolerance, if not, re-selecting two track files to judge the time matching degree, and if so, calculating the contact probability value;
calculating matching values of the two tracks in a segmented manner, and taking the maximum matching value as a contact probability value;
step 4, searching the maximum group of the contact probability map, finding the maximum group in the contact probability map, and carrying out independent analysis on all the found maximum groups to find gamma-suspected infected communities on all the maximum groups, wherein the specific method comprises the following steps:
selecting two adjacent nodes in the maximum group to be added into a set C, wherein the set C represents the currently found gamma-group;
selecting a new node to join the set C under the condition that the edge between the two selected nodes is larger than gamma, and after the new node joins the set C, the product of all the edges is larger than or equal to gamma;
traversing all nodes on the biggest group, and obtaining a final set C which is the gamma-suspected infection community on the biggest group.
2. The method for searching for suspected epidemic infected persons based on the gamma-suspected infected community model of claim 1, wherein the infected person travel track information on the network includes unique representation IDs of individuals and track data including location information and time information.
3. The method for searching for suspected epidemic infected persons based on the gamma-suspected infected community model according to claim 1, wherein the calculation method of the matching value of the two tracks is as follows:
and setting the region range of one track section, and taking the ratio of the track point of the other track section in the region range of one track section to the total track point of the other track section in the matching time section as a matching value.
4. The method for searching suspected epidemic infected persons based on the gamma-suspected infected community model according to claim 1, wherein the specific method for searching the biggest group in the contact probability map is as follows:
constructing 3 sets R, P, X for saving states in the process of algorithm enumeration, wherein an R set is a result set and is expressed as a currently found cluster, and nodes in R are mutually connected but not necessarily form a maximum cluster; nodes in the P set and the X set represent public neighbors of the nodes in the R set, points in the P set and the X set can be added into the R set, so that a group structure of the R set is expanded, the X set is a forbidden set, and the nodes in the P set and the X set are all points which have been added into the R set before; the P set is an alternative set and is expressed as points which are not added into the R set;
and selecting a point v from the P set each time to be added into the R set, and simultaneously updating the P set and the X set to ensure that the P set and the X set still contain public neighbors of the new R set, wherein the method specifically comprises the following steps: and respectively intersecting the P set and the X set with an adjacency list of v points, performing recursive call, transferring the v points from the alternative set P to the forbidden set X after the recursive call is finished, and outputting the R set as a maximum group when the P set and the X set are empty.
5. The method for searching suspected epidemic infected persons based on the gamma-suspected infected community model according to claim 1, wherein the specific method for selecting the new node is as follows:
taking an unviewed node as a starting node, walking to the unviewed node along the edge of the current node, returning to the previous node when the unviewed node is not available, and continuing to probe other nodes until all the nodes are accessed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110813384.9A CN113643824B (en) | 2021-07-19 | 2021-07-19 | Suspected epidemic infection personnel searching method based on gamma-suspected infection community model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110813384.9A CN113643824B (en) | 2021-07-19 | 2021-07-19 | Suspected epidemic infection personnel searching method based on gamma-suspected infection community model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113643824A CN113643824A (en) | 2021-11-12 |
CN113643824B true CN113643824B (en) | 2024-03-26 |
Family
ID=78417714
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110813384.9A Active CN113643824B (en) | 2021-07-19 | 2021-07-19 | Suspected epidemic infection personnel searching method based on gamma-suspected infection community model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113643824B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111027525A (en) * | 2020-03-09 | 2020-04-17 | 中国民用航空总局第二研究所 | Method, device and system for tracking potential infected persons in public places during epidemic situation |
CN111177473A (en) * | 2018-11-13 | 2020-05-19 | 杭州海康威视数字技术股份有限公司 | Personnel relationship analysis method and device and readable storage medium |
CN111354472A (en) * | 2020-02-20 | 2020-06-30 | 戴建荣 | Infectious disease transmission monitoring and early warning system and method |
CN111540476A (en) * | 2020-04-20 | 2020-08-14 | 中国科学院地理科学与资源研究所 | Respiratory infectious disease infectious tree reconstruction method based on mobile phone signaling data |
CN112383875A (en) * | 2020-06-28 | 2021-02-19 | 中国信息通信研究院 | Data processing method and electronic equipment |
CN112653990A (en) * | 2020-09-18 | 2021-04-13 | 武汉爱迪科技股份有限公司 | Screening algorithm and system for close contact personnel |
CN113113153A (en) * | 2021-04-13 | 2021-07-13 | 上海市疾病预防控制中心 | Method, system, device, processor and storage medium for realizing epidemic situation dynamic information analysis in epidemic situation outbreak period by using graph model |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11350889B2 (en) * | 2013-10-10 | 2022-06-07 | Aura Home, Inc. | Covid-19 risk and illness assessment method |
-
2021
- 2021-07-19 CN CN202110813384.9A patent/CN113643824B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111177473A (en) * | 2018-11-13 | 2020-05-19 | 杭州海康威视数字技术股份有限公司 | Personnel relationship analysis method and device and readable storage medium |
CN111354472A (en) * | 2020-02-20 | 2020-06-30 | 戴建荣 | Infectious disease transmission monitoring and early warning system and method |
CN111027525A (en) * | 2020-03-09 | 2020-04-17 | 中国民用航空总局第二研究所 | Method, device and system for tracking potential infected persons in public places during epidemic situation |
CN111540476A (en) * | 2020-04-20 | 2020-08-14 | 中国科学院地理科学与资源研究所 | Respiratory infectious disease infectious tree reconstruction method based on mobile phone signaling data |
CN112383875A (en) * | 2020-06-28 | 2021-02-19 | 中国信息通信研究院 | Data processing method and electronic equipment |
CN112653990A (en) * | 2020-09-18 | 2021-04-13 | 武汉爱迪科技股份有限公司 | Screening algorithm and system for close contact personnel |
CN113113153A (en) * | 2021-04-13 | 2021-07-13 | 上海市疾病预防控制中心 | Method, system, device, processor and storage medium for realizing epidemic situation dynamic information analysis in epidemic situation outbreak period by using graph model |
Non-Patent Citations (2)
Title |
---|
新型冠状病毒肺炎初期传播规模的系统动力学模型估计方法及评价――以甘肃省为例的研究;刘红亮;贾洪文;王雁;刘彬;姚洁;闫宣辰;;电子科技大学学报(社科版)(第03期);全文 * |
新型冠状病毒肺炎疫情数据挖掘与离散随机传播动力学模型分析;唐三一;唐彪;Nicola Luigi Bragazzi;夏凡;李堂娟;何莎;任鹏宇;王霞;向长城;彭志行;吴建宏;肖燕妮;;中国科学:数学(第08期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113643824A (en) | 2021-11-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110457404B (en) | Social media account classification method based on complex heterogeneous network | |
CN105404890A (en) | Criminal gang discrimination method considering locus space-time meaning | |
CN106780263A (en) | High-risk personnel analysis and recognition methods based on big data platform | |
CN111540476A (en) | Respiratory infectious disease infectious tree reconstruction method based on mobile phone signaling data | |
CN107918664B (en) | Social network data differential privacy protection method based on uncertain graph | |
Liang et al. | Cluster validity index for irregular clustering results | |
CN109522416A (en) | A kind of construction method of Financial Risk Control knowledge mapping | |
CN110704694A (en) | Organization hierarchy dividing method based on network representation learning and application thereof | |
CN110533253A (en) | A kind of scientific research cooperative Relationship Prediction method based on Heterogeneous Information network | |
CN109783696B (en) | Multi-pattern graph index construction method and system for weak structure correlation | |
Liu et al. | Social group query based on multi-fuzzy-constrained strong simulation | |
CN113643824B (en) | Suspected epidemic infection personnel searching method based on gamma-suspected infection community model | |
CN110750730A (en) | Group detection method and system based on space-time constraint | |
CN109286622A (en) | A kind of network inbreak detection method based on learning rules collection | |
Wang et al. | Prevalent co-visiting patterns mining from location-based social networks | |
Wang et al. | Overlapping community detection based on node importance and adjacency information | |
Su et al. | A new approach for social group detection based on spatio-temporal interpersonal distance measurement | |
Lind et al. | Spatio-temporal mobility analysis for community detection in the mobile networks using CDR data | |
CN112380267A (en) | Community discovery method based on privacy graph | |
Tong et al. | Pmp-net: Rethinking visual context for scene graph generation | |
CN111639251A (en) | Information retrieval method and device | |
CN113268770B (en) | Track k anonymous privacy protection method based on user activity | |
Jiang et al. | Ai and machine learning for industrial security with level discovery method | |
Fellegara et al. | Analysis of geolocalized social networks based on simplicial complexes | |
Chen et al. | Personalized trajectory privacy-preserving method based on sensitive attribute generalization and location perturbation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information | ||
CB03 | Change of inventor or designer information |
Inventor after: Yuan Long Inventor after: Fan Zhengqing Inventor after: Chen Zi Inventor before: Fan Zhengqing Inventor before: Yuan Long Inventor before: Chen Zi |
|
GR01 | Patent grant | ||
GR01 | Patent grant |