CN113643824B - Suspected epidemic infection personnel searching method based on gamma-suspected infection community model - Google Patents

Suspected epidemic infection personnel searching method based on gamma-suspected infection community model Download PDF

Info

Publication number
CN113643824B
CN113643824B CN202110813384.9A CN202110813384A CN113643824B CN 113643824 B CN113643824 B CN 113643824B CN 202110813384 A CN202110813384 A CN 202110813384A CN 113643824 B CN113643824 B CN 113643824B
Authority
CN
China
Prior art keywords
suspected
gamma
node
track
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110813384.9A
Other languages
Chinese (zh)
Other versions
CN113643824A (en
Inventor
袁龙
范正青
陈紫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN202110813384.9A priority Critical patent/CN113643824B/en
Publication of CN113643824A publication Critical patent/CN113643824A/en
Application granted granted Critical
Publication of CN113643824B publication Critical patent/CN113643824B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Abstract

The invention discloses a suspected epidemic infection personnel detection method based on a gamma-suspected infection community model, which comprises the following steps: crawling travel track information of infected persons on a network; analyzing the travel track data of the infected persons, carrying out pairwise matching on the travel track data of the persons, obtaining any two person contact possibility values, and storing the person contact possibility values in a matching value file; reading data in the matching value file, and generating a contact probability graph, wherein a person ID is used as a node of the contact probability graph, and the contact probability values of two persons are used as weights for connecting two node edges; searching the contact probability map for the biggest group, finding the biggest group in the contact probability map, and carrying out independent analysis on all the found biggest groups to find gamma-suspected infection communities on all the biggest groups. If a patient is detected in the gamma-suspected infection community, there is a high likelihood of infection for other people in the suspected infection community. The invention can determine the personnel carrying the virus and greatly reduce the workload of medical personnel.

Description

Suspected epidemic infection personnel searching method based on gamma-suspected infection community model
Technical Field
The invention belongs to the technical field of data mining, and particularly relates to a suspected epidemic infection personnel searching method based on a gamma-suspected infection community model.
Background
One of the keys of epidemic prevention and control represented by new crowns is to find the infected person, and the earlier the infected person finds, the more timely the epidemic situation can be controlled, and the pressure of the downstream medical treatment work is relieved. Therefore, active detection and epidemiological disciplinary work are carried out, potential asymptomatic infectors or suspected cases are dug out, and the method has important effects on effectively controlling infectious agents, reducing or eliminating pathogen spread and preventing spread of diseases in the population.
However, the infected person who finds the new crown has many difficulties, the infection ability of the new crown virus is particularly strong, cough, sneeze and talk are infected even when standing nearby, and the disease does not happen immediately after the new crown is infected, and a 14-day latency exists, and the new crown still has infectivity in the latency, which clearly increases the difficulty for the finding of the infected person of the new crown. However, the existing algorithm has limitations, so that the time and the position information are not well integrated, and for a new crown, the fact that the time and the geographic position information are not matched at the same time is not of display significance.
Disclosure of Invention
The invention aims to provide a method for searching suspected epidemic infection personnel based on a gamma-suspected infection community model, and all high-infection communities can be found out.
The technical scheme for realizing the purpose of the invention is as follows: a suspected epidemic infection personnel searching method based on a gamma-suspected infection community model comprises the following specific steps:
step 1, crawling travel track information of infected persons on a network;
step 2, analyzing travel track data of infected persons, carrying out pairwise matching on the travel track data of the persons, obtaining any two person contact possibility values, and storing the contact possibility values in a matching value file;
step 3, reading data in the matching value file, and generating a contact probability graph, wherein a person ID is used as a node of the contact probability graph, and a contact probability value of two persons is used as a weight for connecting two node edges;
and 4, searching the maximum groups in the contact probability map, finding the maximum groups in the contact probability map, and carrying out independent analysis on all the found maximum groups to find gamma-suspected infected communities on all the maximum groups.
Preferably, the travel track information of the infected person on the network comprises a unique representation ID of the person and track data, and the track data comprises location information and time information.
Preferably, the method for analyzing the travel track data of the infected person and carrying out pairwise matching on the travel track data of the infected person comprises the following specific steps of:
selecting any two track files and reading track information in the files;
judging the time matching degree: judging whether the generated two track data belong to the same day and are within a set time difference value, namely whether the head time and the tail time of the two track files have an intersection, if not, reselecting the two track files to perform time matching degree judgment, and if so, performing the next geographic position matching degree judgment;
judging the matching degree of the geographic position: matching the position of the place, judging whether the generated two pieces of track data belong to the same area within the range of the distance tolerance, if not, re-selecting two track files to judge the time matching degree, and if so, calculating the contact probability value;
and calculating the matching value of the two tracks in a segmentation way, and taking the maximum matching value as a contact possibility value.
Preferably, the calculation method of the matching values of the two tracks is as follows:
and setting the region range of one track section, and taking the ratio of the track point of the other track section in the region range of one track section to the total track point of the other track section in the matching time section as a matching value.
Preferably, the specific method for searching the biggest group in the contact probability map is as follows:
constructing 3 sets R, P, X for saving states in the process of algorithm enumeration, wherein an R set is a result set and is expressed as a currently found cluster, and nodes in R are mutually connected but not necessarily form a maximum cluster; nodes in the P set and the X set represent public neighbors of the nodes in the R set, points in the P set and the X set can be added into the R set, so that a group structure of the R set is expanded, the X set is a forbidden set, and the nodes in the P set and the X set are all points which have been added into the R set before; the P set is an alternative set and is expressed as points which are not added into the R set;
and selecting a point v from the P set each time to be added into the R set, and simultaneously updating the P set and the X set to ensure that the P set and the X set still contain public neighbors of the new R set, wherein the method specifically comprises the following steps: and respectively intersecting the P set and the X set with an adjacency list of v points, performing recursive call, transferring the v points from the alternative set P to the forbidden set X after the recursive call is finished, and outputting the R set as a maximum group when the P set and the X set are empty.
Preferably, the specific method for finding the gamma-suspected infection community on the biggest group is as follows:
selecting two adjacent nodes in the maximum group to be added into a set C, wherein the set C represents the currently found gamma-group;
selecting a new node to join the set C under the condition that the edge between the two selected nodes is larger than gamma, and after the new node joins the set C, the product of all the edges is larger than or equal to gamma;
traversing all nodes on the biggest group, and obtaining a final set C which is the gamma-suspected infection community on the biggest group.
Preferably, the specific method for selecting the new node is as follows:
taking an unviewed node as a starting node, walking to the unviewed node along the edge of the current node, returning to the previous node when the unviewed node is not available, and continuing to probe other nodes until all the nodes are accessed.
Compared with the prior art, the invention has the remarkable advantages that: in terms of technical means, the real-world track data is adopted for matching, and the matched return value is used as a measure of the infection contact possibility of personnel, so that the limitation of directly inquiring isolated personnel is removed, strangers which are in close range with the patient can be found, the contact degree of the personnel is intuitively taken as the number, and the time for finding the infected personnel is shortened; the track matching of the invention adopts proper pruning, which means that the two tracks which are judged currently cannot be matched in the current time, so that the matching is not carried out in space, and the calculated amount of matching the geographic position and the matching time of the geographic position are saved; the invention grasps the moving track of the patient before being isolated, can clearly know the place where the patient goes, and can find out the crowd in close contact with the patient through the space-time range query, so as to directly find out the personnel information in contact with the infected personnel in a period of time, thereby saving manpower, material resources and financial resources and the workload of medical care personnel; according to the invention, different thresholds gamma can be set to generate gamma-suspected infected communities under different probabilities, so that infection matching under different scenes can be dealt with.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
Fig. 2 is a schematic diagram of a graph.
Fig. 3 is a schematic view of the supermass of fig. 2.
Fig. 4 is a schematic diagram of a contact probability map.
Fig. 5 is a schematic diagram of a track containing only position information.
Fig. 6 is a schematic diagram of a track containing time and location information.
FIG. 7 is a schematic diagram of the trace of FIG. 5 mapped on a two-dimensional plane.
FIG. 8 is a schematic diagram of a depth traversal of an undirected graph.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
As shown in fig. 1, a method for searching suspected epidemic infection personnel based on a gamma-suspected infection community model specifically comprises the following steps:
step 1, crawling travel track information of infected persons on a network;
the travel track information of the infected person on the network comprises unique Identification (ID) of the person and track data, the track data comprises Location information and Time information, the track data is stored in a track (ID, location, time) form, the ID refers to the unique Identification (ID) of the person, the Location refers to the Location information, and the Time refers to the Time information.
Step 2, analyzing travel track data of infected persons, carrying out pairwise matching on the travel track data of the persons to obtain any 2 contact possibility values of the persons, and storing the contact possibility values into a matching value file, wherein the specific method comprises the following steps:
selecting any two track files and reading track information in the files;
judging the time matching degree: judging whether the generated two track data belong to the same day and are within a set time difference value, namely whether the head time and the tail time of the two track files have an intersection, if not, reselecting the two track files to perform time matching degree judgment, and if so, performing the next geographic position matching degree judgment;
judging the matching degree of the geographic position: matching the position of the place, judging whether the generated two pieces of track data belong to the same area within the range of the distance tolerance, if not, re-selecting two track files to judge the time matching degree, and if so, calculating the contact probability value;
and calculating the matching value of the two tracks in a segmentation way, and taking the maximum matching value as a contact possibility value. The calculation method of the matching values of the two tracks comprises the following steps:
and setting the region range of one track section, and taking the ratio of the track point of the other track section in the region range of one track section to the total track point of the other track section in the matching time section as a matching value.
And carrying out pairwise matching on all track files, and calculating the contact probability value.
Step 3, reading data in the matching value file, and generating a contact probability graph, wherein a person ID is used as a node of the contact probability graph, and a contact probability value of two persons is used as a weight for connecting two node edges;
in the present invention, the graph is represented as G (V, E), where V is the set of all nodes in the graph and E is the set of all edges, n and m are typically used to represent the number of points and the number of edges in G, i.e., n= |v|, m= |respectivelyE|. For any two points u, v in graph G, a doublet (u, v) is used to represent one edge connecting point u and point v. For each node v ε G, set N G (v) Representing v neighbors, i.e. N G (v) = { u| (V, u) ∈e, u, v∈v }. All neighbor points together form an adjacency list of points v, denoted Γ (v). The magnitude of Γ (v) is referred to as the degree of point v.
And for a given contact probability graph G' = (V, E, γ), where V represents the set of nodes in G, E represents the set of edges in G, γ represents the uncertainty in the graph, and refers to the weights of the edges in the graph. In the contact probability graph G' (V, E, γ), the presence or absence of different edges is an independent event, and the probability values on the edges do not affect each other. For node u in the graph G', all neighbor nodes of node u are denoted by L (u). The set C stores the nodes of the currently found clique and is used for representing the currently found gamma-clique, and for the node subset C in the graph, max (C) represents the node with the largest number in C, and L (C) represents all neighbor nodes of C. For the set C in the graph G ', the clique (C, G ') is used to represent the clique probability of C, and clique (C, G ') =1 is specified. Let n= |v|, m= |e|, n and m denote the number of graph G nodes and the number of edges, respectively.
Definition 1: cliques in graph g= (V, E), node subset C is a clique if there is an edge connection between any of the nodes in node subset C.
Definition 2: maximum cliques for node subsets M and V in graph g= (V, E), if (1) M satisfies the definition of cliques; (2) There is no node V e v\m, so that M ∈v is a cluster, and M is a very large cluster.
Definition 3: the clique probability in the contact probability map G '= (V, E, γ), for any clique C, the clique probability (C, G') is the product of the weights of all sides in C.
Definition 4: gamma-cliques in the contact probability map G' = (V, E, gamma), if node subsets C and V, satisfy (1) C is a clique; (2) clique (C, G') > = γ, then C is a γ -group.
Definition 5: in the contact probability graph G' = (V, E, γ), if node subsets C and V satisfy (1) C is a γ group; (2) There is no node V e v\c in graph G 'such that C &vis a γ -group, then C is a γ -suspected infected community in graph G'.
Step 4, searching the maximum group of the contact probability map, firstly finding the maximum group in the contact probability map, and then carrying out independent analysis on all the found maximum groups to find gamma-suspected infected communities on all the maximum groups, wherein the specific method is as follows:
the specific method for finding the biggest mass comprises the following steps:
constructing 3 sets R, P, X for saving states in the process of algorithm enumeration, wherein an R set is a result set and is expressed as a currently found cluster, and nodes in R are mutually connected but not necessarily form a maximum cluster; nodes in the P set and the X set represent public neighbors of the nodes in the R set, points in the P set and the X set can be added into the R set, so that a group structure of the R set is expanded, the X set is a forbidden set, and the nodes in the P set and the X set are all points which have been added into the R set before; the P set is an alternative set and is expressed as points which are not added into the R set;
and selecting a point v from the P set each time to be added into the R set, and simultaneously updating the P set and the X set to ensure that the P set and the X set still contain public neighbors of the new R set, wherein the method specifically comprises the following steps: and respectively intersecting the P set and the X set with an adjacency list of v points, performing recursive call, transferring the v points from the alternative set P to the forbidden set X after the recursive call is finished, and outputting the R set as a maximum group when the P set and the X set are empty.
The specific method for finding the gamma-suspected infection communities on all the biggest groups is as follows:
two adjacent nodes in the maximum cluster are selected to be added into a set C, the set C represents the currently found gamma-cluster, the edge between the two nodes is larger than gamma, gamma is set as a threshold value, the minimum value of the product of all edges in the cluster is represented, and when the gamma-cluster C is to be expanded, only the node with the number larger than max (C) in the public adjacent nodes of the C is added. However, each time a new node is added to C, the cluster probability of C will decrease, possibly resulting in clique (C, G') < gamma, where the set C is no longer a gamma-cluster.
And selecting other nodes under the condition that the edge between the two selected nodes is larger than gamma, wherein the product of all edges of the new nodes after being added into the set C is larger than or equal to gamma before the new nodes are selected to be added into the set C.
The new node is selected by adopting a depth-first traversal (DFS) algorithm, as shown in fig. 8, the depth-first traversal firstly takes a non-accessed node as a starting node, walks to the non-accessed node along the edge of the current node, returns to the previous node when no non-accessed node exists, and continues to probe other nodes until all the nodes are accessed. The method comprises the steps of adopting a node number ascending order to process nodes in a contact probability graph, and searching by maintaining a set A and a set B, so that all gamma-suspected infected communities in the contact probability graph are enumerated efficiently, wherein (u, r) data pairs are stored in the set A, u represents the node number, r represents the value to be multiplied by the set C after adding the node u into the set C, u > max (C), the product of all edges is still ensured to be greater than or equal to a threshold gamma after u is added into the set C, the set C stores the node set of the currently found cluster and represents a subset in the contact probability graph, and max (C) represents the node with the largest number in C. The set B is also a pair of stored data (u, r), the meaning is the same as A, but the nodes stored in the set B are all processed nodes, when the set A is empty, the set A is ended, but C cannot be proved to be a gamma-suspected infection community, and possibly, the nodes in the set B can be expanded to form C, and only when the set A and the set B are empty, the C is the gamma-suspected infection community. Judging whether all nodes in the maximum cluster are selected, namely whether all nodes in the maximum cluster are traversed, and when all nodes are traversed, obtaining a set C which is a gamma-suspected infected community.
If a patient is detected in the gamma-suspected infection community, there is a high likelihood of infection for other people in the suspected infection community. According to the invention, different thresholds gamma can be set to generate gamma-suspected infected communities under different probabilities, so that infection matching under different scenes can be dealt with, for example, aiming at vehicles, judgment meeting can be carried out by using a small threshold, and compared with the contact of the vehicles, the common track data is larger than the threshold of the vehicles.
Example 1: in this embodiment, a method for searching for suspected epidemic infection personnel based on a γ -suspected infection community model is provided. Firstly, matching the contact degree of the two tracks with all track data, and then carrying out the maximum group analysis on the graph according to the contact degree data. The invention searches to find out the suspected epidemic infection group, so that the suspected epidemic infection group can complete the search of the suspected epidemic infection personnel.
1. Model design
In the present invention, one graph is denoted as G (V, E). Where V is the set of all nodes in the graph, and E is the set of all edges, n, m are typically used to represent the number of points and the number of edges in G, respectively, i.e., n= |v|, m= |e|. For any two points u, v in graph G, a doublet (u, v) may be used to represent one edge connecting point u and point v. For any one point v, all points connected to it can be referred to as its neighbor points, all of which together form an adjacency list of points v, generally denoted Γ (v). The magnitude of Γ (v) is referred to as the degree of point v. For a graph g= (V, E), where V represents the set of nodes, E represents edges, connecting nodes in G. The invention respectively uses N and m to represent the number of nodes and the edge number, namely n= |V| and m= |E|, and for each node V epsilon G, N is set G (v) Representing v neighbors, i.e. N G (v)={u|(v,u)∈E,u,v∈V}。
And for a given contact probability graph G ' = (V, E, γ), where V represents the set of nodes in G ', E represents the set of edges in G ', γ represents the uncertainty in the graph, referring to the weights of the edges in the graph. In the contact probability graph G' (V, E, γ), the presence or absence of different edges is an independent event, and the probability values on the edges do not affect each other. For node u in the graph G', all neighbor nodes of node u are denoted by L (u). For node subset C in the graph, max (C) represents the highest numbered node in C, and L (C) represents all neighbor nodes of C. For the clique C in the graph G ', the clique (C, G ') is used to represent the clique probability of C, and clique (0,G ') =1 is specified. Let n= |v|, m= |e|, n and m denote the number of nodes and the number of edges of the graph G', respectively.
Definition 1: cliques in graph g= (V, E), node subset C is a clique if there is an edge connection between any of the nodes in node subset C.
Definition 2: maximum cliques for node subsets M and V in graph g= (V, E), if (1) M satisfies the definition of cliques; (2) There is no node V e v\m, so that M ∈v is a cluster, and M is a very large cluster.
Definition 3: the clique probability in the contact probability map G '= (V, E, γ), for any clique C, the clique probability (C, G') is the product of the weights of all sides in C.
Definition 4: gamma-cliques in the contact probability map G' = (V, E, gamma), if node subsets C and V, satisfy (1) C is a clique; (2) clique (C, G') > = γ, then C is a γ -group.
Definition 5: in the contact probability graph G' = (V, E, γ), if node subsets C and V satisfy (1) C is a γ group; (2) There is no node V e v\c in graph G', so that C & -V is a γ -group, and C is a γ -suspected infected community in graph G.
As shown in FIG. 2, the maximum group defined on the graph is denoted as C 1 ={1,2,3},C 2 ={3,5,6},C 3 ={8,9},C 4 ={1,4},C 5 ={1,5},C 6 ={5,10},C 7 ={9,10},C 8 ={6,7}。
2. Gamma-suspected infection community searching method on contact probability graph
2.1 Gamma-suspected infection community search algorithm on contact probability map
Aiming at the property of the graph, the invention designs a search algorithm for matching the matching degree of the two tracks.
2.1.1 Algorithm 1
And comparing the position data and time contained in the track data according to the information contained in the track data, mapping the track into a two-dimensional plane according to the illustrated track, and judging whether the geographic range contained in the compared track data section within the set unit time is within the geographic range of a certain difference value of the comparison data, namely within the allowable error range. For track data track= (ID, date, time, location), in the case where Date is the same, the error of Time is within a certain range, and the expression range of both locations is within the error distance, and the degree of coincidence of the two track areas serves as an index of the degree of contact of both. If two pieces of track data of different people, namely position_A and position_B, are generated, the extreme value region represented by shading is in the region range represented by d error, at the moment, the track points of position_B are matched with the shadow region, the number of the matched points is used as a reference for measuring the contact degree index of the two, if the two pieces of track data exist, the maximum value of the contact degree of the plurality of pieces of data is found, and the maximum value is used as the contact degree between the two pieces of track data, namely the numerical value of the edge between the two nodes in the graph, as shown in fig. 7.
2.1.2 Algorithm 2
The gamma-suspected infection community searching (Maximal Uncertainc Lique Enumeration) algorithm sequentially adds the nodes in the maximum cliques into the set C by taking the maximum cliques found on the probability map as a reference, wherein the currently found gamma-cliques are represented by the set C in the gamma-suspected infection community searching algorithm, when the gamma-cliques C are to be expanded, only the node with the number larger than max (C) in the public neighbor nodes added with the C is added, if the product of all edges in the C after the node is added is larger than or equal to a threshold value gamma, the node can be added as the node for expanding the C, but after a new node is added into the C each time, the clique probability of the C is reduced, and probability < gamma is possibly caused, and the set C is not a gamma-clique. When no more nodes can be added to C, this indicates that a gamma-suspected infected community is found.
Example 2: given γ=0.1 as shown in fig. 4, then the γ -suspected infected communities in the contact probability graph G' are node sets C, respectively 1 ={1,2,3}、C 2 ={2,5}、C 3 ={3,4}、C 4 ={3,5}、C 5 ={4,5}、C 6 ={5,6}、C 7 = {6,7,8,9}, a total of 7 γ -suspected infected communities.
The protection of the present invention is not limited to the above embodiments. Variations and advantages that would occur to one skilled in the art are included in the invention without departing from the spirit and scope of the inventive concept, and the scope of the invention is defined by the appended claims.

Claims (5)

1. A suspected epidemic infection personnel searching method based on a gamma-suspected infection community model is characterized by comprising the following specific steps:
step 1, crawling travel track information of infected persons on a network;
step 2, analyzing travel track data of infected persons, carrying out pairwise matching on the travel track data of the persons, obtaining any two person contact possibility values, and storing the contact possibility values in a matching value file;
step 3, reading data in the matching value file to generate a contact probability graph, wherein a person ID is used as a node of the contact probability graph, and a contact probability value of two persons is used as a weight for connecting two node edges, and the specific method is as follows:
selecting any two track files and reading track information in the files;
judging the time matching degree: judging whether the generated two track data belong to the same day and are within a set time difference value, namely whether the head time and the tail time of the two track files have an intersection, if not, reselecting the two track files to perform time matching degree judgment, and if so, performing the next geographic position matching degree judgment;
judging the matching degree of the geographic position: matching the position of the place, judging whether the generated two pieces of track data belong to the same area within the range of the distance tolerance, if not, re-selecting two track files to judge the time matching degree, and if so, calculating the contact probability value;
calculating matching values of the two tracks in a segmented manner, and taking the maximum matching value as a contact probability value;
step 4, searching the maximum group of the contact probability map, finding the maximum group in the contact probability map, and carrying out independent analysis on all the found maximum groups to find gamma-suspected infected communities on all the maximum groups, wherein the specific method comprises the following steps:
selecting two adjacent nodes in the maximum group to be added into a set C, wherein the set C represents the currently found gamma-group;
selecting a new node to join the set C under the condition that the edge between the two selected nodes is larger than gamma, and after the new node joins the set C, the product of all the edges is larger than or equal to gamma;
traversing all nodes on the biggest group, and obtaining a final set C which is the gamma-suspected infection community on the biggest group.
2. The method for searching for suspected epidemic infected persons based on the gamma-suspected infected community model of claim 1, wherein the infected person travel track information on the network includes unique representation IDs of individuals and track data including location information and time information.
3. The method for searching for suspected epidemic infected persons based on the gamma-suspected infected community model according to claim 1, wherein the calculation method of the matching value of the two tracks is as follows:
and setting the region range of one track section, and taking the ratio of the track point of the other track section in the region range of one track section to the total track point of the other track section in the matching time section as a matching value.
4. The method for searching suspected epidemic infected persons based on the gamma-suspected infected community model according to claim 1, wherein the specific method for searching the biggest group in the contact probability map is as follows:
constructing 3 sets R, P, X for saving states in the process of algorithm enumeration, wherein an R set is a result set and is expressed as a currently found cluster, and nodes in R are mutually connected but not necessarily form a maximum cluster; nodes in the P set and the X set represent public neighbors of the nodes in the R set, points in the P set and the X set can be added into the R set, so that a group structure of the R set is expanded, the X set is a forbidden set, and the nodes in the P set and the X set are all points which have been added into the R set before; the P set is an alternative set and is expressed as points which are not added into the R set;
and selecting a point v from the P set each time to be added into the R set, and simultaneously updating the P set and the X set to ensure that the P set and the X set still contain public neighbors of the new R set, wherein the method specifically comprises the following steps: and respectively intersecting the P set and the X set with an adjacency list of v points, performing recursive call, transferring the v points from the alternative set P to the forbidden set X after the recursive call is finished, and outputting the R set as a maximum group when the P set and the X set are empty.
5. The method for searching suspected epidemic infected persons based on the gamma-suspected infected community model according to claim 1, wherein the specific method for selecting the new node is as follows:
taking an unviewed node as a starting node, walking to the unviewed node along the edge of the current node, returning to the previous node when the unviewed node is not available, and continuing to probe other nodes until all the nodes are accessed.
CN202110813384.9A 2021-07-19 2021-07-19 Suspected epidemic infection personnel searching method based on gamma-suspected infection community model Active CN113643824B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110813384.9A CN113643824B (en) 2021-07-19 2021-07-19 Suspected epidemic infection personnel searching method based on gamma-suspected infection community model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110813384.9A CN113643824B (en) 2021-07-19 2021-07-19 Suspected epidemic infection personnel searching method based on gamma-suspected infection community model

Publications (2)

Publication Number Publication Date
CN113643824A CN113643824A (en) 2021-11-12
CN113643824B true CN113643824B (en) 2024-03-26

Family

ID=78417714

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110813384.9A Active CN113643824B (en) 2021-07-19 2021-07-19 Suspected epidemic infection personnel searching method based on gamma-suspected infection community model

Country Status (1)

Country Link
CN (1) CN113643824B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027525A (en) * 2020-03-09 2020-04-17 中国民用航空总局第二研究所 Method, device and system for tracking potential infected persons in public places during epidemic situation
CN111177473A (en) * 2018-11-13 2020-05-19 杭州海康威视数字技术股份有限公司 Personnel relationship analysis method and device and readable storage medium
CN111354472A (en) * 2020-02-20 2020-06-30 戴建荣 Infectious disease transmission monitoring and early warning system and method
CN111540476A (en) * 2020-04-20 2020-08-14 中国科学院地理科学与资源研究所 Respiratory infectious disease infectious tree reconstruction method based on mobile phone signaling data
CN112383875A (en) * 2020-06-28 2021-02-19 中国信息通信研究院 Data processing method and electronic equipment
CN112653990A (en) * 2020-09-18 2021-04-13 武汉爱迪科技股份有限公司 Screening algorithm and system for close contact personnel
CN113113153A (en) * 2021-04-13 2021-07-13 上海市疾病预防控制中心 Method, system, device, processor and storage medium for realizing epidemic situation dynamic information analysis in epidemic situation outbreak period by using graph model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11350889B2 (en) * 2013-10-10 2022-06-07 Aura Home, Inc. Covid-19 risk and illness assessment method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177473A (en) * 2018-11-13 2020-05-19 杭州海康威视数字技术股份有限公司 Personnel relationship analysis method and device and readable storage medium
CN111354472A (en) * 2020-02-20 2020-06-30 戴建荣 Infectious disease transmission monitoring and early warning system and method
CN111027525A (en) * 2020-03-09 2020-04-17 中国民用航空总局第二研究所 Method, device and system for tracking potential infected persons in public places during epidemic situation
CN111540476A (en) * 2020-04-20 2020-08-14 中国科学院地理科学与资源研究所 Respiratory infectious disease infectious tree reconstruction method based on mobile phone signaling data
CN112383875A (en) * 2020-06-28 2021-02-19 中国信息通信研究院 Data processing method and electronic equipment
CN112653990A (en) * 2020-09-18 2021-04-13 武汉爱迪科技股份有限公司 Screening algorithm and system for close contact personnel
CN113113153A (en) * 2021-04-13 2021-07-13 上海市疾病预防控制中心 Method, system, device, processor and storage medium for realizing epidemic situation dynamic information analysis in epidemic situation outbreak period by using graph model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
新型冠状病毒肺炎初期传播规模的系统动力学模型估计方法及评价――以甘肃省为例的研究;刘红亮;贾洪文;王雁;刘彬;姚洁;闫宣辰;;电子科技大学学报(社科版)(第03期);全文 *
新型冠状病毒肺炎疫情数据挖掘与离散随机传播动力学模型分析;唐三一;唐彪;Nicola Luigi Bragazzi;夏凡;李堂娟;何莎;任鹏宇;王霞;向长城;彭志行;吴建宏;肖燕妮;;中国科学:数学(第08期);全文 *

Also Published As

Publication number Publication date
CN113643824A (en) 2021-11-12

Similar Documents

Publication Publication Date Title
CN110457404B (en) Social media account classification method based on complex heterogeneous network
CN105404890A (en) Criminal gang discrimination method considering locus space-time meaning
CN106780263A (en) High-risk personnel analysis and recognition methods based on big data platform
CN111540476A (en) Respiratory infectious disease infectious tree reconstruction method based on mobile phone signaling data
CN107918664B (en) Social network data differential privacy protection method based on uncertain graph
Liang et al. Cluster validity index for irregular clustering results
CN109522416A (en) A kind of construction method of Financial Risk Control knowledge mapping
CN110704694A (en) Organization hierarchy dividing method based on network representation learning and application thereof
CN110533253A (en) A kind of scientific research cooperative Relationship Prediction method based on Heterogeneous Information network
CN109783696B (en) Multi-pattern graph index construction method and system for weak structure correlation
Liu et al. Social group query based on multi-fuzzy-constrained strong simulation
CN113643824B (en) Suspected epidemic infection personnel searching method based on gamma-suspected infection community model
CN110750730A (en) Group detection method and system based on space-time constraint
CN109286622A (en) A kind of network inbreak detection method based on learning rules collection
Wang et al. Prevalent co-visiting patterns mining from location-based social networks
Wang et al. Overlapping community detection based on node importance and adjacency information
Su et al. A new approach for social group detection based on spatio-temporal interpersonal distance measurement
Lind et al. Spatio-temporal mobility analysis for community detection in the mobile networks using CDR data
CN112380267A (en) Community discovery method based on privacy graph
Tong et al. Pmp-net: Rethinking visual context for scene graph generation
CN111639251A (en) Information retrieval method and device
CN113268770B (en) Track k anonymous privacy protection method based on user activity
Jiang et al. Ai and machine learning for industrial security with level discovery method
Fellegara et al. Analysis of geolocalized social networks based on simplicial complexes
Chen et al. Personalized trajectory privacy-preserving method based on sensitive attribute generalization and location perturbation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Yuan Long

Inventor after: Fan Zhengqing

Inventor after: Chen Zi

Inventor before: Fan Zhengqing

Inventor before: Yuan Long

Inventor before: Chen Zi

GR01 Patent grant
GR01 Patent grant