CN114242261A - Virus propagation control method based on bounded seepage-greedy algorithm - Google Patents

Virus propagation control method based on bounded seepage-greedy algorithm Download PDF

Info

Publication number
CN114242261A
CN114242261A CN202111518210.6A CN202111518210A CN114242261A CN 114242261 A CN114242261 A CN 114242261A CN 202111518210 A CN202111518210 A CN 202111518210A CN 114242261 A CN114242261 A CN 114242261A
Authority
CN
China
Prior art keywords
node
nodes
network
occupied
unoccupied
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111518210.6A
Other languages
Chinese (zh)
Inventor
刘洋
陈晓祺
王震
王茜
李学龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202111518210.6A priority Critical patent/CN114242261A/en
Publication of CN114242261A publication Critical patent/CN114242261A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu

Landscapes

  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention provides a virus propagation control method based on a bounded seepage-greedy algorithm. Firstly, extracting target network information, and knowing target network nodes and connection edge attributes; then, based on the seepage theory, the target network continuously occupies specific nodes, and the reverse process of removing key nodes is realized; occupying the node which minimizes the objective function each time, limiting the size of the maximum connected component and the degree of the residual network nodes, and controlling virus propagation; setting a critical index threshold value, and updating the threshold value to occupy more candidate nodes when the critical indexes of all the candidate nodes exceed the threshold value; and finally, expressing the effect of controlling virus propagation by using the sequence parameters and the network toughness. The invention can realize the rapid and efficient decomposition of the large-scale network, thereby controlling the virus propagation in time and reducing the loss caused by the virus propagation.

Description

Virus propagation control method based on bounded seepage-greedy algorithm
Technical Field
The invention belongs to the technical field of network information analysis, and particularly relates to a virus propagation control method based on a bounded seepage-greedy algorithm.
Background
The network can simulate the interaction situation inside a complex system, wherein nodes represent individuals in the system, and edges represent interaction relations among the individuals. The application of the network is beneficial to researching the global property of the system. The network can test the effect of various artificial measures used for the real system and provide an optimal solution for controlling, predicting, optimizing and reconstructing the real system.
The network decomposition problem refers to identifying a set of key nodes for a given network, the removal of which can maximally decompose the network. The network decomposition can effectively reflect and analyze the actual situation, for example, compared with a diffusion system under the assumption of an average field, the network diffusion system can better represent the propagation mode of the virus; by identifying key nodes and decomposing a virus propagation network, the following problems can be solved to a certain extent: 1) which types of people need to be preferentially isolated when controlling virus transmission? 2) Resource-constrained, limited number of vaccines, which groups should be given priority to vaccine injection? 3) Which places should be focused on? On the other hand, the key nodes dominate the dynamic development of the virus propagation system, and the identification of the key nodes can help to find the diffusion source of virus propagation, so that the computing resources are saved.
The network decomposition problem has proven to be an NP-hard (Non-deterministic polymeric-time Hardness) problem. The network decomposition methods are specifically classified into the following four categories: 1) method based on local information: such methods do not require a known network topology, and randomly removing nodes from the network to achieve network decomposition is often not efficient enough. And then, deriving an acquaintance algorithm, and removing one neighbor node of a group of nodes to realize network decomposition by randomly selecting the group of nodes, wherein the efficiency is often lower than that of a node centrality-based method. 2) The method based on the node centrality comprises the following steps: and calculating the importance of the nodes by using node indexes such as degree centrality, feature vector centrality, Pagerank, betweenness centrality, Katz centrality and the like, and selecting the nodes with high importance as key nodes. The degree centrality method considers that nodes with higher degrees have higher importance. The characteristic vector centrality method considers that nodes connected with important nodes are also important nodes, so that the centrality of the nodes is obtained by adding the centrality of the neighbors of the nodes. 3) The heuristic method comprises the following steps: after removing the node with the highest importance, the degree of the original neighbor node is reduced. On the basis of the node centrality-based method, after removing nodes every time, the heuristic method recalculates the importance of the nodes in the rest network, and removes the nodes with the highest importance again. 4) The indirect method comprises the following steps: the indirect method can decompose the network more efficiently. The method based on the ring removal comprises a belief propagation guiding method, a minimum summation and a reverse greedy method, and the network decomposition can be realized by solving the problem of a feedback vertex set in the ring removal problem. The method based on graph segmentation realizes spectrum dichotomy through an approximation strategy and draws a vertex separator according to the minimum vertex coverage of the spectrum dichotomy, but the method directly considers the whole network and is not efficient enough. The FINDER method is based on a graph neural network and reinforcement learning to solve the network decomposition problem, and is theoretically a method considering a local network because it is based on the graph neural network.
The method is applied to realize network decomposition, and more nodes need to be removed. In practical application scenarios, especially in resource-poor areas, the resources for controlling virus propagation are limited, so that too many people cannot be vaccinated, isolated and protected, and too many places cannot be closed, thereby making it difficult to control virus propagation.
The above method is not very efficient to apply, especially in large networks. For large-scale network data, part of methods have too long calculation time or exceed the memory limit, and the network decomposition problem, such as the GND method based on graph partitioning, is difficult to solve. In practice, in the face of large-scale outbreak of virus propagation, real-time information of all parties needs to be considered, and the system can quickly respond to the outbreak of the virus to carry out prevention and control; however, the above method is difficult to be applied to a large-scale network in practice, and it is difficult to realize a quick response to virus propagation.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a virus propagation control method based on a bounded seepage-greedy algorithm. Firstly, extracting target network information, and knowing target network nodes and connection edge attributes; then, based on the seepage theory, the target network continuously occupies specific nodes, and the reverse process of removing key nodes is realized; occupying the node which minimizes the objective function each time, limiting the size of the maximum connected component and the degree of the residual network nodes, and controlling virus propagation; setting a critical index threshold value, and updating the threshold value to occupy more candidate nodes when the critical indexes of all the candidate nodes exceed the threshold value; and finally, expressing the effect of controlling virus propagation by using the sequence parameters and the network toughness. The invention can realize the rapid and efficient decomposition of the large-scale network, thereby controlling the virus propagation in time and reducing the loss caused by the virus propagation.
A virus propagation control method based on a bounded seepage-greedy algorithm is characterized by comprising the following steps:
step 1: inputting related data of the crowd participating in virus propagation, including individual information, individual quantity, connection among individuals and probability of propagating viruses, constructing a virus propagation network G (N, M) corresponding to the crowd data of the virus propagation by taking the individuals in the crowd data of the virus propagation as nodes, the connection among the individuals as edges and the probability of propagating the viruses among the individuals as weights of the edges, wherein the point set of the network is N, the edge set is M, and the edge weight between the nodes v and w is betavw
Step 2: initializing all nodes in the virus propagation network to be in an unoccupied state to form an unoccupied node set Nr(t); constructing a set of candidate nodes Nc(t), initially a set of unoccupied nodes Nr(t) any subset of which the number of nodes satisfies y ≦ N, N being the number of nodes contained in the point set N of the initial network; constructing an occupied node set No(t), initially an empty set; all edges are initialized to an unoccupied state, forming a set M of unoccupied edgesr(t); building a set of occupied edges Mo(t), initially an empty set; t represents each time after the start of virus propagation control, and initially t is 0 and occupies the node sequence Sr(t) is a null sequence; setting a critical index threshold
Figure BDA0003405376390000031
Initial 1, temporary value
Figure BDA0003405376390000032
Initially 1;
and step 3: at the time t, selecting a candidate node set Nc(t) the node that minimizes the objective function ψ (u), andif a plurality of nodes which enable the objective function psi (u) to be minimum exist at the same time, one node is randomly selected to be converted into the occupied state; the node is then assembled from the unoccupied node set Nr(t) candidate node set Nc(t) deleted, added to the set of occupied nodes No(t) and added to the sequence of occupied nodes Sr(t) end of; if two adjacent nodes are in the occupied state, the edge between the two nodes is converted into the occupied state, and the edge is not in the unoccupied edge set Mr(t) deleted, added to the set of occupied edges Mo(t) in (a);
and 4, step 4: candidate node set N at time tc(t) the key index I of all nodes exceeds the key index threshold
Figure BDA0003405376390000033
If yes, turning to the step 5; otherwise, returning to the step 3 if t is t + 1;
and 5: if the key index threshold value is updated from the last time
Figure BDA0003405376390000034
At the current moment t, at least one node is selected from the network and is converted into an occupied state, and then a key index threshold value is set
Figure BDA0003405376390000035
Updated to alpha x minI I and then temporarily stored
Figure BDA0003405376390000036
Is updated to be new
Figure BDA0003405376390000037
Otherwise, the critical index threshold
Figure BDA0003405376390000038
Is updated to
Figure BDA0003405376390000039
Temporarily storing the value again
Figure BDA00034053763900000310
Is updated to be new
Figure BDA00034053763900000311
After the updating is finished, judging whether t is larger than the number n of nodes, if so, using the occupied node sequence S obtained at the momentr(t) is the final occupied node sequence, go to step 6, otherwise, t equals t +1, return to step 3; the alpha is an updating parameter, and alpha is more than 1;
step 6: all nodes are converted into an unoccupied state again according to an occupied node sequence Sr(t) sequentially converting the nodes into occupied states by the internal sequence; calculating a sequence parameter G each time node state conversion is carried outa(q); when the sequence parameter is increased from 0 to a non-zero constant for the first time, the unoccupied node ratio q at this time is recorded as an unoccupied node ratio threshold qcUnoccupied node proportion threshold qcRepresents the minimum node proportion which needs to be removed for controlling the virus propagation, and the smaller the value of the minimum node proportion is, the smaller the node proportion needs to be removed for controlling the virus propagation is; when occupying the node sequence SrAnd (t) when all the nodes in the node(s) are converted into the occupied states, calculating the network toughness F, wherein the network toughness F represents the virus propagation control effect, and the smaller the value of the network toughness F is, the better the virus propagation control effect is.
Specifically, the unoccupied node set N described in step 2r(t) and occupied node set No(t), at any moment, the intersection is empty, and the union is a point set N; set of unoccupied edges Mr(t) and set of occupied edges MoAnd (t), at any moment, the intersection is empty, and the union is an edge set M.
Specifically, the objective function ψ (u) described in step 3 is set to:
Figure BDA0003405376390000041
wherein, for the candidate node set NcThe node u in (t), ψ (u) represents an objective function value, and i (u) represents a key index value of the node u;
Figure BDA0003405376390000042
is a function of node u, for any satisfaction
Figure BDA0003405376390000043
The node(s) u of (a),
Figure BDA0003405376390000044
set to an equal finite number.
Specifically, the key index I in step 4 is set as the external degree of the node, and the calculation formula is as follows:
Figure BDA0003405376390000045
wherein the content of the first and second substances,
Figure BDA0003405376390000046
representing the external degree of the node u, wherein c (u) is a connected component where the node u is converted into an occupied state, and v represents a node in the connected component c (u); k is a radical ofvIs the degree, k, of node v in the initial network G (N, M)v' for node v in occupied network G (N)o(t),Mo(t)), said occupied network G (N)o(t),Mo(t)) means that the set of occupied nodes N is at time toSet of nodes and occupied edges M in (t)o(t) a network in which edges are connected to each other according to the structure of the initial network G (N, M); the connected component is a sub-network of the virus transmission network; the node degree is the number of edges connected with the node.
Specifically, the key index I in step 4 is set as the external propagation probability of the node, and the calculation formula is as follows:
Figure BDA0003405376390000047
wherein the content of the first and second substances,
Figure BDA0003405376390000048
representing the external propagation probability of the node u, wherein c (u) is a connected component where the node u is converted into an occupied state, and v represents a node in the connected component c (u); Γ (v) represents the unoccupied set of neighbor nodes of node v; w represents a node in the set Γ (v); beta is avwRepresenting the edge weight between nodes v and w.
Specifically, the occupied node sequence S is obtained in step 6rThe internal order of (t) refers to the optimal order of occupying nodes for the transition from controlled to uncontrolled virus propagation.
Specifically, the sequence parameter G described in step 6aThe calculation formula of (q) is:
Figure BDA0003405376390000049
wherein q is the proportion of unoccupied nodes in the network, c "maxIs the maximum connected component, | c "maxAnd l is the number of nodes contained in the maximum connected component, and the maximum connected component is the sub-network with the maximum number of nodes when the proportion of the unoccupied nodes in the network is q.
Specifically, the calculation formula of the network toughness F in step 6 is as follows:
Figure BDA0003405376390000051
the invention has the beneficial effects that: by adopting the seepage process continuously occupying nodes and a specific objective function design strategy, the method can be used for solving the problem of rapid decomposition of a large-scale network, and has higher network decomposition efficiency, thereby effectively controlling virus propagation in time and reducing the loss caused by the virus propagation; the invention realizes network decomposition, removes fewer nodes, has smaller virus transmission scale, thus has less resources consumed by controlling virus transmission, controls virus transmission under the condition of resource deficiency, can theoretically play a better protection role on the network and achieves the aim of saving protection cost; the method has the advantages of low time complexity and space complexity, high calculation efficiency and capability of quickly responding to the emergent virus propagation event; the method has good performance on large-scale network data and is suitable for the network decomposition problem in a large-scale network.
Drawings
FIG. 1 is a flow chart of the virus propagation control method based on the bounded seepage-greedy algorithm;
FIG. 2 is a schematic of the external degree of a node of the present invention;
FIG. 3 is a diagram illustrating the results of using different methods to obtain the change in order parameters with respect to the proportion of unoccupied nodes in four different networks; the graph shows (a) results of different methods for obtaining the sequence parameter changes relative to the proportion of unoccupied nodes in the Power network, (b) results of different methods for obtaining the sequence parameter changes relative to the proportion of unoccupied nodes in the loc-Gowalla network, (c) results of different methods for obtaining the sequence parameter changes relative to the proportion of unoccupied nodes in the twitter-L network, and (d) results of different methods for obtaining the sequence parameter changes relative to the proportion of unoccupied nodes in the as-Sktter network;
fig. 4 is a diagram of computation times in different networks using different methods.
Detailed Description
The present invention will be further described with reference to the following drawings and examples, which include, but are not limited to, the following examples.
As shown in fig. 1, the present invention provides a virus propagation control method based on bounded seepage-greedy algorithm, which is implemented as follows:
step 1: inputting related data of the crowd participating in virus propagation, including individual information, individual quantity, connection among individuals and probability of propagating viruses, constructing a virus propagation network G (N, M) corresponding to the crowd data of the virus propagation by taking the individuals in the crowd data of the virus propagation as nodes, the connection among the individuals as edges and the probability of propagating the viruses among the individuals as weights of the edges, wherein the point set of the network is N, the edge set is M, and the edge weight between the nodes v and w is betavw
Step 2:initializing all nodes in the virus propagation network to be in an unoccupied state to form an unoccupied node set Nr(t); constructing a set of candidate nodes Nc(t), initially a set of unoccupied nodes Nr(t) any subset of which the number of nodes satisfies y ≦ N, N being the number of nodes contained in the point set N of the initial network; constructing an occupied node set No(t), initially an empty set; all edges are initialized to an unoccupied state, forming a set M of unoccupied edgesr(t); building a set of occupied edges Mo(t), initially an empty set. Set of unoccupied nodes Nr(t) and occupied node set No(t), at any moment, the intersection is empty, and the union is a point set N; unoccupied contiguous edge set Mr(t) and occupied contiguous edge set MoAnd (t), at any moment, the intersection is empty, and the union is an edge set M.
t represents each time after the start of virus propagation control, and initially t is 0 and occupies the node sequence Sr(t) is a null sequence; setting a critical index threshold
Figure BDA0003405376390000061
Initial 1, temporary value
Figure BDA0003405376390000062
Initially 1.
And step 3: at the time t, selecting a candidate node set Nc(t) converting the state of the node into an occupied state by the node which minimizes the objective function ψ (u), and if a plurality of nodes which minimize the objective function ψ (u) exist simultaneously, randomly selecting one of the nodes to convert into an occupied state; the node is then assembled from the unoccupied node set Nr(t) candidate node set Nc(t) deleted, added to the set of occupied nodes No(t) and added to the sequence of occupied nodes Sr(t) end of; if two adjacent nodes are in the occupied state, the edge between the two nodes is converted into the occupied state, and the edge is not in the unoccupied edge set Mr(t) deleted, added to the set of occupied edges Mo(t) in (a).
Wherein the objective function ψ (u) is set to:
Figure BDA0003405376390000063
wherein, for the candidate node set NcThe node u in (t), ψ (u) represents an objective function value, and i (u) represents a key index value of the node u;
Figure BDA0003405376390000064
is a function of node u, for any satisfaction
Figure BDA0003405376390000065
The node(s) u of (a),
Figure BDA0003405376390000066
set to an equal finite number.
And 4, step 4: candidate node set N at time tc(t) the key index I of all nodes exceeds the key index threshold
Figure BDA0003405376390000067
If yes, turning to the step 5; otherwise, t is t +1, and the procedure returns to step 3.
Wherein, key index I can adopt two kinds of setting modes:
(1) the key index I is set as the external degree of the node, and the calculation formula is as follows:
Figure BDA0003405376390000071
wherein the content of the first and second substances,
Figure BDA0003405376390000072
representing the external degree of the node u, wherein c (u) is a connected component where the node u is converted into an occupied state, and v represents a node in the connected component c (u); k is a radical ofvIs the degree, k, of node v in the initial network G (N, M)v' for node v in occupied network G (N)o(t),Mo(t)), degree of said occupied networkG(No(t),Mo(t)) means that the set of occupied nodes N is at time toSet of nodes and occupied edges M in (t)o(t) a network in which edges are connected to each other according to the structure of the initial network G (N, M); the connected component is a sub-network of the virus transmission network; the node degree is the number of edges connected with the node. As shown in fig. 2, the black solid nodes represent occupied nodes, the gray solid nodes and the hollow nodes represent unoccupied nodes, the solid lines represent occupied edges, the dotted lines represent unoccupied edges, and the light gray block-shaped cover portions represent connected components. For example, node u, node j, and node w are unoccupied nodes, and node v and node i are unoccupied nodes; if the unoccupied node u is converted into the occupied state, the connected component c (u) of the node u comprises the node u and the node v, and the external degree of the node u is calculated according to a formula
Figure BDA0003405376390000073
Similarly, if the unoccupied node j is converted into the occupied state, the connected component of the node j is c (j), and the external degree of the node j is
Figure BDA0003405376390000074
(2) The key index I is set as the external propagation probability of the node, and the calculation formula is as follows:
Figure BDA0003405376390000075
wherein the content of the first and second substances,
Figure BDA0003405376390000076
representing the external propagation probability of the node u, wherein c (u) is a connected component where the node u is converted into an occupied state, and v represents a node in the connected component c (u); Γ (v) represents the unoccupied set of neighbor nodes of node v; w represents a node in the set Γ (v); beta is avwRepresenting the edge weight between nodes v and w.
And 5: if the key index threshold value is updated from the last time
Figure BDA0003405376390000077
At the current moment t, at least one node is selected from the network and is converted into an occupied state, and then a key index threshold value is set
Figure BDA0003405376390000078
Updated to alpha x minI I and then temporarily stored
Figure BDA0003405376390000079
Is updated to be new
Figure BDA00034053763900000710
Otherwise, the critical index threshold
Figure BDA00034053763900000711
Is updated to
Figure BDA00034053763900000712
Temporarily storing the value again
Figure BDA00034053763900000713
Is updated to be new
Figure BDA00034053763900000714
After the updating is finished, judging whether t is larger than the number n of nodes, if so, using the occupied node sequence S obtained at the momentr(t) is the final occupied node sequence, go to step 6, otherwise, t equals t +1, return to step 3; the alpha is an updating parameter, and alpha is more than 1;
step 6: all nodes are converted into an unoccupied state again according to an occupied node sequence Sr(t) internal sequence of nodes into occupied states, said sequence S of occupied nodesrThe internal order of (t) refers to the optimal order of occupying nodes for the transition from controlled to uncontrolled virus propagation.
Calculating a sequence parameter G each time node state conversion is carried outa(q), sequence parameter Ga(q) represents the degree of the change from controlled to uncontrolled virus propagation in the current network, the smaller the sequence parameter, the more the sequence parameter isThe greater the extent to which the virus propagation is controlled in the pre-network, the sequence parameter GaThe calculation formula of (q) is:
Figure BDA0003405376390000081
wherein q is the proportion of unoccupied nodes in the network, c "maxIs the maximum connected component, | c "maxAnd l is the number of nodes contained in the maximum connected component, and the maximum connected component is the sub-network with the maximum number of nodes when the proportion of the unoccupied nodes in the network is q, and represents the size of the infected crowd spreading the virus.
The process of continuously changing the node into the occupied state in the network can be regarded as the reverse seepage process of continuously removing the node from the complete network, the node selection sequence of the node and the node is reversed, and the application effect is reversed, wherein the unoccupied state of the node or the edge and the removed state are the same state. The nodes are contacted and spread viruses through edges, if part of the nodes and the edges are removed, the current network is decomposed into a plurality of blocks, the contact between the nodes in the occupied state and other nodes can be blocked, and the wide spread of the viruses is limited. Therefore, the proportion q of unoccupied nodes, or the proportion of removed nodes, represents the size of the immune or isolated population in virus transmission, and reflects the strength of the preventive measures.
When the node state begins to be converted, the proportion q of the unoccupied nodes is 1, the sequence parameter is 0, the proportion q of the unoccupied nodes is gradually reduced along with the increase of the number of the occupied nodes, the sequence parameter is gradually increased, when the sequence parameter is increased from 0 to a non-zero constant for the first time, the maximum connected component appears, and the virus propagation begins to spread widely; the unoccupied node ratio q at this time is recorded as an unoccupied node ratio threshold qcUnoccupied node proportion threshold qcRepresenting the minimum proportion of nodes removed required to control viral transmission, qcThe smaller the size of the number of nodes that need to be removed to control virus spread, the smaller the size of the population that needs to be immunized or sequestered.
When occupying the node sequence Sr(t) all nodesAnd when the state is converted into the occupied state, calculating the network toughness F, wherein the network toughness F represents the virus propagation control effect, and the smaller the value of the network toughness F is, the better the virus propagation control effect is.
The calculation formula of the network toughness F is as follows:
Figure BDA0003405376390000082
to verify the effectiveness of the method of the invention, experiments were performed on a virus propagation network, the network parameters of which are shown in table 1.
TABLE 1
Data set Number of nodes Number of edges
Yeast 2375 11693
Power 4941 6594
p2p-Gnutella08 6301 20777
CA-AstroPh 18771 198050
Email-Enron 36692 183831
loc-Gowalla 196591 950327
twitter-L 532325 694606
web-Google 875713 4322051
PAroad 1088092 1541898
Flickr 1624991 15473043
as-Skitter 1696415 11095298
LiveJournal 3997962 34681189
Selecting five common network decomposition methods including HD (high hierarchy Degree) method, AHD (Adaptive high hierarchy Degree) method and AMSRGS (Min-sum)and Reverse-Greedy Strategy), GND (Generalized network decomposition), folder (binding key players in Networks through DEep learning to find key nodes), and BPG-I (bound-perlation Greedy-I, the key indicator being the Bounded seepage-Greedy method of the external degree), BPG-II (bound-perlation Greedy-II, the key indicator being the Bounded seepage-Greedy method of the external propagation probability) of the present invention, where table 2 gives the network toughness F values obtained by different methods on different Networks, and table 3 gives the unoccupied node proportion threshold q obtained by different methods on different Networksc
TABLE 2
Figure BDA0003405376390000091
Figure BDA0003405376390000101
TABLE 3
Figure BDA0003405376390000102
As can be seen from Table 2, the network toughness F of the BPG-I method is reduced by 30% or more as compared with the HD, AHD, FINDER methods; compared with the AMSRGS and GND methods, the network toughness F of the BPG-I method is reduced by more than 20%; the network toughness F of the BPG-I process is increased by about 5% compared to the BPG-II process; as can be seen, the BPG-I, BPG-II method of the present invention has better virus propagation control effect than other methods, and the control effect of the BPG-II method is slightly better than that of BPG-I.
As can be seen from Table 3, the unoccupied-node ratio threshold q of the BPG-I method is higher than that of the HD, AHD, GND, and FINDER methodscThe reduction is more than 40 percent; compared with AMSRGS method and BPG-II method, the unoccupied node proportion threshold q of the BPG-I methodcOverall values are similar, in the PArod and as-Skitter netsUnoccupied node proportion threshold q of BPG-I method on networkcCompared with the AMSRGS method, the reduction is more than 10%. Compared with other methods, the BPG-I, BPG-II method of the invention needs smaller proportion of removed nodes for controlling virus transmission, and needs smaller immunization or isolated population scale;
in tables 2 and 3, "-" indicates that the calculation time was too long or exceeded the memory limit. With respect to networks such as Flickr and LiveJournal with large scale, the GND method is difficult to realize network decomposition due to the limitation of computation time and space. The AMSRGS and FINDER methods have the same problem in LiveJournal networks. The BPG-I, BPG-II method can realize network decomposition in large-scale networks such as Flickr, Livejournal and the like, and compared with HD and AHD methods, the network toughness F is reduced by more than 10 percent, and the unoccupied node proportion threshold q iscThe reduction is more than 20 percent. It can be seen that the present invention performs well on large scale networks.
FIG. 3 is a graph showing the results of varying the order parameters obtained by different methods on a Power, loc-Gowalla, twitter-L, as-Skter network with respect to the proportion of nodes removed, where the abscissa q is the proportion of unoccupied nodes in the network and the ordinate G is the proportion of unoccupied nodes in the networka(q) is an order parameter; HD is a degree centrality method, AHD is a self-adaptability centrality method, FINDER is a method for searching key nodes for deep reinforcement learning, BPG-I is a method adopting the external degree of the nodes as a key index, and BPG-II is a method adopting the external propagation probability as a key index.
Under the condition of certain node removing proportion q, the sequence parameter G obtained by the inventiona(q) tends to be smaller and the extent to which viral propagation is controlled tends to be greater. For example, in a Power network, when the removed node ratio q is 0.03, the BPG-I method obtains the order parameter Ga(q) is 0.0820, and the sequence parameter G is obtained by the FINDER methoda(q) is 0.7758; it can be seen that, corresponding to the real situation, under the conditions of immunization or isolation of patients of the same proportion, the use of the invention results in a smaller population of infected persons, and a greater degree of control of viral transmission;
obtaining the sequence parameter Ga(q) in the case where (q) is constant,the proportion q of the removed nodes is smaller; for example, in a loc-Gowalla network, when G isa(q) 0.01, the removed node ratio q of the BPG-I method is 0.1354, and the removed node ratio q of the FINDER method is 0.1919; it can be seen that, corresponding to the real situation, the proportion of patients immunized or isolated using the invention is smaller, with the same degree of control of viral transmission.
FIG. 4 is a comparison graph of computation Time of different networks by different methods, wherein the abscissa represents different networks, and the abscissa represents the networks, and the networks are Yeast, Power, p2p (p2p-Gnutella08), CA (CA-AstroPh), Email (Email-Enron), loc (loc-Gowalla), twitter (twitter-L), web (web-Google), PAroad, Flickr, as (as-Skter), and live (live journal) networks, and the ordinate represents the computation Time; in the illustration, AMSRGS is a minimum sum and inverse greedy method;
compared with AMSRGS and FINDER methods, the BPG-I, BPG-II method disclosed by the invention has the advantages that the calculation time is obviously reduced, and the efficiency is obviously improved; for example, in the as-Skitter network, the computation speed of the BPG-I method is increased by more than 1500 times compared with the AMSRGS method and is increased by more than 70 times compared with the FINDER method; the method has high calculation efficiency and can quickly respond to the emergent virus propagation event.
In conclusion, the method realizes network decomposition, has fewer removed nodes and smaller virus propagation scale, controls virus propagation by using the method under the condition of resource shortage, and theoretically can play a better protection role on the network. The method has the advantages of low time complexity and space complexity, high calculation efficiency, good performance aiming at large-scale network data, and suitability for the network decomposition problem in a large-scale network.

Claims (8)

1. A virus propagation control method based on a bounded seepage-greedy algorithm is characterized by comprising the following steps:
step 1: inputting relevant data of the crowd participating in virus transmission, including individual information, individual quantity, connection among individuals and probability of transmitting viruses, taking the individuals in the crowd data of the virus transmission as nodes and the connection among the individuals as edgesThe probability of spreading viruses among individuals is the weight of the edge, a virus spreading network G (N, M) corresponding to the virus spreading crowd data is constructed, the point set of the network is N, the edge set is M, and the edge weight between the nodes v and w is betavw
Step 2: initializing all nodes in the virus propagation network to be in an unoccupied state to form an unoccupied node set Nr(t); constructing a set of candidate nodes Nc(t), initially a set of unoccupied nodes Nr(t) any subset of which the number of nodes satisfies y ≦ N, N being the number of nodes contained in the point set N of the initial network; constructing an occupied node set No(t), initially an empty set; all edges are initialized to an unoccupied state, forming a set M of unoccupied edgesr(t); building a set of occupied edges Mo(t), initially an empty set; t represents each time after the start of virus propagation control, and initially t is 0 and occupies the node sequence Sr(t) is a null sequence; setting a critical index threshold
Figure FDA0003405376380000011
Initial 1, temporary value
Figure FDA0003405376380000012
Initially 1;
and step 3: at the time t, selecting a candidate node set Nc(t) converting the state of the node into an occupied state by the node which minimizes the objective function ψ (u), and if a plurality of nodes which minimize the objective function ψ (u) exist simultaneously, randomly selecting one of the nodes to convert into an occupied state; the node is then assembled from the unoccupied node set Nr(t) candidate node set Nc(t) deleted, added to the set of occupied nodes No(t) and added to the sequence of occupied nodes Sr(t) end of; if two adjacent nodes are in the occupied state, the edge between the two nodes is converted into the occupied state, and the edge is not in the unoccupied edge set Mr(t) deleted, added to the set of occupied edges Mo(t) in (a);
and 4, step 4: candidate node set N at time tc(t) the key index I of all nodes exceeds the key index threshold
Figure FDA0003405376380000013
If yes, turning to the step 5; otherwise, returning to the step 3 if t is t + 1;
and 5: if the key index threshold value is updated from the last time
Figure FDA0003405376380000014
At the current moment t, at least one node is selected from the network and is converted into an occupied state, and then a key index threshold value is set
Figure FDA0003405376380000015
Updated to alpha x minI I and then temporarily stored
Figure FDA0003405376380000016
Is updated to be new
Figure FDA0003405376380000017
Otherwise, the critical index threshold
Figure FDA0003405376380000018
Is updated to
Figure FDA0003405376380000019
Temporarily storing the value again
Figure FDA00034053763800000110
Is updated to be new
Figure FDA00034053763800000111
After the updating is finished, judging whether t is larger than the number n of nodes, if so, using the occupied node sequence S obtained at the momentr(t) is the final occupied node sequence, go to step 6, otherwise, t equals t +1, return to step 3; the alpha is an updating parameter, and alpha is more than 1;
step 6: convert all nodes againIn the unoccupied state, according to an occupied node sequence Sr(t) sequentially converting the nodes into occupied states by the internal sequence; calculating a sequence parameter G each time node state conversion is carried outa(q); when the sequence parameter is increased from 0 to a non-zero constant for the first time, the unoccupied node ratio q at this time is recorded as an unoccupied node ratio threshold qcUnoccupied node proportion threshold qcRepresents the minimum node proportion which needs to be removed for controlling the virus propagation, and the smaller the value of the minimum node proportion is, the smaller the node proportion needs to be removed for controlling the virus propagation is; when occupying the node sequence SrAnd (t) when all the nodes in the node(s) are converted into the occupied states, calculating the network toughness F, wherein the network toughness F represents the virus propagation control effect, and the smaller the value of the network toughness F is, the better the virus propagation control effect is.
2. The virus propagation control method based on the bounded seepage-greedy algorithm as claimed in claim 1, wherein: unoccupied node set N described in step 2r(t) and occupied node set No(t), at any moment, the intersection is empty, and the union is a point set N; set of unoccupied edges Mr(t) and set of occupied edges MoAnd (t), at any moment, the intersection is empty, and the union is an edge set M.
3. The virus propagation control method based on the bounded seepage-greedy algorithm as claimed in claim 1, wherein: the objective function ψ (u) described in step 3 is set to:
Figure FDA0003405376380000021
wherein, for the candidate node set NcThe node u in (t), ψ (u) represents an objective function value, and i (u) represents a key index value of the node u;
Figure FDA0003405376380000022
is a function of node u, for any satisfaction
Figure FDA0003405376380000023
The node(s) u of (a),
Figure FDA0003405376380000024
set to an equal finite number.
4. The virus propagation control method based on the bounded seepage-greedy algorithm as claimed in claim 1, wherein: the key index I in the step 4 is set as the external degree of the node, and the calculation formula is as follows:
Figure FDA0003405376380000025
wherein the content of the first and second substances,
Figure FDA0003405376380000026
representing the external degree of the node u, wherein c (u) is a connected component where the node u is converted into an occupied state, and v represents a node in the connected component c (u); k is a radical ofvIs the degree, k 'of the node v in the initial network G (N, M)'vFor node v in occupied network G (N)o(t),Mo(t)), said occupied network G (N)o(t),Mo(t)) means that the set of occupied nodes N is at time toSet of nodes and occupied edges M in (t)o(t) a network in which edges are connected to each other according to the structure of the initial network G (N, M); the connected component is a sub-network of the virus transmission network; the node degree is the number of edges connected with the node.
5. The virus propagation control method based on the bounded seepage-greedy algorithm as claimed in claim 1, wherein: the key index I in the step 4 is set as the external propagation probability of the node, and the calculation formula is as follows:
Figure FDA0003405376380000031
wherein the content of the first and second substances,
Figure FDA0003405376380000032
representing the external propagation probability of the node u, wherein c (u) is a connected component where the node u is converted into an occupied state, and v represents a node in the connected component c (u); Γ (v) represents the unoccupied set of neighbor nodes of node v; w represents a node in the set Γ (v); beta is avwRepresenting the edge weight between nodes v and w.
6. The virus propagation control method based on the bounded seepage-greedy algorithm as claimed in claim 1, wherein: the occupied node sequence S is obtained in step 6rThe internal order of (t) refers to the optimal order of occupying nodes for the transition from controlled to uncontrolled virus propagation.
7. The virus propagation control method based on the bounded seepage-greedy algorithm as claimed in claim 1, wherein: the sequence parameter G described in step 6aThe calculation formula of (q) is:
Figure FDA0003405376380000033
wherein q is the proportion of unoccupied nodes in the network, c ″)maxIs the maximum connected component, | cmaxAnd l is the number of nodes contained in the maximum connected component, and the maximum connected component is the sub-network with the maximum number of nodes when the proportion of the unoccupied nodes in the network is q.
8. The virus propagation control method based on the bounded seepage-greedy algorithm as claimed in claim 1, wherein: the calculation formula of the network toughness F in the step 6 is as follows:
Figure FDA0003405376380000034
CN202111518210.6A 2021-12-10 2021-12-10 Virus propagation control method based on bounded seepage-greedy algorithm Pending CN114242261A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111518210.6A CN114242261A (en) 2021-12-10 2021-12-10 Virus propagation control method based on bounded seepage-greedy algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111518210.6A CN114242261A (en) 2021-12-10 2021-12-10 Virus propagation control method based on bounded seepage-greedy algorithm

Publications (1)

Publication Number Publication Date
CN114242261A true CN114242261A (en) 2022-03-25

Family

ID=80755265

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111518210.6A Pending CN114242261A (en) 2021-12-10 2021-12-10 Virus propagation control method based on bounded seepage-greedy algorithm

Country Status (1)

Country Link
CN (1) CN114242261A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049002A (en) * 2022-06-15 2022-09-13 重庆理工大学 Complex network influence node identification method based on reverse generation network

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049002A (en) * 2022-06-15 2022-09-13 重庆理工大学 Complex network influence node identification method based on reverse generation network

Similar Documents

Publication Publication Date Title
CN109948029B (en) Neural network self-adaptive depth Hash image searching method
Murphey An approximate algorithm for a weapon target assignment stochastic program
CN107729767B (en) Social network data privacy protection method based on graph elements
CN112491818B (en) Power grid transmission line defense method based on multi-agent deep reinforcement learning
Lourenço et al. Evolving evolutionary algorithms
Knowles et al. On the utility of redundant encodings in mutation-based evolutionary search
CN109064348A (en) A method of it blocking rumour community in social networks and inhibits gossip propagation
CN109766710B (en) Differential privacy protection method of associated social network data
Tansey et al. A fast and flexible algorithm for the graph-fused lasso
CN113191530B (en) Block link point reliability prediction method and system with privacy protection function
CN109120431B (en) Method and device for selecting propagation source in complex network and terminal equipment
CN114242261A (en) Virus propagation control method based on bounded seepage-greedy algorithm
CN113283590A (en) Defense method for backdoor attack
CN106953801B (en) Random shortest path realization method based on hierarchical learning automaton
Corus et al. Automatic adaptation of hypermutation rates for multimodal optimisation
Chen et al. A channel aggregation based dynamic pruning method in federated learning
Sundar et al. Metaheuristic approaches for the blockmodel problem
Iotti et al. Infection dynamics on spatial small-world network models
CN110471445B (en) Multi-stage multi-objective optimization method and device for communication energy consumption of multiple unmanned platforms
Jiang et al. An improved quantum-behaved particle swarm optimization algorithm based on linear interpolation
Delaplace et al. Two evolutionary methods for learning bayesian network structures
Dey et al. Network Robustness via Global k-cores
CN111104561B (en) Heuristic unmanned platform information-aware network topology generation method and device
CN114662148A (en) Multi-party combined training method and device for protecting privacy
Chakrapani et al. Implementation of fractal image compression employing particle swarm optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination