CN110019981B - Directed super-edge propagation method integrating unsupervised learning and network out-degree - Google Patents
Directed super-edge propagation method integrating unsupervised learning and network out-degree Download PDFInfo
- Publication number
- CN110019981B CN110019981B CN201711208187.4A CN201711208187A CN110019981B CN 110019981 B CN110019981 B CN 110019981B CN 201711208187 A CN201711208187 A CN 201711208187A CN 110019981 B CN110019981 B CN 110019981B
- Authority
- CN
- China
- Prior art keywords
- edge
- node
- directed
- super
- propagation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 238000013138 pruning Methods 0.000 claims description 16
- 230000005540 biological transmission Effects 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 6
- 230000006798 recombination Effects 0.000 claims description 6
- 238000005215 recombination Methods 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 4
- 238000005065 mining Methods 0.000 claims description 4
- 150000001875 compounds Chemical class 0.000 claims description 3
- 238000003064 k means clustering Methods 0.000 claims description 3
- 230000001902 propagating effect Effects 0.000 claims description 2
- 230000000644 propagated effect Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000010187 selection method Methods 0.000 description 2
- 231100000656 threshold model Toxicity 0.000 description 2
- 238000012098 association analyses Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003012 network analysis Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2155—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a directed super-edge propagation method integrating unsupervised learning and network out-degree, which comprises the following steps: discovering a series of undirected hyperedges connecting multiple vertices from the network node relationships using an unsupervised learning algorithm; aiming at any undirected excess edge, excavating a directed excess edge relation including a forward excess edge and a backward excess edge until traversing all undirected excess edges; sorting elements of a front piece vertex set in the directed super-edge relation according to the out degree, and selecting seed nodes from big to small; and starting from the selected seed nodes, carrying out network information propagation by adopting a linear threshold propagation algorithm aiming at the directed hypergraph. On the premise of ensuring expandability, the invention utilizes the directed super edge to select the seed node and selects a node with the largest degree of outturn to carry out propagation on the basis, thereby improving the coverage rate and the propagation efficiency.
Description
Technical Field
The invention relates to the field of social computing and media mining, in particular to a directed super-edge propagation method integrating unsupervised learning and network outdegree.
Background
With the rapid development of internet technology, more and more online social networks are coming up in succession. In these social networks, individuals in the network interact with each other using the social network as a medium, and information, viewpoints, and influence are propagated. Indeed, as the research of big data is becoming more widespread, social network influence propagation has become one of the key issues in data mining and social network analysis. In the field of social networks, the problem of maximizing influence refers to that a propagation model is given and a set of seed nodes is selected, information is propagated from the node set, and finally the coverage rate of activated nodes on a network is maximized. The goal of impact maximization is to achieve maximum propagation coverage with the shortest time and the fewest seed nodes. In the current internet, influence propagation is mainly embodied by information propagation, so that analyzing influence factors of the information propagation has important significance for improving an influence propagation model. The influence factor of information propagation can be regarded as the influence factor of the node activation probability in the force propagation model.
The propagation models that are currently used more widely include independent cascade models and linear threshold models. The independent cascade model treats an active node as a publisher, an activated node as a recipient, and the publisher activates the recipient. Thus, the independent cascade model is a publisher-centric model, where a node can only affect nodes directly connected to it, and once a node is activated, it will attempt to deactivate all neighboring nodes. Each node in the linear threshold model has an affected threshold that is uniformly and randomly chosen in the range of 0 to 1 and that does not change any more once determined to be in propagation. As with the independent cascade model, there is and only the seed set S at time t ═ 00The node in (1) is activated. At each subsequent time t ≧ 1, each inactive node needs to determine whether it is activated or not according to whether the linear weighted sum of all activated neighbors to it has reached its affected value, if so, node v is activated at time t, otherwise, node v remains inactive. The propagation process ends when no new nodes are activated at a certain time.
For the whole propagation process, the selection of the seed node is the basis of propagation, because the selection result of the seed node directly affects the final effect of propagation, including the coverage rate and the propagation time. Currently, common seed node selection methods include node degree-based heuristic algorithms, greedy algorithms, distance-based heuristic algorithms, random algorithms and the like.
Assuming that the initialized active node is S, f (S) represents that the number of the final active nodes obtained by propagation by using the nodes in S as seed nodes. Taking the greedy algorithm as an example, an empty set S is initialized first, and then all nodes need to be traversed each time a node is added, and the node with the maximum value of f (S + v) -f (S) is added to the node set S. When the greedy algorithm is used for selecting the seed nodes, all the nodes need to be traversed when one node is added every time, so that the time complexity is high, and the greedy algorithm does not consider the topological structure of the graph, which is the limitation of the greedy algorithm.
For a node degree-based heuristic algorithm, namely, k nodes with the highest degree are selected as initial active nodes, the time complexity of the algorithm is greatly reduced compared with a greedy algorithm, and when the algorithm is operated for multiple times and N seed nodes are selected each time, the seed nodes selected by the algorithm are relatively fixed, so that the transmission result fluctuation is not large, and the transmission result fluctuation of the greedy algorithm is relatively large. But because the algorithm only selects the nodes with higher degrees at a time, the information of partial nodes is ignored. For a simpler random algorithm, namely, a plurality of nodes are randomly selected from an original node set to serve as seed nodes, and because the uncertain factors are more and the randomness is high, the seed nodes are not generally selected. And (3) carrying out propagation by using a node degree-based heuristic algorithm, when the seed node set S is selected, simply selecting the K nodes with the highest degree, not considering the topological structure of the graph, and when most of the nodes with higher degree are in the same group, the coverage rate is relatively reduced.
In order to solve the above problems, it is necessary to introduce a topological structure of relationships between users, to improve the quality of seed nodes, and further to improve coverage and propagation efficiency, and to propagate information in a wide range in as short a time as possible. Therefore, a common seed node selection method is combined with correlation analysis, and nodes with strong correlation in a large-scale data set are found out to form a hypergraph. Selecting nodes from each hypergraph as seed nodes may improve coverage. The currently common association analysis algorithm is the brute force method, Fk-1*F1Algorithms, and Apriori algorithms, and the like. For Fk-1*F1According to the algorithm, each K-edge is formed by combining a frequent (K-1) -edge and a 1-edge, then pruning the edges which are lower than the minimum support degree in the K-edges, and repeating the process until no new edge is generated. The algorithm is slightly less time-complex than Apriori algorithm, but it is difficult to avoid duplicate generation of candidate edges. However, the conventional Apriori algorithm is not suitable for directed graph and finds an undirected super edge in forwarding data from a user, so that it is necessary to widen the application range of the conventional method.
Disclosure of Invention
The invention aims to solve the problems that the transmission speed of the existing network information transmission method is not fast enough and the transmission range is not wide enough on the premise of ensuring the expandability.
In order to achieve the above object, the present invention provides a directed hyper-edge propagation method integrating unsupervised learning and network out-degree, which comprises the following steps:
discovering a series of undirected hyperedges connecting multiple vertices from the network node relationships using an unsupervised learning algorithm; aiming at any undirected excess edge, excavating a directed excess edge relation including a forward excess edge and a backward excess edge until traversing all undirected excess edges; sorting elements of a front piece vertex set in the directed super-edge relation according to the out degree, and selecting seed nodes from big to small; and starting from the selected seed nodes, carrying out network information propagation by adopting a linear threshold propagation algorithm aiming at the directed hypergraph.
Preferably, the step of sorting the elements of the front piece vertex set in the directed super-edge relationship according to out degree and selecting the seed nodes from big to small includes: in a directed hyper-edge relationship, only one node with the largest out-degree is preferentially selected as a seed node.
Preferably, the step of discovering a series of undirected hyperedges connecting multiple vertices from network node relationships using an unsupervised learning algorithm comprises: generating an edge list of forwarding pairs formed by all the propagation content providers and corresponding propagation content subscribers, pruning the forwarding pairs smaller than a given minimum support degree, and recording the pruned super edge list; performing pairwise recombination on the undirected excess edges in the set left after pruning to generate new edges; pruning the new combination with the support degree smaller than the given minimum support degree, and recording the super-edge column after pruning again; wherein the rules of recombination are that the first 1/2 elements in the two sets are combined to form the first 1/2 element in the first new set, and the last 1/2 element is combined to form the last 1/2 element in the second new set; and repeating the steps until the remaining collection after pruning is an empty collection.
Preferably, the given minimum support has a value of 0.25.
Preferably, said slave-to-undirected supersedeThe step of mining the directed super-edge relation in the edge list comprises the following steps: optionally, a non-directional super edge is marked as { trans1、trans2…transk}, optionally one node (denoted as trans)m) To create a relationship list (denoted Rlist) with backward hyperedges (the number of elements of the set of the backend vertices is 1), and to calculate the confidence of the hyperedge according to the following formula:repeating the steps until the undirected super edge is empty, and deleting the relation less than the given minimum reliability; merging the relationships which are not deleted by a given rule to obtain a new relationship list; the given merging rule is that the predecessors of the relationships in the Rlist are compared, and if there are two relationships (denoted as R _ a and R _ b) whose predecessors differ by only one node, and the element whose predecessor is R _ a different from the predecessor is R _ b, and the element whose successor is R _ b different from the element whose successor is R _ a, then the two relationships are merged; the new relation front piece is the same element as the R _ a and R _ b front pieces, and the back piece is the union of the R _ a and R _ b back pieces; until the entire list of super edges is traversed.
Preferably, starting from the selected seed node, the step of propagating the network information by using a linear threshold propagation algorithm for the directed hypergraph includes: direct activation seed node set AlreadyAnd randomly assigning a threshold value theta to each of the remaining nodesu. Wherein the threshold value thetauRequires that the content of the compound is [0,1 ]]Is adjusted within the value range of thetauThe larger the value is, the harder the node is to activate, thetauThe smaller the value is, the easier the node is to activate; setting a neighbor node set of a current node in a directed hypergraph as N (v), and defining b for any node w e to N (v)vwRepresents the influence degree of the node w on the node v and satisfiesFor any non-activated node, when the activated neighbor co-action is greater than the randomly assigned threshold, i.e., the activated neighbor co-action is greater than the randomly assigned thresholdThe node is activated; and in the process of network information transmission, continuously repeating the steps until no new active node exists and the network information transmission is finished.
Preferably, the unsupervised learning algorithm includes an unsupervised learning algorithm including K-means clustering, Apriori, FP-growth.
Compared with the prior art, the invention utilizes the directed super-edge to select the seed node on the premise of ensuring the expandability, and selects the node with the largest out-degree for propagation on the basis, thereby improving the coverage rate and the propagation efficiency. The invention discovers the undirected super edges by introducing an unsupervised algorithm and further discovers the directed super edges, and changes the application range of the directed super edges from an undirected graph to a directed graph.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below.
Fig. 1 is a flowchart of a directed hyper-edge propagation method for fusing unsupervised learning and network out-degree according to an embodiment of the present invention;
fig. 2 is an application diagram of the directed hyper-edge propagation method for merging unsupervised learning and network out-degree shown in fig. 1.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, concepts related to the present invention are described as follows.
The hypergraph is a popularization of the graph, at least one hyperedge (hyperedge) in the hypergraph is used for connecting any number of vertexes (vertex), and the general edge (edge) can be connected with only two vertexes; the directed hypergraph is obtained by adding a direction to a hyper-edge in the undirected hypergraph, wherein the direction represents the front-back sequence among the vertexes of the hyper-edge and divides a vertex set into a front piece vertex set and a back piece vertex set; forward super edge (forward super edge) is a directed super edge with the element number of 1 in the front piece vertex set; backward super edge (backward super edge) refers to a directed super edge with the number of elements of the back-part vertex set being 1 (as shown in fig. 2).
Based on the concept, in the embodiment of the invention, the front part vertex set and the back part vertex set of each directed hyper-edge relationship form a directed hyper-graph together, the hyper-graph is a vertex set formed by connecting the front part vertex and the back part vertex in pairs through directed hyper-edges including a forward hyper-edge and a backward hyper-edge, the front part vertex set has no hyper-edge, and the back part vertex set also has no hyper-edge.
Fig. 1 is a flowchart of a directed hyper-edge propagation method fusing unsupervised learning and network out-degree according to an embodiment of the present invention, where the method includes steps S101 to S104:
in step S101, a series of undirected hyperedges connecting multiple vertices are found from the network node relationships using an unsupervised learning algorithm including K-means clustering, Apriori, FP-growth.
In step S102, for any one of the undirected hyperedges, a directed hyperedge relationship including the forward hyperedge and the backward hyperedge is mined until all the undirected hyperedges are traversed.
In step S103, sorting the elements of the front piece vertex set in the directed super-edge relation according to the out degree, and selecting seed nodes from big to small; preferably, in a directed hyper-edge relationship, only one node with the largest out-degree is preferentially selected as a seed node;
in step S104, network information is propagated by using a linear threshold propagation algorithm for the directed hypergraph from the selected seed node.
In one embodiment of the present invention, the step of discovering the undirected hyper-edge using an unsupervised algorithm includes:
1. generating all broadcast content providers (denoted trans)ak) And corresponding broadcast content subscribers (denoted trans)bk) List of edges that make up a forwarding pair, denoted as [ { trans [ ]b1,transa1},{transb2,transa2}…{transbn,transan}]Pruning forwarding pairs with the support degree less than the given minimum support degree, and recording the pruned super-edge list as L1(ii) a Preferably, the value given for the minimum support is 0.25.
2. To pairPerforming pairwise recombination on the undirected excess edges in the set left after pruning to generate new edges; pruning new combinations smaller than the given minimum support degree, and recording the pruned super edge list as Lk(ii) a The rules of recombination are that the first 1/2 elements in the two sets are combined to form the first 1/2 element in the first new set, and the last 1/2 element is combined to form the last 1/2 element in the second new set.
3. Step 102 is repeated until the remaining set after pruning is an empty set.
Secondly, in the embodiment of the present invention, the step of mining the directed super-edge relationship from the undirected super-edge list includes:
1. optionally, a non-directional super edge is marked as { trans1、trans2…transk}, optionally one node (denoted as trans)m) To create a relationship list (denoted Rlist) with backward hyperedges (the number of elements of the set of the backend vertices is 1), and to calculate the confidence of the hyperedge according to the following formula:
2. repeating the steps until the undirected super edge is empty, and deleting the relation less than the given minimum reliability;
3. merging the relationships which are not deleted according to a given rule to obtain a new relationship list (marked as Rlist _ new);
specifically, given a merge rule, the predecessors of the relationships in Rlist are compared, and if there are two relationships (denoted as R _ a and R _ b) whose predecessors differ by only one node, and the element whose predecessor is R _ a different from the predecessor is R _ b, and the element whose successor is R _ b different from the element whose successor is R _ a, then the two relationships are merged; the new relation front piece is the same element as the R _ a and R _ b front pieces, and the back piece is the union of the R _ a and R _ b back pieces;
4. repeating step 203 until the combination can not be carried out;
5. go back to step 201 until the entire list of hyper-edges is traversed.
Third, in an embodiment of the present invention, a linear threshold propagation algorithm for a directed hypergraph includes:
1. direct activation seed node set AlreadyAnd randomly assigning a threshold value theta to each of the remaining nodesu. Wherein the threshold value thetauRequires that the content of the compound is [0,1 ]]Is adjusted within the value range of thetauThe larger the value is, the harder the node is to activate, thetauThe smaller the value, the easier the node is to activate.
2. Setting a neighbor node set of a current node in a directed hypergraph as N (v), and defining b for any node w e to N (v)vwRepresents the influence degree of the node w on the node v and satisfiesFor any non-activated node, when the activated neighbor co-action is greater than the randomly assigned threshold, i.e., the activated neighbor co-action is greater than the randomly assigned thresholdThe node is activated.
3. And in the process of network information transmission, continuously repeating the steps until no new active node exists and the network information transmission is finished.
Fig. 2 is an application diagram of the directed hyper-edge propagation method for merging unsupervised learning and network out-degree shown in fig. 1. As shown in fig. 2, the application is illustrated in a five-layer structure.
The circle node of the first layer is an empty set phi, which indicates that the initial set is empty;
the circle nodes of the second layer are forwarding pairs pair composed of all the broadcast content providers and corresponding broadcast content subscribers, and inside each circle node is one forwarding pair, such as 13, 19, 36, 56 and 71; pruning sets smaller than a given minimum support by calculating the minimum support for each set in the second layer, the pruned sets being marked in gray, say 71;
the circle node of the third layer is to match any forwarding pair in the second layer with the remaining forwarding pairs in the second layer in sequence, and combine the two forwarding pairs to form a new set, where the new set belongs to the third layer of the graph, for example: 13 and 19 generation 139, 13 and 36 generation 136, 13 and 71 generation 713, 19 and 71 generation 719; since the superset of pruned sets must be less than the minimum support, it also needs to be pruned, i.e. marked in grey, such as 713 and 719;
similarly, the circle nodes in the fourth layer are sequentially matched with the forwarding pairs in the third layer to form a new set, for example: 139 and 136 generate 1396, 136 and 713 generate 7136, 139 and 719 generate 7139, 136 and 719 generate 71369; the superset of pruned sets therein is also labeled gray, such as 7136, 7139, and 71369;
in the circle node at the fifth level, a set 71396 is generated that contains all the elements, also belonging to the pruned set, and therefore marked gray.
According to the embodiment of the invention, on the premise of ensuring expandability, the directed excess edge is utilized to select the seed node, and on the basis, the node with the largest out-degree is selected for propagation, so that the coverage rate and the propagation efficiency are improved.
It will be further appreciated by those of ordinary skill in the art that the various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, the components and steps of the various examples having been described herein generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether these functions are performed in hardware or software depends on the particular application of the solution and design constraints. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (5)
1. A directed hyper-edge propagation method integrating unsupervised learning and network out-degree is characterized by comprising the following steps:
discovering a series of undirected hyperedges connecting multiple vertices from the network node relationships using an unsupervised learning algorithm; wherein the step of discovering a series of undirected hyperedges connecting multiple vertices from network node relationships using an unsupervised learning algorithm comprises: generating an edge list of forwarding pairs formed by all the propagation content providers and corresponding propagation content subscribers, pruning the forwarding pairs smaller than a given minimum support degree, and recording the pruned super edge list; performing pairwise recombination on the undirected excess edges in the set left after pruning to generate new edges; pruning the new combination with the support degree smaller than the given minimum support degree, and recording the super-edge column after pruning again; wherein the rules of recombination are that the first 1/2 elements in the two sets are combined to form the first 1/2 element in the first new set, and the last 1/2 element is combined to form the last 1/2 element in the second new set; repeating the steps until the remaining collection after pruning is an empty collection;
aiming at any undirected excess edge, excavating a directed excess edge relation including a forward excess edge and a backward excess edge until traversing all undirected excess edges; wherein the step of mining the directed excess edge relationship including the forward excess edge and the backward excess edge comprises: optionally, a non-directional super edge is marked as { trans1、trans2…transk}, optionally a node transmCreating a relation list Rlist with a backward super-edge, wherein the number of elements of a backward super-edge back-piece vertex set is 1, and calculating the credibility of the super-edge according to the following formula:
repeating the steps until the undirected super edge is empty, and deleting the relation less than the given minimum reliability; merging the relationships which are not deleted by a given rule to obtain a new relationship list; the given merging rule is that the predecessors of the relationships in the Rlist are compared, and if there are two relationships that the predecessors of R _ a and R _ b differ by only one node, and the element that the predecessor of R _ a differs from the predecessor of R _ b is exactly the element that the successor of R _ b differs from the successor of R _ a, the two relationships are merged; the new relation front piece is the same element as the R _ a and R _ b front pieces, and the back piece is the union of the R _ a and R _ b back pieces; until the whole super-edge list is traversed;
sorting elements of a front piece vertex set in the directed super-edge relation according to the out degree, and selecting seed nodes from big to small;
and starting from the selected seed nodes, carrying out network information propagation by adopting a linear threshold propagation algorithm aiming at the directed hypergraph.
2. The method according to claim 1, wherein the step of sorting the elements of the top-piece vertex set in the directed super-edge relationship according to degree of occurrence and selecting the seed nodes from big to small comprises:
in a directed hyper-edge relationship, only one node with the largest out-degree is preferentially selected as a seed node.
3. The method of claim 1, wherein the given minimum support level has a value of 0.25.
4. The method according to claim 1, wherein the step of propagating the network information by using a linear threshold propagation algorithm for the directed hypergraph from the selected seed node comprises:
direct activation seed node set AlreadyAnd randomly assigning a threshold value theta to each of the remaining nodesu(ii) a Wherein the threshold value thetauRequires that the content of the compound is [0,1 ]]Is adjusted within the value range of thetauThe larger the value is, the harder the node is to activate, thetauThe smaller the value is, the easier the node is to activate;
setting a neighbor node set of a current node in a directed hypergraph as N (v), and defining b for any node w e to N (v)vwRepresenting the influence of node w on node vTo a degree that satisfiesFor any non-activated node, when the activated neighbor co-action is greater than the randomly assigned threshold, i.e., the activated neighbor co-action is greater than the randomly assigned thresholdThe node is activated;
and in the process of network information transmission, continuously repeating the steps until no new active node exists and the network information transmission is finished.
5. The method of claim 1, wherein the unsupervised learning algorithm comprises an unsupervised learning algorithm including K-means clustering, Apriori, FP-growth.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711208187.4A CN110019981B (en) | 2017-11-27 | 2017-11-27 | Directed super-edge propagation method integrating unsupervised learning and network out-degree |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711208187.4A CN110019981B (en) | 2017-11-27 | 2017-11-27 | Directed super-edge propagation method integrating unsupervised learning and network out-degree |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110019981A CN110019981A (en) | 2019-07-16 |
CN110019981B true CN110019981B (en) | 2021-05-04 |
Family
ID=67186621
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711208187.4A Active CN110019981B (en) | 2017-11-27 | 2017-11-27 | Directed super-edge propagation method integrating unsupervised learning and network out-degree |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110019981B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118013130B (en) * | 2024-04-08 | 2024-06-04 | 烟台大学 | Service recommendation method, system, equipment and storage medium based on super service network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104820945A (en) * | 2015-04-17 | 2015-08-05 | 南京大学 | Online social network information transmision maximization method based on community structure mining algorithm |
WO2015121854A1 (en) * | 2014-02-13 | 2015-08-20 | Sayiqan Ltd | Web-based influence system and method |
CN105306540A (en) * | 2015-09-24 | 2016-02-03 | 华东师范大学 | Method for obtaining top k nodes with maximum influence in social network |
CN106875281A (en) * | 2017-03-13 | 2017-06-20 | 哈尔滨工程大学 | Community network node method for digging based on greedy subgraph |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9767419B2 (en) * | 2014-01-24 | 2017-09-19 | Microsoft Technology Licensing, Llc | Crowdsourcing system with community learning |
-
2017
- 2017-11-27 CN CN201711208187.4A patent/CN110019981B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015121854A1 (en) * | 2014-02-13 | 2015-08-20 | Sayiqan Ltd | Web-based influence system and method |
CN104820945A (en) * | 2015-04-17 | 2015-08-05 | 南京大学 | Online social network information transmision maximization method based on community structure mining algorithm |
CN105306540A (en) * | 2015-09-24 | 2016-02-03 | 华东师范大学 | Method for obtaining top k nodes with maximum influence in social network |
CN106875281A (en) * | 2017-03-13 | 2017-06-20 | 哈尔滨工程大学 | Community network node method for digging based on greedy subgraph |
Non-Patent Citations (2)
Title |
---|
Ágnes Bodó.SIS Epidemic Propagation on Hypergraphs.《Bulletin of Mathematical Biology》.2016,第713-735页. * |
微博超网络模型的建立及关键节点识别方法;张磊;《中国优秀硕士学位论文全文数据库》;20170315(第3期);第19-35页 * |
Also Published As
Publication number | Publication date |
---|---|
CN110019981A (en) | 2019-07-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Keikha et al. | Influence maximization across heterogeneous interconnected networks based on deep learning | |
US9614792B2 (en) | Method and apparatus for processing messages in a social network | |
Papagelis et al. | Suggesting ghost edges for a smaller world | |
CN103530402A (en) | Method for identifying microblog key users based on improved Page Rank | |
JP6661754B2 (en) | Content distribution method and apparatus | |
Li et al. | Cinema: conformity-aware greedy algorithm for influence maximization in online social networks | |
Chen et al. | Redundant service removal in QoS-aware service composition | |
CN104361462B (en) | Social network influence maximization approach based on cultural gene algorithm | |
Hashem et al. | An efficient dynamic superset bit-vector approach for mining frequent closed itemsets and their lattice structure | |
Chen et al. | Knowledge-enhanced multi-view graph neural networks for session-based recommendation | |
US20230289618A1 (en) | Performing knowledge graph embedding using a prediction model | |
CN111815028A (en) | Method and device for predicting propagation path of sudden hot spot event | |
Zhu et al. | AdaMCL: Adaptive fusion multi-view contrastive learning for collaborative filtering | |
CN110019981B (en) | Directed super-edge propagation method integrating unsupervised learning and network out-degree | |
Gialampoukidis et al. | Community detection in complex networks based on DBSCAN* and a Martingale process | |
Nguyen et al. | Diffusion-based Negative Sampling on Graphs for Link Prediction | |
CN110851684B (en) | Social topic influence recognition method and device based on ternary association graph | |
Feng et al. | Recovering information recipients in social media via provenance | |
Ying et al. | Gcfl: blockchain-based efficient federated learning for heterogeneous devices | |
Ji et al. | Attention-based graph neural network for news recommendation | |
Xie et al. | Mixdec sampling: A soft link-based sampling method of graph neural network for recommendation | |
Liu et al. | Community discovery in weighted networks based on the similarity of common neighbors | |
Wang et al. | Unsupervised twitter social bot detection using deep contrastive graph clustering | |
Tian et al. | A unified Bayesian model for generalized community detection in attribute networks | |
Bitton et al. | Message reduction in the LOCAL model is a free lunch |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20210802 Address after: Room 1601, 16th floor, East Tower, Ximei building, No. 6, Changchun Road, high tech Industrial Development Zone, Zhengzhou, Henan 450001 Patentee after: Zhengzhou xinrand Network Technology Co.,Ltd. Address before: 100190, No. 21 West Fourth Ring Road, Beijing, Haidian District Patentee before: INSTITUTE OF ACOUSTICS, CHINESE ACADEMY OF SCIENCES |
|
TR01 | Transfer of patent right |