CN116756207A - Network key node mining method based on discount strategy and improved discrete crow search algorithm - Google Patents
Network key node mining method based on discount strategy and improved discrete crow search algorithm Download PDFInfo
- Publication number
- CN116756207A CN116756207A CN202310569958.1A CN202310569958A CN116756207A CN 116756207 A CN116756207 A CN 116756207A CN 202310569958 A CN202310569958 A CN 202310569958A CN 116756207 A CN116756207 A CN 116756207A
- Authority
- CN
- China
- Prior art keywords
- node
- crow
- nodes
- network
- influence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000010845 search algorithm Methods 0.000 title claims abstract description 38
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000005065 mining Methods 0.000 title claims abstract description 21
- 238000005457 optimization Methods 0.000 claims abstract description 24
- 230000008569 process Effects 0.000 claims abstract description 11
- 239000011159 matrix material Substances 0.000 claims abstract description 7
- 238000007781 pre-processing Methods 0.000 claims abstract 2
- 239000013598 vector Substances 0.000 claims description 46
- 230000008901 benefit Effects 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 9
- 238000011156 evaluation Methods 0.000 claims description 4
- 230000006978 adaptation Effects 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims 1
- 241001137251 Corvidae Species 0.000 abstract description 2
- 238000005516 engineering process Methods 0.000 abstract description 2
- 238000011160 research Methods 0.000 description 7
- 239000002245 particle Substances 0.000 description 5
- 238000009792 diffusion process Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000010429 evolutionary process Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 230000035515 penetration Effects 0.000 description 1
- 230000011273 social behavior Effects 0.000 description 1
- 230000003997 social interaction Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
- G06N5/042—Backward inferencing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Fuzzy Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明涉及到计算机复杂网络优化技术领域,公开了一种基于折扣策略与改进离散乌鸦搜索算法的网络关键节点挖掘方法,包括:首先对引文网络预处理转换为邻接矩阵,并对该网络做反向得到反向网络;然后利用LRDiscount算法对反向网络的节点进行初始筛选得到候选节点集C;接下来根据改进离散乌鸦搜索算法的局部优化过程对候选节点集C进行优化;最后从优化后的节点集中选择最优的集合,并进行节点影响力评估,从而得到最终k个关键种子节点。与现有技术相比,本发明将网络节点的影响力折扣策略与改进离散乌鸦搜索算法相结合,从模仿乌鸦搜索过程中更新节点位置进行影响力扩散,通过乌鸦个体在引文网络中游走产生的边际增益来寻找关键节点。
The invention relates to the technical field of computer complex network optimization, and discloses a network key node mining method based on a discount strategy and an improved discrete crow search algorithm, which includes: first preprocessing the citation network and converting it into an adjacency matrix, and inverting the network. to obtain the reverse network; then use the LRDiscount algorithm to initially screen the nodes of the reverse network to obtain the candidate node set C ; then optimize the candidate node set C according to the local optimization process of the improved discrete crow search algorithm; finally, from the optimized The nodes centrally select the optimal set and evaluate the node influence to obtain the final k key seed nodes. Compared with the existing technology, the present invention combines the influence discount strategy of network nodes with the improved discrete crow search algorithm, updates the node position to spread influence by imitating the crow search process, and generates influence through individual crows wandering in the citation network. Marginal gain to find key nodes.
Description
技术领域Technical field
本发明属于复杂网络影响力最大化技术领域,特别涉及一种基于折扣策略与改进离散乌鸦搜索算法的网络关键节点挖掘方法。The invention belongs to the technical field of maximizing the influence of complex networks, and in particular relates to a network key node mining method based on a discount strategy and an improved discrete crow search algorithm.
背景技术Background technique
随着各类移动社交服务对人类生活、社交的渗透,社交网络在信息共享、信息传播扩散等方面起到了不可忽视的作用。规模巨大的社交网络对传统的影响力最大化问题的研究造成了很大的挑战,也令该问题的研究具有更大的现实意义,而IM(InfluenceMaximization,IM)旨在社交网络中求解出K个有影响力的节点(节点代表社交媒体用户),并利用“口碑”效应传播信息,使这些节点的影响力范围达到最大。因此,如何在确保时间复杂度与传播效果情况下,在网络中选择K个节点是影响力最大化问题面临的主要难题。With the penetration of various mobile social services into human life and social interaction, social networks have played a role that cannot be ignored in information sharing, information dissemination and diffusion. The huge scale of social networks poses a great challenge to the traditional research on influence maximization, and also makes the research on this problem of greater practical significance. IM (InfluenceMaximization, IM) aims to solve K in social networks. Influential nodes (nodes represent social media users), and use the "word of mouth" effect to spread information to maximize the scope of influence of these nodes. Therefore, how to select K nodes in the network while ensuring time complexity and propagation effect is the main problem facing the problem of maximizing influence.
对于影响力最大化问题,其关键在于如何选取网络中的关键节点,此问题自20世纪以来被研究者所关注。在信息科学领域,随着对社会媒体中信息传播和扩散问题研究的深入,这些关键节点在网络结构动态演化、传播控制等研究中都发挥极其重要作用。一些研究者将元启发式搜索算法应用到网络关键节点挖掘中,但缺点在于该算法执行的速度慢,时间复杂度高,无法在大规模的网络中应用。所以当下在该算法研究领域的基础上,Zhang等人则是考虑到邻居节点在衡量节点影响力方面会发挥重要的作用,故提出了一种结合Pagerank的启发式算法PRDiscount,该算法明确的折扣了所有与所选种子有社会关系个体的影响力。但元启发式算法虽然在一定程度上提高了挖掘关键节点的优化效率,可是该单一解的算法在迭代过程中只有一个解,对于解决小规模网络的关键节点挖掘问题具有简单、快速的优点,容易陷入局部最优,从而导致网络信息冗余。继而Gong等人提出一种基于离散粒子群优化算法的关键节点识别方法,该方法将粒子的位置定义为节点编号,将粒子的速度定义为判断节点是否进行更新的记号,经过多次迭代,进而寻找出搜索条件限制下的全局最优粒子Gbest,该粒子的位置就是寻找出的最优种子节点集合。因此,为了克服算法的缺点,目前对寻找关键节点的影响力最大化研究与群体智能优化算法相结合,其模拟了生物种群的合作行为或物理现象的进化过程,因其强大的启发式搜索思维和全局搜索能力,近年来被广泛应用于影响力最大化的优化问题中。For the problem of influence maximization, the key lies in how to select key nodes in the network. This issue has been paid attention to by researchers since the 20th century. In the field of information science, with the deepening of research on information dissemination and diffusion issues in social media, these key nodes play an extremely important role in research on the dynamic evolution of network structure and communication control. Some researchers have applied the metaheuristic search algorithm to the mining of key nodes in the network. However, the disadvantage is that the algorithm is slow in execution and has high time complexity, making it unable to be applied in large-scale networks. Therefore, based on the current research field of this algorithm, Zhang et al. considered that neighbor nodes will play an important role in measuring node influence, so they proposed a heuristic algorithm PRDiscount combined with Pagerank. This algorithm has a clear discount The influence of all individuals with social relationships with the selected seeds. Although the metaheuristic algorithm improves the optimization efficiency of key node mining to a certain extent, this single-solution algorithm has only one solution during the iterative process. It has the advantages of simplicity and speed for solving the key node mining problem of small-scale networks. It is easy to fall into local optimality, resulting in network information redundancy. Then Gong et al. proposed a key node identification method based on the discrete particle swarm optimization algorithm. This method defines the position of the particle as the node number, and defines the speed of the particle as a mark to determine whether the node is updated. After multiple iterations, Find the global optimal particle Gbest under the constraints of the search conditions, and the position of this particle is the found optimal set of seed nodes. Therefore, in order to overcome the shortcomings of the algorithm, the current research on maximizing the influence of finding key nodes is combined with the swarm intelligence optimization algorithm, which simulates the cooperative behavior of biological populations or the evolutionary process of physical phenomena, because of its powerful heuristic search thinking and global search capabilities, which have been widely used in influence maximization optimization problems in recent years.
因此,如何设计出一种准确度高,代价低的方法,可以在大规模网络中获取种子节点是极其重要的。Therefore, how to design a method with high accuracy and low cost to obtain seed nodes in large-scale networks is extremely important.
发明内容Contents of the invention
发明目的:针对传统的网络关键节点挖掘存在网络规模大、数据量多等问题,从而导致直接进行关键节点挖掘效率低、时间复杂度高,本发明提出了一种基于折扣策略与改进离散乌鸦搜索算法的网络关键节点挖掘方法,将网络节点的影响力折扣策略与改进离散乌鸦搜索算法相结合,从模仿乌鸦搜索过程中更新节点位置进行影响力扩散,通过乌鸦个体在引文网络中游走产生的边际增益来寻找关键节点。Purpose of the invention: In view of the problems that traditional network key node mining has such problems as large network scale and large data volume, which results in low efficiency and high time complexity of directly mining key nodes, the present invention proposes a method based on discount strategy and improved discrete crow search. The network key node mining method of the algorithm combines the influence discount strategy of network nodes with the improved discrete crow search algorithm, and updates the node position to spread influence by imitating the crow search process. The influence is generated by individual crows wandering in the citation network. Marginal gain to find key nodes.
技术方案:本发明提出了一种基于折扣策略与改进离散乌鸦搜索算法的网络关键节点挖掘方法,包括如下步骤:Technical solution: The present invention proposes a network key node mining method based on discount strategy and improved discrete crow search algorithm, which includes the following steps:
S1、对引文网络预处理,将目标网络进行转换,得到一个目标网络的邻接矩阵,并对该引文网络做反向操作,得到反向网络;S1. Preprocess the citation network, convert the target network to obtain an adjacency matrix of the target network, and perform reverse operations on the citation network to obtain the reverse network;
S2、根据折扣策略算法LRDiscount对S1中反向网络中每个指向种子节点的节点影响力进行折扣,得到折扣后的节点影响力,依次选择值最大的节点并添加到候选节点集C;S2. Discount the influence of each node pointing to the seed node in the reverse network in S1 according to the discount strategy algorithm LRDiscount to obtain the discounted node influence. Select the node with the largest value in turn and add it to the candidate node set C;
S3、利用改进离散乌鸦搜索算法的局部优化过程对候选节点集C进行优化,候选节点集即为候选乌鸦群,所述改进离散乌鸦搜索算法在传统离散乌鸦搜索算法基础上增加并行化迭代处理,即离散乌鸦搜索在每次迭代时只与该乌鸦个体对应的记忆向量作对比,从而更新记忆向量以满足乌鸦个体位置向量的更新,得到优化后的节点集C*;S3. Use the local optimization process of the improved discrete crow search algorithm to optimize the candidate node set C. The candidate node set is the candidate crow group. The improved discrete crow search algorithm adds parallel iterative processing on the basis of the traditional discrete crow search algorithm. That is, the discrete crow search only compares with the memory vector corresponding to the crow individual in each iteration, thereby updating the memory vector to meet the update of the crow individual position vector, and obtaining the optimized node set C * ;
S4、从优化后的节点集C*中选择最优的集合,并进行节点影响力评估,从而得到最终k个关键种子节点。S4. Select the optimal set from the optimized node set C * , and evaluate the node influence to obtain the final k key seed nodes.
进一步地,所述步骤S2中获取候选节点集C的具体步骤为:Further, the specific steps for obtaining the candidate node set C in step S2 are:
S2.1、在反向网络中增加一个背景节点bg,将其与网络中所有节点相连接,从而得到一个强连通的N+1个节点的新网络;S2.1. Add a background node bg to the reverse network and connect it to all nodes in the network to obtain a new network of N+1 nodes that are strongly connected;
S2.2、给背景节点bg之外的N个节点分配1单位的LR值,背景节点bg的LR值为0;S2.2. Assign an LR value of 1 unit to N nodes other than the background node bg. The LR value of the background node bg is 0;
S2.3、将这1单位的LR值平均分配给其直接相连的出邻居节点,不断迭代直到达到稳定状态:S2.3. Distribute the LR value of this 1 unit evenly to its directly connected outgoing neighbor nodes, and continue iterating until a stable state is reached:
其中,为节点vj的出度,wji为邻接矩阵元素,表示节点vj与节点vi之间存在边,则wji=1,否则为0;in, is the out-degree of node v j , w ji is the adjacency matrix element, indicating that there is an edge between node v j and node v i , then w ji =1, otherwise it is 0;
S2.4、当迭代结束后,将背景节点bg的LR值LRbg(tc)均分给网络中的所有节点,得到节点vi的最终LR值;S2.4. When the iteration ends, distribute the LR value LR bg (t c ) of the background node bg equally to all nodes in the network to obtain the final LR value of node vi ;
S2.5、折扣策略算法LRDiscount通过对每个种子节点的邻居节点的影响力进行折扣,即在反向网络中,对每个指向种子节点的节点影响力进行折扣,从而得到折扣后的节点影响力:S2.5. Discount strategy algorithm LRDiscount discounts the influence of neighbor nodes of each seed node, that is, in the reverse network, discounts the influence of each node pointing to the seed node, thereby obtaining the discounted node influence. force:
其中,S为种子节点集合,表示节点vi的邻居节点中种子节点的数量占所有邻居节点数量之比;Among them, S is the seed node set, Represents the ratio of the number of seed nodes among the neighbor nodes of node v i to the number of all neighbor nodes;
S2.6、根据最终得到的节点影响力,在INF中依次选择值最大的节点并添加到候选节点集C中。S2.6. Based on the final node influence, select the node with the largest value in INF and add it to the candidate node set C.
进一步地,所述步骤S3中利用改进离散乌鸦搜索算法的局部优化过程对候选节点集C进行优化的具体步骤为:Further, in step S3, the specific steps of optimizing the candidate node set C using the local optimization process of the improved discrete crow search algorithm are:
S3.1、对离散乌鸦搜索算法框架IDCSA中的各项数据进行初始化工作,其中,乌鸦群体规模为N,待求解的种子节点集合为k,最大迭代次数tmax,感知概率AP及局部搜索节点近邻域范围S初始参数;S3.1. Initialize various data in the discrete crow search algorithm framework IDCSA, where the size of the crow group is N, the set of seed nodes to be solved is k, the maximum number of iterations t max , the perception probability AP and the local search node Near neighbor range S initial parameters;
S3.2、依据步骤S2中得到的候选节点集C作为候选乌鸦群,并且来初始化乌鸦种群的位置向量xi=(node1,node2,...,noden),记忆向量Memoryt-1=[m1,m2,...,mn]-1;并从初始化种群中选择初始最优解位置向量X*;S3.2. Use the candidate node set C obtained in step S2 as the candidate crow group, and initialize the position vector x i of the crow population = (node 1 , node 2 ,..., node n ), the memory vector Memory t- 1 = [m 1 , m 2 ,..., m n ] -1 ; and select the initial optimal solution position vector X* from the initialization population;
S3.3、基于节点编码与对乌鸦群体位置向量和记忆向量的离散化表示后,构建网络空间的离散化搜索规则:S3.3. Based on the node coding and the discretized representation of the crow group position vector and memory vector, the discretized search rules of the network space are constructed:
其中,R(ri,s)为局部搜索机制,符号“∩”定义为一种逻辑交叉操作,其目的在于比较两位置向量是否存在重复节点;Among them, R (r i , s) is a local search mechanism, and the symbol "∩" is defined as a logical crossover operation, whose purpose is to compare two position vectors to see whether there are duplicate nodes;
S3.4、基于步骤S2中产生的候选节点集C的节点池,定义目标函数来计算N个乌鸦个体的函数拟合值,采用局部影响力评估函数LIE来进行乌鸦位置向量进化中对种子节点影响力的近似评估:S3.4. Based on the node pool of the candidate node set C generated in step S2, define an objective function to calculate the function fitting values of N crow individuals, and use the local influence evaluation function LIE to perform the crow position vector evolution on the seed nodes. Approximate estimate of impact:
S3.5、对候选节点集C中各节点的2阶邻居节点根据改进的离散乌鸦搜索算法进行局部优化搜索,若某2邻居节点的边际收益值相对该节点自身的边际收益大,则用2阶邻居节点替换当前最优解中的该节点,重复执行,直到最大迭代次数tmax达到上限。S3.5. Perform a local optimization search for the second-order neighbor nodes of each node in the candidate node set C according to the improved discrete crow search algorithm. If the marginal benefit value of a certain 2 neighbor node is larger than the marginal benefit of the node itself, use 2 The order neighbor node replaces the node in the current optimal solution, and the execution is repeated until the maximum number of iterations t max reaches the upper limit.
进一步地,所述步骤S3.5中根据改进的乌鸦搜索算法对节点集进行局部优化搜索的具体步骤为:Further, in step S3.5, the specific steps of performing local optimization search on the node set according to the improved crow search algorithm are:
1)计算当前乌鸦个体i与被跟踪乌鸦个体j的最优位置向量差在此基础上,进行交叉操作得到决策向量Vnode后判断是否进行局部搜索优化;1) Calculate the optimal position vector difference between the current crow individual i and the tracked crow individual j On this basis, perform a crossover operation to obtain the decision vector V node and then determine whether to perform local search optimization;
2)将节点xi的一阶直接邻居节点存入节点集Neighbors中,再依次对该节点进行遍历,找出其2跳近邻节点集并纳入节点集NodeSet中,等到每个节点的一阶邻居节点遍历完成后,对节点集NodeSet进行去重,保证2跳近邻节点没有重复节点存在;2) Store the first- order direct neighbor nodes of node After the node traversal is completed, the node set NodeSet is deduplicated to ensure that there are no duplicate nodes in the 2-hop neighbor nodes;
3)依次计算出种子集位置向量xi在对应位置的节点被其近邻节点代替后的LIE适配值,并选择这些近邻节点集NodeSet中能够在xi向量中带来最大收益的节点,将对应的xi向量中的节点进行替换。3) Calculate the LIE adaptation value of the seed set position vector x i in sequence after the node at the corresponding position is replaced by its neighbor node, and select the node in these neighbor node sets NodeSet that can bring the maximum benefit in the x i vector, and The nodes in the corresponding x i vector are replaced.
进一步地,所述S4中当局部搜索优化迭代次数达到tmmax后所得节点集C*,节点集C*中的节点通过影响力最大化算法选出k个关键节点集。Further, in S4, when the number of local search optimization iterations reaches t mmax , the node set C * is obtained, and the nodes in the node set C * select k key node sets through the influence maximization algorithm.
有益效果:Beneficial effects:
本发明挖掘关键节点时基于折扣策略与改进离散乌鸦搜索算法,解决了算法收敛慢和最优种子集的综合影响力低的问题,具体表现为:This invention is based on the discount strategy and the improved discrete crow search algorithm when mining key nodes, and solves the problems of slow convergence of the algorithm and low comprehensive influence of the optimal seed set. The specific performance is as follows:
(1)通过LRDiscount算法对引文网络中节点间相互影响进行“折扣”,充分利用到网络拓扑结构与节点属性信息。(1) The LRDiscount algorithm is used to "discount" the interaction between nodes in the citation network, making full use of the network topology and node attribute information.
(2)在离散乌鸦搜索算法进行局部搜索优化中,考虑到网络节点折扣策略,从而筛选出初始激活状态的种子集合,其避免了初始化种子节点集对算法收敛速度、结果好坏的缺点。(2) In the local search optimization of the discrete crow search algorithm, the network node discount strategy is taken into consideration to screen out the seed set of the initial activation state, which avoids the shortcomings of the initialized seed node set on the convergence speed of the algorithm and the quality of the results.
(3)在离散乌鸦群搜索规则中添加交叉操作,不仅维持搜索过程中种群的丰富性,避免陷入局部最优解,而且确保更新后的最佳节点在向量xi中无重复节点。(3) Adding a crossover operation to the discrete crow group search rule not only maintains the richness of the population during the search process and avoids falling into the local optimal solution, but also ensures that the updated best node has no duplicate nodes in the vector xi.
本发明针对关键节点挖掘问题,首先将目标问题转化成优化问题,然后利用提出的网络节点折扣策略筛选初始种子集,随之改进离散乌鸦搜索算法对最终种子集进行优化求解。本方法优化了未来影响力最大化问题的研究中寻找复杂网络中的关键节点,在同等条件下,可以取得更好的效果。Aiming at the key node mining problem, this invention first converts the target problem into an optimization problem, then uses the proposed network node discount strategy to screen the initial seed set, and then improves the discrete crow search algorithm to optimize and solve the final seed set. This method optimizes the search for key nodes in complex networks in future research on influence maximization problems, and can achieve better results under the same conditions.
附图说明Description of the drawings
图1为本发明的整体流程图;Figure 1 is an overall flow chart of the present invention;
图2为图1中LRDiscount算法子流程图;Figure 2 is the sub-flow chart of the LRDiscount algorithm in Figure 1;
图3为图1中改进离散乌鸦搜索算法局部优化过程子流程图。Figure 3 is a sub-flow chart of the local optimization process of the improved discrete crow search algorithm in Figure 1.
具体实施方式Detailed ways
下面结合附图和具体实施方式,进一步阐明本发明。应理解这些实施例仅用于说明本发明而不用于限制本发明的范围,在阅读了本发明之后,本领域技术人员对本发明的各种等价形式的修改均落于本申请所附权利要求所限定的范围。The present invention will be further elucidated below in conjunction with the accompanying drawings and specific embodiments. It should be understood that these embodiments are only used to illustrate the present invention and are not intended to limit the scope of the present invention. After reading the present invention, those skilled in the art will make modifications to various equivalent forms of the present invention and all fall within the scope of the appended claims of this application. limited scope.
如图1所示,本发明公开的一种基于折扣策略与改进离散乌鸦搜索算法的网络关键节点挖掘方法的具体步骤如下:As shown in Figure 1, the specific steps of a network key node mining method based on discount strategy and improved discrete crow search algorithm disclosed by the present invention are as follows:
S1、对引文网络预处理,将目标网络进行转换,得到一个目标网络的邻接矩阵,并对该引文网络做反向操作,得到反向网络。S1. Preprocess the citation network, convert the target network to obtain an adjacency matrix of the target network, and perform reverse operations on the citation network to obtain the reverse network.
S2、根据折扣策略算法LRDiscount对S1中反向网络中每个指向种子节点的节点影响力进行折扣,得到折扣后的节点影响力,依次选择值最大的节点并添加到候选节点集C,其具体步骤如下:S2. Discount the influence of each node pointing to the seed node in the reverse network in S1 according to the discount strategy algorithm LRDiscount to obtain the discounted node influence. Select the node with the largest value in turn and add it to the candidate node set C. The specific Proceed as follows:
S2.1、在步骤S1中得到的反向网络中,通过增加一个背景节点bg,将其与网络中所有节点相连接,从而得到一个强连通的N+1个节点的新网络。S2.1. In the reverse network obtained in step S1, add a background node bg and connect it to all nodes in the network, thereby obtaining a new network of N+1 nodes that are strongly connected.
S2.2、给背景节点bg之外的N个节点分配1单位的LR值,背景节点bg的LR值为0:S2.2. Assign an LR value of 1 unit to N nodes other than the background node bg. The LR value of the background node bg is 0:
S2.3、将这1单位的LR值平均分配给其直接相连的出邻居节点,不断迭代直到达到稳定状态:S2.3. Distribute the LR value of this 1 unit evenly to its directly connected outgoing neighbor nodes, and continue iterating until a stable state is reached:
其中,为节点vj的出度,wji为邻接矩阵元素,表示节点vj与节点vi之间存在边,则wji=1,否则为0。in, is the out-degree of node v j , w ji is the adjacency matrix element, indicating that there is an edge between node v j and node v i , then w ji =1, otherwise it is 0.
S2.4、当迭代结束后,将背景节点bg的LR值LRbg(tc)均分给网络中的所有节点,得到节点vi的最终LR值:S2.4. When the iteration ends, distribute the LR value LR bg (t c ) of the background node bg equally to all nodes in the network to obtain the final LR value of node vi :
S2.5、折扣策略算法LRDiscount通过对每个种子节点的邻居节点的影响力进行折扣,即在反向网络中,对每个指向种子节点的节点影响力进行折扣,从而得到折扣后的节点影响力:S2.5. Discount strategy algorithm LRDiscount discounts the influence of neighbor nodes of each seed node, that is, in the reverse network, discounts the influence of each node pointing to the seed node, thereby obtaining the discounted node influence. force:
其中,S为种子节点集合,表示节点vi的邻居节点中种子节点的数量占所有邻居节点数量之比。Among them, S is the seed node set, Indicates the ratio of the number of seed nodes among the neighbor nodes of node v i to the number of all neighbor nodes.
S2.6、根据最终得到的节点影响力,在INF中依次选择值最大的节点并添加到候选节点集C中。S2.6. Based on the final node influence, select the node with the largest value in INF and add it to the candidate node set C.
S3、利用改进离散乌鸦搜索算法的局部优化过程对候选节点集C进行优化,候选节点集即为候选乌鸦群,所述改进离散乌鸦搜索算法在传统离散乌鸦搜索算法基础上增加并行化迭代处理,即离散乌鸦搜索在每次迭代时只与该乌鸦个体对应的记忆向量作对比,从而更新记忆向量以满足乌鸦个体位置向量的更新,得到优化后的节点集C*,如附图2所示,其具体步骤如下:S3. Use the local optimization process of the improved discrete crow search algorithm to optimize the candidate node set C. The candidate node set is the candidate crow group. The improved discrete crow search algorithm adds parallel iterative processing on the basis of the traditional discrete crow search algorithm. That is, the discrete crow search only compares with the memory vector corresponding to the crow individual in each iteration, thereby updating the memory vector to meet the update of the crow individual position vector, and obtains the optimized node set C * , as shown in Figure 2. The specific steps are as follows:
S3.1、对离散乌鸦搜索算法框架IDCSA中的各项数据进行初始化工作,其中,乌鸦群体规模为N,待求解的种子节点集合为k,最大迭代次数tmax,感知概率AP及局部搜索节点近邻域范围S等初始参数。S3.1. Initialize various data in the discrete crow search algorithm framework IDCSA, where the size of the crow group is N, the set of seed nodes to be solved is k, the maximum number of iterations t max , the perception probability AP and the local search node Initial parameters such as the nearest neighborhood range S.
S3.2、依据步骤S2中得到的候选节点集C来初始化乌鸦种群的位置向量xi=(node1,node2,...,noden),记忆向量Memoryt-1=[m1,m2,...,mn]-1;并从初始化种群中选择初始最优解位置向量X*。S3.2. Initialize the position vector x i of the crow population based on the candidate node set C obtained in step S2 = (node 1 , node 2 ,..., node n ), and the memory vector Memory t-1 = [m 1 , m 2 ,..., m n ] -1 ; and select the initial optimal solution position vector X * from the initialization population.
S3.3、基于节点编码与对乌鸦群体位置向量和记忆向量的离散化表示后,构建网络空间的离散化搜索规则:S3.3. Based on the node coding and the discretized representation of the crow group position vector and memory vector, the discretized search rules of the network space are constructed:
其中,R(ri,s)为局部搜索机制,符号“∩”定义为一种逻辑交叉操作,其目的在于比较两位置向量是否存在重复节点。Among them, R ( ri , s) is a local search mechanism, and the symbol "∩" is defined as a logical crossover operation, whose purpose is to compare two position vectors to see whether there are duplicate nodes.
S3.4、基于步骤S2中产生的候选节点集C的节点池,定义目标函数来计算N个乌鸦个体的函数拟合值,采用局部影响力评估函数LIE来进行乌鸦位置向量进化中对种子节点影响力的近似评估:S3.4. Based on the node pool of the candidate node set C generated in step S2, define an objective function to calculate the function fitting values of N crow individuals, and use the local influence evaluation function LIE to perform the crow position vector evolution on the seed nodes. Approximate estimate of impact:
S3.5、对候选节点集C中各节点的2阶邻居节点根据改进的离散乌鸦搜索算法进行局部优化搜索,若某2邻居节点的边际收益值相对该节点自身的边际收益大,则用2阶邻居节点替换当前最优解中的该节点,重复执行,直到最大迭代次数tmax达到上限。具体过程如下:S3.5. Perform a local optimization search for the second-order neighbor nodes of each node in the candidate node set C according to the improved discrete crow search algorithm. If the marginal benefit value of a certain 2 neighbor node is larger than the marginal benefit of the node itself, use 2 The order neighbor node replaces the node in the current optimal solution, and the execution is repeated until the maximum number of iterations t max reaches the upper limit. The specific process is as follows:
首先计算当前乌鸦个体i与被跟踪乌鸦个体j的最优位置向量差在此基础上,进行交叉操作得到决策向量Vnode后判断是否进行局部搜索优化。First, calculate the optimal position vector difference between the current crow individual i and the tracked crow individual j On this basis, a crossover operation is performed to obtain the decision vector V node , and then it is judged whether to perform local search optimization.
然后将节点xi的一阶直接邻居节点存入节点集Neighbors中,再依次对该节点进行遍历,找出其2跳近邻节点集并纳入节点集NodeSet中,等到每个节点的一阶邻居节点遍历完成后,对节点集NodeSet进行去重,保证2跳近邻节点没有重复节点存在。Then store the first -order direct neighbor nodes of node After the traversal is completed, the node set NodeSet is deduplicated to ensure that there are no duplicate nodes among the 2-hop neighboring nodes.
最后再依次计算出种子集位置向量xi在对应位置的节点被其近邻节点代替后的LIE适配值,并选择这些近邻节点集NodeSet中能够在xi向量中带来最大收益的节点,将对应的xi向量中的节点进行替换。Finally, the LIE adaptation value of the seed set position vector x i is calculated in turn after the node at the corresponding position is replaced by its neighbor node, and the node in these neighbor node sets NodeSet that can bring the maximum benefit in the x i vector is selected, and The nodes in the corresponding x i vector are replaced.
S4、从优化后的节点集C*中选择最优的集合,并进行节点影响力评估,从而得到最终k个关键种子节点。S4. Select the optimal set from the optimized node set C * , and evaluate the node influence to obtain the final k key seed nodes.
本发明可以计算机系统结合,从而完成对种子节点的挖掘。The invention can be combined with a computer system to complete the mining of seed nodes.
本发明公开的一种基于折扣策略与改进离散乌鸦搜索算法的网络关键节点挖掘方法可以用于不同规模复杂网路中关键节点的挖掘。The invention discloses a network key node mining method based on discount strategy and improved discrete crow search algorithm, which can be used for mining key nodes in complex networks of different scales.
上述实施方式只为说明本发明的技术构思及特点,其目的在于让熟悉此项技术的人能够了解本发明的内容并据以实施,并不能以此限制本发明的保护范围。凡根据本发明精神实质所做的等效变换或修饰,都应涵盖在本发明的保护范围之内。The above embodiments are only for illustrating the technical concepts and features of the present invention. Their purpose is to enable those familiar with this technology to understand the content of the present invention and implement it accordingly, and cannot limit the scope of protection of the present invention. All equivalent transformations or modifications made based on the spirit and essence of the present invention shall be included in the protection scope of the present invention.
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310569958.1A CN116756207A (en) | 2023-05-19 | 2023-05-19 | Network key node mining method based on discount strategy and improved discrete crow search algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310569958.1A CN116756207A (en) | 2023-05-19 | 2023-05-19 | Network key node mining method based on discount strategy and improved discrete crow search algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116756207A true CN116756207A (en) | 2023-09-15 |
Family
ID=87957940
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310569958.1A Pending CN116756207A (en) | 2023-05-19 | 2023-05-19 | Network key node mining method based on discount strategy and improved discrete crow search algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116756207A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117811992A (en) * | 2024-02-29 | 2024-04-02 | 山东海量信息技术研究院 | Network bad information propagation inhibition method, device, equipment and storage medium |
-
2023
- 2023-05-19 CN CN202310569958.1A patent/CN116756207A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117811992A (en) * | 2024-02-29 | 2024-04-02 | 山东海量信息技术研究院 | Network bad information propagation inhibition method, device, equipment and storage medium |
CN117811992B (en) * | 2024-02-29 | 2024-05-28 | 山东海量信息技术研究院 | Network bad information propagation inhibition method, device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103325061B (en) | A kind of community discovery method and system | |
Syama et al. | A hybrid extreme learning machine model with lévy flight chaotic whale optimization algorithm for wind speed forecasting | |
CN103678671A (en) | Dynamic community detection method in social network | |
CN113962358B (en) | Information diffusion prediction method based on time sequence hypergraph attention neural network | |
Han et al. | Locating multiple equivalent feature subsets in feature selection for imbalanced classification | |
CN113297429B (en) | A social network link prediction method based on neural network architecture search | |
CN110096630A (en) | Big data processing method of the one kind based on clustering | |
Feng et al. | A novel community detection method based on whale optimization algorithm with evolutionary population | |
CN115270007A (en) | A POI recommendation method and system based on hybrid graph neural network | |
Qiao et al. | A hybridized parallel bats algorithm for combinatorial problem of traveling salesman | |
CN116756207A (en) | Network key node mining method based on discount strategy and improved discrete crow search algorithm | |
Li et al. | An improved non-negative latent factor model for missing data estimation via extragradient-based alternating direction method | |
CN117251779A (en) | Node classification method based on global perceptual neural network | |
CN117253037A (en) | Semantic segmentation model structure searching method, automatic semantic segmentation method and system | |
Yu et al. | Community detection in the textile-related trade network using a biased estimation of distribution algorithm | |
CN115965795A (en) | A deep and dark network group discovery method based on network representation learning | |
Hu et al. | Differential evolution based on network structure for feature selection | |
CN118469736A (en) | A community detection method based on community structure enhancement and multi-objective particle swarm optimization | |
CN111275565A (en) | Social network influence maximization method based on local and global influences | |
CN110866838A (en) | Network representation learning algorithm based on transition probability preprocessing | |
CN117171628A (en) | Graph structure data node classification method and device in heterogeneous federated environment | |
CN110223125B (en) | User position obtaining method under node position kernel-edge profit algorithm | |
Duan et al. | [Retracted] The Path of Rural Industry Revitalization Based on Improved Genetic Algorithm in the Internet Era | |
CN114970684A (en) | Community detection method for extracting network core structure by combining VAE | |
Yang et al. | Spatial-temporal data inference with graph attention neural networks in sparse mobile crowdsensing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |