CN116756207A

CN116756207A - Network key node mining method based on discount strategy and improved discrete crow search algorithm

Info

Publication number: CN116756207A
Application number: CN202310569958.1A
Authority: CN
Inventors: 陈伯伦; 许雪; 王笑颜; 谢乾; 刘步实; 朱鹏程; 于翠莹; 王凌; 刘晓娈
Original assignee: Huaiyin Institute of Technology
Current assignee: Huaiyin Institute of Technology
Priority date: 2023-05-19
Filing date: 2023-05-19
Publication date: 2023-09-15

Abstract

The invention relates to the technical field of computer complex network optimization, and discloses a network key node mining method based on a discount strategy and an improved discrete crow search algorithm, which includes: first preprocessing the citation network and converting it into an adjacency matrix, and inverting the network. to obtain the reverse network; then use the LRDiscount algorithm to initially screen the nodes of the reverse network to obtain the candidate node set C ; then optimize the candidate node set C according to the local optimization process of the improved discrete crow search algorithm; finally, from the optimized The nodes centrally select the optimal set and evaluate the node influence to obtain the final k key seed nodes. Compared with the existing technology, the present invention combines the influence discount strategy of network nodes with the improved discrete crow search algorithm, updates the node position to spread influence by imitating the crow search process, and generates influence through individual crows wandering in the citation network. Marginal gain to find key nodes.

Description

Mining of network key nodes based on discount strategy and improved discrete crow search algorithm method

技术领域Technical field

本发明属于复杂网络影响力最大化技术领域，特别涉及一种基于折扣策略与改进离散乌鸦搜索算法的网络关键节点挖掘方法。The invention belongs to the technical field of maximizing the influence of complex networks, and in particular relates to a network key node mining method based on a discount strategy and an improved discrete crow search algorithm.

背景技术Background technique

随着各类移动社交服务对人类生活、社交的渗透，社交网络在信息共享、信息传播扩散等方面起到了不可忽视的作用。规模巨大的社交网络对传统的影响力最大化问题的研究造成了很大的挑战，也令该问题的研究具有更大的现实意义，而IM(InfluenceMaximization，IM)旨在社交网络中求解出K个有影响力的节点(节点代表社交媒体用户)，并利用“口碑”效应传播信息，使这些节点的影响力范围达到最大。因此，如何在确保时间复杂度与传播效果情况下，在网络中选择K个节点是影响力最大化问题面临的主要难题。With the penetration of various mobile social services into human life and social interaction, social networks have played a role that cannot be ignored in information sharing, information dissemination and diffusion. The huge scale of social networks poses a great challenge to the traditional research on influence maximization, and also makes the research on this problem of greater practical significance. IM (InfluenceMaximization, IM) aims to solve K in social networks. Influential nodes (nodes represent social media users), and use the "word of mouth" effect to spread information to maximize the scope of influence of these nodes. Therefore, how to select K nodes in the network while ensuring time complexity and propagation effect is the main problem facing the problem of maximizing influence.

对于影响力最大化问题，其关键在于如何选取网络中的关键节点，此问题自20世纪以来被研究者所关注。在信息科学领域，随着对社会媒体中信息传播和扩散问题研究的深入，这些关键节点在网络结构动态演化、传播控制等研究中都发挥极其重要作用。一些研究者将元启发式搜索算法应用到网络关键节点挖掘中，但缺点在于该算法执行的速度慢，时间复杂度高，无法在大规模的网络中应用。所以当下在该算法研究领域的基础上，Zhang等人则是考虑到邻居节点在衡量节点影响力方面会发挥重要的作用，故提出了一种结合Pagerank的启发式算法PRDiscount，该算法明确的折扣了所有与所选种子有社会关系个体的影响力。但元启发式算法虽然在一定程度上提高了挖掘关键节点的优化效率，可是该单一解的算法在迭代过程中只有一个解，对于解决小规模网络的关键节点挖掘问题具有简单、快速的优点，容易陷入局部最优，从而导致网络信息冗余。继而Gong等人提出一种基于离散粒子群优化算法的关键节点识别方法，该方法将粒子的位置定义为节点编号，将粒子的速度定义为判断节点是否进行更新的记号，经过多次迭代，进而寻找出搜索条件限制下的全局最优粒子Gbest，该粒子的位置就是寻找出的最优种子节点集合。因此，为了克服算法的缺点，目前对寻找关键节点的影响力最大化研究与群体智能优化算法相结合，其模拟了生物种群的合作行为或物理现象的进化过程，因其强大的启发式搜索思维和全局搜索能力，近年来被广泛应用于影响力最大化的优化问题中。For the problem of influence maximization, the key lies in how to select key nodes in the network. This issue has been paid attention to by researchers since the 20th century. In the field of information science, with the deepening of research on information dissemination and diffusion issues in social media, these key nodes play an extremely important role in research on the dynamic evolution of network structure and communication control. Some researchers have applied the metaheuristic search algorithm to the mining of key nodes in the network. However, the disadvantage is that the algorithm is slow in execution and has high time complexity, making it unable to be applied in large-scale networks. Therefore, based on the current research field of this algorithm, Zhang et al. considered that neighbor nodes will play an important role in measuring node influence, so they proposed a heuristic algorithm PRDiscount combined with Pagerank. This algorithm has a clear discount The influence of all individuals with social relationships with the selected seeds. Although the metaheuristic algorithm improves the optimization efficiency of key node mining to a certain extent, this single-solution algorithm has only one solution during the iterative process. It has the advantages of simplicity and speed for solving the key node mining problem of small-scale networks. It is easy to fall into local optimality, resulting in network information redundancy. Then Gong et al. proposed a key node identification method based on the discrete particle swarm optimization algorithm. This method defines the position of the particle as the node number, and defines the speed of the particle as a mark to determine whether the node is updated. After multiple iterations, Find the global optimal particle Gbest under the constraints of the search conditions, and the position of this particle is the found optimal set of seed nodes. Therefore, in order to overcome the shortcomings of the algorithm, the current research on maximizing the influence of finding key nodes is combined with the swarm intelligence optimization algorithm, which simulates the cooperative behavior of biological populations or the evolutionary process of physical phenomena, because of its powerful heuristic search thinking and global search capabilities, which have been widely used in influence maximization optimization problems in recent years.

因此，如何设计出一种准确度高，代价低的方法，可以在大规模网络中获取种子节点是极其重要的。Therefore, how to design a method with high accuracy and low cost to obtain seed nodes in large-scale networks is extremely important.

发明内容Contents of the invention

发明目的：针对传统的网络关键节点挖掘存在网络规模大、数据量多等问题，从而导致直接进行关键节点挖掘效率低、时间复杂度高，本发明提出了一种基于折扣策略与改进离散乌鸦搜索算法的网络关键节点挖掘方法，将网络节点的影响力折扣策略与改进离散乌鸦搜索算法相结合，从模仿乌鸦搜索过程中更新节点位置进行影响力扩散，通过乌鸦个体在引文网络中游走产生的边际增益来寻找关键节点。Purpose of the invention: In view of the problems that traditional network key node mining has such problems as large network scale and large data volume, which results in low efficiency and high time complexity of directly mining key nodes, the present invention proposes a method based on discount strategy and improved discrete crow search. The network key node mining method of the algorithm combines the influence discount strategy of network nodes with the improved discrete crow search algorithm, and updates the node position to spread influence by imitating the crow search process. The influence is generated by individual crows wandering in the citation network. Marginal gain to find key nodes.

技术方案：本发明提出了一种基于折扣策略与改进离散乌鸦搜索算法的网络关键节点挖掘方法，包括如下步骤：Technical solution: The present invention proposes a network key node mining method based on discount strategy and improved discrete crow search algorithm, which includes the following steps:

S1、对引文网络预处理，将目标网络进行转换，得到一个目标网络的邻接矩阵，并对该引文网络做反向操作，得到反向网络；S1. Preprocess the citation network, convert the target network to obtain an adjacency matrix of the target network, and perform reverse operations on the citation network to obtain the reverse network;

S2、根据折扣策略算法LRDiscount对S1中反向网络中每个指向种子节点的节点影响力进行折扣，得到折扣后的节点影响力，依次选择值最大的节点并添加到候选节点集C；S2. Discount the influence of each node pointing to the seed node in the reverse network in S1 according to the discount strategy algorithm LRDiscount to obtain the discounted node influence. Select the node with the largest value in turn and add it to the candidate node set C;

S3、利用改进离散乌鸦搜索算法的局部优化过程对候选节点集C进行优化，候选节点集即为候选乌鸦群，所述改进离散乌鸦搜索算法在传统离散乌鸦搜索算法基础上增加并行化迭代处理，即离散乌鸦搜索在每次迭代时只与该乌鸦个体对应的记忆向量作对比，从而更新记忆向量以满足乌鸦个体位置向量的更新，得到优化后的节点集C^*；S3. Use the local optimization process of the improved discrete crow search algorithm to optimize the candidate node set C. The candidate node set is the candidate crow group. The improved discrete crow search algorithm adds parallel iterative processing on the basis of the traditional discrete crow search algorithm. That is, the discrete crow search only compares with the memory vector corresponding to the crow individual in each iteration, thereby updating the memory vector to meet the update of the crow individual position vector, and obtaining the optimized node set C ^* ;

S4、从优化后的节点集C^*中选择最优的集合，并进行节点影响力评估，从而得到最终k个关键种子节点。S4. Select the optimal set from the optimized node set C ^* , and evaluate the node influence to obtain the final k key seed nodes.

进一步地，所述步骤S2中获取候选节点集C的具体步骤为：Further, the specific steps for obtaining the candidate node set C in step S2 are:

S2.1、在反向网络中增加一个背景节点bg，将其与网络中所有节点相连接，从而得到一个强连通的N+1个节点的新网络；S2.1. Add a background node bg to the reverse network and connect it to all nodes in the network to obtain a new network of N+1 nodes that are strongly connected;

S2.2、给背景节点bg之外的N个节点分配1单位的LR值，背景节点bg的LR值为0；S2.2. Assign an LR value of 1 unit to N nodes other than the background node bg. The LR value of the background node bg is 0;

S2.3、将这1单位的LR值平均分配给其直接相连的出邻居节点，不断迭代直到达到稳定状态：S2.3. Distribute the LR value of this 1 unit evenly to its directly connected outgoing neighbor nodes, and continue iterating until a stable state is reached:

其中，为节点v_j的出度，w_ji为邻接矩阵元素，表示节点v_j与节点v_i之间存在边，则w_ji＝1，否则为0；in, is the out-degree of node v _j , w _ji is the adjacency matrix element, indicating that there is an edge between node v _j and node v _i , then w _ji =1, otherwise it is 0;

S2.4、当迭代结束后，将背景节点bg的LR值LR_bg(t_c)均分给网络中的所有节点，得到节点v_i的最终LR值；S2.4. When the iteration ends, distribute the LR value LR _bg (t _c ) of the background node bg equally to all nodes in the network to obtain the final LR value of node _vi ;

S2.5、折扣策略算法LRDiscount通过对每个种子节点的邻居节点的影响力进行折扣，即在反向网络中，对每个指向种子节点的节点影响力进行折扣，从而得到折扣后的节点影响力：S2.5. Discount strategy algorithm LRDiscount discounts the influence of neighbor nodes of each seed node, that is, in the reverse network, discounts the influence of each node pointing to the seed node, thereby obtaining the discounted node influence. force:

其中，S为种子节点集合，表示节点v_i的邻居节点中种子节点的数量占所有邻居节点数量之比；Among them, S is the seed node set, Represents the ratio of the number of seed nodes among the neighbor nodes of node v _i to the number of all neighbor nodes;

S2.6、根据最终得到的节点影响力，在INF中依次选择值最大的节点并添加到候选节点集C中。S2.6. Based on the final node influence, select the node with the largest value in INF and add it to the candidate node set C.

进一步地，所述步骤S3中利用改进离散乌鸦搜索算法的局部优化过程对候选节点集C进行优化的具体步骤为：Further, in step S3, the specific steps of optimizing the candidate node set C using the local optimization process of the improved discrete crow search algorithm are:

S3.1、对离散乌鸦搜索算法框架IDCSA中的各项数据进行初始化工作，其中，乌鸦群体规模为N，待求解的种子节点集合为k，最大迭代次数t_max，感知概率AP及局部搜索节点近邻域范围S初始参数；S3.1. Initialize various data in the discrete crow search algorithm framework IDCSA, where the size of the crow group is N, the set of seed nodes to be solved is k, the maximum number of iterations t _max , the perception probability AP and the local search node Near neighbor range S initial parameters;

S3.2、依据步骤S2中得到的候选节点集C作为候选乌鸦群，并且来初始化乌鸦种群的位置向量x_i＝(node₁，node₂，...，node_n)，记忆向量Memory_t-1＝[m₁，m₂，...，m_n]^-1；并从初始化种群中选择初始最优解位置向量X*；S3.2. Use the candidate node set C obtained in step S2 as the candidate crow group, and initialize the position vector x _i of the crow population = (node ₁ , node ₂ ,..., node _n ), the memory vector Memory _{t- 1} = [m ₁ , m ₂ ,..., m _n ] ^-1 ; and select the initial optimal solution position vector X* from the initialization population;

S3.3、基于节点编码与对乌鸦群体位置向量和记忆向量的离散化表示后，构建网络空间的离散化搜索规则：S3.3. Based on the node coding and the discretized representation of the crow group position vector and memory vector, the discretized search rules of the network space are constructed:

其中，R(r_i，s)为局部搜索机制，符号“∩”定义为一种逻辑交叉操作，其目的在于比较两位置向量是否存在重复节点；Among them, R (r _i , s) is a local search mechanism, and the symbol "∩" is defined as a logical crossover operation, whose purpose is to compare two position vectors to see whether there are duplicate nodes;

S3.4、基于步骤S2中产生的候选节点集C的节点池，定义目标函数来计算N个乌鸦个体的函数拟合值，采用局部影响力评估函数LIE来进行乌鸦位置向量进化中对种子节点影响力的近似评估：S3.4. Based on the node pool of the candidate node set C generated in step S2, define an objective function to calculate the function fitting values of N crow individuals, and use the local influence evaluation function LIE to perform the crow position vector evolution on the seed nodes. Approximate estimate of impact:

S3.5、对候选节点集C中各节点的2阶邻居节点根据改进的离散乌鸦搜索算法进行局部优化搜索，若某2邻居节点的边际收益值相对该节点自身的边际收益大，则用2阶邻居节点替换当前最优解中的该节点，重复执行，直到最大迭代次数t_max达到上限。S3.5. Perform a local optimization search for the second-order neighbor nodes of each node in the candidate node set C according to the improved discrete crow search algorithm. If the marginal benefit value of a certain 2 neighbor node is larger than the marginal benefit of the node itself, use 2 The order neighbor node replaces the node in the current optimal solution, and the execution is repeated until the maximum number of iterations t _max reaches the upper limit.

进一步地，所述步骤S3.5中根据改进的乌鸦搜索算法对节点集进行局部优化搜索的具体步骤为：Further, in step S3.5, the specific steps of performing local optimization search on the node set according to the improved crow search algorithm are:

1)计算当前乌鸦个体i与被跟踪乌鸦个体j的最优位置向量差在此基础上，进行交叉操作得到决策向量V_node后判断是否进行局部搜索优化；1) Calculate the optimal position vector difference between the current crow individual i and the tracked crow individual j On this basis, perform a crossover operation to obtain the decision vector V _node and then determine whether to perform local search optimization;

2)将节点x_i的一阶直接邻居节点存入节点集Neighbors中，再依次对该节点进行遍历，找出其2跳近邻节点集并纳入节点集NodeSet中，等到每个节点的一阶邻居节点遍历完成后，对节点集NodeSet进行去重，保证2跳近邻节点没有重复节点存在；2) Store the first- _order direct neighbor nodes of node After the node traversal is completed, the node set NodeSet is deduplicated to ensure that there are no duplicate nodes in the 2-hop neighbor nodes;

3)依次计算出种子集位置向量x_i在对应位置的节点被其近邻节点代替后的LIE适配值，并选择这些近邻节点集NodeSet中能够在x_i向量中带来最大收益的节点，将对应的x_i向量中的节点进行替换。3) Calculate the LIE adaptation value of the seed set position vector x _i in sequence after the node at the corresponding position is replaced by its neighbor node, and select the node in these neighbor node sets NodeSet that can bring the maximum benefit in the x _i vector, and The nodes in the corresponding x _i vector are replaced.

进一步地，所述S4中当局部搜索优化迭代次数达到t_mmax后所得节点集C^*，节点集C^*中的节点通过影响力最大化算法选出k个关键节点集。Further, in S4, when the number of local search optimization iterations reaches t _mmax , the node set C ^* is obtained, and the nodes in the node set C ^* select k key node sets through the influence maximization algorithm.

有益效果：Beneficial effects:

本发明挖掘关键节点时基于折扣策略与改进离散乌鸦搜索算法，解决了算法收敛慢和最优种子集的综合影响力低的问题，具体表现为：This invention is based on the discount strategy and the improved discrete crow search algorithm when mining key nodes, and solves the problems of slow convergence of the algorithm and low comprehensive influence of the optimal seed set. The specific performance is as follows:

(1)通过LRDiscount算法对引文网络中节点间相互影响进行“折扣”，充分利用到网络拓扑结构与节点属性信息。(1) The LRDiscount algorithm is used to "discount" the interaction between nodes in the citation network, making full use of the network topology and node attribute information.

(2)在离散乌鸦搜索算法进行局部搜索优化中，考虑到网络节点折扣策略，从而筛选出初始激活状态的种子集合，其避免了初始化种子节点集对算法收敛速度、结果好坏的缺点。(2) In the local search optimization of the discrete crow search algorithm, the network node discount strategy is taken into consideration to screen out the seed set of the initial activation state, which avoids the shortcomings of the initialized seed node set on the convergence speed of the algorithm and the quality of the results.

(3)在离散乌鸦群搜索规则中添加交叉操作，不仅维持搜索过程中种群的丰富性，避免陷入局部最优解，而且确保更新后的最佳节点在向量xi中无重复节点。(3) Adding a crossover operation to the discrete crow group search rule not only maintains the richness of the population during the search process and avoids falling into the local optimal solution, but also ensures that the updated best node has no duplicate nodes in the vector xi.

本发明针对关键节点挖掘问题，首先将目标问题转化成优化问题，然后利用提出的网络节点折扣策略筛选初始种子集，随之改进离散乌鸦搜索算法对最终种子集进行优化求解。本方法优化了未来影响力最大化问题的研究中寻找复杂网络中的关键节点，在同等条件下，可以取得更好的效果。Aiming at the key node mining problem, this invention first converts the target problem into an optimization problem, then uses the proposed network node discount strategy to screen the initial seed set, and then improves the discrete crow search algorithm to optimize and solve the final seed set. This method optimizes the search for key nodes in complex networks in future research on influence maximization problems, and can achieve better results under the same conditions.

附图说明Description of the drawings

图1为本发明的整体流程图；Figure 1 is an overall flow chart of the present invention;

图2为图1中LRDiscount算法子流程图；Figure 2 is the sub-flow chart of the LRDiscount algorithm in Figure 1;

图3为图1中改进离散乌鸦搜索算法局部优化过程子流程图。Figure 3 is a sub-flow chart of the local optimization process of the improved discrete crow search algorithm in Figure 1.

具体实施方式Detailed ways

下面结合附图和具体实施方式，进一步阐明本发明。应理解这些实施例仅用于说明本发明而不用于限制本发明的范围，在阅读了本发明之后，本领域技术人员对本发明的各种等价形式的修改均落于本申请所附权利要求所限定的范围。The present invention will be further elucidated below in conjunction with the accompanying drawings and specific embodiments. It should be understood that these embodiments are only used to illustrate the present invention and are not intended to limit the scope of the present invention. After reading the present invention, those skilled in the art will make modifications to various equivalent forms of the present invention and all fall within the scope of the appended claims of this application. limited scope.

如图1所示，本发明公开的一种基于折扣策略与改进离散乌鸦搜索算法的网络关键节点挖掘方法的具体步骤如下：As shown in Figure 1, the specific steps of a network key node mining method based on discount strategy and improved discrete crow search algorithm disclosed by the present invention are as follows:

S1、对引文网络预处理，将目标网络进行转换，得到一个目标网络的邻接矩阵，并对该引文网络做反向操作，得到反向网络。S1. Preprocess the citation network, convert the target network to obtain an adjacency matrix of the target network, and perform reverse operations on the citation network to obtain the reverse network.

S2、根据折扣策略算法LRDiscount对S1中反向网络中每个指向种子节点的节点影响力进行折扣，得到折扣后的节点影响力，依次选择值最大的节点并添加到候选节点集C，其具体步骤如下：S2. Discount the influence of each node pointing to the seed node in the reverse network in S1 according to the discount strategy algorithm LRDiscount to obtain the discounted node influence. Select the node with the largest value in turn and add it to the candidate node set C. The specific Proceed as follows:

S2.1、在步骤S1中得到的反向网络中，通过增加一个背景节点bg，将其与网络中所有节点相连接，从而得到一个强连通的N+1个节点的新网络。S2.1. In the reverse network obtained in step S1, add a background node bg and connect it to all nodes in the network, thereby obtaining a new network of N+1 nodes that are strongly connected.

S2.2、给背景节点bg之外的N个节点分配1单位的LR值，背景节点bg的LR值为0：S2.2. Assign an LR value of 1 unit to N nodes other than the background node bg. The LR value of the background node bg is 0:

其中，为节点v_j的出度，w_ji为邻接矩阵元素，表示节点v_j与节点v_i之间存在边，则w_ji＝1，否则为0。in, is the out-degree of node v _j , w _ji is the adjacency matrix element, indicating that there is an edge between node v _j and node v _i , then w _ji =1, otherwise it is 0.

S2.4、当迭代结束后，将背景节点bg的LR值LR_bg(t_c)均分给网络中的所有节点，得到节点v_i的最终LR值：S2.4. When the iteration ends, distribute the LR value LR _bg (t _c ) of the background node bg equally to all nodes in the network to obtain the final LR value of node _vi :

其中，S为种子节点集合，表示节点v_i的邻居节点中种子节点的数量占所有邻居节点数量之比。Among them, S is the seed node set, Indicates the ratio of the number of seed nodes among the neighbor nodes of node v _i to the number of all neighbor nodes.

S3、利用改进离散乌鸦搜索算法的局部优化过程对候选节点集C进行优化，候选节点集即为候选乌鸦群，所述改进离散乌鸦搜索算法在传统离散乌鸦搜索算法基础上增加并行化迭代处理，即离散乌鸦搜索在每次迭代时只与该乌鸦个体对应的记忆向量作对比，从而更新记忆向量以满足乌鸦个体位置向量的更新，得到优化后的节点集C^*，如附图2所示，其具体步骤如下：S3. Use the local optimization process of the improved discrete crow search algorithm to optimize the candidate node set C. The candidate node set is the candidate crow group. The improved discrete crow search algorithm adds parallel iterative processing on the basis of the traditional discrete crow search algorithm. That is, the discrete crow search only compares with the memory vector corresponding to the crow individual in each iteration, thereby updating the memory vector to meet the update of the crow individual position vector, and obtains the optimized node set C ^* , as shown in Figure 2. The specific steps are as follows:

S3.1、对离散乌鸦搜索算法框架IDCSA中的各项数据进行初始化工作，其中，乌鸦群体规模为N，待求解的种子节点集合为k，最大迭代次数t_max，感知概率AP及局部搜索节点近邻域范围S等初始参数。S3.1. Initialize various data in the discrete crow search algorithm framework IDCSA, where the size of the crow group is N, the set of seed nodes to be solved is k, the maximum number of iterations t _max , the perception probability AP and the local search node Initial parameters such as the nearest neighborhood range S.

S3.2、依据步骤S2中得到的候选节点集C来初始化乌鸦种群的位置向量x_i＝(node₁，node₂，...，node_n)，记忆向量Memory_t-1＝[m₁，m₂，...，m_n]^-1；并从初始化种群中选择初始最优解位置向量X^*。S3.2. Initialize the position vector x _i of the crow population based on the candidate node set C obtained in step S2 = (node ₁ , node ₂ ,..., node _n ), and the memory vector Memory _t-1 = [m ₁ , m ₂ ,..., m _n ] ^-1 ; and select the initial optimal solution position vector X ^* from the initialization population.

其中，R(r_i，s)为局部搜索机制，符号“∩”定义为一种逻辑交叉操作，其目的在于比较两位置向量是否存在重复节点。Among them, R ( _ri , s) is a local search mechanism, and the symbol "∩" is defined as a logical crossover operation, whose purpose is to compare two position vectors to see whether there are duplicate nodes.

S3.5、对候选节点集C中各节点的2阶邻居节点根据改进的离散乌鸦搜索算法进行局部优化搜索，若某2邻居节点的边际收益值相对该节点自身的边际收益大，则用2阶邻居节点替换当前最优解中的该节点，重复执行，直到最大迭代次数t_max达到上限。具体过程如下：S3.5. Perform a local optimization search for the second-order neighbor nodes of each node in the candidate node set C according to the improved discrete crow search algorithm. If the marginal benefit value of a certain 2 neighbor node is larger than the marginal benefit of the node itself, use 2 The order neighbor node replaces the node in the current optimal solution, and the execution is repeated until the maximum number of iterations t _max reaches the upper limit. The specific process is as follows:

首先计算当前乌鸦个体i与被跟踪乌鸦个体j的最优位置向量差在此基础上，进行交叉操作得到决策向量V_node后判断是否进行局部搜索优化。First, calculate the optimal position vector difference between the current crow individual i and the tracked crow individual j On this basis, a crossover operation is performed to obtain the decision vector V _node , and then it is judged whether to perform local search optimization.

然后将节点x_i的一阶直接邻居节点存入节点集Neighbors中，再依次对该节点进行遍历，找出其2跳近邻节点集并纳入节点集NodeSet中，等到每个节点的一阶邻居节点遍历完成后，对节点集NodeSet进行去重，保证2跳近邻节点没有重复节点存在。Then store the _first -order direct neighbor nodes of node After the traversal is completed, the node set NodeSet is deduplicated to ensure that there are no duplicate nodes among the 2-hop neighboring nodes.

最后再依次计算出种子集位置向量x_i在对应位置的节点被其近邻节点代替后的LIE适配值，并选择这些近邻节点集NodeSet中能够在x_i向量中带来最大收益的节点，将对应的x_i向量中的节点进行替换。Finally, the LIE adaptation value of the seed set position vector x _i is calculated in turn after the node at the corresponding position is replaced by its neighbor node, and the node in these neighbor node sets NodeSet that can bring the maximum benefit in the x _i vector is selected, and The nodes in the corresponding x _i vector are replaced.

本发明可以计算机系统结合，从而完成对种子节点的挖掘。The invention can be combined with a computer system to complete the mining of seed nodes.

本发明公开的一种基于折扣策略与改进离散乌鸦搜索算法的网络关键节点挖掘方法可以用于不同规模复杂网路中关键节点的挖掘。The invention discloses a network key node mining method based on discount strategy and improved discrete crow search algorithm, which can be used for mining key nodes in complex networks of different scales.

上述实施方式只为说明本发明的技术构思及特点，其目的在于让熟悉此项技术的人能够了解本发明的内容并据以实施，并不能以此限制本发明的保护范围。凡根据本发明精神实质所做的等效变换或修饰，都应涵盖在本发明的保护范围之内。The above embodiments are only for illustrating the technical concepts and features of the present invention. Their purpose is to enable those familiar with this technology to understand the content of the present invention and implement it accordingly, and cannot limit the scope of protection of the present invention. All equivalent transformations or modifications made based on the spirit and essence of the present invention shall be included in the protection scope of the present invention.

Claims

1. A network key node mining method based on discount strategy and improved discrete crow search algorithm is characterized by comprising the following steps:

s1, preprocessing a quotation network, converting a target network to obtain an adjacent matrix of the target network, and performing reverse operation on the quotation network to obtain a reverse network;

s2, in the reverse network described in S1, discount calculation is carried out on the node influence of the network according to an LRDiscount algorithm, so that the node influence after discount is obtained, and the node with the largest value is sequentially selected and added to the candidate node set C;

s3, optimizing a candidate node set C by utilizing a local optimization process of an improved discrete crow search algorithm, wherein the candidate node set is a candidate crow group, the improved discrete crow search algorithm adds parallelization iteration processing on the basis of a traditional discrete crow search algorithm, namely, the discrete crow search is only compared with a memory vector corresponding to a crow individual during each iteration, so that the memory vector is updated to meet the update of the position vector of the crow individual, and the optimized node set C is obtained ^* ；

S4, finally, from the optimized node set C ^* And selecting an optimal set, and evaluating the influence of the nodes, so as to obtain final k key seed nodes.

2. The network key node mining method based on the discount policy and the improved discrete crow search algorithm according to claim 1, wherein the specific steps of obtaining the candidate seed subset C in the step S2 are as follows:

s2.1, adding a background node bg in a reverse network, and connecting the background node bg with all nodes in the network, so as to obtain a new network of N+1 nodes which are strongly communicated;

s2.2, distributing LR values of 1 unit to N nodes except the background node bg, wherein the LR value of the background node bg is 0;

s2.3, equally distributing the LR value of 1 unit to directly connected neighbor nodes, and continuously iterating until reaching a stable state:

wherein ,for node v _j Degree of departure, w _ji For adjacency matrix elements, representing node v _j And node v _i Where there is an edge between, then w _ji =1, otherwise 0;

s2.4, after the iteration is finished, the LR value LR of the background node bg is calculated _bg (t _c ) Equally dividing to all nodes in the network to obtain node v _i Final LR values of (a);

s2.5, a discount policy algorithm LRDiscount discounts influence of neighbor nodes of each seed node, namely, in a reverse network, discounts influence of each node pointing to the seed node, so that discounted influence of the node is obtained:

wherein S is a seed node set,representing node v _i The number of seed nodes in the neighbor nodes accounts for the ratio of the number of all the neighbor nodes;

and S2.6, according to the finally obtained node influence, sequentially selecting the node with the largest value in the INF and adding the node into the candidate node set C.

3. The network key node mining method based on the discount policy and the improved discrete crow search algorithm according to claim 1, wherein the specific steps of optimizing the candidate node set C by using the local optimization process of the improved discrete crow search algorithm in the step S3 are as follows:

s3.1, initializing all data in an IDCSA (discrete crow search algorithm framework), wherein the crow group scale is N, the seed node set to be solved is k, and the maximum iteration number t is achieved _max Sensing probability AP and local search node neighbor domain range S initial parameters;

s3.2, using the candidate node set C obtained in the step S2 as a candidate crow group, thereby initializing a position vector x of the crow group _i ＝(node ₁ ,node ₂ ,…,node _n ) Memory vector Memory _t-1 ＝[m ₁ ,m ₂ ,…,m _n ] ^-1 The method comprises the steps of carrying out a first treatment on the surface of the And selecting an initial optimal solution position vector X from the initialized population ^* ；

S3.3, constructing a discretization search rule of a network space based on node coding and discretization representation of the crow group position vector and the memory vector:

wherein ,R(r_i S) is a local search mechanism, and the symbol "≡" is defined as a logical cross operation, which aims at comparing whether there are duplicate nodes in the two position vectors;

s3.4, defining an objective function to calculate function fitting values of N crow individuals based on the node pool of the candidate node set C generated in the step S2, and performing approximate evaluation on the influence of seed nodes in the crow position vector evolution by adopting a local influence evaluation function LIE:

s3.5, carrying out local optimization search on 2-order neighbor nodes of each node in the candidate node set C according to an improved discrete crow search algorithm, if the marginal benefit value of a certain 2-order neighbor node is large relative to the marginal benefit of the node, replacing the node in the current optimal solution by the 2-order neighbor node, and repeatedly executing until the maximum iteration number t is reached _max The upper limit is reached.

4. The network key node mining method based on the discount policy and the improved discrete crow search algorithm according to claim 3, wherein the specific steps of performing the local optimization search on the node set according to the improved crow search algorithm in step S3.5 are as follows:

1) Calculating the optimal position vector difference between the current crow individual i and the tracked crow individual jOn the basis, the cross operation is carried out to obtain a decision vector V _node Judging whether to perform local search optimization or not;

2) Node x _i Storing the first-order direct neighbor nodes of each node into a node set neighbor, traversing the nodes in sequence, finding out a 2-hop neighbor node set of the node, incorporating the 2-hop neighbor node set into the node set, and repeating the node set until the first-order neighbor nodes of each node are traversed, so as to ensure that the 2-hop neighbor nodes have no repeated nodes;

3) Sequentially calculating seed subset position vectors x _i LIE adaptation value after the node at the corresponding position is replaced by the neighboring node, and selecting the neighboring node set NodeSet to be able to x _i The node bringing the maximum benefit in the vector will correspond to x _i The nodes in the vector are replaced.

5. The network key node mining method based on discount policies and improved discrete crow search algorithm as recited in claim 1, wherein the number of local search optimization iterations in S4 reaches t _max The node set C obtained after ^* Node set C ^* And selecting k key node sets through an influence maximization algorithm.