CN111861122A

CN111861122A - A Social Network Information Credibility Evaluation Method Based on Similarity of Propagation Attributes

Info

Publication number: CN111861122A
Application number: CN202010558019.3A
Authority: CN
Inventors: 李大庆; 张欣予
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2020-06-18
Filing date: 2020-06-18
Publication date: 2020-10-30
Anticipated expiration: 2040-06-18
Also published as: CN111861122B

Abstract

The invention provides a social network information credibility evaluation method based on propagation attribute similarity, which comprises the following steps: step A: extracting social network information propagation content and constructing a social network information propagation network; and B: extracting attributes of social network information transmission nodes, calculating topological attributes of social network information transmission networks, marking the reliability of historical information, and establishing a historical information transmission attribute database; and C: calculating the propagation attribute similarity of the target social network and the historical social network so as to evaluate the information credibility of the target social network; through the steps, the purpose of evaluating the credibility of the social network information based on the similarity of the propagation attributes can be achieved, the method is strong in integrity, high in objectivity and good in universality, and the problem that the credibility of the social network information is difficult to objectively measure and evaluate is solved.

Description

A Social Network Information Credibility Evaluation Method Based on Propagation Attribute Similarity

技术领域technical field

本发明提出了一种基于传播属性相似性的社交网络信息可信度评估方法，它涉及网络科学、社交网络分析等技术领域。The invention proposes a method for evaluating the credibility of social network information based on the similarity of propagation attributes, which relates to the technical fields of network science, social network analysis and the like.

背景技术Background technique

社交网络信息从狭义上是指在社交网络环境中，在某一特定时间段内可溯源的、引发传播的并对社会产生一定影响的信息。热点新闻、热门话题、争议事件等都是社交网络中常见的社交网络信息。In a narrow sense, social network information refers to information that can be traced to a certain period of time in a social network environment, that causes dissemination and has a certain impact on society. Hot news, hot topics, controversial events, etc. are all common social network information in social networks.

随着信息技术的发展和新型网络媒体的出现,社交网络信息传播速度越来越快，传播规模日益增长，传播的内容也随着传播层级的增加而产生日益增强的影响。譬如，当面临重大灾难事件时,人们可以通过微博等多种社交网络平台快速准确地传递与灾情相关的社交网络信息,形成热点新闻，披露诸如“救灾措施及时与否”的争议事件，从而更好更快地促使政府与民间采取救援措施；然而，社交网络信息中也有不实信息与谣言，例如与伪科学、宗教迷信相关的争议事件与信息，这些缺乏可信度的内容如若在社交网络中得到广泛传播，会造成群众恐慌甚至影响社会稳定。因此，如何评估一个社交网络信息的可信度，成为了社会各界重点研究的问题。With the development of information technology and the emergence of new network media, the speed of social network information dissemination is getting faster and faster, the scale of dissemination is increasing, and the content of dissemination has an increasing influence with the increase of dissemination levels. For example, when faced with a major disaster, people can quickly and accurately transmit social network information related to the disaster through various social network platforms such as Weibo, form hot news, and disclose controversial events such as "the timeliness of disaster relief measures". It is better and faster to prompt the government and the people to take rescue measures; however, there are also false information and rumors in social network information, such as controversial events and information related to pseudoscience and religious superstition. It is widely spread on the Internet, which will cause panic among the masses and even affect social stability. Therefore, how to evaluate the credibility of a social network information has become a key research problem in all sectors of society.

拥有较高可信度的社交网络信息，其传播溯源一般是经过社会认证的信息发布机构或个人，这样的社交网络信息能够及时传递真实事件情况，为社会各界有需要的人士及时通告或传达宝贵的信息与知识。而缺乏可信度的社交网络信息，通常是以“以讹传讹”的形式存在，且其传播范围相对于较高可信度的社交网络信息来说更加分散和不确定，传播群体也难以辨别，因此能够在社交网络上混淆群众的视线，将不可信的信息也传播广泛。Social network information with high credibility, its dissemination traceability is generally a socially certified information publishing agency or individual, such social network information can timely convey real events, and timely notify or convey valuable information to those in need from all walks of life information and knowledge. However, social network information that lacks credibility usually exists in the form of "misinformation", and its dissemination range is more scattered and uncertain than social network information with high credibility, and the dissemination groups are also difficult to distinguish. Therefore, It can confuse the eyes of the masses on social networks and spread untrustworthy information widely.

由于缺乏可信度的社交网络信息存在的普遍性及其产生的损失给人们的生活甚至社会运行都带来了巨大的影响，因此采用科学的、合理的方法对社交网络信息进行准确的可信度评估与判断是十分必要的。在评估信息的可信度过程中，可以采用多种评估方法，包括基于信息发布者属性如权威性、影响力、活跃程度以及社会关系等的可信度评估方法，基于信息内容属性如信息完整度、语言分析、内容立场判断等的可信度评估方法，以及基于信息发布的时间与数量宏观特征的可信度评估方法。而如何通过信息的传播特性来快速评估其可信度是本发明的重点。Due to the ubiquity of social network information lacking credibility and the resulting losses have a huge impact on people's lives and even social operations, scientific and reasonable methods are used to accurately and credible social network information. Degree evaluation and judgment are very necessary. In the process of evaluating the credibility of information, a variety of evaluation methods can be used, including credibility evaluation methods based on the attributes of information publishers, such as authority, influence, activity, and social relations, based on information content attributes such as information integrity The credibility evaluation method of the degree, language analysis, content position judgment, etc., as well as the credibility evaluation method based on the macro characteristics of the time and quantity of information release. And how to quickly evaluate the credibility of the information through the dissemination characteristics of the information is the focus of the present invention.

以往的方法大多是基于发布者属性、信息内容属性、或信息时间与数量等宏观属性的可信度评估方法，或者需要耗费大量人力物力对所评估的具体社交网络信息属性进行搜集和判断，较少对信息的传播属性进行研究和利用，亦未参考历史的信息传播属性对社交网络信息进行可信度评估。因此上述方法虽然能够针对特定的社交网络信息的可信度进行定量或定性的评估，却未能准确把握社交网络中可信、不可信信息传播特性方面的显著与根本差异，以致上述方法的普适性较差，通用可解释性不强，人力与时间成本耗损巨大。而已有研究表明，可信度相近的社交网络信息之间具有明显的传播属性相似性，而可信度不同的社交网络信息，其传播结构属性之间有着明显差别，且这些差别并且不因信息具体内容的差异而变化。Most of the previous methods are credibility evaluation methods based on macro attributes such as publisher attributes, information content attributes, or information time and quantity, or require a lot of manpower and material resources to collect and judge the specific social network information attributes to be evaluated. Little research and utilization of information dissemination attributes are conducted, and the credibility of social network information is not evaluated with reference to historical information dissemination attributes. Therefore, although the above methods can quantitatively or qualitatively evaluate the credibility of specific social network information, they cannot accurately grasp the significant and fundamental differences in the dissemination characteristics of trusted and untrustworthy information in social networks, so that the generalization of the above methods The adaptability is poor, the general interpretability is not strong, and the labor and time cost is huge. However, existing research has shown that social network information with similar credibility has obvious similarities in propagation attributes, while social network information with different credibility has obvious differences in the dissemination structure attributes, and these differences are not caused by information. The specific content varies.

本发明为评估一个目标社交网络信息的可信度，首先提取一定数量的历史信息与目标社交网络信息的传播属性，其中包括了传播者属性以及传播网络结构，并以传播者为节点，信息转发关系为有向边构建信息传播网络，对目标社交网络信息进行传播属性计算，通过与已有的信息传播网络数据库进行比对来计算目标信息与历史信息传播属性相似性，经过相似性加权与历史信息可信度对比，来评估目标社交网络信息的可信度。In order to evaluate the credibility of a target social network information, the present invention first extracts a certain amount of historical information and the dissemination attributes of the target social network information, including the disseminator attribute and the dissemination network structure, and takes the disseminator as a node to forward the information. The relationship is a directed edge to build an information dissemination network, calculate the dissemination attributes of the target social network information, and calculate the dissemination attribute similarity between the target information and the historical information by comparing with the existing information dissemination network database. Information credibility comparison, to evaluate the credibility of the target social network information.

本专利通过引入传播属性特征对社交网络信息进行了可信度计算与分析，考虑到社交网络信息的历史可信度与传播属性特征，能够合理评估目标社交网络信息的可信度，具有较好的通用性及创新性。基于以上的方法基础及现实的意义提出了一种基于传播属性相似性的社交网络信息可信度评估方法。This patent calculates and analyzes the credibility of social network information by introducing the propagation attribute features. Considering the historical credibility of social network information and the propagation attribute characteristics, the credibility of the target social network information can be reasonably evaluated, and it has better performance. versatility and innovation. Based on the above method basis and practical significance, a method for evaluating the credibility of social network information based on the similarity of propagation attributes is proposed.

发明内容SUMMARY OF THE INVENTION

(一)发明的目的(1) Purpose of the invention

本发明主要用于解决在未验证的海量信息的社交网络平台传播背景下的信息可信度评估问题。现有的方法大多是基于发布者属性、信息内容属性、或信息时间与数量等宏观属性的可信度评估方法，或者需要耗费大量人力物力对所评估的具体社交网络信息属性进行搜集和判断，较少对信息的传播属性进行研究和利用，亦未参考历史的信息传播属性对社交网络信息进行可信度评估。因此针对于现有方法的不足，本专利从传播路径的角度出发，实现基于传播属性相似性的社交网络信息可信度评估。The present invention is mainly used to solve the problem of information credibility evaluation under the background of social network platform dissemination of unverified mass information. Most of the existing methods are credibility evaluation methods based on macro attributes such as publisher attributes, information content attributes, or information time and quantity, or require a lot of manpower and material resources to collect and judge the specific social network information attributes being evaluated. Little research and utilization of information dissemination attributes are conducted, and the credibility of social network information is not evaluated with reference to historical information dissemination attributes. Therefore, in view of the deficiencies of the existing methods, this patent realizes the evaluation of the credibility of social network information based on the similarity of the propagation attributes from the perspective of the propagation path.

通过使用该方法实现在社交网络环境中对信息的传播节点属性与传播结构属性提取，将信息的传播特征与历史信息通过合理的计算与对比来建立社交网络信息的可信度评估方法，进而能够实现针对目标信息的可信度评估，从而为社交网络信息的监测强度调整、舆情等级评定，以及必要地对目标事件进行调查与声明提供了坚实基础。By using this method to realize the extraction of information dissemination node attributes and dissemination structure attributes in the social network environment, the information dissemination characteristics and historical information are reasonably calculated and compared to establish a credibility evaluation method of social network information, and then it can be Realize the credibility assessment of the target information, thus providing a solid foundation for the adjustment of the monitoring intensity of social network information, the rating of public opinion, and the necessary investigation and declaration of the target event.

(二)技术方案(2) Technical solutions

为了实现上述目的，本发明的方法所采用的技术方案是：一种基于传播属性相似性的社交网络信息可信度评估方法。In order to achieve the above object, the technical solution adopted by the method of the present invention is: a method for evaluating the credibility of social network information based on the similarity of propagation attributes.

本发明所述的一种基于传播属性相似性的社交网络信息可信度评估方法是一种应用复杂网络建模的思想，将社交网络平台中，社交网络信息如热点新闻、热门话题、争议性事件等的转发关系与传播者作为连边和节点建立网络，进而构建历史社交网络信息数据库，并随后提取待评估的目标社交网络信息(如某具体的热点新闻)与各历史社交网络信息传播网络的传播属性相似性，从而对该条社交网络信息的可信度进行度量评估。The method for evaluating the credibility of social network information based on the similarity of propagation attributes described in the present invention is an idea of applying complex network modeling. The forwarding relationship of events, etc. and the disseminator establish a network as an edge and a node, and then build a historical social network information database, and then extract the target social network information to be evaluated (such as a specific hot news) and each historical social network information dissemination network Propagation attribute similarity, so as to measure and evaluate the credibility of this piece of social network information.

本发明所述的“一种基于传播属性相似性的社交网络信息可信度评估方法”，其步骤如下：The steps of "a method for evaluating the credibility of social network information based on the similarity of propagation attributes" described in the present invention are as follows:

步骤A：提取社交网络信息传播内容，构建社交网络信息传播网络；Step A: extracting social network information dissemination content, and constructing a social network information dissemination network;

步骤B：提取社交网络信息传播节点属性，计算社交网络信息传播网络拓扑属性，标记历史信息可信度，建立历史信息传播属性数据库；Step B: extracting the attributes of social network information dissemination nodes, calculating the network topology attributes of social network information dissemination, marking the credibility of historical information, and establishing a historical information dissemination attribute database;

步骤C：计算目标社交网络信息与历史社交网络信息的传播属性相似性，以此来评估目标社交网络的信息可信度。Step C: Calculate the dissemination attribute similarity between the target social network information and the historical social network information, so as to evaluate the information credibility of the target social network.

通过以上步骤，可以实现基于传播属性相似性的社交网络信息可信度评估的目的，该方法的整体性强，客观度高，通用性好，解决了社交网络信息的可信度难以客观衡量评估的问题。Through the above steps, the purpose of evaluating the credibility of social network information based on the similarity of dissemination attributes can be achieved. The problem.

其中，步骤A中所述的“提取社交网络信息传播内容，构建社交网络信息传播网络”，其做法如下：对给定数量的具有历史信息可信度的历史社交网络信息与目标社交网络信息，都选定一个信息发布者作为初始传播节点，依层以该初始传播节点的直接转发节点与直接转发有向关系构建该信息的传播网络；以上所述过程的具体步骤如下：Wherein, the method of "extracting social network information dissemination content and constructing a social network information dissemination network" described in step A is as follows: for a given number of historical social network information and target social network information with historical information credibility, Each of them selects an information publisher as the initial propagation node, and constructs the information propagation network layer by layer with the direct forwarding node of the initial propagation node and the direct forwarding directed relationship; the specific steps of the above-mentioned process are as follows:

步骤A1：根据社交网络平台公开内容，选定某一社交网络信息Event(可以是热点话题、热门新闻、争议性事件等)的一个发布者作为初始传播节点V₀₀，对该初始节点的直接转发者作为下层节点V_0j，视初始节点与每一下层节点之间的直接转发关系作为社交网络信息传播有向连边E₀(V₀₀,V_0j)，构建第一层信息传播网络G₀(V₀,E₀),其中

式中：V₀表示初始传播节点V₀₀与对该初始节点的直接转发者节点V_0j的节点集合，E₀表示初始传播节点V₀₀与其直接转发者节点V_0j之间的转发关系连边集合；Step A1: According to the public content of the social network platform, select a publisher of a certain social network information Event (which may be a hot topic, hot news, controversial event, etc.) as the initial dissemination node V ₀₀ , and directly forward the initial node. As the lower-level node _V _0j , _the first-level information dissemination network G ₀ ₍ V ₀ , E ₀ ), where

In the formula: V ₀ represents the node set of the initial propagation node V ₀₀ and the direct forwarder node V _0j of the initial node, E ₀ represents the forwarding relationship between the initial propagation node V ₀₀ and its direct forwarder node V _0j The set of edges ;

步骤A2：设定信息传播网络层级数量阈值L，遍历每层各个转发节点，视为该层的各个初始节点，重复步骤A1，逐层构建层间信息传播网络直至传播层级达到阈值，得到该社交网络信息Event的整体传播网络G(V,E)；式中：G(V,E) 表示该社交网络信息Event的整体传播网络，V表示该网络中的节点集合，E表示该网络中节点转发关系；Step A2: Set the threshold L of the number of layers of the information dissemination network, traverse each forwarding node of each layer, regard it as each initial node of the layer, repeat step A1, build the information dissemination network layer by layer until the dissemination level reaches the threshold, and obtain the social network. The overall dissemination network G(V,E) of the network information Event; in the formula: G(V,E) represents the overall dissemination network of the social network information Event, V represents the node set in the network, and E represents the node forwarding in the network relation;

步骤A3：重复步骤A1、A2，对给定数量Num的历史社交网络信息Event_history建立各自的信息传播网络G_history(V_history,E_history)，建立目标社交网络信息Event_obj的信息传播网络G_obj(V_obj,E_obj)；式中：G_history(V_history,E_history)表示各历史社交网络信息 Event_history的信息传播网络，V_history表示该网络中的节点集合，E_history表示该网络中节点转发关系，而G_obj(V_obj,E_obj)表示目标社交网络信息Event_obj的信息传播网络， V_obj表示该网络中的节点集合，E_obj表示该网络中节点转发关系。Step A3: Repeating steps A1 and A2, establishing a respective information dissemination network G _history (V _history , E _history ) for a given amount of Num of historical social network information Event _history , and establishing an information dissemination network G _obj of the target social network information Event _obj (V _obj , E _obj ); in the formula: G _history (V _history , E _history ) represents the information dissemination network of each historical social network information Event _history , V _history represents the set of nodes in the network, and E _history represents the nodes in the network forwarding relationship, and G _obj (V _obj , E _obj ) represents the information dissemination network of the target social network information Event _obj , V _obj represents the set of nodes in the network, and E _obj represents the forwarding relationship of nodes in the network.

其中，步骤B中所述的“提取社交网络信息传播节点属性，计算社交网络信息传播网络拓扑属性，标记历史信息可信度，建立历史信息传播属性数据库”，其做法如下：提取已构建的社交网络信息传播网络中各传播节点与传播拓扑结构属性，随后标记各历史社交网络信息可信度，建立历史社交网络信息传播属性数据库；以上所述过程的具体步骤如下：Wherein, the method of "extracting social network information dissemination node attributes, calculating social network information dissemination network topology attributes, marking the credibility of historical information, and establishing a historical information dissemination attribute database" described in step B is as follows: extracting the constructed social network information dissemination network Each dissemination node in the network information dissemination network and the dissemination topology structure attributes, then mark the credibility of each historical social network information, and establish a historical social network information dissemination attribute database; the specific steps of the above-mentioned process are as follows:

步骤B1：提取已构建的社交网络信息传播网络G(V,E)中各传播节点V的唯一识别信息为该信息的传播节点属性F_vec；Step B1: Extract the unique identification information of each dissemination node V in the constructed social network information dissemination network G(V, E) as the dissemination node attribute F _vec of the information;

步骤B2：提取已构建的社交网络信息传播网络G(V,E)的网络拓扑结构，提取该网络传播结构属性F_struc，包括且不仅限于：Step B2: Extract the network topology of the constructed social network information dissemination network G(V,E), and extract the network dissemination structure attribute F _struc , including but not limited to:

(1)传播网络G(V,E)初始转发层级数量比r_2/1(如式(1))，即第2层传播网络节点数量n_V(2)与第1层传播网络节点数量n_V(1)之比；(1) The ratio of the number of initial forwarding layers of the propagation network G(V, E) to r _2/1 (as in equation (1)), that is, the number of nodes in the second layer of propagation network n _V (2) and the number of nodes in the first layer of propagation network n The ratio of _V (1);

(2)传播网络G(V,E)节点特征距离a，即用于拟合传播网络中所有节点对之间的距离分布，具体拟合方程如式(2)，其中y为分布概率，x为节点对的间隔距离，b为拟合常数；(2) The characteristic distance a of the nodes of the propagation network G(V, E) is used to fit the distance distribution between all pairs of nodes in the propagation network. The specific fitting equation is shown in Equation (2), where y is the distribution probability and x is the separation distance of the node pair, b is the fitting constant;

(3)传播网络G(V,E)同质性指标h，即传播网络G(V,E)同质性H_G与同规模的星型网络同质性H_star的对数值之差，计算方法如式(3)，其中传播网络的同质性计算方法如式(4)；(3) The homogeneity index h of the propagation network G(V, E), that is, the difference between the logarithm values of the homogeneity H _G of the propagation network G(V, E) and the homogeneity H _star of the star network of the same scale, calculate The method is as formula (3), and the calculation method of the homogeneity of the propagation network is as formula (4);

h＝log(H_star)-log(H_G)(3)h=log(H _star )-log(H _G )(3)

(式中N为网络节点总数，k为节点度) （4)

(where N is the total number of network nodes, and k is the node degree) (4)

步骤B3：根据历史事实汇总与权威机构认证，设定历史社交网络信息 Event_history的可信度评价指标

其中所取值0表示该信息完全不可信，所取值1表示该信息完全可信，设定历史社交网络信息Event_history的传播节点可信度评价指标Credibility_vec(Event_history)、传播拓扑可信度评价指标 Credibility_struc(Event_history)与综合可信度评价指标相同；Step B3: Set the credibility evaluation index of the historical social network information Event _history according to the summary of historical facts and the certification of the authority

The value of 0 indicates that the information is completely unreliable, and the value of 1 indicates that the information is completely credible. Set the credibility evaluation indicators of the propagation nodes of the historical social network information Event _history , Credibility _vec (Event _history ), and the propagation topology credible The degree evaluation index Credibility _struc (Event _history ) is the same as the comprehensive credibility evaluation index;

步骤B4：对已收集的Num个历史社交网络信息Event_history，将历史信息传播节点属性F_vec(Event_history)、历史信息传播网络结构属性F_struc(Event_history)、历史信息可信度指标Credibility(Event_history)加入历史社交网络信息传播属性数据库 DS(Event_history)。Step B4: For the collected Num pieces of historical social network information Event _history , the historical information dissemination node attribute F _vec (Event _history ), the historical information dissemination network structure attribute F _struc (Event _history ), and the historical information credibility index Credibility ( Event _history ) is added to the historical social network information dissemination attribute database DS (Event _history ).

其中，步骤C中所述的“计算目标社交网络信息与历史社交网络信息的传播属性相似性，以此来评估目标社交网络的信息可信度”，其具体做法如下：计算目标社交网络信息Event_obj与历史社交网络信息Event_history的传播节点相似性 Sim_vec并得到目标社交网络信息Event_obj的传播节点可信度Credibility_vec(Event_obj)，随后计算目标社交网络信息Event_obj与历史社交网络信息Event_history的传播结构相似性分位，进而得到传播结构可信度Credibility_struc(Event_obj)，分配各传播属性可信度计算权重，计算目标社交网络信息可信度；以上所述过程的具体步骤如下：Wherein, the "calculation of the similarity of the propagation attributes of the target social network information and the historical social network information, so as to evaluate the information credibility of the target social network" described in step C, the specific method is as follows: Calculate the target social network information Event Sim _vec is the similarity between _obj and the propagation node of historical social network information Event _history and obtains the propagation node credibility of the target social network information Event _obj , Credibility _vec (Event _obj ), and then calculates the target social network information Event _obj and historical social network information Event The similarity quantile of the communication structure of _history , and then obtain the credibility of the communication structure Credibility _struc (Event _obj ), assign the credibility of each communication attribute to calculate the weight, and calculate the credibility of the target social network information; the specific steps of the above process are as follows :

步骤C1：采用集合相似度计算方法如Jaccard相似度方法(式(5))，分别计算目标社交网络信息Event_obj与历史信息传播属性数据库DS(Event_history)中所有历史信息Event_history的传播节点属性相似性Sim_vec(Event_obj,Event_history)，随后选取计算结果中最大的传播节点属性相似性值Sim_{vec_max}及其所对应的历史信息 Event_{vec_max}，记录对应的历史信息的可信度Credibility_vec(Event_{vec_max})，则目标社交网络信息Event_obj的传播节点可信度Credibility_vec(Event_obj)＝Sim_{vec_max}×Credibility_vec(Event_{vec_max})；Step C1: Using a set similarity calculation method such as the Jaccard similarity method (Equation (5)), respectively calculate the target social network information Event _obj and the propagation node attributes of all historical information Event _history in the historical information propagation attribute database DS (Event _history ) Similarity Sim _vec (Event _obj , Event _history ), then select the largest propagation node attribute similarity value Sim _{vec_max} in the calculation result and its corresponding historical information Event _{vec_max} , record the credibility of the corresponding historical information Credibility _vec (Event vec_max ) _{vec_max} ), then the credibility of the propagation node of the target social network information Event _obj is Credibility _vec (Event _obj )=Sim _{vec_max} ×Credibility _vec (Event _{vec_max} );

式中：F_vec(Event_obj)表示目标信息传播节点属性，F_vec(Event_history)表示历史信息传播节点属性；In the formula: F _vec (Event _obj ) represents the attribute of the target information dissemination node, and F _vec (Event _history ) represents the attribute of the historical information dissemination node;

步骤C2：对历史信息传播属性数据库DS(Event_history)中各历史信息Event_history，基于传播结构属性F_struc中M个子属性(包括且不仅限于r_2/1、a、h)的大小，分别按照相同的排序方式(升序或降序)进行排序，排序后得到M个历史信息序列：Step C2: For each historical information Event _history in the historical information dissemination attribute database DS (Event _history ), based on the size of the M sub-attributes (including but not limited to r _2/1 , a, and h) in the dissemination structure attribute F _struc , according to Sort by the same sorting method (ascending or descending), and get M historical information sequences after sorting:

步骤C3：对C2所得的第i个重排历史信息序列，等数量将历史信息划分为K个区间，则每个区间内有

个历史信息，随后比较目标社交网络信息的第i传播结构子属性值与第i个重排历史信息序列中各历史信息Event_history的传播结构子属性值，找到最相似的历史信息Event_history并将其所属区间序号 k_{struc_i}作为目标社交网络信息Event_obj传播结构子属性的相似性分位，对 i＝1,2,...,M个传播结构属性子属性重复此步骤，得到目标社交网络信息Event_obj各传播结构子属性的相似性分位；Step C3: For the i-th rearranged historical information sequence obtained by C2, divide the historical information into K intervals with equal numbers, then each interval has

and then compare the i-th propagation structure sub-attribute value of the target social network information with the propagation structure sub-attribute value of each historical information Event _history in the i-th rearranged historical information sequence, find the most similar historical information Event _history The interval sequence number k _{struc_i} to which it belongs is used as the similarity quantile of the sub-attributes of the target social network information Event _obj propagation structure, and this step is repeated for i=1, 2, ..., M sub-attributes of the propagation structure attributes to obtain the target social network information Similarity quantiles of sub-attributes of each propagation structure of Event _obj ;

步骤C4：根据式(6)计算目标社交网络信息Event_obj与历史社交网络信息Event_history的传播结构子属性可信度Credibility_{struc_i}，其中i为M个传播结构子属性中的第i个，

分别为目标社交网络信息Event_obj所属的此传播结构子属性的分位k_i中，可信度为0、1的历史信息Event_history个数；Step C4: Calculate the propagation structure sub-attribute Credibility _{struc_i} of the target social network information Event _obj and the historical social network information Event _history according to formula (6), where i is the ith in the M propagation structure sub-attributes,

Respectively, in the quantile k _i of the sub-attribute of this propagation structure to which the target social network information Event _obj belongs, the number of historical information Event _history whose reliability is 0 and 1;

步骤C5：对传播节点可信度Credibility_vec、M个传播结构子属性可信度Credibility_{struc_i},i＝1,...,M分配计算权重w_vec,w_{struc_1},...,w_{struc_M}，其中所有权重之和等于 1，分配方法可采取平均分配法、层次分析法、模糊评估法等，根据权重重要性或其他必要信息进行分配；Step C5: Allocate calculation weights w _vec , w _{struc_1} ,..., w _{struc_M} to the propagation node credibility Credibility _vec and the M propagation structure sub-attribute credibility Credibility _{struc_i} , i=1,...,M, where The sum of all weights is equal to 1, and the distribution method can adopt the average distribution method, the analytic hierarchy process, the fuzzy evaluation method, etc., and distribute according to the weight importance or other necessary information;

步骤C6：利用式(7)计算得到目标社交网络信息的可信度，依据结果对目标社交网络信息的可信度进行评估，计算结果数值越接近0则其可信度越弱，反之越接近1则其可信度越强；Step C6: Calculate the credibility of the target social network information by using the formula (7), and evaluate the credibility of the target social network information according to the result. The closer the calculation result is to 0, the weaker the credibility, and vice versa. 1, the more reliable it is;

(三)优点创新(3) Merit innovation

本发明具有如下的创新点：The present invention has the following innovations:

1、通用性强：本专利并不是针对于某一个特定的社交网络信息进行的基于传播属性相似性的可信度评估方法，而是一种对于各类社交网络信息都通用的基于传播属性相似性的社交网络信息可信度评估方法，因此具有较好的通用性。1. Strong versatility: This patent is not a credibility evaluation method based on the similarity of propagation attributes for a specific social network information, but a kind of similarity-based propagation attribute that is common to all kinds of social network information. It is a reliable evaluation method of social network information, so it has good generality.

2、可移植性好：本专利并没有限定传播属性的内容以及各个传播属性相似性分位的计算方式，因此在具体的不同社交网络信息可信度评估中可以根据实际情况的需要进行传播属性的删减与计算方法的调整，因此具有很好的可移植性。2. Good portability: This patent does not limit the content of the dissemination attributes and the calculation method of the similarity quantile of each dissemination attribute. Therefore, the dissemination attributes can be determined according to the needs of the actual situation in the evaluation of the credibility of different social network information. The deletion and the adjustment of the calculation method, so it has good portability.

3、客观度高：本专利通过引入社交网络信息的传播性特征以及已经证实的历史信息可信度，提升了该方法的适用性，能够更加客观的进行可信度评估。3. High degree of objectivity: This patent improves the applicability of the method by introducing the dissemination characteristics of social network information and the credibility of historical information that has been confirmed, and enables more objective credibility assessment.

4、整体性强：本专利是站在信息传播全程的角度上进行的基于传播属性相似性的社交网络信息可信度评估，因此能够把握全局信息的变化，具有良好的整体性。4. Strong integrity: This patent is a credibility evaluation of social network information based on the similarity of dissemination attributes from the perspective of the whole process of information dissemination, so it can grasp the changes of global information and has good integrity.

综上，这种基于传播属性相似性的社交网络信息可信度评估方法能够结合历史信息以及信息传播的特征，更好地评估一个社交网络信息的可信度，能够弥补现有方法的不足；本发明所述方法科学，工艺性好，具有广阔推广应用价值。To sum up, this social network information credibility evaluation method based on the similarity of dissemination attributes can combine historical information and the characteristics of information dissemination to better evaluate the credibility of a social network information, which can make up for the shortcomings of existing methods; The method of the invention is scientific, has good manufacturability, and has broad popularization and application value.

附图说明Description of drawings

图1是本发明所述方法框架流程图。FIG. 1 is a flow chart of the method framework of the present invention.

具体实施方式Detailed ways

为使本发明要解决的技术问题、技术方案更加清楚，下面将结合附图及具体实施案例进行详细描述。应当理解，此处所描述的实施实例仅用于说明和解释本发明，并不用于限定本发明。In order to make the technical problems and technical solutions to be solved by the present invention clearer, the following will describe in detail with reference to the accompanying drawings and specific implementation cases. It should be understood that the embodiments described herein are only used to illustrate and explain the present invention, but not to limit the present invention.

本发明的目的在于解决在未验证的海量信息的社交网络平台传播背景下的社交网络信息可信度评估问题。现有的方法大多是基于发布者属性、信息内容属性、或信息时间与数量等宏观属性的可信度评估方法，或者需要耗费大量人力物力对所评估的具体社交网络信息属性进行搜集和判断，较少对信息的传播属性进行研究和利用，亦未参考历史的信息传播属性对社交网络信息进行可信度评估。因此基于现有方法的不足，本专利从传播路径的角度出发，实现基于传播属性相似性的社交网络信息可信度评估。The purpose of the present invention is to solve the problem of social network information credibility evaluation under the background of social network platform dissemination of unverified mass information. Most of the existing methods are credibility evaluation methods based on macro attributes such as publisher attributes, information content attributes, or information time and quantity, or require a lot of manpower and material resources to collect and judge the specific social network information attributes being evaluated. Little research and utilization of information dissemination attributes are conducted, and the credibility of social network information is not evaluated with reference to historical information dissemination attributes. Therefore, based on the deficiencies of the existing methods, this patent realizes the evaluation of the credibility of social network information based on the similarity of the propagation attributes from the perspective of the propagation path.

该方法拥有通用性强、可移植性好、客观度高、整体性强等特点。下面结合附图说明及具体实施方式对本发明进一步说明。The method has the characteristics of strong versatility, good portability, high objectivity and strong integrity. The present invention will be further described below with reference to the accompanying drawings and specific embodiments.

本发明实施例以目标社交网络信息Event_obj的可信度评估情景为例，阐述本发明方法。已知Event_obj是一条在社交网络平台引发网络讨论热度和广泛传播的社交网络信息，且该信息是否可信或属实，并没有官方或者权威机构进行说明和证实，下面以该Event_obj内容为“吃大蒜不可以消灭新冠病毒”为例，对本发明方法进行说明。The embodiments of the present invention illustrate the method of the present invention by taking the credibility evaluation scenario of the target social network information Event _obj as an example. It is known that Event _obj is a piece of social network information that has caused heated discussions and widespread dissemination on social network platforms, and whether the information is credible or true has not been explained and confirmed by an official or authoritative organization. The following is the content of the Event _obj as " Eating garlic cannot eliminate the new coronavirus" as an example to illustrate the method of the present invention.

本发明所述的“一种基于传播属性相似性的社交网络信息可信度评估方法”，见图1所示，其步骤如下：The "a method for evaluating the credibility of social network information based on the similarity of propagation attributes" described in the present invention is shown in Figure 1, and its steps are as follows:

步骤C：计算目标社交网络与历史社交网络的传播属性相似性，以此来评估目标社交网络的信息可信度。Step C: Calculate the similarity of the propagation attributes of the target social network and the historical social network, so as to evaluate the information credibility of the target social network.

步骤A1：根据社交网络平台公开内容，选定某一社交网络信息Event的一个发布者作为初始传播节点V₀₀，对该初始节点的直接转发者作为下层节点V_0j，视初始节点与每一下层节点之间的直接转发关系作为社交网络信息传播有向连边E₀(V₀₀,V_0j)，构建第一层信息传播网络G₀(V₀,E₀),其中

式中：V₀表示初始传播节点V₀₀与对该初始节点的直接转发者节点V_0j的节点集合，E₀表示初始传播节点V₀₀与其直接转发者节点V_0j之间的转发关系连边集合；Step A1: According to the public content of the social network platform, select a publisher of a certain social network information Event as the initial dissemination node V ₀₀ , and the direct forwarder of the initial node as the lower node V _0j , depending on the initial node and each lower layer The direct forwarding relationship between nodes is used as a social network information propagation directed edge E ₀ (V ₀₀ , V _0j ), and the first layer of information propagation network G ₀ (V ₀ , E ₀ ) is constructed, where

步骤A2：设定信息传播网络层级数量阈值L，如L＝3，遍历每层各个转发节点，视为该层的各个初始节点，重复步骤A1，逐层构建层间信息传播网络直至传播层级达到阈值，得到该社交网络信息Event的整体传播网络G(V,E)，例如针对初始传播节点V₀₀，收集其各个直接转发者作为下层节点V_0j,j＝1,2,...并建立连边关系E₀(V₀₀,V_0j)，此时传播网络层级数量为1，之后针对传播网络第一层的每个节点V_0j,j＝1,2,...，再次收集其各个直接转发者作为下层节点V_1k,k＝1,2,...并建立连边关系E₁(V_0j,V_1k)，则此时传播网络层级数量为2，如此逐层构建层间信息传播网络直至传播层级达到阈值，得到该社交网络信息Event的整体传播网络 G(V,E)；Step A2: Set the threshold L of the number of information dissemination network levels, such as L=3, traverse each forwarding node of each layer, and regard it as each initial node of the layer, repeat step A1, build the information dissemination network layer by layer until the dissemination level reaches Threshold to obtain the overall dissemination network G(V,E) of the social network information Event, for example, for the initial dissemination node V ₀₀ , collect each of its direct forwarders as the lower node V _0j , j=1, 2, . . . and establish The edge relationship E ₀ (V ₀₀ , V _0j ), at this time, the number of levels of the propagation network is 1, and then for each node V _0j , j=1, 2, . The direct forwarder acts as the lower node _V _1k , _k = ₁ , 2, . Propagating the network until the propagation level reaches the threshold, and obtain the overall propagation network G(V, E) of the social network information Event;

步骤A3：重复步骤A1、A2，对给定数量Num(例如Num＝100)的历史社交网络信息Event_history建立各自的信息传播网络G_history(V_history,E_history)，其中历史社交网络信息Event_history可以是社交平台上与目标社交网络信息Event_obj内容相关的、得到证实的内容(例如：“吃大蒜不可以消灭病毒”为可信信息、“喝酒可以消灭新冠病毒”为不可信信息)，也可以是与目标社交网络信息Event_obj内容无关的、得到证实的内容(例如：“新冠病毒导致中国一百万人死亡”为不可信信息、“生物危机都是美国制造的阴谋”为不可信信息)。同时，重复步骤A1、A2,建立一个目标社交网络信息Event_obj，即“吃大蒜不可以消灭新冠病毒”的信息传播网络G_obj(V_obj,E_obj)。Step A3: Repeat steps A1 and A2 to establish a respective information dissemination network G _history (V _history , E _history ) for the historical social network information Event _history of a given number Num (for example, Num=100), wherein the historical social network information Event _history It can be the verified content related to the content of the target social network information Event _obj on the social platform (for example: "Eating garlic can't kill the virus" is trusted information, "Drinking alcohol can kill the new coronavirus" is untrustworthy information), or It can be verified content unrelated to the content of the target social network information Event _obj (for example: "The new coronavirus killed one million people in China" is untrustworthy information, "The biological crisis is a conspiracy made by the United States" is untrustworthy information ). At the same time, steps A1 and A2 are repeated to establish a target social network information Event _obj , that is, an information dissemination network G _obj (V _obj , E _obj ) of “eating garlic cannot eliminate the new coronavirus”.

其中，步骤B中所述的“提取社交网络信息传播节点属性，计算社交网络信息传播网络拓扑属性，标记历史信息可信度，建立历史信息传播属性数据库”，其做法如下：对给定数量的具有历史信息可信度的历史社交网络信息与目标社交网络信息，都提取已构建的社交网络信息传播网络(无论是历史信息网络还是目标信息网络)中各传播节点与传播拓扑结构属性，随后标记各历史社交网络信息可信度，建立历史社交网络信息传播属性数据库；以上所述过程的具体步骤如下：Wherein, the method of "extracting social network information dissemination node attributes, calculating social network information dissemination network topology attributes, marking the credibility of historical information, and establishing a historical information dissemination attribute database" described in step B is as follows: for a given number of Both historical social network information and target social network information with historical information credibility are extracted from each propagation node and propagation topology attribute in the constructed social network information propagation network (whether it is a historical information network or a target information network), and then marked. The credibility of each historical social network information is established, and a historical social network information dissemination attribute database is established; the specific steps of the above-mentioned process are as follows:

步骤B1：提取已构建的社交网络信息传播网络G(V,E)中各传播节点V的唯一识别信息，例如社交平台的用户编号，作为该条信息的传播节点属性F_vec；Step B1: Extract the unique identification information of each dissemination node V in the constructed social network information dissemination network G(V, E), such as the user number of the social platform, as the dissemination node attribute F _vec of the piece of information;

步骤B2：提取已构建的社交网络信息传播网络G(V,E)的网络拓扑结构，使用计算机方法或人工统计等方法提取该网络传播结构属性F_struc，包括且不仅限于：Step B2: Extract the network topology structure of the constructed social network information dissemination network G(V, E), and extract the network dissemination structure attribute F _struc using computer methods or manual statistics, including but not limited to:

(2)传播网络G(V,E)节点特征距离a，即用于拟合传播网络中所有节点对之间的距离分布，具体拟合方程如式(2)，其中y为分布概率，x为节点对的间隔距离；(2) The characteristic distance a of the nodes of the propagation network G(V, E) is used to fit the distance distribution between all pairs of nodes in the propagation network. The specific fitting equation is shown in Equation (2), where y is the distribution probability and x is the separation distance of the node pair;

(3)传播网络G(V,E)同质性指标h，即传播网络G(V,E)同质性与同规模的星型网络同质性的对数值之差，计算方法如式(3)，其中传播网络的同质性计算方法如式(4)；(3) The homogeneity index h of the propagation network G(V, E) is the difference between the logarithm value of the homogeneity of the propagation network G(V, E) and the homogeneity of the star network of the same scale. The calculation method is as follows: 3), where the homogeneity calculation method of the propagation network is as in formula (4);

步骤B3：根据历史事实汇总与权威机构认证，设定历史社交网络信息Event_history的综合可信度评价指标

其中所取值0表示该信息完全不可信，所取值1表示该信息完全可信，例如对一条这样的历史社交网络信息Event1：“吃大蒜不可以消灭病毒”，根据历史事实可知该信息完全可信，则其综合可信度Credibility(Event1)＝1，那么根据综合可信度评价指标 Credibility(Event1)＝1，相应地设定历史社交网络信息Event_history的传播节点可信度评价指标Credibility_vec(Event1)、传播拓扑可信度评价指标Credibility_struc(Event1)与综合可信度评价指标相同，都为1；Step B3: According to the summary of historical facts and the certification of authoritative organizations, set the comprehensive credibility evaluation index of historical social network information Event _history

The value of 0 indicates that the information is completely unreliable, and the value of 1 indicates that the information is completely credible. For example, for a piece of such historical social network information Event1: "Eating garlic can't kill the virus", it can be seen from historical facts that the information is completely Credibility, then its comprehensive credibility Credibility(Event1)=1, then according to the comprehensive credibility evaluation index Credibility(Event1)=1, correspondingly set the historical social network information Event _history propagation node credibility evaluation index Credibility _vec (Event1), the propagation topology credibility evaluation index Credibility _struc (Event1) are the same as the comprehensive credibility evaluation index, and both are 1;

步骤B4：对上述已收集的数量为Num＝100个历史社交网络信息Event_history，将历史信息传播节点属性F_vec(Event_history)、历史信息传播网络结构属性 F_struc(Event_history)、历史信息可信度指标Credibility(Event_history)加入历史社交网络信息传播属性数据库DS(Event_history)。Step B4: For the above-mentioned collected number of Num=100 historical social network information Event _history , the historical information dissemination node attribute F _vec (Event _history ), the historical information dissemination network structure attribute F _struc (Event _history ), the historical information can be The reliability indicator Credibility (Event _history ) is added to the historical social network information dissemination attribute database DS (Event _history ).

其中，步骤C中所述的“计算目标社交网络信息与历史社交网络信息的传播属性相似性，以此来评估目标社交网络的信息可信度”，其具体做法如下：对历史社交网络信息传播属性数据库中已收集的每一条信息Event_history，都计算目标社交网络信息Event_obj与该Event_history的传播节点相似性Sim_vec并得到目标社交网络信息Event_obj的传播节点可信度Credibility_vec(Event_obj)，随后计算目标社交网络信息Event_obj与历史社交网络信息Event_history的传播结构相似性分位，进而得到传播结构可信度Credibility_struc(Event_obj)，分配各传播属性可信度计算权重，计算目标社交网络信息可信度；以上所述过程的具体步骤如下：Wherein, the “calculation of the similarity of the dissemination attributes between the target social network information and the historical social network information, so as to evaluate the information credibility of the target social network” described in step C is as follows: dissemination of historical social network information For each piece of information Event _history collected in the attribute database, calculate the similarity Sim _vec between the target social network information Event _obj and the propagation node of the Event _history , and obtain the propagation node Credibility _vec (Event _obj ) of the target social network information Event _obj . ), then calculate the dissemination structure similarity quantile between the target social network information Event _obj and the historical social network information Event _history , and then obtain the credibility of the dissemination structure Credibility _struc (Event _obj ), assign the credibility calculation weight of each dissemination attribute, calculate The credibility of the target social network information; the specific steps of the above-mentioned process are as follows:

步骤C1：采用集合相似度计算方法如Jaccard相似度方法(式(5))，分别计算目标社交网络信息Event_obj与历史信息传播属性数据库DS(Event_history)中所有历史信息Event_history的传播节点属性相似性Sim_vec(Event_obj,Event_history)，随后选取计算结果中最大的传播节点属性相似性值Sim_{vec_max}(例如Sim_{vec_max}＝0.8)及其所对应的历史信息Event_{vec_max}(例如该条信息内容为“吃大蒜不可以消灭病毒”)，记录对应的历史信息的可信度Credibility_vec(Event_{vec_max})(已知Credibility_vec(Event_{vec_max}) ＝1)，则目标社交网络信息Event_obj的传播节点可信度 Credibility_vec(Event_obj)＝Sim_{vec_max}×Credibility_vec(Event_{vec_max})＝0.8×1＝0.8；Step C1: Using a set similarity calculation method such as the Jaccard similarity method (Equation (5)), respectively calculate the target social network information Event _obj and the propagation node attributes of all historical information Event _history in the historical information propagation attribute database DS (Event _history ) Similarity Sim _vec (Event _obj , Event _history ), and then select the largest propagation node attribute similarity value Sim _{vec_max} (for example, Sim _{vec_max} = 0.8) in the calculation result and its corresponding historical information Event _{vec_max} (for example, the content of this piece of information is "Eating garlic can't kill the virus"), record the credibility of the corresponding historical information Credibility _vec (Event _{vec_max} ) (known Credibility _vec (Event _{vec_max} ) = 1), then the dissemination node of the target social network information Event _obj is credible Degree Credibility _vec (Event _obj )=Sim _{vec_max} ×Credibility _vec (Event _{vec_max} )=0.8×1=0.8;

步骤C2：对历史信息传播属性数据库DS(Event_history)中各历史信息Event_history，基于传播结构属性F_struc中M(本例中M＝3)个子属性的大小，分别按照降序进行排序，例如，对各历史信息Event_history基于传播结构子属性struc_1的大小进行降序排序，得到历史信息的顺序为Event_history1,Event_history2,...，则说明Event_history1的子属性struc_1值在历史信息传播属性数据库DS(Event_history)中是最大的，分别根据其余子属性值对各个历史信息也进行上述排序，最终可得到M(例如M＝3)个历史信息的重排序列：Step C2: Sort each historical information Event _history in the historical information dissemination attribute database DS (Event _history ) in descending order based on the size of the M (M=3 in this example) sub-attributes in the dissemination structure attribute F _struc , for example, Sort each historical information Event _history in descending order based on the size of the sub-attribute _{struc_1} of the propagation structure, and the order of obtaining the historical information is Event _history1 , Event _history2 ,... (Event _history ) is the largest, and the above-mentioned sorting is also performed on each historical information according to the remaining sub-attribute values, and finally a rearrangement sequence of M (for example, M=3) historical information can be obtained:

步骤C3：对C2所得的第i(例如i＝1)个重排历史信息序列，等数量将历史信息划分为K＝10个区间，则每个区间内有

个历史信息Step C3: Rearrange the i-th (for example, i=1) historical information sequence obtained by C2, and divide the historical information into K=10 intervals by an equal number, then each interval has

historical information

(例如List_{struc_1}(Event_history1,Event_history2,...)中共有100个已基于传播结构子属性 struc_1的大小进行降序排序的历史信息，子属性struc_1最大的前10个历史信息划分为一个区间编号1，子属性struc_1第二大的11至第20个历史信息划分为一个区间并编号2，依次将100个历史信息共划分为

个区间)，随后比较目标社交网络信息Event_obj(即“吃大蒜不可以消灭新冠病毒”)的第i 传播结构子属性值与第i个重排历史信息序列中，各历史信息Event_history的传播结构子属性值，找到最相似的历史信息Event_history并将其所属区间序号k_{struc_i}＝3 作为目标社交网络信息Event_obj传播结构子属性的相似性分位，例如对子属性struc_1，目标社交网络信息Event_obj的该属性值为0.34，与List_{struc_1}中的第23个历史信息Event_history23对应的子属性struc_1最为相似，由于Event_history23位于该序列的第3个区间，则将其所属区间序号k_{struc_1}＝3作为目标社交网络信息Event_obj传播结构子属性struc_1的相似性分位，对i＝1,2,...,M个传播结构属性子属性重复此步骤，可得到得到目标社交网络信息Event_obj各传播结构子属性的相似性分位 (3，4，2)；(For example, List _{struc_1} (Event _history1 , Event _history2 ,...) has a total of 100 historical information sorted in descending order based on the size of the sub-attribute struc_1 of the propagation structure, and the top 10 historical information with the largest sub-attribute struc_1 is divided into an interval number 1. The 11th to 20th historical information of the second largest sub-attribute struc_1 is divided into an interval and numbered 2, and the 100 historical information is divided into

interval), and then compare the i-th propagation structure sub-attribute value of the target social network information Event _obj (that is, "eating garlic cannot kill the new coronavirus") and the i-th rearranged historical information sequence, the propagation of each historical information Event _history Structural sub-attribute value, find the most similar historical information Event _history and use the interval number k _{struc_i} = 3 as the target social network information Event _obj to propagate the similarity quantile of the structural sub-attribute, for example, for the sub-attribute struc_1, the target social network information The attribute value of Event _obj is 0.34, which is most similar to the sub-attribute struc_1 corresponding to the 23rd historical information Event _history23 in List _{struc_1} . Since Event _history23 is located in the third interval of the sequence, the interval number k _{struc_1} = 3 as the similarity quantile of the target social network information Event _obj propagation structure sub-attribute struc_1, repeat this step for i=1, 2, ..., M propagation structure attribute sub-attributes to obtain the target social network information Event _obj Similarity quantile (3, 4, 2) of each propagation structure sub-attribute;

步骤C4：根据式(6)计算目标社交网络信息Event_obj，即“吃大蒜不可以消灭新冠病毒”，与历史社交网络信息Event_history的传播结构子属性可信度 Credibility_{struc_i}，其中i为M＝3个传播结构子属性中的第i个，

分别为目标社交网络信息Event_obj所属的此传播结构子属性的分位k_i中，可信度为0、1的历史信息Event_history个数，例如，对于第i＝1个传播结构子属性struc_1，其分位值k_{struc_1}＝3，且对应第i＝1个重排历史信息序列第3区间中可信度为0、1的历史信息Event_history个数

分别等于3、7，则目标社交网络信息的第一个传播结构子属性值

进而同理得到其他两个传播结构子属性可信度分别为0.7、0.8；Step C4: Calculate the target social network information Event _obj according to formula (6), that is, “eating garlic cannot kill the new coronavirus”, and the propagation structure sub-attribute Credibility _{struc_i} of the historical social network information Event _history , where i is M= the ith of the 3 propagation structure sub-attributes,

In the quantile k _i of this propagation structure sub-attribute to which the target social network information Event _obj belongs, the number of historical information Event _history with reliability of 0 and 1, for example, for the i-th = 1st propagation structure sub-attribute struc_1 , its quantile value k _{struc_1} = 3, and corresponds to the number of historical information Event _history whose reliability is 0 and 1 in the third interval of the i=1 rearranged historical information sequence

are equal to 3 and 7, respectively, then the first propagation structure sub-attribute value of the target social network information

In the same way, the credibility of the other two propagation structure sub-attributes is 0.7 and 0.8, respectively;

步骤C5：对传播节点可信度Credibility_vec、M＝3个传播结构子属性可信度Step C5: Credibility _vec of propagation nodes, M=3 propagation structure sub-attributes

Credibility_{struc_i},i＝1,...,M分配计算权重w_vec,w_{struc_1},...,w_{struc_M}，其中所有权重之和等于1，分配方法在本例中采取平均分配法，例如

Credibility _{struc_i} , i=1,...,M assigns the calculation weights w _vec ,w _{struc_1} ,...,w _{struc_M} , where the sum of all weights is equal to 1, and the assignment method adopts the average assignment method in this example, for example

步骤C6：利用式(7)计算得到目标社交网络信息，即“吃大蒜不可以消灭新冠病毒”的可信度为0.75，依据结果对目标社交网络信息的可信度进行评估，计算结果数值越接近0则其可信度越弱，反之越接近1则其可信度越强，可以看出该目标社交网络信息较为可靠，其可信度值为0.75，说明“吃大蒜不可以消灭新冠病毒”的社交网络综合可信度为0.75，有较大概率可以相信该条信息，社交网络用户在社交网络平台上看到该条信息可以认为是大体可信的；Step C6: Use formula (7) to calculate and obtain the target social network information, that is, the credibility of "eating garlic cannot eliminate the new coronavirus" is 0.75, and the credibility of the target social network information is evaluated according to the result. If it is close to 0, its credibility will be weaker. On the contrary, if it is closer to 1, its credibility will be stronger. It can be seen that the target social network information is more reliable, and its credibility value is 0.75, indicating that "eating garlic cannot eliminate the new coronavirus." ”’s social network comprehensive reliability is 0.75, there is a high probability that this piece of information can be believed, and social network users who see this piece of information on social network platforms can be considered generally credible;

Claims

1. a kind of social network information credibility assessment method based on the similarity of propagation attributes, is characterized in that: its steps are as follows:

Step A: extracting social network information dissemination content, and constructing a social network information dissemination network;

Step B: extracting the attributes of social network information dissemination nodes, calculating the network topology attributes of social network information dissemination, marking the credibility of historical information, and establishing a historical information dissemination attribute database;

Step C: Calculate the similarity of the propagation attributes of the target social network and the historical social network, so as to evaluate the information credibility of the target social network.

2. a kind of social network information credibility assessment method based on the similarity of propagation attributes according to claim 1, is characterized in that:

The method of "extracting social network information dissemination content and constructing a social network information dissemination network" described in step A is as follows: for a social network information, an information publisher is selected as an initial dissemination node, and the initial dissemination node is based on the layer by layer. The direct forwarding node of the node has a directed relationship with the direct forwarding to construct the dissemination network of the information, and the above-mentioned dissemination network is constructed for a given number of historical social network information and target social network information with historical information credibility; The specific steps of the process are as follows:

Step A1: According to the public information of the social network platform, select a publisher of a social network information Event as the initial dissemination node V ₀₀ , and the direct forwarder of the initial node as the lower node V _0j , depending on the initial node and each lower node The direct forwarding relationship between them is used as a social network information propagation directed edge E ₀ (V ₀₀ , V _0j ), and the first-layer information propagation network G ₀ (V ₀ , E ₀ ) is constructed, where

Step A2: Set the threshold of the number of layers of the information dissemination network, traverse a plurality of forwarding nodes in each layer, and regard it as a plurality of initial nodes of the layer, repeat step A1, build an inter-layer information dissemination network layer by layer until the dissemination level reaches the threshold, and obtain: The overall dissemination network G(V, E) of the social network information Event; in the formula: V represents the node set in the network, and E represents the node forwarding relationship in the network;

Step A3: Repeating steps A1 and A2, establishing a respective information dissemination network G _history (V _history , E _history ) for a given amount of Num of historical social network information Event _history , and establishing an information dissemination network G _obj of the target social network information Event _obj (V _obj , E _obj ); in the formula: G _history (V _history , E _history ) represents the information dissemination network of each historical social network information Event _history , V _history represents the set of nodes in the network, and E _history represents the nodes in the network forwarding relationship, and G _obj (V _obj , E _obj ) represents the information dissemination network of the target social network information Event _obj , V _obj represents the node set in the network, and E _obj represents the node forwarding relationship in the network.

3. a kind of social network information credibility evaluation method based on the similarity of propagation attributes according to claim 1, is characterized in that:

In the step B of "extracting the node attributes of social network information dissemination, calculating the topology attributes of social network information dissemination network, marking the credibility of historical information, and establishing a database of historical information dissemination attributes", the method is as follows: extracting the constructed social network Each dissemination node and dissemination topology attribute in the information dissemination network, then mark the credibility of each historical social network information, and establish a historical social network information dissemination attribute database; the specific steps of the above-mentioned process are as follows:

Step B1: Extract the unique identification information of each dissemination node V in the constructed social network information dissemination network G(V, E) as the dissemination node attribute F _vec of the information;

Step B2: Extract the network topology structure of the constructed social network information dissemination network G(V, E), and extract the network dissemination structure attribute F _struc , including but not limited to:

(1) The ratio r _2/1 of the initial forwarding level of the propagation network G(V, E), that is, formula (1), that is, the number of nodes in the second-layer propagation network n _V (2) and the number of nodes in the first-layer propagation network n _V (1) ratio;

(2) The characteristic distance a of the nodes of the propagation network G(V, E) is used to fit the distance distribution between all pairs of nodes in the propagation network. The specific fitting equation is shown in formula (2), where y is the distribution probability, x is the separation distance of the node pair;

(3) The homogeneity index h of the propagation network G(V, E) is the difference between the logarithm value of the homogeneity of the propagation network G(V, E) and the homogeneity of the star network of the same scale. The calculation method is as follows: 3), where the homogeneity calculation method of the propagation network is as in formula (4);

h=log(H _star )-log(H _G ) (3)

where N is the total number of network nodes, and k is the node degree (4)

Step B3: Set the credibility evaluation index of the historical social network information Event _history according to the summary of historical facts and the certification of the authority

The value of 0 indicates that the trust is completely unreliable, and the value of 1 indicates that the trust is completely credible. Set the credibility evaluation indicators of the dissemination nodes of the historical society update network information Event _history , Credibility _vec (Event _history ), Credibility _struc (Event _history ) is the same as the comprehensive credibility evaluation index;

Step B4: For the collected Num pieces of historical social network information Event _history , the historical information dissemination node attribute F _vec (Event _history ), the historical information dissemination network structure attribute F _struc (Event _history ), and the historical information credibility index Credibility ( Event _history ) is added to the historical social network information dissemination attribute database DS (Event _history ).

4. a kind of social network information credibility assessment method based on the similarity of propagation attributes according to claim 1, is characterized in that:

In step C, "calculate the similarity of the propagation attributes of the target social network and the historical social network, so as to evaluate the information credibility of the target social network", the specific method is as follows: Calculate the target social network information Event _obj and the history The dissemination node similarity Sim _vec of the social network information Event _history and the credibility of the dissemination node Credibility _vec (Event _obj ) of the target social network information Event _obj are obtained, and then the target social network information Event _obj and the

The similarity quantile of the dissemination structure of the historical social network information Event _history , and then obtain the credibility of the dissemination structure Credibility _struc (Event _obj ), assign the credibility of each dissemination attribute to calculate the weight, and calculate the credibility of the target social network information; The specific steps of the process are as follows:

Step C1: Using a set similarity calculation method such as the Jaccard similarity method, that is, formula (5), respectively calculates that the target social network information Event _obj is similar to the dissemination node attributes of all historical information Event _history in the historical information dissemination attribute database DS (Event _history ). Sim _vec (Event _obj , Event _history ), then select the largest propagation user attribute similarity value Sim _{vec_max} and its corresponding historical information Event _{vec_max} in the calculation result, and record the credibility of the corresponding historical information Credibility _vec (Event _{vec_max} ), then the credibility of the propagation node of the target social network information Event _obj (Credibility _vec (Event _obj )=Sim _{vec_max} ×Credibility _vec (Event _{vec_max} );

In the formula: F _vec (Event _obj ) represents the attribute of the target information dissemination node, and F _vec (Event _history ) represents the attribute of the historical information dissemination node;

Step C2: For each historical information Event _history in the historical information dissemination attribute database DS (Event _history ), based on the size of the M sub-attributes in the dissemination structure attribute F _struc , sort them according to the same sorting method, and obtain M historical information after sorting sequence:

Step C3: For the i-th rearranged historical information sequence obtained by C2, divide the historical information into K intervals with equal numbers, then each interval has

and then compare the corresponding i-th dissemination structure sub-attribute values of the historical information Event _history in the historical information dissemination attribute database DS (Event _history ), find the most similar historical information Event _history and use the interval number k _{struc_i} to which it belongs as the target Similarity _quantiles of the sub-attributes of the communication structure of the social network information Event _obj , repeat this step for i=1, 2, . similarity quantile;

Step C4: Calculate the propagation structure sub-attribute Credibility _{struc_i} of the target social network information Event _obj and the historical social network information Event _history according to formula (6), where i is the ith in the M propagation structure sub-attributes,

Step C5: _{Allocate calculation weights w vec} _, w _{struc_1} _, _. The sum of all weights is equal to 1, and the distribution method can adopt the average distribution method, the analytic hierarchy process and the fuzzy evaluation method, and distribute according to the weight importance and other necessary information;

Step C6: Calculate the credibility of the target social network information by using the formula (7), and evaluate the credibility of the target social network information according to the result. The closer the calculation result is to 0, the weaker the credibility, and vice versa. 1, the more reliable it is;