CN115987810A - Node and community combined multilevel propagation analysis method and analysis system - Google Patents

Node and community combined multilevel propagation analysis method and analysis system Download PDF

Info

Publication number
CN115987810A
CN115987810A CN202211648746.4A CN202211648746A CN115987810A CN 115987810 A CN115987810 A CN 115987810A CN 202211648746 A CN202211648746 A CN 202211648746A CN 115987810 A CN115987810 A CN 115987810A
Authority
CN
China
Prior art keywords
node
propagation
nodes
graph
coefficient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211648746.4A
Other languages
Chinese (zh)
Inventor
黄音莅
沈宜
贾宇
冯碧怡
杨凤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Wanglian Anrui Network Technology Co ltd
Original Assignee
Shenzhen Wanglian Anrui Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Wanglian Anrui Network Technology Co ltd filed Critical Shenzhen Wanglian Anrui Network Technology Co ltd
Priority to CN202211648746.4A priority Critical patent/CN115987810A/en
Publication of CN115987810A publication Critical patent/CN115987810A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention belongs to the technical field of network data multistage propagation analysis, and discloses a node and community combined multistage propagation analysis method and system. Based on a global or local multistage propagation network, in combination with a network-related community algorithm, a node-related clustering coefficient, a core bridge coefficient, an incidence following property coefficient and a time-related propagation speed, a compact community and a core bridge node of a multistage propagation event can be identified, a co-site propagation cluster and a site transition key node, an incidence node, an affected cluster, an outbreak cluster and an outbreak node, and a suspected abnormal graph and an abnormal node are provided, and an analysis method of point-surface fusion is provided for multistage propagation. The invention avoids the problem that multi-level propagation nodes based on the user propagation graph are lost, and simultaneously allows cross-contrast analysis to be carried out on key nodes and associated clusters in different propagation graphs.

Description

Node and community combined multilevel propagation analysis method and analysis system
Technical Field
The invention belongs to the technical field of network data multistage propagation analysis, and particularly relates to a node and community combined multistage propagation analysis method and system.
Background
Multi-level dissemination, i.e., an n-level dissemination, generally includes dissemination of information and impacts, meaning that information and impacts reach the general audience via small and large intermediaries. The current multi-level transmission analysis tool is generally embedded in the public opinion analysis system and exists as an analysis module with auxiliary properties, rather than an independent public opinion analysis product.
Existing multistage propagation analysis tools mainly focus on discovering key influence nodes in a complex social network according to node properties (such as degree-centrality, betweenness-centrality and approach-centrality) or indexes based on random walks (such as PageRank indexes), and expanding analysis of depth and breadth levels aiming at propagation paths. The multi-stage propagation network construction method forms a user propagation network graph by taking a user as a base node.
Through the above analysis, the problems and defects of the prior art are as follows:
(1) The point-surface dimension of the existing multistage propagation analysis tool is relatively cracked, and a comprehensive analysis method is lacked. For example, the multistage propagation analysis tool rarely excavates the connection community of the key influence node further even if the key influence node is identified under the global propagation network view.
(2) In a propagation network graph constructed by taking users as nodes, because each user can only appear once, only one forwarding node can be reserved for ensuring the uniqueness of the user under the condition that a certain user generates multiple forwarding behaviors, and the problem of loss of the propagation node is caused. When analysis of propagation paths is involved, loss of a propagation node can distort the analysis result.
(3) The existing multistage propagation analysis tool considers less suspected abnormal graphs and suspected abnormal nodes, and the existence of the suspected abnormal graphs and the suspected abnormal nodes may cause interference to multistage propagation network analysis.
Disclosure of Invention
In order to overcome the problems in the related art, the embodiments of the present disclosure provide a multi-level propagation analysis method and an analysis system combining nodes and communities.
The technical scheme is as follows: the multi-stage propagation analysis method combining the nodes and the communities is characterized by comprising the following steps of:
s1, constructing a multistage propagation network: respectively taking users and the posters as nodes, and taking forwarding, commenting and replying relations as edges to construct a directed user multistage propagation connected graph and a directed poster multistage propagation connected graph;
s2, identifying the suspected abnormal graph and the abnormal nodes: identifying a suspected abnormal graph and abnormal nodes by calculating a quantitative index of a connected graph according to an index threshold;
s3, denoising: removing the suspected abnormal graph and the abnormal nodes based on the result of the step S2 to obtain a denoised user multilevel propagation connected graph and a textual multilevel propagation connected graph;
s4, communicating attribute assignment of the graph point edges: assigning clustering coefficients, the vertical orientation and the core bridge coefficient attribute to the nodes, and assigning homodromous/heterodromous edge binary numerical values and vertical strength difference attribute to the edges;
s5, identifying the tight community and the core bridge node: finding the influence cluster of the core nodes and the communication cluster of the bridge nodes by combining the clustering coefficient, the core bridge coefficient and the community algorithm;
s6, identifying key nodes for transmitting the cluster in the same place and changing from the place to the place: discovering a same-position transmission cluster and a position transition key node according to the position tendency and the opposite node set of the node;
s7, identifying the flaring nodes and the influenced clusters: according to the incidences and the following property coefficients and the homodromous/heterodromous edge properties, discovering the incidences and the affected clusters;
s8, identifying the outbreak cluster and the outbreak node: discovering an explosion node and an explosion cluster according to the propagation path and the propagation speed of the propagation subtree;
s9, cross analysis: and (4) by comparing the key nodes and the associated clusters obtained in the steps S5, S6, S7 and S8, finding key node accounts which are coincident with each other, and the associated clusters and action ways of the key node accounts.
In step S1, the directed user multilevel propagation connected graph is formed by taking a user as a node, taking an original node as an original user, and taking forwarding, commenting and replying relationships as edges;
the directed postscript multilevel propagation connected graph is formed by taking postscripts as nodes, taking primary nodes as primary postscripts and taking forwarding, commenting and replying relations as edges.
In step S2, the identifying the suspected abnormal graph and the abnormal node by calculating the quantitative index of the connected graph and according to the index threshold specifically includes:
calculating the average number of the posts issued by each account, the issuing frequency of each account in unit time and the ratio index of original forwarding for each post multistage propagation connected graph;
the method comprises the steps that the average number of posted texts issued by each account is = post _ n/account _ n, wherein the account _ n is the number of accounts, and the post _ n is the total number of posted texts;
releasing frequency = post _ n/t in unit time of each account, wherein t is an observation period;
original forwarding ratio = original _ n/retweet _ n, wherein original _ n is original number, and retweet _ n is forwarding number;
and giving initial weight to each index, then calculating a comprehensive index, identifying a suspected abnormal graph and nodes therein according to an initial threshold value, and adjusting the weight and the threshold value in a post-verification manner.
In step S4, the assigning the clustering coefficient, the elevation tendency, and the core bridge coefficient attribute to the node, and the assigning the homodromous/heterodromous edge binary value and the elevation strength difference attribute to the edge specifically include:
assigning clustering coefficients to the nodes in the user multilevel propagation connected graph; and (3) for the context multilevel propagation connected graph, endowing the nodes in the graph with the vertical tendency and the core bridge coefficient, endowing the edges in the graph with the homodromous/heterodromous edge binary numerical values and the vertical strength difference value of the edge associated nodes, and the calculation mode is as follows:
for any two nodes n i 、n j And an edge e therebetween ij ,node i From the standpoint of S i ,node i From the standpoint of S j If the node S i Node S j If both are positive or both are negative, then e ij For the homodromous edge, assign 1, otherwise e ij Is an incongruous edge and is assigned a value of 0; node n i And node n j Difference in vertical intensity value diffS of ij =|S i -S j |。
In step S5, the identification of the compact community and the core bridge node is based on a user multilevel propagation connected graph, and a node set having a relatively compact relationship with an adjacent node is screened out according to a clustering coefficient threshold; forming a new user multi-level propagation connected graph on the basis of the node set; on the basis of the user multilevel propagation connection graph, a core and a bridge coefficient are calculated respectively on the basis of the PageRank and the mesocentrality to obtain a core node and a bridge node, and meanwhile, the whole user multilevel propagation connection graph is divided into a plurality of tight communities by combining a community algorithm to obtain an influence radiation cluster of the core node and a communication cluster of the bridge node.
In step S6, the identification of the same-position propagation cluster and the position transition key node is based on a postscript multi-level propagation connected graph, the same-position propagation cluster is identified according to the position tendency of the node, a front connection point and a rear connection point are traversed for each node, and if an obvious position opposite cluster appears before and after a certain node, the node is labeled as the position transition key node.
In step S7, the discovering the inflammatory node and the affected cluster according to the inflammatory and the following coefficients and the same/different direction edge attributes specifically includes:
based on the postscript multistage propagation connected graph, calculating an incitation coefficient and a following property coefficient for each node according to the same-direction/different-direction edge binary numerical values and the vertical intensity difference value of the associated edges in the following mode;
for a certain node w, calculate provo 1 = sameOut/diffOut, where sameOut is the number of homodromous edges where w points to other nodes, and diffOut is the number of heterodromous edges where w points to other nodes;
calculating provo 2 = sumdiffS/(sameOut + diffOut), where sameOut + diffOut is the number of edges where w points to other nodes, and sumdiffS is the sum of the difference between the nodes pointed to by w and the vertical strength of w;
then there is, flaring coefficient = a 1 *provo 1 +a 2 *provo 2 ,a 1 、a 2 The weight is corresponding to the index;
the calculation process of the following wind coefficient is the same as that of the incitation coefficient, and the edges and nodes pointing to W are concerned;
thereby obtaining the ratio of the incidences coefficient to the following aeolian coefficient, and identifying an incidences node based on a proportion threshold value; and obtaining a node set adjacent to the equidirectional edge according to the equidirectional edge associated with the node, wherein the node set is identified as a main cluster influenced by the inflammatory node.
In step S8, discovering the bursting node and the bursting cluster according to the propagation path and the propagation speed of the propagation subtree specifically includes:
traversing each propagation path in the connected graph based on the ciphertext multistage propagation connected graph, recording the time of reaching each node, then calculating the propagation speed of each propagation path to obtain the path of the propagation speed TopN, wherein N is a self-defined value of 3, 5 or 10, and identifying the outburst node of each propagation path for each node on the TopN propagation path according to the average time consumption of the subsequent propagation nodes; and traversing each node in the connected graph as a root node, recording the time of reaching each child node, and then calculating the propagation speed of each sub-tree to obtain the sub-tree with the propagation speed TopN, wherein the root node of the sub-tree is identified as an explosion node, and the sub-tree is identified as an explosion cluster corresponding to the explosion node.
In step S9, by comparing the key nodes and the associated clusters obtained in step S5, step S6, step S7, and step S8, finding the key node account numbers that coincide with each other, and the associated clusters and action routes of the key node account numbers specifically include:
and based on the posted text multilevel propagation connection graph, identifying the core and bridge nodes of each connection graph, cross-comparing the core and bridge nodes of each connection graph, judging whether posting account numbers corresponding to the core and bridge nodes coincide, recording one or more posted text propagation connection graphs corresponding to the core and bridge node account numbers, and finding posted text propagation graphs and compact communities in which the account numbers of the core and bridge nodes act by combining the method in the step S5, thereby excavating an action mechanism of the core and bridge nodes in multilevel propagation events.
Another object of the present invention is to provide a node and community combined multi-level propagation analysis system, comprising:
the environment building module is used for realizing graph calculation performance acceleration by adopting a machine learning platform through a nlp analysis scene and using a Gpu algorithm and a Gpu graph algorithm library;
the data set preprocessing module is used for preprocessing data and segmenting a data set according to an event dimension;
the algorithm analysis module is used for carrying out bottom algorithm scheduling based on a multi-level propagation analysis method combining nodes and communities;
and the visualization module is used for realizing the map display of the multi-stage propagation analysis result in a mode of combining static rendering of map tiles which are loaded as required and dynamic rendering of Gpu.
It is another object of the present invention to provide a program storage medium for receiving user input, the stored computer program causing an electronic device to execute the node-community combined multi-level propagation analysis method.
It is a further object of the invention to provide a computer arrangement comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the node-community combined multi-level propagation analysis method.
The invention also aims to provide application of the multi-level propagation analysis method combining the nodes and the communities in public opinion analysis of a social network platform.
By combining all the technical schemes, the invention has the advantages and positive effects that:
first, aiming at the technical problems existing in the prior art and the difficulty in solving the problems, the technical problems to be solved by the technical solutions of the present invention are closely combined with the technical solutions to be protected and the results and data in the research and development processes, and some of the technical effects brought by the solved problems are creative and are specifically described as follows:
(1) Aiming at the problem of relative splitting of the multi-level propagation analysis point-surface dimension, a joint analysis method based on nodes, networks and time dimensions is provided, and key nodes and main action clusters are excavated.
(2) Aiming at the problem that nodes in a user propagation graph are lost, a multilevel propagation graph based on the text nodes is constructed, and a propagation network combination extended from different original posts is formed.
(3) Aiming at the interference problem of the abnormal graph and the abnormal node, the suspected abnormal graph and the suspected abnormal node in the multistage propagation network are identified based on related quantitative indexes such as posts, accounts, relations and the like.
Secondly, regarding the technical solution as a whole or from the perspective of products, the technical effects and advantages of the technical solution to be protected by the present invention are specifically described as follows:
the invention provides a node and community combined multilevel propagation analysis method, a node and community combined multilevel propagation analysis system, computer equipment and a medium, which are used for carrying out event-oriented global or local multilevel propagation network construction and comprehensive analysis of joint nodes, networks and time. The method is based on a global or local multistage propagation network, and combines a network-related community algorithm, a node-related clustering coefficient, a core bridge coefficient, an incidence following aeolian coefficient and a time-related propagation speed, so that a compact community and a core bridge node of a multistage propagation event, a same-site propagation cluster and a site transition key node, an incidence node, an affected cluster, an outbreak cluster and an outbreak node, a suspected abnormal graph and an abnormal node can be identified, and a point-surface fusion analysis method is provided for multistage propagation. In addition, according to the data requirements of the analysis indexes, the method is based on the user propagation graph and the post propagation graph for analysis respectively, the problem that multi-stage propagation nodes based on the user propagation graph are lost is solved, and cross comparison analysis is allowed to be carried out on key nodes and associated clusters in different propagation graphs.
Compared with the prior art, the invention has the advantages that:
the method identifies the suspected abnormal graph and the abnormal nodes based on the quantitative indexes and the after-the-fact verification, and reduces the influence of the abnormal graph and the abnormal nodes on the multi-stage propagation analysis.
The method takes the user and the text as nodes respectively, and carries out synchronous analysis on two multilevel propagation connected graphs, namely the user multilevel propagation connected graph and the text multilevel propagation connected graph, thereby reducing the influence of the loss of the user propagation graph nodes on the multilevel propagation analysis.
The method is based on the point edge attribute and the network structure, and carries out recognition and cross analysis on the key nodes and the main function clusters thereof from different angles, thereby improving the point-surface integration of multi-level propagation analysis.
Meanwhile, based on a multi-level propagation analysis method combining nodes and communities, corresponding optimization is performed in a multi-level propagation analysis system. By using a static rendering strategy of map tile loading and optimization based on fm3 force guiding layout and using cuda acceleration operation, a full-scale map analysis result can be smoothly rendered.
Third, as an inventive supplementary proof of the claims of the present invention, there are also presented several important aspects:
(1) Different from an analysis module of auxiliary properties in a public opinion analysis system, the invention constructs an independent system specially applied to multi-level transmission data analysis based on a multi-level transmission analysis method combining nodes and communities, can simultaneously analyze a large batch of textual rebroadcasting diagrams and user transmission diagrams, and removes noise interference, thereby deeply excavating key nodes and associated clusters and action paths thereof and quickly forming systematic cognition on multi-level transmission events.
(2) In the aspect of an analysis method, the method improves the problem that influence nodes are analyzed in a centralized manner in the industry and association clusters and action modes of the influence nodes are ignored, and analyzes influence radiation clusters and influence ways of key nodes from the aspects of communities, standpoints and speed.
(3) On the analysis system level, the problems of memory overflow, network transmission timeout and rendering blockage under large-scale data of a browser end are solved through the characteristics of map slicing and static caching, the image analysis characteristic of the aerial view is reserved, and the real-time analysis mode of the dynamic image can be switched as required by utilizing the zooming function of the map.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure;
FIG. 1 is a flow chart of a multi-level propagation analysis method combining nodes and communities according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a multi-level propagation analysis system combining nodes and communities according to an embodiment of the present invention;
FIG. 3 is one of the evidences of the relevant effects of the embodiments of the present invention, showing the core bridge nodes and their acting clusters of a certain event;
in the figure: 1. an environment building module; 2. a data set preprocessing module; 3. an algorithm analysis module; 4. a visualization module;
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, but rather should be construed as broadly as the present invention is capable of modification in various respects, all without departing from the spirit and scope of the present invention.
1. Illustrative examples are illustrated:
the embodiment of the invention provides a node and community combined multilevel propagation analysis method, which comprises the following steps:
a close community and core bridge node, a same-position propagation cluster and position conversion key node, an initiating node and an affected cluster, an outbreak cluster and an outbreak node, and a suspected abnormal graph and an abnormal node.
Example 1
The embodiment of the invention provides a node and community combined multilevel propagation analysis method, which specifically comprises the following steps:
s1, constructing a multistage propagation network: respectively taking the user and the post as nodes, and taking the relations of forwarding, commenting, replying and the like as edges to construct a directed user/post multistage propagation connected graph;
the user is used as a node, the primary node is used as a primary user, and the relations of forwarding, commenting, replying and the like are edges, so that a directed user multilevel propagation connected graph is formed;
the method comprises the following steps of forming a directed multi-level propagation connected graph of the posted text by taking the posted text as a node, taking an original node as an original posted text, and taking relations of forwarding, commenting, replying and the like as edges;
s2, identifying the suspected abnormal graph and the abnormal nodes: identifying a suspected abnormal graph and abnormal nodes by calculating a quantitative index of the connected graph according to an index threshold;
and for each posting multilevel propagation connected graph, calculating indexes such as the average number of posts issued by each account, the issuing frequency of each account in unit time, the original forwarding ratio and the like.
The method comprises the steps that the average number of posted texts issued by each account is = post _ n/account _ n, wherein account _ n is the number of accounts, and post _ n is the total number of posted texts;
issuing frequency = post _ n/t in unit time of each account, wherein t is an observation period;
original forwarding ratio = original _ n/retweet _ n, wherein original _ n is original number, and retweet _ n is forwarding number;
giving an initial weight to each index, then calculating a comprehensive index, identifying a suspected abnormal graph and nodes therein according to an initial threshold value, and adjusting the weight and the threshold value in a post-verification manner;
s3, "denoising": removing the suspected abnormal graph and nodes thereof based on the result of the step S2 to obtain a user multilevel propagation connected graph and a text multilevel propagation connected graph after denoising;
s4, communicating attribute assignment of the graph point edges: assigning clustering coefficients, vertical orientation and core bridge coefficient attributes to the nodes, and assigning homodromous/heterodromous edge binary numerical values and vertical strength difference attributes to the edges;
the method comprises the steps that a user multilevel propagation connected graph is endowed with clustering coefficients for nodes in the graph; and (3) assigning a position tendency and a core bridge coefficient to the nodes in the graph, assigning homodromous/heterodromous side binary numerical values and position intensity difference values of side associated nodes to the sides in the graph, and calculating the multilevel propagation connected graph in the following way.
For any two nodes n i 、n j And an edge e between them ij ,node i In the standpoint of S i ,node i From the standpoint of S j If S is i 、S j Is both positive or both negative, then e ij Is a homodromous edge (assigned a value of 1), otherwise e ij As an incongruous edge (assigned a value of 0); n is i And n j Difference in vertical intensity value diffS of ij =|S i -S j |。
S5, identifying the tight community and the core bridge node: finding the influence cluster of the core node and the communication cluster of the bridge node by combining the clustering coefficient, the core bridge coefficient and the community algorithm;
screening out a node set which is relatively close to the adjacent nodes according to a clustering coefficient threshold value based on a user multilevel propagation connected graph; forming a new user multi-level propagation connected graph on the basis of the node set; on the basis of the user multilevel propagation connection graph, a core and a bridge coefficient are calculated respectively on the basis of the PageRank and the mesocentrality to obtain a core node and a bridge node, and meanwhile, the whole user multilevel propagation connection graph is divided into a plurality of tight communities by combining a community algorithm to obtain an influence radiation cluster of the core node and a communication cluster of the bridge node.
S6, identifying key nodes for transmitting the cluster in the same place and changing from the place to the place: discovering a same-position transmission cluster and a position transition key node according to the position tendency and the opposite node set of the node;
identifying the same elevation propagation cluster according to the elevation tendency of the nodes based on the textbook multistage propagation connected graph, traversing front and rear connection points for each node, and marking a certain node as an elevation transition key node if obvious elevation opposite clusters appear before and after the node;
s7, identifying the flaring nodes and the influenced clusters: according to the incidences and the following property coefficients and the homodromous/heterodromous edge properties, discovering the incidences and the affected clusters;
based on the postscript multistage propagation connected graph, for each node, according to the homodromous/heterodromous edge binary numerical value and the vertical field strength difference value of the associated edge, the incidences coefficient and the following nature coefficient are calculated in the following mode.
For a certain node w, the provo is calculated 1 = sameOut/diffOut, where sameOut is the number of homodromous edges where w points to other nodes, and diffOut is the number of heterodromous edges where w points to other nodes;
calculating provo 2 = sumdiffS/(sameOut + diffOut), where sameOut + diffOut is the number of edges where w points to other nodes, and sumdiffS is the sum of the difference between the nodes pointed to by w and the vertical strength of w;
then, the incidences coefficient = a 1 *provo 1 +a 2 *provo 2 ,a 1 、a 2 The weight is associated with the index.
The calculation process of the following aeolian coefficient is similar to that of the incidence coefficient, and the focus is to point to the edge and the node of W.
Thereby obtaining the ratio of the inciting coefficient to the following coefficient, and identifying the inciting node based on the proportional threshold; and obtaining a node set adjacent to the equidirectional edge according to the equidirectional edge associated with the node, wherein the node set is identified as a main cluster influenced by the inflammatory node.
S8, identifying the outbreak cluster and the outbreak node: discovering an explosion node and an explosion cluster according to the propagation path and the propagation speed of the propagation subtree;
traversing each propagation path in the connected graph based on a postscript multistage propagation connected graph, recording the time of reaching each node, then calculating the propagation speed (the ratio of the total time to the number of the nodes, namely the average time spent by each node) of each propagation path to obtain the path of the propagation speed TopN, wherein N can be a self-defined value of 3, 5, 10 and the like, and identifying the outburst node of the propagation path for each node on the TopN propagation path according to the average time spent of the subsequent propagation nodes; traversing each node in the connected graph as a root node, recording the time of reaching each child node, and then calculating the propagation speed of each sub-tree to obtain the sub-tree with the propagation speed TopN, wherein the root node of the sub-tree is identified as an explosion node, and the sub-tree is identified as an explosion cluster corresponding to the explosion node.
S9, cross analysis: by comparing the key nodes and the associated clusters obtained in different steps, finding key node account numbers which are coincident with each other, and the associated clusters and action ways of the key node account numbers;
identifying cores and bridge nodes of each connected graph based on the posted multilevel propagation connected graph, cross-comparing the cores and bridge nodes of each connected graph, judging whether posting account numbers corresponding to the cores and bridge nodes are overlapped, recording one or more posted propagation connected graphs corresponding to the cores and bridge node account numbers, and finding posted propagation graphs and compact communities in which the core and bridge node account numbers act by combining the method in the step S5, so as to excavate an action mechanism of the cores and bridge nodes in a multilevel propagation event;
in addition, the core bridge nodes and the associated communities, the site transition key nodes and the co-site propagation clusters, the flaring nodes and the affected clusters, and the outbreak nodes and the outbreak clusters obtained in the steps S5, S6, S7, S8 and S9 are cross-compared to find mutually overlapping node account numbers, and the key node account numbers act on a plurality of clusters through a plurality of ways of affecting communities, flaring sites and detonation propagation.
Example 2
As shown in fig. 2, an embodiment of the present invention provides a multi-level propagation analysis system combining nodes and communities, including:
the environment building module 1 is used for nlp analysis scene and adopts a machine learning platform; for graph computation performance acceleration, the Gpu algorithm and the Gpu graph algorithm library are used;
the data set preprocessing module 2 is used for preprocessing data and segmenting a data set according to an event dimension;
the algorithm analysis module 3 is used for carrying out bottom algorithm scheduling based on a multi-level propagation analysis method combining nodes and communities;
and the visualization module 4 is used for displaying the map of the multi-level propagation analysis result, and adopts a mode of combining static rendering of map tiles loaded according to needs and Gpu dynamic rendering.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
For the information interaction, execution process and other contents between the above-mentioned devices/units, because the embodiments of the method of the present invention are based on the same concept, the specific functions and technical effects thereof can be referred to the method embodiments specifically, and are not described herein again.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments.
2. The application example is as follows:
application example
An embodiment of the present invention provides a computer device, including: at least one processor, a memory, and a computer program stored in the memory and executable on the at least one processor, the processor implementing the steps of any of the various method embodiments described above when executing the computer program.
Embodiments of the present invention further provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the steps in the above method embodiments may be implemented.
The embodiment of the present invention further provides an information data processing terminal, where the information data processing terminal is configured to provide a user input interface to implement the steps in the above method embodiments when implemented on an electronic device, and the information data processing terminal is not limited to a mobile phone, a computer, or a switch.
Embodiments of the present invention further provide a server, where the server is configured to provide a user input interface to implement the steps in the foregoing method embodiments when implemented on an electronic device.
Embodiments of the present invention provide a computer program product, which, when running on an electronic device, enables the electronic device to implement the steps in the above method embodiments when executed.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal device, recording medium, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier signal, telecommunications signal and software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc.
In the above embodiments, the description of each embodiment has its own emphasis, and reference may be made to the related description of other embodiments for parts that are not described or recited in any embodiment.
3. Evidence of the relevant effects of the examples:
as shown in fig. 3, the core bridge nodes and their action clusters can clearly find the distribution of each core bridge node in each community based on the core bridge coefficients, the clustering coefficients and the community algorithm, so as to obtain the action community of each core bridge node.
As shown in table 1, compared with single map tile rendering and Gpu rendering, the positive effects obtained by combining the rendering modes of the two on speed and dynamic and static rendering are combined.
TABLE 1 Positive effects of map tile static rendering and Gpu dynamic rendering in combination
Figure BDA0004011030450000131
Figure BDA0004011030450000141
Table 1 is one of the evidences of the relevant effects of the embodiment of the present invention, and illustrates the positive effects of combining static rendering of map tiles and dynamic rendering of Gpu.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, and any modification, equivalent replacement, and improvement made by those skilled in the art within the technical scope of the present invention disclosed herein, which is within the spirit and principle of the present invention, should be covered by the present invention.

Claims (10)

1. A multi-stage propagation analysis method combining nodes and communities is characterized by comprising the following steps:
s1, constructing a multistage propagation network: respectively taking users and the posters as nodes, and taking forwarding, commenting and replying relations as edges to construct a directed user multistage propagation connected graph and a directed poster multistage propagation connected graph;
s2, identifying the suspected abnormal graph and the abnormal nodes: identifying a suspected abnormal graph and abnormal nodes by calculating a quantitative index of the connected graph according to an index threshold;
s3, denoising: removing the suspected abnormal graph and the abnormal nodes based on the result of the step S2 to obtain a denoised user multilevel propagation connected graph and a postscript multilevel propagation connected graph;
s4, communicating attribute assignment of the graph point edges: assigning clustering coefficients, the vertical orientation and the core bridge coefficient attribute to the nodes, and assigning homodromous/heterodromous edge binary numerical values and vertical strength difference attribute to the edges;
s5, identifying the compact community and the core bridge node: finding the influence cluster of the core nodes and the communication cluster of the bridge nodes by combining the clustering coefficient, the core bridge coefficient and the community algorithm;
s6, broadcasting the cluster in the same place and identifying the key nodes converted from the place: discovering a same-position transmission cluster and a position transition key node according to the position tendency and the opposite node set of the node;
s7, identifying the inciting nodes and the affected clusters: according to the incidences and the following property coefficients and the homodromous/heterodromous edge properties, discovering the incidences and the affected clusters;
s8, identifying the outbreak cluster and the outbreak node: discovering an explosion node and an explosion cluster according to the propagation path and the propagation speed of the propagation subtree;
s9, cross analysis: and (4) by comparing the key nodes and the associated clusters obtained in the steps S5, S6, S7 and S8, finding key node accounts which are coincident with each other, and the associated clusters and action ways of the key node accounts.
2. The multi-level propagation analysis method combining nodes and communities according to claim 1, wherein in step S1, the directed multi-level propagation connected graph of users is formed by taking users as nodes, taking original nodes as original users, and taking forwarding, commenting and replying relationships as edges;
the directed multilevel propagation connected graph of the postscript is formed by taking the postscript as a node, taking the primary node as the primary postscript and taking the forwarding, commenting and replying relations as edges.
3. The multi-stage propagation analysis method combining nodes and communities according to claim 1, wherein in step S2, the identifying the suspected abnormal graph and the abnormal node by calculating a quantitative index of the connected graph according to an index threshold specifically comprises:
calculating the average number of the posts issued by each account, the issuing frequency of each account in unit time and the ratio index of original forwarding for each post multistage propagation connected graph;
the method comprises the steps that the average number of posted texts issued by each account is = post _ n/account _ n, wherein the account _ n is the number of accounts, and the post _ n is the total number of posted texts;
issuing frequency = post _ n/t in unit time of each account, wherein t is an observation period;
original forwarding ratio = original _ n/retweet _ n, where original _ n is the original number and retweet _ n is the forwarding number;
and giving an initial weight to each index, then calculating a comprehensive index, identifying a suspected abnormal graph and nodes therein according to an initial threshold value, and adjusting the weight and the threshold value in a post-verification manner.
4. The multi-stage propagation analysis method combining nodes and communities according to claim 1, wherein in step S4, the assigning of the clustering coefficient, the elevation tendency, and the core bridge coefficient attribute to the nodes, and the assigning of the homodromous/heterodromous edge binary value and the elevation strength difference attribute to the edges specifically comprises:
the method comprises the steps of giving clustering coefficients to nodes in a user multilevel propagation connected graph; and (3) for the context multilevel propagation connected graph, endowing the nodes in the graph with the vertical tendency and the core bridge coefficient, endowing the edges in the graph with the homodromous/heterodromous edge binary numerical values and the vertical strength difference value of the edge associated nodes, and the calculation mode is as follows:
for any two nodes n i 、n j And an edge e between them ij ,node i From the standpoint of S i ,node i From the standpoint of S j If the node S i Node S j Is both positive or both negative, then e ij Assigning a value of 1 for the equidirectional edge, otherwise, e ij The value is 0 for the non-directional edge; node n i And node n j Difference in vertical intensity value diffS of ij =|S i -S j |。
5. The method for multi-level propagation analysis by combining nodes and communities as claimed in claim 1, wherein in step S5, the identification of the compact community and the core bridge node is based on a user multi-level propagation connection graph, and a node set having a relatively close relationship with adjacent nodes is screened out according to a clustering coefficient threshold; forming a new user multi-level propagation connected graph on the basis of the node set; on the basis of the user multilevel propagation connection graph, a core and a bridge node are obtained by calculating a core and a bridge coefficient respectively based on PageRank and mesocentricity, and meanwhile, the whole user multilevel propagation connection graph is divided into a plurality of tight communities by combining a community algorithm, so that an influence radiation cluster of the core node and a communication cluster of the bridge node are obtained.
6. The multi-level propagation analysis method combining nodes and communities according to claim 1, wherein in step S6, the identification of the same-position propagation clusters and position transition key nodes is based on a japanese multilevel propagation connected graph, the same-position propagation clusters are identified according to position trends of nodes, a front connection point and a rear connection point are traversed for each node, and if obvious position opposite clusters appear before and after a certain node, the node is labeled as a position transition key node.
7. The multi-stage propagation analysis method combining nodes and communities according to claim 1, wherein in step S7, the discovering of the inflammatory nodes and the affected clusters according to the inflammatory and windfollowing coefficients and the co/hetero edge attributes specifically comprises:
based on the postscript multistage propagation connected graph, calculating an incitation coefficient and a following property coefficient for each node according to the same-direction/different-direction edge binary numerical values and the vertical intensity difference value of the associated edges in the following mode;
for a certain node w, the provo is calculated 1 = sameOut/diffOut, wherein sameOut is the number of homodromous edges of which w points to other nodes, and diffOut is the number of heterodromous edges of which w points to other nodes;
calculating provo 2 = sumdiffS/(sameOut + diffOut), where sameOut + diffOut is the number of edges where w points to other nodes, and sumdiffS is the sum of the difference between the nodes pointed to by w and the vertical strength of w;
then there is, flaring coefficient = a 1 *provo 1 +a 2 *provo 2 ,a 1 、a 2 The weight is corresponding to the index;
the calculation process of the following wind coefficient is the same as that of the incitation coefficient, and the edges and nodes pointing to W are concerned;
thereby obtaining the ratio of the incidences coefficient to the following aeolian coefficient, and identifying an incidences node based on a proportion threshold value; and obtaining a node set adjacent to the equidirectional edge according to the equidirectional edge associated with the node, wherein the node set is identified as a main cluster influenced by the inflammatory node.
8. The multi-level propagation analysis method combining nodes and communities according to claim 1, wherein in step S8, discovering an outbreak node and an outbreak cluster according to the propagation path and the propagation speed of the propagation subtree specifically comprises:
traversing each propagation path in the connected graph based on the ciphertext multistage propagation connected graph, recording the time of reaching each node, then calculating the propagation speed of each propagation path to obtain the path of the propagation speed TopN, wherein N is a self-defined value of 3, 5 or 10, and identifying the outburst node of each propagation path for each node on the TopN propagation path according to the average time consumption of the subsequent propagation nodes; and traversing each node in the connected graph as a root node, recording the time of reaching each child node, and then calculating the propagation speed of each sub-tree to obtain the sub-tree with the propagation speed TopN, wherein the root node of the sub-tree is identified as an explosion node, and the sub-tree is identified as an explosion cluster corresponding to the explosion node.
9. The node-community combined multilevel propagation analysis method according to claim 1, wherein in step S9, by comparing the key nodes and the associated clusters obtained in steps S5, S6, S7, and S8, finding the key node account numbers that coincide with each other, and the associated clusters and the action routes of the key node account numbers specifically include:
and based on the posted text multilevel propagation connection graph, identifying the core and bridge nodes of each connection graph, cross-comparing the core and bridge nodes of each connection graph, judging whether posting account numbers corresponding to the core and bridge nodes coincide, recording one or more posted text propagation connection graphs corresponding to the core and bridge node account numbers, and finding posted text propagation graphs and compact communities in which the account numbers of the core and bridge nodes act by combining the method in the step S5, thereby excavating an action mechanism of the core and bridge nodes in multilevel propagation events.
10. A multi-stage propagation analysis system combining nodes and communities, which adopts the multi-stage propagation analysis method combining nodes and communities according to any one of claims 1-9, and is characterized by comprising:
the environment building module (1) adopts a machine learning platform through a nlp analysis scene, and realizes graph calculation performance acceleration by using a Gpu algorithm and a Gpu graph algorithm library;
a data set preprocessing module (2) for preprocessing data and segmenting data sets in event dimensions;
the algorithm analysis module (3) is used for carrying out bottom algorithm scheduling on the basis of a multi-level propagation analysis method combining nodes and communities;
and the visualization module (4) is used for realizing the map display of the multi-level propagation analysis result in a mode of combining static rendering of map tiles which are loaded as required and Gpu dynamic rendering.
CN202211648746.4A 2022-12-21 2022-12-21 Node and community combined multilevel propagation analysis method and analysis system Pending CN115987810A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211648746.4A CN115987810A (en) 2022-12-21 2022-12-21 Node and community combined multilevel propagation analysis method and analysis system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211648746.4A CN115987810A (en) 2022-12-21 2022-12-21 Node and community combined multilevel propagation analysis method and analysis system

Publications (1)

Publication Number Publication Date
CN115987810A true CN115987810A (en) 2023-04-18

Family

ID=85969391

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211648746.4A Pending CN115987810A (en) 2022-12-21 2022-12-21 Node and community combined multilevel propagation analysis method and analysis system

Country Status (1)

Country Link
CN (1) CN115987810A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116319379A (en) * 2023-05-17 2023-06-23 云目未来科技(湖南)有限公司 Network information guiding intervention method and system based on propagation chain

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116319379A (en) * 2023-05-17 2023-06-23 云目未来科技(湖南)有限公司 Network information guiding intervention method and system based on propagation chain
CN116319379B (en) * 2023-05-17 2023-08-01 云目未来科技(湖南)有限公司 Network information guiding intervention method and system based on propagation chain

Similar Documents

Publication Publication Date Title
US11645316B2 (en) Question answering method and language model training method, apparatus, device, and storage medium
Han et al. Fault-tolerant relay node placement in heterogeneous wireless sensor networks
CN102893275A (en) Automated social networking graph mining and visualization
US11863439B2 (en) Method, apparatus and storage medium for application identification
CN109858282B (en) Social network relationship data privacy protection method and system
CN115987810A (en) Node and community combined multilevel propagation analysis method and analysis system
CN105721279A (en) Relationship circle excavation method and system of telecommunication network users
CN112422574A (en) Risk account identification method, device, medium and electronic equipment
CN112181698A (en) Method, device and equipment for testing automatic driving limit performance and storage medium
KR20210040327A (en) Identity information processing method and device, electronic equipment and storage medium
CN109993390A (en) Alarm association and worksheet processing optimization method, device, equipment and medium
Branicki et al. Unpacking the impacts of social media upon crisis communication and city evacuation
CN114117056A (en) Training data processing method and device and storage medium
CN111159411B (en) Knowledge graph fused text position analysis method, system and storage medium
Morris A sufficient condition for the subordination principle in ergodic optimization
CN107391650A (en) A kind of structuring method for splitting of document, apparatus and system
CN111459780A (en) User identification method and device, readable medium and electronic equipment
CN114143109B (en) Visual processing method, interaction method and device for attack data
CN115760843A (en) Defect detection model training method and device, terminal device and storage medium
CN115238773A (en) Malicious account detection method and device for heterogeneous primitive path automatic evaluation
CN114419070A (en) Image scene segmentation method, device, equipment and storage medium
CN114387005A (en) Arbitrage group identification method based on graph classification
CN104111965B (en) OGC geographic information services based on differential matrix describe vocabulary reduction method
CN105245380A (en) Message transmission mode identifying method and device
CN112101390A (en) Attribute information determination method, attribute information determination device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination