CN116150507B - Water army group identification method, device, equipment and medium - Google Patents
Water army group identification method, device, equipment and medium Download PDFInfo
- Publication number
- CN116150507B CN116150507B CN202310349637.0A CN202310349637A CN116150507B CN 116150507 B CN116150507 B CN 116150507B CN 202310349637 A CN202310349637 A CN 202310349637A CN 116150507 B CN116150507 B CN 116150507B
- Authority
- CN
- China
- Prior art keywords
- group
- network
- node
- cooperative
- nodes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9536—Search customisation based on social or collaborative filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Abstract
The invention provides a water army group identification method, a device, equipment and a medium, wherein the method comprises the following steps: step one: based on a social network event, analyzing whether every two pairs of users forward the same article source within a very short time interval, if the occurrence times of the behaviors reach a preset threshold, a cooperative forwarding relationship exists among the pairs of users, extracting the cooperative forwarding relationship of all the pairs of users in the social network event, and constructing a global cooperative relationship network; step two: based on a global cooperative relation network, carrying out weighted fusion on the first-order direct adjacent similarity and the neighborhood similarity to obtain node comprehensive similarity; step three: based on the comprehensive similarity of the nodes, the collaborative relationship network is subjected to group division by using a hierarchical division method, and each water army group participating in the social network event is obtained. The method can accurately mine the water army group which performs cooperative forwarding and deliberately amplifies the topic influence in the social network event, and the organization behavior and the combat mode of the water army group are examined.
Description
Technical Field
The invention relates to the technical field of computer data processing, in particular to a water army group identification method, a water army group identification device, computer equipment and a computer readable storage medium based on social network cooperative forwarding behaviors.
Background
Under the background of rapid popularization and development of internet science and technology, netizens can freely conduct social behaviors and propagate personal views on a social platform. Due to multiparty participation of topic discussion, concealment of virtual roles and the like, various social platforms emerge a large amount of repeated information and continuously harassment our vision, and then a large amount of Internet water army combat groups correspondingly appear. The water army combat is increasingly scaled, water army groups accumulate public opinion, and false messages are transmitted, so that network safety is threatened, and social stability is affected. Therefore, the network water army group is found and monitored, and the method has important value for maintaining network safety and guaranteeing the authenticity of network information.
The technical problems to be solved in the current water army identification are as follows:
1. the traditional water army recognition method focuses on manually constructing features, and a probability value is output to recognize whether a user is a water army or not based on the user features by using a machine learning method or a deep learning method, but the accuracy of recognition is seriously dependent on screening and extraction of the user features.
2. The water army recognition method mainly comprises the steps of recognizing single users, is difficult to automatically and insignificantly observe the association relation among water armies, and cannot effectively mine water army groups and analyze the organization characteristics and the behavior patterns of the groups.
3. After the global association relation between the water armies is found and constructed, how to effectively calculate the similarity between the water armies, and divide the water armies into different groups.
Disclosure of Invention
In view of the above, the invention provides a water army group identification method, a device, a computer device and a computer readable storage medium based on social network cooperative forwarding behavior, so as to accurately mine the water army group which performs cooperative forwarding in a social network event and deliberately amplifies the topic influence, and to provide important support for monitoring and limiting water army scale operations by observing the organization behavior and the operational mode of the water army group.
The technical scheme of the invention is as follows:
in a first aspect, the invention provides a water army group identification method based on social network cooperative forwarding behavior, which comprises the following steps:
step one: based on a social network event, analyzing whether every two pairs of users forward the same article source within a very short time interval, if the occurrence times of the behaviors reach a preset threshold, a cooperative forwarding relationship exists among the pairs of users, extracting the cooperative forwarding relationship of all the pairs of users in the social network event, and constructing a global cooperative relationship network;
step two: based on a global cooperative relation network, carrying out weighted fusion on the first-order direct adjacent similarity and the neighborhood similarity to obtain node comprehensive similarity;
step three: based on the comprehensive similarity of the nodes, the collaborative relationship network is subjected to group division by using a hierarchical division method, and each water army group participating in the social network event is obtained.
In a second aspect, the invention also provides a water army group identification device based on social network cooperative forwarding behavior, which comprises the following modules:
a cooperative relationship network module: the method comprises the steps that based on a social network event, whether every two pairs of users forward the same article source in a very short time interval is analyzed, if the occurrence times of the behaviors reach a preset threshold, a cooperative forwarding relationship exists among the pairs of users, the cooperative forwarding relationship of all the pairs of users in the social network event is extracted, and a global cooperative relationship network is constructed;
and the comprehensive similarity calculation module is used for: the method comprises the steps of carrying out weighted fusion on first-order direct adjacent similarity and neighborhood similarity based on a global cooperative relation network to obtain node comprehensive similarity;
the group dividing module: the method is configured to divide groups of the collaborative relationship network by using a hierarchical division method based on the comprehensive similarity of the nodes to obtain all water army groups participating in the social network event.
In a third aspect, the present invention also provides a computer device comprising a memory storing a computer program and a processor implementing the steps of the above method when the processor executes the computer program.
In a fourth aspect, the present invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the above method.
Compared with the prior art, the invention has the beneficial effects that:
1. the method adopts the social behavior mode of cooperative forwarding to identify the water army, automatically judges abnormal behaviors, and is more accurate than a method for manually constructing the characteristic to identify the water army. The implementation process of the invention is fully automatic, the forwarding data of the network event is input, the large-scale cooperative forwarding behavior in the network event can be automatically identified, the water army group is excavated, the water army leader is identified, the speed is high, the accuracy is high, and the practical application value is high in the implementation process;
2. the invention uses the novel angle of the water army group excavation to identify the water army, and the method is more accurate than the method for identifying the single water army, and the result has more explanation significance. The water army group is excavated, and meanwhile, the water army organization characteristics and the behavior patterns can be found;
3. the invention provides a novel node comprehensive similarity method, which carries out weighted fusion on first-order direct adjacent similarity and neighborhood similarity of node pairs, and fully mines and utilizes network structure information. The global cooperative relationship network is subjected to group division based on the node comprehensive similarity, so that all water army groups participating in the same network event can be accurately mined.
The preferred embodiments of the present invention and their advantageous effects will be described in further detail with reference to specific embodiments.
Drawings
The accompanying drawings are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate the invention and together with the description serve to explain the invention. In the drawings of which there are shown,
FIG. 1 is a flow chart of a water army group identification method based on social network cooperative forwarding behavior of the present invention;
FIG. 2 is a schematic diagram of a structure for constructing a global cooperative relationship network;
FIG. 3 is a water army group identification algorithm description diagram based on social network cooperative forwarding behavior;
FIG. 4 is a block diagram of a water army group identification device based on social network cooperative forwarding behavior.
Detailed Description
The following describes specific embodiments of the present invention in detail with reference to the drawings. It should be understood that the detailed description and specific examples, while indicating and illustrating the invention, are not intended to limit the invention.
The water army group identification method based on the social network cooperative forwarding behavior can be applied to computer equipment such as terminals and servers. The terminal may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, which may be head-mounted devices, etc.; the server may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.
Referring to fig. 1, the invention provides a water army group identification method based on social network cooperative forwarding behavior, which comprises the following steps:
step one: based on the social network event, whether the same article source is forwarded by every two user pairs within a very short time interval is analyzed, if the occurrence times of the behaviors reach a preset threshold, a cooperative forwarding relationship exists among the user pairs, the cooperative forwarding relationship of all the user pairs in the social network event is extracted, and a global cooperative relationship network is constructed.
The cooperative forwarding behavior of the water army is defined as that every two users forward the same article source in a shorter time interval, and the occurrence frequency of the behavior reaches a preset threshold value. The cooperative forwarding behavior and the normal forwarding behavior have the following three-point abnormal modes: 1) It is normal for two users to forward the same article, but not normal at the same time; 2) Every two users occasionally forward normally at the same time, but often forward abnormally at the same time; 3) The manual operation requires a certain time, and the operation with small time difference is often the program water army behavior. The water army group is driven by benefit factors, and the method has the characteristics of cooperativity, group property, scale and the like by densely forwarding the appointed content with subjective tendency, instantly forming homogenized sound, deliberately guiding topics, manipulating the wind direction of public opinion.
The first step comprises the following steps:
s11: and calculating the cooperative forwarding relation of the user pair. For the social network event to be researched, setting a keyword group and a time range of the social network event, based on the keyword group and the time range, performing data retrieval and matching by using an API and a crawler tool to obtain social network event related data, wherein the acquired fields comprise(user name),>(user's hair->)、/>(forwarding source user name),)>(forwarding source article->)、/>(time of text). After the data is cleaned and preprocessed, the same article source is forwarded->Put the data of the same group, i.e. the transfer source article of each group of data +.>The values are equal. For each set of data, according to +.>Ascending order is performed if the user +>And->Is>Seconds, e.g. forwarding time interval +.>Second, indicate that the user is about to forward the same article source in a very short time interval, at user +.>And->And a continuous edge is constructed between the two. When in useHouse->And->When the condition of multiple cooperative forwarding occurs, the cooperative times are accumulated, and the accumulated cooperative forwarding times are the link weight of the network +.>。
S21: deleting the continuous edge with the continuous edge weight smaller than the set threshold value, and constructing a global cooperative relation network. If the userAnd->The border weight between->Less, possibly due to occasional reasons, the common forwarding, the probability of the user having a correlation is less. In order to exclude the accidental factors, a weight threshold value is set>Reject the borderline weight +.>And constructing a collaborative relationship network of the events to be studied. The collaborative relationship network can be abstracted into a graph consisting of nodes and edges between nodes>As shown in FIG. 2, wherein->Representing user nodes, connecting edges between users +.>Behavior occurrence representing that users forward the same article source cooperatively>And twice. For example, for borderline weights +.>,/>And->For cooperating user pairs->Representing user +.>And->At a time interval threshold +.>The same article source is forwarded together within seconds, and the above-described cooperative forwarding behavior occurs 3 times.
Step two: and carrying out weighted fusion on the first-order direct adjacent similarity and the neighborhood similarity based on the global cooperative relation network so as to obtain the node comprehensive similarity.
The second step comprises the following steps:
s21: calculating first order direct proximity similarity
Constructing a synergistic relationship networkAfter that, define->The adjacency matrix is->Calculation degree matrix->:Computing a synergistic relationship network->Laplace matrix>:/>Laplace matrix +.>Normalization is carried out: />Normalized Laplace matrix ++according to the spectral theorem>And (3) performing eigenvalue decomposition: />Wherein (1)>Diagonal matrix of eigenvalues, +.>Is a feature vector matrix.
The values of the characteristic values are ordered according to ascending order and before extractionCharacteristic value and calculate +.>And feature vectors corresponding to the feature values. Will->The individual eigenvectors make up a matrix: />The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>For the number of user nodes>Is the dimension of the vector. />Represents a user node +.>The dimensions represent vectors.
For the followingLet->、/>Respectively indicate->Is>、/>Row vector->、/>Representing user nodes +.>、Is>The individual dimensions represent, user node->And->The first order direct proximity similarity calculation formula is:the node embedding vector similarity obtained based on the Laplace feature mapping only considers the first-order similarity of the user node pairs, and does not consider the similarity of the neighborhood structure of the user nodes.
S22: calculation improvementNeighborhood similarity, use ∈>Similarity calculation user node->And->Neighborhood similarity of (c): />The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>、/>Representing user nodes +.>、/>Is the number of neighbor nodes; />Representing user node +.>And user node->Is determined by the number of common neighbors of the network. User nodeAnd user node->Having more common neighbors, +.>The larger the value.
Similarity measures the similarity of two nodes based on the number of common neighbors, but the closeness of the connection between the common neighbors cannot be assessed. Suppose there are two networks +.>And->Node->And node->Having the same neighbor node in both networks, but in the network +.>In, node->And node->The owned common neighbors are connected more closely internally, i.e., the common neighbors have a greater network density. In this case, the number of the cells to be processed is,use->The similarity formula calculates the node similarity, and then +.>I.e. consider node +>And node->Is at the discretion of the network>And->Is equal, but a more rational estimation method should be such that nodes +.>And node->In the network->Is greater than in the network +.>Similarity in (a) and (b). The present invention therefore proposes an improvement +.>Similarity: />Wherein, the->Is indicated at node->Node->And the actual number of connected edges in the sub-network formed by the common neighbors; />,/>In the sub-network described above it is shown, total number of network nodes>Representing the maximum number of theoretically formed edges in the sub-network; />And the obtained network density value in the sub-network is shown.
S23: computing comprehensive similarity of nodes
Respectively to、/>Carrying out maximum value standardization, carrying out weighted fusion on the standardized numerical values to obtain the comprehensive similarity of the nodes, and ensuring the full mining and utilization of network structure information, wherein the comprehensive similarity calculation formula is as follows:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>The weight value is represented, and the value range is 0-1; />Is->Is the maximum value of (2); />Is->Is a maximum value of (a).
Step three: based on the comprehensive similarity of the nodes, the collaborative relationship network is subjected to group division by using a hierarchical division method, and each water army group participating in the social network event is obtained.
The third step comprises the following steps:
s31: constructing a group similarity formula
For any two populations、/>The similarity of the two groups is measured based on the maximum value of the comprehensive similarity of any node pair in the two groups, and the group similarity formula is as follows: />Wherein (1)>Is of the group->Any node in>Is of the group->Any node in the hierarchy.
S32: hierarchical method-based group partitioning
Group division of networks using hierarchical division methods, assuming there areThe individual nodes are grouped, initially each node being considered a group. Then combining the two most similar populations into one population using a population similarity formula, co-generating +.>A population. The merging of the two most similar populations then continues until all nodes are merged into one population.
The whole dividing flow can be expressed as a hierarchical tree diagram, each layer represents a group dividing result of the relational network, module values corresponding to each group dividing result are traversed, and module values are selectedThe largest group division result is used as the optimal division result of the network. Modularity->The calculation formula of (2) is as follows: />;The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Representing the sum of all cooperative forwarding weights in the network; />Representing node->And node->Is a cooperative forwarding weight of (1); />Representation->Degree of (2), i.e. all AND nodes +.>The sum of the cooperative forwarding weights that occur; />Representing nodesA population to which the method belongs; if the user is->、/>Belongs to the same group, is->The value is 1, otherwise, the value is 0; the greater the modularity, the more closely the inside association of the naval group of division, the more sparse the association between the naval group, the naval group structure is more reasonable, divides the effect better.
After the naval group division result is obtained, the naval group leader is identified by using a majority voting method. For each group, counting the frequency of forwarding source users of users in the group, arranging the frequency in a descending order, and selectingThe user of the frequency forwarding source is used as the group leader, i.e. the most forwarded +.>And (5) a user. A water army group identification algorithm description diagram based on social network cooperative forwarding behavior is shown in fig. 3.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages. Based on the same inventive concept, the embodiment of the application also provides a water army group identification device based on the social network cooperative forwarding behavior, wherein each module is used for executing each step in the embodiment corresponding to the water army group identification method.
Referring to fig. 4, the water army group identification device includes:
a cooperative relationship network module: and analyzing whether the two pairs of users forward the same article source within a very short time interval based on the social network event, if the occurrence times of the behaviors reach a preset threshold, a cooperative forwarding relationship exists among the pairs of users, extracting the cooperative forwarding relationship of all the pairs of users in the social network event, and constructing a global cooperative relationship network.
And the comprehensive similarity calculation module is used for: the method is configured to perform weighted fusion on the first-order direct adjacent similarity and the neighborhood similarity based on the global cooperative relation network so as to obtain the node comprehensive similarity.
The group dividing module: the method comprises the steps of obtaining a water army group participating in a social network event by using a hierarchical division method to divide the group of the collaborative relationship network based on the node comprehensive similarity.
The cooperative relational network module comprises:
the collaborative forwarding relation module is configured to set a keyword group and a time range of a social network event to be researched, and based on the keyword group and the time range, perform data retrieval and matching by using an API and a crawler tool to obtain social network event related data, wherein the acquired fields comprise(user name),>(user's hair->)、/>(forwarding source user name),)>(forwarding source article->)、/>(time of text). After the data is cleaned and preprocessed, the same article source is forwarded->Put the data of the same group, i.e. the transfer source article of each group of data +.>The values are equal. For each set of data, according to +.>Ascending order is performed if the user +>And->Is>Second, indicate that the user is about to forward the same article source in a very short time interval, at user +.>And->And a continuous edge is constructed between the two. When the user is->And->When the condition of multiple cooperative forwarding occurs, the cooperative times are accumulated, and the accumulated cooperative forwarding times are the link weight of the network +.>。
And the eliminating module is configured to eliminate the continuous edges with the continuous edge weight smaller than the set threshold value and construct a global cooperative relationship network. If the userAnd->The border weight between->Less, possibly due to occasional reasons, the common forwarding, the probability of the user having a correlation is less. In order to exclude the accidental factors, a weight threshold value is set>Removing the edge weightAnd constructing a collaborative relationship network of the events to be studied. The collaborative relationship network can be abstracted into a graph consisting of nodes and edges between nodes>As shown in FIG. 2, wherein +.>Representing user nodes, connecting edges between users +.>Behavior occurrence representing that users forward the same article source cooperatively>And twice. For example, for edge weights,/>And->For cooperating user pairs->Representing user +.>And->At a time interval threshold +.>The same article source is forwarded together within seconds, and the above-described cooperative forwarding behavior occurs 3 times.
The comprehensive similarity calculation module comprises:
the proximity similarity calculation module is configured to,
constructing a synergistic relationship networkAfter that, define->The adjacency matrix is->Calculation degree matrix->。
The method comprises the steps of carrying out a first treatment on the surface of the Computing a synergistic relationship network->Laplace matrix>:/>The method comprises the steps of carrying out a first treatment on the surface of the Laplace matrix +.>Normalization is carried out: />The method comprises the steps of carrying out a first treatment on the surface of the Normalized Laplace matrix according to the spectral theorem>And (3) performing eigenvalue decomposition: />The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Diagonal matrix of eigenvalues, +.>Is a feature vector matrix. Sorting the eigenvalue values in ascending order, extracting the anterior +.>Characteristic value and calculate +.>And feature vectors corresponding to the feature values. Will->The individual eigenvectors make up a matrix: />The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>For the number of user nodes>Is the dimension of the vector. />Represents a user node +.>The dimensions represent vectors.
For the followingLet->、/>Respectively indicate->Is>、/>Row vector->、/>Representing user nodes +.>、Is>Dimension representation, user node->And->The first order direct proximity similarity calculation formula is:a neighborhood similarity calculation module configured to,
using improvementsSimilarity is calculated: />The method comprises the steps of carrying out a first treatment on the surface of the Wherein, the liquid crystal display device comprises a liquid crystal display device,、/>user node +.>、/>Is the number of neighbor nodes; />Representing user node +.>And user nodeIs a common neighbor number of (a); />Is indicated at node->Node->And the actual number of connected edges in the sub-network formed by the common neighbors;,/>in the sub-network described above it is shown, total number of network nodes>Representing the maximum number of theoretically formed edges in the sub-network; />And the obtained network density value in the sub-network is shown.
A comprehensive similarity calculation module configured to respectively for、/>Carrying out maximum value standardization, carrying out weighted fusion on the standardized numerical values to obtain the comprehensive similarity of the nodes, and ensuring the full mining and utilization of network structure information, wherein the comprehensive similarity calculation formula is as follows: />The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>The weight value is represented, and the value range is 0-1; />Is->Is the maximum value of (2);is->Is a maximum value of (a).
The group division module comprises:
a group similarity calculation module configured to,
for any two populations、/>The similarity of the two groups is measured based on the maximum value of the comprehensive similarity of any node pair in the two groups, and the group similarity formula is as follows: />The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Is of the group->Any node in>Is of the group->Any node in the hierarchy.
A hierarchy-based partitioning module configured to,
group division of networks using hierarchical division methods, assuming there areThe individual nodes are grouped, initially each node being considered a group. Then combining the two most similar populations into one population using a population similarity formula, co-generating +.>A population. The merging of the two most similar populations then continues until all nodes are merged into one population.
The whole partitioning flow can be expressed as a hierarchical tree diagram, each layer represents a group partitioning result of the relational network, and each group partitioning result is traversedSelecting the corresponding module degree value and selecting the module degreeThe largest group division result is used as the optimal division result of the network. Modularity->The calculation formula of (2) is as follows: />;The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Representing the sum of all cooperative forwarding weights in the network; />Representing node->And node->Is a cooperative forwarding weight of (1); />Representation->Degree of (2), i.e. all AND nodes +.>The sum of the cooperative forwarding weights that occur; />Representing nodesA population to which the method belongs; if the user is->、/>Belongs to the same group, is->The value is 1, otherwise, the value is 0.
The greater the modularity, the more closely the inside association of the naval group of division, the more sparse the association between the naval group, the naval group structure is more reasonable, divides the effect better.
After the naval group division result is obtained, the naval group leader is identified by using a majority voting method. For each group, counting the frequency of forwarding source users of users in the group, arranging the frequency in a descending order, and selectingThe user of the frequency forwarding source is used as the group leader, i.e. the most forwarded +.>And (5) a user.
It should be understood that, in the structural block diagram of the water army group identification device shown in fig. 4, each module is configured to execute each step in the embodiment corresponding to fig. 1, and each step in the embodiment corresponding to fig. 1 has been explained in detail in the foregoing embodiment, and specific reference is made to fig. 1 and related descriptions in the embodiment corresponding to fig. 1, which are not repeated herein.
Based on the same inventive concept, the embodiment of the application also provides computer equipment for realizing the water army group identification method based on the social network cooperative forwarding behavior. The implementation scheme of the solution to the problem provided by the computer device is similar to the implementation scheme described in the above method, so the specific limitation in the embodiments of the computer device provided below may refer to the limitation of the water army group identification method based on the social network cooperative forwarding behavior hereinabove, and will not be repeated herein.
In one embodiment, a computer device, which may be a terminal, is provided that includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program, when executed by the processor, implements a water army group identification method based on social network cooperative forwarding behavior. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
In one embodiment, a computer device is provided, including a memory and a processor, where the memory stores a computer program, and the processor implements the method for identifying a water army population based on social network cooperative forwarding behavior as described in the above embodiments when the computer program is executed.
In one embodiment, a computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements a water army population identification method based on social network cooperative forwarding behavior as described in the above embodiments.
In one embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements a water army population identification method based on social network cooperative forwarding behavior as described in the above embodiments.
The water army group identification method, the device, the computer equipment, the computer readable storage medium and the computer program product based on the social network cooperative forwarding behavior have the following beneficial effects:
1. the method adopts the social behavior mode of cooperative forwarding to identify the water army, automatically judges abnormal behaviors, and is more accurate than a method for manually constructing the characteristic to identify the water army. The implementation process of the invention is fully automatic, the forwarding data of the network event is input, the large-scale cooperative forwarding behavior in the network event can be automatically identified, the water army group is excavated, the water army leader is identified, the speed is high, the accuracy is high, and the practical application value is high in the implementation process;
2. the invention uses the novel angle of the water army group excavation to identify the water army, and the method is more accurate than the method for identifying the single water army, and the result has more explanation significance. The water army group is excavated, and meanwhile, the water army organization characteristics and the behavior patterns can be found;
3. the invention provides a novel node comprehensive similarity method, which carries out weighted fusion on first-order direct adjacent similarity and neighborhood similarity of node pairs, and fully mines and utilizes network structure information. The global cooperative relationship network is subjected to group division based on the node comprehensive similarity, so that all water army groups participating in the same network event can be accurately mined.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.
Claims (8)
1. The water army group identification method based on the social network cooperative forwarding behavior is characterized by comprising the following steps of:
step one: based on a social network event, analyzing whether every two pairs of users forward the same article source within a very short time interval, if the occurrence times of the behaviors reach a preset threshold, a cooperative forwarding relationship exists among the pairs of users, extracting the cooperative forwarding relationship of all the pairs of users in the social network event, and constructing a global cooperative relationship network;
step two: based on a global cooperative relation network, carrying out weighted fusion on the first-order direct adjacent similarity and the neighborhood similarity to obtain node comprehensive similarity;
step three: based on the comprehensive similarity of the nodes, carrying out group division on the cooperative relationship network by using a hierarchical division method to obtain all water army groups participating in social network events;
the second step comprises the following steps:
s21: constructing a synergistic relationship networkAfter that, define->The adjacency matrix is->Calculation degree matrix->:
normalized Laplace matrix according to the spectral theoremAnd (3) performing eigenvalue decomposition:
wherein, the liquid crystal display device comprises a liquid crystal display device,diagonal matrix of eigenvalues, +.>Is a feature vector matrix;
the values of the characteristic values are ordered according to ascending order and before extractionCharacteristic value and calculate +.>Feature vectors corresponding to the feature values are to be +.>The individual eigenvectors make up a matrix:
wherein, the liquid crystal display device comprises a liquid crystal display device,for the number of user nodes>Is the dimension of the vector; />Represents a user node +.>A dimension represents a vector;
for the followingLet->、/>Respectively indicate->Is>、/>Row vector->、/>Representing user nodes +.>、/>Is the first of (2)The individual dimensions represent, user node->And->The first order direct proximity similarity calculation formula is:
Wherein, the liquid crystal display device comprises a liquid crystal display device,、/>representing user nodes +.>、/>Is the number of neighbor nodes; />Representing user nodesAnd user node->Is a common neighbor number of (a); />Is indicated at node->Node->And the actual number of connected edges in the sub-network formed by the common neighbors; />,/>In the sub-network described above it is shown, total number of network nodes>Representing the maximum number of theoretically formed edges in the sub-network; />Representing the obtained network density value in the sub-network;
s23: respectively to、/>Carrying out maximum value standardization, and carrying out weighted fusion on the standardized numerical values to obtain the comprehensive similarity of the nodes, wherein the comprehensive similarity calculation formula is as follows:
2. The water army group identification method based on social network cooperative forwarding behavior according to claim 1, wherein the step one includes the steps of:
s11: setting a keyword group and a time range of a social network event for the social network event to be researched, and carrying out data retrieval and matching by using an API and a crawler tool based on the keyword group and the time range to obtain related data of the social network event; after the data is cleaned and preprocessed, the source article is forwardedThe data with equal values are put into the same group, and for each group of data, the data are arranged in ascending order according to the text time; if the user is->And->Is>Second, indicate that the user is about to forward the same article source in a very short time interval, at user +.>And->A connecting edge is constructed between the two; when the user is->And->When the condition of multiple cooperative forwarding occurs, the cooperative times are accumulated, and the accumulated cooperative forwarding times are the link weight of the network +.>;
3. The water army group identification method based on social network cooperative forwarding behavior according to claim 1, wherein the third step comprises the following steps:
s31: for any two populations、/>The similarity of the two groups is measured based on the maximum value of the comprehensive similarity of any node pair in the two groups, and the group similarity formula is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,is of the group->Any node in>Is of the group->Any node in (a);
s32: group division of networks using hierarchical division methods, assuming there areThe individual nodes divide groups, and each node is initially regarded as a group; then combining the two most similar populations into one population using a population similarity formula, co-generating +.>A population of individuals; then continuing to merge the two most similar groups until all nodes are merged into one group;
the whole dividing flow is expressed as a hierarchical structure tree diagram, each layer represents a group dividing result of the relational network, module value corresponding to each group dividing result is traversed, and module is selectedMaximum group division result is used as the optimal division result of the network, modularity is->The calculation formula of (2) is as follows: />;/>;
Wherein, the liquid crystal display device comprises a liquid crystal display device,representing the sum of all cooperative forwarding weights in the network; />Representing node->And node->Is a cooperative forwarding weight of (1);representation->Degree of (2), i.e. all AND nodes +.>The sum of the cooperative forwarding weights that occur; />Representing node->A population to which the method belongs; if the user is->、/>Belongs to the same group, is->The value is 1, otherwise, the value is 0;
after the naval group division result is obtained, the naval group leader is identified by using a majority voting method.
4. The water army group identification device based on the social network cooperative forwarding behavior is characterized by comprising the following modules:
a cooperative relationship network module: the method comprises the steps that based on a social network event, whether every two pairs of users forward the same article source in a very short time interval is analyzed, if the occurrence times of the behaviors reach a preset threshold, a cooperative forwarding relationship exists among the pairs of users, the cooperative forwarding relationship of all the pairs of users in the social network event is extracted, and a global cooperative relationship network is constructed;
and the comprehensive similarity calculation module is used for: the method comprises the steps of carrying out weighted fusion on first-order direct adjacent similarity and neighborhood similarity based on a global cooperative relation network to obtain node comprehensive similarity;
the group dividing module: the method comprises the steps of performing group division on a collaborative relationship network by using a hierarchical division method based on node comprehensive similarity to obtain all water army groups participating in social network events;
the comprehensive similarity calculation module comprises:
the proximity similarity calculation module is configured to,
constructing a synergistic relationship networkAfter that, define->The adjacency matrix is->Calculation degree matrix->:
normalized Laplace matrix according to the spectral theoremAnd (3) performing eigenvalue decomposition:
wherein, the liquid crystal display device comprises a liquid crystal display device,diagonal matrix of eigenvalues, +.>Is a feature vector matrix;
the values of the characteristic values are ordered according to ascending order and before extractionCharacteristic value and calculate +.>Feature vectors corresponding to the feature values are to be +.>The individual eigenvectors make up a matrix:
wherein, the liquid crystal display device comprises a liquid crystal display device,for the number of user nodes>Is the dimension of the vector; />Represents a user node +.>A dimension represents a vector;
for the followingLet->、/>Respectively indicate->Is>、/>Row vector->、/>Representing user nodes +.>、/>Is the first of (2)The individual dimensions represent, user node->And->The first order direct proximity similarity calculation formula is:
a neighborhood similarity calculation module configured to,
wherein, the liquid crystal display device comprises a liquid crystal display device,、/>representing user nodes +.>、/>Is the number of neighbor nodes; />Representing user nodesAnd user node->Is a common neighbor number of (a); />Is indicated at node->Node->And the actual number of connected edges in the sub-network formed by the common neighbors; />,/>In the sub-network described above it is shown, total number of network nodes>Representing the maximum number of theoretically formed edges in the sub-network; />Representing the obtained network density value in the sub-network;
the integrated similarity calculation module is configured to,
respectively to、/>Carrying out maximum value standardization, and carrying out weighted fusion on the standardized numerical values to obtain the comprehensive similarity of the nodes, wherein the comprehensive similarity calculation formula is as follows:
5. The water army group identification device based on social network cooperative forwarding behavior of claim 4, wherein the cooperative relationship network module comprises:
the collaborative forwarding relation module is configured to set a keyword group and a time range of a social network event for the social network event to be researched, and based on the keyword group and the time range, the API and the crawler tool are utilized for data retrieval and matching to obtain social network event related data; after the data is cleaned and preprocessed, the source article is forwardedThe data with equal values are put into the same group, and for each group of data, the data are arranged in ascending order according to the text time; if the user is->And->Is>Second, indicate that the user is about to forward the same article source in a very short time interval, at user +.>And->A connecting edge is constructed between the two; when the user is->And->When the condition of multiple cooperative forwarding occurs, the cooperative times are accumulated, and the accumulated cooperative forwarding times are the link weight of the network +.>;
6. The water army group identification device based on social network cooperative forwarding behavior of claim 4, wherein the group partitioning module comprises:
a group similarity calculation module configured to,
for any two populations、/>The similarity of the two groups is measured based on the maximum value of the comprehensive similarity of any node pair in the two groups, and the group similarity formula is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,is of the group->Any node in>Is of the group->Any node in (a);
a hierarchy-based partitioning module configured to,
group division of networks using hierarchical division methods, assuming there areThe individual nodes divide groups, and each node is initially regarded as a group; then combining the two most similar populations into one population using a population similarity formula, co-generating +.>A population of individuals; then continuing to merge the two most similar groups until all nodes are merged into one group;
the whole dividing flow is expressed as a hierarchical structure tree diagram, each layer represents a group dividing result of the relational network, module value corresponding to each group dividing result is traversed, and module is selectedMaximum group division result is used as the optimal division result of the network, modularity is->The calculation formula of (2) is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing the sum of all cooperative forwarding weights in the network; />Representing node->And node->Is a cooperative forwarding weight of (1);representation->Degree of (2), i.e. all AND nodes +.>The sum of the cooperative forwarding weights that occur; />Representing node->A population to which the method belongs; if the user is->、/>Belongs to the same group, is->The value is 1, otherwise, the value is 0;
after the naval group division result is obtained, the naval group leader is identified by using a majority voting method.
7. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 3 when the computer program is executed.
8. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310349637.0A CN116150507B (en) | 2023-04-04 | 2023-04-04 | Water army group identification method, device, equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310349637.0A CN116150507B (en) | 2023-04-04 | 2023-04-04 | Water army group identification method, device, equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116150507A CN116150507A (en) | 2023-05-23 |
CN116150507B true CN116150507B (en) | 2023-06-30 |
Family
ID=86362081
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310349637.0A Active CN116150507B (en) | 2023-04-04 | 2023-04-04 | Water army group identification method, device, equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116150507B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112052404A (en) * | 2020-09-23 | 2020-12-08 | 西安交通大学 | Group discovery method, system, device and medium for multi-source heterogeneous relation network |
CN112800304A (en) * | 2021-01-08 | 2021-05-14 | 上海海事大学 | Microblog water army group detection method based on clustering |
CN113157993A (en) * | 2021-02-08 | 2021-07-23 | 电子科技大学 | Network water army behavior early warning model based on time sequence graph polarization analysis |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7401294B2 (en) * | 2003-11-24 | 2008-07-15 | International Business Machines Corporation | Method and system for collaborative web browsing |
US8600901B2 (en) * | 2010-07-27 | 2013-12-03 | Yahoo! Inc. | Providing social likeness within a messaging context |
CN107092667B (en) * | 2017-04-07 | 2018-02-27 | 平安科技(深圳)有限公司 | Group's lookup method and device based on social networks |
CN106940732A (en) * | 2016-05-30 | 2017-07-11 | 国家计算机网络与信息安全管理中心 | A kind of doubtful waterborne troops towards microblogging finds method |
US11423082B2 (en) * | 2016-06-29 | 2022-08-23 | Intel Corporation | Methods and apparatus for subgraph matching in big data analysis |
US11032293B2 (en) * | 2018-02-10 | 2021-06-08 | SmartAxiom, Inc. | System and method for managing and securing a distributed ledger for a decentralized peer-to-peer network |
CN113627960A (en) * | 2020-05-06 | 2021-11-09 | 山东科技大学 | Water army group detection method and device |
-
2023
- 2023-04-04 CN CN202310349637.0A patent/CN116150507B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112052404A (en) * | 2020-09-23 | 2020-12-08 | 西安交通大学 | Group discovery method, system, device and medium for multi-source heterogeneous relation network |
CN112800304A (en) * | 2021-01-08 | 2021-05-14 | 上海海事大学 | Microblog water army group detection method based on clustering |
CN113157993A (en) * | 2021-02-08 | 2021-07-23 | 电子科技大学 | Network water army behavior early warning model based on time sequence graph polarization analysis |
Also Published As
Publication number | Publication date |
---|---|
CN116150507A (en) | 2023-05-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3627759B1 (en) | Method and apparatus for encrypting data, method and apparatus for training machine learning model, and electronic device | |
Guo et al. | Development of stock correlation networks using mutual information and financial big data | |
Li et al. | Optimizing generalized pagerank methods for seed-expansion community detection | |
Lin et al. | Blog community discovery and evolution based on mutual awareness expansion | |
CN111612041A (en) | Abnormal user identification method and device, storage medium and electronic equipment | |
CN112052404B (en) | Group discovery method, system, equipment and medium of multi-source heterogeneous relation network | |
Yang et al. | Towards interpretation of recommender systems with sorted explanation paths | |
Mishra et al. | TCD2: Tree-based community detection in dynamic social networks | |
CN110321493B (en) | Abnormity detection and optimization method and system of social network and computer equipment | |
CN104077723A (en) | Social network recommending system and social network recommending method | |
Zhu et al. | Portal nodes screening for large scale social networks | |
CN113987152B (en) | Knowledge graph extraction method, system, electronic equipment and medium | |
Zhang et al. | Spammer detection via ranking aggregation of group behavior | |
Dwarakanath et al. | A Genetic Algorithm based Domain Adaptation Framework for Classification of Disaster Topic Text Tweets. | |
CN116150507B (en) | Water army group identification method, device, equipment and medium | |
Meena et al. | A survey on community detection algorithm and its applications | |
CN112257959A (en) | User risk prediction method and device, electronic equipment and storage medium | |
Guo et al. | Detecting spammers in E-commerce website via spectrum features of user relation graph | |
Xu et al. | Cluster-aware multiplex InfoMax for unsupervised graph representation learning | |
Moreno et al. | An algorithm for identifying the best current friend in a social network | |
Xu et al. | NC-GNN: Consistent neighbors of nodes help more in graph neural networks | |
CN113011153A (en) | Text correlation detection method, device, equipment and storage medium | |
Nohuddin et al. | Social network trend analysis using frequent pattern mining and self organizing maps | |
Ghoshal et al. | A fast community-based approach for discovering anomalies in evolutionary networks | |
Ghoshal et al. | Anomaly detection in evolutionary social networks leveraging community structure |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |