CN112184468A - Dynamic social relationship network link prediction method and device based on spatio-temporal relationship - Google Patents

Dynamic social relationship network link prediction method and device based on spatio-temporal relationship Download PDF

Info

Publication number
CN112184468A
CN112184468A CN202011047469.2A CN202011047469A CN112184468A CN 112184468 A CN112184468 A CN 112184468A CN 202011047469 A CN202011047469 A CN 202011047469A CN 112184468 A CN112184468 A CN 112184468A
Authority
CN
China
Prior art keywords
network
social relationship
dynamic social
dynamic
spatio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011047469.2A
Other languages
Chinese (zh)
Inventor
江逸楠
刘家琛
王亚珅
陈诚
吉祥
张雪莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electronic Science Research Institute of CTEC
Original Assignee
Electronic Science Research Institute of CTEC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electronic Science Research Institute of CTEC filed Critical Electronic Science Research Institute of CTEC
Priority to CN202011047469.2A priority Critical patent/CN112184468A/en
Publication of CN112184468A publication Critical patent/CN112184468A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Computing Systems (AREA)
  • Primary Health Care (AREA)
  • Health & Medical Sciences (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a device for predicting a dynamic social relationship network link based on a spatio-temporal relationship, wherein the method comprises the following steps: acquiring dynamic social relationship data, and preprocessing the dynamic social relationship data to generate a sample set; constructing a weighted similarity characteristic time sequence for any node pair in the sample set; calculating the characteristic value of any node pair at the moment to be predicted by adopting a preset algorithm based on the weighted similarity characteristic time sequence to construct a characteristic matrix; and inputting the characteristic matrix into a pre-trained classification model, and outputting possible links of the dynamic social relationship network at the moment to be predicted. The invention establishes the characteristic time sequence of the dynamic network on the basis of the network topological structure characteristics and the link generation time sequence information, and expands the prediction method from a static network to a dynamic time-varying network. Moreover, the weight is introduced into the link prediction problem, the network structure characteristics and the node link characteristics are fused, and the accuracy of the prediction result is improved by combining a statistical model and a supervised learning method.

Description

Dynamic social relationship network link prediction method and device based on spatio-temporal relationship
Technical Field
The invention relates to the technical field of machine learning, in particular to a method and a device for predicting a dynamic social relationship network link based on a spatiotemporal relationship.
Background
The research on the social relationship network is often modeled by adopting the idea of network science, namely, network nodes are used for representing individuals in the social network, and connection edges/links between the nodes are used for representing the relationships between the individuals. The link prediction problem of the social relationship network mainly carries out mining and prediction around the relationship between individuals, and is one of the basic problems of the social relationship network research. Meanwhile, a remarkable characteristic of the social relationship network is that the social relationship network has high dynamic performance, namely, the scale (the number of nodes/links), the structure and the interaction behavior among the nodes of the network are constantly changed. Therefore, the dynamic network link prediction problem considering the time-space characteristics and the frequency characteristics of node interaction has important practical application value.
One common idea of dynamic network link prediction is to introduce a time series prediction model in a method based on static network topology features, such as calculating a node similarity score in each time period by using the structure information of the network, and then calculating a future similarity score by using an autoregressive integrated moving average model (ARIMA) as a prediction model and performing final link prediction. However, considering that the similarity indexes are numerous, how to design a good similarity evaluation function is a difficult point of the method. The method based on machine learning introduces a classic classification algorithm to predict by regarding a link prediction problem as a binary classification problem, and has obtained a better result on static link prediction, but for a dynamic network, network spatio-temporal characteristics and weight characteristics need to be better considered.
Disclosure of Invention
The invention provides a method and a device for predicting a dynamic social relationship network link, aiming at solving the technical problem of improving the accuracy of dynamic social relationship network prediction.
The prediction method of the dynamic social relationship network link based on the spatio-temporal relationship comprises the following steps:
acquiring dynamic social relationship data, and preprocessing the dynamic social relationship data to generate a sample set;
constructing a weighted similarity characteristic time sequence for any node pair in the sample set;
calculating the characteristic value of any node pair at the moment to be predicted by adopting a preset algorithm based on the weighted similarity characteristic time sequence to construct a characteristic matrix;
and inputting the characteristic matrix into a pre-trained classification model, and outputting possible links of the dynamic social relationship network at the moment to be predicted.
According to the prediction method of the dynamic social relationship network link based on the spatio-temporal relationship, provided by the embodiment of the invention, the characteristic time sequence of the dynamic network is established on the basis of the network topological structure characteristics and link generation time sequence information, so that the application range of the prediction method is expanded from a static network to a dynamic time-varying network. The invention introduces the weight into the link prediction problem, and better reflects the practical characteristics of the network. The invention integrates the network structure characteristics and the node link characteristics, and combines the statistical model and the supervised learning method, thereby being more suitable for the actual situation and having better prediction effect, and improving the accuracy of the prediction result.
According to some embodiments of the invention, preprocessing the dynamic social relationship data comprises: and dividing the dynamic social relation data into a plurality of sub-networks according to a preset time interval.
In some embodiments of the invention, preprocessing the dynamic social relationship data comprises: and assigning a corresponding weight to each node pair based on the link relation of each node pair.
According to some embodiments of the invention, the pre-trained classification model is trained using a random forest or support vector machine algorithm.
In some embodiments of the invention, the predicted dynamic social relationship network is evaluated using an AUC evaluation metric.
The prediction device of the dynamic social relationship network link based on the spatio-temporal relationship comprises the following steps:
the data processing module is used for acquiring dynamic social relationship data and preprocessing the dynamic social relationship data to generate a sample set;
the characteristic time sequence construction module is used for constructing a weighted similarity characteristic time sequence for any node pair in the sample set;
the computing module is used for computing the characteristic value of any node pair at the moment to be predicted by adopting a preset algorithm based on the weighted similarity characteristic time sequence so as to construct a characteristic matrix;
and the classification prediction module is used for inputting the characteristic matrix into a pre-trained classification model and outputting possible links of the dynamic social relationship network at the moment to be predicted.
According to the prediction device of the dynamic social relationship network link based on the spatio-temporal relationship, provided by the embodiment of the invention, the characteristic time sequence of the dynamic network is established on the basis of the network topological structure characteristics and link generation time sequence information, so that the application range of the prediction method is expanded from a static network to a dynamic time-varying network. The real characteristics of the network are better reflected by introducing the weight into the link prediction problem. The invention integrates the network structure characteristics and the node link characteristics, and combines the statistical model and the supervised learning method, thereby being more suitable for the actual situation and having better prediction effect, and improving the accuracy of the prediction result.
According to some embodiments of the invention, the data processing module comprises: and the dividing module is used for dividing the dynamic social relationship data into a plurality of sub-networks according to a preset time interval.
In some embodiments of the invention, the data processing module comprises: and the weight assignment module is used for assigning corresponding weight to each node pair based on the link relation of each node pair.
According to some embodiments of the invention, the pre-trained classification model is trained using a random forest or support vector machine algorithm.
In some embodiments of the invention, the apparatus further comprises:
and the result evaluation module is used for evaluating the predicted dynamic social relationship network by adopting an AUC evaluation index.
Drawings
FIG. 1 is a block diagram of a method for predicting links of a dynamic social relationship network based on spatiotemporal relationships, according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method for predicting links of a dynamic social relationship network based on spatiotemporal relationships, according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a dynamic network model based on time slice division according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a prediction device for dynamic social relationship network links according to an embodiment of the present invention.
Reference numerals:
the prediction apparatus 100 is capable of predicting the prediction mode,
the system comprises a data processing module 10, a characteristic time sequence construction module 20, a calculation module 30 and a classification prediction module 40.
Detailed Description
To further explain the technical means and effects of the present invention adopted to achieve the intended purpose, the present invention will be described in detail with reference to the accompanying drawings and preferred embodiments.
The prediction method of the dynamic social relationship network link based on the spatio-temporal relationship comprises the following steps:
s100, acquiring dynamic social relation data, and preprocessing the data to generate a sample set;
it should be noted that the preprocessing of the dynamic social relationship data may include: and dividing the dynamic social relationship data into a plurality of sub-networks according to a preset time interval. For example, based on time information known in dynamic networks, the entire time period may be divided into nA time slice, each time slice having an interval of (t)1-to) And/n. That is, the interaction of the ith (i is more than or equal to 1 and less than or equal to n) time slice occurs in t0,t0+i*(t1-to)/n]And (4) the following steps. If S denotes the entire network sequence, GtRepresenting the network at time T, T being the time span of the entire network, S may be represented as S ═ G1,G2,Gt,…,GT}。
Preprocessing the dynamic social relationship data, and may further include: and assigning corresponding weight to each node pair based on the link relation of each node pair. Based on different importance of components in the network, in order to measure influence of different edges on nodes on similarity, each node pair is given a proper weight w(u,v)
S200, constructing a weighted similarity characteristic time sequence for any node pair in the sample set;
for example, Weighted Common Neighbors (WCN), weighted Adamic-Adar (WAA), Weighted Resource Allocations (WRA), weighted preferential links (WPA), and Weighted Jaccard's Coeffient (WJC) may be chosen as the artificially extracted network node pair features.
On the basis, for any node pair, the weighted similarity characteristics of the node pair on each time slice are calculated to form a characteristic time sequence.
S300, calculating the eigenvalue of any node pair at the moment to be predicted by adopting a preset algorithm based on the weighted similarity characteristic time sequence to construct a characteristic matrix;
it should be noted that, after the time sequence of the network is established according to the time information, the evolution process of the network and the change trend of the node to the similarity can be obtained by observing the change situation between the adjacent subnetworks in the time sequence of the network, so that the time information of the network is utilized and finally applied to the link prediction of the dynamic network.
And S400, inputting the characteristic matrix into a pre-trained classification model, and outputting possible links of the dynamic social relationship network at the moment to be predicted.
According to the prediction method of the dynamic social relationship network link based on the spatio-temporal relationship, provided by the embodiment of the invention, the characteristic time sequence of the dynamic network is established on the basis of the network topological structure characteristics and link generation time sequence information, so that the application range of the prediction method is expanded from a static network to a dynamic time-varying network. The invention introduces the weight into the link prediction problem, and better reflects the practical characteristics of the network. The invention integrates the network structure characteristics and the node link characteristics, and combines the statistical model and the supervised learning method, thereby being more suitable for the actual situation and having better prediction effect, and improving the accuracy of the prediction result.
According to some embodiments of the invention, the pre-trained classification model is trained using a random forest or support vector machine algorithm. The random forest classification model is composed of a series of decision trees, m samples are repeatedly and randomly extracted from an original training sample set in a replacement mode to generate a new training sample set, and then k classification trees are generated according to a self-help sample set to form a random forest. The support vector machine maps the vectors into a higher dimensional space in which a maximally spaced hyperplane is created. Two hyperplanes parallel to each other are built on both sides of the hyperplane separating the data, and the hyperplane separating the hyperplanes maximizes the distance between the two parallel hyperplanes. And training the training set by using the two supervised learning algorithms, and predicting the dynamic relationship of the nodes in the test set.
It should be noted that the present invention is not limited to the above two algorithms, and other machine learning algorithms, such as Multi-Layer probability, can be used for classification. The algorithm learns the characteristics more beneficial to the task through the hierarchical structure of the input layer, the hidden layer and the output layer.
In some embodiments of the invention, the predicted dynamic social relationship network is evaluated using an AUC evaluation metric. AUC may be understood as the probability that the score value of a randomly selected edge in a test set is higher than the score value of a randomly selected non-existing edge. During n independent comparisons, a missing link and a non-existing link are randomly selected to compare their similarity scores, and if the similarity score of an edge in the test set is greater than the similarity score of a non-existing edge for n' times, and the similarity scores of both n "times are the same, then the AUC is defined as follows:
Figure BDA0002708459000000061
by adopting the evaluation indexes, the prediction accuracy of the method can be evaluated, and the influence of the selection of different parameters such as time intervals on the prediction result can be further analyzed.
As shown in fig. 4, the apparatus 100 for predicting spatio-temporal relationship-based dynamic social relationship network links according to an embodiment of the present invention includes: a data processing module 10, a feature time series construction module 20, a calculation module 30 and a classification prediction module 40.
The data processing module 10 is configured to obtain dynamic social relationship data, and perform preprocessing to generate a sample set. Specifically, the data processing module 10 may include: the device comprises a dividing module and a weight assignment module.
The dividing module is used for dividing the dynamic social relationship data into a plurality of sub-networks according to a preset time interval. For example, based on time information known in the dynamic network, the dividing module may divide the entire time period into n time slices, each time slice being spaced apart by an interval of (t:)1-to) And/n. That is, the interaction of the ith (i is more than or equal to 1 and less than or equal to n) time slice occurs in t0,t0+i*(t1-to)/n]And (4) the following steps. If S denotes the entire network sequence, GtRepresenting the network at time T, T being the time span of the entire network, S may be represented as S ═ G1,G2,Gt,…,GT}。
And the weight assignment module is used for assigning corresponding weight to each node pair based on the link relation of each node pair. Based on different importance of components in the network, in order to measure influence of different edges on nodes on similarity, the weight assignment module can assign a proper weight w to each node pair(u,v)
The feature time sequence construction module 20 is configured to construct a weighted similarity feature time sequence for any node pair in the sample set.
For example, Weighted Common Neighbors (WCN), weighted Adamic-Adar (WAA), Weighted Resource Allocations (WRA), weighted preferential links (WPA), and Weighted Jaccard's Coeffient (WJC) may be chosen as the artificially extracted network node pair features.
On this basis, for any node pair, the feature time series construction module 20 calculates the weighted similarity feature of the node pair on each time slice to form a feature time series.
The calculation module 30 is configured to calculate a feature value of any node pair at a time to be predicted by using a preset algorithm based on the weighted similarity feature time sequence to construct a feature matrix.
It should be noted that, after the time sequence of the network is established according to the time information, the evolution process of the network and the change trend of the node to the similarity can be obtained by observing the change situation between the adjacent subnetworks in the time sequence of the network, so that the time information of the network is utilized and finally applied to the link prediction of the dynamic network.
The classification prediction module 40 is configured to input the feature matrix into a classification model trained in advance, and output a possible link of the dynamic social relationship network at a time to be predicted.
According to the prediction device 100 of the dynamic social relationship network link based on the spatio-temporal relationship, provided by the embodiment of the invention, the characteristic time sequence of the dynamic network is established on the basis of the network topological structure characteristics and the link generation time sequence information, so that the application range of the prediction method is expanded from a static network to a dynamic time-varying network. The real characteristics of the network are better reflected by introducing the weight into the link prediction problem. The invention integrates the network structure characteristics and the node link characteristics, and combines the statistical model and the supervised learning method, thereby being more suitable for the actual situation and having better prediction effect, and improving the accuracy of the prediction result.
According to some embodiments of the invention, the pre-trained classification model is trained using a random forest or support vector machine algorithm. The random forest classification model is composed of a series of decision trees, m samples are repeatedly and randomly extracted from an original training sample set in a replacement mode to generate a new training sample set, and then k classification trees are generated according to a self-help sample set to form a random forest. The support vector machine maps the vectors into a higher dimensional space in which a maximally spaced hyperplane is created. Two hyperplanes parallel to each other are built on both sides of the hyperplane separating the data, and the hyperplane separating the hyperplanes maximizes the distance between the two parallel hyperplanes. And training the training set by using the two supervised learning algorithms, and predicting the dynamic relationship of the nodes in the test set.
In some embodiments of the invention, the apparatus further comprises: and the result evaluation module is used for evaluating the predicted dynamic social relationship network by adopting the AUC evaluation index. AUC may be understood as the probability that the score value of a randomly selected edge in a test set is higher than the score value of a randomly selected non-existing edge. During n independent comparisons, a missing link and a non-existing link are randomly selected to compare their similarity scores, and if the similarity score of an edge in the test set is greater than the similarity score of a non-existing edge for n' times, and the similarity scores of both n "times are the same, then the AUC is defined as follows:
Figure BDA0002708459000000081
by adopting the evaluation indexes, the prediction accuracy of the method can be evaluated, and the influence of the selection of different parameters such as time intervals on the prediction result can be further analyzed.
The method and apparatus for predicting links of a dynamic social relationship network based on spatiotemporal relationships according to the present invention will be described in detail with reference to the accompanying drawings in a specific embodiment. It is to be understood that the following description is only exemplary, and not a specific limitation of the invention.
Aiming at the defects of the prior art, the invention aims to design a link prediction method which can combine the dynamic time sequence characteristics of the network with the learning characteristic representation and improve the accuracy of the link prediction of the dynamic weighting network.
In order to achieve the above objects and other related objects, the present invention provides a dynamic social relationship network link prediction method and apparatus based on machine learning and spatio-temporal relationships. Fig. 2 is a main flow chart of the prediction method of the present invention. As shown in fig. 2, the prediction method of the present invention includes the following steps:
and S1, preprocessing the raw data to generate a sample set.
Suppose a dynamic network from t0Start to t1And finishing the mutual information among all the components. The components and their interactions are abstracted into a undirected network G (V, E). Where V represents a set of components in the network and E represents a set of edges for which an interaction exists. At this time u, V ∈ V represents each component node in the network, and (u, V) ∈ E represents an edge in the network. In addition, based on different importance of components in the network, in order to measure the influence of different edges on the nodes on the similarity, the weight w which is given to each node pair is considered(u,v)
In order to take into account the evolution information of the network, the network is divided into a plurality of sub-networks by time series, wherein each sub-network can be considered as static. As shown in FIG. 3, G1To G3The connection situation between the nodes is constantly changing for the sub-network state of the left network at different time. Based on the known time information in the dynamic network, the whole time period is divided into n time slices, and the interval of each time slice is (t)1-to) And/n. That is, the interaction of the ith (i is more than or equal to 1 and less than or equal to n) time slice occurs in t0,t0+i*(t1-to)/n]And (4) the following steps. If S denotes the entire network sequence, GtRepresenting the network at time T, T being the time span of the entire network, S may be represented as S ═ G1,G2,Gt,…,GT}。
And S2, constructing a node pair weighted similarity characteristic time sequence.
Firstly, extracting the characteristics of weighted node pairs based on the similarity index of the local network structure. According to the method, a Weighted Common Neighbor (WCN), a weighted adaptive-Adar (WAA), a Weighted Resource Allocation (WRA), a weighted priority link (WPA) and a Weighted Jaccard's Coeffient (WJC) are selected as the network node pair characteristics extracted manually.
On the basis, for any node pair, the weighted similarity characteristics of the node pair on each time slice are calculated to form a characteristic time sequence.
And S3, constructing a feature matrix.
After the time sequence of the network is established according to the time information, the evolution process of the network and the change trend of the node to the similarity can be obtained by observing the change condition between the adjacent sub-networks in the time sequence of the network, so that the time information of the network is utilized and finally applied to the link prediction of the dynamic network. The invention adopts a moving average model, the model averages n sub-network characteristics nearest to the t moment, and the model can be expressed as:
Figure BDA0002708459000000091
when n is T-1, the model evolves into an ensemble average model, and the expression of the model is as follows:
Figure BDA0002708459000000092
the calculation result of the model is the average value of the similarity of the latest n sequences in the observed sequence, and the final similarity is calculated to be used as the characteristic value of the node pair, so that the characteristic matrix is constructed.
And S4, constructing a training set and a testing set.
The task of dynamic link prediction is to predict a link which is likely to be newly added in the network at a future time by using the network at a historical time. For network time series S ═ G1,G2,Gt,…,GTAnd taking the first n-1 sub-networks as a training set, and taking the nth sub-network as a test set. And giving labels to the training set according to different link states.
And S5, realizing link prediction through a machine learning classification algorithm.
The invention adopts two models of a random forest and a support vector machine as a classification algorithm. The random forest classification model is composed of a series of decision trees, m samples are repeatedly and randomly extracted from an original training sample set in a replacement mode to generate a new training sample set, and then k classification trees are generated according to a self-help sample set to form a random forest. The support vector machine maps the vectors into a higher dimensional space in which a maximally spaced hyperplane is created. Two hyperplanes parallel to each other are built on both sides of the hyperplane separating the data, and the hyperplane separating the hyperplanes maximizes the distance between the two parallel hyperplanes. And training the training set by using the two supervised learning algorithms, and predicting the dynamic relationship of the nodes in the test set.
And S6, evaluating the prediction result.
In order to evaluate the accuracy of the prediction result, the invention uses AUC (area Under the Receiver Operating Characteristic curve) as the accuracy evaluation index. AUC may be understood as the probability that the score value of a randomly selected edge in a test set is higher than the score value of a randomly selected non-existing edge. During n independent comparisons, a missing link and a non-existing link are randomly selected to compare their similarity scores, and if the similarity score of an edge in the test set is greater than the similarity score of a non-existing edge for n' times, and the similarity scores of both n "times are the same, then the AUC is defined as follows:
Figure BDA0002708459000000101
by adopting the evaluation indexes, the prediction accuracy of the method can be evaluated, and the influence of the selection of different parameters such as time intervals on the prediction result can be further analyzed.
In summary, the dynamic evolution link prediction method based on machine learning provided by the invention has the following beneficial effects:
the method is suitable for the evolution prediction of the dynamic time-varying network. The invention establishes the characteristic time sequence of the dynamic network on the basis of the network topological structure characteristics and the link generation time sequence information, thereby expanding the application range of the prediction method from a static network to a dynamic time-varying network.
The method is suitable for link prediction of a weighting network. Individuals in the actual social network also have different degrees of correlation, such as interaction times, interaction frequency, and the like. Reflected in the abstracted network, i.e. with different weights for the links in the network. The invention introduces the weight into the link prediction problem, and better reflects the practical characteristics of the network.
The accuracy of the prediction result is improved. The invention integrates the network structure characteristics and the node link characteristics, and combines a statistical model and a supervised learning method, thereby being more suitable for the actual situation and having better prediction effect. Compared with the traditional static algorithm, the experimental result on the Email-Eu-core Temporal Network data set shows that the accuracy rate is improved by 5% by adopting the weighted dynamic Network prediction method; compared with a link prediction method without considering weight, the method provided by the invention improves the accuracy by 3%.
While the invention has been described in connection with specific embodiments thereof, it is to be understood that it is intended by the appended drawings and description that the invention may be embodied in other specific forms without departing from the spirit or scope of the invention.

Claims (10)

1. A prediction method of a dynamic social relationship network link based on a spatio-temporal relationship is characterized by comprising the following steps:
acquiring dynamic social relationship data, and preprocessing the dynamic social relationship data to generate a sample set;
constructing a weighted similarity characteristic time sequence for any node pair in the sample set;
calculating the characteristic value of any node pair at the moment to be predicted by adopting a preset algorithm based on the weighted similarity characteristic time sequence to construct a characteristic matrix;
and inputting the characteristic matrix into a pre-trained classification model, and outputting possible links of the dynamic social relationship network at the moment to be predicted.
2. The spatiotemporal relationship-based dynamic social relationship network link prediction method of claim 1, wherein preprocessing the dynamic social relationship data comprises: and dividing the dynamic social relation data into a plurality of sub-networks according to a preset time interval.
3. The spatio-temporal relationship-based dynamic social relationship network link prediction method according to claim 1 or 2, wherein the preprocessing of the dynamic social relationship data comprises: and assigning a corresponding weight to each node pair based on the link relation of each node pair.
4. The spatio-temporal relationship-based dynamic social relationship network link prediction method as claimed in claim 1, wherein the pre-trained classification model is trained using random forest or support vector machine algorithm.
5. The method of predicting spatiotemporal relationship-based dynamic social relationship network links of claim 1, wherein an AUC evaluation index is employed to evaluate the predicted dynamic social relationship network.
6. A prediction apparatus for dynamic social relationship network links based on spatio-temporal relationships, comprising:
the data processing module is used for acquiring dynamic social relationship data and preprocessing the dynamic social relationship data to generate a sample set;
the characteristic time sequence construction module is used for constructing a weighted similarity characteristic time sequence for any node pair in the sample set;
the computing module is used for computing the characteristic value of any node pair at the moment to be predicted by adopting a preset algorithm based on the weighted similarity characteristic time sequence so as to construct a characteristic matrix;
and the classification prediction module is used for inputting the characteristic matrix into a pre-trained classification model and outputting possible links of the dynamic social relationship network at the moment to be predicted.
7. The spatio-temporal relationship-based prediction device for dynamic social relationship network links according to claim 6, wherein the data processing module comprises: and the dividing module is used for dividing the dynamic social relationship data into a plurality of sub-networks according to a preset time interval.
8. The apparatus for predicting spatio-temporal relationship-based dynamic social relationship network links according to claim 6 or 7, wherein the data processing module comprises: and the weight assignment module is used for assigning corresponding weight to each node pair based on the link relation of each node pair.
9. The spatio-temporal relationship-based dynamic social relationship network link prediction device of claim 6, wherein the pre-trained classification model is trained using random forest or support vector machine algorithms.
10. The spatio-temporal relationship-based prediction device for dynamic social relationship network links according to claim 6, further comprising:
and the result evaluation module is used for evaluating the predicted dynamic social relationship network by adopting an AUC evaluation index.
CN202011047469.2A 2020-09-29 2020-09-29 Dynamic social relationship network link prediction method and device based on spatio-temporal relationship Pending CN112184468A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011047469.2A CN112184468A (en) 2020-09-29 2020-09-29 Dynamic social relationship network link prediction method and device based on spatio-temporal relationship

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011047469.2A CN112184468A (en) 2020-09-29 2020-09-29 Dynamic social relationship network link prediction method and device based on spatio-temporal relationship

Publications (1)

Publication Number Publication Date
CN112184468A true CN112184468A (en) 2021-01-05

Family

ID=73945769

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011047469.2A Pending CN112184468A (en) 2020-09-29 2020-09-29 Dynamic social relationship network link prediction method and device based on spatio-temporal relationship

Country Status (1)

Country Link
CN (1) CN112184468A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112700056A (en) * 2021-01-06 2021-04-23 中国互联网络信息中心 Complex network link prediction method, complex network link prediction device, electronic equipment and medium
CN112767186A (en) * 2021-01-26 2021-05-07 东南大学 Social network link prediction method based on 7-subgraph topological structure
CN114202035A (en) * 2021-12-16 2022-03-18 成都理工大学 Multi-feature fusion large-scale network community detection algorithm
CN114553497A (en) * 2022-01-28 2022-05-27 中国科学院信息工程研究所 Internal threat detection method based on feature fusion
CN115509789A (en) * 2022-09-30 2022-12-23 中国科学院重庆绿色智能技术研究院 Computing system fault prediction method and system based on component calling analysis
CN116112379A (en) * 2022-12-09 2023-05-12 国网湖北省电力有限公司信息通信公司 Dynamic prediction method for directed link of multidimensional service sharing equipment of data center
CN117151279A (en) * 2023-08-15 2023-12-01 哈尔滨工业大学 Isomorphic network link prediction method and system based on line graph neural network

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112700056A (en) * 2021-01-06 2021-04-23 中国互联网络信息中心 Complex network link prediction method, complex network link prediction device, electronic equipment and medium
CN112700056B (en) * 2021-01-06 2023-09-15 中国互联网络信息中心 Complex network link prediction method, device, electronic equipment and medium
CN112767186A (en) * 2021-01-26 2021-05-07 东南大学 Social network link prediction method based on 7-subgraph topological structure
CN114202035A (en) * 2021-12-16 2022-03-18 成都理工大学 Multi-feature fusion large-scale network community detection algorithm
CN114553497A (en) * 2022-01-28 2022-05-27 中国科学院信息工程研究所 Internal threat detection method based on feature fusion
CN114553497B (en) * 2022-01-28 2022-11-15 中国科学院信息工程研究所 Internal threat detection method based on feature fusion
CN115509789A (en) * 2022-09-30 2022-12-23 中国科学院重庆绿色智能技术研究院 Computing system fault prediction method and system based on component calling analysis
CN115509789B (en) * 2022-09-30 2023-08-11 中国科学院重庆绿色智能技术研究院 Method and system for predicting faults of computing system based on component call analysis
CN116112379A (en) * 2022-12-09 2023-05-12 国网湖北省电力有限公司信息通信公司 Dynamic prediction method for directed link of multidimensional service sharing equipment of data center
CN116112379B (en) * 2022-12-09 2024-02-02 国网湖北省电力有限公司信息通信公司 Dynamic prediction method for directed link of multidimensional service sharing equipment of data center
CN117151279A (en) * 2023-08-15 2023-12-01 哈尔滨工业大学 Isomorphic network link prediction method and system based on line graph neural network

Similar Documents

Publication Publication Date Title
CN112184468A (en) Dynamic social relationship network link prediction method and device based on spatio-temporal relationship
Albahri et al. Multidimensional benchmarking of the active queue management methods of network congestion control based on extension of fuzzy decision by opinion score method
CN111371644B (en) Multi-domain SDN network traffic situation prediction method and system based on GRU
CN113114722B (en) Virtual network function migration method based on edge network
CN113869521A (en) Method, device, computing equipment and storage medium for constructing prediction model
EP3899758A1 (en) Methods and systems for automatically selecting a model for time series prediction of a data stream
CN113988464A (en) Network link attribute relation prediction method and equipment based on graph neural network
CN107240028B (en) Overlapped community detection method in complex network of Fedora system component
Perez-Cervantes et al. Using link prediction to estimate the collaborative influence of researchers
Tolochko et al. Same but different: A comparison of estimation approaches for exponential random graph models for multiple networks
Goethals et al. Reliable spurious mode rejection using self learning algorithms
Lai et al. Task assignment and capacity allocation for ml-based intrusion detection as a service in a multi-tier architecture
Li et al. Cyber performance situation awareness on fuzzy correlation analysis
CN113835973B (en) Model training method and related device
CN113837481B (en) Financial big data management system based on block chain
CN108898227A (en) Learning rate calculation method and device, disaggregated model calculation method and device
CN106815653B (en) Distance game-based social network relationship prediction method and system
CN110659266A (en) Data processing method of model
CN113111308B (en) Symbolic regression method and system based on data-driven genetic programming algorithm
CN114070708A (en) Virtual network function resource consumption prediction method based on flow characteristic extraction
JP7205628B2 (en) Information processing device, control method, and program
CN111079003A (en) Technical scheme of potential preference correlation prediction model with social circle as key support
Ballin et al. Optimization of stratified sampling with the r package samplingstrata: applications to network data
CN111382391A (en) Target correlation feature construction method for multi-target regression
Paillisse et al. Supervised Machine Learning Techniques to Calculate the Robustness of Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210105

RJ01 Rejection of invention patent application after publication