CN108270608A - A kind of foundation of link prediction model and link prediction method - Google Patents

A kind of foundation of link prediction model and link prediction method Download PDF

Info

Publication number
CN108270608A
CN108270608A CN201710004638.6A CN201710004638A CN108270608A CN 108270608 A CN108270608 A CN 108270608A CN 201710004638 A CN201710004638 A CN 201710004638A CN 108270608 A CN108270608 A CN 108270608A
Authority
CN
China
Prior art keywords
model
network
link prediction
data
network data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710004638.6A
Other languages
Chinese (zh)
Other versions
CN108270608B (en
Inventor
颜永红
李太松
张艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Acoustics CAS
Beijing Kexin Technology Co Ltd
Original Assignee
Institute of Acoustics CAS
Beijing Kexin Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Acoustics CAS, Beijing Kexin Technology Co Ltd filed Critical Institute of Acoustics CAS
Priority to CN201710004638.6A priority Critical patent/CN108270608B/en
Publication of CN108270608A publication Critical patent/CN108270608A/en
Application granted granted Critical
Publication of CN108270608B publication Critical patent/CN108270608B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Probability & Statistics with Applications (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The present invention provides a kind of method for building up of link prediction model, the link prediction model includes:Sequential is limited Boltzmann machine model and gradient promotes decision-tree model;The method includes:A large amount of network data is captured from internet or other multimedias, network data is pre-processed, network data is divided into historical data and available data, input timing is limited Boltzmann machine model, trains model parameter;The network topology characteristic of network data node pair is extracted, feature set is formed and input gradient promotes decision-tree model, train model parameter;The link prediction model is limited Boltzmann machine model including trained sequential and gradient promotes decision-tree model.Based on the link prediction model that this method is established, the present invention also provides a kind of link prediction method, this method can predict the all-links of network NextState.

Description

A kind of foundation of link prediction model and link prediction method
Technical field
The present invention relates to Internet technical fields, and in particular to a kind of foundation of link prediction model and link prediction side Method, this method utilize the topological characteristic of network and deep learning model, and link prediction is carried out to large scale network.
Background technology
Along with the rapid development of internet and mobile communication technology, the contact between people becomes increasingly closer.It is logical Internet and communication network are crossed, constitutes a huge complex network between men.In network it is interpersonal it is interactive, The various aspects in life have been dissolved into exchange and influence.It also gradually attracts attention to the research of community network, and as current One of research hotspot of scientific domain.In society, many people wish the structure and variation by analyzing community network, find Contact principle between nodes knows to hide rule and community network topological features under general phenomenon Relationship between nodal community feature and network node behavior trend, and then find the differentiation essence of community network, utilize this A little information help people that resource and information processing is more effectively configured, and instruct commodity production, human lives, population management, nature Management, interpretation and the decision of planning etc..One important research point of wherein network node behavior trend is exactly link prediction.
Link prediction method is used for describing the development trend in network future, the connection prediction that can be refined between node; Incomplete either hiding side can also be found out in existing imperfect network.Traditional link prediction method generally utilizes Network topology characteristic and nodal community are predicted using the method for machine learning.However these methods are all from microcosmic angle Degree with node to carrying out link prediction for object, is unfavorable for the evolutionary Modeling to network macrostructure, thus its prediction effect There are certain bottlenecks.
Invention content
It is an object of the invention to overcome drawbacks described above existing for current link prediction method, it is proposed that one kind is based on depth The link prediction method of study, this method are limited adjacency matrix of the Boltzmann machine model to macroscopical sequential lower network using sequential It is modeled, then using trained model as generation model, macro-forecast is carried out to the network linking state of next sequential. On the other hand, network part topological characteristic is extracted from microcosmic angle, using machine learning model, (gradient promotes decision tree Learning model) prediction network structure linking status.Finally by the prediction result Weighted Fusion of the two, the final chain of network is obtained Road prediction result.This method describes the evolution of network from two angles of both macro and micro of network, to generate deep learning model Based on, fusion machine learning model improves link prediction performance.
To achieve these goals, the present invention provides a kind of link prediction method based on deep learning, the links Prediction model includes:Sequential is limited Boltzmann machine model and gradient promotes decision-tree model;The method includes:From internet Or a large amount of network data is captured in other multimedias, network data is pre-processed, network data is divided into history number According to and available data, input timing be limited Boltzmann machine model, train model parameter;Extract the net of network data node pair Network topological characteristic, forms feature set and input gradient promotes decision-tree model, trains model parameter;The link prediction model Boltzmann machine model is limited including trained sequential and gradient promotes decision-tree model.
In above-mentioned technical proposal, the method specifically includes:
Step S1) from internet or other multimedias a large amount of network data is captured, network data is pre-processed, Network data is made not include isolated node or node pair;
Step S2) by certain time length network data is divided into timeslice, tectonic network figure G=under each timeslice {GK,GK-1,…,G1, G is expressed as A={ A with sequential adjacency matrixK,AK-1,…,A1, then acknowledging time window is N, N<K, Wherein { AN,AN-1,…,A2For historical data, { A1It is available data;
Step S3) historical data and available data input timing be limited Boltzmann machine model, train model parameter;
Step S4) by { GK,…,G2It is merged into basic network G ';With G1For standard set, select from G ' and jumped at a distance of for one Node pair, form positive negative sample;And make positive and negative sample size consistent;The network topology characteristic of node pair is extracted, forms feature Collect and input gradient promotes decision-tree model, train model parameter;
Step S5) the link prediction model training finishes, and the link prediction model is limited including trained sequential Boltzmann machine model and gradient promote decision-tree model.
In above-mentioned technical proposal, the step S4) network topology characteristic include feature based on neighbours and based on network The feature of migration.
Based on the link prediction model that the above method is established, the present invention also provides a kind of link prediction method, the sides Method includes:
Step T1) crawl network data to be predicted, and pre-processed, network data is made not include isolated node or section Point pair;
Step T2) by certain time length network data to be predicted is divided into timeslice, it is constructed under each timeslice Network G={ GK,GK-1,…,G1,G0, wherein { GN,GN-1,…,G2For web-based history figure, { G1For existing network figure, { G0} For network to be predicted;G is expressed as A={ A with sequential adjacency matrixK,AK-1,…,A1,A0};Time window is N, N<K, when Between window be moved along a unit, historical data becomes { AN-1,AN-2,…,A1, data two-value random initializtion to be predicted is {A0, { A0It is available data, input timing is limited Boltzmann machine model, prediction result R1 is obtained after successive ignition;
Step T3) use { GK,GK-1,…,G1Construction basic network, utilize step S4) extraction feature set, input gradient promotion Decision-tree model predicts { G0Under node connection status, obtain prediction result R2;
Step T4) weighting amalgamation result R1, R2, finally obtain the prediction result R after fusion.
In above-mentioned technical proposal, the step T4) realization process be:
If there is common node pair in R1 and R2, weight merging and obtain R=α R1+ (1- α) R2, α values exist Between 0.5-0.7;If there are the node pair not having in R2, prediction result R=R1 in R1.
The advantage of the invention is that:
1st, link prediction method of the invention has merged deep learning and machine learning method, and network is described from two angles Variation, overcome the deficiency of single model;And prediction be network NextState all-links, thus prediction effect is more Comprehensively, more accurately;
2nd, link prediction method of the invention not only has heterogeneous networks a universality, but also to heterogeneous networks characteristic, no Network with size has good robustness.
Description of the drawings
Fig. 1 is the sequence diagram of the link prediction method of the present invention.
Specific embodiment
The present invention will be further described in detail in the following with reference to the drawings and specific embodiments.
A kind of method for building up of link prediction model, the link prediction model include:Sequential is limited Boltzmann machine (Temporal Restricted Boltzmann Machine, TRBM) model and gradient promote decision tree (Gradient Boosting Decision Trees, GBDT) model;It the described method comprises the following steps:
Step S1) from internet or other multimedias a large amount of network data is captured, network data is pre-processed;
The temporal information on side is included in the network data;If the network data captured is not comprising isolated node or section The network data of point pair, then can be used directly, otherwise need to pre-process the network data captured, delete isolated node With node pair;
Step S2) by certain time length network data is divided into timeslice (snapshot), structure under each timeslice Make network G={ GK,GK-1,…,G1, G is expressed as A={ A with sequential adjacency matrixK,AK-1,…,A1, when then confirming Between window be N (N<K), wherein { AN,AN-1,…,A2For historical data, { A1It is available data;
Step S3) historical data and available data inputted into TRBM models, training pattern parameter;
Step S4) by { GK,…,G2It is merged into basic network G ';With G1For standard set, select from G ' and jumped at a distance of for one Node pair, form positive negative sample;It due to positive and negative imbalanced training sets, needs to node to sampling so that positive and negative sample number Amount is consistent;The network topology characteristic of node pair is extracted, form feature set and inputs GBDT model trainings, training pattern parameter;
Network topology characteristic includes the feature based on neighbours and the feature based on network wandering in the present embodiment;In this reality It applies in example, neighbors feature Adamic-Adar;Migration is characterized as RootedPagerank.
Step S5) the link prediction model training finishes, and the link prediction model includes trained TRBM models With GBDT models.
As shown in Figure 1, based on the link prediction model that the above method is established, the present invention also provides a kind of link prediction sides Method, the method includes:
Step T1) crawl network data to be predicted, and pre-processed, network data is made not include isolated node or section Point pair;
Step T2) by certain time length network data to be predicted is divided into timeslice, it is constructed under each timeslice Network G={ GK,GK-1,…,G1,G0, wherein { GN,GN-1,…,G2For web-based history figure, { G1For existing network figure, { G0} For network to be predicted;G is expressed as A={ A with sequential adjacency matrixK,AK-1,…,A1,A0};Time window is N (N<K), when Between window be moved along a unit, historical data becomes { AN-1,AN-2,…,A1, data two-value random initializtion to be predicted is {A0, { A0It is available data, TRBM models are inputted, prediction result R1 is obtained after successive ignition;
Step T3) use { GK,GK-1,…,G1Construction basic network, utilize step S4) extraction feature set, input GBDT moulds Type predicts { G0Under node connection status, obtain prediction result R2;
Step T4) weighting amalgamation result R1, R2, finally obtain the prediction result R after fusion.
If there is common node pair in R1 and R2, weighting merges R=α R1+ (1- α) R2, and α values are in 0.5-0.7 Between;If there are the node pair not having in R2 in R1, using the result of R1 as final result R=R1.

Claims (5)

1. a kind of method for building up of link prediction model, the link prediction model include:Sequential is limited Boltzmann machine model Decision-tree model is promoted with gradient;The method includes:A large amount of network data is captured from internet or other multimedias, it is right Network data is pre-processed, and network data is divided into historical data and available data, input timing is limited Boltzmann machine Model trains model parameter;The network topology characteristic of network data node pair is extracted, feature set is formed and input gradient is promoted Decision-tree model trains model parameter;The link prediction model is limited Boltzmann machine model including trained sequential Decision-tree model is promoted with gradient.
2. the method for building up of link prediction model according to claim 1, which is characterized in that the method specifically includes:
Step S1) from internet or other multimedias a large amount of network data is captured, network data is pre-processed, makes net Network data do not include isolated node or node pair;
Step S2) by certain time length network data is divided into timeslice, tectonic network figure G={ G under each timesliceK, GK-1,…,G1, G is expressed as A={ A with sequential adjacency matrixK,AK-1,…,A1, then acknowledging time window is N, N<K, wherein {AN,AN-1,…,A2For historical data, { A1It is available data;
Step S3) historical data and available data input timing be limited Boltzmann machine model, train model parameter;
Step S4) by { GK,…,G2It is merged into basic network G ';With G1For standard set, selected from G ' at a distance of the section jumped for one Point pair, forms positive negative sample;And make positive and negative sample size consistent;The network topology characteristic of node pair is extracted, forms feature set simultaneously Input gradient promotes decision-tree model, trains model parameter;
Step S5) the link prediction model training finishes, and the link prediction model is limited Bohr including trained sequential Hereby graceful machine model and gradient promote decision-tree model.
3. the method for building up of link prediction model according to claim 2, which is characterized in that the step S4) network Topological characteristic includes the feature based on neighbours and the feature based on network wandering.
4. a kind of link prediction method, based on the link prediction model realization that the method described in one of claim 2-3 is established, institute The method of stating includes:
Step T1) crawl network data to be predicted, and pre-processed, network data is made not include isolated node or node It is right;
Step T2) by certain time length network data to be predicted is divided into timeslice, tectonic network under each timeslice Scheme G={ GK,GK-1,…,G1,G0, wherein { GN,GN-1,…,G2For web-based history figure, { G1For existing network figure, { G0To treat The network of prediction;G is expressed as A={ A with sequential adjacency matrixK,AK-1,…,A1,A0};Time window is N, N<K, time window A unit is moved along, historical data becomes { AN-1,AN-2,…,A1, data two-value random initializtion to be predicted is { A0, {A0It is available data, input timing is limited Boltzmann machine model, prediction result R1 is obtained after successive ignition;
Step T3) use { GK,GK-1,…,G1Construction basic network, utilize step S4) extraction feature set, input gradient promotion decision Tree-model predicts { G0Under node connection status, obtain prediction result R2;
Step T4) weighting amalgamation result R1, R2, finally obtain the prediction result R after fusion.
5. link prediction method according to claim 1, which is characterized in that the step T4) realization process be:
If there is common node pair in R1 and R2, weight merging and obtain R=α R1+ (1- α) R2, α values are in 0.5- Between 0.7;If there are the node pair not having in R2, prediction result R=R1 in R1.
CN201710004638.6A 2017-01-04 2017-01-04 Link prediction model establishment and link prediction method Active CN108270608B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710004638.6A CN108270608B (en) 2017-01-04 2017-01-04 Link prediction model establishment and link prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710004638.6A CN108270608B (en) 2017-01-04 2017-01-04 Link prediction model establishment and link prediction method

Publications (2)

Publication Number Publication Date
CN108270608A true CN108270608A (en) 2018-07-10
CN108270608B CN108270608B (en) 2020-04-03

Family

ID=62771669

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710004638.6A Active CN108270608B (en) 2017-01-04 2017-01-04 Link prediction model establishment and link prediction method

Country Status (1)

Country Link
CN (1) CN108270608B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109639485A (en) * 2018-12-13 2019-04-16 国家电网有限公司 The monitoring method and device of electricity consumption acquisition communication link
CN110061961A (en) * 2019-03-05 2019-07-26 中国科学院信息工程研究所 A kind of anti-tracking network topological smart construction method and system based on limited Boltzmann machine
CN110445653A (en) * 2019-08-12 2019-11-12 灵长智能科技(杭州)有限公司 Network state prediction technique, device, equipment and medium
CN116132300A (en) * 2022-09-15 2023-05-16 电子科技大学 Link identification method based on gradient lifting decision tree feature combination

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103490948A (en) * 2013-09-06 2014-01-01 华为技术有限公司 Method and device for predicting network performance
US9129158B1 (en) * 2012-03-05 2015-09-08 Hrl Laboratories, Llc Method and system for embedding visual intelligence

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9129158B1 (en) * 2012-03-05 2015-09-08 Hrl Laboratories, Llc Method and system for embedding visual intelligence
CN103490948A (en) * 2013-09-06 2014-01-01 华为技术有限公司 Method and device for predicting network performance

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109639485A (en) * 2018-12-13 2019-04-16 国家电网有限公司 The monitoring method and device of electricity consumption acquisition communication link
CN110061961A (en) * 2019-03-05 2019-07-26 中国科学院信息工程研究所 A kind of anti-tracking network topological smart construction method and system based on limited Boltzmann machine
CN110061961B (en) * 2019-03-05 2020-08-25 中国科学院信息工程研究所 Anti-tracking network topology intelligent construction method and system based on limited Boltzmann machine
CN110445653A (en) * 2019-08-12 2019-11-12 灵长智能科技(杭州)有限公司 Network state prediction technique, device, equipment and medium
CN116132300A (en) * 2022-09-15 2023-05-16 电子科技大学 Link identification method based on gradient lifting decision tree feature combination
CN116132300B (en) * 2022-09-15 2024-04-30 电子科技大学 Link identification method based on gradient lifting decision tree feature combination

Also Published As

Publication number Publication date
CN108270608B (en) 2020-04-03

Similar Documents

Publication Publication Date Title
CN108270608A (en) A kind of foundation of link prediction model and link prediction method
Gatti et al. Large-scale multi-agent-based modeling and simulation of microblogging-based online social network
CN104809501B (en) A kind of computer system based on class brain coprocessor
CN112084373B (en) Graph embedding-based multi-source heterogeneous network user alignment method
CN114373101A (en) Image classification method for neural network architecture search based on evolution strategy
CN104348829A (en) Network security situation sensing system and method
Tehseen et al. A framework for the prediction of earthquake using federated learning
CN112990378B (en) Scene recognition method and device based on artificial intelligence and electronic equipment
CN104361462B (en) Social network influence maximization approach based on cultural gene algorithm
CN110362728A (en) Information-pushing method, device, equipment and storage medium based on big data analysis
CN107086925B (en) Deep learning-based internet traffic big data analysis method
CN111126578A (en) Joint data processing method, device and system for model training
Garcia-Magarino et al. Survivability strategies for emerging wireless networks with data mining techniques: A case study with NetLogo and RapidMiner
CN107644268B (en) Open source software project incubation state prediction method based on multiple features
Popa et al. Neural networks for production curve pattern recognition applied to cyclic steam optimization in diatomite reservoirs
CN112000793A (en) Man-machine interaction oriented dialogue target planning method
CN110020379A (en) It is a kind of to be embedded in the link prediction method for indicating model based on depth dynamic network
Lingyu et al. SMAM: Detecting rumors from microblogs with stance mining assisting task
CN115658971A (en) Attention mechanism-based multi-layer heterogeneous network node importance degree evaluation method
Esmaili et al. Effective synthetic data generation for fake user detection
Ren et al. [Retracted] A Study on Information Classification and Storage in Cloud Computing Data Centers Based on Group Collaborative Intelligent Clustering
CN113743605A (en) Method for searching smoke and fire detection network architecture based on evolution method
Oghenekaro et al. A hierarchical temporal memory based framework for mining event logs
Jena et al. A Comprehensive Study on Metaheuristics, Big Data and Deep Neural Network Strategies
Kozlak et al. Agent-based modelling of social organisations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant