CN108228728A

CN108228728A - A kind of paper network node of parametrization represents learning method

Info

Publication number: CN108228728A
Application number: CN201711308050.6A
Authority: CN
Inventors: 蒲菊华; 陈虞君; 刘伟; 班崟峰; 杜佳鸿; 熊璋
Original assignee: SHENZHEN BEIHANG NEW INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE; Beihang University
Current assignee: SHENZHEN BEIHANG NEW INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE; Beihang University
Priority date: 2017-12-11
Filing date: 2017-12-11
Publication date: 2018-06-29
Anticipated expiration: 2037-12-11
Also published as: CN108228728B

Abstract

A kind of paper network node the invention discloses parametrization represents learning method, and this method is built an empty paper node queue, the neighbor node of any one paper node and the neighbor node of neighbours are then sampled using random walk mode first；And using the paper node of selection as first element of paper node queue, subsequent foundation redirects probability and obtains the other elements of paper node queue；Traversal completes all paper nodes, Ze You papers node queue set；Then using the neural metwork training data of positive and negative method of sampling generation multi-layer perception (MLP)；It is finally handled using neural network paper probabilistic model, obtains the nonlinear transformation that paper node semantics information is represented to paper knot vector, and then the vector for obtaining paper node represents.

Description

A kind of paper network node of parametrization represents learning method

Technical field

The present invention relates to the expression learning methods of paper network, more particularly, refer to a kind of paper network of parametrization Node table dendrography learning method.

Background technology

Social networks belongs to internet concept noun.Such as the social networking service of Blog, WIKI, Tag, SNS, RSS etc. Platform.Internet causes a kind of completely new human society tissue and existential mode quietly to come into the present invention, constructs one and surmounts On terrestrial space, huge group-network colony, the human society of 21 century gradually emerge brand-new form With speciality, the individual of network global age is being polymerized to new social groups.Paper network refers between paper and paper Relational network, mutual reference and shared author between paper etc. are shown as on the net.

The expression study of paper network is mostly using non-parametric model, such as " DeepWalk at present:Online Learning of Social Representations " translations are：Depth is walked：The on-line study that social activity represents, Bryan Perozzi etc., 26Mar 2014；Using word2vec imparametrizations method to the expression of paper network in the document It practises.

Network structure refers to the connectivity of network physically.There are many kinds of the topological structures of network, as two-dimensional structure has Annular, star, tree-like, neighbour connect net and beating pipeline array etc., reference《Interconnection network architecture is analyzed》, Wang Dingxing is old State is good to write, October nineteen ninety, the 36-38 pages content.With the development of network, also occur such as reticular structure, cellular knot Structure etc..

The expression learning method of current paper network has to the paper in all paper networks of traversal completion, Cai Nengxue Practise the expression of paper.The expression study that will cannot carry out newly-increased paper when paper network, which has, newly increases paper, can not be further The newly-increased paper of completion classification, analysis work.

Invention content

In order to solve the problems, such as that newly-increased paper can not be indicated study, the present invention proposes a kind of paper net of parametrization Network node table dendrography learning method.In the present invention, star paper network structure is carried out by random walk statistical model first Sampling, obtains paper knot vector information；The paper node queue that sampling is completed is made of a series of paper node, each The secondary selection to next paper node is all random；.After paper network samples step is carried out, the present invention constructs one A deep neural network based on twin network frame, wherein two identical sub-networks of twin network are by multilayer sense Know machine (MLP) composition, the present invention is using the multi-layer perception (MLP) learnt as nonlinear mapping function, by building from network section Point rich text information represents that the nonlinear mapping function of vector obtains network node and represents vector to network node.

The paper network node that the present invention proposes a kind of parametrization represents learning method, it is characterised in that includes following Step：

Step 1, based on random walk method sample obtain any one paper node neighbours-paper set of node and The neighbours of neighbours-paper set of node；

Step 101：A paper node empty queue is built, is denoted as V, the V is used for storing paper sequence node；Paper section The maximum queue element digit of point empty queue V is mv, and the value of mv is 10~20；Then step 102 is performed；

Step 102：Choose any one paper node paper_a, then by the paper_aIt is put into paper node queue V 1st；Then step 103 is performed；

Step 103：Acquisition belongs to any one paper node paper_aWhole neighbours' paper sets of node, be denoted asNeighbours' paper node refers to and any one paper Node paper_aBetween exist even side paper node set；Then step 104 is performed；

Step 104：According to neighbours' paper set of node Middle neighbor node sum B determines that jumping to first redirects probabilityC represents hop count；Then step 105 is performed；

Step 105：Using alias sampling algorithm (alias sampling), according to currentDescribedThe middle neighbours' paper node for obtaining next-hopSimultaneously willIt is put into the 2nd of paper node queue V；Then step 106 is performed；

Step 106：Acquisition belongs to neighbours' paper nodeWhole neighbours' paper sets of node, i.e., the neighbours of neighbours- Paper set of nodeThen step 107 is performed；

Step 107：Calculate neighbours' paper nodeWith any one paper node paper_aBetween most short hop countThen step 108 is performed；

WhereinWhat is represented is the fewest number of hops from any one neighbours' paper node to previous paper node Distance；

Step 108：According to describedTo determineIt jumps to second and redirects probabilityThen Perform step 109；

Described second redirects probabilityC represents hop count；

Step 109：ThroughAfter determining, according toIt is sampled with alias, selectionAs next-hop Paper node simultaneously willThe 3rd be put into paper node queue V；Then step 110 is performed；

Step 110：Cycle performs step 106 and step 109, until when the digit in paper node queue V is mv, this Random walk stops；Then step 111 is performed；

Step 111：Step 101 is repeated to step 110 for each paper node in entire paper network, is come The neighbor node sampling of paper node is completed, Ze You papers node queue set is denoted as VF={ V₁,V₂,...,V_f,...,V_F}； Then step 201 is performed；

V₁Represent first paper node queue；

V₂Represent second paper node queue；

V_fRepresent any one paper node queue, f represents the identification number of paper node queue；

V_FRepresent the last one paper node queue, F represents the sum of paper node queue set, f ∈ F；

Step 2, using the neural metwork training data of negative method of sampling generation multi-layer perception (MLP)；

Step 201：Establish positive sample queue Q_pWith negative sample queue Q_n, it is required just that training neural network is stored respectively Then sampled data and negative sampled data perform step 202；

Step 202：Neighbours window size hyper parameter WD is set up, if WD is in paper node queue V_fIn, then belong to paper section Point queue V_fIn each paper be denoted asThen step is performed 203；

Expression belongs to any one paper node queue V_fFirst paper node；

Expression belongs to any one paper node queue V_fSecond paper node；

Expression belongs to any one paper node queue V_fAny one paper node, g represent neighbours' paper The identification number of node；

Expression belongs to any one paper node queue V_fThe last one paper node, G represent paper node Queue V_fLength, g ∈ G；

It represents in adjacent paper nodeThe node of middle minimum identification number；

It represents in adjacent paper nodeThe node of middle maximum identification number；

It represents in adjacent paper nodeIn removeWithIn addition Queue-adjacent paper node, subscript l expressions are not maximum nor the identification number of minimum paper node；

Step 203：For any one arbitrary queue-paper nodeAccording to the sequence of its neighbours' identification number, from It is small to being sampled greatly, sampling process is to described In each node and arbitrary queue-paper nodeA triple is formed, then performs step 204；

For describedWith arbitrary queue-paper nodeForm a triple, i.e., (δ), wherein δ=+ 1 represents the triple as positive sample, otherwise δ=- 1 item represents the ternary Group is negative sample, and will (δ) it is inserted into positive sample queue Q_pIn；

For describedWith arbitrary queue-paper nodeForm a triple, i.e., ( δ), wherein δ=+ 1 represents the triple as positive sample, otherwise δ=- 1 item represents the triple as negative sample, and will (δ) it is inserted into positive sample queue Q_pIn；

Step 204：Cycle performs step 202 and step 203, until paper node queue set VF={ V₁,V₂,..., V_f,...,V_FIn all paper node queues in all paper nodes all complete the sampling operation of neighbours' paper node, obtain To positive sample queue Q_p, then perform step 207；

Step 205：Paper nodes all in network are sampled, choose any two paper section from network every time Point, i.e., the first arbitrary paper node paper_a, the second arbitrary paper node paper_o；Connect if existed between two paper nodes Side or two paper nodes randomly selected are identical, then continue this step, otherwise by any two paper node paper_a、 paper_oForm triple (paper_a,paper_o, -1) and deposit negative sample queue Q_nIn, then perform step 206；

Step 206：Cycle performs step 205, sets up a positive and negative sample proportion parameter μ, it is assumed that positive sample queue Q_pIn Triple number is np, then works as Q_nIn triple quantity be equal to μ × np when stop, then perform step 207；

Step 207：The positive sample queue Q that will be obtained in step 204_pWith the negative sample queue Q obtained in step 206_nMerge Together, a new sample queue Q is obtained_Newly={ Q₁....,Q_(1+μ)×np, it is rear to perform step 208；

Q₁Represent new sample queue Q_NewlyIn minimum identification number triple；

Q_(1+μ)×npRepresent new sample queue Q_NewlyIn maximum identification number triple；Subscript (1+ μ) × np representative samples Queue Q_NewlyIn include (1+ μ) × np triple；

Step 208：By new sample queue Q_Newly={ Q₁....,Q_(1+μ)×npIn all elements upset sequence, obtain unrest The sample queue Q of sequence_Sequence={ Q₁....,Q_(1+μ)×np, then perform step 301；

Step 3, the processing in the neural network paper probabilistic model based on multi-layer perception (MLP)；

Step 301：The Q obtained for step 208_Sequence={ Q₁....,Q_(1+μ)×np, a triple is selected every time (paper_a,paper_o, δ), it is put into neural network paper probabilistic model and is learnt as a pair of of paper node, perform step 302；

Step 302：For two paper node paper in each triple_aWith paper_o, using modelIt is mapped, obtains the vector after two corresponding transformationPerform step 303；

To belong to paper_aMulti-layer perception (MLP) function；

To belong to paper_oMulti-layer perception (MLP) function；

Step 303：The Euclidean distance of two paper nodes is calculated, performs step 304；

Euclidean distance is：

E_posRepresent the Euclidean shortest distance；E_negRepresent Euclidean longest distance；C represents hop count；

Step 304：The merging of positive negative sample is put into the loss function of the Euclidean distance represented about paper distribution using δ In, and the loss function for being balanced positive negative sample calculates, and obtains whole loss function L, performs step 305；

Step 305：Non-linear transform function f is determined using stochastic gradient descent algorithm_θ, complete any two paper section Point paper_aWith paper_oExpression study.

Network node expression is that each node in network is described with a vector.In order to handle in social networks Numerous and jumbled information and neighbor node relationship, the present invention propose a kind of parametrization network node represent learning method.It is this Network node represents that learning method can learn to a nonlinear mapping function so that can be simply from network node The vector that content information obtains network node represents.The vector of one node is represented, first can obtain it using random walk All mid-side nodes, then according to the relationship between twin network struction node and its neighbor node, and then learn to determine Nonlinear Mapping Function.In order to verify the effect of the present invention, the present invention carries out node point using Cora data sets to whole nodes of paper network The work of class, in the emulation experiment, the network node that is obtained by the method for the present invention represents vector with identical SVM classifier In the case of, classification results are significantly better than other methods, can verify that the present invention is carrying out network node table to paper network Show that aspect is effective.

Description of the drawings

Fig. 1 is the flow chart that the paper network node that the present invention parameterizes represents study.

Fig. 2 is the evaluation result of the Micro-F1 indexs in Cora data sets.

Fig. 3 is the evaluation result of the Macro-F1 indexs in Cora data sets.

Fig. 4 is the evaluation result of the Micro-F1 indexs in Wiki data sets.

Fig. 5 is the evaluation result of the Macro-F1 indexs in Wiki data sets.

Specific embodiment

Below in conjunction with drawings and examples, the present invention is described in further detail.

In the present invention, paper is denoted as paper, and more paper paper form a paper set, are denoted as AP, and AP= {paper₁,paper₂,…,paper_a,…,paper_o,…,paper_A}；Any one paper in the paper set AP exists It is known as a paper node in star paper network structure；

paper₁Represent first paper node；

paper₂Represent second paper node；

paper_aRepresent a-th of paper node, a represents the identification number of paper node；

paper_ARepresent the last one paper node, A represents paper sum, a ∈ A.

For convenience of explanation, paper_aAlso referred to as any one paper node, paper_oIt is except paper_aExcept another Arbitrary paper node, hereinafter by paper_aReferred to as first arbitrary paper node, paper_oReferred to as second arbitrary paper node.

Any one paper node paper will be belonged to_aWhole neighbours' paper nodes be denoted asAndAlso referred to as neighbours-paper node Collection；

Expression belongs to any one paper node paper_aFirst neighbor node；

Expression belongs to any one paper node paper_aSecond neighbor node；

Expression belongs to any one paper node paper_aAny one neighbor node, b represent neighbor node Identification number；

Expression belongs to any one paper node paper_aThe last one neighbor node, B expression belong to paper_aNeighbor node sum, B ∈ A.

Any one neighbor node will be belonged toWhole neighbours' paper nodes be denoted asAndThe also referred to as neighbours of neighbours-paper section Point set.

Expression belongs to any one neighbours' paper nodeFirst neighbor node；

Expression belongs to any one neighbours' paper nodeSecond neighbor node；

Expression belongs to any one neighbours' paper nodeAny one neighbor node, e expression belong to Neighbours' paper nodeNeighbor node identification number；

Expression belongs to any one neighbours' paper nodeThe last one neighbor node, E expression belong toThe neighbor node sum neighbor node of neighbours (referred to as sum), E ∈ A.

In the present invention, the star paper network structure used for《Interconnection network architecture is analyzed》Fig. 1 .19 of page 37 (c) structure.Wang Dingxing, Chen Guoliang are write；October nineteen ninety the first edition.

In the present invention, paper node semantics information refers to passing through the word that the topic, abstract, text of paper include Wordization processing carries out vectorial expression.The wordization processing goes out for the paper node semantics information in any one papers contents The coding of 0 or 1 binaryzation is carried out whether now, so as to obtain the vector of the characterization of the papers contents corresponding 0 or 1." 0 " represents not Occur, " 1 " represents occur.It is handled using wordization and handled belonging to paper node all in star paper network structure, Obtain word number and the associated two-dimensional matrix of paper number of nodes, referred to as paper two values matrix.

Using the neural network paper probabilistic model of paper node semantics information architecture multi-layer perception (MLP)

In the present invention, the structure of paper probabilistic model includes：(A) neural network paper probabilistic model expression formula is set(B) from AP={ paper₁,paper₂,…,paper_a,…,paper_o,…,paper_AIn choose appoint Anticipate two paper node paper_aWith paper_o, and by paper_aWith paper_o In mapped, point It does not obtain belonging to paper_aMulti-layer perception (MLP) functionBelong to paper_oMulti-layer perception (MLP) function(C) foundationWithCalculate paper_aWith paper_oBetween Euclidean distance, and be balanced positive negative sample loss function processing； (D) using stochastic gradient descent algorithm to belonging to f_θWeight parameter WEIGHT and offset parameter BIAS processing, learned Practise the non-linear transform function f of target_θ, all triples are traversed, are obtained to paper node semantics information based on multilayer sense Know the neural metwork training of machine.

In the present invention, for paper_aThere is its corresponding rich text informationThen using multi-layer perception (MLP) god It is obtained through network describedNonlinear transformation.Assuming that multi-layer perception (MLP) is H layers a total of, the nerve based on multi-layer perception (MLP) Network has the weight parameter WEIGHT of each layer and offset parameter BIAS.

In the present invention, weight parameter WEIGHT={ weight₁,weight₂,...,weight_h,...,weight_H}。

In the present invention, offset parameter BIAS={ bias₁,bias₂,...,bias_h,...,bias_H}。

weight₁Represent the weight parameter of the first layer network in neural network；

weight₂Represent the weight parameter of the second layer network in neural network；

weight_hRepresent the weight parameter of any one layer network in neural network, h represents the number of plies identification number of perceptron；

weight_HRepresent the weight parameter of last layer network in neural network, H represents total number of plies of perceptron；

bias₁Represent the offset parameter of the first layer network in neural network；

bias₂Represent the offset parameter of the second layer network in neural network；

bias_hRepresent the offset parameter of any one layer network in neural network；

bias_HRepresent the offset parameter of last layer network in neural network.

The output of multi-layer perception (MLP) first layer is denoted asWhereinGeneration The output of first layer, f in table multi-layer perception (MLP)₁Represent the activation primitive of first layer neural network.

Similarly, multi-layer perception (MLP) second layer output is denoted asWherein Represent the output of the second layer in multi-layer perception (MLP), f₂Represent the activation primitive of second layer neural network.

Any one layer of output of multi-layer perception (MLP) is denoted asForf_h Represent the activation primitive of any one layer of neural network.

Last layer of output of multi-layer perception (MLP) is denoted as

In the present invention, the activation primitive f for any one layer in multi-layer perception (MLP)_h, generally all select non-linear letter Number, such as sigmoid tanh functions etc..Output for last layer of multi-layer perception (MLP)It is multiple non-linear Function is for inputTransformation, therefore can simply portray forWherein θ represents all parametrizations The summation of function.By described inIt is final as the neural network paper probabilistic model based on multi-layer perception (MLP) Output, then have

In the present invention, in order to enable the Euclidean distance between point similar in expression of space is short as far as possible, without phase As Euclidean distance between point it is long as far as possible.Its citation form is：

E_posRepresent the Euclidean shortest distance；E_negRepresent Euclidean longest distance；C represents hop count.

In the present invention, due to triple (paper_a,paper_o, δ) in δ represent the triple be positive sample or The mark of negative sample, wherein positive sample are regarded as needing point similar in space, and negative sample may be considered needs and exist The point of distance as far as possible in space.Therefore, for this application, the present invention can utilize δ to be put into the merging of positive negative sample In the loss function of Euclidean distance represented about paper distribution：

M represents Q_SequenceIn any one triple identification number,Represent the first arbitrary paper in triple m Node,Represent the second arbitrary paper node in triple m, δ^(m)Represent the mark of positive negative sample in triple m.L What is represented is whole loss function, which should be out of order sample sequence Q_SequenceThe sum of middle all elements loss function.

In the present invention, due to the ratio of positive negative sample difference, and the similitude between positive negative sample is different.Such as , can be more like probably due to there is even side between positive sample, and negative sample then differs greatly, the loss that negative sample positive in this way generates Function will differ, therefore for this application, and the present invention needs one to reconcile parameter γ to balance the loss function of positive negative sample, Therefore loss function will add in γ, become：

In the present invention, the purpose of training neural network is the value of loss function can be reduced at least, in order to right Neural network is trained, and determines neural network weight and the value of neural network biasing, and the present invention is calculated using stochastic gradient descent Method carries out the study of network parameter.

In the present invention, to model be trained for non-linear transform function f is determined by stochastic gradient descent algorithm_θ, by In non-linear transform function f_θMainly include weight parameter WEIGHT and offset parameter BIAS.For weight parameter WEIGHT and partially It is L relative to weight parameter WEIGHT and the local derviation of offset parameter BIAS to put updated value that every subgradient of parameter BIAS declines, Therefore, in each iteration, weight parameter WEIGHT and offset parameter BIAS is carried out more with learning rate ε according to parameter updated value Newly：

WEIGHT_Afterwards=WEIGHT_Before+ε·ΔWEIGHT

BIAS_Afterwards=BIAS_Before+ε·ΔBIAS

WEIGHT_BeforeFor the weight parameter of last layer in perceptron, WEIGHT_AfterwardsFor the weight parameter of current layer in perceptron, Δ WEIGHT is to decline per subgradient with L relative to the local derviation of weight parameter WEIGHT.

BLAS_BeforeFor the offset parameter of last layer in perceptron, BLAS_AfterwardsFor the offset parameter of current layer in perceptron, Δ BLAS is to decline per subgradient with L relative to the local derviation of offset parameter BLAS.

When stochastic gradient descent is used, since training iterations are excessive, it may appear that the phenomenon that over-fitting, because This, present invention employs early-stop (translation is terminates in advance) methods, do not continue to become hour i.e. in training to loss function L Deconditioning, come the over-fitting occurred when preventing and training." terminating in advance " is《Deep learning》Section 7.8, page 151 interior Hold, author is her grace Goodfellow, Joshua Ben Jiao etc., translator Zhao Shenjian, multitude Yu monarch；August 1 day first in 2017 Version.

In the present invention, each layer in perceptron of weight parameter WEIGHT and offset parameter BIAS is preserved, is learnt The non-linear transform function f of target_θ, so as to complete the neural metwork training based on multi-layer perception (MLP), finally obtain according to study Target f_θTo the paper_aIt generates it and represents vector, the i.e. nerve for paper node semantics information architecture multi-layer perception (MLP) Network paper probabilistic model

A kind of paper network node of parametrization proposed by the present invention represents learning method, comprises steps that：

Step 103：Acquisition belongs to any one paper node paper_aWhole neighbours' paper sets of node, be denoted asIn the present invention, neighbours' paper node refers to appointing Anticipate a paper node paper_aBetween exist even side paper node set；Then step 104 is performed；

Step 104：According to neighbours' paper set of node Middle neighbor node sum B determines to jump to the probability of each neighbours' paper node(referred to as first redirecting probability),C represents hop count；Then step 105 is performed；

Step 105：Using alias sampling algorithm (alias sampling), probability is redirected according to currentInstitute It statesThe middle neighbours' paper node for obtaining next-hopSimultaneously willIt is put into the 2nd of paper node queue V；Then step 106 is performed；

In the present invention, whereinWhat is represented is from any one neighbours' paper node to previous paper node Fewest number of hops distance, if for example, neighbours' paper nodeTo paper node paper_aIt is minimum to need 1 to jump, thenIf neighbours' paper nodeIt is exactly paper node paper_a, thenAnd so on.

Step 108：According to describedTo determineJump to the probability of each neighbours' paper node(referred to as second redirecting probability)；Then step 109 is performed；

Described second redirects probabilityC represents hop count.

In the present invention, most short hop count refer between two paper nodes needed for least hops.

In the present invention, p is adjusts the not paper node in the paper node queue V in random walk method Second redirect probabilityThe parameter (referred to as jumping out parameter) of size, q are described in random walk method for adjusting Second of paper node in paper node queue V redirects probabilityThe parameter (referred to as jumping into parameter) of size, p, q control Make the probability redirected, if it is desired to which random walk is more redirected locally, then p needs to set larger；Conversely, q needs It sets larger.

Step 109：ThroughAfter determining, according toIt is sampled with alias, selectionAs next-hop opinion Literary node simultaneously willThe 3rd be put into paper node queue V；Then step 110 is performed；

Step 111：In the present invention, step 101 is repeated for each paper node in entire paper network To step 110, to complete the sampling of the neighbor node of paper node, Ze You papers node queue set is denoted as VF={ V₁,V₂,..., V_f,...,V_F}；Then step 201 is performed.

V₁Represent first paper node queue；

V₂Represent second paper node queue；

V_FRepresent the last one paper node queue, F represents the sum of paper node queue set, f ∈ F.

In the present invention, the paper node queue set that training data workable for generating neural network is obtained for step 1 VF={ V₁,V₂,...,V_f,...,V_F}；It removes except the training data in paper node queue set, the present invention can be by Negative sampling algorithm generates the data needed for training pattern.

Expression belongs to any one paper node queue V_fFirst paper node；

Expression belongs to any one paper node queue V_fSecond paper node；

Expression belongs to any one paper node queue V_fAny one paper node (referred to as arbitrary team Row-paper node), g represents the identification number of neighbours' paper node；

Expression belongs to any one paper node queue V_fThe last one paper node, G represent paper node Queue V_fLength, g ∈ G.

It represents in adjacent paper nodeThe node of middle minimum identification number.

It represents in adjacent paper nodeThe node of middle maximum identification number.

It represents in adjacent paper nodeIn removeWithIn addition Any one paper node, abbreviation queue-adjacent paper node.Subscript l expressions are not maximum nor minimum paper node Identification number, i.e. the other identifier number except this 2 paper nodes.

For describedWith arbitrary queue-paper nodeForm a triple, i.e., (δ), wherein δ=+ 1 represents the triple as positive sample, otherwise δ=- 1 item represents the ternary Group is negative sample, and will (δ) it is inserted into positive sample queue Q_pIn.

For describedWith arbitrary queue-paper nodeForm a triple, i.e., ( δ), wherein δ=+ 1 represents the triple as positive sample, otherwise δ=- 1 item represents the triple as negative sample, and will (δ) it is inserted into positive sample queue Q_pIn.

Step 205：Paper nodes all in network are sampled, choose any two paper node from network every time (two paper nodes of selection can be adjacent or non-conterminous), i.e., the first arbitrary paper node paper_a, the Two arbitrary paper node paper_o.If there is even side ((paper between two paper nodes_a,paper_o) ∈ E) or two Identical (the paper of paper node randomly selected_a=paper_o), then continue this step, otherwise by any two paper node paper_a、paper_oForm triple (paper_a,paper_o, -1) and deposit negative sample queue Q_nIn, then perform step 206；

Q₁Represent new sample queue Q_NewlyIn minimum identification number triple.

Q_(1+μ)×npRepresent new sample queue Q_NewlyIn maximum identification number triple.Subscript (1+ μ) × np representative samples Queue Q_NewlyIn include (1+ μ) × np triple.

Step 208：By new sample queue Q_Newly={ Q₁....,Q_(1+μ)×npIn all elements upset sequence, obtain unrest The sample queue Q of sequence_Sequence={ Q₁....,Q_(1+μ)×np, then perform step 301.

To belong to paper_aMulti-layer perception (MLP) function；

To belong to paper_oMulti-layer perception (MLP) function；

In the present invention, the purpose of twin network is in order to enable the Euclidean distance between point similar in expression of space is use up It is possible short, and the Euclidean distance between dissimilar point is long as far as possible.Its citation form is：

Embodiment 1

The present embodiment employs Cora papers data set and Pubmed knowledge network data sets are learnt and experimental work.

Cora be a paper data set in total containing 2708 paper nodes, comprising 2708 nodes and 5429 sides, The paper rich text information vector that the corresponding length of each node is 1433, the rich text information vector are represented by 0/1 Word whether there is.Meanwhile each node is associated with a category attribute, total category attribute value number is 7.

Pubmed be a knowledge network data set in total containing 19717 paper nodes, comprising 19717 nodes and 44338 sides, the paper rich text information vector that the corresponding length of each node is 500, the rich text information vector It whether there is by 0/1 expression word.Meanwhile each node is associated with a category attribute, total category attribute value number It is 3.

In order to verify validity, main contrast of the present invention performances of the distinct methods in paper node-classification task：

DeepWalk：It employs common Random Walk Algorithm to sample network, then be obtained with word2vec algorithms The expression of each node in network.(2014DeepWalk:online learning of social representations[J].Perozzi B,Alrfou R,Skiena S.KDD:701-710.)

TADW：Random walk in DeepWalk is decomposed, dexterously adds the rich text information of node, is used The mode of matrix multiple obtains the expression of each node in network.(2015,Network representation learning with rich text information[C]YangC,Zhao D,Zhao D,et al.International Conference on Artificial Intelligence.AAAI Press,:2111-2117.)

Node2Vec：It is the upgrade version of DeepWalk, employs second order Random Walk Algorithm and network is sampled, then Representing for each node in network is obtained with word2vec algorithms.(2016,node2vec:Scalable Feature Learning for Networks[C]//Grover A,Leskovec J.KDD:855.)

Node Forecasting Methodology is selected to carry out the comparison that vector represents effect the present invention.This experiment uses hybrid verification skill Art (cross-validation) selects SVM classifier to classify in different classification Forecasting Methodologies.

It is Micro-F1 and Macro-F1 respectively to be weighed present invention employs two evaluation indexes.

The wherein computational methods of Macro-F1 are:

Wherein P_macroAnd R_macroThe quasi- rate of macro difference and macro recall ratio are represented respectively.

The computational methods of Micro-F1 are:

Wherein P_microAnd R_microThe quasi- rate of elementary errors and micro- recall ratio are represented respectively.

Cora data sets effect as shown in Figures 2 and 3, the present invention with other methods Cora data sets comparison imitate Fruit, what Fig. 2 was represented is performance of each method under Micro-F1 evaluation indexes, and what Fig. 3 was represented is that each method is commented in Macro-F1 Performance under valency index.The training data that the horizontal axis of two figures represents grader accounts for the percentage of total data.It can from figure To find out, the method for the present invention network representation learning methods all more several than other under Micro-F1 and Macro-F1 evaluation indexes Effect will be got well, particular it can be seen that compared to purely by the network information, without the DeepWalk using network node semantic information With Node2vec algorithms, inventive algorithm is under Micro-F1 and Macro-F1 evaluation indexes, for shared by each training data Ratio has more than 5% promotion, can show the present invention after converged network nodal information and network topology structure, obtain To network node represent that vector will significantly be got well than the network node that is obtained using network topological information merely represents vector.Together When, comparison TADW this combine network node information and the method for network topological information can be seen that side proposed by the present invention Method still has 3% promotion in two evaluation indexes.

Wiki data sets effect as shown in Figure 4 and Figure 5, it can be seen from the figure that the present invention in Micro-F1 and It is all got well than the effect of other several network representation learning methods under Macro-F1 evaluation indexes.Due to the classification of Wiki data sets Quantity will be far more than Cora data sets, it is found that do not use the DeepWalk and Node2vec of network node semantic information Algorithm, classifying quality is poor, far below the result analyzed using TADW.This declarative semantics occupies master in the data set Lead effect.Under Micro-F1 and Macro-F1 evaluation indexes, the experimental result obtained for TADW methods has the method for the present invention 2% promotion can show the present invention after converged network nodal information and network topology structure, obtained network node Represent vector compared to directly using matrix multiple to network node represent that vector will be got well.The present invention can be illustrated in network section Point represents preferably to be merged with semantic information in jointed with network information, obtains preferably representing vector.

By the analysis of Fig. 2-Fig. 5, these experiments, which embody the present invention, naturally converged network structure to be believed with semantic Both breaths represent vector, therefore can verify effectiveness of the invention so as to obtain better network node.

Claims

1. a kind of paper network node of parametrization represents learning method, it is characterised in that includes the following steps：

Step 1 samples neighbours-paper set of node and the neighbours for obtaining any one paper node based on random walk method Neighbours-paper set of node；

Step 101：A paper node empty queue is built, is denoted as V, the V is used for storing paper sequence node；Paper node is empty The maximum queue element digit of queue V is mv, and the value of mv is 10~20；Then step 102 is performed；

Step 102：Choose any one paper node paper_a, then by the paper_aIt is put into the 1st in paper node queue V Position；Then step 103 is performed；

Step 103：Acquisition belongs to any one paper node paper_aWhole neighbours' paper sets of node, be denoted as

Neighbours' paper node refers to discuss with any one Literary node paper_aBetween exist even side paper node set；Then step 104 is performed；

Step 104：According to neighbours' paper set of node

Middle neighbor node sum B determines to jump to first Redirect probabilityC represents hop count；Then step 105 is performed；

Step 105：Using alias sampling algorithm (alias sampling), according to currentDescribedThe middle neighbours' paper node for obtaining next-hop Simultaneously willIt is put into the 2nd of paper node queue V；Then step 106 is performed；

Step 106：Acquisition belongs to neighbours' paper nodeWhole neighbours' paper sets of node, i.e. neighbours neighbours-paper Set of nodeThen step 107 is performed；

WhereinWhat is represented is the fewest number of hops distance from any one neighbours' paper node to previous paper node；

Step 108：According to describedTo determineIt jumps to second and redirects probabilityThen it performs Step 109；

Described second redirects probabilityC represents hop count.

Step 109：ThroughAfter determining, according toIt is sampled with alias, selectionAs next-hop paper section Point simultaneously willThe 3rd be put into paper node queue V；Then step 110 is performed；

Step 110：Cycle performs step 106 and step 109, until when the digit in paper node queue V is mv, this is random Migration stops；Then step 111 is performed；

Step 111：Step 101 is repeated to step 109 for each paper node in entire paper network, to complete The neighbor node sampling of paper node, Ze You papers node queue set are denoted as VF={ V₁,V₂,...,V_f,...,V_F}；Then Perform step 201.

V₁Represent first paper node queue；

V₂Represent second paper node queue；

Step 201：Establish positive sample queue Q_pWith negative sample queue Q_n, the required positive sampling of training neural network is stored respectively Then data and negative sampled data perform step 202；

Step 202：Neighbours window size hyper parameter WD is set up, if WD is in paper node queue V_fIn, then belong to paper node queue V_fIn each paper be denoted asThen step 203 is performed；

Expression belongs to any one paper node queue V_fFirst paper node；

Expression belongs to any one paper node queue V_fSecond paper node；

Expression belongs to any one paper node queue V_fAny one paper node, g represent neighbours' paper node Identification number；

Expression belongs to any one paper node queue V_fThe last one paper node, G represents paper node queue V_f Length, g ∈ G.

For the node in any one paper queue in the present invention, it is believed that small with nodal distance in queue In WD whole nodes be positive sample node.Every time, the 2 × WD belonged to adjacent paper sets of node are first obtained for any one paper node present invention, be denoted as

It represents in adjacent paper nodeIn removeWithTeam in addition Row-adjacent paper node, subscript l expressions are not maximum nor the identification number of minimum paper node；

Step 203：For any one arbitrary queue-paper nodeAccording to the sequence of its neighbours' identification number, from it is small to It is sampled greatly, sampling process is to describedIn Each node and arbitrary queue-paper nodeA triple is formed, then performs step 204；

For describedWith arbitrary queue-paper nodeA triple is formed, i.e.,Wherein δ=+ 1 represents the triple as positive sample, otherwise δ=- 1 item represents to be somebody's turn to do Triple is negative sample, and willIt is inserted into positive sample queue Q_pIn.

For describedWith arbitrary queue-paper nodeA triple is formed, i.e., Wherein δ=+ 1 represents the triple as positive sample, otherwise δ=- 1 item represents the triple as negative sample, and It willIt is inserted into positive sample queue Q_pIn.

For describedWith arbitrary queue-paper nodeA triple is formed, i.e.,Wherein δ=+ 1 represents the triple as positive sample, on the contrary δ=- 1 item represent this three Tuple is negative sample, and willIt is inserted into positive sample queue Q_pIn.

Step 205：Paper nodes all in network are sampled, choose any two paper node from network every time, i.e., First arbitrary paper node paper_a, the second arbitrary paper node paper_o.If between two paper nodes exist even side or The paper node that person two randomly selects is identical, then continues this step, otherwise by any two paper node paper_a、paper_o Form triple (paper_a,paper_o, -1) and deposit negative sample queue Q_nIn, then perform step 206；

Step 206：Cycle performs step 205, sets up a positive and negative sample proportion parameter μ, it is assumed that positive sample queue Q_pMiddle triple Number is np, then works as Q_nIn triple quantity be equal to μ × np when stop, then perform step 207；

Step 207：The positive sample queue Q that will be obtained in step 204_pWith the negative sample queue Q obtained in step 206_nMerge one It rises, obtains a new sample queue Q_Newly={ Q₁....,Q_(1+μ)×np, it is rear to perform step 208；

Q₁Represent new sample queue Q_NewlyIn minimum identification number triple.

Q_(1+μ)×npRepresent new sample queue Q_NewlyIn maximum identification number triple.Subscript (1+ μ) × np representative sample queues Q_NewlyIn include (1+ μ) × np triple.

Step 208：By new sample queue Q_Newly={ Q₁....,Q_(1+μ)×npIn all elements upset sequence, obtain out of order Sample queue Q_Sequence={ Q₁....,Q_(1+μ)×np, then perform step 301.

To belong to paper_aMulti-layer perception (MLP) function；

To belong to paper_oMulti-layer perception (MLP) function；

Euclidean distance is：

Step 304：The merging of positive negative sample is put into the loss function of the Euclidean distance represented about paper distribution using δ, And the loss function for being balanced positive negative sample calculates, and obtains whole loss function L, performs step 305；

Step 305：Non-linear transform function f is determined using stochastic gradient descent algorithm_θ, complete any two paper node paper_aWith paper_oExpression study.

2. the paper network node of parametrization according to claim 1 represents learning method, it is characterised in that：Build multilayer The neural network paper probabilistic model of perceptron includes (A) setting neural network paper probabilistic model expression formula(B) from AP={ paper₁,paper₂,…,paper_a,…,paper_o,…,paper_AIn select Take any two paper node paper_aWith paper_o, and by paper_aWith paper_o In reflected It penetrates, respectively obtains and belong to paper_aMulti-layer perception (MLP) functionBelong to paper_oMulti-layer perception (MLP) function (C) foundationWithCalculate paper_aWith paper_oBetween Euclidean distance, and be balanced the loss letter of positive negative sample Number processing；(D) using stochastic gradient descent algorithm to belonging to f_θWeight parameter WEIGHT and offset parameter BIAS processing, Obtain the non-linear transform function f of learning objective_θ, traverse all triples, obtain to paper node semantics information based on The neural metwork training of multi-layer perception (MLP).

3. the paper network node of parametrization according to claim 1 represents learning method, it is characterised in that：Step 103, Step 104 and step 105 realize the acquisition of the 2nd bit element in paper node queue V.

4. the paper network node of parametrization according to claim 1 represents learning method, it is characterised in that：Step 106 The acquisition of element after paper node queue V the 2nd bit elements of relaying is realized to step 110, until reaching mv.

5. the paper network node of parametrization according to claim 1 represents learning method, it is characterised in that：In Cora numbers 3% is promoted according to paper node-classification experiment effect is directed on collection.

6. the paper network node of parametrization according to claim 1 represents learning method, it is characterised in that：In Wiki numbers 2% is promoted according to paper node-classification experiment effect is directed on collection.