CN108228728B - Parameterized thesis network node representation learning method - Google Patents

Parameterized thesis network node representation learning method Download PDF

Info

Publication number
CN108228728B
CN108228728B CN201711308050.6A CN201711308050A CN108228728B CN 108228728 B CN108228728 B CN 108228728B CN 201711308050 A CN201711308050 A CN 201711308050A CN 108228728 B CN108228728 B CN 108228728B
Authority
CN
China
Prior art keywords
paper
node
queue
neighbor
thesis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711308050.6A
Other languages
Chinese (zh)
Other versions
CN108228728A (en
Inventor
蒲菊华
陈虞君
刘伟
班崟峰
杜佳鸿
熊璋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Beihang Emerging Industrial Technology Research Institute
Beihang University
Original Assignee
Shenzhen Beihang Emerging Industrial Technology Research Institute
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Beihang Emerging Industrial Technology Research Institute, Beihang University filed Critical Shenzhen Beihang Emerging Industrial Technology Research Institute
Priority to CN201711308050.6A priority Critical patent/CN108228728B/en
Publication of CN108228728A publication Critical patent/CN108228728A/en
Application granted granted Critical
Publication of CN108228728B publication Critical patent/CN108228728B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3347Query execution using vector based model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

The invention discloses a parameterized thesis network node representation learning method, which comprises the steps of firstly constructing an empty thesis node queue, and then sampling a neighbor node of any one thesis node and a neighbor node of the neighbor in a random walk mode; the selected thesis node is used as a first element of the thesis node queue, and then other elements of the thesis node queue are obtained according to the skipping probability; all the thesis nodes are traversed, and a thesis node queue set exists; then, generating neural network training data of the multilayer perceptron by adopting a positive and negative sampling method; and finally, processing by adopting a neural network paper probability model to obtain the nonlinear transformation from paper node semantic information to paper node vector representation, and further obtaining the vector representation of the paper nodes.

Description

Parameterized thesis network node representation learning method
Technical Field
The present invention relates to a representation learning method of a thesis network, and more particularly, to a parameterized representation learning method of a thesis network node.
Background
Social networks belong to internet concept nouns. Social networking service platforms such as Blog, WIKI, Tag, SNS, RSS, etc. The internet leads to a brand-new human social organization and a survival mode to enter the invention, a huge group-network group which exceeds the earth space is constructed, the human society of the 21 st century gradually emerges brand-new forms and characteristics, and individuals in the network global era are converging into a new social group. The network of papers refers to the networking of the relations between papers, and shows on the network the mutual citation and the sharing author between papers.
The expression learning of the paper network mostly adopts a non-parametric model at present, such as a Deepwalk: Online L earning of Social representation "translation of deep walking: Online learning of Social representation, Bryan Perozzi et al, 26Mar 2014, wherein a word2vec non-parametric method is used for learning the expression of the paper network.
Network fabric refers to the physical connectivity of a network. The topological structure of the network has various structures, such as a two-dimensional structure including a ring, a star, a tree, a neighbor connection network, a pulsating flow array and the like, and the contents of the interconnection network structure analysis, Wangxing, Chen nationality editors, 10 months in 1990 and pages 36-38 are referred to. With the development of networks, there are also network structures, honeycomb structures, etc.
The present representation learning method of the paper network must traverse all papers in the paper network to learn the representation of the papers. When a new thesis is added to the thesis network, representation learning of the new thesis cannot be performed, and classification and analysis work of the new thesis cannot be further completed.
Disclosure of Invention
In order to solve the problem that a newly added paper cannot represent learning, the invention provides a parameterized paper network node representation learning method, in the invention, a star-shaped paper network structure is sampled by means of a random walk statistical model to obtain paper node vector information, a paper node queue after sampling is composed of a series of paper nodes, the selection of the next paper node is random each time, after the paper network sampling step is carried out, a deep neural network based on a twin network framework is constructed, wherein two identical sub-networks of the twin network are composed of multilayer perceptrons (M L P), the learned multilayer perceptrons are used as nonlinear mapping functions, and network node representation vectors are obtained by constructing the nonlinear mapping functions from rich text information of network nodes to network node representation vectors.
The invention provides a parameterized thesis network node representation learning method, which is characterized by comprising the following steps of:
the method comprises the following steps that firstly, a neighbor-paper node set of any paper node and a neighbor-paper node set of the neighbor are obtained through sampling based on a random walk method;
step 101: constructing a paper node empty queue marked as V, wherein the V is used for storing a paper node sequence; the maximum queue element bit number of the paper node empty queue V is mv, and the value of mv is 10-20; then, step 102 is executed;
step 102: selecting any paper node paperaThen the paper is putaPutting the position 1 in a thesis node queue V; then step 103 is executed;
step 103: acquiring paper belonging to any paper nodeaAll neighbor paper node set of (1), noted
Figure BDA0001502383970000021
The neighbor paper node refers to a paper node paper associated with any one paper nodeaA thesis node set with edges in between; then step 104 is executed;
step 104: according to the neighbor paper node set
Figure BDA0001502383970000022
The total number B of middle neighbor nodes determines the probability of jumping to the first jump
Figure BDA0001502383970000023
c represents the hop count; then step 105 is executed;
step 105: using alias sampling algorithm (alias sampling), based on the current
Figure BDA0001502383970000024
In the above-mentioned
Figure BDA0001502383970000025
Neighbor paper node for acquiring next hop
Figure BDA0001502383970000026
At the same time will
Figure BDA0001502383970000027
Place bit 2 of the paper node queue V; then step 106 is executed;
step 106: obtaining nodes belonging to neighbor thesis
Figure BDA0001502383970000028
All neighbor paper node set of (2), i.e. neighbor-paper node set of neighbors
Figure BDA0001502383970000029
Then step 107 is performed;
step 107: computing neighbor thesis nodes
Figure BDA00015023839700000210
With any paper node paperaShortest hop count in between
Figure BDA0001502383970000031
Then step 108 is executed;
wherein
Figure BDA0001502383970000032
Representing the least hop distance from any neighbor paper node to the previous paper node;
step 108: according to said
Figure BDA0001502383970000033
To determine
Figure BDA0001502383970000034
Jump to second jump probability
Figure BDA0001502383970000035
Then step 109 is performed;
the second probability of hopping
Figure BDA0001502383970000036
c represents the hop count;
step 109: warp beam
Figure BDA0001502383970000037
After determination, according to
Figure BDA0001502383970000038
And alias sampling, selecting
Figure BDA0001502383970000039
As a next-hop thesis node, will simultaneously
Figure BDA00015023839700000310
Bit 3 placed in paper node queue V; then, step 110 is executed;
step 110: circularly executing the step 106 and the step 109 until the digit in the paper node queue V is mv, and stopping the random walk; then step 111 is executed;
step 111: repeating the steps 101 to 110 for each thesis node in the whole thesis network to complete the neighbor node sampling of the thesis node, and then there is a thesis node queue set denoted as VF ═ V1,V2,...,Vf,...,VF}; then step 201 is executed;
V1representing a first paper node queue;
V2representing a second paper node queue;
Vfrepresenting any one thesis node queue, and f representing an identification number of the thesis node queue;
VFrepresenting the last paper node queue, F representing the total number of paper node queue sets, F ∈ F;
generating neural network training data of the multilayer perceptron by adopting a negative sampling method;
step 201: establishing a positive sample queue QpAnd negative sample queue QnRespectively storing positive sampling data and negative sampling data required by training a neural network, and then executing step 202;
step 202: setting up a neighbor window size hyperparameter WD, if WD is in a paper node queue VfIn, then belongs to the thesis node queue VfEach article in (1) is noted as
Figure BDA00015023839700000311
Then step 203 is executed;
Figure BDA00015023839700000312
indicating belonging to any one of the paper node queues VfThe first paper node of (1);
Figure BDA00015023839700000313
indicating belonging to any one of the paper node queues VfThe second paper node of (1);
Figure BDA0001502383970000041
indicating belonging to any one of the paper node queues VfG represents an identification number of a neighbor thesis node;
Figure BDA0001502383970000042
indicating belonging to any one of the paper node queues VfG denotes a paper node queue VfLength of G ∈ G;
for any node in the paper queue, all nodes with the distance from the node in the queue smaller than WD are considered as positive sample nodes, and every time, for any paper node, the invention firstly acquires 2 × WD adjacent paper node sets which belong to the same and marks the adjacent paper node sets as the positive sample nodes
Figure BDA0001502383970000047
Figure BDA0001502383970000048
Indicated at the neighboring paper node
Figure BDA0001502383970000049
The node with the minimum identification number;
Figure BDA00015023839700000410
indicated at the neighboring paper node
Figure BDA00015023839700000411
The node with the medium and maximum identification number;
Figure BDA00015023839700000412
indicated at the neighboring paper node
Figure BDA00015023839700000413
Middle removing
Figure BDA00015023839700000414
And
Figure BDA00015023839700000415
queue outside-adjacent thesis nodes, subscript l denotes the identification number of the node that is not the largest or the smallest thesis;
step 203: for any arbitrary queue-paper node
Figure BDA00015023839700000416
Sampling from small to large according to the sequence of the neighbor identification numbers, wherein the sampling process is to
Figure BDA00015023839700000417
Each node in (1) and any queue-thesis node
Figure BDA00015023839700000418
A triple is formed, and then step 204 is executed;
for the said
Figure BDA00015023839700000419
And arbitrary queue-paper node
Figure BDA00015023839700000420
Form a triad, i.e. (,)
Figure BDA00015023839700000421
) Wherein +1 represents that the triplet is a positive sample, whereas-1 represents that the triplet is a negative sample, and (b) is
Figure BDA00015023839700000422
) Insert positive sample queue QpPerforming the following steps;
for the said
Figure BDA00015023839700000423
And arbitrary queue-paper node
Figure BDA00015023839700000424
Form a triad, i.e. (,)
Figure BDA00015023839700000425
Figure BDA00015023839700000426
) Wherein +1 represents that the triplet is a positive sample, whereas-1 represents that the triplet is a negative sample, and (b) is
Figure BDA00015023839700000427
) Insert positive sample queue QpPerforming the following steps;
for the said
Figure BDA00015023839700000428
And arbitrary queue-paper node
Figure BDA00015023839700000429
Form a triad, i.e. (,)
Figure BDA00015023839700000430
) Wherein +1 represents that the triplet is a positive sample, whereas-1 represents that the triplet is a negative sample, and (b) is
Figure BDA00015023839700000431
) Insert positive sample queue QpPerforming the following steps;
step 204: step 202 and step 203 are executed in a loop until the thesis node queue set VF ═ V1,V2,...,Vf,...,VFAll the paper nodes in all the paper node queues in the queue complete the sampling work of the neighbor paper nodes to obtain a positive sample queue QpThen, step 207 is performed;
step 205: sampling all paper nodes in the network, and selecting any two paper nodes from the network at each time, namely a first paper node paperaSecond paper node papero(ii) a If a connecting edge exists between two paper nodes or two randomly selected paper nodes are the same, continuing the step, otherwise, performing any two paper nodesa、paperoComposition triplet (paper)a,papero-1) storing into a negative sample queue QnThen step 206 is performed;
step 206: step 205 is executed in a loop, and a positive/negative sample ratio parameter μ is established, assuming a positive sample queue QpThe number of the middle triples is np, then when Q isnStops when the number of triples in (d) equals μ × np, and then performs step 207;
step 207: queue Q of positive samples obtained in step 204pAnd the negative sample queue Q obtained in step 206nAre combined together to obtain a new sample queue QNew={Q1....,Q(1+μ)×npExecute step 208;
Q1indicating a new sample queue QNewMinimum identification number inThe triplet of (2);
Q(1+μ)×npindicating a new sample queue QNewThe subscript (1+ mu) × np represents the sample queue QNewThe kit comprises (1+ mu) × np triplets;
step 208: queue Q of new samplesNew={Q1....,Q(1+μ)×npDisordering the sequence of all elements in the sequence to obtain a disordered sample queue QSorting={Q1....,Q(1+μ)×npThen step 301 is performed;
processing in a neural network paper probability model based on a multilayer perceptron;
step 301: for the Q obtained in step 208Sorting={Q1....,Q(1+μ)×npOne triple (paper) at a timea,paperoB), putting the neural network paper nodes into a neural network paper probability model as a pair of paper nodes for learning, and executing the step 302;
step 302: for two paper nodes paper in each tripleaAnd paperoBy means of a model
Figure BDA0001502383970000051
Mapping to obtain two corresponding transformed vectors
Figure BDA0001502383970000052
Step
303 is executed;
Figure BDA0001502383970000053
to belong to paperaThe multi-layer perceptron function of (1);
Figure BDA0001502383970000054
to belong to paperoThe multi-layer perceptron function of (1);
step 303: calculating Euclidean distances of the two thesis nodes, and executing step 304;
the Euclidean distance is:
Figure BDA0001502383970000055
Figure BDA0001502383970000056
Eposrepresents the euclidean shortest distance; enegRepresents the Euclidean longest distance; c represents the hop count;
304, merging the positive and negative samples, putting the merged positive and negative samples into a loss function of Euclidean distance related to distributed representation of the thesis, carrying out loss function calculation of balancing the positive and negative samples to obtain an integral loss function L, and executing the step 305;
Figure BDA0001502383970000061
step 305: determining a non-linear transformation function f by using a random gradient descent algorithmθCompleting any two paper node papersaAnd paperoThe representation of (2) is learned.
Network node representation describes each node in the network with a vector. In order to process the numerous and complicated information and the neighbor node relationship in the social network, the invention provides a parameterized network node representation learning method. The network node representation learning method can learn a nonlinear mapping function, so that the vector representation of the network node can be simply obtained from the content information of the network node. For vector representation of a node, a random walk is used to obtain peripheral nodes of the node, and then a relation between the node and neighbor nodes of the node is constructed according to a twin network, so that a nonlinear mapping function is learned and determined. In the simulation experiment, under the condition that the network node expression vectors obtained by the method are used by the same SVM classifier, the classification result is obviously better than that of other methods, and the method can be verified to be effective in the aspect of network node expression of the thesis network.
Drawings
FIG. 1 is a flow chart of parameterized paper network node representation learning in accordance with the present invention.
FIG. 2 shows the results of the evaluation of the Micro-F1 index in the Cora data set.
FIG. 3 shows the results of evaluation of the Macro-F1 index in the Cora data set.
FIG. 4 is the results of the evaluation of the Micro-F1 metric in the Wiki data set.
FIG. 5 is a graph of the results of the evaluation of the Macro-F1 metric in a Wiki dataset.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples.
In the present invention, the paper is referred to as a paper, multiple papers form a paper set, which is referred to as AP, and AP ═ paper1,paper2,…,papera,…,papero,…,paperA}; any one of the papers in the paper set AP is called a paper node in the star-shaped paper network structure;
paper1representing a first paper node;
paper2representing a second paper node;
paperarepresenting the a-th paper node, wherein a represents the identification number of the paper node;
paperArepresenting the last paper node, a representing the total number of papers, a ∈ a.
For convenience of explanation, paperaAlso called any paper node, paperoIs to remove paperaIn addition to another arbitrary paper node, hereinafter paperaReferred to as the first arbitrary paper node, paperoReferred to as the second arbitrary paper node.
Will belong to any paper node paperaAll neighbor paper nodes of (2)
Figure BDA0001502383970000071
And is
Figure BDA0001502383970000072
Also referred to as neighbor-paper node set for short;
Figure BDA0001502383970000073
representing a paper belonging to any one of the paper nodesaThe first neighboring node of (a);
Figure BDA0001502383970000074
representing a paper belonging to any one of the paper nodesaA second neighboring node of (a);
Figure BDA0001502383970000075
representing a paper belonging to any one of the paper nodesaB represents the identification number of the neighbor node;
Figure BDA0001502383970000076
representing a paper belonging to any one of the paper nodesaB represents belonging to a paperaB ∈ a.
Will belong to any one neighbor node
Figure BDA0001502383970000077
All neighbor paper nodes of (2)
Figure BDA0001502383970000078
And is
Figure BDA0001502383970000079
Also referred to simply as the neighbor of a neighbor-paper node set.
Figure BDA00015023839700000710
Indicating belonging to any neighbor paper node
Figure BDA00015023839700000711
The first neighboring node of (a);
Figure BDA00015023839700000712
indicating belonging to any neighbor paper node
Figure BDA00015023839700000713
A second neighboring node of (a);
Figure BDA00015023839700000714
indicating belonging to any neighbor paper node
Figure BDA00015023839700000715
E represents a node belonging to a neighbor paper
Figure BDA00015023839700000716
The identification number of the neighbor node of (2);
Figure BDA00015023839700000717
indicating belonging to any neighbor paper node
Figure BDA00015023839700000718
E denotes a node belonging to
Figure BDA00015023839700000719
E ∈ a (total number of neighbor nodes for short neighbors).
In the present invention, the star-shaped paper network structure is adopted as the structure of fig. 1.19(c) on page 37 of "analysis of interconnection network structure". Wandingxing, a good compilation in old countries; first edition of month 10, 1990.
In the invention, the semantic information of the paper node refers to vector representation of words contained in the title, abstract and text of the paper through a lexical process. The part-of-speech processing is to perform 0 or 1 binarization coding according to the occurrence or non-occurrence of the semantic information of the paper node in any paper content, so as to obtain a 0 or 1 represented vector corresponding to the paper content. A "0" indicates absence and a "1" indicates presence. And processing all the thesis nodes belonging to the star-shaped thesis network structure by adopting lexical processing to obtain a two-dimensional matrix of association of the word number and the thesis node number, which is called a thesis binary matrix for short.
Neural network thesis probability model for constructing multilayer perceptron by adopting thesis node semantic information
In the invention, the construction of the paper probability model comprises the following steps: (A) setting neural network paper probability model expression
Figure BDA0001502383970000081
(B) From AP ═ { paper ═1,paper2,…,papera,…,papero,…,paperAChoose any two paper nodes paperaAnd paperoAnd combining the paperaAnd paperoIn that
Figure BDA0001502383970000082
Respectively obtaining the data belonging to the paperaMulti-layer perceptron function of
Figure BDA0001502383970000083
Belong to paperoMulti-layer perceptron function of
Figure BDA0001502383970000084
(C) According to
Figure BDA0001502383970000085
And
Figure BDA0001502383970000086
computing paperaAnd paperoEuclidean distance between the positive and negative samples, and performing loss function processing for balancing the positive and negative samples; (D) using a random gradient descent algorithm to pair fθThe weighted parameter WEIGHT and the BIAS parameter BIAS are processed to obtain a nonlinear transformation function f of a learning targetθAnd traversing all the triples to obtain the neural network training of the semantic information of the nodes of the thesis based on the multilayer perceptron.
In the invention, the constructed neural network paper probability model expression based on the multilayer perceptron is recorded as fθAs a non-linear mapping function, for the paperaSemantic information of the paper node. By mapping a function f to a non-linearityθLearning is performed to determine the parameter θ in the nonlinear mapping. Based on a non-linear mapping function fθPaper for any one of the papers can be obtainedaExpression of paper probability model
Figure BDA0001502383970000089
In the present invention, for paperaAll have rich text information corresponding to them
Figure BDA00015023839700000810
Then obtaining the signal by adopting a multilayer perceptron neural network
Figure BDA00015023839700000811
Is performed by a non-linear transformation. Assuming that the multi-layer perceptron has H layers in total, the neural network based on the multi-layer perceptron has a WEIGHT parameter WEIGHT and a BIAS parameter BIAS of each layer.
In the present invention, WEIGHT parameter WEIGHT ═ { WEIGHT ═ WEIGHT [ ({ WEIGHT) }1,weight2,...,weighth,...,weightH}。
In the present invention, BIAS parameter BIAS ═ BIAS1,bias2,...,biash,...,biasH}。
weight1A weight parameter representing a first layer network in the neural network;
weight2a weight parameter representing a second tier network in the neural network;
weighthrepresenting the weight parameter of any layer of network in the neural network, and h represents the layer number identification number of the perceptron;
weightHrepresenting the last layer in a neural networkThe weight parameter of the network, H represents the total number of layers of the perceptron;
bias1a bias parameter representing a first layer network in the neural network;
bias2a bias parameter representing a second layer network in the neural network;
biashrepresenting the bias parameters of any layer of network in the neural network;
biasHrepresenting the bias parameters of the last layer of the neural network.
First layer output for a multi-layer perceptron is noted
Figure BDA0001502383970000091
Wherein
Figure BDA00015023839700000916
Representing the output of the first layer of a multi-layer perceptron, f1Representing the activation function of the first layer neural network.
Similarly, the second layer output of the multi-layer perceptron is recorded as
Figure BDA0001502383970000092
Wherein
Figure BDA0001502383970000093
Representing the output of the second layer in a multi-layer perceptron, f2Representing the activation function of the second layer neural network.
Any output of the multi-layer perceptron is recorded as
Figure BDA0001502383970000094
Is composed of
Figure BDA0001502383970000095
fhRepresenting the activation function of any layer of neural network.
The last output of the multi-layer perceptron is recorded as
Figure BDA0001502383970000096
In the present inventionIn the light of the above, the activation function f for any layer of the multi-layer perceptronhNon-linear functions, such as sigmoid or tanh functions, are typically chosen. Output to last layer of multi-layer perceptron
Figure BDA0001502383970000097
For multiple non-linear functions to input
Figure BDA0001502383970000098
Can thus be simply depicted as
Figure BDA0001502383970000099
Where θ represents the sum of all parameterized functions. Will be described in
Figure BDA00015023839700000910
The final output of the neural network paper probability model based on the multilayer perceptron is
Figure BDA00015023839700000911
In the present invention, in order to make the euclidean distance between similar points in the expression space as short as possible, the euclidean distance between dissimilar points is as long as possible. The basic form is as follows:
Figure BDA00015023839700000912
Figure BDA00015023839700000913
Eposrepresents the euclidean shortest distance; enegRepresents the Euclidean longest distance; c represents the number of hops.
In the present invention, since the triple (paper)a,paperoAnd,) that represents whether the triplet is a positive or negative sample, where positive samples may be considered to require similar points in space, and negative samples may be considered to require points as far as possible in space.Thus, for the present application, the present invention may take advantage of the merging of positive and negative samples into a loss function for euclidean distance of the distributed representation of the paper:
Figure BDA00015023839700000914
m represents QSortingThe identification number of any one of the triples in (b),
Figure BDA00015023839700000915
representing the first arbitrary paper node in a triplet m,
Figure BDA0001502383970000101
representing the second arbitrary paper node in the triplet m,(m)the flag indicating positive and negative samples in the triplet m L represents the overall loss function, which should be the sequence of out-of-order samples QSortingThe sum of all element loss functions in (c).
In the invention, the proportion of positive and negative samples is different, and the similarity between the positive and negative samples is different. For example, positive samples may be more similar due to the existence of a connecting edge, while negative samples are more different, so that the loss functions generated by the positive and negative samples will not be the same, and therefore, for this application, the present invention needs a harmonic parameter γ to balance the loss functions of the positive and negative samples, and therefore the loss functions will be added to γ to become:
Figure BDA0001502383970000102
in the invention, the purpose of training the neural network is to reduce the value of the loss function to the minimum, and in order to train the neural network and determine the weight of the neural network and the bias value of the neural network, the invention adopts a random gradient descent algorithm to learn the network parameters.
In the present invention, the model is trained by determining the non-linear transformation function f by a stochastic gradient descent algorithmθDue to a non-linear transformation function fθMain bagThe update value for each gradient descent of the WEIGHT parameter WEIGHT and the BIAS parameter BIAS is L partial derivatives with respect to the WEIGHT parameter WEIGHT and the BIAS parameter BIAS, so that at each iteration, the WEIGHT parameter WEIGHT and the BIAS parameter BIAS are updated at a learning rate according to the parameter update value:
WEIGHTrear end=WEIGHTFront side+·ΔWEIGHT
BIASRear end=BIASFront side+·ΔBIAS
WEIGHTFront sideWEIGHT parameter for upper layer in perceptron, WEIGHTRear endΔ WEIGHT is the partial derivative of each gradient descent L with respect to the WEIGHT parameter WEIGHT for the current layer in the perceptron.
BLASFront sideFor sensing the bias parameters of the upper layer in the machine, B L ASRear endTo sense the bias parameters of the current layer in the machine, Δ B L AS is the partial derivative of each gradient descent at L relative to the bias parameters B L AS.
When random gradient descent is used, due to the fact that the number of training iterations is too many, overfitting can occur, the early-stop method is adopted, training is stopped when the loss function L is not reduced continuously, and the overfitting phenomenon generated during training is prevented.
In the invention, the WEIGHT parameter WEIGHT and the BIAS parameter BIAS of each layer in the perceptron are saved to obtain the nonlinear transformation function f of the learning targetθThereby completing the neural network training based on the multilayer perceptron and finally obtaining the target f according to the learningθFor the paperaGenerating the expression vector thereof, namely constructing the neural network paper probability model of the multilayer perceptron aiming at the semantic information of paper nodes
Figure BDA0001502383970000111
The invention provides a parameterized thesis network node representation learning method, which specifically comprises the following steps:
the method comprises the following steps that firstly, a neighbor-paper node set of any paper node and a neighbor-paper node set of the neighbor are obtained through sampling based on a random walk method;
in the present invention, the paper set AP ═ paper1,paper2,…,papera,…,papero,…,paperAIn the star-shaped thesis network structure, the sampling of the neighbor thesis nodes of each thesis node is carried out by random walk of the jump probability of adding the previous jump and the next jump. For any paper node paperaThe random walk method is adopted to obtain samples belonging to paperaNeighbor-paper node set
Figure BDA0001502383970000112
Step 101: constructing a paper node empty queue marked as V, wherein the V is used for storing a paper node sequence; the maximum queue element bit number of the paper node empty queue V is mv, and the value of mv is 10-20; then, step 102 is executed;
step 102: selecting any paper node paperaThen the paper is putaPutting the position 1 in a thesis node queue V; then step 103 is executed;
step 103: acquiring paper belonging to any paper nodeaAll neighbor paper node set of (1), noted
Figure BDA0001502383970000113
In the invention, the neighbor paper node refers to a paper node paper associated with any one paper nodeaA thesis node set with edges in between; then step 104 is executed;
step 104: according to the neighbor paper node set
Figure BDA0001502383970000114
The total number of middle neighbor nodes B determines the probability of jumping to each neighbor paper node
Figure BDA0001502383970000115
(first hop probability for short),
Figure BDA0001502383970000116
c represents the hop count; then step 105 is executed;
step 105: adopting alias sampling algorithm (alias sampling) according to current jump probability
Figure BDA0001502383970000117
In the above-mentioned
Figure BDA0001502383970000118
Neighbor paper node for acquiring next hop
Figure BDA0001502383970000119
At the same time will
Figure BDA00015023839700001110
Place bit 2 of the paper node queue V; then step 106 is executed;
step 106: obtaining nodes belonging to neighbor thesis
Figure BDA00015023839700001111
All neighbor paper node set of (2), i.e. neighbor-paper node set of neighbors
Figure BDA0001502383970000121
Then step 107 is performed;
step 107: computing neighbor thesis nodes
Figure BDA0001502383970000122
With any paper node paperaShortest hop count in between
Figure BDA0001502383970000123
Then step 108 is executed;
in the present invention, wherein
Figure BDA0001502383970000124
Representing the minimum hop distance from any neighbor paper node to a previous paper node, e.g. if a neighbor paper node
Figure BDA0001502383970000125
To paper node paperaA minimum of 1 hop is required, then
Figure BDA0001502383970000126
If neighbor paper node
Figure BDA0001502383970000127
Namely paper node paperaThen, then
Figure BDA0001502383970000128
And so on.
Step 108: according to said
Figure BDA0001502383970000129
To determine
Figure BDA00015023839700001210
Probability of hopping to each neighbor paper node
Figure BDA00015023839700001211
(second hop probability for short); then step 109 is performed;
the second probability of hopping
Figure BDA00015023839700001212
c represents the number of hops.
In the present invention, the shortest hop count refers to the minimum hop required between two paper nodes.
In the invention, p is the second hop probability for adjusting paper nodes not in the paper node queue V in the random walk method
Figure BDA00015023839700001213
Size parameter (abbreviated as a skip parameter), q isSecond hop probability for adjusting paper nodes in the paper node queue V in a random walk method
Figure BDA00015023839700001214
The size parameter (jump-in parameter for short), p, q control the probability of jumping, if we want to walk more randomly in local jumping, p needs to be set larger; conversely, q needs to be set larger.
Step 109: warp beam
Figure BDA00015023839700001215
After determination, according to
Figure BDA00015023839700001216
And alias sampling, selecting
Figure BDA00015023839700001217
As a next-hop thesis node, will simultaneously
Figure BDA00015023839700001218
Bit 3 placed in paper node queue V; then, step 110 is executed;
step 110: circularly executing the step 106 and the step 109 until the digit in the paper node queue V is mv, and stopping the random walk; then step 111 is executed;
step 111: in the present invention, the steps 101 to 110 are repeatedly executed for each thesis node in the whole thesis network to complete the neighbor node sampling of the thesis node, and then there is a thesis node queue set denoted as VF ═ V-1,V2,...,Vf,...,VF}; step 201 is then performed.
V1Representing a first paper node queue;
V2representing a second paper node queue;
Vfrepresenting any one thesis node queue, and f representing an identification number of the thesis node queue;
VFrepresenting the last paper node queue, F representsTotal number of paper node queue sets, F ∈ F.
Generating neural network training data of the multilayer perceptron by adopting a negative sampling method;
in the invention, the thesis node queue set VF (V) obtained in the step one is used for generating training data which can be used by the neural network1,V2,...,Vf,...,VF}; in addition to the training data in the paper node queue set, the present invention can generate the data required for training the model by means of a negative sampling algorithm.
Step 201: establishing a positive sample queue QpAnd negative sample queue QnRespectively storing positive sampling data and negative sampling data required by training a neural network, and then executing step 202;
step 202: setting up a neighbor window size hyperparameter WD, if WD is in a paper node queue VfIn, then belongs to the thesis node queue VfEach article in (1) is noted as
Figure BDA0001502383970000131
Then step 203 is executed;
Figure BDA0001502383970000132
indicating belonging to any one of the paper node queues VfThe first paper node of (1);
Figure BDA0001502383970000133
indicating belonging to any one of the paper node queues VfThe second paper node of (1);
Figure BDA0001502383970000134
indicating belonging to any one of the paper node queues VfG represents the identification number of the neighbor paper node;
Figure BDA0001502383970000135
indicating belonging to any one of the paper node queues VfG denotes a paper node queue VfLength of (G ∈ G).
For any node in the paper queue, all nodes with the distance from the node to the node smaller than WD in the queue are considered as positive sample nodes, and each time, for any paper node, the invention firstly acquires 2 × WD adjacent paper node sets which belong to the same, and the nodes are marked as the positive sample nodes
Figure BDA00015023839700001310
Figure BDA00015023839700001311
Indicated at the neighboring paper node
Figure BDA00015023839700001312
The node with the smallest identification number.
Figure BDA00015023839700001313
Indicated at the neighboring paper node
Figure BDA00015023839700001314
The node with the largest identification number.
Figure BDA00015023839700001315
Indicated at the neighboring paper node
Figure BDA00015023839700001316
Middle removing
Figure BDA00015023839700001317
And
Figure BDA00015023839700001318
any other paper node is called a queue-adjacent paper node for short. The subscript l indicates the identification number of the node of the paper that is neither the largest nor the smallest, i.e. divided byOther identification numbers of these 2 paper nodes.
Step 203: for any arbitrary queue-paper node
Figure BDA00015023839700001319
Sampling from small to large according to the sequence of the neighbor identification numbers, wherein the sampling process is to
Figure BDA0001502383970000141
Each node in (1) and any queue-thesis node
Figure BDA0001502383970000142
A triple is formed, and then step 204 is executed;
for the said
Figure BDA0001502383970000143
And arbitrary queue-paper node
Figure BDA0001502383970000144
Form a triad, i.e. (,)
Figure BDA0001502383970000145
) Wherein +1 represents that the triplet is a positive sample, whereas-1 represents that the triplet is a negative sample, and (b) is
Figure BDA0001502383970000146
) Insert positive sample queue QpIn (1).
For the said
Figure BDA0001502383970000147
And arbitrary queue-paper node
Figure BDA0001502383970000148
Form a triad, i.e. (,)
Figure BDA0001502383970000149
Figure BDA00015023839700001410
) Wherein +1 represents that the triplet is a positive sample, whereas-1 represents that the triplet is a negative sample, and (b) is
Figure BDA00015023839700001411
) Insert positive sample queue QpIn (1).
For the said
Figure BDA00015023839700001412
And arbitrary queue-paper node
Figure BDA00015023839700001413
Form a triad, i.e. (,)
Figure BDA00015023839700001414
) Wherein +1 represents that the triplet is a positive sample, whereas-1 represents that the triplet is a negative sample, and (b) is
Figure BDA00015023839700001415
) Insert positive sample queue QpIn (1).
Step 204: step 202 and step 203 are executed in a loop until the thesis node queue set VF ═ V1,V2,...,Vf,...,VFAll the paper nodes in all the paper node queues in the queue complete the sampling work of the neighbor paper nodes to obtain a positive sample queue QpThen, step 207 is performed;
step 205: sampling all paper nodes in the network, and selecting any two paper nodes (the selected two paper nodes can be adjacent or non-adjacent) from the network each time, namely, the first arbitrary paper node paperaSecond paper node papero. If there is a continuous edge between two paper nodes ((paper)a,papero) ∈ E), or the two randomly chosen paper nodes are identical (paper)a=papero) Continuing the step, otherwise, dividing any two paper nodes into papera、paperoComposition ofTriple (paper)a,papero-1) storing into a negative sample queue QnThen step 206 is performed;
step 206: step 205 is executed in a loop, and a positive/negative sample ratio parameter μ is established, assuming a positive sample queue QpThe number of the middle triples is np, then when Q isnStops when the number of triples in (d) equals μ × np, and then performs step 207;
step 207: queue Q of positive samples obtained in step 204pAnd the negative sample queue Q obtained in step 206nAre combined together to obtain a new sample queue QNew={Q1....,Q(1+μ)×npExecute step 208;
Q1indicating a new sample queue QNewThe triplet of the smallest identification number in (1).
Q(1+μ)×npIndicating a new sample queue QNewSubscript (1+ μ) × np represents the sample queue QNewComprising (1+ μ) × np triplets.
Step 208: queue Q of new samplesNew={Q1....,Q(1+μ)×npDisordering the sequence of all elements in the sequence to obtain a disordered sample queue QSorting={Q1....,Q(1+μ)×npAnd then step 301 is performed.
Processing in a neural network paper probability model based on a multilayer perceptron;
step 301: for the Q obtained in step 208Sorting={Q1....,Q(1+μ)×npOne triple (paper) at a timea,paperoB), putting the neural network paper nodes into a neural network paper probability model as a pair of paper nodes for learning, and executing the step 302;
step 302: for two paper nodes paper in each tripleaAnd paperoBy means of a model
Figure BDA0001502383970000151
Mapping to obtain two corresponding transformed vectors
Figure BDA0001502383970000152
Step
303 is executed;
Figure BDA0001502383970000153
to belong to paperaThe multi-layer perceptron function of (1);
Figure BDA0001502383970000154
to belong to paperoThe multi-layer perceptron function of (1);
step 303: calculating Euclidean distances of the two thesis nodes, and executing step 304;
in the present invention, the twin network is aimed at making the euclidean distance between similar points in the expression space as short as possible, and the euclidean distance between dissimilar points as long as possible. The basic form is as follows:
Figure BDA0001502383970000155
Figure BDA0001502383970000156
Eposrepresents the euclidean shortest distance; enegRepresents the Euclidean longest distance; c represents the number of hops.
304, merging the positive and negative samples, putting the merged positive and negative samples into a loss function of Euclidean distance related to distributed representation of the thesis, carrying out loss function calculation of balancing the positive and negative samples to obtain an integral loss function L, and executing the step 305;
Figure BDA0001502383970000157
step 305: determining a non-linear transformation function f by using a random gradient descent algorithmθCompleting any two paper node papersaAnd paperoThe representation of (2) is learned.
Example 1
The embodiment adopts the Cora paper data set and the Pubmed knowledge network data set to carry out learning and experimental work.
Cora is a collective paper data containing 2708 paper nodes, each node corresponding to a paper rich text information vector of length 1433, and 5429 edges, the rich text information vector indicating the presence or absence of a word by 0/1. Meanwhile, each node is associated with a category attribute, and the total category attribute value number is 7.
Pubmed is a knowledge network data aggregation containing 19717 paper nodes, including 19717 nodes and 44338 edges, each node corresponding to a paper rich text information vector with the length of 500, and the rich text information vector indicates whether a word exists or not by 0/1. Meanwhile, each node is associated with a category attribute, and the total category attribute value number is 3.
In order to verify the effectiveness, the invention mainly compares the performances of different methods in the classification task of the paper nodes:
deepwalk: the network is sampled by adopting a common random walk algorithm, and then the representation of each node in the network is obtained by using a word2vec algorithm. (2014deep walk of online learning of social representation [ J ]. Perozzi B, Alrfou R, Skiiena S.KDD:701-710.)
TADW: and decomposing the random walk in the Deepwalk, skillfully adding rich text information of the nodes, and obtaining the representation of each node in the network by adopting a matrix multiplication mode. (2015, Network representation with rich text information [ C ] YangC, ZHao D, et al. International conference on Intelligent Association. AAAI Press: 2111-
Node2Vec, an upgraded version of Deepwalk, employs a second-order random walk algorithm to sample the network, and then obtains a representation of each Node in the network using a word2Vec algorithm (2016, Node2Vec: Scalable Feature L earning for Networks [ C ]// Grover A, L eskovec J.KDD:855.)
The method selects a node prediction method to compare vector representation effects. In the experiment, a hybrid verification technology (cross-validation) is adopted, and SVM classifiers are selected for classification in different classification prediction methods.
The invention adopts two evaluation indexes to measure respectively Micro-F1 and Macro-F1.
The Macro-F1 calculation method comprises the following steps:
Figure BDA0001502383970000161
wherein P ismacroAnd RmacroRespectively representing the macro level difference and the macro recall.
The calculation method of the Micro-F1 comprises the following steps:
Figure BDA0001502383970000171
wherein P ismicroAnd RmicroRespectively representing the micro-tolerance rate and the micro-recall rate.
Effects on the Cora data set As shown in FIGS. 2 and 3, the effects of the present invention are compared with those of other methods on the Cora data set, FIG. 2 represents the performance of each method on the Micro-F1 evaluation index, and FIG. 3 represents the performance of each method on the Macro-F1 evaluation index. The horizontal axis of the two graphs represents the training data of the classifier as a percentage of the total data. The figure shows that the method has better effect than other network representation learning methods under the evaluation indexes of Micro-F1 and Macro-F1, and particularly shows that compared with a Deepwalk and Node2vec algorithm which purely use network information and do not adopt network Node semantic information, the algorithm of the invention improves the proportion of each training data by more than 5% under the evaluation indexes of Micro-F1 and Macro-F1, and the obtained network Node representation vector is remarkably better than that obtained by only using network topology information after the network Node information and the network topology structure are fused. Meanwhile, as can be seen from the comparison of the TADW method combining the network node information and the network topology information, the method provided by the present invention is still improved by 3% in both evaluation indexes.
The effect of Wiki data set is shown in FIGS. 4 and 5, and it can be seen that the present invention has better effect than other network representation learning methods under both Micro-F1 and Macro-F1 evaluation indexes. Since the category number of Wiki data sets is far more than that of the Cora data sets, it can be found that the classification effect is poor and is far lower than the result of analysis by using TADW without adopting the Deepwalk and Node2vec algorithms of network Node semantic information. This illustrates that semantics dominates the data set. According to the method, under the evaluation indexes of Micro-F1 and Macro-F1, the experiment result obtained by the TADW method is improved by 2%, and the fact that the network node expression vector obtained by the method after the network node information and the network topology structure are fused is better than the network node expression vector obtained by directly multiplying the network node expression vector by a matrix can be shown. The invention can be demonstrated to better fuse in the combination of network information and semantic information in the network node representation, and obtain a better representation vector.
Through the analysis of fig. 2-5, these experiments show that the invention can naturally merge both the network structure and the semantic information, so as to obtain a better network node representation vector, and thus, the validity of the invention can be verified.

Claims (3)

1. A parameterized thesis network node representation learning method is characterized by comprising the following steps:
the method comprises the following steps that firstly, a neighbor thesis node set of any one thesis node and a neighbor thesis node set of a neighbor are obtained through sampling based on a random walk method;
step 101: constructing a paper node empty queue marked as V, wherein the V is used for storing a paper node sequence; the maximum queue element bit number of the paper node empty queue V is mv, and the value of mv is 10-20; then, step 102 is executed;
step 102: selecting any paper node paperaThen the paper is putaPutting the position 1 in a thesis node queue V; then step 103 is executed;
step 103: acquiring paper belonging to any paper nodeaAll neighbors ofNode set of the living thesis, note
Figure FDA0002425374080000011
The neighbor paper node set refers to a set of nodes related to any paper node paperaA neighbor thesis node set with connected edges exists between the neighbor thesis nodes; then step 104 is executed;
Figure FDA0002425374080000012
representing a paper belonging to any one of the paper nodesaThe first neighbor node of (2), i.e. the first neighbor paper node;
Figure FDA0002425374080000013
representing a paper belonging to any one of the paper nodesaA second neighbor node of (2), i.e. a second neighbor paper node;
Figure FDA0002425374080000014
representing a paper belonging to any one of the paper nodesaB represents the identification number of the neighbor node;
Figure FDA0002425374080000015
representing a paper belonging to any one of the paper nodesaB represents a node belonging to a paper, i.e. the last neighbor paper nodeaB ∈ a;
step 104: according to the neighbor paper node set
Figure FDA0002425374080000016
The total number B of middle neighbor nodes determines the probability of jumping to the first jump
Figure FDA0002425374080000017
c represents the hop count; then step 105 is executed;
step 105: adopting alias sampling algorithm according to the current first jump probability
Figure FDA0002425374080000018
In the above-mentioned
Figure FDA0002425374080000019
Neighbor paper node for acquiring next hop
Figure FDA00024253740800000110
At the same time will
Figure FDA00024253740800000111
Place bit 2 of the paper node queue V; then step 106 is executed;
step 106: obtaining nodes belonging to any neighbor thesis
Figure FDA00024253740800000112
All neighbor paper node set of (2), i.e., neighbor paper node set of neighbors
Figure FDA00024253740800000113
Then step 107 is performed;
Figure FDA0002425374080000021
indicating belonging to any neighbor paper node
Figure FDA0002425374080000022
The first neighbor node of the neighbor, namely the first neighbor paper node of the neighbor;
Figure FDA0002425374080000023
indicating belonging to any neighbor paper node
Figure FDA0002425374080000024
A second neighbor node of the neighbor, i.e., a second neighbor paper node of the neighbor;
Figure FDA0002425374080000025
indicating belonging to any neighbor paper node
Figure FDA0002425374080000026
E denotes a node belonging to the neighbor paper, i.e. any neighbor paper node of the neighbor
Figure FDA0002425374080000027
The identification number of the neighbor node of (2);
Figure FDA0002425374080000028
indicating belonging to any neighbor paper node
Figure FDA0002425374080000029
E denotes a node belonging to the last neighbor of the neighbor, i.e. the last neighbor paper node of the neighbor
Figure FDA00024253740800000210
Total number of neighbor nodes of E ∈ a;
step 107: neighbor paper node for calculating any neighbor
Figure FDA00024253740800000211
With any paper node paperaShortest hop count in between
Figure FDA00024253740800000212
Then step 108 is executed;
wherein
Figure FDA00024253740800000213
Represented by any one neighborTo a neighbor paper node located in a paperaA minimum hop distance of a previous paper node;
step 108: according to said
Figure FDA00024253740800000214
To determine
Figure FDA00024253740800000215
Second hop probability of hopping to each neighbor paper node
Figure FDA00024253740800000216
Then step 109 is performed;
the second probability of hopping
Figure FDA00024253740800000217
c represents the hop count; p is a second hop probability for adjusting paper nodes not in the paper node queue V in the random walk method
Figure FDA00024253740800000218
A size parameter, i.e., a skip parameter; q is a second hop probability for adjusting paper nodes in the paper node queue V in a random walk method
Figure FDA00024253740800000219
A size parameter, i.e., a hop-in parameter;
step 109: warp beam
Figure FDA00024253740800000220
After determination, according to
Figure FDA00024253740800000221
And alias sampling, selecting
Figure FDA00024253740800000222
As a next-hop thesis node, will simultaneously
Figure FDA00024253740800000223
Bit 3 placed in paper node queue V; then, step 110 is executed;
step 110: circularly executing the step 106 and the step 109 until the digit in the paper node queue V is mv, and stopping the random walk; then step 111 is executed;
step 111: repeating the steps 101 to 109 for each thesis node in the whole thesis network to complete the neighbor node sampling of the thesis node, and then there is a thesis node queue set denoted as VF ═ V1,V2,...,Vf,...,VF}; then step 201 is executed;
V1representing a first paper node queue;
V2representing a second paper node queue;
Vfrepresenting any one thesis node queue, and f representing an identification number of the thesis node queue;
VFrepresenting the last paper node queue, F representing the total number of paper node queue sets, F ∈ F;
generating neural network training data of the multilayer perceptron by adopting a negative sampling method;
step 201: establishing a positive sample queue QpAnd negative sample queue QnRespectively storing positive sampling data and negative sampling data required by training a neural network, and then executing step 202;
step 202: setting up a neighbor window size hyperparameter WD, if WD is in a paper node queue VfIn, then belongs to the thesis node queue VfEach article in (1) is noted as
Figure FDA0002425374080000031
Then step 203 is executed;
Figure FDA0002425374080000032
indicating belonging to any one of the paper node queues VfThe first paper node of (1);
Figure FDA0002425374080000033
indicating belonging to any one of the paper node queues VfThe second paper node of (1);
Figure FDA0002425374080000034
indicating belonging to any one of the paper node queues VfG represents an identification number of the thesis node;
Figure FDA0002425374080000035
indicating belonging to any one of the paper node queues VfG denotes a paper node queue VfLength of G ∈ G;
for nodes in any one paper queue
Figure FDA0002425374080000036
Consider a node in a queue
Figure FDA0002425374080000037
All nodes with the distance smaller than WD are positive sample nodes; at a time, for any one paper node
Figure FDA0002425374080000038
First obtain the information of
Figure FDA0002425374080000039
2 × WD neighbor paper node set, noted
Figure FDA00024253740800000310
Figure FDA00024253740800000311
Indicated at the neighboring paper node
Figure FDA00024253740800000312
The node with the minimum identification number;
Figure FDA00024253740800000313
indicated at the neighboring paper node
Figure FDA00024253740800000314
The node with the medium and maximum identification number;
Figure FDA00024253740800000315
indicated at the neighboring paper node
Figure FDA00024253740800000316
Middle removing
Figure FDA00024253740800000317
And
Figure FDA00024253740800000318
queue outside-adjacent thesis nodes, subscript l denotes the identification number of the node that is not the largest or the smallest thesis;
step 203: for any arbitrary queue-paper node
Figure FDA00024253740800000319
Sampling from small to large according to the sequence of the neighbor identification numbers, wherein the sampling process is to
Figure FDA0002425374080000041
Each node in (1) and any queue-thesis node
Figure FDA0002425374080000042
Forming a triple and then executingA row step 204;
for the said
Figure FDA0002425374080000043
And arbitrary queue-paper node
Figure FDA0002425374080000044
Form a triplet, i.e.
Figure FDA0002425374080000045
Where +1 denotes that the triplet is a positive sample, whereas-1 denotes that the triplet is a negative sample, and will
Figure FDA0002425374080000046
Insert positive sample queue QpPerforming the following steps;
for the said
Figure FDA0002425374080000047
And arbitrary queue-paper node
Figure FDA0002425374080000048
Form a triplet, i.e.
Figure FDA0002425374080000049
Figure FDA00024253740800000410
Where +1 denotes that the triplet is a positive sample, whereas-1 denotes that the triplet is a negative sample, and will
Figure FDA00024253740800000411
Insert positive sample queue QpPerforming the following steps;
for the said
Figure FDA00024253740800000412
And arbitrary queue-paper node
Figure FDA00024253740800000413
Form a triplet, i.e.
Figure FDA00024253740800000414
Where +1 denotes that the triplet is a positive sample, whereas-1 denotes that the triplet is a negative sample, and will
Figure FDA00024253740800000415
Insert positive sample queue QpPerforming the following steps;
step 204: step 202 and step 203 are executed in a loop until the thesis node queue set VF ═ V1,V2,...,Vf,...,VFAll the paper nodes in all the paper node queues in the queue complete the sampling work of the neighbor paper nodes to obtain a positive sample queue QpThen, step 207 is performed;
step 205: sampling all paper nodes in the network, and selecting any two paper nodes from the network at each time, namely a first paper node paperaSecond paper node papero(ii) a If a connecting edge exists between two paper nodes or two randomly selected paper nodes are the same, continuing the step, otherwise, performing any two paper nodesa、paperoComposition triplet (paper)a,papero-1) storing into a negative sample queue QnThen step 206 is performed;
step 206: step 205 is executed in a loop, and a positive/negative sample ratio parameter μ is established, assuming a positive sample queue QpThe number of the middle triples is np, then when Q isnStops when the number of triples in (d) equals μ × np, and then performs step 207;
step 207: queue Q of positive samples obtained in step 204pAnd the negative sample queue Q obtained in step 206nAre combined together to obtain a new sample queue QNew={Q1....,Q(1+μ)×npThen go to step 208;
Q1representing new samplesQueue QNewThe triplet of the smallest identification number in (1);
Q(1+μ)×npindicating a new sample queue QNewThe subscript (1+ mu) × np represents the sample queue QNewThe kit comprises (1+ mu) × np triplets;
step 208: queue Q of new samplesNew={Q1....,Q(1+μ)×npDisordering the sequence of all elements in the sequence to obtain a disordered sample queue QSorting={Q1....,Q(1+μ)×npThen step 301 is performed;
processing in a neural network paper probability model based on a multilayer perceptron;
step 301: for the Q obtained in step 208Sorting={Q1....,Q(1+μ)×npOne triple (paper) at a timea,paperoB), putting the neural network paper nodes into a neural network paper probability model as a pair of paper nodes for learning, and executing the step 302;
step 302: for two paper nodes paper in each tripleaAnd paperoBy means of a model
Figure FDA0002425374080000051
Mapping to obtain two corresponding transformed vectors
Figure FDA0002425374080000052
Step 303 is executed;
Figure FDA0002425374080000053
to belong to paperaThe multi-layer perceptron function of (1);
Figure FDA0002425374080000054
to belong to paperoThe multi-layer perceptron function of (1);
step 303: calculating Euclidean distances of the two thesis nodes, and executing step 304;
the Euclidean distance is:
Figure FDA0002425374080000055
Figure FDA0002425374080000056
Eposrepresents the euclidean shortest distance; enegRepresents the Euclidean longest distance; c represents the hop count;
304, merging the positive and negative samples, putting the merged positive and negative samples into a loss function of Euclidean distance related to distributed representation of the thesis, carrying out loss function calculation of balancing the positive and negative samples to obtain an integral loss function L, and executing the step 305;
Figure FDA0002425374080000057
gamma represents a harmonic parameter, which is a loss function used to balance positive and negative samples;
m represents QSortingAn identification number of any one of the triples;
due to triplets (paper)a,paperoA) is a flag that represents whether the triplet is a positive or negative sample, where positive samples are considered points that need to be similar in space and negative samples are considered points that need to be as far apart in space as possible;
step 305: determining a non-linear transformation function f by using a random gradient descent algorithmθCompleting any two paper node papersaAnd paperoThe representation of (2) is learned.
2. A parameterized thesis network node representation learning method in accordance with claim 1, characterized in that: step 103, step 104 and step 105 realize the acquisition of the 2 nd element in the paper node queue V.
3. A parameterized thesis network node representation learning method in accordance with claim 1, characterized in that: step 106 to step 110 realize the acquisition of the element after the 2 nd bit element is relayed by the paper node queue V until the maximum queue element number mv of the paper node empty queue V is reached.
CN201711308050.6A 2017-12-11 2017-12-11 Parameterized thesis network node representation learning method Active CN108228728B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711308050.6A CN108228728B (en) 2017-12-11 2017-12-11 Parameterized thesis network node representation learning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711308050.6A CN108228728B (en) 2017-12-11 2017-12-11 Parameterized thesis network node representation learning method

Publications (2)

Publication Number Publication Date
CN108228728A CN108228728A (en) 2018-06-29
CN108228728B true CN108228728B (en) 2020-07-17

Family

ID=62653503

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711308050.6A Active CN108228728B (en) 2017-12-11 2017-12-11 Parameterized thesis network node representation learning method

Country Status (1)

Country Link
CN (1) CN108228728B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109213831A (en) * 2018-08-14 2019-01-15 阿里巴巴集团控股有限公司 Event detecting method and device calculate equipment and storage medium
CN109376864A (en) * 2018-09-06 2019-02-22 电子科技大学 A kind of knowledge mapping relation inference algorithm based on stacking neural network
CN109558494A (en) * 2018-10-29 2019-04-02 中国科学院计算机网络信息中心 A kind of scholar's name disambiguation method based on heterogeneous network insertion
CN110322021B (en) * 2019-06-14 2021-03-30 清华大学 Hyper-parameter optimization method and device for large-scale network representation learning
CN112559734B (en) * 2019-09-26 2023-10-17 中国科学技术信息研究所 Brief report generating method, brief report generating device, electronic equipment and computer readable storage medium
CN111292062B (en) * 2020-02-10 2023-04-25 中南大学 Network embedding-based crowd-sourced garbage worker detection method, system and storage medium
CN112148876B (en) * 2020-09-23 2023-10-13 南京大学 Paper classification and recommendation method
CN117648670B (en) * 2024-01-24 2024-04-12 润泰救援装备科技河北有限公司 Rescue data fusion method, electronic equipment, storage medium and rescue fire truck

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250438A (en) * 2016-07-26 2016-12-21 上海交通大学 Based on random walk model zero quotes article recommends method and system
CN106777339A (en) * 2017-01-13 2017-05-31 深圳市唯特视科技有限公司 A kind of method that author is recognized based on heterogeneous network incorporation model
CN107451596A (en) * 2016-05-30 2017-12-08 清华大学 A kind of classified nodes method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8918431B2 (en) * 2011-09-09 2014-12-23 Sri International Adaptive ontology

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451596A (en) * 2016-05-30 2017-12-08 清华大学 A kind of classified nodes method and device
CN106250438A (en) * 2016-07-26 2016-12-21 上海交通大学 Based on random walk model zero quotes article recommends method and system
CN106777339A (en) * 2017-01-13 2017-05-31 深圳市唯特视科技有限公司 A kind of method that author is recognized based on heterogeneous network incorporation model

Also Published As

Publication number Publication date
CN108228728A (en) 2018-06-29

Similar Documents

Publication Publication Date Title
CN108228728B (en) Parameterized thesis network node representation learning method
US11544535B2 (en) Graph convolutional networks with motif-based attention
Liu et al. Principled multilayer network embedding
Suthaharan et al. Decision tree learning
Tran et al. On filter size in graph convolutional networks
CN112508085B (en) Social network link prediction method based on perceptual neural network
Kundu et al. Fuzzy-rough community in social networks
CN110147911B (en) Social influence prediction model and prediction method based on content perception
Amin A novel classification model for cotton yarn quality based on trained neural network using genetic algorithm
Venturelli et al. A Kriging-assisted multiobjective evolutionary algorithm
Nasiri et al. A node representation learning approach for link prediction in social networks using game theory and K-core decomposition
Kepner et al. Mathematics of Big Data
US11669727B2 (en) Information processing device, neural network design method, and recording medium
CN112905906B (en) Recommendation method and system fusing local collaboration and feature intersection
Jenny Li et al. Evaluating deep learning biases based on grey-box testing results
Coscia et al. The node vector distance problem in complex networks
Lokhande et al. Accelerating column generation via flexible dual optimal inequalities with application to entity resolution
CN109697511B (en) Data reasoning method and device and computer equipment
CN113159976B (en) Identification method for important users of microblog network
Kim et al. Network analysis for active and passive propagation models
Jayachitra Devi et al. Link prediction model based on geodesic distance measure using various machine learning classification models
Javaheripi et al. Swann: Small-world architecture for fast convergence of neural networks
Van Tran et al. On filter size in graph convolutional networks
Montiel et al. Reducing the size of combinatorial optimization problems using the operator vaccine by fuzzy selector with adaptive heuristics
Ferdaus et al. A genetic algorithm approach using improved fitness function for classification rule mining

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant