CN108228728B - Parameterized thesis network node representation learning method - Google Patents
Parameterized thesis network node representation learning method Download PDFInfo
- Publication number
- CN108228728B CN108228728B CN201711308050.6A CN201711308050A CN108228728B CN 108228728 B CN108228728 B CN 108228728B CN 201711308050 A CN201711308050 A CN 201711308050A CN 108228728 B CN108228728 B CN 108228728B
- Authority
- CN
- China
- Prior art keywords
- paper
- node
- queue
- neighbor
- thesis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3347—Query execution using vector based model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Abstract
The invention discloses a parameterized thesis network node representation learning method, which comprises the steps of firstly constructing an empty thesis node queue, and then sampling a neighbor node of any one thesis node and a neighbor node of the neighbor in a random walk mode; the selected thesis node is used as a first element of the thesis node queue, and then other elements of the thesis node queue are obtained according to the skipping probability; all the thesis nodes are traversed, and a thesis node queue set exists; then, generating neural network training data of the multilayer perceptron by adopting a positive and negative sampling method; and finally, processing by adopting a neural network paper probability model to obtain the nonlinear transformation from paper node semantic information to paper node vector representation, and further obtaining the vector representation of the paper nodes.
Description
Technical Field
The present invention relates to a representation learning method of a thesis network, and more particularly, to a parameterized representation learning method of a thesis network node.
Background
Social networks belong to internet concept nouns. Social networking service platforms such as Blog, WIKI, Tag, SNS, RSS, etc. The internet leads to a brand-new human social organization and a survival mode to enter the invention, a huge group-network group which exceeds the earth space is constructed, the human society of the 21 st century gradually emerges brand-new forms and characteristics, and individuals in the network global era are converging into a new social group. The network of papers refers to the networking of the relations between papers, and shows on the network the mutual citation and the sharing author between papers.
The expression learning of the paper network mostly adopts a non-parametric model at present, such as a Deepwalk: Online L earning of Social representation "translation of deep walking: Online learning of Social representation, Bryan Perozzi et al, 26Mar 2014, wherein a word2vec non-parametric method is used for learning the expression of the paper network.
Network fabric refers to the physical connectivity of a network. The topological structure of the network has various structures, such as a two-dimensional structure including a ring, a star, a tree, a neighbor connection network, a pulsating flow array and the like, and the contents of the interconnection network structure analysis, Wangxing, Chen nationality editors, 10 months in 1990 and pages 36-38 are referred to. With the development of networks, there are also network structures, honeycomb structures, etc.
The present representation learning method of the paper network must traverse all papers in the paper network to learn the representation of the papers. When a new thesis is added to the thesis network, representation learning of the new thesis cannot be performed, and classification and analysis work of the new thesis cannot be further completed.
Disclosure of Invention
In order to solve the problem that a newly added paper cannot represent learning, the invention provides a parameterized paper network node representation learning method, in the invention, a star-shaped paper network structure is sampled by means of a random walk statistical model to obtain paper node vector information, a paper node queue after sampling is composed of a series of paper nodes, the selection of the next paper node is random each time, after the paper network sampling step is carried out, a deep neural network based on a twin network framework is constructed, wherein two identical sub-networks of the twin network are composed of multilayer perceptrons (M L P), the learned multilayer perceptrons are used as nonlinear mapping functions, and network node representation vectors are obtained by constructing the nonlinear mapping functions from rich text information of network nodes to network node representation vectors.
The invention provides a parameterized thesis network node representation learning method, which is characterized by comprising the following steps of:
the method comprises the following steps that firstly, a neighbor-paper node set of any paper node and a neighbor-paper node set of the neighbor are obtained through sampling based on a random walk method;
step 101: constructing a paper node empty queue marked as V, wherein the V is used for storing a paper node sequence; the maximum queue element bit number of the paper node empty queue V is mv, and the value of mv is 10-20; then, step 102 is executed;
step 102: selecting any paper node paperaThen the paper is putaPutting the position 1 in a thesis node queue V; then step 103 is executed;
step 103: acquiring paper belonging to any paper nodeaAll neighbor paper node set of (1), notedThe neighbor paper node refers to a paper node paper associated with any one paper nodeaA thesis node set with edges in between; then step 104 is executed;
step 104: according to the neighbor paper node setThe total number B of middle neighbor nodes determines the probability of jumping to the first jumpc represents the hop count; then step 105 is executed;
step 105: using alias sampling algorithm (alias sampling), based on the currentIn the above-mentionedNeighbor paper node for acquiring next hopAt the same time willPlace bit 2 of the paper node queue V; then step 106 is executed;
step 106: obtaining nodes belonging to neighbor thesisAll neighbor paper node set of (2), i.e. neighbor-paper node set of neighborsThen step 107 is performed;
step 107: computing neighbor thesis nodesWith any paper node paperaShortest hop count in betweenThen step 108 is executed;
step 109: warp beamAfter determination, according toAnd alias sampling, selectingAs a next-hop thesis node, will simultaneouslyBit 3 placed in paper node queue V; then, step 110 is executed;
step 110: circularly executing the step 106 and the step 109 until the digit in the paper node queue V is mv, and stopping the random walk; then step 111 is executed;
step 111: repeating the steps 101 to 110 for each thesis node in the whole thesis network to complete the neighbor node sampling of the thesis node, and then there is a thesis node queue set denoted as VF ═ V1,V2,...,Vf,...,VF}; then step 201 is executed;
V1representing a first paper node queue;
V2representing a second paper node queue;
Vfrepresenting any one thesis node queue, and f representing an identification number of the thesis node queue;
VFrepresenting the last paper node queue, F representing the total number of paper node queue sets, F ∈ F;
generating neural network training data of the multilayer perceptron by adopting a negative sampling method;
step 201: establishing a positive sample queue QpAnd negative sample queue QnRespectively storing positive sampling data and negative sampling data required by training a neural network, and then executing step 202;
step 202: setting up a neighbor window size hyperparameter WD, if WD is in a paper node queue VfIn, then belongs to the thesis node queue VfEach article in (1) is noted asThen step 203 is executed;
indicating belonging to any one of the paper node queues VfG represents an identification number of a neighbor thesis node;
indicating belonging to any one of the paper node queues VfG denotes a paper node queue VfLength of G ∈ G;
for any node in the paper queue, all nodes with the distance from the node in the queue smaller than WD are considered as positive sample nodes, and every time, for any paper node, the invention firstly acquires 2 × WD adjacent paper node sets which belong to the same and marks the adjacent paper node sets as the positive sample nodes
indicated at the neighboring paper nodeMiddle removingAndqueue outside-adjacent thesis nodes, subscript l denotes the identification number of the node that is not the largest or the smallest thesis;
step 203: for any arbitrary queue-paper nodeSampling from small to large according to the sequence of the neighbor identification numbers, wherein the sampling process is toEach node in (1) and any queue-thesis nodeA triple is formed, and then step 204 is executed;
for the saidAnd arbitrary queue-paper nodeForm a triad, i.e. (,)) Wherein +1 represents that the triplet is a positive sample, whereas-1 represents that the triplet is a negative sample, and (b) is) Insert positive sample queue QpPerforming the following steps;
for the saidAnd arbitrary queue-paper nodeForm a triad, i.e. (,) ) Wherein +1 represents that the triplet is a positive sample, whereas-1 represents that the triplet is a negative sample, and (b) is) Insert positive sample queue QpPerforming the following steps;
for the saidAnd arbitrary queue-paper nodeForm a triad, i.e. (,)) Wherein +1 represents that the triplet is a positive sample, whereas-1 represents that the triplet is a negative sample, and (b) is) Insert positive sample queue QpPerforming the following steps;
step 204: step 202 and step 203 are executed in a loop until the thesis node queue set VF ═ V1,V2,...,Vf,...,VFAll the paper nodes in all the paper node queues in the queue complete the sampling work of the neighbor paper nodes to obtain a positive sample queue QpThen, step 207 is performed;
step 205: sampling all paper nodes in the network, and selecting any two paper nodes from the network at each time, namely a first paper node paperaSecond paper node papero(ii) a If a connecting edge exists between two paper nodes or two randomly selected paper nodes are the same, continuing the step, otherwise, performing any two paper nodesa、paperoComposition triplet (paper)a,papero-1) storing into a negative sample queue QnThen step 206 is performed;
step 206: step 205 is executed in a loop, and a positive/negative sample ratio parameter μ is established, assuming a positive sample queue QpThe number of the middle triples is np, then when Q isnStops when the number of triples in (d) equals μ × np, and then performs step 207;
step 207: queue Q of positive samples obtained in step 204pAnd the negative sample queue Q obtained in step 206nAre combined together to obtain a new sample queue QNew={Q1....,Q(1+μ)×npExecute step 208;
Q1indicating a new sample queue QNewMinimum identification number inThe triplet of (2);
Q(1+μ)×npindicating a new sample queue QNewThe subscript (1+ mu) × np represents the sample queue QNewThe kit comprises (1+ mu) × np triplets;
step 208: queue Q of new samplesNew={Q1....,Q(1+μ)×npDisordering the sequence of all elements in the sequence to obtain a disordered sample queue QSorting={Q1....,Q(1+μ)×npThen step 301 is performed;
processing in a neural network paper probability model based on a multilayer perceptron;
step 301: for the Q obtained in step 208Sorting={Q1....,Q(1+μ)×npOne triple (paper) at a timea,paperoB), putting the neural network paper nodes into a neural network paper probability model as a pair of paper nodes for learning, and executing the step 302;
step 302: for two paper nodes paper in each tripleaAnd paperoBy means of a modelMapping to obtain two corresponding transformed vectorsStep 303 is executed;
step 303: calculating Euclidean distances of the two thesis nodes, and executing step 304;
the Euclidean distance is:
Eposrepresents the euclidean shortest distance; enegRepresents the Euclidean longest distance; c represents the hop count;
304, merging the positive and negative samples, putting the merged positive and negative samples into a loss function of Euclidean distance related to distributed representation of the thesis, carrying out loss function calculation of balancing the positive and negative samples to obtain an integral loss function L, and executing the step 305;
step 305: determining a non-linear transformation function f by using a random gradient descent algorithmθCompleting any two paper node papersaAnd paperoThe representation of (2) is learned.
Network node representation describes each node in the network with a vector. In order to process the numerous and complicated information and the neighbor node relationship in the social network, the invention provides a parameterized network node representation learning method. The network node representation learning method can learn a nonlinear mapping function, so that the vector representation of the network node can be simply obtained from the content information of the network node. For vector representation of a node, a random walk is used to obtain peripheral nodes of the node, and then a relation between the node and neighbor nodes of the node is constructed according to a twin network, so that a nonlinear mapping function is learned and determined. In the simulation experiment, under the condition that the network node expression vectors obtained by the method are used by the same SVM classifier, the classification result is obviously better than that of other methods, and the method can be verified to be effective in the aspect of network node expression of the thesis network.
Drawings
FIG. 1 is a flow chart of parameterized paper network node representation learning in accordance with the present invention.
FIG. 2 shows the results of the evaluation of the Micro-F1 index in the Cora data set.
FIG. 3 shows the results of evaluation of the Macro-F1 index in the Cora data set.
FIG. 4 is the results of the evaluation of the Micro-F1 metric in the Wiki data set.
FIG. 5 is a graph of the results of the evaluation of the Macro-F1 metric in a Wiki dataset.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples.
In the present invention, the paper is referred to as a paper, multiple papers form a paper set, which is referred to as AP, and AP ═ paper1,paper2,…,papera,…,papero,…,paperA}; any one of the papers in the paper set AP is called a paper node in the star-shaped paper network structure;
paper1representing a first paper node;
paper2representing a second paper node;
paperarepresenting the a-th paper node, wherein a represents the identification number of the paper node;
paperArepresenting the last paper node, a representing the total number of papers, a ∈ a.
For convenience of explanation, paperaAlso called any paper node, paperoIs to remove paperaIn addition to another arbitrary paper node, hereinafter paperaReferred to as the first arbitrary paper node, paperoReferred to as the second arbitrary paper node.
Will belong to any paper node paperaAll neighbor paper nodes of (2)And isAlso referred to as neighbor-paper node set for short;
representing a paper belonging to any one of the paper nodesaB represents the identification number of the neighbor node;
representing a paper belonging to any one of the paper nodesaB represents belonging to a paperaB ∈ a.
Will belong to any one neighbor nodeAll neighbor paper nodes of (2)And isAlso referred to simply as the neighbor of a neighbor-paper node set.
indicating belonging to any neighbor paper nodeE represents a node belonging to a neighbor paperThe identification number of the neighbor node of (2);
indicating belonging to any neighbor paper nodeE denotes a node belonging toE ∈ a (total number of neighbor nodes for short neighbors).
In the present invention, the star-shaped paper network structure is adopted as the structure of fig. 1.19(c) on page 37 of "analysis of interconnection network structure". Wandingxing, a good compilation in old countries; first edition of month 10, 1990.
In the invention, the semantic information of the paper node refers to vector representation of words contained in the title, abstract and text of the paper through a lexical process. The part-of-speech processing is to perform 0 or 1 binarization coding according to the occurrence or non-occurrence of the semantic information of the paper node in any paper content, so as to obtain a 0 or 1 represented vector corresponding to the paper content. A "0" indicates absence and a "1" indicates presence. And processing all the thesis nodes belonging to the star-shaped thesis network structure by adopting lexical processing to obtain a two-dimensional matrix of association of the word number and the thesis node number, which is called a thesis binary matrix for short.
Neural network thesis probability model for constructing multilayer perceptron by adopting thesis node semantic information
In the invention, the construction of the paper probability model comprises the following steps: (A) setting neural network paper probability model expression(B) From AP ═ { paper ═1,paper2,…,papera,…,papero,…,paperAChoose any two paper nodes paperaAnd paperoAnd combining the paperaAnd paperoIn thatRespectively obtaining the data belonging to the paperaMulti-layer perceptron function ofBelong to paperoMulti-layer perceptron function of(C) According toAndcomputing paperaAnd paperoEuclidean distance between the positive and negative samples, and performing loss function processing for balancing the positive and negative samples; (D) using a random gradient descent algorithm to pair fθThe weighted parameter WEIGHT and the BIAS parameter BIAS are processed to obtain a nonlinear transformation function f of a learning targetθAnd traversing all the triples to obtain the neural network training of the semantic information of the nodes of the thesis based on the multilayer perceptron.
In the invention, the constructed neural network paper probability model expression based on the multilayer perceptron is recorded as fθAs a non-linear mapping function, for the paperaSemantic information of the paper node. By mapping a function f to a non-linearityθLearning is performed to determine the parameter θ in the nonlinear mapping. Based on a non-linear mapping function fθPaper for any one of the papers can be obtainedaExpression of paper probability model
In the present invention, for paperaAll have rich text information corresponding to themThen obtaining the signal by adopting a multilayer perceptron neural networkIs performed by a non-linear transformation. Assuming that the multi-layer perceptron has H layers in total, the neural network based on the multi-layer perceptron has a WEIGHT parameter WEIGHT and a BIAS parameter BIAS of each layer.
In the present invention, WEIGHT parameter WEIGHT ═ { WEIGHT ═ WEIGHT [ ({ WEIGHT) }1,weight2,...,weighth,...,weightH}。
In the present invention, BIAS parameter BIAS ═ BIAS1,bias2,...,biash,...,biasH}。
weight1A weight parameter representing a first layer network in the neural network;
weight2a weight parameter representing a second tier network in the neural network;
weighthrepresenting the weight parameter of any layer of network in the neural network, and h represents the layer number identification number of the perceptron;
weightHrepresenting the last layer in a neural networkThe weight parameter of the network, H represents the total number of layers of the perceptron;
bias1a bias parameter representing a first layer network in the neural network;
bias2a bias parameter representing a second layer network in the neural network;
biashrepresenting the bias parameters of any layer of network in the neural network;
biasHrepresenting the bias parameters of the last layer of the neural network.
First layer output for a multi-layer perceptron is notedWhereinRepresenting the output of the first layer of a multi-layer perceptron, f1Representing the activation function of the first layer neural network.
Similarly, the second layer output of the multi-layer perceptron is recorded asWhereinRepresenting the output of the second layer in a multi-layer perceptron, f2Representing the activation function of the second layer neural network.
Any output of the multi-layer perceptron is recorded asIs composed offhRepresenting the activation function of any layer of neural network.
In the present inventionIn the light of the above, the activation function f for any layer of the multi-layer perceptronhNon-linear functions, such as sigmoid or tanh functions, are typically chosen. Output to last layer of multi-layer perceptronFor multiple non-linear functions to inputCan thus be simply depicted asWhere θ represents the sum of all parameterized functions. Will be described inThe final output of the neural network paper probability model based on the multilayer perceptron is
In the present invention, in order to make the euclidean distance between similar points in the expression space as short as possible, the euclidean distance between dissimilar points is as long as possible. The basic form is as follows:
Eposrepresents the euclidean shortest distance; enegRepresents the Euclidean longest distance; c represents the number of hops.
In the present invention, since the triple (paper)a,paperoAnd,) that represents whether the triplet is a positive or negative sample, where positive samples may be considered to require similar points in space, and negative samples may be considered to require points as far as possible in space.Thus, for the present application, the present invention may take advantage of the merging of positive and negative samples into a loss function for euclidean distance of the distributed representation of the paper:
m represents QSortingThe identification number of any one of the triples in (b),representing the first arbitrary paper node in a triplet m,representing the second arbitrary paper node in the triplet m,(m)the flag indicating positive and negative samples in the triplet m L represents the overall loss function, which should be the sequence of out-of-order samples QSortingThe sum of all element loss functions in (c).
In the invention, the proportion of positive and negative samples is different, and the similarity between the positive and negative samples is different. For example, positive samples may be more similar due to the existence of a connecting edge, while negative samples are more different, so that the loss functions generated by the positive and negative samples will not be the same, and therefore, for this application, the present invention needs a harmonic parameter γ to balance the loss functions of the positive and negative samples, and therefore the loss functions will be added to γ to become:
in the invention, the purpose of training the neural network is to reduce the value of the loss function to the minimum, and in order to train the neural network and determine the weight of the neural network and the bias value of the neural network, the invention adopts a random gradient descent algorithm to learn the network parameters.
In the present invention, the model is trained by determining the non-linear transformation function f by a stochastic gradient descent algorithmθDue to a non-linear transformation function fθMain bagThe update value for each gradient descent of the WEIGHT parameter WEIGHT and the BIAS parameter BIAS is L partial derivatives with respect to the WEIGHT parameter WEIGHT and the BIAS parameter BIAS, so that at each iteration, the WEIGHT parameter WEIGHT and the BIAS parameter BIAS are updated at a learning rate according to the parameter update value:
WEIGHTrear end=WEIGHTFront side+·ΔWEIGHT
BIASRear end=BIASFront side+·ΔBIAS
WEIGHTFront sideWEIGHT parameter for upper layer in perceptron, WEIGHTRear endΔ WEIGHT is the partial derivative of each gradient descent L with respect to the WEIGHT parameter WEIGHT for the current layer in the perceptron.
BLASFront sideFor sensing the bias parameters of the upper layer in the machine, B L ASRear endTo sense the bias parameters of the current layer in the machine, Δ B L AS is the partial derivative of each gradient descent at L relative to the bias parameters B L AS.
When random gradient descent is used, due to the fact that the number of training iterations is too many, overfitting can occur, the early-stop method is adopted, training is stopped when the loss function L is not reduced continuously, and the overfitting phenomenon generated during training is prevented.
In the invention, the WEIGHT parameter WEIGHT and the BIAS parameter BIAS of each layer in the perceptron are saved to obtain the nonlinear transformation function f of the learning targetθThereby completing the neural network training based on the multilayer perceptron and finally obtaining the target f according to the learningθFor the paperaGenerating the expression vector thereof, namely constructing the neural network paper probability model of the multilayer perceptron aiming at the semantic information of paper nodes
The invention provides a parameterized thesis network node representation learning method, which specifically comprises the following steps:
the method comprises the following steps that firstly, a neighbor-paper node set of any paper node and a neighbor-paper node set of the neighbor are obtained through sampling based on a random walk method;
in the present invention, the paper set AP ═ paper1,paper2,…,papera,…,papero,…,paperAIn the star-shaped thesis network structure, the sampling of the neighbor thesis nodes of each thesis node is carried out by random walk of the jump probability of adding the previous jump and the next jump. For any paper node paperaThe random walk method is adopted to obtain samples belonging to paperaNeighbor-paper node set
Step 101: constructing a paper node empty queue marked as V, wherein the V is used for storing a paper node sequence; the maximum queue element bit number of the paper node empty queue V is mv, and the value of mv is 10-20; then, step 102 is executed;
step 102: selecting any paper node paperaThen the paper is putaPutting the position 1 in a thesis node queue V; then step 103 is executed;
step 103: acquiring paper belonging to any paper nodeaAll neighbor paper node set of (1), notedIn the invention, the neighbor paper node refers to a paper node paper associated with any one paper nodeaA thesis node set with edges in between; then step 104 is executed;
step 104: according to the neighbor paper node setThe total number of middle neighbor nodes B determines the probability of jumping to each neighbor paper node(first hop probability for short),c represents the hop count; then step 105 is executed;
step 105: adopting alias sampling algorithm (alias sampling) according to current jump probabilityIn the above-mentionedNeighbor paper node for acquiring next hopAt the same time willPlace bit 2 of the paper node queue V; then step 106 is executed;
step 106: obtaining nodes belonging to neighbor thesisAll neighbor paper node set of (2), i.e. neighbor-paper node set of neighborsThen step 107 is performed;
step 107: computing neighbor thesis nodesWith any paper node paperaShortest hop count in betweenThen step 108 is executed;
in the present invention, whereinRepresenting the minimum hop distance from any neighbor paper node to a previous paper node, e.g. if a neighbor paper nodeTo paper node paperaA minimum of 1 hop is required, thenIf neighbor paper nodeNamely paper node paperaThen, thenAnd so on.
Step 108: according to saidTo determineProbability of hopping to each neighbor paper node(second hop probability for short); then step 109 is performed;
In the present invention, the shortest hop count refers to the minimum hop required between two paper nodes.
In the invention, p is the second hop probability for adjusting paper nodes not in the paper node queue V in the random walk methodSize parameter (abbreviated as a skip parameter), q isSecond hop probability for adjusting paper nodes in the paper node queue V in a random walk methodThe size parameter (jump-in parameter for short), p, q control the probability of jumping, if we want to walk more randomly in local jumping, p needs to be set larger; conversely, q needs to be set larger.
Step 109: warp beamAfter determination, according toAnd alias sampling, selectingAs a next-hop thesis node, will simultaneouslyBit 3 placed in paper node queue V; then, step 110 is executed;
step 110: circularly executing the step 106 and the step 109 until the digit in the paper node queue V is mv, and stopping the random walk; then step 111 is executed;
step 111: in the present invention, the steps 101 to 110 are repeatedly executed for each thesis node in the whole thesis network to complete the neighbor node sampling of the thesis node, and then there is a thesis node queue set denoted as VF ═ V-1,V2,...,Vf,...,VF}; step 201 is then performed.
V1Representing a first paper node queue;
V2representing a second paper node queue;
Vfrepresenting any one thesis node queue, and f representing an identification number of the thesis node queue;
VFrepresenting the last paper node queue, F representsTotal number of paper node queue sets, F ∈ F.
Generating neural network training data of the multilayer perceptron by adopting a negative sampling method;
in the invention, the thesis node queue set VF (V) obtained in the step one is used for generating training data which can be used by the neural network1,V2,...,Vf,...,VF}; in addition to the training data in the paper node queue set, the present invention can generate the data required for training the model by means of a negative sampling algorithm.
Step 201: establishing a positive sample queue QpAnd negative sample queue QnRespectively storing positive sampling data and negative sampling data required by training a neural network, and then executing step 202;
step 202: setting up a neighbor window size hyperparameter WD, if WD is in a paper node queue VfIn, then belongs to the thesis node queue VfEach article in (1) is noted asThen step 203 is executed;
indicating belonging to any one of the paper node queues VfG represents the identification number of the neighbor paper node;
indicating belonging to any one of the paper node queues VfG denotes a paper node queue VfLength of (G ∈ G).
For any node in the paper queue, all nodes with the distance from the node to the node smaller than WD in the queue are considered as positive sample nodes, and each time, for any paper node, the invention firstly acquires 2 × WD adjacent paper node sets which belong to the same, and the nodes are marked as the positive sample nodes
Indicated at the neighboring paper nodeMiddle removingAndany other paper node is called a queue-adjacent paper node for short. The subscript l indicates the identification number of the node of the paper that is neither the largest nor the smallest, i.e. divided byOther identification numbers of these 2 paper nodes.
Step 203: for any arbitrary queue-paper nodeSampling from small to large according to the sequence of the neighbor identification numbers, wherein the sampling process is toEach node in (1) and any queue-thesis nodeA triple is formed, and then step 204 is executed;
for the saidAnd arbitrary queue-paper nodeForm a triad, i.e. (,)) Wherein +1 represents that the triplet is a positive sample, whereas-1 represents that the triplet is a negative sample, and (b) is) Insert positive sample queue QpIn (1).
For the saidAnd arbitrary queue-paper nodeForm a triad, i.e. (,) ) Wherein +1 represents that the triplet is a positive sample, whereas-1 represents that the triplet is a negative sample, and (b) is) Insert positive sample queue QpIn (1).
For the saidAnd arbitrary queue-paper nodeForm a triad, i.e. (,)) Wherein +1 represents that the triplet is a positive sample, whereas-1 represents that the triplet is a negative sample, and (b) is) Insert positive sample queue QpIn (1).
Step 204: step 202 and step 203 are executed in a loop until the thesis node queue set VF ═ V1,V2,...,Vf,...,VFAll the paper nodes in all the paper node queues in the queue complete the sampling work of the neighbor paper nodes to obtain a positive sample queue QpThen, step 207 is performed;
step 205: sampling all paper nodes in the network, and selecting any two paper nodes (the selected two paper nodes can be adjacent or non-adjacent) from the network each time, namely, the first arbitrary paper node paperaSecond paper node papero. If there is a continuous edge between two paper nodes ((paper)a,papero) ∈ E), or the two randomly chosen paper nodes are identical (paper)a=papero) Continuing the step, otherwise, dividing any two paper nodes into papera、paperoComposition ofTriple (paper)a,papero-1) storing into a negative sample queue QnThen step 206 is performed;
step 206: step 205 is executed in a loop, and a positive/negative sample ratio parameter μ is established, assuming a positive sample queue QpThe number of the middle triples is np, then when Q isnStops when the number of triples in (d) equals μ × np, and then performs step 207;
step 207: queue Q of positive samples obtained in step 204pAnd the negative sample queue Q obtained in step 206nAre combined together to obtain a new sample queue QNew={Q1....,Q(1+μ)×npExecute step 208;
Q1indicating a new sample queue QNewThe triplet of the smallest identification number in (1).
Q(1+μ)×npIndicating a new sample queue QNewSubscript (1+ μ) × np represents the sample queue QNewComprising (1+ μ) × np triplets.
Step 208: queue Q of new samplesNew={Q1....,Q(1+μ)×npDisordering the sequence of all elements in the sequence to obtain a disordered sample queue QSorting={Q1....,Q(1+μ)×npAnd then step 301 is performed.
Processing in a neural network paper probability model based on a multilayer perceptron;
step 301: for the Q obtained in step 208Sorting={Q1....,Q(1+μ)×npOne triple (paper) at a timea,paperoB), putting the neural network paper nodes into a neural network paper probability model as a pair of paper nodes for learning, and executing the step 302;
step 302: for two paper nodes paper in each tripleaAnd paperoBy means of a modelMapping to obtain two corresponding transformed vectorsStep 303 is executed;
step 303: calculating Euclidean distances of the two thesis nodes, and executing step 304;
in the present invention, the twin network is aimed at making the euclidean distance between similar points in the expression space as short as possible, and the euclidean distance between dissimilar points as long as possible. The basic form is as follows:
Eposrepresents the euclidean shortest distance; enegRepresents the Euclidean longest distance; c represents the number of hops.
304, merging the positive and negative samples, putting the merged positive and negative samples into a loss function of Euclidean distance related to distributed representation of the thesis, carrying out loss function calculation of balancing the positive and negative samples to obtain an integral loss function L, and executing the step 305;
step 305: determining a non-linear transformation function f by using a random gradient descent algorithmθCompleting any two paper node papersaAnd paperoThe representation of (2) is learned.
Example 1
The embodiment adopts the Cora paper data set and the Pubmed knowledge network data set to carry out learning and experimental work.
Cora is a collective paper data containing 2708 paper nodes, each node corresponding to a paper rich text information vector of length 1433, and 5429 edges, the rich text information vector indicating the presence or absence of a word by 0/1. Meanwhile, each node is associated with a category attribute, and the total category attribute value number is 7.
Pubmed is a knowledge network data aggregation containing 19717 paper nodes, including 19717 nodes and 44338 edges, each node corresponding to a paper rich text information vector with the length of 500, and the rich text information vector indicates whether a word exists or not by 0/1. Meanwhile, each node is associated with a category attribute, and the total category attribute value number is 3.
In order to verify the effectiveness, the invention mainly compares the performances of different methods in the classification task of the paper nodes:
deepwalk: the network is sampled by adopting a common random walk algorithm, and then the representation of each node in the network is obtained by using a word2vec algorithm. (2014deep walk of online learning of social representation [ J ]. Perozzi B, Alrfou R, Skiiena S.KDD:701-710.)
TADW: and decomposing the random walk in the Deepwalk, skillfully adding rich text information of the nodes, and obtaining the representation of each node in the network by adopting a matrix multiplication mode. (2015, Network representation with rich text information [ C ] YangC, ZHao D, et al. International conference on Intelligent Association. AAAI Press: 2111-
Node2Vec, an upgraded version of Deepwalk, employs a second-order random walk algorithm to sample the network, and then obtains a representation of each Node in the network using a word2Vec algorithm (2016, Node2Vec: Scalable Feature L earning for Networks [ C ]// Grover A, L eskovec J.KDD:855.)
The method selects a node prediction method to compare vector representation effects. In the experiment, a hybrid verification technology (cross-validation) is adopted, and SVM classifiers are selected for classification in different classification prediction methods.
The invention adopts two evaluation indexes to measure respectively Micro-F1 and Macro-F1.
The Macro-F1 calculation method comprises the following steps:
wherein P ismacroAnd RmacroRespectively representing the macro level difference and the macro recall.
The calculation method of the Micro-F1 comprises the following steps:
wherein P ismicroAnd RmicroRespectively representing the micro-tolerance rate and the micro-recall rate.
Effects on the Cora data set As shown in FIGS. 2 and 3, the effects of the present invention are compared with those of other methods on the Cora data set, FIG. 2 represents the performance of each method on the Micro-F1 evaluation index, and FIG. 3 represents the performance of each method on the Macro-F1 evaluation index. The horizontal axis of the two graphs represents the training data of the classifier as a percentage of the total data. The figure shows that the method has better effect than other network representation learning methods under the evaluation indexes of Micro-F1 and Macro-F1, and particularly shows that compared with a Deepwalk and Node2vec algorithm which purely use network information and do not adopt network Node semantic information, the algorithm of the invention improves the proportion of each training data by more than 5% under the evaluation indexes of Micro-F1 and Macro-F1, and the obtained network Node representation vector is remarkably better than that obtained by only using network topology information after the network Node information and the network topology structure are fused. Meanwhile, as can be seen from the comparison of the TADW method combining the network node information and the network topology information, the method provided by the present invention is still improved by 3% in both evaluation indexes.
The effect of Wiki data set is shown in FIGS. 4 and 5, and it can be seen that the present invention has better effect than other network representation learning methods under both Micro-F1 and Macro-F1 evaluation indexes. Since the category number of Wiki data sets is far more than that of the Cora data sets, it can be found that the classification effect is poor and is far lower than the result of analysis by using TADW without adopting the Deepwalk and Node2vec algorithms of network Node semantic information. This illustrates that semantics dominates the data set. According to the method, under the evaluation indexes of Micro-F1 and Macro-F1, the experiment result obtained by the TADW method is improved by 2%, and the fact that the network node expression vector obtained by the method after the network node information and the network topology structure are fused is better than the network node expression vector obtained by directly multiplying the network node expression vector by a matrix can be shown. The invention can be demonstrated to better fuse in the combination of network information and semantic information in the network node representation, and obtain a better representation vector.
Through the analysis of fig. 2-5, these experiments show that the invention can naturally merge both the network structure and the semantic information, so as to obtain a better network node representation vector, and thus, the validity of the invention can be verified.
Claims (3)
1. A parameterized thesis network node representation learning method is characterized by comprising the following steps:
the method comprises the following steps that firstly, a neighbor thesis node set of any one thesis node and a neighbor thesis node set of a neighbor are obtained through sampling based on a random walk method;
step 101: constructing a paper node empty queue marked as V, wherein the V is used for storing a paper node sequence; the maximum queue element bit number of the paper node empty queue V is mv, and the value of mv is 10-20; then, step 102 is executed;
step 102: selecting any paper node paperaThen the paper is putaPutting the position 1 in a thesis node queue V; then step 103 is executed;
step 103: acquiring paper belonging to any paper nodeaAll neighbors ofNode set of the living thesis, noteThe neighbor paper node set refers to a set of nodes related to any paper node paperaA neighbor thesis node set with connected edges exists between the neighbor thesis nodes; then step 104 is executed;
representing a paper belonging to any one of the paper nodesaThe first neighbor node of (2), i.e. the first neighbor paper node;
representing a paper belonging to any one of the paper nodesaA second neighbor node of (2), i.e. a second neighbor paper node;
representing a paper belonging to any one of the paper nodesaB represents the identification number of the neighbor node;
representing a paper belonging to any one of the paper nodesaB represents a node belonging to a paper, i.e. the last neighbor paper nodeaB ∈ a;
step 104: according to the neighbor paper node setThe total number B of middle neighbor nodes determines the probability of jumping to the first jumpc represents the hop count; then step 105 is executed;
step 105: adopting alias sampling algorithm according to the current first jump probabilityIn the above-mentionedNeighbor paper node for acquiring next hopAt the same time willPlace bit 2 of the paper node queue V; then step 106 is executed;
step 106: obtaining nodes belonging to any neighbor thesisAll neighbor paper node set of (2), i.e., neighbor paper node set of neighborsThen step 107 is performed;
indicating belonging to any neighbor paper nodeThe first neighbor node of the neighbor, namely the first neighbor paper node of the neighbor;
indicating belonging to any neighbor paper nodeA second neighbor node of the neighbor, i.e., a second neighbor paper node of the neighbor;
indicating belonging to any neighbor paper nodeE denotes a node belonging to the neighbor paper, i.e. any neighbor paper node of the neighborThe identification number of the neighbor node of (2);
indicating belonging to any neighbor paper nodeE denotes a node belonging to the last neighbor of the neighbor, i.e. the last neighbor paper node of the neighborTotal number of neighbor nodes of E ∈ a;
step 107: neighbor paper node for calculating any neighborWith any paper node paperaShortest hop count in betweenThen step 108 is executed;
whereinRepresented by any one neighborTo a neighbor paper node located in a paperaA minimum hop distance of a previous paper node;
step 108: according to saidTo determineSecond hop probability of hopping to each neighbor paper nodeThen step 109 is performed;
the second probability of hoppingc represents the hop count; p is a second hop probability for adjusting paper nodes not in the paper node queue V in the random walk methodA size parameter, i.e., a skip parameter; q is a second hop probability for adjusting paper nodes in the paper node queue V in a random walk methodA size parameter, i.e., a hop-in parameter;
step 109: warp beamAfter determination, according toAnd alias sampling, selectingAs a next-hop thesis node, will simultaneouslyBit 3 placed in paper node queue V; then, step 110 is executed;
step 110: circularly executing the step 106 and the step 109 until the digit in the paper node queue V is mv, and stopping the random walk; then step 111 is executed;
step 111: repeating the steps 101 to 109 for each thesis node in the whole thesis network to complete the neighbor node sampling of the thesis node, and then there is a thesis node queue set denoted as VF ═ V1,V2,...,Vf,...,VF}; then step 201 is executed;
V1representing a first paper node queue;
V2representing a second paper node queue;
Vfrepresenting any one thesis node queue, and f representing an identification number of the thesis node queue;
VFrepresenting the last paper node queue, F representing the total number of paper node queue sets, F ∈ F;
generating neural network training data of the multilayer perceptron by adopting a negative sampling method;
step 201: establishing a positive sample queue QpAnd negative sample queue QnRespectively storing positive sampling data and negative sampling data required by training a neural network, and then executing step 202;
step 202: setting up a neighbor window size hyperparameter WD, if WD is in a paper node queue VfIn, then belongs to the thesis node queue VfEach article in (1) is noted asThen step 203 is executed;
indicating belonging to any one of the paper node queues VfG represents an identification number of the thesis node;
indicating belonging to any one of the paper node queues VfG denotes a paper node queue VfLength of G ∈ G;
for nodes in any one paper queueConsider a node in a queueAll nodes with the distance smaller than WD are positive sample nodes; at a time, for any one paper nodeFirst obtain the information of2 × WD neighbor paper node set, noted
indicated at the neighboring paper nodeMiddle removingAndqueue outside-adjacent thesis nodes, subscript l denotes the identification number of the node that is not the largest or the smallest thesis;
step 203: for any arbitrary queue-paper nodeSampling from small to large according to the sequence of the neighbor identification numbers, wherein the sampling process is toEach node in (1) and any queue-thesis nodeForming a triple and then executingA row step 204;
for the saidAnd arbitrary queue-paper nodeForm a triplet, i.e.Where +1 denotes that the triplet is a positive sample, whereas-1 denotes that the triplet is a negative sample, and willInsert positive sample queue QpPerforming the following steps;
for the saidAnd arbitrary queue-paper nodeForm a triplet, i.e. Where +1 denotes that the triplet is a positive sample, whereas-1 denotes that the triplet is a negative sample, and willInsert positive sample queue QpPerforming the following steps;
for the saidAnd arbitrary queue-paper nodeForm a triplet, i.e.Where +1 denotes that the triplet is a positive sample, whereas-1 denotes that the triplet is a negative sample, and willInsert positive sample queue QpPerforming the following steps;
step 204: step 202 and step 203 are executed in a loop until the thesis node queue set VF ═ V1,V2,...,Vf,...,VFAll the paper nodes in all the paper node queues in the queue complete the sampling work of the neighbor paper nodes to obtain a positive sample queue QpThen, step 207 is performed;
step 205: sampling all paper nodes in the network, and selecting any two paper nodes from the network at each time, namely a first paper node paperaSecond paper node papero(ii) a If a connecting edge exists between two paper nodes or two randomly selected paper nodes are the same, continuing the step, otherwise, performing any two paper nodesa、paperoComposition triplet (paper)a,papero-1) storing into a negative sample queue QnThen step 206 is performed;
step 206: step 205 is executed in a loop, and a positive/negative sample ratio parameter μ is established, assuming a positive sample queue QpThe number of the middle triples is np, then when Q isnStops when the number of triples in (d) equals μ × np, and then performs step 207;
step 207: queue Q of positive samples obtained in step 204pAnd the negative sample queue Q obtained in step 206nAre combined together to obtain a new sample queue QNew={Q1....,Q(1+μ)×npThen go to step 208;
Q1representing new samplesQueue QNewThe triplet of the smallest identification number in (1);
Q(1+μ)×npindicating a new sample queue QNewThe subscript (1+ mu) × np represents the sample queue QNewThe kit comprises (1+ mu) × np triplets;
step 208: queue Q of new samplesNew={Q1....,Q(1+μ)×npDisordering the sequence of all elements in the sequence to obtain a disordered sample queue QSorting={Q1....,Q(1+μ)×npThen step 301 is performed;
processing in a neural network paper probability model based on a multilayer perceptron;
step 301: for the Q obtained in step 208Sorting={Q1....,Q(1+μ)×npOne triple (paper) at a timea,paperoB), putting the neural network paper nodes into a neural network paper probability model as a pair of paper nodes for learning, and executing the step 302;
step 302: for two paper nodes paper in each tripleaAnd paperoBy means of a modelMapping to obtain two corresponding transformed vectorsStep 303 is executed;
step 303: calculating Euclidean distances of the two thesis nodes, and executing step 304;
the Euclidean distance is:
Eposrepresents the euclidean shortest distance; enegRepresents the Euclidean longest distance; c represents the hop count;
304, merging the positive and negative samples, putting the merged positive and negative samples into a loss function of Euclidean distance related to distributed representation of the thesis, carrying out loss function calculation of balancing the positive and negative samples to obtain an integral loss function L, and executing the step 305;
gamma represents a harmonic parameter, which is a loss function used to balance positive and negative samples;
m represents QSortingAn identification number of any one of the triples;
due to triplets (paper)a,paperoA) is a flag that represents whether the triplet is a positive or negative sample, where positive samples are considered points that need to be similar in space and negative samples are considered points that need to be as far apart in space as possible;
step 305: determining a non-linear transformation function f by using a random gradient descent algorithmθCompleting any two paper node papersaAnd paperoThe representation of (2) is learned.
2. A parameterized thesis network node representation learning method in accordance with claim 1, characterized in that: step 103, step 104 and step 105 realize the acquisition of the 2 nd element in the paper node queue V.
3. A parameterized thesis network node representation learning method in accordance with claim 1, characterized in that: step 106 to step 110 realize the acquisition of the element after the 2 nd bit element is relayed by the paper node queue V until the maximum queue element number mv of the paper node empty queue V is reached.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711308050.6A CN108228728B (en) | 2017-12-11 | 2017-12-11 | Parameterized thesis network node representation learning method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711308050.6A CN108228728B (en) | 2017-12-11 | 2017-12-11 | Parameterized thesis network node representation learning method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108228728A CN108228728A (en) | 2018-06-29 |
CN108228728B true CN108228728B (en) | 2020-07-17 |
Family
ID=62653503
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711308050.6A Active CN108228728B (en) | 2017-12-11 | 2017-12-11 | Parameterized thesis network node representation learning method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108228728B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109213831A (en) * | 2018-08-14 | 2019-01-15 | 阿里巴巴集团控股有限公司 | Event detecting method and device calculate equipment and storage medium |
CN109376864A (en) * | 2018-09-06 | 2019-02-22 | 电子科技大学 | A kind of knowledge mapping relation inference algorithm based on stacking neural network |
CN109558494A (en) * | 2018-10-29 | 2019-04-02 | 中国科学院计算机网络信息中心 | A kind of scholar's name disambiguation method based on heterogeneous network insertion |
CN110322021B (en) * | 2019-06-14 | 2021-03-30 | 清华大学 | Hyper-parameter optimization method and device for large-scale network representation learning |
CN112559734B (en) * | 2019-09-26 | 2023-10-17 | 中国科学技术信息研究所 | Brief report generating method, brief report generating device, electronic equipment and computer readable storage medium |
CN111292062B (en) * | 2020-02-10 | 2023-04-25 | 中南大学 | Network embedding-based crowd-sourced garbage worker detection method, system and storage medium |
CN112148876B (en) * | 2020-09-23 | 2023-10-13 | 南京大学 | Paper classification and recommendation method |
CN117648670B (en) * | 2024-01-24 | 2024-04-12 | 润泰救援装备科技河北有限公司 | Rescue data fusion method, electronic equipment, storage medium and rescue fire truck |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106250438A (en) * | 2016-07-26 | 2016-12-21 | 上海交通大学 | Based on random walk model zero quotes article recommends method and system |
CN106777339A (en) * | 2017-01-13 | 2017-05-31 | 深圳市唯特视科技有限公司 | A kind of method that author is recognized based on heterogeneous network incorporation model |
CN107451596A (en) * | 2016-05-30 | 2017-12-08 | 清华大学 | A kind of classified nodes method and device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8918431B2 (en) * | 2011-09-09 | 2014-12-23 | Sri International | Adaptive ontology |
-
2017
- 2017-12-11 CN CN201711308050.6A patent/CN108228728B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107451596A (en) * | 2016-05-30 | 2017-12-08 | 清华大学 | A kind of classified nodes method and device |
CN106250438A (en) * | 2016-07-26 | 2016-12-21 | 上海交通大学 | Based on random walk model zero quotes article recommends method and system |
CN106777339A (en) * | 2017-01-13 | 2017-05-31 | 深圳市唯特视科技有限公司 | A kind of method that author is recognized based on heterogeneous network incorporation model |
Also Published As
Publication number | Publication date |
---|---|
CN108228728A (en) | 2018-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108228728B (en) | Parameterized thesis network node representation learning method | |
US11544535B2 (en) | Graph convolutional networks with motif-based attention | |
Liu et al. | Principled multilayer network embedding | |
Suthaharan et al. | Decision tree learning | |
Tran et al. | On filter size in graph convolutional networks | |
CN112508085B (en) | Social network link prediction method based on perceptual neural network | |
Kundu et al. | Fuzzy-rough community in social networks | |
CN110147911B (en) | Social influence prediction model and prediction method based on content perception | |
Amin | A novel classification model for cotton yarn quality based on trained neural network using genetic algorithm | |
Venturelli et al. | A Kriging-assisted multiobjective evolutionary algorithm | |
Nasiri et al. | A node representation learning approach for link prediction in social networks using game theory and K-core decomposition | |
Kepner et al. | Mathematics of Big Data | |
US11669727B2 (en) | Information processing device, neural network design method, and recording medium | |
CN112905906B (en) | Recommendation method and system fusing local collaboration and feature intersection | |
Jenny Li et al. | Evaluating deep learning biases based on grey-box testing results | |
Coscia et al. | The node vector distance problem in complex networks | |
Lokhande et al. | Accelerating column generation via flexible dual optimal inequalities with application to entity resolution | |
CN109697511B (en) | Data reasoning method and device and computer equipment | |
CN113159976B (en) | Identification method for important users of microblog network | |
Kim et al. | Network analysis for active and passive propagation models | |
Jayachitra Devi et al. | Link prediction model based on geodesic distance measure using various machine learning classification models | |
Javaheripi et al. | Swann: Small-world architecture for fast convergence of neural networks | |
Van Tran et al. | On filter size in graph convolutional networks | |
Montiel et al. | Reducing the size of combinatorial optimization problems using the operator vaccine by fuzzy selector with adaptive heuristics | |
Ferdaus et al. | A genetic algorithm approach using improved fitness function for classification rule mining |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |