CN107729290A - A kind of expression learning method of ultra-large figure using the optimization of local sensitivity Hash - Google Patents

A kind of expression learning method of ultra-large figure using the optimization of local sensitivity Hash Download PDF

Info

Publication number
CN107729290A
CN107729290A CN201710857844.1A CN201710857844A CN107729290A CN 107729290 A CN107729290 A CN 107729290A CN 201710857844 A CN201710857844 A CN 201710857844A CN 107729290 A CN107729290 A CN 107729290A
Authority
CN
China
Prior art keywords
node
vector
target
hash function
training sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710857844.1A
Other languages
Chinese (zh)
Other versions
CN107729290B (en
Inventor
李笑宇
陈修司
周畅
高军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Shenzhen Graduate School
Original Assignee
Peking University Shenzhen Graduate School
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Shenzhen Graduate School filed Critical Peking University Shenzhen Graduate School
Priority to CN201710857844.1A priority Critical patent/CN107729290B/en
Publication of CN107729290A publication Critical patent/CN107729290A/en
Application granted granted Critical
Publication of CN107729290B publication Critical patent/CN107729290B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of expression learning method of the ultra-large figure using the optimization of local sensitivity Hash.This method is:Each node of target figure is calculated using local sensitivity hash function, and the knot vector of the node is defined according to result of calculation;Training sample is obtained from the graph structure of the target figure;Based on the training sample, the knot vector of each node in the target figure is trained using skip gram models, knot vector corresponding to each node in the target figure is obtained and represents.The present invention solves the puzzlement that " the long-tail phenomenon " of generally existing in real network structure is brought, while in view of the content information and structural information in network, is adapted to distributed implementation, has enhanced scalability.

Description

A kind of expression learning method of ultra-large figure using the optimization of local sensitivity Hash
Technical field
The invention belongs to areas of information technology, it is related to the expression learning method of Large Scale Graphs structure, more particularly to a kind of profit The ultra-large figure optimized with local sensitivity Hash represents learning method.
Background technology
Expression study on figure, also known as " figure insertion " (Graph Embedding), refer to map each node in figure For the algorithm of the low-dimensional vector representation of node diagnostic can be kept.The knot vector that " figure insertion " algorithm obtains represents to be considered as The essential characteristic of figure interior joint, the generality input as the other machines learning tasks on figure.
The research work of relevant " figure insertion " algorithm before is learnt emphatically using the network structure information of figure mostly, Comparing typical method includes DeepWalk, LINE, node2vec etc..However, the network structure of real world is often sparse And skewness weighing apparatus." hot topic " node is connected to most side on a small quantity, and structural information is close, and substantial amounts of " unexpected winner " node The side number of connection is seldom, and structural information is sparse, namely " long tail effect ".Therefore, be based only upon the method for network structure information for The expression effect of substantial amounts of structural information sparse " long tail node " is often not fully up to expectations.
The content of the invention
The present invention proposes a kind of scheme for the knot vector expression for being applied to calculate large scale network, is particularly intended to solve The puzzlement that " the long-tail phenomenon " of generally existing is brought in real network structure.The algorithm is simultaneously in view of the content information in network And structural information, it is adapted to distributed implementation, there is enhanced scalability.
The present invention establishes the association based on node content information using local sensitivity hash function between figure interior joint, then Using random walk method generation training sample from artwork structure, knot vector corresponding to skip-gram model trainings is used. This method can be indicated study using the content information and overall network structure of each node simultaneously, have Highly Scalable Property, especially there is preferable effect on the significant figure of structure long tail effect.
Present patent application is the local sensitivity hash function using node content information as input.The spy of local sensitivity Hash Matter ensure that the Hash mapping of node similar in node content information is also close, so as to which the close node of content passes through altogether Enjoy Hash output and establish contact.Hash output of the present invention based on each node has redefined vector corresponding to it, resets The mode of justice may insure that the close node of content information will possess similar final expression vector.The present invention uses random walk Method obtains training sample from artwork structure, is then instructed using skip-gram models on the knot vector redefined Practice.This method has a clear superiority on the large-scale obvious figure of structure long tail effect compared to the method based on pure structure, body The advantage of fusion content information is showed, and has realized there is highly scalable suitable for distributed structure/architecture.
The deficiency for the method that the present invention practises for the existing chart dendrography based on pure structure, innovatively propose using local Sensitive hash function merges content information.Compared to the method based on pure structure, the present invention has advantages below:
1) present invention make use of network content information and structural information simultaneously.Associated methods are first to use local sensitivity Hash Output association is established between the close node of content information, then use and given birth to from network structure by random walk mode Into training sample be trained.Such combination can associate node similar in distant in those figures and content, This is that structure-based method can not accomplish.And enhance originally may be more using the association of content information by the present invention Sparse graph structure, and after content information associates, the sparse unexpected winner node of script structure can also share popular node Structural information, so as to efficiently solve " figure insertion " scheme based on pure structure on " long tail effect " obvious graph structure The problem of effect is poor.
2) present invention can save redundant space.The present invention is breathed out by the use of the local sensitivity using node content information as input Uncommon function has redefined knot vector, and each node corresponds to its exclusive vectorial mode in method before changing.Existing Substantial amounts of homogeneity node in real network often be present, such as in the commodity figure of electric business website, there is what a large amount of different hotel owners sold Basically identical commodity.Depositing repeatedly for homogeneity knot vector can be caused if each node occupies its exclusive memory space Storage.The present invention changes knot vector definition, and node similar in content information has maximum probability shared vector parameter, therefore can save Save the redundant space problem that homogeneity node is brought.
3) present invention is applied to distributed computing framework, has enhanced scalability.Algorithm in the present invention is in Ali There is smooth realization on cloud distributed computing framework odps ps (Ali's cloud parameter server), figure scale reaches ten million rank.
Brief description of the drawings
Fig. 1 is the schematic diagram that knot vector expression is redefined based on local sensitivity Hash;
Fig. 2 is the gradient updating process schematic of single sample in training process.
Embodiment
Hereafter by specific embodiment, and coordinate accompanying drawing, the algorithm flow of the inventive method is illustrated.
First, the present invention introduces the design of the local sensitivity hash function used in algorithm, and then introducing the present invention is How each node vectorial is redefined, and the last present invention finally trains the process of knot vector expression by algorithm is introduced.
(1) the local sensitivity hash function using node content information as input
Node content information processing is that low-dimensional real number is vectorial by the step a. present invention, the input as follow-up hash function. Process of the content information processing for low-dimensional real number vector can be utilized into existing algorithm.For example, reality of the present invention in Ali's cloud In testing, graph structure is that the commodity figure that construction is extracted in sequence is clicked on from user, a node in each commodity corresponding diagram, by commodity Content information of the title text as corresponding node.The present invention handles commodity title text for the vectorial process of low-dimensional such as Under:First the title text of all commodity is segmented, then using all title texts as training corpus, using in word2vec Skip-gram models, the title text of each commodity is regarded as a window and is trained, training obtain in dictionary own The d dimension term vectors of word.Because title is usually short text, the average work of the term vector of the invention by all words included in title For the content vector of title vector, namely corresponding node.
The step b. present invention designs local sensitive hash function by the way of hyperplane cutting.Local sensitivity Hash letter Number needs closer with input value (node content vector), then is output to the bigger characteristic of the probability of same barrel number.The present invention Design method be randomly generated k d dimension hyperplane, due in d dimension spaces, one divide the space into it is two-part super flat Face can be tieed up real number vector representation by a d, thus in fact be random k d dimension real number vector of generation, be denoted as This k hyperplane together constitutes the hash function of a local sensitivity.It is defeated for two The d dimension node content vectors entered, if they are in the same side of k hyperplane, then it is assumed that they are under this hash function Finally there is identical output valve.Specific hash function computational methods are to input the d dimension content vectors of some nodeAllow it Vector corresponding with each hyperplaneInner product is all done respectivelyInner product 1 is recorded as more than or equal to 0, is otherwise recorded as 0.Then for an input content vectorCan obtain k dimension 0/1 to Amount, namely a k bit, the k bits are output of the node under this hash function.Two inputs VectorWithHaving identical output valve under the hash function, and if only if for all i=1,2 .., k,With Sign symbol it is identical.As can be seen that each Hash letter that the present invention designs Several barrelages is 2k
The m dimension discrete codes that step c. are obtained each node in figure by local sensitivity Hash represent.M use of the present invention Building mode in previous step, a hash function race { h can be obtained(j), j ∈ { 1,2 ..., m }, wherein each hash function h(j)Codomain have 0,1 ..., 2k- 1 this 2kIndividual value (each hash function here generates k hyperplane at random, M*k hyperplane has been used altogether in i.e. whole hash function race).The content information vector of all nodes in figure is all inputted respectively After being calculated into this m hash function, each node can obtain m barrel number, i.e., each node has a new m to tie up discrete volume Representation, this m dimension discrete codes represent to be obtained after local sensitivity Hash mapping by node content information.Due to office The characteristic of portion's sensitive hash function, the m dimension discrete codes of similar node represent and close in content.Next the present invention To introduce how the m based on each node ties up discrete codes to redefine the vector of node.
(2) knot vector redefines
In " figure insertion " algorithm based on pure structure before, each node u is mapped as two vectorial suAnd tu, That is " source vector " and " object vector ".The concept of " source vector " and " object vector " is still remained in the algorithm of the present invention, Difference is that each node no longer independently possesses their own vector, but hash function race { h(j)In each hash function Each Hash bucket correspond to a pair " source vectors " and " object vector ".Specifically, hash function h(j)(j∈{1,2,…, M }) numbering be i (i ∈ 0,1 ..., 2k- 1 } " source vector " and " object vector " corresponding to Hash bucket) is respectivelyWith
For a certain node u in figure, remember that " content vector " is e corresponding to itu, remember that it is obtained after hash function maps To m dimension discrete codes be expressed asInd represents index, that is, the meaning indexed;This " source vector " and " object vector " s of the node is defined as below in inventionuAnd tu
That is, " source vector " that the present invention defines some node passes through hash function race { h for it(j)Obtain after mapping " source vector " of corresponding m Hash bucket is averaged, and " object vector " passes through hash function race { h for it(j)Obtain after mapping " object vector " of corresponding m Hash bucket is averaged.The present invention had analyzed in upper one section, and content information is more similar Node more Hash bucket of meeting share count after mapping, and the vector of each node is directly equal to its corresponding Hash bucket Vector is averaged, therefore the node that content information is more similar, and their corresponding " source vectors " and " object vector " also can more connect respectively Closely.The present invention utilizes node content information in this way, passes through between the vector representation of the close node of content information Shared Hash bucket contacts to establish, and the calculating process of hash function, which can be automatically performed, determines that the content information of which node connects Closely.Next, the present invention again by this new definition mode bring into it is structure-based training framework in, so as to reach text and The combination of structural information.
The diagram that knot vector redefines such as Fig. 1.
(3) training process introduction
After the redefining of knot vector is completed, the present invention can include new definition mode structure-based training frame In frame, the knot vector that obtaining the present invention needs represents.The model that the present invention uses is in recent years in natural language processing field Popular skip-gram models.In artwork, using the random walk strategy for terminating probability using carrying from graph structure Generate training sample.
The specific introduction (random walk strategy, which belongs to, to be fruitful, and is not belonging to the contribution of the present invention) of random walk strategy is such as Under:Migration is carried out from some node u, if the node currently reached is t, the probability for having p terminates migration at t;Have 1-p's Probability continues migration, goes to any node for having side to be connected with t, goes between the probability of which node and t and these points While weights directly proportional (weights when in real figure generally reflect the close journey of the entity relationship representated by the node at side both ends Degree, therefore allow to go to probability directly proportional to side right be rational).Migration is persistently carried out, until algorithms selection is whole at the v of certain point Only migration.Finally, (u, v) point is to being positive sample that we obtain in this paths.P is that algorithm designer can be taking human as control The parameter of system, p is more big, and the average length in the path that random walk obtains is shorter.
For positive sample (u, v), the present invention predicts the general of destination node v using softmax functions to define source node u Rate:
Wherein V is the vertex set in figure, suAnd tvAfter respectively redefining u node " source vector " and v nodes " purpose to Amount ".Attempt directly optimization above this new probability formula be it is very time-consuming because formula need to calculate the source vector of the node with The inner product sum of the object vector of all nodes in figure.In order to improve the efficiency of training, present invention employs random negative example The strategy of (negative sampling) optimizes, and has redefined the probability that source node u predicts destination node v:
Wherein, σ is sigmoid functions,PDIt is a pre-set Node distribution, is generally Node is uniformly distributed (probability being sampled a little is all equal),What is represented is from Node distribution PDIn, adopt at random Sample goes out point n process, tnAs " object vector " for the point n that sampling obtains.For each positive example, the present invention from the distribution with K negative examples of machine sampling.NoteFor the number of the positive sample (u, v) that our samplings obtain in whole sampling process, then instruct Experienced global object function is shown below.
Notice s in above formulau,tv,tnBe not independent parameter, but by node by local sensitivity Hash mapping to Hash bucket corresponding to vector calculate and get.Therefore, the present invention by the forward part intermediary knot vector that continues reset the right way of conduct Formula is substituted into, i.e., the knot vector after we are redefined is included in above-mentioned structure-based training framework, obtains the final overall situation Object function:
All example probabilistic logarithms and be target that global objective function is observed with maximizing, update each hash function Each Hash bucket corresponding to " source vector " and " object vector ".What the final result of training obtained is each hash function Corresponding to each Hash bucket " source vector " and " object vector ", and and then institute in figure can be calculated according to redefining for node There is the vector representation of node.
Global object function looks comparatively laborious, but in the training process, and the renewal process of parameter is in fact simultaneously It is uncomplicated.Such as a given training positive example (u, v), the present invention predict to obtain node v probability by maximizing from node u, And then more corresponding to new node u corresponding to m individual " source vector " and node v m " object vector ".Specific renewal process such as Fig. 2 institutes Show:The present invention finds corresponding to m corresponding to node u " source vector " and node v m by u and v Hash bucket index first " object vector ", vectorial s corresponding to each of which is obtained after being averageduAnd tv, then with obtained su,tvAfter substituting into derivation Formula, the gradient magnitude for needing to update in each parameter is calculated, then is reversely renewed back to vector corresponding to each Hash bucket i.e. Can.
Embodiment
Taobao's commodity chart dendrography practises example:
Using commodity on the line of Taobao as node of graph, according between the intraday click sequence construct node of Taobao user Side (in clicking on sequence close to former and later two commodity between connect a line), form a commodity relation figure, figure scale reaches To ten million rank.
The present invention use " title " of Taobao's commodity corresponding to each node as the node " content information " (such as " great star that gram Men's Shoes wire side air cushion running shoes noctilucence is jogged ").In an experiment, " title " is converted into the real number of 200 dimensions by the present invention Vector.Then method for transformation is used as training language first to be segmented to the title text of the whole network commodity using the title of all commodity Expect storehouse, using each title as a window, apply mechanically and all words in corpus are obtained in the skip-gram in word2vec Vector representation.Because commodity title is all short text, the present invention uses the average conduct pair of the vector representation of all words in title Answer " the content vector " of node.
After the content vector of each node is obtained, using the hash function of method construct local sensitivity mentioned in the present invention And train the vector representation of each commodity.Test result indicates that the vector representation application that the method in the present invention is calculated In recommending on to the line of reality, the result (APP, the Scalable that are obtained compared to the chart dendrography learning method based on " pure structure " Graph Embedding for Asymmetric Proximity, AAAI2017), have in clicking rate index and significantly carry Rise.
The above embodiments are merely illustrative of the technical solutions of the present invention rather than is limited, the ordinary skill of this area Technical scheme can be modified by personnel or equivalent substitution, without departing from the spirit and scope of the present invention, this The protection domain of invention should be to be defined described in claims.

Claims (9)

1. a kind of expression learning method of ultra-large figure using the optimization of local sensitivity Hash, its step include:
Each node of target figure is calculated using local sensitivity hash function, and the node is defined according to result of calculation Knot vector;
Training sample is obtained from the graph structure of the target figure;
Based on the training sample, the knot vector of each node in the target figure is trained using skip-gram models, obtained Into the target figure, knot vector corresponding to each node represents.
2. the method as described in claim 1, it is characterised in that it is quick that the m parts are designed by the way of hyperplane cutting Feel hash function, obtain a hash function race { h(j)},j∈{1,2,…,m};Wherein, each local sensitivity Hash letter Number includes the k d dimension hyperplane generated at random, if the d dimensional vectors of two inputs tie up the same side of hyperplane in the k d, Then the two d dimensional vectors have identical output valve under this hash function.
3. method as claimed in claim 2, it is characterised in that the barrelage of the local sensitivity hash function is 2k
4. method as claimed in claim 2, it is characterised in that the computational methods of the local sensitivity hash function are:For The d dimensional vectors of input, the d dimensional vectors and the d dimensional vectors are done into inner product respectively in the corresponding vector of each hyperplane, inner product is more than 1 is recorded as equal to 0, is otherwise recorded as 0, obtains the output result of the d dimensional vectors, i.e. a k bits.
5. method as claimed in claim 2 or claim 3, it is characterised in that using local sensitivity hash function to each of target figure Node is calculated, and the method for defining according to result of calculation the knot vector of the node is:
1) it is that d dimensional vectors input the hash function race { h respectively by the node content information processing of the target figure(j)In it is every The one local sensitivity hash function;Wherein, j-th of local sensitivity hash function h(j)By node u d dimensional vectors euReflect It is mapped to the source vector of Hash bucket, i.e. the Hash bucket that numbering is iAnd object vectorIt is corresponding with node u;
2) to each node of the target figure, the source vector of m Hash bucket corresponding to the node is averaged and is used as the node Source vector, the object vector of m Hash bucket corresponding to the node is averaged as the object vector to the node.
6. method as claimed in claim 5, it is characterised in that using most similar two node of knot vector as content information phase Like node, association is established by shared Hash bucket between the vector representation of content information similar node.
7. method as claimed in claim 5, it is characterised in that based on the training sample, using skip-gram models to this The method that the knot vector of each node is trained in target figure is:For training sample (u, v), training sample (u, v) is one Positive example, using formulaCalculate source node u and predict mesh Mark node v probability;Wherein,What is represented is from Node distribution PDIn, stochastical sampling goes out point n process, suFor source node u Source vector, tvFor the object vector of destination node u node, tnTo sample obtained point n object vector;PDIt is one advance The Node distribution set, for each training sample (u, v), from Node distribution PDMiddle stochastical sampling k negative examples;Then adopt Use global objective functionRenewal Source vector and object vector corresponding to each Hash bucket of each local sensitivity hash function;And then according to renewal result Knot vector corresponding to each node in the target figure is calculated to represent;Wherein,For the number of positive example (u, v).
8. method as claimed in claim 5, it is characterised in that based on the training sample, using skip-gram models to this The method that the knot vector of each node is trained in target figure is:For training sample (u, v), training sample (u, v) is one Positive example, using formulaCalculate source node u and predict mesh Mark node v probability;Wherein,What is represented is from Node distribution PDIn, stochastical sampling goes out point n process, suFor source node U source vector, tvFor the object vector of destination node u node, tnTo sample obtained point n object vector;PDIt is one advance The Node distribution set, for each training sample (u, v), from Node distribution PDMiddle stochastical sampling k negative examples;Then adopt Use global objective function Renewal Source vector and object vector corresponding to each Hash bucket of each local sensitivity hash function;And then according to renewal result Knot vector corresponding to each node in the target figure is calculated to represent;Wherein,For the number of positive example (u, v).
9. the method as described in claim 1, it is characterised in that given birth to from the graph structure of the target figure using random walk method Into training sample.
CN201710857844.1A 2017-09-21 2017-09-21 Representation learning method of super-large scale graph by using locality sensitive hash optimization Expired - Fee Related CN107729290B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710857844.1A CN107729290B (en) 2017-09-21 2017-09-21 Representation learning method of super-large scale graph by using locality sensitive hash optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710857844.1A CN107729290B (en) 2017-09-21 2017-09-21 Representation learning method of super-large scale graph by using locality sensitive hash optimization

Publications (2)

Publication Number Publication Date
CN107729290A true CN107729290A (en) 2018-02-23
CN107729290B CN107729290B (en) 2021-05-11

Family

ID=61207259

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710857844.1A Expired - Fee Related CN107729290B (en) 2017-09-21 2017-09-21 Representation learning method of super-large scale graph by using locality sensitive hash optimization

Country Status (1)

Country Link
CN (1) CN107729290B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107944489A (en) * 2017-11-17 2018-04-20 清华大学 Extensive combination chart feature learning method based on structure semantics fusion
CN109118053A (en) * 2018-07-17 2019-01-01 阿里巴巴集团控股有限公司 It is a kind of steal card risk trade recognition methods and device
CN109194707A (en) * 2018-07-24 2019-01-11 阿里巴巴集团控股有限公司 The method and device of distribution figure insertion
CN109992606A (en) * 2019-03-14 2019-07-09 北京达佳互联信息技术有限公司 A kind of method for digging of target user, device, electronic equipment and storage medium
CN110232393A (en) * 2018-03-05 2019-09-13 腾讯科技(深圳)有限公司 Processing method, device, storage medium and the electronic device of data
WO2020038141A1 (en) * 2018-08-24 2020-02-27 阿里巴巴集团控股有限公司 Distributed graph embedding method, apparatus and system, and device
CN111160552A (en) * 2019-12-17 2020-05-15 北京百度网讯科技有限公司 Negative sampling processing method, device, equipment and computer storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101170578A (en) * 2007-11-30 2008-04-30 北京理工大学 Hierarchical peer-to-peer network structure and constructing method based on syntax similarity
US20110249899A1 (en) * 2010-04-07 2011-10-13 Sony Corporation Recognition device, recognition method, and program
CN102298606A (en) * 2011-06-01 2011-12-28 清华大学 Random walking image automatic annotation method and device based on label graph model
CN103020321A (en) * 2013-01-11 2013-04-03 广东图图搜网络科技有限公司 Neighbor searching method and neighbor searching system
US20140279738A1 (en) * 2013-03-15 2014-09-18 Bazaarvoice, Inc. Non-Linear Classification of Text Samples
CN104794223A (en) * 2015-04-29 2015-07-22 厦门美图之家科技有限公司 Subtitle matching method and system based on image retrieval
CN104866471A (en) * 2015-06-05 2015-08-26 南开大学 Instance matching method based on local sensitive Hash strategy
CN106649715A (en) * 2016-12-21 2017-05-10 中国人民解放军国防科学技术大学 Cross-media retrieval method based on local sensitive hash algorithm and neural network
CN106682233A (en) * 2017-01-16 2017-05-17 华侨大学 Method for Hash image retrieval based on deep learning and local feature fusion
CN106780639A (en) * 2017-01-20 2017-05-31 中国海洋大学 Hash coding method based on the sparse insertion of significant characteristics and extreme learning machine

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101170578A (en) * 2007-11-30 2008-04-30 北京理工大学 Hierarchical peer-to-peer network structure and constructing method based on syntax similarity
US20110249899A1 (en) * 2010-04-07 2011-10-13 Sony Corporation Recognition device, recognition method, and program
CN102298606A (en) * 2011-06-01 2011-12-28 清华大学 Random walking image automatic annotation method and device based on label graph model
CN103020321A (en) * 2013-01-11 2013-04-03 广东图图搜网络科技有限公司 Neighbor searching method and neighbor searching system
US20140279738A1 (en) * 2013-03-15 2014-09-18 Bazaarvoice, Inc. Non-Linear Classification of Text Samples
CN104794223A (en) * 2015-04-29 2015-07-22 厦门美图之家科技有限公司 Subtitle matching method and system based on image retrieval
CN104866471A (en) * 2015-06-05 2015-08-26 南开大学 Instance matching method based on local sensitive Hash strategy
CN106649715A (en) * 2016-12-21 2017-05-10 中国人民解放军国防科学技术大学 Cross-media retrieval method based on local sensitive hash algorithm and neural network
CN106682233A (en) * 2017-01-16 2017-05-17 华侨大学 Method for Hash image retrieval based on deep learning and local feature fusion
CN106780639A (en) * 2017-01-20 2017-05-31 中国海洋大学 Hash coding method based on the sparse insertion of significant characteristics and extreme learning machine

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
WENGANG ZHOU 等: "Scalar quantization for large scale image search", 《ACM》 *
邹浩: "分布式图像检索训练系统的设计与实现", 《万方数据库》 *
高毫林: "基于哈希技术的图像检索研究", 《中国博士学位论文全文数据库》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107944489A (en) * 2017-11-17 2018-04-20 清华大学 Extensive combination chart feature learning method based on structure semantics fusion
CN107944489B (en) * 2017-11-17 2018-10-16 清华大学 Extensive combination chart feature learning method based on structure semantics fusion
CN110232393B (en) * 2018-03-05 2022-11-04 腾讯科技(深圳)有限公司 Data processing method and device, storage medium and electronic device
CN110232393A (en) * 2018-03-05 2019-09-13 腾讯科技(深圳)有限公司 Processing method, device, storage medium and the electronic device of data
CN109118053A (en) * 2018-07-17 2019-01-01 阿里巴巴集团控股有限公司 It is a kind of steal card risk trade recognition methods and device
CN109118053B (en) * 2018-07-17 2022-04-05 创新先进技术有限公司 Method and device for identifying card stealing risk transaction
CN109194707B (en) * 2018-07-24 2020-11-20 创新先进技术有限公司 Distributed graph embedding method and device
CN109194707A (en) * 2018-07-24 2019-01-11 阿里巴巴集团控股有限公司 The method and device of distribution figure insertion
TWI703460B (en) * 2018-08-24 2020-09-01 香港商阿里巴巴集團服務有限公司 Distributed graph embedding method, device, equipment and system
WO2020038141A1 (en) * 2018-08-24 2020-02-27 阿里巴巴集团控股有限公司 Distributed graph embedding method, apparatus and system, and device
US11074295B2 (en) 2018-08-24 2021-07-27 Advanced New Technologies Co., Ltd. Distributed graph embedding method and apparatus, device, and system
CN109992606A (en) * 2019-03-14 2019-07-09 北京达佳互联信息技术有限公司 A kind of method for digging of target user, device, electronic equipment and storage medium
CN111160552A (en) * 2019-12-17 2020-05-15 北京百度网讯科技有限公司 Negative sampling processing method, device, equipment and computer storage medium
CN111160552B (en) * 2019-12-17 2023-09-26 北京百度网讯科技有限公司 News information recommendation processing method, device, equipment and computer storage medium

Also Published As

Publication number Publication date
CN107729290B (en) 2021-05-11

Similar Documents

Publication Publication Date Title
CN107729290A (en) A kind of expression learning method of ultra-large figure using the optimization of local sensitivity Hash
CN110334219A (en) The knowledge mapping for incorporating text semantic feature based on attention mechanism indicates learning method
CN107515855B (en) Microblog emotion analysis method and system combined with emoticons
CN108073711A (en) A kind of Relation extraction method and system of knowledge based collection of illustrative plates
CN103927394B (en) A kind of multi-tag Active Learning sorting technique and system based on SVM
Pang et al. DeepCity: A feature learning framework for mining location check-ins
KR20210040892A (en) Information Recommendation Method based on Fusion Relation Network, Apparatus, Electronic Device, Non-transitory Computer Readable Medium, and Computer Program
CN109902203A (en) The network representation learning method and device of random walk based on side
CN109376857A (en) A kind of multi-modal depth internet startup disk method of fusion structure and attribute information
CN104298873A (en) Attribute reduction method and mental state assessment method on the basis of genetic algorithm and rough set
CN111210111B (en) Urban environment assessment method and system based on online learning and crowdsourcing data analysis
CN108228728A (en) A kind of paper network node of parametrization represents learning method
CN110383302A (en) Small Maastricht Treaty Rana Fermi's subcode
CN113065974A (en) Link prediction method based on dynamic network representation learning
Li et al. Intelligent medical heterogeneous big data set balanced clustering using deep learning
Bien et al. Non-convex global minimization and false discovery rate control for the TREX
CN107368521A (en) A kind of Promote knowledge method and system based on big data and deep learning
CN109086463A (en) A kind of Ask-Answer Community label recommendation method based on region convolutional neural networks
CN112000788A (en) Data processing method and device and computer readable storage medium
CN110008411A (en) It is a kind of to be registered the deep learning point of interest recommended method of sparse matrix based on user
CN110222839A (en) A kind of method, apparatus and storage medium of network representation study
Wen et al. Attention-aware path-based relation extraction for medical knowledge graph
CN116386895B (en) Epidemic public opinion entity identification method and device based on heterogeneous graph neural network
CN111159424B (en) Method and device for labeling knowledge graph entity, storage medium and electronic equipment
CN108694232A (en) A kind of socialization recommendation method based on trusting relationship feature learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210511

Termination date: 20210921

CF01 Termination of patent right due to non-payment of annual fee