CN111177479A - Method and device for acquiring feature vectors of nodes in relational network graph - Google Patents

Method and device for acquiring feature vectors of nodes in relational network graph Download PDF

Info

Publication number
CN111177479A
CN111177479A CN201911374127.9A CN201911374127A CN111177479A CN 111177479 A CN111177479 A CN 111177479A CN 201911374127 A CN201911374127 A CN 201911374127A CN 111177479 A CN111177479 A CN 111177479A
Authority
CN
China
Prior art keywords
node
sequence corresponding
vector sequence
vector
embedded
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911374127.9A
Other languages
Chinese (zh)
Other versions
CN111177479B (en
Inventor
苏炜跃
冯仕堃
朱志凡
何径舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201911374127.9A priority Critical patent/CN111177479B/en
Publication of CN111177479A publication Critical patent/CN111177479A/en
Application granted granted Critical
Publication of CN111177479B publication Critical patent/CN111177479B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a method, a device, electronic equipment and a computer readable storage medium for acquiring feature vectors of nodes in a relational network graph, and relates to the technical field of graph processing. The implementation scheme of the method for acquiring the feature vectors of the nodes in the relational network graph is as follows: acquiring a to-be-processed relationship network graph; determining an initial vector sequence corresponding to each node in the to-be-processed relationship network graph; taking each embedded vector in the initial vector sequence corresponding to the current node as a query vector, querying all embedded vectors contained in the initial vector sequence corresponding to the current node and the neighbor nodes thereof, and splicing the query result corresponding to each query vector to obtain an output vector sequence corresponding to the current node; and outputting the output vector sequence corresponding to each node as the characteristic vector of each node in the to-be-processed relational network graph. According to the method and the device, the interactive information among the features can be obtained, so that the representation accuracy of the output feature vector to each node is improved.

Description

Method and device for acquiring feature vectors of nodes in relational network graph
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method and an apparatus for obtaining feature vectors of nodes in a relational network graph, an electronic device, and a computer-readable storage medium in the field of graph processing technologies.
Background
In the prior art, when obtaining a feature vector of a node in a relational network graph, a mode of regarding all features included in one node as a whole is generally adopted, so that operations such as aggregation with other nodes are realized. Therefore, the prior art cannot acquire mutual information among the features contained in the nodes, and the accuracy of the acquired feature vectors is low when the acquired feature vectors represent the nodes.
Disclosure of Invention
The technical solution adopted in the present application to solve the technical problem is to provide a method, an apparatus, an electronic device, and a computer-readable medium for obtaining a feature vector of a node in a relational network graph, where the method includes: acquiring a to-be-processed relationship network graph; determining an initial vector sequence corresponding to each node in the to-be-processed relationship network graph; taking each embedded vector in the initial vector sequence corresponding to the current node as a query vector, querying all embedded vectors contained in the initial vector sequence corresponding to the current node and the neighbor nodes thereof, and splicing the query result corresponding to each query vector to obtain an output vector sequence corresponding to the current node; and outputting the output vector sequence corresponding to each node as the characteristic vector of each node in the to-be-processed relational network graph. According to the method and the device, the interactive information among the features can be obtained, so that the representation accuracy of the output feature vector to each node is improved.
According to a preferred embodiment of the present application, the determining an initial vector sequence corresponding to each node in the to-be-processed relationship network graph includes: acquiring each feature contained in a node; respectively mapping each feature into an embedded vector with a preset length; and splicing the embedded vectors corresponding to the features, and taking the splicing result as an initial vector sequence corresponding to the node.
According to a preferred embodiment of the present application, the querying, with each embedded vector in the initial vector sequence corresponding to the current node as a query vector, all embedded vectors included in the initial vector sequences corresponding to the current node and its neighboring nodes includes: acquiring all embedded vectors contained in an initial vector sequence corresponding to a current node and neighbor nodes thereof; calculating the similarity between each query vector and each embedded vector in all embedded vectors; and normalizing the calculation results of the similarity, and taking the processing result as a query result corresponding to each embedded vector.
According to a preferred embodiment of the present application, after obtaining the output vector sequence corresponding to each node, the method further includes: taking the output vector sequence corresponding to each node as an initial vector sequence corresponding to each node; and repeating the operation of obtaining the output vector sequence corresponding to the current node according to the initial vector sequence corresponding to the current node and the neighbor nodes thereof, and obtaining the output vector sequence corresponding to the current node after circulating for a preset number of times.
The technical solution adopted in the present application to solve the technical problem further provides a device for obtaining a feature vector of a node in a relational network graph, including: the acquiring unit is used for acquiring a network diagram of the relation to be processed; a determining unit, configured to determine an initial vector sequence corresponding to each node in the to-be-processed relationship network graph; the processing unit is used for taking each embedded vector in the initial vector sequence corresponding to the current node as a query vector, querying all embedded vectors contained in the initial vector sequence corresponding to the current node and the neighbor nodes thereof, and splicing the query result corresponding to each query vector to obtain an output vector sequence corresponding to the current node; and the output unit is used for outputting the output vector sequence corresponding to each node as the characteristic vector of each node in the to-be-processed relationship network graph.
According to a preferred embodiment of the present application, when determining the initial vector sequence corresponding to each node in the to-be-processed relationship network graph, the determining unit specifically executes: acquiring each feature contained in a node; respectively mapping each feature into an embedded vector with a preset length; and splicing the embedded vectors corresponding to the features, and taking the splicing result as an initial vector sequence corresponding to the node.
According to a preferred embodiment of the present application, when the processing unit uses each embedded vector in the initial vector sequence corresponding to the current node as a query vector and queries all embedded vectors included in the initial vector sequences corresponding to the current node and its neighboring nodes, the processing unit specifically executes: acquiring all embedded vectors contained in an initial vector sequence corresponding to a current node and neighbor nodes thereof; calculating the similarity between each query vector and each embedded vector in all embedded vectors; and normalizing the calculation results of the similarity, and taking the processing result as a query result corresponding to each embedded vector.
According to a preferred embodiment of the present application, after obtaining the output vector sequence corresponding to each node, the processing unit further performs: taking the output vector sequence corresponding to each node as an initial vector sequence corresponding to each node; and repeating the operation of obtaining the output vector sequence corresponding to the current node according to the initial vector sequence corresponding to the current node and the neighbor nodes thereof, and obtaining the output vector sequence corresponding to the current node after circulating for a preset number of times.
One embodiment in the above application has the following advantages or benefits: according to the method and the device, the interactive information among the features can be obtained, so that the representation accuracy of the output feature vector to each node is improved. Because the mode of taking the characteristics of the nodes as the granularity is adopted, the technical problems that the interactive information among the characteristics cannot be acquired and the accuracy of expressing the nodes by the characteristic vectors is low because all the characteristics of the nodes are taken as a whole in the prior art are solved, the interactive information among the characteristics is acquired, and the technical effects of improving the expression accuracy of the output characteristic vectors to each node are achieved.
Other effects of the above-described alternative will be described below with reference to specific embodiments.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
fig. 1 is a flowchart of a method for obtaining feature vectors of nodes in a relational network graph according to a first embodiment of the present application;
fig. 2 is a structural diagram of an apparatus for obtaining feature vectors of nodes in a relational network graph according to a second embodiment of the present application;
fig. 3 is a block diagram of an electronic device for implementing the method for acquiring feature vectors of nodes in a relational network graph according to the embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a flowchart of a method for obtaining feature vectors of nodes in a relational network graph according to an embodiment of the present application, where as shown in fig. 1, the method includes:
in S101, a to-be-processed relationship network map is acquired.
In this step, a network diagram of the relationship to be processed is obtained. The to-be-processed relationship network Graph obtained in this step is a Graph (Graph) composed of a plurality of nodes and edges connecting two points, and is generally used for describing a certain specific relationship between objects, in the Graph, the objects are represented by the nodes, and the corresponding two objects have a certain specific relationship by the edges connecting two points.
It is understood that the to-be-processed relationship network graph obtained in this step may correspond to different application scenarios, for example, a graph of a social network, a graph of a biological network, a graph of item recommendation, and the like.
In S102, an initial vector sequence corresponding to each node in the to-be-processed relationship network graph is determined.
In this step, the initial vector sequence corresponding to each node in the to-be-processed relationship network graph obtained in step S101 is determined. The initial vector sequence determined in the step is formed by splicing embedded vectors corresponding to all nodes, and all the embedded vectors respectively correspond to different characteristics contained in the nodes.
It can be understood that, since the nodes include different features, and the different features have different feature lengths, in order to improve the accuracy of the obtained initial vector sequence, when determining the initial vector sequence corresponding to each node in the to-be-processed relationship network graph, the following method may be adopted in this step: acquiring each feature contained in a node, such as acquiring continuous features and discrete features contained in the node; respectively mapping each feature into an embedding vector with a preset length, namely, each real feature corresponds to one embedding vector with the same length one by one; and splicing the embedded vectors corresponding to the features, and taking the splicing result as an initial vector sequence corresponding to the node.
For example, if a node in the to-be-processed relationship network graph is used to represent a user, the node may include characteristics of the user, such as age, gender, education level, income, and the like, and in this step, after the characteristics are respectively mapped into embedded vectors of equal length, the embedded vectors are spliced, so as to obtain an initial vector sequence for representing the node.
In S103, each embedded vector in the initial vector sequence corresponding to the current node is used as a query vector, all embedded vectors included in the initial vector sequences corresponding to the current node and its neighboring nodes are queried, and the query results corresponding to each query vector are spliced to obtain an output vector sequence corresponding to the current node.
In this step, each embedded vector in the initial vector sequence corresponding to the current node obtained in step S102 is used as a query vector, and all embedded vectors included in the initial vector sequence corresponding to the current node and its neighboring nodes are queried, so that the query results corresponding to each query vector are spliced to obtain an output vector sequence corresponding to the current node. The length of the output vector sequence obtained in the step is the same as the length of the initial vector sequence corresponding to each node.
That is to say, in the step, when the current node is aggregated with the neighboring nodes thereof, each feature of the current node and each feature of the neighboring nodes are aggregated with the feature in the node as the granularity, so that the obtained output vector sequence can more accurately represent each node.
Specifically, in this step, when each embedded vector in the initial vector sequence corresponding to the current node is used as a query vector and all embedded vectors included in the initial vector sequence corresponding to the current node and its neighboring nodes are queried, the following method may be adopted: acquiring all embedded vectors contained in an initial vector sequence corresponding to a current node and neighbor nodes thereof; calculating the similarity between each query vector and each embedded vector in all embedded vectors; and normalizing the calculation results of the similarity, and taking the processing result as a query result corresponding to each embedded vector. In this step, when normalization of each similarity calculation result is performed, the softmax method may be used.
It can be understood that, in this step, when the output vector sequence corresponding to each node is obtained, the steps are performed simultaneously, that is, in this step, each node in the to-be-processed relationship network graph is simultaneously used as a current node, so as to obtain the output vector sequence corresponding to each node in the to-be-processed relationship network.
In order to more accurately obtain the mutual information between the features and enrich the degree of association between each vector in the obtained output vector sequence and other nodes, the step may further include the following contents after obtaining the output vector sequence corresponding to each node: taking the output vector sequence corresponding to each node as an initial vector sequence corresponding to each node; and repeating the operation of obtaining the output vector sequence corresponding to the current node according to the initial vector sequence corresponding to the current node and the neighbor nodes thereof, and obtaining the output vector sequence corresponding to the current node after circulating for a preset number of times.
That is, in this step, after the output vector sequence corresponding to each node is obtained, each output vector sequence is further used as an initial vector sequence, so that the feature of the current node can obtain the mutual information between the neighbor node and the feature of the neighbor node corresponding to the neighbor node.
In S104, the output vector sequence corresponding to each node is output as the feature vector of each node in the to-be-processed relationship network graph.
In this step, the output vector sequence corresponding to each node in step S103 is output as the feature vector of each node in the to-be-processed relationship network graph obtained in step S101. Wherein, each feature vector obtained in the step is used for representing the feature of each node.
By the technical scheme, the characteristics of the nodes are used as the granularity, and when the nodes are aggregated, each characteristic of the current node is aggregated with each characteristic of the neighbor nodes, so that the interactive information among the characteristics is more accurately acquired, and the representation accuracy of the output characteristic vector to each node is improved.
Fig. 2 is a structural diagram of an apparatus for obtaining feature vectors of nodes in a relational network graph according to an embodiment of the present disclosure, where as shown in fig. 2, the apparatus includes: an acquisition unit 201, a determination unit 202, a processing unit 203, and an output unit 204.
An obtaining unit 201, configured to obtain a to-be-processed relationship network map.
The obtaining unit 201 is configured to obtain a to-be-processed relationship network map. The to-be-processed relationship network Graph obtained by the obtaining unit 201 is a Graph (Graph) composed of a plurality of nodes and edges connecting two points, and is generally used for describing a certain specific relationship between objects, in the Graph, the objects are represented by the nodes, and the corresponding two objects have a certain specific relationship between the edges connecting two points.
It is understood that the to-be-processed relationship network graph acquired by the acquiring unit 201 may correspond to different application scenarios, such as a graph of a social network, a graph of a biological network, a graph of item recommendation, and the like.
A determining unit 202, configured to determine an initial vector sequence corresponding to each node in the to-be-processed relationship network graph.
The determining unit 202 is configured to determine an initial vector sequence corresponding to each node in the to-be-processed relationship network graph acquired by the acquiring unit 201. The initial vector sequence determined by the determining unit 202 is formed by splicing embedded vectors corresponding to nodes, and each embedded vector corresponds to different features included in the node.
It can be understood that, since the nodes include different features, and the different features have different feature lengths, in order to improve the accuracy of the obtained initial vector sequence, when the determining unit 202 determines the initial vector sequence corresponding to each node in the to-be-processed relationship network graph, the following method may be adopted: acquiring each feature contained in a node, such as acquiring continuous features and discrete features contained in the node; respectively mapping each feature into an embedding vector with a preset length, namely, each real feature corresponds to one embedding vector with the same length one by one; and splicing the embedded vectors corresponding to the features, and taking the splicing result as an initial vector sequence corresponding to the node.
The processing unit 203 is configured to use each embedded vector in the initial vector sequence corresponding to the current node as a query vector, query all embedded vectors included in the initial vector sequences corresponding to the current node and neighboring nodes thereof, and splice query results corresponding to each query vector to obtain an output vector sequence corresponding to the current node.
The processing unit 203 is configured to use each embedded vector in the initial vector sequence corresponding to the current node obtained by the determining unit 202 as a query vector, query all embedded vectors included in the initial vector sequence corresponding to the current node and neighboring nodes thereof, and thus splice query results corresponding to each query vector to obtain an output vector sequence corresponding to the current node. The length of the output vector sequence obtained by the processing unit 203 is the same as the length of the initial vector sequence corresponding to each node.
That is to say, when the processing unit 203 aggregates the current node with its neighboring nodes, it aggregates each feature of the current node with each feature of the neighboring nodes with the feature in the node as a granularity, so that the obtained output vector sequence can more accurately represent each node.
Specifically, when the processing unit 203 uses each embedded vector in the initial vector sequence corresponding to the current node as a query vector and queries all embedded vectors included in the initial vector sequence corresponding to the current node and its neighboring nodes, the following method may be adopted: acquiring all embedded vectors contained in an initial vector sequence corresponding to a current node and neighbor nodes thereof; calculating the similarity between each query vector and each embedded vector in all embedded vectors; and normalizing the calculation results of the similarity, and taking the processing result as a query result corresponding to each embedded vector. Among them, the processing unit 203 may use the softmax method in performing normalization of each similarity calculation result.
It can be understood that, when the processing unit 203 acquires the output vector sequence corresponding to each node, it performs simultaneously, that is, the processing unit 203 simultaneously takes each node in the to-be-processed relationship network graph as a current node, thereby acquiring the output vector sequence corresponding to each node in the to-be-processed relationship network.
In order to more accurately obtain the mutual information between the features and enrich the association degree between each vector and other nodes in the obtained output vector sequence, the processing unit 203 may further include the following contents after obtaining the output vector sequence corresponding to each node: taking the output vector sequence corresponding to each node as an initial vector sequence corresponding to each node; and repeating the operation of obtaining the output vector sequence corresponding to the current node according to the initial vector sequence corresponding to the current node and the neighbor nodes thereof, and obtaining the output vector sequence corresponding to the current node after circulating for a preset number of times.
That is to say, after acquiring the output vector sequence corresponding to each node, the processing unit 203 further takes each output vector sequence as an initial vector sequence, so that the feature of the current node can acquire the mutual information between the neighbor node and the feature of the neighbor node corresponding to the neighbor node.
The output unit 204 is configured to output an output vector sequence corresponding to each node as a feature vector of each node in the to-be-processed relationship network graph.
The output unit 204 is configured to output a sequence of output vectors corresponding to each node in the processing unit 203, as the feature vector of each node in the to-be-processed relationship network graph acquired by the acquiring unit 201. Each feature vector acquired by the output unit 204 is used to represent a feature of each node.
By the technical scheme, the characteristics of the nodes are used as the granularity, and when the nodes are aggregated, each characteristic of the current node is aggregated with each characteristic of the neighbor nodes, so that the interactive information among the characteristics is more accurately acquired, and the representation accuracy of the output characteristic vector to each node is improved.
Fig. 3 is a block diagram of an electronic device according to an embodiment of the present application, which is a method for obtaining feature vectors of nodes in a relational network graph. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 3, the electronic apparatus includes: one or more processors 301, memory 302, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 3, one processor 301 is taken as an example.
Memory 302 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by at least one processor to cause the at least one processor to perform the method for obtaining feature vectors of nodes in a relational network graph provided by the present application. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the methods provided herein.
The memory 302 is a non-transitory computer readable storage medium, and can be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules (for example, the acquisition unit 201, the determination unit 202, the processing unit 203, and the output unit 204 shown in fig. 2) corresponding to the feature vectors of the nodes in the relational network graph in the embodiment of the present application. The processor 301 executes various functional applications of the server and data processing by running non-transitory software programs, instructions and modules stored in the memory 302, namely, implements the method for acquiring feature vectors of nodes in the relational network graph in the above-described method embodiment.
The memory 302 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created from use of the electronic device that acquires the feature vectors of the nodes in the relational network graph, and the like. Further, the memory 302 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 302 optionally includes memory located remotely from processor 401, and these remote memories may be connected over a network to an electronic device that obtains feature vectors for nodes in a relational network graph. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device for obtaining the feature vector of the node in the relational network graph may further include: an input device 303 and an output device 304. The processor 301, the memory 302, the input device 303 and the output device 304 may be connected by a bus or other means, and fig. 3 illustrates the connection by a bus as an example.
The input device 303 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus acquiring the feature vectors of the nodes in the relational network graph, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or other input devices. The output devices 304 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, the characteristics of the nodes are used as the granularity, and when the nodes are aggregated, each characteristic of the current node is aggregated with each characteristic of the neighbor nodes, so that the interactive information among the characteristics is more accurately acquired, and the representation accuracy of the output characteristic vector to each node is improved.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A method for obtaining feature vectors of nodes in a relational network graph is characterized by comprising the following steps:
acquiring a to-be-processed relationship network graph;
determining an initial vector sequence corresponding to each node in the to-be-processed relationship network graph;
taking each embedded vector in the initial vector sequence corresponding to the current node as a query vector, querying all embedded vectors contained in the initial vector sequence corresponding to the current node and the neighbor nodes thereof, and splicing the query result corresponding to each query vector to obtain an output vector sequence corresponding to the current node;
and outputting the output vector sequence corresponding to each node as the characteristic vector of each node in the to-be-processed relational network graph.
2. The method according to claim 1, wherein the determining the initial vector sequence corresponding to each node in the to-be-processed relationship network graph comprises:
acquiring each feature contained in a node;
respectively mapping each feature into an embedded vector with a preset length;
and splicing the embedded vectors corresponding to the features, and taking the splicing result as an initial vector sequence corresponding to the node.
3. The method of claim 1, wherein the querying, with each embedded vector in the initial vector sequence corresponding to the current node as a query vector, all embedded vectors included in the initial vector sequences corresponding to the current node and its neighboring nodes comprises:
acquiring all embedded vectors contained in an initial vector sequence corresponding to a current node and neighbor nodes thereof;
calculating the similarity between each query vector and each embedded vector in all embedded vectors;
and normalizing the calculation results of the similarity, and taking the processing result as a query result corresponding to each embedded vector.
4. The method of claim 1, further comprising, after obtaining the output vector sequence corresponding to each node:
taking the output vector sequence corresponding to each node as an initial vector sequence corresponding to each node;
and repeating the operation of obtaining the output vector sequence corresponding to the current node according to the initial vector sequence corresponding to the current node and the neighbor nodes thereof, and obtaining the output vector sequence corresponding to the current node after circulating for a preset number of times.
5. An apparatus for obtaining feature vectors of nodes in a relational network graph, comprising:
the acquiring unit is used for acquiring a network diagram of the relation to be processed;
a determining unit, configured to determine an initial vector sequence corresponding to each node in the to-be-processed relationship network graph;
the processing unit is used for taking each embedded vector in the initial vector sequence corresponding to the current node as a query vector, querying all embedded vectors contained in the initial vector sequence corresponding to the current node and the neighbor nodes thereof, and splicing the query result corresponding to each query vector to obtain an output vector sequence corresponding to the current node;
and the output unit is used for outputting the output vector sequence corresponding to each node as the characteristic vector of each node in the to-be-processed relationship network graph.
6. The apparatus according to claim 5, wherein the determining unit, when determining the initial vector sequence corresponding to each node in the to-be-processed relationship network graph, specifically performs:
acquiring each feature contained in a node;
respectively mapping each feature into an embedded vector with a preset length;
and splicing the embedded vectors corresponding to the features, and taking the splicing result as an initial vector sequence corresponding to the node.
7. The apparatus according to claim 5, wherein the processing unit, when taking each embedded vector in the initial vector sequence corresponding to the current node as a query vector and querying all embedded vectors included in the initial vector sequences corresponding to the current node and its neighboring nodes, specifically executes:
acquiring all embedded vectors contained in an initial vector sequence corresponding to a current node and neighbor nodes thereof;
calculating the similarity between each query vector and each embedded vector in all embedded vectors;
and normalizing the calculation results of the similarity, and taking the processing result as a query result corresponding to each embedded vector.
8. The apparatus according to claim 5, wherein the processing unit further performs, after obtaining the output vector sequence corresponding to each node:
taking the output vector sequence corresponding to each node as an initial vector sequence corresponding to each node;
and repeating the operation of obtaining the output vector sequence corresponding to the current node according to the initial vector sequence corresponding to the current node and the neighbor nodes thereof, and obtaining the output vector sequence corresponding to the current node after circulating for a preset number of times.
9. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-4.
10. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-4.
CN201911374127.9A 2019-12-23 2019-12-23 Method and device for acquiring feature vector of node in relational network graph Active CN111177479B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911374127.9A CN111177479B (en) 2019-12-23 2019-12-23 Method and device for acquiring feature vector of node in relational network graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911374127.9A CN111177479B (en) 2019-12-23 2019-12-23 Method and device for acquiring feature vector of node in relational network graph

Publications (2)

Publication Number Publication Date
CN111177479A true CN111177479A (en) 2020-05-19
CN111177479B CN111177479B (en) 2023-08-18

Family

ID=70655784

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911374127.9A Active CN111177479B (en) 2019-12-23 2019-12-23 Method and device for acquiring feature vector of node in relational network graph

Country Status (1)

Country Link
CN (1) CN111177479B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111538870A (en) * 2020-07-07 2020-08-14 北京百度网讯科技有限公司 Text expression method and device, electronic equipment and readable storage medium
CN112214499A (en) * 2020-12-03 2021-01-12 腾讯科技(深圳)有限公司 Graph data processing method and device, computer equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110213801A1 (en) * 2010-03-01 2011-09-01 International Business Machines Corporation Efficient computation of top-k aggregation over graph and network data
CN106407381A (en) * 2016-09-13 2017-02-15 北京百度网讯科技有限公司 Method and device for pushing information based on artificial intelligence
US9760619B1 (en) * 2014-04-29 2017-09-12 Google Inc. Generating weighted clustering coefficients for a social network graph
CN109583562A (en) * 2017-09-28 2019-04-05 西门子股份公司 SGCNN: the convolutional neural networks based on figure of structure
CN109918454A (en) * 2019-02-22 2019-06-21 阿里巴巴集团控股有限公司 The method and device of node insertion is carried out to relational network figure
CN110245269A (en) * 2019-05-06 2019-09-17 阿里巴巴集团控股有限公司 Obtain the method and apparatus for being dynamically embedded into vector of relational network figure interior joint
US20190286655A1 (en) * 2018-03-13 2019-09-19 Pinterest, Inc. Efficient generation of embedding vectors of nodes in a corpus graph
CN110427436A (en) * 2019-07-31 2019-11-08 北京百度网讯科技有限公司 The method and device of entity similarity calculation
CN110569437A (en) * 2019-09-05 2019-12-13 腾讯科技(深圳)有限公司 click probability prediction and page content recommendation methods and devices

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110213801A1 (en) * 2010-03-01 2011-09-01 International Business Machines Corporation Efficient computation of top-k aggregation over graph and network data
US9760619B1 (en) * 2014-04-29 2017-09-12 Google Inc. Generating weighted clustering coefficients for a social network graph
CN106407381A (en) * 2016-09-13 2017-02-15 北京百度网讯科技有限公司 Method and device for pushing information based on artificial intelligence
CN109583562A (en) * 2017-09-28 2019-04-05 西门子股份公司 SGCNN: the convolutional neural networks based on figure of structure
US20190286655A1 (en) * 2018-03-13 2019-09-19 Pinterest, Inc. Efficient generation of embedding vectors of nodes in a corpus graph
CN109918454A (en) * 2019-02-22 2019-06-21 阿里巴巴集团控股有限公司 The method and device of node insertion is carried out to relational network figure
CN110245269A (en) * 2019-05-06 2019-09-17 阿里巴巴集团控股有限公司 Obtain the method and apparatus for being dynamically embedded into vector of relational network figure interior joint
CN110427436A (en) * 2019-07-31 2019-11-08 北京百度网讯科技有限公司 The method and device of entity similarity calculation
CN110569437A (en) * 2019-09-05 2019-12-13 腾讯科技(深圳)有限公司 click probability prediction and page content recommendation methods and devices

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李超男: "基于节点相似度的社会网络社团发现的算法研究", 《硕士电子期刊》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111538870A (en) * 2020-07-07 2020-08-14 北京百度网讯科技有限公司 Text expression method and device, electronic equipment and readable storage medium
CN111538870B (en) * 2020-07-07 2020-12-18 北京百度网讯科技有限公司 Text expression method and device, electronic equipment and readable storage medium
CN112214499A (en) * 2020-12-03 2021-01-12 腾讯科技(深圳)有限公司 Graph data processing method and device, computer equipment and storage medium
CN112214499B (en) * 2020-12-03 2021-03-19 腾讯科技(深圳)有限公司 Graph data processing method and device, computer equipment and storage medium
US11935049B2 (en) 2020-12-03 2024-03-19 Tencent Technology (Shenzhen) Company Limited Graph data processing method and apparatus, computer device, and storage medium

Also Published As

Publication number Publication date
CN111177479B (en) 2023-08-18

Similar Documents

Publication Publication Date Title
KR102528748B1 (en) Method, apparatus, device and storage medium for constructing knowledge graph
JP7395445B2 (en) Methods, devices and electronic devices for human-computer interactive interaction based on search data
CN111104514A (en) Method and device for training document label model
CN111523001B (en) Method, device, equipment and storage medium for storing data
CN111241838B (en) Semantic relation processing method, device and equipment for text entity
CN111860769A (en) Method and device for pre-training neural network
CN110427436B (en) Method and device for calculating entity similarity
CN110852379B (en) Training sample generation method and device for target object recognition
CN111611449A (en) Information encoding method and device, electronic equipment and computer readable storage medium
CN111090991A (en) Scene error correction method and device, electronic equipment and storage medium
CN111523007A (en) User interest information determination method, device, equipment and storage medium
CN111177479B (en) Method and device for acquiring feature vector of node in relational network graph
CN111767990A (en) Neural network processing method and device
CN111176838B (en) Method and device for distributing embedded vector to node in bipartite graph
CN111966846A (en) Image query method and device, electronic equipment and storage medium
CN111738325A (en) Image recognition method, device, equipment and storage medium
CN113656689B (en) Model generation method and network information pushing method
CN111488972B (en) Data migration method, device, electronic equipment and storage medium
CN111522837B (en) Method and apparatus for determining time consumption of deep neural network
CN111339344B (en) Indoor image retrieval method and device and electronic equipment
CN111523000B (en) Method, apparatus, device and storage medium for importing data
CN113961797A (en) Resource recommendation method and device, electronic equipment and readable storage medium
CN111767988A (en) Neural network fusion method and device
CN111324747A (en) Method and device for generating triples and electronic equipment
CN112925482B (en) Data processing method, device, system, electronic equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant