CN116737965A - Information acquisition method and device, electronic equipment and storage medium - Google Patents

Information acquisition method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116737965A
CN116737965A CN202311011833.3A CN202311011833A CN116737965A CN 116737965 A CN116737965 A CN 116737965A CN 202311011833 A CN202311011833 A CN 202311011833A CN 116737965 A CN116737965 A CN 116737965A
Authority
CN
China
Prior art keywords
entity
vector
context
triplet
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311011833.3A
Other languages
Chinese (zh)
Inventor
石志林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Tencent Computer Systems Co Ltd
Original Assignee
Shenzhen Tencent Computer Systems Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Tencent Computer Systems Co Ltd filed Critical Shenzhen Tencent Computer Systems Co Ltd
Priority to CN202311011833.3A priority Critical patent/CN116737965A/en
Publication of CN116737965A publication Critical patent/CN116737965A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a method, a device, electronic equipment and a storage medium for information acquisition, and relates to the field of artificial intelligence. The information acquisition method comprises the following steps: acquiring a text vector representation of a text; determining a first entity attribute context vector of the first entity and a second entity attribute context vector of the second entity according to entity attributes of the first entity and the second entity in the text in the knowledge network; determining a triplet context vector of a first entity and the second entity in a knowledge network; and predicting the relation between the first entity and the second entity in the text according to the text vector representation, the first entity attribute context vector, the second entity attribute context vector and the triplet context vector. And the entity attribute and the triplet context derived from the knowledge network are used as external features, and the relationship extraction is carried out by combining the external features on the basis of the context of the text, so that the performance of the relationship extraction task can be improved.

Description

Information acquisition method and device, electronic equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of artificial intelligence, in particular to a method, a device, electronic equipment and a storage medium for information acquisition.
Background
Publicly available large-scale knowledge network graphs are widely used in many practical applications, such as questions and answers, factual inspections, voice assistants, search engines, and the like. Although these knowledge networks have been successful and popular, they are not complete. There is therefore a need for methods that can automatically extract knowledge from unstructured text into a knowledge network. Relationship extraction is a knowledge graph completion task aimed at determining an implicit relationship between two given entities and aligning it to a background knowledge network. For example, given the sentence "Li Baizi is too white, the number green lotus is a curiosity, romantic poems great in the tangsheng, and is known as a poem by the latter. He called 'Li Du' "with Du Fuge. The goal of the relationship extraction task is to infer semantic relationships, in this example entity Li Baihe love is a friend. The effect of background knowledge can be seen: the correct target relationship "friends" is not explicitly expressed in the sentence, but the model can infer the correct relationship.
In the related art, the relation extraction method mainly depends on a remote learning paradigm. Given a sentence (or instance), the multi-entity relationship extraction considers all previous occurrences of a given entity pair while predicting the target relationship. However, using only context information from entity pairs in the neural network model adds some noise to the training data, thereby negatively affecting the overall predictive effect. Therefore, how to improve the performance of the relation extraction task is needed to be solved.
Disclosure of Invention
The application provides a method, a device, electronic equipment and a storage medium for acquiring information, which can be beneficial to improving the performance of a relation extraction task and improving the quality of relation extraction.
In a first aspect, an embodiment of the present application provides a method for obtaining information, including:
acquiring a text vector representation of a text;
determining a first entity attribute context vector of a first entity and a second entity attribute context vector of a second entity according to entity attributes of the first entity and the second entity in the text in a knowledge network;
determining a triplet context vector of the first entity and the second entity in the knowledge network, wherein the triplet context vector comprises a first entity vector, a relationship vector and a second entity vector; the sum of the first entity vector and the relationship vector is equal to the second entity vector; the first entity vector is obtained by weighting and aggregating the first entity and a triplet vector of a neighborhood entity of the first entity in the knowledge network; the neighborhood entity comprises the second entity;
and predicting the relation between the first entity and the second entity in the text according to the text vector representation, the first entity attribute context vector, the second entity attribute context vector and the triplet context vector.
In a second aspect, an embodiment of the present application provides an apparatus for acquiring information, including:
an acquisition unit configured to acquire a text vector representation of a text;
a first determining unit, configured to determine a first entity attribute context vector of a first entity and a second entity attribute context vector of a second entity according to entity attributes of the first entity and the second entity in the text in a knowledge network;
a second determining unit, configured to determine a triplet context vector of the first entity and the second entity in the knowledge network, where the triplet context vector includes a first entity vector, a relationship vector, and a second entity vector; the sum of the first entity vector and the relationship vector is equal to the second entity vector; the first entity vector is obtained by weighting and aggregating the first entity and a triplet vector of a neighborhood entity of the first entity in the knowledge network; the neighborhood entity comprises the second entity;
and the predicting unit is used for predicting the relation between the first entity and the second entity in the text according to the text vector representation, the first entity attribute context vector, the second entity attribute context vector and the triplet context vector.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor and a memory for storing a computer program, the processor being for invoking and running the computer program stored in the memory for performing the method as in the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium comprising instructions which, when run on a computer, cause the computer to perform a method as in the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product comprising computer program instructions for causing a computer to perform the method as in the first aspect.
In a sixth aspect, embodiments of the present application provide a computer program for causing a computer to perform the method as in the first aspect.
Through the technical scheme, the embodiment of the application utilizes the entity attribute and the triplet context derived from the knowledge network as external features, performs relationship extraction by combining the external features on the basis of the context of the text, and reduces the influence of noise from the previous text on the overall relationship extraction performance by supplementing the context obtained from the text by using the knowledge from the knowledge network, thereby being beneficial to improving the performance of the relationship extraction task and improving the quality of relationship extraction.
Drawings
Fig. 1 is a schematic diagram of an application scenario of an embodiment of the present application;
FIG. 2 is a schematic diagram of a system architecture according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of a method of information acquisition according to an embodiment of the application;
FIG. 4 is a schematic diagram of a process for obtaining entity attribute context vectors for a given entity;
FIG. 5 is a schematic flow chart diagram of another method of information acquisition according to an embodiment of the present application;
FIG. 6 is a schematic flow chart diagram of another method of information acquisition according to an embodiment of the present application;
FIG. 7 is a schematic block diagram of an apparatus for information acquisition according to an embodiment of the present application;
fig. 8 is a schematic block diagram of an electronic device according to an embodiment of the application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application.
It should be understood that in embodiments of the present application, "B corresponding to a" means that B is associated with a. In one implementation, B may be determined from a. It should also be understood that determining B from a does not mean determining B from a alone, but may also determine B from a and/or other information.
In the description of the present application, unless otherwise indicated, "at least one" means one or more, and "a plurality" means two or more. In addition, "and/or" describes an association relationship of the association object, and indicates that there may be three relationships, for example, a and/or B may indicate: a alone, a and B together, and B alone, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.
It should be further understood that the description of the first, second, etc. in the embodiments of the present application is for illustration and distinction of descriptive objects, and is not intended to represent any limitation on the number of devices in the embodiments of the present application, nor is it intended to constitute any limitation on the embodiments of the present application.
It should also be appreciated that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the application. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiment of the application is applied to the technical field of artificial intelligence.
Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
With research and advancement of artificial intelligence technology, research and application of artificial intelligence technology is being developed in various fields, such as common smart home, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned, autopilot, unmanned, digital twin, virtual man, robot, artificial Intelligence Generated Content (AIGC), conversational interactions, smart medical, smart customer service, game AI, etc., and it is believed that with the development of technology, artificial intelligence technology will be applied in more fields and with increasing importance value.
Embodiments of the present application may relate to natural language processing (Nature Language processing, NLP) in artificial intelligence technology, an important direction in the computer science and artificial intelligence fields. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Thus, the research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with the research in linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph techniques, and the like.
The embodiment of the application can relate to Computer Vision (CV) technology in artificial intelligence technology, wherein the Computer Vision is a science for researching how to make a machine "see", and further refers to using a camera and a Computer to replace human eyes to recognize, monitor, measure and other machine Vision of a target, and further performing graphic processing, so that the Computer processing becomes an image more suitable for human eyes to observe or transmit to an instrument to detect. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, and map construction, among others, as well as common biometric recognition techniques such as face recognition, fingerprint recognition, and others.
The embodiment of the application can also relate to Machine Learning (ML) in the artificial intelligence technology, wherein ML is a multi-domain interdisciplinary, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.
Related terms related to the embodiments of the present application are described below.
Knowledge network: is a knowledge-relationship network that links together heterogeneous information, each node representing an entity, and each edge representing a relationship between entities.
Entity attributes: refers to a property or properties an entity has to describe the identity or state of the entity. For example, the attributes of the student entity may include name, number, grade, etc.
And (3) relation extraction: the task of information extraction is to identify target relationships between entities from text and form a triplet (subject, relationship, object). For example, given a sentence "Libai is a famous poetry in China", the goal of relation extraction is to extract triples (Libai, profession, poetry). Relationship extraction is an important technical link in constructing a knowledge network. The subject may also be referred to as a head entity and the object may also be referred to as a tail entity.
Graph neural network: a deep learning technique that can apply a neural network to data of a graph structure to capture complex relationships and features in the data. A graph neural network is an optimizable transformation that can update all the attributes (nodes, edges, global context) in a graph while maintaining the symmetry (i.e., permutation invariance) of the graph.
Graph attention mechanism: a method for a graph neural network uses a self-attention mechanism to calculate the relevance of each node in the graph and the neighbor nodes thereof, and uses the relevance as a weight to update the characteristics of the nodes, so that the characteristics of the nodes can better capture the structural information of the graph.
Triplet context: is a knowledge representation method, which represents a triplet (head entity, relation, tail entity) as its corresponding vector (ebedding). The purpose of the triplet context is to capture semantic information between entities and relationships, and the triplet context can be used for construction and reasoning of relationship extraction.
Currently, the relation extraction method in the related art mainly depends on a remote learning paradigm. Given a sentence (or instance), the multi-entity relationship extraction considers all the previous occurrences of a given entity pair to predict the target relationship. The method assumes that if two entities have a relationship in the knowledge network, then all sentences containing these entities express the same relationship, so using only the context information from the entity pairs will add some noise to the training data, thereby negatively affecting the overall predictive effect.
In order to solve the technical problems, the application provides a method, a device, electronic equipment and a storage medium for acquiring information, which are beneficial to improving the performance of a relation extraction task and improving the quality of relation extraction.
In particular, a text vector representation of the text may be obtained; determining a first entity attribute context vector of the first entity and a second entity attribute context vector of the second entity according to entity attributes of the first entity and the second entity in the text in a knowledge network; determining a triplet context vector of the first entity and the second entity in the knowledge network, wherein the triplet context vector comprises a first entity vector, a relation vector and a second entity vector; the sum of the first entity vector and the relationship vector is equal to the second entity vector; the first entity vector is obtained by weighting and aggregating the first entity and the triplet vector of the neighborhood entity of the first entity in the knowledge network; the neighborhood entity includes a second entity; and predicting the relation between the first entity and the second entity in the text according to the text vector representation, the first entity attribute context vector, the second entity attribute context vector and the triplet context vector.
Therefore, the embodiment of the application utilizes the entity attribute and the triplet context derived in the knowledge network as external features, performs relation extraction by combining the external features on the basis of the context of the text, and reduces the influence of noise from the previous text on the overall relation extraction performance by supplementing the context obtained from the text by using the knowledge from the knowledge network, thereby being beneficial to improving the performance of relation extraction tasks and improving the quality of relation extraction.
The embodiment of the application can be applied to any business scene requiring text processing, and can automatically identify the relation in the unstructured text (text relation extraction) and align with a knowledge network, namely, the extracted entity relation in the text is associated with the entity relation in the knowledge network. Publicly available large-scale knowledge networks are widely used in many practical applications, such as questions and answers, factual inspections, language assistants, search engines, advertisement recommendations or information recommendations, and the like. The knowledge network can be made more complete by extracting knowledge from the text.
Fig. 1 shows a schematic diagram of an application scenario according to an embodiment of the present application.
As shown in fig. 1, the application scenario involves a terminal 102 and a server 104, where the terminal 102 may communicate data with the server 104 through a communication network. Server 104 may be a background server of terminal 102.
Optionally, as shown in FIG. 1, the server 104 may also be coupled to a data storage system 106, such as a database, for providing data storage services for the server 104. The data storage system may be integrated on the server 104, or may be deployed on a cloud or other server, without limitation.
By way of example, the terminal 102 may refer to a device that has rich man-machine interaction, has access to the internet, typically carries various operating systems, and has a high processing power. The terminal device may be a terminal device such as a smart phone, a tablet computer, a portable notebook computer, a desktop computer, a wearable device, a vehicle-mounted device, etc., but is not limited thereto. Optionally, in the embodiment of the present application, the terminal 102 is installed with a relationship extraction application or an application with a relationship extraction function.
The server 104 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligent platforms, and the like. Servers may also become nodes of the blockchain.
The server may be one or more. Where the servers are multiple, there are at least two servers for providing different services and/or there are at least two servers for providing the same service, such as in a load balancing manner, as embodiments of the application are not limited in this respect.
The terminal device and the server may be directly or indirectly connected through wired or wireless communication, which is not limited in the present application. The present application does not limit the number of servers or terminal devices. The scheme provided by the application can be independently completed by the terminal equipment, can be independently completed by the server, and can be completed by the cooperation of the terminal equipment and the server, and the application is not limited to the scheme.
Optionally, as shown in FIG. 1, a data storage system may also be included. The data storage system may store data required by the server 104. The data storage system may be integrated on the server 104, or may be deployed on a cloud or other server, without limitation.
It should be understood that fig. 1 is only an exemplary illustration, and does not specifically limit the application scenario of the embodiment of the present application. For example, fig. 1 illustrates one terminal device, one server, and may actually include other numbers of terminal devices and servers, which the present application is not limited to.
The following describes the technical scheme of the embodiments of the present application in detail through some embodiments. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.
Fig. 2 is a schematic diagram of a system architecture according to an embodiment of the present application. The system architecture includes a text acquisition module 210, a text vector module 220, an entity attribute context module 230, a triplet context learner 240, and an aggregator 250, wherein the aggregator 250 further includes a text encoder 251, a propagation layer 252, and a classifier 253. The text obtaining module 210 is configured to obtain text, and the text vector module 220 is configured to obtain a text vector representation of the input text. Entity attribute context module 230, triplet context learner 240, and aggregator 250 are new modules introduced by embodiments of the present application, which are described below.
Entity attribute context module 230 is used to facilitate learning of entity attributes from a knowledge network for a context vector representation of a given entity in text to enrich the entity vector. Entity attribute context module 230 may be a recurrent neural network-based module. By way of example, and not limitation, entity attributes such as at least one of entity tags, entity aliases, entity descriptions, and entity types, etc.
The triplet context learner 240 is configured to capture a triplet context vector for a given entity in the knowledge network, the triplet context vector comprising a vector representation of entities and relationships in the neighborhood of the given entity, supplementing the context information obtained by the entity attribute context module 230 with additional triplet context vectors. The triplet context learner 240 may include a graph attention network that may learn the entities and relationship vectors of neighbor triples of entities using a graph attention mechanism.
The aggregator 250 may aggregate entity attribute contexts, triplet contexts, and text vectors using the messaging capabilities of the neural network, learn vector representations of entities stored in the text and knowledge networks, and output relationship extraction results via the classifier. Wherein the neural network includes a text encoder 251 and a propagation layer 252. The neural network receives as input the text vector, the entity attribute context vector, and the triplet context vector, and obtains a uniform vector representation, and the classifier 253 predicts the entity relationships to obtain a relationship extraction result. After the text vector passes through the text encoder 251, it is input to the propagation layer 252 together with the entity attribute context to obtain entity vector representations of each layer, and the entity vector representations of each layer are spliced with the triplet context vector and then input to the classifier 253 to obtain an accurate entity relationship.
In some embodiments, the entity attribute context module 230, the triplet context learner 240, and the aggregator 250 may be separately model trained by embodiments of the present application to optimize model parameters for each module. Alternatively, the model training of the aggregator 250 module to optimize parameters may be continued with parameter updating of the entity attribute context module 230 and the triplet context learner 240. Alternatively, the entity attribute context module 230, the triplet context learner 240, and the aggregator 250 may be trained alternately, i.e., training the entity attribute context module 230 and the triplet context learner 240 with a portion of the samples, then keeping the two parameters unchanged, training the aggregator 250 with a portion of the samples, and then repeating the previous training process with another portion of the samples.
Therefore, the embodiment of the application utilizes the entity attribute and the triplet context derived in the knowledge network as external features, performs relation extraction by combining the external features on the basis of the context of the text, and reduces the influence of noise from the previous text on the overall relation extraction performance by supplementing the context obtained from the text by using the knowledge from the knowledge network, thereby being beneficial to improving the performance of relation extraction tasks and improving the quality of relation extraction.
In order to describe the detailed process of the information acquisition method provided in the embodiment of the present application, first, the problems and related terms related to the embodiment of the present application are defined.
First, a knowledge network can be defined as a tupleWherein->Representing a set of entities (vertices or nodes), ->Representing a set of relations (edges), ->Is the set of all triples,also known as entity relationships.
Triplet(s)Representing +.>,/>Is the head entity (the beginning entity of the relationship), and +.>Is the tail entity (the ending entity of the relationship). Since the knowledge network is a multiple graph, +.>Possibly, and for any two entities, +.>. A tuple can be defined +.>It is to retrieve the function from a context +.>The result is that the function returns to the given entity +.>Is a set of two: />(set of all attributes) and +.>(in->A set of all triples for the header entity).
A sentenceIs a word sequence.Entity aggregation in sentencesRepresentation, each of which->Is sentence->Is a phrase fragment of (a). Each phrase fragment is labeled ++by an entity in the knowledge network>Wherein->. When there are relations between two labeling entities in a sentence, they form a pair of relations +. >(labeled N/a if there is no corresponding relationship in the knowledge network).
The relation extraction task is that the sentence isIs predicted to give entity pair->Target relation of-> . It returns an N/a tag if no relationship is inferred. The embodiment of the application regards the relation extraction as a classification task and utilizes the contextual information of the knowledge network (through learning set +.>And->To be implemented) to improve classificationEffects.
The information acquisition method provided by the present application will be described in detail below in conjunction with the above definition.
Fig. 3 is a schematic flow chart of a method 300 of information acquisition according to an embodiment of the present application, where the method 300 of information acquisition may be performed by any electronic device having data processing capabilities, e.g., the electronic device may be implemented as a server or a terminal device, e.g., as the server 104 or the terminal 102 of fig. 1, and the present application is not limited in this regard. As shown in fig. 3, the method 300 of information acquisition includes steps 310 through 340.
A text vector representation of the text is obtained 310.
The text is text to be processed, for example, may be a sentence. Referring to FIG. 2, the text vector module 220 may obtain a text vector representation of text, e.g., the text vector representation of a sentence may be . By way of example, the text vector representation may be a static vector representation, as the application is not limited in this regard.
By way of example, text vector modules may include, but are not limited to, BERT (Bidirectional Encoder Representations from Transformers), long-short term memory models (long-short term memory, LSTM), convolutional neural network models (Convolutional Neural Networks, CNN), graphic neural networks (Graph Neural Network, GNN), and the like, as the application is not limited thereto.
At 320, a first entity attribute context vector for the first entity and a second entity attribute context vector for the second entity are determined based on the entity attributes of the first entity and the second entity in the text in the knowledge network.
Exemplary, for entity sets in sentencesThen the first entity and the second entity are the set of entities +.>Any two different entities. With continued reference to fig. 2, the entity attribute context module 230 may obtain, from the input text, a first entity attribute context vector for a first entity and a second entity attribute context vector for a second entity in the text as features obtained from an external knowledge network.
In this step, an entity attribute context vector for the entity is constructed from the common attributes of the entity in the knowledge network. Wherein the entity attributes include, but are not limited to, at least one of entity tags, entity aliases, entity descriptions, and entity types, among others.
In some embodiments, the word vector and the character vector of each first entity attribute of the first entity may be connected together and input to the encoder to obtain a coded vector of each first entity attribute; and splicing the coded vectors of each first entity attribute, and inputting the spliced coded vectors into a convolutional neural network to obtain the context vector of the first entity attribute. Similarly, the word vector and the character vector of each second entity attribute of the second entity can be connected and input into the encoder to obtain the encoding vector of each second entity attribute; and splicing the coded vectors of each second entity attribute, and inputting the spliced coded vectors into a convolutional neural network to obtain the context vector of the second entity attribute.
Fig. 4 shows a schematic diagram of a process of obtaining an entity attribute context vector for a given entity. The given entity may be the first entity or the second entity, which is not limited. Taking the example that the entity attributes include entity tags, entity descriptions, entity aliases and entity types, the word vectors and character vectors of each entity attribute may be respectively connected to input encoders, which may be a bi-directional LSTM encoder. The encoder outputs corresponding to each entity attribute may be stacked and input into a one-dimensional CNN to obtain an entity attribute context vector. CNNs are able to use a dynamic number of contexts by max pooling, keeping the model invariant to the order of context input. By way of example, the entity attribute context vector may be expressed as the following formula:
(1)
Wherein, the liquid crystal display device comprises a liquid crystal display device,entity attribute context vector representing a given entity, 1D_CNN representing a 1-dimensional CNN network, +.>Is a connection operator, ++>Is an attribute of a given entity, +.>Is the number of entity attributes.
330, determining a triplet context vector of the first entity and the second entity in the knowledge network, wherein the triplet context vector comprises the first entity vector, the relation vector and the second entity vector; the sum of the first entity vector and the relationship vector is equal to the second entity vector; the first entity vector is obtained by weighting and aggregating the first entity and the triplet vector of the neighborhood entity of the first entity in the knowledge network. The neighborhood entity includes a second entity.
In particular, because each entity may participate in multiple relationships in different contexts in the knowledge network, different aspects of the entity may participate in representing the multiple relationships, more expressive power can be captured by learning the triplet context vector, i.e., learning the entity and relationship vector of the triplet. In particular, the triplet context vector is intended to capture semantic information between entities and relationships in the neighborhood of a given entity. The entity attribute context information obtained in step 320 can be supplemented by the additional triplet context in the neighborhood network, resulting in more complete external features.
With continued reference to fig. 2, the triplet context learner 240 may obtain triplet context vectors for the first entity and the second entity in the text as features obtained from the external knowledge network based on the input text.
In some embodiments, a graph attention mechanism may be utilized to obtain a triplet context vector for a first entity and a second entity in a knowledge network. Referring to fig. 5, the triplet context vector may be acquired through the following steps 331 to 333.
331, an initial triplet context vector is obtained, the initial triplet context vector comprising an initial entity vector of the first entity, an initial entity vector of the second entity and an initial relationship vector. The initial relationship vector characterizes an initial relationship of the first entity and the second entity.
Exemplary, can be provided withFor the initial entity vector of the first entity, +.>For the initial entity vector of the second entity, +.>Is an initial relationship vector for the first entity and the second entity. The second entity is a certain neighbor node of the first entity. Then triplet +.>Vector of->Expressed as the following formula:
(2)
wherein, the liquid crystal display device comprises a liquid crystal display device,is a weight metric. Head entity vector +.>Tail entity vector- >And a relation vectorAnd connecting. Vector->I.e. the initial triplet context vector described above.
332, weighting and aggregating the initial triplet context vectors by using the graph attention network to obtain a neighborhood weighted aggregate vector of the first entity.
Specifically, the graph attention network uses a self-attention mechanism to calculate the relevance of each entity node and the neighbor nodes thereof, and uses the relevance as a weighted aggregation of weights to the neighbor nodes to update the representation of the entity node, so that the characteristics of the entity node can better capture the structural information of the graph. In other words, the weighted aggregate vector fuses the relationships of the first entity and its neighboring entities, and can represent multiple relationships that the first entity participates in.
Illustratively, the neighborhood weighted aggregate vector for the first entity may be obtained by the following steps 3321 through 3323.
3321, attention weights of the initial triplet context vector are obtained.
In particular, the attention weight (i.e., the attention value) of the initial triplet context vector of the first entity with each of the neighboring entities may be learned. Illustratively, the attention weight is defined byThe calculation process is shown in the following formula (3):
(3)
wherein, the liquid crystal display device comprises a liquid crystal display device,is an activation function- >Is a weight metric.
3322 normalize the weights of the initial triplet context vector to obtain the relative attention weights of the initialized triplet context vector.
Specifically, when aggregating neighbor nodes, attention values of all neighbors of the entity node need to be normalized to obtain relative attention weights. For example, the softmax function may be normalized to equation (3) above, as follows:
(4)
wherein, the liquid crystal display device comprises a liquid crystal display device,for the relative attention value, +.>Representation entity->Neighborhood of->Representation entity->And->A set of relationships between. />Is an aggregation coefficient used when the relation between the first node and the neighbor node is weighted and aggregated.
3323 weight-aggregating the initial triplet context vector according to the relative attention weights to obtain a neighborhood weight-aggregate vector.
In particular, it is possible to useTriad vector for the first entity and each neighbor entity in equation (2)And carrying out weighted summation to obtain a neighborhood weighted aggregation vector of the first entity, wherein the neighborhood weighted aggregation vector fuses the information of the neighbor nodes. To stabilize learning more information, X independent attention heads can be used, and then the resulting neighborhood weighted aggregate vector is a connection of vectors for each attention head, as shown in equation (5) below:
(5)
Wherein, the liquid crystal display device comprises a liquid crystal display device,weighting the aggregate vector for the neighborhood of the first entity, < >>As a nonlinear function.
333, determining a triplet context vector from the neighborhood weighted aggregation vector and the initial triplet context vector.
Specifically, since the neighborhood weighted aggregation vector is obtained by weighting and aggregating the relationship between the first entity and the neighborhood entity according to the initial triplet context vector, the neighborhood triplet context vector can be obtained according to the neighborhood weighted aggregation vector. Illustratively, the second initial triplet context vector may be obtained by the following steps 3331 through 3334.
3331 obtaining the first entity vector according to the neighborhood weighted aggregate vector and the initial entity vector of the first entity.
Specifically, an initial entity vector of the first entityCan be combined with neighborhood weighting aggregate vector +.>Add to make the first entity vector +.>Preserving initial entity orientationQuantity->Is a piece of information of (a). The specific formula is as follows:
(6)
wherein, the liquid crystal display device comprises a liquid crystal display device,is a weight matrix.
3332, performing linear transformation on the initial relation vector to obtain a relation vector.
Specifically, for an initial relationship vectorA weight matrix can be used +.>Performing linear change to obtain a relation vector +.>To match the first entity vector +. >Is a dimension of (c). The specific formula is as follows:
(7)
3333 obtaining a second entity vector based on the first entity vector and the relationship vector.
Specifically, for an effective triplet in the same vector spaceThe following formula holds:
(8)
based on this, it is possible toThe first entity vector in the formula (6)Relation vector +.>Adding to obtain a second entity vector +.>. Wherein (1)>、/>Each vector in physical space.
In some embodiments, the first entity vector may also be converted from entity space to relationship space via a nonlinear transformation, and the second entity vector may be converted from entity space to relationship space via a nonlinear transformation.
That is, in embodiments of the present application, the entity vector and the relationship vector may be maintained in different vector spaces, such as the first entity vectorConversion from physical space to relational space, denoted +.>And vector the second entityConversion from physical space to relational space, denoted +.>
Illustratively, the entity vector may be converted from entity space to relationship space by applying a nonlinear transformation, as shown in the following equation:
(9)
wherein, the liquid crystal display device comprises a liquid crystal display device,(wherein->) Is a relationship-specific entity vector in the relationship space, +. >As a nonlinear function>A transformation matrix specific to the relationship.
Specifically, each entity corresponds to a vector, each relationship corresponds to a vector, and the differences between the entities and the relationships cannot be clearly indicated if the vector corresponding to the entity and the vector corresponding to the relationship are in the same space. And by embedding the vectors corresponding to the entities and the vectors corresponding to the relations into different vector spaces respectively, the model can be helped to learn the entity and the relation vector with more expressive force respectively, and the relation and the representation of the entity can be helped to be captured more accurately.
Based on the separated spatial vectors, the above equation (8) can be modified as follows:
(10)
3334 obtaining a triplet context vector based on the first entity vector, the relationship vector and the second entity vector.
Specifically, the triples corresponding to the first entity vector, the relation vector and the second entity vector can be expressed asTriplet->The corresponding triplet context vector is similar to equation (2), i.eThe concatenation results in a triplet context vector.
In some embodiments, parameters of the triplet context learner may also be updated according to marginal ordering loss to obtain a more accurate triplet context vector. For example, model parameter updates may be performed according to steps 3341 to 3343 below.
3341 determining a first distance metric from an active triplet corresponding to the first entity and its neighborhood entity, and determining a second distance metric from an inactive triplet corresponding to the first entity and its neighborhood entity, wherein the active triplet comprises an active relationship between the first entity and its neighborhood entity, and the inactive triplet comprises an inactive relationship between the first entity and its neighborhood entity.
Exemplary, distance metrics may be definedFor representing triples->Distance measure->The determination may be made according to the following equation:
(11)
specifically, for a valid triplet of first entity and its neighborhood nodes (corresponding vector is) That is, the actual triples in the data set have an effective relationship between entities, the distance measure of which is the first distance measure and can be recorded as +.>The method comprises the steps of carrying out a first treatment on the surface of the For the invalid triplet composed of the first entity and its neighborhood nodes (corresponding vector is +.>) I.e. triples not present in the data set, there is no relation between entities, the distance measure of which is the second distance measure, which can be noted +.>. For non-existing relations an erroneous relation pair may be assigned>The representation is performed.
Illustratively, in a knowledge network, li Baihe love has a valid relationship "friend", then the valid triplet is < Lifebai, friend, love >. An invalid triplet will contain a relationship that does not exist between the two entities.
3342 determining a marginal ordering loss based on the first distance metric and the second distance metric.
For example, the marginal ordering loss may be determined according to the following formula:
(11)
wherein, the liquid crystal display device comprises a liquid crystal display device,representing marginal ordering loss,/->Is a set of valid triples, +.>Is a set of invalid triples, +.>Is a marginal parameter.
3343 to minimize marginal ordering loss to update parameters of the graph meaning network.
Therefore, the embodiment of the application can enable the drawing meaning network to learn the entity and the relation vector of the entity node and the neighborhood entity more accurately by updating the parameters of the drawing meaning network, and obtain the more accurate triplet context vector.
340 predicting the relationship of the first entity and the second entity in the text based on the text vector representation, the first entity attribute context vector, the second entity attribute context vector, and the triplet context vector.
With continued reference to fig. 2, the text vector representation obtained by the text vector module 220, the first entity attribute context vector and the second entity attribute context vector obtained by the entity attribute context module 230, and the triplet context vector obtained by the triplet context learner 240 may be input to the aggregator 250 to predict the relationship of the first entity and the second entity in the text. The first entity attribute context vector, the second entity attribute context vector and the triplet context vector are used as external features acquired from the knowledge network, and relationship extraction is performed by combining text vector representation to obtain the relationship of the entities.
In some embodiments, referring to fig. 6, the relationship of the first entity and the second entity in the text may be predicted by the following steps 341 through 346.
And 341, acquiring a first text vector representation corresponding to the first entity and a second text vector representation corresponding to the second entity according to the text vector representation.
For example, for text vector representationThe first text vector representation corresponding to the first entity may be represented as +.>The second text vector representation corresponding to the second entity may be +.>
342, encoding the first text vector to represent the position information of the word corresponding to the first entity to obtain a first encoded vector, and encoding the second text vector to represent the position information of the word corresponding to the second entity to obtain a second encoded vector.
With continued reference to fig. 2, the text vector representation (i.e., the word vector) and the location information of the words may be input to a text encoder 251, which encodes the word vector and the location information of the words to obtain an encoded vector. Illustratively, the encoder may encode according to the following formula:
(13)
wherein, the liquid crystal display device comprises a liquid crystal display device,text vector representation of the entity being input to the encoder with word positions +. >The corresponding entity pair position is +.>;/>For word vector +.>Representing word position +.>Relative position in text (e.g., sentence) to entityIs a position vector of (a). The encoder splices the word vector and the position vector of the word in the text to output. The position vector may be used to indicate whether an entity belongs to a head entity or a tail entity or neither.
In this step, the first text vector representation corresponding to the first entity and the position information of the word thereof may be input to the encoder to obtain the encoding result corresponding to the first entity (i.e., the first encoding result), and the second text vector representation corresponding to the second entity and the position information of the word thereof may be input to the encoder to obtain the encoding result corresponding to the second entity (i.e., the second encoding result).
And 343, inputting the first coding vector and the first entity attribute context vector into a propagation module to obtain a layered representation vector of the first entity.
With continued reference to fig. 2, the encoding result corresponding to the first entity output by the text encoder 251 and the first entity attribute context vector of the first entity obtained by the entity attribute context module 230 are input into the propagation layer 252 to obtain a hierarchical representation vector of the first entity.
In some embodiments, the propagation module (i.e., propagation layer) comprises a multi-layer structure. For each layer (e.g., the nth layer) of the propagation module, a hierarchical representation vector corresponding to that layer may be obtained according to steps 3431 and 3432 as follows. Wherein n is an integer greater than or equal to 0.
3431 the nth layer in the propagation module obtains a first transfer matrix of the nth layer from the first encoding vector.
Specifically, the nth layer may input the first encoded vector into the bi-directional LSTM network and then generate the first transition matrix via a fully connected network (e.g., MLP). Illustratively, for each layer (e.g., the nth layer), the transfer matrix can be learned using the following formula
(14)
Wherein [ the]The representation is a conversion of the vector into a matrix,is a layer of bi-directional LSTM>Is an index of words in text,/>Is the length of the sentence.
3432, the nth layer in the propagation module obtains a representation vector of the first entity in the (n+1) th layer according to the first transfer matrix and the representation vector of the neighborhood entity of the first entity in the nth layer; wherein the neighborhood node of the first entity comprises the second entity; the representation vector of the first entity at layer 0 is a first entity attribute context vector.
In particular, the nth layer may be based on the first transfer matrix And first entity->Neighborhood of->Entities in (e.g.)>) Representation vector at the n-th layer->Non-linear transformation is performed on the product of (2) to obtain a first entity +.>Representation vector in (n+1) layer +.>. Exemplary, ->Can be obtained according to the following formula:
(15)
in some embodiments, when n is 0, the representation vector of the entity is the entity attribute context vector corresponding to the entity obtained in step 320. For example, the expression vector of the first entity or the second entity at layer 0 is calculated according to formula (1).
And 344, inputting the second coding vector and the second entity attribute context vector into a propagation module to obtain a layered representation vector of the second entity.
With continued reference to fig. 2, the encoding result corresponding to the second entity output by the text encoder 251 and the second entity attribute context vector of the second entity obtained by the entity attribute context module 230 are input into the propagation layer 252 to obtain a hierarchical representation vector of the second entity.
In some embodiments, the hierarchical representation vector for the layer may be derived according to steps 3441 and 3442 as follows.
3441 the nth layer in the propagation module obtains a second transfer matrix of the nth layer according to the second coding vector.
3442 the nth layer in the propagation module obtains the expression vector of the second entity in the (n+1) th layer according to the second transfer matrix and the expression vector of the neighborhood entity of the second entity in the nth layer; the neighborhood node of the second entity comprises the first entity; wherein the representation vector of the second entity at layer 0 is the second entity attribute context.
Specifically, the calculation process of steps 3441 and 3442 is similar to that of steps 3431 and 3432, and reference is made to the above related description, and the detailed description is omitted here.
345, obtaining the connection relation between the first entity and the second entity according to the layered representation vector of the first entity and the layered representation vector of the second entity.
For example, each layer of learned vectors in the propagation module may be connected together, and the connection relationship of the first entity and the second entity is determined and output. Specifically, each layer may multiply the layered representation vector of the first entity and the layered representation vector of the second entity in the layer element by element, and the results obtained after multiplication of the layers are spliced to obtain the connection relationship. Illustratively, the connection relationship may be as follows:
(16)
wherein, the liquid crystal display device comprises a liquid crystal display device,representation entity->And entity->Connection relationship between->Representing the total number of layers of the propagation module,/->Representing element-by-element multiplication>Representing the transpose of the vector.
346, predicting the relation between the first entity and the second entity in the text according to the connection relation and the triplet context vector.
With continued reference to fig. 2, the connection relationship output by the propagation layer 252 and the triplet context vector output by the triplet context learner 240 may be spliced and then input into the classifier 253 to predict the relationship between the first entity and the second entity in the text, so as to obtain a relationship extraction result.
For the case that the entity and the relation vector are learned in the same vector space, the entity vector in the triplet context vector and the connection vector can be spliced together and input into the classifier to obtain the probability of each relation between the first entity and the second entity. Illustratively, the first entity vector in equation (6) may beAnd a second entity vector +.>And +.>Connected and input to a classification layer (e.g., MLP) to obtain probabilities for each relationship, as shown in the following equation:
(17)
wherein, the liquid crystal display device comprises a liquid crystal display device,index of representing word +.>Head entity of->And tail entity->The relation of->Is a probability of (2).
For the case where the entity and relationship vectors learn in a separate vector space, the entity vector of the relationship space may be used to find the probability that the triplet shows a valid relationship. Based on this, the entity vector of the relationship space in equation (9) above can be compared with that in equation (16)Connected and non-linearly transformed, for example, as shown in the following equation (18):
(18)
wherein, the liquid crystal display device comprises a liquid crystal display device,by applying a non-linear function on the connected vector representation of the propagation layer and triplet context vectors >A vector is obtained. />And the relation vector +.>The distance between them, the head entity can be obtained>And tail entity->Probability of existence of relationship between them>The following formula is shown:
(19)
wherein, the liquid crystal display device comprises a liquid crystal display device,for sentences, ->Is a context, ++>Is a knowledge network.
For equation (19) above, the computation cost of optimizing with a binary cross entropy loss function and negative sampling of the invalid triples is very high. Thus can be to entityAnd->Is optimized by converting and comparing with each relationship. In particular, see the procedure of steps 3461 to 3463 below.
3461 determining at least one distance measure of the triplet context vector from the first entity vector, the relationship vector and the second entity vector.
In particular, distance metrics for triplet contextsCan be expressed as the following formula:
(20)
wherein one distance measure is an entityAnd one entity of the neighborhood->Is a triplet context vector corresponding to the entity +.>And entity->A relationship between them.
3462, determining a conversion vector based on the at least one distance metric.
Illustratively, norms of at least one distance metric may be concatenated, the transformation vector may be derived, e.g., the entity pair may be derived according to the following formula And->Is>
(21)
Wherein the vector is convertedRepresenting entity pair->And->And->Personal relationship->Proximity of->Is an entity pair +.>And->Number of relationships between.
3463 predicts the relationship of the first entity and the second entity in the text based on the connection relationship, the first entity vector, the second entity vector, and the translation vector.
Specifically, the connection relationship can beFirst entity vector in relation space +.>And a second entity vectorConversion vector->And the input classification head is spliced to obtain a classification target relationship, namely the relationship between the first entity and the second entity in the text. Exemplary, ->Can be obtained according to the following formula:
(22)
the final classifier outputs two entitiesAnd->Relationship between them. For example, if there are 100 relation categories, the probability value is takenThe largest one of the relationship categories is the final result of the relationship extraction.
Therefore, the embodiment of the application utilizes the entity attribute and the triplet context derived in the knowledge network as external features, performs relation extraction by combining the external features on the basis of the context of the text, and reduces the influence of noise from the previous text on the overall relation extraction performance by supplementing the context obtained from the text by using the knowledge from the knowledge network, thereby being beneficial to improving the performance of relation extraction tasks and improving the quality of relation extraction.
By way of example, the embodiment of the application can automatically identify the relation in the sentence and align the relation with the knowledge network, correlate the entity relation in the sentence with the entity relation in the knowledge network, and improve the quality of relation extraction by learning the identification of the sentence and the facts stored in the knowledge network. Facts existing in the knowledge network may include entity attributes (e.g., labels, aliases, descriptions, instances, etc.) and fact triples. The embodiment of the application can obviously surpass the performance of the existing relation extraction method.
The specific embodiments of the present application have been described in detail above with reference to the accompanying drawings, but the present application is not limited to the specific details of the above embodiments, and various simple modifications can be made to the technical solution of the present application within the scope of the technical concept of the present application, and all the simple modifications belong to the protection scope of the present application. For example, the specific features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various possible combinations are not described further. As another example, any combination of the various embodiments of the present application may be made without departing from the spirit of the present application, which should also be regarded as the disclosure of the present application.
It should be further understood that, in the various method embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic of the processes, and should not constitute any limitation on the implementation process of the embodiments of the present application. It is to be understood that the numbers may be interchanged where appropriate such that the described embodiments of the application may be practiced otherwise than as shown or described.
The method embodiments of the present application are described above in detail, and the apparatus embodiments of the present application are described below in detail with reference to fig. 7 to 8.
Fig. 7 is a schematic block diagram of an apparatus 10 for information acquisition according to an embodiment of the present application. As shown in fig. 7, the information acquisition apparatus 10 may include an acquisition unit 11, a first determination unit 12, a second determination unit 13, and a prediction unit 14.
An acquisition unit 11 for acquiring a text vector representation of a text;
a first determining unit 12, configured to determine a first entity attribute context vector of a first entity and a second entity attribute context vector of a second entity according to entity attributes of the first entity and the second entity in the text in a knowledge network;
A second determining unit 13, configured to determine a triplet context vector of the first entity and the second entity in the knowledge network, where the triplet context vector includes a first entity vector, a relationship vector, and a second entity vector; the sum of the first entity vector and the relationship vector is equal to the second entity vector; the first entity vector is obtained by weighting and aggregating the first entity and a triplet vector of a neighborhood entity of the first entity in the knowledge network; the neighborhood entity comprises the second entity;
a prediction unit 14, configured to predict a relationship between the first entity and the second entity in the text according to the text vector representation, the first entity attribute context vector, the second entity attribute context vector, and the triplet context vector.
In some embodiments, the second determining unit 13 is specifically configured to:
acquiring an initial triplet context vector, wherein the initial triplet context vector comprises an initial entity vector of the first entity, an initial entity vector of the second entity and an initial relation vector; the neighborhood entity comprises the second entity; the initial relationship vector characterizes an initial relationship of the first entity and the second entity;
Carrying out weighted aggregation on the initial triplet context vector by using a graph attention network to obtain a neighborhood weighted aggregation vector of the first entity;
and determining the triplet context vector according to the neighborhood weighted aggregation vector and the initial triplet context vector.
In some embodiments, the second determining unit 13 is specifically configured to:
acquiring the attention weight of the initial triplet context vector;
normalizing the weight of the initial triplet context vector to obtain the relative attention weight of the initial triplet context vector;
and carrying out weighted aggregation on the initial triplet context vector according to the relative attention weight to obtain the neighborhood weighted aggregation vector.
In some embodiments, the second determining unit 13 is specifically configured to:
obtaining the first entity vector according to the neighborhood weighted aggregate vector and the initial entity vector of the first entity;
performing linear transformation on the initial relation vector to obtain the relation vector;
obtaining the second entity vector according to the initial entity vector and the relation vector of the second entity;
and obtaining the triplet context vector according to the first entity vector, the relation vector and the second entity vector.
In some embodiments, the second determining unit 13 is further configured to:
converting the first entity vector from entity space to relation space through nonlinear transformation; and
the second entity vector is converted from entity space to relation space through nonlinear transformation.
In some embodiments, the apparatus 10 for information acquisition further comprises an updating unit for:
determining a first distance measure according to an effective triplet corresponding to the first entity and the neighborhood entity, and determining a second distance measure according to an ineffective triplet corresponding to the first entity and the neighborhood entity, wherein the effective triplet comprises an effective relation between the first entity and the neighborhood entity, and the ineffective triplet comprises an ineffective relation between the first entity and the neighborhood entity;
determining a marginal ordering loss according to the first distance measure and the second distance measure;
the marginal ordering loss is minimized to update parameters of the graph meaning network.
In some embodiments, prediction unit 14 is further to:
acquiring a first text vector representation corresponding to the first entity and a second text vector representation corresponding to the second entity according to the text vector representation;
Encoding the first text vector to represent the position information of the word corresponding to the first entity to obtain a first encoded vector, and encoding the second text vector to represent the position information of the word corresponding to the second entity to obtain a second encoded vector;
inputting the first coding vector and the first entity attribute context vector into a propagation module to obtain a layered representation vector of the first entity;
inputting the second coding vector and the second entity attribute context vector into the propagation module to obtain a layered representation vector of the second entity;
obtaining a connection relation between the first entity and the second entity according to the layered representation vector of the first entity and the layered representation vector of the second entity;
and predicting the relation between the first entity and the second entity in the text according to the connection relation and the triplet context vector.
In some embodiments, prediction unit 14 is specifically configured to:
an nth layer in the propagation module obtains a first transfer matrix of the nth layer according to the first coding vector; n is an integer greater than or equal to 0;
the nth layer obtains a representation vector of the first entity at an (n+1) th layer according to the first transfer matrix and the representation vector of the neighborhood entity of the first entity at the nth layer; wherein the neighborhood node of the first entity comprises the second entity; the representation vector of the first entity at the 0 th layer is the first entity attribute context;
The inputting the second encoding vector and the second entity attribute context into the propagation module to obtain a layered representation vector of the second entity includes:
an nth layer in the propagation module obtains a second transfer matrix of the nth layer according to the second coding vector;
the nth layer obtains a representation vector of the second entity at the (n+1) th layer according to the second transfer matrix and the representation vector of the neighborhood entity of the second entity at the nth layer; the neighborhood node of the second entity comprises the first entity; wherein the representation vector of the second entity at layer 0 is the second entity attribute context.
In some embodiments, prediction unit 14 is specifically configured to:
determining at least one distance metric for a triplet context vector from the first entity vector, the relationship vector, and the second entity vector in the triplet context vector;
determining a conversion vector according to the at least one distance metric;
and predicting the relation between the first entity and the second entity in the text according to the connection relation, the first entity vector, the second entity vector and the conversion vector.
In some embodiments, the first determining unit 12 is specifically configured to:
connecting word vectors and character vectors of each first entity attribute of the first entity to input into an encoder to obtain a coding vector of each first entity attribute; the coded vector of each first entity attribute is spliced and then input into a convolutional neural network, so that the context vector of the first entity attribute is obtained;
connecting word vectors and character vectors of each second entity attribute of the second entity to input into an encoder to obtain a coding vector of each second entity attribute; and splicing the coded vectors of each second entity attribute, and inputting the spliced coded vectors into a convolutional neural network to obtain the context vector of the second entity attribute.
In some embodiments, the entity attributes include at least one of an entity tag, an entity alias, an entity description, and an entity type.
It should be understood that apparatus embodiments and method embodiments may correspond with each other and that similar descriptions may refer to the method embodiments. To avoid repetition, no further description is provided here. Specifically, the apparatus 10 for information acquisition shown in fig. 7 may perform the above method embodiments, and the operations and/or functions of each module in the apparatus 10 for information acquisition are respectively for implementing the corresponding flow in the method 300 for information acquisition, which is not described herein for brevity.
The apparatus of the embodiments of the present application is described above in terms of functional modules with reference to the accompanying drawings. It should be understood that the functional module may be implemented in hardware, or may be implemented by instructions in software, or may be implemented by a combination of hardware and software modules. Specifically, each step of the method embodiment in the embodiment of the present application may be implemented by an integrated logic circuit of hardware in a processor and/or an instruction in a software form, and the steps of the method disclosed in connection with the embodiment of the present application may be directly implemented as a hardware decoding processor or implemented by a combination of hardware and software modules in the decoding processor. Alternatively, the software modules may be located in a well-established storage medium in the art such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and the like. The storage medium is located in a memory, and the processor reads information in the memory, and in combination with hardware, performs the steps in the above method embodiments.
Fig. 8 is a schematic block diagram of an electronic device 30 provided by an embodiment of the present application.
As shown in fig. 8, the electronic device 30 may include:
a memory 31 and a processor 32, the memory 31 being for storing a computer program and for transmitting the program code to the processor 32. In other words, the processor 32 may call and run a computer program from the memory 31 to implement the method in the embodiment of the present application.
For example, the processor 32 may be configured to perform the above-described method embodiments according to instructions in the computer program.
In some embodiments of the present application, the processor 32 may include, but is not limited to:
a general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like.
In some embodiments of the present application, the memory 31 includes, but is not limited to:
volatile memory and/or nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (Double Data Rate SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), and Direct memory bus RAM (DR RAM).
In some embodiments of the present application, the computer program may be divided into one or more modules, which are stored in the memory 31 and executed by the processor 32 to perform the methods provided by the present application. The one or more modules may be a series of computer program instruction segments capable of performing the specified functions, which are used to describe the execution of the computer program in the electronic device.
As shown in fig. 8, the electronic device 30 may further include:
a transceiver 33, the transceiver 33 being connectable to the processor 32 or the memory 31.
The processor 32 may control the transceiver 33 to communicate with other devices, and in particular, may send information or data to other devices or receive information or data sent by other devices. The transceiver 33 may include a transmitter and a receiver. The transceiver 33 may further include antennas, the number of which may be one or more.
It will be appreciated that the various components in the electronic device are connected by a bus system that includes, in addition to a data bus, a power bus, a control bus, and a status signal bus.
The present application also provides a computer storage medium having stored thereon a computer program which, when executed by a computer, enables the computer to perform the method of the above-described method embodiments. Alternatively, embodiments of the present application also provide a computer program product comprising instructions which, when executed by a computer, cause the computer to perform the method of the method embodiments described above.
When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital video disc (digital video disc, DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.
It will be appreciated that in the specific implementation of the present application, when the above embodiments of the present application are applied to specific products or technologies and relate to data related to user information and the like, user permission or consent needs to be obtained, and the collection, use and processing of the related data needs to comply with the relevant laws and regulations and standards.
Those of ordinary skill in the art will appreciate that the various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.
The modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. For example, functional modules in various embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily appreciate variations or alternatives within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (15)

1. A method of information acquisition, comprising:
acquiring a text vector representation of a text;
determining a first entity attribute context vector of a first entity and a second entity attribute context vector of a second entity according to entity attributes of the first entity and the second entity in the text in a knowledge network;
Determining a triplet context vector of the first entity and the second entity in the knowledge network, wherein the triplet context vector comprises a first entity vector, a relationship vector and a second entity vector; the sum of the first entity vector and the relationship vector is equal to the second entity vector; the first entity vector is obtained by weighting and aggregating the first entity and a triplet vector of a neighborhood entity of the first entity in the knowledge network; the neighborhood entity comprises the second entity;
and predicting the relation between the first entity and the second entity in the text according to the text vector representation, the first entity attribute context vector, the second entity attribute context vector and the triplet context vector.
2. The method of information retrieval according to claim 1, wherein said determining a triplet context vector of the first entity and the second entity in the knowledge network comprises:
acquiring an initial triplet context vector, wherein the initial triplet context vector comprises an initial entity vector of the first entity, an initial entity vector of the second entity and an initial relation vector; the initial relationship vector characterizes an initial relationship of the first entity and the second entity;
Carrying out weighted aggregation on the initial triplet context vector by using a graph attention network to obtain a neighborhood weighted aggregation vector of the first entity;
and determining the triplet context vector according to the neighborhood weighted aggregation vector and the initial triplet context vector.
3. The method of information acquisition according to claim 2, wherein the weighting and aggregating the initial triplet context vector with the graph attention network to obtain a neighborhood weighted aggregate vector for the first entity comprises:
acquiring the attention weight of the initial triplet context vector;
normalizing the weight of the initial triplet context vector to obtain the relative attention weight of the initial triplet context vector;
and carrying out weighted aggregation on the initial triplet context vector according to the relative attention weight to obtain the neighborhood weighted aggregation vector.
4. The method of information acquisition according to claim 2, wherein said determining the triplet context vector from the neighborhood weighted aggregation vector and the initial triplet context vector comprises:
obtaining the first entity vector according to the neighborhood weighted aggregate vector and the initial entity vector of the first entity;
Performing linear transformation on the initial relation vector to obtain the relation vector;
obtaining the second entity vector according to the initial entity vector and the relation vector of the second entity;
and obtaining the triplet context vector according to the first entity vector, the relation vector and the second entity vector.
5. The method of information acquisition according to claim 4, further comprising:
converting the first entity vector from entity space to relation space through nonlinear transformation; and
the second entity vector is converted from entity space to relation space through nonlinear transformation.
6. The method of information acquisition according to claim 4, further comprising:
determining a first distance measure according to an effective triplet corresponding to the first entity and the neighborhood entity, and determining a second distance measure according to an ineffective triplet corresponding to the first entity and the neighborhood entity, wherein the effective triplet comprises an effective relation between the first entity and the neighborhood entity, and the ineffective triplet comprises an ineffective relation between the first entity and the neighborhood entity;
Determining a marginal ordering loss according to the first distance measure and the second distance measure;
the marginal ordering loss is minimized to update parameters of the graph meaning network.
7. The method of information retrieval according to claim 1, wherein predicting the relationship of the first entity and the second entity in the text from the text vector representation, the first entity attribute context vector, the second entity attribute context vector, and the triplet context vector comprises:
acquiring a first text vector representation corresponding to the first entity and a second text vector representation corresponding to the second entity according to the text vector representation;
encoding the first text vector to represent the position information of the word corresponding to the first entity to obtain a first encoded vector, and encoding the second text vector to represent the position information of the word corresponding to the second entity to obtain a second encoded vector;
inputting the first coding vector and the first entity attribute context vector into a propagation module to obtain a layered representation vector of the first entity;
Inputting the second coding vector and the second entity attribute context vector into the propagation module to obtain a layered representation vector of the second entity;
obtaining a connection relation between the first entity and the second entity according to the layered representation vector of the first entity and the layered representation vector of the second entity;
and predicting the relation between the first entity and the second entity in the text according to the connection relation and the triplet context vector.
8. The method of claim 7, wherein inputting the first encoding vector and the first entity attribute context into a propagation module, to obtain a hierarchical representation vector of the first entity, comprises:
an nth layer in the propagation module obtains a first transfer matrix of the nth layer according to the first coding vector; n is an integer greater than or equal to 0;
the nth layer obtains a representation vector of the first entity at an (n+1) th layer according to the first transfer matrix and the representation vector of the neighborhood entity of the first entity at the nth layer; wherein the neighborhood node of the first entity comprises the second entity; the representation vector of the first entity at the 0 th layer is the first entity attribute context;
The inputting the second encoding vector and the second entity attribute context into the propagation module to obtain a layered representation vector of the second entity includes:
an nth layer in the propagation module obtains a second transfer matrix of the nth layer according to the second coding vector;
the nth layer obtains a representation vector of the second entity at the (n+1) th layer according to the second transfer matrix and the representation vector of the neighborhood entity of the second entity at the nth layer; the neighborhood node of the second entity comprises the first entity; wherein the representation vector of the second entity at layer 0 is the second entity attribute context.
9. The method of information retrieval according to claim 7, wherein predicting the relationship of the first entity and the second entity in the text based on the connection relationship and the triplet context vector comprises:
determining at least one distance metric for a triplet context vector from the first entity vector, the relationship vector, and the second entity vector in the triplet context vector;
determining a conversion vector according to the at least one distance metric;
And predicting the relation between the first entity and the second entity in the text according to the connection relation, the first entity vector, the second entity vector and the conversion vector.
10. The method of claim 1, wherein determining a first entity attribute context vector for the first entity and a second entity attribute context vector for the second entity based on entity attributes of the first entity and the second entity in the text in a knowledge network comprises:
connecting word vectors and character vectors of each first entity attribute of the first entity to input into an encoder to obtain a coding vector of each first entity attribute; the coded vector of each first entity attribute is spliced and then input into a convolutional neural network, so that the context vector of the first entity attribute is obtained;
connecting word vectors and character vectors of each second entity attribute of the second entity to input into an encoder to obtain a coding vector of each second entity attribute; and splicing the coded vectors of each second entity attribute, and inputting the spliced coded vectors into a convolutional neural network to obtain the context vector of the second entity attribute.
11. The method of information retrieval according to any one of claims 1-10, wherein the entity attributes include at least one of an entity tag, an entity alias, an entity description, and an entity type.
12. An apparatus for information acquisition, comprising:
an acquisition unit configured to acquire a text vector representation of a text;
a first determining unit, configured to determine a first entity attribute context vector of a first entity and a second entity attribute context vector of a second entity according to entity attributes of the first entity and the second entity in the text in a knowledge network;
a second determining unit, configured to determine a triplet context vector of the first entity and the second entity in the knowledge network, where the triplet context vector includes a first entity vector, a relationship vector, and a second entity vector; the sum of the first entity vector and the relationship vector is equal to the second entity vector; the first entity vector is obtained by weighting and aggregating the first entity and a triplet vector of a neighborhood entity of the first entity in the knowledge network; the neighborhood entity comprises the second entity;
And the predicting unit is used for predicting the relation between the first entity and the second entity in the text according to the text vector representation, the first entity attribute context vector, the second entity attribute context vector and the triplet context vector.
13. An electronic device comprising a processor and a memory, the memory having instructions stored therein that when executed by the processor cause the processor to perform the method of information retrieval according to any one of claims 1-11.
14. A computer storage medium, characterized by storing a computer program comprising means for performing the information acquisition of any one of claims 1-11.
15. A computer program product comprising computer program code which, when run by an electronic device, causes the electronic device to perform the method of information acquisition as claimed in any one of claims 1-11.
CN202311011833.3A 2023-08-11 2023-08-11 Information acquisition method and device, electronic equipment and storage medium Pending CN116737965A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311011833.3A CN116737965A (en) 2023-08-11 2023-08-11 Information acquisition method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311011833.3A CN116737965A (en) 2023-08-11 2023-08-11 Information acquisition method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116737965A true CN116737965A (en) 2023-09-12

Family

ID=87906389

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311011833.3A Pending CN116737965A (en) 2023-08-11 2023-08-11 Information acquisition method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116737965A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113536795A (en) * 2021-07-05 2021-10-22 杭州远传新业科技有限公司 Method, system, electronic device and storage medium for entity relation extraction
CN115545005A (en) * 2022-09-27 2022-12-30 北京理工大学 Remote supervision relation extraction method fusing knowledge and constraint graph

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113536795A (en) * 2021-07-05 2021-10-22 杭州远传新业科技有限公司 Method, system, electronic device and storage medium for entity relation extraction
CN115545005A (en) * 2022-09-27 2022-12-30 北京理工大学 Remote supervision relation extraction method fusing knowledge and constraint graph

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ANSON BASTOS 等: "RECON: Relation Extraction using Knowledge Graph Context in a Graph Neural Network", pages 1 - 12, Retrieved from the Internet <URL:https://arxiv.org/abs/2009.08694> *

Similar Documents

Publication Publication Date Title
CN111353076B (en) Method for training cross-modal retrieval model, cross-modal retrieval method and related device
WO2022057776A1 (en) Model compression method and apparatus
CN116415654A (en) Data processing method and related equipment
CN111553162A (en) Intention identification method and related device
WO2022156561A1 (en) Method and device for natural language processing
CN112131883B (en) Language model training method, device, computer equipment and storage medium
CN113704460B (en) Text classification method and device, electronic equipment and storage medium
CN115131638B (en) Training method, device, medium and equipment for visual text pre-training model
CN111783903B (en) Text processing method, text model processing method and device and computer equipment
CN113779225B (en) Training method of entity link model, entity link method and device
Dai et al. Hybrid deep model for human behavior understanding on industrial internet of video things
CN113420212A (en) Deep feature learning-based recommendation method, device, equipment and storage medium
CN114492661B (en) Text data classification method and device, computer equipment and storage medium
Amara et al. Cross-network representation learning for anchor users on multiplex heterogeneous social network
CN115146068A (en) Method, device and equipment for extracting relation triples and storage medium
CN115631008B (en) Commodity recommendation method, device, equipment and medium
CN117764373A (en) Risk prediction method, apparatus, device and storage medium
Chen et al. CNFRD: A Few‐Shot Rumor Detection Framework via Capsule Network for COVID‐19
WO2023137918A1 (en) Text data analysis method and apparatus, model training method, and computer device
CN115827878A (en) Statement emotion analysis method, device and equipment
CN116737965A (en) Information acquisition method and device, electronic equipment and storage medium
CN114911940A (en) Text emotion recognition method and device, electronic equipment and storage medium
CN114925681A (en) Knowledge map question-answer entity linking method, device, equipment and medium
CN114547308A (en) Text processing method and device, electronic equipment and storage medium
CN114298961A (en) Image processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40092307

Country of ref document: HK