CN110674637A - Character relation recognition model training method, device, equipment and medium - Google Patents

Character relation recognition model training method, device, equipment and medium Download PDF

Info

Publication number
CN110674637A
CN110674637A CN201910839474.8A CN201910839474A CN110674637A CN 110674637 A CN110674637 A CN 110674637A CN 201910839474 A CN201910839474 A CN 201910839474A CN 110674637 A CN110674637 A CN 110674637A
Authority
CN
China
Prior art keywords
relationship
character
attribute
person
corpus
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910839474.8A
Other languages
Chinese (zh)
Other versions
CN110674637B (en
Inventor
王安然
徐程程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201910839474.8A priority Critical patent/CN110674637B/en
Publication of CN110674637A publication Critical patent/CN110674637A/en
Application granted granted Critical
Publication of CN110674637B publication Critical patent/CN110674637B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The invention discloses a character relation recognition model training method, a character relation recognition model training device and a character relation recognition model training medium, wherein the method comprises the steps of obtaining a character relation three-tuple set; acquiring a first equivalence association attribute table corresponding to each person relation triple in the person relation triple set; acquiring a reversible person relation triple corresponding to each person relation triple in a person relation triple set, and acquiring a second equivalent associated attribute table corresponding to the reversible person relation triple; accessing a corpus, and marking a positive sample corpus and a negative sample corpus in the corpus according to each character relationship triple, the first equivalent associated attribute table, the reversible character relationship triple corresponding to the character relationship triple and the corresponding second equivalent associated attribute table; and training a preset machine learning model according to the positive sample corpus and the negative sample corpus to obtain a character relationship recognition model. The invention can reduce the situations of generating the under-recalling of the positive sample and the mistaken recalling of the positive sample as the negative sample.

Description

Character relation recognition model training method, device, equipment and medium
Technical Field
The invention relates to the field of machine learning, in particular to a character relation recognition model training method, a character relation recognition model training device, a character relation recognition model training equipment and a character relation recognition model training medium.
Background
Named entity recognition in natural language processing is an important basic tool in application fields such as information extraction, question-answering systems, syntactic analysis, machine translation and the like, plays an important role in the process of bringing natural language processing technology to practical use, and character relationship recognition is one of important contents in named entity recognition. In the prior art, the purpose of accurately identifying the character relationship embodied by the corpus is usually achieved by training a character relationship identification model, and the quality of the training corpus used for training the character relationship identification model has an important influence on the quality of the character relationship identification model.
In the prior art, the corpus used for training the character relationship recognition model is usually obtained based on a simple remote supervision markup-recalling strategy, and the reversibility of character relationship is not fully considered by the remote supervision markup-recalling strategy, so that the problems that a positive sample is under-recalled and the positive sample is mistakenly marked as a negative sample can be caused, and the quality of the corpus is reduced.
Disclosure of Invention
In order to solve the technical problem that the quality of a character relationship recognition model is reduced due to the fact that a corpus is under-recalled in the prior art and a positive sample is mistakenly marked as a negative sample, embodiments of the present invention provide a character relationship recognition model training method, device, equipment and medium.
In one aspect, the present invention provides a character relationship recognition model training method, including:
acquiring a character relation three-tuple set, wherein each character relation triple in the character relation three-tuple set comprises a head entity, a tail entity and an attribute representing the relation between the head entity and the tail entity;
acquiring a first equivalent associated attribute table corresponding to each character relationship triple in a character relationship triple set, wherein equivalent associated attributes in the first equivalent associated attribute table have the same meaning with attributes in the character relationship triples;
acquiring a reversible person relation triple corresponding to each person relation triple in a person relation triple set, and acquiring a second equivalent associated attribute table corresponding to the reversible person relation triple, wherein equivalent associated attributes in the second equivalent associated attribute table have the same meaning as attributes in the reversible person relation triple;
accessing a corpus, and marking a positive sample corpus and a negative sample corpus in the corpus according to each character relationship triplet, a first equivalence correlation attribute table corresponding to the character relationship triplet, a reversible character relationship triplet corresponding to the character relationship triplet and a second equivalence correlation attribute table corresponding to the reversible character relationship triplet;
and training a preset machine learning model according to the positive sample corpus and the negative sample corpus to obtain a character relationship recognition model.
In another aspect, the present invention provides a character relationship recognition model training apparatus, including:
the character relationship three-tuple set acquisition module is used for acquiring a character relationship three-tuple set, wherein each character relationship three-tuple in the character relationship three-tuple set comprises a head entity, a tail entity and an attribute representing the relationship between the head entity and the tail entity;
the system comprises a first equivalence association attribute table acquisition module, a first equivalence association attribute table acquisition module and a second equivalence association attribute table acquisition module, wherein the first equivalence association attribute table acquisition module is used for acquiring a first equivalence association attribute table corresponding to each character relation triple in a character relation triple set, and equivalent association attributes in the first equivalence association attribute table have the same meaning with attributes in the character relation triples;
the reversible content acquisition module is used for acquiring reversible person relationship triples corresponding to each person relationship triplet in the person relationship triplet set and acquiring a second equivalent associated attribute table corresponding to the reversible person relationship triples, wherein equivalent associated attributes in the second equivalent associated attribute table have the same meanings as attributes in the reversible person relationship triples;
the system comprises a sample corpus acquisition module, a corpus processing module and a query processing module, wherein the sample corpus acquisition module is used for accessing a corpus and marking a positive sample corpus and a negative sample corpus in the corpus according to each character relationship triplet, a first equivalent associated attribute table corresponding to the character relationship triplet, a reversible character relationship triplet corresponding to the character relationship triplet and a second equivalent associated attribute table corresponding to the reversible character relationship triplet;
and the training module is used for training a preset machine learning model according to the positive sample corpus and the negative sample corpus to obtain a character relation recognition model.
In another aspect, the present invention provides a human relationship recognition model training apparatus, which includes a processor and a memory, where the memory stores at least one instruction, at least one program, a code set, or a set of instructions, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by the processor to implement a human relationship recognition model training method.
In another aspect, the present invention provides a computer storage medium, wherein at least one instruction, at least one program, a set of codes, or a set of instructions is stored in the storage medium, and the at least one instruction, at least one program, set of codes, or set of instructions is loaded by a processor and executes a human relationship recognition model training method.
The invention provides a character relation recognition model training method, a character relation recognition model training device and a character relation recognition model training medium. The method can expand the existing person relation triple to obtain the first association equivalent association attribute table, the reversible person relation triple and the second association attribute table, and extract the positive sample corpus and the negative sample corpus in the existing corpus according to the character relation triple, the reversible person relation triple, the first association attribute table and the second association attribute table obtained after expansion as the basis of the callback, thereby reducing the situations that the prior art only depends on the existing person relation triple to carry out callback on the positive sample generated by the callback and mistakenly recalls the positive sample as the negative sample. Furthermore, the embodiment of the invention also improves the diversity of the negative samples by constructing two different negative samples, thereby comprehensively improving the quality of the positive samples and the negative samples and further improving the accuracy of the character relationship identification model.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of an implementation environment of a character relationship recognition model training method provided by the present invention;
FIG. 2 is a flowchart of a character relationship recognition model training method provided by the present invention;
FIG. 3 is a flow chart for obtaining a triple set of person relationships provided by the present invention;
fig. 4 is a flowchart illustrating expanding the person relationship triplets in the person relationship triplet set to obtain a first equivalent associated attribute table corresponding to the person relationship triplets according to the present invention;
fig. 5 is a flowchart illustrating another process of expanding the people relationship triples in the people relationship triplet set to obtain a first equivalent associated attribute table corresponding to the people relationship triples according to the present invention;
FIG. 6 is a schematic performance diagram of a character relationship recognition model obtained by constructing a corpus training under the condition that a reversible character relationship triplet is not expanded, according to the present invention;
FIG. 7 is a performance diagram of a character relationship recognition model obtained by training a sample, where the character relationship triple and a corpus corresponding to the reversible character relationship triple are not used together;
FIG. 8 is a block diagram of a human relationship recognition model training apparatus provided in the present invention;
FIG. 9 is a block diagram of a people relationship triplet acquisition module provided by the present invention;
fig. 10 is a hardware structural diagram of an apparatus for implementing the method provided by the embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In order to make the objects, technical solutions and advantages disclosed in the embodiments of the present invention more clearly apparent, the embodiments of the present invention are described in further detail below with reference to the accompanying drawings and the embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the embodiments of the invention and are not intended to limit the embodiments of the invention.
In the following, the terms "first", "second" are used for descriptive purposes only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present embodiment, "a plurality" means two or more unless otherwise specified. In order to facilitate understanding of the technical solutions and the technical effects thereof described in the embodiments of the present invention, the embodiments of the present invention first explain related terms:
natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.
Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.
Knowledge graph: the knowledge graph is essentially a semantic network, and a plurality of nodes are arranged in the network, and can be called entities, such as "Liudebua", "Jinji lake", "Beijing City", and the like. Edges issued by an entity represent associations between the entity and related other entities of the entity. For example, Liu De Hua wife is Zhu Li Qian. In the knowledge graph, Liu De Hua and Zhu Li Qian are both entities, an edge exists between the two entities, the edge represents a couple relationship, and the name of the edge can be 'wife'.
Triplet: the triplet is the smallest structural unit in the knowledge graph, and is in the form of: (subject, attribute, object). Such as: the triplets (Liu De Hua, wife, Zhu Li Qian), (Liu De Hua, Sheng Di, hong Kong), wherein "wife" and "Sheng Di" are both used to represent attributes, the words in front of the attributes represent subjects, and the words in the back represent objects. In addition, the object may also be called as the attribute value of the subject, for example, mercurous chloride is the attribute value of the "wife" attribute of the subject liude hua. In a scenario representing a relationship of characters, the specific form of the triplet is (character 1, relationship, character 2), where character 1 and character 2 both correspond to entities, where the subject points to the head entity and the object points to the tail entity.
In order to train the character relationship recognition model, a large amount of training corpora need to be constructed. The traditional method for acquiring the corpus based on manual labeling is time-consuming and labor-consuming, so that the corpus used for training the character relationship recognition model is generally acquired based on a simple remote supervision relabeling strategy in the prior art on the basis of the triple. The basic idea of the remote supervision benchmarking strategy is that known triple data in a knowledge map are matched with corpora, if a certain corpus simultaneously contains a subject and an object in a triple, the corpora and the triple can form a training sample, and a set of all the training samples is a training expectation. In the prior art, a character relationship recognition model is trained by using a training expectation obtained based on a remote supervision benchmarking strategy, so that the character relationship recognition model has the capability of recognizing character relationships embodied in unknown linguistic data.
For example, (Liu De Hua, wife, Zhu Li Qian) is a triple in the knowledge map, and by using the triple, three linguistic data "Liu De Hua Lao is Zhu Li Qian", "Liu De Hua and wife Zhu Li Qian in No. 8 return port" and "Liu De Hua and Zhu Li together return port" can be matched, and then the three linguistic data can be respectively used as the return mark data of the triple, and three training samples can be constructed with the triple. Specifically, if a training sample hits in three elements of a triplet, the training sample is marked back as a positive sample. For example, "Liu De Hua and wife Zhu Li Qian from No. 8 Return port" is marked back as a positive sample, while "Liu De Hua and Zhu Li Qian together Return port" is not marked back as a positive sample. Furthermore, it can also expand one item of attribute in the three elements, find its equivalent attribute, and call back the corpus of subject, object and equivalent attribute corresponding to the three elements hit at the same time as a positive sample, and the wife's related equivalent attribute includes: "husband", "wife", "spouse" and the like, and therefore "the goddess of Liu's is mercurous chloride" can also be marked back as a positive sample.
The embodiment of the invention considers that the human relationship has reversibility, and the reversibility means that for one triple (A, relationship, B) representing the human relationship, another triple (B, inverse relationship, A) representing the human relationship must exist. Although the quality of the corpus can be improved by utilizing the remote supervision and benchmarking strategy in the prior art, the reversibility of the human-object relationship is not considered, so that the corpus with the human-object relationship cannot be recalled well, and the quality of the corpus is reduced. For example, for a character relationship triple (liudri, wife, mercurous), the reversible character relationship triple is (mercurous, husband, liudri), and the wife equivalent attributes include: the "husband", "wife", "spouse", and the logout, only when the equivalent attributes of "liu de hua", "zhu li qian", "wife" or "wife" appear simultaneously in the corpus, the logout will be marked back as a positive sample, so if the sentence is "liu de qian husband is liu de hua", the logout will not be marked back as a positive sample, but in fact, the "liu de qian husband is liu de hua", because of the correspondence with the reversible character relationship triples, should be marked as a positive sample. Moreover, when the remote logout is performed, a sentence containing the head entity and the tail entity but having no attribute is randomly drawn as a negative sample, and even the negative sample label is likely to be marked on the sentence with the name "Liude Hua is a good name for Naja Zhu.
Based on the above analysis, the following problems may occur when the corpus is obtained based on the remote markup-returning strategy in the prior art:
(1) high quality recalls cannot be made for positive samples.
(2) The positive samples are flagged as false back as negative samples.
In view of this, embodiments of the present invention provide a method for training a character relationship recognition model, which can construct a corpus from two aspects of a character relationship triplet and a reversible character relationship triplet corresponding to the character relationship triplet on the basis of fully considering the reversibility of a character relationship, thereby reducing the probability that a positive sample in the corpus is incorrectly labeled as a negative sample, increasing the recall rate of the positive sample, achieving the purpose of increasing the quality of the corpus, and further training a high-quality character relationship recognition model based on the high-quality corpus.
First, the embodiment of the present invention discloses an implementation environment for a character relationship recognition model training method in a possible embodiment.
Referring to fig. 1, the implementation environment includes: at least one client 01 and a server 03.
The client 01 may include: the physical devices may also include software running in the physical devices, such as applications with a character relationship recognition function or applications providing other services based on character relationship recognition. The application program can cover application fields of information extraction, question answering systems, syntax analysis, machine translation and the like. The Client 01 may be communicatively connected to the Server 03 based on a Browser/Server mode (Browser/Server, B/S) or a Client/Server mode (Client/Server, C/S).
The client 01 may generate a corpus and transmit the corpus to the server 03 so that the server 03 updates the corpus based on the corpus.
The server 03 can obtain a character relationship three-tuple set and access a corpus to obtain a large amount of corpora; performing markup for a plurality of corpora in the corpus based on each character relationship triplet in the character relationship triplet set and the reversible character relationship triplet corresponding to the character relationship triplet, constructing a training corpus based on markup result, and training a preset machine learning model based on the training corpus to obtain a character relationship recognition model.
The server 03 may also provide a personal relationship recognition service and other services related to the personal relationship recognition service for the client 01 based on the personal relationship recognition model. In the process of providing the person relationship identification service and other services related to the person relationship identification service, the server 03 may further collect new identified person relationships, thereby supplementing and updating the person relationship tuple set. And additionally, more training corpora can be re-marked based on the identified new character relationship, and the newly marked training corpora are input into the character relationship identification model to realize the optimization of the character identification model.
The server 03 may comprise an independently operating server, or a distributed server, or a server cluster composed of a plurality of servers.
Referring to fig. 2, a flowchart of a method for training a character relationship recognition model, which may be implemented by a server in the implementation environment of fig. 1 as an execution subject, is shown, where the method may include:
s101, a character relation three-element set is obtained, wherein each character relation three element in the character relation three-element set comprises a head entity, a tail entity and an attribute representing the relation between the head entity and the tail entity.
Specifically, the obtaining of the person relationship three-tuple set, as shown in fig. 3, may include:
s1011, accessing the knowledge graph, and extracting entities with the types of people to obtain a target head entity set.
And S1013, traversing the attribute list corresponding to each target head entity in the target head entity set, and extracting the attributes which are used for representing the human-object relationship in the attribute list to obtain the target attribute set corresponding to the target head entity.
And S1015, obtaining a target tail entity having a target attribute relationship with the target head entity according to each target head entity and each target attribute in the target attribute set corresponding to the target head entity.
S1017, constructing a character relation triple according to the target head entity, the target attribute and the target tail entity which have the corresponding relation, so as to obtain a character relation triple set.
S103, acquiring a first equivalence association attribute table corresponding to each person relation triple in the person relation triple set, wherein equivalence association attributes in the first equivalence association attribute table have the same meaning with attributes in the person relation triples.
In a possible embodiment, the obtaining a first equivalent associated attribute table corresponding to each person relationship triple in the person relationship triple set is shown in fig. 4, and may include:
and S1031, extracting attributes which are used for representing the relationship between the head entity and the tail entity in the person relationship triple.
S1033, at least one equivalent keyword of the attribute of the relationship between the head entity and the tail entity is obtained, and a first equivalent associated attribute table is constructed according to the equivalent keyword.
In the possible embodiment, a synonym expansion manner is used to construct a first equivalent associated attribute table, and an equivalent keyword is a synonym equivalent to an attribute representing the relationship between the head entity and the tail entity, and specifically, the equivalent keyword can be obtained by crawling a website for word interpretation such as "Baidu Chinese", "google translation", and the like.
For example, if the attribute in the people relationship triple that characterizes the relationship between the head entity and the tail entity is "wife", the equivalent keyword may be "wife", "lover", etc.
In another possible embodiment, the obtaining of the first equivalent associated attribute table corresponding to each person relationship triple in the person relationship triple set is shown in fig. 5, and may include:
s1032, obtaining a first word vector corresponding to the attribute for representing the relationship between the head entity and the tail entity in the character relationship triple.
S1034, acquiring a preset attribute set, and calculating a second word vector corresponding to each attribute in the preset attribute set.
And S1036, extracting at least one equivalent correlation attribute from the preset attribute set, wherein the cosine similarity between the second word vector corresponding to the equivalent correlation attribute and the first word vector is greater than a preset threshold value.
S1038, obtaining a first equivalent associated attribute table based on the at least one equivalent associated attribute.
The embodiment of the present invention provides two technical solutions for obtaining the first equivalence relation attribute table, which may be used alternatively or jointly.
And S105, acquiring a reversible person relation triple corresponding to each person relation triple in the person relation triple set, and acquiring a second equivalent associated attribute table corresponding to the reversible person relation triple, wherein equivalent associated attributes in the second equivalent associated attribute table have the same meaning as attributes in the reversible person relation triple.
Specifically, a head entity in the reversible person relationship triple is a tail entity in the person relationship triple, a tail entity in the reversible person relationship triple is a head entity in the person relationship triple, and an attribute in the reversible person relationship triple and an attribute in the person relationship triple have opposite meanings. For example, the reversible triple of human relationships (leiacang, prodigy, leiacang) corresponds to the triple of human relationships (leiacang, prodigy, ruacang).
In a possible embodiment, the method for obtaining the second equivalent associated attribute table corresponding to the reversible person relationship triple may refer to a method for obtaining the first equivalent associated attribute table corresponding to the person relationship triple. In another possible embodiment, attributes with opposite meanings from the attributes in the first equivalent associated attribute table corresponding to the person relationship triple may be directly obtained to form the second equivalent associated attribute table corresponding to the reversible person relationship triple.
And S107, accessing a corpus, and marking a positive sample corpus and a negative sample corpus in the corpus according to each character relationship triplet, a first equivalent associated attribute table corresponding to the character relationship triplet, a reversible character relationship triplet corresponding to the character relationship triplet and a second equivalent associated attribute table corresponding to the reversible character relationship triplet.
Specifically, the positive sample corpus may be labeled according to a first criterion, and the negative sample corpus may be labeled according to a second criterion. The first criterion may include at least one of a corpus match hit on three elements of a person relationship triple at the same time, hit on a head entity and a tail entity of a person relationship triple and hit on an attribute of the first equivalence correlation attribute table, hit on three elements of a reversible person relationship triple at the same time, hit on a head entity and a tail entity of a reversible person relationship triple and hit on an attribute of the second equivalence correlation attribute table.
Specifically, the attribute hit by the positive sample corpus is a tag of the positive sample corpus, and the attribute belongs to a character relationship triple, a reversible character relationship triple, a first equivalence association attribute table or a second equivalence association attribute table.
The embodiment of the invention can further disclose the second criterion, and the corpus conforming to the second criterion is marked as the negative sample corpus.
In one possible embodiment, the second criterion may be that the corpus only hits the head entity and the tail entity in the person relationship triple. That is, if the corpus only includes the head entity and the tail entity, but does not include the field representing the attribute relationship between the head entity and the tail entity, the corpus is marked as the negative sample corpus. For example, the corpus contains entities a and B in the person relationship triple (a, attribute, B) but does not contain "attribute", and the corpus is labeled as a negative sample and may represent the text that does not represent the attribute relationship between the head entity and the tail entity.
In another possible embodiment, the second criterion may be that only the attribute in the person relationship triplet is included in the corpus, and the corpus also hits the head entity or the tail entity in the person relationship triplet, and also hits other person entities not belonging to the person relationship triplet. That is, the corpus contains the "attribute" of a person relationship triple (a, attribute, B), and the entity a or B in the person relationship triple is also hit. At this time, if the corpus still contains other human entities C and does not contain other attribute keywords, the corpus is labeled as a negative sample. Such negative examples may represent text that has an attribute keyword, but does not characterize the relationship of two entities.
The second criterion in the above two embodiments may be used alternatively or jointly, and the embodiments of the present invention are not limited to other technical solutions for obtaining the negative sample corpus.
And S109, training a preset machine learning model according to the positive sample corpus and the negative sample corpus to obtain a character relationship recognition model.
The character relationship recognition model training method disclosed by the embodiment of the invention can expand the existing character relationship triples to obtain the first associated equivalent associated attribute table, the reversible character relationship triples and the second equivalent associated attribute table, and extract the positive sample corpora and the negative sample corpora from the existing corpus according to the character relationship triples, the reversible character relationship triples, the first equivalent associated attribute table and the second equivalent associated attribute table obtained after expansion as the basis of the callback, so that the situations that the positive sample is not recalled and the positive sample is mistakenly recalled into the negative sample which are generated by the callback only depending on the existing character relationship triples in the prior art are reduced. Furthermore, the embodiment of the invention also improves the diversity of the negative samples by constructing two different negative samples, thereby comprehensively improving the quality of the positive samples and the negative samples and further improving the accuracy of the character relationship identification model.
Further, the embodiment of the present invention tests the performance of the character relationship recognition model obtained by corpus training in the prior art, as shown in fig. 6, which shows a schematic performance diagram of the character relationship recognition model obtained by corpus training under the condition that the reversible character relationship triple is not expanded, it is obvious that F1 values of "parent" and "child" are very low, and F1 values can be according to a formula
Figure BDA0002193233360000121
And calculating, wherein precision and recall are precision and recall. FIG. 7 shows a character obtained by training a sample without the linguistic data of the character relationship triplets and the corresponding reversible character relationship tripletsObviously, the performance diagram of the relationship identification model cannot accurately predict the sequence of the character relationships in the corpus, that is, it cannot be determined which entity in the corpus is the head entity and which entity is the tail entity. The embodiment of the invention can accurately overcome the problems, not only can accurately identify each entity in the corpus, but also can accurately predict the character relationship among each entity.
The embodiment of the invention also discloses a character relation recognition model training device, as shown in fig. 8, the device comprises:
a person relationship three-tuple set obtaining module 201, configured to obtain a person relationship three-tuple set, where each person relationship three-tuple in the person relationship three-tuple set includes a head entity, a tail entity, and an attribute representing a relationship between the head entity and the tail entity;
a first equivalence correlation attribute table obtaining module 203, configured to obtain a first equivalence correlation attribute table corresponding to each person relationship triple in a person relationship triple set, where an equivalence correlation attribute in the first equivalence correlation attribute table has the same meaning as an attribute in the person relationship triple;
the reversible content obtaining module 205 is configured to obtain a reversible person relationship triple corresponding to each person relationship triple in the person relationship triple set, and obtain a second equivalent associated attribute table corresponding to the reversible person relationship triple, where an equivalent associated attribute in the second equivalent associated attribute table has the same meaning as an attribute in the reversible person relationship triple;
the sample corpus acquiring module 207 is configured to access a corpus, and mark a positive sample corpus and a negative sample corpus in the corpus according to each person relationship triplet, a first equivalence correlation attribute table corresponding to the person relationship triplet, a reversible person relationship triplet corresponding to the person relationship triplet, and a second equivalence correlation attribute table corresponding to the reversible person relationship triplet;
and the training module 209 is configured to train a preset machine learning model according to the positive sample corpus and the negative sample corpus to obtain a character relationship recognition model.
Further, as shown in fig. 9, the person relationship three-tuple set obtaining module 201 includes:
a target head entity set extraction unit 2011, configured to access the knowledge graph, and extract entities of the type of people to obtain a target head entity set;
a target attribute set extraction unit 2013, configured to traverse an attribute list corresponding to each target head entity in the target head entity set, and extract an attribute that indicates a human-object relationship in the attribute list to obtain a target attribute set corresponding to the target head entity;
a target tail entity obtaining unit 2015, configured to obtain, according to each target head entity and each target attribute in the target attribute set corresponding to the target head entity, a target tail entity having a target attribute relationship with the target head entity;
and the person relation triple generating unit 2017 is used for constructing a person relation triple according to the target head entity, the target attribute and the target tail entity which have the corresponding relation so as to obtain a person relation triple set.
Specifically, the embodiment of the invention discloses a character relation recognition model training device and a method thereof, which are based on the same inventive concept. For details, please refer to the method embodiment, which is not described herein.
The embodiment of the invention also provides a computer storage medium, and the computer storage medium can store a plurality of instructions. The instructions may be adapted to be loaded by a processor and execute a method for training a character relationship recognition model according to an embodiment of the present invention, the method at least comprising the following steps:
a character relationship recognition model training method, the method comprising:
acquiring a character relation three-tuple set, wherein each character relation triple in the character relation three-tuple set comprises a head entity, a tail entity and an attribute representing the relation between the head entity and the tail entity;
acquiring a first equivalent associated attribute table corresponding to each character relationship triple in a character relationship triple set, wherein equivalent associated attributes in the first equivalent associated attribute table have the same meaning with attributes in the character relationship triples;
acquiring a reversible person relation triple corresponding to each person relation triple in a person relation triple set, and acquiring a second equivalent associated attribute table corresponding to the reversible person relation triple, wherein equivalent associated attributes in the second equivalent associated attribute table have the same meaning as attributes in the reversible person relation triple;
accessing a corpus, and marking a positive sample corpus and a negative sample corpus in the corpus according to each character relationship triplet, a first equivalence correlation attribute table corresponding to the character relationship triplet, a reversible character relationship triplet corresponding to the character relationship triplet and a second equivalence correlation attribute table corresponding to the reversible character relationship triplet;
and training a preset machine learning model according to the positive sample corpus and the negative sample corpus to obtain a character relationship recognition model.
In a preferred embodiment, the obtaining of the person relationship tuple set comprises:
accessing a knowledge graph, and extracting entities with the types of people to obtain a target head entity set;
traversing an attribute list corresponding to each target head entity in the target head entity set, and extracting attributes which are used for expressing the human-object relationship in the attribute list to obtain a target attribute set corresponding to the target head entity;
obtaining target tail entities having a target attribute relationship with the target head entities according to the target head entities and the target attributes in the target attribute set corresponding to the target head entities;
and constructing a character relationship triple according to the target head entity, the target attribute and the target tail entity with the corresponding relationship to obtain a character relationship triple set.
In a preferred embodiment, the obtaining a first equivalent associated attribute table corresponding to each person relationship triple in the person relationship triple set includes:
extracting attributes which represent the relationship between the head entity and the tail entity in the character relationship triples;
and acquiring at least one equivalent keyword of the attribute of the relationship between the head entity and the tail entity, and constructing a first equivalent associated attribute table according to the equivalent keyword.
In a preferred embodiment, the obtaining a first equivalent associated attribute table corresponding to each person relationship triple in the person relationship triple set includes:
acquiring a first word vector corresponding to the attribute for representing the relationship between the head entity and the tail entity in the character relationship triple;
acquiring a preset attribute set, and calculating a second word vector corresponding to each attribute in the preset attribute set;
extracting at least one equivalence correlation attribute from the preset attribute set, wherein the cosine similarity between a second word vector corresponding to the equivalence correlation attribute and the first word vector is greater than a preset threshold value;
and obtaining a first equivalent associated attribute table based on the at least one equivalent associated attribute.
In a preferred embodiment, the marking a positive sample corpus and a negative sample corpus in the corpus according to each person relationship triplet, the first equivalent associated attribute table corresponding to the person relationship triplet, the reversible person relationship triplet corresponding to the person relationship triplet, and the second equivalent associated attribute table corresponding to the reversible person relationship triplet includes marking the positive sample corpus according to a first criterion;
the first criterion comprises at least one condition that the corpus conforms to three elements in a simultaneous hit person relationship triple, a head entity and a tail entity in the hit person relationship triple hit attributes in the first equivalent correlation attribute table, three elements in a simultaneous hit reversible person relationship triple, a head entity and a tail entity in the hit reversible person relationship triple hit attributes in the second equivalent correlation attribute table.
In a preferred embodiment, the marking a positive sample corpus and a negative sample corpus in the corpus according to each person relationship triplet, the first equivalent associated attribute table corresponding to the person relationship triplet, the reversible person relationship triplet corresponding to the person relationship triplet, and the second equivalent associated attribute table corresponding to the reversible person relationship triplet includes marking a negative sample corpus according to second determination data;
the second criterion includes that the corpus only hits the head entity and the tail entity in the person relationship triplet,
and/or the presence of a gas in the gas,
the corpus only contains attributes in the character relationship triplets, the corpus also hits a head entity or a tail entity in the character relationship triplets, and also hits other character entities which do not belong to the character relationship triplets.
Further, fig. 10 shows a hardware structure diagram of an apparatus for implementing the method provided by the embodiment of the present invention, and the apparatus may participate in forming or containing the device or system provided by the embodiment of the present invention. As shown in fig. 10, the device 10 may include one or more (shown as 102a, 102b, … …, 102 n) processors 102 (the processors 102 may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA, etc.), a memory 104 for storing data, and a transmission device 106 for communication functions. Besides, the method can also comprise the following steps: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a power source, and/or a camera. It will be understood by those skilled in the art that the structure shown in fig. 10 is merely illustrative and is not intended to limit the structure of the electronic device. For example, device 10 may also include more or fewer components than shown in FIG. 10, or have a different configuration than shown in FIG. 10.
It should be noted that the one or more processors 102 and/or other data processing circuitry described above may be referred to generally herein as "data processing circuitry". The data processing circuitry may be embodied in whole or in part in software, hardware, firmware, or any combination thereof. Further, the data processing circuitry may be a single, stand-alone processing module, or incorporated in whole or in part into any of the other elements in the device 10 (or mobile device). As referred to in the embodiments of the application, the data processing circuit acts as a processor control (e.g. selection of a variable resistance termination path connected to the interface).
The memory 104 may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the methods described in the embodiments of the present invention, and the processor 102 executes various functional applications and data processing by executing the software programs and modules stored in the memory 104, so as to implement a character relationship recognition model training method as described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, memory 104 may further include memory located remotely from processor 102, which may be connected to device 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of such networks may include wireless networks provided by the communication provider of the device 10. In one example, the transmission device 106 includes a network adapter (NIC) that can be connected to other network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 can be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the device 10 (or mobile device).
It should be noted that: the precedence order of the above embodiments of the present invention is only for description, and does not represent the merits of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the device and server embodiments, since they are substantially similar to the method embodiments, the description is simple, and the relevant points can be referred to the partial description of the method embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A character relation recognition model training method is characterized by comprising the following steps:
acquiring a character relation three-tuple set, wherein each character relation triple in the character relation three-tuple set comprises a head entity, a tail entity and an attribute representing the relation between the head entity and the tail entity;
acquiring a first equivalent associated attribute table corresponding to each character relationship triple in a character relationship triple set, wherein equivalent associated attributes in the first equivalent associated attribute table have the same meaning with attributes in the character relationship triples;
acquiring a reversible person relation triple corresponding to each person relation triple in a person relation triple set, and acquiring a second equivalent associated attribute table corresponding to the reversible person relation triple, wherein equivalent associated attributes in the second equivalent associated attribute table have the same meaning as attributes in the reversible person relation triple;
accessing a corpus, and marking a positive sample corpus and a negative sample corpus in the corpus according to each character relationship triplet, a first equivalence correlation attribute table corresponding to the character relationship triplet, a reversible character relationship triplet corresponding to the character relationship triplet and a second equivalence correlation attribute table corresponding to the reversible character relationship triplet;
and training a preset machine learning model according to the positive sample corpus and the negative sample corpus to obtain a character relationship recognition model.
2. The method of claim 1, wherein obtaining the set of people relationship tuples comprises:
accessing a knowledge graph, and extracting entities with the types of people to obtain a target head entity set;
traversing an attribute list corresponding to each target head entity in the target head entity set, and extracting attributes which are used for expressing the human-object relationship in the attribute list to obtain a target attribute set corresponding to the target head entity;
obtaining target tail entities having a target attribute relationship with the target head entities according to the target head entities and the target attributes in the target attribute set corresponding to the target head entities;
and constructing a character relationship triple according to the target head entity, the target attribute and the target tail entity with the corresponding relationship to obtain a character relationship triple set.
3. The method of claim 1, wherein obtaining the first equivalent associated attribute table corresponding to each of the person relationship triples in the person relationship triplet set comprises:
extracting attributes which represent the relationship between the head entity and the tail entity in the character relationship triples;
and acquiring at least one equivalent keyword of the attribute of the relationship between the head entity and the tail entity, and constructing a first equivalent associated attribute table according to the equivalent keyword.
4. The method of claim 1, wherein obtaining the first equivalent associated attribute table corresponding to each of the person relationship triples in the person relationship triplet set comprises:
acquiring a first word vector corresponding to the attribute for representing the relationship between the head entity and the tail entity in the character relationship triple;
acquiring a preset attribute set, and calculating a second word vector corresponding to each attribute in the preset attribute set;
extracting at least one equivalence correlation attribute from the preset attribute set, wherein the cosine similarity between a second word vector corresponding to the equivalence correlation attribute and the first word vector is greater than a preset threshold value;
and obtaining a first equivalent associated attribute table based on the at least one equivalent associated attribute.
5. The method of claim 1, wherein the labeling of the positive sample corpora and the negative sample corpora in the corpus according to the person relationship triples, the first equivalent associated attribute table corresponding to the person relationship triples, the reversible person relationship triples corresponding to the person relationship triples, and the second equivalent associated attribute table corresponding to the reversible person relationship triples includes labeling the positive sample corpora according to a first criterion;
the first criterion comprises at least one condition that the corpus conforms to three elements in a simultaneous hit person relationship triple, a head entity and a tail entity in the hit person relationship triple hit attributes in the first equivalent correlation attribute table, three elements in a simultaneous hit reversible person relationship triple, a head entity and a tail entity in the hit reversible person relationship triple hit attributes in the second equivalent correlation attribute table.
6. The method of claim 1, wherein the labeling of the positive sample corpora and the negative sample corpora in the corpus according to the respective person relationship triples, the first equivalent associated attribute table corresponding to the person relationship triples, the reversible person relationship triples corresponding to the person relationship triples, and the second equivalent associated attribute table corresponding to the reversible person relationship triples includes labeling the negative sample corpora according to second data;
the second criterion includes that the corpus only hits the head entity and the tail entity in the person relationship triplet,
and/or the presence of a gas in the gas,
the corpus only contains attributes in the character relationship triplets, the corpus also hits a head entity or a tail entity in the character relationship triplets, and also hits other character entities which do not belong to the character relationship triplets.
7. A character relationship recognition model training apparatus, characterized in that the apparatus comprises:
the character relationship three-tuple set acquisition module is used for acquiring a character relationship three-tuple set, wherein each character relationship three-tuple in the character relationship three-tuple set comprises a head entity, a tail entity and an attribute representing the relationship between the head entity and the tail entity;
the system comprises a first equivalence association attribute table acquisition module, a first equivalence association attribute table acquisition module and a second equivalence association attribute table acquisition module, wherein the first equivalence association attribute table acquisition module is used for acquiring a first equivalence association attribute table corresponding to each character relation triple in a character relation triple set, and equivalent association attributes in the first equivalence association attribute table have the same meaning with attributes in the character relation triples;
the reversible content acquisition module is used for acquiring reversible person relationship triples corresponding to each person relationship triplet in the person relationship triplet set and acquiring a second equivalent associated attribute table corresponding to the reversible person relationship triples, wherein equivalent associated attributes in the second equivalent associated attribute table have the same meanings as attributes in the reversible person relationship triples;
the system comprises a sample corpus acquisition module, a corpus processing module and a query processing module, wherein the sample corpus acquisition module is used for accessing a corpus and marking a positive sample corpus and a negative sample corpus in the corpus according to each character relationship triplet, a first equivalent associated attribute table corresponding to the character relationship triplet, a reversible character relationship triplet corresponding to the character relationship triplet and a second equivalent associated attribute table corresponding to the reversible character relationship triplet;
and the training module is used for training a preset machine learning model according to the positive sample corpus and the negative sample corpus to obtain a character relation recognition model.
8. The apparatus of claim 7, wherein the person relationship trigram acquisition module comprises:
the target head entity set extraction unit is used for accessing the knowledge graph and extracting entities with the types of people to obtain a target head entity set;
the target attribute set extraction unit is used for traversing an attribute list corresponding to each target head entity in the target head entity set and extracting attributes which are used for expressing human-object relationships in the attribute list to obtain a target attribute set corresponding to the target head entity;
a target tail entity obtaining unit, configured to obtain a target tail entity having a target attribute relationship with a target head entity according to each target head entity and each target attribute in a target attribute set corresponding to the target head entity;
and the character relation triple generating unit is used for constructing the character relation triple according to the target head entity, the target attribute and the target tail entity with the corresponding relation so as to obtain a character relation triple set.
9. A computer storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by a processor to implement a method of training a character relationship recognition model according to any one of claims 1-6.
10. An apparatus for training human relationship recognition models, the apparatus comprising a processor and a memory, the memory having at least one instruction, at least one program, a set of codes, or a set of instructions stored therein, the at least one instruction, the at least one program, the set of codes, or the set of instructions being loaded by the processor and performing a method of training human relationship recognition models according to any one of claims 1 to 6.
CN201910839474.8A 2019-09-06 2019-09-06 Character relationship recognition model training method, device, equipment and medium Active CN110674637B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910839474.8A CN110674637B (en) 2019-09-06 2019-09-06 Character relationship recognition model training method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910839474.8A CN110674637B (en) 2019-09-06 2019-09-06 Character relationship recognition model training method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN110674637A true CN110674637A (en) 2020-01-10
CN110674637B CN110674637B (en) 2023-07-11

Family

ID=69076578

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910839474.8A Active CN110674637B (en) 2019-09-06 2019-09-06 Character relationship recognition model training method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN110674637B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111694967A (en) * 2020-06-11 2020-09-22 腾讯科技(深圳)有限公司 Attribute extraction method and device, electronic equipment and medium
CN113254549A (en) * 2021-06-21 2021-08-13 中国人民解放军国防科技大学 Character relation mining model training method, character relation mining method and device
CN113361280A (en) * 2021-06-30 2021-09-07 北京百度网讯科技有限公司 Method for training model, prediction method, prediction device, electronic device and storage medium

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110238610A1 (en) * 2008-12-15 2011-09-29 Korea Institute Of Science & Technology Informatio System and method for efficient reasoning using view in dbms-based rdf triple store
US20130262361A1 (en) * 2012-04-02 2013-10-03 Playence GmBH System and method for natural language querying
CN104657750A (en) * 2015-03-23 2015-05-27 苏州大学张家港工业技术研究院 Method and device for extracting character relation
CN106484675A (en) * 2016-09-29 2017-03-08 北京理工大学 Fusion distributed semantic and the character relation abstracting method of sentence justice feature
KR20170089142A (en) * 2016-01-26 2017-08-03 경북대학교 산학협력단 Generating method and system for triple data
CN107741953A (en) * 2017-09-14 2018-02-27 平安科技(深圳)有限公司 The real relationship match method, apparatus and readable storage medium storing program for executing of social platform user
WO2018072563A1 (en) * 2016-10-18 2018-04-26 中兴通讯股份有限公司 Knowledge graph creation method, device, and system
US20180144252A1 (en) * 2016-11-23 2018-05-24 Fujitsu Limited Method and apparatus for completing a knowledge graph
CN108446769A (en) * 2018-01-23 2018-08-24 深圳市阿西莫夫科技有限公司 Knowledge mapping relation inference method, apparatus, computer equipment and storage medium
CN108647258A (en) * 2018-01-24 2018-10-12 北京理工大学 A kind of expression learning method based on entity associated constraint
CN108959418A (en) * 2018-06-06 2018-12-07 中国人民解放军国防科技大学 Character relation extraction method and device, computer device and computer readable storage medium
CN109145123A (en) * 2018-09-30 2019-01-04 国信优易数据有限公司 Construction method, intelligent interactive method, system and the electronic equipment of knowledge mapping model
CN109472033A (en) * 2018-11-19 2019-03-15 华南师范大学 Entity relation extraction method and system in text, storage medium, electronic equipment
CN109783651A (en) * 2019-01-29 2019-05-21 北京百度网讯科技有限公司 Extract method, apparatus, electronic equipment and the storage medium of entity relevant information
CN109933785A (en) * 2019-02-03 2019-06-25 北京百度网讯科技有限公司 Method, apparatus, equipment and medium for entity associated
WO2019128367A1 (en) * 2017-12-26 2019-07-04 广州广电运通金融电子股份有限公司 Face verification method and apparatus based on triplet loss, and computer device and storage medium
CN109992670A (en) * 2019-04-04 2019-07-09 西安交通大学 A kind of map completion method of knowledge based map neighbour structure
CN110032650A (en) * 2019-04-18 2019-07-19 腾讯科技(深圳)有限公司 A kind of generation method, device and the electronic equipment of training sample data

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110238610A1 (en) * 2008-12-15 2011-09-29 Korea Institute Of Science & Technology Informatio System and method for efficient reasoning using view in dbms-based rdf triple store
US20130262361A1 (en) * 2012-04-02 2013-10-03 Playence GmBH System and method for natural language querying
CN104657750A (en) * 2015-03-23 2015-05-27 苏州大学张家港工业技术研究院 Method and device for extracting character relation
KR20170089142A (en) * 2016-01-26 2017-08-03 경북대학교 산학협력단 Generating method and system for triple data
CN106484675A (en) * 2016-09-29 2017-03-08 北京理工大学 Fusion distributed semantic and the character relation abstracting method of sentence justice feature
WO2018072563A1 (en) * 2016-10-18 2018-04-26 中兴通讯股份有限公司 Knowledge graph creation method, device, and system
US20180144252A1 (en) * 2016-11-23 2018-05-24 Fujitsu Limited Method and apparatus for completing a knowledge graph
CN107741953A (en) * 2017-09-14 2018-02-27 平安科技(深圳)有限公司 The real relationship match method, apparatus and readable storage medium storing program for executing of social platform user
WO2019128367A1 (en) * 2017-12-26 2019-07-04 广州广电运通金融电子股份有限公司 Face verification method and apparatus based on triplet loss, and computer device and storage medium
CN108446769A (en) * 2018-01-23 2018-08-24 深圳市阿西莫夫科技有限公司 Knowledge mapping relation inference method, apparatus, computer equipment and storage medium
CN108647258A (en) * 2018-01-24 2018-10-12 北京理工大学 A kind of expression learning method based on entity associated constraint
CN108959418A (en) * 2018-06-06 2018-12-07 中国人民解放军国防科技大学 Character relation extraction method and device, computer device and computer readable storage medium
CN109145123A (en) * 2018-09-30 2019-01-04 国信优易数据有限公司 Construction method, intelligent interactive method, system and the electronic equipment of knowledge mapping model
CN109472033A (en) * 2018-11-19 2019-03-15 华南师范大学 Entity relation extraction method and system in text, storage medium, electronic equipment
CN109783651A (en) * 2019-01-29 2019-05-21 北京百度网讯科技有限公司 Extract method, apparatus, electronic equipment and the storage medium of entity relevant information
CN109933785A (en) * 2019-02-03 2019-06-25 北京百度网讯科技有限公司 Method, apparatus, equipment and medium for entity associated
CN109992670A (en) * 2019-04-04 2019-07-09 西安交通大学 A kind of map completion method of knowledge based map neighbour structure
CN110032650A (en) * 2019-04-18 2019-07-19 腾讯科技(深圳)有限公司 A kind of generation method, device and the electronic equipment of training sample data

Non-Patent Citations (11)

* Cited by examiner, † Cited by third party
Title
JUNJIE CHEN等: "Learning deep unsupervised binary codes for image retrieval", 《IJCAI"18: PROCEEDINGS OF THE 27TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE》 *
JUNJIE CHEN等: "Learning deep unsupervised binary codes for image retrieval", 《IJCAI"18: PROCEEDINGS OF THE 27TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE》, 13 July 2018 (2018-07-13), pages 613 - 619 *
ROBERT WEST 等: "Knowledge base completion via search-based question answering", PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, pages 515 - 525 *
WENHAN XIONG 等: "DeepPath: a reinforcement learning method for knowledge graph reasoning", ARXIV, pages 1 - 10 *
XIN LI等: "A Structural Representation Learning for Multi-relational Networks", 《ARXIV》 *
XIN LI等: "A Structural Representation Learning for Multi-relational Networks", 《ARXIV》, 8 June 2018 (2018-06-08), pages 2 - 9 *
丁宁: "面向新闻事件的人物关系分类研究", 中国优秀硕士学位论文全文数据库 信息科技辑, no. 01, pages 138 - 5147 *
林希珣: "大规模知识图谱完善关键算法研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 *
林希珣: "大规模知识图谱完善关键算法研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, no. 1, 15 January 2019 (2019-01-15), pages 110 - 481 *
王凯强: "大数据环境下的用户信息抽取与分析", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 *
王凯强: "大数据环境下的用户信息抽取与分析", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, no. 11, 15 November 2018 (2018-11-15), pages 138 - 581 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111694967A (en) * 2020-06-11 2020-09-22 腾讯科技(深圳)有限公司 Attribute extraction method and device, electronic equipment and medium
CN111694967B (en) * 2020-06-11 2023-10-20 腾讯科技(深圳)有限公司 Attribute extraction method, attribute extraction device, electronic equipment and medium
CN113254549A (en) * 2021-06-21 2021-08-13 中国人民解放军国防科技大学 Character relation mining model training method, character relation mining method and device
CN113254549B (en) * 2021-06-21 2021-11-23 中国人民解放军国防科技大学 Character relation mining model training method, character relation mining method and device
CN113361280A (en) * 2021-06-30 2021-09-07 北京百度网讯科技有限公司 Method for training model, prediction method, prediction device, electronic device and storage medium
CN113361280B (en) * 2021-06-30 2023-10-31 北京百度网讯科技有限公司 Model training method, prediction method, apparatus, electronic device and storage medium

Also Published As

Publication number Publication date
CN110674637B (en) 2023-07-11

Similar Documents

Publication Publication Date Title
CN107679039B (en) Method and device for determining statement intention
CN107436864B (en) Chinese question-answer semantic similarity calculation method based on Word2Vec
US20120084076A1 (en) Context-based disambiguation of acronyms and abbreviations
CN108664599B (en) Intelligent question-answering method and device, intelligent question-answering server and storage medium
Lev et al. In defense of word embedding for generic text representation
CN104050256A (en) Initiative study-based questioning and answering method and questioning and answering system adopting initiative study-based questioning and answering method
CN110674637B (en) Character relationship recognition model training method, device, equipment and medium
Singh et al. An algorithm to transform natural language into SQL queries for relational databases
US20170193088A1 (en) Entailment knowledge base in natural language processing systems
CN113590776B (en) Knowledge graph-based text processing method and device, electronic equipment and medium
CN112131881B (en) Information extraction method and device, electronic equipment and storage medium
CN116795973B (en) Text processing method and device based on artificial intelligence, electronic equipment and medium
CN112131401B (en) Concept knowledge graph construction method and device
KR20210030068A (en) System and method for ensemble question-answering
WO2023207096A1 (en) Entity linking method and apparatus, device, and nonvolatile readable storage medium
US20230008897A1 (en) Information search method and device, electronic device, and storage medium
CN112507139A (en) Knowledge graph-based question-answering method, system, equipment and storage medium
CN110991183B (en) Predicate determination method, predicate determination device, predicate determination equipment and predicate determination storage medium
CN110209781A (en) A kind of text handling method, device and relevant device
CN112507089A (en) Intelligent question-answering engine based on knowledge graph and implementation method thereof
CN115114419A (en) Question and answer processing method and device, electronic equipment and computer readable medium
CN111931503B (en) Information extraction method and device, equipment and computer readable storage medium
CN117290488A (en) Man-machine interaction method and device based on large model, electronic equipment and storage medium
Kulkarni et al. College chat-bot
CN113704421A (en) Information retrieval method and device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40020244

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant