CN110598006A - Model training method, triplet embedding method, apparatus, medium, and device - Google Patents

Model training method, triplet embedding method, apparatus, medium, and device Download PDF

Info

Publication number
CN110598006A
CN110598006A CN201910875584.XA CN201910875584A CN110598006A CN 110598006 A CN110598006 A CN 110598006A CN 201910875584 A CN201910875584 A CN 201910875584A CN 110598006 A CN110598006 A CN 110598006A
Authority
CN
China
Prior art keywords
training
model
triples
probability
embedding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910875584.XA
Other languages
Chinese (zh)
Other versions
CN110598006B (en
Inventor
王尧
李林峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Yiyi Yunda Data Technology Co Ltd
Nanjing Medical Duyun Medical Technology Co Ltd
Original Assignee
Nanjing Yiyi Yunda Data Technology Co Ltd
Nanjing Medical Duyun Medical Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Yiyi Yunda Data Technology Co Ltd, Nanjing Medical Duyun Medical Technology Co Ltd filed Critical Nanjing Yiyi Yunda Data Technology Co Ltd
Priority to CN201910875584.XA priority Critical patent/CN110598006B/en
Publication of CN110598006A publication Critical patent/CN110598006A/en
Application granted granted Critical
Publication of CN110598006B publication Critical patent/CN110598006B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

The disclosure provides a training method of an embedding model of a triplet, embedding of the triplet, a device, a computer readable medium and electronic equipment, and relates to the technical field of machine learning. The training method of the embedded model of the triples comprises the following steps: obtaining N groups of training samples, wherein each group of training samples comprises: the triples and the probability of the knowledge being established, wherein N is an integer greater than 1; inputting the triples in the ith group of training samples into an embedded model, and obtaining a projection distance S according to the output of the embedded modeliWherein i is a positive integer less than or equal to N; probability P according to ith group of training samplesiAnd a projection distance SiA loss function of the embedded model is determined to train the embedded model based on the loss function. According to the technical scheme, the information content of the triples after the triples are subjected to embedding processing contains probability information, so that the method and the device are beneficial to improving the expression accuracy of the medical knowledge points and the accuracy of the triples embedding processing.

Description

Model training method, triplet embedding method, apparatus, medium, and device
Technical Field
The present disclosure relates to the field of machine learning technologies, and in particular, to a method and an apparatus for training an embedded model of a triplet, a method and an apparatus for embedding a triplet, a computer-readable medium, and an electronic device.
Background
Knowledge maps (KG), or "Knowledge bases" (KB), are Graph structures that represent Knowledge by "entities" and "relationships". And knowledge-graph embedding (KGEmbellding) means that "entities" and "relations" in a knowledge-graph are represented by vectors or low-dimensional vectors, so that in knowledge-graph-based reasoning applications, symbolic reasoning can be replaced by vector calculation. That is, triples in the knowledge-graph are made more suitable for use in inference calculations based on the knowledge-graph.
In the related art, an embedded tag model of a triplet in a knowledge graph is regarded as a binary classification model: the knowledge points expressed by the triples are "positive samples" when they are established, and the knowledge points expressed by the triples are "negative samples" (unrealized knowledge points) when they are not established. Further, the training targets of the binary model are as follows: for a triplet of knowledge points held in training samples, the distance between its vectors h + r and t is as small as possible. The loss function generally employed is: the positive samples are smaller than the distance values of the negative samples by at least a preset distance.
However, in the method for embedding a knowledge graph provided in the related art, the accuracy of the triple embedding process needs to be improved.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
An object of the embodiments of the present disclosure is to provide a training method for an embedded model of a triplet, a training apparatus for an embedded model of a triplet, an embedding method of a triplet, an embedding apparatus of a triplet, a computer-readable medium, and an electronic device, thereby improving accuracy of triplet embedding processing to at least a certain extent.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
According to a first aspect of the embodiments of the present disclosure, there is provided a method for training an embedded model of a triplet, including:
obtaining N groups of training samples, wherein each group of training samples comprises: the probability of the fact of the triplets and the knowledge expressed by the triplets, wherein N is an integer greater than 1;
inputting the triples in the ith group of training samples into an embedded model, and obtaining a projection distance S according to the output of the embedded modeliWherein i is a positive integer less than or equal to N;
according to the probability P of the ith group of training samplesiAnd the above projection distance SiDetermining a loss function of the embedded model to train the embedded model based on the loss function.
In an embodiment of the present disclosure, based on the above scheme, the probability P of the training sample according to the ith group isiAnd the above projection distance SiDetermining a loss function of the embedded model, comprising:
based on a preset mapping function, the probability P in the ith group of training samplesiMapping to the target distance D corresponding to the ith group of training samplesi
According to the target distance DiAnd the above projection distance SiDetermining a loss function of the embedded model;
wherein the mapping function is a projection distance SiValue range and probability P ofiImplement a one-to-one mapping of monotonic functions between the value ranges of (a) and (b).
In one embodiment of the present disclosure, based on the foregoing scheme:
the above-mentioned N groups of training samples of obtaining includes:
acquiring N1 groups of training positive samples and N2 groups of training negative samples; wherein each set of training positive samples comprises: a first probability that the knowledge represented by the first triplet and the aforementioned triplet holds, each set of training negative examples comprising: a second triple and a preset probability value;
inputting the triples in the ith training sample into an embedded model, and obtaining a projection distance S according to the output of the embedded modeliThe method comprises the following steps:
inputting the first triplet in the i1 th training positive sample into an embedded model, and obtaining a first projection distance S according to the output of the embedded modeli1Wherein i1 is a positive integer less than or equal to N1; and inputting the second triplet in the i2 th training negative sample into the embedded model, and obtaining a second projection distance S according to the output of the embedded modeli2Wherein i2 is a positive integer less than or equal to N2;
probability P of the training sample according to the ith groupiAnd the above projection distance SiDetermining a loss function of the embedded model, comprising:
training the first probability P of the positive sample according to the i1 th groupi1And the first projection distance Si1Determining a first loss function; according to the preset probability value and the second projection distance Si2Determining a second loss function; and determining a loss function of the embedded model based on the first loss function and the second loss function.
In an embodiment of the present disclosure, based on the foregoing scheme, the first probability P of training the positive samples according to the i1 th group is described abovei1And the above projection distance Si1Determining a first loss function, comprising:
training the i1 th group to the first probability P in the positive sample based on the preset mapping functioni1Mapping to a first target distance D corresponding to the i1 th training positive samplei1
According to the first target distance Di1And the first projection distance Si1Determining said first loss function;
wherein the mapping function is a first projection distance Si1Value range of (1) and first probability Pi1Implement a one-to-one mapping of monotonic functions between the value ranges of (a) and (b).
In an embodiment of the present disclosure, based on the foregoing scheme, the rootAccording to the preset probability value and the second projection distance Si2Determining a second loss function comprising:
mapping the preset probability value into a second target distance D' based on the preset mapping function;
according to the second target distance D' and the second projection distance Si2Determines the second loss function.
In an embodiment of the disclosure, based on the foregoing scheme, the triplet in the ith training sample is input into an embedded model, and the projection distance S is obtained according to the output of the embedded modeliThe method comprises the following steps:
inputting the triples in the ith group of training samples into an embedding model, and obtaining the triples through the embedding processing of the embedding model: the head vector h corresponding to the head entity in the above-mentioned tripletiAnd the attribute vector r corresponding to the attribute in the tripleiAnd, the tail vector t corresponding to the tail entity in the above-mentioned tripleti
According to the head vector hiAttribute vector riAnd a tail vector tiObtaining the projection distance Si
According to a second aspect of the embodiments of the present disclosure, there is provided a method for embedding a triplet, the method including:
acquiring a target triple in the knowledge graph;
inputting the target probability into an embedding model of the triple, obtaining a vector representation of the target triple according to the output of the embedding model of the triple, and predicting an embedding label of the target triple according to the vector representation;
wherein the embedding model of the triplet is trained according to the method as described in any one of the first aspect of the embodiments above.
According to a third aspect of the embodiments of the present disclosure, there is provided an apparatus for training an embedded model of a triplet, the apparatus including:
a sample acquisition module to: obtaining N groups of training samples, wherein each group of training samples comprises: the probability that the triplets and the knowledge expressed by the triplets are established, wherein N is an integer greater than 1;
a projection distance determination module to: inputting the triples in the ith group of training samples into an embedded model, and obtaining a projection distance S according to the output of the embedded modeliWherein i is a positive integer less than or equal to N;
a model training module to: according to the probability P of the ith group of training samplesiAnd the projection distance SiDetermining a loss function of the embedded model to train the embedded model based on the loss function.
According to a fourth aspect of the embodiments of the present disclosure, there is provided an apparatus for training an embedded model of a triplet, the apparatus including:
a target triplet acquisition module to: acquiring a target triple in the knowledge graph;
an embedding module to: inputting the target probability into an embedding model of the triple, obtaining a vector representation of the target triple according to the output of the embedding model of the triple, and predicting an embedding label of the target triple according to the vector representation;
wherein the embedding model of the triplet is trained according to the method as described in any one of the first aspect of the embodiments above.
According to a fifth aspect of the embodiments of the present disclosure, there is provided a computer readable medium, on which a computer program is stored, the program, when executed by a processor, implementing a method for training an embedded model of triples as described in any one of the first aspect of the embodiments above;
and implementing the triple embedding method as described in the second aspect of the above embodiments.
According to a sixth aspect of embodiments of the present disclosure, there is provided an electronic apparatus including: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement a method of training an embedded model of triples as described in any one of the first aspects of the embodiments above;
and implementing the triple embedding method as described in the second aspect of the above embodiments.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:
in some embodiments of the present disclosure, a plurality of sets of training samples are first obtained, wherein each set of training samples comprises: triples and the probability that the knowledge represented by the triples holds; then, inputting the triples in any group (i-th group) of training samples into the embedded model, and obtaining the projection distance S corresponding to the i-th group of training samples according to the output of the modeli(ii) a Further, according to the probability P in the ith group of training samplesiAnd the above projection distance SiA loss function for the embedded model is determined, and the embedded model is trained based on the loss function. In the technical scheme, the probability corresponding to the triples is introduced in the training process of the embedded model, so that the probability of the establishment of the knowledge expressed by the target triples can be considered when the trained embedded model carries out embedding processing on the target triples. Therefore, the information content after the triple is embedded comprises the probability information, so that the improvement of the expression accuracy of the medical knowledge points and the improvement of the accuracy of the triple embedding processing are facilitated.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty. In the drawings:
FIG. 1 illustrates a system architecture diagram for implementing a training method of an embedding model of triples or an embedding method of triples in an exemplary embodiment of the present disclosure;
FIG. 2 illustrates a flow diagram of a method of training an embedded model of triples according to an embodiment of the present disclosure;
FIG. 3 illustrates a flow diagram of a method of determining a projected distance of a triplet in accordance with an embodiment of the present disclosure;
FIG. 4 illustrates a flow diagram of a method of determining a first loss function according to an embodiment of the present disclosure;
FIG. 5 shows a flow diagram of a method of determining a loss function of an embedded model according to an embodiment of the disclosure;
FIG. 6 shows a flow diagram of a method of embedding triples according to an embodiment of the present disclosure;
FIG. 7 shows a schematic structural diagram of a training apparatus for an embedded model of triples according to an embodiment of the present disclosure;
FIG. 8 illustrates a schematic structural diagram of an embedding arrangement of triplets in accordance with an embodiment of the present disclosure;
FIG. 9 shows a schematic diagram of a structure of a computer storage medium in an exemplary embodiment of the disclosure; and the number of the first and second groups,
fig. 10 shows a schematic structural diagram of an electronic device in an exemplary embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the disclosure.
The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.
The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
The exemplary embodiment first provides a system architecture for implementing a training method of an embedded model of a triplet, which can be applied to various data processing scenarios. Referring to fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send request instructions or the like. The terminal devices 101, 102, 103 may have various communication client applications installed thereon, such as a photo processing application, a shopping application, a web browser application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, for example, acquiring a triplet input by a user using the terminal device 101, 102, 103 to represent a target medical knowledge point, and acquiring an additional semantic feature defining the target medical knowledge point on the basis of the semantic features expressed by the triplet (for example only). The backend management server may expand the triples into quadruples based on the additional semantic features to build a medical knowledge-graph based on the quadruples (just an example). The background management server can also receive a query instruction and perform query in the medical knowledge graph based on the query instruction.
It should be noted that the training method of the triple embedding model provided in the embodiment of the present application is generally executed by the server 105, and accordingly, the training device of the triple embedding model is generally disposed in the terminal device 101.
Triples are the same as the general definition of Graph (Graph) in mathematical terms, consisting of points, i.e. entities in a knowledge structure, and edges, i.e. relationships between entities. In the knowledge expression of KG, an edge and its head and tail nodes are taken, and the Subject-Predicate-Object triple (SPO) can be formed by regarding the head node as a Subject (Subject), regarding the edge as a Predicate (Predicate), and regarding the tail node as an Object (Object). The subject, predicate, and object of the triple (SPO) may also be referred to as a Head Entity (Head Entity), a relationship (relationship), and a Tail Entity (Tail Entity) of the triple, respectively.
In the related art, an embedded tag model of a triplet in a knowledge graph is regarded as a two-class model, that is: the knowledge points expressed by the triples are positive samples when the triples are established, and the knowledge points expressed by the triples are negative samples when the triples are not established. However, in the real world, a large number of triples are established with a certain probability, and therefore, it is not appropriate to describe positive samples as absolutely correct and negative samples as absolutely incorrect in a binary manner in the related art.
For example, the triplet representation is based on the fact that there are additional semantic features like probabilities: not all type II diabetics have symptoms of polydipsia and it is probable that this medical knowledge point holds. For another example: the infant pneumonia patients can have 'cough and milk' symptoms, and the people are children in the medical knowledge point of 'pneumonia symptom cough and milk'. Because the binary classification method provided by the related technology cannot represent such information, the information amount after the triple embedding processing based on the related technology is reduced, and the accuracy of expression of medical knowledge points needs to be improved.
In view of the above problems in the related art, the present technical solution provides a method and an apparatus for training an embedded model of a triplet, a computer storage medium, and an electronic device. The following description will first describe a training method of an embedded model of a triplet:
fig. 2 shows a flow diagram of a method of training an embedded model of triples according to an embodiment of the present disclosure. The embodiment provides a training method for an embedded model of triples, which overcomes the above problems in the prior art at least to some extent.
The execution subject of the training method for the triple embedded model provided in this embodiment may be a device having a calculation processing function, such as a server.
Referring to fig. 2, the training method of the triple embedded model provided in this embodiment includes:
step S210, obtaining N groups of training samples, wherein each group of training samples includes: the probability that the triplets and the knowledge expressed by the triplets are established, wherein N is an integer greater than 1;
step S220, inputting the triples in the ith group of training samples into an embedded model, and obtaining a projection distance S according to the output of the embedded modeliWherein i is a positive integer less than or equal to N; and the number of the first and second groups,
step S230, according to the probability P of the ith group of training samplesiAnd the projection distance SiDetermining a loss function of the embedded model to train the embedded model based on the loss function.
In the technical solution provided in the embodiment shown in fig. 2, the probability corresponding to the triplet is introduced in the training process of the embedding model, so that when the trained embedding model performs embedding processing on the target triplet, the probability that the knowledge expressed by the target triplet is established can be considered. Therefore, the information content after the triple is embedded comprises probability information, and the improvement of the expression accuracy of the medical knowledge points is facilitated.
In particular, the embedding model is used for entity vectorization and relationship vectorization in the triples, namely knowledge representation. Specifically, the embedded model aims to represent all entities and relations in the knowledge graph into a vector with a low latitude. I.e. converting the triplet (h, r, t) to (h),r,t). In this way, the entities in the triples (e.g.,<h: type 2 diabetes r: symptom t: multi-purpose drink>The head entity in "type 2 diabetes") has natural language transformed to be a low dimensional vector, such as: [0.32,0.5,0.23,0.12,0.1,0.54]. Wherein the low latitude vectors make triples in the knowledge graph more suitable for inference calculations based on the knowledge graph.
The related art has already been studied on the above embedding model, and among them, the work using the translation mode has been widely used based on the distance between vectors, and the more representative work includes: TransE model, TransH model, TransR model, TransD model and TransSparse model. The following briefly describes the above models, and then summarizes the deficiencies of the above models:
the TransE model encodes the head entity head entry, tail entity tail entry and relationship in the same triple into the same space, and its representation vector is marked with (h, r, t). The model training targets are: for an SPO triplet that holds in the training sample, the distance between its vectors h + r and t is as small as possible. The loss function adopts a pairwise contrast loss function based on margin, namely, the distance value of the positive sample (h, r, t) is smaller than that of the negative sample (h ', r, t') by at least one margin. The mathematical expression is as follows formula one, wherein, the idea of the loss function shown in formula one is adopted in the TransE model, the TransH model, the TransR model, the TransD model and the TransSparse model:
wherein, [ x ]]+Max (x,0), Δ represents the set of training positive sample triplets (h, r, t), Δ ' represents the set of training negative sample triplets (h ', r, t '), and the parameter γ is used to adjust the parameter.
However, in the TransE model algorithm, since all entities and relations are mapped to the same vector space, the description capability for one-to-many and many-to-many relations is limited.
A hyperplane is introduced into each relation in the TransH model algorithm, so that for one-to-many relations, different projection vectors can be obtained on different hyperplanes by a head entity, and the one-to-many relations can be better described by the characteristics. For each triplet (h, r, t), its distance function defines the following formula two, where h ispIs the projection vector of h on the hyperplane, tpIs the projection vector of t on the hyperplane. After the projection vector is obtained, the distance function used by the TransH model is in the following form, and the idea of the training loss function is the same as that of the TransE model (formula one).
score_value=fr(h,t)=|hp+r-tp|
Formula two
The TransR model algorithm further expands the idea of the TransH model algorithm, and in different spaces where the entity and the relation are respectively projected, when the distance function of the triples (h, r, t) is calculated, the entity vector is converted from the entity space to the relation space through the transfer matrix. The different spaces make the characterization capability of the transR model stronger than that of the TransH model. Except for the projection vector hp、tpIn the calculation method of (3), the distance function and the loss function used in the TransR model are the same as those of the TransH model (see formula two).
The TransD model provides a dynamic mapping method, and the basic idea is as follows: 1) even for the same relationship, the meaning and the property of the head-tail entity may be different, so that it is not fine enough to use the same transfer matrix to describe all the entities after the given relationship, and the transfer matrix needs to be designed for the combination of each entity + relationship; 2) the number of entity + relationship combinations can be very large, and in order to avoid explosion in the number of parameters to be learned, the transfer matrix is formed using the projection vectors of the entities and the relationships themselves. The same distance function and loss function as before are used for the TransD model as well.
The TranSparse model proposes that a sparse matrix is used as a transfer matrix, and the sparse matrix can be a shared parameter of all entities of a relation, namely a shared mode; or it may be a separate training parameter for each entity in a relationship, i.e. a partitioned pattern. In the many-to-many relation prediction task, the implementation of the segmented mode TranSparse model exceeds the TransD model. However, the same distance function and loss function as before are used in the TransR model of TransSpararseTransR.
As can be seen, the probability that the knowledge points expressed by the triples are established is not considered in the foregoing TransE model, the TransH model, the TransR model, the TransD model, and the trasmission model, and therefore, the present technical solution further improves the models. Namely, the information content of the triples subjected to embedding processing in the technical scheme contains probability information, so that the expression accuracy of the medical knowledge points is improved. Specifically, the method comprises the following steps:
the implementation details of the steps of the embodiment of the training method for embedded models shown in fig. 2 are described in detail below:
in an exemplary embodiment, in step S210, N sets of training samples are obtained, wherein each set of training samples includes: the probability that the triplets and the knowledge expressed by the triplets are true, and N is an integer greater than 1.
Illustratively, N1 training positive samples and N2 training negative samples are specifically obtained; wherein each set of training positive samples comprises: a first triplet and a first probability of knowledge being held by the triplet, each set of training negative examples comprising: a second triplet and a preset probability value.
Illustratively, in the medical field, a certain disease a and a commonly used drug B form a relationship triple, and the triple appears M times in N historical medical records, so that the occurrence probability of the triple (disease a, related drug, drug B) can be found to be M/N. In other words, in a data set consisting of given N historical medical records, (disease A, related medication, medication B) holds with a probability M/N.
Illustratively, a set of training positive samples includes: first triplet < h: type 2 diabetes r: symptom t: polydipsia > and a first probability 80% that knowledge represented by the first triplet is true. Specifically, the medical knowledge of the training positive sample is as follows: polydipsia is included among symptoms in 80% of type 2 diabetic patients.
For example, a negative sample should actually be a sample defined as having a probability of 0, but in order to avoid the probability in training the negative sample in the mapping function as a denominator, the target distance is negative infinity. In order to avoid the variables that cannot be directly calculated, such as the logarithm of 0 and negative infinity, in the calculation logic, the contract establishment probability with a negative sample is preset as ∈ in this embodimentn,εnGenerally, the value is a positive value, and the specific value can be set according to actual requirements. Then, the set of training negative examples includes: second triplet<h: and (3) gastric cancer r: symptom t: multi-purpose drink>And a preset probability value epsilon corresponding to the second tripletn
By way of example, the medical field is taken as an example to illustrate: the sample data can be obtained from a constructed medical knowledge map or constructed according to medical data. Wherein the medical data can be derived from medical experience knowledge. Illustratively, real-world clinical data obtained from empirical observations and the like are accumulated in clinical practice, such as: medical records. The medical data can also be derived from medical literature knowledge by learning textbooks, clinical guidelines, monographs, treatises, and the like.
With continued reference to fig. 2, after the training samples are obtained, in step S220, the triples in the ith training sample are input into the embedded model, and the projection distance S is obtained according to the output of the embedded modeliWherein i is a positive integer less than or equal to N.
In an exemplary embodiment, fig. 3 shows a flowchart of a method for determining a projection distance of a triplet according to an embodiment of the present disclosure, which may be a specific implementation manner of step S220. Referring to fig. 3, the embodiment shown in this figure comprises:
step S310, inputting the triples in the ith group of training samples into an embedded model to obtain through the embedding processing of the embedded model: a head vector h corresponding to the head entity in the tripletiAnd the attribute vector r corresponding to the attribute in the tripleiAnd, the tail vector t corresponding to the tail entity in the tripleti(ii) a And the number of the first and second groups,
step S320, according to the head vector hiAttribute vector riAnd a tail vector tiObtaining the projection distance Si
For example, the specific implementation of the embedding process of the embedding model on the triples in the training sample may be any one of embedding processes of a TransE model, a TransH model, a TransR model, a TransD model or a trasspare model. Thus, the output of the embedded model is the head vector corresponding to the head entity in the triplet in the ith set of training samplesAttribute vector corresponding to attribute in the tripleAnd the tail vector corresponding to the tail entity in the tripletThat is, the triplet (h)i,ri,ti) Is converted into
Further in accordance withDetermining triplets (h)i,ri,ti) Projection distance of
In an exemplary embodiment, fig. 4 shows a flowchart of a method for determining a first projection distance of a positive sample according to an embodiment of the present disclosure. Specifically, the method for determining the projection distance of the positive sample and the method for determining the projection distance of the negative sample may be another specific embodiment of step S220. Referring to fig. 4, the embodiment shown in this figure comprises:
step S410, inputting the first triple in the ith 1 th group of training positive samples into an embedded model, and obtaining a first projection distance S according to the output of the embedded modeli1Wherein i1 is a positive integer less than or equal to N1; and the number of the first and second groups,
step S420, inputting the second triple in the ith 2 th training negative sample into the embedded model, and obtaining a second projection distance S according to the output of the embedded modeli2Wherein i2 is a positive integer less than or equal to N2.
According to the embodiment shown in fig. 3, in step S410, after the first triplet in the i1 th training positive sample is input into the embedding model, the output of the embedding model is the head vector corresponding to the head entity in the first triplet in the i1 th training positive sampleAttribute vector corresponding to attribute in the tripleAnd the tail vector corresponding to the tail entity in the tripletThat is, the first triplet (h)i1,ri1,ti1) Is converted intoFurther in accordance withDetermine the first threeTuple (h)i1,ri1,ti1) First projection distance of
In an exemplary embodiment, the specific implementation of step S420 is the same as step S410, and is not described herein again. Finally determining the second triple (h) in the training negative samplei2,ri2,ti2) Second projection distance of
It should be noted that the execution sequence of step S410 and step S420 is not sequential.
With continued reference to FIG. 2, after determining the projection distances of the triples, in step S230, the probability P of the training sample according to the ith set is determinediAnd the projection distance SiDetermining a loss function of the embedded model to train the embedded model based on the loss function.
In the related art, a projection distance based on a triplet is regarded as a description method of triplet reliability in a TransE model, a TransH model, a TransR model, a TransD model, or a transsparse model. That is, the loss function of the TransE model, the TransH model, the TransR model, the TransD model, or the trasspare model is determined by comparison using the projected distances of the positive and negative samples, specifically, the object is to make the projected distance of the positive sample smaller than the projected distance of the negative sample.
However, in the present technical solution, because the probability that the knowledge point expressed by the triplet is established is introduced, all features of the sample cannot be reflected (for example, probability features in the sample cannot be reflected) by only considering the projection distance in the above related art. Therefore, in the technical scheme, two factors of probability and projection distance should be reflected when determining the loss function of the embedded model.
Specifically, in the present technical solution, the first probability in the training positive samples is mapped to the target distance corresponding to the set of training samples by setting a mapping functionFrom Di. That is, the probability in the training positive sample can be mapped out by the inverse of the mapping function Φ () to a target distance. Further, at the passing target distance DiAnd the projection distance SiDetermining a loss function of the embedded model.
In an exemplary embodiment, fig. 5 shows a flowchart of a method for determining a loss function according to an embodiment of the disclosure, which may be used as a specific implementation manner of step S230. Referring to fig. 5, the embodiment shown in the figure includes step S510 and step S520.
In step S510, the probability P in the ith group of training samples is determined based on a preset mapping functioniMapping to the target distance D corresponding to the ith group of training samplesi
Wherein the mapping function is a projection distance SiValue range and probability P ofiImplement a one-to-one mapping of monotonic functions between the value ranges of (a) and (b). Exemplary, PiRepresentation triplet (h)i,ri,ti) And represents a triplet (h)i,ri,ti) Target distance D ofi. The mapping function used in this embodiment is an exponential functionThen P isiAnd DiThe conversion formula between is as follows:
thus, the probability P of training samples according to the ith groupiThe target distance D corresponding to the set of samples may be determinedi
Illustratively, the probability P of training a positive sample according to the i1 th groupi1Determining the target distance D corresponding to the group of positive samplesi1Equation three is as follows:
the value range of i1 is an integer greater than or equal to 1 and less than or equal to N1, where N1 is the number of training positive samples.
Illustratively, the preset probability value epsilon corresponding to the training negative samplenThe formula for determining the second target distance D' corresponding to each set of negative examples is as follows:
in step S520, according to the target distance DiAnd the projection distance SiDetermines a loss function of the embedded model.
Illustratively, for training positive samples: according to the first target distance Di1And the first projection distanceDetermines a first loss function of the embedded model, as shown in equation five below:
wherein, the formula five represents a first projection distance obtained by training a first triple in the positive sampleDistance D from the first targeti1The difference in (a). For a set of training positive samples, the training target of the first loss function is the distance from itself as close as possible to the first target distance, i.e. the minimum value of the above formula five is found.
Illustratively, for training negative examples: according to the second target distance D' and the second projection distanceDetermines a second loss function of the embedded model, as shown in equation six below:
wherein, [ x ]]+Max (x,0), and the sixth expression represents the second projection distance obtained by training the second triplet in the negative sampleThe difference from the second target distance D'. For a set of training negative samples, the training target of the second loss function is the distance from itself to the second target as far as possible, i.e. the minimum value of the above formula six is found.
In an exemplary embodiment, the loss function of the embedded model is determined based on the first loss function and the second loss functionCan be expressed as shown in equation seven:
where Δ represents a first set of triples in the training positive samples, Δ 'represents a second set of triples in the training negative samples, and the parameter γ' is used to adjust the weight lost by the negative samples.
In an exemplary embodiment, the above-described loss function of the embedded modelIt is also determined from the first loss function, the second loss function, and an exponential function with respect to the first projection distance and the second projection distance, such as shown in equation eight:
wherein γ in the above formulaγ、αγAnd betaγThe specific value can be determined according to the actual situation for the weight of the sample loss. The subscript r indicates that the weight value may be different according to the relationship entity relationship in different triples.
In equation eight, the portion of the exponential function for the first throw distance and the second throw distance is setThe significance of is that: the first target distance and the second target distance in the technical scheme are mapped through an exponential function]And the distance loss is combined to improve the prediction accuracy of the embedded model obtained based on the loss function training.
Illustratively, the model parameters of the embedded model are obtained by determining the minimum value of formula seven or formula eight, so as to complete the training process of the model.
It should be noted that, in the present technical solution, the loss function of the embedded model may also be: other mathematical expressions of the first projection distance, the second projection distance, the first target distance, and the second target distance.
In the technical solutions of the training methods for the embedded models provided in the above embodiments, the probabilities corresponding to the triples are introduced in the training process of the embedded model, so that when the trained embedded model performs the embedding process on the target triples, the probability that the knowledge expressed by the target triples is established can be considered. Therefore, the information content after the triple is embedded comprises probability information, and the improvement of the expression accuracy of the medical knowledge points is facilitated.
In an exemplary embodiment, fig. 6 shows a flow diagram of a triple embedding method according to an embodiment of the present disclosure. Referring to fig. 6, the embodiment shown in this figure comprises:
step S610, acquiring a target triple in the knowledge graph; and the number of the first and second groups,
step S620, inputting the target probability into an embedding model of the triple, obtaining a vector representation of the target triple according to the output of the embedding model of the triple, and predicting the embedding label of the target triple according to the vector representation.
Wherein the embedding model of the triplet is trained according to the method described in the embodiment shown in fig. 2 to 5.
By introducing probability, the knowledge graph is embedded into the Embedding algorithm, so that probability information of the triples can be effectively learned, and a better application effect is obtained.
Taking a TransH model as an example, an algorithm formed by introducing the training method of the embedded model provided by the technical scheme into the TransH model can be denoted as a PrTransH model (Probability TransH, a TransH model containing Probability information). On a certain medical knowledge map, under the same training set and test set conditions, the performance of two algorithms is investigated by using a tail entity prediction task, and the results of a certain experiment are shown in the following table 1:
TABLE 1
Referring to Table 1, where Hits @10 represents the proportion of Hits that hit the correct answer in the top ten of the predicted results, closer to 1 is better. Mean Rank represents the average ranking of all correct answers in the prediction sequence, with smaller being better. NDCG @10 describes the top ten predictions with or without the correct answer ranked in the top, the closer this value is to 1 the better. The disease _ to _ medium/disease _ to _ system/disease _ to _ operation/disease _ to _ laboratory in each row of the table respectively identifies five relations of disease-related drugs, symptoms, operations, examinations and examinations.
From the above table, it can be seen that the PrTransH model is significantly better than the TransH model as a whole, and the prediction accuracy after introducing the probability is greatly improved.
It should be noted that the training method for the embedded model shown in fig. 2 to 5 can be applied not only to the TransH model but also to the TransE model, the TransR model, the TransD model, and the TransSparse model, and the model training method can be applied to form an optimized model, which can be referred to as: a PrTransE model, a PrTransR model, a PrTransD model, and a PrTransSparse model.
Those skilled in the art will appreciate that all or part of the steps for implementing the above embodiments are implemented as computer programs executed by a processor (including a CPU and a GPU). For example, model training of the risk prediction model is implemented by the GPU, or risk level prediction processing of the object to be measured is implemented by using the CPU or the GPU based on the trained risk prediction model. When executed by the CPU, performs the functions defined by the above-described methods provided by the present disclosure. The program may be stored in a computer readable storage medium, which may be a read-only memory, a magnetic or optical disk, or the like.
Furthermore, it should be noted that the above-mentioned figures are only schematic illustrations of the processes involved in the methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
The following describes an embodiment of a training apparatus for an embedded model of triples according to the present disclosure, which may be used to perform the training method for an embedded model of triples according to the embodiments shown in fig. 2 to 5 of the present disclosure.
Fig. 7 shows a schematic structural diagram of a training apparatus for an embedded model of triples according to an embodiment of the present disclosure, and referring to fig. 7, the training apparatus 700 for an embedded model of triples provided by this embodiment includes: a sample acquisition module 701, a projection distance determination module 702, and a model training module 703.
The sample acquiring module 701 is configured to: obtaining N groups of training samples, wherein each group of training samples comprises: the probability that the triplets and the knowledge expressed by the triplets are established, wherein N is an integer greater than 1;
the above projection distance determinationA module 702 for: inputting the triples in the ith group of training samples into an embedded model, and obtaining a projection distance S according to the output of the embedded modeliWherein i is a positive integer less than or equal to N;
the model training module 703 is configured to: according to the probability P of the ith group of training samplesiAnd the projection distance SiDetermining a loss function of the embedded model to train the embedded model based on the loss function.
In an embodiment of the present disclosure, based on the foregoing scheme, the model training module 703 is specifically configured to:
based on a preset mapping function, the probability P in the ith group of training samplesiMapping to the target distance D corresponding to the ith group of training samplesi
According to the target distance DiAnd the projection distance SiDetermining a loss function of the embedded model;
wherein the mapping function is a projection distance SiValue range and probability P ofiImplement a one-to-one mapping of monotonic functions between the value ranges of (a) and (b).
In one embodiment of the present disclosure, based on the foregoing:
the sample acquiring module 701 is specifically configured to:
acquiring N1 groups of training positive samples and N2 groups of training negative samples; wherein each set of training positive samples comprises: a first triplet and a first probability of knowledge being held by the triplet, each set of training negative examples comprising: a second triple and a preset probability value;
the projection distance determining module 702 is specifically configured to:
inputting a first triple in the ith 1 th training positive sample into an embedded model, and obtaining a first projection distance S according to the output of the embedded modeli1Wherein i1 is a positive integer less than or equal to N1; inputting the second triple in the i2 th training negative sample into the embedded model, and obtaining a second projection distance S according to the output of the embedded modeli2Wherein i2 is a positive integer less than or equal to N2;
the model training module 703 includes: a first loss function determination unit, a second loss function determination unit, and a model loss function determination unit.
Wherein the first loss function determining unit is configured to: training a first probability P of a positive sample according to the i1 th groupi1And the first projection distance Si1Determining a first loss function; the second loss function determining unit is configured to: according to the preset probability value and the second projection distance Si2Determining a second loss function; and the model loss function determination unit is configured to: determining a loss function of the embedded model from the first loss function and the second loss function.
In an embodiment of the disclosure, based on the foregoing scheme, the first loss function determining unit is specifically configured to:
training the i1 th group to the first probability P in the positive sample based on the preset mapping functioni1Mapping to a first target distance D corresponding to the i1 th training positive samplei1
According to the first target distance Di1And the first projection distance Si1Determining the first loss function;
wherein the mapping function is a first projection distance Si1Value range of (1) and first probability Pi1Implement a one-to-one mapping of monotonic functions between the value ranges of (a) and (b).
In an embodiment of the disclosure, based on the foregoing scheme, the second loss function determining unit is specifically configured to:
mapping the preset probability value into a second target distance D' based on the preset mapping function;
according to the second target distance D' and the second projection distance Si2Determines the second loss function.
In an embodiment of the present disclosure, based on the foregoing scheme, the projection distance determining module 702 is further specifically configured to:
the triples in the ith group of training samplesInputting an embedding model to obtain through an embedding process of the embedding model: a head vector h corresponding to the head entity in the tripletiAnd the attribute vector r corresponding to the attribute in the tripleiAnd, the tail vector t corresponding to the tail entity in the tripleti
According to the head vector hiAttribute vector riAnd a tail vector tiObtaining the projection distance Si
For details that are not disclosed in the embodiment of the training apparatus for an embedded model of a triplet according to the present disclosure, please refer to the embodiment of the training method for an embedded model of a triplet according to the present disclosure for details that are not disclosed in the embodiment of the training apparatus for an embedded model of a triplet according to the present disclosure.
The following describes an embodiment of a training apparatus for an embedded model of triples according to the present disclosure, which may be used to perform the training method for an embedded model of triples according to the embodiment shown in fig. 6 of the present disclosure.
Fig. 8 shows a schematic structural diagram of an apparatus for embedding triples according to an embodiment of the present disclosure, and referring to fig. 8, an apparatus 800 for embedding triples provided by this embodiment includes: a target triple obtaining module 801 and an embedding module 802.
The target triple obtaining module 801 is configured to: acquiring a target triple in the knowledge graph;
the embedded module 802 is configured to: and inputting the target probability into an embedding model of the triple, obtaining a vector representation of the target triple according to the output of the embedding model of the triple, and predicting the embedding label of the target triple according to the vector representation.
Wherein the embedding model of the triplet is trained according to the method embodiment described in any one of fig. 2 to 5.
As the functional modules of the triple embedding device of the exemplary embodiment of the present disclosure correspond to the steps of the exemplary embodiment of the triple embedding method described above, please refer to the embodiment of the triple embedding method described above of the present disclosure for details that are not disclosed in the embodiment of the triple embedding device of the present disclosure.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
Moreover, although the steps of the methods of the present disclosure are depicted in the drawings in a particular order, this does not require or imply that the steps must be performed in this particular order, or that all of the depicted steps must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions, etc.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a mobile terminal, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, there is also provided a computer storage medium capable of implementing the above method. On which a program product capable of implementing the above-described method of the present specification is stored. In some possible embodiments, various aspects of the present disclosure may also be implemented in the form of a program product including program code for causing a terminal device to perform the steps according to various exemplary embodiments of the present disclosure described in the "exemplary methods" section above of this specification when the program product is run on the terminal device.
Referring to fig. 9, a program product 900 for implementing the above method according to an embodiment of the present disclosure is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product described above may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
In addition, in an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.
As will be appreciated by one skilled in the art, aspects of the present disclosure may be embodied as a system, method or program product. Accordingly, various aspects of the present disclosure may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
An electronic device 1000 according to this embodiment of the disclosure is described below with reference to fig. 10. The electronic device 1000 shown in fig. 10 is only an example and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 10, the electronic device 1000 is embodied in the form of a general purpose computing device. The components of the electronic device 1000 may include, but are not limited to: the at least one processing unit 1010, the at least one memory unit 1020, and a bus 1030 that couples various system components including the memory unit 1020 and the processing unit 1010.
Wherein the storage unit stores program codes, which can be executed by the processing unit 1010, so that the processing unit 1010 executes the steps according to various exemplary embodiments of the present disclosure described in the "exemplary method" section above in this specification. For example, the processing unit 1010 described above may perform the following as shown in fig. 2: step S210, obtaining N groups of training samples, wherein each group of training samples includes: the probability that the triplets and the knowledge expressed by the triplets are established, wherein N is an integer greater than 1; step S220, inputting the triples in the ith group of training samples into an embedded model, and obtaining a projection distance S according to the output of the embedded modeliWherein i is a positive integer less than or equal to N; and step S230, according to the probability P of the ith group of training samplesiAnd the projection distance SiDetermining a loss function of the embedded model to train the embedded model based on the loss function.
For example, the processing unit 1010 may further perform a training method of an embedded model of a triplet as shown in any one of fig. 2 to 6.
The storage unit 1020 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)10201 and/or a cache memory unit 10202, and may further include a read-only memory unit (ROM) 10203.
The memory unit 1020 may also include a program/utility 10204 having a set (at least one) of program modules 10205, such program modules 10205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 1030 may be any one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, and a local bus using any of a variety of bus architectures.
The electronic device 1000 may also communicate with one or more external devices 1100 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 1000, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 1000 to communicate with one or more other computing devices. Such communication may occur through input/output (I/O) interfaces 1050. Also, the electronic device 1000 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 1070. As shown, the network adapter 1060 communicates with the other modules of the electronic device 1000 over the bus 1030. It should be appreciated that although not shown, other hardware and/or software modules may be used in conjunction with the electronic device 1000, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.
Furthermore, the above-described figures are merely schematic illustrations of processes included in methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (11)

1. A method of training an embedded model of triples, the method comprising:
obtaining N groups of training samples, wherein each group of training samples comprises: the probability that the triplets and the knowledge expressed by the triplets are established, wherein N is an integer greater than 1;
inputting the triples in the ith group of training samples into an embedded model, and obtaining a projection distance S according to the output of the embedded modeliWherein i is a positive integer less than or equal to N;
according to the probability P of the ith group of training samplesiAnd the projection distance SiDetermining a loss function of the embedded model to train the embedded model based on the loss function.
2. Training method of an embedded model of triples according to claim 1, wherein the probability P of training a sample according to the ith groupiAnd the projection distance SiDetermining a loss function for the embedded model, comprising:
based on a preset mapping function, the probability P in the ith group of training samplesiMapping to the target distance D corresponding to the ith group of training samplesi
According to the target distance DiAnd the projection distance SiDetermining a loss function of the embedded model;
wherein the mapping function is a projection distance SiValue range and probability P ofiImplement a one-to-one mapping of monotonic functions between the value ranges of (a) and (b).
3. The method of training an embedded model of triples according to claim 1,
the acquiring of the N sets of training samples includes:
acquiring N1 groups of training positive samples and N2 groups of training negative samples; wherein each set of training positive samples comprises: a first triplet and a first probability of knowledge being held by the triplet, each set of training negative examples comprising: a second triple and a preset probability value;
inputting the triples in the ith group of training samples into an embedded model, and obtaining a projection distance S according to the output of the embedded modeliThe method comprises the following steps:
inputting a first triple in the ith 1 th training positive sample into an embedded model, and obtaining a first projection distance S according to the output of the embedded modeli1Wherein i1 is a positive integer less than or equal to N1; inputting the second triple in the i2 th training negative sample into the embedded model, and obtaining a second projection distance S according to the output of the embedded modeli2Wherein i2 is a positive integer less than or equal to N2;
probability P of the training samples according to the ith groupiAnd the projection distance SiDetermining a loss function for the embedded model, comprising:
training a first probability P of a positive sample according to the i1 th groupi1And the first projection distance Si1Determining a first loss function; according to the preset probability value and the second projection distance Si2Determining a second loss function; and determining a loss function of the embedded model from the first loss function and the second loss function.
4. The method for training an embedded model of triples according to claim 3, wherein the first probability P of training a positive sample according to the i1 th groupi1And the projection distance Si1Determining a first loss function, comprising:
training the i1 th group to the first probability P in the positive sample based on the preset mapping functioni1Mapping to i1 th training positive sample correspondencesFirst target distance Di1
According to the first target distance Di1And the first projection distance Si1Determining the first loss function;
wherein the mapping function is a first projection distance Si1Value range of (1) and first probability Pi1Implement a one-to-one mapping of monotonic functions between the value ranges of (a) and (b).
5. Method for training an embedded model of triples according to claim 4, wherein said predetermined probability value and said second projection distance S are determined according to said predetermined probability valuei2Determining a second loss function comprising:
mapping the preset probability value into a second target distance D' based on the preset mapping function;
according to the second target distance D' and the second projection distance Si2Determines the second loss function.
6. The method for training the embedding model of the triples according to any one of claims 1 to 5, wherein the triples in the ith training sample are input into the embedding model, and the projection distance S is obtained according to the output of the embedding modeliThe method comprises the following steps:
inputting the triples in the ith group of training samples into an embedding model, and obtaining the triples through embedding processing of the embedding model: a head vector h corresponding to the head entity in the tripletiAnd the attribute vector r corresponding to the attribute in the tripleiAnd, the tail vector t corresponding to the tail entity in the tripleti
According to the head vector hiAttribute vector riAnd a tail vector tiObtaining the projection distance Si
7. A method for embedding triples, the method comprising:
acquiring a target triple in the knowledge graph;
inputting the target probability into an embedding model of the triple, obtaining a vector representation of the target triple according to the output of the embedding model of the triple, and predicting an embedding label of the target triple according to the vector representation;
wherein the embedded model of triples is trained according to the method of any one of claims 1 to 6.
8. An apparatus for training an embedded model of triples, the apparatus comprising:
a sample acquisition module to: obtaining N groups of training samples, wherein each group of training samples comprises: the probability that the triplets and the knowledge expressed by the triplets are established, wherein N is an integer greater than 1;
a projection distance determination module to: inputting the triples in the ith group of training samples into an embedded model, and obtaining a projection distance S according to the output of the embedded modeliWherein i is a positive integer less than or equal to N;
a model training module to: according to the probability P of the ith group of training samplesiAnd the projection distance SiDetermining a loss function of the embedded model to train the embedded model based on the loss function.
9. An apparatus for embedding triples, the apparatus comprising:
a target triplet acquisition module to: acquiring a target triple in the knowledge graph;
an embedding module to: inputting the target probability into an embedding model of the triple, obtaining a vector representation of the target triple according to the output of the embedding model of the triple, and predicting an embedding label of the target triple according to the vector representation;
wherein the embedded model of triples is trained according to the method of any one of claims 1 to 6.
10. A computer-readable medium, on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out a method of training an embedded model of triples according to any one of claims 1 to 6;
and implementing the method of embedding triples according to claim 7.
11. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement a method of training an embedded model of triples as claimed in any one of claims 1 to 6;
and implementing the method of embedding triples according to claim 7.
CN201910875584.XA 2019-09-17 2019-09-17 Model training method, triplet embedding method, apparatus, medium, and device Active CN110598006B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910875584.XA CN110598006B (en) 2019-09-17 2019-09-17 Model training method, triplet embedding method, apparatus, medium, and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910875584.XA CN110598006B (en) 2019-09-17 2019-09-17 Model training method, triplet embedding method, apparatus, medium, and device

Publications (2)

Publication Number Publication Date
CN110598006A true CN110598006A (en) 2019-12-20
CN110598006B CN110598006B (en) 2022-04-01

Family

ID=68860040

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910875584.XA Active CN110598006B (en) 2019-09-17 2019-09-17 Model training method, triplet embedding method, apparatus, medium, and device

Country Status (1)

Country Link
CN (1) CN110598006B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795926A (en) * 2020-01-03 2020-02-14 四川大学 Judgment document similarity judgment method and system based on legal knowledge graph
CN111259085A (en) * 2019-12-30 2020-06-09 福州大学 Relation prediction method of time perception translation model based on relation hyperplane
CN111611344A (en) * 2020-05-06 2020-09-01 北京智通云联科技有限公司 Complex attribute query method, system and equipment based on dictionary and knowledge graph
CN111696636A (en) * 2020-05-15 2020-09-22 平安科技(深圳)有限公司 Data processing method and device based on deep neural network
CN113032580A (en) * 2021-03-29 2021-06-25 浙江星汉信息技术股份有限公司 Associated file recommendation method and system and electronic equipment
CN113609311A (en) * 2021-09-30 2021-11-05 航天宏康智能科技(北京)有限公司 Method and device for recommending items
CN115858886A (en) * 2022-12-12 2023-03-28 腾讯科技(深圳)有限公司 Data processing method, device, equipment and readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255002A (en) * 2018-09-11 2019-01-22 浙江大学 A method of it is excavated using relation path and solves knowledge mapping alignment task
CN109815345A (en) * 2019-02-25 2019-05-28 南京大学 A kind of knowledge mapping embedding grammar based on path
CN109840283A (en) * 2019-03-01 2019-06-04 东北大学 A kind of local adaptive knowledge mapping optimization method based on transitive relation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255002A (en) * 2018-09-11 2019-01-22 浙江大学 A method of it is excavated using relation path and solves knowledge mapping alignment task
CN109815345A (en) * 2019-02-25 2019-05-28 南京大学 A kind of knowledge mapping embedding grammar based on path
CN109840283A (en) * 2019-03-01 2019-06-04 东北大学 A kind of local adaptive knowledge mapping optimization method based on transitive relation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LINFENG LI ET AL.: "PrTransH: Embedding Probabilistic Medical Knowledge from Real World EMR Data", 《ARXIV》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259085A (en) * 2019-12-30 2020-06-09 福州大学 Relation prediction method of time perception translation model based on relation hyperplane
CN111259085B (en) * 2019-12-30 2022-08-16 福州大学 Relation prediction method of time perception translation model based on relation hyperplane
CN110795926A (en) * 2020-01-03 2020-02-14 四川大学 Judgment document similarity judgment method and system based on legal knowledge graph
CN110795926B (en) * 2020-01-03 2020-04-07 四川大学 Judgment document similarity judgment method and system based on legal knowledge graph
CN111611344A (en) * 2020-05-06 2020-09-01 北京智通云联科技有限公司 Complex attribute query method, system and equipment based on dictionary and knowledge graph
CN111611344B (en) * 2020-05-06 2023-06-13 北京智通云联科技有限公司 Complex attribute query method, system and equipment based on dictionary and knowledge graph
CN111696636A (en) * 2020-05-15 2020-09-22 平安科技(深圳)有限公司 Data processing method and device based on deep neural network
CN111696636B (en) * 2020-05-15 2023-09-22 平安科技(深圳)有限公司 Data processing method and device based on deep neural network
CN113032580A (en) * 2021-03-29 2021-06-25 浙江星汉信息技术股份有限公司 Associated file recommendation method and system and electronic equipment
CN113609311A (en) * 2021-09-30 2021-11-05 航天宏康智能科技(北京)有限公司 Method and device for recommending items
CN115858886A (en) * 2022-12-12 2023-03-28 腾讯科技(深圳)有限公司 Data processing method, device, equipment and readable storage medium
CN115858886B (en) * 2022-12-12 2024-02-27 腾讯科技(深圳)有限公司 Data processing method, device, equipment and readable storage medium

Also Published As

Publication number Publication date
CN110598006B (en) 2022-04-01

Similar Documents

Publication Publication Date Title
CN110598006B (en) Model training method, triplet embedding method, apparatus, medium, and device
US20210335469A1 (en) Systems and Methods for Automatically Tagging Concepts to, and Generating Text Reports for, Medical Images Based On Machine Learning
JP6929971B2 (en) Neural network-based translation of natural language queries into database queries
CN113535984B (en) Knowledge graph relation prediction method and device based on attention mechanism
Miller The medical AI insurgency: what physicians must know about data to practice with intelligent machines
US20190164084A1 (en) Method of and system for generating prediction quality parameter for a prediction model executed in a machine learning algorithm
GB2571825A (en) Semantic class localization digital environment
US11183308B2 (en) Estimating personalized drug responses from real world evidence
US10311058B1 (en) Techniques for processing neural queries
CN112287089B (en) Classification model training and automatic question-answering method and device for automatic question-answering system
US11874798B2 (en) Smart dataset collection system
WO2022001724A1 (en) Data processing method and device
US20200012930A1 (en) Techniques for knowledge neuron enhancements
US11645500B2 (en) Method and system for enhancing training data and improving performance for neural network models
US11599749B1 (en) Method of and system for explainable knowledge-based visual question answering
CN114078597A (en) Decision trees with support from text for healthcare applications
Zang et al. Scehr: Supervised contrastive learning for clinical risk prediction using electronic health records
CN112463973A (en) Construction method, device and medium of medical knowledge graph and electronic equipment
US11797281B2 (en) Multi-language source code search engine
WO2024114659A1 (en) Summary generation method and related device
WO2023116572A1 (en) Word or sentence generation method and related device
US20220215287A1 (en) Self-supervised pretraining through text alignment
US10395169B1 (en) Self learning neural knowledge artifactory for autonomous decision making
US20230206068A1 (en) Methodology to automatically incorporate feedback to enable self learning in neural learning artifactories
CN117216194B (en) Knowledge question-answering method and device, equipment and medium in literature and gambling field

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant