CN116757277A - Relation prediction model training method, application method, device and equipment - Google Patents

Relation prediction model training method, application method, device and equipment Download PDF

Info

Publication number
CN116757277A
CN116757277A CN202310777378.1A CN202310777378A CN116757277A CN 116757277 A CN116757277 A CN 116757277A CN 202310777378 A CN202310777378 A CN 202310777378A CN 116757277 A CN116757277 A CN 116757277A
Authority
CN
China
Prior art keywords
target
information
sample
convolution
scoring function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310777378.1A
Other languages
Chinese (zh)
Inventor
张喆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202310777378.1A priority Critical patent/CN116757277A/en
Publication of CN116757277A publication Critical patent/CN116757277A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present disclosure relates to the field of data processing technologies, and is suitable for use in the financial field, for example, in application scenarios such as banking, and in particular, to a method, an application method, an apparatus, and a device for training a relational prediction model. Comprising the following steps: constructing a plurality of sample triples; creating a plurality of negative sample information and weights corresponding to each negative sample information; processing the plurality of sample triples and the negative sample information by using a convolution scoring function to obtain corresponding scoring function values; based on the scoring function value and the weight, carrying out iterative optimization on the vectors to be trained corresponding to the connection relation respectively until target vectors meeting preset conditions are obtained; and processing the target vector by using a convolution scoring function to obtain threshold data, and forming a trained relation prediction model by using the convolution scoring function and the target vector based on the threshold data. The method and the device realize the control of the complexity of the model by controlling the dimension of the circular convolution, thereby reducing the requirement on calculation resources and improving the calculation efficiency.

Description

Relation prediction model training method, application method, device and equipment
Technical Field
The embodiment of the specification relates to the technical field of data processing, is suitable for application scenes such as banks and the like in the financial field, and particularly relates to a method, an application method, a device and equipment for training a relation prediction model.
Background
In the network topology structure of the knowledge graph, each node represents an entity with a text description, and can refer to people, things and objects which exist in reality, such as a teacher, a school and the like, and abstract concepts such as a language, a science and the like; each side represents a relationship, including symmetric relationships such as "classmates with …", anti-symmetric relationships such as "less than … years" and the like.
The unit topology of the knowledge graph can be represented by a triplet (head entity, relationship, tail entity), abbreviated as (h, r, t). The triplet indicates that there is a relation between a set of head entity (head entity) and tail entity (tail entity), which is the minimum unit for representing the relation between multiple entities, also called a fact. A complete knowledge graph consists of a large number of facts, which from a set point of view can also be seen as a set of all triples. Each node in the knowledge graph itself carries description information, but these own description information do not belong to the topology structure.
However, the knowledge graph only plays a role in modeling a knowledge structure and storing knowledge, and cannot be put into large-scale practical application requiring computer automatic identification or prediction. The current knowledge graph embedding model is divided into two main types, namely a translation distance model and a semantic matching model. The translation distance model utilizes a model based on vector space norms, the probability of fact establishment is measured by defining the distance between entities, the intuitiveness of the model is strong, the definition of a scoring function is simple, and the calculation efficiency is high; the semantic matching model is based on the concept of similarity, the potential semantic of the matching entity and the relation in the embedded space is used for measuring the possibility of the existence of the triplet facts, and the semantic matching model is divided into a bilinear model and a neural network model, and for the same model, the more parameters are set, the larger the calculation amount is, the more the relation can be simulated, but the higher the requirement on calculation resources is, and the calculation efficiency is lower.
A relationship prediction model training method and an application method are needed at present, so that the problems of high requirement on computing resources and low computing efficiency of the relationship prediction method based on the knowledge graph embedded model in the prior art are solved.
Disclosure of Invention
In order to solve the problems in the prior art, the embodiment of the specification provides a method, an application method, a device and equipment for training a relation prediction model, which realize the control of the complexity of the model by controlling the dimension of the cyclic convolution, thereby reducing the requirement on computing resources and improving the computing efficiency. And the many-to-many relation in the knowledge graph can be mapped, so that the application range of the knowledge graph embedding model is widened.
In order to solve the technical problems, the specific technical scheme in the specification is as follows:
in one aspect, an embodiment of the present disclosure provides a method for training a relational prediction model, including:
constructing a plurality of sample triples based on the entity in the sample knowledge graph and the connection relation between the entity;
creating a plurality of negative sample information and weights corresponding to each of the negative sample information for the plurality of sample triplet information;
processing the plurality of sample triplet information and the plurality of negative sample information by using a convolution scoring function to obtain corresponding scoring function values;
based on the scoring function value and the weight, carrying out iterative optimization on vectors to be trained corresponding to the entity and the connection relation respectively until target vectors meeting preset conditions are obtained;
And processing the target vectors corresponding to each sample triplet information and each negative sample information respectively by using the convolution scoring function to obtain threshold data, and forming a trained relation prediction model by using the convolution scoring function and the target vectors based on the threshold data.
Further, the creating a plurality of negative sample information and weights corresponding to each of the negative sample information for the plurality of sample triplet information includes:
selecting a plurality of target sample triplet information from the plurality of sample triplet information;
creating a plurality of corresponding negative sample information for each target sample triplet information;
and determining the weight corresponding to the negative sample information based on the association degree between the negative sample information and the target sample triplet information.
Further, for each of the target sample triples, creating a corresponding plurality of negative sample information includes:
determining that the target sample triplet information comprises two target entities and a target connection relationship between the two target entities;
for any one of the two target entities, determining candidate entities which do not have the target connection relation with the other target entity from the entities;
And replacing the target entity by the candidate entity so as to form the negative sample information with the other target entity of the two target entities and the target connection relation.
Further, the scoring function values include a first sub-scoring function value and a second sub-scoring function value;
processing the plurality of sample triplet information and the plurality of negative sample information by using a convolution scoring function to obtain corresponding scoring function values comprises:
determining a vector to be trained for each entity and each connection relation;
for each sample triplet information, processing the vector to be trained corresponding to the sample triplet information by using the convolution scoring function to obtain the first sub-scoring function value;
and processing the vector to be trained corresponding to the negative sample information by using a convolution scoring function to obtain the second sub-scoring function value.
Further, performing iterative optimization on the vectors to be trained corresponding to the entity and the connection relation respectively based on the scoring function value and the weight until obtaining the target vector meeting the preset condition comprises:
Processing the first sub-score function value, the second sub-score function value and the weight by using a loss function to obtain a loss function value;
optimizing the vectors to be trained corresponding to the entity and the connection relation respectively based on an optimization algorithm and the loss function value to obtain optimized vectors;
and iterating by using the optimized vector to replace the vector to be trained until the preset condition is met, so as to obtain the target vector.
Further, the convolution scoring function includes:
wherein f r (h, t) representing the scoring function value corresponding to (h, r, t), h and t representing the vectors to be trained or target vectors corresponding to the two entities included in the sample triplet information, respectively, r representing the vectors to be trained or target vectors corresponding to the connection relationship included in the sample triplet information,characterizing the cyclic convolution operation, F characterizing L 1 Norms or L 2 Norms.
Further, the loss function includes:
wherein L represents a loss function, log represents a log loss function, sigma represents a Logistic function, and gamma represents a training stringency controlling the sample triplet information and the negative sample information, (h '' i ,r,t’ i ) Characterizing a vector to be trained corresponding to the ith negative sample information corresponding to (h, r, t), and characterizing a vector to be trained corresponding to the sample triplet information, f r (h, t) characterizing the first sub-score function value corresponding to (h, r, t), f r (h i ,t i ) Characterization is identical to the (h) i ,r,t i ) The corresponding second sub-score function value, p (h i ,r,t i ) Characterization is identical to the (h) i ,r,t i ) And the corresponding weight, n, represents the total number of the negative sample information corresponding to the sample triplet information.
Based on the same inventive concept, the embodiment of the present disclosure further provides a method for applying a relational prediction model, including:
for a received triplet to be predicted, determining a vector to be processed corresponding to a connection relationship between an entity in the triplet to be predicted and the entity from a target vector;
processing the vector to be processed by using a convolution scoring function to obtain a target value;
judging whether the target value is smaller than threshold data or not;
determining that the connection relationship is established when the target value is determined to be smaller than the threshold value;
wherein the target vector and the threshold data are determined based on a relational prediction model training method as described.
On the other hand, the embodiment of the specification also provides a relational prediction model training device, which comprises:
the sample triplet information construction unit is used for constructing a plurality of sample triplet information based on the entity in the sample knowledge graph and the connection relation between the entities;
a creating unit configured to create, for the plurality of sample triplet information, a plurality of negative sample information and weights corresponding to each of the negative sample information;
the convolution score calculation unit is used for processing the plurality of sample triplet information and the plurality of negative sample information by utilizing a convolution score function to obtain corresponding score function values;
the optimizing unit is used for carrying out iterative optimization on the vectors to be trained corresponding to the entity and the connection relation respectively on the basis of the loss function value obtained by the scoring function value and the weight until a target vector meeting a preset condition is obtained;
and the relation prediction model determining unit is used for processing the target vectors corresponding to each sample triplet information and each negative sample information respectively by using the convolution scoring function to obtain threshold data, and forming a trained relation prediction model by using the convolution scoring function and the target vectors based on the threshold data.
Based on the same inventive concept, the embodiments of the present disclosure further provide a relational prediction model application device, including:
the device comprises a to-be-processed vector determining unit, a to-be-processed vector determining unit and a processing unit, wherein the to-be-processed vector determining unit is used for determining to-be-processed vectors corresponding to the entity in the to-be-predicted triples and the connection relation between the entity from the target vectors;
the convolution score calculation unit is used for processing the vector to be processed by utilizing a convolution score function to obtain a target value;
a judging unit for judging whether the target value is smaller than threshold data;
a connection relation determining unit configured to determine that the connection relation is established in a case where it is determined that the target value is smaller than the threshold value,
wherein the target vector and the threshold data are determined based on the relationship prediction model training device.
In another aspect, embodiments of the present disclosure further provide a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the method described above when executing the computer program.
In another aspect, embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon computer instructions that, when executed by a processor, perform the above-described method.
In another aspect, the present description embodiments also provide a computer program product comprising a computer program/instruction which, when executed by a processor, implements a method.
According to the embodiment of the specification, firstly, sample triplet information is constructed based on the connection relation between entities in a sample knowledge graph, then corresponding negative sample information and weights corresponding to the negative sample information are created, then the corresponding grading function value is obtained by processing the sample triplet information and the negative sample information through a convolution grading function, then the to-be-trained vectors corresponding to the entities and the connection relation respectively are subjected to iterative optimization based on the grading function value and the weights until the to-be-trained vectors meet the preset condition and serve as target vectors, finally, the corresponding target vectors of each sample triplet information and each negative sample information are processed through the convolution grading function to obtain threshold data, and a relation prediction model after training is formed through the convolution grading function and the target vectors based on the threshold data. The method has the advantages that the prediction of one-to-many, many-to-one and many-to-many relationships is realized through the convolution scoring function, the prediction efficiency is improved, and the problems of high requirements on computing resources and low computing efficiency in the prior art of the relationship prediction method based on the knowledge graph embedded model are solved.
Drawings
In order to more clearly illustrate the embodiments of the present description or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present description, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of an implementation system of a method for training a relational prediction model according to an embodiment of the present disclosure;
FIG. 2 is a flowchart of a method for training a relational prediction model according to an embodiment of the present disclosure;
FIG. 3 illustrates steps for creating a plurality of negative sample information and weights corresponding to each negative sample information for a plurality of sample triples in an embodiment of the present disclosure;
FIG. 4 shows steps for creating a plurality of negative sample information for each of the target sample triples in the embodiment of the present disclosure;
FIG. 5 is a flowchart illustrating steps for processing a plurality of sample triples and a plurality of negative sample information by using a convolution scoring function to obtain corresponding scoring function values according to an embodiment of the present disclosure;
FIG. 6 shows the steps for iteratively optimizing a vector to be trained in an embodiment of the present disclosure;
FIG. 7 is a schematic structural diagram of a training device for a relational prediction model according to an embodiment of the present disclosure;
FIG. 8 is a flowchart of a method for applying a relational prediction model according to an embodiment of the present disclosure;
FIG. 9 is a schematic diagram of a relational prediction model application device according to an embodiment of the present disclosure;
fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure.
[ reference numerals description ]
101. A terminal;
102. a server;
701. a sample triplet information construction unit;
702. a creation unit;
703. a convolution score calculation unit;
704. an optimizing unit;
705. a relationship prediction model determination unit;
901. a vector determination unit to be processed;
902. a convolution score calculation unit;
903. a judging unit;
904. a connection relation determination unit;
1002. a computer device;
1004. a processing device;
1006. storing the resource;
1008. a driving mechanism;
1010. an input/output module;
1012. an input device;
1014. an output device;
1016. a presentation device;
1018. a graphical user interface;
1020. a network interface;
1022. A communication link;
1024. a communication bus.
Detailed Description
The technical solutions of the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is apparent that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.
It should be noted that the terms "first," "second," and the like in the description and the claims of the specification and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the present description described herein may be capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or device that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or device.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowcharts, in some cases the steps illustrated or described may be performed in an order other than that illustrated herein.
Fig. 1 is a schematic diagram of an implementation system of a method for training a relational prediction model according to an embodiment of the present disclosure, including a terminal 101 and a server 102. The terminal 101 and the server 102 may communicate over a network, which may include a local area network (Local Area Network, abbreviated as LAN), a wide area network (Wide Area Network, abbreviated as WAN), the internet, or a combination thereof, and be connected to a website, user equipment (e.g., computing device), and a backend system.
The staff may send data to the server 102 through the terminal 101, and the server 102 may construct a knowledge graph from the data input by the terminal 101, and then train and store a relationship prediction model based on the entity and the connection relationship between the entities in the knowledge graph.
After the relationship prediction model is trained, the staff can also send the data of the relationship to be predicted to the server 102 through the terminal 101, the server 102 calculates the data of the relationship to be predicted input by the terminal 101 by using the stored relationship prediction model, so as to obtain the connection relationship between the data of the relationship to be predicted, and sends the connection relationship to the terminal 101, so that the terminal 101 displays the connection relationship to the staff.
The server 102 may be provided with a training relationship prediction model and codes for applying the relationship prediction model, and execute codes to train the relationship prediction model or calculate the connection relationship of the data of the relationship to be predicted.
In addition, it should be noted that, fig. 1 is only one application environment provided by the present disclosure, and in practical application, other application environments may also be included, which is not limited in this specification.
Specifically, the embodiment of the specification provides a relation prediction model training method, which realizes the control of model complexity by controlling the dimension of cyclic convolution, thereby reducing the requirement on computing resources and improving the computing efficiency. And the many-to-many relation in the knowledge graph can be mapped, so that the application range of the knowledge graph embedding model is widened. Fig. 2 is a schematic flow chart of a method for training a relational prediction model according to an embodiment of the present disclosure. The process of quantitatively training a relational predictive model is described in this figure. The order of steps recited in the embodiments is merely one way of performing the order of steps and does not represent a unique order of execution. When a system or apparatus product in practice is executed, it may be executed sequentially or in parallel according to the method shown in the embodiments or the drawings. As shown in fig. 2 in particular, the method may be performed by the server 102, and may include:
Step 201: constructing a plurality of sample triples based on the entity in the sample knowledge graph and the connection relation between the entity;
step 202: creating a plurality of negative sample information and weights corresponding to each of the negative sample information for the plurality of sample triplet information;
step 203: processing the plurality of sample triplet information and the plurality of negative sample information by using a convolution scoring function to obtain corresponding scoring function values;
step 204: based on the scoring function value and the weight, carrying out iterative optimization on vectors to be trained corresponding to the entity and the connection relation respectively until target vectors meeting preset conditions are obtained;
step 205: and processing the target vectors corresponding to each sample triplet information and each negative sample information respectively by using the convolution scoring function to obtain threshold data, and forming a trained relation prediction model by using the convolution scoring function and the target vectors based on the threshold data.
According to the embodiment of the specification, firstly, sample triplet information is constructed based on the connection relation between entities in a sample knowledge graph, then corresponding negative sample information and weights corresponding to the negative sample information are created, then the corresponding grading function value is obtained by processing the sample triplet information and the negative sample information through a convolution grading function, then the to-be-trained vectors corresponding to the entities and the connection relation respectively are subjected to iterative optimization based on the grading function value and the weights until the to-be-trained vectors meet the preset condition and serve as target vectors, finally, the corresponding target vectors of each sample triplet information and each negative sample information are processed through the convolution grading function to obtain threshold data, and a relation prediction model after training is formed through the convolution grading function and the target vectors based on the threshold data. The method has the advantages that the prediction of one-to-many, many-to-one and many-to-many relationships is realized through the convolution scoring function, the prediction efficiency is improved, and the problems of high requirements on computing resources and low computing efficiency in the prior art of the relationship prediction method based on the knowledge graph embedded model are solved.
In the embodiment of the present disclosure, the object trained by the relationship prediction model is an entity and a relationship in the knowledge graph, the training result is an algebraic representation (i.e., a multidimensional vector) to which the entity and the relationship are mapped, and the trained relationship prediction model is a set of key value pairs mapping relationship between all entities, relationships and corresponding algebraic representations in the knowledge graph.
In this embodiment of the present disclosure, the data in the knowledge graph may be customer data of a bank, and the bank relationship graph may be scanned in full, so as to extract all triples (h, r, t) included in the relationship graph, for example, h= "customer 1", r= "card number", t= "card number 1". And respectively obtaining an entity set E and a relation set R, and finally obtaining a big data set S= { (h, R, t) |h, t belongs to E and R belongs to R }.
And then, the entities and the relations in the data sets E, R and S are standardized according to the specific needs in the bank, only important features are reserved, redundant information possibly existing in the original knowledge graph is simplified and reduced conveniently, and the calculation efficiency can be improved. For example, the words "father" and "father" in the entity are unified as "father", so as to reduce the redundant data volume of three data sets.
The sample triplet information in the embodiments of the present description is not digital, but is a symbol, so it is necessary to initialize the sample triplet information to digital and then perform training. In the embodiment of the specification, algebraic initialization is performed on the entities and relations in E and R. The initialization method may optionally use a uniform or gaussian distributed sampling method in order to distribute the data as uniformly as possible over the spatial range, increasing the likelihood of iteration to a locally optimal solution. If a uniform distribution is used, the sample distribution in each dimension is typicallyWherein X represents sampled data, U represents a uniformly distributed sampling algorithm, i is a self-defined integer parameter flexibly controlling a random initialization variable value range, epsilon represents an expected value, and k represents an embedded space dimension. Where the expected value epsilon reflects the severity of the training, which value generally needs to be obtained by multiple training, to avoid the training going to over-fitting or under-fitting to some extent. Selecting a dimension k of the embedded space, namely mapping the entity and the relation into a k-dimensional vector, reflecting the information quantity borne by the entity or the relation, wherein the information quantity increases exponentially with the increase of k; meanwhile, the greater k is, the greater the complexity of the cyclic convolution is, and the greater the cyclic matrix generated by the discrete cyclic convolution is, so that the value of k cannot be too great.
Then creating a plurality of negative sample information and a weight corresponding to each negative sample information.
Specifically, as shown in fig. 3, the creating, for the plurality of sample triples of information, a plurality of negative sample information and weights corresponding to each of the negative sample information includes:
step 301: selecting a plurality of target sample triplet information from the plurality of sample triplet information;
step 302: creating a plurality of corresponding negative sample information for each target sample triplet information;
step 303: and determining the weight corresponding to the negative sample information based on the association degree between the negative sample information and the target sample triplet information.
In the embodiment of the present disclosure, a plurality of target sample triples are selected first, and then negative sample information corresponding to the target sample triples is created. In the embodiment of the present specification, the positive samples represent triples that actually exist, the negative samples represent triples that do not exist, and the negative samples may be obtained by replacing the head entity and/or the tail entity of the positive sample triples, and the replacement entity must also be an entity that actually exists (typically, the head entity or the tail entity is replaced with a random value). In general, it is also practical that one positive sample triplet may be replaced by a large number of negative sample triples. The negative samples generated can be multiplied by a weight expressed as the inverse of the degree of correlation between the positive samples and the corresponding negative samples generated using Self-adaptive strategy; the smaller the degree of association, the less weight is given to the negative sample, thereby weakening the negative impact of its strong association with the positive sample.
To improve the accuracy of the calculation, according to one embodiment of the present disclosure, as shown in fig. 4, creating a corresponding plurality of negative sample information for each of the target sample triples includes:
step 401: determining that the target sample triplet information comprises two target entities and a target connection relationship between the two target entities;
step 402: for any one of the two target entities, determining candidate entities which do not have the target connection relation with the other target entity from the entities;
step 403: and replacing the target entity by the candidate entity so as to form the negative sample information with the other target entity of the two target entities and the target connection relation.
In the embodiment of the present specification, first, two target entities a and B included in target sample triplet information and a target connection relationship x between the two target entities a and B are determined. The target connection relation x is a positive sample, then for any one of the two target entities a and B (for example, the target entity a, i.e., the embodiment of the present specification does not limit the replacement of the head entity or the tail entity), an entity C having no target connection relation x with the other target entity B of the two target entities a and B is determined from all the entities of the sample knowledge graph, and is used as a candidate entity, the candidate entity C is used to replace the target entity a, and finally the candidate entity C, the target entity B, and the target connection relation x are used as negative samples.
And then processing the plurality of sample triplet information and the plurality of negative sample information by using a convolution scoring function to obtain corresponding scoring function values.
According to one embodiment of the present specification, the scoring function values include a first sub-scoring function value and a second sub-scoring function value;
as shown in fig. 5, the processing the plurality of sample triplet information and the plurality of negative sample information by using the convolution scoring function to obtain the corresponding scoring function value includes:
step 501: determining a vector to be trained for each entity and each connection relation;
step 502: for each sample triplet information, processing the vector to be trained corresponding to the sample triplet information by using the convolution scoring function to obtain the first sub-scoring function value;
step 503: and processing the vector to be trained corresponding to the negative sample information by using a convolution scoring function to obtain the second sub-scoring function value.
In the present embodiment, the scoring function is a score of whether a triplet (h, r, t) is a true sample. When the scoring function value is calculated, the vectors to be trained are initialized at random for each entity and each connection relation, then the vectors to be trained corresponding to the head entity, the tail entity and the connection relation in each sample triplet information are respectively processed by using a convolution scoring function to obtain a first sub-scoring function value, and then the vectors to be trained corresponding to the head entity, the tail entity and the connection relation in the negative sample information are calculated according to the same convolution scoring function to obtain a second sub-scoring function value.
Specifically, the convolution scoring function is
Wherein f r (h, t) representing the scoring function value corresponding to (h, r, t), h and t representing the vectors to be trained corresponding to two of the entities included in the sample triplet information or negative sample information, respectively, r representing the vectors to be trained corresponding to the connection relationship included in the sample triplet information or negative sample information,characterizing the cyclic convolution operation, F characterizing L 1 Norms or L 2 Norms.
According to one embodiment of the present disclosure, as shown in fig. 6, performing iterative optimization on the to-be-trained vectors corresponding to the entity and the connection relationship respectively based on the scoring function value and the weight until obtaining a target vector meeting a preset condition includes:
step 601: processing the first sub-score function value, the second sub-score function value and the weight by using a loss function to obtain a loss function value;
step 602: optimizing the vectors to be trained corresponding to the entity and the connection relation respectively based on an optimization algorithm and the loss function value to obtain optimized vectors;
step 603: and iterating by using the optimized vector to replace the vector to be trained until the preset condition is met, so as to obtain the target vector.
In the embodiment of the specification, parameters such as adjustable learning rate, maximum iteration number and the like are set to optimize the vector to be trained. The learning rate can be adjusted to be an initial learning rate, then the learning rate is gradually reduced according to the training progress, and the learning rate is reduced when the loss function value is changed in a vibration mode, so that the local optimal solution can be further approximated. The maximum iteration number is generally set as the iteration number when the training effect is no longer significantly changed, and the effect of stopping training is exerted.
Further, the loss function includes:
wherein L represents a loss function, log represents a log loss function, sigma represents a Logistic function, and gamma represents a training stringency controlling the sample triplet information and the negative sample information, (h '' i ,r,t’ i ) Characterizing a vector to be trained corresponding to the ith negative sample information corresponding to (h, r, t), and characterizing a vector to be trained corresponding to the sample triplet information, f r (h, t) characterizing the first sub-score function value corresponding to (h, r, t), f r (h’ i ,t’ i ) Characterization is identical to the (h' i ,r,t’ i ) The corresponding second sub-score function value, p (h' i ,r,t’ i ) Characterization is identical to the (h' i ,r,t’ i ) And the corresponding weight, n, represents the total number of the negative sample information corresponding to the sample triplet information.
And then processing the target vector corresponding to each sample triplet information and each negative sample information respectively by using the convolution scoring function to obtain threshold data. And the threshold data is used for comparing with the target value calculated by the convolution scoring function and judging whether the predicted relationship is established or not. The convolution scoring function for calculating the threshold data is the same as the convolution scoring function for calculating the scoring function value, namely the data types of the vector to be trained and the target vector are the same, and the data type of the threshold data is the same as the data type of the scoring function value. In addition, the performance indexes MR, MRR, hits@n and the like can be used for calculating threshold data, wherein MR (Mean Rank) represents the average value of sequence numbers of the scoring functions which are respectively calculated in the generated negative sample triples from small to large; the smaller the index, the better the effect of the relationship prediction model. The MRR (Mean Reciprocal Ranking, average reciprocal ordering) is calculated in a similar manner to MR, except that the average of MR sequence numbers, the average of the reciprocal of MRR sequence numbers, the greater the index, the better the effect of the relational predictive model. HITS@n represents the proportion of positive samples in triples with sequence numbers smaller than n, and the larger the value is, the better the effect of the relation prediction model is.
And finally, constructing a trained relation prediction model based on the threshold data by using the convolution scoring function and the target vector.
Based on the same inventive concept, the embodiment of the present disclosure further provides a relational prediction model training device, as shown in fig. 7, including:
a sample triplet information construction unit 701, configured to construct a plurality of sample triplet information based on an entity in a sample knowledge graph and a connection relationship between the entities;
a creating unit 702, configured to create, for the plurality of sample triplet information, a plurality of negative sample information and a weight corresponding to each of the negative sample information;
a convolution score calculating unit 703, configured to process the plurality of sample triplet information and the plurality of negative sample information by using a convolution score function, so as to obtain a corresponding score function value;
an optimizing unit 704, configured to iteratively optimize the vectors to be trained corresponding to the entity and the connection relationship respectively, based on the loss function value obtained by the scoring function value and the weight, until a target vector satisfying a preset condition is obtained;
and the relation prediction model determining unit 705 is configured to process the target vectors corresponding to each sample triplet information and each negative sample information by using the convolution scoring function to obtain threshold data, and form a trained relation prediction model by using the convolution scoring function and the target vectors based on the threshold data.
Since the principle of the device for solving the problem is similar to that of the method, the implementation of the device can be referred to the implementation of the method, and the repetition is omitted.
The relation prediction model of the embodiment of the specification can be used for the scene of relation map completion of banks, potential customer relation mining and the like. Therefore, based on the same inventive concept, the embodiment of the present disclosure further provides a method for applying a relational prediction model, as shown in fig. 8, including:
step 801: for a received triplet to be predicted, determining a vector to be processed corresponding to a connection relationship between an entity in the triplet to be predicted and the entity from a target vector;
step 802: processing the vector to be processed by using a convolution scoring function to obtain a target value;
step 803: judging whether the target value is smaller than threshold data or not;
step 804: and determining that the connection relation is established under the condition that the target value is smaller than the threshold value.
The target vector and the threshold data are determined based on the relationship prediction model training method.
In this embodiment of the present disclosure, the triplet to be predicted includes two entities and a specified connection relationship, which may be predicting whether the specified connection relationship exists between the two entities, where the connection relationship between the entities of the triplet to be predicted is also data in a sample knowledge graph, and a vector to be processed corresponding to the entity and the connection relationship in the triplet to be predicted is determined from a target vector obtained by training. And then processing the vector to be processed by using a convolution scoring function to obtain a target value, wherein the convolution scoring function is the same as the convolution scoring function in the process of training the relation prediction model, namely the data type of the vector to be processed is the same as the data type of the target vector, and the obtained target value is the same as the data type of the threshold value data, so that whether the target value is smaller than the threshold value data can be judged, and when the target value is smaller than the threshold value data, the fact that the triplet to be predicted is true is determined, namely the appointed connection relation exists between the head entity and the tail entity of the triplet to be predicted.
Taking mining potential customer relationships as an example, assume that it is currently desired to predict whether there is a relationship x between customer a and customer B, i.e., to predict triples (a, x, B) to be predicted. All (A, x, $ { tail_entity }) and ($ { head_entity }, x, B) variables $ { tail_entity }, and $ { head_entity }, respectively, representing the head and tail entities actually present in the sample knowledge graph are first extracted from the trained relational prediction model. And calculating the whole triples by using a convolution scoring function to obtain a target vector and threshold data, calculating a target value of the triples (A, x, B) to be predicted by using the convolution scoring function, and determining that a relation x exists between the client A and the client B indeed if the target value is smaller than the threshold data.
Based on the same inventive concept, the embodiment of the present disclosure further provides a relational prediction model application device, as shown in fig. 9, including:
a to-be-processed vector determining unit 901, configured to determine, for a received to-be-predicted triplet, a to-be-processed vector corresponding to a connection relationship between an entity in the to-be-predicted triplet and the entity from a target vector;
a convolution score calculating unit 902, configured to process the vector to be processed by using a convolution score function to obtain a target value;
A judging unit 903, configured to judge whether the target value is smaller than threshold data;
a connection relation determining unit 904 configured to determine that the connection relation is established in a case where it is determined that the target value is smaller than the threshold value.
Wherein the target vector and the threshold data are determined based on a relational prediction model training device as shown in fig. 7.
Since the principle of the device for solving the problem is similar to that of the method, the implementation of the device can be referred to the implementation of the method, and the repetition is omitted.
Fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure, where the apparatus may be a computer device according to the present disclosure, and perform the method of the present disclosure. The computer device 1002 may include one or more processing devices 1004, such as one or more Central Processing Units (CPUs), each of which may implement one or more hardware threads. The computer device 1002 may also include any storage resources 1006 for storing any kind of information, such as code, settings, data, etc. For example, and without limitation, storage resources 1006 may include any one or more of the following combinations: any type of RAM, any type of ROM, flash memory devices, hard disks, optical disks, etc. More generally, any storage resource may store information using any technology. Further, any storage resource may provide volatile or non-volatile retention of information. Further, any storage resources may represent fixed or removable components of computer device 1002. In one case, when the processing device 1004 executes associated instructions stored in any storage resource or combination of storage resources, the computer device 1002 can perform any of the operations of the associated instructions. The computer device 1002 also includes one or more drive mechanisms 1008, such as a hard disk drive mechanism, an optical disk drive mechanism, and the like, for interacting with any storage resources.
The computer device 1002 may also include an input/output module 1010 (I/O) for receiving various inputs (via input device 1012) and for providing various outputs (via output device 1014). One particular output mechanism may include a presentation device 1016 and an associated Graphical User Interface (GUI) 1018. In other embodiments, input/output module 1010 (I/O), input device 1012, and output device 1014 may not be included as just one computer device in a network. Computer device 1002 may also include one or more network interfaces 1020 for exchanging data with other devices via one or more communication links 1022. One or more communication buses 1024 couple the above-described components together.
The communication link 1022 may be implemented in any manner, for example, through a local area network, a wide area network (e.g., the internet), a point-to-point connection, etc., or any combination thereof. Communication links 1022 may include any combination of hardwired links, wireless links, routers, gateway functions, name servers, etc., governed by any protocol or combination of protocols.
The embodiments of the present specification also provide a computer readable storage medium storing a computer program which, when executed by a processor, implements the above method.
The present description also provides a computer program product comprising a computer program which, when executed by a processor, implements the above method.
It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing detailed description of the embodiments has been presented for purposes of illustration and description, and it should be understood that the foregoing is by way of example only, and is not intended to limit the scope of the invention.

Claims (13)

1. A method for training a relational predictive model, comprising:
constructing a plurality of sample triples based on the entity in the sample knowledge graph and the connection relation between the entity;
creating a plurality of negative sample information and weights corresponding to each of the negative sample information for the plurality of sample triplet information;
processing the plurality of sample triplet information and the plurality of negative sample information by using a convolution scoring function to obtain corresponding scoring function values;
based on the scoring function value and the weight, carrying out iterative optimization on vectors to be trained corresponding to the entity and the connection relation respectively until target vectors meeting preset conditions are obtained;
and processing the target vectors corresponding to each sample triplet information and each negative sample information respectively by using the convolution scoring function to obtain threshold data, and forming a trained relation prediction model by using the convolution scoring function and the target vectors based on the threshold data.
2. The method of claim 1, wherein creating a plurality of negative sample information and weights corresponding to each of the negative sample information for the plurality of sample triplet information comprises:
Selecting a plurality of target sample triplet information from the plurality of sample triplet information;
creating a plurality of corresponding negative sample information for each target sample triplet information;
and determining the weight corresponding to the negative sample information based on the association degree between the negative sample information and the target sample triplet information.
3. The method of claim 2, wherein creating a corresponding plurality of negative sample information for each of the target sample triplet information comprises:
determining that the target sample triplet information comprises two target entities and a target connection relationship between the two target entities;
for any one of the two target entities, determining candidate entities which do not have the target connection relation with the other target entity from the entities;
and replacing the target entity by the candidate entity so as to form the negative sample information with the other target entity of the two target entities and the target connection relation.
4. The method of claim 1, wherein the scoring function values comprise a first sub-scoring function value and a second sub-scoring function value;
Processing the plurality of sample triplet information and the plurality of negative sample information by using a convolution scoring function to obtain corresponding scoring function values comprises:
determining a vector to be trained for each entity and each connection relation;
for each sample triplet information, processing the vector to be trained corresponding to the sample triplet information by using the convolution scoring function to obtain the first sub-scoring function value;
and processing the vector to be trained corresponding to the negative sample information by using a convolution scoring function to obtain the second sub-scoring function value.
5. The method of claim 4, wherein iteratively optimizing the to-be-trained vectors corresponding to the entity and the connection relationship, respectively, based on the scoring function value and the weight, until a target vector satisfying a preset condition is obtained comprises:
processing the first sub-score function value, the second sub-score function value and the weight by using a loss function to obtain a loss function value;
optimizing the vectors to be trained corresponding to the entity and the connection relation respectively based on an optimization algorithm and the loss function value to obtain optimized vectors;
And iterating by using the optimized vector to replace the vector to be trained until the preset condition is met, so as to obtain the target vector.
6. The method of claim 4, wherein the convolution scoring function comprises:
wherein f r (h, t) representing the scoring function value corresponding to (h, r, t), h and t representing the vectors to be trained or target vectors corresponding to the two entities included in the sample triplet information, respectively, r representing the vectors to be trained or target vectors corresponding to the connection relationship included in the sample triplet information,characterizing the cyclic convolution operation, F characterizing L 1 Norms or L 2 Norms.
7. The method of claim 5, wherein the loss function comprises:
wherein L represents a loss function, log represents a log loss function, sigma represents a Logistic function, and gamma represents a training stringency controlling the sample triplet information and the negative sample information, (h '' i ,r,t’ i ) Characterizing a vector to be trained corresponding to the ith negative sample information corresponding to (h, r, t), and characterizing a vector to be trained corresponding to the sample triplet information, f r (h, t) characterizing the first sub-score function value corresponding to (h, r, t), f r (h’ i ,t’ i ) Characterization is identical to the (h) i ,r,t i ) The corresponding second sub-score function value, p (h i ,r,t i ) Characterization is identical to the (h) i ,r,t i ) And the corresponding weight, n, represents the total number of the negative sample information corresponding to the sample triplet information.
8. A method of applying a relational prediction model, comprising:
for a received triplet to be predicted, determining a vector to be processed corresponding to a connection relationship between an entity in the triplet to be predicted and the entity from a target vector;
processing the vector to be processed by using a convolution scoring function to obtain a target value;
judging whether the target value is smaller than threshold data or not;
determining that the connection relationship is established when the target value is determined to be smaller than the threshold value;
wherein the target vector and the threshold data are determined based on the method of any of claims 1-7.
9. A relational prediction model training device, comprising:
the sample triplet information construction unit is used for constructing a plurality of sample triplet information based on the entity in the sample knowledge graph and the connection relation between the entities;
a creating unit configured to create, for the plurality of sample triplet information, a plurality of negative sample information and weights corresponding to each of the negative sample information;
The convolution score calculation unit is used for processing the plurality of sample triplet information and the plurality of negative sample information by utilizing a convolution score function to obtain corresponding score function values;
the optimizing unit is used for carrying out iterative optimization on the vectors to be trained corresponding to the entity and the connection relation respectively on the basis of the loss function value obtained by the scoring function value and the weight until a target vector meeting a preset condition is obtained;
and the relation prediction model determining unit is used for processing the target vectors corresponding to each sample triplet information and each negative sample information respectively by using the convolution scoring function to obtain threshold data, and forming a trained relation prediction model by using the convolution scoring function and the target vectors based on the threshold data.
10. A relational prediction model application device, characterized by comprising:
the device comprises a to-be-processed vector determining unit, a to-be-processed vector determining unit and a processing unit, wherein the to-be-processed vector determining unit is used for determining to-be-processed vectors corresponding to the entity in the to-be-predicted triples and the connection relation between the entity from the target vectors;
the convolution score calculation unit is used for processing the vector to be processed by utilizing a convolution score function to obtain a target value;
A judging unit for judging whether the target value is smaller than threshold data;
a connection relation determining unit configured to determine that the connection relation is established, in a case where it is determined that the target value is smaller than the threshold value;
wherein the target vector and the threshold data are determined based on the apparatus of claim 9.
11. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of the preceding claims 1-8 when executing the computer program.
12. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, performs the method of any of the preceding claims 1-8.
13. A computer program product comprising computer programs/instructions which, when executed by a processor, implement the method according to any of claims 1-8.
CN202310777378.1A 2023-06-28 2023-06-28 Relation prediction model training method, application method, device and equipment Pending CN116757277A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310777378.1A CN116757277A (en) 2023-06-28 2023-06-28 Relation prediction model training method, application method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310777378.1A CN116757277A (en) 2023-06-28 2023-06-28 Relation prediction model training method, application method, device and equipment

Publications (1)

Publication Number Publication Date
CN116757277A true CN116757277A (en) 2023-09-15

Family

ID=87951209

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310777378.1A Pending CN116757277A (en) 2023-06-28 2023-06-28 Relation prediction model training method, application method, device and equipment

Country Status (1)

Country Link
CN (1) CN116757277A (en)

Similar Documents

Publication Publication Date Title
US20210049512A1 (en) Explainers for machine learning classifiers
US20200401939A1 (en) Systems and methods for preparing data for use by machine learning algorithms
US11048718B2 (en) Methods and systems for feature engineering
JP6182242B1 (en) Machine learning method, computer and program related to data labeling model
US20210073671A1 (en) Generating combined feature embedding for minority class upsampling in training machine learning models with imbalanced samples
Rajawat et al. Fusion deep learning based on back propagation neural network for personalization
WO2019111118A1 (en) Robust gradient weight compression schemes for deep learning applications
AU2021264961B2 (en) Multi objective optimization of applications
US11373117B1 (en) Artificial intelligence service for scalable classification using features of unlabeled data and class descriptors
US11423307B2 (en) Taxonomy construction via graph-based cross-domain knowledge transfer
US9852390B2 (en) Methods and systems for intelligent evolutionary optimization of workflows using big data infrastructure
CN111667056A (en) Method and apparatus for searching model structure
US11599826B2 (en) Knowledge aided feature engineering
WO2023116111A1 (en) Disk fault prediction method and apparatus
CN115244587A (en) Efficient ground truth annotation
CN111008693A (en) Network model construction method, system and medium based on data compression
JP2023535140A (en) Identifying source datasets that fit the transfer learning process against the target domain
JP5673473B2 (en) Distributed computer system and method for controlling distributed computer system
WO2023113693A2 (en) Optimal knowledge distillation scheme
CN116976461A (en) Federal learning method, apparatus, device and medium
CN116757277A (en) Relation prediction model training method, application method, device and equipment
CN111459990B (en) Object processing method, system, computer readable storage medium and computer device
CN114444606A (en) Model training and data classification method and device
CN110837847A (en) User classification method and device, storage medium and server
CN113515383B (en) System resource data distribution method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination