CN113298160B - Triple verification method, apparatus, device and medium - Google Patents

Triple verification method, apparatus, device and medium Download PDF

Info

Publication number
CN113298160B
CN113298160B CN202110594046.0A CN202110594046A CN113298160B CN 113298160 B CN113298160 B CN 113298160B CN 202110594046 A CN202110594046 A CN 202110594046A CN 113298160 B CN113298160 B CN 113298160B
Authority
CN
China
Prior art keywords
triple
target
entity
relation
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110594046.0A
Other languages
Chinese (zh)
Other versions
CN113298160A (en
Inventor
曾钢欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Shuliantianxia Intelligent Technology Co Ltd
Original Assignee
Shenzhen Shuliantianxia Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Shuliantianxia Intelligent Technology Co Ltd filed Critical Shenzhen Shuliantianxia Intelligent Technology Co Ltd
Priority to CN202110594046.0A priority Critical patent/CN113298160B/en
Publication of CN113298160A publication Critical patent/CN113298160A/en
Application granted granted Critical
Publication of CN113298160B publication Critical patent/CN113298160B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a triple verification method, which comprises the following steps: firstly, random sampling marking is carried out on a triple to be checked to obtain a marked first triple and a second triple which is not marked. And then obtaining the input vector and the labeling information of the first triple to train a first binary-binary model, and then checking a second triple by using the first binary-binary model. And determining an labeled data set according to the first triple, the second triple and the verification result by adopting a self-training method, training a second classification model by using the labeled data set, and finally completing the verification of the triple to be verified by using the target classification model. According to the scheme, the labeled data sets of the number of samples required by training can be obtained on the premise that only a small number of triples are labeled, the fitting capacity of the trained target classification model is improved, and a good effect can be obtained only by a small number of labeled data. In addition, a triple verifying device, equipment and a storage medium are also provided.

Description

Triple verification method, apparatus, device and medium
Technical Field
The invention relates to the technical field of knowledge graphs, in particular to a triple verification method, a triple verification device, triple verification equipment and triple verification media.
Background
With the development of artificial intelligence, knowledge graphs are increasingly important at the application bottom layer of artificial intelligence. For traditional knowledge graph construction, the whole process needs a lot of manpower and material resources, so that an unsupervised knowledge graph construction scheme becomes the mainstream of development at the present stage. However, in an unsupervised knowledge graph construction scheme, due to lack of manual intervention, extracted triples are not very accurate and need to be manually corrected, and errors of the triples will affect construction of the knowledge graph, so that how to ensure accuracy of the triples on the premise of less manual intervention is very important.
Disclosure of Invention
Based on this, it is necessary to provide a method, an apparatus, a device, and a medium for verifying a triplet, which ensure the accuracy of the triplet with less human intervention.
A method for checking triples, the method comprising:
acquiring a triple to be verified, randomly sampling and marking the triple to be verified to obtain a marked first triple and a second triple which is not marked;
embedding the triple information of the first triple into an input vector through a pre-training model, wherein the first triple information comprises a head entity, a relation, a tail entity and a sentence where the first triple is located, obtaining the marking information of the first triple, the marking information comprises whether each first triple is credible or not, and training a first two-two classification model according to the input vector and the marking information;
performing first check on the second triple by using the trained first-second classification model to obtain a first check result corresponding to the second triple, wherein the first check result is whether the second triple is credible or not;
determining an annotated data set according to the first triple, the second triple and the first check result, training a second classification model according to the annotated data set, and performing second check on the triple to be checked by using the trained second classification model to obtain a second check result of the triple to be checked, wherein the second check result is whether the triple to be checked is credible or not.
In one embodiment, the embedding the triplet information of the first triplet into an input vector through a pre-trained model includes:
for each first triple, respectively encoding the head entity, the relation and the tail entity through the pre-training model to obtain a first vector corresponding to the head entity, a second vector corresponding to the relation and a third vector corresponding to the tail entity;
sequentially connecting the first vector, the second vector and the third vector according to the sequence of the first vector, the second vector and the third vector to obtain a first input vector;
and coding the sentence through the pre-training model, and taking the coded sentence as a second input vector, wherein the input vector comprises the first input vector and the second input vector.
In one embodiment, the first binary model comprises a feedforward neural network and an activation function, and the training the first binary model according to the input vector and the labeling information comprises:
the first secondary classification model maps the credibility of the triple to be checked between 0 and 1 according to the input vector, and calculates a mapping error according to the labeling information and the mapping result;
and adjusting the model parameters of the first secondary classification model according to the mapping error until the mapping result meets a preset check standard.
In one embodiment, the determining an annotated dataset from the first triple, the second triple, and the first verification result includes:
and taking the first triple and a second triple with a credible first verification result as the annotation data set.
In one embodiment, the obtaining a triplet to be verified includes:
acquiring text data, and extracting a triple to be verified from the text data, wherein the extraction is based on rule extraction or syntactic analysis extraction.
In one embodiment, the pre-training model is any one of bert, word2vec, XLinet, and Albert.
In one embodiment, after the target classification model is used to verify the triplet to be verified, and a verification result of the triplet to be verified is obtained, the method further includes:
acquiring trusted target triples from the annotation data set according to a second check result, wherein each target triplet comprises a target head entity, a target relation and a target tail entity;
constructing a plurality of first co-occurrence matrixes of the target header entity and the target relation, screening a first target matrix which is larger than a first segmentation threshold value from the plurality of first co-occurrence matrixes, and combining a header entity type and a relation type corresponding to the first target matrix to obtain a first combination;
constructing a plurality of second co-occurrence matrixes of the target tail entity and the target relation, screening a second target matrix which is larger than a second segmentation threshold value from the plurality of second co-occurrence matrixes, and combining tail entity types and relation types corresponding to the second target matrix to obtain a second combination;
and carrying out cross combination on the first combination and the second combination to obtain a knowledge graph.
A triple verification apparatus, the apparatus comprising:
the marking module is used for acquiring the triples to be checked, randomly sampling and marking the triples to be checked to obtain marked first triples and unmarked second triples;
the initial training module is used for embedding the triple information of the first triple into an input vector through a pre-training model, acquiring the marking information of the first triple, wherein the marking information comprises whether each first triple is credible or not, and training a first two-two classification model according to the input vector and the marking information, and the first triple information comprises a head entity, a relation, a tail entity and a sentence where the first triple is located;
the first checking module is used for performing first checking on the second triple by using the trained first and second classification model to obtain a first checking result corresponding to the second triple, wherein the first checking result is whether the second triple is credible or not;
and the training and checking module is used for determining a marked data set according to the first triple, the second triple and the first checking result, training a second classification model according to the marked data set, checking the triple to be checked by using the target classification model to obtain a second checking result of the triple to be checked, wherein the second checking result is whether the triple to be checked is credible or not.
A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
acquiring a triple to be verified, randomly sampling and marking the triple to be verified to obtain a marked first triple and a second triple which is not marked;
embedding the triple information of the first triple into an input vector through a pre-training model, wherein the first triple information comprises a head entity, a relation, a tail entity and a sentence where the first triple is located, obtaining the marking information of the first triple, the marking information comprises whether each first triple is credible or not, and training a first two-two classification model according to the input vector and the marking information;
performing first verification on the second triple by using the trained first and second classification models to obtain a first verification result corresponding to the second triple, wherein the first verification result is whether the second triple is credible or not;
determining an annotated data set according to the first triple, the second triple and the first check result, training a second classification model according to the annotated data set, and performing second check on the triple to be checked by using the trained second classification model to obtain a second check result of the triple to be checked, wherein the second check result is whether the triple to be checked is credible or not.
A triple verification device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of:
acquiring a triple to be checked, randomly sampling and marking the triple to be checked to obtain a marked first triple and a second triple which is not marked;
embedding the triple information of the first triple into an input vector through a pre-training model, wherein the first triple information comprises a head entity, a relation, a tail entity and a sentence where the first triple is located, obtaining the marking information of the first triple, the marking information comprises whether each first triple is credible or not, and training a first two-two classification model according to the input vector and the marking information;
performing first check on the second triple by using the trained first-second classification model to obtain a first check result corresponding to the second triple, wherein the first check result is whether the second triple is credible or not;
determining an annotated data set according to the first triple, the second triple and the first check result, training a second classification model according to the annotated data set, and performing second check on the triple to be checked by using the trained second classification model to obtain a second check result of the triple to be checked, wherein the second check result is whether the triple to be checked is credible or not.
The invention provides a triple verification method, a triple verification device, equipment and a triple verification medium. And then, acquiring the input vector and the labeling information of the first triple to train a first two-class classification model, and then verifying the second triple by using the first two-class classification model, so that the scheme only needs a small amount of data labeling, and the labeling cost can be greatly reduced. Then, a self-training method is adopted in the scheme, a labeled data set is determined according to the first triple, the second triple and the verification result, a second classification model is trained by the labeled data set, and finally the target classification model is used for completing the verification of the triple to be verified. Therefore, the labeled data sets of the number of samples required by training can be obtained on the premise of labeling only a small number of triples, the fitting capacity of the trained target classification model is improved, and a good effect can be obtained only by a small number of labeled data.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Wherein:
FIG. 1 is a schematic flow chart of a triple verification method according to an embodiment;
FIG. 2 is a schematic diagram of an exemplary triple verifier;
fig. 3 is a block diagram of a triple verifying apparatus in an embodiment.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, fig. 1 is a schematic flow chart of a triple verification method in an embodiment, where the triple verification method in this embodiment provides steps including:
and 102, acquiring a triple to be verified, randomly sampling and marking the triple to be verified, and acquiring a marked first triple and a second triple which is not marked.
The triple to be verified refers to a group of head entities, relation entities and tail entities, whether the triple to be verified is credible or not, information is correct information, and a knowledge graph for artificial intelligence can be constructed based on the credible triple. Illustratively, "zhou jen" is the head entity, "singer" is the relationship, and "nu stick" is the tail entity in (zhou jen, singer, nu stick).
Specifically, a section of text data is obtained first, the text data is a text paragraph composed of a plurality of sentences, and the length of the text data can be automatically input by a user according to requirements. And then extracting the triples to be checked from the text data based on rules or extracted triples based on syntactic analysis. Illustratively, the rule-based extraction is divided into: the first step is as follows: a set of relationships to be extracted is defined, such as (father, mother, son, daughter). The second step is that: and traversing each sentence of the text data, and removing the non-head entity in each sentence and the words in the non-relation set. The third step: traversing from the second word of each sentence, and selecting the head entity which is closest to the word when encountering the word in the relation set. Rule-based extraction requires no training and the rules defined are simpler and therefore more commonly used. The extraction based on the syntactic analysis needs to determine the syntactic structure of the sentence (for example, determining the structure of a guest-like intermediation in the sentence) or the dependency relationship between words in the sentence (for example, the relationship such as a right-appended relationship, a fixed relationship, a state-in-relation, etc.), the logic setting of the extraction based on the syntactic analysis is more complicated, but the extraction result is more accurate than the extraction based on the rule.
And after the triples to be verified are extracted, marking part of the triples to be verified in a random extraction mode. In order to ensure randomness and avoid the occurrence of too many similar triples to be verified, in a specific application scenario, the triples to be verified can be divided into different layers according to the types of the triples to be verified, for example, the triples to be verified are divided into layers according to the types of words in the triples (individual nouns: cars and rooms; abstract nouns: living and friendly), and then samples are randomly extracted from the different layers according to a specified proportion. Finally, a labeled first triplet is obtained, for example comprising a labeled 1 (Zhougelon, who is singer) and a labeled 0 (Zhougelon, who is writer), and a second unlabeled triplet.
And 104, embedding the triple information of the first triple into an input vector through a pre-training model, acquiring the labeling information of the first triple, and training the first binary model according to the input vector and the labeling information.
The pre-training model adopted in this embodiment is any one of bert, word2vec, XLnet, and Albert. The triple information of the first triple is related attribute information of the first triple, and includes a head entity, a relationship and a tail entity of the first triple, a sentence in which the first triple is located in the text data, and information such as a position of the triple in the sentence. The annotation information is information indicating whether each first triple is authentic, i.e., whether a sentence composed of the triples is correct, and is generated when step 102 is executed.
In one embodiment, the embedded input vector comprises a first input vector and a second input vector, and the process of embedding the triplet information into the input vector comprises: for each first triple, the head entity, the relation and the tail entity are respectively encoded through the pre-training model to obtain a first vector corresponding to the head entity, a second vector corresponding to the relation and a third vector corresponding to the tail entity. Wherein, the available coding modes comprise existing doc2vec. And then sequentially connecting (c operation) according to the sequence of the first vector, the second vector and the third vector to obtain a first input vector. If the first vector: a = [1,2,3], the second vector B = [4,5,6], and the third vector B = [4,5,6] are sequentially connected to obtain the first input vector AcBcC = [1,2,3,4,5,6,7,8,9]. For each sentence, the sentence is also encoded by doc2vec, and the encoded sentence is taken as a second input vector. And the coding of the position of the triple can be based on the index, the index is a numerical value, for a sentence with 128 words, a matrix with 128 x 10 is initialized, and the index of the nth row is found as the position coding based on the position n of the triple in the sentence.
The first two-class model in this embodiment includes a feed forward Neural Network (FFN) and an activation function Sigmoid, each neuron in the feed forward Neural Network is composed of a linear fit and a nonlinear activation function, each neuron is arranged in layers, different layers are all connected, and each neuron is only connected with a neuron in a previous layer, receives an output of the previous layer, and outputs the output to a next layer. The activation function Sigmoid introduces nonlinear factors to the neurons, so that the neural network can arbitrarily approximate any nonlinear function to realize the mapping of variables between 0 and 1.
During training, the first binary model takes the input vector as the input data of the model, and the credibility of the triplet to be verified is mapped between 0 and 1. Specifically, the feedforward neural network FFN uses the formula: ffn _ output = Relu (Wx input + b), where Wx and b are trainable parameters and Relu is an activation function. In the original space, the input features do not distinguish well between the trusted and the untrusted classes, and the activation function is a non-linear function by which the input can be transformed into another implicit space that best distinguishes between the trusted and the untrusted classes. Then the feedforward neural network FFN is connected with the full connection layer, and the formula is as follows: sigmoid (Wy ffn _ output + b 2). Where Wy and b2 are also trainable parameters. The Sigmoid is an activation function, and input features are mapped between 0 and 1 through the Sigmoid, namely, the credibility of the triples to be checked is mapped between 0 and 1. Further, the mapping result is more than or equal to 0.5 is considered as credible, and the mapping result is less than 0.5 is considered as incredible, so that the output can be separated into two categories. And then, calculating a mapping error according to the mapping result and the labeling information, namely, calculating the ratio of the wrong mapping result to all the mapping results. And then, performing loss reverse updating parameters according to the mapping error, such as values of weight values and bias parameters in the adjustment model, iterating for multiple times until the mapping error is smaller than a preset error threshold value, and obtaining more accurate parameters Wx and Wy.
In this way, the feed-forward neural network has the function of mapping inseparable samples in an original space into another feature space so as to better separate two types of samples, sigmoid is to map features between 0 and 1, and is taken as the probability of a certain type, and through maximum likelihood estimation (the probability of a certain type is the maximum, the result is considered to be the same type), and sigmoid is a continuous function, so that normal derivation can be achieved, and parameters are updated through gradient descent. The primary purpose of sigmoid is to map feature values to probabilities and enable updating of parameters.
And 106, performing first verification on the second triple by using the trained first and second classification models to obtain a first verification result corresponding to the second triple.
Embedding the triple information of the second triple into an input vector through a pre-training model, and inputting the embedded input vector into the first secondary classification model for verification. And after the input vector corresponding to the second triple is processed by the feedforward neural network and the activation function, a first verification result of whether each second triple is credible is obtained.
And 108, determining an annotated data set according to the first triple, the second triple and the first check result, training a second classification model according to the annotated data set, and performing second check on the triple to be checked by using the second classification model to obtain a second check result of the triple to be checked.
Specifically, the first triple and the second triple with the first verification result being credible are used as the labeled data set. Therefore, the labeled data set with the number of samples required by training can be obtained on the premise of labeling only a few triples, and the fitting capacity of the trained second classification model can be improved.
The model parameters of the second binary model are initialized and are only structurally identical to the first binary model. The process of training the second classification model with the labeled data set is substantially the same as that in step 104, and the description is omitted as long as the samples are different. And then performing second check on all triples to be checked of the trained second classification model, wherein the second check is basically consistent with the step 106, only the samples have differences, and the description is omitted.
According to the triple verification method, random sampling marking is firstly carried out on the triple to be verified, and a marked first triple and a second triple which is not marked are obtained. And then, acquiring the input vector and the labeling information of the first triple to train a first two-first classification model, and then verifying the second triple by using the first two-first classification model, so that the scheme only needs a small amount of data labeling, and the labeling cost can be greatly reduced. Then, a self-training method is adopted in the scheme, a labeled data set is determined according to the first triple, the second triple and the verification result, a second classification model is trained by the labeled data set, and finally the target classification model is used for completing the verification of the triple to be verified. Therefore, the labeled data sets of the number of samples required by training can be obtained on the premise of labeling only a small number of triples, the fitting capacity of the trained target classification model is improved, and a good effect can be obtained only by a small number of labeled data.
Further, after the check of the triple to be detected is completed, a downstream task of constructing the knowledge graph can be further performed based on a second check result. Specifically, first, a credible triple is acquired from the annotation dataset according to the second check result to serve as a target triple, and a head entity, a relation and a tail entity of each target triple are used as a target head entity, a target relation and a target tail entity.
Then, a plurality of first co-occurrence matrices of the target head entity and the target relationship are constructed, wherein the first co-occurrence matrices are symmetric matrices of which the target head entity and the target relationship are rows and columns. Constructing the first co-occurrence matrix includes first constructing a plurality of two-dimensional matrices of target head entities and target relationships, each two-dimensional matrix including a plurality of head entity columns and a plurality of relationship series. Wherein, a line or a column of data where any target head entity in the two-dimensional matrix is located is called a head entity column; one row or column of data where any one target relationship in the two-bit matrix is located is called a relationship series. And determining the type of any target head entity as a target head entity type, determining the type of any target relation as a target relation type, adding 1 to a head entity column consistent with the type of the target head entity and adding 1 to a relation series consistent with the type of the target relation in a two-dimensional matrix of all 0 to obtain a first co-occurrence matrix. And repeatedly selecting the target head entity type and/or the target relation type to obtain a plurality of first co-occurrence matrixes. And then, screening out a first target matrix which is larger than a first segmentation threshold value from the plurality of first co-occurrence matrices, for example, taking a matrix which is obtained by adding all numerical values in the first co-occurrence matrices and is larger than K as a first target matrix, and combining a target head entity and a target relation corresponding to the first target matrix to obtain a first combination. For example, the target head entity includes zhou jen, lin jun jen, ginger, etc., the target relationship includes singer, composer, brother, etc., a column of data in the two-dimensional matrix is used as a head entity column, and a row of data in the two-dimensional matrix is used as a relationship series. If the figure is determined to be the target head entity type, adding 1 to a column in which Zhongjien, linjunjie, ginger and the like are positioned; and if the occupational relationship is determined to be the target relationship type, adding 1 to a line where a singer, a composer and the like are located, and finally obtaining a first co-occurrence matrix. If the first co-occurrence matrix of the above example is determined to be the first target matrix based on K, the combination may result in a first combination including "zhou jeren-singer", "zhou jeren-composer", "linj-singer", "linj-composer", and the like.
And similarly, a plurality of second co-occurrence matrixes of the target tail entity and the target relationship are constructed, and the second co-occurrence matrixes are symmetrical matrixes taking the target tail entity and the target relationship as rows and columns. Constructing the second co-occurrence matrix includes constructing a plurality of two-dimensional matrices of target tail entities and target relationships, each two-dimensional matrix including a plurality of tail entity columns and a plurality of relationship series. Wherein, a row or a column of data where any tail entity in the two-dimensional matrix is located is called a tail entity column. And determining the type of any one target tail entity as a target tail entity type, determining the type of any one target relationship as a target relationship type, and adding 1 to a tail entity column consistent with the type of the target tail entity and adding 1 to a relationship series consistent with the type of the target relationship in a two-dimensional matrix with all 0 to obtain a first co-occurrence matrix. And repeatedly selecting the target head entity type and/or the target relation type to obtain a plurality of second co-occurrence matrixes. And screening a second target matrix which is larger than a second segmentation threshold value from the plurality of second co-occurrence matrices, for example, adding all numerical values in the second co-occurrence matrices to obtain a matrix which is larger than K as the second target matrix, and combining the target tail entity and the target relation corresponding to the second target matrix to obtain a second combination. For example, the target tail entity comprises rice aroma, south of the Yangtze river, jiangwu and the like, the target relationship comprises singers, composers, brothers and the like, one column of data in the two-dimensional matrix is used as a tail entity column, and one row of data in the two-dimensional matrix is used as a relationship series. If the work is determined to be the target tail entity type, adding 1 to a row in which rice fragrance, south of Yangtze river and the like are positioned; and if the occupational relationship is determined to be the target relationship type, adding 1 to a line where a singer, a composer and the like are located, and finally obtaining a second co-occurrence matrix. If the second co-occurrence matrix of the above example is determined to be the second target matrix based on K, the combination may result in a second combination including "singer-rice scent", "singer-south of the Yangtze river", "composer-rice scent", "composer-south of the Yangtze river", and the like. And finally, performing cross combination on the first combination and the second combination with the same target relation to obtain the knowledge graph.
In one embodiment, as shown in fig. 2, a triple verifying apparatus is provided, the apparatus including:
the marking module 202 is configured to obtain a triple to be checked, and randomly sample and mark the triple to be checked to obtain a marked first triple and an unmarked second triple;
the initial training module 204 is configured to embed triple information of a first triple into an input vector through a pre-training model, where the triple information includes a head entity, a relationship, a tail entity, and a sentence in which the first triple is located, obtain labeling information of the first triple, where the labeling information includes whether each first triple is reliable, and train a first two-two classification model according to the input vector and the labeling information;
the first checking module 206 is configured to perform first checking on the second triple by using the trained first-second classification model, and obtain a first checking result corresponding to the second triple;
the training and checking module 208 is configured to determine an annotated data set according to the first triple, the second triple and the first checking result, train a second classification model according to the annotated data set, and check the triple to be checked by using the target classification model to obtain a second checking result of the triple to be checked.
The triple verifying device firstly carries out random sampling labeling on a triple to be verified to obtain a labeled first triple and an unlabeled second triple. And then, acquiring the input vector and the labeling information of the first triple to train a first two-first classification model, and then verifying the second triple by using the first two-first classification model, so that the scheme only needs a small amount of data labeling, and the labeling cost can be greatly reduced. Then, the scheme adopts a self-training method, determines an labeled data set according to the first triple, the second triple and the verification result, trains a second binary model by using the labeled data set, and finally completes the verification of the triple to be verified by using the target classification model. Therefore, the labeled data sets of the number of samples required by training can be obtained on the premise of labeling only a small number of triples, the fitting capacity of the trained target classification model is improved, and a good effect can be obtained only by a small number of labeled data.
In one embodiment, the initial training module 204 is specifically configured to: respectively encoding a head entity, a relation and a tail entity through a pre-training model for each first triple to obtain a first vector corresponding to the head entity, a second vector corresponding to the relation and a third vector corresponding to the tail entity; sequentially connecting the first vector, the second vector and the third vector to obtain a first input vector; and coding the sentence through a pre-training model, and taking the coded sentence as a second input vector, wherein the input vector comprises a first input vector and a second input vector.
In one embodiment, the initial training module 204 is specifically configured to: the first binary classification model maps the credibility of the triple to be verified between 0 and 1 according to the input vector, and calculates a mapping error according to the labeling information and the mapping result; and adjusting the model parameters of the first and second classification models according to the mapping error until the mapping result meets the preset check standard.
In one embodiment, the training and verification module 208 is specifically configured to: and taking the first triple and the second triple with the credible first verification result as an annotation data set.
In one embodiment, the labeling module 202 is specifically configured to: and acquiring text data, extracting the triples to be checked from the text data, wherein the extraction is based on rules or syntactic analysis.
In one embodiment, the triple verifying apparatus further includes: the knowledge graph construction module is used for acquiring credible target triples from the labeling data set according to a second check result, and each target triplet comprises a target header entity, a target relation and a target tail entity; constructing a plurality of first co-occurrence matrixes of target head entities and target relations, screening out a first target matrix which is larger than a first segmentation threshold value from the plurality of first co-occurrence matrixes, and combining head entity types and relation types corresponding to the first target matrix to obtain a first combination; constructing a plurality of second co-occurrence matrixes of the target tail entity and the target relation, screening out a second target matrix which is larger than a second segmentation threshold value from the plurality of second co-occurrence matrixes, and combining tail entity types and relation types corresponding to the second target matrix to obtain a second combination; and carrying out cross combination on the first combination and the second combination to obtain the knowledge graph.
Fig. 3 shows an internal configuration diagram of a verification device of a triplet in one embodiment. As shown in fig. 3, the triple verification device includes a processor, a memory, and a network interface connected by a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the triple verification device stores an operating system, and may further store a computer program, and when the computer program is executed by the processor, the computer program may enable the processor to implement the triple verification method. The internal memory may also have stored therein a computer program that, when executed by the processor, causes the processor to perform a method of verifying triples. It will be understood by those skilled in the art that the structure shown in fig. 3 is only a block diagram of a part of the structure related to the present application, and does not constitute a limitation to the checking device of the triplet to which the present application is applied, and a checking device of a specific triplet may include more or less components than those shown in the figure, or combine some components, or have a different arrangement of components.
A triple verification device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the following steps when executing the computer program: acquiring a triple to be verified, randomly sampling and marking the triple to be verified to obtain a marked first triple and a second triple which is not marked; embedding the triple information of the first triple into an input vector through a pre-training model, wherein the triple information comprises a head entity, a relation, a tail entity and a sentence where the first triple is located, acquiring the marking information of the first triple, the marking information comprises whether each first triple is credible or not, and training a first two-two classification model according to the input vector and the marking information; performing first verification on the second triple by using the trained first-second classification model to obtain a first verification result corresponding to the second triple; and determining an annotation data set according to the first triple, the second triple and the first check result, training a second binary model according to the annotation data set, and performing second check on the triple to be checked by using the trained second binary model to obtain a second check result of the triple to be checked.
In one embodiment, embedding triplet information of a first triplet as an input vector by a pre-trained model includes: respectively encoding a head entity, a relation and a tail entity through a pre-training model for each first triple to obtain a first vector corresponding to the head entity, a second vector corresponding to the relation and a third vector corresponding to the tail entity; sequentially connecting the first vector, the second vector and the third vector to obtain a first input vector; and coding the sentence through a pre-training model, and taking the coded sentence as a second input vector, wherein the input vector comprises a first input vector and a second input vector.
In one embodiment, the first binary model includes a feedforward neural network and an activation function, and training the first binary model based on the input vector and the labeling information includes: the first binary classification model maps the credibility of the triple to be verified between 0 and 1 according to the input vector, and calculates a mapping error according to the labeling information and the mapping result; and adjusting the model parameters of the first and second classification models according to the mapping error until the mapping result meets the preset check standard.
In one embodiment, determining the annotated dataset from the first triple, the second triple, and the first verification result includes: and taking the first triple and the second triple with the credible first verification result as an annotation data set.
In one embodiment, obtaining a triplet to be checked includes: and acquiring text data, extracting the triples to be checked from the text data, wherein the extraction is based on rules or syntactic analysis.
In an embodiment, after performing a second check on the triple to be checked by using the trained second classification model to obtain a second check result of the triple to be checked, the method further includes: acquiring credible target triples from the labeling dataset according to the second check result, wherein each target triplet comprises a target header entity, a target relation and a target tail entity; constructing a plurality of first co-occurrence matrixes of target head entities and target relations, screening out a first target matrix which is larger than a first segmentation threshold value from the plurality of first co-occurrence matrixes, and combining head entity types and relation types corresponding to the first target matrix to obtain a first combination; constructing a plurality of second co-occurrence matrixes of the target tail entity and the target relation, screening out a second target matrix which is larger than a second segmentation threshold value from the plurality of second co-occurrence matrixes, and combining tail entity types and relation types corresponding to the second target matrix to obtain a second combination; and carrying out cross combination on the first combination and the second combination to obtain the knowledge graph.
A computer-readable storage medium, storing a computer program which, when executed by a processor, performs the steps of: acquiring a triple to be verified, randomly sampling and marking the triple to be verified to obtain a marked first triple and a second triple which is not marked; embedding the triple information of the first triple into an input vector through a pre-training model, wherein the triple information comprises a head entity, a relation, a tail entity and a sentence where the first triple is located, acquiring the marking information of the first triple, the marking information comprises whether each first triple is credible or not, and training a first two-two classification model according to the input vector and the marking information; performing first verification on the second triple by using the trained first and second classification models to obtain a first verification result corresponding to the second triple; and determining a labeled data set according to the first triple, the second triple and the first check result, training a second classification model according to the labeled data set, and performing second check on the triple to be checked by using the trained second classification model to obtain a second check result of the triple to be checked.
In one embodiment, embedding triplet information of a first triplet as an input vector by a pre-trained model includes: respectively encoding a head entity, a relation and a tail entity through a pre-training model for each first triple to obtain a first vector corresponding to the head entity, a second vector corresponding to the relation and a third vector corresponding to the tail entity; sequentially connecting the first vector, the second vector and the third vector according to the sequence of the first vector, the second vector and the third vector to obtain a first input vector; and coding the sentence through a pre-training model, and taking the coded sentence as a second input vector, wherein the input vector comprises a first input vector and a second input vector.
In one embodiment, the first binary model includes a feedforward neural network and an activation function, and training the first binary model based on the input vector and the labeling information includes: the first and second classification models map the credibility of the triple to be checked between 0 and 1 according to the input vector, and the mapping error is calculated according to the labeling information and the mapping result; and adjusting the model parameters of the first and second classification models according to the mapping error until the mapping result meets the preset check standard.
In one embodiment, determining the annotated dataset from the first triple, the second triple, and the first verification result includes: and taking the first triple and the second triple with the credible first verification result as an annotation data set.
In one embodiment, acquiring a triplet to be verified includes: and acquiring text data, extracting the triple to be checked from the text data, wherein the extraction is based on rule or syntactic analysis.
In an embodiment, after performing a second check on the triple to be checked by using the trained second classification model to obtain a second check result of the triple to be checked, the method further includes: acquiring credible target triples from the labeling dataset according to the second check result, wherein each target triplet comprises a target header entity, a target relation and a target tail entity; constructing a plurality of first co-occurrence matrixes of target head entities and target relations, screening out a first target matrix which is larger than a first segmentation threshold value from the plurality of first co-occurrence matrixes, and combining head entity types and relation types corresponding to the first target matrix to obtain a first combination; constructing a plurality of second co-occurrence matrixes of the target tail entity and the target relation, screening out a second target matrix which is larger than a second segmentation threshold value from the plurality of second co-occurrence matrixes, and combining tail entity types and relation types corresponding to the second target matrix to obtain a second combination; and carrying out cross combination on the first combination and the second combination to obtain the knowledge graph.
It should be noted that the method, the apparatus, the device and the computer-readable storage medium for checking triples described above belong to a general inventive concept, and the contents in the embodiments of the method, the apparatus, the device and the computer-readable storage medium for checking triples may be applicable to each other.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (8)

1. A method for verifying a triplet, the method comprising:
acquiring a triple to be checked, randomly sampling and marking the triple to be checked to obtain a marked first triple and a second triple which is not marked;
embedding the triple information of the first triple into an input vector through a pre-training model, wherein the first triple information comprises a head entity, a relation, a tail entity and a sentence where the first triple is located, acquiring the marking information of the first triple, the marking information comprises whether the first triple is credible or not, and training a first two-two classification model according to the input vector and the marking information;
performing first check on the second triple by using the trained first-second classification model to obtain a first check result corresponding to the second triple, wherein the first check result is whether the second triple is credible or not;
determining an annotated data set according to the first triple, the second triple and the first check result, training a second binary model according to the annotated data set, and performing second check on the triple to be checked by using the trained second binary model to obtain a second check result of the triple to be checked, wherein the second check result is whether the triple to be checked is credible or not;
the obtaining of the triplet to be verified includes: acquiring text data, and extracting triples to be checked from the text data, wherein the extraction is based on rule extraction or syntactic analysis; the text data is a text paragraph composed of a plurality of sentences;
after the trained second classification model is used to perform second check on the triplet to be checked to obtain a second check result of the triplet to be checked, the method further includes: acquiring credible target triples from the labeling data set according to a second check result, wherein each target triplet comprises a target header entity, a target relation and a target tail entity; constructing a plurality of first co-occurrence matrixes of the target header entity and the target relationship, wherein the first co-occurrence matrixes are symmetrical matrixes taking the target header entity and the target relationship as rows and columns, screening out first target matrixes which are greater than a first segmentation threshold value from the plurality of first co-occurrence matrixes, and combining head entity types and relationship types corresponding to the first target matrixes to obtain a first combination; constructing a plurality of second co-occurrence matrixes of the target tail entity and the target relation, wherein the second co-occurrence matrixes are symmetrical matrixes taking the target tail entity and the target relation as rows and columns, screening a second target matrix which is larger than a second segmentation threshold value from the plurality of second co-occurrence matrixes, and combining tail entity types and relation types corresponding to the second target matrix to obtain a second combination; and carrying out cross combination on the first combination and the second combination to obtain a knowledge graph.
2. The verification method of claim 1, wherein embedding the triplet information of the first triplet as an input vector through a pre-trained model comprises:
respectively encoding the head entity, the relation and the tail entity through the pre-training model to obtain a first vector corresponding to the head entity, a second vector corresponding to the relation and a third vector corresponding to the tail entity;
sequentially connecting the first vector, the second vector and the third vector according to the sequence of the first vector, the second vector and the third vector to obtain a first input vector;
and coding the sentence through the pre-training model, and taking the coded sentence as a second input vector, wherein the input vector comprises the first input vector and the second input vector.
3. The verification method of claim 1, wherein the training a first binary model based on the input vector and the labeling information comprises:
the first binary classification model maps the credibility of the triple to be verified between 0 and 1 according to the input vector, and calculates a mapping error according to the labeling information and the mapping result;
and adjusting the model parameters of the first secondary classification model according to the mapping error until the mapping result meets a preset check standard.
4. The verification method of claim 1, wherein determining an annotated dataset from the first triple, the second triple, and the first verification result comprises:
and taking the first triple and a second triple with a credible first verification result as the annotation data set.
5. The verification method according to claim 1, wherein the pre-training model is any one of bert, word2vec, XLnet, and Albert.
6. A triple verification apparatus, comprising:
the marking module is used for acquiring the triples to be checked, randomly sampling and marking the triples to be checked to obtain marked first triples and unmarked second triples;
the initial training module is used for embedding the triple information of the first triple into an input vector through a pre-training model, acquiring the marking information of the first triple, wherein the marking information comprises whether each first triple is credible or not, and training a first two-two classification model according to the input vector and the marking information, and the first triple information comprises a head entity, a relation, a tail entity and a sentence where the first triple is located;
the first checking module is used for performing first checking on the second triple by using the trained first and second classification model to obtain a first checking result corresponding to the second triple, wherein the first checking result is whether the second triple is credible or not;
the training and checking module is used for determining a marked data set according to the first triple, the second triple and the first checking result, training a second classification model according to the marked data set, checking the triple to be checked by using a target classification model to obtain a second checking result of the triple to be checked, wherein the second checking result is whether the triple to be checked is credible or not;
wherein, the marking module is specifically used for: acquiring text data, and extracting a triple to be verified from the text data, wherein the extraction is based on rule extraction or syntactic analysis extraction; the text data is a text paragraph composed of a plurality of sentences;
the triple verifying apparatus further includes: a knowledge graph construction module to: obtaining credible target triples from the labeling data set according to a second check result, wherein each target triplet comprises a target header entity, a target relation and a target tail entity; constructing a plurality of first co-occurrence matrixes of the target header entity and the target relation, wherein the first co-occurrence matrixes are symmetrical matrixes with rows and columns of the target header entity and the target relation, screening out first target matrixes which are larger than a first segmentation threshold value from the plurality of first co-occurrence matrixes, and combining the head entity types and the relation types corresponding to the first target matrixes to obtain a first combination; constructing a plurality of second co-occurrence matrixes of the target tail entity and the target relation, wherein the second co-occurrence matrixes are symmetrical matrixes taking the target tail entity and the target relation as rows and columns, screening a second target matrix which is larger than a second segmentation threshold value from the plurality of second co-occurrence matrixes, and combining tail entity types and relation types corresponding to the second target matrix to obtain a second combination; and carrying out cross combination on the first combination and the second combination to obtain a knowledge graph.
7. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, causes the processor to carry out the steps of the method according to any one of claims 1 to 5.
8. Triple verification device comprising a memory and a processor, characterized in that the memory stores a computer program which, when executed by the processor, causes the processor to carry out the steps of the method according to any one of claims 1 to 5.
CN202110594046.0A 2021-05-28 2021-05-28 Triple verification method, apparatus, device and medium Active CN113298160B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110594046.0A CN113298160B (en) 2021-05-28 2021-05-28 Triple verification method, apparatus, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110594046.0A CN113298160B (en) 2021-05-28 2021-05-28 Triple verification method, apparatus, device and medium

Publications (2)

Publication Number Publication Date
CN113298160A CN113298160A (en) 2021-08-24
CN113298160B true CN113298160B (en) 2023-03-07

Family

ID=77326020

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110594046.0A Active CN113298160B (en) 2021-05-28 2021-05-28 Triple verification method, apparatus, device and medium

Country Status (1)

Country Link
CN (1) CN113298160B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115982352B (en) * 2022-12-12 2024-04-02 北京百度网讯科技有限公司 Text classification method, device and equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021595A (en) * 2016-10-28 2018-05-11 北大方正集团有限公司 Examine the method and device of knowledge base triple
CN109378053A (en) * 2018-11-30 2019-02-22 安徽影联云享医疗科技有限公司 A kind of knowledge mapping construction method for medical image
CN109816027A (en) * 2019-01-29 2019-05-28 北京三快在线科技有限公司 Training method, device and the unmanned equipment of unmanned decision model
CN110263697A (en) * 2019-06-17 2019-09-20 哈尔滨工业大学(深圳) Pedestrian based on unsupervised learning recognition methods, device and medium again
CN112818138A (en) * 2021-04-19 2021-05-18 中译语通科技股份有限公司 Knowledge graph ontology construction method and device, terminal device and readable storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165385B (en) * 2018-08-29 2022-08-09 中国人民解放军国防科技大学 Multi-triple extraction method based on entity relationship joint extraction model
CN112015859B (en) * 2019-05-31 2023-08-18 百度在线网络技术(北京)有限公司 Knowledge hierarchy extraction method and device for text, computer equipment and readable medium
CN112613306A (en) * 2020-12-31 2021-04-06 恒安嘉新(北京)科技股份公司 Method, device, electronic equipment and storage medium for extracting entity relationship

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021595A (en) * 2016-10-28 2018-05-11 北大方正集团有限公司 Examine the method and device of knowledge base triple
CN109378053A (en) * 2018-11-30 2019-02-22 安徽影联云享医疗科技有限公司 A kind of knowledge mapping construction method for medical image
CN109816027A (en) * 2019-01-29 2019-05-28 北京三快在线科技有限公司 Training method, device and the unmanned equipment of unmanned decision model
CN110263697A (en) * 2019-06-17 2019-09-20 哈尔滨工业大学(深圳) Pedestrian based on unsupervised learning recognition methods, device and medium again
CN112818138A (en) * 2021-04-19 2021-05-18 中译语通科技股份有限公司 Knowledge graph ontology construction method and device, terminal device and readable storage medium

Also Published As

Publication number Publication date
CN113298160A (en) 2021-08-24

Similar Documents

Publication Publication Date Title
CN111767707B (en) Method, device, equipment and storage medium for detecting Leideogue cases
CN110059320B (en) Entity relationship extraction method and device, computer equipment and storage medium
CN110032739B (en) Method and system for extracting named entities of Chinese electronic medical record
CN112016318B (en) Triage information recommendation method, device, equipment and medium based on interpretation model
CN111738004A (en) Training method of named entity recognition model and named entity recognition method
CN112052684A (en) Named entity identification method, device, equipment and storage medium for power metering
CN111666775B (en) Text processing method, device, equipment and storage medium
CN112507039A (en) Text understanding method based on external knowledge embedding
CN110472049B (en) Disease screening text classification method, computer device and readable storage medium
CN111191457A (en) Natural language semantic recognition method and device, computer equipment and storage medium
CN110321426B (en) Digest extraction method and device and computer equipment
CN111710383A (en) Medical record quality control method and device, computer equipment and storage medium
Gong et al. Continual pre-training of language models for math problem understanding with syntax-aware memory network
CN113298160B (en) Triple verification method, apparatus, device and medium
CN110808095B (en) Diagnostic result recognition method, model training method, computer equipment and storage medium
CN116245107A (en) Electric power audit text entity identification method, device, equipment and storage medium
CN115526234A (en) Cross-domain model training and log anomaly detection method and device based on transfer learning
CN111191439A (en) Natural sentence generation method and device, computer equipment and storage medium
CN112036151B (en) Gene disease relation knowledge base construction method, device and computer equipment
CN115238645A (en) Asset data identification method and device, electronic equipment and computer storage medium
CN113283461A (en) Financial big data processing system and method based on block chain
CN115422357A (en) Text classification method and device, computer equipment and storage medium
CN114638229A (en) Entity identification method, device, medium and equipment of record data
CN113821571A (en) Food safety relation extraction method based on BERT and improved PCNN
CN111562943A (en) Code clone detection method and device based on event embedded tree and GAT network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant