CN114626368A - Method and system for acquiring common knowledge of vertical domain rules - Google Patents

Method and system for acquiring common knowledge of vertical domain rules Download PDF

Info

Publication number
CN114626368A
CN114626368A CN202210266934.4A CN202210266934A CN114626368A CN 114626368 A CN114626368 A CN 114626368A CN 202210266934 A CN202210266934 A CN 202210266934A CN 114626368 A CN114626368 A CN 114626368A
Authority
CN
China
Prior art keywords
knowledge
common knowledge
rule common
rule
rules
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210266934.4A
Other languages
Chinese (zh)
Other versions
CN114626368B (en
Inventor
刘鑫
崔莹
李春豹
刘万里
黄刘
陈莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 10 Research Institute
Original Assignee
CETC 10 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 10 Research Institute filed Critical CETC 10 Research Institute
Priority to CN202210266934.4A priority Critical patent/CN114626368B/en
Publication of CN114626368A publication Critical patent/CN114626368A/en
Application granted granted Critical
Publication of CN114626368B publication Critical patent/CN114626368B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of artificial intelligence, and discloses a method and a system for acquiring rule common knowledge in the vertical field, wherein the acquisition method comprises the following steps: s1, making rules, common sense and knowledge modeling specifications; s2, constructing a network basic model by rule common knowledge acquisition; and S3, constructing a complete model of the rule common sense knowledge acquisition network. The invention solves the problems of low rule common sense knowledge acquisition efficiency, low accuracy, high cost and the like in the prior art.

Description

Method and system for acquiring common knowledge of vertical domain rules
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method and a system for acquiring rule common knowledge in the vertical field.
Background
At present, the artificial intelligence related technology has obtained better application effects in a plurality of fields such as natural language processing, image recognition, audio and video synthesis and the like, and essentially, a model can be distinguished according to different learning groups of data characteristics through large-scale data training. However, the intelligent processing algorithm model obtained in this way is deeply affected by data quality, when data distribution has slight deviation, such as partial text loss, slight noise added to an image, and the like, the model can distinguish errors, the generalization and the robustness of the model are poor, and needless to say, the model is difficult to migrate and adapt under unknown conditions by executing high-order tasks such as logical reasoning, scene understanding, decision analysis and the like. The reason for this is that the importance of knowledge in the field of artificial intelligence is more and more prominent because the traditional machine learning algorithm does not introduce multiple types of knowledge.
Among the various knowledge, there is a special knowledge-rule common knowledge, and its acquisition, representation and processing have been a major core problem in the field of artificial intelligence. Many researchers found that something very easy to do for children of only a few years old was not effectively dealt with by many years of research using artificial intelligence. The artificial intelligence expert de lei foss (Dreyfus Hubert) believes that: the common sense problem is the biggest obstacle to realizing the general artificial intelligence, and if the common sense problem is solved, the research on the artificial intelligence is completed. Therefore, the introduction of the knowledge of rule common sense is of great significance to the breakthrough of the artificial intelligence related technology. For the knowledge of the rule general knowledge, there is no commonly accepted definition in the academic world and the industry at present, and the rule general knowledge is described in a general abstract way, so that almost all people can sense, understand and judge the shared things according to the rule general knowledge, and people can reasonably expect without debate. Meanwhile, in different vertical fields, such as medical treatment, finance, military affairs and the like, some rule common knowledge specific to the fields is often presented in a rule form, such as 'nuclear submarine needs nuclear energy', 'aircraft carrier is larger than protective carrier', 'loading before cannon opening' and the like. The common knowledge of the rules in the vertical field is also the default and common knowledge of almost all people, which is often reflected in the thinking, consciousness and careless behavior of people, and is recorded in data carriers such as text images, audio and video with less dominance. Moreover, the biggest difference between the regular common knowledge and the encyclopedic knowledge which takes the knowledge map as the mainstream is that the entities and the relations of the encyclopedic knowledge are easy to model, and the relations are relatively clear; however, the relationship between the entities of the rule common knowledge is difficult to determine, so that the extraction type technique in the conventional knowledge acquisition mode cannot acquire the common knowledge relationship which is not defined in advance, which results in the omission of the acquisition of the rule common knowledge, and the acquisition of the rule common knowledge by the extraction type technique is not available.
For the reasons, a technical way for rapidly and automatically acquiring rule common knowledge does not exist at present, and the investigation on a rule common knowledge base which is mainstream at home and abroad such as Cyc, NELL, ConceptNet and the like can find that the current rule common knowledge acquisition method with higher confidence coefficient mainly depends on a manual crowdsourcing mode, so that a large amount of manpower and material resources are consumed. The mode for acquiring the rule common knowledge is too high in cost, so a brand-new method for acquiring the rule common knowledge in the vertical field needs to be developed urgently to reduce the cost, realize the quick acquisition of the rule common knowledge in a large scale and support the development of downstream reasoning application research based on the rule common knowledge.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a method and a system for acquiring the common knowledge of the rules in the vertical field, and solves the problems of low acquisition efficiency, low accuracy, high cost and the like of the common knowledge of the rules in the prior art.
The technical scheme adopted by the invention for solving the problems is as follows:
a method for acquiring common knowledge of vertical domain rules comprises the following steps:
s1, establishing rules common knowledge modeling specification: classifying the rule common knowledge and formulating a modeling specification according to the coverage scope of the rule common knowledge and the support requirement of the rule common knowledge on downstream tasks;
s2, establishing a rule common knowledge acquisition network basic model: establishing a rule common sense knowledge to acquire a network basic model by utilizing the language knowledge capability of a language pre-training model in text learning; designing a pre-training reasoning task meeting the requirement of acquiring the rule common knowledge, inputting part of the rule common knowledge in the known field into the rule common knowledge acquisition network basic model so as to finely adjust the rule common knowledge acquisition network basic model, and promoting the rule common knowledge acquisition network basic model to learn the connotation related to the rule common knowledge so as to have the capability of generating the rule common knowledge;
s3, constructing a complete rule knowledge acquisition network model: training the rule common knowledge acquisition network basic model around the particularity of the use mode of the rule common knowledge in the downstream reasoning task, and realizing the acquisition of the rule common knowledge conforming to human cognition.
As a preferred technical solution, the step S2 includes the following steps:
s21, input layer representation: obtaining the input representation of a rule common knowledge acquisition network by using the original input text;
s22, network model coding: fully learning semantic association among each word in the text to obtain context semantic representation of the text;
s23, optimizing network model parameters: and calculating the cross entropy loss, continuously optimizing the rule common knowledge to obtain network basic model parameters, and stopping optimization when the cross entropy loss is less than a set threshold value to obtain the final rule common knowledge to obtain the network basic model parameters.
As a preferred technical solution, in step S3, the pre-training inference task includes: mask language model task, next sentence prediction task.
As a preferable technical solution, in step S3, when executing the mask language model task, the input is defined as a form of two-segment text concatenation.
As a preferred technical solution, in step S21, it is assumed that the original input text is x1x2…xi…xnThe input text after passing the mask operation is x'1x'2…x′i…x'nFor the masked outputEntering a text for processing to obtain an input expression v of the rule common knowledge acquisition network, wherein the calculation formula is as follows:
v=InputRepresentation(X),
wherein X ═ CLS]x′1x'2…x′i…x'n[SEP],xiIth word, x 'representing input text'iDenotes the ith word after mask processing, [ CLS]A special mark indicating the beginning of a text sequence, [ SEP ]]Representing separation marks between text sequences.
As a preferred technical solution, in step S22, the input expression v passes through 4 layers of transformers, and the semantic association between each word in the text is fully learned by the self-attention mechanism, so as to finally obtain the context semantic representation of the text, where the calculation formula is:
Figure BDA0003552738040000041
wherein h is[l]∈RN×dRepresenting the hidden layer output of the first layer transform, h[0]N denotes the sequence length and d denotes the hidden layer dimension of the regular common sense knowledge acquisition network.
As a preferable technical solution, in step S23, a probability distribution P on the vocabulary corresponding to the ith component in the mask representation is calculatediBy the use of PiAnd label yiCalculating the cross entropy loss, PiThe calculation formula is as follows:
Figure BDA0003552738040000042
where Softmax () represents an activation function, i represents a component number in a mask representation, m represents a masked flag, W represents a word vector matrix, T represents a transpose operation,
Figure BDA0003552738040000044
a transpose matrix representing a matrix of word vectors,
Figure BDA0003552738040000043
vector representation representing the ith masked word, boIndicating the bias of the fully connected layer.
As a preferred technical solution, the step S3 includes the following steps:
s31, adding a rule common knowledge acquisition task layer based on the large-scale language pre-training model, and constructing a rule common knowledge acquisition network;
s32, selecting part of rule common knowledge from the existing rule common knowledge base as a seed knowledge set, and converting the seed knowledge set from a triple form into a natural language form;
s33, using the seed knowledge set in the natural language form in the step S32 to carry out mask prediction training on the rule common knowledge acquisition network, so that the network has the capability of generating the field rule common knowledge;
s34, obtaining a head entity by an entity extraction mode aiming at the sources of the open domain text data and/or the domain text database;
s35, inputting the head entity obtained in the step S34 and the specific common sense relation to be predicted in the dimensions into the rule common sense knowledge acquisition network trained in the step S33;
s36, the rule common knowledge acquisition network generates a series of corresponding reasonable tail entities according to different head entity and common knowledge relation combinations to obtain a new rule common knowledge triple in the vertical field;
and S37, storing the rule common sense knowledge obtained in the step S36 in a database, and expanding the scale of the rule common sense knowledge database.
As a preferred technical solution, the rule common sense knowledge is classified according to the following dimensions: similar rules, different rules, classification rules, part-whole rules, spatial rules, creation rules, usage rules, motivation rules, characteristic rules, comparability rules, temporal rules.
A vertical domain rule common sense knowledge acquisition system is based on the vertical domain rule common sense knowledge acquisition method and comprises the following modules which are electrically connected in sequence:
a rule common knowledge modeling specification formulation module: the method is used for classifying the rule common knowledge and formulating modeling specifications according to the coverage scope of the rule common knowledge and the support requirements of the rule common knowledge on downstream tasks;
the rule common knowledge acquisition network basic model building module comprises: the method comprises the steps of establishing a rule common knowledge acquisition network basic model by utilizing the language knowledge ability of a language pre-training model in text learning; designing a pre-training reasoning task meeting the requirement of acquiring the rule common knowledge, inputting part of the rule common knowledge in the known field into the rule common knowledge acquisition network basic model so as to finely adjust the rule common knowledge acquisition network basic model, and promoting the rule common knowledge acquisition network basic model to learn the connotation related to the rule common knowledge so as to have the capability of generating the rule common knowledge;
a rule common knowledge acquisition network complete model construction module: the rule common knowledge acquisition network basic model is trained by surrounding the particularity of the use mode of the rule common knowledge in the downstream reasoning task, so that the acquisition of the rule common knowledge conforming to human cognition is realized.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention provides a rule common knowledge acquisition technology based on a generative Transformer, aiming at the problem that the relationship among entities in the rule common knowledge is difficult to determine, the invention integrates the supervision of a small amount of knowledge based on the 'knowledge reserve' of a large-scale language pre-training model, enhances the understanding of a rule common knowledge acquisition network model on concepts/rules, and generates the rule common knowledge which has multiple dimensions and accords with human cognition;
(2) the invention relates to rule common knowledge modeling specification formulation, which aims at the problem that the current academic and industrial fields do not provide unified rule common knowledge definition and classification standard temporarily, and flexibly classifies the rule common knowledge in a layered way and performs specification modeling according to the storage management and downstream application requirements of the rule common knowledge in the vertical field on the basis of the induction and summarization of a foreign mainstream rule common knowledge base, thereby playing a guiding role in acquiring the rule common knowledge;
(3) the method constructs a rule common knowledge acquisition network, specifically designs a pre-training task meeting the requirement of rule common knowledge acquisition on the basis of a large-scale language pre-training model, can promote the model to deeply understand the domain concepts and rules only by supervising and guiding the domain rule common knowledge of a small sample, and has the capability of generating the domain rule common knowledge.
Drawings
FIG. 1 is a schematic diagram of a method for acquiring knowledge of rules in the vertical field according to the present invention;
FIG. 2 is a schematic diagram of a vertical domain rule knowledge acquisition system according to the present invention;
FIG. 3 is a schematic diagram of a rule common knowledge classification;
FIG. 4 is one of the enlarged partial views of FIG. 3;
FIG. 5 is a second enlarged view of the portion of FIG. 3;
FIG. 6 is a third enlarged view of the portion of FIG. 3;
FIG. 7 is a schematic diagram of a rule common knowledge modeling specification;
FIG. 8 is one of the enlarged partial views of FIG. 7;
FIG. 9 is a second enlarged view of the portion of FIG. 7;
FIG. 10 is a schematic diagram of the construction of a rule common knowledge acquisition network base model;
FIG. 11 is a schematic diagram of rule common knowledge acquisition;
FIG. 12 is one of the enlarged partial views of FIG. 11;
fig. 13 is a second enlarged view of the portion of fig. 11.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited to these examples.
Example 1
As shown in fig. 1 to 13, the present invention aims to disclose a generation-type method for acquiring rule common sense knowledge in a vertical domain, which mainly breaks through the techniques of rule common sense knowledge modeling specification formulation, rule common sense knowledge acquisition network construction, rule common sense knowledge acquisition and the like, and can realize automatic acquisition of large-scale rule common sense knowledge and support development of downstream tasks such as machine reasoning for fusing rule common sense knowledge, aiming at the problems of rule common sense knowledge abstraction, dispersion, ambiguous relation, difficult automatic acquisition and the like.
The method for acquiring the rule common knowledge in the vertical field comprises the steps of rule common knowledge modeling specification formulation, rule common knowledge acquisition network construction, rule common knowledge acquisition and the like. The functions realized by the various parts are briefly explained as follows:
and (3) rule common knowledge modeling specification formulation: according to the covering scope of the rule common knowledge and the supporting requirements of the rule common knowledge on downstream tasks, the rule common knowledge is classified and a modeling specification is made, and the development of subsequent rule common knowledge classification acquisition, storage management, reasoning application method design and the like is guided.
Establishing a rule common knowledge acquisition network basic model: the method comprises the steps of establishing a rule common sense knowledge acquisition network basic model by using 'knowledge reserve' of a large-scale language pre-training model, designing a pre-training task meeting the rule common sense knowledge acquisition requirement, inputting part of known field rule common sense knowledge to finely adjust the rule common sense knowledge acquisition network basic model, promoting the rule common sense knowledge acquisition network basic model to learn the connotation related to the rule common sense knowledge, and having the rule common sense knowledge generation capacity.
And (3) establishing a complete model of the rule common knowledge acquisition network: around the particularity of the use mode of the rule common knowledge in the downstream reasoning task, on the basis of the rule common knowledge acquisition network basic model, a rule common knowledge acquisition task layer is added, and a rule common knowledge acquisition network complete model is constructed, so that the rule common knowledge conforming to human cognition is efficiently acquired.
The beneficial effects of the invention are:
(1) the invention provides a rule common knowledge acquisition technology based on a generative Transformer, aiming at the problem that the relation among entities in the rule common knowledge is difficult to determine, the knowledge storage based on a large-scale language pre-training model is integrated with the supervision of a small amount of knowledge, and the understanding of a rule common knowledge acquisition network model on concepts/rules is enhanced, so that the rule common knowledge which is in multiple dimensions and accords with human cognition is generated.
(2) The invention relates to rule common knowledge modeling specification formulation, which aims at the problem that the current academic and industrial fields do not provide unified rule common knowledge definition and classification standards temporarily, and flexibly classifies the rule common knowledge in a layered way and performs standard modeling according to the storage management and downstream application requirements of the rule common knowledge in the vertical field on the basis of the induction and summarization of a foreign mainstream rule common knowledge base so as to guide the acquisition of the rule common knowledge.
(3) The method constructs a rule common knowledge acquisition network, specifically designs a pre-training task meeting the requirement of rule common knowledge acquisition on the basis of a large-scale language pre-training model, can promote the model to deeply understand the domain concepts and rules only by supervising and guiding the domain rule common knowledge of a small sample, and has the capability of generating the domain rule common knowledge.
Example 2
As shown in fig. 1 to fig. 13, as a further optimization of the embodiment 1, this embodiment includes all the technical features of the embodiment 1, and in addition, this embodiment also includes the following technical features:
the method is realized by adopting the following steps:
1. making a rule common knowledge modeling specification;
referring to fig. 3, aiming at the problem that the current academic and industrial circles do not have an authoritative unified standard for rule common sense knowledge classification, in order to better and more scientifically perform rule common sense knowledge organization, storage, management and update iteration and support different downstream reasoning tasks (such as intelligent question answering, retrieval recommendation and the like), the invention firstly divides rule common sense knowledge into 11 dimensions of similar rules, different rules, classification rules, part-whole rules, space rules, creation rules, use rules, motivation rules, characteristic rules, comparability rules, time rules and the like, and continues to subdivide the rule common sense knowledge into different rule relationship types according to the above, because the rule common sense knowledge does not need to strictly regulate the relationship like knowledge graph knowledge when being used for the downstream reasoning tasks, the subdivision under the 11 dimensions provides a more open subdivision environment for users, the flexibility and ease of use of knowledge of rules common knowledge is greatly increased, for example, similar rules may be subdivided into synonyms, similar, defined, liked, etc. specific relationship types.
Second, the present invention models rule common sense knowledge. Referring to fig. 7, for the knowledge of rule common knowledge, for example, "wife is synonymous with wife", "nuclear submarine requires nuclear power", "war is caused by resource contention", etc., it is stored and used in two forms of natural language text description and structured triple. In the figure, a rule dimension refers to a dimension to which a certain rule common sense knowledge belongs, a common sense definition refers to an abstract expression form of the rule common sense knowledge, a common sense description refers to specific rule common sense knowledge content in a natural language text form, and a relationship name, a head entity and a tail entity form a content element of the rule common sense knowledge in a structured triple form. From the perspective of practical reasoning application, since the concrete relationship of rule common sense knowledge is uncertain, a more generalized and abstract relationship is used to normalize and constrain the rule common sense knowledge type during modeling, that is, a diversified concrete relationship may exist in a certain rule dimension.
2. Establishing a network basic model by using rule common knowledge;
referring to fig. 10, the basic model structure of the rule common knowledge acquisition network is formed by multiple layers of transformers, and for the task requirements of rule common knowledge acquisition, two corresponding pre-training reasoning tasks are designed during model training: mask Language Model task (MLM); next Sentence Prediction task (NSP).
In the network model training process, the input of the network consists of two sections of text x(1)And x(2)Splicing, namely modeling the input text by a regular common sense knowledge acquisition network model to obtain the semantic representation of the text, and finally learningTo the mask language model and the next sentence prediction model. The mask language model has no special requirement on the input form, and can be a text or two texts. Whereas the next sentence prediction model requires that the input must be two pieces of text. Therefore, in order to unify the training process of the subsequent rule common knowledge acquisition network model, the input is unified and specified to be in a form of splicing two sections of texts at the pre-training task stage. The rule common sense knowledge acquisition network training process will be introduced next by three steps of input layer representation, network model encoding and network model parameter optimization.
(1) An input layer representation;
assume the original input text is x1x2…xnThe input text after passing the mask operation is x'1x′2…x′n,xiIth word, x 'representing input text'iIndicating the ith word after mask processing. And processing the input text after the mask as follows to obtain an input representation v of the rule common knowledge acquisition network:
X=[CLS]x′1x'2…x'n[SEP],
v=InputRepresentation(X),
wherein, [ CLS]A special mark indicating the beginning of a text sequence, [ SEP ]]Representing separation marks between text sequences. It should be noted that if the length N of the input text is less than the maximum sequence length N of the regular common sense knowledge acquisition network, Padding Token (PAD) is required]And splicing after the text is input until the maximum sequence length M of the rule common knowledge acquisition network is reached. For example, assume that the maximum sequence length N of a regular common sense knowledge acquisition network is 10, and the input sequence length is 7 (two special tokens plus x)1To x5) Addition of 3 [ PAD ] s behind the input sequence is required]And (5) filling the marks.
[CLS]x1x2x3x4x5[SEP][PAD][PAD][PAD],
If the length of the input sequence X is larger than the maximum sequence length of the regular common sense knowledge acquisition network, the input sequence needs to be cut off to the networkIs determined. For example, assume that the maximum sequence length N of a regular common sense knowledge acquisition network is 5, and the input sequence length is 7 (two special tokens plus x)1To x5) The sequence needs to be truncated so that the length of the valid sequence (2 special marks removed in the input sequence) becomes 3.
[CLS]x1x2x3[SEP],
(2) Network model coding;
in the regular common sense knowledge acquisition network coding layer, the input means v fully learns the semantic association between each word in the text by means of the self-attention mechanism through 4 layers of transformers. The specific encoding method of the Transformer is well-established and popular in the field of artificial intelligence, and is not described herein again.
Figure BDA0003552738040000111
Wherein h is[l]∈RN×dRepresenting the hidden layer output of the first layer transform, while specifying h[0]V to maintain the completeness of the equation. For convenience of description, labels between layers are omitted and simplified as:
h=Transformer(v),
where h represents the output of the last layer of the transform, i.e., h[L]. Finally obtaining the context semantic expression h e R of the text by the methodN×dAnd d represents the hidden layer dimension of the rule common knowledge acquisition network.
(3) And optimizing network model parameters.
Since the masking language model only masks a part of words in the input text, it is not necessary to predict every position in the input text, but only the already masked position. Assume set M ═ M1,m2,…,mkDenotes the subscript of all mask positions, k denotes the total mask number. If the input text length is n and the mask ratio is 15%, k is n × 15%. Then, the elements in the set M are taken as subscripts to extract from the context semantic representation h of the input sequenceCorresponding representations are obtained, and the representations are spliced to obtain a mask representation hm∈Rk×d
In the regular common knowledge acquisition network, as the input representation dimension e is the same as the hidden layer dimension d, the word vector matrix W belonging to the R can be directly utilized|V|×eThe mask representation is mapped to a vocabulary space. For the ith component in the mask representation, the probability distribution P on the vocabulary corresponding to the mask position is calculated by the following formulai
Figure BDA0003552738040000121
Wherein b DEG e R|V|Indicating the bias of the fully connected layer. Finally, obtaining the probability distribution P corresponding to the mask positioniThen, with the label yi(i.e., original word x)iOne-hot vector representation) to calculate cross entropy loss and optimize model parameters.
3. And (5) constructing a complete model of the rule common sense knowledge acquisition network.
Referring to fig. 11, the invention provides a vertical domain rule common sense knowledge acquisition technology based on a generative Transformer, which trains a rule common sense knowledge acquisition network on the basis of a large-scale language pre-training model, and then automatically generates the rule common sense knowledge of the 11 dimensions with respect to the entity extracted from the text. The method comprises the following specific steps:
step 1: adding a rule common knowledge acquisition task layer based on a large-scale language pre-training model such as BERT, GPT and the like, and constructing a rule common knowledge acquisition network complete model;
step 2: selecting part of rule common knowledge from an existing rule common knowledge base (such as conceptNet, Cyc, NELL, a field common knowledge base and the like) as a seed knowledge set, such as part of open domain rule common knowledge and part of vertical field rule common knowledge, and converting the triple form into a natural language form;
and 3, step 3: using the seed knowledge set in the form of natural language in the step 2 to acquire a network complete model for training rule common knowledge, so that the network has the capability of generating the field rule common knowledge;
and 4, step 4: aiming at sources such as open domain text data, a domain text database and the like, head entities such as objects, events, concepts and the like are obtained in an entity extraction mode;
and 5: inputting the head entities of the objects, the events, the concepts and the like obtained in the step 4 and the specific common sense relations to be predicted in the 11 dimensions into the rule common sense knowledge acquisition network complete model trained in the step 3;
step 6: the rule common knowledge acquisition network generates a corresponding series of reasonable tail entities according to different head entity and common knowledge relation combinations to obtain a new rule common knowledge triple in the vertical field;
and 7: and (4) warehousing the rule common sense knowledge obtained in the step (6), and expanding the scale of the rule common sense knowledge base.
As described above, the present invention can be preferably realized.
All features disclosed in all embodiments in this specification, or all methods or process steps implicitly disclosed, may be combined and/or expanded, or substituted, in any way, except for mutually exclusive features and/or steps.
The foregoing is only a preferred embodiment of the present invention, and the present invention is not limited thereto in any way, and any simple modification, equivalent replacement and improvement made to the above embodiment within the spirit and principle of the present invention still fall within the protection scope of the present invention.

Claims (10)

1. A method for acquiring the common knowledge of the rules in the vertical field is characterized by comprising the following steps:
s1, establishing rules common knowledge modeling specification: classifying the rule common knowledge and formulating a modeling specification according to the coverage scope of the rule common knowledge and the support requirement of the rule common knowledge on downstream tasks;
s2, establishing a rule common knowledge acquisition network basic model: establishing a rule common knowledge acquisition network basic model by utilizing the language knowledge capability of a language pre-training model in text learning; designing a pre-training reasoning task meeting the requirement of acquiring the rule common knowledge, inputting part of the rule common knowledge in the known field into the rule common knowledge acquisition network basic model so as to finely adjust the rule common knowledge acquisition network basic model, and promoting the rule common knowledge acquisition network basic model to learn the connotation related to the rule common knowledge so as to have the capability of generating the rule common knowledge;
s3, constructing a complete rule knowledge acquisition network model: training the rule common knowledge acquisition network basic model around the particularity of the use mode of the rule common knowledge in the downstream reasoning task, and realizing the acquisition of the rule common knowledge conforming to human cognition.
2. The method for acquiring knowledge of common general knowledge of vertical domain rules according to claim 1, wherein the step S2 comprises the following steps:
s21, input layer representation: obtaining the input representation of a rule common knowledge acquisition network by using the original input text;
s22, network model coding: fully learning semantic association among each word in the text to obtain context semantic representation of the text;
s23, optimizing network model parameters: and calculating the cross entropy loss, continuously optimizing the rule common knowledge to obtain network basic model parameters, and stopping optimization when the cross entropy loss is less than a set threshold value to obtain the final rule common knowledge to obtain the network basic model parameters.
3. The method for acquiring the common general knowledge of the vertical domain rule according to claim 2, wherein in the step S3, the pre-trained inference task is designed to include: mask language model task, next sentence prediction task.
4. The method for acquiring common sense knowledge of vertical domain rules according to claim 3, wherein in step S3, when performing the task of mask language model, the input is defined uniformly as a form of two text concatenations.
5. The method for obtaining knowledge of vertical domain rules general knowledge according to claim 4, wherein in step S21, the original input text is assumed to be x1x2…xi…xnThe input text after passing the mask operation is x'1x'2…x'i…x'nProcessing the masked input text to obtain an input expression v of the rule common knowledge acquisition network, wherein the calculation formula is as follows:
v=InputRepresentation(X),
wherein, X ═ CLS]x'1x'2…x'i…x'n[SEP],xiIth word, x 'representing input text'iDenotes the ith word after mask processing, [ CLS]A special mark indicating the beginning of a text sequence, [ SEP ]]Representing separation marks between text sequences.
6. The method for obtaining knowledge of common general knowledge in vertical domain rules according to claim 5, wherein in step S22, the input representation v passes through 4 layers of transformers, and the semantic association between each word in the text is fully learned by means of a self-attention mechanism, so as to obtain the semantic representation of the context of the text, and the calculation formula is as follows:
Figure FDA0003552738030000021
wherein h is[l]∈RN×dRepresenting the hidden layer output of the first layer transform, h[0]N denotes the sequence length and d denotes the hidden layer dimension of the regular common sense knowledge acquisition network.
7. The method for obtaining knowledge of vertical domain rules general knowledge according to claim 6, wherein in step S23, the probability distribution P on the vocabulary corresponding to the ith component in the mask representation is calculatediAll right (1)By PiAnd label yiCalculating the cross entropy loss, PiThe calculation formula is as follows:
Figure FDA0003552738030000031
where Softmax () represents an activation function, i represents a component number in a mask representation, m represents a masked flag, W represents a word vector matrix, T represents a transpose operation,
Figure FDA0003552738030000033
a transpose matrix representing a matrix of word vectors,
Figure FDA0003552738030000032
vector representation representing the ith masked word, boIndicating the bias of the fully connected layer.
8. The method for acquiring the common general knowledge of the vertical domain rule according to claim 7, wherein the step S3 comprises the following steps:
s31, adding a rule common knowledge acquisition task layer based on the large-scale language pre-training model, and constructing a rule common knowledge acquisition network;
s32, selecting part of rule common knowledge from the existing rule common knowledge base as a seed knowledge set, and converting the seed knowledge set from a triple form into a natural language form;
s33, using the seed knowledge set in the natural language form in the step S32 to carry out mask prediction training on the rule common knowledge acquisition network, so that the network has the capability of generating the field rule common knowledge;
s34, obtaining a head entity by an entity extraction mode aiming at the sources of the open domain text data and/or the domain text database;
s35, inputting the head entity obtained in the step S34 and the specific common sense relation to be predicted in the dimensions into the rule common sense knowledge acquisition network trained in the step S33;
s36, the rule common knowledge acquisition network generates a series of corresponding reasonable tail entities according to different head entity and common knowledge relation combinations to obtain a new rule common knowledge triple in the vertical field;
and S37, storing the rule common sense knowledge obtained in the step S36 in a database, and expanding the scale of the rule common sense knowledge database.
9. The method for acquiring the common general knowledge of the vertical domain rules according to any one of claims 1 to 8, wherein the common general knowledge of the rules is classified according to the following dimensions: similar rules, different rules, classification rules, part-whole rules, spatial rules, creation rules, usage rules, motivational rules, characteristic rules, comparability rules, temporal rules.
10. A vertical domain rule common sense knowledge acquisition system, based on any one of claims 1 to 9, comprising the following modules electrically connected in sequence:
a rule common knowledge modeling specification formulation module: the method is used for classifying the rule common knowledge and formulating a modeling specification according to the coverage scope of the rule common knowledge and the support requirement of the rule common knowledge on downstream tasks;
the rule common knowledge acquisition network basic model building module comprises: the method comprises the steps of establishing a rule common knowledge acquisition network basic model by utilizing the language knowledge ability of a language pre-training model in text learning; designing a pre-training reasoning task meeting the rule common knowledge acquisition requirement, inputting part of rule common knowledge in the known field into the rule common knowledge acquisition network basic model so as to finely adjust the rule common knowledge acquisition network basic model, and promoting the rule common knowledge acquisition network basic model to learn the connotation related to the rule common knowledge so as to have the capability of generating the rule common knowledge;
a rule common knowledge acquisition network complete model construction module: the rule common knowledge acquisition network basic model is trained according to the particularity of the using mode of the rule common knowledge in the downstream reasoning task, and the acquisition of the rule common knowledge conforming to human cognition is achieved.
CN202210266934.4A 2022-03-18 2022-03-18 Method and system for acquiring rule common sense knowledge in vertical field Active CN114626368B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210266934.4A CN114626368B (en) 2022-03-18 2022-03-18 Method and system for acquiring rule common sense knowledge in vertical field

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210266934.4A CN114626368B (en) 2022-03-18 2022-03-18 Method and system for acquiring rule common sense knowledge in vertical field

Publications (2)

Publication Number Publication Date
CN114626368A true CN114626368A (en) 2022-06-14
CN114626368B CN114626368B (en) 2023-06-09

Family

ID=81901139

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210266934.4A Active CN114626368B (en) 2022-03-18 2022-03-18 Method and system for acquiring rule common sense knowledge in vertical field

Country Status (1)

Country Link
CN (1) CN114626368B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114626362A (en) * 2022-03-18 2022-06-14 中国电子科技集团公司第十研究所 Controllable open type combination rule knowledge generation method and system
WO2024087754A1 (en) * 2022-10-27 2024-05-02 中国电子科技集团公司第十研究所 Multi-dimensional comprehensive text identification method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846000A (en) * 2018-04-11 2018-11-20 中国科学院软件研究所 A kind of common sense semanteme map construction method and device based on supernode and the common sense complementing method based on connection prediction
CN111428053A (en) * 2020-03-30 2020-07-17 西安交通大学 Tax field knowledge graph construction method
CN112148863A (en) * 2020-10-15 2020-12-29 哈尔滨工业大学 Generation type dialogue abstract method integrated with common knowledge
CN112199511A (en) * 2020-09-28 2021-01-08 西南电子技术研究所(中国电子科技集团公司第十研究所) Cross-language multi-source vertical domain knowledge graph construction method
CN112597316A (en) * 2020-12-30 2021-04-02 厦门渊亭信息科技有限公司 Interpretable reasoning question-answering method and device
CN113779260A (en) * 2021-08-12 2021-12-10 华东师范大学 Domain map entity and relationship combined extraction method and system based on pre-training model
CN114153955A (en) * 2021-11-11 2022-03-08 科讯嘉联信息技术有限公司 Construction method of multi-skill task type dialogue system fusing chatting and common knowledge

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846000A (en) * 2018-04-11 2018-11-20 中国科学院软件研究所 A kind of common sense semanteme map construction method and device based on supernode and the common sense complementing method based on connection prediction
CN111428053A (en) * 2020-03-30 2020-07-17 西安交通大学 Tax field knowledge graph construction method
CN112199511A (en) * 2020-09-28 2021-01-08 西南电子技术研究所(中国电子科技集团公司第十研究所) Cross-language multi-source vertical domain knowledge graph construction method
CN112148863A (en) * 2020-10-15 2020-12-29 哈尔滨工业大学 Generation type dialogue abstract method integrated with common knowledge
CN112597316A (en) * 2020-12-30 2021-04-02 厦门渊亭信息科技有限公司 Interpretable reasoning question-answering method and device
CN113779260A (en) * 2021-08-12 2021-12-10 华东师范大学 Domain map entity and relationship combined extraction method and system based on pre-training model
CN114153955A (en) * 2021-11-11 2022-03-08 科讯嘉联信息技术有限公司 Construction method of multi-skill task type dialogue system fusing chatting and common knowledge

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
XIN LIU 等: "KGR: Retrieval, Retrospect, Refine and Rethink for Commonsense Generation" *
乔晶晶;温政;段利国;王莉;: "基于实体图路径聚合的多实体关系抽取" *
刘鑫: "基于预训练模型的大规模常识自动获取技术" *
周烨恒;石嘉晗;徐睿峰;: "结合预训练模型和语言知识库的文本匹配方法" *
白林亭;文鹏程;李亚晖;: "基于深度学习的视觉问答技术研究" *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114626362A (en) * 2022-03-18 2022-06-14 中国电子科技集团公司第十研究所 Controllable open type combination rule knowledge generation method and system
CN114626362B (en) * 2022-03-18 2023-06-06 中国电子科技集团公司第十研究所 Controllable open type combination rule knowledge generation method and system
WO2024087754A1 (en) * 2022-10-27 2024-05-02 中国电子科技集团公司第十研究所 Multi-dimensional comprehensive text identification method

Also Published As

Publication number Publication date
CN114626368B (en) 2023-06-09

Similar Documents

Publication Publication Date Title
CN110377686B (en) Address information feature extraction method based on deep neural network model
CN114626368A (en) Method and system for acquiring common knowledge of vertical domain rules
CN110738057B (en) Text style migration method based on grammar constraint and language model
CN112069199B (en) Multi-round natural language SQL conversion method based on intermediate syntax tree
CN110516244B (en) Automatic sentence filling method based on BERT
CN112100397A (en) Electric power plan knowledge graph construction method and system based on bidirectional gating circulation unit
CN107092594B (en) Bilingual recurrence self-encoding encoder based on figure
CN113761893B (en) Relation extraction method based on mode pre-training
CN113779220A (en) Mongolian multi-hop question-answering method based on three-channel cognitive map and graph attention network
CN116661805B (en) Code representation generation method and device, storage medium and electronic equipment
CN115099338A (en) Power grid master equipment-oriented multi-source heterogeneous quality information fusion processing method and system
CN113987201A (en) Zero-sample knowledge graph completion method based on ontology adapter
CN113032418A (en) Method for converting complex natural language query into SQL (structured query language) based on tree model
CN115293168A (en) Multi-language abbreviation disambiguation algorithm based on pre-training model semantic understanding
Aghaei et al. Question answering over knowledge graphs: A case study in tourism
CN114168615A (en) Method and system for querying SCD (substation configuration description) file of intelligent substation by natural language
CN116541533A (en) Multi-mode process map modeling method of wind driven generator based on multi-source heterogeneous data
CN112464673B (en) Language meaning understanding method for fusing meaning original information
CN112256869B (en) Same-knowledge-point test question grouping system and method based on question meaning text
CN111625623B (en) Text theme extraction method, text theme extraction device, computer equipment, medium and program product
CN111581339A (en) Method for extracting gene events of biomedical literature based on tree-shaped LSTM
Chen et al. Pre-training Models Based Knowledge Graph Multi-hop Reasoning for Smart Grid Technology
Gu et al. Research on the Knowledge Graph Construction Technology and its Power Grid Applications
CN114626362B (en) Controllable open type combination rule knowledge generation method and system
Sun et al. Research on Relation Extraction Method Based on Multi-channel Convolution and BiLSTM Model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant