CN114626368A

CN114626368A - Method and system for acquiring common knowledge of vertical domain rules

Info

Publication number: CN114626368A
Application number: CN202210266934.4A
Authority: CN
Inventors: 刘鑫; 崔莹; 李春豹; 刘万里; 黄刘; 陈莹
Original assignee: CETC 10 Research Institute
Current assignee: CETC 10 Research Institute
Priority date: 2022-03-18
Filing date: 2022-03-18
Publication date: 2022-06-14
Anticipated expiration: 2042-03-18
Also published as: CN114626368B

Abstract

The invention relates to the technical field of artificial intelligence, and discloses a method and a system for acquiring rule common knowledge in the vertical field, wherein the acquisition method comprises the following steps: s1, making rules, common sense and knowledge modeling specifications; s2, constructing a network basic model by rule common knowledge acquisition; and S3, constructing a complete model of the rule common sense knowledge acquisition network. The invention solves the problems of low rule common sense knowledge acquisition efficiency, low accuracy, high cost and the like in the prior art.

Description

Method and system for acquiring common knowledge of vertical domain rules

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a method and a system for acquiring rule common knowledge in the vertical field.

Background

At present, the artificial intelligence related technology has obtained better application effects in a plurality of fields such as natural language processing, image recognition, audio and video synthesis and the like, and essentially, a model can be distinguished according to different learning groups of data characteristics through large-scale data training. However, the intelligent processing algorithm model obtained in this way is deeply affected by data quality, when data distribution has slight deviation, such as partial text loss, slight noise added to an image, and the like, the model can distinguish errors, the generalization and the robustness of the model are poor, and needless to say, the model is difficult to migrate and adapt under unknown conditions by executing high-order tasks such as logical reasoning, scene understanding, decision analysis and the like. The reason for this is that the importance of knowledge in the field of artificial intelligence is more and more prominent because the traditional machine learning algorithm does not introduce multiple types of knowledge.

Among the various knowledge, there is a special knowledge-rule common knowledge, and its acquisition, representation and processing have been a major core problem in the field of artificial intelligence. Many researchers found that something very easy to do for children of only a few years old was not effectively dealt with by many years of research using artificial intelligence. The artificial intelligence expert de lei foss (Dreyfus Hubert) believes that: the common sense problem is the biggest obstacle to realizing the general artificial intelligence, and if the common sense problem is solved, the research on the artificial intelligence is completed. Therefore, the introduction of the knowledge of rule common sense is of great significance to the breakthrough of the artificial intelligence related technology. For the knowledge of the rule general knowledge, there is no commonly accepted definition in the academic world and the industry at present, and the rule general knowledge is described in a general abstract way, so that almost all people can sense, understand and judge the shared things according to the rule general knowledge, and people can reasonably expect without debate. Meanwhile, in different vertical fields, such as medical treatment, finance, military affairs and the like, some rule common knowledge specific to the fields is often presented in a rule form, such as 'nuclear submarine needs nuclear energy', 'aircraft carrier is larger than protective carrier', 'loading before cannon opening' and the like. The common knowledge of the rules in the vertical field is also the default and common knowledge of almost all people, which is often reflected in the thinking, consciousness and careless behavior of people, and is recorded in data carriers such as text images, audio and video with less dominance. Moreover, the biggest difference between the regular common knowledge and the encyclopedic knowledge which takes the knowledge map as the mainstream is that the entities and the relations of the encyclopedic knowledge are easy to model, and the relations are relatively clear; however, the relationship between the entities of the rule common knowledge is difficult to determine, so that the extraction type technique in the conventional knowledge acquisition mode cannot acquire the common knowledge relationship which is not defined in advance, which results in the omission of the acquisition of the rule common knowledge, and the acquisition of the rule common knowledge by the extraction type technique is not available.

For the reasons, a technical way for rapidly and automatically acquiring rule common knowledge does not exist at present, and the investigation on a rule common knowledge base which is mainstream at home and abroad such as Cyc, NELL, ConceptNet and the like can find that the current rule common knowledge acquisition method with higher confidence coefficient mainly depends on a manual crowdsourcing mode, so that a large amount of manpower and material resources are consumed. The mode for acquiring the rule common knowledge is too high in cost, so a brand-new method for acquiring the rule common knowledge in the vertical field needs to be developed urgently to reduce the cost, realize the quick acquisition of the rule common knowledge in a large scale and support the development of downstream reasoning application research based on the rule common knowledge.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides a method and a system for acquiring the common knowledge of the rules in the vertical field, and solves the problems of low acquisition efficiency, low accuracy, high cost and the like of the common knowledge of the rules in the prior art.

The technical scheme adopted by the invention for solving the problems is as follows:

a method for acquiring common knowledge of vertical domain rules comprises the following steps:

s1, establishing rules common knowledge modeling specification: classifying the rule common knowledge and formulating a modeling specification according to the coverage scope of the rule common knowledge and the support requirement of the rule common knowledge on downstream tasks;

s2, establishing a rule common knowledge acquisition network basic model: establishing a rule common sense knowledge to acquire a network basic model by utilizing the language knowledge capability of a language pre-training model in text learning; designing a pre-training reasoning task meeting the requirement of acquiring the rule common knowledge, inputting part of the rule common knowledge in the known field into the rule common knowledge acquisition network basic model so as to finely adjust the rule common knowledge acquisition network basic model, and promoting the rule common knowledge acquisition network basic model to learn the connotation related to the rule common knowledge so as to have the capability of generating the rule common knowledge;

s3, constructing a complete rule knowledge acquisition network model: training the rule common knowledge acquisition network basic model around the particularity of the use mode of the rule common knowledge in the downstream reasoning task, and realizing the acquisition of the rule common knowledge conforming to human cognition.

As a preferred technical solution, the step S2 includes the following steps:

s21, input layer representation: obtaining the input representation of a rule common knowledge acquisition network by using the original input text;

s22, network model coding: fully learning semantic association among each word in the text to obtain context semantic representation of the text;

s23, optimizing network model parameters: and calculating the cross entropy loss, continuously optimizing the rule common knowledge to obtain network basic model parameters, and stopping optimization when the cross entropy loss is less than a set threshold value to obtain the final rule common knowledge to obtain the network basic model parameters.

As a preferred technical solution, in step S3, the pre-training inference task includes: mask language model task, next sentence prediction task.

As a preferable technical solution, in step S3, when executing the mask language model task, the input is defined as a form of two-segment text concatenation.

As a preferred technical solution, in step S21, it is assumed that the original input text is x₁x₂…x_i…x_nThe input text after passing the mask operation is x'₁x'₂…x′_i…x'_nFor the masked outputEntering a text for processing to obtain an input expression v of the rule common knowledge acquisition network, wherein the calculation formula is as follows:

v＝InputRepresentation(X)，

wherein X ═ CLS]x′₁x'₂…x′_i…x'_n[SEP]，x_iIth word, x 'representing input text'_iDenotes the ith word after mask processing, [ CLS]A special mark indicating the beginning of a text sequence, [ SEP ]]Representing separation marks between text sequences.

As a preferred technical solution, in step S22, the input expression v passes through 4 layers of transformers, and the semantic association between each word in the text is fully learned by the self-attention mechanism, so as to finally obtain the context semantic representation of the text, where the calculation formula is:

wherein h is^[l]∈R^N×dRepresenting the hidden layer output of the first layer transform, h^[0]N denotes the sequence length and d denotes the hidden layer dimension of the regular common sense knowledge acquisition network.

As a preferable technical solution, in step S23, a probability distribution P on the vocabulary corresponding to the ith component in the mask representation is calculated_iBy the use of P_iAnd label y_iCalculating the cross entropy loss, P_iThe calculation formula is as follows:

where Softmax () represents an activation function, i represents a component number in a mask representation, m represents a masked flag, W represents a word vector matrix, T represents a transpose operation,

a transpose matrix representing a matrix of word vectors,

vector representation representing the ith masked word, b^oIndicating the bias of the fully connected layer.

As a preferred technical solution, the step S3 includes the following steps:

s31, adding a rule common knowledge acquisition task layer based on the large-scale language pre-training model, and constructing a rule common knowledge acquisition network;

s32, selecting part of rule common knowledge from the existing rule common knowledge base as a seed knowledge set, and converting the seed knowledge set from a triple form into a natural language form;

s33, using the seed knowledge set in the natural language form in the step S32 to carry out mask prediction training on the rule common knowledge acquisition network, so that the network has the capability of generating the field rule common knowledge;

s34, obtaining a head entity by an entity extraction mode aiming at the sources of the open domain text data and/or the domain text database;

s35, inputting the head entity obtained in the step S34 and the specific common sense relation to be predicted in the dimensions into the rule common sense knowledge acquisition network trained in the step S33;

s36, the rule common knowledge acquisition network generates a series of corresponding reasonable tail entities according to different head entity and common knowledge relation combinations to obtain a new rule common knowledge triple in the vertical field;

and S37, storing the rule common sense knowledge obtained in the step S36 in a database, and expanding the scale of the rule common sense knowledge database.

As a preferred technical solution, the rule common sense knowledge is classified according to the following dimensions: similar rules, different rules, classification rules, part-whole rules, spatial rules, creation rules, usage rules, motivation rules, characteristic rules, comparability rules, temporal rules.

A vertical domain rule common sense knowledge acquisition system is based on the vertical domain rule common sense knowledge acquisition method and comprises the following modules which are electrically connected in sequence:

a rule common knowledge modeling specification formulation module: the method is used for classifying the rule common knowledge and formulating modeling specifications according to the coverage scope of the rule common knowledge and the support requirements of the rule common knowledge on downstream tasks;

the rule common knowledge acquisition network basic model building module comprises: the method comprises the steps of establishing a rule common knowledge acquisition network basic model by utilizing the language knowledge ability of a language pre-training model in text learning; designing a pre-training reasoning task meeting the requirement of acquiring the rule common knowledge, inputting part of the rule common knowledge in the known field into the rule common knowledge acquisition network basic model so as to finely adjust the rule common knowledge acquisition network basic model, and promoting the rule common knowledge acquisition network basic model to learn the connotation related to the rule common knowledge so as to have the capability of generating the rule common knowledge;

a rule common knowledge acquisition network complete model construction module: the rule common knowledge acquisition network basic model is trained by surrounding the particularity of the use mode of the rule common knowledge in the downstream reasoning task, so that the acquisition of the rule common knowledge conforming to human cognition is realized.

Compared with the prior art, the invention has the following beneficial effects:

(1) the invention provides a rule common knowledge acquisition technology based on a generative Transformer, aiming at the problem that the relationship among entities in the rule common knowledge is difficult to determine, the invention integrates the supervision of a small amount of knowledge based on the 'knowledge reserve' of a large-scale language pre-training model, enhances the understanding of a rule common knowledge acquisition network model on concepts/rules, and generates the rule common knowledge which has multiple dimensions and accords with human cognition;

(2) the invention relates to rule common knowledge modeling specification formulation, which aims at the problem that the current academic and industrial fields do not provide unified rule common knowledge definition and classification standard temporarily, and flexibly classifies the rule common knowledge in a layered way and performs specification modeling according to the storage management and downstream application requirements of the rule common knowledge in the vertical field on the basis of the induction and summarization of a foreign mainstream rule common knowledge base, thereby playing a guiding role in acquiring the rule common knowledge;

(3) the method constructs a rule common knowledge acquisition network, specifically designs a pre-training task meeting the requirement of rule common knowledge acquisition on the basis of a large-scale language pre-training model, can promote the model to deeply understand the domain concepts and rules only by supervising and guiding the domain rule common knowledge of a small sample, and has the capability of generating the domain rule common knowledge.

Drawings

FIG. 1 is a schematic diagram of a method for acquiring knowledge of rules in the vertical field according to the present invention;

FIG. 2 is a schematic diagram of a vertical domain rule knowledge acquisition system according to the present invention;

FIG. 3 is a schematic diagram of a rule common knowledge classification;

FIG. 4 is one of the enlarged partial views of FIG. 3;

FIG. 5 is a second enlarged view of the portion of FIG. 3;

FIG. 6 is a third enlarged view of the portion of FIG. 3;

FIG. 7 is a schematic diagram of a rule common knowledge modeling specification;

FIG. 8 is one of the enlarged partial views of FIG. 7;

FIG. 9 is a second enlarged view of the portion of FIG. 7;

FIG. 10 is a schematic diagram of the construction of a rule common knowledge acquisition network base model;

FIG. 11 is a schematic diagram of rule common knowledge acquisition;

FIG. 12 is one of the enlarged partial views of FIG. 11;

fig. 13 is a second enlarged view of the portion of fig. 11.

Detailed Description

The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited to these examples.

Example 1

As shown in fig. 1 to 13, the present invention aims to disclose a generation-type method for acquiring rule common sense knowledge in a vertical domain, which mainly breaks through the techniques of rule common sense knowledge modeling specification formulation, rule common sense knowledge acquisition network construction, rule common sense knowledge acquisition and the like, and can realize automatic acquisition of large-scale rule common sense knowledge and support development of downstream tasks such as machine reasoning for fusing rule common sense knowledge, aiming at the problems of rule common sense knowledge abstraction, dispersion, ambiguous relation, difficult automatic acquisition and the like.

The method for acquiring the rule common knowledge in the vertical field comprises the steps of rule common knowledge modeling specification formulation, rule common knowledge acquisition network construction, rule common knowledge acquisition and the like. The functions realized by the various parts are briefly explained as follows:

and (3) rule common knowledge modeling specification formulation: according to the covering scope of the rule common knowledge and the supporting requirements of the rule common knowledge on downstream tasks, the rule common knowledge is classified and a modeling specification is made, and the development of subsequent rule common knowledge classification acquisition, storage management, reasoning application method design and the like is guided.

Establishing a rule common knowledge acquisition network basic model: the method comprises the steps of establishing a rule common sense knowledge acquisition network basic model by using 'knowledge reserve' of a large-scale language pre-training model, designing a pre-training task meeting the rule common sense knowledge acquisition requirement, inputting part of known field rule common sense knowledge to finely adjust the rule common sense knowledge acquisition network basic model, promoting the rule common sense knowledge acquisition network basic model to learn the connotation related to the rule common sense knowledge, and having the rule common sense knowledge generation capacity.

And (3) establishing a complete model of the rule common knowledge acquisition network: around the particularity of the use mode of the rule common knowledge in the downstream reasoning task, on the basis of the rule common knowledge acquisition network basic model, a rule common knowledge acquisition task layer is added, and a rule common knowledge acquisition network complete model is constructed, so that the rule common knowledge conforming to human cognition is efficiently acquired.

The beneficial effects of the invention are:

(1) the invention provides a rule common knowledge acquisition technology based on a generative Transformer, aiming at the problem that the relation among entities in the rule common knowledge is difficult to determine, the knowledge storage based on a large-scale language pre-training model is integrated with the supervision of a small amount of knowledge, and the understanding of a rule common knowledge acquisition network model on concepts/rules is enhanced, so that the rule common knowledge which is in multiple dimensions and accords with human cognition is generated.

(2) The invention relates to rule common knowledge modeling specification formulation, which aims at the problem that the current academic and industrial fields do not provide unified rule common knowledge definition and classification standards temporarily, and flexibly classifies the rule common knowledge in a layered way and performs standard modeling according to the storage management and downstream application requirements of the rule common knowledge in the vertical field on the basis of the induction and summarization of a foreign mainstream rule common knowledge base so as to guide the acquisition of the rule common knowledge.

Example 2

As shown in fig. 1 to fig. 13, as a further optimization of the embodiment 1, this embodiment includes all the technical features of the embodiment 1, and in addition, this embodiment also includes the following technical features:

the method is realized by adopting the following steps:

1. making a rule common knowledge modeling specification;

referring to fig. 3, aiming at the problem that the current academic and industrial circles do not have an authoritative unified standard for rule common sense knowledge classification, in order to better and more scientifically perform rule common sense knowledge organization, storage, management and update iteration and support different downstream reasoning tasks (such as intelligent question answering, retrieval recommendation and the like), the invention firstly divides rule common sense knowledge into 11 dimensions of similar rules, different rules, classification rules, part-whole rules, space rules, creation rules, use rules, motivation rules, characteristic rules, comparability rules, time rules and the like, and continues to subdivide the rule common sense knowledge into different rule relationship types according to the above, because the rule common sense knowledge does not need to strictly regulate the relationship like knowledge graph knowledge when being used for the downstream reasoning tasks, the subdivision under the 11 dimensions provides a more open subdivision environment for users, the flexibility and ease of use of knowledge of rules common knowledge is greatly increased, for example, similar rules may be subdivided into synonyms, similar, defined, liked, etc. specific relationship types.

Second, the present invention models rule common sense knowledge. Referring to fig. 7, for the knowledge of rule common knowledge, for example, "wife is synonymous with wife", "nuclear submarine requires nuclear power", "war is caused by resource contention", etc., it is stored and used in two forms of natural language text description and structured triple. In the figure, a rule dimension refers to a dimension to which a certain rule common sense knowledge belongs, a common sense definition refers to an abstract expression form of the rule common sense knowledge, a common sense description refers to specific rule common sense knowledge content in a natural language text form, and a relationship name, a head entity and a tail entity form a content element of the rule common sense knowledge in a structured triple form. From the perspective of practical reasoning application, since the concrete relationship of rule common sense knowledge is uncertain, a more generalized and abstract relationship is used to normalize and constrain the rule common sense knowledge type during modeling, that is, a diversified concrete relationship may exist in a certain rule dimension.

2. Establishing a network basic model by using rule common knowledge;

referring to fig. 10, the basic model structure of the rule common knowledge acquisition network is formed by multiple layers of transformers, and for the task requirements of rule common knowledge acquisition, two corresponding pre-training reasoning tasks are designed during model training: mask Language Model task (MLM); next Sentence Prediction task (NSP).

In the network model training process, the input of the network consists of two sections of text x⁽¹⁾And x⁽²⁾Splicing, namely modeling the input text by a regular common sense knowledge acquisition network model to obtain the semantic representation of the text, and finally learningTo the mask language model and the next sentence prediction model. The mask language model has no special requirement on the input form, and can be a text or two texts. Whereas the next sentence prediction model requires that the input must be two pieces of text. Therefore, in order to unify the training process of the subsequent rule common knowledge acquisition network model, the input is unified and specified to be in a form of splicing two sections of texts at the pre-training task stage. The rule common sense knowledge acquisition network training process will be introduced next by three steps of input layer representation, network model encoding and network model parameter optimization.

(1) An input layer representation;

assume the original input text is x₁x₂…x_nThe input text after passing the mask operation is x'₁x′₂…x′_n，x_iIth word, x 'representing input text'_iIndicating the ith word after mask processing. And processing the input text after the mask as follows to obtain an input representation v of the rule common knowledge acquisition network:

X＝[CLS]x′₁x'₂…x'_n[SEP]，

v＝InputRepresentation(X)，

wherein, [ CLS]A special mark indicating the beginning of a text sequence, [ SEP ]]Representing separation marks between text sequences. It should be noted that if the length N of the input text is less than the maximum sequence length N of the regular common sense knowledge acquisition network, Padding Token (PAD) is required]And splicing after the text is input until the maximum sequence length M of the rule common knowledge acquisition network is reached. For example, assume that the maximum sequence length N of a regular common sense knowledge acquisition network is 10, and the input sequence length is 7 (two special tokens plus x)₁To x₅) Addition of 3 [ PAD ] s behind the input sequence is required]And (5) filling the marks.

[CLS]x₁x₂x₃x₄x₅[SEP][PAD][PAD][PAD]，

If the length of the input sequence X is larger than the maximum sequence length of the regular common sense knowledge acquisition network, the input sequence needs to be cut off to the networkIs determined. For example, assume that the maximum sequence length N of a regular common sense knowledge acquisition network is 5, and the input sequence length is 7 (two special tokens plus x)₁To x₅) The sequence needs to be truncated so that the length of the valid sequence (2 special marks removed in the input sequence) becomes 3.

[CLS]x₁x₂x₃[SEP]，

(2) Network model coding;

in the regular common sense knowledge acquisition network coding layer, the input means v fully learns the semantic association between each word in the text by means of the self-attention mechanism through 4 layers of transformers. The specific encoding method of the Transformer is well-established and popular in the field of artificial intelligence, and is not described herein again.

Wherein h is^[l]∈R^N×dRepresenting the hidden layer output of the first layer transform, while specifying h^[0]V to maintain the completeness of the equation. For convenience of description, labels between layers are omitted and simplified as:

h＝Transformer(v)，

where h represents the output of the last layer of the transform, i.e., h^[L]. Finally obtaining the context semantic expression h e R of the text by the method^N×dAnd d represents the hidden layer dimension of the rule common knowledge acquisition network.

(3) And optimizing network model parameters.

Since the masking language model only masks a part of words in the input text, it is not necessary to predict every position in the input text, but only the already masked position. Assume set M ═ M₁,m₂,…,m_kDenotes the subscript of all mask positions, k denotes the total mask number. If the input text length is n and the mask ratio is 15%, k is n × 15%. Then, the elements in the set M are taken as subscripts to extract from the context semantic representation h of the input sequenceCorresponding representations are obtained, and the representations are spliced to obtain a mask representation h^m∈R^k×d。

In the regular common knowledge acquisition network, as the input representation dimension e is the same as the hidden layer dimension d, the word vector matrix W belonging to the R can be directly utilized^|V|×eThe mask representation is mapped to a vocabulary space. For the ith component in the mask representation, the probability distribution P on the vocabulary corresponding to the mask position is calculated by the following formula_i：

Wherein b DEG e R^|V|Indicating the bias of the fully connected layer. Finally, obtaining the probability distribution P corresponding to the mask position_iThen, with the label y_i(i.e., original word x)_iOne-hot vector representation) to calculate cross entropy loss and optimize model parameters.

3. And (5) constructing a complete model of the rule common sense knowledge acquisition network.

Referring to fig. 11, the invention provides a vertical domain rule common sense knowledge acquisition technology based on a generative Transformer, which trains a rule common sense knowledge acquisition network on the basis of a large-scale language pre-training model, and then automatically generates the rule common sense knowledge of the 11 dimensions with respect to the entity extracted from the text. The method comprises the following specific steps:

step 1: adding a rule common knowledge acquisition task layer based on a large-scale language pre-training model such as BERT, GPT and the like, and constructing a rule common knowledge acquisition network complete model;

step 2: selecting part of rule common knowledge from an existing rule common knowledge base (such as conceptNet, Cyc, NELL, a field common knowledge base and the like) as a seed knowledge set, such as part of open domain rule common knowledge and part of vertical field rule common knowledge, and converting the triple form into a natural language form;

and 3, step 3: using the seed knowledge set in the form of natural language in the step 2 to acquire a network complete model for training rule common knowledge, so that the network has the capability of generating the field rule common knowledge;

and 4, step 4: aiming at sources such as open domain text data, a domain text database and the like, head entities such as objects, events, concepts and the like are obtained in an entity extraction mode;

and 5: inputting the head entities of the objects, the events, the concepts and the like obtained in the step 4 and the specific common sense relations to be predicted in the 11 dimensions into the rule common sense knowledge acquisition network complete model trained in the step 3;

step 6: the rule common knowledge acquisition network generates a corresponding series of reasonable tail entities according to different head entity and common knowledge relation combinations to obtain a new rule common knowledge triple in the vertical field;

and 7: and (4) warehousing the rule common sense knowledge obtained in the step (6), and expanding the scale of the rule common sense knowledge base.

As described above, the present invention can be preferably realized.

All features disclosed in all embodiments in this specification, or all methods or process steps implicitly disclosed, may be combined and/or expanded, or substituted, in any way, except for mutually exclusive features and/or steps.

The foregoing is only a preferred embodiment of the present invention, and the present invention is not limited thereto in any way, and any simple modification, equivalent replacement and improvement made to the above embodiment within the spirit and principle of the present invention still fall within the protection scope of the present invention.

Claims

1. A method for acquiring the common knowledge of the rules in the vertical field is characterized by comprising the following steps:

s2, establishing a rule common knowledge acquisition network basic model: establishing a rule common knowledge acquisition network basic model by utilizing the language knowledge capability of a language pre-training model in text learning; designing a pre-training reasoning task meeting the requirement of acquiring the rule common knowledge, inputting part of the rule common knowledge in the known field into the rule common knowledge acquisition network basic model so as to finely adjust the rule common knowledge acquisition network basic model, and promoting the rule common knowledge acquisition network basic model to learn the connotation related to the rule common knowledge so as to have the capability of generating the rule common knowledge;

2. The method for acquiring knowledge of common general knowledge of vertical domain rules according to claim 1, wherein the step S2 comprises the following steps:

3. The method for acquiring the common general knowledge of the vertical domain rule according to claim 2, wherein in the step S3, the pre-trained inference task is designed to include: mask language model task, next sentence prediction task.

4. The method for acquiring common sense knowledge of vertical domain rules according to claim 3, wherein in step S3, when performing the task of mask language model, the input is defined uniformly as a form of two text concatenations.

5. The method for obtaining knowledge of vertical domain rules general knowledge according to claim 4, wherein in step S21, the original input text is assumed to be x₁x₂…x_i…x_nThe input text after passing the mask operation is x'₁x'₂…x'_i…x'_nProcessing the masked input text to obtain an input expression v of the rule common knowledge acquisition network, wherein the calculation formula is as follows:

v＝InputRepresentation(X)，

wherein, X ═ CLS]x'₁x'₂…x'_i…x'_n[SEP]，x_iIth word, x 'representing input text'_iDenotes the ith word after mask processing, [ CLS]A special mark indicating the beginning of a text sequence, [ SEP ]]Representing separation marks between text sequences.

6. The method for obtaining knowledge of common general knowledge in vertical domain rules according to claim 5, wherein in step S22, the input representation v passes through 4 layers of transformers, and the semantic association between each word in the text is fully learned by means of a self-attention mechanism, so as to obtain the semantic representation of the context of the text, and the calculation formula is as follows:

7. The method for obtaining knowledge of vertical domain rules general knowledge according to claim 6, wherein in step S23, the probability distribution P on the vocabulary corresponding to the ith component in the mask representation is calculated_iAll right (1)By P_iAnd label y_iCalculating the cross entropy loss, P_iThe calculation formula is as follows:

a transpose matrix representing a matrix of word vectors,

8. The method for acquiring the common general knowledge of the vertical domain rule according to claim 7, wherein the step S3 comprises the following steps:

9. The method for acquiring the common general knowledge of the vertical domain rules according to any one of claims 1 to 8, wherein the common general knowledge of the rules is classified according to the following dimensions: similar rules, different rules, classification rules, part-whole rules, spatial rules, creation rules, usage rules, motivational rules, characteristic rules, comparability rules, temporal rules.

10. A vertical domain rule common sense knowledge acquisition system, based on any one of claims 1 to 9, comprising the following modules electrically connected in sequence:

a rule common knowledge modeling specification formulation module: the method is used for classifying the rule common knowledge and formulating a modeling specification according to the coverage scope of the rule common knowledge and the support requirement of the rule common knowledge on downstream tasks;

the rule common knowledge acquisition network basic model building module comprises: the method comprises the steps of establishing a rule common knowledge acquisition network basic model by utilizing the language knowledge ability of a language pre-training model in text learning; designing a pre-training reasoning task meeting the rule common knowledge acquisition requirement, inputting part of rule common knowledge in the known field into the rule common knowledge acquisition network basic model so as to finely adjust the rule common knowledge acquisition network basic model, and promoting the rule common knowledge acquisition network basic model to learn the connotation related to the rule common knowledge so as to have the capability of generating the rule common knowledge;

a rule common knowledge acquisition network complete model construction module: the rule common knowledge acquisition network basic model is trained according to the particularity of the using mode of the rule common knowledge in the downstream reasoning task, and the acquisition of the rule common knowledge conforming to human cognition is achieved.