CN113435212A

CN113435212A - Text inference method and device based on rule embedding

Info

Publication number: CN113435212A
Application number: CN202110984877.9A
Authority: CN
Inventors: 孙宇清; 郑威
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2021-08-26
Filing date: 2021-08-26
Publication date: 2021-09-24
Anticipated expiration: 2041-08-26
Also published as: CN113435212B

Abstract

A text inference method based on rule embedding comprises the steps of carrying out neural retrieval and inference on different components of a logic rule based on a pre-trained semantic logic network, and supporting user requirement change or task migration; and combining a parallel structure of a semantic logic network and a neural classification network, adopting a probability distribution distance function Jensen-Shannon divergence, and constraining the consistency of an inference result through network fine tuning training. The semantic logic network provided by the invention encodes the user rules into semantic vectors, can better retain the semantic information of the text while detecting the logic rules, and supports language flexibility and text diversity. The invention also provides a method for integrating the user rules into the neural classification network to improve the text inference performance, namely, the parallel prediction structure inferred by the neural classification network and the semantic logic network is combined, consistency joint loss is adopted, the semantic logic network and the neural classification network can benefit each other, and the detection result of the rules is used as the evidence of text inference.

Description

Text inference method and device based on rule embedding

Technical Field

The invention discloses a text inference method and a text inference device based on rule embedding, and belongs to the technical field of natural language processing.

Background

Public opinion subscription is an important application scene in the new media era, and means that a media mechanism regularly pushes texts such as internet public opinions or news concerned by users according to the requirements of subscribing users, wherein the user requirements are usually expressed in the form of keyword logic rules and describe the text contents preferred by the users. The text inference task based on the user requirement refers to judging whether one text meets the user requirement, and the task has important application value in the scene.

The existing technologies for processing the inference task are mainly divided into two categories, one is to infer based on a keyword boolean search result and find out a text matching the logic expression by comparing the text with a keyword logic expression defined by a user, but the keyword boolean search mode has limitations, and the flexibility of natural language enables the text expression form with the same semantic meaning to have great freedom degree and influence the matching result. The other method is a classification method based on deep learning, which is characterized in that text type inference is carried out based on pre-training word vectors and a neural network, and supervised learning is carried out on a large-scale labeled data set, so that the neural network can understand and infer whether texts meet user requirements from a semantic level, for example, text expression vectors obtained based on a convolutional neural network are recorded in Chinese patent document CN 113076488A: a method and a system for recommending information based on user data are provided, which perform feature modeling on a specific sentence in a text carrying user information through a preset keyword, but have the defects that the problem of diversity of topics related to user requirements is difficult to process, and the change of the user requirements is difficult to adapt.

Disclosure of Invention

Aiming at the problems in the prior art, the invention discloses a text inference method based on rule embedding.

The invention also discloses a device for realizing the text inference method, so as to realize the inference processing of the text.

Summary of the invention:

a text inference method based on rule embedding comprises two parts: firstly, based on a pre-trained semantic logic network, different components of a logic rule are subjected to neural retrieval and inference, and user requirement change or task migration is supported; and secondly, combining a parallel structure of a semantic logic network and a neural classification network, adopting a probability distribution distance function Jensen-Shannon divergence, and constraining the consistency of the inferred result through network fine tuning training. And finally, fusion inference is carried out based on the prediction results of the semantic logic network and the neural classification network, and meanwhile, the activation result of the semantic logic network is used as the evidence of the text inference result.

The invention provides a semantic logic network, which approximates a logic inference process in a neural manner, wherein the process comprises the detection of different granularity components in a logic rule by a text and the combination of detection results, and the components comprise items, conjunctions and disjunctions. And respectively verifying the inclusion relationship of the text to the components by introducing three independent loss functions. Aiming at the challenges brought by the dynamically changing user requirements, the semantic logic network is trained by using a pre-training-fine tuning mechanism. The semantic logic network is composed of three modules and is respectively used for semantic detection of items in user rules, conjunction rules and disjunction rules, and text inference is carried out by combining detection results. A text is obtained from a Chinese general corpus such as Chinese Wikipedia, and a general keyword set corpus is obtained from a Chinese synonym forest such as Chinese WordNet. And pre-training each module by using the universal corpus to enhance the robustness of the network to keyword detection and finely adjust the previous user data, thereby improving the adaptability to user requirement change.

In addition, the invention provides a parallel structure combining the optional neural classification network and the semantic logic network, and the network is finely adjusted by a joint training mode to improve the inference performance. In order to combine the neural classification network and the semantic logic network, a Jensen-Shannon loss function is used as a regularization term, and the consistency of prediction results on two sides of a parallel structure is constrained through network fine tuning stage training.

Technical term interpretation:

1. user requirements: also called user rules, in the present invention, the subscribing user describes his preferences for text content, given in the form of logical rules in the form of a set of keywords, i.e. words or phrases in the academic field. Dynamically changing user requirements refer to: when a user puts forward a new focus, the logic expression is changed to express by adding or deleting key words.

2. Text inference: for a given user requirement, it is inferred whether the input text meets the requirement.

3. Semantic logic network: refers to a neural network used for semantic detection and inference of input text.

4. Parallel network: the parallel network in the invention comprises a semantic logic network and a neural classification network which are arranged in parallel.

5. And (3) consistency constraint: in a loss function, a probability distribution distance function Jensen-Shannon divergence, namely JS distance for short is introduced to serve as a regularization term, and the inference results on two sides of a parallel network are constrained to be optimized towards the consistency direction of probability distribution; the JS distance is a variant of Kullback-Leibler (KL) divergence, and the asymmetric problem of the KL divergence is solved.

The detailed technical scheme of the invention is as follows:

a method for rule-based embedding based text inference, the method comprising:

1) converting a keyword logic expression for describing user requirements into an equivalent disjunctive normal form, wherein the user requirements are a propositional formula P, and the disjunctive normal form of P is as follows:

（1）

in the formula (1), the first and second groups,

indicating the number of conjunction rules that are to be applied,r _iis the ith user rule; in the propositional formula P, conjunctions are taken from the set

An item is a set of keywords

Keywords and synonyms thereof related to description subject or semantics are included; formulation of propositions according to the theorem of existence of normal formPMust be capable ofIs converted into a disjunctive normal form equivalent to the above,

is a conjunction rule composed of a set of keywords, i.e.

Wherein

Representation conjunctive rules

The number of the middle items and all the conjunction rule sets forming the user requirements are expressed as

I.e. a user rule set, wherein

Representing the number of conjunction rules; in this step, english of the Disjunctive Normal Form is abbreviated as DNF, the Disjunctive Normal Form has flexibility of processing user requirement change, and the change of the user requirement can be efficiently adapted through an addition and deletion conjunction rule; the specific process of the conversion in the step is the same as the conversion mode of the traditional logic expression, and does not belong to the content to be protected by the invention;

2) determining whether an input text satisfies a user rule:

given text collections

And user rule set

(ii) a Inputting text

Is expressed as

(ii) a Inferring an output text-level probability of

Representing input text

A probability of satisfying a user rule; rule level probabilities of

Is a length of

Of which the ith dimension value represents the predicted input textxSatisfying user rules

Probability of, according to

Value of (2) judging textxWhich user rules are satisfied.

The invention utilizes semantic logic network to input textxAnd (3) understanding, and deducing whether the user rule corresponding to the user requirement is met:

using semantic logic network to input text in turnxItem detection, conjunction rule detection and disjunction normal form detection are carried out, whether an input text meets the user rule or not is finally judged, and whether the input text x meets the items in the user rule or not is judged in the step

Rule of conjunction

And the semantics of the extraction rule are shown in the right side of the attached drawing 1, and three modules from bottom to top respectively perform item detection, conjunction rule detection and extraction rule detection.

Preferably, according to the present invention, the method for text inference based on rule embedding further includes a neural classification network disposed in parallel with the semantic logic network, the neural classification network is configured to: performing category prediction on an input text to obtain the probability that the input text meets the requirements of a user, namely a prediction result;

respectively deducing the input text through a neural classification network and a semantic logic network to respectively obtain the prediction results of the input text and the semantic logic network; and finally, constraining the consistency of the prediction results of the Jensen-Shannon divergence, namely JS distance for short.

According to the invention, the input texts are sequentially pairedxThe specific method for item detection, conjunction rule detection and disjunction normal form detection comprises the following steps:

2-1) item detection

Item detection for determining input text

Whether or not to include disjunctive normal form items

(ii) a related semantic;

input as input text

；

The output is recorded as the detection result

Representing input text

Containing item

The probability of (d);

will input text

Converting into corresponding pre-training word vector structureForming a matrix: the pre-training word vector refers to a Chinese word vector obtained by training a Chinese Wikipedia corpus and a word2vec algorithm, and is obtained by inputting a text

The matrix composed of pre-training word vectors corresponding to all the words in the Chinese language is recorded as

Wherein

Which represents the real number field, is,uis to input text

The length of the truncation of (a) is,dis the length of the pre-training word vector,

is a word

Corresponding length isdThe vector of (a);

will item

Conversion to vector form: item(s)

Vector of

The average of the pre-training word vectors corresponding to all keywords in the corresponding keyword set, i.e.

Wherein

Is a key word in the set that is,

is that

Corresponding to the pre-training word vector.

Computing item

And inputting text

The mutual information between every vocabulary in the Chinese character, vector

And inputting text

Pre-training word embedding matrix

The interaction vector is obtained through matrix multiplication and is recorded as

：

(2)

For input text

Obtaining text semantic vector after semantic coding through coding network ENC

In the present invention, different convolutional neural networks can be adopted, and a TEXTCNN structure is preferably adopted as the coding network ENC, and the sizes of the three convolutional kernels used are 2 in each cased，3×d,4×dWhereindIs the pre-training word directionA dimension of magnitude, the number of each convolution kernel being 64;

semantic vector of text

And interaction vector

Splicing, and reducing dimension through a multilayer perceptron network MLP to obtain vectors

I.e. as input text

To item

Contains the relationship:

(3)

through

Value of function activation as detection of input text

Containing item

Probability, i.e. inference result, representing the input text

To item

Corresponding key wordSatisfaction degree of set semantics:

(4)

predicting input text with semantic logic networkxContaining item

Probability of said vector

The rule is also used as the input of the next-stage conjunction rule module;

to represent

The function is activated in such a way that,

is a network parameter;

evaluating inference results using cross-entropy loss function

With true result, i.e. true tag of item

Difference between distributions to find loss

：

(5)

Wherein the content of the first and second substances,

the true label of the item is obtained by the character string matching detection and synonym expansion of the text and the keyword;

representing training set sample expectations;Mis the number of keyword sets; training process by minimizing losses

Detecting all parameters in the network by using the updated items;

indicating use of

The norm regularizes the parameters of the term detection network to avoid overfitting; the cross entropy loss function is a cross entropy cross-entropy loss function;

2-2) conjunctive rule detection

Conjunctive rule detection for validating input textxWhether to satisfy conjunctive rules

The semantics of (2);

the input is as follows: term representation vector of step 2-1)

；

The output is: predicting input text to contain conjunctive rules

The probability of (d);

conjunctive rule embedding network

The invention verifies that the method adopts

Or

The different structures have the capability of approximate logic conjunction operation and conjunction rule

Comprising a sequence of items

The expression vectors corresponding to the items detected form a sequence

Splicing all vectors in the sequence as input, passing through

Obtaining a representation vector of a conjunctive rule

The output vector contains the rules for extracting the input text pair

Contains the relationship:

(6)

wherein the content of the first and second substances,

to represent

A sequence of all items of (a);

through

The function activation yields the detection probability of the conjunctive rule, shown in equation (7), where

To represent

The function is activated in such a way that,

is a parameter of the network that is,

is that the input text contains conjunction rules

Probability of (2), i.e. inference result:

(7)

measuring prediction results by using cross entropy loss function

With true result, i.e. true tag of rule

Difference of (2) and loss

Wherein

The label is a regular real label and is obtained by the conjunction operation of Boolean values of related item labels;

representing training set sample expectations; training process by minimizing losses

To update all parameters in the UNet and conjunction rule detection modules,

indicating use of

The norm regularizes all parameters in UNet and conjunctive rule detection modules to avoid overfitting:

(8)

2-3) disjunctive normal form detection

Disjunctive normal form detection for validating input textxWhether the text meets the complete user rule set is equivalent to whether the text meets any conjunction rule in the user rule set;

the input is as follows: the conjunctive rule in step 2-2) represents a vector

And other associated conjunction rules represent vectors;

the output is: predicting a probability that the input text satisfies the user rule set;

by usingmaxFunction to implement disjunctive networks

: using the maximum probability in the inference results in the step 2-2) as a text inference result to represent the inference of the input textxProbability of satisfying user's demand, wherein

Is the probability that the predicted input text satisfies the user rule set,

the expression is taken as a function of the maximum probability,

and the inference result output by the conjunction rule detection module is represented as follows:

(9)

calculating the loss by using cross entropy loss function

As shown in formula (10), whereinyIs a real label of the input text, whether the text meets the requirements of the user is marked by an expert,

representing training set sample expectations, the training process by minimizing losses

To update all the parameters of the semantic logical network,

indicating use of

The norm regularizes the parameters of the semantic logic network to avoid overfitting:

(10)。

according to a preferred embodiment of the present invention, the processing method of the neural classification network includes:

constructing semantic vectors of input text by a text encoding module, the text encoding network used therein being

ENC₂Preferably CNN, RNN or BERT based coding modules; by passingAfter the text coding module obtains the semantic expression vector of the input text, the category prediction is carried out based on the semantic expression vector, as shown in formula (11),

representing the probability that the neural classification network predicts that the input text meets the user's requirements, here

Is the output text level tag

，

To represent

The function is activated in such a way that,

is the network parameter:

(11)

measuring prediction result of neural classification network by using cross entropy loss function

With true result, i.e. true tag of the input text

The difference between them, shown in formula (12), is lost

By minimizing losses

To update the neural classification netAll parameters of the network, whereinyIs a real label of the input text, whether the text meets the requirements of the user is marked by an expert,

represents the expectation of the sample of the training set,

indicating use of

The norm regularizes all parameters of the neural classification network to avoid overfitting:

(12)

3) and deducing the input text through a neural classification network and a semantic logic network respectively to obtain the prediction results of the input text and the semantic logic network, and finally constraining the consistency of the prediction results of the input text and the semantic logic network by using Jensen-Shannon divergence, namely JS distance for short.

Measuring the similarity between the prediction result distribution of the neural classification network and the semantic logic network by adopting JS distance, wherein the greater the similarity between the two is, the smaller the JS distance value is, and the probability distribution output by the neural classification network is recorded as

The probability distribution of the semantic logic network output is

Then JS distance between them

The calculation formula of (2) is as follows:

(13)

the above-mentioned

Expressing the Kullback-leibler (KL) divergence, the calculation is shown in equation (14), the JS distance is a variant of the KL divergence, solving the asymmetric problem of the KL divergence:

(14)

taking the JS distance as a regular term in the joint loss, and realizing the joint loss

Is calculated as in equation (15), wherein hyper-parameter

For the purpose of weighing the different loss terms,

the value range is (0, 1), and the constraint condition is satisfied

，

As a loss function shown in equation (12),

is a loss function shown in equation (10):

(15)

in parallel structure training process, by minimizing joint loss

To update all parameters of the neural classification network and the semantic logic network.

An apparatus for implementing the text inference method is characterized by comprising: a semantic logic network module;

the semantic logic network module is used for: determining whether an input text satisfies a user rule; the semantic logic network module comprises: the device comprises an item detection module, a conjunction rule detection module and a disjunction normal form detection module which are sequentially arranged along the direction of data flow.

According to the preferable embodiment of the present invention, the apparatus for implementing the text inference method further includes a neural classification network module disposed in parallel with the semantic logic network module;

the neural classification network is to: performing category prediction on an input text to obtain the probability that the input text meets the requirements of a user, namely a prediction result;

and respectively deducing the input text through a neural classification network and a semantic logic network to respectively obtain the prediction results of the input text and the semantic logic network, and finally, constraining the consistency of the prediction results of the input text and the semantic logic network by using Jensen-Shannon divergence.

The technical advantages of the invention are as follows:

(1) the semantic logic network provided by the invention encodes the user rules into semantic vectors, can better retain the semantic information of the text while detecting the logic rules, and supports language flexibility and text diversity.

(2) The invention also provides a method for integrating the user rules into the neural classification network to improve the text inference performance, namely, the parallel prediction structure inferred by the neural classification network and the semantic logic network is combined, consistency joint loss is adopted, the semantic logic network and the neural classification network can benefit each other, and the detection result of the rules is used as the evidence of text inference.

The semantic logic inference based on pre-training provided by the invention can better meet the user requirements of dynamic change. Aiming at the challenge brought to the supervised learning method, the invention utilizes massive universal linguistic data such as Chinese Wikipedia and the like and linguistic knowledge in open fields such as synonyms, near-synonym sets and the like extracted based on Chinese WordNet to pre-train a semantic logic network and finely tune on specific user data, thereby enhancing the robustness of keyword detection and being beneficial to efficiently processing the user requirement of dynamic change.

Drawings

FIG. 1 is a schematic diagram of an apparatus for implementing a rule embedding based text inference method of the present invention;

FIG. 2 is an example of a user requirement decision tree in the embodiment of the present invention.

Detailed Description

The invention is described in detail below with reference to the following examples and the accompanying drawings of the specification, but is not limited thereto.

Examples 1,

A method of rule-embedding based text inference, the method comprising:

（1）

in the formula (1), the first and second groups,

An item is a set of keywords

Keywords and synonyms thereof related to description subject or semantics are included; formulation of propositions according to the theorem of existence of normal formPMust be converted into an equivalent disjunctive normal form,

is a gate ofConjunctive rules formed by sets of key words, i.e.

Wherein

Representation conjunctive rules

I.e. a user rule set, wherein

Representing the number of conjunction rules; in this step, english of the Disjunctive Normal Form is abbreviated as DNF, the Disjunctive Normal Form has flexibility of processing user requirement change, and the change of the user requirement can be efficiently adapted through an addition and deletion conjunction rule;

2) determining whether an input text satisfies a user rule:

given text collections

And user rule set

(ii) a Inputting text

Is expressed as

(ii) a Inferring an output text-level probability of

Representing input text

A probability of satisfying a user rule; rule level probabilities of

Is a length of

Probability of, according to

Value of (2) judging textxWhich user rules are satisfied.

using semantic logic network to input text in turnxItem detection, conjunction rule detection and disjunction normal form detection are carried out, and whether an input text meets the user rule or not is finally judged.

The sequential pair of input textsxThe specific method for item detection, conjunction rule detection and disjunction normal form detection comprises the following steps:

2-1) item detection

Item detection for determining input text

Whether or not to include disjunctive normal form items

(ii) a related semantic;

input as input text

；

The output is recorded as the detection result

Representing input text

Containing item

The probability of (d);

will input text

Converting into a matrix formed by corresponding pre-training word vectors: the pre-training word vector refers to a Chinese word vector obtained by training a Chinese Wikipedia corpus and a word2vec algorithm, and is obtained by inputting a text

Wherein

Which represents the real number field, is,uis to input text

is a word

Corresponding length isdThe vector of (a);

will item

Conversion to vector form: item(s)

Vector of

Wherein

Is a key word in the set that is,

is that

Corresponding to the pre-training word vector.

Computing item

And inputting text

Pre-training word embedding matrix

：

(2)

For input text

Obtaining text semantic vector after semantic coding through coding network ENC

In the present invention, different convolutional neural networks can be adopted, and a TEXTCNN structure is preferably adopted as the coding network ENC, and the sizes of the three convolutional kernels used are 2 in each cased，3×d,4×dWhereindIs the dimension of the pre-training word vector, and the number of each convolution kernel is 64;

semantic vector of text

And interaction vector

I.e. as input text

To item

Contains the relationship:

(3)

through

Value of function activationAs detected input text

Containing item

Probability, i.e. inference result, representing the input text

To item

The satisfaction degree of the corresponding keyword set semantics:

(4)

predicting input text with semantic logic network

Containing item

Probability of said vector

The rule is also used as the input of the next-stage conjunction rule module;

to represent

The function is activated in such a way that,

is a network parameter;

estimating a push using a cross entropy loss functionBroken result

With true result, i.e. true tag of item

Difference between distributions to find loss

：

(5)

Wherein the content of the first and second substances,

Detecting all parameters in the network by using the updated items;

indicating use of

2-2) conjunctive rule detection

The semantics of (2);

the input is as follows: term representation vector of step 2-1)

；

The output is: predicting input text to contain conjunctive rules

The probability of (d);

conjunctive rule embedding network

The invention verifies that the method adopts

Or

Comprising a sequence of items

The expression vectors corresponding to the items detected form a sequence

Splicing all vectors in the sequence as input, passing through

Obtaining a representation vector of a conjunctive rule

The output vector contains the rules for extracting the input text pair

Contains the relationship:

(6)

wherein the content of the first and second substances,

to represent

A sequence of all items of (a);

through

To represent

The function is activated in such a way that,

is a parameter of the network that is,

is that the input text contains conjunction rules

Probability of (2), i.e. inference result:

(7)

measuring prediction results by using cross entropy loss function

And true results

Difference of (2) and loss

Wherein

To update all parameters in the UNet and conjunction rule detection modules,

indicating use of

(8)

2-3) disjunctive normal form detection

Disjunctive normal form detection for validating input text

Whether the text meets the complete user rule set is equivalent to whether the text meets any conjunction rule in the user rule set;

the input is as follows: the conjunctive rule in step 2-2) represents a vector

And other associated conjunction rules represent vectors;

by usingmaxFunction to implement disjunctive networks

: using the maximum probability in the inference results in the step 2-2) as a text inference result to represent the inference of the input text

Probability of satisfying user's demand, wherein

Is the probability that the predicted input text satisfies the user rule set,

the expression is taken as a function of the maximum probability,

(9)

calculating the loss by using cross entropy loss function

As shown in formula (10), wherein

Is a real label of the input text, whether the text meets the requirements of the user is marked by an expert,

representing a training setSample expectation, training process by minimizing losses

To update all the parameters of the semantic logical network,

indicating use of

(10)

examples 2,

The method of embodiment 1 for rule-based embedded text inference further comprising a neural classification network disposed in parallel with the semantic logic network, the neural classification network configured to: performing category prediction on an input text to obtain the probability that the input text meets the requirements of a user, namely a prediction result;

The processing method of the neural classification network comprises the following steps:

ENC₂Preferably CNN, RNN or BERT based coding modules; after the semantic expression vector of the input text is obtained through a text coding module, category prediction is carried out based on the semantic expression vector, as shown in formula (11),

Is the output text level tag

，

To represent

The function is activated in such a way that,

is the network parameter:

(11)

And true results

The difference between them, shown in formula (12), is lost

By minimizing losses

To update all parameters of the neural classification network, wherein

representing training set sample periodsThe physician can watch the disease,

indicating use of

(12)

The probability distribution of the semantic logic network output is

Then JS distance between them

The calculation formula of (2) is as follows:

(13)

the above-mentioned

(14)

Is calculated as in equation (15), wherein hyper-parameter

For the purpose of weighing the different loss terms,

the value range is (0, 1), and the constraint condition is satisfied

，

As a loss function shown in equation (12),

is a loss function shown in equation (10):

(15)

in parallel structure training process, by minimizing joint loss

Examples 3,

An apparatus for implementing the text inference method according to embodiment 1, comprising: a semantic logic network module;

Examples 4,

On the basis of embodiment 3, the apparatus for implementing the text inference method further includes a neural classification network module disposed in parallel with the semantic logic network module;

Application examples,

Practical application methods of the method and apparatus as described in examples 1-4 are as follows.

Pre-training a semantic logic network based on the general corpus:

obtaining a universal corpus, comprising: training texts are obtained from a Chinese general corpus such as Chinese Wikipedia, and keyword sets are obtained from a Chinese synonym forest such as Chinese version WordNet.

The automatic labeling of item level and conjunctive rule level is carried out on the general corpus, which comprises the following steps: for item annotation, at least inclusion for satisfaction

Text of one of the keywordsx，Item label

Is 1:

if not, then

Is 0; for the conjunction rule marking, randomly combining the keyword set to generate a conjunction rule;

if the text isxSimultaneously satisfy the conjunction rule

All the items in (1), the conjunction rule tag of the text

Is 1;

if at least any item is not satisfied, then

Is 0.

According to the step 2) described in embodiment 1, with reference to fig. 1, the general corpus pre-training item detection module and the conjunction rule detection module are used, which specifically include:

detecting the network according to the step 2-1) usage item

：

Inputting universal corpus textxInputting a set of generic keywords

，A training item detection module;

and converting the input token into a corresponding pre-training word vector at an embedding layer of the model. For the keyword set to be detected, the vector of the keyword set is the average value of the pre-training word vectors corresponding to all words in the set, because the synonym is embedded in the semantic space and has adjacent position relation, the average vector can present common semantic features; on the other hand, for the place name set, the top prefecture words on the geographic region are used as proxy words, because the prefecture place names all contain the fact that the event occurs in the region;

the item detection module outputting probability corresponding to formula (4)

The real label corresponds to the above label

Finding the loss of formula (5)

And reversely propagating to update the parameters of the item detection network, and iteratively training until the accuracy of the verification set is improved to be less than a threshold value.

According to step 2-2) of embodiment 1), a conjunction rule detection module is added to the untrained UNet, and two modules are trained by using a universal corpus, which specifically includes:

input devicexAnd

obtaining an output vector by UNettVectors corresponding to all terms contained in the conjunction ruletSplicing input CNet to obtain detection probability of conjunction rule

Corresponding to the formula (7), all the conjunction rules are detected in sequence;

by using

And a label

Loss calculation

Discarding the predicted part of the term detection module, corresponding to equation (8), based on the loss

And the propagation is carried out reversely to update the parameters of the UNet and the CNet.

The network is finely adjusted based on user data, which specifically comprises the following steps:

acquiring user requirements in the form of logic rules:

a subscribing user pays attention to emergencies in a specific area, including social security events, natural disasters and the like, the requirement description of the user is shown in fig. 2, white nodes represent logic or operations, and black nodes represent logic and operations:

for target text, starting with the leaf node and passing the boolean predicate value to the root node, an example of the keyword set in fig. 2 is shown in table 1:

TABLE 1 keyword set example of subscribed users

The logic rules corresponding to the requirement decision tree of the subscribing user are written according to step 1) of embodiment 1, and propositional formulas and disjunctive normal forms equivalent to the decision tree are shown in table 2.

TABLE 2 logic rules for subscribing users

Fine tuning the semantic logic network using user samples and rules:

the sample set comprises historical interest texts of the user, namely texts judged and pushed by experts, the texts form a positive sample set and correspond to tags

Constructing a negative sample set of texts which are not interested in the user history, namely the texts judged by experts not to be pushed, and corresponding labels

。

And preprocessing the text of the sample set, including Chinese word segmentation, text truncation or filling, and converting the word after word segmentation into a token input form. And converts all keywords contained in the logic rules into an input form for token.

The method for finely adjusting the semantic logic network by using the sample set specifically comprises the following steps:

according to step 2-1) of embodiment 1), the network is tested using the user sample set and the item fine-tuning items of the logic rules

The analog pre-training process trains UNet using user data, iteratively until the validation set accuracy rises below a threshold.

According to step 2-2), the network UNet and CNet are fine-tuned using the user sample set and the collection rules, analogizing to the pre-training process, and iteratively training until the validation set accuracy is raised by less than the threshold.

According to step 2-3), adding an extraction rule detection module, training the DNet by using a user sample set, and specifically comprising:

input device

And

obtaining all conjunction rules through UNet and CNet

Is predicted with probability of

The maximum probability in MAX network is used as the probability that the inferred text meets the user's requirement, as shown in formula (9)

. For example, if there are three predicted probabilities output by CNet, which are 0.98, 0.73, and 0.43, respectively, the probability of MAX network output is 0.98, which indicates that the text satisfies the user requirement if any one of the rules is satisfied.

Alternatively, if the DNet is implemented using MLP, then all will be

Concatenating and inputting DNet to obtain a representation vectorRAnd based on vectorsRObtaining prediction probabilities

。

By using

And a label

Loss calculationL _RDiscarding terms, predicting parts in conjunction rule detection module, and using losses, as in equation (10)L _RAnd the parameters of the whole semantic logic network are updated by back propagation.

Training a parallel network based on user data, which is specifically as follows:

according to embodiments 2 and 4, training a parallel network structure using a user sample set specifically includes:

independently training the neural classification network: the neural classification network is fully trained using the user sample set, and the loss function is as in equation (12).

Jointly training a semantic logic network and a neural classification network: and combining the trained semantic logic network and the trained neural classification network for fine tuning, and introducing a JS item in the combined loss to constrain the consistency of the prediction results at two sides of the parallel structure. The joint loss is as in equation (15). At this time, the two ends of the parallel network simultaneously predict the categories of the texts, and the output of one side of the neural classification network is

The calculation is shown in formula (11), and the output of the semantic logic network is

The calculation is as shown in formula (9), and the invention preferably adopts

As a final output result. For example, in this application, the input text "Bin is affected by 'Liqima' strong typhoon, and the prediction result corresponding to …" is

Judging to meet the user requirements; the prediction result of the input text "in the latest updated scenario, wing views with feather back to Qingzhou …" is

And judging that the user requirements are not met.

Claims

1. A method for rule-based embedding based text inference, the method comprising:

（1）

in the formula (1), the first and second groups,

An item is a set of keywords

is a conjunction rule composed of a set of keywords, i.e.

Wherein

Representation conjunctive rules

I.e. a user rule set, wherein

2) determining whether an input text satisfies a user rule:

2. The method of claim 1, further comprising a neural classification network disposed in parallel with the semantic logic network, the neural classification network configured to: performing category prediction on an input text to obtain the probability that the input text meets the requirements of a user, namely a prediction result;

3. The method of claim 1, wherein the sequential pairing of input text is performed in sequencexThe specific method for item detection, conjunction rule detection and disjunction normal form detection comprises the following steps:

2-1) item detection

Item detection for determining input text

Whether or not to include disjunctive normal form items

(ii) a related semantic;

input as input text

；

The output is recorded as the detection result

Representing input text

Containing item

The probability of (d);

will input text

Converting into a matrix formed by corresponding pre-training word vectors: is marked as

Wherein

Which represents the real number field, is,uis to input text

is a word

Corresponding length isdThe vector of (a);

will item

Conversion to vector form: item(s)

Vector of

Wherein

Is a key word in the set that is,

is that

Corresponding to the pre-training word vector;

will vector

And inputting the textBook (I)

Pre-training word embedding matrix

：

(2)

For input text

Obtaining text semantic vector after semantic coding through coding network ENC

；

Semantic vector of text

And interaction vector

I.e. as input text

To item

Contains the relationship:

(3)

through

Value of function activation as detection of input text

Containing item

Probability, i.e. inference result, representing the input text

To item

The satisfaction degree of the corresponding keyword set semantics:

(4)

predicting input text with semantic logic networkxContaining item

Probability of said vector

And also as the nextInputting a stage conjunction rule module;

to represent

The function is activated in such a way that,

is a network parameter;

evaluating inference results using cross-entropy loss function

With true result, i.e. true tag of item

Difference between distributions to find loss

：

(5)

Wherein the content of the first and second substances,

Detecting all parameters in the network by using the updated items;

indicating use of

Normalizing the parameters of the item detection network by the norm;

2-2) conjunctive rule detection

Conjunctive rule detection for validating input text

Whether to satisfy conjunctive rules

The semantics of (2);

the input is as follows: term representation vector of step 2-1)

；

The output is: predicting input text to contain conjunctive rules

The probability of (d);

conjunctive rule embedding network

Rule of conjunction

Comprising a sequence of items

The expression vectors corresponding to the items detected form a sequence

Concatenating all vectors in the sequenceAs an input, pass through

Obtaining a representation vector of a conjunctive rule

：

(6)

Wherein the content of the first and second substances,

to represent

A sequence of all items of (a);

through

Activating the function to obtain the detection probability of the conjunction rule;

is shown in formula (7), wherein

To represent

The function is activated in such a way that,

is a parameter of the network that is,

is that the input text contains conjunction rules

Probability of (2), i.e. inference result:

(7)

measuring prediction results by using cross entropy loss function

With true result, i.e. true tag of rule

Difference of (2) and loss

Wherein

To update all parameters in the UNet and conjunction rule detection modules,

indicating use of

Regularizing all parameters in the UNet and conjunction rule detection modules by norm:

(8)

2-3) disjunctive normal form detection

Disjunctive normal form detection for validating input text

Whether a complete set of user rules is satisfied;

the input is as follows: the conjunctive rule in step 2-2) represents a vector

And other associated conjunction rules represent vectors;

by usingmaxFunction to implement disjunctive networks

: using the maximum probability in the inference results in the step 2-2) as a text inference result, wherein

Is the probability that the predicted input text satisfies the user rule set,

the expression is taken as a function of the maximum probability,

(9)

calculating the loss by using cross entropy loss function

As shown in formula (10), wherein

To update all the parameters of the semantic logical network,

indicating use of

Norm to regularize parameters of the semantic logic network:

(10)。

4. the method of claim 2, wherein the processing method of the neural classification network comprises:

ENC₂(ii) a After the semantic expression vector of the input text is obtained through a text coding module, category prediction is carried out based on the semantic expression vector, as shown in formula (11),

representing neural classification network predicted input text charactersProbability of meeting user demand, here

Is the output text level tag

，

To represent

The function is activated in such a way that,

is the network parameter:

(11)

With true result, i.e. true tag of the input text

The difference between them, shown in formula (12), is lost

By minimizing losses

To update all parameters of the neural classification network, wherein

represents the expectation of the sample of the training set,

indicating use of

Norm to regularize all parameters of the neural classification network:

(12)

and deducing the input text through a neural classification network and a semantic logic network respectively to obtain the prediction results of the input text and the semantic logic network, and finally constraining the consistency of the prediction results of the input text and the semantic logic network by using Jensen-Shannon divergence, namely JS distance for short.

5. The method of claim 4, wherein the text inference based on rule embedding is performed,

measuring the similarity between the prediction result distribution of the neural classification network and the semantic logic network by adopting JS distance, and recording the probability distribution output by the neural classification network as