CN113435212A - Text inference method and device based on rule embedding - Google Patents

Text inference method and device based on rule embedding Download PDF

Info

Publication number
CN113435212A
CN113435212A CN202110984877.9A CN202110984877A CN113435212A CN 113435212 A CN113435212 A CN 113435212A CN 202110984877 A CN202110984877 A CN 202110984877A CN 113435212 A CN113435212 A CN 113435212A
Authority
CN
China
Prior art keywords
network
text
input text
rule
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110984877.9A
Other languages
Chinese (zh)
Other versions
CN113435212B (en
Inventor
孙宇清
郑威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202110984877.9A priority Critical patent/CN113435212B/en
Publication of CN113435212A publication Critical patent/CN113435212A/en
Application granted granted Critical
Publication of CN113435212B publication Critical patent/CN113435212B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/041Abduction

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

A text inference method based on rule embedding comprises the steps of carrying out neural retrieval and inference on different components of a logic rule based on a pre-trained semantic logic network, and supporting user requirement change or task migration; and combining a parallel structure of a semantic logic network and a neural classification network, adopting a probability distribution distance function Jensen-Shannon divergence, and constraining the consistency of an inference result through network fine tuning training. The semantic logic network provided by the invention encodes the user rules into semantic vectors, can better retain the semantic information of the text while detecting the logic rules, and supports language flexibility and text diversity. The invention also provides a method for integrating the user rules into the neural classification network to improve the text inference performance, namely, the parallel prediction structure inferred by the neural classification network and the semantic logic network is combined, consistency joint loss is adopted, the semantic logic network and the neural classification network can benefit each other, and the detection result of the rules is used as the evidence of text inference.

Description

Text inference method and device based on rule embedding
Technical Field
The invention discloses a text inference method and a text inference device based on rule embedding, and belongs to the technical field of natural language processing.
Background
Public opinion subscription is an important application scene in the new media era, and means that a media mechanism regularly pushes texts such as internet public opinions or news concerned by users according to the requirements of subscribing users, wherein the user requirements are usually expressed in the form of keyword logic rules and describe the text contents preferred by the users. The text inference task based on the user requirement refers to judging whether one text meets the user requirement, and the task has important application value in the scene.
The existing technologies for processing the inference task are mainly divided into two categories, one is to infer based on a keyword boolean search result and find out a text matching the logic expression by comparing the text with a keyword logic expression defined by a user, but the keyword boolean search mode has limitations, and the flexibility of natural language enables the text expression form with the same semantic meaning to have great freedom degree and influence the matching result. The other method is a classification method based on deep learning, which is characterized in that text type inference is carried out based on pre-training word vectors and a neural network, and supervised learning is carried out on a large-scale labeled data set, so that the neural network can understand and infer whether texts meet user requirements from a semantic level, for example, text expression vectors obtained based on a convolutional neural network are recorded in Chinese patent document CN 113076488A: a method and a system for recommending information based on user data are provided, which perform feature modeling on a specific sentence in a text carrying user information through a preset keyword, but have the defects that the problem of diversity of topics related to user requirements is difficult to process, and the change of the user requirements is difficult to adapt.
Disclosure of Invention
Aiming at the problems in the prior art, the invention discloses a text inference method based on rule embedding.
The invention also discloses a device for realizing the text inference method, so as to realize the inference processing of the text.
Summary of the invention:
a text inference method based on rule embedding comprises two parts: firstly, based on a pre-trained semantic logic network, different components of a logic rule are subjected to neural retrieval and inference, and user requirement change or task migration is supported; and secondly, combining a parallel structure of a semantic logic network and a neural classification network, adopting a probability distribution distance function Jensen-Shannon divergence, and constraining the consistency of the inferred result through network fine tuning training. And finally, fusion inference is carried out based on the prediction results of the semantic logic network and the neural classification network, and meanwhile, the activation result of the semantic logic network is used as the evidence of the text inference result.
The invention provides a semantic logic network, which approximates a logic inference process in a neural manner, wherein the process comprises the detection of different granularity components in a logic rule by a text and the combination of detection results, and the components comprise items, conjunctions and disjunctions. And respectively verifying the inclusion relationship of the text to the components by introducing three independent loss functions. Aiming at the challenges brought by the dynamically changing user requirements, the semantic logic network is trained by using a pre-training-fine tuning mechanism. The semantic logic network is composed of three modules and is respectively used for semantic detection of items in user rules, conjunction rules and disjunction rules, and text inference is carried out by combining detection results. A text is obtained from a Chinese general corpus such as Chinese Wikipedia, and a general keyword set corpus is obtained from a Chinese synonym forest such as Chinese WordNet. And pre-training each module by using the universal corpus to enhance the robustness of the network to keyword detection and finely adjust the previous user data, thereby improving the adaptability to user requirement change.
In addition, the invention provides a parallel structure combining the optional neural classification network and the semantic logic network, and the network is finely adjusted by a joint training mode to improve the inference performance. In order to combine the neural classification network and the semantic logic network, a Jensen-Shannon loss function is used as a regularization term, and the consistency of prediction results on two sides of a parallel structure is constrained through network fine tuning stage training.
Technical term interpretation:
1. user requirements: also called user rules, in the present invention, the subscribing user describes his preferences for text content, given in the form of logical rules in the form of a set of keywords, i.e. words or phrases in the academic field. Dynamically changing user requirements refer to: when a user puts forward a new focus, the logic expression is changed to express by adding or deleting key words.
2. Text inference: for a given user requirement, it is inferred whether the input text meets the requirement.
3. Semantic logic network: refers to a neural network used for semantic detection and inference of input text.
4. Parallel network: the parallel network in the invention comprises a semantic logic network and a neural classification network which are arranged in parallel.
5. And (3) consistency constraint: in a loss function, a probability distribution distance function Jensen-Shannon divergence, namely JS distance for short is introduced to serve as a regularization term, and the inference results on two sides of a parallel network are constrained to be optimized towards the consistency direction of probability distribution; the JS distance is a variant of Kullback-Leibler (KL) divergence, and the asymmetric problem of the KL divergence is solved.
The detailed technical scheme of the invention is as follows:
a method for rule-based embedding based text inference, the method comprising:
1) converting a keyword logic expression for describing user requirements into an equivalent disjunctive normal form, wherein the user requirements are a propositional formula P, and the disjunctive normal form of P is as follows:
Figure 493000DEST_PATH_IMAGE001
(1)
in the formula (1), the first and second groups,
Figure 675720DEST_PATH_IMAGE002
indicating the number of conjunction rules that are to be applied,r iis the ith user rule; in the propositional formula P, conjunctions are taken from the set
Figure 252195DEST_PATH_IMAGE003
An item is a set of keywords
Figure 178562DEST_PATH_IMAGE004
Keywords and synonyms thereof related to description subject or semantics are included; formulation of propositions according to the theorem of existence of normal formPMust be capable ofIs converted into a disjunctive normal form equivalent to the above,
Figure 759979DEST_PATH_IMAGE005
is a conjunction rule composed of a set of keywords, i.e.
Figure 175917DEST_PATH_IMAGE006
Wherein
Figure 974108DEST_PATH_IMAGE007
Representation conjunctive rules
Figure 173008DEST_PATH_IMAGE005
The number of the middle items and all the conjunction rule sets forming the user requirements are expressed as
Figure 373046DEST_PATH_IMAGE008
I.e. a user rule set, wherein
Figure 163147DEST_PATH_IMAGE009
Representing the number of conjunction rules; in this step, english of the Disjunctive Normal Form is abbreviated as DNF, the Disjunctive Normal Form has flexibility of processing user requirement change, and the change of the user requirement can be efficiently adapted through an addition and deletion conjunction rule; the specific process of the conversion in the step is the same as the conversion mode of the traditional logic expression, and does not belong to the content to be protected by the invention;
2) determining whether an input text satisfies a user rule:
given text collections
Figure 183056DEST_PATH_IMAGE010
And user rule set
Figure 218270DEST_PATH_IMAGE011
(ii) a Inputting text
Figure 538393DEST_PATH_IMAGE012
Is expressed as
Figure 437079DEST_PATH_IMAGE013
(ii) a Inferring an output text-level probability of
Figure 209863DEST_PATH_IMAGE014
Representing input text
Figure 547303DEST_PATH_IMAGE015
A probability of satisfying a user rule; rule level probabilities of
Figure 456353DEST_PATH_IMAGE016
Is a length of
Figure 791520DEST_PATH_IMAGE017
Of which the ith dimension value represents the predicted input textxSatisfying user rules
Figure 786021DEST_PATH_IMAGE018
Probability of, according to
Figure 927152DEST_PATH_IMAGE019
Value of (2) judging textxWhich user rules are satisfied.
The invention utilizes semantic logic network to input textxAnd (3) understanding, and deducing whether the user rule corresponding to the user requirement is met:
using semantic logic network to input text in turnxItem detection, conjunction rule detection and disjunction normal form detection are carried out, whether an input text meets the user rule or not is finally judged, and whether the input text x meets the items in the user rule or not is judged in the step
Figure 956288DEST_PATH_IMAGE004
Rule of conjunction
Figure 196776DEST_PATH_IMAGE020
And the semantics of the extraction rule are shown in the right side of the attached drawing 1, and three modules from bottom to top respectively perform item detection, conjunction rule detection and extraction rule detection.
Preferably, according to the present invention, the method for text inference based on rule embedding further includes a neural classification network disposed in parallel with the semantic logic network, the neural classification network is configured to: performing category prediction on an input text to obtain the probability that the input text meets the requirements of a user, namely a prediction result;
respectively deducing the input text through a neural classification network and a semantic logic network to respectively obtain the prediction results of the input text and the semantic logic network; and finally, constraining the consistency of the prediction results of the Jensen-Shannon divergence, namely JS distance for short.
According to the invention, the input texts are sequentially pairedxThe specific method for item detection, conjunction rule detection and disjunction normal form detection comprises the following steps:
2-1) item detection
Item detection for determining input text
Figure 711196DEST_PATH_IMAGE021
Whether or not to include disjunctive normal form items
Figure 124860DEST_PATH_IMAGE004
(ii) a related semantic;
input as input text
Figure 211765DEST_PATH_IMAGE022
The output is recorded as the detection result
Figure 685472DEST_PATH_IMAGE023
Representing input text
Figure 654565DEST_PATH_IMAGE024
Containing item
Figure 75182DEST_PATH_IMAGE025
The probability of (d);
will input text
Figure 78910DEST_PATH_IMAGE026
Converting into corresponding pre-training word vector structureForming a matrix: the pre-training word vector refers to a Chinese word vector obtained by training a Chinese Wikipedia corpus and a word2vec algorithm, and is obtained by inputting a text
Figure 926780DEST_PATH_IMAGE027
The matrix composed of pre-training word vectors corresponding to all the words in the Chinese language is recorded as
Figure 383169DEST_PATH_IMAGE028
Wherein
Figure 404215DEST_PATH_IMAGE029
Which represents the real number field, is,uis to input text
Figure 262449DEST_PATH_IMAGE030
The length of the truncation of (a) is,dis the length of the pre-training word vector,
Figure 281221DEST_PATH_IMAGE031
is a word
Figure 454933DEST_PATH_IMAGE032
Corresponding length isdThe vector of (a);
will item
Figure 14090DEST_PATH_IMAGE033
Conversion to vector form: item(s)
Figure 992410DEST_PATH_IMAGE033
Vector of
Figure 182083DEST_PATH_IMAGE033
The average of the pre-training word vectors corresponding to all keywords in the corresponding keyword set, i.e.
Figure 878644DEST_PATH_IMAGE034
Wherein
Figure 975913DEST_PATH_IMAGE035
Is a key word in the set that is,
Figure 543160DEST_PATH_IMAGE036
is that
Figure 903734DEST_PATH_IMAGE037
Corresponding to the pre-training word vector.
Computing item
Figure 556433DEST_PATH_IMAGE004
And inputting text
Figure 722972DEST_PATH_IMAGE038
The mutual information between every vocabulary in the Chinese character, vector
Figure 613567DEST_PATH_IMAGE039
And inputting text
Figure 443245DEST_PATH_IMAGE040
Pre-training word embedding matrix
Figure 848819DEST_PATH_IMAGE041
The interaction vector is obtained through matrix multiplication and is recorded as
Figure 491153DEST_PATH_IMAGE042
Figure 32993DEST_PATH_IMAGE043
(2)
For input text
Figure 532107DEST_PATH_IMAGE044
Obtaining text semantic vector after semantic coding through coding network ENC
Figure 424977DEST_PATH_IMAGE045
In the present invention, different convolutional neural networks can be adopted, and a TEXTCNN structure is preferably adopted as the coding network ENC, and the sizes of the three convolutional kernels used are 2 in each cased,3×d,4×dWhereindIs the pre-training word directionA dimension of magnitude, the number of each convolution kernel being 64;
semantic vector of text
Figure 667739DEST_PATH_IMAGE046
And interaction vector
Figure 532927DEST_PATH_IMAGE047
Splicing, and reducing dimension through a multilayer perceptron network MLP to obtain vectors
Figure 202943DEST_PATH_IMAGE048
I.e. as input text
Figure 317529DEST_PATH_IMAGE049
To item
Figure 567245DEST_PATH_IMAGE004
Contains the relationship:
Figure 585142DEST_PATH_IMAGE050
(3)
Figure 426059DEST_PATH_IMAGE051
through
Figure 27942DEST_PATH_IMAGE052
Value of function activation as detection of input text
Figure 815769DEST_PATH_IMAGE053
Containing item
Figure 186707DEST_PATH_IMAGE054
Probability, i.e. inference result, representing the input text
Figure 464105DEST_PATH_IMAGE055
To item
Figure 553284DEST_PATH_IMAGE056
Corresponding key wordSatisfaction degree of set semantics:
Figure 879223DEST_PATH_IMAGE057
(4)
Figure 370247DEST_PATH_IMAGE058
predicting input text with semantic logic networkxContaining item
Figure 552967DEST_PATH_IMAGE059
Probability of said vector
Figure 630906DEST_PATH_IMAGE060
The rule is also used as the input of the next-stage conjunction rule module;
Figure 760536DEST_PATH_IMAGE061
to represent
Figure 106067DEST_PATH_IMAGE062
The function is activated in such a way that,
Figure 725267DEST_PATH_IMAGE063
is a network parameter;
evaluating inference results using cross-entropy loss function
Figure 523459DEST_PATH_IMAGE064
With true result, i.e. true tag of item
Figure 456780DEST_PATH_IMAGE065
Difference between distributions to find loss
Figure 656817DEST_PATH_IMAGE066
Figure 446919DEST_PATH_IMAGE067
(5)
Wherein the content of the first and second substances,
Figure 935669DEST_PATH_IMAGE068
the true label of the item is obtained by the character string matching detection and synonym expansion of the text and the keyword;
Figure 469418DEST_PATH_IMAGE069
representing training set sample expectations;Mis the number of keyword sets; training process by minimizing losses
Figure 789541DEST_PATH_IMAGE070
Detecting all parameters in the network by using the updated items;
Figure 980570DEST_PATH_IMAGE071
indicating use of
Figure 956616DEST_PATH_IMAGE072
The norm regularizes the parameters of the term detection network to avoid overfitting; the cross entropy loss function is a cross entropy cross-entropy loss function;
2-2) conjunctive rule detection
Conjunctive rule detection for validating input textxWhether to satisfy conjunctive rules
Figure 28478DEST_PATH_IMAGE073
The semantics of (2);
the input is as follows: term representation vector of step 2-1)
Figure 203107DEST_PATH_IMAGE075
The output is: predicting input text to contain conjunctive rules
Figure 538273DEST_PATH_IMAGE076
The probability of (d);
conjunctive rule embedding network
Figure 798353DEST_PATH_IMAGE077
The invention verifies that the method adopts
Figure 673906DEST_PATH_IMAGE078
Or
Figure 640725DEST_PATH_IMAGE079
The different structures have the capability of approximate logic conjunction operation and conjunction rule
Figure 943530DEST_PATH_IMAGE076
Comprising a sequence of items
Figure 690906DEST_PATH_IMAGE080
The expression vectors corresponding to the items detected form a sequence
Figure 104570DEST_PATH_IMAGE081
Splicing all vectors in the sequence as input, passing through
Figure 191475DEST_PATH_IMAGE082
Obtaining a representation vector of a conjunctive rule
Figure 166646DEST_PATH_IMAGE083
The output vector contains the rules for extracting the input text pair
Figure 401318DEST_PATH_IMAGE084
Contains the relationship:
Figure 821935DEST_PATH_IMAGE085
(6)
wherein the content of the first and second substances,
Figure 560084DEST_PATH_IMAGE086
to represent
Figure 470271DEST_PATH_IMAGE087
A sequence of all items of (a);
Figure 926661DEST_PATH_IMAGE088
through
Figure 885389DEST_PATH_IMAGE089
The function activation yields the detection probability of the conjunctive rule, shown in equation (7), where
Figure 9203DEST_PATH_IMAGE090
To represent
Figure 762396DEST_PATH_IMAGE091
The function is activated in such a way that,
Figure 706081DEST_PATH_IMAGE092
is a parameter of the network that is,
Figure 530817DEST_PATH_IMAGE093
is that the input text contains conjunction rules
Figure 446821DEST_PATH_IMAGE094
Probability of (2), i.e. inference result:
Figure 934696DEST_PATH_IMAGE095
(7)
measuring prediction results by using cross entropy loss function
Figure 365677DEST_PATH_IMAGE096
With true result, i.e. true tag of rule
Figure 666209DEST_PATH_IMAGE097
Difference of (2) and loss
Figure 764615DEST_PATH_IMAGE098
Wherein
Figure 921927DEST_PATH_IMAGE099
The label is a regular real label and is obtained by the conjunction operation of Boolean values of related item labels;
Figure 777887DEST_PATH_IMAGE100
representing training set sample expectations; training process by minimizing losses
Figure 944426DEST_PATH_IMAGE101
To update all parameters in the UNet and conjunction rule detection modules,
Figure 631760DEST_PATH_IMAGE102
indicating use of
Figure 163235DEST_PATH_IMAGE103
The norm regularizes all parameters in UNet and conjunctive rule detection modules to avoid overfitting:
Figure 303229DEST_PATH_IMAGE104
(8)
2-3) disjunctive normal form detection
Disjunctive normal form detection for validating input textxWhether the text meets the complete user rule set is equivalent to whether the text meets any conjunction rule in the user rule set;
the input is as follows: the conjunctive rule in step 2-2) represents a vector
Figure 7880DEST_PATH_IMAGE105
And other associated conjunction rules represent vectors;
the output is: predicting a probability that the input text satisfies the user rule set;
by usingmaxFunction to implement disjunctive networks
Figure 752982DEST_PATH_IMAGE106
: using the maximum probability in the inference results in the step 2-2) as a text inference result to represent the inference of the input textxProbability of satisfying user's demand, wherein
Figure 19141DEST_PATH_IMAGE107
Is the probability that the predicted input text satisfies the user rule set,
Figure 646431DEST_PATH_IMAGE108
the expression is taken as a function of the maximum probability,
Figure 92456DEST_PATH_IMAGE109
and the inference result output by the conjunction rule detection module is represented as follows:
Figure 488802DEST_PATH_IMAGE110
(9)
calculating the loss by using cross entropy loss function
Figure 424397DEST_PATH_IMAGE111
As shown in formula (10), whereinyIs a real label of the input text, whether the text meets the requirements of the user is marked by an expert,
Figure 742246DEST_PATH_IMAGE112
representing training set sample expectations, the training process by minimizing losses
Figure 788700DEST_PATH_IMAGE113
To update all the parameters of the semantic logical network,
Figure 305132DEST_PATH_IMAGE114
indicating use of
Figure 349311DEST_PATH_IMAGE115
The norm regularizes the parameters of the semantic logic network to avoid overfitting:
Figure 216773DEST_PATH_IMAGE116
(10)。
according to a preferred embodiment of the present invention, the processing method of the neural classification network includes:
constructing semantic vectors of input text by a text encoding module, the text encoding network used therein being
ENC2Preferably CNN, RNN or BERT based coding modules; by passingAfter the text coding module obtains the semantic expression vector of the input text, the category prediction is carried out based on the semantic expression vector, as shown in formula (11),
Figure 801338DEST_PATH_IMAGE117
representing the probability that the neural classification network predicts that the input text meets the user's requirements, here
Figure 375539DEST_PATH_IMAGE118
Is the output text level tag
Figure 906400DEST_PATH_IMAGE119
Figure 261158DEST_PATH_IMAGE120
To represent
Figure 587097DEST_PATH_IMAGE121
The function is activated in such a way that,
Figure 78121DEST_PATH_IMAGE122
is the network parameter:
Figure 526420DEST_PATH_IMAGE123
(11)
measuring prediction result of neural classification network by using cross entropy loss function
Figure 40578DEST_PATH_IMAGE124
With true result, i.e. true tag of the input text
Figure 232525DEST_PATH_IMAGE125
The difference between them, shown in formula (12), is lost
Figure 578056DEST_PATH_IMAGE127
By minimizing losses
Figure 134939DEST_PATH_IMAGE127
To update the neural classification netAll parameters of the network, whereinyIs a real label of the input text, whether the text meets the requirements of the user is marked by an expert,
Figure 198710DEST_PATH_IMAGE128
represents the expectation of the sample of the training set,
Figure 928769DEST_PATH_IMAGE129
indicating use of
Figure 332068DEST_PATH_IMAGE130
The norm regularizes all parameters of the neural classification network to avoid overfitting:
Figure 623634DEST_PATH_IMAGE131
(12)
3) and deducing the input text through a neural classification network and a semantic logic network respectively to obtain the prediction results of the input text and the semantic logic network, and finally constraining the consistency of the prediction results of the input text and the semantic logic network by using Jensen-Shannon divergence, namely JS distance for short.
Measuring the similarity between the prediction result distribution of the neural classification network and the semantic logic network by adopting JS distance, wherein the greater the similarity between the two is, the smaller the JS distance value is, and the probability distribution output by the neural classification network is recorded as
Figure 909122DEST_PATH_IMAGE132
The probability distribution of the semantic logic network output is
Figure 380555DEST_PATH_IMAGE133
Then JS distance between them
Figure 700678DEST_PATH_IMAGE134
The calculation formula of (2) is as follows:
Figure 661681DEST_PATH_IMAGE135
(13)
the above-mentioned
Figure 637727DEST_PATH_IMAGE136
Expressing the Kullback-leibler (KL) divergence, the calculation is shown in equation (14), the JS distance is a variant of the KL divergence, solving the asymmetric problem of the KL divergence:
Figure 709588DEST_PATH_IMAGE137
(14)
taking the JS distance as a regular term in the joint loss, and realizing the joint loss
Figure 87480DEST_PATH_IMAGE138
Is calculated as in equation (15), wherein hyper-parameter
Figure 953805DEST_PATH_IMAGE139
For the purpose of weighing the different loss terms,
Figure 479464DEST_PATH_IMAGE140
the value range is (0, 1), and the constraint condition is satisfied
Figure 89437DEST_PATH_IMAGE141
Figure 321835DEST_PATH_IMAGE142
As a loss function shown in equation (12),
Figure 391684DEST_PATH_IMAGE143
is a loss function shown in equation (10):
Figure 873481DEST_PATH_IMAGE144
(15)
in parallel structure training process, by minimizing joint loss
Figure 490407DEST_PATH_IMAGE145
To update all parameters of the neural classification network and the semantic logic network.
An apparatus for implementing the text inference method is characterized by comprising: a semantic logic network module;
the semantic logic network module is used for: determining whether an input text satisfies a user rule; the semantic logic network module comprises: the device comprises an item detection module, a conjunction rule detection module and a disjunction normal form detection module which are sequentially arranged along the direction of data flow.
According to the preferable embodiment of the present invention, the apparatus for implementing the text inference method further includes a neural classification network module disposed in parallel with the semantic logic network module;
the neural classification network is to: performing category prediction on an input text to obtain the probability that the input text meets the requirements of a user, namely a prediction result;
and respectively deducing the input text through a neural classification network and a semantic logic network to respectively obtain the prediction results of the input text and the semantic logic network, and finally, constraining the consistency of the prediction results of the input text and the semantic logic network by using Jensen-Shannon divergence.
The technical advantages of the invention are as follows:
(1) the semantic logic network provided by the invention encodes the user rules into semantic vectors, can better retain the semantic information of the text while detecting the logic rules, and supports language flexibility and text diversity.
(2) The invention also provides a method for integrating the user rules into the neural classification network to improve the text inference performance, namely, the parallel prediction structure inferred by the neural classification network and the semantic logic network is combined, consistency joint loss is adopted, the semantic logic network and the neural classification network can benefit each other, and the detection result of the rules is used as the evidence of text inference.
The semantic logic inference based on pre-training provided by the invention can better meet the user requirements of dynamic change. Aiming at the challenge brought to the supervised learning method, the invention utilizes massive universal linguistic data such as Chinese Wikipedia and the like and linguistic knowledge in open fields such as synonyms, near-synonym sets and the like extracted based on Chinese WordNet to pre-train a semantic logic network and finely tune on specific user data, thereby enhancing the robustness of keyword detection and being beneficial to efficiently processing the user requirement of dynamic change.
Drawings
FIG. 1 is a schematic diagram of an apparatus for implementing a rule embedding based text inference method of the present invention;
FIG. 2 is an example of a user requirement decision tree in the embodiment of the present invention.
Detailed Description
The invention is described in detail below with reference to the following examples and the accompanying drawings of the specification, but is not limited thereto.
Examples 1,
A method of rule-embedding based text inference, the method comprising:
1) converting a keyword logic expression for describing user requirements into an equivalent disjunctive normal form, wherein the user requirements are a propositional formula P, and the disjunctive normal form of P is as follows:
Figure 639629DEST_PATH_IMAGE146
(1)
in the formula (1), the first and second groups,
Figure 316598DEST_PATH_IMAGE147
indicating the number of conjunction rules that are to be applied,r iis the ith user rule; in the propositional formula P, conjunctions are taken from the set
Figure 285691DEST_PATH_IMAGE148
An item is a set of keywords
Figure 237467DEST_PATH_IMAGE149
Keywords and synonyms thereof related to description subject or semantics are included; formulation of propositions according to the theorem of existence of normal formPMust be converted into an equivalent disjunctive normal form,
Figure 506774DEST_PATH_IMAGE150
is a gate ofConjunctive rules formed by sets of key words, i.e.
Figure 354644DEST_PATH_IMAGE151
Wherein
Figure 811033DEST_PATH_IMAGE152
Representation conjunctive rules
Figure 832079DEST_PATH_IMAGE153
The number of the middle items and all the conjunction rule sets forming the user requirements are expressed as
Figure 627997DEST_PATH_IMAGE154
I.e. a user rule set, wherein
Figure 210550DEST_PATH_IMAGE155
Representing the number of conjunction rules; in this step, english of the Disjunctive Normal Form is abbreviated as DNF, the Disjunctive Normal Form has flexibility of processing user requirement change, and the change of the user requirement can be efficiently adapted through an addition and deletion conjunction rule;
2) determining whether an input text satisfies a user rule:
given text collections
Figure 154235DEST_PATH_IMAGE156
And user rule set
Figure 916655DEST_PATH_IMAGE157
(ii) a Inputting text
Figure 894975DEST_PATH_IMAGE158
Is expressed as
Figure 615806DEST_PATH_IMAGE159
(ii) a Inferring an output text-level probability of
Figure 250050DEST_PATH_IMAGE160
Representing input text
Figure 347319DEST_PATH_IMAGE161
A probability of satisfying a user rule; rule level probabilities of
Figure 445725DEST_PATH_IMAGE162
Is a length of
Figure 540720DEST_PATH_IMAGE163
Of which the ith dimension value represents the predicted input textxSatisfying user rules
Figure 724577DEST_PATH_IMAGE164
Probability of, according to
Figure 625537DEST_PATH_IMAGE165
Value of (2) judging textxWhich user rules are satisfied.
The invention utilizes semantic logic network to input textxAnd (3) understanding, and deducing whether the user rule corresponding to the user requirement is met:
using semantic logic network to input text in turnxItem detection, conjunction rule detection and disjunction normal form detection are carried out, and whether an input text meets the user rule or not is finally judged.
The sequential pair of input textsxThe specific method for item detection, conjunction rule detection and disjunction normal form detection comprises the following steps:
2-1) item detection
Item detection for determining input text
Figure 250553DEST_PATH_IMAGE167
Whether or not to include disjunctive normal form items
Figure 74372DEST_PATH_IMAGE168
(ii) a related semantic;
input as input text
Figure 479945DEST_PATH_IMAGE169
The output is recorded as the detection result
Figure 387858DEST_PATH_IMAGE170
Representing input text
Figure 195277DEST_PATH_IMAGE171
Containing item
Figure 163233DEST_PATH_IMAGE172
The probability of (d);
will input text
Figure 259365DEST_PATH_IMAGE173
Converting into a matrix formed by corresponding pre-training word vectors: the pre-training word vector refers to a Chinese word vector obtained by training a Chinese Wikipedia corpus and a word2vec algorithm, and is obtained by inputting a text
Figure 767707DEST_PATH_IMAGE174
The matrix composed of pre-training word vectors corresponding to all the words in the Chinese language is recorded as
Figure 429633DEST_PATH_IMAGE175
Wherein
Figure 99648DEST_PATH_IMAGE176
Which represents the real number field, is,uis to input text
Figure 417497DEST_PATH_IMAGE177
The length of the truncation of (a) is,dis the length of the pre-training word vector,
Figure 198371DEST_PATH_IMAGE178
is a word
Figure 216268DEST_PATH_IMAGE179
Corresponding length isdThe vector of (a);
will item
Figure 322765DEST_PATH_IMAGE180
Conversion to vector form: item(s)
Figure 924647DEST_PATH_IMAGE181
Vector of
Figure 509212DEST_PATH_IMAGE182
The average of the pre-training word vectors corresponding to all keywords in the corresponding keyword set, i.e.
Figure 83413DEST_PATH_IMAGE183
Wherein
Figure 360811DEST_PATH_IMAGE184
Is a key word in the set that is,
Figure 184410DEST_PATH_IMAGE185
is that
Figure 775928DEST_PATH_IMAGE186
Corresponding to the pre-training word vector.
Computing item
Figure 1373DEST_PATH_IMAGE149
And inputting text
Figure 449672DEST_PATH_IMAGE187
The mutual information between every vocabulary in the Chinese character, vector
Figure 527612DEST_PATH_IMAGE188
And inputting text
Figure 453980DEST_PATH_IMAGE190
Pre-training word embedding matrix
Figure 737194DEST_PATH_IMAGE191
The interaction vector is obtained through matrix multiplication and is recorded as
Figure 621973DEST_PATH_IMAGE192
Figure 420165DEST_PATH_IMAGE193
(2)
For input text
Figure 353486DEST_PATH_IMAGE194
Obtaining text semantic vector after semantic coding through coding network ENC
Figure 553523DEST_PATH_IMAGE195
In the present invention, different convolutional neural networks can be adopted, and a TEXTCNN structure is preferably adopted as the coding network ENC, and the sizes of the three convolutional kernels used are 2 in each cased,3×d,4×dWhereindIs the dimension of the pre-training word vector, and the number of each convolution kernel is 64;
semantic vector of text
Figure 343624DEST_PATH_IMAGE196
And interaction vector
Figure 629112DEST_PATH_IMAGE197
Splicing, and reducing dimension through a multilayer perceptron network MLP to obtain vectors
Figure 897282DEST_PATH_IMAGE198
I.e. as input text
Figure 155088DEST_PATH_IMAGE199
To item
Figure 617556DEST_PATH_IMAGE200
Contains the relationship:
Figure 390340DEST_PATH_IMAGE201
(3)
Figure 462201DEST_PATH_IMAGE202
through
Figure 840093DEST_PATH_IMAGE203
Value of function activationAs detected input text
Figure 971997DEST_PATH_IMAGE205
Containing item
Figure 232077DEST_PATH_IMAGE206
Probability, i.e. inference result, representing the input text
Figure 45312DEST_PATH_IMAGE207
To item
Figure 340027DEST_PATH_IMAGE206
The satisfaction degree of the corresponding keyword set semantics:
Figure 580516DEST_PATH_IMAGE208
(4)
Figure 327892DEST_PATH_IMAGE209
predicting input text with semantic logic network
Figure 7135DEST_PATH_IMAGE210
Containing item
Figure 94040DEST_PATH_IMAGE211
Probability of said vector
Figure 63352DEST_PATH_IMAGE212
The rule is also used as the input of the next-stage conjunction rule module;
Figure 298024DEST_PATH_IMAGE213
to represent
Figure 249799DEST_PATH_IMAGE214
The function is activated in such a way that,
Figure 456790DEST_PATH_IMAGE215
is a network parameter;
estimating a push using a cross entropy loss functionBroken result
Figure 835819DEST_PATH_IMAGE216
With true result, i.e. true tag of item
Figure 557787DEST_PATH_IMAGE217
Difference between distributions to find loss
Figure 516516DEST_PATH_IMAGE218
Figure 374750DEST_PATH_IMAGE219
(5)
Wherein the content of the first and second substances,
Figure 190260DEST_PATH_IMAGE220
the true label of the item is obtained by the character string matching detection and synonym expansion of the text and the keyword;
Figure 399524DEST_PATH_IMAGE221
representing training set sample expectations;Mis the number of keyword sets; training process by minimizing losses
Figure 896364DEST_PATH_IMAGE222
Detecting all parameters in the network by using the updated items;
Figure 438466DEST_PATH_IMAGE223
indicating use of
Figure 424877DEST_PATH_IMAGE224
The norm regularizes the parameters of the term detection network to avoid overfitting; the cross entropy loss function is a cross entropy cross-entropy loss function;
2-2) conjunctive rule detection
Conjunctive rule detection for validating input textxWhether to satisfy conjunctive rules
Figure 590279DEST_PATH_IMAGE225
The semantics of (2);
the input is as follows: term representation vector of step 2-1)
Figure 953127DEST_PATH_IMAGE226
The output is: predicting input text to contain conjunctive rules
Figure 520375DEST_PATH_IMAGE228
The probability of (d);
conjunctive rule embedding network
Figure 412108DEST_PATH_IMAGE229
The invention verifies that the method adopts
Figure 831850DEST_PATH_IMAGE230
Or
Figure 732810DEST_PATH_IMAGE231
The different structures have the capability of approximate logic conjunction operation and conjunction rule
Figure 623405DEST_PATH_IMAGE232
Comprising a sequence of items
Figure 217198DEST_PATH_IMAGE233
The expression vectors corresponding to the items detected form a sequence
Figure 357192DEST_PATH_IMAGE234
Splicing all vectors in the sequence as input, passing through
Figure 265105DEST_PATH_IMAGE235
Obtaining a representation vector of a conjunctive rule
Figure 72524DEST_PATH_IMAGE237
The output vector contains the rules for extracting the input text pair
Figure 571639DEST_PATH_IMAGE238
Contains the relationship:
Figure 402191DEST_PATH_IMAGE239
(6)
wherein the content of the first and second substances,
Figure 379375DEST_PATH_IMAGE240
to represent
Figure 41300DEST_PATH_IMAGE241
A sequence of all items of (a);
Figure 212781DEST_PATH_IMAGE242
through
Figure 592947DEST_PATH_IMAGE243
The function activation yields the detection probability of the conjunctive rule, shown in equation (7), where
Figure 373821DEST_PATH_IMAGE244
To represent
Figure 890253DEST_PATH_IMAGE245
The function is activated in such a way that,
Figure 731170DEST_PATH_IMAGE246
is a parameter of the network that is,
Figure 536315DEST_PATH_IMAGE247
is that the input text contains conjunction rules
Figure 120880DEST_PATH_IMAGE248
Probability of (2), i.e. inference result:
Figure 491818DEST_PATH_IMAGE249
(7)
measuring prediction results by using cross entropy loss function
Figure 769216DEST_PATH_IMAGE250
And true results
Figure 88421DEST_PATH_IMAGE251
Difference of (2) and loss
Figure 476677DEST_PATH_IMAGE252
Wherein
Figure 905384DEST_PATH_IMAGE253
The label is a regular real label and is obtained by the conjunction operation of Boolean values of related item labels;
Figure 353683DEST_PATH_IMAGE254
representing training set sample expectations; training process by minimizing losses
Figure 930158DEST_PATH_IMAGE255
To update all parameters in the UNet and conjunction rule detection modules,
Figure 59788DEST_PATH_IMAGE256
indicating use of
Figure 405319DEST_PATH_IMAGE257
The norm regularizes all parameters in UNet and conjunctive rule detection modules to avoid overfitting:
Figure 24519DEST_PATH_IMAGE258
(8)
2-3) disjunctive normal form detection
Disjunctive normal form detection for validating input text
Figure 25973DEST_PATH_IMAGE259
Whether the text meets the complete user rule set is equivalent to whether the text meets any conjunction rule in the user rule set;
the input is as follows: the conjunctive rule in step 2-2) represents a vector
Figure 490452DEST_PATH_IMAGE261
And other associated conjunction rules represent vectors;
the output is: predicting a probability that the input text satisfies the user rule set;
by usingmaxFunction to implement disjunctive networks
Figure 956069DEST_PATH_IMAGE262
: using the maximum probability in the inference results in the step 2-2) as a text inference result to represent the inference of the input text
Figure 949432DEST_PATH_IMAGE263
Probability of satisfying user's demand, wherein
Figure 736385DEST_PATH_IMAGE264
Is the probability that the predicted input text satisfies the user rule set,
Figure 270135DEST_PATH_IMAGE265
the expression is taken as a function of the maximum probability,
Figure 527941DEST_PATH_IMAGE266
and the inference result output by the conjunction rule detection module is represented as follows:
Figure 488943DEST_PATH_IMAGE267
(9)
calculating the loss by using cross entropy loss function
Figure 261727DEST_PATH_IMAGE268
As shown in formula (10), wherein
Figure 536851DEST_PATH_IMAGE269
Is a real label of the input text, whether the text meets the requirements of the user is marked by an expert,
Figure 445901DEST_PATH_IMAGE270
representing a training setSample expectation, training process by minimizing losses
Figure 577805DEST_PATH_IMAGE271
To update all the parameters of the semantic logical network,
Figure 572306DEST_PATH_IMAGE272
indicating use of
Figure 447858DEST_PATH_IMAGE273
The norm regularizes the parameters of the semantic logic network to avoid overfitting:
Figure 775196DEST_PATH_IMAGE274
(10)
examples 2,
The method of embodiment 1 for rule-based embedded text inference further comprising a neural classification network disposed in parallel with the semantic logic network, the neural classification network configured to: performing category prediction on an input text to obtain the probability that the input text meets the requirements of a user, namely a prediction result;
respectively deducing the input text through a neural classification network and a semantic logic network to respectively obtain the prediction results of the input text and the semantic logic network; and finally, constraining the consistency of the prediction results of the Jensen-Shannon divergence, namely JS distance for short.
The processing method of the neural classification network comprises the following steps:
constructing semantic vectors of input text by a text encoding module, the text encoding network used therein being
ENC2Preferably CNN, RNN or BERT based coding modules; after the semantic expression vector of the input text is obtained through a text coding module, category prediction is carried out based on the semantic expression vector, as shown in formula (11),
Figure 812423DEST_PATH_IMAGE275
representing the probability that the neural classification network predicts that the input text meets the user's requirements, here
Figure 763061DEST_PATH_IMAGE276
Is the output text level tag
Figure 176725DEST_PATH_IMAGE277
Figure 857105DEST_PATH_IMAGE278
To represent
Figure 65232DEST_PATH_IMAGE279
The function is activated in such a way that,
Figure 34325DEST_PATH_IMAGE280
is the network parameter:
Figure 753145DEST_PATH_IMAGE281
(11)
measuring prediction result of neural classification network by using cross entropy loss function
Figure 491294DEST_PATH_IMAGE282
And true results
Figure 135902DEST_PATH_IMAGE283
The difference between them, shown in formula (12), is lost
Figure 592291DEST_PATH_IMAGE284
By minimizing losses
Figure 551020DEST_PATH_IMAGE285
To update all parameters of the neural classification network, wherein
Figure 409254DEST_PATH_IMAGE286
Is a real label of the input text, whether the text meets the requirements of the user is marked by an expert,
Figure 224764DEST_PATH_IMAGE287
representing training set sample periodsThe physician can watch the disease,
Figure 902870DEST_PATH_IMAGE288
indicating use of
Figure 777807DEST_PATH_IMAGE289
The norm regularizes all parameters of the neural classification network to avoid overfitting:
Figure 490549DEST_PATH_IMAGE290
(12)
3) and deducing the input text through a neural classification network and a semantic logic network respectively to obtain the prediction results of the input text and the semantic logic network, and finally constraining the consistency of the prediction results of the input text and the semantic logic network by using Jensen-Shannon divergence, namely JS distance for short.
Measuring the similarity between the prediction result distribution of the neural classification network and the semantic logic network by adopting JS distance, wherein the greater the similarity between the two is, the smaller the JS distance value is, and the probability distribution output by the neural classification network is recorded as
Figure 680221DEST_PATH_IMAGE291
The probability distribution of the semantic logic network output is
Figure 111203DEST_PATH_IMAGE292
Then JS distance between them
Figure 208472DEST_PATH_IMAGE293
The calculation formula of (2) is as follows:
Figure 572457DEST_PATH_IMAGE294
(13)
the above-mentioned
Figure 464190DEST_PATH_IMAGE295
Expressing the Kullback-leibler (KL) divergence, the calculation is shown in equation (14), the JS distance is a variant of the KL divergence, solving the asymmetric problem of the KL divergence:
Figure 116888DEST_PATH_IMAGE296
(14)
taking the JS distance as a regular term in the joint loss, and realizing the joint loss
Figure 519313DEST_PATH_IMAGE297
Is calculated as in equation (15), wherein hyper-parameter
Figure 206646DEST_PATH_IMAGE298
For the purpose of weighing the different loss terms,
Figure 800438DEST_PATH_IMAGE299
the value range is (0, 1), and the constraint condition is satisfied
Figure 940433DEST_PATH_IMAGE300
Figure 379504DEST_PATH_IMAGE301
As a loss function shown in equation (12),
Figure 124606DEST_PATH_IMAGE302
is a loss function shown in equation (10):
Figure 623721DEST_PATH_IMAGE303
(15)
in parallel structure training process, by minimizing joint loss
Figure 516590DEST_PATH_IMAGE304
To update all parameters of the neural classification network and the semantic logic network.
Examples 3,
An apparatus for implementing the text inference method according to embodiment 1, comprising: a semantic logic network module;
the semantic logic network module is used for: determining whether an input text satisfies a user rule; the semantic logic network module comprises: the device comprises an item detection module, a conjunction rule detection module and a disjunction normal form detection module which are sequentially arranged along the direction of data flow.
Examples 4,
On the basis of embodiment 3, the apparatus for implementing the text inference method further includes a neural classification network module disposed in parallel with the semantic logic network module;
the neural classification network is to: performing category prediction on an input text to obtain the probability that the input text meets the requirements of a user, namely a prediction result;
and respectively deducing the input text through a neural classification network and a semantic logic network to respectively obtain the prediction results of the input text and the semantic logic network, and finally, constraining the consistency of the prediction results of the input text and the semantic logic network by using Jensen-Shannon divergence.
Application examples,
Practical application methods of the method and apparatus as described in examples 1-4 are as follows.
Pre-training a semantic logic network based on the general corpus:
obtaining a universal corpus, comprising: training texts are obtained from a Chinese general corpus such as Chinese Wikipedia, and keyword sets are obtained from a Chinese synonym forest such as Chinese version WordNet.
The automatic labeling of item level and conjunctive rule level is carried out on the general corpus, which comprises the following steps: for item annotation, at least inclusion for satisfaction
Figure 962615DEST_PATH_IMAGE305
Text of one of the keywordsx,Item label
Figure 624541DEST_PATH_IMAGE306
Is 1:
if not, then
Figure 796021DEST_PATH_IMAGE307
Is 0; for the conjunction rule marking, randomly combining the keyword set to generate a conjunction rule;
if the text isxSimultaneously satisfy the conjunction rule
Figure 910608DEST_PATH_IMAGE308
All the items in (1), the conjunction rule tag of the text
Figure 957061DEST_PATH_IMAGE309
Is 1;
if at least any item is not satisfied, then
Figure 676756DEST_PATH_IMAGE310
Is 0.
According to the step 2) described in embodiment 1, with reference to fig. 1, the general corpus pre-training item detection module and the conjunction rule detection module are used, which specifically include:
detecting the network according to the step 2-1) usage item
Figure 783252DEST_PATH_IMAGE311
Inputting universal corpus textxInputting a set of generic keywords
Figure 119555DEST_PATH_IMAGE312
A training item detection module;
and converting the input token into a corresponding pre-training word vector at an embedding layer of the model. For the keyword set to be detected, the vector of the keyword set is the average value of the pre-training word vectors corresponding to all words in the set, because the synonym is embedded in the semantic space and has adjacent position relation, the average vector can present common semantic features; on the other hand, for the place name set, the top prefecture words on the geographic region are used as proxy words, because the prefecture place names all contain the fact that the event occurs in the region;
the item detection module outputting probability corresponding to formula (4)
Figure 907383DEST_PATH_IMAGE313
The real label corresponds to the above label
Figure 278321DEST_PATH_IMAGE314
Finding the loss of formula (5)
Figure 555719DEST_PATH_IMAGE315
And reversely propagating to update the parameters of the item detection network, and iteratively training until the accuracy of the verification set is improved to be less than a threshold value.
According to step 2-2) of embodiment 1), a conjunction rule detection module is added to the untrained UNet, and two modules are trained by using a universal corpus, which specifically includes:
input devicexAnd
Figure 644898DEST_PATH_IMAGE312
obtaining an output vector by UNettVectors corresponding to all terms contained in the conjunction ruletSplicing input CNet to obtain detection probability of conjunction rule
Figure 970837DEST_PATH_IMAGE316
Corresponding to the formula (7), all the conjunction rules are detected in sequence;
by using
Figure 228905DEST_PATH_IMAGE317
And a label
Figure 614887DEST_PATH_IMAGE318
Loss calculation
Figure 925783DEST_PATH_IMAGE319
Discarding the predicted part of the term detection module, corresponding to equation (8), based on the loss
Figure 117729DEST_PATH_IMAGE320
And the propagation is carried out reversely to update the parameters of the UNet and the CNet.
The network is finely adjusted based on user data, which specifically comprises the following steps:
acquiring user requirements in the form of logic rules:
a subscribing user pays attention to emergencies in a specific area, including social security events, natural disasters and the like, the requirement description of the user is shown in fig. 2, white nodes represent logic or operations, and black nodes represent logic and operations:
for target text, starting with the leaf node and passing the boolean predicate value to the root node, an example of the keyword set in fig. 2 is shown in table 1:
TABLE 1 keyword set example of subscribed users
Figure 463260DEST_PATH_IMAGE321
The logic rules corresponding to the requirement decision tree of the subscribing user are written according to step 1) of embodiment 1, and propositional formulas and disjunctive normal forms equivalent to the decision tree are shown in table 2.
TABLE 2 logic rules for subscribing users
Figure 816881DEST_PATH_IMAGE322
Fine tuning the semantic logic network using user samples and rules:
the sample set comprises historical interest texts of the user, namely texts judged and pushed by experts, the texts form a positive sample set and correspond to tags
Figure 349494DEST_PATH_IMAGE323
Constructing a negative sample set of texts which are not interested in the user history, namely the texts judged by experts not to be pushed, and corresponding labels
Figure 876290DEST_PATH_IMAGE324
And preprocessing the text of the sample set, including Chinese word segmentation, text truncation or filling, and converting the word after word segmentation into a token input form. And converts all keywords contained in the logic rules into an input form for token.
The method for finely adjusting the semantic logic network by using the sample set specifically comprises the following steps:
according to step 2-1) of embodiment 1), the network is tested using the user sample set and the item fine-tuning items of the logic rules
Figure 306353DEST_PATH_IMAGE325
The analog pre-training process trains UNet using user data, iteratively until the validation set accuracy rises below a threshold.
According to step 2-2), the network UNet and CNet are fine-tuned using the user sample set and the collection rules, analogizing to the pre-training process, and iteratively training until the validation set accuracy is raised by less than the threshold.
According to step 2-3), adding an extraction rule detection module, training the DNet by using a user sample set, and specifically comprising:
input device
Figure 830876DEST_PATH_IMAGE326
And
Figure 381943DEST_PATH_IMAGE327
obtaining all conjunction rules through UNet and CNet
Figure 118954DEST_PATH_IMAGE328
Is predicted with probability of
Figure 173498DEST_PATH_IMAGE329
The maximum probability in MAX network is used as the probability that the inferred text meets the user's requirement, as shown in formula (9)
Figure 134501DEST_PATH_IMAGE330
. For example, if there are three predicted probabilities output by CNet, which are 0.98, 0.73, and 0.43, respectively, the probability of MAX network output is 0.98, which indicates that the text satisfies the user requirement if any one of the rules is satisfied.
Alternatively, if the DNet is implemented using MLP, then all will be
Figure 907285DEST_PATH_IMAGE331
Concatenating and inputting DNet to obtain a representation vectorRAnd based on vectorsRObtaining prediction probabilities
Figure 182408DEST_PATH_IMAGE332
By using
Figure 91458DEST_PATH_IMAGE333
And a label
Figure 488942DEST_PATH_IMAGE334
Loss calculationL R Discarding terms, predicting parts in conjunction rule detection module, and using losses, as in equation (10)L R And the parameters of the whole semantic logic network are updated by back propagation.
Training a parallel network based on user data, which is specifically as follows:
according to embodiments 2 and 4, training a parallel network structure using a user sample set specifically includes:
independently training the neural classification network: the neural classification network is fully trained using the user sample set, and the loss function is as in equation (12).
Jointly training a semantic logic network and a neural classification network: and combining the trained semantic logic network and the trained neural classification network for fine tuning, and introducing a JS item in the combined loss to constrain the consistency of the prediction results at two sides of the parallel structure. The joint loss is as in equation (15). At this time, the two ends of the parallel network simultaneously predict the categories of the texts, and the output of one side of the neural classification network is
Figure 686705DEST_PATH_IMAGE335
The calculation is shown in formula (11), and the output of the semantic logic network is
Figure 329301DEST_PATH_IMAGE336
The calculation is as shown in formula (9), and the invention preferably adopts
Figure 92858DEST_PATH_IMAGE337
As a final output result. For example, in this application, the input text "Bin is affected by 'Liqima' strong typhoon, and the prediction result corresponding to …" is
Figure 598925DEST_PATH_IMAGE338
Judging to meet the user requirements; the prediction result of the input text "in the latest updated scenario, wing views with feather back to Qingzhou …" is
Figure 346302DEST_PATH_IMAGE339
And judging that the user requirements are not met.

Claims (7)

1. A method for rule-based embedding based text inference, the method comprising:
1) converting a keyword logic expression for describing user requirements into an equivalent disjunctive normal form, wherein the user requirements are a propositional formula P, and the disjunctive normal form of P is as follows:
Figure 142509DEST_PATH_IMAGE001
(1)
in the formula (1), the first and second groups,
Figure 513448DEST_PATH_IMAGE002
indicating the number of conjunction rules that are to be applied,r iis the ith user rule; in the propositional formula P, conjunctions are taken from the set
Figure 790845DEST_PATH_IMAGE003
An item is a set of keywords
Figure 880024DEST_PATH_IMAGE004
Keywords and synonyms thereof related to description subject or semantics are included; formulation of propositions according to the theorem of existence of normal formPMust be converted into an equivalent disjunctive normal form,
Figure 533859DEST_PATH_IMAGE005
is a conjunction rule composed of a set of keywords, i.e.
Figure 526348DEST_PATH_IMAGE006
Wherein
Figure 240226DEST_PATH_IMAGE007
Representation conjunctive rules
Figure 285543DEST_PATH_IMAGE009
The number of the middle items and all the conjunction rule sets forming the user requirements are expressed as
Figure 211910DEST_PATH_IMAGE010
I.e. a user rule set, wherein
Figure 291862DEST_PATH_IMAGE011
Representing the number of conjunction rules; in this step, english of the Disjunctive Normal Form is abbreviated as DNF, the Disjunctive Normal Form has flexibility of processing user requirement change, and the change of the user requirement can be efficiently adapted through an addition and deletion conjunction rule;
2) determining whether an input text satisfies a user rule:
using semantic logic network to input text in turnxItem detection, conjunction rule detection and disjunction normal form detection are carried out, and whether an input text meets the user rule or not is finally judged.
2. The method of claim 1, further comprising a neural classification network disposed in parallel with the semantic logic network, the neural classification network configured to: performing category prediction on an input text to obtain the probability that the input text meets the requirements of a user, namely a prediction result;
respectively deducing the input text through a neural classification network and a semantic logic network to respectively obtain the prediction results of the input text and the semantic logic network; and finally, constraining the consistency of the prediction results of the Jensen-Shannon divergence, namely JS distance for short.
3. The method of claim 1, wherein the sequential pairing of input text is performed in sequencexThe specific method for item detection, conjunction rule detection and disjunction normal form detection comprises the following steps:
2-1) item detection
Item detection for determining input text
Figure 645483DEST_PATH_IMAGE012
Whether or not to include disjunctive normal form items
Figure 178095DEST_PATH_IMAGE013
(ii) a related semantic;
input as input text
Figure 908154DEST_PATH_IMAGE014
The output is recorded as the detection result
Figure 842612DEST_PATH_IMAGE015
Representing input text
Figure 101555DEST_PATH_IMAGE016
Containing item
Figure 121464DEST_PATH_IMAGE017
The probability of (d);
will input text
Figure 891099DEST_PATH_IMAGE018
Converting into a matrix formed by corresponding pre-training word vectors: is marked as
Figure 945642DEST_PATH_IMAGE019
Wherein
Figure 641066DEST_PATH_IMAGE020
Which represents the real number field, is,uis to input text
Figure 148271DEST_PATH_IMAGE022
The length of the truncation of (a) is,dis the length of the pre-training word vector,
Figure 954553DEST_PATH_IMAGE023
is a word
Figure 863603DEST_PATH_IMAGE024
Corresponding length isdThe vector of (a);
will item
Figure 464348DEST_PATH_IMAGE025
Conversion to vector form: item(s)
Figure 458849DEST_PATH_IMAGE026
Vector of
Figure 599981DEST_PATH_IMAGE027
The average of the pre-training word vectors corresponding to all keywords in the corresponding keyword set, i.e.
Figure 363537DEST_PATH_IMAGE028
Wherein
Figure 167807DEST_PATH_IMAGE029
Is a key word in the set that is,
Figure 384025DEST_PATH_IMAGE031
is that
Figure 532110DEST_PATH_IMAGE032
Corresponding to the pre-training word vector;
will vector
Figure 415752DEST_PATH_IMAGE033
And inputting the textBook (I)
Figure 623880DEST_PATH_IMAGE034
Pre-training word embedding matrix
Figure 592973DEST_PATH_IMAGE035
The interaction vector is obtained through matrix multiplication and is recorded as
Figure 279169DEST_PATH_IMAGE036
Figure 17318DEST_PATH_IMAGE037
(2)
For input text
Figure 661926DEST_PATH_IMAGE038
Obtaining text semantic vector after semantic coding through coding network ENC
Figure 852736DEST_PATH_IMAGE039
Semantic vector of text
Figure 342623DEST_PATH_IMAGE040
And interaction vector
Figure 669699DEST_PATH_IMAGE041
Splicing, and reducing dimension through a multilayer perceptron network MLP to obtain vectors
Figure 980814DEST_PATH_IMAGE042
I.e. as input text
Figure 658920DEST_PATH_IMAGE044
To item
Figure 218077DEST_PATH_IMAGE004
Contains the relationship:
Figure 665239DEST_PATH_IMAGE045
(3)
Figure 120491DEST_PATH_IMAGE047
through
Figure 551472DEST_PATH_IMAGE048
Value of function activation as detection of input text
Figure 383162DEST_PATH_IMAGE049
Containing item
Figure 950410DEST_PATH_IMAGE050
Probability, i.e. inference result, representing the input text
Figure 842142DEST_PATH_IMAGE051
To item
Figure 494840DEST_PATH_IMAGE052
The satisfaction degree of the corresponding keyword set semantics:
Figure 395800DEST_PATH_IMAGE053
(4)
Figure 53440DEST_PATH_IMAGE054
predicting input text with semantic logic networkxContaining item
Figure 116074DEST_PATH_IMAGE055
Probability of said vector
Figure 256068DEST_PATH_IMAGE056
And also as the nextInputting a stage conjunction rule module;
Figure 695140DEST_PATH_IMAGE057
to represent
Figure 971400DEST_PATH_IMAGE058
The function is activated in such a way that,
Figure 470515DEST_PATH_IMAGE059
is a network parameter;
evaluating inference results using cross-entropy loss function
Figure 832226DEST_PATH_IMAGE060
With true result, i.e. true tag of item
Figure 809409DEST_PATH_IMAGE061
Difference between distributions to find loss
Figure 205756DEST_PATH_IMAGE062
Figure 610192DEST_PATH_IMAGE063
(5)
Wherein the content of the first and second substances,
Figure 724779DEST_PATH_IMAGE064
the true label of the item is obtained by the character string matching detection and synonym expansion of the text and the keyword;
Figure 240074DEST_PATH_IMAGE065
representing training set sample expectations;Mis the number of keyword sets; training process by minimizing losses
Figure 992391DEST_PATH_IMAGE066
Detecting all parameters in the network by using the updated items;
Figure 833308DEST_PATH_IMAGE067
indicating use of
Figure 169612DEST_PATH_IMAGE068
Normalizing the parameters of the item detection network by the norm;
2-2) conjunctive rule detection
Conjunctive rule detection for validating input text
Figure 488598DEST_PATH_IMAGE069
Whether to satisfy conjunctive rules
Figure 859536DEST_PATH_IMAGE070
The semantics of (2);
the input is as follows: term representation vector of step 2-1)
Figure 605775DEST_PATH_IMAGE072
The output is: predicting input text to contain conjunctive rules
Figure 163796DEST_PATH_IMAGE073
The probability of (d);
conjunctive rule embedding network
Figure 552052DEST_PATH_IMAGE074
Rule of conjunction
Figure 511917DEST_PATH_IMAGE075
Comprising a sequence of items
Figure 429058DEST_PATH_IMAGE076
The expression vectors corresponding to the items detected form a sequence
Figure 5533DEST_PATH_IMAGE077
Concatenating all vectors in the sequenceAs an input, pass through
Figure 666321DEST_PATH_IMAGE078
Obtaining a representation vector of a conjunctive rule
Figure 982158DEST_PATH_IMAGE079
Figure 601358DEST_PATH_IMAGE080
(6)
Wherein the content of the first and second substances,
Figure 133971DEST_PATH_IMAGE081
to represent
Figure 598450DEST_PATH_IMAGE082
A sequence of all items of (a);
Figure 532908DEST_PATH_IMAGE083
through
Figure 57430DEST_PATH_IMAGE084
Activating the function to obtain the detection probability of the conjunction rule;
is shown in formula (7), wherein
Figure 77339DEST_PATH_IMAGE085
To represent
Figure 345509DEST_PATH_IMAGE086
The function is activated in such a way that,
Figure 134474DEST_PATH_IMAGE087
is a parameter of the network that is,
Figure 95477DEST_PATH_IMAGE088
is that the input text contains conjunction rules
Figure 337102DEST_PATH_IMAGE089
Probability of (2), i.e. inference result:
Figure 143384DEST_PATH_IMAGE090
(7)
measuring prediction results by using cross entropy loss function
Figure 837056DEST_PATH_IMAGE091
With true result, i.e. true tag of rule
Figure 437802DEST_PATH_IMAGE092
Difference of (2) and loss
Figure 432303DEST_PATH_IMAGE093
Wherein
Figure 42276DEST_PATH_IMAGE094
The label is a regular real label and is obtained by the conjunction operation of Boolean values of related item labels;
Figure 805832DEST_PATH_IMAGE095
representing training set sample expectations; training process by minimizing losses
Figure 577479DEST_PATH_IMAGE096
To update all parameters in the UNet and conjunction rule detection modules,
Figure 324855DEST_PATH_IMAGE097
indicating use of
Figure 472940DEST_PATH_IMAGE098
Regularizing all parameters in the UNet and conjunction rule detection modules by norm:
Figure 91003DEST_PATH_IMAGE099
(8)
2-3) disjunctive normal form detection
Disjunctive normal form detection for validating input text
Figure 564710DEST_PATH_IMAGE100
Whether a complete set of user rules is satisfied;
the input is as follows: the conjunctive rule in step 2-2) represents a vector
Figure 268224DEST_PATH_IMAGE101
And other associated conjunction rules represent vectors;
the output is: predicting a probability that the input text satisfies the user rule set;
by usingmaxFunction to implement disjunctive networks
Figure 954420DEST_PATH_IMAGE102
: using the maximum probability in the inference results in the step 2-2) as a text inference result, wherein
Figure 459613DEST_PATH_IMAGE103
Is the probability that the predicted input text satisfies the user rule set,
Figure 573062DEST_PATH_IMAGE104
the expression is taken as a function of the maximum probability,
Figure 763872DEST_PATH_IMAGE105
and the inference result output by the conjunction rule detection module is represented as follows:
Figure 519339DEST_PATH_IMAGE106
(9)
calculating the loss by using cross entropy loss function
Figure 111994DEST_PATH_IMAGE107
As shown in formula (10), wherein
Figure 927503DEST_PATH_IMAGE108
Is a real label of the input text, whether the text meets the requirements of the user is marked by an expert,
Figure 605609DEST_PATH_IMAGE109
representing training set sample expectations, the training process by minimizing losses
Figure 899187DEST_PATH_IMAGE110
To update all the parameters of the semantic logical network,
Figure 346349DEST_PATH_IMAGE111
indicating use of
Figure 332760DEST_PATH_IMAGE112
Norm to regularize parameters of the semantic logic network:
Figure 498162DEST_PATH_IMAGE113
(10)。
4. the method of claim 2, wherein the processing method of the neural classification network comprises:
constructing semantic vectors of input text by a text encoding module, the text encoding network used therein being
ENC2(ii) a After the semantic expression vector of the input text is obtained through a text coding module, category prediction is carried out based on the semantic expression vector, as shown in formula (11),
Figure 96896DEST_PATH_IMAGE114
representing neural classification network predicted input text charactersProbability of meeting user demand, here
Figure 664143DEST_PATH_IMAGE115
Is the output text level tag
Figure 821455DEST_PATH_IMAGE116
Figure 474153DEST_PATH_IMAGE117
To represent
Figure 109534DEST_PATH_IMAGE118
The function is activated in such a way that,
Figure 531288DEST_PATH_IMAGE119
is the network parameter:
Figure 859501DEST_PATH_IMAGE120
(11)
measuring prediction result of neural classification network by using cross entropy loss function
Figure 733916DEST_PATH_IMAGE121
With true result, i.e. true tag of the input text
Figure 172988DEST_PATH_IMAGE122
The difference between them, shown in formula (12), is lost
Figure 714828DEST_PATH_IMAGE123
By minimizing losses
Figure 449828DEST_PATH_IMAGE124
To update all parameters of the neural classification network, wherein
Figure 811539DEST_PATH_IMAGE125
Is a real label of the input text, whether the text meets the requirements of the user is marked by an expert,
Figure 788722DEST_PATH_IMAGE126
represents the expectation of the sample of the training set,
Figure 185069DEST_PATH_IMAGE127
indicating use of
Figure 855084DEST_PATH_IMAGE128
Norm to regularize all parameters of the neural classification network:
Figure 704092DEST_PATH_IMAGE129
(12)
and deducing the input text through a neural classification network and a semantic logic network respectively to obtain the prediction results of the input text and the semantic logic network, and finally constraining the consistency of the prediction results of the input text and the semantic logic network by using Jensen-Shannon divergence, namely JS distance for short.
5. The method of claim 4, wherein the text inference based on rule embedding is performed,
measuring the similarity between the prediction result distribution of the neural classification network and the semantic logic network by adopting JS distance, and recording the probability distribution output by the neural classification network as
Figure 219387DEST_PATH_IMAGE130
The probability distribution of the semantic logic network output is
Figure 735819DEST_PATH_IMAGE131
Then JS distance between them
Figure 311156DEST_PATH_IMAGE132
The calculation formula of (2) is as follows:
Figure 647460DEST_PATH_IMAGE133
(13)
taking the JS distance as a regular term in the joint loss, and realizing the joint loss
Figure 232025DEST_PATH_IMAGE134
Is calculated as in equation (15), wherein hyper-parameter
Figure 337384DEST_PATH_IMAGE135
For the purpose of weighing the different loss terms,
Figure 579229DEST_PATH_IMAGE136
the value range is (0, 1), and the constraint condition is satisfied
Figure DEST_PATH_IMAGE137
Figure 668407DEST_PATH_IMAGE138
As a loss function shown in equation (12),
Figure DEST_PATH_IMAGE139
is a loss function shown in equation (10):
Figure 322243DEST_PATH_IMAGE140
(15)
by minimizing joint losses
Figure DEST_PATH_IMAGE141
To update all parameters of the neural classification network and the semantic logic network.
6. An apparatus for implementing the text inference method of any one of claims 1-5, comprising: a semantic logic network module;
the semantic logic network module is used for: determining whether an input text satisfies a user rule; the semantic logic network module comprises: the device comprises an item detection module, a conjunction rule detection module and a disjunction normal form detection module which are sequentially arranged along the direction of data flow.
7. The apparatus of claim 6, further comprising a neural classification network module disposed in parallel with the semantic logic network module;
the neural classification network is to: performing category prediction on an input text to obtain the probability that the input text meets the requirements of a user, namely a prediction result;
and respectively deducing the input text through a neural classification network and a semantic logic network to respectively obtain the prediction results of the input text and the semantic logic network, and finally, constraining the consistency of the prediction results of the input text and the semantic logic network by using Jensen-Shannon divergence.
CN202110984877.9A 2021-08-26 2021-08-26 Text inference method and device based on rule embedding Active CN113435212B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110984877.9A CN113435212B (en) 2021-08-26 2021-08-26 Text inference method and device based on rule embedding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110984877.9A CN113435212B (en) 2021-08-26 2021-08-26 Text inference method and device based on rule embedding

Publications (2)

Publication Number Publication Date
CN113435212A true CN113435212A (en) 2021-09-24
CN113435212B CN113435212B (en) 2021-11-16

Family

ID=77797888

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110984877.9A Active CN113435212B (en) 2021-08-26 2021-08-26 Text inference method and device based on rule embedding

Country Status (1)

Country Link
CN (1) CN113435212B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114003726A (en) * 2021-12-31 2022-02-01 山东大学 Subspace embedding-based academic thesis difference analysis method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708096A (en) * 2012-05-29 2012-10-03 代松 Network intelligence public sentiment monitoring system based on semantics and work method thereof
CN103605729A (en) * 2013-11-19 2014-02-26 段炼 POI (point of interest) Chinese text categorizing method based on local random word density model
CN103699663A (en) * 2013-12-27 2014-04-02 中国科学院自动化研究所 Hot event mining method based on large-scale knowledge base
CN109840322A (en) * 2018-11-08 2019-06-04 中山大学 It is a kind of based on intensified learning cloze test type reading understand analysis model and method
CN110069623A (en) * 2017-12-06 2019-07-30 腾讯科技(深圳)有限公司 Summary texts generation method, device, storage medium and computer equipment
CN110321432A (en) * 2019-06-24 2019-10-11 拓尔思信息技术股份有限公司 Textual event information extracting method, electronic device and non-volatile memory medium
US10621499B1 (en) * 2015-08-03 2020-04-14 Marca Research & Development International, Llc Systems and methods for semantic understanding of digital information
CN113268565A (en) * 2021-04-27 2021-08-17 山东大学 Method and device for quickly generating word vector based on concept text

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708096A (en) * 2012-05-29 2012-10-03 代松 Network intelligence public sentiment monitoring system based on semantics and work method thereof
CN103605729A (en) * 2013-11-19 2014-02-26 段炼 POI (point of interest) Chinese text categorizing method based on local random word density model
CN103699663A (en) * 2013-12-27 2014-04-02 中国科学院自动化研究所 Hot event mining method based on large-scale knowledge base
US10621499B1 (en) * 2015-08-03 2020-04-14 Marca Research & Development International, Llc Systems and methods for semantic understanding of digital information
CN110069623A (en) * 2017-12-06 2019-07-30 腾讯科技(深圳)有限公司 Summary texts generation method, device, storage medium and computer equipment
CN109840322A (en) * 2018-11-08 2019-06-04 中山大学 It is a kind of based on intensified learning cloze test type reading understand analysis model and method
CN110321432A (en) * 2019-06-24 2019-10-11 拓尔思信息技术股份有限公司 Textual event information extracting method, electronic device and non-volatile memory medium
CN113268565A (en) * 2021-04-27 2021-08-17 山东大学 Method and device for quickly generating word vector based on concept text

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LI ZHAOHUI;BAI XIAOCHEN;HU RUI;LI XIAOLI: ""Measuring Phase-Amplitude Coupling Based on the Jensen-Shannon Divergence and Correlation Matrix"", 《IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING : A PUBLICATION OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY》 *
刘云: ""面向社会化媒体用户评论行为的属性推断"", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *
陈良军; 洪彧; SUJITH MANGALATHU; 勾红叶; 蒲黔辉: ""基于Jensen-Shannon散度的自适应采样方法的高效可靠性分析"", 《JOURNAL OF CENTRAL SOUTH UNIVERSITY》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114003726A (en) * 2021-12-31 2022-02-01 山东大学 Subspace embedding-based academic thesis difference analysis method

Also Published As

Publication number Publication date
CN113435212B (en) 2021-11-16

Similar Documents

Publication Publication Date Title
US11182562B2 (en) Deep embedding for natural language content based on semantic dependencies
Marivate et al. Improving short text classification through global augmentation methods
US11281976B2 (en) Generative adversarial network based modeling of text for natural language processing
US11481416B2 (en) Question Answering using trained generative adversarial network based modeling of text
US10657259B2 (en) Protecting cognitive systems from gradient based attacks through the use of deceiving gradients
Mahmood et al. Deep sentiments in roman urdu text using recurrent convolutional neural network model
Ezaldeen et al. A hybrid E-learning recommendation integrating adaptive profiling and sentiment analysis
Suissa et al. Text analysis using deep neural networks in digital humanities and information science
US11663518B2 (en) Cognitive system virtual corpus training and utilization
Rauf et al. Using bert for checking the polarity of movie reviews
CN110781666B (en) Natural language processing text modeling based on generative antagonism network
Essa et al. Fake news detection based on a hybrid BERT and LightGBM models
Jiang et al. A hierarchical model with recurrent convolutional neural networks for sequential sentence classification
Suresh Kumar et al. Local search five‐element cycle optimized reLU‐BiLSTM for multilingual aspect‐based text classification
Patil et al. Hate speech detection using deep learning and text analysis
CN113435212B (en) Text inference method and device based on rule embedding
CN116956228A (en) Text mining method for technical transaction platform
Neill et al. Meta-embedding as auxiliary task regularization
Nazarizadeh et al. Using Group Deep Learning and Data Augmentation in Persian Sentiment Analysis
Kandi Language Modelling for Handling Out-of-Vocabulary Words in Natural Language Processing
Lou Deep learning-based sentiment analysis of movie reviews
Jawale et al. Sentiment analysis and vector embedding: A comparative study
Ait Benali et al. Arabic named entity recognition in social media based on BiLSTM-CRF using an attention mechanism
Baruah et al. Detection of Hate Speech in Assamese Text
Han Emotion Analysis of Literary Works Based on Attentional Mechanisms and the Fusion of Two-Channel Features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant