CN112001166B

CN112001166B - Intelligent question-answer sentence semantic matching method and device for government affair consultation service

Info

Publication number: CN112001166B
Application number: CN202010855426.0A
Authority: CN
Inventors: 鹿文鹏; 于瑞
Original assignee: Qilu University of Technology
Current assignee: Qilu University of Technology
Priority date: 2020-08-24
Filing date: 2020-08-24
Publication date: 2023-10-17
Anticipated expiration: 2040-08-24
Also published as: CN112001166A

Abstract

The invention discloses an intelligent question-answer sentence semantic matching method and device for government affair consultation service, and belongs to the technical field of artificial intelligence and natural language processing. The invention aims to solve the technical problems of capturing the interaction information between the semantic context characteristics and sentences in a deeper level and reducing the loss of the semantic information so as to realize the intelligent semantic matching of sentence pairs, and adopts the following technical scheme: the sentence pair semantic matching model consisting of the multi-granularity embedding module, the gating deep feature residual type fusion network module, the semantic feature interaction matching module and the tag prediction module is constructed and trained, the gating deep feature residual type fusion representation of sentence information is realized, and meanwhile, the final matching tensor of sentence pairs is generated through an attention mechanism and a gating mechanism, and the matching degree of the sentence pairs is judged, so that the aim of intelligent semantic matching of the sentence pairs is achieved. The device comprises a sentence-to-semantic matching knowledge base construction unit, a training data set generation unit, a sentence-to-semantic matching model construction unit and a sentence-to-semantic matching model training unit.

Description

Intelligent question-answer sentence semantic matching method and device for government affair consultation service

Technical Field

The invention relates to the technical field of artificial intelligence and natural language processing, in particular to an intelligent question-answer sentence semantic matching method for government affair consultation service.

Background

People need to know related policies and files established by government when transacting business daily. In order to better make service work for people, government at all levels needs to set up government affair consultation related service institutions, and government affair consultation is provided for people. With the increasing demand for government consultation, the working pressure of related institutions is increasing. Although the demand for government consultation is increasing, most consultations belong to repeated consultations. For these repeated consultations, a historical consultation library can be mined, and the answers can be automatically resolved. Intelligent question-answering systems have unique advantages for addressing this difficulty. The intelligent question-answering system is one of core technologies of man-machine interaction, can automatically find a standard question matched with a question-answering knowledge base aiming at the question proposed by a user, and pushes an answer of the standard question to the user, so that the burden of manual answer can be greatly reduced. The intelligent question-answering system has wide practical application in the fields of self-service, intelligent customer service and the like. For a great variety of questions presented by users, how to find matched standard questions for the questions is a core technology of the intelligent question-answering system. The essence of the technology is to measure the matching degree of the questions presented by the user and the standard questions in the question-answering knowledge base, and the essence is the sentence-to-semantic matching task.

The sentence-to-semantic matching task aims at measuring whether the semantics contained in two sentences are consistent, which is consistent with the core objective of many natural language processing tasks, and is an intelligent question-answering system facing government consulting services as described above. The semantic matching degree calculation of natural language sentences is a very challenging work, and the existing method can not perfectly solve the problem.

When matching the semantics of the sentence pairs, the existing method generally needs to design a specific neural network to code the semantics of the sentences so as to extract the corresponding semantic features. While for text semantic coding, the most widely used coding model is the recurrent neural network and its various variant structures. The cyclic neural network adopts a chain structure, so that the remote semantic features can be better captured, and the cyclic neural network has strong advantages for processing text data. However, the characterization capability of single-layer networks is limited after all, so in order to improve the characterization capability of the network, deep and rich semantic features are extracted, and the depth of the network is generally increased. However, the coding result of each layer in the deep network structure is not effective information, and in this case, two problems occur: firstly, if only the result output by the last layer of network is taken as a coding result, semantic information is always lost; second, if the results of each layer of network output are simply combined, e.g., concatenated or added, directly, the resulting encoded result is inevitably subject to a significant amount of noise. Therefore, the existing deep network structure still has the above-mentioned non-negligible drawbacks for text semantic coding.

Disclosure of Invention

The technical task of the invention is to provide the intelligent question-answer sentence semantic matching method and device for the government affair consultation service so as to fully exert the advantages of a gating mechanism and a residual mechanism, capture more semantic context information and interaction information between sentences, and finally achieve the aim of carrying out intelligent semantic matching on sentence pairs through an attention mechanism and a gating mechanism.

The technical task of the invention is realized in the following way, an intelligent question-answer sentence pair semantic matching method for government affair consultation service is realized by constructing and training a sentence pair semantic matching model consisting of a multi-granularity embedding module, a gating deep feature residual type fusion network module, a semantic feature interaction matching module and a label prediction module, realizing gating deep feature residual type fusion representation of sentence information, generating a final matching tensor of the sentence pair through a attention mechanism and a gating mechanism and judging the matching degree of the sentence pair so as to achieve the aim of intelligent semantic matching of the sentence pair; the method comprises the following steps:

the multi-granularity embedding module respectively performs embedding operation on the input sentences according to the granularity of the words and the granularity of the words to obtain multi-granularity embedded representation of the sentences;

The method comprises the steps that a gating deep feature residual type fusion network module carries out coding operation on multi-granularity embedded representations of sentences to obtain gating deep feature residual type fusion representations of the sentences;

the semantic feature interaction matching module performs feature matching and feature screening operation on the gated deep feature residual type fusion representation of the sentence pair to obtain a matching vector of the sentence pair;

the label prediction module maps the matching tensor of the sentence pair into a floating point type numerical value on a designated interval, compares the floating point type numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the sentence pair are matched according to a comparison result.

Preferably, the multi-granularity embedding module is used for constructing a word mapping conversion table, constructing an input module, constructing a word vector mapping layer and constructing a word vector mapping layer;

wherein, a word mapping conversion table is constructed: the mapping rule is: starting with the number 1, sequentially and incrementally sorting according to the sequence of each word recorded into the word list, thereby forming a word mapping conversion list required by the invention; the word list is constructed according to a semantic matching word breaking processing knowledge base according to sentences, and the knowledge base is obtained by carrying out word breaking operation on an original data text of the semantic matching knowledge base; then, training a Word2Vec Word vector model to obtain a Word vector matrix of each Word;

Constructing a word mapping conversion table: the mapping rule is: starting with the number 1, sequentially and incrementally sorting according to the sequence of each word to be input into the word list, so as to form a word mapping conversion table required by the invention; the word list is constructed according to a semantic matching word segmentation processing knowledge base through sentences, and the knowledge base is obtained by carrying out word segmentation operation on an original data text of the semantic matching knowledge base; then training the vector model by using Word2Vec to obtain a Word vector matrix of each Word; the invention relates to a sentence semantic matching word breaking processing knowledge base and a sentence semantic matching word segmentation processing knowledge base, which are collectively called a sentence semantic matching knowledge base;

and (3) constructing an input module: the input layer comprises four inputs, and for each sentence pair or sentence pair to be predicted in the training data set, the sentence pairs are subjected to word breaking and word segmentation preprocessing to respectively obtain a sentenc1_ char, sentence2_ char, sentence1_word and a sentenc2_word, wherein suffixes char and word respectively represent that the corresponding sentence is processed by word breaking or word segmentation, and the word breaking and word segmentation are formed into the following forms: (sentenc1_char, sentenc2_char, sentenc1_word, sentenc2_word); converting each word or word in the input sentence into a corresponding digital identifier according to a word mapping conversion table and a word mapping conversion table;

Constructing a word vector mapping layer: loading the weight of the word vector matrix trained in the step of constructing the word mapping conversion table to initialize the weight parameter of the current layer; for input sentences sentenc1_char and sentenc2_char, obtaining corresponding sentence vectors sentenc1_char_ embed, sentence2 _char_emmbed; each sentence in the sentence-to-semantic matching word breaking processing knowledge base can be converted into a vector form through a word vector mapping mode.

Constructing a word vector mapping layer: loading the weight of the word vector matrix trained in the step of constructing the word mapping conversion table to initialize the weight parameter of the current layer; aiming at the input sentences sentenc1_word and sentenc2_word, corresponding sentence vectors sentenc1_word_ embed, sentence2 _word_end are obtained; each sentence in the sentence-to-semantic matching word segmentation processing knowledge base can be converted into a vector form through a word vector mapping mode.

More preferably, the construction process of the gated deep feature residual fusion network module specifically comprises the following steps:

firstly, the word embedding representation and the word embedding representation output by the multi-granularity embedding module are selectively fused through a gating mechanism to obtain a gating embedding fusion representation, and the formula is as follows:

Wherein equation (1.1) represents constructing an embedded representation information selection gate, wherein, and />Representing a weight matrix to be trained, < >>Representing sendence1_char_emuded or sendce2_char_emuded,/for>Representing sendence1_word_emud or sendence2_word_emud, σ representing a sigmoid function, gate _emb Representation embedded representation information selection gate; the formula (1.2) shows that the word embedding representation and the word embedding representation are selectively fused by the embedding representation information selection gate, wherein, the ". As used herein, the representation is multiplied by element +.>The representation gating embeds the fusion representation.

Further, the first layer encoding structure BiLSTM ₁ Performing coding operation on the word embedding representation and the word embedding representation to obtain a preliminary first-layer word coding result and a first-layer word coding result; the first layer word coding result and the first layer word coding result are selectively fused through a gating mechanism, and then the fusion result and the gating embedded fusion representation are selectively fused through the gating mechanism, so that the gated first layer characteristic residual fusion representation is obtained, and the formula is as follows:

wherein the formula (2.1) represents the use of BiLSTM ₁ The encoded word is embedded into the representation, wherein,representing sentenc1_char_emuded or sentenc2_char_emuded, i _c The vector representing the i-th word represents the relative position in the sentence, < > >Representing the first layer word coding result; equation (2.2) represents the use of BiLSTM ₁ Code word embedded representation, wherein->Representing sendence1_word_emuded or sendence2_word_emuded, i _w The vector representing the i-th word represents the relative position in the sentence, < >>Representing a first-layer word coding result; equation (2.3) represents constructing a first layer encoding result selection gate, wherein, and />Represents a weight matrix to be trained, sigma represents a sigmoid function, gate ₁ ^* Representing a first layer encoding result select gate; formula (2.4) shows that the first layer word encoding result and the first layer word encoding result are selectively fused through the first layer encoding result selection gate, wherein, by the following formula, by the element multiplication, "+", is shown>Representing a first layer of gating encoding result fusion representation; equation (2.5) shows the construction of a first layer characteristic residual type selection gate, wherein +_> and />Representing a weight matrix to be trained, < >>A gating embedded fusion representation representing the output of equation (1.2), σ representing the sigmoid function, gate ₁ Representing a first layer of characteristic residual type selection gate; equation (2.6) represents the selective fusion of the gated embedded fusion representation and the gated first layer encoded result fusion representation by first layer feature residual selection gating, wherein +. >The representation gates the first layer feature residual fusion representation.

Further, the first layer word coding result and the first layer word coding result are transferred to the second layer coding structure BiLSTM ₂ ；BiLSTM ₂ Respectively carrying out coding operation on the first layer character coding result and the first layer word coding result to obtain a second layer character coding result and a second layer word coding result; the second-layer word coding result and the second-layer word coding result are selectively fused through a gating mechanism, and then the fusion result is selectively fused with a gated first-layer characteristic residual type fusion representation through the gating mechanism to obtain a gated second-layer characteristic residual type fusion representation, wherein the formula is as follows:

wherein the formula (3.1) represents the use of BiLSTM ₂ Encoding the first layer word encoding result, wherein,representing the result of the first layer word encoding, i _c Represents the ith time step,/->Representing the second layer word coding result; equation (3.2) represents the use of BiLSTM ₂ Encoding a first layer word encoding result, wherein ∈>Representing the result of the first layer word encoding, i _w Represents the ith time step,/->Representing a second-layer word coding result; equation (3.3) shows the construction of a second layer coding result selection gate, wherein +_> and />Represents a weight matrix to be trained, sigma represents a sigmoid function, gate ₂ ^* Representing a second layer encoding result selection gate; formula (3.4) shows that the second-layer word encoding result and the second-layer word encoding result are selectively fused through the second-layer encoding result selection gate, wherein, by comparison with the formula "+% represents multiplication by element +%, by +% is added to the second-layer word encoding result>Representing a second layer of gating encoding result fusion representation; equation (3.5) shows the construction of a second layer characteristic residual selection gate, wherein +_> and />Representing a weight matrix to be trained, < >>A gated first layer feature residual formula fusion representation representing the output of equation (2.6), σ representing the sigmoid function, gate ₂ Representing a second layer of characteristic residual selection gates; equation (3.6) represents selectively fusing the gated first layer feature residual type fusion representation and the gated second layer encoding result fusion representation by a second layer feature residual type selection gate, wherein>And representing a gated second layer characteristic residual type fusion representation.

Further, the second layer word coding result and the second layer word coding result are transmitted to a third layer coding structure BiLSTM ₃ The method comprises the steps of carrying out a first treatment on the surface of the Similarly, the multi-level gating characteristic residual error type fusion representation can be generated through repeated coding for many times; and according to the preset hierarchical depth of the model, generating a final gated deep feature residual type fusion representation. For the depth layer, the formula is as follows:

Wherein the formula (4.1) represents the use of BiLSTM _depth Encoding the depth-1 layer word encoding result, wherein,representing the coded result of the depth-1 layer word, i _c Represents the ith time step,/->Representing the coding result of the depth layer word; equation (4.2) shows the use of BiLSTM _depth Encoding a depth-1 layer word encoding result, wherein +_>Representing the coded result of the depth-1 layer word, i _w Represents the ith time step,/->Representing the coding result of the depth layer word; equation (4.3) shows the construction of the depth layer coding result selection gate, wherein +_> and />Representing a weight matrix to be trained, sigma representing sigmoid function, gate _depth ^* Representing a depth layer coding result selection gate; equation (4.4) shows that the depth layer word encoding result and the depth layer word encoding result are selectively fused by the depth layer encoding result selection gate, wherein, the ". As indicated by multiplication by element->Representing the fusion representation of the coding result of the gating depth layer; equation (4.5) shows the construction of the depth layer characteristic residual selection gate, wherein +_> and />Representing a weight matrix to be trained, < >>Representation gating depth-1 layer characteristic residual error type fusion representation, sigma represents sigmoid function, gate _depth Representing a depth layer characteristic residual selection gate; equation (4.6) represents selectively fusing the gated depth-1 layer feature residual type fusion representation and the gated depth layer encoding result fusion representation by the depth layer feature residual type selection gate, wherein _in _ >And representing a gating depth layer characteristic residual type fusion representation, namely a gating deep layer characteristic residual type fusion representation.

More preferably, the construction process of the semantic feature interaction matching module specifically comprises the following steps:

the layer receives the gated deep feature residual type fusion representation output by the gated deep feature residual type fusion network module as input, performs semantic feature matching and semantic feature screening operation on the gated deep feature residual type fusion representation in three steps, and accordingly generates a final sentence pair semantic matching tensor, and the specific operation is as follows:

firstly, completing an interactive matching process between sentence pairs by applying an attention mechanism so as to obtain a preliminary sentence matching tensor. Taking the example of the sense 1 matching the sense 2, the formula is as follows:

wherein, the formula (5.1) represents mapping the gated deep feature residual fusion representation of two sentences,an ith component representing a gated deep feature residual fusion representation of sense 1,/->An ith component representing a gated deep feature residual fusion representation of sense 2,/-> and />Indicating weight to be trained, by which, by multiplying by element, formula (5.2) indicates attention weight calculated, formula (5.3) indicates that the interactive matching process is completed using the attention weight, and- >The result of matching the content 2 with the content 1, i.e., the sentence preliminary matching tensor, is represented. Similarly, matching sense 1 with sense 2 will also result in a similar sentence preliminary matching tensor +.>

Secondly, performing feature screening operation on the sentence matching tensor by using a gating mechanism to obtain the sentence matching tensor, wherein the formula is as follows:

wherein equation (6.5) represents the construction of a matching tensor gate,representing the result of matching sense 2 with sense 1,/for sense 2> and />The weight to be trained; formula (6.6) shows the feature screening of the matching tensor using the matching tensor gate, as indicated by element-wise multiplication, +.>Representing the tensor of the sense 1 match. Similarly, processing the result of matching sense 1 with sense 2 can obtain the sense 2 matching tensor +.>

Thirdly, connecting two sentence matching tensors to obtain a sentence pair matching tensor, wherein the formula is as follows:

wherein ,representing sentence pairs matching tensors.

More preferably, the label prediction module is constructed as follows:

the sentence pair semantic matching tensor is used as the input of the module, and is processed by a layer of fully-connected network with dimension of 1 and activation function of sigmoid, so as to obtain a sentence pair semantic matching tensor which is in [0,1]The matching degree value between the two is marked as y _pred Finally, comparing the semantic meaning with a set threshold value (0.5) to judge whether the semantic meaning between sentence pairs is matched or not; i.e. y _pred And when the semantic meaning of the sentence pair is not less than 0.5, predicting that the semantic meaning of the sentence pair is matched, otherwise, not matching. When the sentence is not sufficiently trained on the semantic matching model, training is required on the training data set to optimize the model parameters; when the model is trained, the label prediction module can predict whether the semantics of the target sentence pair are matched.

More preferably, the sentence-to-semantic matching knowledge base construction is specifically as follows:

downloading a data set on a network to obtain original data: downloading a sentence-to-semantic matching data set or a manual construction data set which is already disclosed on the network, and taking the sentence-to-semantic matching data set or the manual construction data set as original data for constructing a sentence-to-semantic matching knowledge base;

preprocessing raw data: preprocessing original data used for constructing a sentence-to-semantic matching knowledge base, and performing word breaking operation and word segmentation operation on each sentence to obtain a sentence-to-semantic matching word breaking processing knowledge base and a word segmentation processing knowledge base;

summarizing a sub-knowledge base: summarizing a sentence-to-semantic matching word breaking processing knowledge base and a sentence-to-semantic matching word segmentation processing knowledge base to construct a sentence-to-semantic matching knowledge base.

The sentence pair semantic matching model is obtained by training by using a training data set, and the training data set is constructed as follows:

building training positive examples: sentence pairs with consistent sentence semantics are constructed as positive examples in a sentence pair semantic matching knowledge base, and formalized as follows: (sentenc1_char, sentenc2_char, sentenc1_word, sentenc2_word, 1); wherein, sentenc1_ char, sentence2_char respectively refers to sentence1 and sentence2 in the sentence-to-semantic matching word segmentation processing knowledge base, sentenc1_word and sentenc2_word respectively refer to sentence1 and sentence2 in the sentence-to-semantic matching word segmentation processing knowledge base, and 1 here indicates that the semantics of the two sentences are matched, which is a positive example;

building training negative examples: selecting a sentence s ₁ Randomly selecting one and sentence s from the sentence-to-semantic matching knowledge base ₁ Mismatched sentence s ₂ Will s ₁ And s ₂ Combining to construct a negative example, formalized as: (sentenc1_char, sentenc2_char, sentenc1_word, sentenc2_word, 0); the sentenc1_ char, sentence1_word refers to a sentence1 in a sentence-to-semantic matching word breaking processing knowledge base and a sentence-to-semantic matching word segmentation processing knowledge base respectively; the sentence2_ char, sentence2_word refers to a sentence2 in a sentence-to-semantic matching word breaking processing knowledge base and a sentence-to-semantic matching word segmentation processing knowledge base respectively; 0 represents sentence s ₁ And sentence s ₂ Is a negative example;

building a training data set: combining all positive example sample sentence pairs and negative example sample sentence pairs obtained after the operations of constructing training positive examples and constructing training negative examples, and disturbing the sequence of the positive example sentence pairs and the negative example sample sentence pairs to construct a final training data set; both the positive case data and the negative case data contain five dimensions, namely sentenc1_char, sentenc2_char, sentenc1_word, sentenc2_word, 0 or 1;

after the sentence semantic matching model is constructed, training and optimizing the sentence semantic matching model through a training data set, wherein the training and optimizing steps are as follows:

constructing a loss function: from the label prediction module construction process, y _pred Is the matching degree calculated value obtained after sentence semantic matching model processing, y _true The method is a true label for judging whether two sentence semantics are matched, the value of the true label is limited to 0 or 1, cross entropy is adopted as a loss function, and the formula is as follows:

optimizing a training model: using RMSProp as an optimization algorithm, the remaining super parameters of RMSProp all select default settings in Keras except for its learning rate setting of 0.0015; and on the training data set, carrying out optimization training on the sentence pair semantic matching model.

An intelligent question-answer sentence pair semantic matching device for government affair consulting service, which comprises,

the sentence pair semantic matching knowledge base construction unit is used for acquiring a large amount of sentence pair data, and then preprocessing the sentence pair data to obtain a sentence pair semantic matching knowledge base meeting training requirements;

a training data set generating unit for constructing positive example data and training negative example data for training according to sentences in the sentence semantic matching knowledge base, and constructing a final training data set based on the positive example data and the negative example data;

the sentence pair semantic matching model construction unit is used for constructing a word mapping conversion table and a word mapping conversion table, and simultaneously constructing an input module, a word vector mapping layer, a gate-control deep feature residual type fusion network module, a semantic feature interaction matching module and a label prediction module; the sentence-to-semantic matching model construction unit includes,

the word mapping conversion table or word mapping conversion table construction unit is responsible for segmenting each sentence in the sentence-to-semantic matching knowledge base according to the granularity of the word or the granularity of the word, sequentially storing each word or word into a list to obtain a word table or a word table, sequentially and incrementally sequencing the words or the words according to the sequence of the words or the words which are input into the word table or the word table by taking the number 1 as the beginning, so as to form the word mapping conversion table or the word mapping conversion table required by the invention; after the word mapping conversion table or the word mapping conversion table is constructed, each word or word in the table is mapped into a unique digital identifier; then, training a Word2Vec Word vector model or a Word vector model to obtain a Word vector matrix of each Word or a Word vector matrix of each Word;

The input module construction unit is responsible for preprocessing each sentence pair or sentence pair to be predicted in the training data set, respectively obtaining a sentenc1_ char, sentence2_ char, sentence1_word and a sentenc2_word, and formalizing the sentenc1_ char, sentence _ char, sentence _word and the sentenc2_word as: (sentenc1_char, sentenc2_char, sentenc1_word, sentenc2_word);

the character vector mapping layer or word vector mapping layer construction unit is responsible for loading a character vector matrix or word vector matrix obtained by training in the step of the character mapping conversion table or word mapping conversion table construction unit to initialize the weight parameters of the current layer; for word vector mapping, for input sentences sentenc1_char and sentenc2_char, obtaining their corresponding sentence vectors sentenc1_char_ embed, sentence2 _char_emmbed; for word vector mapping, corresponding sentence vectors sentenc1_word_ embed, sentence2 _word_emmbed are obtained for input sentences sentenc1_word and sentenc2_word;

the gate-control deep feature residual type fusion network module construction unit is used for capturing and screening semantic features of sentences, and specifically comprises the steps of receiving word embedding representations output by a word vector mapping layer and word embedding representations output by the word vector mapping layer as inputs; the word embedding representation and the word embedding representation are selectively fused through a gating mechanism to obtain a gating embedding fusion representation, and simultaneously, the word embedding representation and the word embedding representation before fusion are transmitted to a first layer of coding structure; the first layer coding structure respectively carries out coding operation on the word embedding representation and the word embedding representation to obtain a first layer word coding result and a first layer word coding result; selectively fusing the first layer character encoding result and the first layer word encoding result through a gating mechanism, then selectively fusing the fusion result with a gating embedded fusion representation through the gating mechanism to obtain a gating first layer characteristic residual type fusion representation, and simultaneously transmitting the first layer character encoding result and the first layer word encoding result before fusing to a second layer encoding structure; the second layer coding structure respectively carries out coding operation on the first layer character coding result and the first layer word coding result to obtain a second layer character coding result and a second layer word coding result; selectively fusing the second-layer word coding result and the second-layer word coding result through a gating mechanism, then selectively fusing the fusion result with a gated first-layer characteristic residual fusion representation through the gating mechanism to obtain a gated second-layer characteristic residual fusion representation, and simultaneously transmitting the second-layer word coding result and the second-layer word coding result before fusing to a third-layer coding structure; similarly, the multi-level gating characteristic residual error type fusion representation can be generated through repeated coding for many times; according to the preset hierarchical depth of the model, until a final gated deep feature residual type fusion representation is generated;

The semantic feature interaction matching module construction unit is responsible for further processing the gated deep feature residual type fusion representation of the corresponding sentence, and performing semantic feature interaction matching, semantic feature screening and other operations on the fusion representation, so that a final sentence pair semantic matching tensor is generated;

the label prediction module unit is responsible for processing the semantic matching tensor of the sentence pair so as to obtain a matching degree value, and comparing the matching degree value with a set threshold value so as to judge whether the semantics of the sentence pair are matched;

the sentence pair semantic matching model training unit is used for constructing a loss function required in the model training process and completing the optimization training of the model.

Preferably, the sentence-to-semantic matching knowledge base construction unit includes,

the sentence pair data acquisition unit is responsible for downloading a sentence pair semantic matching data set or a manual construction data set which is already disclosed on the network, and taking the sentence pair semantic matching data set as the original data for constructing a sentence pair semantic matching knowledge base;

the original data word breaking preprocessing or word segmentation preprocessing unit is responsible for preprocessing original data used for constructing a sentence-to-semantic matching knowledge base, and performing word breaking or word segmentation operation on each sentence in the original data, so as to construct a sentence-to-semantic matching word breaking processing knowledge base or a sentence-to-semantic matching word segmentation processing knowledge base;

The sub-knowledge base summarizing unit is responsible for summarizing the sentence-to-semantic matching word breaking processing knowledge base and the sentence-to-semantic matching word segmentation processing knowledge base, so as to construct the sentence-to-semantic matching knowledge base.

The training data set generation unit comprises,

the training positive example data construction unit is responsible for constructing sentences with consistent semantics and the matching labels 1 thereof in the sentence pair-semantic matching knowledge base into training positive example data;

training a negative example data construction unit, namely selecting one sentence, randomly selecting a sentence which is not matched with the sentence to be combined, and constructing negative example data together with a matched tag 0;

the training data set construction unit is responsible for combining all training positive example data and training negative example data together and disturbing the sequence of the training positive example data and the training negative example data so as to construct a final training data set;

the sentence-to-semantic matching model training unit includes,

the loss function construction unit is responsible for calculating the error of the semantic matching degree between the sentences 1 and 2;

and the model optimization unit is responsible for training and adjusting parameters in model training, and reduces prediction errors.

A storage medium having stored therein a plurality of instructions having processor loading for performing the steps of the intelligent question-answer sentence semantic matching method for government affair oriented consulting services described above.

An electronic device, the electronic device comprising: the storage medium described above; and a processor for executing the instructions in the storage medium.

The intelligent question-answering sentence semantic matching method and device for the government affair consultation service have the following advantages:

through the gating deep feature residual type fusion network structure, the coding result output by each layer of network can be effectively information screened, so that the extracted semantic features are more accurate; the semantic features of a deeper level can be captured, and the depth can be freely controlled, so that the structure can be flexibly adapted to different data sets, and better universality is achieved;

secondly, the invention carries out interactive matching through the attention mechanism, so that the information between sentence pairs can be effectively aligned, the matching process is more reliable, and the generated sentence pairs have rich interactive characteristics to the matching tensor, thereby improving the accuracy of semantic matching of the sentences;

the method processes the coding result of the sentence pairs through a gating mechanism, and can effectively screen the interaction characteristics among the sentence pairs, so that the prediction accuracy of the model is improved;

drawings

The invention is further described below with reference to the accompanying drawings.

FIG. 1 is a flow chart of a method for semantic matching of intelligent question-answering sentences for government consulting services;

FIG. 2 is a flow chart for constructing a sentence-to-semantic matching knowledge base;

FIG. 3 is a flow chart for constructing a training dataset;

FIG. 4 is a flow chart for constructing a sentence-to-semantic matching model;

FIG. 5 is a flow chart of training a sentence to semantic matching model;

FIG. 6 is a schematic diagram of a semantic matching device for intelligent question-answering sentences for government consulting services;

FIG. 7 is a schematic diagram of a structure for constructing a gated deep feature residual fusion network;

fig. 8 is a schematic diagram of a framework of a semantic matching model for intelligent question-answering sentences for government consulting services.

The specific embodiment is as follows:

the method and apparatus for semantic matching of intelligent question-answering sentences for government consulting services of the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments of the present invention.

Example 1:

as shown in fig. 8, the main framework structure of the invention comprises a multi-granularity embedding module, a gated deep feature residual type fusion network module, a semantic feature interaction matching module and a label prediction module. The multi-granularity embedding module respectively performs embedding operation on an input sentence according to word granularity and word granularity, and transmits the result to the gated deep feature residual type fusion network module of the model. The gating deep feature residual type fusion network module comprises a plurality of layers of coding structures, as shown in fig. 7, wherein word embedding representations and word embedding representations output by the multi-granularity embedding module are selectively fused through a gating mechanism to obtain gating embedded fusion representations, and simultaneously, word embedding representations and word embedding representations before fusion are transmitted to a first layer of coding structure; the first layer coding structure respectively carries out coding operation on the word embedding representation and the word embedding representation to obtain a first layer word coding result and a first layer word coding result; selectively fusing the first layer character encoding result and the first layer word encoding result through a gating mechanism, then selectively fusing the fusion result with a gating embedded fusion representation through the gating mechanism to obtain a gating first layer characteristic residual type fusion representation, and simultaneously transmitting the first layer character encoding result and the first layer word encoding result before fusing to a second layer encoding structure; the second layer coding structure respectively carries out coding operation on the first layer character coding result and the first layer word coding result to obtain a second layer character coding result and a second layer word coding result; selectively fusing the second-layer word coding result and the second-layer word coding result through a gating mechanism, then selectively fusing the fusion result with a gated first-layer characteristic residual fusion representation through the gating mechanism to obtain a gated second-layer characteristic residual fusion representation, and simultaneously transmitting the second-layer word coding result and the second-layer word coding result before fusing to a third-layer coding structure; similarly, the multi-level gating characteristic residual error type fusion representation can be generated through repeated coding for many times; according to the preset hierarchical depth of the model, until a final gated deep feature residual type fusion representation is generated; the gated deep feature residual fusion representation is passed to a semantic feature interaction matching module of the model. The semantic feature interaction matching module performs semantic feature matching and feature screening operation on the gated deep feature residual type fusion representation; wherein semantic feature matching is accomplished through an attention mechanism; the feature screening operation is realized through a gating mechanism; and finally, obtaining a matching tensor of the sentence pair, and transmitting the matching tensor to a label prediction module of the model. The label prediction module maps the matched tensor of the sentence pair into a floating point type numerical value on the appointed interval; and comparing the semantic meaning with a preset threshold value as the matching degree, and judging whether the semantic meaning of the sentence pair is matched or not according to the comparison result. The method comprises the following steps:

(1) The multi-granularity embedding module respectively performs embedding operation on the input sentences according to the granularity of the words and the granularity of the words to obtain multi-granularity embedded representation of the sentences;

(2) The method comprises the steps that a gating deep feature residual type fusion network module carries out coding operation on multi-granularity embedded representations of sentences to obtain gating deep feature residual type fusion representations of the sentences;

(3) The semantic feature interaction matching module performs feature matching and feature screening operation on the gated deep feature residual type fusion representation of the sentence pair to obtain a matching vector of the sentence pair;

(4) The label prediction module maps the matching tensor of the sentence pair into a floating point type numerical value on a designated interval, compares the floating point type numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the sentence pair are matched according to a comparison result.

Example 2:

as shown in figure 1, the intelligent question-answer sentence semantic matching method for the government affair consultation service comprises the following specific steps:

s1, constructing a sentence pair semantic matching knowledge base, as shown in a figure 2, specifically comprising the following steps:

s101, downloading a data set on a network to obtain original data: and downloading the sentence-to-semantic matching data set or the manually constructed data set which is already disclosed on the network, and taking the data set as the original data for constructing a sentence-to-semantic matching knowledge base.

Illustrating: according to the history record of government affair consultation, the problem sentence pairs contained in the history record can be collected, a government affair consultation problem sentence pair data set is constructed manually, and the original data for constructing a sentence pair semantic matching knowledge base is obtained. Sentence pair examples are represented as follows:

sentence1	how does a college graduate file store in talent market?
		sentence2	Is a university graduate file storage procedure?

S102, preprocessing original data: preprocessing is used for constructing the original data of the sentence-to-semantic matching knowledge base, and performing word breaking operation and word segmentation operation on each sentence in the original data to obtain the sentence-to-semantic matching word breaking processing knowledge base and the word segmentation processing knowledge base.

And (3) performing word breaking preprocessing and word segmentation preprocessing on each sentence in the original data for constructing the sentence meaning matching knowledge base obtained in the step (S101) to obtain a sentence meaning matching word breaking processing knowledge base and a word segmentation processing knowledge base. The word breaking operation comprises the following specific steps: each sentence is segmented in units of each word in the chinese sentence with a space as a separator. The word segmentation operation comprises the following specific steps: and selecting a default precise mode to segment each sentence by using a Jieba word segmentation tool. In this operation, in order to avoid loss of semantic information, all contents including punctuation marks, special characters, and stop words in sentences are preserved.

Illustrating: taking the sense 1 shown in S101 as an example, the word breaking operation is performed to obtain "how to store a college graduation file in talent market? "; after the jeeba word segmentation tool is used for word segmentation operation processing, the' how is a college graduate file stored in talent market? ".

S103, summarizing the sub-knowledge base: summarizing a sentence-to-semantic matching word breaking processing knowledge base and a sentence-to-semantic matching word segmentation processing knowledge base to construct a sentence-to-semantic matching knowledge base.

And (3) summarizing the sentence-to-semantic matching word-breaking processing knowledge base and the sentence-to-semantic matching word-segmentation processing knowledge base obtained in the step (S102) under the same folder, thereby obtaining the sentence-to-semantic matching knowledge base. The flow is shown in fig. 2. It should be noted here that the data after the word breaking operation and the data after the word segmentation operation are not combined into the same file, i.e. the sentence-to-semantic matching knowledge base actually comprises two independent sub-knowledge bases. Each preprocessed sentence retains the ID information of its original sentence.

S2, constructing a training data set of a sentence pair semantic matching model: for each sentence pair in the sentence pair semantic matching knowledge base, if the semantics of the sentence pairs are consistent, the sentence pairs can be used for constructing training positive examples; if the semantics of the sentence pairs are inconsistent, the sentence pairs can be used for constructing training negative examples; mixing a certain amount of positive example data with negative example data, thereby constructing a model training data set; as shown in fig. 3, the specific steps are as follows:

S201, constructing a training positive example: sentence pairs with consistent sentence semantics are constructed as positive examples in a sentence pair semantic matching knowledge base, and formalized as follows: (sentenc1_char, sentenc2_char, sentenc1_word, sentenc2_word, 1;

illustrating: after the word segmentation operation processing in the step S102 and the word segmentation operation processing in the step S103 are performed on the sense 1 and the sense 2 shown in the step S101, the constructed positive example data form is as follows:

("how does college graduate files store in talent market.

S202, constructing training negative examples: selecting a sentence s ₁ Randomly selecting one and sentence s from the sentence-to-semantic matching knowledge base ₁ Mismatched sentence s ₂ Will s ₁ And s ₂ Combining to construct a negative example, formalized as: (sentenc1_char, sentenc2_char, sentenc1_word, sentenc2_word, 0);

illustrating: what is a semantically mismatched sentence pair in the LCQMC dataset "content 1: how is a college graduation file deposited in the talent market? sensor 2 where to check-in wedding? "as an example, after the word breaking operation processing in step S102 and the word segmentation operation processing in step S103, the constructed negative example data form is as follows:

("how do college graduates store in talent markets.

S203, constructing a training data set: and (3) merging all positive example sentence pair data and negative example sentence pair data obtained after the operations of the step S201 and the step S202 together, and disturbing the sequence of the positive example sentence pair data and the negative example sentence pair data to construct a final training data set. Whether positive or negative example data, they contain five dimensions, namely sentenc1_char, sentenc2_char, sentenc1_word, sentenc2_word, 0 or 1.

S3, constructing a sentence pair semantic matching model: the method mainly comprises the steps of constructing a word mapping conversion table, constructing an input module, constructing a word vector mapping layer, constructing a gate-control deep feature residual type fusion network module, constructing a semantic feature interaction matching module and constructing a label prediction module. The method comprises the steps of constructing a word mapping conversion table, constructing an input module, constructing a word vector mapping layer and constructing a word vector mapping layer, wherein the word mapping layer corresponds to the multi-granularity embedded module in fig. 8, and the rest parts correspond to the modules in fig. 8 one by one. The method comprises the following specific steps:

S301, constructing a word mapping conversion table: the word list is constructed by the sentence pair semantic matching word breaking processing knowledge base obtained by the processing in the step S102. After the word table is constructed, each word in the table is mapped into a unique digital identifier, and the mapping rule is as follows: starting with the number 1, each word is then sequentially ordered incrementally in the order in which it was entered into the word table, thereby forming the word map conversion table required by the present invention.

Illustrating: what is processed in step S102, "how is a college graduation file stored in talent market? "a word table and a word map conversion table are constructed as follows:

word and word	High height	School and school	Pichia (P.E.)	Industry is provided with	Raw materials	Gear	Case with a table top	Such as	What is	Storing the articles	Put and put
												Mapping	1	2	3	4	5	6	7	8	9	10	11
Word and word	At the position of	Human body	Only then	Market in the marketplace	Field of technology	？
												Mapping	12	13	14	15	16	17

Then, word2Vec is used for training a Word vector model to obtain a Word vector matrix char_compressing_matrix of each Word.

Illustrating: in Keras, the code implementation described above is as follows:

w2v_model_char＝genism.models.Word2Vec(w2v_corpus_char,size＝char_embe dding_dim,window＝5,min_count＝1,sg＝1,workers＝4,seed＝1234,iter＝25)

char_embedding_matrix＝numpy.zeros([len(tokenizer.char_index)+1,char_emb edding_dim])

tokenizer＝keras.preprocessing.text.Tokenizer(num_words＝len(char_set))

for char,idx in tokenizer.char_index.items():

char_embedding_matrix[idx,:]＝w2v_model.wv[char]

wherein w2v_morphus_char is word breaking processing training corpus, namely, sentences are matched with semantics to process all data in a knowledge base; char_casting_dim is the word vector dimension, the present model sets char_casting_dim to 400, char_set is the word table.

S302, constructing a word mapping conversion table: the vocabulary is constructed by processing a knowledge base for semantic matching and word segmentation of sentences obtained by the processing in the step S103. After the vocabulary is constructed, each word in the table is mapped into a unique digital identifier, and the mapping rule is as follows: starting with the number 1, each word is then sequentially and incrementally ordered in the order in which it was entered into the vocabulary, thereby forming the word mapping conversion table required by the present invention.

Illustrating: with the content processed in step S103, "how is the college graduation file stored in talent market? "a vocabulary and a word mapping conversion table are constructed as follows:

word and word	College	Graduates	Archives file	How to get	Storing	At the position of	Talents	Market for the production of	？
										Mapping	1	2	3	4	5	6	7	8	9

Then, word2Vec is used for training a Word vector model, and a Word vector matrix word_compressing_matrix of each Word is obtained.

Illustrating: in Keras, for the code implementation described above, it is substantially identical to that illustrated in S301, except that the parameters are changed from char to word dependent. In view of the space limitations, they are not described in detail herein.

In the example in S301, w2v_morphus_char is replaced by w2v_morphus_word, which is a word segmentation processing training corpus, that is, all data in a knowledge base are processed by sentence semantic matching word segmentation; the char_unbedding_dim is replaced by word_unbedding_dim, which is the word vector dimension, and the model sets word_unbedding_dim to 400; char_set is replaced with word_set, which is a vocabulary.

S303, constructing an input layer: the input layer includes four inputs, and sentenc1_ char, sentence2_ char, sentence _word and sentenc2_word are obtained from training data set samples of the input layer, respectively, and are formed as: (sentenc1_char, sentenc2_char, sentenc1_word, sentenc2_word);

each word or word in the input sentence is converted into a corresponding numerical identifier according to the word mapping conversion table and the word mapping conversion table.

Illustrating: using the sentence pair shown in step S201 as a sample, a piece of input data is composed. The results are shown below:

( "how do college graduates files store in talent markets? "," university graduate file storage procedure? How do the university graduation files store in talent market? "," university graduate file storage procedure? " )

Each input data contains 4 clauses. For the first two clauses, converting them into a numerical representation according to the word mapping conversion table in step S301; for the latter two clauses, they are converted into a numerical representation according to the word mapping conversion table in step S302. The 4 clauses of the input data are combined to represent the following:

("1,2,3,4,5,6,7,8,9, 10, 11, 12, 13, 14, 15, 16, 17","18, 19,3,4,5,6,7, 10, 11, 20, 21, 17","1,2,3,4,5,6,7,8,9","10,2,3,5, 11,9"). Wherein, for partial characters in the content 2, the mapping relationship is: large-18, school-19, hand-20, continuous-21; for partial words in sense 2, the mapping relationship is: university-10, procedure-11.

S304, constructing a word vector mapping layer: initializing weight parameters of a current layer by loading the weight of the word vector matrix trained in the step of constructing the word mapping conversion table; for input sentences sentenc1_char and sentenc2_char, obtaining corresponding sentence vectors sentenc1_char_ embed, sentence2 _char_emmbed; each sentence in the sentence-to-semantic matching word breaking processing knowledge base can be converted into a vector form through a word vector mapping mode.

Illustrating: in Keras, the code implementation described above is as follows:

char_embedding_layer＝Embedding(char_embedding_matrix.shape[0],char_emb_dim,weights＝[char_embedding_matrix],input_length＝input_dim,trainable＝False)

wherein char_embedding_matrix is the weight of the word vector matrix trained in the step of constructing the word map conversion table, char_embedding_matrix 0 is the size of the word table of the word vector matrix, char_embedding_dim is the dimension of the output word vector, and input_length is the length of the input sequence.

The corresponding sentences sentenc1_char and sentenc2_char are processed by the coding layer of Keras to obtain corresponding sentence vectors sentenc1_char_ embed, sentence2_char_embed.

S305, constructing a word vector mapping layer: initializing weight parameters of a current layer by loading the weight of the word vector matrix trained in the step of constructing the word mapping conversion table; aiming at the input sentences sentenc1_word and sentenc2_word, corresponding sentence vectors sentenc1_word_ embed, sentence2 _word_end are obtained; each sentence in the sentence-to-semantic matching word segmentation processing knowledge base can be converted into a vector form through a word vector mapping mode.

Illustrating: in Keras, for the code implementation described above, it is basically identical to that in S304, except that the parameters are related by changing char to word. In view of the space limitations, they are not described in detail herein.

Corresponding sentences of the sentenc1_word and the sentenc2_word are processed by an encoding layer of Keras to obtain corresponding sentence vectors of the sentenc1_word_ embed, sentence2 _word_email.

S306, constructing a gated deep feature residual type fusion network module: the structure is shown in fig. 7, and the specific steps are as follows:

firstly, the word embedding representation and the word embedding representation output by the multi-granularity embedding module are selectively fused through a gating mechanism to obtain a gating embedding fusion representation. The specific implementation is shown in the following formula.

Further, the first layer encoding structure BiLSTM ₁ Performing coding operation on the word embedding representation and the word embedding representation to obtain a preliminary first-layer word coding result and a first-layer word coding result; the first layer word coding result and the first layer word coding result are selectively fused through a gating mechanism, and then the fusion result is selectively fused with a gating embedded fusion representation through the gating mechanism, so that the gated first layer characteristic residual fusion representation is obtained. The specific implementation is shown in the following formula.

Further, the first layer word coding result and the first layer word coding result are transferred to the second layer coding structure BiLSTM ₂ ；BiLSTM ₂ Respectively carrying out coding operation on the first layer character coding result and the first layer word coding result to obtain a second layer character coding result and a second layer word coding result; and selectively fusing the second-layer word coding result and the second-layer word coding result through a gating mechanism, and then selectively fusing the fusion result with a gated first-layer characteristic residual type fusion representation through the gating mechanism to obtain a gated second-layer characteristic residual type fusion representation. The specific implementation is shown in the following formula.

/>

Further, the second layer word coding result and the second layer word coding result are transmitted to a third layer coding structure BiLSTM ₃ The method comprises the steps of carrying out a first treatment on the surface of the Similarly, the multi-level gating characteristic residual error type fusion representation can be generated through repeated coding for many times; depth of hierarchy preset according to modelUntil a final gated deep feature residual fusion representation is generated. For the depth layer, the following formula is embodied.

Illustrating: when the invention is implemented on the manually constructed government affair consultation data set, the number of layers of the structure is 6, and the optimal result can be obtained when the coding dimension of BiLSTM in each layer is set to 300. Furthermore, to avoid the over-fitting problem, a dropout strategy is used in each layer of BiLSTM, and the best results are obtained when dropout is set to 0.01.

In Keras, the code implementation described above is as follows:

/>

where, the content_end_char is a word embedded representation of a sentence, the content_end_word is a word embedded representation of a sentence, 300 is a coding dimension of the BiLSTM, the content_end is a gated deep feature residual type fusion representation of a corresponding sentence, and it should be noted that the gateFeatureLayer represents a gated feature fusion layer, and the code implementation in Keras is as follows:

q＝feature_1

p＝feature_2

gate＝tf.sigmoid(tf.add(self.v1,tf.add(K.dot(p,self.W2),K.dot(q,self.W1))))

xj_＝gate*q

xp_＝tf.subtract(self.v2,gate)*p

result＝tf.add(xj_,xp_)

where feature_1 and feature_2 are objects to be fused, self.w1, self.w2, self.v1 and self.v2 are weights to be trained, gate is the corresponding gate of the construct, and result is the fusion result.

S307, constructing a semantic feature interaction matching module: step S306 is carried out to obtain the gated deep feature residual type fusion representation of the sense 1 and the sense 2 respectively, and semantic feature matching, semantic feature screening and other operations are carried out on the fusion representation, so that a final sentence pair semantic matching tensor is generated; the method comprises the following specific steps:

firstly, completing an interactive matching process between sentence pairs by applying an attention mechanism so as to obtain a preliminary sentence matching tensor. Taking the example of the matching of the sense 1 to the sense 2, the following formula is specifically implemented.

Illustrating: in Keras, the code implementation described above is as follows:

sentence1＝feature_s1

sentence2＝feature_s2

q_p_dot＝tf.expand_dims(sentence2,axis＝1)*tf.expand_dims(sentence1,axis＝2)

sd＝tf.multiply(tf.tanh(K.dot(q_p_dot,self.Wd)),self.vd)

sd＝tf.squeeze(sd,axis＝-1)

ad＝tf.nn.softmax(sd)

h＝K.batch_dot(ad,sentence2)

where feature_s1 and feature_s2 represent the gated deep feature residual fusion representation of the corresponding sentence, self wd and self vd represent weights to be trained, and h represents the sentence preliminary matching tensor.

And secondly, performing feature screening operation on the sentence matching tensor by using a gating mechanism to obtain the sentence matching tensor. The specific implementation is shown in the following formula.

/>

Illustrating: in Keras, the code implementation described above is as follows:

q＝h1

gj＝tf.sigmoid(tf.add(self.v1,K.dot(q,self.W1)))

M_s1＝gj*q

where h1 represents the sentence preliminary matching tensor, self.w1 and self.v1 represent weights to be trained, and m_s1 represents the sentence matching tensor.

And thirdly, connecting the two sentence matching tensors to obtain a sentence pair matching tensor. The specific implementation is shown in the following formula.

Illustrating: in Keras, the code implementation described above is as follows:

similarity＝Concatenate(axis＝2)([M_s1,M_s2])

where M_s1 and M_s2 represent the corresponding sentence matching tensors, similarity represents the sentence pair matching tensors.

S308, constructing a label prediction module: the sentence pair semantic matching tensor obtained in step S307 is used as the input of the module, and is processed by a layer of fully-connected network with dimension of 1 and activation function of sigmoid, thereby obtaining a sentence pair semantic matching tensor in [0,1 ] ]The matching degree value between the two is marked as y _pred Finally, comparing the semantic meaning with a set threshold value (0.5) to judge whether the semantic meaning between sentence pairs is matched or not; i.e. y _pred And when the semantic meaning of the sentence pair is not less than 0.5, predicting that the semantic meaning of the sentence pair is matched, otherwise, not matching.

When the sentence based on the deep semantic feature map provided by the invention does not train the semantic matching model yet, the step S4 is required to be further executed for training so as to optimize the model parameters; when the model is trained, step S308 predicts whether the semantics of the target sentence pair match.

S4, training a sentence pair semantic matching model: training the semantic matching model on the sentence constructed in the step S3 on the training data set obtained in the step S2, as shown in fig. 5, specifically as follows:

s401, constructing a loss function: from the label prediction module construction process, y _pred Is the matching degree calculated value obtained after sentence semantic matching model processing, y _true The method is a true label for judging whether two sentence semantics are matched, the value of the true label is limited to 0 or 1, cross entropy is adopted as a loss function, and the formula is as follows:

the optimization functions described above and their settings are expressed in Keras using code:

parallel_model.compile(loss＝"binary_crossentropy",optimizer＝op,metrics＝['accuracy',precision,recall,f1_score])

s402, optimizing a training model: using RMSProp as an optimization algorithm, the remaining super parameters of RMSProp except for its learning rate set to 0.0015 all select default settings in Keras; on a training data set, carrying out optimization training on the sentence pair semantic matching model;

Illustrating: the optimization functions described above and their settings are expressed in Keras using code:

optim＝keras.optimizers.RMSProp(lr＝0.0015)。

the model provided by the invention can obtain the accuracy of more than 80% on the manually constructed government affair consultation data set.

Example 3:

as shown in fig. 6, the intelligent question-answering sentence pair semantic matching apparatus based on the government affairs consulting service of embodiment 2, the apparatus includes,

the sentence pair semantic matching knowledge base construction unit is used for acquiring a large amount of sentence pair data, and then preprocessing the sentence pair data to obtain a sentence pair semantic matching knowledge base meeting training requirements; the sentence-to-semantic matching knowledge base construction unit includes,

A training data set generating unit for constructing positive example data and negative example data for training according to sentences in the sentence semantic matching knowledge base, and constructing a final training data set based on the positive example data and the negative example data; the training data set generation unit comprises a data processing unit,

The gate-control deep feature residual type fusion network module construction unit is responsible for capturing and screening semantic features of sentences, and specifically operates to receive word embedded representations output by a word vector mapping layer and word embedded representations output by the word vector mapping layer as inputs; the word embedding representation and the word embedding representation are selectively fused through a gating mechanism to obtain a gating embedding fusion representation, and simultaneously, the word embedding representation and the word embedding representation before fusion are transmitted to a first layer of coding structure; the first layer coding structure respectively carries out coding operation on the word embedding representation and the word embedding representation to obtain a first layer word coding result and a first layer word coding result; selectively fusing the first layer character encoding result and the first layer word encoding result through a gating mechanism, then selectively fusing the fusion result with a gating embedded fusion representation through the gating mechanism to obtain a gating first layer characteristic residual type fusion representation, and simultaneously transmitting the first layer character encoding result and the first layer word encoding result before fusing to a second layer encoding structure; the second layer coding structure respectively carries out coding operation on the first layer character coding result and the first layer word coding result to obtain a second layer character coding result and a second layer word coding result; selectively fusing the second-layer word coding result and the second-layer word coding result through a gating mechanism, then selectively fusing the fusion result with a gated first-layer characteristic residual fusion representation through the gating mechanism to obtain a gated second-layer characteristic residual fusion representation, and simultaneously transmitting the second-layer word coding result and the second-layer word coding result before fusing to a third-layer coding structure; similarly, the multi-level gating characteristic residual error type fusion representation can be generated through repeated coding for many times; according to the preset hierarchical depth of the model, until a final gated deep feature residual type fusion representation is generated;

The semantic feature interaction matching module construction unit is responsible for carrying out feature matching and feature screening operation on the gated deep feature residual type fusion representation of the sentence pair to obtain a matching vector of the sentence pair;

the sentence pair semantic matching model training unit is used for constructing a loss function required in the model training process and completing the optimization training of the model; the sentence-to-semantic matching model training unit includes,

the model optimization unit is in charge of training and adjusting parameters in model training, and reduces prediction errors;

example 4:

based on the storage medium of embodiment 2, a plurality of instructions are stored therein, the instructions having a processor to load, perform the steps of the intelligent question-answer sentence-to-semantic matching method for the government affair consultation service of embodiment 2.

Example 5:

based on the electronic apparatus of embodiment 4, the electronic apparatus includes: the storage medium of example 4; and a processor configured to execute the instructions in the storage medium of embodiment 4.

Claims

1. The intelligent question-answering sentence pair semantic matching method for the government affair consultation service is characterized by comprising the steps of constructing and training a sentence pair semantic matching model consisting of a multi-granularity embedding module, a gating deep feature residual error fusion network module, a semantic feature interaction matching module and a label prediction module, realizing gating deep feature residual error fusion representation of sentence information, generating a final matching tensor of sentence pairs through an attention mechanism and a gating mechanism, and judging the matching degree of the sentence pairs so as to achieve the aim of intelligent semantic matching of the sentence pairs; the method comprises the following steps:

the label prediction module maps the matching tensor of the sentence pair into a floating point type numerical value on a designated interval, compares the floating point type numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the sentence pair are matched according to a comparison result;

The construction process of the gated deep feature residual type fusion network module is specifically as follows:

wherein, the expression (1.1) represents the selection of the construction embedded representation informationA door, wherein, and />Representing a weight matrix to be trained, < >>Representing sendence1_char_emuded or sendce2_char_emuded,/for>Representing sendence1_word_emud or sendence2_word_emud, σ representing a sigmoid function, gate _emb Representation embedded representation information selection gate; the formula (1.2) shows that the word embedding representation and the word embedding representation are selectively fused by the embedding representation information selection gate, wherein, the ". As used herein, the representation is multiplied by element +.>Representation gating embeds the fusion representation;

first layer coding structure BiLSTM ₁ Performing coding operation on the word embedding representation and the word embedding representation to obtain a preliminary first-layer word coding result and a first-layer word coding result; the first layer word coding result and the first layer word coding result are selectively fused through a gating mechanism, and then the fusion result and the gating embedded fusion representation are selectively fused through the gating mechanism, so that the gated first layer characteristic residual fusion representation is obtained, and the formula is as follows:

Wherein the formula (2.1) represents the use of BiLSTM ₁ The encoded word is embedded into the representation, wherein,representing sentenc1_char_emuded or sentenc2_char_emuded, i _c The vector representing the i-th word represents the relative position in the sentence, < >>Representing the first layer word coding result; equation (2.2) represents the use of BiLSTM ₁ Code word embedded representation, wherein->Representing sendence1_word_emuded or sendence2_word_emuded, i _w The vector representing the i-th word represents the relative position in the sentence, < >>Representing a first-layer word coding result; equation (2.3) shows the construction of the first layer coding result selection gate, wherein +_> and />Representing to be trainedWeight matrix, σ represents sigmoid function, gate ₁ ^* Representing a first layer encoding result select gate; formula (2.4) shows that the first layer word encoding result and the first layer word encoding result are selectively fused through the first layer encoding result selection gate, wherein, by the following formula, by the element multiplication, "+", is shown>Representing a first layer of gating encoding result fusion representation; equation (2.5) shows the construction of a first layer characteristic residual type selection gate, wherein +_> and />Representing a weight matrix to be trained, < >>A gating embedded fusion representation representing the output of equation (1.2), σ representing the sigmoid function, gate ₁ Representing a first layer of characteristic residual type selection gate; equation (2.6) represents the selective fusion of the gated embedded fusion representation and the gated first layer encoded result fusion representation by first layer feature residual selection gating, wherein +. >Representing a gated first layer feature residual fusion representation;

transmitting the first layer word coding result and the first layer word coding result to the second layer coding structure BiLSTM ₂ ；BiLSTM ₂ Respectively carrying out coding operation on the first layer character coding result and the first layer word coding result to obtain a second layer character coding result and a second layer word coding result; the second-layer word coding result and the second-layer word coding result are selectively fused through a gating mechanism, and then the fusion result is selectively fused with a gated first-layer characteristic residual type fusion representation through the gating mechanism to obtain a gated second-layer characteristic residual type fusion representation, wherein the formula is as follows:

wherein the formula (3.1) represents the use of BiLSTM ₂ Encoding the first layer word encoding result, wherein,representing the result of the first layer word encoding, i _c Represents the ith time step,/->Representing the second layer word coding result; equation (3.2) represents the use of BiLSTM ₂ Encoding a first layer word encoding result, wherein ∈>Representing the result of the first layer word encoding, i _w Represents the ith time step,/->Representing a second-layer word coding result; equation (3.3) shows the construction of a second layer coding result selection gate, wherein +_> and />Represents a weight matrix to be trained, sigma represents a sigmoid function, gate ₂ ^* Representing a second layer encoding result selection gate; formula (3.4) shows that the second-layer word encoding result and the second-layer word encoding result are selectively fused through the second-layer encoding result selection gate, wherein, by comparison with the formula "+% represents multiplication by element +%, by +% is added to the second-layer word encoding result>Representing a second layer of gating encoding result fusion representation; equation (3.5) shows the construction of a second layer characteristic residual selection gate, wherein +_> and />Representing a weight matrix to be trained, < >>A gated first layer feature residual formula fusion representation representing the output of equation (2.6), σ representing the sigmoid function, gate ₂ Representing a second layer of characteristic residual selection gates; equation (3.6) represents selectively fusing the gated first layer feature residual type fusion representation and the gated second layer encoding result fusion representation by a second layer feature residual type selection gate, wherein>Representing a gated second layer feature residual type fusion representation;

transmitting the second layer word coding result and the second layer word coding result to a third layer coding structure BiLSTM ₃ The method comprises the steps of carrying out a first treatment on the surface of the Similarly, the knitting can be repeated for a plurality of timesCode generation multi-level gating feature residual type fusion representation; according to the preset hierarchical depth of the model, until a final gated deep feature residual type fusion representation is generated; for the depth layer, the formula is as follows:

Wherein the formula (4.1) represents the use of BiLSTM _depth Encoding the depth-1 layer word encoding result, wherein,representing the coded result of the depth-1 layer word, i _c Represents the ith time step,/->Representing the coding result of the depth layer word; equation (4.2) shows the use of BiLSTM _depth Encoding a depth-1 layer word encoding result, wherein +_>Representing the coded result of the depth-1 layer word, i _w Represents the ith time step,/->Representing the coding result of the depth layer word; equation (4.3) shows the construction of the depth layer coding result selection gate, wherein +_> and />Represents a weight matrix to be trained, sigma represents a sigmoid function, gate _depth ^* Representing a depth layer coding result selection gate; equation (4.4) shows that the depth layer word encoding result and the depth layer word encoding result are selectively fused by the depth layer encoding result selection gate, wherein, the ". As indicated by multiplication by element->Representing the fusion representation of the coding result of the gating depth layer; equation (4.5) shows the construction of the depth layer characteristic residual selection gate, wherein +_> and />Representing a weight matrix to be trained, < >>Representation gating depth-1 layer characteristic residual error type fusion representation, sigma represents sigmoid function, gate _depth Representing a depth layer characteristic residual selection gate; equation (4.6) represents a gating depth-1 layer feature residual fusion representation and a gating depth layer coding result fusion table by selecting a gating pair by depth layer feature residual type Selective fusion is shown, wherein +.>And representing a gating depth layer characteristic residual type fusion representation, namely a gating deep layer characteristic residual type fusion representation.

2. The intelligent question-answering sentence semantic matching method for government affair consultation services according to claim 1, wherein the multi-granularity embedding module is used for constructing a word mapping conversion table, constructing an input module, constructing a word vector mapping layer and constructing a word vector mapping layer;

wherein, a word mapping conversion table or a word mapping conversion table is constructed: the mapping rule is: starting with the number 1, sequentially and incrementally sorting according to the order in which each word or word is input into a word list or a word list, thereby forming a required word mapping conversion list or word mapping conversion list; the word list or the word list is constructed according to a sentence semantic matching knowledge base, and the knowledge base comprises a word breaking processing knowledge base or a word segmentation processing knowledge base, which is obtained by performing word breaking preprocessing or word segmentation preprocessing operation on an original data text of the semantic matching knowledge base respectively; then training a Word2Vec Word vector model or a Word vector model to obtain a Word vector matrix of each Word or a Word vector matrix of each Word;

constructing a word vector mapping layer or a word vector mapping layer: loading a word vector matrix or a word vector matrix obtained by training in the step of constructing a word mapping conversion table or the word mapping conversion table to initialize the weight parameters of the current layer; for word vector mapping, for input sentences sentenc1_char and sentenc2_char, obtaining their corresponding sentence vectors sentenc1_char_ embed, sentence2 _char_emmbed; for word vector mapping, for the input sentences sentenc1_word and sentenc2_word, the corresponding sentence vectors sentenc1_word_ embed, sentence2 _word_end are obtained.

3. The intelligent question-answer sentence semantic matching method for government affair consultation services according to claim 1, wherein the construction process of the semantic feature interactive matching module is specifically as follows:

firstly, completing an interactive matching process between sentence pairs by applying an attention mechanism so as to obtain a preliminary sentence matching tensor; taking the example of the sense 1 matching the sense 2, the formula is as follows:

wherein, the formula (5.1) represents mapping the gated deep feature residual fusion representation of two sentences,an ith component representing a gated deep feature residual fusion representation of sense 1,/->An ith component representing a gated deep feature residual fusion representation of sense 2,/-> and />Indicating weight to be trained, as would be the multiplication by element; equation (5.2) represents the calculated attention weight; equation (5.3) shows that the interaction matching process is completed using the attention weight, +.>Representing the result of matching the content 2 with the content 1, namely, the sentence preliminary matching tensor; similarly, matching sense 1 with sense 2 will also result in a similar sentence preliminary matching tensor +. >

Secondly, performing feature screening operation on the sentence matching tensor by using a gating mechanism to obtain the sentence matching tensor; the formula is as follows:

wherein equation (6.5) represents the construction of a matching tensor gate,representing the result of matching sense 2 with sense 1, and />The weight to be trained; formula (6.6) shows the feature screening of the matching tensor using the matching tensor gate, as indicated by element-wise multiplication, +.>Represent a sense 1 matching tensor; similarly, processing the result of matching sense 1 with sense 2 can obtain the sense 2 matching tensor +.>

wherein ,representing sentence pairs matching tensors.

4. The intelligent question-answering sentence meaning matching method for government affair consultation service according to claim 3, characterized in that the label prediction module constructing process is as follows:

the sentence pair semantic matching tensor is used as the input of the module, and is processed by a layer of fully-connected network with dimension of 1 and activation function of sigmoid, so as to obtain a sentence pair semantic matching tensor which is in [0,1]The matching degree value between the two is marked as y _pred Finally, comparing the semantic meaning with the established threshold value of 0.5 to judge whether the semantic meaning between the sentence pairs is matched or not; i.e. y _pred When the semantic meaning of the sentence pair is not less than 0.5, predicting that the semantic meaning of the sentence pair is matched, otherwise, not matching; when the sentence is not sufficiently trained on the semantic matching model, training is required on the training data set to optimize the model parameters; when training is finished, the label prediction module can predict whether the semantics of the target sentence pair are matched.

5. The intelligent question-answer sentence pair semantic matching method for government affair consultation service according to claim 4, wherein the sentence pair semantic matching knowledge base is constructed specifically as follows:

summarizing a sub-knowledge base: summarizing a sentence-to-semantic matching word breaking processing knowledge base and a sentence-to-semantic matching word segmentation processing knowledge base to construct a sentence-to-semantic matching knowledge base;

building training positive examples: sentence pairs with consistent sentence semantics are constructed as positive examples in a sentence pair semantic matching knowledge base, and formalized as follows: (sentenc1_char, sentenc2_char, sentenc1_word, sentenc2_word, 1); the sentenc1_ char, sentence2_char respectively refers to a sentence1 and a sentence2 in a sentence-to-semantic matching word segmentation processing knowledge base, and the sentenc1_word and the sentenc2_word respectively refer to a sentence1 and a sentence2 and 1 in the sentence-to-semantic matching word segmentation processing knowledge base, which indicate that the semantics of the two sentences are matched, and are positive examples;

building training negative examples: selecting a sentence s ₁ Randomly selecting one and sentence s from the sentence-to-semantic matching knowledge base ₁ Mismatched sentence s ₂ Will s ₁ And s ₂ Combining to construct a negative example, formalized as: (sentenc1_char, sentenc2_char, sentenc1_word, sentenc2_word, 0); wherein, sendence1_ char, sentence1_word refers to a sentence-to-semantic matching word breaking processing knowledge base and sentence-to-semantic matching respectivelyWord segmentation processes sentences 1 in the knowledge base; the sentence2_ char, sentence2_word refers to a sentence2 in a sentence-to-semantic matching word breaking processing knowledge base and a sentence-to-semantic matching word segmentation processing knowledge base respectively; 0 represents sentence s ₁ And sentence s ₂ Is a negative example;

constructing a loss function: adopting cross entropy as a loss function;

6. The intelligent question-answering sentence-to-semantic matching device for government affair consulting service is characterized by comprising,

A training data set generating unit for constructing positive example data and negative example data for training according to sentences in the sentence semantic matching knowledge base, and constructing a final training data set based on the positive example data and the negative example data;

the word mapping conversion table or word mapping conversion table construction unit is responsible for segmenting each sentence in the sentence-to-semantic matching knowledge base according to the granularity of the word or the granularity of the word, sequentially storing each word or word into a list to obtain a word table or a word table, sequentially and incrementally sequencing the words or the words according to the sequence of the words or the words which are input into the word table or the word table by taking the number 1 as the beginning, so as to form a required word mapping conversion table or word mapping conversion table; after the word mapping conversion table or the word mapping conversion table is constructed, each word or word in the table is mapped into a unique digital identifier; then, training a Word2Vec Word vector model or a Word vector model to obtain a Word vector matrix of each Word or a Word vector matrix of each Word;

7. The intelligent question-answering sentence pair semantic matching method for government affair consultation service according to claim 6, characterised in that the sentence pair semantic matching knowledge base construction unit includes,

The sub knowledge base summarizing unit is responsible for summarizing a sentence-to-semantic matching word breaking processing knowledge base and a sentence-to-semantic matching word segmentation processing knowledge base, so as to construct a sentence-to-semantic matching knowledge base;

the training data set generation unit comprises,

the sentence-to-semantic matching model training unit includes,

8. A storage medium having stored therein a plurality of instructions, wherein the instructions have processor loading for performing the steps of the intelligent question-answer sentence semantic matching method for a government affair consultation service of any of claims 1-5.

9. An electronic device, the electronic device comprising: the storage medium of claim 8; and a processor for executing the instructions in the storage medium.