CN112001166A - Intelligent question-answer sentence-to-semantic matching method and device for government affair consultation service - Google Patents

Intelligent question-answer sentence-to-semantic matching method and device for government affair consultation service Download PDF

Info

Publication number
CN112001166A
CN112001166A CN202010855426.0A CN202010855426A CN112001166A CN 112001166 A CN112001166 A CN 112001166A CN 202010855426 A CN202010855426 A CN 202010855426A CN 112001166 A CN112001166 A CN 112001166A
Authority
CN
China
Prior art keywords
word
sentence
layer
matching
sensor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010855426.0A
Other languages
Chinese (zh)
Other versions
CN112001166B (en
Inventor
鹿文鹏
于瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qilu University of Technology
Original Assignee
Qilu University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qilu University of Technology filed Critical Qilu University of Technology
Priority to CN202010855426.0A priority Critical patent/CN112001166B/en
Publication of CN112001166A publication Critical patent/CN112001166A/en
Application granted granted Critical
Publication of CN112001166B publication Critical patent/CN112001166B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses an intelligent question-answer sentence-to-semantic matching method and device for government affair consultation service, and belongs to the technical field of artificial intelligence and natural language processing. The technical problem to be solved by the invention is how to capture deeper semantic context characteristics and interactive information among sentences and reduce the loss of semantic information so as to realize intelligent semantic matching of sentence pairs, and the technical scheme adopted is as follows: the method comprises the steps of constructing and training a sentence pair semantic matching model consisting of a multi-granularity embedding module, a gate control deep characteristic residual type fusion network module, a semantic characteristic interactive matching module and a label prediction module, realizing gate control deep characteristic residual type fusion representation of sentence information, generating a final matching tensor of a sentence pair through an attention mechanism and a gate control mechanism, and judging the matching degree of the sentence pair so as to achieve the aim of carrying out intelligent semantic matching on the sentence pair. The device comprises a sentence-to-semantic matching knowledge base construction unit, a training data set generation unit, a sentence-to-semantic matching model construction unit and a sentence-to-semantic matching model training unit.

Description

Intelligent question-answer sentence-to-semantic matching method and device for government affair consultation service
Technical Field
The invention relates to the technical field of artificial intelligence and natural language processing, in particular to an intelligent question-answer sentence-to-semantic matching method for government affair consultation service.
Background
People need to know relevant policies and documents made by governments when handling affairs daily. In order to better provide service work for people, government affair consultation related service mechanisms need to be set up for all levels of governments, and government affair consultation is provided for people. As the demand for government consulting increases, the working pressure of the relevant institutions increases. Although the demand for government counseling is increasing, most counseling is of a repeated counseling. For the repeated consultation, a historical consultation library can be mined and automatically solved. The intelligent question-answering system has unique advantages for solving the difficulty. The intelligent question-answering system is one of core technologies of man-machine interaction, can automatically find standard questions matched with questions in a question-answering knowledge base aiming at the questions put forward by a user, and pushes answers of the standard questions to the user, so that the burden of manual answering can be greatly reduced. The intelligent question-answering system has wide practical application in the fields of self-service, intelligent customer service and the like. For the various questions provided by the user, how to find the matched standard question is the core technology of the intelligent question-answering system. The essence of the technology is to measure the matching degree of the questions put forward by the user and the standard questions in the question-answer knowledge base, and the essence of the technology is the task of matching sentences and meanings.
The sentence-to-semantic matching task aims to measure whether the semantics implied by two sentences are consistent, which is consistent with the core goal of many natural language processing tasks, such as the intelligent question-answering system facing the government counseling service. The calculation of semantic matching of natural language sentences is a very challenging task, and the existing method can not solve the problem completely.
In the existing method, when matching the semantics of a sentence pair, a specific neural network is required to be designed to perform semantic coding on the sentence, so as to extract corresponding semantic features. For text semantic coding, the most widely applied coding model is a recurrent neural network and various variant structures thereof. The cyclic neural network adopts a chain structure, so that long-distance semantic features can be well captured, and the cyclic neural network has strong advantages for processing text data. However, the single-layer network has limited characterization capability, and therefore, in order to improve the characterization capability of the network, deep and richer semantic features are extracted, and the depth of the network is generally increased. However, the encoding result of each layer in the deep network structure is not all valid information, and in this case, two problems occur: firstly, if only the result output by the last layer of network is taken as a coding result, the loss of semantic information is caused certainly; second, if the results of each layer network output are directly simply combined, e.g., concatenated or added, the resulting encoded result is inevitably noisy. Therefore, the existing deep-layer network structure still has the above-mentioned non-negligible disadvantage for text semantic coding.
Disclosure of Invention
The technical task of the invention is to provide an intelligent question-answer sentence-to-sentence semantic matching method and device for government affair consultation service, so that the advantages of a gate control mechanism and a residual error mechanism are fully exerted, more semantic context information and interactive information among sentences are captured, and the purpose of intelligent semantic matching of sentence pairs is finally achieved through an attention mechanism and a gate control mechanism.
The technical task of the invention is realized in the following way, and the intelligent question-answer sentence-to-semantic matching method for government affair consultation service is realized by constructing and training a sentence pair semantic matching model consisting of a multi-granularity embedding module, a gate control deep layer characteristic residual difference type fusion network module, a semantic characteristic interactive matching module and a label prediction module, so that the gate control deep layer characteristic residual difference type fusion representation of sentence information is realized, and meanwhile, a final matching tensor of the sentence pair is generated through an attention mechanism and a gate control mechanism and the matching degree of the sentence pair is judged, so that the aim of performing intelligent semantic matching on the sentence pair is fulfilled; the method comprises the following specific steps:
the multi-granularity embedding module is used for respectively embedding the input sentences by word granularity and word granularity to obtain multi-granularity embedded expression of the sentences;
the gate control deep layer characteristic residual difference type fusion network module carries out coding operation on the multi-granularity embedded expression of the sentence to obtain gate control deep layer characteristic residual difference type fusion expression of the sentence;
the semantic feature interactive matching module performs feature matching and feature screening operation on the gated deep feature residual type fusion representation of the sentence pair to obtain a matching vector of the sentence pair;
and the tag prediction module maps the matching tensor of the sentence pair into a floating point type numerical value in the designated interval, compares the floating point type numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the sentence pair are matched or not according to the comparison result.
Preferably, the multi-granularity embedding module is used for constructing a word mapping conversion table, an input module, a word vector mapping layer and a word vector mapping layer;
wherein, a word mapping conversion table is constructed: the mapping rule is as follows: starting with the number 1, sequentially and progressively sequencing according to the sequence of each word recorded in the word table, thereby forming a word mapping conversion table required by the invention; the word table is constructed by constructing a semantic matching word breaking processing knowledge base according to sentences, wherein the knowledge base is obtained by carrying out word breaking operation on original data texts of the semantic matching knowledge base; then, training a Word vector model by using Word2Vec to obtain a Word vector matrix of each Word;
constructing a word mapping conversion table: the mapping rule is as follows: taking the number 1 as the starting point, and then sequentially increasing and sequencing according to the sequence of the input word list of each word so as to form a word mapping conversion table required by the invention; the word list is constructed according to a knowledge base which is processed by matching and segmenting the semantic meaning through sentences, and the knowledge base is obtained by segmenting original data texts of the knowledge base matched with the semantic meaning; then, using Word2Vec to train the vector model to obtain a Word vector matrix of each Word; the method comprises the steps of matching a sentence with a semantic word breaking processing knowledge base and matching a sentence with a semantic word segmentation processing knowledge base, wherein the sentence with the semantic word are collectively called as the sentence with the semantic word matching knowledge base;
constructing an input module: the input layer comprises four inputs, each sentence pair or sentence pair to be predicted in the training data set is subjected to word segmentation and word segmentation preprocessing, and the sensor 1_ char, the sensor 2_ char, the sensor 1_ word and the sensor 2_ word are respectively obtained, wherein suffixes char and word respectively represent word segmentation or word segmentation processing of corresponding sentences, and the suffixes char and word are formed as follows: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word); converting each character or word in the input sentence into a corresponding digital identifier according to a character mapping conversion table and a word mapping conversion table;
constructing a word vector mapping layer: loading the weight of the word vector matrix obtained by training in the step of constructing the word mapping conversion table to initialize the weight parameter of the current layer; aiming at input sentences of sensor 1_ char and sensor 2_ char, obtaining corresponding sentence vectors of sensor 1_ char _ embed and sensor 2_ char _ embed; each sentence in the sentence-to-semantic matching word-breaking processing knowledge base can convert the sentence information into a vector form in a word vector mapping mode.
Constructing a word vector mapping layer: loading the word vector matrix weight obtained by training in the step of constructing a word mapping conversion table to initialize the weight parameter of the current layer; aiming at input sentences of a content 1_ word and a content 2_ word, obtaining corresponding sentence vectors of the content 1_ word _ embedded and the content 2_ word _ embedded; each sentence in the sentence-to-semantic matching word segmentation processing knowledge base can convert sentence information into a vector form in a word vector mapping mode.
Preferably, the construction process of the gated deep characteristic residual difference type fusion network module is as follows:
firstly, selectively fusing the word embedding expression and the word embedding expression output by the multi-granularity embedding module through a gating mechanism to obtain a gating embedding fusion expression, wherein the formula is as follows:
Figure BDA0002646257970000031
Figure BDA0002646257970000032
wherein, the formula (1.1) represents that the construction is embedded to represent the information selection gate, wherein,
Figure BDA0002646257970000033
and
Figure BDA0002646257970000034
represents the weight matrix to be trained and,
Figure BDA0002646257970000035
representing either sensor 1_ char _ embed or sensor 2_ char _ embed,
Figure BDA0002646257970000036
represents the sense 1_ word _ embedded or the sense 2_ word _ embedded, sigma represents the sigmoid function, gateembA presentation embedded presentation information selection gate; equation (1.2) represents the selective fusion of the word embedded representation and the word embedded representation through the embedded representation information selection gate, wherein the representation is multiplied by elements,
Figure BDA0002646257970000037
representing a gated embedded fusion representation.
Further, the first layer coding structure BilSTM1Respectively carrying out coding operation on the word embedding representation and the word embedding representation to obtain a preliminary first-layer word coding result and a preliminary first-layer word coding result; the first layer character coding result and the first layer word coding result are selectively fused through a gating mechanism, and then the fusion result is selectively fused with gating embedded fusion expression through the gating mechanism to obtain gating first layer characteristic residual difference type fusion expression, wherein the formula is as follows:
Figure BDA0002646257970000041
Figure BDA0002646257970000042
Figure BDA0002646257970000043
Figure BDA0002646257970000044
Figure BDA0002646257970000045
Figure BDA0002646257970000046
wherein, formula (2.1) indicates that BilSTM is used1The code word embeds a representation of, among other things,
Figure BDA0002646257970000047
indicating sensor 1_ char _ embed or sensor 2_ char _ embed, icThe vector representing the ith word represents the relative position in the sentence,
Figure BDA0002646257970000048
representing the coding result of the first layer word; equation (2.2) shows the use of BilSTM1The encoded word embedded representation may include, among other things,
Figure BDA0002646257970000049
representing either sense 1_ word _ element or sense 2_ word _ element, iwThe vector representing the ith word represents the relative position in the sentence,
Figure BDA00026462579700000410
representing a first-layer word encoding result; equation (2.3) represents constructing a first layer coding result selection gate, wherein,
Figure BDA00026462579700000411
and
Figure BDA00026462579700000412
represents the weight matrix to be trained, sigma represents sigmoid function, gate1 *A selection gate for indicating the first layer coding result; equation (2.4) shows the result of encoding by the first layerThe selection gate selectively fuses the first layer word encoding result and the first layer word encoding result, wherein, indicates multiplication by elements,
Figure BDA00026462579700000413
representing a gated first-layer encoding result fusion representation; equation (2.5) represents constructing a first-tier feature residual-type select gate, wherein,
Figure BDA00026462579700000414
and
Figure BDA00026462579700000415
represents the weight matrix to be trained and,
Figure BDA00026462579700000416
represents the gate-embedded fusion representation of the output of equation (1.2), σ represents the sigmoid function, gate1A first layer characteristic residual difference type selection gate is represented; equation (2.6) represents the selective fusion of the gated embedded fused representation and the gated first layer coding result fused representation by a first layer feature residual-type selection gate, wherein,
Figure BDA00026462579700000417
a residual fusion representation of the gated first layer features is represented.
Further, the coding result of the first layer of words and the coding result of the first layer of words are transmitted to a second layer of coding structure BilSTM2;BiLSTM2Respectively carrying out coding operation on the first layer character coding result and the first layer word coding result to obtain a second layer character coding result and a second layer word coding result; and selectively fusing the second layer character coding result and the second layer word coding result through a gating mechanism, and then selectively fusing the fusion result and the gated first layer characteristic residual difference type fusion representation through the gating mechanism to obtain a gated second layer characteristic residual difference type fusion representation, wherein the formula is as follows:
Figure BDA00026462579700000418
Figure BDA00026462579700000419
Figure BDA0002646257970000051
Figure BDA0002646257970000052
Figure BDA0002646257970000053
Figure BDA0002646257970000054
wherein, formula (3.1) indicates that BilSTM is used2Encoding the first-layer word encoding result, wherein,
Figure BDA0002646257970000055
representing the result of the first layer word encoding, icThe (i) th time step is represented,
Figure BDA0002646257970000056
representing the second layer word encoding result; equation (3.2) shows the use of BilSTM2And coding the first-layer word coding result, wherein,
Figure BDA0002646257970000057
representing the first-level word encoding result, iwThe (i) th time step is represented,
Figure BDA0002646257970000058
representing a second layer word encoding result; equation (3.3) represents constructing a second layer coding result selection gate, wherein,
Figure BDA0002646257970000059
and
Figure BDA00026462579700000510
represents the weight matrix to be trained, sigma represents sigmoid function, gate2 *A selection gate for indicating the second layer coding result; equation (3.4) represents the selective fusion of the second layer word encoding result and the second layer word encoding result through the second layer encoding result selection gate, wherein, it indicates the multiplication by elements,
Figure BDA00026462579700000511
representing a gated second layer encoding result fusion representation; equation (3.5) represents constructing a second layer of characteristic residual-type select gates, wherein,
Figure BDA00026462579700000512
and
Figure BDA00026462579700000513
represents the weight matrix to be trained and,
Figure BDA00026462579700000514
the gating first layer characteristic residual fusion expression of the expression (2.6) output is expressed, sigma represents sigmoid function, and gate2A residual selection gate representing a second layer feature; equation (3.6) represents the selective fusion of the gated first layer feature residual fusion representation and the gated second layer encoding result fusion representation by the second layer feature residual selection gate, wherein,
Figure BDA00026462579700000515
representing a gated second layer feature residual fusion representation.
Further, the coding result of the second layer words and the coding result of the second layer words are transmitted to a third layer coding structure BilSTM3(ii) a By analogy, multi-level gating characteristic residual difference type fusion representation can be generated through repeated coding for many times; and according to the preset level depth of the model, until a final gated deep layer characteristic residual fusion representation is generated. For the depth layer, the formula is as follows:
Figure BDA00026462579700000516
Figure BDA00026462579700000517
Figure BDA00026462579700000518
Figure BDA00026462579700000519
Figure BDA00026462579700000520
Figure BDA00026462579700000521
wherein, formula (4.1) indicates that BilSTM is useddepthEncoding a depth-1 layer word encoding result, wherein,
Figure BDA00026462579700000522
representing the result of encoding a depth-1 layer word, icThe (i) th time step is represented,
Figure BDA00026462579700000523
representing the encoding result of the depth layer word; equation (4.2) shows the use of BilSTMdepthAnd coding the result of the depth-1 layer word coding, wherein,
Figure BDA0002646257970000061
indicating the encoding result of the term of the depth-1 st layer, iwThe (i) th time step is represented,
Figure BDA0002646257970000062
representing the encoding result of the term of the depth layer; equation (4.3) represents constructing a depth layer coding result selection gate, wherein,
Figure BDA0002646257970000063
and
Figure BDA0002646257970000064
represents the weight matrix to be trained, sigma represents sigmoid function, gatedepth *A selection gate for indicating the encoding result of the depth layer; formula (4.4) indicates that the word coding result of the depth layer and the word coding result of the depth layer are selectively fused by the depth layer coding result selection gate, wherein, it indicates that multiplication by elements,
Figure BDA0002646257970000065
representing the fusion representation of the coding result of the gated depth layer; equation (4.5) represents constructing a depth layer feature residual equation select gate, where,
Figure BDA0002646257970000066
and
Figure BDA0002646257970000067
represents the weight matrix to be trained and,
Figure BDA0002646257970000068
representing the residual fusion representation of the gating depth-1 layer characteristic, sigma representing sigmoid function, gatedepthRepresenting a feature residual error formula selection gate of a depth layer; equation (4.6) represents the selective fusion of the gated depth-1 layer feature residual fusion representation and the gated depth layer coding result fusion representation by a depth layer feature residual selection gate, wherein,
Figure BDA0002646257970000069
and (3) representing the gated depth layer feature residual fusion representation, namely the gated deep layer feature residual fusion representation.
Preferably, the construction process of the semantic feature interaction matching module is as follows:
the layer receives the gated deep characteristic residual difference type fusion representation output by the gated deep characteristic residual difference type fusion network module as input, and performs semantic characteristic matching and semantic characteristic screening operation on the gated deep characteristic residual difference type fusion representation in three steps, so that a final sentence is generated and a tensor is matched with the semantic meaning, and the specific operation is as follows:
firstly, an interactive matching process between sentence pairs is completed by applying an attention mechanism to obtain a preliminary matching tensor of the sentences. Taking the example of the sensor 1 matching the sensor 2, the formula is as follows:
Figure BDA00026462579700000610
Figure BDA00026462579700000611
Figure BDA00026462579700000612
wherein, the formula (5.1) represents the mapping of the gated deep feature residual fusion representation of the two sentences,
Figure BDA00026462579700000613
the ith component of the gated deep feature residual fusion representation representing sense 1,
Figure BDA00026462579700000614
the ith component of the gated deep feature residual fusion representation representing sense 2,
Figure BDA00026462579700000615
and
Figure BDA00026462579700000616
indicates a weight to be trained, <' > indicates multiplication by an element, equation (5.2) indicates that an attention weight is calculated, equation (5.3) indicates that an interactive matching process is completed using the attention weight,
Figure BDA00026462579700000617
the result of matching the sensor 2 with the sensor 1 is represented, i.e., the sentence preliminary matching tensor. Similarly, matching sensor 1 with sensor 2 will result in a similar preliminary matching tensor for the sentence
Figure BDA0002646257970000071
And secondly, performing feature screening operation on the sentence matching tensor by using a gating mechanism to obtain a sentence matching tensor, wherein the formula is as follows:
Figure BDA0002646257970000072
Figure BDA0002646257970000073
wherein equation (6.5) represents constructing a matching tensor gate,
Figure BDA0002646257970000074
results are shown matching the sensor 2 with sensor 1,
Figure BDA0002646257970000075
and
Figure BDA0002646257970000076
is the weight to be trained; equation (6.6) indicates that the matching tensor is feature-screened using the matching tensor gate, which indicates that multiplication by an element,
Figure BDA0002646257970000077
representing the sense 1 matching tensors. Similarly, the result of processing the matching sensor 1 with sensor 2 yields a sensor 2 matching tensor
Figure BDA0002646257970000078
And thirdly, connecting the two sentence matching tensors to obtain a sentence pair matching tensor, wherein the formula is as follows:
Figure BDA0002646257970000079
wherein ,
Figure BDA00026462579700000710
representing sentence pairs matching tensors.
Preferably, the label prediction module is constructed by the following steps:
the sentence-to-semantic matching tensor is used as the input of the module and is processed by a layer of fully-connected network with the dimensionality of 1 and the activation function of sigmoid, so that a value of [0,1 ] is obtained]The value of the degree of matching between the two is recorded as ypredFinally, whether the semantics of the sentence pairs are matched or not is judged by comparing with the set threshold value (0.5); i.e. ypredAnd when the semantic meaning of the sentence pair is predicted to be matched when the semantic meaning is more than or equal to 0.5, otherwise, the semantic meaning is not matched. When the sentence is not fully trained on the semantic matching model, training is required to be carried out on a training data set so as to optimize the model parameters; when the model training is finished, the label prediction module can predict whether the semantics of the target sentence pair are matched.
Preferably, the sentence construction for the semantic matching knowledge base is as follows:
downloading a data set on a network to obtain original data: downloading a sentence-to-semantic matching data set or a manually constructed data set which is already disclosed on a network, and taking the sentence-to-semantic matching data set or the manually constructed data set as original data for constructing a sentence-to-semantic matching knowledge base;
preprocessing raw data: preprocessing original data used for constructing a sentence-to-semantic matching knowledge base, and performing word segmentation operation and word segmentation operation on each sentence to obtain a sentence-to-semantic matching word segmentation processing knowledge base and a word segmentation processing knowledge base;
summarizing the sub-knowledge base: summarizing a sentence-to-semantic matching word-breaking processing knowledge base and a sentence-to-semantic matching word-segmentation processing knowledge base to construct a sentence-to-semantic matching knowledge base.
The sentence-to-semantic matching model is obtained by training by using a training data set, and the construction process of the training data set is as follows:
constructing a training example: constructing two sentence pairs with consistent sentence semantemes into a positive example in a sentence pair semantic matching knowledge base, and formalizing the positive example into: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word, 1); wherein, sensor 1_ char and sensor 2_ char refer to sentence1 and sentence2 in the knowledge base for semantic matching word segmentation processing respectively, sensor 1_ word and sensor 2_ word refer to sentence1 and sentence2 in the knowledge base for semantic matching word segmentation processing respectively, and here 1 indicates that the semantics of the two sentences are matched, which is a positive example;
constructing a training negative example: selecting a sentence s1Randomly selecting a sentence s from the sentence pair semantic matching knowledge base1Unmatched sentence s2A 1 is to1And s2The combination is carried out, and a negative example is constructed and formalized as follows: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word, 0); wherein, the sensor 1_ char and the sensor 1_ word respectively refer to sentence-to-sentence semantic matching word-breaking processing knowledge base and sentence-to-sentence semantic matching word-segmentation processing knowledge base, namely sentence 1; sensor 2_ char, sensor 2_ word refer to sentence-to-sentence semantic matching word-breaking processing knowledge base and sentence-to-sentence semantic matching word-segmentation processing knowledge base, respectively; 0 denotes the sentence s1And sentence s2Is a negative example;
constructing a training data set: combining all positive example sample sentence pairs and negative example sample sentence pairs obtained after the operations of constructing the training positive examples and constructing the training negative examples, and disordering the sequence of the positive example sample sentence pairs and the negative example sample sentence pairs to construct a final training data set; whether positive case data or negative case data contains five dimensions, namely, sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word,0 or 1;
after the sentence-to-semantic matching model is built, training and optimizing the sentence-to-semantic matching model through a training data set are carried out, which specifically comprises the following steps:
constructing a loss function: as can be seen from the tag prediction module construction process,ypredis a matching degree calculation value y obtained by processing a sentence to a semantic matching modeltrueThe semantic matching method is a real label for judging whether the semantics of two sentences are matched, the value of the label is limited to 0 or 1, cross entropy is used as a loss function, and the formula is as follows:
Figure BDA0002646257970000081
optimizing a training model: using RMSProp as an optimization algorithm, except that the learning rate is set to 0.0015, the remaining hyper-parameters of RMSProp all select default settings in Keras; and optimally training the sentence pair semantic matching model on the training data set.
The intelligent question-answer sentence pair semantic matching device facing the government affair consulting service comprises,
the sentence-to-semantic matching knowledge base construction unit is used for acquiring a large amount of sentence pair data and then carrying out preprocessing operation on the sentence pair data so as to obtain a sentence-to-semantic matching knowledge base which meets the training requirement;
a training data set generating unit for constructing positive example data and negative example data for training according to sentences in the sentence-to-semantic matching knowledge base, and constructing a final training data set based on the positive example data and the negative example data;
the sentence pair semantic matching model building unit is used for building a word mapping conversion table, an input module, a word vector mapping layer, a gating deep layer characteristic residual difference type fusion network module, a semantic characteristic interactive matching module and a label prediction module; the sentence-to-semantic matching model construction unit includes,
the word mapping conversion table or word mapping conversion table construction unit is responsible for segmenting each sentence in the semantic matching knowledge base by the sentence according to the word granularity or the word granularity, sequentially storing each word or word in a list to obtain a word list or word list, and sequentially increasing and sequencing the words or words according to the sequence of the words or word lists recorded by the words or words with the number 1 as the start to form the word mapping conversion table or word mapping conversion table required by the invention; after the word mapping conversion table or the word mapping conversion table is constructed, each word or word in the table is mapped into a unique digital identifier; then, the Word vector model or the Word vector model is trained by using Word2Vec to obtain a Word vector matrix of each Word or a Word vector matrix of each Word;
the input module construction unit is responsible for preprocessing each sentence pair or sentence pair to be predicted in the training data set, respectively acquiring sensor 1_ char, sensor 2_ char, sensor 1_ word and sensor 2_ word, and formalizing the words as follows: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word);
the word vector mapping layer or word vector mapping layer construction unit is responsible for loading a word vector matrix or word vector matrix obtained by training in the step of the word mapping conversion table or word mapping conversion table construction unit to initialize the weight parameters of the current layer; for word vector mapping, aiming at input sentences of sensor 1_ char and sensor 2_ char, obtaining corresponding sentence vectors of sensor 1_ char _ embed and sensor 2_ char _ embed; for word vector mapping, for input sentences of presence 1_ word and presence 2_ word, obtaining their corresponding sentence vectors of presence 1_ word _ embedded and presence 2_ word _ embedded;
the gate control deep layer characteristic residual type fusion network module construction unit is used for capturing and screening semantic characteristics of sentences, and specifically operates by receiving word embedded representation output by a word vector mapping layer and word embedded representation output by a word vector mapping layer as input; the word embedding expression and the word embedding expression are selectively fused through a gating mechanism to obtain gating embedding fusion expression, and meanwhile, the word embedding expression and the word embedding expression before fusion are transmitted to a first layer of coding structure; the first layer of coding structure respectively carries out coding operation on the word embedding representation and the word embedding representation to obtain a first layer of word coding result and a first layer of word coding result; selectively fusing the first layer character coding result and the first layer word coding result through a gate control mechanism, then selectively fusing the fusion result and the gate control embedded fusion representation through the gate control mechanism to obtain a gate control first layer characteristic residual difference type fusion representation, and simultaneously transmitting the first layer character coding result and the first layer word coding result before fusion to a second layer coding structure; the second layer coding structure respectively carries out coding operation on the first layer character coding result and the first layer word coding result so as to obtain a second layer character coding result and a second layer word coding result; selectively fusing the second layer character coding result and the second layer word coding result through a gating mechanism, then selectively fusing the fusion result and the gated first layer characteristic residual difference type fusion representation through the gating mechanism to obtain gated second layer characteristic residual difference type fusion representation, and simultaneously transmitting the second layer character coding result and the second layer word coding result before fusion to a third layer coding structure; by analogy, multi-level gating characteristic residual difference type fusion representation can be generated through repeated coding for many times; according to the preset level depth of the model, generating a final gated deep layer characteristic residual difference type fusion representation;
the semantic feature interactive matching module construction unit is responsible for further processing the gate-controlled deep-layer feature residual fusion expression of the corresponding sentence, and performing semantic feature interactive matching, semantic feature screening and other operations on the gate-controlled deep-layer feature residual fusion expression, so as to generate a final sentence-to-semantic matching tensor;
the tag prediction module unit is responsible for processing the semantic matching tensor of the sentence pair so as to obtain a matching degree value, and the matching degree value is compared with an established threshold value so as to judge whether the semantics of the sentence pair are matched or not;
and the sentence-to-semantic matching model training unit is used for constructing a loss function required in the model training process and finishing the optimization training of the model.
Preferably, the sentence-to-semantic matching knowledge base construction unit includes,
the sentence pair data acquisition unit is responsible for downloading a sentence pair semantic matching data set or a manually constructed data set which is already disclosed on a network, and the sentence pair data set is used as original data for constructing a sentence pair semantic matching knowledge base;
the system comprises an original data word-breaking preprocessing or word-segmentation preprocessing unit, a word-breaking or word-segmentation preprocessing unit and a word-segmentation processing unit, wherein the original data word-breaking preprocessing or word-segmentation preprocessing unit is responsible for preprocessing original data used for constructing a sentence-to-semantic matching knowledge base, and performing word-breaking or word-segmentation operation on each sentence in the original data word-breaking or word-segmentation preprocessing unit so as to construct a sentence-to-semantic matching word-breaking processing knowledge base or a sentence-;
and the sub-knowledge base summarizing unit is responsible for summarizing the sentence-to-semantic matching word-breaking processing knowledge base and the sentence-to-semantic matching word-segmentation processing knowledge base so as to construct the sentence-to-semantic matching knowledge base.
The training data set generating unit comprises a training data set generating unit,
the training positive case data construction unit is responsible for constructing two sentences with consistent semantics in the sentence-to-semantic matching knowledge base and the matching labels 1 thereof into training positive case data;
the training negative case data construction unit is responsible for selecting one sentence, randomly selecting a sentence which is not matched with the sentence for combination, and constructing the sentence and the matched label 0 into negative case data;
the training data set construction unit is responsible for combining all training positive example data and training negative example data together and disordering the sequence of the training positive example data and the training negative example data so as to construct a final training data set;
the sentence-to-semantic matching model training unit includes,
the loss function construction unit is responsible for calculating the error of the semantic matching degree between the sentences 1 and 2;
and the model optimization unit is responsible for training and adjusting parameters in model training to reduce prediction errors.
A storage medium having stored therein a plurality of instructions, said instructions being loaded by a processor for performing the steps of the above-mentioned intelligent sentence-by-sentence semantic matching method for government counseling services.
An electronic device, the electronic device comprising: the storage medium described above; and a processor for executing instructions in the storage medium.
The intelligent question-answer sentence pair semantic matching method and device for the government affair consultation service have the following advantages that:
according to the invention, through a gated deep characteristic residual error type fusion network structure, effective information screening can be carried out on the coding result output by each layer network, so that the extracted semantic characteristics are more accurate; the semantic features of a deeper level can be captured, and the depth can be freely controlled, so that the structure can be flexibly adapted to different data sets, and better universality is achieved;
the interactive matching is carried out through an attention mechanism, so that the information between sentence pairs can be effectively aligned, the matching process is more reliable, the generated sentences have abundant interactive characteristics on the matching tensor, and the accuracy of the sentences on the semantic matching is improved;
thirdly, the coding result of the sentence pairs is processed through a gating mechanism, so that the interactive characteristics among the sentence pairs can be effectively screened, and the prediction accuracy of the model is improved;
drawings
The invention is further described below with reference to the accompanying drawings.
FIG. 1 is a flow chart of a method for matching semantic meanings with intelligent question-answering sentences for government consulting services;
FIG. 2 is a flow chart of building a sentence-to-semantic matching knowledge base;
FIG. 3 is a flow chart for constructing a training data set;
FIG. 4 is a flow chart for constructing a sentence-to-semantic matching model;
FIG. 5 is a flow chart of training a sentence-to-semantic matching model;
FIG. 6 is a schematic structural diagram of an intelligent question-answer sentence pair semantic matching device for government affair counseling service;
FIG. 7 is a schematic structural diagram of a gated deep feature residual fusion network;
FIG. 8 is a frame diagram of a intelligent question-answer sentence-to-semantic matching model for government counseling services.
The specific implementation mode is as follows:
the intelligent question-answer sentence pair semantic matching method and device for the government consulting service according to the invention are described in detail below with reference to the drawings and the specific embodiments of the specification.
Example 1:
as shown in fig. 8, the main framework structure of the present invention includes a multi-granularity embedding module, a gated deep feature residual fusion network module, a semantic feature interactive matching module, and a tag prediction module. The multi-granularity embedding module is used for respectively embedding the input sentences by the word granularity and transmitting the result to the gate control deep layer characteristic residual difference type fusion network module of the model. The gated deep characteristic residual fusion network module comprises a plurality of layers of coding structures, as shown in fig. 7, wherein the word embedding representation and the word embedding representation output by the multi-granularity embedding module are selectively fused through a gating mechanism to obtain a gated embedding fusion representation, and meanwhile, the word embedding representation and the word embedding representation before fusion are transmitted to the first layer of coding structure; the first layer of coding structure respectively carries out coding operation on the word embedding representation and the word embedding representation to obtain a first layer of word coding result and a first layer of word coding result; selectively fusing the first layer character coding result and the first layer word coding result through a gate control mechanism, then selectively fusing the fusion result and the gate control embedded fusion representation through the gate control mechanism to obtain a gate control first layer characteristic residual difference type fusion representation, and simultaneously transmitting the first layer character coding result and the first layer word coding result before fusion to a second layer coding structure; the second layer coding structure respectively carries out coding operation on the first layer character coding result and the first layer word coding result so as to obtain a second layer character coding result and a second layer word coding result; selectively fusing the second layer character coding result and the second layer word coding result through a gating mechanism, then selectively fusing the fusion result and the gated first layer characteristic residual difference type fusion representation through the gating mechanism to obtain gated second layer characteristic residual difference type fusion representation, and simultaneously transmitting the second layer character coding result and the second layer word coding result before fusion to a third layer coding structure; by analogy, multi-level gating characteristic residual difference type fusion representation can be generated through repeated coding for many times; according to the preset level depth of the model, generating a final gated deep layer characteristic residual difference type fusion representation; the gated deep feature residual fusion represents the semantic feature interactive matching module that is to be passed to the model. The semantic feature interactive matching module performs semantic feature matching and feature screening on the gating deep layer feature residual difference type fusion representation; wherein, semantic feature matching is completed by an attention mechanism; the characteristic screening operation is realized through a gating mechanism; finally, the matching tensor of the sentence pair is obtained and is transmitted to a label prediction module of the model. The tag prediction module maps the matching tensor of the sentence pair into a floating point type numerical value in a specified interval; and comparing the matching degree serving as the matching degree with a preset threshold value, and judging whether the semantics of the sentence pairs are matched or not according to the comparison result. The method comprises the following specific steps:
(1) the multi-granularity embedding module is used for respectively embedding the input sentences by word granularity and word granularity to obtain multi-granularity embedded expression of the sentences;
(2) the gate control deep layer characteristic residual difference type fusion network module carries out coding operation on the multi-granularity embedded expression of the sentence to obtain gate control deep layer characteristic residual difference type fusion expression of the sentence;
(3) the semantic feature interactive matching module performs feature matching and feature screening operation on the gated deep feature residual type fusion representation of the sentence pair to obtain a matching vector of the sentence pair;
(4) and the tag prediction module maps the matching tensor of the sentence pair into a floating point type numerical value in the designated interval, compares the floating point type numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the sentence pair are matched or not according to the comparison result.
Example 2:
as shown in fig. 1, the intelligent question-answer sentence pair semantic matching method for government consulting services of the present invention specifically comprises the following steps:
s1, constructing a sentence-to-semantic matching knowledge base, as shown in the attached figure 2, and specifically comprising the following steps:
s101, downloading a data set on a network to obtain original data: and downloading a sentence-to-semantic matching data set or a manually constructed data set which is already disclosed on the network, and taking the sentence-to-semantic matching data set or the manually constructed data set as original data for constructing a sentence-to-semantic matching knowledge base.
For example, the following steps are carried out: according to the historical record of the government affair consultation, the question sentence pairs contained in the historical record can be collected, the data set of the government affair consultation question sentence pairs is constructed manually, and the original data used for constructing the sentence-to-semantic matching knowledge base is obtained. Sentences for the example, are represented as follows:
sentence1 how do college graduate archives be deposited in talent market?
sentence2 University graduate file storage procedure?
S102, preprocessing original data: preprocessing is used for constructing original data of a sentence-to-semantic matching knowledge base, and performing word breaking operation and word segmentation operation on each sentence to obtain the sentence-to-semantic matching word breaking processing knowledge base and the word segmentation processing knowledge base.
And performing word segmentation preprocessing and word segmentation preprocessing on each sentence in the original data for constructing the sentence-to-semantic matching knowledge base obtained in the step S101 to obtain a sentence-to-semantic matching word segmentation processing knowledge base and a word segmentation processing knowledge base. The character breaking operation comprises the following specific steps: each character in the Chinese sentence is taken as a unit, and a blank space is taken as a separator to segment each sentence. The word segmentation operation comprises the following specific steps: and selecting a default accurate mode to segment each sentence by using a Jieba word segmentation tool. In this operation, all the contents of punctuation, special characters and stop words in the sentence are preserved in order to avoid loss of semantic information.
For example, the following steps are carried out: taking the sensor 1 shown in S101 as an example, the word-breaking operation is performed on the sensor to obtain "how college graduate files are stored in talent market? "; use a Jieba word segmentation tool to perform word segmentation processing on the file to obtain' how college graduate files are stored in talent market? ".
S103, summarizing the sub-knowledge base: summarizing a sentence-to-semantic matching word-breaking processing knowledge base and a sentence-to-semantic matching word-segmentation processing knowledge base to construct a sentence-to-semantic matching knowledge base.
And summarizing the sentence-to-semantic matching word-breaking processing knowledge base and the sentence-to-semantic matching word-segmentation processing knowledge base obtained in the step S102 to the same folder, so as to obtain the sentence-to-semantic matching knowledge base. The flow is shown in fig. 2. It should be noted here that the data processed by the word-breaking operation and the data processed by the word-segmentation operation are not merged into the same file, i.e., the sentence-to-semantic matching knowledge base actually contains two independent sub-knowledge bases. Each preprocessed sentence retains the ID information of its original sentence.
S2, constructing a training data set of the sentence-to-semantic matching model: for each sentence pair in the sentence pair semantic matching knowledge base, if the semantics are consistent, the sentence pair can be used for constructing a training positive example; if the semantics are inconsistent, the sentence pair can be used for constructing a training negative example; mixing a certain amount of positive example data and negative example data to construct a model training data set; as shown in fig. 3, the specific steps are as follows:
s201, constructing a training example: constructing two sentence pairs with consistent sentence semantemes into a positive example in a sentence pair semantic matching knowledge base, and formalizing the positive example into: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word, 1;
for example, the following steps are carried out: after the word-breaking operation processing of step S102 and the word-segmentation operation processing of step S103 are performed on the content 1 and the content 2 shown in step S101, the formal example data form is constructed as follows:
(how are college graduate profiles stored in talent market.
S202, constructing a training negative example: selecting a sentence s1Randomly selecting a sentence s from the sentence pair semantic matching knowledge base1Unmatched sentence s2A 1 is to1And s2The combination is carried out, and the combination is carried out,a negative example was constructed, formalized as: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word, 0);
for example, the following steps are carried out: the pair "content 1: how does a college graduate archive reside in talent market? sensor 2 where marriage registration is handled? For example, after the word-breaking operation processing in step S102 and the word-segmentation operation processing in step S103, negative example data forms are constructed as follows:
(how are college graduate profiles stored in talent market.
S203, constructing a training data set: all positive example sentence pair data and negative example sentence pair data obtained after the operations of step S201 and step S202 are combined together, and the sequence thereof is disturbed, so as to construct a final training data set. Whether positive case data or negative case data, they contain five dimensions, namely, sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word,0 or 1.
S3, constructing a sentence-to-semantic matching model: the method mainly comprises the steps of constructing a word mapping conversion table, constructing an input module, constructing a word vector mapping layer, constructing a gated deep layer feature residual type fusion network module, constructing a semantic feature interactive matching module and constructing a label prediction module. The word mapping conversion table, the input module, the word vector mapping layer and the word vector mapping layer are constructed, and correspond to the multi-granularity embedded module in fig. 8, and the rest parts correspond to the modules in fig. 8 one by one. The method comprises the following specific steps:
s301, constructing a word mapping conversion table: the word table is constructed by the sentence-to-semantic matching word-breaking processing knowledge base obtained by the processing of step S102. After the word table is constructed, each word in the table is mapped to a unique digital identifier, and the mapping rule is as follows: starting with the number 1, the words are then ordered in ascending order in the order in which each word is entered into the word table, thereby forming the word mapping conversion table required by the present invention.
For example, the following steps are carried out: with the content processed in step S102, "how do college graduate files are stored in talent market? ", construct word table and word mapping translation table as follows:
words and phrases Height of School After all, the tea is made Industry Raw material Gear Table (A table) Such as What is needed Store Put
Mapping 1 2 3 4 5 6 7 8 9 10 11
Words and phrases In that Human being Just before City (R) Field(s)
Mapping 12 13 14 15 16 17
Then, the invention trains a Word vector model by using Word2Vec to obtain a Word vector matrix char _ embedding _ matrix of each Word.
For example, the following steps are carried out: in Keras, the implementation for the code described above is as follows:
w2v_model_char=genism.models.Word2Vec(w2v_corpus_char,size=char_embe dding_dim,window=5,min_count=1,sg=1,workers=4,seed=1234,iter=25)
char_embedding_matrix=numpy.zeros([len(tokenizer.char_index)+1,char_emb edding_dim])
tokenizer=keras.preprocessing.text.Tokenizer(num_words=len(char_set))
for char,idx in tokenizer.char_index.items():
char_embedding_matrix[idx,:]=w2v_model.wv[char]
wherein w2v _ corpus _ char is a word-breaking processing training corpus, namely, all data in a sentence-to-semantic matching word-breaking processing knowledge base; char _ embedding _ dim is a word vector dimension, the model sets char _ embedding _ dim to be 400, and char _ set is a word table.
S302, constructing a word mapping conversion table: the vocabulary is constructed by processing the knowledge base by sentence-to-semantic matching and word segmentation obtained in step S103. After the word list is constructed, each word in the list is mapped into a unique digital identifier, and the mapping rule is as follows: starting with the number 1, sequentially and progressively ordering according to the sequence of each word being recorded into the word list, thereby forming the word mapping conversion table required by the invention.
For example, the following steps are carried out: with the content processed in step S103, "how do college graduate files are stored in talent market? ", construct word table and word mapping translation table as follows:
words and phrases Colleges and universities Graduate student File system How to do Storage of In that Talents Market place
Mapping 1 2 3 4 5 6 7 8 9
Then, the Word vector model is trained by using Word2Vec to obtain a Word vector matrix Word _ embedding _ matrix of each Word.
For example, the following steps are carried out: in Keras, for the code implementation described above, basically the same as illustrated in S301, but the parameters are changed from char to word. For the sake of brevity, no further description is provided herein.
In S301, w2v _ corp _ char is replaced by w2v _ corp _ word, which is a segmentation processing corpus, that is, all data in the sentence-to-semantic matching segmentation processing knowledge base; the char _ embedding _ dim is replaced by a word _ embedding _ dim, the word _ embedding _ dim is a word vector dimension, and the word _ embedding _ dim is set to be 400 by the model; char _ set is changed to word _ set, which is a vocabulary.
S303, constructing an input layer: the input layer includes four inputs, from which a training data set sample is obtained, respectively, sensor 1_ char, sensor 2_ char, sensor 1_ word, and sensor 2_ word, formalized as: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word);
and for each character or word in the input sentence, converting the character or word into a corresponding numerical identifier according to the character mapping conversion table and the word mapping conversion table.
For example, the following steps are carried out: the sentence pair shown in step S201 is used as a sample to compose a piece of input data. The results are shown below:
("how college graduate files are stored in talent market
Each input data contains 4 clauses. For the first two clauses, converting the clauses into numerical representations according to the word mapping conversion table in the step S301; for the latter two clauses, they are converted into numerical representations according to the word mapping conversion table in step S302. The 4 clauses of the input data, combined representation results are as follows:
("1,2,3,4,5,6,7,8,9, 10, 11, 12, 13, 14, 15, 16, 17","18, 19,3,4,5,6,7, 10, 11, 20, 21, 17","1,2,3,4,5,6,7,8,9","10,2,3,5, 11,9"). Wherein, for the partial characters in the sensor 2, the mapping relationship is: large-18, school-19, hand-20, continuation-21; for partial words in the content 2, the mapping relationship is as follows: university-10, procedure-11.
S304, constructing a word vector mapping layer: initializing the weight parameter of the current layer by loading the weight of the word vector matrix obtained by training in the step of constructing the word mapping conversion table; aiming at input sentences of sensor 1_ char and sensor 2_ char, obtaining corresponding sentence vectors of sensor 1_ char _ embed and sensor 2_ char _ embed; each sentence in the sentence-to-semantic matching word-breaking processing knowledge base can convert the sentence information into a vector form in a word vector mapping mode.
For example, the following steps are carried out: in Keras, the implementation for the code described above is as follows:
char_embedding_layer=Embedding(char_embedding_matrix.shape[0],char_emb_dim,weights=[char_embedding_matrix],input_length=input_dim,trainable=False)
wherein, char _ embedding _ matrix is the weight of the word vector matrix obtained by training in the step of constructing the word mapping conversion table, char _ embedding _ matrix. shape [0] is the size of the word table of the word vector matrix, char _ embedding _ dim is the dimension of the output word vector, and input _ length is the length of the input sequence.
The corresponding sentences sensor 1_ char and sensor 2_ char are processed by an Embedding layer of Keras to obtain corresponding sentence vectors sensor 1_ char _ embedded and sensor 2_ char _ embedded.
S305, constructing a word vector mapping layer: initializing the weight parameter of the current layer by loading the weight of the word vector matrix obtained by training in the step of constructing a word mapping conversion table; aiming at input sentences of a content 1_ word and a content 2_ word, obtaining corresponding sentence vectors of the content 1_ word _ embedded and the content 2_ word _ embedded; each sentence in the sentence-to-semantic matching word segmentation processing knowledge base can convert sentence information into a vector form in a word vector mapping mode.
For example, the following steps are carried out: in Keras, the code implementation described above is basically the same as in S304, except that the parameters are changed from char to word. For the sake of brevity, no further description is provided herein.
The corresponding sentences of content 1_ word and content 2_ word are processed by an Embedding layer of Keras to obtain corresponding sentences of content 1_ word _ embedded and content 2_ word _ embedded.
S306, constructing a gated deep characteristic residual difference type fusion network module: the structure is shown in fig. 7, and the specific steps are as follows:
first, the word embedding representation and the word embedding representation output by the multi-granularity embedding module are selectively fused through a gating mechanism to obtain a gating embedding fusion representation. The specific implementation is shown in the following formula.
Figure BDA0002646257970000171
Figure BDA0002646257970000172
Further, the first layer coding structure BilSTM1Respectively carrying out coding operation on the word embedding representation and the word embedding representation to obtain a preliminary first-layer word coding result and a preliminary first-layer word coding result; and selectively fusing the first layer character coding result and the first layer word coding result through a gating mechanism, and then selectively fusing the fusion result and the gating embedded fusion representation through the gating mechanism to obtain the gating first layer characteristic residual difference type fusion representation. The specific implementation is shown in the following formula.
Figure BDA0002646257970000173
Figure BDA0002646257970000174
Figure BDA0002646257970000175
Figure BDA0002646257970000176
Figure BDA0002646257970000177
Figure BDA0002646257970000178
Further, the coding result of the first layer of words and the coding result of the first layer of words are transmitted to a second layer of coding structure BilSTM2;BiLSTM2Respectively carrying out coding operation on the first layer character coding result and the first layer word coding result to obtain a second layer character coding result and a second layer word coding result; and selectively fusing the second layer character coding result and the second layer word coding result through a gating mechanism, and then selectively fusing the fusion result and the gated first layer characteristic residual difference type fusion representation through the gating mechanism to obtain the gated second layer characteristic residual difference type fusion representation. The specific implementation is shown in the following formula.
Figure BDA0002646257970000181
Figure BDA0002646257970000182
Figure BDA0002646257970000183
Figure BDA0002646257970000184
Figure BDA0002646257970000185
Figure BDA0002646257970000186
Further, the coding result of the second layer words and the coding result of the second layer words are transmitted to a third layer coding structure BilSTM3(ii) a By analogy, multi-level gating characteristic residual difference type fusion representation can be generated through repeated coding for many times; and according to the preset level depth of the model, until a final gated deep layer characteristic residual fusion representation is generated. For the depth layer, the implementation is shown in the following formula.
Figure BDA0002646257970000187
Figure BDA0002646257970000188
Figure BDA0002646257970000189
Figure BDA00026462579700001810
Figure BDA00026462579700001811
Figure BDA00026462579700001812
For example, the following steps are carried out: when the method is implemented on a manually constructed government affair consultation data set, the number of layers of the structure is 6, and the optimal result can be obtained when the coding dimension of the BilSTM in each layer is set to be 300. In addition, in order to avoid the over-fitting problem, a dropout strategy is used in each layer of BilSTM, and the optimal result can be obtained when dropout is set to be 0.01.
In Keras, the implementation for the code described above is as follows:
Figure BDA00026462579700001813
Figure BDA0002646257970000191
wherein, sensor _ embedded _ char is word embedded representation of a sentence, sensor _ embedded _ word is word embedded representation of a sentence, 300 is coding dimension of BiLSTM, sensor _ encode is gated deep layer feature residual fusion representation of a corresponding sentence, it should be noted that GateFeatureLayer represents gated feature fusion layer, and the code implementation in Keras follows:
q=feature_1
p=feature_2
gate=tf.sigmoid(tf.add(self.v1,tf.add(K.dot(p,self.W2),K.dot(q,self.W1))))
xj_=gate*q
xp_=tf.subtract(self.v2,gate)*p
result=tf.add(xj_,xp_)
where feature _1 and feature _2 are objects to be fused, self.W1, self.W2, self.v1 and self.v2 are weights to be trained, gate is the corresponding gate of the construct, and result is the fusion result.
S307, constructing a semantic feature interactive matching module: after the processing of step S306, obtaining gated deep feature residual fusion representations of sensor 1 and sensor 2, respectively, and performing semantic feature matching, semantic feature screening, and the like on the gated deep feature residual fusion representations, thereby generating a final sentence to semantic matching tensor; the method comprises the following specific steps:
firstly, an interactive matching process between sentence pairs is completed by applying an attention mechanism to obtain a preliminary matching tensor of the sentences. Taking the example of the sensor 1 matching the sensor 2, the following formula is implemented.
Figure BDA0002646257970000192
Figure BDA0002646257970000193
Figure BDA0002646257970000194
For example, the following steps are carried out: in Keras, the implementation for the code described above is as follows:
sentence1=feature_s1
sentence2=feature_s2
q_p_dot=tf.expand_dims(sentence2,axis=1)*tf.expand_dims(sentence1,axis=2)
sd=tf.multiply(tf.tanh(K.dot(q_p_dot,self.Wd)),self.vd)
sd=tf.squeeze(sd,axis=-1)
ad=tf.nn.softmax(sd)
h=K.batch_dot(ad,sentence2)
the feature _ s1 and the feature _ s2 represent gated deep feature residual fusion representations of corresponding sentences, self.Wd and self.Vd represent weights to be trained, and h represents a preliminary matching tensor of the sentences.
And secondly, performing feature screening operation on the sentence matching tensor by using a gating mechanism to obtain the sentence matching tensor. The specific implementation is shown in the following formula.
Figure BDA0002646257970000201
Figure BDA0002646257970000202
For example, the following steps are carried out: in Keras, the implementation for the code described above is as follows:
q=h1
gj=tf.sigmoid(tf.add(self.v1,K.dot(q,self.W1)))
M_s1=gj*q
where h1 denotes the sentence preliminary matching tensor, self.w1 and self.v1 denote weights to be trained, and M _ s1 denotes the sentence matching tensor.
And thirdly, connecting the two sentence matching tensors to obtain a sentence pair matching tensor. The specific implementation is shown in the following formula.
Figure BDA0002646257970000203
For example, the following steps are carried out: in Keras, the implementation for the code described above is as follows:
similarity=Concatenate(axis=2)([M_s1,M_s2])
where M _ s1 and M _ s2 represent the corresponding sentence match tensors, and similarity represents the sentence-to-match tensors.
S308, constructing a label prediction module: the sentence pair semantic matching tensor obtained in step S307 is used as an input of the module, and is processed through a full-connection network with a layer of dimensionality 1 and an activation function sigmoid, so as to obtain a sentence pair semantic matching tensor which is [0,1 ]]The value of the degree of matching between the two is recorded as ypredFinally, whether the semantics of the sentence pairs are matched or not is judged by comparing with the set threshold value (0.5); i.e. ypredAnd when the semantic meaning of the sentence pair is predicted to be matched when the semantic meaning is more than or equal to 0.5, otherwise, the semantic meaning is not matched.
When the deep semantic feature map-based sentence provided by the invention has not been trained on the semantic matching model, step S4 needs to be further executed for training to optimize the model parameters; when the model is trained, step S308 may predict whether the semantics of the target sentence pair match.
S4, training a sentence-to-semantic matching model: training the sentence constructed in step S3 on the training data set obtained in step S2 to obtain a semantic matching model, as shown in fig. 5, specifically as follows:
s401, constructing a loss function: known from the label prediction module construction process, ypredIs a matching degree calculation value y obtained by processing a sentence to a semantic matching modeltrueWhether two sentence semantics matchThe value of the real label is limited to 0 or 1, the cross entropy is adopted as a loss function, and the formula is as follows:
Figure BDA0002646257970000211
the optimization function described above and its settings are expressed in Keras using code:
parallel_model.compile(loss="binary_crossentropy",optimizer=op,metrics=['accuracy',precision,recall,f1_score])
s402, optimizing a training model: using the RMSProp as an optimization algorithm, except that the learning rate is set to 0.0015, the remaining hyper-parameters of RMSProp all select default settings in Keras; optimally training the sentence pair semantic matching model on a training data set;
for example, the following steps are carried out: the optimization function described above and its settings are expressed in Keras using code:
optim=keras.optimizers.RMSProp(lr=0.0015)。
the model provided by the invention can obtain more than 80% of accuracy on the manually constructed government affair consultation data set.
Example 3:
as shown in fig. 6, the intelligent question-answer sentence pair semantic matching apparatus for the government counseling service according to embodiment 2, comprises,
the sentence-to-semantic matching knowledge base construction unit is used for acquiring a large amount of sentence pair data and then carrying out preprocessing operation on the sentence pair data so as to obtain a sentence-to-semantic matching knowledge base which meets the training requirement; the sentence-to-semantic matching knowledge base construction unit includes,
the sentence pair data acquisition unit is responsible for downloading a sentence pair semantic matching data set or a manually constructed data set which is already disclosed on a network, and the sentence pair data set is used as original data for constructing a sentence pair semantic matching knowledge base;
the system comprises an original data word-breaking preprocessing or word-segmentation preprocessing unit, a word-breaking or word-segmentation preprocessing unit and a word-segmentation processing unit, wherein the original data word-breaking preprocessing or word-segmentation preprocessing unit is responsible for preprocessing original data used for constructing a sentence-to-semantic matching knowledge base, and performing word-breaking or word-segmentation operation on each sentence in the original data word-breaking or word-segmentation preprocessing unit so as to construct a sentence-to-semantic matching word-breaking processing knowledge base or a sentence-;
and the sub-knowledge base summarizing unit is responsible for summarizing the sentence-to-semantic matching word-breaking processing knowledge base and the sentence-to-semantic matching word-segmentation processing knowledge base so as to construct the sentence-to-semantic matching knowledge base.
A training data set generating unit for constructing positive case data and negative case data for training according to sentences in the sentence-to-semantic matching knowledge base, and constructing a final training data set based on the positive case data and the negative case data; the training data set generating unit comprises a training data set generating unit,
the training positive case data construction unit is responsible for constructing two sentences with consistent semantics in the sentence-to-semantic matching knowledge base and the matching labels 1 thereof into training positive case data;
the training negative case data construction unit is responsible for selecting one sentence, randomly selecting a sentence which is not matched with the sentence for combination, and constructing the sentence and the matched label 0 into negative case data;
the training data set construction unit is responsible for combining all training positive example data and training negative example data together and disordering the sequence of the training positive example data and the training negative example data so as to construct a final training data set;
the sentence pair semantic matching model building unit is used for building a word mapping conversion table, an input module, a word vector mapping layer, a gating deep layer characteristic residual difference type fusion network module, a semantic characteristic interactive matching module and a label prediction module; the sentence-to-semantic matching model construction unit includes,
the word mapping conversion table or word mapping conversion table construction unit is responsible for segmenting each sentence in the semantic matching knowledge base by the sentence according to the word granularity or the word granularity, sequentially storing each word or word in a list to obtain a word list or word list, and sequentially increasing and sequencing the words or words according to the sequence of the words or word lists recorded by the words or words with the number 1 as the start to form the word mapping conversion table or word mapping conversion table required by the invention; after the word mapping conversion table or the word mapping conversion table is constructed, each word or word in the table is mapped into a unique digital identifier; then, the Word vector model or the Word vector model is trained by using Word2Vec to obtain a Word vector matrix of each Word or a Word vector matrix of each Word;
the input module construction unit is responsible for preprocessing each sentence pair or sentence pair to be predicted in the training data set, respectively acquiring sensor 1_ char, sensor 2_ char, sensor 1_ word and sensor 2_ word, and formalizing the words as follows: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word);
the word vector mapping layer or word vector mapping layer construction unit is responsible for loading a word vector matrix or word vector matrix obtained by training in the step of the word mapping conversion table or word mapping conversion table construction unit to initialize the weight parameters of the current layer; for word vector mapping, aiming at input sentences of sensor 1_ char and sensor 2_ char, obtaining corresponding sentence vectors of sensor 1_ char _ embed and sensor 2_ char _ embed; for word vector mapping, for input sentences of presence 1_ word and presence 2_ word, obtaining their corresponding sentence vectors of presence 1_ word _ embedded and presence 2_ word _ embedded;
the gate control deep layer characteristic residual difference type fusion network module construction unit is responsible for capturing and screening semantic characteristics of sentences, and specifically operates by receiving word embedded representation output by a word vector mapping layer and word embedded representation output by a word vector mapping layer as input; the word embedding expression and the word embedding expression are selectively fused through a gating mechanism to obtain gating embedding fusion expression, and meanwhile, the word embedding expression and the word embedding expression before fusion are transmitted to a first layer of coding structure; the first layer of coding structure respectively carries out coding operation on the word embedding representation and the word embedding representation to obtain a first layer of word coding result and a first layer of word coding result; selectively fusing the first layer character coding result and the first layer word coding result through a gate control mechanism, then selectively fusing the fusion result and the gate control embedded fusion representation through the gate control mechanism to obtain a gate control first layer characteristic residual difference type fusion representation, and simultaneously transmitting the first layer character coding result and the first layer word coding result before fusion to a second layer coding structure; the second layer coding structure respectively carries out coding operation on the first layer character coding result and the first layer word coding result so as to obtain a second layer character coding result and a second layer word coding result; selectively fusing the second layer character coding result and the second layer word coding result through a gating mechanism, then selectively fusing the fusion result and the gated first layer characteristic residual difference type fusion representation through the gating mechanism to obtain gated second layer characteristic residual difference type fusion representation, and simultaneously transmitting the second layer character coding result and the second layer word coding result before fusion to a third layer coding structure; by analogy, multi-level gating characteristic residual difference type fusion representation can be generated through repeated coding for many times; according to the preset level depth of the model, generating a final gated deep layer characteristic residual difference type fusion representation;
the semantic feature interactive matching module construction unit is responsible for carrying out feature matching and feature screening on the gated deep feature residual type fusion representation of the sentence pairs to obtain matching vectors of the sentence pairs;
the tag prediction module unit is responsible for processing the semantic matching tensor of the sentence pair so as to obtain a matching degree value, and the matching degree value is compared with an established threshold value so as to judge whether the semantics of the sentence pair are matched or not;
the sentence-to-semantic matching model training unit is used for constructing a loss function required in the model training process and finishing the optimization training of the model; the sentence-to-semantic matching model training unit includes,
the loss function construction unit is responsible for calculating the error of the semantic matching degree between the sentences 1 and 2;
the model optimization unit is responsible for training and adjusting parameters in model training to reduce prediction errors;
example 4:
a storage medium according to embodiment 2, wherein a plurality of instructions are stored, and the instructions are loaded by a processor, and the steps of the intelligent question-answer sentence-to-semantic matching method for government counseling service according to embodiment 2 are executed.
Example 5:
the electronic device according to embodiment 4, the electronic device comprising: the storage medium of example 4; and a processor for executing the instructions in the storage medium of embodiment 4.

Claims (10)

1. The intelligent question-answer sentence pair semantic matching method for the government affair consultation service is characterized in that a sentence pair semantic matching model consisting of a multi-granularity embedding module, a gated deep layer feature residual difference type fusion network module, a semantic feature interactive matching module and a label prediction module is constructed and trained to realize the gated deep layer feature residual difference type fusion representation of sentence information, and meanwhile, a final matching tensor of the sentence pair is generated through an attention mechanism and a gating mechanism and the matching degree of the sentence pair is judged so as to achieve the aim of performing intelligent semantic matching on the sentence pair; the method comprises the following specific steps:
the multi-granularity embedding module is used for respectively embedding the input sentences by word granularity and word granularity to obtain multi-granularity embedded expression of the sentences;
the gate control deep layer characteristic residual difference type fusion network module carries out coding operation on the multi-granularity embedded expression of the sentence to obtain gate control deep layer characteristic residual difference type fusion expression of the sentence;
the semantic feature interactive matching module performs feature matching and feature screening operation on the gated deep feature residual type fusion representation of the sentence pair to obtain a matching vector of the sentence pair;
and the tag prediction module maps the matching tensor of the sentence pair into a floating point type numerical value in the designated interval, compares the floating point type numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the sentence pair are matched or not according to the comparison result.
2. The intelligent sentence-in-sentence semantic matching method for government consulting services according to claim 1, wherein the multi-granularity embedding module is used for constructing a word mapping conversion table, an input module, a word vector mapping layer and a word vector mapping layer;
wherein, a word mapping conversion table or a word mapping conversion table is constructed: the mapping rule is as follows: taking the number 1 as a start, and then sequentially increasing and sequencing according to the sequence of the character table or the word table into which each character or word is recorded, thereby forming a character mapping conversion table or a word mapping conversion table required by the invention; the word list or the word list is constructed by matching a knowledge base to a semantic meaning according to sentences, wherein the knowledge base comprises a word-breaking processing knowledge base or a word-segmentation processing knowledge base, and is obtained by performing word-breaking preprocessing or word-segmentation preprocessing on original data texts of the semantic meaning matching knowledge base respectively; then, using Word2Vec to train a Word vector model or a Word vector model to obtain a Word vector matrix of each Word or a Word vector matrix of each Word;
constructing an input module: the input layer comprises four inputs, each sentence pair or sentence pair to be predicted in the training data set is subjected to word segmentation and word segmentation preprocessing, and the sensor 1_ char, the sensor 2_ char, the sensor 1_ word and the sensor 2_ word are respectively obtained, wherein suffixes char and word respectively represent word segmentation or word segmentation processing of corresponding sentences, and the suffixes char and word are formed as follows: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word); converting each character or word in the input sentence into a corresponding digital identifier according to a character mapping conversion table and a word mapping conversion table;
constructing a word vector mapping layer or a word vector mapping layer: loading a word vector matrix or a word vector matrix obtained by training in the step of constructing a word mapping conversion table or a word mapping conversion table to initialize the weight parameters of the current layer; for word vector mapping, aiming at input sentences of sensor 1_ char and sensor 2_ char, obtaining corresponding sentence vectors of sensor 1_ char _ embed and sensor 2_ char _ embed; for word vector mapping, for the input sentences presence 1_ word and presence 2_ word, the corresponding sentence vectors presence 1_ word _ embedded and presence 2_ word _ embedded are obtained.
3. The intelligent question-answer sentence pair semantic matching method for the government counseling service according to claim 1 or 2, wherein the construction process of the gated deep level feature residual fusion network module is as follows:
firstly, selectively fusing the word embedding expression and the word embedding expression output by the multi-granularity embedding module through a gating mechanism to obtain a gating embedding fusion expression, wherein the formula is as follows:
Figure FDA0002646257960000021
Figure FDA0002646257960000022
wherein, the formula (1.1) represents that the construction is embedded to represent the information selection gate, wherein,
Figure FDA0002646257960000023
and
Figure FDA0002646257960000024
represents the weight matrix to be trained and,
Figure FDA0002646257960000025
representing either sensor 1_ char _ embed or sensor 2_ char _ embed,
Figure FDA0002646257960000026
represents the sense 1_ word _ embedded or the sense 2_ word _ embedded, sigma represents the sigmoid function, gateembA presentation embedded presentation information selection gate; equation (1.2) represents the selective fusion of the word embedded representation and the word embedded representation through the embedded representation information selection gate, wherein the representation is multiplied by elements,
Figure FDA0002646257960000027
representing a gated embedded fused representation;
first layer coding structure BilSTM1Respectively carrying out coding operation on the word embedding representation and the word embedding representation to obtain a preliminary first-layer word coding result and a preliminary first-layer word coding result; the first layer word coding result and the first layer word coding result are selectively fused through a gating mechanism, and then the fusion result is fused with gating embedding through the gating mechanismThe representations are selectively fused to obtain a gated first layer feature residual fusion representation, as follows:
Figure FDA0002646257960000028
Figure FDA0002646257960000029
Figure FDA00026462579600000210
Figure FDA00026462579600000211
Figure FDA00026462579600000212
Figure FDA00026462579600000213
wherein, formula (2.1) indicates that BilSTM is used1The code word embeds a representation of, among other things,
Figure FDA00026462579600000214
indicating sensor 1_ char _ embed or sensor 2_ char _ embed, icThe vector representing the ith word represents the relative position in the sentence,
Figure FDA0002646257960000031
representing the coding result of the first layer word; equation (2.2) shows the use of BilSTM1The encoded word embedded representation may include, among other things,
Figure FDA0002646257960000032
representing either sense 1_ word _ element or sense 2_ word _ element, iwThe vector representing the ith word represents the relative position in the sentence,
Figure FDA0002646257960000033
representing a first-layer word encoding result; equation (2.3) represents constructing a first layer coding result selection gate, wherein,
Figure FDA0002646257960000034
and
Figure FDA0002646257960000035
represents the weight matrix to be trained, sigma represents sigmoid function, gate1 *A selection gate for indicating the first layer coding result; equation (2.4) represents the selective fusion of the first layer word encoding result and the first layer word encoding result through the first layer encoding result selection gate, wherein it indicates the multiplication by elements,
Figure FDA0002646257960000036
representing a gated first-layer encoding result fusion representation; equation (2.5) represents constructing a first-tier feature residual-type select gate, wherein,
Figure FDA0002646257960000037
and
Figure FDA0002646257960000038
represents the weight matrix to be trained and,
Figure FDA0002646257960000039
represents the gate-embedded fusion representation of the output of equation (1.2), σ represents the sigmoid function, gate1A first layer characteristic residual difference type selection gate is represented; equation (2.6) represents the selective fusion of the gated embedded fused representation and the gated first layer coding result fused representation by a first layer feature residual-type selection gate, wherein,
Figure FDA00026462579600000310
representing a gated first-layer feature residual fusion representation;
transmitting the coding result of the first layer of characters and the coding result of the first layer of words to a second layer of coding structure BilSTM2;BiLSTM2Respectively carrying out coding operation on the first layer character coding result and the first layer word coding result to obtain a second layer character coding result and a second layer word coding result; and selectively fusing the second layer character coding result and the second layer word coding result through a gating mechanism, and then selectively fusing the fusion result and the gated first layer characteristic residual difference type fusion representation through the gating mechanism to obtain a gated second layer characteristic residual difference type fusion representation, wherein the formula is as follows:
Figure FDA00026462579600000311
Figure FDA00026462579600000312
Figure FDA00026462579600000313
Figure FDA00026462579600000314
Figure FDA00026462579600000315
Figure FDA00026462579600000316
wherein, the formula(3.1) use of BilSTM2Encoding the first-layer word encoding result, wherein,
Figure FDA00026462579600000317
representing the result of the first layer word encoding, icThe (i) th time step is represented,
Figure FDA00026462579600000318
representing the second layer word encoding result; equation (3.2) shows the use of BilSTM2And coding the first-layer word coding result, wherein,
Figure FDA00026462579600000319
representing the first-level word encoding result, iwThe (i) th time step is represented,
Figure FDA00026462579600000320
representing a second layer word encoding result; equation (3.3) represents constructing a second layer coding result selection gate, wherein,
Figure FDA00026462579600000321
and
Figure FDA00026462579600000322
represents the weight matrix to be trained, sigma represents sigmoid function, gate2 *A selection gate for indicating the second layer coding result; equation (3.4) represents the selective fusion of the second layer word encoding result and the second layer word encoding result through the second layer encoding result selection gate, wherein, it indicates the multiplication by elements,
Figure FDA0002646257960000041
representing a gated second layer encoding result fusion representation; equation (3.5) represents constructing a second layer of characteristic residual-type select gates, wherein,
Figure FDA0002646257960000042
and
Figure FDA0002646257960000043
represents the weight matrix to be trained and,
Figure FDA0002646257960000044
the gating first layer characteristic residual fusion expression of the expression (2.6) output is expressed, sigma represents sigmoid function, and gate2A residual selection gate representing a second layer feature; equation (3.6) represents the selective fusion of the gated first layer feature residual fusion representation and the gated second layer encoding result fusion representation by the second layer feature residual selection gate, wherein,
Figure FDA0002646257960000045
representing a gated second layer feature residual fusion representation;
transmitting the coding result of the second layer characters and the coding result of the second layer words to a third layer coding structure BiLSTM3(ii) a By analogy, multi-level gating characteristic residual difference type fusion representation can be generated through repeated coding for many times; according to the preset level depth of the model, generating a final gated deep layer characteristic residual difference type fusion representation; for the depth layer, the formula is as follows:
Figure FDA0002646257960000046
Figure FDA00026462579600000421
Figure FDA0002646257960000047
Figure FDA0002646257960000048
Figure FDA0002646257960000049
Figure FDA00026462579600000410
wherein, formula (4.1) indicates that BilSTM is useddepthEncoding a depth-1 layer word encoding result, wherein,
Figure FDA00026462579600000411
representing the result of encoding a depth-1 layer word, icThe (i) th time step is represented,
Figure FDA00026462579600000412
representing the encoding result of the depth layer word; equation (4.2) shows the use of BilSTMdepthAnd coding the result of the depth-1 layer word coding, wherein,
Figure FDA00026462579600000413
indicating the encoding result of the term of the depth-1 st layer, iwThe (i) th time step is represented,
Figure FDA00026462579600000414
representing the encoding result of the term of the depth layer; equation (4.3) represents constructing a depth layer coding result selection gate, wherein,
Figure FDA00026462579600000415
and
Figure FDA00026462579600000416
represents the weight matrix to be trained, sigma represents sigmoid function, gatedepth *A selection gate for indicating the encoding result of the depth layer; formula (4.4) indicates that the word coding result of the depth layer and the word coding result of the depth layer are selectively fused by the depth layer coding result selection gate, wherein, it indicates that multiplication by elements,
Figure FDA00026462579600000417
representing the fusion representation of the coding result of the gated depth layer; equation (4.5) represents constructing a depth layer feature residual equation select gate, where,
Figure FDA00026462579600000418
and
Figure FDA00026462579600000419
represents the weight matrix to be trained and,
Figure FDA00026462579600000420
representing the residual fusion representation of the gating depth-1 layer characteristic, sigma representing sigmoid function, gatedepthRepresenting a feature residual error formula selection gate of a depth layer; equation (4.6) represents the selective fusion of the gated depth-1 layer feature residual fusion representation and the gated depth layer coding result fusion representation by a depth layer feature residual selection gate, wherein,
Figure FDA0002646257960000051
and (3) representing the gated depth layer feature residual fusion representation, namely the gated deep layer feature residual fusion representation.
4. The intelligent sentence-in-sentence pair semantic matching method for government consulting services according to claim 3, wherein the semantic feature interactive matching module is specifically constructed as follows:
the layer receives the gated deep characteristic residual difference type fusion representation output by the gated deep characteristic residual difference type fusion network module as input, and performs semantic characteristic matching and semantic characteristic screening operation on the gated deep characteristic residual difference type fusion representation in three steps, so that a final sentence is generated and a tensor is matched with the semantic meaning, and the specific operation is as follows:
firstly, completing an interactive matching process between sentence pairs by applying an attention mechanism to obtain a preliminary matching tensor of the sentences; taking the example of the sensor 1 matching the sensor 2, the formula is as follows:
Figure FDA0002646257960000052
Figure FDA0002646257960000053
Figure FDA0002646257960000054
wherein, the formula (5.1) represents the mapping of the gated deep feature residual fusion representation of the two sentences,
Figure FDA0002646257960000055
the ith component of the gated deep feature residual fusion representation representing sense 1,
Figure FDA0002646257960000056
the ith component of the gated deep feature residual fusion representation representing sense 2,
Figure FDA0002646257960000057
and
Figure FDA0002646257960000058
indicates a weight to be trained, and indicates multiplication by element; equation (5.2) represents the calculated attention weight; equation (5.3) represents the use of attention weights to complete the interactive matching process,
Figure FDA0002646257960000059
represents the result of matching the sensor 2 with the sensor 1, i.e., the sentence preliminary matching tensor; similarly, matching sensor 1 with sensor 2 will result in a similar preliminary matching tensor for the sentence
Figure FDA00026462579600000510
Secondly, performing feature screening operation on the sentence matching tensor by using a gate control mechanism to obtain a sentence matching tensor; the formula is as follows:
Figure FDA00026462579600000511
Figure FDA00026462579600000512
wherein equation (6.5) represents constructing a matching tensor gate,
Figure FDA00026462579600000513
results are shown matching the sensor 2 with sensor 1,
Figure FDA00026462579600000514
and
Figure FDA00026462579600000515
is the weight to be trained; equation (6.6) indicates that the matching tensor is feature-screened using the matching tensor gate, which indicates that multiplication by an element,
Figure FDA00026462579600000516
representing the sensor 1 matching tensor; similarly, the result of processing the matching sensor 1 with sensor 2 yields a sensor 2 matching tensor
Figure FDA00026462579600000517
And thirdly, connecting the two sentence matching tensors to obtain a sentence pair matching tensor, wherein the formula is as follows:
Figure FDA0002646257960000061
wherein ,
Figure FDA0002646257960000062
representing sentence pairs matching tensors.
5. The intelligent sentence-in-sentence pair semantic matching method for government counseling service according to claim 4, wherein the tag prediction module is constructed by the following steps:
the sentence-to-semantic matching tensor is used as the input of the module and is processed by a layer of fully-connected network with the dimensionality of 1 and the activation function of sigmoid, so that a value of [0,1 ] is obtained]The value of the degree of matching between the two is recorded as ypredFinally, comparing with the set threshold value of 0.5 to judge whether the semantics of the sentence pairs are matched; i.e. ypredWhen the semantic meaning of the sentence pair is matched, if not, the semantic meaning is not matched; when the sentence is not fully trained on the semantic matching model, training is required to be carried out on a training data set so as to optimize the model parameters; when training is completed, the label prediction module can predict whether the semantics of the target sentence pair are matched.
6. The intelligent question-answer sentence-to-sentence matching method for government counseling service according to claim 5, wherein the sentence-to-sentence matching knowledge base is constructed as follows:
downloading a data set on a network to obtain original data: downloading a sentence-to-semantic matching data set or a manually constructed data set which is already disclosed on a network, and taking the sentence-to-semantic matching data set or the manually constructed data set as original data for constructing a sentence-to-semantic matching knowledge base;
preprocessing raw data: preprocessing original data used for constructing a sentence-to-semantic matching knowledge base, and performing word segmentation operation and word segmentation operation on each sentence to obtain a sentence-to-semantic matching word segmentation processing knowledge base and a word segmentation processing knowledge base;
summarizing the sub-knowledge base: summarizing a sentence-to-semantic matching word-breaking processing knowledge base and a sentence-to-semantic matching word-segmentation processing knowledge base, and constructing a sentence-to-semantic matching knowledge base;
the sentence-to-semantic matching model is obtained by training by using a training data set, and the construction process of the training data set is as follows:
constructing a training example: constructing two sentence pairs with consistent sentence semantemes into a positive example in a sentence pair semantic matching knowledge base, and formalizing the positive example into: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word, 1); wherein, sensor 1_ char and sensor 2_ char refer to sentence1 and sentence2 in the knowledge base for semantic matching word segmentation processing respectively, sensor 1_ word and sensor 2_ word refer to sentence1 and sentence2 in the knowledge base for semantic matching word segmentation processing respectively, and 1 indicates that the semantics of the two sentences are matched, which is a positive example;
constructing a training negative example: selecting a sentence s1Randomly selecting a sentence s from the sentence pair semantic matching knowledge base1Unmatched sentence s2A 1 is to1And s2The combination is carried out, and a negative example is constructed and formalized as follows: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word, 0); wherein, the sensor 1_ char and the sensor 1_ word respectively refer to sentence-to-sentence semantic matching word-breaking processing knowledge base and sentence-to-sentence semantic matching word-segmentation processing knowledge base, namely sentence 1; sensor 2_ char, sensor 2_ word refer to sentence-to-sentence semantic matching word-breaking processing knowledge base and sentence-to-sentence semantic matching word-segmentation processing knowledge base, respectively; 0 denotes the sentence s1And sentence s2Is a negative example;
constructing a training data set: combining all positive example sample sentence pairs and negative example sample sentence pairs obtained after the operations of constructing the training positive examples and constructing the training negative examples, and disordering the sequence of the positive example sample sentence pairs and the negative example sample sentence pairs to construct a final training data set; whether positive case data or negative case data contains five dimensions, namely, sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word,0 or 1;
after the sentence-to-semantic matching model is built, training and optimizing the sentence-to-semantic matching model through a training data set are carried out, which specifically comprises the following steps:
constructing a loss function: adopting cross entropy as a loss function;
optimizing a training model: using RMSProp as an optimization algorithm, except that the learning rate is set to 0.0015, the remaining hyper-parameters of RMSProp all select default settings in Keras; and optimally training the sentence pair semantic matching model on the training data set.
7. The intelligent question-answer sentence pair semantic matching device facing the government affair consultation service is characterized by comprising,
the sentence-to-semantic matching knowledge base construction unit is used for acquiring a large amount of sentence pair data and then carrying out preprocessing operation on the sentence pair data so as to obtain a sentence-to-semantic matching knowledge base which meets the training requirement;
a training data set generating unit for constructing positive case data and negative case data for training according to sentences in the sentence-to-semantic matching knowledge base, and constructing a final training data set based on the positive case data and the negative case data;
the sentence pair semantic matching model building unit is used for building a word mapping conversion table, an input module, a word vector mapping layer, a gating deep layer characteristic residual difference type fusion network module, a semantic characteristic interactive matching module and a label prediction module; the sentence-to-semantic matching model construction unit includes,
the word mapping conversion table or word mapping conversion table construction unit is responsible for segmenting each sentence in the semantic matching knowledge base by the sentence according to the word granularity or the word granularity, sequentially storing each word or word in a list to obtain a word list or word list, and sequentially increasing and sequencing the words or words according to the sequence of the words or word lists recorded by the words or words with the number 1 as the start to form the word mapping conversion table or word mapping conversion table required by the invention; after the word mapping conversion table or the word mapping conversion table is constructed, each word or word in the table is mapped into a unique digital identifier; then, the Word vector model or the Word vector model is trained by using Word2Vec to obtain a Word vector matrix of each Word or a Word vector matrix of each Word;
the input module construction unit is responsible for preprocessing each sentence pair or sentence pair to be predicted in the training data set, respectively acquiring sensor 1_ char, sensor 2_ char, sensor 1_ word and sensor 2_ word, and formalizing the words as follows: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word);
the word vector mapping layer or word vector mapping layer construction unit is responsible for loading a word vector matrix or word vector matrix obtained by training in the step of the word mapping conversion table or word mapping conversion table construction unit to initialize the weight parameters of the current layer; for word vector mapping, aiming at input sentences of sensor 1_ char and sensor 2_ char, obtaining corresponding sentence vectors of sensor 1_ char _ embed and sensor 2_ char _ embed; for word vector mapping, for input sentences of presence 1_ word and presence 2_ word, obtaining their corresponding sentence vectors of presence 1_ word _ embedded and presence 2_ word _ embedded;
the gate control deep layer characteristic residual difference type fusion network module construction unit is responsible for capturing and screening semantic characteristics of sentences, and specifically operates by receiving word embedded representation output by a word vector mapping layer and word embedded representation output by a word vector mapping layer as input; the word embedding expression and the word embedding expression are selectively fused through a gating mechanism to obtain gating embedding fusion expression, and meanwhile, the word embedding expression and the word embedding expression before fusion are transmitted to a first layer of coding structure; the first layer of coding structure respectively carries out coding operation on the word embedding representation and the word embedding representation to obtain a first layer of word coding result and a first layer of word coding result; selectively fusing the first layer character coding result and the first layer word coding result through a gate control mechanism, then selectively fusing the fusion result and the gate control embedded fusion representation through the gate control mechanism to obtain a gate control first layer characteristic residual difference type fusion representation, and simultaneously transmitting the first layer character coding result and the first layer word coding result before fusion to a second layer coding structure; the second layer coding structure respectively carries out coding operation on the first layer character coding result and the first layer word coding result so as to obtain a second layer character coding result and a second layer word coding result; selectively fusing the second layer character coding result and the second layer word coding result through a gating mechanism, then selectively fusing the fusion result and the gated first layer characteristic residual difference type fusion representation through the gating mechanism to obtain gated second layer characteristic residual difference type fusion representation, and simultaneously transmitting the second layer character coding result and the second layer word coding result before fusion to a third layer coding structure; by analogy, multi-level gating characteristic residual difference type fusion representation can be generated through repeated coding for many times; according to the preset level depth of the model, generating a final gated deep layer characteristic residual difference type fusion representation;
the semantic feature interactive matching module construction unit is responsible for further processing the gate-controlled deep-layer feature residual fusion expression of the corresponding sentence, and performing semantic feature interactive matching, semantic feature screening and other operations on the gate-controlled deep-layer feature residual fusion expression, so as to generate a final sentence-to-semantic matching tensor;
the tag prediction module unit is responsible for processing the semantic matching tensor of the sentence pair so as to obtain a matching degree value, and the matching degree value is compared with an established threshold value so as to judge whether the semantics of the sentence pair are matched or not;
and the sentence-to-semantic matching model training unit is used for constructing a loss function required in the model training process and finishing the optimization training of the model.
8. The intelligent sentence-in-sentence pair semantic matching method for government counseling service according to claim 7, wherein the sentence-in-sentence pair semantic matching knowledge base constructing unit comprises,
the sentence pair data acquisition unit is responsible for downloading a sentence pair semantic matching data set or a manually constructed data set which is already disclosed on a network, and the sentence pair data set is used as original data for constructing a sentence pair semantic matching knowledge base;
the system comprises an original data word-breaking preprocessing or word-segmentation preprocessing unit, a word-breaking or word-segmentation preprocessing unit and a word-segmentation processing unit, wherein the original data word-breaking preprocessing or word-segmentation preprocessing unit is responsible for preprocessing original data used for constructing a sentence-to-semantic matching knowledge base, and performing word-breaking or word-segmentation operation on each sentence in the original data word-breaking or word-segmentation preprocessing unit so as to construct a sentence-to-semantic matching word-breaking processing knowledge base or a sentence-;
the sub-knowledge base summarizing unit is responsible for summarizing the sentence-to-semantic matching word-breaking processing knowledge base and the sentence-to-semantic matching word-segmentation processing knowledge base so as to construct the sentence-to-semantic matching knowledge base;
the training data set generating unit comprises a training data set generating unit,
the training positive case data construction unit is responsible for constructing two sentences with consistent semantics in the sentence-to-semantic matching knowledge base and the matching labels 1 thereof into training positive case data;
the training negative case data construction unit is responsible for selecting one sentence, randomly selecting a sentence which is not matched with the sentence for combination, and constructing the sentence and the matched label 0 into negative case data;
the training data set construction unit is responsible for combining all training positive example data and training negative example data together and disordering the sequence of the training positive example data and the training negative example data so as to construct a final training data set;
the sentence-to-semantic matching model training unit includes,
the loss function construction unit is responsible for calculating the error of the semantic matching degree between the sentences 1 and 2;
and the model optimization unit is responsible for training and adjusting parameters in model training to reduce prediction errors.
9. A storage medium having stored thereon a plurality of instructions characterized in that said instructions are loaded by a processor to perform the steps of the method for matching semantic meanings with intelligent question-answer sentences oriented towards government counseling services according to claims 1-6.
10. An electronic device, characterized in that the electronic device comprises: the storage medium of claim 9; and a processor for executing instructions in the storage medium.
CN202010855426.0A 2020-08-24 2020-08-24 Intelligent question-answer sentence semantic matching method and device for government affair consultation service Active CN112001166B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010855426.0A CN112001166B (en) 2020-08-24 2020-08-24 Intelligent question-answer sentence semantic matching method and device for government affair consultation service

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010855426.0A CN112001166B (en) 2020-08-24 2020-08-24 Intelligent question-answer sentence semantic matching method and device for government affair consultation service

Publications (2)

Publication Number Publication Date
CN112001166A true CN112001166A (en) 2020-11-27
CN112001166B CN112001166B (en) 2023-10-17

Family

ID=73470203

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010855426.0A Active CN112001166B (en) 2020-08-24 2020-08-24 Intelligent question-answer sentence semantic matching method and device for government affair consultation service

Country Status (1)

Country Link
CN (1) CN112001166B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966524A (en) * 2021-03-26 2021-06-15 湖北工业大学 Chinese sentence semantic matching method and system based on multi-granularity twin network
CN113065359A (en) * 2021-04-07 2021-07-02 齐鲁工业大学 Sentence-to-semantic matching method and device oriented to intelligent interaction
CN113065358A (en) * 2021-04-07 2021-07-02 齐鲁工业大学 Text-to-semantic matching method based on multi-granularity alignment for bank consultation service
CN113268962A (en) * 2021-06-08 2021-08-17 齐鲁工业大学 Text generation method and device for building industry information service question-answering system
CN114238563A (en) * 2021-12-08 2022-03-25 齐鲁工业大学 Multi-angle interaction-based intelligent matching method and device for Chinese sentences to semantic meanings
CN114328883A (en) * 2022-03-08 2022-04-12 恒生电子股份有限公司 Data processing method, device, equipment and medium for machine reading understanding

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110032635A (en) * 2019-04-22 2019-07-19 齐鲁工业大学 One kind being based on the problem of depth characteristic fused neural network to matching process and device
CN111310439A (en) * 2020-02-20 2020-06-19 齐鲁工业大学 Intelligent semantic matching method and device based on depth feature dimension-changing mechanism
CN111325028A (en) * 2020-02-20 2020-06-23 齐鲁工业大学 Intelligent semantic matching method and device based on deep hierarchical coding
WO2020143137A1 (en) * 2019-01-07 2020-07-16 北京大学深圳研究生院 Multi-step self-attention cross-media retrieval method based on restricted text space and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020143137A1 (en) * 2019-01-07 2020-07-16 北京大学深圳研究生院 Multi-step self-attention cross-media retrieval method based on restricted text space and system
CN110032635A (en) * 2019-04-22 2019-07-19 齐鲁工业大学 One kind being based on the problem of depth characteristic fused neural network to matching process and device
CN111310439A (en) * 2020-02-20 2020-06-19 齐鲁工业大学 Intelligent semantic matching method and device based on depth feature dimension-changing mechanism
CN111325028A (en) * 2020-02-20 2020-06-23 齐鲁工业大学 Intelligent semantic matching method and device based on deep hierarchical coding

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈志豪;余翔;刘子辰;邱大伟;顾本刚;: "基于注意力和字嵌入的中文医疗问答匹配方法", 计算机应用, no. 06 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966524A (en) * 2021-03-26 2021-06-15 湖北工业大学 Chinese sentence semantic matching method and system based on multi-granularity twin network
CN112966524B (en) * 2021-03-26 2024-01-26 湖北工业大学 Chinese sentence semantic matching method and system based on multi-granularity twin network
CN113065359A (en) * 2021-04-07 2021-07-02 齐鲁工业大学 Sentence-to-semantic matching method and device oriented to intelligent interaction
CN113065358A (en) * 2021-04-07 2021-07-02 齐鲁工业大学 Text-to-semantic matching method based on multi-granularity alignment for bank consultation service
CN113065358B (en) * 2021-04-07 2022-05-24 齐鲁工业大学 Text-to-semantic matching method based on multi-granularity alignment for bank consultation service
CN113065359B (en) * 2021-04-07 2022-05-24 齐鲁工业大学 Sentence-to-semantic matching method and device oriented to intelligent interaction
CN113268962A (en) * 2021-06-08 2021-08-17 齐鲁工业大学 Text generation method and device for building industry information service question-answering system
CN114238563A (en) * 2021-12-08 2022-03-25 齐鲁工业大学 Multi-angle interaction-based intelligent matching method and device for Chinese sentences to semantic meanings
CN114328883A (en) * 2022-03-08 2022-04-12 恒生电子股份有限公司 Data processing method, device, equipment and medium for machine reading understanding
CN114328883B (en) * 2022-03-08 2022-06-28 恒生电子股份有限公司 Data processing method, device, equipment and medium for machine reading understanding

Also Published As

Publication number Publication date
CN112001166B (en) 2023-10-17

Similar Documents

Publication Publication Date Title
CN111310438B (en) Chinese sentence semantic intelligent matching method and device based on multi-granularity fusion model
CN112001166A (en) Intelligent question-answer sentence-to-semantic matching method and device for government affair consultation service
CN109597891B (en) Text emotion analysis method based on bidirectional long-and-short-term memory neural network
CN108415977B (en) Deep neural network and reinforcement learning-based generative machine reading understanding method
CN113312500B (en) Method for constructing event map for safe operation of dam
CN112000772B (en) Sentence-to-semantic matching method based on semantic feature cube and oriented to intelligent question and answer
CN111144448A (en) Video barrage emotion analysis method based on multi-scale attention convolutional coding network
CN110390397B (en) Text inclusion recognition method and device
CN112000770B (en) Semantic feature graph-based sentence semantic matching method for intelligent question and answer
CN113065358B (en) Text-to-semantic matching method based on multi-granularity alignment for bank consultation service
CN110750635B (en) French recommendation method based on joint deep learning model
CN112000771B (en) Judicial public service-oriented sentence pair intelligent semantic matching method and device
CN113254610B (en) Multi-round conversation generation method for patent consultation
CN111310439A (en) Intelligent semantic matching method and device based on depth feature dimension-changing mechanism
CN111325028A (en) Intelligent semantic matching method and device based on deep hierarchical coding
CN108549658A (en) A kind of deep learning video answering method and system based on the upper attention mechanism of syntactic analysis tree
CN112115687A (en) Problem generation method combining triples and entity types in knowledge base
CN110196967A (en) Sequence labelling method and apparatus based on depth converting structure
CN109919175A (en) A kind of more classification methods of entity of combination attribute information
CN105975497A (en) Automatic microblog topic recommendation method and device
CN116975776A (en) Multi-mode data fusion method and device based on tensor and mutual information
CN114297399A (en) Knowledge graph generation method, knowledge graph generation system, storage medium and electronic equipment
CN117313728A (en) Entity recognition method, model training method, device, equipment and storage medium
CN113705242B (en) Intelligent semantic matching method and device for education consultation service
CN114356990A (en) Base named entity recognition system and method based on transfer learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant