CN112000772A - Sentence-to-semantic matching method based on semantic feature cube and oriented to intelligent question and answer - Google Patents

Sentence-to-semantic matching method based on semantic feature cube and oriented to intelligent question and answer Download PDF

Info

Publication number
CN112000772A
CN112000772A CN202010855971.XA CN202010855971A CN112000772A CN 112000772 A CN112000772 A CN 112000772A CN 202010855971 A CN202010855971 A CN 202010855971A CN 112000772 A CN112000772 A CN 112000772A
Authority
CN
China
Prior art keywords
word
sentence
layer
semantic
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010855971.XA
Other languages
Chinese (zh)
Other versions
CN112000772B (en
Inventor
鹿文鹏
于瑞
张旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qilu University of Technology
Original Assignee
Qilu University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qilu University of Technology filed Critical Qilu University of Technology
Priority to CN202010855971.XA priority Critical patent/CN112000772B/en
Publication of CN112000772A publication Critical patent/CN112000772A/en
Application granted granted Critical
Publication of CN112000772B publication Critical patent/CN112000772B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a sentence-to-semantic matching method based on a semantic feature cube and oriented to intelligent question answering, and belongs to the technical field of artificial intelligence and natural language processing. The technical problem to be solved by the invention is how to capture more semantic context characteristics, time sequence characteristics, connection of coding information between different dimensions and interactive information between sentences so as to realize intelligent semantic matching of sentence pairs, and the technical scheme adopted is as follows: a sentence pair semantic matching model consisting of a multi-granularity embedding module, a deep semantic feature cube construction network module, a feature conversion network module and a label prediction module is constructed and trained, so that deep semantic feature cube representation of sentence information and three-dimensional convolution coding representation of semantic features are realized, meanwhile, a final matching tensor of a sentence pair is generated through an attention mechanism, and the matching degree of the sentence pair is judged, so that the aim of carrying out intelligent semantic matching on the sentence pair is fulfilled. The device comprises a sentence-to-semantic matching knowledge base construction unit, a training data set generation unit, a sentence-to-semantic matching model construction unit and a sentence-to-semantic matching model training unit.

Description

Sentence-to-semantic matching method based on semantic feature cube and oriented to intelligent question and answer
Technical Field
The invention relates to the technical field of artificial intelligence and natural language processing, in particular to a sentence-to-semantic matching method based on a semantic feature cube for intelligent question answering.
Background
The intelligent question-answering system is one of core technologies of man-machine interaction, can automatically find standard questions matched with questions in a question-answering knowledge base aiming at the questions put forward by a user, and pushes answers of the standard questions to the user, so that the burden of manual answering can be greatly reduced. The intelligent question-answering system has wide practical application in the fields of self-service, intelligent customer service and the like. For the various questions provided by the user, how to find the matched standard question is the core technology of the intelligent question-answering system. The essence of the technology is to measure the matching degree of the questions put forward by the user and the standard questions in the question-answer knowledge base, and the essence of the technology is the task of matching sentences and meanings.
The sentence-to-semantic matching task aims to measure whether the semantics implied by two sentences are consistent, which is consistent with the core goal of many natural language processing tasks, such as the intelligent question-answering system described above. The calculation of semantic matching of natural language sentences is a very challenging task, and the existing method can not solve the problem completely.
In the existing method, when matching the semantics of a sentence pair, a specific neural network is required to be designed to perform semantic coding on the sentence, so as to extract corresponding semantic features. For text semantic coding, the most widely applied coding models are one-dimensional convolutional neural networks, two-dimensional convolutional neural networks, cyclic neural networks and various variant structures thereof. Most of the methods applying these network structures focus on mining deeper semantic information, and the time sequence information in the sentence is not fully utilized. For example, convolutional neural networks are applied to a variety of natural language processing tasks due to their excellence in capturing local features, but whether one-dimensional or two-dimensional convolutional neural networks, the timing information they can capture is determined by their convolutional kernel size, which is very limited compared to the length of a sentence. Therefore, the method using the one-dimensional convolutional neural network or the method using the two-dimensional convolutional neural network inevitably loses part of the timing characteristics in the process of processing sentences. The recurrent neural network and the variant structure thereof have a chain-like structure, so that the recurrent neural network and the variant structure thereof can naturally keep the time sequence information contained in sentences when processing text information, and the information is very important for modeling representation of texts. However, if the structure of the recurrent neural network is carefully observed, it can be easily found that the processing of the timing information is merely retained. That is, the recurrent neural network does not perform any targeted mining and processing on the information in the dimension of the time-series feature. In summary, the one-dimensional convolutional neural network, the two-dimensional convolutional neural network, and the cyclic neural network fail to fully exploit the timing information in the sentence.
Disclosure of Invention
The technical task of the invention is to provide an intelligent question-answering sentence-to-sentence semantic matching method based on a semantic feature cube, so that the advantages of a convolutional neural network are fully exerted, more semantic context information and interactive information among sentences are captured, and the purpose of intelligent semantic matching of sentence pairs is finally achieved through an attention mechanism.
The technical task of the invention is realized according to the following mode, the sentence pair semantic matching method facing intelligent question answering and based on the semantic feature cube is realized by constructing and training a sentence pair semantic matching model consisting of a multi-granularity embedding module, a deep semantic feature cube construction network module, a feature conversion network module and a label prediction module, realizing deep semantic feature cube representation of sentence information and three-dimensional convolution coding representation of semantic features, generating a final matching tensor of the sentence pair through an attention mechanism and judging the matching degree of the sentence pair so as to achieve the aim of carrying out intelligent semantic matching on the sentence pair; the method comprises the following specific steps:
the multi-granularity embedding module is used for respectively embedding the input sentences by word granularity and word granularity to obtain multi-granularity embedded expression of the sentences;
the deep semantic feature cube construction network module carries out coding operation on the multi-granularity embedded representation of the sentence to obtain a deep semantic feature cube of the sentence;
the feature conversion network module further performs feature coding, feature matching and feature screening operations on the deep semantic feature cube of the sentence pair to obtain a matching vector of the sentence pair;
and the tag prediction module maps the matching tensor of the sentence pair into a floating point type numerical value in the designated interval, compares the floating point type numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the sentence pair are matched or not according to the comparison result.
Preferably, the multi-granularity embedding module is used for constructing a word mapping conversion table, an input module, a word vector mapping layer and a word vector mapping layer;
wherein, a word mapping conversion table is constructed: the mapping rule is as follows: starting with the number 1, sequentially and progressively sequencing according to the sequence of each word recorded in the word table, thereby forming a word mapping conversion table required by the invention; the word table is constructed by constructing a semantic matching word breaking processing knowledge base according to sentences, wherein the knowledge base is obtained by carrying out word breaking operation on original data texts of the semantic matching knowledge base; then, training a Word vector model by using Word2Vec to obtain a Word vector matrix of each Word;
constructing a word mapping conversion table: the mapping rule is as follows: taking the number 1 as the starting point, and then sequentially increasing and sequencing according to the sequence of the input word list of each word so as to form a word mapping conversion table required by the invention; the word list is constructed according to a knowledge base which is processed by matching and segmenting the semantic meaning through sentences, and the knowledge base is obtained by segmenting original data texts of the knowledge base matched with the semantic meaning; then, using Word2Vec to train the vector model to obtain a Word vector matrix of each Word;
constructing an input module: the input layer comprises four inputs, each sentence pair or sentence pair to be predicted in the training data set is subjected to word segmentation and word segmentation preprocessing, and the sensor 1_ char, the sensor 2_ char, the sensor 1_ word and the sensor 2_ word are respectively obtained, wherein suffixes char and word respectively represent word segmentation or word segmentation processing of corresponding sentences, and the suffixes char and word are formed as follows: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word); converting each character or word in the input sentence into a corresponding digital identifier according to a character mapping conversion table and a word mapping conversion table;
constructing a word vector mapping layer: loading the weight of the word vector matrix obtained by training in the step of constructing the word mapping conversion table to initialize the weight parameter of the current layer; aiming at input sentences of sensor 1_ char and sensor 2_ char, obtaining corresponding sentence vectors of sensor 1_ char _ embed and sensor 2_ char _ embed; each sentence in the sentence-to-semantic matching word-breaking processing knowledge base can convert the sentence information into a vector form in a word vector mapping mode.
Constructing a word vector mapping layer: loading the word vector matrix weight obtained by training in the step of constructing a word mapping conversion table to initialize the weight parameter of the current layer; aiming at input sentences of a content 1_ word and a content 2_ word, obtaining corresponding sentence vectors of the content 1_ word _ embedded and the content 2_ word _ embedded; each sentence in the sentence-to-semantic matching word segmentation processing knowledge base can convert sentence information into a vector form in a word vector mapping mode.
Preferably, the construction process of the deep semantic feature cube construction network module is as follows:
first layer coding structure BilSTM1Respectively carrying out coding operation on the word embedding expression and the word embedding expression output by the multi-granularity embedding module to obtain a first layer word coding result and a first layer word coding result, and marking the first layer word coding result and the first layer word coding result as
Figure BDA0002646398060000031
And
Figure BDA0002646398060000032
and
Figure BDA0002646398060000033
adding two dimensions to obtain the first layer word coding result and the first layer word coding result after dimension increase, and recordingIs composed of
Figure BDA0002646398060000034
And
Figure BDA0002646398060000035
the newly added dimensions are defined as a granularity dimension and a depth dimension, and then connected in the granularity dimension
Figure BDA0002646398060000036
And
Figure BDA0002646398060000037
to generate a first level semantic feature cube, the formula is as follows:
Figure BDA0002646398060000038
Figure BDA0002646398060000039
Figure BDA00026463980600000310
wherein, formula (1.1) indicates that BilSTM is used1The word-embedded representation output by the encoding multi-granularity embedding module is added in two dimensions by means of a reshape operation, wherein,
Figure BDA00026463980600000311
indicating sensor 1_ char _ embed or sensor 2_ char _ embed, icThe vector representing the ith word represents the relative position in the sentence,
Figure BDA00026463980600000312
representing the first layer word coding result after dimension increasing, wherein the formula represents that the sensor _ char _ embedded passes through the BilSTM1Obtaining a tensor with shape as (batch _ size, time _ steps, output _ dimension) after processing, and newly adding two dimensions through resume operationThus, a tensor with shape of (batch _ size, time _ steps,1, output _ dimension,1) is obtained, wherein the newly added third dimension is called a granularity dimension, the newly added fifth dimension is called a depth dimension, and the tensor is the first layer word encoding result after dimension increase; equation (1.2) shows the use of BilSTM1The word embedding representation output by the encoding multi-granularity embedding module is added in two dimensions by a reshape operation, wherein,
Figure BDA0002646398060000041
representing either sense 1_ word _ element or sense 2_ word _ element, iwThe vector representing the ith word represents the relative position in the sentence,
Figure BDA0002646398060000042
representing the coding result of the first layer words after dimension increasing, wherein the formula represents that the content _ word _ embedded passes through the BilSTM1Obtaining a shape tensor with (batch _ size, time _ steps, output _ dimension) after processing, and then obtaining a tensor with a shape as (batch _ size, time _ steps,1, output _ dimension,1) through a resume operation, wherein a newly-added third dimension is called a granularity dimension, a newly-added fifth dimension is called a depth dimension, and the tensor is a first layer word encoding result after dimension addition; equation (1.3) represents concatenating the first-level word encoding result and the first-level word encoding result in the granularity dimension to obtain a first-level semantic feature cube, wherein,
Figure BDA0002646398060000043
represents the first level semantic feature cube whose shape is (batch _ size, time _ steps,2, output _ dimension, 1).
Further, the first layer word encoding result before dimension increasing and the first layer word encoding result, namely
Figure BDA0002646398060000044
And
Figure BDA0002646398060000045
delivery to the second layer coding Structure BilsTM2;BiLSTM2Respectively carrying out coding operation on the coding result of the first layer of characters and the coding result of the first layer of words to obtain a coding result of the second layer of characters and a coding result of the second layer of words, and marking the coding results as the coding results of the second layer of words
Figure BDA0002646398060000046
And
Figure BDA0002646398060000047
and
Figure BDA0002646398060000048
adding two dimensions to obtain the second layer word coding result and the second layer word coding result after dimension increase, and recording the results as
Figure BDA0002646398060000049
And
Figure BDA00026463980600000410
the newly added dimensions are defined as a granularity dimension and a depth dimension, and then connected in the granularity dimension
Figure BDA00026463980600000411
And
Figure BDA00026463980600000412
to generate a second level semantic feature cube, the formula is as follows:
Figure BDA00026463980600000413
Figure BDA00026463980600000414
Figure BDA00026463980600000415
wherein, the meaning of the formula (2.1) is similar to the formula (1.1) except that the BilSTM in the formula2The coded object is the first layer word coding result before dimension increase, wherein,
Figure BDA00026463980600000416
to represent
Figure BDA00026463980600000417
Through BiLSTM2Processing and newly adding a second layer word coding result after two dimensions; the meaning of equation (2.2) is similar to equation (1.2) except that BilSTM in this equation2The coded object is a first-layer word coding result before dimension increasing, wherein,
Figure BDA00026463980600000418
to represent
Figure BDA00026463980600000419
Through BiLSTM2Processing and adding a second layer word coding result after two dimensions;
Figure BDA00026463980600000420
and
Figure BDA00026463980600000421
the obtaining step and shape are all the same as
Figure BDA00026463980600000422
And
Figure BDA00026463980600000423
the consistency is achieved; the meaning of the formula (2.3) is similar to that of the formula (1.3), except that the objects connected by the formula are the second layer word encoding result and the second layer word encoding result after dimension increase, wherein,
Figure BDA00026463980600000424
represents a second level semantic feature cube whose shape is (batch _ size, time _ steps,2, output _ dimension, 1).
Further, the second layer word encoding result before dimension increasing and the second layer word encoding result, namely
Figure BDA0002646398060000051
And
Figure BDA0002646398060000052
to the third layer of coding structure BilSTM3(ii) a By analogy, the multi-level semantic feature cube can be generated by repeatedly encoding for multiple times; and according to the preset hierarchy depth of the model, generating a depth level semantic feature cube. For the depth layer, the formula is as follows:
Figure BDA0002646398060000053
Figure BDA0002646398060000054
Figure BDA0002646398060000055
wherein, the meaning of formula (3.1) is similar to that of formula (2.1) except that BilSTM in the formuladepthThe coded object is the result of encoding the depth-1 layer word before dimension increase, wherein,
Figure BDA0002646398060000056
to represent
Figure BDA0002646398060000057
Through BiLSTMdepthProcessing and newly adding a depth layer word coding result after two dimensions; the meaning of equation (3.2) is similar to equation (2.2) except that BilSTM in this equationdepthThe coded object is the coded result of the depth-1 layer word before dimension increase, wherein,
Figure BDA0002646398060000058
to represent
Figure BDA0002646398060000059
Through BiLSTMdepthProcessing and newly adding a depth layer word coding result after two dimensions;
Figure BDA00026463980600000510
and
Figure BDA00026463980600000511
the obtaining step and shape are all the same as
Figure BDA00026463980600000512
And
Figure BDA00026463980600000513
the consistency is achieved; the meaning of the formula (3.3) is similar to that of the formula (2.3), except that the objects linked by the formula are the depth-1 layer word encoding result and the depth-1 layer word encoding result after dimension increase, wherein,
Figure BDA00026463980600000514
represents a depth level semantic feature cube whose shape is (batch _ size, time _ steps,2, output _ dimension, 1).
Further, after the semantic feature cubes of each level are obtained, the semantic feature cubes of all levels are connected in the depth dimension to generate a final deep semantic feature cube, and the formula is as follows:
Figure BDA00026463980600000515
wherein equation (4) represents a semantic feature cube joining all levels in the depth dimension,
Figure BDA00026463980600000516
represents the final deep semantic feature cube with shape (batch _ size, time _ steps,2, output _ dimension, depth).
Preferably, the construction process of the feature transformation network module is as follows:
constructing a two-dimensional convolution semantic feature coding layer: the layer receives a deep semantic feature cube output by a deep semantic feature cube construction network module as input, and then performs coding operation on the deep semantic feature cube by using a three-dimensional convolutional neural network so as to obtain corresponding semantic feature coding expression, wherein the formula is as follows:
Figure BDA00026463980600000517
Figure BDA0002646398060000061
Figure BDA0002646398060000062
Figure BDA0002646398060000063
wherein, the deep semantic feature cube is obtained after the processing of the step 3.6
Figure BDA0002646398060000064
Inputting for the layer; equation (5.1) represents the result of ReLU function mapping after convolution of the f-th convolution kernel on a specific region of the deep semantic feature cube, where [ x ]1,y1,z1]Which represents the size of the convolution kernel,
Figure BDA0002646398060000065
a weight matrix representing the f-th convolution kernel, i, j, and k represent the abscissa, ordinate, and depth coordinate of the convolution region, and ml、mhAnd mdRepresenting the length, height and depth of the deep semantic feature cube i + x1-1、j:j+y1-1、k:k+z1-1 represents a convolution region and-1 represents a convolution region,
Figure BDA0002646398060000066
a bias matrix representing the f-th convolution kernel,
Figure BDA0002646398060000067
denotes the f-th convolution kernel at i: i + x1-1、j:j+y1-1、k:k+z1-1 area of convolution results; equation (5.2) represents the integration of the horizontal and vertical convolution results of the f-th convolution kernel in each region to obtain the kth depth convolution result of the f-th convolution kernel, where sx1And sy1Representing the horizontal convolution step and the vertical convolution step,
Figure BDA0002646398060000068
a kth depth convolution result representing an fth convolution kernel; equation (5.3) represents the integration of all the deep convolution results of the f-th convolution kernel to obtain the deep convolution result of the f-th convolution kernel, where sz1The step of depth convolution is represented as a step,
Figure BDA0002646398060000069
a depth convolution result representing the f-th convolution kernel; equation (5.4) represents the integration of the deep convolution results of all convolution kernels to obtain the final convolution result of the layer network for the deep semantic feature cube, i.e.
Figure BDA00026463980600000610
It is referred to as a semantic feature coded representation.
Constructing a semantic feature matching layer: this layer first links semantic feature coded representations of sensor 1 and sensor 2 in the depth dimension
Figure BDA00026463980600000611
And
Figure BDA00026463980600000612
thereby obtaining the sentence pair connection tensor
Figure BDA00026463980600000613
The formula is as follows:
Figure BDA00026463980600000614
subsequently, semantic feature coding operation is carried out on the sentence pair preliminary matching tensor to generate a sentence pair preliminary matching tensor, and the specific process comprises the following two steps:
first, another three-dimensional convolutional neural network pair is used
Figure BDA00026463980600000615
And performing three-dimensional convolution matching processing to obtain a sentence-to-three-dimensional convolution matching tensor, wherein the formula is as follows:
Figure BDA00026463980600000616
Figure BDA00026463980600000617
Figure BDA0002646398060000071
Figure BDA0002646398060000072
wherein, the sentence pair join tensor
Figure BDA0002646398060000073
Inputting for the layer; equation (7.1) represents the result of the ReLU function mapping after the f-th convolution kernel sentence convolves a specific region of the join tensor, where [ x [ ]2,y2,z2]Which represents the size of the convolution kernel,
Figure BDA0002646398060000074
a weight matrix representing the f-th convolution kernel, i, j, and k represent the abscissa, ordinate, and depth coordinate of the convolution region, rl、rhAnd rdRepresenting the length, height and depth of the sentence pair join tensor i + x2-1,j:j+y2-1,k:k+z2-1 represents a convolution region and-1 represents a convolution region,
Figure BDA0002646398060000075
a bias matrix representing the f-th convolution kernel,
Figure BDA0002646398060000076
denotes the f-th convolution kernel at i: i + x2-1、j:j+y2-1、k:k+z2-1 area of convolution results; equation (7.2) represents the integration of the horizontal and vertical convolution results of the f-th convolution kernel in each region to obtain the kth depth convolution result of the f-th convolution kernel, where sx2And sy2Representing the horizontal convolution step and the vertical convolution step,
Figure BDA0002646398060000077
a kth depth convolution result representing an fth convolution kernel; equation (7.3) represents the integration of all the deep convolution results of the f-th convolution kernel to obtain the deep convolution result of the f-th convolution kernel, where sz2The step of depth convolution is represented as a step,
Figure BDA0002646398060000078
a depth convolution result representing the f-th convolution kernel; equation (7.4) represents the integration of the deep convolution results for all convolution kernels to obtain the final convolution result for the deep semantic feature cube in the layer network, i.e.
Figure BDA0002646398060000079
It is called the sentence-to-three dimensional convolution matching tensor.
Second, using a two-dimensional convolutional neural network pair
Figure BDA00026463980600000710
And performing two-dimensional convolution matching processing to obtain a preliminary convolution matching tensor of the sentence pair, wherein the formula is as follows:
Figure BDA00026463980600000711
Figure BDA00026463980600000712
Figure BDA00026463980600000713
wherein the sentence matches the tensor to the three-dimensional convolution
Figure BDA00026463980600000714
Inputting for the layer; equation (8.1) represents the result of the ReLU function mapping after the f-th convolution kernel sentence convolves a specific region of the three-dimensional convolution matching tensor, where [ x ]3,y3]Which represents the size of the convolution kernel,
Figure BDA00026463980600000715
a weight matrix representing the f-th convolution kernel, i and j representing the abscissa and ordinate of the convolution region, tlAnd thRepresenting the length and height of the sentence versus the three-dimensional convolution matching tensor i + x3-1、j:j+y3-1 represents a convolution region and-1 represents a convolution region,
Figure BDA00026463980600000716
a bias matrix representing the f-th convolution kernel,
Figure BDA00026463980600000717
denotes the f-th convolution kernel at i: i + x3-1、j:j+y3-1 area of convolution results; equation (8.2) shows that the convolution result of the f-th convolution kernel in each region is integrated to obtain the final convolution result of the f-th convolution kernel, i.e.
Figure BDA00026463980600000718
Wherein s isx3And sy3Representing a horizontal convolution step and a vertical convolution step; formula (8.3) shows that the final convolution results of the n convolution kernels are combined to obtain the final convolution result of the layer network on the sentence pair three-dimensional convolution matching tensor, namely the final convolution result is
Figure BDA0002646398060000081
It is called the sentence pair preliminary matching tensor.
Constructing a semantic feature screening layer: the layer receives output sentences of the semantic feature matching layer and takes the preliminary matching tensor as input, and then completes semantic feature screening and weighting operation on the layer, wherein the formula is as follows:
Figure BDA0002646398060000082
Figure BDA0002646398060000083
Figure BDA0002646398060000084
wherein, the formula (9.1) represents
Figure BDA0002646398060000085
A mapping is performed in which, among other things,
Figure BDA0002646398060000086
and
Figure BDA0002646398060000087
representing the corresponding trainable weight matrix in the model,
Figure BDA0002646398060000088
to represent
Figure BDA0002646398060000089
A result after mapping; equation (9.2) represents calculating the attention weight, where,
Figure BDA00026463980600000810
representing an attention weight; equation (9.3) represents the use of attention weights to generate the final match vector, where N is
Figure BDA00026463980600000811
The number of feature vectors in (a) is,
Figure BDA00026463980600000812
the tensors are matched to the semantics for the final sentence.
Preferably, the label prediction module is constructed by the following steps:
the sentence-to-semantic matching tensor is used as the input of the module and is processed by a layer of fully-connected network with the dimensionality of 1 and the activation function of sigmoid, so that a value of [0,1 ] is obtained]The value of the degree of matching between the two is recorded as ypredFinally, whether the semantics of the sentence pairs are matched or not is judged by comparing with the set threshold value (0.5); i.e. ypredAnd when the semantic meaning of the sentence pair is predicted to be matched when the semantic meaning is more than or equal to 0.5, otherwise, the semantic meaning is not matched. When the sentence is not fully trained on the semantic matching model, training is required to be carried out on a training data set so as to optimize the model parameters; when the model training is finished, the label prediction module can predict whether the semantics of the target sentence pair are matched.
Preferably, the sentence construction for the semantic matching knowledge base is as follows:
downloading a data set on a network to obtain original data: downloading a sentence-to-semantic matching data set or a manually constructed data set which is already disclosed on a network, and taking the sentence-to-semantic matching data set or the manually constructed data set as original data for constructing a sentence-to-semantic matching knowledge base;
preprocessing raw data: preprocessing original data used for constructing a sentence-to-semantic matching knowledge base, and performing word segmentation operation and word segmentation operation on each sentence to obtain a sentence-to-semantic matching word segmentation processing knowledge base and a word segmentation processing knowledge base;
summarizing the sub-knowledge base: summarizing a sentence-to-semantic matching word-breaking processing knowledge base and a sentence-to-semantic matching word-segmentation processing knowledge base to construct a sentence-to-semantic matching knowledge base.
The sentence-to-semantic matching model is obtained by training by using a training data set, and the construction process of the training data set is as follows:
constructing a training example: constructing two sentence pairs with consistent sentence semantemes into a positive example in a sentence pair semantic matching knowledge base, and formalizing the positive example into: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word, 1); wherein, sensor 1_ char and sensor 2_ char refer to sentence1 and sentence2 in the knowledge base for semantic matching word segmentation processing respectively, sensor 1_ word and sensor 2_ word refer to sentence1 and sentence2 in the knowledge base for semantic matching word segmentation processing respectively, and here 1 indicates that the semantics of the two sentences are matched, which is a positive example;
constructing a training negative example: selecting a sentence s1Randomly selecting a sentence s from the sentence pair semantic matching knowledge base1Unmatched sentence s2A 1 is to1And s2The combination is carried out, and a negative example is constructed and formalized as follows: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word, 0); wherein, the sensor 1_ char and the sensor 1_ word respectively refer to sentence-to-sentence semantic matching word-breaking processing knowledge base and sentence-to-sentence semantic matching word-segmentation processing knowledge base, namely sentence 1; sensor 2_ char, sensor 2_ word refer to sentence-to-sentence semantic matching word-breaking processing knowledge base and sentence-to-sentence semantic matching word-segmentation processing knowledge base, respectively; 0 denotes the sentence s1And sentence s2Is a negative example;
constructing a training data set: combining all positive example sample sentence pairs and negative example sample sentence pairs obtained after the operations of constructing the training positive examples and constructing the training negative examples, and disordering the sequence of the positive example sample sentence pairs and the negative example sample sentence pairs to construct a final training data set; whether positive case data or negative case data contains five dimensions, namely, sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word,0 or 1;
after the sentence-to-semantic matching model is built, training and optimizing the sentence-to-semantic matching model through a training data set are carried out, which specifically comprises the following steps:
constructing a loss function: known from the label prediction module construction process, ypredIs a matching degree calculation value y obtained by processing a sentence to a semantic matching modeltrueIs two sentence languagesDefining whether the matched real label is present, wherein the value of the matched real label is limited to 0 or 1, the cross entropy is adopted as a loss function, and the formula is as follows:
Figure BDA0002646398060000091
optimizing a training model: using RMSProp as an optimization algorithm, except that the learning rate is set to 0.0015, the remaining hyper-parameters of RMSProp all select default settings in Keras; and optimally training the sentence pair semantic matching model on the training data set.
An intelligent question-answering sentence-to-semantic matching device based on a semantic feature cube comprises,
the sentence-to-semantic matching knowledge base construction unit is used for acquiring a large amount of sentence pair data and then carrying out preprocessing operation on the sentence pair data so as to obtain a sentence-to-semantic matching knowledge base which meets the training requirement;
a training data set generating unit for constructing positive example data and negative example data for training according to sentences in the sentence-to-semantic matching knowledge base, and constructing a final training data set based on the positive example data and the negative example data;
the sentence pair semantic matching model construction unit is used for constructing a word mapping conversion table and a word mapping conversion table, and simultaneously constructing an input module, a word vector mapping layer, a deep semantic feature cube construction network module, a feature conversion network module and a label prediction module; the sentence-to-semantic matching model construction unit includes,
the word mapping conversion table or word mapping conversion table construction unit is responsible for segmenting each sentence in the semantic matching knowledge base by the sentence according to the word granularity or the word granularity, sequentially storing each word or word in a list to obtain a word list or word list, and sequentially increasing and sequencing the words or words according to the sequence of the words or word lists recorded by the words or words with the number 1 as the start to form the word mapping conversion table or word mapping conversion table required by the invention; after the word mapping conversion table or the word mapping conversion table is constructed, each word or word in the table is mapped into a unique digital identifier; then, the Word vector model or the Word vector model is trained by using Word2Vec to obtain a Word vector matrix of each Word or a Word vector matrix of each Word;
the input module construction unit is responsible for preprocessing each sentence pair or sentence pair to be predicted in the training data set, respectively acquiring sensor 1_ char, sensor 2_ char, sensor 1_ word and sensor 2_ word, and formalizing the words as follows: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word);
the word vector mapping layer or word vector mapping layer construction unit is responsible for loading a word vector matrix or word vector matrix obtained by training in the step of the word mapping conversion table or word mapping conversion table construction unit to initialize the weight parameters of the current layer; for word vector mapping, aiming at input sentences of sensor 1_ char and sensor 2_ char, obtaining corresponding sentence vectors of sensor 1_ char _ embed and sensor 2_ char _ embed; for word vector mapping, for input sentences of presence 1_ word and presence 2_ word, obtaining their corresponding sentence vectors of presence 1_ word _ embedded and presence 2_ word _ embedded;
the deep semantic feature cube construction network module construction unit is responsible for constructing one-dimensional semantic information into a semantic feature cube, and specifically operates to receive word embedded representations output by the word vector mapping layer and word embedded representations output by the word vector mapping layer as input; the first layer of coding structure respectively carries out coding operation on the word embedding expression and the word embedding expression output by the multi-granularity embedding module so as to obtain a first layer of word coding result and a first layer of word coding result; newly adding two dimensions, namely a first layer word coding result and a first layer word coding result, and defining the two dimensions as a granularity dimension and a depth dimension; the first layer word coding result and the first layer word coding result are connected in granularity dimension to generate a first layer semantic feature cube, and meanwhile, the first layer word coding result and the first layer word coding result before dimension increase are transmitted to a second layer coding structure; the second layer coding structure respectively carries out coding operation on the first layer character coding result and the first layer word coding result so as to obtain a second layer character coding result and a second layer word coding result; adding two dimensions, namely a granularity dimension and a depth dimension, to the second layer word coding result and the second layer word coding result; the second layer word coding result and the second layer word coding result are connected in the granularity dimension to generate a second level semantic feature cube, and meanwhile, the second layer word coding result and the second layer word coding result before the dimension is increased are transmitted to a third layer coding structure; by analogy, the multi-level semantic feature cube can be generated by repeatedly encoding for many times; after the semantic feature cubes of each level are obtained, the semantic feature cubes of all levels are connected on a depth dimension to generate a final deep semantic feature cube;
the feature conversion network module construction unit is responsible for further processing a deep semantic feature cube of a corresponding sentence and carrying out semantic feature coding, semantic feature matching, semantic feature screening and other operations on the deep semantic feature cube so as to generate a final sentence to semantic matching tensor; the corresponding operation is realized through a three-dimensional convolution semantic feature coding layer, a semantic feature matching layer and a semantic feature screening layer respectively;
the tag prediction module unit is responsible for processing the semantic matching tensor of the sentence pair so as to obtain a matching degree value, and the matching degree value is compared with an established threshold value so as to judge whether the semantics of the sentence pair are matched or not;
and the sentence-to-semantic matching model training unit is used for constructing a loss function required in the model training process and finishing the optimization training of the model.
Preferably, the sentence-to-semantic matching knowledge base construction unit includes,
the sentence pair data acquisition unit is responsible for downloading a sentence pair semantic matching data set or a manually constructed data set which is already disclosed on a network, and the sentence pair data set is used as original data for constructing a sentence pair semantic matching knowledge base;
the system comprises an original data word-breaking preprocessing or word-segmentation preprocessing unit, a word-breaking or word-segmentation preprocessing unit and a word-segmentation processing unit, wherein the original data word-breaking preprocessing or word-segmentation preprocessing unit is responsible for preprocessing original data used for constructing a sentence-to-semantic matching knowledge base, and performing word-breaking or word-segmentation operation on each sentence in the original data word-breaking or word-segmentation preprocessing unit so as to construct a sentence-to-semantic matching word-breaking processing knowledge base or a sentence-;
and the sub-knowledge base summarizing unit is responsible for summarizing the sentence-to-semantic matching word-breaking processing knowledge base and the sentence-to-semantic matching word-segmentation processing knowledge base so as to construct the sentence-to-semantic matching knowledge base.
The training data set generating unit comprises a training data set generating unit,
the training positive case data construction unit is responsible for constructing two sentences with consistent semantics in the sentence-to-semantic matching knowledge base and the matching labels 1 thereof into training positive case data;
the training negative case data construction unit is responsible for selecting one sentence, randomly selecting a sentence which is not matched with the sentence for combination, and constructing the sentence and the matched label 0 into negative case data;
the training data set construction unit is responsible for combining all training positive example data and training negative example data together and disordering the sequence of the training positive example data and the training negative example data so as to construct a final training data set;
the sentence-to-semantic matching model training unit includes,
the loss function construction unit is responsible for calculating the error of the semantic matching degree between the sentences 1 and 2;
and the model optimization unit is responsible for training and adjusting parameters in model training to reduce prediction errors.
A storage medium having stored therein a plurality of instructions loaded by a processor for performing the steps of the above method for matching semantic meanings to sentences based on a semantic feature cube for intelligent question and answer.
An electronic device, the electronic device comprising: the storage medium described above; and a processor for executing instructions in the storage medium.
The sentence-to-semantic matching method facing the intelligent question-answering and based on the semantic feature cube has the following advantages:
the invention constructs a network structure through a deep semantic feature cube, and can construct one-dimensional semantic information into a three-dimensional cube form, so that the selection range of a semantic coding network is wider; the semantic features of deeper layers can be captured, and the depth can be freely controlled, so that the structure can be flexibly adapted to different data sets;
the method carries out semantic coding on the sentences through the three-dimensional convolutional neural network, so that information on time sequence dimensionality in the sentences can be purposefully mined and analyzed, and the accuracy of the sentences for semantic matching is improved; the generated sentences can contain richer time sequence characteristics to the matching tensor, so that the prediction accuracy of the model is improved;
the sentence pairs are subjected to semantic matching through the two-dimensional convolutional neural network, the interactive features between the sentence pairs can be effectively captured, the generated sentence pair matching tensor has rich interactive features, and therefore the prediction accuracy of the model is improved.
Drawings
The invention is further described below with reference to the accompanying drawings.
FIG. 1 is a flow chart of a sentence-to-semantic matching method for intelligent question answering based on a semantic feature cube;
FIG. 2 is a flow chart of building a sentence-to-semantic matching knowledge base;
FIG. 3 is a flow chart for constructing a training data set;
FIG. 4 is a flow chart for constructing a sentence-to-semantic matching model;
FIG. 5 is a flow chart of training a sentence-to-semantic matching model;
FIG. 6 is a schematic structural diagram of a sentence-to-semantic matching device for intelligent question answering based on a semantic feature cube;
FIG. 7 is a schematic structural diagram of a deep semantic feature cube construction network;
FIG. 8 is a frame diagram of a sentence-to-semantic matching model based on a semantic feature cube for intelligent question answering.
Detailed Description
The sentence-to-semantic matching method based on semantic feature cube for intelligent question answering according to the present invention will be described in detail below with reference to the drawings and the specific embodiments of the specification.
Example 1:
as shown in fig. 8, the main framework structure of the present invention includes a multi-granularity embedding module, a deep semantic feature cube construction network module, a feature transformation network module, and a label prediction module. The multi-granularity embedding module is used for respectively embedding the input sentences by the word granularity and transmitting the result to the deep semantic feature cube construction network module of the model. The deep semantic feature cube construction network module comprises a plurality of layers of coding structures, as shown in fig. 7, wherein the first layer of coding structures respectively perform coding operations on the word embedding representation and the word embedding representation output by the multi-granularity embedding module to obtain a first layer of word coding result and a first layer of word coding result; newly adding two dimensions, namely a first layer word coding result and a first layer word coding result, and defining the two dimensions as a granularity dimension and a depth dimension; the first layer word coding result and the first layer word coding result are connected in granularity dimension to generate a first layer semantic feature cube, and meanwhile, the first layer word coding result and the first layer word coding result before dimension increase are transmitted to a second layer coding structure; the second layer coding structure respectively carries out coding operation on the first layer character coding result and the first layer word coding result so as to obtain a second layer character coding result and a second layer word coding result; adding two dimensions, namely a granularity dimension and a depth dimension, to the second layer word coding result and the second layer word coding result; the second layer word coding result and the second layer word coding result are connected in the granularity dimension to generate a second level semantic feature cube, and meanwhile, the second layer word coding result and the second layer word coding result before the dimension is increased are transmitted to a third layer coding structure; by analogy, the multi-level semantic feature cube can be generated by repeatedly encoding for many times; after the semantic feature cubes of each level are obtained, the semantic feature cubes of all levels are connected on a depth dimension to generate a final deep semantic feature cube; the deep semantic feature cube will be passed to the feature transformation network module of the model. The feature conversion network module further performs feature coding, feature matching and feature screening operations on the deep semantic feature cube; wherein the feature encoding operation is performed by a three-dimensional convolutional neural network; the feature matching operation is completed through a three-dimensional convolution neural network and a two-dimensional convolution neural network; the feature screening operation is realized by an attention mechanism; finally, the matching tensor of the sentence pair is obtained and is transmitted to a label prediction module of the model. The tag prediction module maps the matching tensor of the sentence pair into a floating point type numerical value in a specified interval; and comparing the matching degree serving as the matching degree with a preset threshold value, and judging whether the semantics of the sentence pairs are matched or not according to the comparison result. The method comprises the following specific steps:
(1) the multi-granularity embedding module is used for respectively embedding the input sentences by word granularity and word granularity to obtain multi-granularity embedded expression of the sentences;
(2) the deep semantic feature cube construction network module carries out coding operation on the multi-granularity embedded representation of the sentence to obtain a deep semantic feature cube of the sentence;
(3) the feature conversion network module further performs feature coding, feature matching and feature screening operations on the deep semantic feature cube of the sentence pair to obtain a matching vector of the sentence pair;
(4) and the tag prediction module maps the matching tensor of the sentence pair into a floating point type numerical value in the designated interval, compares the floating point type numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the sentence pair are matched or not according to the comparison result.
Example 2:
as shown in the attached figure 1, the sentence pair semantic matching method facing to the intelligent question-answering and based on the semantic feature cube comprises the following specific steps:
s1, constructing a sentence-to-semantic matching knowledge base, as shown in the attached figure 2, and specifically comprising the following steps:
s101, downloading a data set on a network to obtain original data: and downloading a sentence-to-semantic matching data set or a manually constructed data set which is already disclosed on the network, and taking the sentence-to-semantic matching data set or the manually constructed data set as original data for constructing a sentence-to-semantic matching knowledge base.
Examples are: there are many sentences on the network that are published for intelligent question-answering system-oriented semantic matching datasets, such as LCQMC dataset [ Xin Liu, Qingcai Chen, Chong Deng, Huajun Zeng, Jing Chen, Dongfang Li, and Buzhou Tang. The present invention collects this data and downloads it to obtain the raw data used to build the sentence-to-semantic matching knowledge base.
Sentence pairs in the LCQMC dataset are illustrated as follows:
sentence1 what is the most sad song in the world?
sentence2 What songs are the most sad in the world?
S102, preprocessing original data: preprocessing is used for constructing original data of a sentence-to-semantic matching knowledge base, and performing word breaking operation and word segmentation operation on each sentence to obtain the sentence-to-semantic matching word breaking processing knowledge base and the word segmentation processing knowledge base.
And performing word segmentation preprocessing and word segmentation preprocessing on each sentence in the original data for constructing the sentence-to-semantic matching knowledge base obtained in the step S101 to obtain a sentence-to-semantic matching word segmentation processing knowledge base and a word segmentation processing knowledge base. The character breaking operation comprises the following specific steps: each character in the Chinese sentence is taken as a unit, and a blank space is taken as a separator to segment each sentence. The word segmentation operation comprises the following specific steps: and selecting a default accurate mode to segment each sentence by using a Jieba word segmentation tool. In this operation, all the contents of punctuation, special characters and stop words in the sentence are preserved in order to avoid loss of semantic information.
Examples are: taking the sensor 1 shown in S101 as an example, the word-breaking operation processing is performed on it to obtain "what is the most sad song in the world? "; using Jieba word segmentation tool to perform word segmentation operation processing on the words, the word segmentation tool gets "what is the most sad song in the world? ".
S103, summarizing the sub-knowledge base: summarizing a sentence-to-semantic matching word-breaking processing knowledge base and a sentence-to-semantic matching word-segmentation processing knowledge base to construct a sentence-to-semantic matching knowledge base.
And summarizing the sentence-to-semantic matching word-breaking processing knowledge base and the sentence-to-semantic matching word-segmentation processing knowledge base obtained in the step S102 to the same folder, so as to obtain the sentence-to-semantic matching knowledge base. The flow is shown in fig. 2. It should be noted here that the data processed by the word-breaking operation and the data processed by the word-segmentation operation are not merged into the same file, i.e., the sentence-to-semantic matching knowledge base actually contains two independent sub-knowledge bases. Each preprocessed sentence retains the ID information of its original sentence.
S2, constructing a training data set of the sentence-to-semantic matching model: for each sentence pair in the sentence pair semantic matching knowledge base, if the semantics are consistent, the sentence pair can be used for constructing a training positive example; if the semantics are inconsistent, the sentence pair can be used for constructing a training negative example; mixing a certain amount of positive example data and negative example data to construct a model training data set; as shown in fig. 3, the specific steps are as follows:
s201, constructing a training example: constructing two sentence pairs with consistent sentence semantemes into a positive example in a sentence pair semantic matching knowledge base, and formalizing the positive example into: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word, 1);
examples are: after the word-breaking operation processing of step S102 and the word-segmentation operation processing of step S103 are performed on the content 1 and the content 2 shown in step S101, the formal example data form is constructed as follows:
(what are the most sad songs in the world.
S202, constructing a training negative example: selecting a sentence s1Matching of semantics from sentence pairsRandomly selecting one from the knowledge base and sentence s1Unmatched sentence s2A 1 is to1And s2The combination is carried out, and a negative example is constructed and formalized as follows: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word, 0);
examples are: what is a smart band in the pair "content 1: with a semantically mismatched sentence in the LCQMC dataset? sensor 2 what is the smart band used? For example, after the word-breaking operation processing in step S102 and the word-segmentation operation processing in step S103, negative example data forms are constructed as follows:
(what is a smart band.
In the LCQMC dataset, the ratio of positive to negative sentence pairs is 1.38: 1.
S203, constructing a training data set: all positive example sentence pair data and negative example sentence pair data obtained after the operations of step S201 and step S202 are combined together, and the sequence thereof is disturbed, so as to construct a final training data set. Whether positive case data or negative case data, they contain five dimensions, namely, sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word,0 or 1.
S3, constructing a sentence-to-semantic matching model: the method mainly comprises the steps of constructing a word mapping conversion table, constructing an input module, constructing a word vector mapping layer, constructing a deep semantic feature cube construction network module, constructing a feature conversion network module and constructing a label prediction module. The word mapping conversion table, the input module, the word vector mapping layer and the word vector mapping layer are constructed, and correspond to the multi-granularity embedded module in fig. 8, and the rest parts correspond to the modules in fig. 8 one by one. The method comprises the following specific steps:
s301, constructing a word mapping conversion table: the word table is constructed by the sentence-to-semantic matching word-breaking processing knowledge base obtained by the processing of step S102. After the word table is constructed, each word in the table is mapped to a unique digital identifier, and the mapping rule is as follows: starting with the number 1, the words are then ordered in ascending order in the order in which each word is entered into the word table, thereby forming the word mapping conversion table required by the present invention.
Examples are: with the content processed in step S102, "what is the most sad song in the world? ", construct word table and word mapping translation table as follows:
words and phrases Chinese character' shi Boundary of China On the upper part Most preferably Sade with Injury due to wound Is/are as follows Song (music instrument) Musical composition Is that Sundries Chinese character' Tao
Mapping 1 2 3 4 5 6 7 8 9 10 11 12 13
Then, the invention trains a Word vector model by using Word2Vec to obtain a Word vector matrix char _ embedding _ matrix of each Word.
For example, the following steps are carried out: in Keras, the implementation for the code described above is as follows:
w2v_model_char=genism.models.Word2Vec(w2v_corpus_char,size=char_embedding_dim,window=5,min_count=1,sg=1,workers=4,seed=1234,iter=25)
char_embedding_matrix=numpy.zeros([len(tokenizer.char_index)+1,char_embedding_dim])
tokenizer=keras.preprocessing.text.Tokenizer(num_words=len(char_set))
for char,idx in tokenizer.char_index.items():
char_embedding_matrix[idx,:]=w2v_model.wv[char]
wherein w2v _ corpus _ char is a word-breaking processing training corpus, namely, all data in a sentence-to-semantic matching word-breaking processing knowledge base; char _ embedding _ dim is a word vector dimension, the model sets char _ embedding _ dim to be 400, and char _ set is a word table.
S302, constructing a word mapping conversion table: the vocabulary is constructed by processing the knowledge base by sentence-to-semantic matching and word segmentation obtained in step S103. After the word list is constructed, each word in the list is mapped into a unique digital identifier, and the mapping rule is as follows: starting with the number 1, sequentially and progressively ordering according to the sequence of each word being recorded into the word list, thereby forming the word mapping conversion table required by the invention.
Examples are: with the content processed in step S103, "what is the most sad song in the world? ", construct word table and word mapping translation table as follows:
words and phrases World of things On the upper part Most preferably Sadness and sorrow Is/are as follows Song (music) Is that What is
Mapping 1 2 3 4 5 6 7 8 9
Then, the Word vector model is trained by using Word2Vec to obtain a Word vector matrix Word _ embedding _ matrix of each Word.
For example, the following steps are carried out: in Keras, for the code implementation described above, basically the same as illustrated in S301, but the parameters are changed from char to word. For the sake of brevity, no further description is provided herein.
In S301, w2v _ corp _ char is replaced by w2v _ corp _ word, which is a segmentation processing corpus, that is, all data in the sentence-to-semantic matching segmentation processing knowledge base; the char _ embedding _ dim is replaced by a word _ embedding _ dim, the word _ embedding _ dim is a word vector dimension, and the word _ embedding _ dim is set to be 400 by the model; char _ set is changed to word _ set, which is a vocabulary.
S303, constructing an input layer: the input layer includes four inputs, from which a training data set sample is obtained, respectively, sensor 1_ char, sensor 2_ char, sensor 1_ word, and sensor 2_ word, formalized as: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word);
and for each character or word in the input sentence, converting the character or word into a corresponding numerical identifier according to the character mapping conversion table and the word mapping conversion table.
For example, the following steps are carried out: the sentence pair shown in step S201 is used as a sample to compose a piece of input data. The results are shown below:
(what is the most sad song in the world
Each input data contains 4 clauses. For the first two clauses, converting the clauses into numerical representations according to the word mapping conversion table in the step S301; for the latter two clauses, it is converted into a numerical value representation (assuming that the word "song" appearing in sentence2 but not in sentence1 is mapped to 10) according to the word mapping conversion table in step S302. The 4 clauses of the input data, combined representation results are as follows:
(“1,2,3,4,5,6,7,8,9,10,11,12,13”,“1,2,3,4,5,6,7,10,11,12,8,13”,“1,2,3,4,5,6,7,8,9”,“1,2,3,4,5,7,8,10,9”)。
s304, constructing a word vector mapping layer: initializing the weight parameter of the current layer by loading the weight of the word vector matrix obtained by training in the step of constructing the word mapping conversion table; aiming at input sentences of sensor 1_ char and sensor 2_ char, obtaining corresponding sentence vectors of sensor 1_ char _ embed and sensor 2_ char _ embed; each sentence in the sentence-to-semantic matching word-breaking processing knowledge base can convert the sentence information into a vector form in a word vector mapping mode.
For example, the following steps are carried out: in Keras, the implementation for the code described above is as follows:
char_embedding_layer=Embedding(char_embedding_matrix.shape[0],char_em b_dim,weights=[char_embedding_matrix],input_length=input_dim,trainable=False)
wherein, char _ embedding _ matrix is the weight of the word vector matrix obtained by training in the step of constructing the word mapping conversion table, char _ embedding _ matrix. shape [0] is the size of the word table of the word vector matrix, char _ embedding _ dim is the dimension of the output word vector, and input _ length is the length of the input sequence.
The corresponding sentences sensor 1_ char and sensor 2_ char are processed by an Embedding layer of Keras to obtain corresponding sentence vectors sensor 1_ char _ embedded and sensor 2_ char _ embedded.
S305, constructing a word vector mapping layer: initializing the weight parameter of the current layer by loading the weight of the word vector matrix obtained by training in the step of constructing a word mapping conversion table; aiming at input sentences of a content 1_ word and a content 2_ word, obtaining corresponding sentence vectors of the content 1_ word _ embedded and the content 2_ word _ embedded; each sentence in the sentence-to-semantic matching word segmentation processing knowledge base can convert sentence information into a vector form in a word vector mapping mode.
For example, the following steps are carried out: in Keras, the code implementation described above is basically the same as in S304, except that the parameters are changed from char to word. For the sake of brevity, no further description is provided herein.
The corresponding sentences of content 1_ word and content 2_ word are processed by an Embedding layer of Keras to obtain corresponding sentences of content 1_ word _ embedded and content 2_ word _ embedded.
S306, constructing a deep semantic feature cube construction network module: the structure is shown in fig. 7, and the specific steps are as follows:
first layer coding structure BilSTM1Respectively carrying out coding operation on the word embedding expression and the word embedding expression output by the multi-granularity embedding module to obtain a first layer word coding result and a first layer word coding result, and marking the first layer word coding result and the first layer word coding result as
Figure BDA0002646398060000181
And
Figure BDA0002646398060000182
and
Figure BDA0002646398060000183
adding two dimensions to obtain the first layer word coding result and the first layer word coding result after dimension increase, and recording the results as
Figure BDA0002646398060000184
And
Figure BDA0002646398060000185
the newly added dimensions are defined as a granularity dimension and a depth dimension, and then connected in the granularity dimension
Figure BDA0002646398060000186
And
Figure BDA0002646398060000187
to generate a first level semantic feature cube. The specific implementation is shown in the following formula.
Figure BDA0002646398060000188
Figure BDA0002646398060000189
Figure BDA00026463980600001810
Further, the first layer word encoding result before dimension increasing and the first layer word encoding result, namely
Figure BDA00026463980600001811
And
Figure BDA00026463980600001812
delivery to the second layer coding Structure BilsTM2;BiLSTM2Respectively carrying out coding operation on the coding result of the first layer of characters and the coding result of the first layer of words to obtain a coding result of the second layer of characters and a coding result of the second layer of words, and marking the coding results as the coding results of the second layer of words
Figure BDA0002646398060000191
And
Figure BDA0002646398060000192
and
Figure BDA0002646398060000193
adding two dimensions to obtain the second layer word coding result and the second layer word coding result after dimension increase, and recording the results as
Figure BDA0002646398060000194
And
Figure BDA0002646398060000195
the newly added dimensions are defined as a granularity dimension and a depth dimension, and then connected in the granularity dimension
Figure BDA0002646398060000196
And
Figure BDA0002646398060000197
to generate a second hierarchical semantic feature cube. The specific implementation is shown in the following formula.
Figure BDA0002646398060000198
Figure BDA0002646398060000199
Figure BDA00026463980600001910
Further, the second layer word encoding result before dimension increasing and the second layer word encoding result, namely
Figure BDA00026463980600001911
And
Figure BDA00026463980600001912
to the third layer of coding structure BilSTM3(ii) a By analogy, the multi-level semantic feature cube can be generated by repeatedly encoding for multiple times; and according to the preset hierarchy depth of the model, generating a depth level semantic feature cube. For the depth layer, the following formula is implemented.
Figure BDA00026463980600001913
Figure BDA00026463980600001914
Figure BDA00026463980600001915
Further, after the semantic feature cubes of each level are obtained, the semantic feature cubes of all levels are connected in the depth dimension to generate a final deep semantic feature cube. The specific implementation is shown in the following formula.
Figure BDA00026463980600001916
For example, the following steps are carried out: when the invention is implemented on the LCQMC data set, the number of layers of the structure is 2, and the optimal result can be obtained when the coding dimension of the BiLSTM in each layer is set to 300. In addition, in order to avoid the over-fitting problem, a dropout strategy is used in each layer of BilSTM, and the optimal result can be obtained when dropout is set to be 0.15. In Keras, the implementation for the code described above is as follows:
Figure BDA00026463980600001917
Figure BDA0002646398060000201
wherein, the sensor _ embedded _ char is a word-embedded representation of a sentence, the sensor _ embedded _ word is a word-embedded representation of a sentence, 300 is an encoding dimension of BilSTM, and feature _ cube is a deep semantic feature cube of a corresponding sentence.
S307, constructing a feature conversion network module: after the processing of step S306, deep semantic feature cube representations of sensor 1 and sensor 2 are obtained, and semantic feature coding, semantic feature matching, semantic feature screening, and the like are performed on the deep semantic feature cube representations, so as to generate a final sentence-to-semantic matching tensor; the method comprises the following specific steps:
constructing a three-dimensional convolution semantic feature coding layer: the layer receives the deep semantic feature cube output by the deep semantic feature cube construction network module as the input of the layer, and then uses the three-dimensional convolutional neural network to perform coding operation on the deep semantic feature cube construction network module, so as to obtain corresponding semantic feature coded representation. The specific implementation is shown in the following formula.
Figure BDA0002646398060000202
Figure BDA0002646398060000203
Figure BDA0002646398060000204
Figure BDA0002646398060000205
For example, the following steps are carried out: when the invention is implemented on the LCQMC dataset, [ x ]1,y1,z1]Take [2,2,5 ]],sx1、sy1And sz1Optimal results were obtained when 1, 1 and 1, respectively, and n was taken to be 16.
In Keras, the implementation for the code described above is as follows:
encode_3DCNN=Conv3D(filters=16,kernel_size=(2,2,5),padding='Valid',strides=[1,1,1],data_format='channels_last',activation='relu')(feature_cube)
the feature _ cube represents a deep semantic feature cube of a corresponding sentence, 16 represents that the convolutional neural network has 16 convolutional kernels, and encode _3DCNN represents an encoding result of the deep semantic feature cube of the corresponding sentence processed by the three-dimensional convolutional neural network.
Constructing a semantic feature matching layer: this layer first links semantic feature coded representations of sensor 1 and sensor 2 in the depth dimension
Figure BDA0002646398060000211
And
Figure BDA0002646398060000212
thereby obtaining the sentence pair connection tensor
Figure BDA0002646398060000213
The specific implementation is shown in the following formula.
Figure BDA0002646398060000214
Subsequently, semantic feature coding operation is carried out on the sentence pair preliminary matching tensor to generate a sentence pair preliminary matching tensor, and the specific process comprises the following two steps:
first, another three-dimensional convolutional neural network pair is used
Figure BDA0002646398060000215
And carrying out three-dimensional convolution matching processing to obtain a sentence-to-three-dimensional convolution matching tensor. The specific implementation is shown in the following formula.
Figure BDA0002646398060000216
Figure BDA0002646398060000217
Figure BDA0002646398060000218
Figure BDA0002646398060000219
For example, the following steps are carried out: when the invention is implemented on the LCQMC dataset, [ x ]2,y2,z2]Take [5,1,2 ]],sx2、sy2And sz2Respectively take1. Optimal results were obtained when 1 and 1, n were taken to be 32.
In Keras, the implementation for the code described above is as follows:
sentens_pairs_con=Concatenate(axis=4)([encode_3DCNN_S1,
encode_3DCNN_S2])
match_3DCNN=Conv3D(filters=32,kernel_size=(5,1,2),padding='Valid',strides=[1,1,1],data_format='channels_last',activation='relu')(sentens_pairs_con)
the encode _3DCNN _ S1 represents semantic feature encoding representation of the sensor 1, the encode _3DCNN _ S2 represents semantic feature encoding representation of the sensor 2, the sense _ patents _ con represents a connection result of deep semantic feature cubes of two sentences in a depth dimension, 32 represents that the convolutional neural network has 32 convolutional kernels, and match _3DCNN represents a sentence-to-three convolutional matching tensor.
Second, using a two-dimensional convolutional neural network pair
Figure BDA00026463980600002110
And performing two-dimensional convolution matching processing to obtain a preliminary convolution matching tensor of the sentence pair. The specific implementation is shown in the following formula.
Figure BDA00026463980600002111
Figure BDA00026463980600002112
Figure BDA0002646398060000221
For example, the following steps are carried out: when the invention is implemented on the LCQMC dataset, [ x ]3,y3]Take [25,3 ]],sx3And sy3Optimal results were obtained when 1 and 1, respectively, and n was 64.
In Keras, the implementation for the code described above is as follows:
match_2DCNN=Conv2D(filters=64,kernel_size=(25,3),padding='Valid',strides=[1,1],data_format='channels_last',activation='relu')(match_3DCNN)
the match _3DCNN represents a three-dimensional convolution matching tensor of the sentence pair, 64 represents that the convolution neural network has 64 convolution kernels, and the match _2DCNN represents an initial matching tensor of the sentence pair.
Constructing a semantic feature screening layer: the layer receives output sentences of the semantic feature matching layer and takes the preliminary matching tensor as input, and then completes semantic feature screening and weighting operation on the layer, wherein the formula is as follows:
Figure BDA0002646398060000222
Figure BDA0002646398060000223
Figure BDA0002646398060000224
for example, the following steps are carried out: in Keras, the implementation for the code described above is as follows:
sentence_output=match_2DCNN
z=tf.multiply(tf.tanh(K.dot(sentence_output,self.w)),self.v)
z=tf.squeeze(z,axis=-1)
a=tf.nn.softmax(z)
m=K.batch_dot(a,sentence_output)
the match _2DCNN represents the preliminary matching tensor of the sentence pair, self.w and self.v both refer to the weight matrix to be trained, and m represents the final matching tensor of the sentence pair after attention mechanism processing.
S308, constructing a label prediction module: the sentence-to-semantic matching tensor obtained in step S307 is used as the input of the module, and passes through aProcessing a fully-connected network with a layer dimension of 1 and an activation function of sigmoid to obtain a network with a value of 0,1]The value of the degree of matching between the two is recorded as ypredFinally, whether the semantics of the sentence pairs are matched or not is judged by comparing with the set threshold value (0.5); i.e. ypredAnd when the semantic meaning of the sentence pair is predicted to be matched when the semantic meaning is more than or equal to 0.5, otherwise, the semantic meaning is not matched.
When the deep semantic feature map-based sentence provided by the invention has not been trained on the semantic matching model, step S4 needs to be further executed for training to optimize the model parameters; when the model is trained, step S308 may predict whether the semantics of the target sentence pair match.
S4, training a sentence-to-semantic matching model: training the sentence constructed in step S3 on the training data set obtained in step S2 to obtain a semantic matching model, as shown in fig. 5, specifically as follows:
s401, constructing a loss function: known from the label prediction module construction process, ypredIs a matching degree calculation value y obtained by processing a sentence to a semantic matching modeltrueThe semantic matching method is a real label for judging whether the semantics of two sentences are matched, the value of the label is limited to 0 or 1, cross entropy is used as a loss function, and the formula is as follows:
Figure BDA0002646398060000231
the optimization function described above and its settings are expressed in Keras using code:
parallel_model.compile(loss="binary_crossentropy",optimizer=op,metrics=['accuracy',precision,recall,f1_score])
s402, optimizing a training model: using the RMSProp as an optimization algorithm, except that the learning rate is set to 0.0015, the remaining hyper-parameters of RMSProp all select default settings in Keras; optimally training the sentence pair semantic matching model on a training data set;
for example, the following steps are carried out: the optimization function described above and its settings are expressed in Keras using code:
optim=keras.optimizers.RMSProp(lr=0.0015)。
the model provided by the invention obtains a result superior to the current advanced model on the LCQMC data set, and the comparison of the experimental results is specifically shown in the following table.
Figure BDA0002646398060000232
Compared with the existing model, the model of the invention is improved greatly as shown by the experimental result. Among them, the first three rows are experimental results of the prior art model (derived from document LCQMC, fining 2018), and the last row is experimental results of the model of the present invention, so it can be seen that the present invention is greatly improved over the prior art model.
Example 3:
as shown in fig. 6, the sentence-to-semantic matching apparatus based on the semantic feature cube for intelligent question answering in embodiment 2 comprises,
the sentence-to-semantic matching knowledge base construction unit is used for acquiring a large amount of sentence pair data and then carrying out preprocessing operation on the sentence pair data so as to obtain a sentence-to-semantic matching knowledge base which meets the training requirement; the sentence-to-semantic matching knowledge base construction unit includes,
the sentence pair data acquisition unit is responsible for downloading a sentence pair semantic matching data set or a manually constructed data set which is already disclosed on a network, and the sentence pair data set is used as original data for constructing a sentence pair semantic matching knowledge base;
the system comprises an original data word-breaking preprocessing or word-segmentation preprocessing unit, a word-breaking or word-segmentation preprocessing unit and a word-segmentation processing unit, wherein the original data word-breaking preprocessing or word-segmentation preprocessing unit is responsible for preprocessing original data used for constructing a sentence-to-semantic matching knowledge base, and performing word-breaking or word-segmentation operation on each sentence in the original data word-breaking or word-segmentation preprocessing unit so as to construct a sentence-to-semantic matching word-breaking processing knowledge base or a sentence-;
and the sub-knowledge base summarizing unit is responsible for summarizing the sentence-to-semantic matching word-breaking processing knowledge base and the sentence-to-semantic matching word-segmentation processing knowledge base so as to construct the sentence-to-semantic matching knowledge base.
A training data set generating unit for constructing positive case data and negative case data for training according to sentences in the sentence-to-semantic matching knowledge base, and constructing a final training data set based on the positive case data and the negative case data; the training data set generating unit comprises a training data set generating unit,
the training positive case data construction unit is responsible for constructing two sentences with consistent semantics in the sentence-to-semantic matching knowledge base and the matching labels 1 thereof into training positive case data;
the training negative case data construction unit is responsible for selecting one sentence, randomly selecting a sentence which is not matched with the sentence for combination, and constructing the sentence and the matched label 0 into negative case data;
the training data set construction unit is responsible for combining all training positive example data and training negative example data together and disordering the sequence of the training positive example data and the training negative example data so as to construct a final training data set;
the sentence pair semantic matching model construction unit is used for constructing a word mapping conversion table and a word mapping conversion table, and simultaneously constructing an input module, a word vector mapping layer, a deep semantic feature cube construction network module, a feature conversion network module and a label prediction module; the sentence-to-semantic matching model construction unit includes,
the word mapping conversion table or word mapping conversion table construction unit is responsible for segmenting each sentence in the semantic matching knowledge base by the sentence according to the word granularity or the word granularity, sequentially storing each word or word in a list to obtain a word list or word list, and sequentially increasing and sequencing the words or words according to the sequence of the words or word lists recorded by the words or words with the number 1 as the start to form the word mapping conversion table or word mapping conversion table required by the invention; after the word mapping conversion table or the word mapping conversion table is constructed, each word or word in the table is mapped into a unique digital identifier; then, the Word vector model or the Word vector model is trained by using Word2Vec to obtain a Word vector matrix of each Word or a Word vector matrix of each Word;
the input module construction unit is responsible for preprocessing each sentence pair or sentence pair to be predicted in the training data set, respectively acquiring sensor 1_ char, sensor 2_ char, sensor 1_ word and sensor 2_ word, and formalizing the words as follows: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word);
the word vector mapping layer or word vector mapping layer construction unit is responsible for loading a word vector matrix or word vector matrix obtained by training in the step of the word mapping conversion table or word mapping conversion table construction unit to initialize the weight parameters of the current layer; for word vector mapping, aiming at input sentences of sensor 1_ char and sensor 2_ char, obtaining corresponding sentence vectors of sensor 1_ char _ embed and sensor 2_ char _ embed; for word vector mapping, for input sentences of presence 1_ word and presence 2_ word, obtaining their corresponding sentence vectors of presence 1_ word _ embedded and presence 2_ word _ embedded;
the deep semantic feature cube construction network module construction unit is responsible for constructing one-dimensional semantic information into a semantic feature cube, and specifically operates to receive word embedded representations output by the word vector mapping layer and word embedded representations output by the word vector mapping layer as input; the first layer of coding structure respectively carries out coding operation on the word embedding expression and the word embedding expression output by the multi-granularity embedding module so as to obtain a first layer of word coding result and a first layer of word coding result; newly adding two dimensions, namely a first layer word coding result and a first layer word coding result, and defining the two dimensions as a granularity dimension and a depth dimension; the first layer word coding result and the first layer word coding result are connected in granularity dimension to generate a first layer semantic feature cube, and meanwhile, the first layer word coding result and the first layer word coding result before dimension increase are transmitted to a second layer coding structure; the second layer coding structure respectively carries out coding operation on the first layer character coding result and the first layer word coding result so as to obtain a second layer character coding result and a second layer word coding result; adding two dimensions, namely a granularity dimension and a depth dimension, to the second layer word coding result and the second layer word coding result; the second layer word coding result and the second layer word coding result are connected in the granularity dimension to generate a second level semantic feature cube, and meanwhile, the second layer word coding result and the second layer word coding result before the dimension is increased are transmitted to a third layer coding structure; by analogy, the multi-level semantic feature cube can be generated by repeatedly encoding for many times; after the semantic feature cubes of each level are obtained, the semantic feature cubes of all levels are connected on a depth dimension to generate a final deep semantic feature cube;
the feature conversion network module construction unit is responsible for further processing a deep semantic feature cube of a corresponding sentence and carrying out semantic feature coding, semantic feature matching, semantic feature screening and other operations on the deep semantic feature cube so as to generate a final sentence to semantic matching tensor; the corresponding operation is realized through a three-dimensional convolution semantic feature coding layer, a semantic feature matching layer and a semantic feature screening layer respectively;
the tag prediction module unit is responsible for processing the semantic matching tensor of the sentence pair so as to obtain a matching degree value, and the matching degree value is compared with an established threshold value so as to judge whether the semantics of the sentence pair are matched or not;
the sentence-to-semantic matching model training unit is used for constructing a loss function required in the model training process and finishing the optimization training of the model; the sentence-to-semantic matching model training unit includes,
the loss function construction unit is responsible for calculating the error of the semantic matching degree between the sentences 1 and 2;
the model optimization unit is responsible for training and adjusting parameters in model training to reduce prediction errors;
example 4:
the storage medium according to embodiment 2, in which a plurality of instructions are stored, the instructions being loaded by a processor, and the steps of the sentence-to-semantic matching method for intelligent question answering based on semantic feature cubes according to embodiment 2 are executed.
Example 5:
the electronic device according to embodiment 4, the electronic device comprising: the storage medium of example 4; and a processor for executing the instructions in the storage medium of embodiment 4.

Claims (10)

1. A sentence pair semantic matching method facing intelligent question answering and based on a semantic feature cube is characterized in that a sentence pair semantic matching model consisting of a multi-granularity embedding module, a deep semantic feature cube construction network module, a feature conversion network module and a label prediction module is constructed and trained to realize deep semantic feature cube representation of sentence information and three-dimensional convolutional coding representation of semantic features, and meanwhile, a final matching tensor of a sentence pair is generated through an attention mechanism and the matching degree of the sentence pair is judged so as to achieve the aim of intelligent semantic matching of the sentence pair; the method comprises the following specific steps:
the multi-granularity embedding module is used for respectively embedding the input sentences by word granularity and word granularity to obtain multi-granularity embedded expression of the sentences;
the deep semantic feature cube construction network module carries out coding operation on the multi-granularity embedded representation of the sentence to obtain a deep semantic feature cube of the sentence;
the feature conversion network module further performs feature coding, feature matching and feature screening operations on the deep semantic feature cube of the sentence pair to obtain a matching vector of the sentence pair;
and the tag prediction module maps the matching tensor of the sentence pair into a floating point type numerical value in the designated interval, compares the floating point type numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the sentence pair are matched or not according to the comparison result.
2. The intelligent question answering based sentence pair semantic matching method according to claim 1, wherein the multi-granularity embedding module is used for constructing a word mapping conversion table, an input module, a word vector mapping layer and a word vector mapping layer;
wherein, a word mapping conversion table or a word mapping conversion table is constructed: the mapping rule is as follows: taking the number 1 as a start, and then sequentially increasing and sequencing according to the sequence of the character table or the word table into which each character or word is recorded, thereby forming a character mapping conversion table or a word mapping conversion table required by the invention; the word list or the word list is constructed by matching a knowledge base to a semantic meaning according to sentences, wherein the knowledge base comprises a word-breaking processing knowledge base or a word-segmentation processing knowledge base, and is obtained by performing word-breaking preprocessing or word-segmentation preprocessing on original data texts of the semantic meaning matching knowledge base respectively; then, using Word2Vec to train a Word vector model or a Word vector model to obtain a Word vector matrix of each Word or a Word vector matrix of each Word;
constructing an input module: the input layer comprises four inputs, each sentence pair or sentence pair to be predicted in the training data set is subjected to word segmentation and word segmentation preprocessing, and the sensor 1_ char, the sensor 2_ char, the sensor 1_ word and the sensor 2_ word are respectively obtained, wherein suffixes char and word respectively represent word segmentation or word segmentation processing of corresponding sentences, and the suffixes char and word are formed as follows: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word); converting each character or word in the input sentence into a corresponding digital identifier according to a character mapping conversion table and a word mapping conversion table;
constructing a word vector mapping layer or a word vector mapping layer: loading a word vector matrix or a word vector matrix obtained by training in the step of constructing a word mapping conversion table or a word mapping conversion table to initialize the weight parameters of the current layer; for word vector mapping, aiming at input sentences of sensor 1_ char and sensor 2_ char, obtaining corresponding sentence vectors of sensor 1_ char _ embed and sensor 2_ char _ embed; for word vector mapping, for the input sentences presence 1_ word and presence 2_ word, the corresponding sentence vectors presence 1_ word _ embedded and presence 2_ word _ embedded are obtained.
3. The sentence-to-semantic matching method for intelligent question answering based on the semantic feature cube according to claim 1 or 2, wherein the construction process of the deep semantic feature cube construction network module is as follows:
first layer coding structure BilSTM1Respectively carrying out coding operation on the word embedding expression and the word embedding expression output by the multi-granularity embedding module to obtain a first layer word coding result and a first layer word coding result, and marking the first layer word coding result and the first layer word coding result as
Figure FDA0002646398050000021
And
Figure FDA0002646398050000022
Figure FDA0002646398050000023
and
Figure FDA0002646398050000024
adding two dimensions to obtain the first layer word coding result and the first layer word coding result after dimension increase, and recording the results as
Figure FDA0002646398050000025
And
Figure FDA0002646398050000026
the newly added dimensions are defined as a granularity dimension and a depth dimension, and then connected in the granularity dimension
Figure FDA0002646398050000027
And
Figure FDA0002646398050000028
to generate a first level semantic feature cube, the formula is as follows:
Figure FDA0002646398050000029
Figure FDA00026463980500000210
Figure FDA00026463980500000211
wherein, formula (1.1) indicates that BilSTM is used1Encoding word-embedded representations of the output of a multi-granular embedding module and adding two dimensions by reshape operation, whichIn (1),
Figure FDA00026463980500000212
indicating sensor 1_ char _ embed or sensor 2_ char _ embed, icThe vector representing the ith word represents the relative position in the sentence,
Figure FDA00026463980500000213
representing the first layer word coding result after dimension increasing, wherein the formula represents that the sensor _ char _ embedded passes through the BilSTM1Obtaining a shape tensor with (batch _ size, time _ steps, output _ dimension) after processing, and then obtaining a tensor with a shape as (batch _ size, time _ steps,1, output _ dimension,1) through a resume operation, wherein a newly-added third dimension is called a granularity dimension, a newly-added fifth dimension is called a depth dimension, and the tensor is a first layer word encoding result after dimension increasing; equation (1.2) shows the use of BilSTM1The word embedding representation output by the encoding multi-granularity embedding module is added in two dimensions by a reshape operation, wherein,
Figure FDA00026463980500000214
representing either sense 1_ word _ element or sense 2_ word _ element, iwThe vector representing the ith word represents the relative position in the sentence,
Figure FDA00026463980500000215
representing the coding result of the first layer words after dimension increasing, wherein the formula represents that the content _ word _ embedded passes through the BilSTM1Obtaining a shape tensor with (batch _ size, time _ steps, output _ dimension) after processing, and then obtaining a tensor with a shape as (batch _ size, time _ steps,1, output _ dimension,1) through a resume operation, wherein a newly-added third dimension is called a granularity dimension, a newly-added fifth dimension is called a depth dimension, and the tensor is a first layer word encoding result after dimension addition; equation (1.3) represents concatenating the first-level word encoding result and the first-level word encoding result in the granularity dimension to obtain a first-level semantic feature cube, wherein,
Figure FDA0002646398050000031
representing a first-level semantic feature cube whose shape is (batch _ size, time _ steps,2, output _ dimension, 1);
the first layer word encoding result before dimension increment and the first layer word encoding result, i.e.
Figure FDA0002646398050000032
And
Figure FDA0002646398050000033
delivery to the second layer coding Structure BilsTM2;BiLSTM2Respectively carrying out coding operation on the coding result of the first layer of characters and the coding result of the first layer of words to obtain a coding result of the second layer of characters and a coding result of the second layer of words, and marking the coding results as the coding results of the second layer of words
Figure FDA0002646398050000034
And
Figure FDA0002646398050000035
Figure FDA0002646398050000036
and
Figure FDA0002646398050000037
adding two dimensions to obtain the second layer word coding result and the second layer word coding result after dimension increase, and recording the results as
Figure FDA0002646398050000038
And
Figure FDA0002646398050000039
the newly added dimensions are defined as a granularity dimension and a depth dimension, and then connected in the granularity dimension
Figure FDA00026463980500000310
And
Figure FDA00026463980500000311
to generate a second hierarchical semantic feature cube; the formula is as follows:
Figure FDA00026463980500000312
Figure FDA00026463980500000313
Figure FDA00026463980500000314
wherein, the meaning of the formula (2.1) is similar to the formula (1.1) except that the BilSTM in the formula2The coded object is the first layer word coding result before dimension increase, wherein,
Figure FDA00026463980500000315
to represent
Figure FDA00026463980500000316
Through BiLSTM2Processing and newly adding a second layer word coding result after two dimensions; the meaning of equation (2.2) is similar to equation (1.2) except that BilSTM in this equation2The coded object is a first-layer word coding result before dimension increasing, wherein,
Figure FDA00026463980500000317
to represent
Figure FDA00026463980500000318
Through BiLSTM2Processing and adding a second layer word coding result after two dimensions;
Figure FDA00026463980500000319
and
Figure FDA00026463980500000320
the obtaining step and shape are all the same as
Figure FDA00026463980500000321
And
Figure FDA00026463980500000322
the consistency is achieved; the meaning of the formula (2.3) is similar to that of the formula (1.3), except that the objects connected by the formula are the second layer word encoding result and the second layer word encoding result after dimension increase, wherein,
Figure FDA00026463980500000323
representing a second level semantic feature cube whose shape is (batch _ size, time _ steps,2, output _ dimension, 1);
the second layer word coding result before dimension increment and the second layer word coding result, namely
Figure FDA00026463980500000324
And
Figure FDA00026463980500000325
to the third layer of coding structure BilSTM3(ii) a By analogy, the multi-level semantic feature cube can be generated by repeatedly encoding for multiple times; according to the preset hierarchy depth of the model, until a depth level semantic feature cube is generated; for the depth layer, the formula is as follows:
Figure FDA00026463980500000326
Figure FDA00026463980500000327
Figure FDA0002646398050000041
wherein, the meaning of formula (3.1) is similar to that of formula (2.1) except that BilSTM in the formuladepthThe coded object is the result of encoding the depth-1 layer word before dimension increase, wherein,
Figure FDA0002646398050000042
to represent
Figure FDA0002646398050000043
Through BiLSTMdepthProcessing and newly adding a depth layer word coding result after two dimensions; the meaning of equation (3.2) is similar to equation (2.2) except that BilSTM in this equationdepthThe coded object is the coded result of the depth-1 layer word before dimension increase, wherein,
Figure FDA0002646398050000044
to represent
Figure FDA0002646398050000045
Through BiLSTMdepthProcessing and newly adding a depth layer word coding result after two dimensions;
Figure FDA0002646398050000046
and
Figure FDA0002646398050000047
the obtaining step and shape are all the same as
Figure FDA0002646398050000048
And
Figure FDA0002646398050000049
the consistency is achieved; the meaning of the formula (3.3) is similar to that of the formula (2.3), except that the object connected by the formula is the depth-1 layer word encoding result and the depth-1 layer word encoding result after dimension increaseAs a result, among other things,
Figure FDA00026463980500000410
representing a depth level semantic feature cube, the shape of which is (batch _ size, time _ steps,2, output _ dimension, 1);
after the semantic feature cubes of each level are obtained, the semantic feature cubes of all levels are connected on the depth dimension to generate a final deep semantic feature cube, and the formula is as follows:
Figure FDA00026463980500000411
wherein equation (4) represents a semantic feature cube joining all levels in the depth dimension,
Figure FDA00026463980500000412
represents the final deep semantic feature cube with shape (batch _ size, time _ steps,2, output _ dimension, depth).
4. The intelligent question-answering based sentence pair semantic matching method according to claim 3, wherein the construction process of the feature conversion network module is as follows:
constructing a three-dimensional convolution semantic feature coding layer: the layer receives a deep semantic feature cube output by a deep semantic feature cube construction network module as input, and then performs coding operation on the deep semantic feature cube by using a three-dimensional convolutional neural network so as to obtain corresponding semantic feature coding expression, wherein the formula is as follows:
Figure FDA00026463980500000413
Figure FDA00026463980500000414
Figure FDA00026463980500000415
Figure FDA00026463980500000416
wherein, deep semantic feature cube
Figure FDA00026463980500000417
Inputting for the layer; equation (5.1) represents the result of ReLU function mapping after convolution of the f-th convolution kernel on a specific region of the deep semantic feature cube, where [ x ]1,y1,z1]Which represents the size of the convolution kernel,
Figure FDA0002646398050000051
a weight matrix representing the f-th convolution kernel, i, j, and k represent the abscissa, ordinate, and depth coordinate of the convolution region, and ml、mhAnd mdRepresenting the length, height and depth of the deep semantic feature cube i + x1-1、j:j+y1-1、k:k+z1-1 represents a convolution region and-1 represents a convolution region,
Figure FDA0002646398050000052
a bias matrix representing the f-th convolution kernel,
Figure FDA0002646398050000053
denotes the f-th convolution kernel at i: i + x1-1、j:j+y1-1、k:k+z1-1 area of convolution results; equation (5.2) shows that the horizontal and vertical convolution results of the f-th convolution kernel in each region are integrated to obtain the kth depth convolution result of the f-th convolution kernel, namely
Figure FDA0002646398050000054
Wherein s isx1And sy1Representing a horizontal convolution step and a vertical convolution step; equation (5.3) shows that all the deep convolution results of the f-th convolution kernel are integrated to obtain the deep convolution result of the f-th convolution kernel, i.e.
Figure FDA0002646398050000055
Wherein s isz1Representing a depth convolution step; equation (5.4) represents the integration of the deep convolution results of all convolution kernels to obtain the final convolution result of the layer network for the deep semantic feature cube, i.e.
Figure FDA0002646398050000056
It is called semantic feature coding representation;
constructing a semantic feature matching layer: this layer first links semantic feature coded representations of sensor 1 and sensor 2 in the depth dimension
Figure FDA0002646398050000057
And
Figure FDA0002646398050000058
thereby obtaining the sentence pair connection tensor
Figure FDA0002646398050000059
The formula is as follows:
Figure FDA00026463980500000510
subsequently, semantic feature coding operation is carried out on the sentence pair preliminary matching tensor to generate a sentence pair preliminary matching tensor, and the specific process comprises the following two steps:
first, another three-dimensional convolutional neural network pair is used
Figure FDA00026463980500000511
And performing three-dimensional convolution matching processing to obtain a sentence-to-three-dimensional convolution matching tensor, wherein the formula is as follows:
Figure FDA00026463980500000512
Figure FDA00026463980500000513
Figure FDA00026463980500000514
Figure FDA00026463980500000515
wherein, the sentence pair join tensor
Figure FDA00026463980500000516
Inputting for the layer; equation (7.1) represents the result of the ReLU function mapping after the f-th convolution kernel sentence convolves a specific region of the join tensor, where [ x [ ]2,y2,z2]Which represents the size of the convolution kernel,
Figure FDA00026463980500000517
a weight matrix representing the f-th convolution kernel, i, j, and k represent the abscissa, ordinate, and depth coordinate of the convolution region, rl、rhAnd rdRepresenting the length, height and depth of the sentence pair join tensor i + x2-1,j:j+y2-1,k:k+z2-1 represents a convolution region and-1 represents a convolution region,
Figure FDA00026463980500000518
a bias matrix representing the f-th convolution kernel,
Figure FDA00026463980500000519
denotes the f-th convolution kernel at i: i + x2-1、j:j+y2-1、k:k+z2-1 area of convolution results; equation (7.2) shows that the horizontal and vertical convolution results of the f-th convolution kernel in each region are integrated to obtain the kth depth convolution result of the f-th convolution kernel, namely
Figure FDA0002646398050000061
Wherein s isx2And sy2Representing a horizontal convolution step and a vertical convolution step; equation (7.3) shows that all the deep convolution results of the f-th convolution kernel are integrated to obtain the deep convolution result of the f-th convolution kernel, i.e.
Figure FDA0002646398050000062
Wherein s isz2Representing a depth convolution step; equation (7.4) represents the integration of the deep convolution results for all convolution kernels to obtain the final convolution result for the deep semantic feature cube in the layer network, i.e.
Figure FDA0002646398050000063
The result is called sentence pair three-dimensional convolution matching tensor;
second, using a two-dimensional convolutional neural network pair
Figure FDA0002646398050000064
And performing two-dimensional convolution matching processing to obtain a preliminary convolution matching tensor of the sentence pair, wherein the formula is as follows:
Figure FDA0002646398050000065
Figure FDA0002646398050000066
Figure FDA0002646398050000067
wherein the sentence matches the tensor to the three-dimensional convolution
Figure FDA0002646398050000068
Inputting for the layer; equation (8.1) represents the result of the ReLU function mapping after the f-th convolution kernel sentence convolves a specific region of the three-dimensional convolution matching tensor, where [ x ]3,y3]Which represents the size of the convolution kernel,
Figure FDA0002646398050000069
a weight matrix representing the f-th convolution kernel, i and j representing the abscissa and ordinate of the convolution region, tlAnd thRepresenting the length and height of the sentence versus the three-dimensional convolution matching tensor i + x3-1、j:j+y3-1 represents a convolution region and-1 represents a convolution region,
Figure FDA00026463980500000610
a bias matrix representing the f-th convolution kernel,
Figure FDA00026463980500000611
denotes the f-th convolution kernel at i: i + x3-1、j:j+y3-1 area of convolution results; equation (8.2) shows that the convolution result of the f-th convolution kernel in each region is integrated to obtain the final convolution result of the f-th convolution kernel, i.e.
Figure FDA00026463980500000612
Wherein s isx3And sy3Representing a horizontal convolution step and a vertical convolution step; formula (8.3) shows that the final convolution results of the n convolution kernels are combined to obtain the final convolution result of the layer network on the sentence pair three-dimensional convolution matching tensor, namely the final convolution result is
Figure FDA00026463980500000613
The result is called as a preliminary matching tensor of the sentence pair;
constructing a semantic feature screening layer: the layer receives output sentences of the semantic feature matching layer and takes the preliminary matching tensor as input, and then completes semantic feature screening and weighting operation on the layer, wherein the formula is as follows:
Figure FDA00026463980500000614
Figure FDA00026463980500000615
Figure FDA00026463980500000616
wherein, the formula (9.1) represents
Figure FDA0002646398050000071
A mapping is performed in which, among other things,
Figure FDA0002646398050000072
and
Figure FDA0002646398050000073
representing the corresponding trainable weight matrix in the model,
Figure FDA0002646398050000074
to represent
Figure FDA0002646398050000075
A result after mapping; equation (9.2) represents calculating the attention weight, where,
Figure FDA0002646398050000076
representing an attention weight; equation (9.3) represents the use of attention weights to generate the final match vector, where N is
Figure FDA0002646398050000077
The number of feature vectors in (a) is,
Figure FDA0002646398050000078
the tensors are matched to the semantics for the final sentence.
5. The intelligent question-answering semantic feature cube-based sentence pair semantic matching method according to claim 4, wherein the tag prediction module is constructed by the following steps:
the sentence-to-semantic matching tensor is used as the input of the module and is processed by a layer of fully-connected network with the dimensionality of 1 and the activation function of sigmoid, so that a value of [0,1 ] is obtained]The value of the degree of matching between the two is recorded as ypredFinally, comparing with the set threshold value of 0.5 to judge whether the semantics of the sentence pairs are matched; i.e. ypredWhen the semantic meaning of the sentence pair is matched, if not, the semantic meaning is not matched; when the sentence is not fully trained on the semantic matching model, training is required to be carried out on a training data set so as to optimize the model parameters; when the model training is finished, the label prediction module can predict whether the semantics of the target sentence pair are matched.
6. The intelligent question-answering based sentence-to-semantic feature cube matching method according to claim 5, wherein the sentence-to-semantic matching knowledge base is constructed as follows:
downloading a data set on a network to obtain original data: downloading a sentence-to-semantic matching data set or a manually constructed data set which is already disclosed on a network, and taking the sentence-to-semantic matching data set or the manually constructed data set as original data for constructing a sentence-to-semantic matching knowledge base;
preprocessing raw data: preprocessing original data used for constructing a sentence-to-semantic matching knowledge base, and performing word segmentation operation and word segmentation operation on each sentence to obtain a sentence-to-semantic matching word segmentation processing knowledge base and a word segmentation processing knowledge base;
summarizing the sub-knowledge base: summarizing a sentence-to-semantic matching word-breaking processing knowledge base and a sentence-to-semantic matching word-segmentation processing knowledge base, and constructing a sentence-to-semantic matching knowledge base;
the sentence-to-semantic matching model is obtained by training by using a training data set, and the construction process of the training data set is as follows:
constructing a training example: constructing two sentence pairs with consistent sentence semantemes into a positive example in a sentence pair semantic matching knowledge base, and formalizing the positive example into: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word, 1); wherein, sensor 1_ char and sensor 2_ char refer to sentence1 and sentence2 in the knowledge base for semantic matching word segmentation processing respectively, sensor 1_ word and sensor 2_ word refer to sentence1 and sentence2 in the knowledge base for semantic matching word segmentation processing respectively, and 1 indicates that the semantics of the two sentences are matched, which is a positive example;
constructing a training negative example: selecting a sentence s1Randomly selecting a sentence s from the sentence pair semantic matching knowledge base1Unmatched sentence s2A 1 is to1And s2The combination is carried out, and a negative example is constructed and formalized as follows: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word, 0); wherein 0 represents the sentence s1And sentence s2Is a negative example;
constructing a training data set: combining all positive example sample sentence pairs and negative example sample sentence pairs obtained after the operations of constructing the training positive examples and constructing the training negative examples, and disordering the sequence of the positive example sample sentence pairs and the negative example sample sentence pairs to construct a final training data set;
after the sentence-to-semantic matching model is built, training and optimizing the sentence-to-semantic matching model through a training data set are carried out, which specifically comprises the following steps:
constructing a loss function: adopting cross entropy as a loss function;
optimizing a training model: using RMSProp as an optimization algorithm, except that the learning rate is set to 0.0015, the remaining hyper-parameters of RMSProp all select default settings in Keras; and optimally training the sentence pair semantic matching model on the training data set.
7. An intelligent question-answering sentence-to-semantic matching device based on a semantic feature cube is characterized by comprising,
the sentence-to-semantic matching knowledge base construction unit is used for acquiring a large amount of sentence pair data and then carrying out preprocessing operation on the sentence pair data so as to obtain a sentence-to-semantic matching knowledge base which meets the training requirement;
a training data set generating unit for constructing positive case data and negative case data for training according to sentences in the sentence-to-semantic matching knowledge base, and constructing a final training data set based on the positive case data and the negative case data;
the sentence pair semantic matching model construction unit is used for constructing a word mapping conversion table and a word mapping conversion table, and simultaneously constructing an input module, a word vector mapping layer, a deep semantic feature cube construction network module, a feature conversion network module and a label prediction module; the sentence-to-semantic matching model construction unit includes,
the word mapping conversion table or word mapping conversion table construction unit is responsible for segmenting each sentence in the semantic matching knowledge base by the sentence according to the word granularity or the word granularity, sequentially storing each word or word in a list to obtain a word list or word list, and sequentially increasing and sequencing the words or words according to the sequence of the words or word lists recorded by the words or words with the number 1 as the start to form the word mapping conversion table or word mapping conversion table required by the invention; after the word mapping conversion table or the word mapping conversion table is constructed, each word or word in the table is mapped into a unique digital identifier; then, the Word vector model or the Word vector model is trained by using Word2Vec to obtain a Word vector matrix of each Word or a Word vector matrix of each Word;
the input module construction unit is responsible for preprocessing each sentence pair or sentence pair to be predicted in the training data set, respectively acquiring sensor 1_ char, sensor 2_ char, sensor 1_ word and sensor 2_ word, and formalizing the words as follows: (sensor 1_ char, sensor 2_ char, sensor 1_ word, sensor 2_ word);
the word vector mapping layer or word vector mapping layer construction unit is responsible for loading a word vector matrix or word vector matrix obtained by training in the step of the word mapping conversion table or word mapping conversion table construction unit to initialize the weight parameters of the current layer; for word vector mapping, aiming at input sentences of sensor 1_ char and sensor 2_ char, obtaining corresponding sentence vectors of sensor 1_ char _ embed and sensor 2_ char _ embed; for word vector mapping, for input sentences of presence 1_ word and presence 2_ word, obtaining their corresponding sentence vectors of presence 1_ word _ embedded and presence 2_ word _ embedded;
the deep semantic feature cube construction network module construction unit is used for constructing one-dimensional semantic information into a semantic feature cube, and specifically operates to receive word embedded representation output by the word vector mapping layer and word embedded representation output by the word vector mapping layer as input; the first layer of coding structure respectively carries out coding operation on the word embedding expression and the word embedding expression output by the multi-granularity embedding module so as to obtain a first layer of word coding result and a first layer of word coding result; newly adding two dimensions, namely a first layer word coding result and a first layer word coding result, and defining the two dimensions as a granularity dimension and a depth dimension; the first layer word coding result and the first layer word coding result are connected in granularity dimension to generate a first layer semantic feature cube, and meanwhile, the first layer word coding result and the first layer word coding result before dimension increase are transmitted to a second layer coding structure; the second layer coding structure respectively carries out coding operation on the first layer character coding result and the first layer word coding result so as to obtain a second layer character coding result and a second layer word coding result; adding two dimensions, namely a granularity dimension and a depth dimension, to the second layer word coding result and the second layer word coding result; the second layer word coding result and the second layer word coding result are connected in the granularity dimension to generate a second level semantic feature cube, and meanwhile, the second layer word coding result and the second layer word coding result before the dimension is increased are transmitted to a third layer coding structure; by analogy, the multi-level semantic feature cube can be generated by repeatedly encoding for many times; after the semantic feature cubes of each level are obtained, the semantic feature cubes of all levels are connected on a depth dimension to generate a final deep semantic feature cube;
the feature conversion network module construction unit is responsible for further processing a deep semantic feature cube of a corresponding sentence and carrying out semantic feature coding, semantic feature matching, semantic feature screening and other operations on the deep semantic feature cube so as to generate a final sentence to semantic matching tensor; the corresponding operation is realized through a three-dimensional convolution semantic feature coding layer, a semantic feature matching layer and a semantic feature screening layer respectively;
the tag prediction module unit is responsible for processing the semantic matching tensor of the sentence pair so as to obtain a matching degree value, and the matching degree value is compared with an established threshold value so as to judge whether the semantics of the sentence pair are matched or not;
and the sentence-to-semantic matching model training unit is used for constructing a loss function required in the model training process and finishing the optimization training of the model.
8. The intelligent question-answering semantic feature cube-based sentence-to-semantic matching method according to claim 7, wherein the sentence-to-semantic matching knowledge base construction unit comprises,
the sentence pair data acquisition unit is responsible for downloading a sentence pair semantic matching data set or a manually constructed data set which is already disclosed on a network, and the sentence pair data set is used as original data for constructing a sentence pair semantic matching knowledge base;
the system comprises an original data word-breaking preprocessing or word-segmentation preprocessing unit, a word-breaking or word-segmentation preprocessing unit and a word-segmentation processing unit, wherein the original data word-breaking preprocessing or word-segmentation preprocessing unit is responsible for preprocessing original data used for constructing a sentence-to-semantic matching knowledge base, and performing word-breaking or word-segmentation operation on each sentence in the original data word-breaking or word-segmentation preprocessing unit so as to construct a sentence-to-semantic matching word-breaking processing knowledge base or a sentence-;
the sub-knowledge base summarizing unit is responsible for summarizing the sentence-to-semantic matching word-breaking processing knowledge base and the sentence-to-semantic matching word-segmentation processing knowledge base so as to construct the sentence-to-semantic matching knowledge base;
the training data set generating unit comprises a training data set generating unit,
the training positive case data construction unit is responsible for constructing two sentences with consistent semantics in the sentence-to-semantic matching knowledge base and the matching labels 1 thereof into training positive case data;
the training negative case data construction unit is responsible for selecting one sentence, randomly selecting a sentence which is not matched with the sentence for combination, and constructing the sentence and the matched label 0 into negative case data;
the training data set construction unit is responsible for combining all training positive example data and training negative example data together and disordering the sequence of the training positive example data and the training negative example data so as to construct a final training data set;
the sentence-to-semantic matching model training unit includes,
the loss function construction unit is responsible for calculating the error of the semantic matching degree between the sentences 1 and 2;
and the model optimization unit is responsible for training and adjusting parameters in model training to reduce prediction errors.
9. A storage medium having stored thereon a plurality of instructions characterized in that said instructions are loaded by a processor to perform the steps of the intelligent question answering semantic feature cube based sentence-to-sentence matching method of claims 1-6.
10. An electronic device, characterized in that the electronic device comprises: the storage medium of claim 9; and a processor for executing instructions in the storage medium.
CN202010855971.XA 2020-08-24 2020-08-24 Sentence-to-semantic matching method based on semantic feature cube and oriented to intelligent question and answer Active CN112000772B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010855971.XA CN112000772B (en) 2020-08-24 2020-08-24 Sentence-to-semantic matching method based on semantic feature cube and oriented to intelligent question and answer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010855971.XA CN112000772B (en) 2020-08-24 2020-08-24 Sentence-to-semantic matching method based on semantic feature cube and oriented to intelligent question and answer

Publications (2)

Publication Number Publication Date
CN112000772A true CN112000772A (en) 2020-11-27
CN112000772B CN112000772B (en) 2022-09-06

Family

ID=73471688

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010855971.XA Active CN112000772B (en) 2020-08-24 2020-08-24 Sentence-to-semantic matching method based on semantic feature cube and oriented to intelligent question and answer

Country Status (1)

Country Link
CN (1) CN112000772B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966524A (en) * 2021-03-26 2021-06-15 湖北工业大学 Chinese sentence semantic matching method and system based on multi-granularity twin network
CN113065359A (en) * 2021-04-07 2021-07-02 齐鲁工业大学 Sentence-to-semantic matching method and device oriented to intelligent interaction
CN113065358A (en) * 2021-04-07 2021-07-02 齐鲁工业大学 Text-to-semantic matching method based on multi-granularity alignment for bank consultation service
CN113268962A (en) * 2021-06-08 2021-08-17 齐鲁工业大学 Text generation method and device for building industry information service question-answering system
CN114238563A (en) * 2021-12-08 2022-03-25 齐鲁工业大学 Multi-angle interaction-based intelligent matching method and device for Chinese sentences to semantic meanings
CN114547256A (en) * 2022-04-01 2022-05-27 齐鲁工业大学 Text semantic matching method and device for intelligent question answering of fire safety knowledge
CN116306811A (en) * 2023-02-28 2023-06-23 苏州亿铸智能科技有限公司 Weight distribution method for deploying neural network for ReRAM

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110032635A (en) * 2019-04-22 2019-07-19 齐鲁工业大学 One kind being based on the problem of depth characteristic fused neural network to matching process and device
CN110083692A (en) * 2019-04-22 2019-08-02 齐鲁工业大学 A kind of the text interaction matching process and device of finance knowledge question
CN111310439A (en) * 2020-02-20 2020-06-19 齐鲁工业大学 Intelligent semantic matching method and device based on depth feature dimension-changing mechanism
CN111310438A (en) * 2020-02-20 2020-06-19 齐鲁工业大学 Chinese sentence semantic intelligent matching method and device based on multi-granularity fusion model
CN111325028A (en) * 2020-02-20 2020-06-23 齐鲁工业大学 Intelligent semantic matching method and device based on deep hierarchical coding
CN111339249A (en) * 2020-02-20 2020-06-26 齐鲁工业大学 Deep intelligent text matching method and device combining multi-angle features

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110032635A (en) * 2019-04-22 2019-07-19 齐鲁工业大学 One kind being based on the problem of depth characteristic fused neural network to matching process and device
CN110083692A (en) * 2019-04-22 2019-08-02 齐鲁工业大学 A kind of the text interaction matching process and device of finance knowledge question
CN111310439A (en) * 2020-02-20 2020-06-19 齐鲁工业大学 Intelligent semantic matching method and device based on depth feature dimension-changing mechanism
CN111310438A (en) * 2020-02-20 2020-06-19 齐鲁工业大学 Chinese sentence semantic intelligent matching method and device based on multi-granularity fusion model
CN111325028A (en) * 2020-02-20 2020-06-23 齐鲁工业大学 Intelligent semantic matching method and device based on deep hierarchical coding
CN111339249A (en) * 2020-02-20 2020-06-26 齐鲁工业大学 Deep intelligent text matching method and device combining multi-angle features

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966524A (en) * 2021-03-26 2021-06-15 湖北工业大学 Chinese sentence semantic matching method and system based on multi-granularity twin network
CN112966524B (en) * 2021-03-26 2024-01-26 湖北工业大学 Chinese sentence semantic matching method and system based on multi-granularity twin network
CN113065359A (en) * 2021-04-07 2021-07-02 齐鲁工业大学 Sentence-to-semantic matching method and device oriented to intelligent interaction
CN113065358A (en) * 2021-04-07 2021-07-02 齐鲁工业大学 Text-to-semantic matching method based on multi-granularity alignment for bank consultation service
CN113268962A (en) * 2021-06-08 2021-08-17 齐鲁工业大学 Text generation method and device for building industry information service question-answering system
CN114238563A (en) * 2021-12-08 2022-03-25 齐鲁工业大学 Multi-angle interaction-based intelligent matching method and device for Chinese sentences to semantic meanings
CN114547256A (en) * 2022-04-01 2022-05-27 齐鲁工业大学 Text semantic matching method and device for intelligent question answering of fire safety knowledge
CN114547256B (en) * 2022-04-01 2024-03-15 齐鲁工业大学 Text semantic matching method and device for intelligent question and answer of fire safety knowledge
CN116306811A (en) * 2023-02-28 2023-06-23 苏州亿铸智能科技有限公司 Weight distribution method for deploying neural network for ReRAM
CN116306811B (en) * 2023-02-28 2023-10-27 苏州亿铸智能科技有限公司 Weight distribution method for deploying neural network for ReRAM

Also Published As

Publication number Publication date
CN112000772B (en) 2022-09-06

Similar Documents

Publication Publication Date Title
CN112000772B (en) Sentence-to-semantic matching method based on semantic feature cube and oriented to intelligent question and answer
CN111310438B (en) Chinese sentence semantic intelligent matching method and device based on multi-granularity fusion model
CN112000770B (en) Semantic feature graph-based sentence semantic matching method for intelligent question and answer
CN109376242B (en) Text classification method based on cyclic neural network variant and convolutional neural network
CN109614471B (en) Open type problem automatic generation method based on generation type countermeasure network
CN111325028B (en) Intelligent semantic matching method and device based on deep hierarchical coding
CN110188272B (en) Community question-answering website label recommendation method based on user background
CN110134946B (en) Machine reading understanding method for complex data
CN111310439B (en) Intelligent semantic matching method and device based on depth feature dimension changing mechanism
CN110413785A (en) A kind of Automatic document classification method based on BERT and Fusion Features
CN112001166B (en) Intelligent question-answer sentence semantic matching method and device for government affair consultation service
CN112000771B (en) Judicial public service-oriented sentence pair intelligent semantic matching method and device
CN109063164A (en) A kind of intelligent answer method based on deep learning
CN110222184A (en) A kind of emotion information recognition methods of text and relevant apparatus
CN107544960B (en) Automatic question-answering method based on variable binding and relation activation
CN112380319A (en) Model training method and related device
CN112527993B (en) Cross-media hierarchical deep video question-answer reasoning framework
CN114528398A (en) Emotion prediction method and system based on interactive double-graph convolutional network
CN116796045A (en) Multi-dimensional book grading method, system and readable medium
CN116910190A (en) Method, device and equipment for acquiring multi-task perception model and readable storage medium
CN115204143A (en) Method and system for calculating text similarity based on prompt
CN115203206A (en) Data content searching method and device, computer equipment and readable storage medium
CN113722439A (en) Cross-domain emotion classification method and system based on antagonism type alignment network
CN113821610A (en) Information matching method, device, equipment and storage medium
CN114547256B (en) Text semantic matching method and device for intelligent question and answer of fire safety knowledge

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant