CN115269808A

CN115269808A - Text semantic matching method and device for medical intelligent question answering

Info

Publication number: CN115269808A
Application number: CN202210996504.8A
Authority: CN
Inventors: 鹿文鹏; 张鑫; 赵鹏宇; 郑超群; 张维玉; 马凤英
Original assignee: Qilu University of Technology
Current assignee: Qilu University of Technology
Priority date: 2022-08-19
Filing date: 2022-08-19
Publication date: 2022-11-01

Abstract

The invention discloses a text semantic matching method and device for medical intelligent question answering, and belongs to the technical field of natural language processing. The technical problem to be solved by the invention is how to capture fine-grained semantic features and semantic interactive features among texts of the same text to realize semantic matching of the text, and the technical scheme is as follows: a text semantic matching model is formed by constructing and training an embedding layer, a semantic coding layer, a multi-level fine-grained feature extraction layer, a feature fusion layer and a prediction layer, text character and word granularity features are extracted, the fine-grained semantic features and the text semantic interaction features of the same text are captured, multiple relevant features are finally combined, then multiple matching operations are carried out, a final matching feature vector is generated, and the similarity of the text is judged. The device comprises a text matching knowledge base construction unit, a training data set generation unit, a text matching model construction unit and a text semantic matching model training unit.

Description

Text semantic matching method and device for medical intelligent question answering

Technical Field

The invention relates to the technical field of artificial intelligence and natural language processing, in particular to a text semantic matching method and device for medical intelligent question answering.

Background

The medical intelligent question answering can automatically find questions with similar semantics in a question answering knowledge base aiming at the questions put forward by the patient, and pushes the case to the user, so that the burden of manual answer of a doctor is greatly reduced; for the various questions put forward by patients, how to find the standard questions similar to the semantics of the questions is the core of the medical intelligent question-answering system; the essence of the technology is to measure the matching degree of the questions put forward by the patient and the standard questions in the question and answer knowledge base, and the essence is the text semantic matching task.

The text semantic matching task aims to measure whether the semantics contained in two texts are consistent, which is consistent with the core target of many natural language processing tasks, and the semantic matching calculation of natural language texts is a very challenging task, and the problem cannot be solved completely by the existing method.

The existing method mainly focuses on similarity discrimination of English texts, models semantic information inside the same text at a word granularity level, and models semantic interaction information among texts at a text level, but a Chinese text is more complex than an English text, chinese has rich semantic information at the word granularity level and the word granularity level, and how to better capture word granularity, word granularity and semantic information at the text level to better determine semantic similarity among texts is challenging work; aiming at the defects of the existing text semantic matching method, the invention provides a text semantic matching method and a text semantic matching device facing to medical intelligent question answering; capturing fine-grained semantic features and semantic interaction features among texts of the same text at multiple levels; the core idea is that the granularity characteristics of characters and words of a text are extracted by combining a multi-layer coding structure with various attention mechanisms, the fine-granularity semantic characteristics and the semantic interaction characteristics between the texts of the same text are captured, various relevant characteristics are finally combined, then various matching operations are carried out, a final matching characteristic vector is generated, and the similarity of the text is judged.

Disclosure of Invention

The technical task of the invention is to provide a text semantic matching method and a text semantic matching device for medical intelligent question answering, which are characterized in that the granularity characteristics of text characters and words are extracted, the fine-granularity semantic characteristics and the semantic interaction characteristics among texts of the same text are captured, multiple related characteristics are finally combined, then multiple matching operations are carried out, the final matching characteristic vector is generated, and the similarity of the text is judged.

The technical task of the invention is realized in the following way, and the text semantic matching method facing medical intelligent question answering is characterized in that a text semantic matching model is formed by constructing and training an embedding layer, a semantic coding layer, a multi-level fine-grained feature extraction layer, a feature fusion layer and a prediction layer, text characters and word granularity features are extracted, the same text fine-grained semantic features and semantic interaction features among texts are captured, finally, various related features are combined, and then, various matching operations are carried out to generate final matching feature vectors and judge the similarity of the texts; the method comprises the following specific steps:

the embedding layer carries out embedding operation on the input text according to the character granularity and the word granularity respectively to obtain text character embedding representation and word embedding representation;

the semantic coding layer receives text character embedded representation and word embedded representation, codes by using a bidirectional long-short term memory network (BilSTM), and outputs text character and word granularity characteristics;

the multilevel fine-grained feature extraction layer performs the same text and text inter-coding operation on the text character and word granularity features output by the semantic coding layer to obtain the same text fine-grained semantic features and text inter-semantic interaction features;

the feature fusion layer combines the related features, and then performs various matching operations to generate a final matching feature vector;

and the prediction layer inputs the final matching feature vector into the multilayer perceptron to obtain a floating-point numerical value, compares the floating-point numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the text are matched or not according to the comparison result.

Preferably, the embedding layer comprises a word mapping conversion table, an input layer, a word vector mapping layer, an output text word embedding representation and a word embedding representation;

wherein, the word mapping conversion table: the mapping rule is that the number 1 is used as the starting point, and then the characters or the words are sequentially and progressively ordered according to the sequence of the character word list recorded into each character or word, so that a character word mapping conversion table is formed; then, using Word2Vec to train the Word vector model to obtain the Word vector matrix of each Word;

an input layer: the input layer comprises four inputs, word breaking and word segmentation preprocessing are carried out on each text or text to be predicted in the training data set, txt P _ char, txt Q _ char, txt P _ word and txt Q _ word are respectively obtained, wherein suffixes char and word respectively represent that the corresponding text is subjected to word breaking or word segmentation processing, and the suffixes char and word are formed as follows: (txt P _ char, txt Q _ char, txt P _ word, txt Q _ word); converting each character and word in the input text into corresponding numerical identification according to a character and word mapping conversion table;

word vector mapping layer: loading the word vector matrix obtained by training in the step of constructing the word mapping conversion table to initialize the weight parameters of the current layer; aiming at input texts txt P _ char, txt Q _ char, txt P _ word and txt Q _ word, corresponding text word embedding representation and word embedding representation txt P _ char _ embedded, txt Q _ char _ embedded, txt P _ word _ embedded and txt Q _ word _ embedded are obtained

More preferably, the implementation details of the semantic coding layer are as follows:

taking the text P as an example, the module receives the text P characters, word embedded representation and uses the bidirectional long-short term memory network BilsTM encodes to obtain granularity characteristics of P characters and words of text, and records the granularity characteristics as

The specific formula is as follows:

wherein N represents the length of the word granularity characteristic and the word granularity characteristic, formula (1) represents that the text P word embedded representation is encoded by using a bidirectional long-short term memory network (BilSTM), wherein,

the granularity characteristic of the ith position word of the text P obtained by bidirectional long-short term memory network BilSTM coding is shown,

the ith position word granularity characteristic of the text P obtained by forward long-short term memory network LSTM coding is shown,

the ith position word granularity characteristic of the text P obtained by backward LSTM coding is represented; the symbol meaning in formula (2) is basically consistent with that in formula (1),

represents the granularity characteristic of the j-th position word of the text P obtained by bidirectional long and short term memory network BilSTM coding,

represents the jth bit of the text P obtained by forward LSTM encodingThe feature of the word-setting granularity,

representing the granularity characteristic of the j-th position word of the text P obtained by backward LSTM coding.

Similarly, the text Q is operated similarly to the text P to obtain the granularity characteristics of the characters and words of the text Q, and the granularity characteristics are marked as Q ^c 、Q ^w 。

Preferably, the implementation details of the multi-level fine-grained feature extraction layer are as follows:

performing encoding operation between the same text and the same text on the granularity characteristics of the text characters and the words output by the semantic encoding layer to obtain the fine granularity semantic characteristics of the same text and the semantic interaction characteristics between the texts; the method comprises two sub-modules, wherein the first sub-module is responsible for extracting fine-grained semantic features of the same text, and mainly uses a plurality of attention module codes to obtain the fine-grained semantic features of the same text according to different granularities of the same text; the second sub-module is responsible for extracting semantic interaction features among texts, and mainly obtains the semantic interaction features among the texts by using a plurality of layers of coding structures among the texts;

extracting fine-grained semantic features of the same text in a first sub-module:

first, for convenience of subsequent description, in the first section, taking the text P as an example, the following attention module is defined:

defining a soft alignment attention module, denoted as SOA, and the formula is as follows:

wherein

The granularity characteristic of the ith position word of the text P is represented by the formula (1),

the granularity characteristic of the j-th position word of the text P is shown in formula (2),

representing a soft alignment attention weight between the ith position word granularity characteristic and the jth position word granularity characteristic of the text P,

indicating that the softmax operation on the soft-alignment attention weight maps to a value of 0-1,

indicating that the ith position word granularity characteristic of the text P can be re-expressed by weighted summation of all word granularity characteristics of the text P by using soft alignment attention,

the representation uses soft alignment attention to enable the granularity characteristic of the jth position word of the text P to be represented again by the weighted summation of all the granularity characteristics of the words of the text P;

define the multiplicative alignment attention module, denoted MUA, as follows:

wherein TimeDistributed (Dense ()) indicates that the same layer operation is performed for the tensor of each time step, tan h indicates the multiplication operation for the bit, P indicates the activation function ^c Representing the granularity characteristics of a P word of the text,

representing the granularity characteristic of the P words of the text after being processed by a Dense () layer, P ^w The granularity characteristic of the text P words is represented,

indicating the multiplicative alignment attention weight,

representing softmax operation on multiplicative alignment attention weightsThe mapping is a numerical value from 0 to 1,

indicating that the ith position word granularity characteristic of the text P can be re-expressed by weighted summation of all word granularity characteristics of the text P by using multiplication alignment attention,

the expression that the jth position word granularity feature of the text P can be represented again by the weighted sum of all word granularity features of the text P by using multiplication alignment attention;

defining the Subtraction alignment attention module as SUA, the formula is as follows:

where TimeDistributed (Dense ()) represents that the same Dense () layer operation is performed for the tensor of each time step, represents a bit-wise subtraction operation, tanh represents an activation function, P ^c Representing the granularity characteristics of a P word of the text,

representing a subtraction alignment attention weight,

indicating that the softmax operation on the subtractive alignment attention weight maps to a value of 0-1,

indicating that the ith position word granularity feature of the text P can be re-represented by weighted summation of all word granularity features of the text P using subtractive alignment attention,

the method indicates that the jth position word granularity characteristic of the text P can be represented again by weighted summation of all word granularity characteristics of the text P by using subtraction alignment attention;

define the self-alignment attention module as SEA, the formula is as follows:

wherein,

the i-th position word granularity characteristic of the text P is represented,

representing the jth position word granularity characteristic of the text P,

representing a self-aligned attention weight between the ith position word granularity feature of the text P and the jth position word granularity feature of the text P,

indicating that the softmax operation on the self-aligned attention weight maps to a value of 0-1,

the representation uses self-alignment attention to enable the ith position word granularity feature of the text P to be represented again by weighted summation of all word granularity features of the text P;

in the following description, the SOA symbol is used to represent the operation of formula (3), the MUA symbol is used to represent the operation of formula (4), the SUA symbol is used to represent the operation of formula (5), and the SEA symbol is used to represent the operation of formula (6);

the first layer of coding structure uses a plurality of attention modules to extract fine-grained initial semantic features of the same text:

first, using soft-alignment attention, the text P-word granularity feature P ^c And the granularity characteristic P of the text P word ^w Performing soft alignment attention to obtain text P soft alignment characteristics at word granularity level

Text P soft-alignment feature at word granularity level

As shown in equation (7):

second, using multiplicative alignment attention, the text P word granularity feature P ^c And text P word granularity characteristic P ^w Text P multiplication alignment feature for performing multiplication alignment attention to word granularity level

Text P-multiply aligned feature at word granularity level

As shown in equation (8):

then, using subtraction to align attention, the text P word granularity feature P ^c And text P word granularity characteristic P ^w Text P subtraction alignment feature for performing subtraction alignment attention to obtain word granularity level

Text P subtraction alignment feature to word granularity level

As shown in equation (9):

similarly, the text Q is processed similarly to the text P, and the soft alignment characteristic of the text Q at the word granularity level can be obtained

Word granularity level text Q soft alignment features

Word-granularity level text Q multiplication alignment feature

Word granularity level text Q multiplication alignment feature

Word-granularity-level text Q subtractive alignment features

Word granularity level text Q subtraction alignment feature

Namely, the extraction of fine-grained initial semantic features of the same text is completed;

the second layer of coding structure enhances the fine-grained semantic features of the same text to complete the extraction of the fine-grained semantic features of the same text:

firstly, soft-aligning the text P at the word granularity level in the formula (7)

And the text P word granularity characteristic P in the formula (1) ^c Adding to obtain text P deep soft alignment characteristics of word granularity level

As shown in equation (10):

then, the text P multiplication alignment feature of word granularity level in the formula (8) is aligned

Adding the character granularity Pc of the text P in the formula (1) to obtain the character granularity level text P deep multiplication alignment characteristic

As shown in equation (11):

then, the text P of the word granularity level in the formula (9) is subtracted to align the features

And the text P word granularity characteristic P in the formula (1) ^c Adding to obtain text P deep subtraction alignment features at word granularity level

As shown in equation (12):

then, the deep soft alignment feature of the text P at the word granularity level in the formula (10)

Text P deep multiplication alignment feature at word granularity level in equation (11)

Text P deep subtraction alignment feature at word granularity level in equation (12)

Concatenating to get text P high level feature P 'of word granularity level' _c As shown in equation (13):

similar in word granularity to word granularity, first, the text P at the word granularity level in equation (7) is soft-aligned to the features

And the granularity characteristic P of the text P words in the formula (2) ^w Adding to obtain text P deep soft alignment characteristics of word granularity level

As shown in equation (14):

then, the text P multiplication alignment feature of the word granularity level in the formula (8)

And the text P word granularity characteristic P in the formula (2) ^w Adding to obtain text P deep multiplication alignment characteristics of word granularity level

As shown in equation (15):

then, the text P of the word granularity level in the formula (9) is subjected to subtraction alignment

And the text P word granularity characteristic P in the formula (2) ^w Adding to obtain text at word granularity levelP deep subtraction alignment features

As shown in equation (16):

next, join the text P deep soft alignment features at the word granularity level of equation (14)

Text P deep multiplication alignment feature at word granularity level in formula (15)

Text P deep subtraction alignment feature at word granularity level in equation (16)

Text P high-level feature P 'hindering word granularity level' _w As shown in equation (17):

text P deep soft alignment feature at the word granularity level in the join equation (10)

Text P deep soft alignment feature at the word granularity level of formula (14)

Obtaining text P deep semantic feature P' _deep As shown in equation (18):

similarly, the text Q is processed similarly to the text PObtaining text Q deep soft alignment features at word granularity level

Word-granularity-level text Q deep multiplication alignment feature

Word-granularity-level text Q deep subtraction alignment feature

Text Q high-level feature Q 'at word granularity level' _c And text Q deep soft-alignment feature at word granularity level

Word granularity level text Q deep multiplication alignment feature

Word granularity level text Q deep subtraction alignment feature

Text Q high-level feature Q 'of word granularity level' _w Text Q deep semantic feature Q' _deep Finishing the extraction of fine-grained semantic features of the same text;

and extracting semantic interactive features between texts of a second submodule:

the first layer of coding structure simultaneously uses a plurality of layers of coding structures to extract the initial semantic interaction characteristics between texts:

in word granularity, firstly, the text P word granularity characteristic P in formula (1) ^c And text Q word granularity feature Q ^c Performing soft alignment attention to obtain text P soft alignment interactive features at word granularity level

Text Q soft alignment interaction feature at word granularity level

As shown in equation (19):

secondly, the granularity characteristic P of the text P words in the formula (1) ^c And text Q word granularity feature Q ^c Text P subtraction alignment interactive feature for obtaining word granularity level by performing subtraction alignment attention

Text Q subtractive alignment of interactive features at word granularity level

As shown in equation (20):

similar to the word granularity, the text P word granularity characteristic P in formula (2) is first set ^w And text Q word granularity feature Q ^w Text P soft alignment interactive feature for obtaining word granularity level by performing soft alignment attention

Text Q soft-alignment interaction feature at word granularity level

As shown in equation (21):

then, the text P word granularity characteristic P in the formula (2) ^w And text Q word granularity feature Q ^w Text P subtraction alignment interactive feature for obtaining word granularity level by performing subtraction alignment attention

Text Q subtraction alignment interaction feature at word granularity level

As shown in equation (22):

the second layer of coding structure enhances the initial semantic interaction characteristics between texts to complete the extraction of the semantic interaction characteristics between texts:

at word granularity, first, soft-align the text P at the word granularity level in equation (19) to the interactive features

And the text P word granularity characteristic P in the formula (1) ^c Adding to obtain text P deep soft alignment interactive characteristics of word granularity level

As shown in equation (23):

then, the text P subtraction at the word granularity level in the formula (20) is aligned with the interactive feature

And the text P word granularity characteristic P in the formula (1) ^c Adding to obtain text P deep subtraction alignment interactive features of word granularity level

As shown in equation (24):

finally, the text P deep soft alignment interactive characteristics of the character granularity level in the connection formula (23)

Aligning interactive features with text P deep subtraction at word granularity level in equation (24)

Obtaining text P high-level interactive characteristic P of word granularity level _c ', as shown in equation (25):

under the word granularity, firstly, the text P at the word granularity level in the formula (21) is softly aligned with the interactive characteristics

Adding P to the granularity characteristic of the text P word in the formula (2) ^w Text P deep soft alignment interactive feature for obtaining word granularity level

As shown in equation (26):

then, the text P subtraction alignment interactive feature of the word granularity level in the formula (22)

And the granularity characteristic P of the text P words in the formula (2) ^w Text P deep subtraction alignment interactive feature with word granularity level obtained through addition

As shown in equation (27):

finally, the text P deep soft-alignment interactive features of the word granularity level in the joint formula (26)

Alignment of interactive features with text P deep subtraction at word granularity level in equation (27)

Obtaining the text P high-level interactive characteristic P' of word granularity level _w As shown in equation (28):

text P deep subtraction alignment interactive features at the word granularity level in the conjunctive formula (24)

Obtaining the text P deep semantic interactive characteristic P _deep As shown in equation (29):

similarly, the text Q is processed similarly to the text P, and the deep soft alignment interactive feature of the text Q at the word granularity level can be obtained

Word-granularity-level text Q deep subtraction alignment interactive feature

Text Q high-level interactive characteristic Q' at word granularity level _c Text Q deep soft alignment interactive feature of word granularity level

Word-granularity-level text Q deep subtraction alignment interactive features

Text Q high-level interactive characteristic Q' at word granularity level _w Text Q deep semantic interactive feature Q _deep And completing the extraction of semantic interactive features between texts.

More preferably, the implementation details of the feature fusion layer are as follows:

first, for convenience of subsequent description, the following definitions are made:

the vector is defined to subtract and then operate on the absolute value of the bit to represent AB, as shown in equation (30):

AB(P，Q)＝|P-Q| (30)

the P and the Q are two different vectors, and the absolute value operation is carried out according to the bit after the subtraction of the two vectors of the P and the Q;

the bit-wise multiplication operation of the definition vector is expressed as MU, as shown in equation (31):

MU(P，Q)＝P⊙Q (31)

wherein, P and Q are two different vectors, which represent the operation of multiplying P and Q vectors according to bit;

in the following description, the AB symbol represents the operation of formula (30), the MU symbol represents the operation of formula (31), the SOA symbol represents the operation of formula (3), the MUA symbol represents the operation of formula (4), the SUA symbol represents the operation of formula (5), and the SEA symbol represents the operation of formula (6);

the feature fusion layer is divided into two sub-modules, the first sub-module combines various related features, and the second sub-module performs various matching operations to obtain a final matching feature vector;

the first sub-module combines a plurality of relevant features:

text P high-level features P of word granularity level in connection formula (13) _c ' and formula (25)Text P high-level interactive characteristic P at medium-word granularity level _c ' obtaining text P aggregation features at word granularity level

And aggregating the text P features at the word granularity level

Performing self-attention to text P deep aggregation features at word granularity level

As shown in equation (32):

concatenating text P high-level features P 'at word granularity level in equation (17) similar to word granularity' _w The high-level interaction characteristic P' with the text P at the word granularity level in the formula (28) _w Text P aggregation features to derive word granularity level

And aggregating the text P with the character granularity level

Performing self-attention to get text P deep aggregated features at word granularity level

As shown in equation (33):

then, text P deep aggregation features at the word granularity level in the formula (32) are connected

Deep aggregation feature of text P with word granularity level in formula (33)

Then, performing maximum pooling operation to obtain semantic features P' of the text P after pooling, as shown in formula (34):

next, join text P deep semantic feature P 'in equation (18)' _deep Interacting with the deep semantic interaction feature P' of the text P in the formula (29) _deep Obtaining text P deep polymerization features

As shown in equation (35):

similarly, the same operation as the text P is carried out on the text Q to obtain the text Q aggregation characteristics at the word granularity level

Word-granularity-level text Q deep aggregation features

Text Q aggregation feature at word granularity level

Word granularity level text Q deep polymerization features

Text Q semantic feature Q' and text Q deep polymerization feature after pooling

Then will formula (35)Chinese text P deep polymerization feature

Deep syndication features with text Q

Performing soft-alignment attention to obtain soft-aligned text P deep polymerization features

Text Q deep polymerization features after soft alignment

As shown in equation (36):

then, the text P after soft alignment in the formula (36) is deeply polymerized with the features

Performing maximum pooling operation to obtain a pooled text P deep polymerization feature P', and performing soft alignment on a text Q deep polymerization feature

Performing maximum pooling operation to obtain a pooled text Q deep polymerization feature Q' as shown in formula (37):

the second sub-module performs multiple matching operations to obtain a final matching feature vector:

firstly, the semantic feature P 'of the pooled text P and the semantic feature Q' of the pooled text Q in the formula (34) are subtracted in absolute value to obtain a subtraction matching feature PQ _ab As shown in equation (38):

PQ _ab ＝AB(P′-Q′) (38)

secondly, performing point multiplication on the semantic features P 'of the pooled text P and the semantic features Q' of the pooled text Q in the formula (34) to obtain point-multiplied matching features PQ _mu As shown in equation (39):

PQ _mu ＝MU(P′，Q′) (39)

thirdly, subtracting the text P deep polymerization feature P ' after the pooling in the formula (37) and the text Q deep polymerization feature Q ' after the pooling in the formula (37) by absolute values to obtain a deep subtraction matching feature PQ ' _ab As shown in equation (40):

PQ′ _ab ＝AB(P″，Q″) (40)

then, carrying out dot multiplication on the text P deep polymerization feature P ' after pooling in the formula (37) and the text Q deep polymerization feature Q ' after pooling in the formula (37) to obtain a deep dot multiplication matching feature PQ ' _mu As shown in equation (41):

PQ′ _mu ＝MU(P″，Q″) (41)

finally, the semantic features P 'of the text P after pooling in the formula (34), the semantic features Q' of the text Q after pooling, and the subtraction matching features PQ in the formula (38) are connected _ab Equation (39) midpoint product matching feature PQ _mu And the deep subtraction matching characteristic PQ 'in the formula (40)' _ab And the middle-deep layer point multiplication matching characteristic PQ 'of the formula (41)' _mu The final matching feature vector F is obtained, as shown in equation (42):

F＝[P'；Q'；PQ _ab ；PQ _mu ；PQ' _ab ；PQ' _mu ] (42)

more preferably, the implementation details of the prediction layer are as follows:

using the final matching feature vector F as input, using the three fully-connected layers and using the ReLU activation function for activation after the first and second fully-connected layers and the sigmoid function for activation after the third fully-connected layer, resulting in a value at [0,1 ]]The value of the degree of matching between the two is recorded as y _pred (ii) a Finally, whether the text semantics are matched or not is judged by comparing the text semantics with the set threshold value of 0.5; i.e. y _pred When the semantic meaning of the text is predicted to be matched when the semantic meaning of the text is more than or equal to 0.5, otherwise, the semantic meaning of the text is not matched(ii) a When the text semantic matching model is not trained, training is required to be carried out on a training data set constructed according to a semantic matching knowledge base so as to optimize model parameters; when the model is trained, the prediction layer can predict whether the semantics of the target text are matched.

Preferably, the text semantic matching knowledge base comprises a data set acquisition original data, a preprocessing original data and a summary sub knowledge base which are downloaded on a network;

downloading a data set on a network to obtain original data: downloading a text semantic matching data set or a manually constructed data set which is already disclosed on a network, and taking the data set as original data for constructing a text semantic matching knowledge base;

preprocessing raw data: preprocessing original data used for constructing a text semantic matching knowledge base, and performing word breaking operation and word segmentation operation on each text to obtain a text semantic matching word breaking processing knowledge base and a word segmentation processing knowledge base;

summarizing the sub-knowledge base: summarizing a text semantic matching word-breaking processing knowledge base and a text semantic matching word-segmentation processing knowledge base to construct a text semantic matching knowledge base;

the text semantic matching model is obtained by training through a training data set, and the construction process of the training data set comprises the steps of constructing a training positive case, constructing a training negative case and constructing a training data set;

constructing a training example: for each text in the text semantic matching knowledge base, if the semantics are consistent, the text can be used for constructing a training case;

constructing a training negative example: selecting a text txt P, randomly selecting a text txt Q which is not matched with the text txt P from a text semantic matching knowledge base, and combining the txt P and the txt Q to construct a negative case;

constructing a training data set: combining all positive example data and negative example data obtained after the operations of constructing the training positive example and constructing the training negative example, and disordering the sequence of the positive example data and the negative example data to construct a final training data set;

after the text semantic matching model is built, training and optimizing the text semantic matching model through a training data set, which specifically comprises the following steps:

constructing a loss function: known from the prediction layer implementation, y _pred Calculating a numerical value for the matching degree obtained after the text semantic matching model processing; and y is _true The semantic meaning of the text is a real label whether the two text semantics are matched, the value of the semantic meaning is limited to 0 or 1, and the cross entropy is used as a loss function;

constructing an optimization function: using Adam optimization functions; and performing optimization training on the text semantic matching model on the training data set.

A text semantic matching device for medical intelligent question answering comprises a text semantic matching knowledge base building unit, a training data set generating unit, a text semantic matching model building unit and a text semantic matching model training unit;

the specific function of each cell of the summary text knowledge base is as follows:

the text semantic matching knowledge base construction unit is used for acquiring a large amount of text data and then preprocessing the text data so as to acquire a text semantic matching knowledge base which meets the training requirement;

the training data set generating unit is used for matching data in the text semantic matching knowledge base, if the semantics of the data are consistent, the text is used for constructing a training positive example, otherwise, the text is used for constructing a training negative example, and all the positive example data and the negative example data are mixed to obtain a training data set;

the text semantic matching model building unit is used for building a word mapping conversion table, an input layer, a word vector mapping layer, a semantic coding layer, a multi-level fine-grained feature extraction layer, a feature fusion layer and a prediction layer;

and the text semantic matching model training unit is used for constructing a training loss function and an optimization function and finishing the training of the model.

A storage medium having stored therein a plurality of instructions, the instructions being loadable by a processor and adapted to perform the steps of the above-described method for semantic matching of text to a medical intelligence question and answer.

An electronic device, the electronic device comprising:

the storage medium described above; and

a processor to execute the instructions in the storage medium.

The text semantic matching method and device for medical intelligent question answering have the following advantages:

embedding operation is carried out on the granularity of characters and words in the text, so that semantic information contained in different granularities of the text is extracted, and the extracted semantic features are more detailed and abundant;

secondly, the text is semantically coded through a bidirectional long-term and short-term memory network, so that bidirectional semantic dependence of the text can be captured better;

thirdly, semantic features with different granularities and different levels can be captured by constructing a fine-grained feature extraction layer, and semantic features with more granularities and deeper levels can be extracted as far as possible;

the texts are subjected to semantic coding through an attention mechanism, so that the dependency relationship between the texts and between the granularities in the texts can be effectively captured, the generated text matching tensor has rich interactive characteristics, and the prediction accuracy of the model is improved;

and fifthly, by maximum pooling operation, invalid information in the matching tensor can be effectively filtered, and effective information can be strengthened, so that the matching process is more accurate, and the accuracy of text semantic matching is improved.

Drawings

The invention is further described below with reference to the accompanying drawings.

FIG. 1 is a flow chart of a text semantic matching method for medical intelligent question answering;

FIG. 2 is a flow chart for building a text semantic matching knowledge base;

FIG. 3 is a flow chart for constructing a training data set;

FIG. 4 is a flow chart of building a text semantic matching model;

FIG. 5 is a flow chart of training a text semantic matching model;

FIG. 6 is a schematic diagram of a semantic coding layer model (taking text P as an example);

FIG. 7 is a schematic diagram of a structure for extracting fine-grained semantic features of the same text (taking the text P as an example);

FIG. 8 is a schematic diagram of a structure for extracting semantic interaction features between texts;

FIG. 9 is a schematic view of a feature fusion layer;

FIG. 10 is a schematic structural diagram of a text semantic matching device for medical intelligent question answering

Detailed Description

The text semantic matching method and device for medical intelligent question answering according to the invention are described in detail below with reference to the drawings and specific embodiments of the specification.

Example 1:

the main framework structure of the invention comprises an embedding layer, a semantic coding layer, a multi-level fine-grained feature extraction layer, a feature fusion layer and a prediction layer. The embedding layer carries out embedding operation on the input text according to the word granularity and the word granularity respectively, and outputs text word embedding representation and word embedding representation. The semantic coding layer structure is shown in fig. 6, taking a text P as an example, receiving a text P word embedded representation and a word embedded representation, coding and outputting text P word and word granularity features by using a bidirectional long-short term memory network BiLSTM, and transmitting the text P word and word granularity features to a multi-level fine granularity feature extraction layer. The multi-level fine-grained feature extraction layer comprises two sub-modules, wherein the first sub-module is responsible for extracting fine-grained semantic features of the same text as shown in fig. 7, and mainly uses a plurality of attention module codes to obtain the fine-grained semantic features of the same text for different granularities of the same text; the second sub-module, as shown in fig. 8, is responsible for extracting semantic interactive features between texts, and mainly obtains the semantic interactive features between texts by using a plurality of layers of coding structures between texts; in the first sub-module, as shown in fig. 7, taking a text P as an example, the first layer of coding structure uses multiple attention modules to extract the initial semantic features of the same text with fine granularity, which specifically includes: firstly, carrying out soft alignment attention on the granularity characteristic of a text P word and the granularity characteristic of a text P word by using soft alignment attention to obtain the soft alignment characteristic of the text P at the word granularity level and the soft alignment characteristic of the text P at the word granularity level, secondly, carrying out multiplication alignment attention on the granularity characteristic of the text P word and the granularity characteristic of the text P word by using multiplication alignment attention to obtain the multiplication alignment characteristic of the text P at the word granularity level and the multiplication alignment characteristic of the text P at the word granularity level, and then carrying out subtraction alignment attention on the granularity characteristic of the text P word and the granularity characteristic of the text P word to obtain the subtraction alignment characteristic of the text P at the word granularity level and the subtraction alignment characteristic of the text P at the word granularity level by using subtraction alignment attention to complete the extraction of the initial semantic characteristic of the fine granularity of the same text; the second layer of coding structure enhances the fine-grained initial semantic features of the same text to complete the extraction of the fine-grained semantic features of the same text, and specifically comprises the following steps: firstly, under the word granularity, adding a text P soft alignment feature at the word granularity level and a text P word granularity feature to obtain a text P deep soft alignment feature at the word granularity level, then adding a text P multiplication alignment feature at the word granularity level and a text P word granularity feature to obtain a text P deep multiplication alignment feature at the word granularity level, then adding a text P subtraction alignment feature at the word granularity level and a text P word granularity feature to obtain a text P deep subtraction alignment feature at the word granularity level, and then connecting the text P deep soft alignment feature at the word granularity level, the text P deep multiplication alignment feature at the word granularity level and the text P deep subtraction alignment feature at the word granularity level to obtain a text P high-level feature at the word granularity level; the word granularity is similar to the word granularity, firstly, adding a text P soft alignment feature at a word granularity level and a text P word granularity feature to obtain a text P deep soft alignment feature at the word granularity level, then adding a text P multiplication alignment feature at the word granularity level and a text P word granularity feature to obtain a text P deep multiplication alignment feature at the word granularity level, then adding a text P subtraction alignment feature at the word granularity level and a text P word granularity feature to obtain a text P deep subtraction alignment feature at the word granularity level, then connecting the text P deep soft alignment feature at the word granularity level, the text P deep multiplication alignment feature at the word granularity level and the text P deep subtraction alignment feature at the word granularity level to obtain a text P high-level feature at the word granularity level, and connecting the text P deep soft alignment feature at the word granularity level and the text P deep soft alignment feature at the word granularity level to obtain a text P semantic deep soft alignment feature, namely extracting fine granularity features of the same text; in the second sub-module, as shown in fig. 8, the first layer of coding structure simultaneously uses several layers of coding structures to extract the initial semantic interaction features between texts, which specifically includes: in the word granularity, firstly, carrying out soft alignment attention on the granularity characteristic of a text P word and the granularity characteristic of a text Q word to obtain the soft alignment interactive characteristic of the text P at the word granularity level and the soft alignment interactive characteristic of the text Q at the word granularity level, and secondly, carrying out subtraction alignment attention on the granularity characteristic of the text P word and the granularity characteristic of the text Q word to obtain the subtraction alignment interactive characteristic of the text P at the word granularity level and the subtraction alignment interactive characteristic of the text Q at the word granularity level; performing soft alignment attention on the granularity characteristic of a text P word and the granularity characteristic of a text Q word to obtain the soft alignment interactive characteristic of the text P at the word granularity level and the soft alignment interactive characteristic of the text Q at the word granularity level, and then performing subtraction alignment attention on the granularity characteristic of the text P word and the granularity characteristic of the text Q word to obtain the subtraction alignment interactive characteristic of the text P at the word granularity level and the subtraction alignment interactive characteristic of the text Q at the word granularity level; the second layer of coding structure enhances the initial semantic interactive features between texts to complete the extraction of the semantic interactive features between the texts, and specifically comprises the following steps: under the word granularity, firstly, adding the text P soft alignment interactive feature at the word granularity level and the text P word granularity feature to obtain a text P deep soft alignment interactive feature at the word granularity level, then adding the text P subtraction alignment interactive feature at the word granularity level and the text P word granularity feature to obtain a text P deep subtraction alignment interactive feature at the word granularity level, and finally, connecting the text P deep soft alignment interactive feature at the word granularity level and the text P deep subtraction alignment interactive feature at the word granularity level to obtain a text P high-level interactive feature at the word granularity level; in the word granularity, firstly, adding a text P soft alignment interactive feature at a word granularity level and a text P word granularity feature to obtain a text P deep soft alignment interactive feature at the word granularity level, then adding a text P subtraction alignment interactive feature at the word granularity level and the text P word granularity feature to obtain a text P deep subtraction alignment interactive feature at the word granularity level, and finally, connecting the text P deep soft alignment interactive feature at the word granularity level and the text P deep subtraction alignment interactive feature at the word granularity level to obtain a text P high-level interactive feature at the word granularity level, and connecting the text P deep subtraction alignment interactive feature at the word granularity level and the text P deep subtraction alignment interactive feature at the word granularity level to obtain a text P deep semantic interactive feature; the text Q and the text P are operated similarly, and the text Q high-level interactive features at the word granularity level, the text Q deep-level semantic interactive features and the text Q high-level interactive features at the word granularity level can be obtained, namely, the extraction of the text semantic interactive features is completed. In the feature fusion layer, as shown in fig. 9, the first sub-module merges multiple related features, and the second sub-module performs multiple matching operations to obtain a final matching feature vector; the first sub-module merges multiple related features, and taking the text P as an example, the specific steps are as follows: under the word granularity, connecting the text P high-level features at the word granularity level with the text P high-level interactive features at the word granularity level to obtain text P aggregation features at the word granularity level, and performing self-attention on the text P aggregation features at the word granularity level to obtain text P deep aggregation features at the word granularity level; under the word granularity, similar to the word granularity, connecting text P high-level features at the word granularity level with text P high-level interactive features at the word granularity level to obtain text P aggregation features at the word granularity level, performing self-attention on the text P aggregation features at the word granularity level to obtain text P deep aggregation features at the word granularity level, then connecting the text P deep aggregation features at the word granularity level with the text P deep aggregation features at the word granularity level, performing maximal pooling operation to obtain pooled text P semantic features, and similarly, performing the same operation as the text P on the text Q to obtain pooled text Q semantic features; then, connecting the text P deep semantic interactive feature with the text P deep semantic feature to obtain a text P deep polymerization feature, similarly, performing the same operation on the text Q as the text P to obtain a text Q deep polymerization feature, performing soft alignment attention on the text P deep polymerization feature and the text Q deep polymerization feature to obtain a soft-aligned text P deep polymerization feature and a soft-aligned text Q deep polymerization feature, then performing maximum pooling operation on the soft-aligned text P deep polymerization feature to obtain a pooled text P deep polymerization feature, and performing maximum pooling operation on the soft-aligned text Q deep polymerization feature to obtain a pooled text Q deep polymerization feature; the second sub-module performs multiple matching operations to obtain a final matching feature vector, specifically: firstly, subtracting the absolute value of the semantic features of the pooled text P and the semantic features of the pooled text Q to obtain subtraction matching features, secondly, performing point multiplication on the semantic features of the pooled text P and the semantic features of the pooled text Q to obtain point multiplication matching features, thirdly, subtracting the absolute value of the deep polymerization features of the pooled text P and the deep polymerization features of the pooled text Q to obtain deep subtraction matching features, thirdly, performing point multiplication on the deep polymerization features of the pooled text P and the deep polymerization features of the pooled text Q to obtain deep point multiplication matching features, and finally, connecting the semantic features of the pooled text P, the semantic features of the pooled text Q, the subtraction matching features, the point multiplication matching features, the deep subtraction matching features and the deep point multiplication matching features to obtain final matching feature vectors. And the prediction layer inputs the final matching feature vector into the multilayer perceptron to obtain a floating-point numerical value, compares the floating-point numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the text are matched or not according to the comparison result. The method comprises the following specific steps:

(1) The embedding layer carries out embedding operation on the input text according to the character granularity and the word granularity respectively, and outputs text character embedding representation and word embedding representation;

(2) The semantic coding layer is used for receiving word embedded expression and word embedded expression, coding is carried out by using a bidirectional long-short term memory network BilSTM, and granularity characteristics of text words and words are output;

(3) The multilevel fine-grained feature extraction layer performs the same text and text inter-coding operation on the text character and word granularity features output by the semantic coding layer to obtain the same text fine-grained semantic features and the text inter-semantic interactive features;

(4) The feature fusion layer combines various related features, and then performs various matching operations to obtain a final matching feature vector;

(4) And the prediction layer inputs the final matching feature vector into the multilayer perceptron to obtain a floating-point numerical value, compares the floating-point numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the text are matched or not according to the comparison result.

Example 2:

as shown in the attached figure 1, the text semantic matching method for the medical intelligent question answering specifically comprises the following steps:

s1, constructing a text semantic matching knowledge base, as shown in the attached figure 2, and specifically comprising the following steps:

s101, downloading a data set on a network to obtain original data: downloading a text semantic matching data set or a manually constructed data set which is already disclosed on a network, and taking the text semantic matching data set or the manually constructed data set as original data for constructing a text semantic matching knowledge base;

examples are: the method comprises the steps that a plurality of published text semantic matching data sets facing medical intelligent question answering exist on a network, and a plurality of question answering data pairs exist in a plurality of medical community forums;

text pairs for the example, as follows:

txt P	what are the symptoms of the cold?
		txt Q	What symptoms can be judged to be a cold?

S102, preprocessing original data: preprocessing original data used for constructing a text semantic matching knowledge base, and performing word segmentation operation and word segmentation operation on each text to obtain a text semantic matching word segmentation processing knowledge base and a word segmentation processing knowledge base;

taking txt P shown in S101 as an example, it is subjected to word-breaking processing operation to obtain "which symptoms of a cold are all present? "; the Jieba word segmentation tool is used for carrying out word segmentation operation processing on the Chinese medicinal herb to obtain' which symptoms are shown in cold? ".

S103, summarizing the sub-knowledge base: summarizing a text semantic matching word-breaking processing knowledge base and a text semantic matching word-segmentation processing knowledge base to construct a text semantic matching knowledge base;

the text semantic matching word segmentation processing knowledge base and the text semantic matching word segmentation processing knowledge base obtained in the step S102 are collected under the same folder, so as to obtain a text semantic matching knowledge base, the flow of which is shown in fig. 2, where it is to be noted that data processed by the word segmentation operation and data processed by the word segmentation operation are not merged into the same file, that is, the text semantic matching knowledge base actually includes two independent sub-knowledge bases.

S2, constructing a text semantic matching model training data set: for each text in the text semantic matching knowledge base, if the semantics are consistent, the text can be used for constructing a training case; if the semantics are inconsistent, the text can be used for constructing a training negative case; mixing a certain amount of positive example data and negative example data to construct a training data set required by the model; as shown in fig. 3, the specific steps are as follows:

s201, constructing training regular case data: two texts with consistent text semantics are constructed into regular case data, and the regular case data is formalized into: (txt P _ char, txt Q _ char, txt P _ word, txt Q _ word, 1);

examples are: after the txt P and txt Q displayed in step S101 are subjected to the word-breaking operation processing and the word-segmentation operation processing in step S102, a formal example data form is constructed as follows:

(which are all symptoms exhibited by cold.

S202, constructing training negative example data: selecting a certain text contained in the text, and randomly selecting a certain text which is not matched with the selected text for combination; the two texts with different semantics are constructed into negative example data, and the negative example data can be formatted into the following steps by adopting the operation similar to the step S201: (txt P _ char, txt Q _ char, txt P _ word, txt Q _ word, 0), each symbolic meaning is the same as in step S201, 0 represents that the semantics of the two texts are not matched, and is a negative example;

examples are: the example is very similar to the construction training example, and is not described in detail here.

S203, constructing a training data set: all positive example data and negative example data obtained after the operations of the steps S201 and S202 are combined together, the sequence of the positive example data and the negative example data is disordered, and a final training data set is constructed, wherein the positive example data and the negative example data both comprise 5 dimensions, namely txt P _ char, txt Q _ char, txt P _ word, txt Q _ word,0 or 1.

S3, constructing a text semantic matching model: the method mainly comprises the steps of constructing a word mapping conversion table, an input layer, a word vector mapping layer, a semantic coding layer, a multi-level fine-grained feature extraction layer, a feature fusion layer and a prediction layer; as shown in fig. 4, the specific steps are as follows:

s301, constructing a word mapping conversion table: the word list is constructed by the text semantic matching word breaking processing knowledge base and the word segmentation processing knowledge base which are obtained after the processing of the step S102; after the word list is constructed, each word or word in the list is mapped to a unique digital identifier, and the mapping rule is as follows: starting with the number 1, sequentially and progressively sequencing each character or word according to the sequence of the character and word table recorded by each character or word so as to form a word mapping conversion table required by the invention;

examples are as follows: with the content processed in step S102, "what are all symptoms of the cold? "," what are all symptoms of the cold? "construct word table and word mapping translation table as follows:

words and phrases	Feeling of	Cap (A)	Watch (A)	Now that	Is	Symptoms of (1)	Form of	Are all	Is provided with	Where is	These are
												Mapping	1	2	3	4	5	6	7	8	9	10	11
Words and phrases	？	Common cold	Performance of	Symptoms and signs	Are all provided with	Which ones are
												Mapping	12	13	14	15	16	17

Then, the Word2Vec is used for training the Word vector model to obtain a Word vector matrix char _ embedding _ matrix of each Word;

for example, the following steps are carried out: in Keras, the following is implemented for the code described above:

w2v_model＝models.Word2Vec(w2v_corpus，size＝EMB_DIM，window＝5，min_count＝1，sg＝1，workers＝4，seed＝1234，iter＝25)

embedding_matrix＝np.zeros([len(tokenizer.word_index)+1，EMB_DIM])

tokenizer＝Tokenizer(num_words＝len(word_set))

for word，idx in tokenizer.word_index.items():

embedding_matrix[idx，:]＝w2v_model.wv[word]

wherein w2v _ corpus is all data in the text semantic matching knowledge base; EMB _ DIM is a vector dimension, the model sets EMB _ DIM to 300, and word _setto a word list.

S302, constructing an input layer: the input layer comprises four inputs, txt P _ char, txt Q _ char, txt P _ word and txt Q _ word are respectively obtained from the training data set sample of the input layer, and the txt P _ char, txt Q _ word and txt Q _ word are formed as follows: (txt P _ char, txt Q _ char, txt P _ word, txt Q _ word);

for each character and word in the input text, the invention converts the character and word into corresponding numerical identifiers according to the word mapping conversion table constructed in the step S301;

for example, the following steps are carried out: using the text shown in step S201 as a sample, a piece of input data is formed, and the result is as follows:

( "are all symptoms of the common cold present? "," which symptoms are manifested can be judged as a cold? "," what are all symptoms of the cold? "," which symptoms are manifested are judged to be colds? " )

Each piece of input data contains 4 sub-texts; based on the word mapping conversion table in step S301, which is converted into a numerical representation (assuming that "may", "to", "decide", "to", "may", "to", and "to" which "appear in txt Q but do not appear in txt P are mapped to 18, 19, 20, 21, 22, 23, 24, respectively), 4 sub-texts of data are input, and the combined representation results are as follows:

(“1，2，3，4，5，6，7，8，9，10，11，12”，“3，4，10，11，6，7，18，19，20，21，22，1，2，12”，“13，14，5，15，16，17，12”，“14，17，15，23，24，22，13，12”)。

s303, constructing a word vector mapping layer: initializing the weight parameter of the current layer by loading the word vector matrix obtained by training in the step of constructing a word mapping conversion table; aiming at input texts txt P _ char, txt Q _ char, txt P _ word and txt Q _ word, obtaining corresponding text word embedding representation and word embedding representation txt P _ char _ embedded, txt Q _ char _ embedded, txt P _ word _ embedded and txt Q _ word _ embedded; each text in the text semantic matching knowledge base can convert text information into a vector form in a word vector mapping mode;

embedding_layer＝Embedding(embedding_matrix.shape[0]，emb_dim，weights＝[embedding_matrix]，input_length＝input_dim，trainable＝False)

wherein, embedding _ matrix is a word vector matrix obtained by training in the step of word mapping conversion table, embedding _ matrix, shape [0] is the size of a word table of the word vector matrix, emb _ dim is the dimension of output text word embedding expression and word embedding expression, and input _ length is the length of input sequence;

and processing the corresponding texts txt P _ char, txt Q _ char, txt P _ word and txt Q _ word by an Embedding layer of the Keras to obtain corresponding text word Embedding representations and word Embedding representations txt P _ char _ embedded, txt Q _ char _ embedded, txt P _ word _ embedded and txtQ _ word _ embedded.

S304, constructing a semantic coding layer:

taking the text P as an example, the module receives the embedded representation of the text P characters and words and uses a bidirectional long-short term memory network BilSTM to encode to obtain the granularity characteristics of the text P characters and words which are marked as

The concrete formula is as follows:

where N represents the word granularity features and the length of the word granularity features, equation (1) represents encoding a text P word embedded representation using a bidirectional long-short term memory network BilSt, where,

the granularity characteristic of the ith position word of the text P obtained by bidirectional long and short term memory network BilSTM coding is shown,

the ith position word granularity characteristic of the text P obtained by backward LSTM coding is represented; the symbol meaning in formula (2) is basically the same as that in formula (1),

representing the granularity characteristics of the jth position word of the text P obtained by bidirectional long-short term memory network BilSTM coding,

represents the granularity characteristic of the j-th position word of the text P obtained by forward LSTM coding,

and the j (th) position word granularity characteristics of the text P obtained by backward LSTM coding are shown.

S305, constructing a multi-level fine-grained feature extraction layer:

the multilevel fine-grained feature extraction layer takes the granularity features of the text characters and words output by the semantic coding layer as input; performing encoding operation between the same text and the same text to obtain fine-grained semantic features of the same text and semantic interaction features between the same text; the method comprises two sub-modules, wherein the first sub-module is responsible for extracting fine-grained semantic features of the same text, and mainly uses a plurality of attention module codes to obtain the fine-grained semantic features of the same text according to different granularities of the same text, as shown in FIG. 7; the second sub-module is responsible for extracting semantic interactive features between texts, and mainly obtains the semantic interactive features between the texts by using a plurality of layers of coding structures between the texts, as shown in fig. 8.

S30501, extracting fine-grained semantic features of the same text of a first sub-module:

defining a soft alignment attention module, marked as SOA, and the formula is as follows:

wherein

The i-th position word granularity characteristic of the text P is represented by the formula (1),

numeric values representing the softmax operation mapping to 0-1 for soft-alignment attention weights，

Indicating that the ith position word granularity feature of the text P can be re-expressed by weighted summation of all word granularity features of the text P using soft-alignment attention,

define the multiplicative alignment attention module, denoted MUA, as follows:

wherein TimeDistributed (Dense ()) indicates that the same layer operation of Dense () is performed for the tensor of each time step,. Alpha.indicates an alignment multiplication operation,. Tanh indicates an activation function, and P ^c Representing the granularity characteristics of a P word of the text,

indicating that the multiplication is aligned with the attention weight,

indicating that softmax operation on the multiplicative alignment attention weight maps to a value of 0-1,

defining a Sua as the Sua, the formula is as follows:

where TimeDistributed (Dense ()) represents that the same layer operation of Dense () is performed for the tensor of each time step, represents the bit-wise subtraction operation, tanh represents the activation function, P ^c Representing the granularity characteristics of a P word of the text,

representing a subtractive alignment attention weight,

indicating that the softmax operation on the subtraction alignment attention weight maps to a value of 0-1,

define the self-alignment attention module as SEA, the formula is as follows:

wherein,

the i-th position word granularity characteristic of the text P is represented,

representing the jth position word granularity characteristic of the text P,

representing a mapping of the softmax operation on the self-aligned attention weight to a value of 0-1,

the representation uses self-alignment attention to enable the ith position word granularity characteristic of the text P to be represented again by weighted summation of all word granularity characteristics of the text P;

s3050101, extracting fine-grained initial granularity semantic features of the same text by using a plurality of attention modules in a first layer of coding structure:

first, using soft-alignment attention, the text P word granularity feature P ^c And text P word granularity characteristic P ^w Performing soft alignment attention to obtain text P soft alignment characteristics at word granularity level

Text P soft-alignment feature at word granularity level

As shown in equation (7):

second, using multiplicative alignment attention, the text P word granularity feature P ^c And text P word granularity characteristic P ^w Performing multiplicative alignment attention to obtain text P multiplicative alignment features at word granularity level

Text P multiplication alignment feature at word granularity level

As shown in equation (8):

Text P subtraction alignment feature to word granularity level

As shown in equation (9):

Text Q at word granularity levelSoft alignment feature

Word-granularity level text Q multiplication alignment feature

Word granularity level text Q-multiply alignment feature

Word-granularity-level text Q subtractive alignment features

Word granularity level text Q subtraction alignment feature

s3050102, the second-layer coding structure enhances the fine-grained initial semantic features of the same text to complete extraction of the fine-grained semantic features of the same text:

As shown in equation (10):

And the text P word granularity characteristic P in the formula (1) ^c Adding to obtain text P deep multiplication alignment characteristics of word granularity level

As shown in equation (11):

As shown in equation (12):

then, the text P deep soft alignment characteristics of the word granularity level in the formula (10)

Concatenating to get text P high level feature P 'at word granularity level' _c As shown in equation (13):

similar in word granularity to word granularity, first, the text P soft-alignment feature at the word granularity level in equation (7) is

As shown in equation (14):

As shown in equation (15):

And the text P word granularity characteristic P in the formula (2) ^w Adding to obtain word granularity level text P deep subtraction alignment features

As shown in equation (16):

Obtaining text P high-level feature P 'of word granularity level' _w As shown in equation (17):

text P deep soft alignment feature at word granularity level in conjunctive formula (10)

Obtaining text P deep semantic feature P' _deep As shown in equation (18):

similarly, the text Q is processed similarly to the text P, and deep soft alignment characteristics of the text Q at the word granularity level can be obtained

Word-granularity level text Q deep multiplicative alignment features

Word-granularity-level text Q deep subtraction alignment feature

Text Q high-level feature Q 'at word granularity level' _c And text Q deep soft-alignment features at word granularity level

Word granularity level text Q deep multiplication alignment feature

Word granularity level text Q deep subtraction alignment feature

s30502, semantic interaction feature extraction between texts:

s3050201, extracting initial semantic interaction characteristics between texts by using a plurality of layers of coding structures simultaneously through the first layer of coding structure:

on the character granularity, firstly, the text P character granularity characteristic P in formula (1) ^c And text Q word granularity feature Q ^c Text P soft alignment interactive feature for obtaining word granularity level by performing soft alignment attention

Text Q soft alignment interaction feature at word granularity level

As shown in equation (19):

secondly, the granularity characteristic P of the text P word in the formula (1) ^c And text Q word granularitySign Q ^c Text P subtraction alignment interactive feature for obtaining word granularity level by performing subtraction alignment attention

Text Q subtractive alignment of interactive features at word granularity level

As shown in equation (20):

Text Q soft-alignment interaction feature at word granularity level

As shown in equation (21):

Text Q-subtractive alignment interactive features at word granularity level

As shown in equation (22):

s3050202, enhancing initial semantic interaction features between texts by using a second-layer coding structure, and finishing extraction of semantic interaction features between texts:

As shown in equation (23):

As shown in equation (24):

Alignment of interactive features with text P deep subtraction at word granularity level in equation (24)

Obtaining text P high-level interactive characteristics P of word granularity level _c ", as shown in equation (25):

As shown in equation (26):

then, the text P subtraction alignment interactive characteristics of the word granularity level in the formula (22) are aligned

And the granularity characteristic P of the text P words in the formula (2) ^w Text P deep subtraction alignment interactive feature with word granularity level obtained by adding

As shown in equation (27):

And the term in formula (27)Granularity level text P deep subtraction alignment interactive feature

Obtaining the text P high-level interactive feature P ″, of the word granularity level _w As shown in equation (28):

Word-granularity-level text Q deep subtraction alignment interactive feature

Word-granularity-level text Q deep subtraction alignment interactive features

S306, constructing a feature fusion layer:

firstly, for convenience of subsequent description, the following operations are defined:

AB(P,Q)＝|P-Q| (30)

the P and the Q are two different vectors, and the absolute value operation according to the bit is carried out after the P and the Q vectors are subtracted;

the bit-wise multiplication operation of the definition vector is denoted as MU, as shown in equation (31):

MU(P，Q)＝P⊙Q (31)

the feature fusion layer is divided into two sub-modules, the first sub-module combines multiple related features, and the second sub-module performs multiple matching operations to obtain a final matching feature vector, as shown in fig. 9.

S30601, combining a plurality of related features by a first submodule:

concatenating text P high-level features P 'of word granularity level in equation (13)' _c And text P high-level interaction feature P 'at word granularity level in formula (25)' _c Text P aggregation feature to obtain word granularity level

And aggregating the text P features at the word granularity level

As shown in equation (32):

at word granularity, similar to word granularity, concatenate text P high level features P 'at the word granularity level in equation (17)' _w The high-level interaction characteristic P' with the text P at the word granularity level in the formula (28) _w Text P aggregation features to derive word granularity level

And aggregating the text P with the character granularity level

Text P deep aggregation feature with self-attention derived word granularity level

As shown in equation (33):

Deep aggregation characteristics of text P with word granularity level in formula (33)

Then, performing maximum pooling operation to obtain a semantic feature P' of the pooled text P, as shown in formula (34):

next, join text P deep semantic feature P 'in equation (18)' _deep Interacting with the deep semantic interactive feature P' of the text P in the formula (29) _deep Obtaining text P deep polymerization features

As shown in equation (35):

Word-granularity-level text Q deep aggregation features

Text Q aggregation feature at word granularity level

Word granularity level text Q deep polymerization features

Text Q semantic feature Q' after pooling and text Q deep polymerization feature

Then, the text P in the formula (35) is deeply aggregated with the characteristics

Deep syndication features with text Q

Performing soft alignment attentionSoft-aligned text P deep polymerization features

Text Q deep polymerization features after soft alignment

As shown in equation (36):

then, the text P after soft alignment in the formula (36) is deeply polymerized into features

Performing maximum pooling to obtain a pooled text Q deep polymerization feature Q' as shown in equation (37):

s30602, performing multiple matching operations to obtain a final matching feature vector:

firstly, the semantic features P 'of the pooled text P and the semantic features Q' of the pooled text Q in the formula (34) are subtracted in absolute value to obtain the subtraction matching features PQ _ab As shown in equation (38):

PQ _ab ＝AB(P′-Q′) (38)

PQ _mu ＝MU(P′，Q′) (39)

thirdly, pooling in equation (37) is followedThe P deep polymerization feature P ' and the pooled text Q deep polymerization feature Q ' in the formula (37) are subjected to absolute value subtraction to obtain a deep subtraction matching feature PQ ' _ab As shown in equation (40):

PQ′ _ab ＝AB(P″，Q″) (40)

PQ′ _mu ＝MU(P″，Q″) (41)

finally, the semantic features P 'of the pooled text P in equation (34), the semantic features Q' of the pooled text Q, and the subtractive matching features PQ in equation (38) are connected _ab Equation (39) midpoint-times matching characteristic PQ _mu And the deep subtraction matching characteristic PQ 'in the formula (40)' _ab And the deep layer in the formula (41) is dot-multiplied by the matched characteristic PQ' _mu A final matching feature vector F is obtained, as shown in equation (42):

F＝[P′；Q′；PQ _ab ；PQ _mu ；PQ′ _ab ；PQ′ _mu ] (42)

s307, constructing a prediction layer:

using the final matching feature vector as input, using the fully-connected layers of three layers and using the ReLU activation function for activation after the first and second fully-connected layers and the sigmoid function for activation after the third fully-connected layer, resulting in a value at [0,1 ]]The value of the degree of matching between them is recorded as y _pred (ii) a Finally, whether the text semantics are matched or not is judged by comparing the text semantics with the set threshold value of 0.5; i.e. y _pred When the semantic meaning of the text is more than or equal to 0.5, predicting that the semantic meaning of the text is matched, otherwise, not matching;

when the text semantic matching model is not trained, training is required to be carried out on a training data set constructed according to a semantic matching knowledge base so as to optimize model parameters; when the model is trained, the prediction layer can predict whether the semantics of the target text are matched.

S4, training a text semantic matching model: training the text semantic matching model constructed in the step S3 on the training data set obtained in the step S2, as shown in fig. 5, specifically as follows:

s401, constructing a loss function: from step S307, y _pred The matching degree value is obtained after text semantic matching model processing; and y is _true The two text semantics are matched real labels, the values of the two text semantics are limited to 0 or 1, the cross entropy is used as a loss function, and the formula is as follows:

s402, constructing an optimization function:

testing various optimization functions of the model, and finally selecting Adam optimization functions as the optimization functions of the model, wherein hyper-parameters of the Adam optimization functions are set by default values in Keras;

for example, the following steps are carried out: the optimization function described above and its settings are expressed in Keras using code:

optim＝keras.optimizers.Adam()

the proposed model can achieve excellent effects on medical intelligent question and answer data sets.

Example 3:

as shown in fig. 10, the text semantic matching device for medical intelligent question answering according to embodiment 2 includes,

the text semantic matching knowledge base construction unit is used for acquiring a large amount of text data and then carrying out preprocessing operation on the text data so as to obtain a text semantic matching knowledge base meeting the training requirement;

the training data set generating unit is used for matching data in the knowledge base according to the semantics of the text, if the semantics of the data are consistent, the text is used for constructing a training positive example, otherwise, the text is used for constructing a training negative example, and all positive example data and all negative example data are mixed to obtain a training data set;

a text semantic matching model construction unit: the system is used for constructing a word mapping conversion table, an input layer, a word vector mapping layer, a semantic coding layer, a multi-level fine-grained feature extraction layer, a feature fusion layer and a prediction layer;

a text semantic matching model training unit: and the method is used for constructing a training loss function and an optimization function and finishing the training of the model.

Example 4:

based on the storage medium of embodiment 2, in which a plurality of instructions are stored, the instructions are loaded by a processor, and the steps of the medical intelligent question-answering oriented text semantic method of embodiment 2 are executed.

Example 5:

electronic equipment based on embodiment 4, electronic equipment includes: the storage medium of example 4; and

a processor for executing the instructions in the storage medium of embodiment 4.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and these modifications or substitutions do not depart from the spirit of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A text semantic matching method for medical intelligent question answering is characterized in that a text semantic matching model is formed by constructing and training an embedding layer, a semantic coding layer, a multi-level fine-grained feature extraction layer, a feature fusion layer and a prediction layer, text characters and word granularity features are extracted, fine-grained semantic features and text semantic interaction features of the same text are captured, multiple relevant features are combined finally, then multiple matching operations are carried out, a final matching feature vector is generated, and the similarity of the text is judged; the method comprises the following specific steps:

the embedding layer carries out embedding operation on the input text according to the word granularity and the word granularity respectively, and outputs text word embedding representation and word embedding representation;

the semantic coding layer receives text character embedded representation and word embedded representation, codes the text character embedded representation and the word embedded representation by using a bidirectional long-short term memory network BilSTM, and outputs text character and word granularity characteristics;

the multilevel fine-grained feature extraction layer performs the same text and text inter-coding operation on the text character and word granularity features output by the semantic coding layer to obtain the same text fine-grained semantic features and the text inter-semantic interactive features;

the feature fusion layer combines various related features, and then performs various matching operations to generate a final matching feature vector;

2. The medical intelligent question-answering oriented text semantic matching method according to claim 1, wherein the embedding layer comprises a word mapping conversion table, an input layer, a word vector mapping layer, an output text word embedding representation and a word embedding representation;

wherein, the word mapping conversion table: the mapping rule is that the number 1 is used as the starting point, and then the characters or the words are sequentially and progressively ordered according to the sequence of the character word list recorded into each character or word, so that a character word mapping conversion table is formed; then, using Word2Vec to train the Word vector model to obtain a Word vector matrix of each Word;

word vector mapping layer: loading the word vector matrix obtained by training in the step of constructing the word mapping conversion table to initialize the weight parameters of the current layer; and obtaining corresponding text word embedding representation and word embedding representation txt P _ char _ embedded, txt Q _ char _ embedded, txt P _ word _ embedded and txt Q _ word _ embedded for the input text txt P _ char, txt Q _ char, txt P _ word _ embedded and txt Q _ word _ embedded.

3. The text semantic matching method for medical intelligent question answering according to claim 2, wherein implementation details of the semantic coding layer are as follows:

The concrete formula is as follows:

representing the ith bit of text P obtained by forward long-short term memory network LSTM encodingThe character-setting granularity characteristic is that,

representing the granularity characteristic of the jth position word of the text P obtained by forward LSTM coding,

4. The medical intelligent question-answering oriented text semantic matching method according to claim 3, wherein the implementation details of the multi-level fine-grained feature extraction layer are as follows:

carrying out encoding operation between the same text and the same text on the granularity characteristics of the text characters and the words output by the semantic encoding layer to obtain the fine granularity semantic characteristics of the same text and the semantic interaction characteristics between the texts; the method comprises two sub-modules, wherein the first sub-module is responsible for extracting fine-grained semantic features of the same text, and mainly uses a plurality of attention module codes to obtain the fine-grained semantic features of the same text according to different granularities of the same text; the second sub-module is responsible for extracting semantic interaction features among texts, and mainly obtains the semantic interaction features among the texts by using a plurality of layers of coding structures among the texts;

wherein

the granularity characteristic of the word at the jth position of the text P is shown in a formula (2),

representing the soft alignment attention weight between the ith position word granularity characteristic and the jth position word granularity characteristic of the text P,

define the multiplicative alignment attention module, denoted MUA, as follows:

indicating that the multiplication is aligned with the attention weight,

indicating that the softmax operation on the multiplicative alignment attention weight maps to a value of 0-1,

indicating that the ith position word granularity feature of the text P can be re-expressed by weighted summation of all word granularity features of the text P using multiplicative alignment attention,

the expression that the jth position word granularity characteristic of the text P can be represented again by the weighted summation of all word granularity characteristics of the text P by using multiplication alignment attention;

defining a Sua as the Sua, the formula is as follows:

where TimeDistributed (Dense ()) represents the same layer operation of Dense () performed on the tensor of each time step-represents a bit-wise subtraction operation, tanh represents an activation function, P ^c The granularity characteristic of the P words of the text is represented,

represents the granularity characteristic of the text P word after being processed by a Dense () layer, P ^w The granularity characteristic of the text P words is represented,

representing a subtractive alignment attention weight,

defining the self-alignment attention module as SEA, the formula is as follows:

wherein,

the ith position word granularity characteristic of the text P is represented,

representing the jth position word granularity characteristic of the text P,

Text P soft-alignment feature at word granularity level

As shown in equation (7):

second, using multiplicative alignment attention, the text P word granularity feature P ^c And the granularity characteristic P of the text P word ^w Performing multiplicative alignment attention gettingWord granularity level text P multiplication alignment feature

Text P-multiply aligned feature at word granularity level

As shown in equation (8):

Text P subtraction alignment feature to word granularity level

As shown in equation (9):

Word granularity level text Q soft alignment features

Word-granularity level text Q multiplication alignment feature

Word granularity level text Q multiplication alignment feature

Word-granularity-level text Q subtractive alignment features

Word granularity level text Q subtraction alignment feature

the second layer of coding structure enhances the fine-grained initial semantic features of the same text to complete the extraction of the fine-grained semantic features of the same text:

As shown in equation (10):

As shown in equation (11):

As shown in equation (12):

Text P deep multiplicative alignment features at word granularity level in equation (11)

And the text P word granularity characteristic P in the formula (2) ^w Adding to obtain text P deep soft alignment characteristics of word granularity level

As shown in equation (14):

As shown in equation (15):

And the text P word granularity characteristic P in the formula (2) ^w Text P deep subtraction alignment feature adding to obtain word granularity level

As shown in equation (16):

Obtaining text P deep semantic feature P' _deep As shown in equation (18):

Word-granularity-level text Q deep multiplication alignment featureSign

Word-granularity-level text Q deep subtraction alignment feature

Word granularity level text Q deep multiplication alignment feature

Word-granularity-level text Q deep subtraction alignment features

on the character granularity, firstly, the text P character granularity characteristic P in formula (1) ^c And text Q word granularity feature Q ^c Performing soft alignment attention to obtain text P soft alignment interactive features at word granularity level

Text Q soft alignment interaction feature at word granularity level

As shown in equation (19):

secondly, the granularity characteristic P of the text P word in the formula (1) ^c And text Q word granularity feature Q ^c Text P subtraction alignment interactive feature for obtaining word granularity level by performing subtraction alignment attention

Text Q subtractive alignment of interactive features at word granularity level

As shown in equation (20):

Text Q soft-alignment interactive features at word granularity level

As shown in equation (21):

Text Q subtraction alignment interaction feature at word granularity level

As shown in equation (22):

the second layer of coding structure enhances the initial semantic interactive features between texts to complete the extraction of the semantic interactive features between the texts:

And the text P word granularity characteristic P in the formula (1) ^c Adding to obtain text P deep soft alignment interactive features of word granularity level

As shown in equation (23):

then, the text P subtraction alignment interactive feature of the word granularity level in the formula (20)

As shown in equation (24):

Obtaining the text P high-level interactive characteristic P' of word granularity level _c As shown in equation (25):

As shown in equation (26):

And the text P word granularity characteristic P in the formula (2) ^w Text P deep subtraction alignment interactive feature with word granularity level obtained by adding

As shown in equation (27):

finally, the text P deep soft-alignment interactive features of word granularity level in the joint formula (26)

Obtaining the deep semantic interactive feature P ″' of the text P _deep As shown in equation (29):

Word-granularity-level text Q deep subtraction alignment interactive feature

Text Q high-level interactive characteristic Q' at word granularity level _c Word granularity level text Q deep soft alignment interactive feature

Word granularity level text Q deep subtraction alignment interactive feature

Text Q high-level interactive feature Q' at word granularity level _w Text Q deep semantic interactive feature Q _deep And completing the extraction of semantic interactive features between texts.

5. The text semantic matching method for medical intelligent question answering according to claim 4, wherein implementation details of the feature fusion layer are as follows:

the vector subtraction and bit-wise absolute value operation are defined as AB, as shown in equation (30):

AB(P,Q)＝|P-Q| (30)

MU(P,Q)＝P⊙Q (31)

wherein, P and Q are two different vectors and represent the operation of multiplying P and Q vectors by bit;

the first sub-module combines a plurality of relevant features:

concatenating text P high-level features P 'of word granularity level in equation (13)' _c The high-level interactive characteristic P' of the text P at the word granularity level in the formula (25) _c Text P aggregation feature to obtain word granularity level

And aggregating the text P features at the word granularity level

As shown in equation (32):

at word granularity, similar to word granularity, concatenate text P high level features P 'at the word granularity level in equation (17)' _w The high-level interactive characteristic P' of the text P at the word granularity level in the formula (28) _w Text P aggregation features to derive word granularity level

And aggregating the text P with the character granularity level

As shown in equation (33):

thereafter, text P deep aggregation features at the word granularity level in the join formula (32) are joined

As shown in equation (35):

Word-granularity-level text Q deep aggregation features

Text Q aggregation feature at word granularity level

Word granularity level text Q deep polymerization features

Text Q semantic feature Q' after pooling and text Q deep polymerization feature

Deep syndication features with text Q

Deep polymerization features with soft-aligned text Q

As shown in equation (36):

Performing maximum pooling operation to obtain a pooled text P deep polymerization feature P', and deep polymerizing the soft aligned text QAlloy characteristics

PQ _ab ＝AB(P′-Q′) (38)

secondly, performing point multiplication on semantic features P 'of the pooled text P and semantic features Q' of the pooled text Q in the formula (34) to obtain point-multiplied matching features PQ _mu As shown in equation (39):

PQ _mu ＝MU(P′,Q′) (39)

PQ′ _ab ＝AB(P″,Q″) (40)

PQ′ _mu ＝MU(P″,Q″) (41)

finally, the semantic features P 'of the text P after pooling in the formula (34), the semantic features Q' of the text Q after pooling, and the subtraction matching features PQ in the formula (38) are connected _ab Equation (39) midpoint product matching feature PQ _mu Equation (4)0) Mid-deep subtraction matching characteristic PQ' _ab And the deep layer in the formula (41) is dot-multiplied by the matched characteristic PQ' _mu The final matching feature vector F is obtained, as shown in equation (42):

F＝[P′；Q′；PQ _ab ；PQ _mu ；PQ′ _ab ；PQ′ _mu ] (42)

6. the text semantic matching method for medical intelligent question answering according to claim 5, wherein implementation details of the prediction layer are as follows:

using the final matching feature vector F as input, using a fully connected layer of three layers and using the ReLU activation function for activation after the fully connected layers of the first and second layers and using the sigmoid function for activation after the fully connected layer of the third layer, resulting in a final matching feature vector F at [0,1 ]]The value of the degree of matching between the two is recorded as y _pred (ii) a Finally, whether the text semantics are matched or not is judged by comparing the text semantics with the set threshold value of 0.5; i.e. y _pred When the semantic meaning of the text is more than or equal to 0.5, predicting that the semantic meaning of the text is matched, otherwise, mismatching;

when the text semantic matching model is not trained, training needs to be carried out on a training data set constructed according to a semantic matching knowledge base so as to optimize model parameters; when the model is trained, the prediction layer can predict whether the semantics of the target text are matched.

7. The text semantic matching method for medical intelligent question answering according to claim 1, wherein the text semantic matching knowledge base comprises a data set acquisition raw data downloading network, a raw data preprocessing and a sub knowledge base summarizing;

preprocessing raw data: preprocessing original data used for constructing a text semantic matching knowledge base, and performing word segmentation operation and word segmentation operation on each text to obtain a text semantic matching word segmentation processing knowledge base and a word segmentation processing knowledge base;

constructing a training example: for each text in the text semantic matching knowledge base, if the semantics are consistent, the text can be used for constructing a training example;

8. A text semantic matching device for medical intelligent question answering is characterized by comprising a text semantic matching knowledge base building unit, a training data set generating unit, a text semantic matching model building unit and a text semantic matching model training unit, and the steps of the text semantic matching method for medical intelligent question answering described in claims 1-7 are respectively realized.

9. A storage medium having stored thereon a plurality of instructions, wherein the instructions are loaded by a processor to perform the steps of the method for semantic matching of text towards intelligent medical questioning and answering as claimed in claims 1 to 7.

10. An electronic device, characterized in that the electronic device comprises:

the storage medium of claim 9 and a processor to execute instructions in the storage medium.