CN115269808A - Text semantic matching method and device for medical intelligent question answering - Google Patents

Text semantic matching method and device for medical intelligent question answering Download PDF

Info

Publication number
CN115269808A
CN115269808A CN202210996504.8A CN202210996504A CN115269808A CN 115269808 A CN115269808 A CN 115269808A CN 202210996504 A CN202210996504 A CN 202210996504A CN 115269808 A CN115269808 A CN 115269808A
Authority
CN
China
Prior art keywords
text
word
feature
alignment
granularity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210996504.8A
Other languages
Chinese (zh)
Inventor
鹿文鹏
张鑫
赵鹏宇
郑超群
张维玉
马凤英
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qilu University of Technology
Original Assignee
Qilu University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qilu University of Technology filed Critical Qilu University of Technology
Priority to CN202210996504.8A priority Critical patent/CN115269808A/en
Publication of CN115269808A publication Critical patent/CN115269808A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a text semantic matching method and device for medical intelligent question answering, and belongs to the technical field of natural language processing. The technical problem to be solved by the invention is how to capture fine-grained semantic features and semantic interactive features among texts of the same text to realize semantic matching of the text, and the technical scheme is as follows: a text semantic matching model is formed by constructing and training an embedding layer, a semantic coding layer, a multi-level fine-grained feature extraction layer, a feature fusion layer and a prediction layer, text character and word granularity features are extracted, the fine-grained semantic features and the text semantic interaction features of the same text are captured, multiple relevant features are finally combined, then multiple matching operations are carried out, a final matching feature vector is generated, and the similarity of the text is judged. The device comprises a text matching knowledge base construction unit, a training data set generation unit, a text matching model construction unit and a text semantic matching model training unit.

Description

Text semantic matching method and device for medical intelligent question answering
Technical Field
The invention relates to the technical field of artificial intelligence and natural language processing, in particular to a text semantic matching method and device for medical intelligent question answering.
Background
The medical intelligent question answering can automatically find questions with similar semantics in a question answering knowledge base aiming at the questions put forward by the patient, and pushes the case to the user, so that the burden of manual answer of a doctor is greatly reduced; for the various questions put forward by patients, how to find the standard questions similar to the semantics of the questions is the core of the medical intelligent question-answering system; the essence of the technology is to measure the matching degree of the questions put forward by the patient and the standard questions in the question and answer knowledge base, and the essence is the text semantic matching task.
The text semantic matching task aims to measure whether the semantics contained in two texts are consistent, which is consistent with the core target of many natural language processing tasks, and the semantic matching calculation of natural language texts is a very challenging task, and the problem cannot be solved completely by the existing method.
The existing method mainly focuses on similarity discrimination of English texts, models semantic information inside the same text at a word granularity level, and models semantic interaction information among texts at a text level, but a Chinese text is more complex than an English text, chinese has rich semantic information at the word granularity level and the word granularity level, and how to better capture word granularity, word granularity and semantic information at the text level to better determine semantic similarity among texts is challenging work; aiming at the defects of the existing text semantic matching method, the invention provides a text semantic matching method and a text semantic matching device facing to medical intelligent question answering; capturing fine-grained semantic features and semantic interaction features among texts of the same text at multiple levels; the core idea is that the granularity characteristics of characters and words of a text are extracted by combining a multi-layer coding structure with various attention mechanisms, the fine-granularity semantic characteristics and the semantic interaction characteristics between the texts of the same text are captured, various relevant characteristics are finally combined, then various matching operations are carried out, a final matching characteristic vector is generated, and the similarity of the text is judged.
Disclosure of Invention
The technical task of the invention is to provide a text semantic matching method and a text semantic matching device for medical intelligent question answering, which are characterized in that the granularity characteristics of text characters and words are extracted, the fine-granularity semantic characteristics and the semantic interaction characteristics among texts of the same text are captured, multiple related characteristics are finally combined, then multiple matching operations are carried out, the final matching characteristic vector is generated, and the similarity of the text is judged.
The technical task of the invention is realized in the following way, and the text semantic matching method facing medical intelligent question answering is characterized in that a text semantic matching model is formed by constructing and training an embedding layer, a semantic coding layer, a multi-level fine-grained feature extraction layer, a feature fusion layer and a prediction layer, text characters and word granularity features are extracted, the same text fine-grained semantic features and semantic interaction features among texts are captured, finally, various related features are combined, and then, various matching operations are carried out to generate final matching feature vectors and judge the similarity of the texts; the method comprises the following specific steps:
the embedding layer carries out embedding operation on the input text according to the character granularity and the word granularity respectively to obtain text character embedding representation and word embedding representation;
the semantic coding layer receives text character embedded representation and word embedded representation, codes by using a bidirectional long-short term memory network (BilSTM), and outputs text character and word granularity characteristics;
the multilevel fine-grained feature extraction layer performs the same text and text inter-coding operation on the text character and word granularity features output by the semantic coding layer to obtain the same text fine-grained semantic features and text inter-semantic interaction features;
the feature fusion layer combines the related features, and then performs various matching operations to generate a final matching feature vector;
and the prediction layer inputs the final matching feature vector into the multilayer perceptron to obtain a floating-point numerical value, compares the floating-point numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the text are matched or not according to the comparison result.
Preferably, the embedding layer comprises a word mapping conversion table, an input layer, a word vector mapping layer, an output text word embedding representation and a word embedding representation;
wherein, the word mapping conversion table: the mapping rule is that the number 1 is used as the starting point, and then the characters or the words are sequentially and progressively ordered according to the sequence of the character word list recorded into each character or word, so that a character word mapping conversion table is formed; then, using Word2Vec to train the Word vector model to obtain the Word vector matrix of each Word;
an input layer: the input layer comprises four inputs, word breaking and word segmentation preprocessing are carried out on each text or text to be predicted in the training data set, txt P _ char, txt Q _ char, txt P _ word and txt Q _ word are respectively obtained, wherein suffixes char and word respectively represent that the corresponding text is subjected to word breaking or word segmentation processing, and the suffixes char and word are formed as follows: (txt P _ char, txt Q _ char, txt P _ word, txt Q _ word); converting each character and word in the input text into corresponding numerical identification according to a character and word mapping conversion table;
word vector mapping layer: loading the word vector matrix obtained by training in the step of constructing the word mapping conversion table to initialize the weight parameters of the current layer; aiming at input texts txt P _ char, txt Q _ char, txt P _ word and txt Q _ word, corresponding text word embedding representation and word embedding representation txt P _ char _ embedded, txt Q _ char _ embedded, txt P _ word _ embedded and txt Q _ word _ embedded are obtained
More preferably, the implementation details of the semantic coding layer are as follows:
taking the text P as an example, the module receives the text P characters, word embedded representation and uses the bidirectional long-short term memory network BilsTM encodes to obtain granularity characteristics of P characters and words of text, and records the granularity characteristics as
Figure BDA0003805765280000021
Figure BDA0003805765280000022
The specific formula is as follows:
Figure BDA0003805765280000023
Figure BDA0003805765280000024
wherein N represents the length of the word granularity characteristic and the word granularity characteristic, formula (1) represents that the text P word embedded representation is encoded by using a bidirectional long-short term memory network (BilSTM), wherein,
Figure BDA0003805765280000025
the granularity characteristic of the ith position word of the text P obtained by bidirectional long-short term memory network BilSTM coding is shown,
Figure BDA0003805765280000026
the ith position word granularity characteristic of the text P obtained by forward long-short term memory network LSTM coding is shown,
Figure BDA0003805765280000027
the ith position word granularity characteristic of the text P obtained by backward LSTM coding is represented; the symbol meaning in formula (2) is basically consistent with that in formula (1),
Figure BDA0003805765280000028
represents the granularity characteristic of the j-th position word of the text P obtained by bidirectional long and short term memory network BilSTM coding,
Figure BDA0003805765280000029
represents the jth bit of the text P obtained by forward LSTM encodingThe feature of the word-setting granularity,
Figure BDA00038057652800000210
representing the granularity characteristic of the j-th position word of the text P obtained by backward LSTM coding.
Similarly, the text Q is operated similarly to the text P to obtain the granularity characteristics of the characters and words of the text Q, and the granularity characteristics are marked as Q c 、Q w
Preferably, the implementation details of the multi-level fine-grained feature extraction layer are as follows:
performing encoding operation between the same text and the same text on the granularity characteristics of the text characters and the words output by the semantic encoding layer to obtain the fine granularity semantic characteristics of the same text and the semantic interaction characteristics between the texts; the method comprises two sub-modules, wherein the first sub-module is responsible for extracting fine-grained semantic features of the same text, and mainly uses a plurality of attention module codes to obtain the fine-grained semantic features of the same text according to different granularities of the same text; the second sub-module is responsible for extracting semantic interaction features among texts, and mainly obtains the semantic interaction features among the texts by using a plurality of layers of coding structures among the texts;
extracting fine-grained semantic features of the same text in a first sub-module:
first, for convenience of subsequent description, in the first section, taking the text P as an example, the following attention module is defined:
defining a soft alignment attention module, denoted as SOA, and the formula is as follows:
Figure BDA0003805765280000031
wherein
Figure BDA0003805765280000032
The granularity characteristic of the ith position word of the text P is represented by the formula (1),
Figure BDA0003805765280000033
the granularity characteristic of the j-th position word of the text P is shown in formula (2),
Figure BDA0003805765280000034
representing a soft alignment attention weight between the ith position word granularity characteristic and the jth position word granularity characteristic of the text P,
Figure BDA0003805765280000035
indicating that the softmax operation on the soft-alignment attention weight maps to a value of 0-1,
Figure BDA0003805765280000036
indicating that the ith position word granularity characteristic of the text P can be re-expressed by weighted summation of all word granularity characteristics of the text P by using soft alignment attention,
Figure BDA0003805765280000037
the representation uses soft alignment attention to enable the granularity characteristic of the jth position word of the text P to be represented again by the weighted summation of all the granularity characteristics of the words of the text P;
define the multiplicative alignment attention module, denoted MUA, as follows:
Figure BDA0003805765280000038
wherein TimeDistributed (Dense ()) indicates that the same layer operation is performed for the tensor of each time step, tan h indicates the multiplication operation for the bit, P indicates the activation function c Representing the granularity characteristics of a P word of the text,
Figure BDA0003805765280000041
representing the granularity characteristic of the P words of the text after being processed by a Dense () layer, P w The granularity characteristic of the text P words is represented,
Figure BDA0003805765280000042
indicating the multiplicative alignment attention weight,
Figure BDA0003805765280000043
representing softmax operation on multiplicative alignment attention weightsThe mapping is a numerical value from 0 to 1,
Figure BDA0003805765280000044
indicating that the ith position word granularity characteristic of the text P can be re-expressed by weighted summation of all word granularity characteristics of the text P by using multiplication alignment attention,
Figure BDA0003805765280000045
the expression that the jth position word granularity feature of the text P can be represented again by the weighted sum of all word granularity features of the text P by using multiplication alignment attention;
defining the Subtraction alignment attention module as SUA, the formula is as follows:
Figure BDA0003805765280000046
where TimeDistributed (Dense ()) represents that the same Dense () layer operation is performed for the tensor of each time step, represents a bit-wise subtraction operation, tanh represents an activation function, P c Representing the granularity characteristics of a P word of the text,
Figure BDA0003805765280000047
representing the granularity characteristic of the P words of the text after being processed by a Dense () layer, P w The granularity characteristic of the text P words is represented,
Figure BDA0003805765280000048
representing a subtraction alignment attention weight,
Figure BDA0003805765280000049
indicating that the softmax operation on the subtractive alignment attention weight maps to a value of 0-1,
Figure BDA00038057652800000410
indicating that the ith position word granularity feature of the text P can be re-represented by weighted summation of all word granularity features of the text P using subtractive alignment attention,
Figure BDA00038057652800000411
the method indicates that the jth position word granularity characteristic of the text P can be represented again by weighted summation of all word granularity characteristics of the text P by using subtraction alignment attention;
define the self-alignment attention module as SEA, the formula is as follows:
Figure BDA00038057652800000412
wherein,
Figure BDA00038057652800000413
the i-th position word granularity characteristic of the text P is represented,
Figure BDA00038057652800000416
representing the jth position word granularity characteristic of the text P,
Figure BDA00038057652800000414
representing a self-aligned attention weight between the ith position word granularity feature of the text P and the jth position word granularity feature of the text P,
Figure BDA00038057652800000415
indicating that the softmax operation on the self-aligned attention weight maps to a value of 0-1,
Figure BDA0003805765280000051
the representation uses self-alignment attention to enable the ith position word granularity feature of the text P to be represented again by weighted summation of all word granularity features of the text P;
in the following description, the SOA symbol is used to represent the operation of formula (3), the MUA symbol is used to represent the operation of formula (4), the SUA symbol is used to represent the operation of formula (5), and the SEA symbol is used to represent the operation of formula (6);
the first layer of coding structure uses a plurality of attention modules to extract fine-grained initial semantic features of the same text:
first, using soft-alignment attention, the text P-word granularity feature P c And the granularity characteristic P of the text P word w Performing soft alignment attention to obtain text P soft alignment characteristics at word granularity level
Figure BDA0003805765280000052
Text P soft-alignment feature at word granularity level
Figure BDA0003805765280000053
As shown in equation (7):
Figure BDA0003805765280000054
second, using multiplicative alignment attention, the text P word granularity feature P c And text P word granularity characteristic P w Text P multiplication alignment feature for performing multiplication alignment attention to word granularity level
Figure BDA0003805765280000055
Text P-multiply aligned feature at word granularity level
Figure BDA0003805765280000056
As shown in equation (8):
Figure BDA0003805765280000057
then, using subtraction to align attention, the text P word granularity feature P c And text P word granularity characteristic P w Text P subtraction alignment feature for performing subtraction alignment attention to obtain word granularity level
Figure BDA0003805765280000058
Text P subtraction alignment feature to word granularity level
Figure BDA0003805765280000059
As shown in equation (9):
Figure BDA00038057652800000510
similarly, the text Q is processed similarly to the text P, and the soft alignment characteristic of the text Q at the word granularity level can be obtained
Figure BDA00038057652800000511
Word granularity level text Q soft alignment features
Figure BDA00038057652800000512
Word-granularity level text Q multiplication alignment feature
Figure BDA00038057652800000513
Word granularity level text Q multiplication alignment feature
Figure BDA00038057652800000514
Word-granularity-level text Q subtractive alignment features
Figure BDA00038057652800000515
Word granularity level text Q subtraction alignment feature
Figure BDA00038057652800000516
Namely, the extraction of fine-grained initial semantic features of the same text is completed;
the second layer of coding structure enhances the fine-grained semantic features of the same text to complete the extraction of the fine-grained semantic features of the same text:
firstly, soft-aligning the text P at the word granularity level in the formula (7)
Figure BDA00038057652800000517
And the text P word granularity characteristic P in the formula (1) c Adding to obtain text P deep soft alignment characteristics of word granularity level
Figure BDA00038057652800000518
As shown in equation (10):
Figure BDA00038057652800000519
then, the text P multiplication alignment feature of word granularity level in the formula (8) is aligned
Figure BDA00038057652800000520
Adding the character granularity Pc of the text P in the formula (1) to obtain the character granularity level text P deep multiplication alignment characteristic
Figure BDA00038057652800000521
As shown in equation (11):
Figure BDA00038057652800000522
then, the text P of the word granularity level in the formula (9) is subtracted to align the features
Figure BDA00038057652800000523
And the text P word granularity characteristic P in the formula (1) c Adding to obtain text P deep subtraction alignment features at word granularity level
Figure BDA00038057652800000524
As shown in equation (12):
Figure BDA00038057652800000525
then, the deep soft alignment feature of the text P at the word granularity level in the formula (10)
Figure BDA00038057652800000526
Text P deep multiplication alignment feature at word granularity level in equation (11)
Figure BDA0003805765280000061
Text P deep subtraction alignment feature at word granularity level in equation (12)
Figure BDA0003805765280000062
Concatenating to get text P high level feature P 'of word granularity level' c As shown in equation (13):
Figure BDA0003805765280000063
similar in word granularity to word granularity, first, the text P at the word granularity level in equation (7) is soft-aligned to the features
Figure BDA0003805765280000064
And the granularity characteristic P of the text P words in the formula (2) w Adding to obtain text P deep soft alignment characteristics of word granularity level
Figure BDA0003805765280000065
As shown in equation (14):
Figure BDA0003805765280000066
then, the text P multiplication alignment feature of the word granularity level in the formula (8)
Figure BDA0003805765280000067
And the text P word granularity characteristic P in the formula (2) w Adding to obtain text P deep multiplication alignment characteristics of word granularity level
Figure BDA0003805765280000068
As shown in equation (15):
Figure BDA0003805765280000069
then, the text P of the word granularity level in the formula (9) is subjected to subtraction alignment
Figure BDA00038057652800000610
And the text P word granularity characteristic P in the formula (2) w Adding to obtain text at word granularity levelP deep subtraction alignment features
Figure BDA00038057652800000611
As shown in equation (16):
Figure BDA00038057652800000612
next, join the text P deep soft alignment features at the word granularity level of equation (14)
Figure BDA00038057652800000613
Text P deep multiplication alignment feature at word granularity level in formula (15)
Figure BDA00038057652800000614
Text P deep subtraction alignment feature at word granularity level in equation (16)
Figure BDA00038057652800000615
Text P high-level feature P 'hindering word granularity level' w As shown in equation (17):
Figure BDA00038057652800000616
text P deep soft alignment feature at the word granularity level in the join equation (10)
Figure BDA00038057652800000617
Text P deep soft alignment feature at the word granularity level of formula (14)
Figure BDA00038057652800000618
Obtaining text P deep semantic feature P' deep As shown in equation (18):
Figure BDA00038057652800000619
similarly, the text Q is processed similarly to the text PObtaining text Q deep soft alignment features at word granularity level
Figure BDA00038057652800000620
Word-granularity-level text Q deep multiplication alignment feature
Figure BDA00038057652800000621
Word-granularity-level text Q deep subtraction alignment feature
Figure BDA00038057652800000622
Text Q high-level feature Q 'at word granularity level' c And text Q deep soft-alignment feature at word granularity level
Figure BDA00038057652800000623
Word granularity level text Q deep multiplication alignment feature
Figure BDA00038057652800000624
Word granularity level text Q deep subtraction alignment feature
Figure BDA00038057652800000625
Text Q high-level feature Q 'of word granularity level' w Text Q deep semantic feature Q' deep Finishing the extraction of fine-grained semantic features of the same text;
and extracting semantic interactive features between texts of a second submodule:
the first layer of coding structure simultaneously uses a plurality of layers of coding structures to extract the initial semantic interaction characteristics between texts:
in word granularity, firstly, the text P word granularity characteristic P in formula (1) c And text Q word granularity feature Q c Performing soft alignment attention to obtain text P soft alignment interactive features at word granularity level
Figure BDA00038057652800000626
Text Q soft alignment interaction feature at word granularity level
Figure BDA00038057652800000627
As shown in equation (19):
Figure BDA00038057652800000628
secondly, the granularity characteristic P of the text P words in the formula (1) c And text Q word granularity feature Q c Text P subtraction alignment interactive feature for obtaining word granularity level by performing subtraction alignment attention
Figure BDA00038057652800000629
Text Q subtractive alignment of interactive features at word granularity level
Figure BDA0003805765280000071
As shown in equation (20):
Figure BDA0003805765280000072
similar to the word granularity, the text P word granularity characteristic P in formula (2) is first set w And text Q word granularity feature Q w Text P soft alignment interactive feature for obtaining word granularity level by performing soft alignment attention
Figure BDA0003805765280000073
Text Q soft-alignment interaction feature at word granularity level
Figure BDA00038057652800000728
As shown in equation (21):
Figure BDA0003805765280000075
then, the text P word granularity characteristic P in the formula (2) w And text Q word granularity feature Q w Text P subtraction alignment interactive feature for obtaining word granularity level by performing subtraction alignment attention
Figure BDA0003805765280000076
Text Q subtraction alignment interaction feature at word granularity level
Figure BDA0003805765280000077
As shown in equation (22):
Figure BDA0003805765280000078
the second layer of coding structure enhances the initial semantic interaction characteristics between texts to complete the extraction of the semantic interaction characteristics between texts:
at word granularity, first, soft-align the text P at the word granularity level in equation (19) to the interactive features
Figure BDA0003805765280000079
And the text P word granularity characteristic P in the formula (1) c Adding to obtain text P deep soft alignment interactive characteristics of word granularity level
Figure BDA00038057652800000710
As shown in equation (23):
Figure BDA00038057652800000711
then, the text P subtraction at the word granularity level in the formula (20) is aligned with the interactive feature
Figure BDA00038057652800000712
And the text P word granularity characteristic P in the formula (1) c Adding to obtain text P deep subtraction alignment interactive features of word granularity level
Figure BDA00038057652800000713
As shown in equation (24):
Figure BDA00038057652800000714
finally, the text P deep soft alignment interactive characteristics of the character granularity level in the connection formula (23)
Figure BDA00038057652800000715
Aligning interactive features with text P deep subtraction at word granularity level in equation (24)
Figure BDA00038057652800000716
Obtaining text P high-level interactive characteristic P of word granularity level c ', as shown in equation (25):
Figure BDA00038057652800000717
under the word granularity, firstly, the text P at the word granularity level in the formula (21) is softly aligned with the interactive characteristics
Figure BDA00038057652800000718
Adding P to the granularity characteristic of the text P word in the formula (2) w Text P deep soft alignment interactive feature for obtaining word granularity level
Figure BDA00038057652800000719
As shown in equation (26):
Figure BDA00038057652800000720
then, the text P subtraction alignment interactive feature of the word granularity level in the formula (22)
Figure BDA00038057652800000721
And the granularity characteristic P of the text P words in the formula (2) w Text P deep subtraction alignment interactive feature with word granularity level obtained through addition
Figure BDA00038057652800000722
As shown in equation (27):
Figure BDA00038057652800000723
finally, the text P deep soft-alignment interactive features of the word granularity level in the joint formula (26)
Figure BDA00038057652800000724
Alignment of interactive features with text P deep subtraction at word granularity level in equation (27)
Figure BDA00038057652800000725
Obtaining the text P high-level interactive characteristic P' of word granularity level w As shown in equation (28):
Figure BDA00038057652800000726
text P deep subtraction alignment interactive features at the word granularity level in the conjunctive formula (24)
Figure BDA00038057652800000727
Alignment of interactive features with text P deep subtraction at word granularity level in equation (27)
Figure BDA0003805765280000081
Obtaining the text P deep semantic interactive characteristic P deep As shown in equation (29):
Figure BDA0003805765280000082
similarly, the text Q is processed similarly to the text P, and the deep soft alignment interactive feature of the text Q at the word granularity level can be obtained
Figure BDA0003805765280000083
Word-granularity-level text Q deep subtraction alignment interactive feature
Figure BDA0003805765280000084
Text Q high-level interactive characteristic Q' at word granularity level c Text Q deep soft alignment interactive feature of word granularity level
Figure BDA0003805765280000085
Word-granularity-level text Q deep subtraction alignment interactive features
Figure BDA0003805765280000086
Text Q high-level interactive characteristic Q' at word granularity level w Text Q deep semantic interactive feature Q deep And completing the extraction of semantic interactive features between texts.
More preferably, the implementation details of the feature fusion layer are as follows:
first, for convenience of subsequent description, the following definitions are made:
the vector is defined to subtract and then operate on the absolute value of the bit to represent AB, as shown in equation (30):
AB(P,Q)=|P-Q| (30)
the P and the Q are two different vectors, and the absolute value operation is carried out according to the bit after the subtraction of the two vectors of the P and the Q;
the bit-wise multiplication operation of the definition vector is expressed as MU, as shown in equation (31):
MU(P,Q)=P⊙Q (31)
wherein, P and Q are two different vectors, which represent the operation of multiplying P and Q vectors according to bit;
in the following description, the AB symbol represents the operation of formula (30), the MU symbol represents the operation of formula (31), the SOA symbol represents the operation of formula (3), the MUA symbol represents the operation of formula (4), the SUA symbol represents the operation of formula (5), and the SEA symbol represents the operation of formula (6);
the feature fusion layer is divided into two sub-modules, the first sub-module combines various related features, and the second sub-module performs various matching operations to obtain a final matching feature vector;
the first sub-module combines a plurality of relevant features:
text P high-level features P of word granularity level in connection formula (13) c ' and formula (25)Text P high-level interactive characteristic P at medium-word granularity level c ' obtaining text P aggregation features at word granularity level
Figure BDA0003805765280000087
And aggregating the text P features at the word granularity level
Figure BDA0003805765280000088
Performing self-attention to text P deep aggregation features at word granularity level
Figure BDA0003805765280000089
As shown in equation (32):
Figure BDA00038057652800000810
concatenating text P high-level features P 'at word granularity level in equation (17) similar to word granularity' w The high-level interaction characteristic P' with the text P at the word granularity level in the formula (28) w Text P aggregation features to derive word granularity level
Figure BDA00038057652800000811
And aggregating the text P with the character granularity level
Figure BDA00038057652800000812
Performing self-attention to get text P deep aggregated features at word granularity level
Figure BDA00038057652800000813
As shown in equation (33):
Figure BDA00038057652800000814
then, text P deep aggregation features at the word granularity level in the formula (32) are connected
Figure BDA00038057652800000815
Deep aggregation feature of text P with word granularity level in formula (33)
Figure BDA00038057652800000816
Then, performing maximum pooling operation to obtain semantic features P' of the text P after pooling, as shown in formula (34):
Figure BDA0003805765280000091
next, join text P deep semantic feature P 'in equation (18)' deep Interacting with the deep semantic interaction feature P' of the text P in the formula (29) deep Obtaining text P deep polymerization features
Figure BDA0003805765280000092
As shown in equation (35):
Figure BDA0003805765280000093
similarly, the same operation as the text P is carried out on the text Q to obtain the text Q aggregation characteristics at the word granularity level
Figure BDA0003805765280000094
Word-granularity-level text Q deep aggregation features
Figure BDA0003805765280000095
Text Q aggregation feature at word granularity level
Figure BDA0003805765280000096
Word granularity level text Q deep polymerization features
Figure BDA0003805765280000097
Text Q semantic feature Q' and text Q deep polymerization feature after pooling
Figure BDA0003805765280000098
Then will formula (35)Chinese text P deep polymerization feature
Figure BDA0003805765280000099
Deep syndication features with text Q
Figure BDA00038057652800000910
Performing soft-alignment attention to obtain soft-aligned text P deep polymerization features
Figure BDA00038057652800000911
Text Q deep polymerization features after soft alignment
Figure BDA00038057652800000912
As shown in equation (36):
Figure BDA00038057652800000913
then, the text P after soft alignment in the formula (36) is deeply polymerized with the features
Figure BDA00038057652800000914
Performing maximum pooling operation to obtain a pooled text P deep polymerization feature P', and performing soft alignment on a text Q deep polymerization feature
Figure BDA00038057652800000915
Performing maximum pooling operation to obtain a pooled text Q deep polymerization feature Q' as shown in formula (37):
Figure BDA00038057652800000916
the second sub-module performs multiple matching operations to obtain a final matching feature vector:
firstly, the semantic feature P 'of the pooled text P and the semantic feature Q' of the pooled text Q in the formula (34) are subtracted in absolute value to obtain a subtraction matching feature PQ ab As shown in equation (38):
PQ ab =AB(P′-Q′) (38)
secondly, performing point multiplication on the semantic features P 'of the pooled text P and the semantic features Q' of the pooled text Q in the formula (34) to obtain point-multiplied matching features PQ mu As shown in equation (39):
PQ mu =MU(P′,Q′) (39)
thirdly, subtracting the text P deep polymerization feature P ' after the pooling in the formula (37) and the text Q deep polymerization feature Q ' after the pooling in the formula (37) by absolute values to obtain a deep subtraction matching feature PQ ' ab As shown in equation (40):
PQ′ ab =AB(P″,Q″) (40)
then, carrying out dot multiplication on the text P deep polymerization feature P ' after pooling in the formula (37) and the text Q deep polymerization feature Q ' after pooling in the formula (37) to obtain a deep dot multiplication matching feature PQ ' mu As shown in equation (41):
PQ′ mu =MU(P″,Q″) (41)
finally, the semantic features P 'of the text P after pooling in the formula (34), the semantic features Q' of the text Q after pooling, and the subtraction matching features PQ in the formula (38) are connected ab Equation (39) midpoint product matching feature PQ mu And the deep subtraction matching characteristic PQ 'in the formula (40)' ab And the middle-deep layer point multiplication matching characteristic PQ 'of the formula (41)' mu The final matching feature vector F is obtained, as shown in equation (42):
F=[P';Q';PQ ab ;PQ mu ;PQ' ab ;PQ' mu ] (42)
more preferably, the implementation details of the prediction layer are as follows:
using the final matching feature vector F as input, using the three fully-connected layers and using the ReLU activation function for activation after the first and second fully-connected layers and the sigmoid function for activation after the third fully-connected layer, resulting in a value at [0,1 ]]The value of the degree of matching between the two is recorded as y pred (ii) a Finally, whether the text semantics are matched or not is judged by comparing the text semantics with the set threshold value of 0.5; i.e. y pred When the semantic meaning of the text is predicted to be matched when the semantic meaning of the text is more than or equal to 0.5, otherwise, the semantic meaning of the text is not matched(ii) a When the text semantic matching model is not trained, training is required to be carried out on a training data set constructed according to a semantic matching knowledge base so as to optimize model parameters; when the model is trained, the prediction layer can predict whether the semantics of the target text are matched.
Preferably, the text semantic matching knowledge base comprises a data set acquisition original data, a preprocessing original data and a summary sub knowledge base which are downloaded on a network;
downloading a data set on a network to obtain original data: downloading a text semantic matching data set or a manually constructed data set which is already disclosed on a network, and taking the data set as original data for constructing a text semantic matching knowledge base;
preprocessing raw data: preprocessing original data used for constructing a text semantic matching knowledge base, and performing word breaking operation and word segmentation operation on each text to obtain a text semantic matching word breaking processing knowledge base and a word segmentation processing knowledge base;
summarizing the sub-knowledge base: summarizing a text semantic matching word-breaking processing knowledge base and a text semantic matching word-segmentation processing knowledge base to construct a text semantic matching knowledge base;
the text semantic matching model is obtained by training through a training data set, and the construction process of the training data set comprises the steps of constructing a training positive case, constructing a training negative case and constructing a training data set;
constructing a training example: for each text in the text semantic matching knowledge base, if the semantics are consistent, the text can be used for constructing a training case;
constructing a training negative example: selecting a text txt P, randomly selecting a text txt Q which is not matched with the text txt P from a text semantic matching knowledge base, and combining the txt P and the txt Q to construct a negative case;
constructing a training data set: combining all positive example data and negative example data obtained after the operations of constructing the training positive example and constructing the training negative example, and disordering the sequence of the positive example data and the negative example data to construct a final training data set;
after the text semantic matching model is built, training and optimizing the text semantic matching model through a training data set, which specifically comprises the following steps:
constructing a loss function: known from the prediction layer implementation, y pred Calculating a numerical value for the matching degree obtained after the text semantic matching model processing; and y is true The semantic meaning of the text is a real label whether the two text semantics are matched, the value of the semantic meaning is limited to 0 or 1, and the cross entropy is used as a loss function;
constructing an optimization function: using Adam optimization functions; and performing optimization training on the text semantic matching model on the training data set.
A text semantic matching device for medical intelligent question answering comprises a text semantic matching knowledge base building unit, a training data set generating unit, a text semantic matching model building unit and a text semantic matching model training unit;
the specific function of each cell of the summary text knowledge base is as follows:
the text semantic matching knowledge base construction unit is used for acquiring a large amount of text data and then preprocessing the text data so as to acquire a text semantic matching knowledge base which meets the training requirement;
the training data set generating unit is used for matching data in the text semantic matching knowledge base, if the semantics of the data are consistent, the text is used for constructing a training positive example, otherwise, the text is used for constructing a training negative example, and all the positive example data and the negative example data are mixed to obtain a training data set;
the text semantic matching model building unit is used for building a word mapping conversion table, an input layer, a word vector mapping layer, a semantic coding layer, a multi-level fine-grained feature extraction layer, a feature fusion layer and a prediction layer;
and the text semantic matching model training unit is used for constructing a training loss function and an optimization function and finishing the training of the model.
A storage medium having stored therein a plurality of instructions, the instructions being loadable by a processor and adapted to perform the steps of the above-described method for semantic matching of text to a medical intelligence question and answer.
An electronic device, the electronic device comprising:
the storage medium described above; and
a processor to execute the instructions in the storage medium.
The text semantic matching method and device for medical intelligent question answering have the following advantages:
embedding operation is carried out on the granularity of characters and words in the text, so that semantic information contained in different granularities of the text is extracted, and the extracted semantic features are more detailed and abundant;
secondly, the text is semantically coded through a bidirectional long-term and short-term memory network, so that bidirectional semantic dependence of the text can be captured better;
thirdly, semantic features with different granularities and different levels can be captured by constructing a fine-grained feature extraction layer, and semantic features with more granularities and deeper levels can be extracted as far as possible;
the texts are subjected to semantic coding through an attention mechanism, so that the dependency relationship between the texts and between the granularities in the texts can be effectively captured, the generated text matching tensor has rich interactive characteristics, and the prediction accuracy of the model is improved;
and fifthly, by maximum pooling operation, invalid information in the matching tensor can be effectively filtered, and effective information can be strengthened, so that the matching process is more accurate, and the accuracy of text semantic matching is improved.
Drawings
The invention is further described below with reference to the accompanying drawings.
FIG. 1 is a flow chart of a text semantic matching method for medical intelligent question answering;
FIG. 2 is a flow chart for building a text semantic matching knowledge base;
FIG. 3 is a flow chart for constructing a training data set;
FIG. 4 is a flow chart of building a text semantic matching model;
FIG. 5 is a flow chart of training a text semantic matching model;
FIG. 6 is a schematic diagram of a semantic coding layer model (taking text P as an example);
FIG. 7 is a schematic diagram of a structure for extracting fine-grained semantic features of the same text (taking the text P as an example);
FIG. 8 is a schematic diagram of a structure for extracting semantic interaction features between texts;
FIG. 9 is a schematic view of a feature fusion layer;
FIG. 10 is a schematic structural diagram of a text semantic matching device for medical intelligent question answering
Detailed Description
The text semantic matching method and device for medical intelligent question answering according to the invention are described in detail below with reference to the drawings and specific embodiments of the specification.
Example 1:
the main framework structure of the invention comprises an embedding layer, a semantic coding layer, a multi-level fine-grained feature extraction layer, a feature fusion layer and a prediction layer. The embedding layer carries out embedding operation on the input text according to the word granularity and the word granularity respectively, and outputs text word embedding representation and word embedding representation. The semantic coding layer structure is shown in fig. 6, taking a text P as an example, receiving a text P word embedded representation and a word embedded representation, coding and outputting text P word and word granularity features by using a bidirectional long-short term memory network BiLSTM, and transmitting the text P word and word granularity features to a multi-level fine granularity feature extraction layer. The multi-level fine-grained feature extraction layer comprises two sub-modules, wherein the first sub-module is responsible for extracting fine-grained semantic features of the same text as shown in fig. 7, and mainly uses a plurality of attention module codes to obtain the fine-grained semantic features of the same text for different granularities of the same text; the second sub-module, as shown in fig. 8, is responsible for extracting semantic interactive features between texts, and mainly obtains the semantic interactive features between texts by using a plurality of layers of coding structures between texts; in the first sub-module, as shown in fig. 7, taking a text P as an example, the first layer of coding structure uses multiple attention modules to extract the initial semantic features of the same text with fine granularity, which specifically includes: firstly, carrying out soft alignment attention on the granularity characteristic of a text P word and the granularity characteristic of a text P word by using soft alignment attention to obtain the soft alignment characteristic of the text P at the word granularity level and the soft alignment characteristic of the text P at the word granularity level, secondly, carrying out multiplication alignment attention on the granularity characteristic of the text P word and the granularity characteristic of the text P word by using multiplication alignment attention to obtain the multiplication alignment characteristic of the text P at the word granularity level and the multiplication alignment characteristic of the text P at the word granularity level, and then carrying out subtraction alignment attention on the granularity characteristic of the text P word and the granularity characteristic of the text P word to obtain the subtraction alignment characteristic of the text P at the word granularity level and the subtraction alignment characteristic of the text P at the word granularity level by using subtraction alignment attention to complete the extraction of the initial semantic characteristic of the fine granularity of the same text; the second layer of coding structure enhances the fine-grained initial semantic features of the same text to complete the extraction of the fine-grained semantic features of the same text, and specifically comprises the following steps: firstly, under the word granularity, adding a text P soft alignment feature at the word granularity level and a text P word granularity feature to obtain a text P deep soft alignment feature at the word granularity level, then adding a text P multiplication alignment feature at the word granularity level and a text P word granularity feature to obtain a text P deep multiplication alignment feature at the word granularity level, then adding a text P subtraction alignment feature at the word granularity level and a text P word granularity feature to obtain a text P deep subtraction alignment feature at the word granularity level, and then connecting the text P deep soft alignment feature at the word granularity level, the text P deep multiplication alignment feature at the word granularity level and the text P deep subtraction alignment feature at the word granularity level to obtain a text P high-level feature at the word granularity level; the word granularity is similar to the word granularity, firstly, adding a text P soft alignment feature at a word granularity level and a text P word granularity feature to obtain a text P deep soft alignment feature at the word granularity level, then adding a text P multiplication alignment feature at the word granularity level and a text P word granularity feature to obtain a text P deep multiplication alignment feature at the word granularity level, then adding a text P subtraction alignment feature at the word granularity level and a text P word granularity feature to obtain a text P deep subtraction alignment feature at the word granularity level, then connecting the text P deep soft alignment feature at the word granularity level, the text P deep multiplication alignment feature at the word granularity level and the text P deep subtraction alignment feature at the word granularity level to obtain a text P high-level feature at the word granularity level, and connecting the text P deep soft alignment feature at the word granularity level and the text P deep soft alignment feature at the word granularity level to obtain a text P semantic deep soft alignment feature, namely extracting fine granularity features of the same text; in the second sub-module, as shown in fig. 8, the first layer of coding structure simultaneously uses several layers of coding structures to extract the initial semantic interaction features between texts, which specifically includes: in the word granularity, firstly, carrying out soft alignment attention on the granularity characteristic of a text P word and the granularity characteristic of a text Q word to obtain the soft alignment interactive characteristic of the text P at the word granularity level and the soft alignment interactive characteristic of the text Q at the word granularity level, and secondly, carrying out subtraction alignment attention on the granularity characteristic of the text P word and the granularity characteristic of the text Q word to obtain the subtraction alignment interactive characteristic of the text P at the word granularity level and the subtraction alignment interactive characteristic of the text Q at the word granularity level; performing soft alignment attention on the granularity characteristic of a text P word and the granularity characteristic of a text Q word to obtain the soft alignment interactive characteristic of the text P at the word granularity level and the soft alignment interactive characteristic of the text Q at the word granularity level, and then performing subtraction alignment attention on the granularity characteristic of the text P word and the granularity characteristic of the text Q word to obtain the subtraction alignment interactive characteristic of the text P at the word granularity level and the subtraction alignment interactive characteristic of the text Q at the word granularity level; the second layer of coding structure enhances the initial semantic interactive features between texts to complete the extraction of the semantic interactive features between the texts, and specifically comprises the following steps: under the word granularity, firstly, adding the text P soft alignment interactive feature at the word granularity level and the text P word granularity feature to obtain a text P deep soft alignment interactive feature at the word granularity level, then adding the text P subtraction alignment interactive feature at the word granularity level and the text P word granularity feature to obtain a text P deep subtraction alignment interactive feature at the word granularity level, and finally, connecting the text P deep soft alignment interactive feature at the word granularity level and the text P deep subtraction alignment interactive feature at the word granularity level to obtain a text P high-level interactive feature at the word granularity level; in the word granularity, firstly, adding a text P soft alignment interactive feature at a word granularity level and a text P word granularity feature to obtain a text P deep soft alignment interactive feature at the word granularity level, then adding a text P subtraction alignment interactive feature at the word granularity level and the text P word granularity feature to obtain a text P deep subtraction alignment interactive feature at the word granularity level, and finally, connecting the text P deep soft alignment interactive feature at the word granularity level and the text P deep subtraction alignment interactive feature at the word granularity level to obtain a text P high-level interactive feature at the word granularity level, and connecting the text P deep subtraction alignment interactive feature at the word granularity level and the text P deep subtraction alignment interactive feature at the word granularity level to obtain a text P deep semantic interactive feature; the text Q and the text P are operated similarly, and the text Q high-level interactive features at the word granularity level, the text Q deep-level semantic interactive features and the text Q high-level interactive features at the word granularity level can be obtained, namely, the extraction of the text semantic interactive features is completed. In the feature fusion layer, as shown in fig. 9, the first sub-module merges multiple related features, and the second sub-module performs multiple matching operations to obtain a final matching feature vector; the first sub-module merges multiple related features, and taking the text P as an example, the specific steps are as follows: under the word granularity, connecting the text P high-level features at the word granularity level with the text P high-level interactive features at the word granularity level to obtain text P aggregation features at the word granularity level, and performing self-attention on the text P aggregation features at the word granularity level to obtain text P deep aggregation features at the word granularity level; under the word granularity, similar to the word granularity, connecting text P high-level features at the word granularity level with text P high-level interactive features at the word granularity level to obtain text P aggregation features at the word granularity level, performing self-attention on the text P aggregation features at the word granularity level to obtain text P deep aggregation features at the word granularity level, then connecting the text P deep aggregation features at the word granularity level with the text P deep aggregation features at the word granularity level, performing maximal pooling operation to obtain pooled text P semantic features, and similarly, performing the same operation as the text P on the text Q to obtain pooled text Q semantic features; then, connecting the text P deep semantic interactive feature with the text P deep semantic feature to obtain a text P deep polymerization feature, similarly, performing the same operation on the text Q as the text P to obtain a text Q deep polymerization feature, performing soft alignment attention on the text P deep polymerization feature and the text Q deep polymerization feature to obtain a soft-aligned text P deep polymerization feature and a soft-aligned text Q deep polymerization feature, then performing maximum pooling operation on the soft-aligned text P deep polymerization feature to obtain a pooled text P deep polymerization feature, and performing maximum pooling operation on the soft-aligned text Q deep polymerization feature to obtain a pooled text Q deep polymerization feature; the second sub-module performs multiple matching operations to obtain a final matching feature vector, specifically: firstly, subtracting the absolute value of the semantic features of the pooled text P and the semantic features of the pooled text Q to obtain subtraction matching features, secondly, performing point multiplication on the semantic features of the pooled text P and the semantic features of the pooled text Q to obtain point multiplication matching features, thirdly, subtracting the absolute value of the deep polymerization features of the pooled text P and the deep polymerization features of the pooled text Q to obtain deep subtraction matching features, thirdly, performing point multiplication on the deep polymerization features of the pooled text P and the deep polymerization features of the pooled text Q to obtain deep point multiplication matching features, and finally, connecting the semantic features of the pooled text P, the semantic features of the pooled text Q, the subtraction matching features, the point multiplication matching features, the deep subtraction matching features and the deep point multiplication matching features to obtain final matching feature vectors. And the prediction layer inputs the final matching feature vector into the multilayer perceptron to obtain a floating-point numerical value, compares the floating-point numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the text are matched or not according to the comparison result. The method comprises the following specific steps:
(1) The embedding layer carries out embedding operation on the input text according to the character granularity and the word granularity respectively, and outputs text character embedding representation and word embedding representation;
(2) The semantic coding layer is used for receiving word embedded expression and word embedded expression, coding is carried out by using a bidirectional long-short term memory network BilSTM, and granularity characteristics of text words and words are output;
(3) The multilevel fine-grained feature extraction layer performs the same text and text inter-coding operation on the text character and word granularity features output by the semantic coding layer to obtain the same text fine-grained semantic features and the text inter-semantic interactive features;
(4) The feature fusion layer combines various related features, and then performs various matching operations to obtain a final matching feature vector;
(4) And the prediction layer inputs the final matching feature vector into the multilayer perceptron to obtain a floating-point numerical value, compares the floating-point numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the text are matched or not according to the comparison result.
Example 2:
as shown in the attached figure 1, the text semantic matching method for the medical intelligent question answering specifically comprises the following steps:
s1, constructing a text semantic matching knowledge base, as shown in the attached figure 2, and specifically comprising the following steps:
s101, downloading a data set on a network to obtain original data: downloading a text semantic matching data set or a manually constructed data set which is already disclosed on a network, and taking the text semantic matching data set or the manually constructed data set as original data for constructing a text semantic matching knowledge base;
examples are: the method comprises the steps that a plurality of published text semantic matching data sets facing medical intelligent question answering exist on a network, and a plurality of question answering data pairs exist in a plurality of medical community forums;
text pairs for the example, as follows:
txt P what are the symptoms of the cold?
txt Q What symptoms can be judged to be a cold?
S102, preprocessing original data: preprocessing original data used for constructing a text semantic matching knowledge base, and performing word segmentation operation and word segmentation operation on each text to obtain a text semantic matching word segmentation processing knowledge base and a word segmentation processing knowledge base;
taking txt P shown in S101 as an example, it is subjected to word-breaking processing operation to obtain "which symptoms of a cold are all present? "; the Jieba word segmentation tool is used for carrying out word segmentation operation processing on the Chinese medicinal herb to obtain' which symptoms are shown in cold? ".
S103, summarizing the sub-knowledge base: summarizing a text semantic matching word-breaking processing knowledge base and a text semantic matching word-segmentation processing knowledge base to construct a text semantic matching knowledge base;
the text semantic matching word segmentation processing knowledge base and the text semantic matching word segmentation processing knowledge base obtained in the step S102 are collected under the same folder, so as to obtain a text semantic matching knowledge base, the flow of which is shown in fig. 2, where it is to be noted that data processed by the word segmentation operation and data processed by the word segmentation operation are not merged into the same file, that is, the text semantic matching knowledge base actually includes two independent sub-knowledge bases.
S2, constructing a text semantic matching model training data set: for each text in the text semantic matching knowledge base, if the semantics are consistent, the text can be used for constructing a training case; if the semantics are inconsistent, the text can be used for constructing a training negative case; mixing a certain amount of positive example data and negative example data to construct a training data set required by the model; as shown in fig. 3, the specific steps are as follows:
s201, constructing training regular case data: two texts with consistent text semantics are constructed into regular case data, and the regular case data is formalized into: (txt P _ char, txt Q _ char, txt P _ word, txt Q _ word, 1);
examples are: after the txt P and txt Q displayed in step S101 are subjected to the word-breaking operation processing and the word-segmentation operation processing in step S102, a formal example data form is constructed as follows:
(which are all symptoms exhibited by cold.
S202, constructing training negative example data: selecting a certain text contained in the text, and randomly selecting a certain text which is not matched with the selected text for combination; the two texts with different semantics are constructed into negative example data, and the negative example data can be formatted into the following steps by adopting the operation similar to the step S201: (txt P _ char, txt Q _ char, txt P _ word, txt Q _ word, 0), each symbolic meaning is the same as in step S201, 0 represents that the semantics of the two texts are not matched, and is a negative example;
examples are: the example is very similar to the construction training example, and is not described in detail here.
S203, constructing a training data set: all positive example data and negative example data obtained after the operations of the steps S201 and S202 are combined together, the sequence of the positive example data and the negative example data is disordered, and a final training data set is constructed, wherein the positive example data and the negative example data both comprise 5 dimensions, namely txt P _ char, txt Q _ char, txt P _ word, txt Q _ word,0 or 1.
S3, constructing a text semantic matching model: the method mainly comprises the steps of constructing a word mapping conversion table, an input layer, a word vector mapping layer, a semantic coding layer, a multi-level fine-grained feature extraction layer, a feature fusion layer and a prediction layer; as shown in fig. 4, the specific steps are as follows:
s301, constructing a word mapping conversion table: the word list is constructed by the text semantic matching word breaking processing knowledge base and the word segmentation processing knowledge base which are obtained after the processing of the step S102; after the word list is constructed, each word or word in the list is mapped to a unique digital identifier, and the mapping rule is as follows: starting with the number 1, sequentially and progressively sequencing each character or word according to the sequence of the character and word table recorded by each character or word so as to form a word mapping conversion table required by the invention;
examples are as follows: with the content processed in step S102, "what are all symptoms of the cold? "," what are all symptoms of the cold? "construct word table and word mapping translation table as follows:
words and phrases Feeling of Cap (A) Watch (A) Now that Is Symptoms of (1) Form of Are all Is provided with Where is These are
Mapping 1 2 3 4 5 6 7 8 9 10 11
Words and phrases Common cold Performance of Symptoms and signs Are all provided with Which ones are
Mapping 12 13 14 15 16 17
Then, the Word2Vec is used for training the Word vector model to obtain a Word vector matrix char _ embedding _ matrix of each Word;
for example, the following steps are carried out: in Keras, the following is implemented for the code described above:
w2v_model=models.Word2Vec(w2v_corpus,size=EMB_DIM,window=5,min_count=1,sg=1,workers=4,seed=1234,iter=25)
embedding_matrix=np.zeros([len(tokenizer.word_index)+1,EMB_DIM])
tokenizer=Tokenizer(num_words=len(word_set))
for word,idx in tokenizer.word_index.items():
embedding_matrix[idx,:]=w2v_model.wv[word]
wherein w2v _ corpus is all data in the text semantic matching knowledge base; EMB _ DIM is a vector dimension, the model sets EMB _ DIM to 300, and word _setto a word list.
S302, constructing an input layer: the input layer comprises four inputs, txt P _ char, txt Q _ char, txt P _ word and txt Q _ word are respectively obtained from the training data set sample of the input layer, and the txt P _ char, txt Q _ word and txt Q _ word are formed as follows: (txt P _ char, txt Q _ char, txt P _ word, txt Q _ word);
for each character and word in the input text, the invention converts the character and word into corresponding numerical identifiers according to the word mapping conversion table constructed in the step S301;
for example, the following steps are carried out: using the text shown in step S201 as a sample, a piece of input data is formed, and the result is as follows:
( "are all symptoms of the common cold present? "," which symptoms are manifested can be judged as a cold? "," what are all symptoms of the cold? "," which symptoms are manifested are judged to be colds? " )
Each piece of input data contains 4 sub-texts; based on the word mapping conversion table in step S301, which is converted into a numerical representation (assuming that "may", "to", "decide", "to", "may", "to", and "to" which "appear in txt Q but do not appear in txt P are mapped to 18, 19, 20, 21, 22, 23, 24, respectively), 4 sub-texts of data are input, and the combined representation results are as follows:
(“1,2,3,4,5,6,7,8,9,10,11,12”,“3,4,10,11,6,7,18,19,20,21,22,1,2,12”,“13,14,5,15,16,17,12”,“14,17,15,23,24,22,13,12”)。
s303, constructing a word vector mapping layer: initializing the weight parameter of the current layer by loading the word vector matrix obtained by training in the step of constructing a word mapping conversion table; aiming at input texts txt P _ char, txt Q _ char, txt P _ word and txt Q _ word, obtaining corresponding text word embedding representation and word embedding representation txt P _ char _ embedded, txt Q _ char _ embedded, txt P _ word _ embedded and txt Q _ word _ embedded; each text in the text semantic matching knowledge base can convert text information into a vector form in a word vector mapping mode;
for example, the following steps are carried out: in Keras, the following is implemented for the code described above:
embedding_layer=Embedding(embedding_matrix.shape[0],emb_dim,weights=[embedding_matrix],input_length=input_dim,trainable=False)
wherein, embedding _ matrix is a word vector matrix obtained by training in the step of word mapping conversion table, embedding _ matrix, shape [0] is the size of a word table of the word vector matrix, emb _ dim is the dimension of output text word embedding expression and word embedding expression, and input _ length is the length of input sequence;
and processing the corresponding texts txt P _ char, txt Q _ char, txt P _ word and txt Q _ word by an Embedding layer of the Keras to obtain corresponding text word Embedding representations and word Embedding representations txt P _ char _ embedded, txt Q _ char _ embedded, txt P _ word _ embedded and txtQ _ word _ embedded.
S304, constructing a semantic coding layer:
taking the text P as an example, the module receives the embedded representation of the text P characters and words and uses a bidirectional long-short term memory network BilSTM to encode to obtain the granularity characteristics of the text P characters and words which are marked as
Figure BDA0003805765280000171
Figure BDA0003805765280000172
The concrete formula is as follows:
Figure BDA0003805765280000173
Figure BDA0003805765280000174
where N represents the word granularity features and the length of the word granularity features, equation (1) represents encoding a text P word embedded representation using a bidirectional long-short term memory network BilSt, where,
Figure BDA0003805765280000175
the granularity characteristic of the ith position word of the text P obtained by bidirectional long and short term memory network BilSTM coding is shown,
Figure BDA0003805765280000176
the ith position word granularity characteristic of the text P obtained by forward long-short term memory network LSTM coding is shown,
Figure BDA0003805765280000177
the ith position word granularity characteristic of the text P obtained by backward LSTM coding is represented; the symbol meaning in formula (2) is basically the same as that in formula (1),
Figure BDA0003805765280000178
representing the granularity characteristics of the jth position word of the text P obtained by bidirectional long-short term memory network BilSTM coding,
Figure BDA0003805765280000179
represents the granularity characteristic of the j-th position word of the text P obtained by forward LSTM coding,
Figure BDA00038057652800001710
and the j (th) position word granularity characteristics of the text P obtained by backward LSTM coding are shown.
Similarly, the text Q is operated similarly to the text P to obtain the granularity characteristics of the characters and words of the text Q, and the granularity characteristics are marked as Q c 、Q w
S305, constructing a multi-level fine-grained feature extraction layer:
the multilevel fine-grained feature extraction layer takes the granularity features of the text characters and words output by the semantic coding layer as input; performing encoding operation between the same text and the same text to obtain fine-grained semantic features of the same text and semantic interaction features between the same text; the method comprises two sub-modules, wherein the first sub-module is responsible for extracting fine-grained semantic features of the same text, and mainly uses a plurality of attention module codes to obtain the fine-grained semantic features of the same text according to different granularities of the same text, as shown in FIG. 7; the second sub-module is responsible for extracting semantic interactive features between texts, and mainly obtains the semantic interactive features between the texts by using a plurality of layers of coding structures between the texts, as shown in fig. 8.
S30501, extracting fine-grained semantic features of the same text of a first sub-module:
first, for convenience of subsequent description, in the first section, taking the text P as an example, the following attention module is defined:
defining a soft alignment attention module, marked as SOA, and the formula is as follows:
Figure BDA0003805765280000181
wherein
Figure BDA0003805765280000182
The i-th position word granularity characteristic of the text P is represented by the formula (1),
Figure BDA0003805765280000183
the granularity characteristic of the j-th position word of the text P is shown in formula (2),
Figure BDA0003805765280000184
representing a soft alignment attention weight between the ith position word granularity characteristic and the jth position word granularity characteristic of the text P,
Figure BDA0003805765280000185
numeric values representing the softmax operation mapping to 0-1 for soft-alignment attention weights,
Figure BDA0003805765280000186
Indicating that the ith position word granularity feature of the text P can be re-expressed by weighted summation of all word granularity features of the text P using soft-alignment attention,
Figure BDA0003805765280000187
the representation uses soft alignment attention to enable the granularity characteristic of the jth position word of the text P to be represented again by the weighted summation of all the granularity characteristics of the words of the text P;
define the multiplicative alignment attention module, denoted MUA, as follows:
Figure BDA0003805765280000188
wherein TimeDistributed (Dense ()) indicates that the same layer operation of Dense () is performed for the tensor of each time step,. Alpha.indicates an alignment multiplication operation,. Tanh indicates an activation function, and P c Representing the granularity characteristics of a P word of the text,
Figure BDA0003805765280000189
representing the granularity characteristic of the P words of the text after being processed by a Dense () layer, P w The granularity characteristic of the text P words is represented,
Figure BDA00038057652800001810
indicating that the multiplication is aligned with the attention weight,
Figure BDA00038057652800001811
indicating that softmax operation on the multiplicative alignment attention weight maps to a value of 0-1,
Figure BDA00038057652800001812
indicating that the ith position word granularity characteristic of the text P can be re-expressed by weighted summation of all word granularity characteristics of the text P by using multiplication alignment attention,
Figure BDA00038057652800001813
the expression that the jth position word granularity feature of the text P can be represented again by the weighted sum of all word granularity features of the text P by using multiplication alignment attention;
defining a Sua as the Sua, the formula is as follows:
Figure BDA0003805765280000191
where TimeDistributed (Dense ()) represents that the same layer operation of Dense () is performed for the tensor of each time step, represents the bit-wise subtraction operation, tanh represents the activation function, P c Representing the granularity characteristics of a P word of the text,
Figure BDA0003805765280000192
representing the granularity characteristic of the P words of the text after being processed by a Dense () layer, P w The granularity characteristic of the text P words is represented,
Figure BDA0003805765280000193
representing a subtractive alignment attention weight,
Figure BDA0003805765280000194
indicating that the softmax operation on the subtraction alignment attention weight maps to a value of 0-1,
Figure BDA0003805765280000195
indicating that the ith position word granularity feature of the text P can be re-represented by weighted summation of all word granularity features of the text P using subtractive alignment attention,
Figure BDA0003805765280000196
the method indicates that the jth position word granularity characteristic of the text P can be represented again by weighted summation of all word granularity characteristics of the text P by using subtraction alignment attention;
define the self-alignment attention module as SEA, the formula is as follows:
Figure BDA0003805765280000197
wherein,
Figure BDA0003805765280000198
the i-th position word granularity characteristic of the text P is represented,
Figure BDA0003805765280000199
representing the jth position word granularity characteristic of the text P,
Figure BDA00038057652800001910
representing a self-aligned attention weight between the ith position word granularity feature of the text P and the jth position word granularity feature of the text P,
Figure BDA00038057652800001911
representing a mapping of the softmax operation on the self-aligned attention weight to a value of 0-1,
Figure BDA00038057652800001912
the representation uses self-alignment attention to enable the ith position word granularity characteristic of the text P to be represented again by weighted summation of all word granularity characteristics of the text P;
in the following description, the SOA symbol is used to represent the operation of formula (3), the MUA symbol is used to represent the operation of formula (4), the SUA symbol is used to represent the operation of formula (5), and the SEA symbol is used to represent the operation of formula (6);
s3050101, extracting fine-grained initial granularity semantic features of the same text by using a plurality of attention modules in a first layer of coding structure:
first, using soft-alignment attention, the text P word granularity feature P c And text P word granularity characteristic P w Performing soft alignment attention to obtain text P soft alignment characteristics at word granularity level
Figure BDA00038057652800001913
Text P soft-alignment feature at word granularity level
Figure BDA00038057652800001914
As shown in equation (7):
Figure BDA0003805765280000201
second, using multiplicative alignment attention, the text P word granularity feature P c And text P word granularity characteristic P w Performing multiplicative alignment attention to obtain text P multiplicative alignment features at word granularity level
Figure BDA0003805765280000202
Text P multiplication alignment feature at word granularity level
Figure BDA0003805765280000203
As shown in equation (8):
Figure BDA0003805765280000204
then, using subtraction to align attention, the text P word granularity feature P c And text P word granularity characteristic P w Text P subtraction alignment feature for performing subtraction alignment attention to obtain word granularity level
Figure BDA0003805765280000205
Text P subtraction alignment feature to word granularity level
Figure BDA0003805765280000206
As shown in equation (9):
Figure BDA0003805765280000207
similarly, the text Q is processed similarly to the text P, and the soft alignment characteristic of the text Q at the word granularity level can be obtained
Figure BDA0003805765280000208
Text Q at word granularity levelSoft alignment feature
Figure BDA0003805765280000209
Word-granularity level text Q multiplication alignment feature
Figure BDA00038057652800002010
Word granularity level text Q-multiply alignment feature
Figure BDA00038057652800002011
Word-granularity-level text Q subtractive alignment features
Figure BDA00038057652800002012
Word granularity level text Q subtraction alignment feature
Figure BDA00038057652800002013
Namely, the extraction of fine-grained initial semantic features of the same text is completed;
s3050102, the second-layer coding structure enhances the fine-grained initial semantic features of the same text to complete extraction of the fine-grained semantic features of the same text:
firstly, soft-aligning the text P at the word granularity level in the formula (7)
Figure BDA00038057652800002014
And the text P word granularity characteristic P in the formula (1) c Adding to obtain text P deep soft alignment characteristics of word granularity level
Figure BDA00038057652800002015
As shown in equation (10):
Figure BDA00038057652800002016
then, the text P multiplication alignment feature of word granularity level in the formula (8) is aligned
Figure BDA00038057652800002017
And the text P word granularity characteristic P in the formula (1) c Adding to obtain text P deep multiplication alignment characteristics of word granularity level
Figure BDA00038057652800002018
As shown in equation (11):
Figure BDA00038057652800002019
then, the text P of the word granularity level in the formula (9) is subtracted to align the features
Figure BDA00038057652800002020
And the text P word granularity characteristic P in the formula (1) c Adding to obtain text P deep subtraction alignment features at word granularity level
Figure BDA00038057652800002021
As shown in equation (12):
Figure BDA00038057652800002022
then, the text P deep soft alignment characteristics of the word granularity level in the formula (10)
Figure BDA00038057652800002023
Text P deep multiplication alignment feature at word granularity level in equation (11)
Figure BDA00038057652800002024
Text P deep subtraction alignment feature at word granularity level in equation (12)
Figure BDA00038057652800002025
Concatenating to get text P high level feature P 'at word granularity level' c As shown in equation (13):
Figure BDA00038057652800002026
similar in word granularity to word granularity, first, the text P soft-alignment feature at the word granularity level in equation (7) is
Figure BDA00038057652800002027
And the granularity characteristic P of the text P words in the formula (2) w Adding to obtain text P deep soft alignment characteristics of word granularity level
Figure BDA00038057652800002028
As shown in equation (14):
Figure BDA00038057652800002029
then, the text P multiplication alignment feature of the word granularity level in the formula (8)
Figure BDA00038057652800002030
And the text P word granularity characteristic P in the formula (2) w Adding to obtain text P deep multiplication alignment characteristics of word granularity level
Figure BDA00038057652800002031
As shown in equation (15):
Figure BDA0003805765280000211
then, the text P of the word granularity level in the formula (9) is subjected to subtraction alignment
Figure BDA0003805765280000212
And the text P word granularity characteristic P in the formula (2) w Adding to obtain word granularity level text P deep subtraction alignment features
Figure BDA0003805765280000213
As shown in equation (16):
Figure BDA0003805765280000214
next, join the text P deep soft alignment features at the word granularity level of equation (14)
Figure BDA0003805765280000215
Text P deep multiplication alignment feature at word granularity level in formula (15)
Figure BDA0003805765280000216
Text P deep subtraction alignment feature at word granularity level in equation (16)
Figure BDA0003805765280000217
Obtaining text P high-level feature P 'of word granularity level' w As shown in equation (17):
Figure BDA0003805765280000218
text P deep soft alignment feature at word granularity level in conjunctive formula (10)
Figure BDA0003805765280000219
Text P deep soft alignment feature at the word granularity level of formula (14)
Figure BDA00038057652800002110
Obtaining text P deep semantic feature P' deep As shown in equation (18):
Figure BDA00038057652800002111
similarly, the text Q is processed similarly to the text P, and deep soft alignment characteristics of the text Q at the word granularity level can be obtained
Figure BDA00038057652800002112
Word-granularity level text Q deep multiplicative alignment features
Figure BDA00038057652800002113
Word-granularity-level text Q deep subtraction alignment feature
Figure BDA00038057652800002114
Text Q high-level feature Q 'at word granularity level' c And text Q deep soft-alignment features at word granularity level
Figure BDA00038057652800002115
Word granularity level text Q deep multiplication alignment feature
Figure BDA00038057652800002116
Word granularity level text Q deep subtraction alignment feature
Figure BDA00038057652800002117
Text Q high-level feature Q 'of word granularity level' w Text Q deep semantic feature Q' deep Finishing the extraction of fine-grained semantic features of the same text;
s30502, semantic interaction feature extraction between texts:
s3050201, extracting initial semantic interaction characteristics between texts by using a plurality of layers of coding structures simultaneously through the first layer of coding structure:
on the character granularity, firstly, the text P character granularity characteristic P in formula (1) c And text Q word granularity feature Q c Text P soft alignment interactive feature for obtaining word granularity level by performing soft alignment attention
Figure BDA00038057652800002118
Text Q soft alignment interaction feature at word granularity level
Figure BDA00038057652800002119
As shown in equation (19):
Figure BDA00038057652800002120
secondly, the granularity characteristic P of the text P word in the formula (1) c And text Q word granularitySign Q c Text P subtraction alignment interactive feature for obtaining word granularity level by performing subtraction alignment attention
Figure BDA00038057652800002121
Text Q subtractive alignment of interactive features at word granularity level
Figure BDA00038057652800002122
As shown in equation (20):
Figure BDA00038057652800002123
similar to the word granularity, the text P word granularity characteristic P in formula (2) is first set w And text Q word granularity feature Q w Text P soft alignment interactive feature for obtaining word granularity level by performing soft alignment attention
Figure BDA00038057652800002124
Text Q soft-alignment interaction feature at word granularity level
Figure BDA00038057652800002125
As shown in equation (21):
Figure BDA00038057652800002126
then, the text P word granularity characteristic P in the formula (2) w And text Q word granularity feature Q w Text P subtraction alignment interactive feature for obtaining word granularity level by performing subtraction alignment attention
Figure BDA00038057652800002127
Text Q-subtractive alignment interactive features at word granularity level
Figure BDA0003805765280000221
As shown in equation (22):
Figure BDA0003805765280000222
s3050202, enhancing initial semantic interaction features between texts by using a second-layer coding structure, and finishing extraction of semantic interaction features between texts:
at word granularity, first, soft-align the text P at the word granularity level in equation (19) to the interactive features
Figure BDA0003805765280000223
And the text P word granularity characteristic P in the formula (1) c Adding to obtain text P deep soft alignment interactive characteristics of word granularity level
Figure BDA0003805765280000224
As shown in equation (23):
Figure BDA0003805765280000225
then, the text P subtraction at the word granularity level in the formula (20) is aligned with the interactive feature
Figure BDA0003805765280000226
And the text P word granularity characteristic P in the formula (1) c Adding to obtain text P deep subtraction alignment interactive features of word granularity level
Figure BDA0003805765280000227
As shown in equation (24):
Figure BDA0003805765280000228
finally, the text P deep soft alignment interactive characteristics of the character granularity level in the connection formula (23)
Figure BDA0003805765280000229
Alignment of interactive features with text P deep subtraction at word granularity level in equation (24)
Figure BDA00038057652800002210
Obtaining text P high-level interactive characteristics P of word granularity level c ", as shown in equation (25):
Figure BDA00038057652800002211
under the word granularity, firstly, the text P at the word granularity level in the formula (21) is softly aligned with the interactive characteristics
Figure BDA00038057652800002212
Adding P to the granularity characteristic of the text P word in the formula (2) w Text P deep soft alignment interactive feature for obtaining word granularity level
Figure BDA00038057652800002213
As shown in equation (26):
Figure BDA00038057652800002214
then, the text P subtraction alignment interactive characteristics of the word granularity level in the formula (22) are aligned
Figure BDA00038057652800002215
And the granularity characteristic P of the text P words in the formula (2) w Text P deep subtraction alignment interactive feature with word granularity level obtained by adding
Figure BDA00038057652800002216
As shown in equation (27):
Figure BDA00038057652800002217
finally, the text P deep soft-alignment interactive features of the word granularity level in the joint formula (26)
Figure BDA00038057652800002218
And the term in formula (27)Granularity level text P deep subtraction alignment interactive feature
Figure BDA00038057652800002219
Obtaining the text P high-level interactive feature P ″, of the word granularity level w As shown in equation (28):
Figure BDA00038057652800002220
text P deep subtraction alignment interactive features at the word granularity level in the conjunctive formula (24)
Figure BDA00038057652800002221
Alignment of interactive features with text P deep subtraction at word granularity level in equation (27)
Figure BDA00038057652800002222
Obtaining the text P deep semantic interactive characteristic P deep As shown in equation (29):
Figure BDA00038057652800002223
similarly, the text Q is processed similarly to the text P, and the deep soft alignment interactive feature of the text Q at the word granularity level can be obtained
Figure BDA00038057652800002224
Word-granularity-level text Q deep subtraction alignment interactive feature
Figure BDA00038057652800002225
Text Q high-level interactive characteristic Q' at word granularity level c Text Q deep soft alignment interactive feature of word granularity level
Figure BDA00038057652800002226
Word-granularity-level text Q deep subtraction alignment interactive features
Figure BDA00038057652800002227
Text Q high-level interactive characteristic Q' at word granularity level w Text Q deep semantic interactive feature Q deep And completing the extraction of semantic interactive features between texts.
S306, constructing a feature fusion layer:
firstly, for convenience of subsequent description, the following operations are defined:
the vector is defined to subtract and then operate on the absolute value of the bit to represent AB, as shown in equation (30):
AB(P,Q)=|P-Q| (30)
the P and the Q are two different vectors, and the absolute value operation according to the bit is carried out after the P and the Q vectors are subtracted;
the bit-wise multiplication operation of the definition vector is denoted as MU, as shown in equation (31):
MU(P,Q)=P⊙Q (31)
wherein, P and Q are two different vectors, which represent the operation of multiplying P and Q vectors according to bit;
in the following description, the AB symbol represents the operation of formula (30), the MU symbol represents the operation of formula (31), the SOA symbol represents the operation of formula (3), the MUA symbol represents the operation of formula (4), the SUA symbol represents the operation of formula (5), and the SEA symbol represents the operation of formula (6);
the feature fusion layer is divided into two sub-modules, the first sub-module combines multiple related features, and the second sub-module performs multiple matching operations to obtain a final matching feature vector, as shown in fig. 9.
S30601, combining a plurality of related features by a first submodule:
concatenating text P high-level features P 'of word granularity level in equation (13)' c And text P high-level interaction feature P 'at word granularity level in formula (25)' c Text P aggregation feature to obtain word granularity level
Figure BDA0003805765280000231
And aggregating the text P features at the word granularity level
Figure BDA0003805765280000232
Performing self-attention to text P deep aggregation features at word granularity level
Figure BDA0003805765280000233
As shown in equation (32):
Figure BDA0003805765280000234
at word granularity, similar to word granularity, concatenate text P high level features P 'at the word granularity level in equation (17)' w The high-level interaction characteristic P' with the text P at the word granularity level in the formula (28) w Text P aggregation features to derive word granularity level
Figure BDA0003805765280000235
And aggregating the text P with the character granularity level
Figure BDA0003805765280000236
Text P deep aggregation feature with self-attention derived word granularity level
Figure BDA0003805765280000237
As shown in equation (33):
Figure BDA0003805765280000238
then, text P deep aggregation features at the word granularity level in the formula (32) are connected
Figure BDA0003805765280000239
Deep aggregation characteristics of text P with word granularity level in formula (33)
Figure BDA00038057652800002310
Then, performing maximum pooling operation to obtain a semantic feature P' of the pooled text P, as shown in formula (34):
Figure BDA00038057652800002311
next, join text P deep semantic feature P 'in equation (18)' deep Interacting with the deep semantic interactive feature P' of the text P in the formula (29) deep Obtaining text P deep polymerization features
Figure BDA00038057652800002312
As shown in equation (35):
Figure BDA00038057652800002313
similarly, the same operation as the text P is carried out on the text Q to obtain the text Q aggregation characteristics at the word granularity level
Figure BDA00038057652800002314
Word-granularity-level text Q deep aggregation features
Figure BDA00038057652800002315
Text Q aggregation feature at word granularity level
Figure BDA0003805765280000241
Word granularity level text Q deep polymerization features
Figure BDA0003805765280000242
Text Q semantic feature Q' after pooling and text Q deep polymerization feature
Figure BDA0003805765280000243
Then, the text P in the formula (35) is deeply aggregated with the characteristics
Figure BDA0003805765280000244
Deep syndication features with text Q
Figure BDA0003805765280000245
Performing soft alignment attentionSoft-aligned text P deep polymerization features
Figure BDA0003805765280000246
Text Q deep polymerization features after soft alignment
Figure BDA0003805765280000247
As shown in equation (36):
Figure BDA0003805765280000248
then, the text P after soft alignment in the formula (36) is deeply polymerized into features
Figure BDA0003805765280000249
Performing maximum pooling operation to obtain a pooled text P deep polymerization feature P', and performing soft alignment on a text Q deep polymerization feature
Figure BDA00038057652800002410
Performing maximum pooling to obtain a pooled text Q deep polymerization feature Q' as shown in equation (37):
Figure BDA00038057652800002411
s30602, performing multiple matching operations to obtain a final matching feature vector:
firstly, the semantic features P 'of the pooled text P and the semantic features Q' of the pooled text Q in the formula (34) are subtracted in absolute value to obtain the subtraction matching features PQ ab As shown in equation (38):
PQ ab =AB(P′-Q′) (38)
secondly, performing point multiplication on the semantic features P 'of the pooled text P and the semantic features Q' of the pooled text Q in the formula (34) to obtain point-multiplied matching features PQ mu As shown in equation (39):
PQ mu =MU(P′,Q′) (39)
thirdly, pooling in equation (37) is followedThe P deep polymerization feature P ' and the pooled text Q deep polymerization feature Q ' in the formula (37) are subjected to absolute value subtraction to obtain a deep subtraction matching feature PQ ' ab As shown in equation (40):
PQ′ ab =AB(P″,Q″) (40)
then, carrying out dot multiplication on the text P deep polymerization feature P ' after pooling in the formula (37) and the text Q deep polymerization feature Q ' after pooling in the formula (37) to obtain a deep dot multiplication matching feature PQ ' mu As shown in equation (41):
PQ′ mu =MU(P″,Q″) (41)
finally, the semantic features P 'of the pooled text P in equation (34), the semantic features Q' of the pooled text Q, and the subtractive matching features PQ in equation (38) are connected ab Equation (39) midpoint-times matching characteristic PQ mu And the deep subtraction matching characteristic PQ 'in the formula (40)' ab And the deep layer in the formula (41) is dot-multiplied by the matched characteristic PQ' mu A final matching feature vector F is obtained, as shown in equation (42):
F=[P′;Q′;PQ ab ;PQ mu ;PQ′ ab ;PQ′ mu ] (42)
s307, constructing a prediction layer:
using the final matching feature vector as input, using the fully-connected layers of three layers and using the ReLU activation function for activation after the first and second fully-connected layers and the sigmoid function for activation after the third fully-connected layer, resulting in a value at [0,1 ]]The value of the degree of matching between them is recorded as y pred (ii) a Finally, whether the text semantics are matched or not is judged by comparing the text semantics with the set threshold value of 0.5; i.e. y pred When the semantic meaning of the text is more than or equal to 0.5, predicting that the semantic meaning of the text is matched, otherwise, not matching;
when the text semantic matching model is not trained, training is required to be carried out on a training data set constructed according to a semantic matching knowledge base so as to optimize model parameters; when the model is trained, the prediction layer can predict whether the semantics of the target text are matched.
S4, training a text semantic matching model: training the text semantic matching model constructed in the step S3 on the training data set obtained in the step S2, as shown in fig. 5, specifically as follows:
s401, constructing a loss function: from step S307, y pred The matching degree value is obtained after text semantic matching model processing; and y is true The two text semantics are matched real labels, the values of the two text semantics are limited to 0 or 1, the cross entropy is used as a loss function, and the formula is as follows:
Figure BDA0003805765280000251
s402, constructing an optimization function:
testing various optimization functions of the model, and finally selecting Adam optimization functions as the optimization functions of the model, wherein hyper-parameters of the Adam optimization functions are set by default values in Keras;
for example, the following steps are carried out: the optimization function described above and its settings are expressed in Keras using code:
optim=keras.optimizers.Adam()
the proposed model can achieve excellent effects on medical intelligent question and answer data sets.
Example 3:
as shown in fig. 10, the text semantic matching device for medical intelligent question answering according to embodiment 2 includes,
the text semantic matching knowledge base construction unit is used for acquiring a large amount of text data and then carrying out preprocessing operation on the text data so as to obtain a text semantic matching knowledge base meeting the training requirement;
the training data set generating unit is used for matching data in the knowledge base according to the semantics of the text, if the semantics of the data are consistent, the text is used for constructing a training positive example, otherwise, the text is used for constructing a training negative example, and all positive example data and all negative example data are mixed to obtain a training data set;
a text semantic matching model construction unit: the system is used for constructing a word mapping conversion table, an input layer, a word vector mapping layer, a semantic coding layer, a multi-level fine-grained feature extraction layer, a feature fusion layer and a prediction layer;
a text semantic matching model training unit: and the method is used for constructing a training loss function and an optimization function and finishing the training of the model.
Example 4:
based on the storage medium of embodiment 2, in which a plurality of instructions are stored, the instructions are loaded by a processor, and the steps of the medical intelligent question-answering oriented text semantic method of embodiment 2 are executed.
Example 5:
electronic equipment based on embodiment 4, electronic equipment includes: the storage medium of example 4; and
a processor for executing the instructions in the storage medium of embodiment 4.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and these modifications or substitutions do not depart from the spirit of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A text semantic matching method for medical intelligent question answering is characterized in that a text semantic matching model is formed by constructing and training an embedding layer, a semantic coding layer, a multi-level fine-grained feature extraction layer, a feature fusion layer and a prediction layer, text characters and word granularity features are extracted, fine-grained semantic features and text semantic interaction features of the same text are captured, multiple relevant features are combined finally, then multiple matching operations are carried out, a final matching feature vector is generated, and the similarity of the text is judged; the method comprises the following specific steps:
the embedding layer carries out embedding operation on the input text according to the word granularity and the word granularity respectively, and outputs text word embedding representation and word embedding representation;
the semantic coding layer receives text character embedded representation and word embedded representation, codes the text character embedded representation and the word embedded representation by using a bidirectional long-short term memory network BilSTM, and outputs text character and word granularity characteristics;
the multilevel fine-grained feature extraction layer performs the same text and text inter-coding operation on the text character and word granularity features output by the semantic coding layer to obtain the same text fine-grained semantic features and the text inter-semantic interactive features;
the feature fusion layer combines various related features, and then performs various matching operations to generate a final matching feature vector;
and the prediction layer inputs the final matching feature vector into the multilayer perceptron to obtain a floating-point numerical value, compares the floating-point numerical value serving as the matching degree with a preset threshold value, and judges whether the semantics of the text are matched or not according to the comparison result.
2. The medical intelligent question-answering oriented text semantic matching method according to claim 1, wherein the embedding layer comprises a word mapping conversion table, an input layer, a word vector mapping layer, an output text word embedding representation and a word embedding representation;
wherein, the word mapping conversion table: the mapping rule is that the number 1 is used as the starting point, and then the characters or the words are sequentially and progressively ordered according to the sequence of the character word list recorded into each character or word, so that a character word mapping conversion table is formed; then, using Word2Vec to train the Word vector model to obtain a Word vector matrix of each Word;
an input layer: the input layer comprises four inputs, word breaking and word segmentation preprocessing are carried out on each text or text to be predicted in the training data set, txt P _ char, txt Q _ char, txt P _ word and txt Q _ word are respectively obtained, wherein suffixes char and word respectively represent that the corresponding text is subjected to word breaking or word segmentation processing, and the suffixes char and word are formed as follows: (txt P _ char, txt Q _ char, txt P _ word, txt Q _ word); converting each character and word in the input text into corresponding numerical identification according to a character and word mapping conversion table;
word vector mapping layer: loading the word vector matrix obtained by training in the step of constructing the word mapping conversion table to initialize the weight parameters of the current layer; and obtaining corresponding text word embedding representation and word embedding representation txt P _ char _ embedded, txt Q _ char _ embedded, txt P _ word _ embedded and txt Q _ word _ embedded for the input text txt P _ char, txt Q _ char, txt P _ word _ embedded and txt Q _ word _ embedded.
3. The text semantic matching method for medical intelligent question answering according to claim 2, wherein implementation details of the semantic coding layer are as follows:
taking the text P as an example, the module receives the embedded representation of the text P characters and words and uses a bidirectional long-short term memory network BilSTM to encode to obtain the granularity characteristics of the text P characters and words which are marked as
Figure FDA0003805765270000011
Figure FDA0003805765270000012
The concrete formula is as follows:
Figure FDA0003805765270000013
Figure FDA0003805765270000014
wherein N represents the length of the word granularity characteristic and the word granularity characteristic, formula (1) represents that the text P word embedded representation is encoded by using a bidirectional long-short term memory network (BilSTM), wherein,
Figure FDA0003805765270000021
the granularity characteristic of the ith position word of the text P obtained by bidirectional long and short term memory network BilSTM coding is shown,
Figure FDA0003805765270000022
representing the ith bit of text P obtained by forward long-short term memory network LSTM encodingThe character-setting granularity characteristic is that,
Figure FDA0003805765270000023
the ith position word granularity characteristic of the text P obtained by backward LSTM coding is represented; the symbol meaning in formula (2) is basically consistent with that in formula (1),
Figure FDA0003805765270000024
representing the granularity characteristics of the jth position word of the text P obtained by bidirectional long-short term memory network BilSTM coding,
Figure FDA0003805765270000025
representing the granularity characteristic of the jth position word of the text P obtained by forward LSTM coding,
Figure FDA0003805765270000026
and the j (th) position word granularity characteristics of the text P obtained by backward LSTM coding are shown.
Similarly, the text Q is operated similarly to the text P to obtain the granularity characteristics of the characters and words of the text Q, and the granularity characteristics are marked as Q c 、Q w
4. The medical intelligent question-answering oriented text semantic matching method according to claim 3, wherein the implementation details of the multi-level fine-grained feature extraction layer are as follows:
carrying out encoding operation between the same text and the same text on the granularity characteristics of the text characters and the words output by the semantic encoding layer to obtain the fine granularity semantic characteristics of the same text and the semantic interaction characteristics between the texts; the method comprises two sub-modules, wherein the first sub-module is responsible for extracting fine-grained semantic features of the same text, and mainly uses a plurality of attention module codes to obtain the fine-grained semantic features of the same text according to different granularities of the same text; the second sub-module is responsible for extracting semantic interaction features among texts, and mainly obtains the semantic interaction features among the texts by using a plurality of layers of coding structures among the texts;
extracting fine-grained semantic features of the same text in a first sub-module:
first, for convenience of subsequent description, in the first section, taking the text P as an example, the following attention module is defined:
defining a soft alignment attention module, denoted as SOA, and the formula is as follows:
Figure FDA0003805765270000027
wherein
Figure FDA0003805765270000028
The i-th position word granularity characteristic of the text P is represented by the formula (1),
Figure FDA0003805765270000029
the granularity characteristic of the word at the jth position of the text P is shown in a formula (2),
Figure FDA00038057652700000210
representing the soft alignment attention weight between the ith position word granularity characteristic and the jth position word granularity characteristic of the text P,
Figure FDA00038057652700000211
indicating that the softmax operation on the soft-alignment attention weight maps to a value of 0-1,
Figure FDA00038057652700000212
indicating that the ith position word granularity characteristic of the text P can be re-expressed by weighted summation of all word granularity characteristics of the text P by using soft alignment attention,
Figure FDA00038057652700000213
the representation uses soft alignment attention to enable the granularity characteristic of the jth position word of the text P to be represented again by the weighted summation of all the granularity characteristics of the words of the text P;
define the multiplicative alignment attention module, denoted MUA, as follows:
Figure FDA0003805765270000031
wherein TimeDistributed (Dense ()) indicates that the same layer operation is performed for the tensor of each time step, tan h indicates the multiplication operation for the bit, P indicates the activation function c Representing the granularity characteristics of a P word of the text,
Figure FDA0003805765270000032
representing the granularity characteristic of the P words of the text after being processed by a Dense () layer, P w The granularity characteristic of the text P words is represented,
Figure FDA0003805765270000033
indicating that the multiplication is aligned with the attention weight,
Figure FDA0003805765270000034
indicating that the softmax operation on the multiplicative alignment attention weight maps to a value of 0-1,
Figure FDA0003805765270000035
indicating that the ith position word granularity feature of the text P can be re-expressed by weighted summation of all word granularity features of the text P using multiplicative alignment attention,
Figure FDA0003805765270000036
the expression that the jth position word granularity characteristic of the text P can be represented again by the weighted summation of all word granularity characteristics of the text P by using multiplication alignment attention;
defining a Sua as the Sua, the formula is as follows:
Figure FDA0003805765270000037
where TimeDistributed (Dense ()) represents the same layer operation of Dense () performed on the tensor of each time step-represents a bit-wise subtraction operation, tanh represents an activation function, P c The granularity characteristic of the P words of the text is represented,
Figure FDA0003805765270000038
represents the granularity characteristic of the text P word after being processed by a Dense () layer, P w The granularity characteristic of the text P words is represented,
Figure FDA0003805765270000039
representing a subtractive alignment attention weight,
Figure FDA00038057652700000310
indicating that the softmax operation on the subtraction alignment attention weight maps to a value of 0-1,
Figure FDA00038057652700000311
indicating that the ith position word granularity feature of the text P can be re-represented by weighted summation of all word granularity features of the text P using subtractive alignment attention,
Figure FDA00038057652700000312
the method indicates that the jth position word granularity characteristic of the text P can be represented again by weighted summation of all word granularity characteristics of the text P by using subtraction alignment attention;
defining the self-alignment attention module as SEA, the formula is as follows:
Figure FDA0003805765270000041
wherein,
Figure FDA0003805765270000042
the ith position word granularity characteristic of the text P is represented,
Figure FDA0003805765270000043
representing the jth position word granularity characteristic of the text P,
Figure FDA0003805765270000044
representing a self-aligned attention weight between the ith position word granularity feature of the text P and the jth position word granularity feature of the text P,
Figure FDA0003805765270000045
indicating that the softmax operation on the self-aligned attention weight maps to a value of 0-1,
Figure FDA0003805765270000046
the representation uses self-alignment attention to enable the ith position word granularity characteristic of the text P to be represented again by weighted summation of all word granularity characteristics of the text P;
in the following description, the SOA symbol is used to represent the operation of formula (3), the MUA symbol is used to represent the operation of formula (4), the SUA symbol is used to represent the operation of formula (5), and the SEA symbol is used to represent the operation of formula (6);
the first layer of coding structure uses a plurality of attention modules to extract fine-grained initial semantic features of the same text:
first, using soft-alignment attention, the text P word granularity feature P c And text P word granularity characteristic P w Performing soft alignment attention to obtain text P soft alignment characteristics at word granularity level
Figure FDA0003805765270000047
Text P soft-alignment feature at word granularity level
Figure FDA0003805765270000048
As shown in equation (7):
Figure FDA0003805765270000049
second, using multiplicative alignment attention, the text P word granularity feature P c And the granularity characteristic P of the text P word w Performing multiplicative alignment attention gettingWord granularity level text P multiplication alignment feature
Figure FDA00038057652700000410
Text P-multiply aligned feature at word granularity level
Figure FDA00038057652700000411
As shown in equation (8):
Figure FDA00038057652700000412
then, using subtraction to align attention, the text P word granularity feature P c And text P word granularity characteristic P w Text P subtraction alignment feature for performing subtraction alignment attention to obtain word granularity level
Figure FDA00038057652700000413
Text P subtraction alignment feature to word granularity level
Figure FDA00038057652700000414
As shown in equation (9):
Figure FDA00038057652700000415
similarly, the text Q is processed similarly to the text P, and the soft alignment characteristic of the text Q at the word granularity level can be obtained
Figure FDA00038057652700000416
Word granularity level text Q soft alignment features
Figure FDA00038057652700000417
Word-granularity level text Q multiplication alignment feature
Figure FDA00038057652700000418
Word granularity level text Q multiplication alignment feature
Figure FDA00038057652700000419
Word-granularity-level text Q subtractive alignment features
Figure FDA00038057652700000420
Word granularity level text Q subtraction alignment feature
Figure FDA00038057652700000421
Namely, the extraction of fine-grained initial semantic features of the same text is completed;
the second layer of coding structure enhances the fine-grained initial semantic features of the same text to complete the extraction of the fine-grained semantic features of the same text:
firstly, soft-aligning the text P at the word granularity level in the formula (7)
Figure FDA00038057652700000422
And the text P word granularity characteristic P in the formula (1) c Adding to obtain text P deep soft alignment characteristics of word granularity level
Figure FDA00038057652700000423
As shown in equation (10):
Figure FDA0003805765270000051
then, the text P multiplication alignment feature of word granularity level in the formula (8) is aligned
Figure FDA0003805765270000052
And the text P word granularity characteristic P in the formula (1) c Adding to obtain text P deep multiplication alignment characteristics of word granularity level
Figure FDA0003805765270000053
As shown in equation (11):
Figure FDA0003805765270000054
then, the text P of the word granularity level in the formula (9) is subtracted to align the features
Figure FDA0003805765270000055
And the text P word granularity characteristic P in the formula (1) c Adding to obtain text P deep subtraction alignment features at word granularity level
Figure FDA0003805765270000056
As shown in equation (12):
Figure FDA0003805765270000057
then, the text P deep soft alignment characteristics of the word granularity level in the formula (10)
Figure FDA0003805765270000058
Text P deep multiplicative alignment features at word granularity level in equation (11)
Figure FDA0003805765270000059
Text P deep subtraction alignment feature at word granularity level in equation (12)
Figure FDA00038057652700000510
Concatenating to get text P high level feature P 'at word granularity level' c As shown in equation (13):
Figure FDA00038057652700000511
similar in word granularity to word granularity, first, the text P soft-alignment feature at the word granularity level in equation (7) is
Figure FDA00038057652700000512
And the text P word granularity characteristic P in the formula (2) w Adding to obtain text P deep soft alignment characteristics of word granularity level
Figure FDA00038057652700000513
As shown in equation (14):
Figure FDA00038057652700000514
then, the text P multiplication alignment feature of the word granularity level in the formula (8)
Figure FDA00038057652700000515
And the text P word granularity characteristic P in the formula (2) w Adding to obtain text P deep multiplication alignment characteristics of word granularity level
Figure FDA00038057652700000516
As shown in equation (15):
Figure FDA00038057652700000517
then, the text P of the word granularity level in the formula (9) is subjected to subtraction alignment
Figure FDA00038057652700000518
And the text P word granularity characteristic P in the formula (2) w Text P deep subtraction alignment feature adding to obtain word granularity level
Figure FDA00038057652700000519
As shown in equation (16):
Figure FDA00038057652700000520
next, join the text P deep soft alignment features at the word granularity level of equation (14)
Figure FDA00038057652700000521
Text P deep multiplication alignment feature at word granularity level in formula (15)
Figure FDA00038057652700000522
Text P deep subtraction alignment feature at word granularity level in equation (16)
Figure FDA00038057652700000523
Obtaining text P high-level feature P 'of word granularity level' w As shown in equation (17):
Figure FDA00038057652700000524
text P deep soft alignment feature at word granularity level in conjunctive formula (10)
Figure FDA00038057652700000525
Text P deep soft alignment feature at the word granularity level of formula (14)
Figure FDA00038057652700000526
Obtaining text P deep semantic feature P' deep As shown in equation (18):
Figure FDA00038057652700000527
similarly, the text Q is processed similarly to the text P, and deep soft alignment characteristics of the text Q at the word granularity level can be obtained
Figure FDA00038057652700000528
Word-granularity-level text Q deep multiplication alignment featureSign
Figure FDA00038057652700000529
Word-granularity-level text Q deep subtraction alignment feature
Figure FDA00038057652700000530
Text Q high-level feature Q 'at word granularity level' c And text Q deep soft-alignment feature at word granularity level
Figure FDA0003805765270000061
Word granularity level text Q deep multiplication alignment feature
Figure FDA0003805765270000062
Word-granularity-level text Q deep subtraction alignment features
Figure FDA0003805765270000063
Text Q high-level feature Q 'of word granularity level' w Text Q deep semantic feature Q' deep Finishing the extraction of fine-grained semantic features of the same text;
and extracting semantic interactive features between texts of a second submodule:
the first layer of coding structure simultaneously uses a plurality of layers of coding structures to extract the initial semantic interaction characteristics between texts:
on the character granularity, firstly, the text P character granularity characteristic P in formula (1) c And text Q word granularity feature Q c Performing soft alignment attention to obtain text P soft alignment interactive features at word granularity level
Figure FDA0003805765270000064
Text Q soft alignment interaction feature at word granularity level
Figure FDA0003805765270000065
As shown in equation (19):
Figure FDA0003805765270000066
secondly, the granularity characteristic P of the text P word in the formula (1) c And text Q word granularity feature Q c Text P subtraction alignment interactive feature for obtaining word granularity level by performing subtraction alignment attention
Figure FDA0003805765270000067
Text Q subtractive alignment of interactive features at word granularity level
Figure FDA0003805765270000068
As shown in equation (20):
Figure FDA0003805765270000069
similar to the word granularity, the text P word granularity characteristic P in formula (2) is first set w And text Q word granularity feature Q w Text P soft alignment interactive feature for obtaining word granularity level by performing soft alignment attention
Figure FDA00038057652700000610
Text Q soft-alignment interactive features at word granularity level
Figure FDA00038057652700000611
As shown in equation (21):
Figure FDA00038057652700000612
then, the text P word granularity characteristic P in the formula (2) w And text Q word granularity feature Q w Text P subtraction alignment interactive feature for obtaining word granularity level by performing subtraction alignment attention
Figure FDA00038057652700000613
Text Q subtraction alignment interaction feature at word granularity level
Figure FDA00038057652700000614
As shown in equation (22):
Figure FDA00038057652700000615
the second layer of coding structure enhances the initial semantic interactive features between texts to complete the extraction of the semantic interactive features between the texts:
at word granularity, first, soft-align the text P at the word granularity level in equation (19) to the interactive features
Figure FDA00038057652700000616
And the text P word granularity characteristic P in the formula (1) c Adding to obtain text P deep soft alignment interactive features of word granularity level
Figure FDA00038057652700000617
As shown in equation (23):
Figure FDA00038057652700000618
then, the text P subtraction alignment interactive feature of the word granularity level in the formula (20)
Figure FDA00038057652700000619
And the text P word granularity characteristic P in the formula (1) c Adding to obtain text P deep subtraction alignment interactive features of word granularity level
Figure FDA00038057652700000620
As shown in equation (24):
Figure FDA00038057652700000621
finally, the text P deep soft alignment interactive characteristics of the character granularity level in the connection formula (23)
Figure FDA00038057652700000622
Alignment of interactive features with text P deep subtraction at word granularity level in equation (24)
Figure FDA00038057652700000623
Obtaining the text P high-level interactive characteristic P' of word granularity level c As shown in equation (25):
Figure FDA00038057652700000624
under the word granularity, firstly, the text P at the word granularity level in the formula (21) is softly aligned with the interactive characteristics
Figure FDA00038057652700000625
Adding P to the granularity characteristic of the text P word in the formula (2) w Text P deep soft alignment interactive feature for obtaining word granularity level
Figure FDA00038057652700000626
As shown in equation (26):
Figure FDA0003805765270000071
then, the text P subtraction alignment interactive feature of the word granularity level in the formula (22)
Figure FDA0003805765270000072
And the text P word granularity characteristic P in the formula (2) w Text P deep subtraction alignment interactive feature with word granularity level obtained by adding
Figure FDA0003805765270000073
As shown in equation (27):
Figure FDA0003805765270000074
finally, the text P deep soft-alignment interactive features of word granularity level in the joint formula (26)
Figure FDA0003805765270000075
Alignment of interactive features with text P deep subtraction at word granularity level in equation (27)
Figure FDA0003805765270000076
Obtaining the text P high-level interactive feature P ″, of the word granularity level w As shown in equation (28):
Figure FDA0003805765270000077
text P deep subtraction alignment interactive features at the word granularity level in the conjunctive formula (24)
Figure FDA0003805765270000078
Alignment of interactive features with text P deep subtraction at word granularity level in equation (27)
Figure FDA0003805765270000079
Obtaining the deep semantic interactive feature P ″' of the text P deep As shown in equation (29):
Figure FDA00038057652700000710
similarly, the text Q is processed similarly to the text P, and the deep soft alignment interactive feature of the text Q at the word granularity level can be obtained
Figure FDA00038057652700000711
Word-granularity-level text Q deep subtraction alignment interactive feature
Figure FDA00038057652700000712
Text Q high-level interactive characteristic Q' at word granularity level c Word granularity level text Q deep soft alignment interactive feature
Figure FDA00038057652700000713
Word granularity level text Q deep subtraction alignment interactive feature
Figure FDA00038057652700000714
Text Q high-level interactive feature Q' at word granularity level w Text Q deep semantic interactive feature Q deep And completing the extraction of semantic interactive features between texts.
5. The text semantic matching method for medical intelligent question answering according to claim 4, wherein implementation details of the feature fusion layer are as follows:
first, for convenience of subsequent description, the following definitions are made:
the vector subtraction and bit-wise absolute value operation are defined as AB, as shown in equation (30):
AB(P,Q)=|P-Q| (30)
the P and the Q are two different vectors, and the absolute value operation according to the bit is carried out after the P and the Q vectors are subtracted;
the bit-wise multiplication operation of the definition vector is denoted as MU, as shown in equation (31):
MU(P,Q)=P⊙Q (31)
wherein, P and Q are two different vectors and represent the operation of multiplying P and Q vectors by bit;
in the following description, the AB symbol represents the operation of formula (30), the MU symbol represents the operation of formula (31), the SOA symbol represents the operation of formula (3), the MUA symbol represents the operation of formula (4), the SUA symbol represents the operation of formula (5), and the SEA symbol represents the operation of formula (6);
the feature fusion layer is divided into two sub-modules, the first sub-module combines various related features, and the second sub-module performs various matching operations to obtain a final matching feature vector;
the first sub-module combines a plurality of relevant features:
concatenating text P high-level features P 'of word granularity level in equation (13)' c The high-level interactive characteristic P' of the text P at the word granularity level in the formula (25) c Text P aggregation feature to obtain word granularity level
Figure FDA00038057652700000715
And aggregating the text P features at the word granularity level
Figure FDA00038057652700000716
Performing self-attention to text P deep aggregation features at word granularity level
Figure FDA00038057652700000717
As shown in equation (32):
Figure FDA0003805765270000081
at word granularity, similar to word granularity, concatenate text P high level features P 'at the word granularity level in equation (17)' w The high-level interactive characteristic P' of the text P at the word granularity level in the formula (28) w Text P aggregation features to derive word granularity level
Figure FDA0003805765270000082
And aggregating the text P with the character granularity level
Figure FDA0003805765270000083
Performing self-attention to get text P deep aggregated features at word granularity level
Figure FDA0003805765270000084
As shown in equation (33):
Figure FDA0003805765270000085
thereafter, text P deep aggregation features at the word granularity level in the join formula (32) are joined
Figure FDA0003805765270000086
Deep aggregation characteristics of text P with word granularity level in formula (33)
Figure FDA0003805765270000087
Then, performing maximum pooling operation to obtain a semantic feature P' of the pooled text P, as shown in formula (34):
Figure FDA0003805765270000088
next, join text P deep semantic feature P 'in equation (18)' deep Interacting with the deep semantic interactive feature P' of the text P in the formula (29) deep Obtaining text P deep polymerization features
Figure FDA0003805765270000089
As shown in equation (35):
Figure FDA00038057652700000810
similarly, the same operation as the text P is carried out on the text Q to obtain the text Q aggregation characteristics at the word granularity level
Figure FDA00038057652700000811
Word-granularity-level text Q deep aggregation features
Figure FDA00038057652700000812
Text Q aggregation feature at word granularity level
Figure FDA00038057652700000813
Word granularity level text Q deep polymerization features
Figure FDA00038057652700000814
Text Q semantic feature Q' after pooling and text Q deep polymerization feature
Figure FDA00038057652700000815
Then, the text P in the formula (35) is deeply aggregated with the characteristics
Figure FDA00038057652700000816
Deep syndication features with text Q
Figure FDA00038057652700000817
Performing soft-alignment attention to obtain soft-aligned text P deep polymerization features
Figure FDA00038057652700000818
Deep polymerization features with soft-aligned text Q
Figure FDA00038057652700000819
As shown in equation (36):
Figure FDA00038057652700000820
then, the text P after soft alignment in the formula (36) is deeply polymerized into features
Figure FDA00038057652700000821
Performing maximum pooling operation to obtain a pooled text P deep polymerization feature P', and deep polymerizing the soft aligned text QAlloy characteristics
Figure FDA00038057652700000822
Performing maximum pooling to obtain a pooled text Q deep polymerization feature Q' as shown in equation (37):
Figure FDA00038057652700000823
the second sub-module performs multiple matching operations to obtain a final matching feature vector:
firstly, the semantic feature P 'of the pooled text P and the semantic feature Q' of the pooled text Q in the formula (34) are subtracted in absolute value to obtain a subtraction matching feature PQ ab As shown in equation (38):
PQ ab =AB(P′-Q′) (38)
secondly, performing point multiplication on semantic features P 'of the pooled text P and semantic features Q' of the pooled text Q in the formula (34) to obtain point-multiplied matching features PQ mu As shown in equation (39):
PQ mu =MU(P′,Q′) (39)
thirdly, subtracting the text P deep polymerization feature P ' after the pooling in the formula (37) and the text Q deep polymerization feature Q ' after the pooling in the formula (37) by absolute values to obtain a deep subtraction matching feature PQ ' ab As shown in equation (40):
PQ′ ab =AB(P″,Q″) (40)
then, carrying out dot multiplication on the text P deep polymerization feature P ' after pooling in the formula (37) and the text Q deep polymerization feature Q ' after pooling in the formula (37) to obtain a deep dot multiplication matching feature PQ ' mu As shown in equation (41):
PQ′ mu =MU(P″,Q″) (41)
finally, the semantic features P 'of the text P after pooling in the formula (34), the semantic features Q' of the text Q after pooling, and the subtraction matching features PQ in the formula (38) are connected ab Equation (39) midpoint product matching feature PQ mu Equation (4)0) Mid-deep subtraction matching characteristic PQ' ab And the deep layer in the formula (41) is dot-multiplied by the matched characteristic PQ' mu The final matching feature vector F is obtained, as shown in equation (42):
F=[P′;Q′;PQ ab ;PQ mu ;PQ′ ab ;PQ′ mu ] (42)
6. the text semantic matching method for medical intelligent question answering according to claim 5, wherein implementation details of the prediction layer are as follows:
using the final matching feature vector F as input, using a fully connected layer of three layers and using the ReLU activation function for activation after the fully connected layers of the first and second layers and using the sigmoid function for activation after the fully connected layer of the third layer, resulting in a final matching feature vector F at [0,1 ]]The value of the degree of matching between the two is recorded as y pred (ii) a Finally, whether the text semantics are matched or not is judged by comparing the text semantics with the set threshold value of 0.5; i.e. y pred When the semantic meaning of the text is more than or equal to 0.5, predicting that the semantic meaning of the text is matched, otherwise, mismatching;
when the text semantic matching model is not trained, training needs to be carried out on a training data set constructed according to a semantic matching knowledge base so as to optimize model parameters; when the model is trained, the prediction layer can predict whether the semantics of the target text are matched.
7. The text semantic matching method for medical intelligent question answering according to claim 1, wherein the text semantic matching knowledge base comprises a data set acquisition raw data downloading network, a raw data preprocessing and a sub knowledge base summarizing;
downloading a data set on a network to obtain original data: downloading a text semantic matching data set or a manually constructed data set which is already disclosed on a network, and taking the data set as original data for constructing a text semantic matching knowledge base;
preprocessing raw data: preprocessing original data used for constructing a text semantic matching knowledge base, and performing word segmentation operation and word segmentation operation on each text to obtain a text semantic matching word segmentation processing knowledge base and a word segmentation processing knowledge base;
summarizing the sub-knowledge base: summarizing a text semantic matching word-breaking processing knowledge base and a text semantic matching word-segmentation processing knowledge base to construct a text semantic matching knowledge base;
the text semantic matching model is obtained by training through a training data set, and the construction process of the training data set comprises the steps of constructing a training positive case, constructing a training negative case and constructing a training data set;
constructing a training example: for each text in the text semantic matching knowledge base, if the semantics are consistent, the text can be used for constructing a training example;
constructing a training negative example: selecting a text txt P, randomly selecting a text txt Q which is not matched with the text txt P from a text semantic matching knowledge base, and combining the txt P and the txt Q to construct a negative case;
constructing a training data set: combining all positive example data and negative example data obtained after the operations of constructing the training positive example and constructing the training negative example, and disordering the sequence of the positive example data and the negative example data to construct a final training data set;
after the text semantic matching model is built, training and optimizing the text semantic matching model through a training data set, which specifically comprises the following steps:
constructing a loss function: known from the prediction layer implementation, y pred Calculating a numerical value for the matching degree obtained after the text semantic matching model processing; and y is true The semantic meaning of the text is a real label whether the two text semantics are matched, the value of the semantic meaning is limited to 0 or 1, and the cross entropy is used as a loss function;
constructing an optimization function: using Adam optimization functions; and performing optimization training on the text semantic matching model on the training data set.
8. A text semantic matching device for medical intelligent question answering is characterized by comprising a text semantic matching knowledge base building unit, a training data set generating unit, a text semantic matching model building unit and a text semantic matching model training unit, and the steps of the text semantic matching method for medical intelligent question answering described in claims 1-7 are respectively realized.
9. A storage medium having stored thereon a plurality of instructions, wherein the instructions are loaded by a processor to perform the steps of the method for semantic matching of text towards intelligent medical questioning and answering as claimed in claims 1 to 7.
10. An electronic device, characterized in that the electronic device comprises:
the storage medium of claim 9 and a processor to execute instructions in the storage medium.
CN202210996504.8A 2022-08-19 2022-08-19 Text semantic matching method and device for medical intelligent question answering Pending CN115269808A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210996504.8A CN115269808A (en) 2022-08-19 2022-08-19 Text semantic matching method and device for medical intelligent question answering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210996504.8A CN115269808A (en) 2022-08-19 2022-08-19 Text semantic matching method and device for medical intelligent question answering

Publications (1)

Publication Number Publication Date
CN115269808A true CN115269808A (en) 2022-11-01

Family

ID=83752534

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210996504.8A Pending CN115269808A (en) 2022-08-19 2022-08-19 Text semantic matching method and device for medical intelligent question answering

Country Status (1)

Country Link
CN (1) CN115269808A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117637153A (en) * 2024-01-23 2024-03-01 吉林大学 Informationized management system and method for patient safety nursing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117637153A (en) * 2024-01-23 2024-03-01 吉林大学 Informationized management system and method for patient safety nursing
CN117637153B (en) * 2024-01-23 2024-03-29 吉林大学 Informationized management system and method for patient safety nursing

Similar Documents

Publication Publication Date Title
CN111310438B (en) Chinese sentence semantic intelligent matching method and device based on multi-granularity fusion model
CN110364251B (en) Intelligent interactive diagnosis guide consultation system based on machine reading understanding
CN113312500B (en) Method for constructing event map for safe operation of dam
CN111325028B (en) Intelligent semantic matching method and device based on deep hierarchical coding
CN111382565B (en) Emotion-reason pair extraction method and system based on multiple labels
CN109614471B (en) Open type problem automatic generation method based on generation type countermeasure network
CN111310439B (en) Intelligent semantic matching method and device based on depth feature dimension changing mechanism
CN113065358B (en) Text-to-semantic matching method based on multi-granularity alignment for bank consultation service
CN112000771B (en) Judicial public service-oriented sentence pair intelligent semantic matching method and device
CN112000770B (en) Semantic feature graph-based sentence semantic matching method for intelligent question and answer
CN112001166B (en) Intelligent question-answer sentence semantic matching method and device for government affair consultation service
CN112860930B (en) Text-to-commodity image retrieval method based on hierarchical similarity learning
CN113435208A (en) Student model training method and device and electronic equipment
CN112463924B (en) Text intention matching method for intelligent question answering based on internal correlation coding
CN115269808A (en) Text semantic matching method and device for medical intelligent question answering
CN116431827A (en) Information processing method, information processing device, storage medium and computer equipment
CN116204674A (en) Image description method based on visual concept word association structural modeling
CN117313728A (en) Entity recognition method, model training method, device, equipment and storage medium
CN115510236A (en) Chapter-level event detection method based on information fusion and data enhancement
CN114004220A (en) Text emotion reason identification method based on CPC-ANN
CN113705242B (en) Intelligent semantic matching method and device for education consultation service
CN113761192B (en) Text processing method, text processing device and text processing equipment
CN117933226A (en) Context-aware dialogue information extraction system and method
CN113705241B (en) Intelligent semantic matching method and device based on multi-view attention for college entrance examination consultation
CN113065359B (en) Sentence-to-semantic matching method and device oriented to intelligent interaction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: 250353 University Road, Changqing District, Ji'nan, Shandong Province, No. 3501

Applicant after: Qilu University of Technology (Shandong Academy of Sciences)

Address before: 250353 University Road, Changqing District, Ji'nan, Shandong Province, No. 3501

Applicant before: Qilu University of Technology

Country or region before: China