CN113127599B - Question-answering position detection method and device of hierarchical alignment structure - Google Patents

Question-answering position detection method and device of hierarchical alignment structure Download PDF

Info

Publication number
CN113127599B
CN113127599B CN202110230676.XA CN202110230676A CN113127599B CN 113127599 B CN113127599 B CN 113127599B CN 202110230676 A CN202110230676 A CN 202110230676A CN 113127599 B CN113127599 B CN 113127599B
Authority
CN
China
Prior art keywords
question
answer
sequence
sample
coarse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110230676.XA
Other languages
Chinese (zh)
Other versions
CN113127599A (en
Inventor
付鹏
林政�
刘欢
王伟平
孟丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN202110230676.XA priority Critical patent/CN113127599B/en
Publication of CN113127599A publication Critical patent/CN113127599A/en
Application granted granted Critical
Publication of CN113127599B publication Critical patent/CN113127599B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

The invention discloses a method and a device for detecting a question-answering position of a hierarchical alignment structure, wherein the method comprises the following steps: respectively converting the question text and the answer text into a question sequence and an answer sequence; splicing the question sequence and the answer sequence to obtain a question answer sequence; and inputting the question sequence, the answer sequence and the question answer sequence into a hierarchical alignment model to obtain a question-answer standpoint detection result. According to the hierarchical alignment model, a BERT pre-training model is used to obtain coarse-grained vertical representation, concept-level target alignment and evidence-level information alignment are performed from two aspects of question and answer in a QA pair, and coarse-to-fine vertical representation is obtained, so that higher accuracy and F1 value can be obtained on a question-and-answer vertical detection task.

Description

Question-answering position detection method and device of hierarchical alignment structure
Technical Field
The invention relates to the field of social media-position detection-natural language processing, in particular to a question-answering position detection method and device with a hierarchical alignment structure.
Technical background
The position detection task is a classification problem, which is intended to identify the position of an author expressed on a specific target (such as an entity, a statement, an event, and the like), and plays an important role in tasks such as opinion identification, political debate, rumor detection, fake news detection, and the like. In the social question-and-answer platform, question-and-answer position detection is a novel position detection task, aiming at identifying the position carried in the answer to a specific question.
For the site detection task, early research focused on online forensic text, mainly using rule-based algorithms (Walker, M., Tree, J.F., and, P., Abbott, R., King, J.: A: color for Resources on delay and floor. in: Proceedings of the origin International Conference on delay and Evaluation (LREC' 12). pp.812{ European channels Association (ELRA), Istanbul, Turkey (May)), SVM (Hasan, K.S., Ng., V.: stability classification of electronic devices: Data, fields, services, string, concrete, etc.) (Source: coding, N.S. 201, J.S. and J.S. J.A. application, J.S. Pat. No. J.7. and J.S.: coding. of classification of electronic devices: Data, fields, sources, services, concrete, and application, J.S. J.7. application, and application, mineral, concrete, mineral, concrete, sample, application, and application, mineral, concrete, mineral, cham (2014)), and the like. Recent work gradually shifted to the social media field, research methods gradually shifted to deep learning methods, using deep neural network-based models to analyze the standpoint of targets, such as documents (Vijayaraghavan, P., Sysoev, I., Vosoughi, S., Roy, D.: Deep at SemEvent-2016 (6): Detecting station in tweeth using the same and word-level Ns.in: Proceedings of the 10 International Workshop on management (SemEvent-2016), SemEvent.413 {419.Association for Computational linkage, Sanego, Calorifa (2016)), documents (Zarrella, G., A., Marsh, Miq-Securi, J.D.: J., zhou, G.: Standard detection with systematic accounting network. in: Proceedings of the 27th International Conference on Computational rules. pp.2399{2409.Association for Computational rules, SantaFe, New Mexico, USA (Aug 2018) }. In addition, there are some studies, such as literature (Zhang, B., Yang, M., Li, X., Ye, Y., Xu, X., Dai, K.: enhanced cross-target state detection with transferable property-knowledge, in: Proceedings of the 58th Annual Meeting of the Association for comprehensive linearity, pp.3188{3197.Association for comprehensive linearity, Online (JSlul 2020) }) and literature (above summary, V., Attateri, G.: Transfer learning from clusters to surface) which use the knowledge of multiple target phrases such as the target phrases FN, S.12, S.A. the discussion of related languages, S.A. the discussion of related to related documents, and the target phrases of related to the objects, such as "related phrases" related "and" are referred to "pages" and "the target entities". 12. the migration ".
The question-answer position detection aims at questions in a question-answer text and identifies positions in the answer text. Given a question-answer (QA) pair, The latest approach proposes a cyclic conditional attention network (Yuan, j., Zhao, y., Xu, j., Qin, b., expanding answer state detection with a repeat conditional assignment. in: The third-third aai Conference on architecture assignment, AAAI 2019, The third-First Innovative Applications of architecture assignment, IAAI 2019, The Ninth AAAI aggregate on duration assignment, EAAI 2019, honoluu, hawaiii, USA, January 27-February 1, QA 9, 2019. 20133. in architecture assignment, aai 2019, and The final answer to The question is obtained by a cyclic conditional answer field (Yuan, j., Zhao, j., y, Zhao, j., Qin, b., expanding answer state detection with a repeat assignment, aai.e., The final answer state adjustment by reading The final answer field. When the question and answer standpoint detection task is solved, the model not only needs to understand the semantics in the question and answer text, but also needs to model the relationship between the question and the answer text.
Furthermore, vertical detection subtask vertical detection (Gorrall, G., Kochkina, E., Liakata, M., Aker, A., Zubiga, A., Bontcheva, K., Derczynski, L.: Semcal-2019 task 7: Rumour Eval, determining rule trend and support for rule in Proceedings of the 13th International word on semiconductor evaluation.845 {854.Association for computing Linguletics, Minneapolis, Minnesota, USA (Jun 2019)) and false news vertical detection (Gorrall, G., Kochkina, E., Lichkin, M., Akia, Zneu, A., Zuk, J., Wen, J.7, J.12, J.1. evaluation, J., and the question-answer position detection focuses more on how to learn the mutual correlation among QAs, and models the position representation under the specified target. The tasks associated with question-answering position detection also include target-dependent sentiment analysis (Gorrell, G., Kochkina, E., Liakata, M., Aker, A., Zubiga, A., Bontcheva, K., Derczynski, L.: SemEval-2019task 7: Rumourval, determining rumour veracity and reporting for rumour. in: Proceedings of the 13th International Workshop on Semantic evaluation. pp.845. Association for practical linearity, Minneapolis, Minnesota, USA (Jun 2019)), the latter target being a representation associated with a learning target, and the need to find targets and information associated with the entire question.
The prior art is applied to the question-answering position detection task and ignores the following two problems. First, in question-and-answer position detection, positions are related to targets related to concepts in the question text, but words representing the same concepts appearing in the question and answer text may not coincide and target alignment should be performed. Second, the answer text may contain more than one concept-related target, and the additional target information may interfere with the recognition standpoint, and context alignment should be performed to find the content that can support the question text, i.e., evidence-related context.
Disclosure of Invention
The invention provides a method and a device for detecting a question-answer place of a hierarchical alignment structure, which solve the problem that the target related to the concept and the context related to the evidence in a QA pair in a question-answer place detection task are possibly inconsistent through the method of target alignment related to the concept and context alignment related to the evidence, and carry out vector representation on the place from thick to fine, thereby effectively improving the effect of the question-answer place detection task and accurately identifying the place carried by the answer text aiming at the problem in the QA pair.
In order to achieve the purpose, the invention provides the following technical scheme:
a question-answering position detection method of a hierarchical alignment structure comprises the following steps:
1) respectively converting the question text and the answer text into a question sequence and an answer sequence;
2) splicing the question sequence and the answer sequence to obtain a question answer sequence;
3) inputting the question sequence, the answer sequence and the question answer sequence into a hierarchical alignment model to obtain a question-answer standpoint detection result;
wherein, a question-answering place detection model is obtained through the following steps:
a) respectively converting the plurality of sample question texts and the plurality of sample answer texts into sample question sequences and sample answer sequences, and splicing the sample question sequences and the corresponding sample answer sequences to obtain a plurality of sample question answer sequences;
b) respectively coding each sample question sequence, sample answer sequence and sample question answer sequence to obtain a plurality of question sequence representations SQAnswer sequence representation SAAnd coarse grain size in the vertical representation of SQA
c) Representing S by a question sequenceQAs a query and representing the corresponding answer sequence as SAObtaining a number of question-dependent answer representations M as keys and valuesQ→AExpressing S as a sequence of answersAAs a query and representing the corresponding answer sequence as SQObtaining, as keys and values, a number of answer-dependent question representations MA→QAnd connecting the question-dependent answer representation MQ→AWith corresponding answer-dependent question representation MA→QObtaining a plurality of fine-grained representations DQA
d) Aligned fine-grained representation D based on a multi-head attention mechanismQARepresenting S from the corresponding coarse-grained standpointQAThe sentence meanings related to evidence between the two groups of the sentences obtain a plurality of vectors representing O from a coarse position to a fine position;
e) and classifying a plurality of vector representations O from coarse to fine to obtain a hierarchical alignment model.
Further, the method of encoding the sample question sequence, the sample answer sequence, and the sample question answer sequence includes: a pre-trained BERT model is used.
Further, a question-dependent answer representation M is obtained by the following stepsQ→A
1) Representing S by a question sequenceQAs a query and representing the corresponding answer sequence as SAObtaining an output of a first answer-question matching block as a key and value, comprising the steps of:
a) obtaining the output of the ith head
Figure BDA0002957732550000041
Wherein
Figure BDA0002957732550000042
Figure BDA0002957732550000043
Is that
Figure BDA0002957732550000044
D is the embedding size of the sample question text and the sample answer text converted into the sample question sequence and the sample answer sequence, h is the number of heads,
Figure BDA0002957732550000045
is a parameter which can be learned, i is more than or equal to 1 and less than or equal to h;
b) splicing the outputs of h heads, and performing linear projection operation on the splicing result to obtain an operation result
MATT(SQ,SA)=[ATT1(SQ,SA),ATT2(SQ,SA),...,ATTh(SQ,SA)WOWherein
Figure BDA0002957732550000046
Are learnable parameters;
c) in question sequence representation SQAnd operation result MATT (S)Q,SA) Performing residual connection between the two to obtain the result Z ═ LN (S)Q+MATT(SQ,SA) LN is a hierarchical normalization operation;
d) accessing the result Z into a feedforward network and another residual connecting layer to obtain the output TIM of the first transform encoder1(SQ,SA) LN (Z + MLP (Z)), where MLP is a feed-forward network;
2) by stackingmAn answer-question matching block for obtaining an answer representation dependent on a question
Figure BDA0002957732550000047
Figure BDA0002957732550000048
Further, I ═ MATT' (D) is represented by a coarse to fine field vectorQA,SQA)= [ATT′1(DQA,SQA),...,ATT′h′(DQA,SQA)]W′OWherein
Figure BDA0002957732550000049
Figure BDA00029577325500000410
Figure BDA00029577325500000411
Is a parameter that can be learned, j is more than or equal to 1 and less than or equal to h'.
Further, by classifying several coarse to fine field vector representations O:
1) calculating the probability that the vector from the coarse to the fine position represents that O belongs to each type of position by utilizing a softmax function;
2) the class with the highest probability is taken as the class of the coarse-to-fine-field vector representation O.
Further, before calculating the probability that O belongs to each class of position represented by the coarse-to-fine position vector, a linear layer is used to reduce the number of dimensions each representing O by the coarse-to-fine position vector.
Further, a loss function of the level alignment model is trained
Figure BDA0002957732550000051
Wherein
Figure BDA0002957732550000052
Is the result of the prediction, N is the number of sample question texts or sample answer texts, and | C | is the number of the set of the place categories.
Further, the set of standpoint categories C includes: approval, disapproval, and neutrality.
A storage medium having a computer program stored therein, wherein the computer program is arranged to perform the above-mentioned method when executed.
An electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer to perform the method as described above.
Compared with the prior art, the invention has the following advantages:
compared with the scheme of the circulation condition attention network, the method explicitly models the target dependency information through the attention coding strategy. In contrast, the conditional attention and extraction process only simulates the interaction between the QA pair, and neither learns a feature-rich text representation nor explicitly performs target and context alignment at the encoding stage, but the present invention uses the BERT pre-training model to obtain coarse-grained vertical representation, and then performs concept-level target alignment and evidence-level information alignment from both the question and the answer in the QA pair to obtain coarse-to-fine vertical representation. Experiments prove that the technical method can obtain higher accuracy and F1 value on the question-answering position detection task.
Drawings
FIG. 1 is a Hierarchical Alignment (HAT) model architecture diagram of the present invention.
Detailed Description
In order to make the technical solutions in the embodiments of the present invention better understood and make the objects, features, and advantages of the present invention more comprehensible, the technical core of the present invention is described in further detail below with reference to the accompanying drawings and examples.
The invention provides a novel question-answering position detection model, namely a Hierarchy Alignment (HAT) model based on a Transformer, as shown in figure 1, the model can align the context related to the concept-related target and evidence in a question-answering pair, learn the position from rough to fine, and be applied to the question-answering position detection. The HAT model mainly comprises three modules: a question and answer text coding module, a concept related target alignment module and an evidence related context alignment module. First, the present invention uses the Pre-training model BERT (Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep biological transformations for Language integrity in: Proceedings of the 2019 Conference of the North American pipeline of the Association for practical linearity of Human Language Technologies, Volume 1(Long and Short Papers) pp.4171{4186.Association for practical linearity, Minneaps, Minnesota (Jun 2019)) to calculate the basic features and meaningful features of the question-answer text. Then, a QA interaction matching block is introduced to align the concept-related objects from two directions, obtaining a question-dependent answer representation and a question-dependent answer representation. Finally, a multi-head attention mechanism is used to align evidence-related contexts to learn a better ground representation for question-answering ground detection.
The method is mainly divided into the following four parts: question and answer text coding, target alignment, context alignment and place classification.
1. Question-answer text coding
For problem text, the invention converts the text sequence into a sequence representation X ═ X1,x2,...,xNTherein of
Figure BDA0002957732550000061
Is the sum of word embedding, segment embedding and position embedding, N is the length in the problem sequence, d is the size of the embedding, and d is also the dimension size of the pre-trained model BERT used to obtain the text representation. The encoded text is the output of the last layer of the BERT encoder, i.e. the problem sequence representation
Figure BDA0002957732550000062
The answer sequence representation can be obtained using the same method
Figure BDA0002957732550000063
Then, the invention splicesInputting the question sequence and the answer sequence into a pre-trained BERT model to obtain a coarse-grained position representation which is recorded as
Figure BDA0002957732550000064
Wherein one dimension of the (N +1+ M) extra is the separator [ SEP ] between the Q and A sequences]。
2. Target alignment
The role of the concept-related object alignment module is to align the concept-related objects from both the question and answer aspects in the QA pair, learning an answer-dependent question representation and an answer-dependent question representation. The QA interaction matching module is thus constructed, using a self-attention mechanism, to align the concept-level targets from two aspects. We propose two QA interaction matching blocks: a question-answer matching block and an answer-question matching block.
Question-answer matching block represents a sequence of questions SQAs a query, the answer sequence is represented as SAAs keys and values. Conversely, the answer-question matching block represents the answer sequence SAAs a query, the answer sequence is represented as SQAs keys and values. In this way, the model focuses more on conceptually related goals, both in terms of questions and answers, and thus obtains an answer-dependent question representation and an answer-dependent question representation.
Specifically, the ith head calculation formula of the question-answer matching block is:
Figure BDA0002957732550000065
wherein the content of the first and second substances,
Figure BDA0002957732550000066
is that
Figure BDA0002957732550000067
The dimension (c) of (a) is,
Figure BDA0002957732550000068
is a learnable parameter, h is the headThe number of the cells.
Then, output of h heads are spliced together to perform linear projection operation, and the formula is as follows:
MATT(SQ,SA)=[ATT1(SQ,SA),ATT2(SQ,SA),...,ATTh(SQ,SA)]Wo
wherein the content of the first and second substances,
Figure BDA0002957732550000071
are learnable parameters.
Then, at SQAnd MATT (S)Q,SA) The residual errors are connected, and the calculation formula is as follows:
Z=LN(SQ+MATT(SQ,SA))
where LN is the hierarchical normalization operation. After that, Z is then coupled into a feed forward network (MLP) and another residual connection layer, resulting in the output of the first transform encoder:
TIM(SQ,SA)=LN(Z+MLP(Z))
wherein
Figure BDA0002957732550000072
I.e., the output of the first question-answer matching block.
We stacked on lmA matching block for obtaining an answer representation dependent on the question
Figure BDA0002957732550000073
I.e. the output of the last layer, denoted as MQ→AWhere l ismIs a hyper-parameter representing the number of matching blocks.
Similar to the computation of the question-answer matching block, we can also pile up lmA matching block obtains an answer-dependent question representation by computing an answer-question matching block
Figure BDA0002957732550000074
Is marked as MA→Q
Finally, we will denote both MQ→AAnd MA→QConnecting to obtain a fine-grained representation DQAAs the output of the conceptually related target alignment module.
3. Context alignment
The alignment module associated with evidence aims to align the evidence context of the QA pair and accumulate from coarse-grained to fine-grained standpoint representations for question-answering standpoint classification. To accomplish this, the present invention employs a multi-head attention layer to align the fine-grained representation D of QAQAFrom the standpoint of coarse particle size, SQAEvidence-related sentence meaning in between.
Specifically, the multi-head attention is calculated:
Figure BDA0002957732550000075
MATT′(DQA,SQA)=[ATT′1(DQA,SQA),...,ATT′h′(DQA,SQA)]W′o
where h' is the number of attention heads. Note here that the last vertical vector from coarse to fine is denoted as O ═ MATT' (D)QA,SQA) Thus, the process of context alignment is completed.
4. Location classification
After the vector representation of the vertical is obtained, the final vertical classification is performed. In the place classification part, a linear layer is used to reduce the number of dimensions, then the probability of belonging to each place is calculated by using a softmax function, and the category with the highest probability is taken as the place category of a given QA pair. This section is formulated as:
Figure BDA0002957732550000081
the loss function during training is:
Figure BDA0002957732550000082
wherein the content of the first and second substances,
Figure BDA0002957732550000083
is the result of the prediction, representing the probability of the jth class of categories; when the jth class is the true label of sample i,
Figure BDA0002957732550000084
is 1, otherwise is 0; n is the data size of the training data; i C is the size of the number of the set of vertical classes, where the set of vertical classes C ═ Favor, Against, Neutral }.
(III) positive effects
In order to verify the effect of the method, in the experimental process, the invention uses an open source data set proposed in the above-mentioned cyclic conditional attention network scheme, and the data set comprises a plurality of Chinese question-answer pairs. The question-answer pair data is collected from three websites of hundredth knowledge, dog searching and question asking and medical public network, and the related targets of concepts mainly comprise pregnancy, food safety, diseases and the like. The training data set size is 10598, the test size is 2993, and the data size for each of the standpoint categories of the training set and the test set is shown in table 1.
The evaluation indexes of the method are accuracy (accuracycacy), F1-macro, F1-macro, F1-favor and F1-against, wherein F1-favor is the F1 value of the sample with the position label as support, and F1-against is the F1 value of the sample with the position label as objection. The method (HAT model) was compared to some mainstream methods and the specific results are shown in table 2.
TABLE 1 data set statistics
Figure BDA0002957732550000085
TABLE 2 results of the experiment
Figure BDA0002957732550000086
Figure BDA0002957732550000091
The model provided by the invention can be seen to reach the optimum on each evaluation index, and exceeds the performance of a plurality of mainstream models, thereby proving the effectiveness of the method provided by the invention.
The above examples are provided only for the purpose of describing the present invention, and are not intended to limit the scope of the present invention. The scope of the invention is defined by the appended claims. Various equivalent substitutions and modifications can be made without departing from the spirit and principles of the invention, and are intended to be included within the scope of the invention.

Claims (10)

1. A question-answering position detection method of a hierarchical alignment structure comprises the following steps:
1) respectively converting the question text and the answer text into a question sequence and an answer sequence;
2) splicing the question sequence and the answer sequence to obtain a question answer sequence;
3) inputting the question sequence, the answer sequence and the question answer sequence into a hierarchical alignment model to obtain a question-answer standpoint detection result;
wherein, a question-answering place detection model is obtained through the following steps:
a) respectively converting the plurality of sample question texts and the plurality of sample answer texts into sample question sequences and sample answer sequences, and splicing the sample question sequences and the corresponding sample answer sequences to obtain a plurality of sample question answer sequences;
b) respectively coding each sample question sequence, sample answer sequence and sample question answer sequence to obtain a plurality of question sequence representations SQAnswer sequence representation SAAnd coarse grain size in the vertical representation of SQA
c) Representing S by a question sequenceQAs a query and representing the corresponding answer sequence as SAObtaining a number of question-dependent answer representations M as keys and valuesQ→AExpressing S as a sequence of answersAAs a surveyEnquiring and representing the corresponding answer sequence SQObtaining, as keys and values, a number of answer-dependent question representations MA→QAnd connecting the question-dependent answer representation MQ→AWith corresponding answer-dependent question representation MA→QObtaining a plurality of fine-grained representations, denoted as DQA
d) Aligning fine-grained representations D based on a multi-head attention mechanismQARepresenting S from the corresponding coarse-grained standpointQAThe sentence meanings related to evidence between the two groups of the sentences obtain a plurality of vectors representing O from a coarse position to a fine position;
e) and classifying a plurality of vector representations O from coarse to fine to obtain a level alignment model.
2. The method of claim 1, wherein the method of encoding the sample question sequence, the sample answer sequence, and the sample question answer sequence comprises: a pre-trained BERT model was used.
3. The method of claim 1, wherein the question-dependent answer representation M is obtained by the following stepsQ→A
1) Representing S by a question sequenceQAs a query and representing the corresponding answer sequence as SAObtaining an output of a first answer-question matching block as a key and value, comprising the steps of:
a) obtaining the output of the ith head
Figure FDA0002957732540000011
Wherein
Figure FDA0002957732540000012
Figure FDA0002957732540000013
Is that
Figure FDA0002957732540000014
D is sample question text and sample answerThe text is converted into the embedded size of the sample question sequence and the sample answer sequence, h is the number of headers,
Figure FDA0002957732540000015
is a parameter which can be learned, i is more than or equal to 1 and less than or equal to h;
b) splicing the outputs of h heads, and performing linear projection operation on the splicing result to obtain an operation result MATT (S)Q,SA)=[ATT1(SQ,SA),ATT2(SQ,SA),...,ATTh(SQ,SA)WOIn which
Figure FDA0002957732540000016
Are learnable parameters;
c) in question sequence representation SQAnd operation result MATT (S)Q,SA) Performing residual connection between the two to obtain the result Z ═ LN (S)Q+MATT(SQ,SA) LN is a hierarchical normalization operation;
d) accessing the result Z into a feedforward network and another residual connecting layer to obtain the output TIM of the first transform encoder1(SQ,SA) LN (Z + MLP (Z)), where MLP is a feed-forward network;
2) by stackingmAn answer-question matching block for obtaining an answer representation dependent on a question
Figure FDA0002957732540000021
Figure FDA0002957732540000022
4. The method of claim 3, wherein O MATT' (D) is represented by a coarse to fine field vectorQA,SQA)=[ATT′1(DQA,SQA),...,ATT′h′(DQA,SQA)]W′OIn which
Figure FDA0002957732540000023
Figure FDA0002957732540000024
Figure FDA0002957732540000025
Is a parameter which can be learned, j is more than or equal to 1 and less than or equal to h ', and h' is the number of attention heads.
5. The method of claim 1, wherein the classification is performed by classifying a number of coarse-to-fine-field vector representations O:
1) calculating the probability that the vector from the coarse to the precise position represents that O belongs to each type of position by utilizing a softmax function;
2) the class with the highest probability is taken as the class of the coarse-to-fine-field vector representation O.
6. The method of claim 5 wherein the number of dimensions each representing O by a coarse-to-fine position vector is reduced using a linear layer before calculating the probability that O belongs to each class of positions represented by a coarse-to-fine position vector.
7. The method of claim 1, wherein a penalty function for a hierarchical alignment model is trained
Figure FDA0002957732540000026
Figure FDA0002957732540000027
Wherein
Figure FDA0002957732540000028
Is the result of the prediction, N is the number of sample question texts or sample answer texts, and | C | is the number of the set of the place categories.
8. The method of claim 7, wherein the set of standpoint categories C comprises: approved, disapproved, and neutral.
9.A storage medium having a computer program stored thereon, wherein the computer program is arranged to, when run, perform the method of any of claims 1-8.
10. An electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the method according to any of claims 1-8.
CN202110230676.XA 2021-03-02 2021-03-02 Question-answering position detection method and device of hierarchical alignment structure Active CN113127599B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110230676.XA CN113127599B (en) 2021-03-02 2021-03-02 Question-answering position detection method and device of hierarchical alignment structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110230676.XA CN113127599B (en) 2021-03-02 2021-03-02 Question-answering position detection method and device of hierarchical alignment structure

Publications (2)

Publication Number Publication Date
CN113127599A CN113127599A (en) 2021-07-16
CN113127599B true CN113127599B (en) 2022-07-12

Family

ID=76772366

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110230676.XA Active CN113127599B (en) 2021-03-02 2021-03-02 Question-answering position detection method and device of hierarchical alignment structure

Country Status (1)

Country Link
CN (1) CN113127599B (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109558477B (en) * 2018-10-23 2021-03-23 深圳先进技术研究院 Community question-answering system and method based on multitask learning and electronic equipment
US11281863B2 (en) * 2019-04-18 2022-03-22 Salesforce.Com, Inc. Systems and methods for unifying question answering and text classification via span extraction
CN111581979B (en) * 2020-05-06 2022-08-16 西安交通大学 False news detection system and method based on evidence perception layered interactive attention network
CN112256861B (en) * 2020-09-07 2023-09-26 中国科学院信息工程研究所 Rumor detection method based on search engine return result and electronic device
CN112232058B (en) * 2020-10-15 2022-11-04 济南大学 False news identification method and system based on deep learning three-layer semantic extraction framework

Also Published As

Publication number Publication date
CN113127599A (en) 2021-07-16

Similar Documents

Publication Publication Date Title
Lei et al. Re-examining the role of schema linking in text-to-SQL
Yan et al. Joint learning of response ranking and next utterance suggestion in human-computer conversation system
CN111191002A (en) Neural code searching method and device based on hierarchical embedding
CN112115253B (en) Depth text ordering method based on multi-view attention mechanism
CN109614480B (en) Method and device for generating automatic abstract based on generation type countermeasure network
CN116097250A (en) Layout aware multimodal pre-training for multimodal document understanding
CN111222330B (en) Chinese event detection method and system
CN116992005B (en) Intelligent dialogue method, system and equipment based on large model and local knowledge base
CN114818717A (en) Chinese named entity recognition method and system fusing vocabulary and syntax information
Cornia et al. A unified cycle-consistent neural model for text and image retrieval
CN114330483A (en) Data processing method, model training method, device, equipment and storage medium
Gupta et al. A Comparative Analysis of Sentence Embedding Techniques for Document Ranking
Yang et al. Adaptive syncretic attention for constrained image captioning
CN114281931A (en) Text matching method, device, equipment, medium and computer program product
CN111581365B (en) Predicate extraction method
Michael et al. A First Experimental Demonstration of Massive Knowledge Infusion.
Peng et al. MPSC: A multiple-perspective semantics-crossover model for matching sentences
CN113127599B (en) Question-answering position detection method and device of hierarchical alignment structure
Zhu et al. Knowledge-based question answering by jointly generating, copying and paraphrasing
CN116911252A (en) Entity relationship joint extraction method based on relationship attention enhancement and part-of-speech mask
Zhang et al. Dual attention model for citation recommendation with analyses on explainability of attention mechanisms and qualitative experiments
CN112749554B (en) Method, device, equipment and storage medium for determining text matching degree
Li et al. Grading Chinese answers on specialty subjective questions
Huang et al. DFS-NER: Description Enhanced Few-shot NER via Prompt Learning and Meta-Learning
CN111611392B (en) Educational resource reference analysis method, system and medium for integrating multiple features and voting strategies

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant