CN112183085A - Machine reading understanding method and device, electronic equipment and computer storage medium - Google Patents
Machine reading understanding method and device, electronic equipment and computer storage medium Download PDFInfo
- Publication number
- CN112183085A CN112183085A CN202010955175.3A CN202010955175A CN112183085A CN 112183085 A CN112183085 A CN 112183085A CN 202010955175 A CN202010955175 A CN 202010955175A CN 112183085 A CN112183085 A CN 112183085A
- Authority
- CN
- China
- Prior art keywords
- vector
- article
- attention
- question
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 60
- 239000013598 vector Substances 0.000 claims abstract description 317
- 230000003993 interaction Effects 0.000 claims abstract description 30
- 238000004364 calculation method Methods 0.000 claims abstract description 27
- 238000012549 training Methods 0.000 claims description 59
- 230000002452 interceptive effect Effects 0.000 claims description 17
- 230000006870 function Effects 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 15
- 238000005315 distribution function Methods 0.000 claims description 8
- 238000013507 mapping Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 description 10
- 230000007246 mechanism Effects 0.000 description 9
- 230000008569 process Effects 0.000 description 9
- 239000011800 void material Substances 0.000 description 5
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000008034 disappearance Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 241000282414 Homo sapiens Species 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 125000004122 cyclic group Chemical class 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002040 relaxant effect Effects 0.000 description 1
- 239000004576 sand Substances 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/151—Transformation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to a machine reading understanding method, a device, equipment and a medium, wherein the machine reading understanding method comprises the following steps: receiving a question and an article, converting the question into a question vector through calculation of a coding layer, converting the article into an article vector through calculation of the coding layer, calculating the question vector and the article vector through a multiple attention layer to obtain an interaction information vector, wherein the multiple attention layer comprises a trained self-attention model and a plurality of attention matching models, calculating the article through a trained BTM topic model to obtain topic words, encoding the topic words to obtain topic feature vectors, and calculating the interaction information vector and the topic feature vectors through a nonlinear output layer to obtain answers related to the question. By the method and the device, the problem that the accuracy of the answer is low when a single word matching method is used for machine reading understanding is solved, and the accuracy of the machine reading understanding of the answer is improved.
Description
Technical Field
The present invention relates to the field of natural language processing, and in particular, to a machine reading and understanding method and apparatus, an electronic device, and a computer storage medium.
Background
Machine-readable understanding is a technique that uses algorithms to make computing mechanisms solve article semantics and answer related questions. In the related art, a question is matched with words in an article by using a word matching method, and then answers related to the question are obtained from the article, but the single word matching method causes the interaction information between the question and the words in the article to be too simple, so that the obtained answers are not related to the question, and the accuracy of the answers is low.
Aiming at the problem that the accuracy of answers is low when a single word matching method is used for machine reading understanding in the related art, an effective solution is not provided at present.
Disclosure of Invention
The embodiment of the application provides a machine reading understanding method, a machine reading understanding device, electronic equipment and a computer storage medium, so as to at least solve the problem that the accuracy of answers is low when a single word matching method is used for machine reading understanding in the related art.
In a first aspect, an embodiment of the present application provides a machine reading understanding method, where the method includes:
receiving a question and an article, converting the question into a question vector through calculation of a coding layer, and converting the article into an article vector through calculation of the coding layer;
calculating the problem vector and the article vector through a multiple attention layer to obtain an interactive information vector, wherein the multiple attention layer comprises a trained self-attention model and a plurality of attention matching models;
calculating the article through a trained BTM topic model to obtain topic words, and coding the topic words to obtain topic feature vectors;
and calculating the interaction information vector and the theme characteristic vector through a nonlinear output layer to obtain an answer related to the question.
In some embodiments, the coding layer comprises a BERT model and a preset gated hole convolution layer, and the transforming the problem into a problem vector by coding layer calculation comprises:
learning the problem based on the BERT model to obtain a first word vector of each single word in the problem, and forming a first intermediate vector by the first word vector of each single word;
calculating the first intermediate vector through the gated cavity convolution layer to obtain the problem vector;
the converting the article into an article vector through the encoding layer calculation comprises:
learning the article based on the BERT model to obtain second word vectors of the individual characters in the article, and forming second intermediate vectors by the second word vectors of the individual characters;
and calculating the second intermediate vector through the gated cavity convolution layer to obtain the article vector.
In some embodiments, the gated hole convolution layer includes a plurality of layers of sequentially connected hole convolution gate units, each hole convolution gate unit is configured to sample an input vector at intervals to obtain an output vector, and use the output vector as an input vector of a next layer of hole convolution gate units.
In some embodiments, where the plurality of attention matching models includes a dot product attention model and a Concat-based attention model, the calculating the question vector and the article vector through multiple attention layers to obtain an interaction information vector includes:
calculating the question vector and the article vector through a trained dot product attention model to obtain a first vector and a second vector, wherein the first vector is the dot product attention vector of the question about the article, and the second vector is the dot product attention vector of the article about the question;
calculating the question vector and the article vector through a trained Concat-based attention model to obtain a third vector and a fourth vector, wherein the third vector is a Concat attention vector of the question about the article, and the fourth vector is a Concat attention vector of the article about the question;
merging the first vector, the second vector, the third vector and the fourth vector to obtain a merged vector;
and calculating the merged vector through the trained self-attention model to obtain the interactive information vector.
In some of these embodiments, the trained BTM topic model is obtained by:
acquiring an article corpus training set;
sampling the theme distribution on the article corpus training set through a Dirichlet distribution function;
sampling the distribution of terms under a plurality of subjects through the Dirichlet distribution function;
and extracting a target theme from the theme distribution, and extracting word pairs from the target theme to make the word pairs obey the term distribution under the target theme.
In some embodiments, the encoding the topic word to obtain a topic feature vector includes:
acquiring a training corpus, learning the training corpus based on a BERT model to obtain a third word vector of each single word in the training corpus, and forming a word vector library by the third word vector of each single word;
and obtaining a third word vector of each single word in the topic words from the word vector library, and forming the topic feature vector by the third word vector of each single word of the topic words.
In some embodiments, the nonlinear output layer includes a preset hyperbolic tangent function, and the calculating the interaction information vector and the topic feature vector through the nonlinear output layer to obtain the answer related to the question includes:
and according to the interaction information vector and the theme characteristic vector, carrying out nonlinear mapping through the hyperbolic tangent function to obtain the answer extracted from the article.
In a second aspect, an embodiment of the present application provides a machine reading and understanding device, including: the system comprises a coding module, a multi-attention module, a theme acquisition module and an answer calculation module;
the encoding module is used for receiving questions and articles, converting the questions into question vectors through encoding layer calculation, and converting the articles into article vectors through the encoding layer calculation;
the multi-attention module is used for calculating the question vector and the article vector through a multi-attention layer to obtain an interactive information vector, wherein the multi-attention layer comprises a trained self-attention model and a plurality of attention matching models;
the topic acquisition module is used for calculating the article through a trained BTM topic model to obtain topic words, and coding the topic words to obtain topic feature vectors;
and the answer calculation module is used for calculating the interaction information vector and the theme characteristic vector through a nonlinear output layer to obtain an answer related to the question.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the processor implements the machine reading understanding method according to the first aspect.
In a fourth aspect, embodiments of the present application provide a computer storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the machine reading understanding method according to the first aspect.
Compared with the related technology, the machine reading understanding method provided by the embodiment of the application obtains the answer related to the problem by receiving the problem and the article, converting the problem into the problem vector through calculation of the coding layer, converting the article into the article vector through calculation of the coding layer, and calculating the problem vector and the article vector through the multiple attention layers to obtain the interactive information vector, wherein the multiple attention layers comprise a trained self-attention model and a plurality of attention matching models, calculating the article through the trained BTM topic model to obtain the topic word, encoding the topic word to obtain the topic feature vector, and calculating the interactive information vector and the topic feature vector through the nonlinear output layer, so that the problem of low accuracy of the answer existing in machine reading understanding by using a single word matching method is solved, the accuracy of understanding the answer by machine reading is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a flow chart of a machine reading understanding method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a structure of a void convolution gating cell according to an embodiment of the present application;
FIG. 3 is a flow chart of obtaining an interaction information vector according to an embodiment of the application;
FIG. 4 is a flow diagram of a machine reading understanding method in accordance with a preferred embodiment of the present application;
FIG. 5 is a schematic diagram of a machine-readable understanding apparatus according to an embodiment of the present application;
fig. 6 is an internal structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.
It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar terms in this application is not intended to be limiting, and can be singular or plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as referred to herein means two or more. "and/or" describe the association relationship of the associated objects, there may be three relationships, for example, "a and/or B" may: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally refers to a front and back related object in an "or" relationship. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
The embodiment provides a machine reading understanding method. Fig. 1 is a flowchart of a machine reading understanding method according to an embodiment of the present application, as shown in fig. 1, the flowchart includes the following steps:
s110, receiving the question and the article, converting the question into a question vector through calculation of a coding layer, and converting the article into an article vector through calculation of the coding layer. The coding layer is used for respectively carrying out bottom layer processing on the articles and the problems, carrying out digital coding on the texts of the articles and the problems and converting the articles and the problems into information units which can be processed by a computer. The coding layer is used for coding each single word, phrase and sentence in the article and the question on the basis of understanding the context so as to keep the semantic meaning of the original sentence in the article. In the coding layer, the problem and the article are respectively segmented to obtain a plurality of single characters in respective texts, then each single character is converted into a corresponding word vector, and the word vector of each single character is further coded to obtain word level vector representation of the problem and the article. The conversion of each Word into a Word vector may use any one of Word2vec model, glove (global Vectors for Word representation) model, and elmo (embedding from Language models). The word-level vector representation of the questions and articles can be Bi-directionally encoded using a BiGRU (Bi-directional Gated RNN), also known as a Bi-directional Gated round-robin network.
And S120, calculating the question vector and the article vector through multiple attention layers to obtain an interactive information vector. The multi-attention layer includes a trained self-attention model and a plurality of attention matching models. Because of the relevance between the articles and the questions, the connection between the articles and the questions needs to be established so as to obtain answers related to the questions from the articles according to the interaction information between the articles and the questions. By combining the semantics of the articles and the questions together for consideration, the understanding of the questions can be deepened by means of the semantic analysis of the articles, and the understanding of the articles can be deepened by means of the semantic analysis of the questions, so that the semantic relation between the articles and the questions is focused. The attention mechanism in deep learning is similar to the selective visual attention mechanism of human beings in nature, and the core goal is to select information which is more critical to the current task goal from a plurality of information. The multi-attention layer integrates various attention mechanisms, interaction on the word level in the sentence can be enhanced by using a plurality of attention matching models, and interaction information among words can be enriched. And a long-distance dependency relationship can be established by the self-attention mechanism, important word characteristics in the sentence are found, and the method is suitable for establishing the relationship between the long article and the problem. The interaction information vector obtained by multiple attention layers can contain richer and more effective interaction information between the question and the article.
S130, the articles are calculated through the trained BTM topic model to obtain topic words, and the topic words are encoded to obtain topic feature vectors. After determining the Topic distribution and the word distribution, the BTM (Topic model) Topic model correspondingly takes two words, namely after segmenting a text, carrying out pairing on any two words in the window length, wherein the two words are called a pair of biterm, and then obtaining each Topic word in the document according to the probability that each pair of biterm belongs to each Topic. The BTM topic model can effectively solve the sparsity problem by relaxing the constraint that the entire document must belong to one topic to the point that two words within the window length belong to one topic. The topic words reflect the topic content of the article, so that the topic feature vector contains the topic information of the article.
And S140, calculating the interactive information vector and the theme characteristic vector through a nonlinear output layer to obtain answers related to the questions. By integrating the topic information of the article into the interactive information between the article and the question, the extraction of answers related to the question from the article can be better guided.
Through the steps, through fusing multiple attention mechanisms, interaction information on the level of words in the question and the article is enhanced by using multiple attention matching models, the relevance between the question and the article is enhanced, and meanwhile, the topic information of the article is fused into the interaction information to guide the extraction of answers from the article, so that the relevance degree between the answers and the question is improved, and the problem of low accuracy of the answers existing in machine reading understanding by using a single word matching method is solved.
In some embodiments, the coding layers include a BERT model and preset gated hole convolution layers. Converting the problem into a problem vector through coding layer calculation, specifically comprising: learning the problem based on a BERT model to obtain a first word vector of each single word in the problem, and forming a first intermediate vector by the first word vectors of the single words as query:wherein m represents the problem qThe number of the single words is,and (3) a first word vector, i is 1, i.m, representing the ith single word in the problem q, and then calculating the first intermediate vector through a gated hole convolution layer to obtain a problem vector and marking the problem vector as Vq:Wherein, ConvresIndicating a gated hole convolution layer. Similarly, converting the article into an article vector through encoding layer calculation specifically includes: learning the article based on a BERT model to obtain a second word vector of each single word in the article, and forming a second intermediate vector by the second word vectors of the single words as Document:wherein n represents the number of single characters in the article d,and calculating a second word vector representing the jth single word in the article d, wherein j is 1d:ConvresIndicating gated hole convolution layer operation.
The BERT (bidirectional Encoder retrieval from transformations) model is a deep bidirectional pre-training language understanding model using a Transformer model as a feature extractor, and essentially learns a good feature representation for words by running a self-supervision learning method on the basis of massive linguistic data. The Transformer model is an NLP classical model proposed by the Google team, and because the Transformer model uses a self-attention mechanism and does not adopt a sequential structure of a cyclic neural network, the Transformer model can be subjected to parallelization training and can have global information. Therefore, the first intermediate vector obtained based on BERT model learning can effectively express the semantics among the single characters in the problem, and similarly, the second intermediate vector can effectively express the semantics among the single characters in the article. The hole convolution is a variation of convolution, and unlike ordinary convolution, hole convolution can sample input text at intervals across text segments with the same hole rate, and by stacking hole convolutions with exponentially increasing hole rates, coverage to most sentence lengths with fewer layers can be achieved. Through the BERT model and the gated cavity convolution layer, the representation capability of the question vector and the article vector is improved, the more accurate interactive information vector can be obtained through the question vector and the article vector, and the accuracy of the answer is further improved.
In some embodiments, the gated hole convolution layer includes a plurality of layers of sequentially connected hole convolution gating cells. FIG. 2 is a schematic structural diagram of a hole convolution gating unit according to an embodiment of the present application, and as shown in FIG. 2, each hole convolution gating unit includes a hole convolution network and a residual error network. And each cavity convolution gate control unit is used for sampling the input vector at intervals to obtain an output vector, and the output vector is used as the input vector of the next layer of cavity convolution gate control unit. The operation of each hole convolution gating cell can be represented by the following formula:wherein X represents an input vector of the hole convolution gate control unit, Y represents an output vector of the hole convolution gate control unit, Conv1 represents convolution operation 1, Conv2 represents convolution operation 2, Conv1 and Conv2 are both hole convolutions, the set filter number and window size of the two are consistent but not shared, and Sig represents a sigmoid activation function.
Ordinary convolution operation can process texts in parallel, the model training time is shortened remarkably, but in order to obtain a good effect on long-distance dependence, a plurality of convolution layers are required to be accumulated, and the risk of gradient disappearance is increased. The gated hole convolution layer introduces a residual mechanism on the basis of the hole convolution network, so that information is transmitted in multiple channels, effective information can be strengthened, useless information is eliminated, and the problem of gradient disappearance brought by a deep network is solved.
In some embodiments, fig. 3 is a flowchart of obtaining an interaction information vector according to an embodiment of the present application, and as shown in fig. 3, in a case that the plurality of attention matching models include a dot product attention model and a Concat-based attention model, calculating a question vector and an article vector through multiple attention layers to obtain the interaction information vector includes the following steps:
s310, calculating the problem vector and the article vector through the trained dot product attention model to obtain a first vector and a second vector. The first vector is question q the dot product attention vector for article d is denoted as Vm1The second vector is the dot product attention vector of article d about question q and is denoted as Vm2. In particular, a first vector Vm1Obtained by the following method: two vectors are combinedAndafter dot product, the dimension is consistent with the original dimension, then nonlinear change is carried out through hyperbolic tangent function, and then v is carried outdAfter linear variation of (A) to obtain the original weightFinally obtaining the normalized weighted vector expression V of the question q about the article dm1I.e. the first vector Vm1Obtained by the following formula:
wherein the content of the first and second substances,a vector representing the jth word in the problem vector,vector, v, representing the t-th word in the article vectordFor a trained parameter, tanh (-) represents a hyperbolic tangent function. The second vector V can be obtained by the same methodm2。
And S320, calculating the problem vector and the article vector through a trained attention model based on Concat to obtain a third vector and a fourth vector. Third vector is question q the Concat attention vector for article d is denoted as Vm3The fourth vector is the Concat attention vector of article d for question q, denoted as Vm4. In particular, the third vector Vm3Obtained by the following method: will be provided withThroughVector after matrix mapping andthroughCombining the vectors after matrix mapping, and after nonlinear change is carried out on the hyperbolic tangent function in order to ensure nonlinearity, passing through vcMapping the vector to obtain the original weightFinally obtaining the normalized weighted vector expression V of the question q about the article dm3I.e. the third vector Vm3Can be obtained by the following formula:
wherein the content of the first and second substances,a vector representing the jth word in the problem vector,the vector representing the t-th single word in the article vector,and vcFor a trained parameter, tanh (-) represents a hyperbolic tangent function. The fourth vector V can be obtained by the same methodm4。
S330, combining the first vector, the second vector, the third vector and the fourth vector to obtain a combined vector: vc=Concat(Vm1,Vm2,Vm3,Vm4) Wherein V iscRepresents a merge vector, Vm1、Vm2、Vm3And Vm4Respectively, a first vector, a second vector, a third vector and a fourth vector, and Concat (·) represents a merge operation.
And S340, calculating the merged vector through the trained self-attention model to obtain an interactive information vector. The interaction information vector is obtained by the following formula:
Vs=Attention(VcWc,VsWs) Wherein V iscRepresents a merge vector, VsRepresenting an interaction information vector, WcIs the weight of the combined vector, WsIs the weight of the interaction information vector, Attention (·) represents the self-Attention operation. The self-attention mechanism canAnd important word features in the sentence are found and are continuously adjusted in the process of back propagation, and the weight is changed. Meanwhile, parallel computation can be realized through matrix multiplication in the self-attention computation process, and the training speed of the model is accelerated.
In some embodiments, the trained BTM topic model is obtained by: acquiring an article corpus training set; sampling the theme distribution on the article corpus training set through a Dirichlet distribution function; the lexical item distribution under a plurality of subjects is sampled through a Dirichlet distribution function; and extracting a target theme from the theme distribution, and extracting word pairs from the target theme to make the word pairs obey the distribution of terms under the target theme.
Specifically, a large-scale article corpus is used for constructing an article corpus training set, and word distribution under a theme is sampled from a parameter beta through Dirichlet distributionK is 1, a, K and K are the number of topics, Dir (·) represents a Dirichlet distribution function, in the Dirichlet distribution of a parameter alpha, the topic distribution theta-Dir (alpha) of an article corpus training set is sampled, a target topic Z is extracted from a common parameter theta of the article corpus training set, Z is the Dirichlet distribution of the topic theta and obeys Z-Mult (theta), Mult (·) represents a polynomial distribution function, a word pair in the corpus is set as b,extracting from the extracted subject ZAndtwo words and subject them to
In some embodiments, the topic terms are encoded to obtain topic feature vectors by: obtaining a training corpus, learning the training corpus based on a BERT model to obtain a third word vector of each single word in the training corpus, and forming a word vector library by the third word vector of each single word. And obtaining a third word vector of each single word in the topic words from the word vector library, and forming topic feature vectors by the third word vectors of each single word in the topic words. The topic feature vector obtained based on the BERT model can effectively express the semantics among the single characters in the topic words, the representation capability of the topic feature vector is improved, the topic feature vector is integrated into the interactive information vector, and the accuracy of the answer can be further improved.
In some embodiments, the non-linear output layer includes a preset hyperbolic tangent function. Calculating the interactive information vector and the topic feature vector through a nonlinear output layer to obtain answers related to the questions, wherein the answers include: and carrying out nonlinear mapping through a hyperbolic tangent function according to the interactive information vector and the theme characteristic vector to obtain an answer extracted from the article. The answer is predicted by the following formula:
gIR(d,q)=tanh(WsVs+WtVt) Wherein V issRepresenting an interaction information vector, VtRepresenting a topic feature vector, WsAnd WtFor a trained parameter, tanh (-) represents the hyperbolic tangent function, gIR(d, q) represents an answer extracted from the article d that is related to the question q.
The embodiments of the present application are described and illustrated below by means of preferred embodiments.
Fig. 4 is a flow chart diagram of a machine reading understanding method according to the preferred embodiment of the present application. As shown in fig. 4, first, parameters of models and functions in the coding layer, the multi-attention layer, and the non-linear output layer need to be trained. The training process comprises the following steps: the method comprises the steps of obtaining product documents in a specific field based on a large number of customer service question-answer corpora, cleaning and segmenting the corpora, and then pre-training word vectors by using a BERT model to generate the word vectors based on the specific vertical field. Setting parameters of word vector learning model training, wherein the parameters comprise the dimensionality of a word vector, batch processing parameters, the size of a window, an initial learning rate,Word vector matrices, auxiliary vector matrices, and the like. And respectively carrying out text word segmentation and coding on the training problems and the training articles in the training set aiming at large-scale training data comprising a training set and a development set to respectively obtain word level vector representation of the training problems and word level vector representation of the training articles. And respectively carrying out the void convolution calculation on the word level vector representation of each single word in the training question and the training article, wherein the word level vector representation of the training question and the word level vector representation of the training article are respectively calculated through a gate control void convolution layer, so that the training question vector and the training article vector can be obtained, and the gate control void convolution layer comprises a plurality of layers of sequentially connected void convolution gate control units. Then inputting the training question vector and the training article vector into a dot product attention model and an attention model calculation based on Concat to obtain a training dot product attention vector V of the training question about the training article1And training Concat attention vector V3And training article training dot product attention vector V for training question2And training Concat attention vector V4. Will V1、V2、V3And V4Merging to obtain a training merging vector Vc-trainWill Vc-trainAnd calculating to obtain a training interaction information vector through a self-attention model. And calculating the training articles through a trained BTM topic model to obtain training topic words, and coding the training topic words to obtain training topic feature vectors. And carrying out nonlinear combination on the training mutual information vector and the training subject feature vector for answer prediction. And finally, training parameters of models and functions in the coding layer, the multi-attention layer and the nonlinear output layer by using a training set, selecting the parameters with the optimal F1 indexes in the development set for storage, and finishing the training.
Then, the received question and article are input into the model and the function with trained parameters, so that the correct answer of the question can be extracted from the article. The prediction process comprises the following steps: converting the received questions and articles into question vectors V through BERT models and gated hole convolution layers with trained parameters in the coding layer respectivelyqAnd articlesVector Vd. Vector the problem VqAnd the article vector VdInputting the well-trained dot product attention model and the Concat-based attention model in the multi-attention layer to obtain a first vector Vm1Second vector Vm2A third vector Vm3And a fourth vector Vm4And will Vm1、Vm2、Vm3And Vm4After merging, inputting the trained self-attention model for calculation to obtain an interactive information vector Vs. And the article is encoded after being calculated through a trained BTM topic model to obtain a topic feature vector Vt. Subject feature vector VtAnd mutual information vector VsThe answer related to the question can be obtained from the article through the hyperbolic tangent function with the trained parameters in the nonlinear output layer.
The embodiment of the application provides a machine reading understanding device. Fig. 5 is a schematic structural diagram of a machine-readable understanding apparatus according to an embodiment of the present application, and as shown in fig. 5, the apparatus includes an encoding module 510, a multi-attention module 520, a topic obtaining module 530, and an answer calculating module 540: the encoding module 510 is configured to receive a question and an article, convert the question into a question vector through encoding layer calculation, and convert the article into an article vector through the encoding layer calculation; the multiple attention module 520 is configured to calculate the question vector and the article vector through multiple attention layers to obtain an interaction information vector, where the multiple attention layers include a trained self-attention model and multiple attention matching models; the topic acquisition module 530 is configured to calculate the article through a trained BTM topic model to obtain topic words, and encode the topic words to obtain topic feature vectors; the answer calculating module 540 is configured to calculate the interaction information vector and the topic feature vector through a non-linear output layer to obtain an answer related to the question.
For specific limitations of the machine reading understanding apparatus, reference may be made to the above limitations of the machine reading understanding method, which are not described herein again. The various modules in the machine reading and understanding apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
An embodiment of the present application further provides an electronic device, which includes a memory and a processor, where the memory stores a computer program, and the processor is configured to execute the computer program to perform the steps in any of the method embodiments described above.
Optionally, the electronic device may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
It should be noted that, for specific examples in this embodiment, reference may be made to examples described in the foregoing embodiments and optional implementations, and details of this embodiment are not described herein again.
In addition, in combination with the machine reading understanding method in the foregoing embodiments, the embodiments of the present application may provide a storage medium to implement. The storage medium having stored thereon a computer program; the computer program, when executed by a processor, implements any of the machine-readable understanding methods of the above embodiments.
In an embodiment, fig. 6 is a schematic internal structure diagram of an electronic device according to an embodiment of the present application, and as shown in fig. 6, there is provided an electronic device, which may be a server, and its internal structure diagram may be as shown in fig. 6. The electronic device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic equipment comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the electronic device is used for storing data. The network interface of the electronic device is used for connecting and communicating with an external terminal through a network. The computer program is executed by a processor to implement a machine-readable understanding method.
Those skilled in the art will appreciate that the configuration shown in fig. 6 is a block diagram of only a portion of the configuration associated with the present application, and does not constitute a limitation on the electronic device to which the present application is applied, and a particular electronic device may include more or less components than those shown in the drawings, or may combine certain components, or have a different arrangement of components.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It should be understood by those skilled in the art that various features of the above-described embodiments can be combined in any combination, and for the sake of brevity, all possible combinations of features in the above-described embodiments are not described in detail, but rather, all combinations of features which are not inconsistent with each other should be construed as being within the scope of the present disclosure.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A machine-readable understanding method, the method comprising:
receiving a question and an article, converting the question into a question vector through calculation of a coding layer, and converting the article into an article vector through calculation of the coding layer;
calculating the problem vector and the article vector through a multiple attention layer to obtain an interactive information vector, wherein the multiple attention layer comprises a trained self-attention model and a plurality of attention matching models;
calculating the article through a trained BTM topic model to obtain topic words, and coding the topic words to obtain topic feature vectors;
and calculating the interaction information vector and the theme characteristic vector through a nonlinear output layer to obtain an answer related to the question.
2. The method of claim 1, wherein the coding layer comprises a BERT model and a preset gated hole convolution layer, and wherein converting the problem into a problem vector by coding layer computation comprises:
learning the problem based on the BERT model to obtain a first word vector of each single word in the problem, and forming a first intermediate vector by the first word vector of each single word;
calculating the first intermediate vector through the gated cavity convolution layer to obtain the problem vector;
the converting the article into an article vector through the encoding layer calculation comprises:
learning the article based on the BERT model to obtain second word vectors of the individual characters in the article, and forming second intermediate vectors by the second word vectors of the individual characters;
and calculating the second intermediate vector through the gated cavity convolution layer to obtain the article vector.
3. The method according to claim 2, wherein the gated hole convolution layer comprises a plurality of layers of sequentially connected hole convolution gate units, each hole convolution gate unit is configured to sample an input vector at intervals to obtain an output vector, and use the output vector as an input vector of a next layer of hole convolution gate units.
4. The method of claim 1, wherein in a case where the plurality of attention matching models includes a dot product attention model and a Concat-based attention model, the calculating the question vector and the article vector through multiple attention layers to obtain an interaction information vector comprises:
calculating the question vector and the article vector through a trained dot product attention model to obtain a first vector and a second vector, wherein the first vector is the dot product attention vector of the question about the article, and the second vector is the dot product attention vector of the article about the question;
calculating the question vector and the article vector through a trained Concat-based attention model to obtain a third vector and a fourth vector, wherein the third vector is a Concat attention vector of the question about the article, and the fourth vector is a Concat attention vector of the article about the question;
merging the first vector, the second vector, the third vector and the fourth vector to obtain a merged vector;
and calculating the merged vector through the trained self-attention model to obtain the interactive information vector.
5. The method of claim 1, wherein the trained BTM topic model is obtained by:
acquiring an article corpus training set;
sampling the theme distribution on the article corpus training set through a Dirichlet distribution function;
sampling the distribution of terms under a plurality of subjects through the Dirichlet distribution function;
and extracting a target theme from the theme distribution, and extracting word pairs from the target theme to make the word pairs obey the term distribution under the target theme.
6. The method of claim 1, wherein encoding the topic terms into topic feature vectors comprises:
acquiring a training corpus, learning the training corpus based on a BERT model to obtain a third word vector of each single word in the training corpus, and forming a word vector library by the third word vector of each single word;
and obtaining a third word vector of each single word in the topic words from the word vector library, and forming the topic feature vector by the third word vector of each single word of the topic words.
7. The method of claim 1, wherein the nonlinear output layer comprises a preset hyperbolic tangent function, and wherein the calculating the interaction information vector and the topic feature vector through the nonlinear output layer to obtain the answer related to the question comprises:
and according to the interaction information vector and the theme characteristic vector, carrying out nonlinear mapping through the hyperbolic tangent function to obtain the answer extracted from the article.
8. A machine reading understanding apparatus, the apparatus comprising: the system comprises a coding module, a multi-attention module, a theme acquisition module and an answer calculation module;
the encoding module is used for receiving questions and articles, converting the questions into question vectors through encoding layer calculation, and converting the articles into article vectors through the encoding layer calculation;
the multi-attention module is used for calculating the question vector and the article vector through a multi-attention layer to obtain an interactive information vector, wherein the multi-attention layer comprises a trained self-attention model and a plurality of attention matching models;
the topic acquisition module is used for calculating the article through a trained BTM topic model to obtain topic words, and coding the topic words to obtain topic feature vectors;
and the answer calculation module is used for calculating the interaction information vector and the theme characteristic vector through a nonlinear output layer to obtain an answer related to the question.
9. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the machine reading understanding method of any of claims 1 to 7 when executing the computer program.
10. A computer storage medium on which a computer program is stored, the program, when executed by a processor, implementing a machine reading understanding method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010955175.3A CN112183085A (en) | 2020-09-11 | 2020-09-11 | Machine reading understanding method and device, electronic equipment and computer storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010955175.3A CN112183085A (en) | 2020-09-11 | 2020-09-11 | Machine reading understanding method and device, electronic equipment and computer storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112183085A true CN112183085A (en) | 2021-01-05 |
Family
ID=73920607
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010955175.3A Pending CN112183085A (en) | 2020-09-11 | 2020-09-11 | Machine reading understanding method and device, electronic equipment and computer storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112183085A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112966499A (en) * | 2021-03-17 | 2021-06-15 | 中山大学 | Question and answer matching method based on self-adaptive fusion multi-attention network |
CN113705664A (en) * | 2021-08-26 | 2021-11-26 | 南通大学 | Model, training method and surface electromyographic signal gesture recognition method |
CN114398976A (en) * | 2022-01-13 | 2022-04-26 | 福州大学 | Machine reading understanding method based on BERT and gate control type attention enhancement network |
CN114492451A (en) * | 2021-12-22 | 2022-05-13 | 马上消费金融股份有限公司 | Text matching method and device, electronic equipment and computer readable storage medium |
CN114564562A (en) * | 2022-02-22 | 2022-05-31 | 平安科技(深圳)有限公司 | Question generation method, device and equipment based on answer guidance and storage medium |
CN115169367A (en) * | 2022-09-06 | 2022-10-11 | 杭州远传新业科技股份有限公司 | Dialogue generating method and device, and storage medium |
CN114398976B (en) * | 2022-01-13 | 2024-06-07 | 福州大学 | Machine reading and understanding method based on BERT and gating type attention enhancement network |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109460553A (en) * | 2018-11-05 | 2019-03-12 | 中山大学 | A kind of machine reading understanding method based on thresholding convolutional neural networks |
CN109492227A (en) * | 2018-11-16 | 2019-03-19 | 大连理工大学 | It is a kind of that understanding method is read based on the machine of bull attention mechanism and Dynamic iterations |
CN109657226A (en) * | 2018-09-20 | 2019-04-19 | 北京信息科技大学 | The reading of multi-joint knot attention understands model, system and method |
CN109657246A (en) * | 2018-12-19 | 2019-04-19 | 中山大学 | A kind of extraction-type machine reading based on deep learning understands the method for building up of model |
CN109739986A (en) * | 2018-12-28 | 2019-05-10 | 合肥工业大学 | A kind of complaint short text classification method based on Deep integrating study |
CN110134771A (en) * | 2019-04-09 | 2019-08-16 | 广东工业大学 | A kind of implementation method based on more attention mechanism converged network question answering systems |
CN110309305A (en) * | 2019-06-14 | 2019-10-08 | 中国电子科技集团公司第二十八研究所 | Machine based on multitask joint training reads understanding method and computer storage medium |
CN110334184A (en) * | 2019-07-04 | 2019-10-15 | 河海大学常州校区 | The intelligent Answer System understood is read based on machine |
CN110619123A (en) * | 2019-09-19 | 2019-12-27 | 电子科技大学 | Machine reading understanding method |
CN110633472A (en) * | 2019-09-19 | 2019-12-31 | 电子科技大学 | Article and question fusion method based on attention and aggregation mechanism |
-
2020
- 2020-09-11 CN CN202010955175.3A patent/CN112183085A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109657226A (en) * | 2018-09-20 | 2019-04-19 | 北京信息科技大学 | The reading of multi-joint knot attention understands model, system and method |
CN109460553A (en) * | 2018-11-05 | 2019-03-12 | 中山大学 | A kind of machine reading understanding method based on thresholding convolutional neural networks |
CN109492227A (en) * | 2018-11-16 | 2019-03-19 | 大连理工大学 | It is a kind of that understanding method is read based on the machine of bull attention mechanism and Dynamic iterations |
CN109657246A (en) * | 2018-12-19 | 2019-04-19 | 中山大学 | A kind of extraction-type machine reading based on deep learning understands the method for building up of model |
CN109739986A (en) * | 2018-12-28 | 2019-05-10 | 合肥工业大学 | A kind of complaint short text classification method based on Deep integrating study |
CN110134771A (en) * | 2019-04-09 | 2019-08-16 | 广东工业大学 | A kind of implementation method based on more attention mechanism converged network question answering systems |
CN110309305A (en) * | 2019-06-14 | 2019-10-08 | 中国电子科技集团公司第二十八研究所 | Machine based on multitask joint training reads understanding method and computer storage medium |
CN110334184A (en) * | 2019-07-04 | 2019-10-15 | 河海大学常州校区 | The intelligent Answer System understood is read based on machine |
CN110619123A (en) * | 2019-09-19 | 2019-12-27 | 电子科技大学 | Machine reading understanding method |
CN110633472A (en) * | 2019-09-19 | 2019-12-31 | 电子科技大学 | Article and question fusion method based on attention and aggregation mechanism |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112966499A (en) * | 2021-03-17 | 2021-06-15 | 中山大学 | Question and answer matching method based on self-adaptive fusion multi-attention network |
CN113705664A (en) * | 2021-08-26 | 2021-11-26 | 南通大学 | Model, training method and surface electromyographic signal gesture recognition method |
CN113705664B (en) * | 2021-08-26 | 2023-10-24 | 南通大学 | Model, training method and surface electromyographic signal gesture recognition method |
CN114492451A (en) * | 2021-12-22 | 2022-05-13 | 马上消费金融股份有限公司 | Text matching method and device, electronic equipment and computer readable storage medium |
CN114492451B (en) * | 2021-12-22 | 2023-10-24 | 马上消费金融股份有限公司 | Text matching method, device, electronic equipment and computer readable storage medium |
CN114398976A (en) * | 2022-01-13 | 2022-04-26 | 福州大学 | Machine reading understanding method based on BERT and gate control type attention enhancement network |
CN114398976B (en) * | 2022-01-13 | 2024-06-07 | 福州大学 | Machine reading and understanding method based on BERT and gating type attention enhancement network |
CN114564562A (en) * | 2022-02-22 | 2022-05-31 | 平安科技(深圳)有限公司 | Question generation method, device and equipment based on answer guidance and storage medium |
CN114564562B (en) * | 2022-02-22 | 2024-05-14 | 平安科技(深圳)有限公司 | Question generation method, device, equipment and storage medium based on answer guidance |
CN115169367A (en) * | 2022-09-06 | 2022-10-11 | 杭州远传新业科技股份有限公司 | Dialogue generating method and device, and storage medium |
CN115169367B (en) * | 2022-09-06 | 2022-12-09 | 杭州远传新业科技股份有限公司 | Dialogue generating method and device, and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109947912B (en) | Model method based on intra-paragraph reasoning and joint question answer matching | |
CN112183085A (en) | Machine reading understanding method and device, electronic equipment and computer storage medium | |
CN108427771B (en) | Abstract text generation method and device and computer equipment | |
CN109597891B (en) | Text emotion analysis method based on bidirectional long-and-short-term memory neural network | |
CN111783474B (en) | Comment text viewpoint information processing method and device and storage medium | |
CN107798140B (en) | Dialog system construction method, semantic controlled response method and device | |
CN110928997A (en) | Intention recognition method and device, electronic equipment and readable storage medium | |
Ferrer-i-Cancho et al. | Optimal coding and the origins of Zipfian laws | |
CN110598779A (en) | Abstract description generation method and device, computer equipment and storage medium | |
CN110969020A (en) | CNN and attention mechanism-based Chinese named entity identification method, system and medium | |
CN111966812B (en) | Automatic question answering method based on dynamic word vector and storage medium | |
CN111680494A (en) | Similar text generation method and device | |
CN112560502B (en) | Semantic similarity matching method and device and storage medium | |
CN110795944A (en) | Recommended content processing method and device, and emotion attribute determining method and device | |
CN112528637A (en) | Text processing model training method and device, computer equipment and storage medium | |
CN111259113A (en) | Text matching method and device, computer readable storage medium and computer equipment | |
CN114707005B (en) | Knowledge graph construction method and system for ship equipment | |
CN113536795A (en) | Method, system, electronic device and storage medium for entity relation extraction | |
CN115795044A (en) | Knowledge injection-based user relationship mining method and device | |
CN116050352A (en) | Text encoding method and device, computer equipment and storage medium | |
CN110298046B (en) | Translation model training method, text translation method and related device | |
CN111723572B (en) | Chinese short text correlation measurement method based on CNN convolutional layer and BilSTM | |
CN113806646A (en) | Sequence labeling system and training system of sequence labeling model | |
CN111783430A (en) | Sentence pair matching rate determination method and device, computer equipment and storage medium | |
CN116109980A (en) | Action recognition method based on video text matching |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |