CN115936000A - Discourse relation identification method, system, equipment and computer storage medium - Google Patents

Discourse relation identification method, system, equipment and computer storage medium Download PDF

Info

Publication number
CN115936000A
CN115936000A CN202111325913.7A CN202111325913A CN115936000A CN 115936000 A CN115936000 A CN 115936000A CN 202111325913 A CN202111325913 A CN 202111325913A CN 115936000 A CN115936000 A CN 115936000A
Authority
CN
China
Prior art keywords
vector
texts
sections
argument
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111325913.7A
Other languages
Chinese (zh)
Inventor
袁鹏
李浩然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Shangke Information Technology Co Ltd
Priority to CN202111325913.7A priority Critical patent/CN115936000A/en
Publication of CN115936000A publication Critical patent/CN115936000A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Machine Translation (AREA)

Abstract

The invention provides a method, a system, equipment and a computer storage medium for identifying discourse relation, which are characterized in that two sections of texts for describing objects are obtained; inputting the two sections of texts into a pre-training language model BERT for coding, and outputting text semantic representation; performing Brown clustering mapping on the two sections of texts to obtain a Brown clustering vector; marking the two sections of texts as a first argument and a second argument respectively to obtain a first argument vector and a second argument vector corresponding to the two sections of texts; combining the first argument vector and the second argument vector to obtain argument vectors of two sections of texts; obtaining a CLS vector based on the text semantic representation, the Brownian clustering vector and the argument vector; and predicting the chapter relationship of the two sections of texts by using the CLS vector to obtain the chapter relationship of the two sections of texts. In the scheme, the BERT is used for coding two sections of texts, and the CLS vector obtained based on the text semantic representation, the Brownian clustering vector and the argument vector is used for predicting the chapter relationship of the two sections of texts, so that the accuracy of chapter relationship identification is improved.

Description

Discourse relation identification method, system, equipment and computer storage medium
Technical Field
The present invention relates to the field of natural language processing technologies, and in particular, to a method, a system, a device, and a computer storage medium for chapter relationship identification.
Background
The chapter relationship identification method refers to a task of identifying chapter relationships between given two text segments. Specifically, two pieces of text are input, for example: today, the weather is good, and yesterday also lightning thunder. The chapter relationship between the texts at the two ends is predicted as follows: turning.
In the prior art, the discourse relation identification is usually realized by adopting the brown clustering characteristics, but the brown clustering is used as the discrete characteristics to be added into the discourse relation classifier based on the brown clustering discourse relation identification method, and the accuracy of discourse relation identification is not high.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method, a system, a device, and a computer storage medium for chapter relationship identification, so as to achieve the purpose of improving accuracy of chapter relationship identification.
In order to achieve the above purpose, the embodiments of the present invention provide the following technical solutions:
the first aspect of the embodiment of the invention discloses a discourse relation identification method, which comprises the following steps:
acquiring two sections of texts for describing an object;
inputting the two sections of texts into a pre-training language model BERT for coding, and outputting text semantic representation;
performing Brownian clustering mapping on the two sections of texts to obtain a Brownian clustering vector;
marking the two sections of texts as a first argument and a second argument respectively to obtain a first argument vector and a second argument vector corresponding to the two sections of texts;
combining the first argument vector and the second argument vector to obtain argument vectors of the two sections of texts;
obtaining a classified CLS vector based on the text semantic representation, the Brownian clustering vector and the argument vector;
and predicting the chapter relationship of the two sections of texts by using the CLS vector to obtain the chapter relationship of the two sections of texts.
Optionally, the performing brownian cluster mapping on the two sections of texts to obtain a brownian cluster vector includes:
for each piece of text, establishing a category index pointing to a Brown cluster of the text;
obtaining a vector corresponding to each category index based on the category index of each Brown cluster;
and combining the vector corresponding to each category index to obtain a Brown clustering vector.
Optionally, the obtaining a vector corresponding to each class index based on the class index of each brownian cluster includes:
based on the category index of each Brown cluster, obtaining word vectors with the same category index in the text;
and calculating sum values of word vectors with the same category index in the text, and taking the average value of the sum values as a vector corresponding to each category index.
Optionally, the predicting the chapter relationship between the two pieces of text by using the CLS vector to obtain the chapter relationship between the two pieces of text includes:
and predicting the chapter relationship of the two sections of texts by taking the CLS vector as the input of a logistic regression model Softmax, and outputting the chapter relationship of the two sections of texts.
Optionally, the method further includes:
and calculating the loss value by taking the chapter relationship of the two sections of texts and the standard class label as the input of a loss function to obtain the loss value of the chapter relationship of the two sections of texts.
The second aspect of the embodiment of the invention discloses a chapter relationship identification system, which comprises:
the acquisition module is used for acquiring two sections of texts for describing the object;
the coding module is used for inputting the two sections of texts into a pre-training language model BERT for coding and outputting text semantic representation;
the mapping module is used for carrying out Brown clustering mapping on the two sections of texts to obtain a Brown clustering vector;
the marking module is used for respectively marking the two sections of texts as a first argument and a second argument to obtain a first argument vector and a second argument vector corresponding to the two sections of texts;
the combining module is used for combining the first argument vector and the second argument vector to obtain argument vectors of the two sections of texts;
the obtaining module is used for obtaining a CLS vector based on the text semantic representation, the Brownian clustering vector and the argument vector;
and the prediction module is used for predicting the chapter relationship of the two sections of texts by using the CLS vector to obtain the chapter relationship of the two sections of texts.
Optionally, the mapping module is specifically configured to:
for each piece of text, establishing a category index pointing to a Brown cluster of the text; obtaining a vector corresponding to each category index based on the category index of each Brown cluster; and combining the vector corresponding to each category index to obtain a Brownian clustering vector.
Optionally, the prediction module is specifically configured to:
and predicting the chapter relationship of the two sections of texts by taking the CLS vector as the input of a logistic regression model Softmax, and outputting the chapter relationship of the two sections of texts.
The third aspect of the embodiment of the present invention discloses an electronic device, where the electronic device is configured to run a program, and when the program runs, the method for identifying discourse relation according to any one of the first aspect of the embodiment of the present invention is executed.
The fourth aspect of the embodiments of the present invention discloses a computer storage medium, where the storage medium includes a storage program, and when the program runs, a device in which the storage medium is located is controlled to execute the discourse relation identification method according to any one of the first aspect of the embodiments of the present invention.
Based on the method, the system, the equipment and the computer storage medium for identifying discourse relations provided by the embodiment of the invention, the method comprises the following steps: acquiring two sections of texts for describing an object; inputting the two sections of texts into a pre-training language model BERT for coding, and outputting text semantic representation; performing Brownian clustering mapping on the two sections of texts to obtain a Brownian clustering vector; marking the two sections of texts as a first argument and a second argument respectively to obtain a first argument vector and a second argument vector corresponding to the two sections of texts; combining the first argument vector and the second argument vector to obtain argument vectors of the two sections of texts; obtaining a CLS vector based on the text semantic representation, the Brownian clustering vector and the argument vector; and predicting the chapter relationship of the two sections of texts by using the CLS vector to obtain the chapter relationship of the two sections of texts. In the scheme, a pre-training language model BERT is used for coding two sections of texts, and CLS vectors obtained based on text semantic representation, brownian clustering vectors and argument vectors are used for predicting the chapter relations of the two sections of texts, so that the accuracy of chapter relation recognition is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic flow chart of a discourse relation identification method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a process for obtaining a brownian cluster vector according to an embodiment of the present invention;
fig. 3 is a schematic flow chart illustrating a process of obtaining a vector corresponding to each category index according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating another discourse relation identification method according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a chapter relationship identification system according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of another discourse relation identification system according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device 70 according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
As can be seen from the background art, when the discourse relation is identified by using the existing brownian clustering discourse relation identification method, the accuracy of discourse relation identification is not high.
In the scheme, a pre-training language model BERT is used for coding two sections of texts, and CLS vectors obtained based on text semantic representation, brownian clustering vectors and argument vectors are used for predicting the discourse relation of the two sections of texts, so that the accuracy of discourse relation recognition is improved.
As shown in fig. 1, a schematic flow chart of a discourse relation identification method according to an embodiment of the present invention is provided, and the method mainly includes the following steps:
step S101: two pieces of text describing the object are obtained.
In step S101, the objects may be the same object or different objects, and the present invention is not limited thereto.
For example, there are two pieces of text: the screen of the computer is large, and the screen of the mobile phone is small. At this time, the objects in the two text sections are computers and mobile phones and are different objects.
As another example, there are two pieces of text: the computer has a large screen and a large memory. At this time, the objects in the two texts are computers and are the same object.
In the process of implementing the step S101, when performing chapter relationship identification, two pieces of text for performing chapter relationship identification need to be determined, and therefore, two pieces of text for describing an object need to be obtained.
Step S102: and inputting the two sections of texts into a pre-training language model BERT for coding, and outputting text semantic representation.
In step S102, the pre-trained language model BERT is a bi-directional translation decoding model that can simultaneously use information of a preceding word and information of a following word.
The text semantic representation may be: h = (h) 0 ,h 1 ,h 2 ,...,h N ) Wherein N is the total length of the two text segments, that is, the number of words contained in the two text segments.
It should be noted that, in the original pre-trained language model BERT, each word in the text is composed of two parts, which are a word vector and a position vector, respectively, as shown in formula (1):
e(a)=w(a)+p(a) (1),
where a is a word, e (a) is the vector of a, w (a) is the word vector of a, and p (a) is the position vector of a.
In the process of implementing step S102 specifically, the two acquired texts are used as input of the pre-training language model BERT, and are input into the pre-training language model BERT for encoding, so as to obtain text semantic representation, and output the text semantic representation.
Step S103: and performing Brown clustering mapping on the two sections of texts to obtain a Brown clustering vector.
In step S103, if a is a word, the brownian cluster vector of a can be represented by b (a).
In the process of implementing step S103 specifically, the two acquired texts are mapped to obtain a brownian cluster vector pointing to the two texts, and the brownian cluster vector is obtained based on the category indexes of the brownian clusters corresponding to all the two texts.
Optionally, step S103 is executed to perform brownian cluster mapping on the two sections of text to obtain a brownian cluster vector, as shown in fig. 2, which is a schematic flow diagram for obtaining a brownian cluster vector according to an embodiment of the present invention, and mainly includes the following steps:
step S201: for each piece of text, a category index pointing to a brownian cluster of text is established.
In the process of specifically implementing step S201, when the category index of the brown cluster pointing to the text is established for each text, the category index of the brown cluster of the text needs to be introduced first, and then the category index of the brown cluster pointing to each text is determined, so that the category index of the brown cluster pointing to the text can only be established.
Step S202: and obtaining a vector corresponding to each category index based on the category index of each Brownian cluster.
In the process of implementing step S202 specifically, word vectors having the same category index in the text are required to be obtained according to the category index of each brown cluster, and a vector corresponding to each category index is obtained according to the word vectors having the same category index in the text.
Optionally, the step S202 is executed to obtain a vector corresponding to each category index based on the category index of each braun cluster, as shown in fig. 3, which is a schematic flow chart for obtaining a vector corresponding to each category index provided in the embodiment of the present invention, and the method mainly includes the following steps:
step S301: and based on the category index of each Brown cluster, obtaining a word vector with the same category index in the text.
In the process of implementing step S301 specifically, word vectors having the same category index are found in the text according to the category index of each brown cluster, and word vectors having the same category index in the text are obtained.
For example, there are three words in the text, i, and he, the category index of the brown cluster corresponding to i, and he is 1, the category index of the brown cluster corresponding to s is 2, and the words with the same category index are found in this text as i, and he, thereby deriving word vectors corresponding to i, and he, which are w (you), w (i), and w (he), respectively.
Step S302: and calculating the sum value of word vectors with the same category index in the text, and taking the average value of the sum value as the vector corresponding to each category index.
In the process of implementing step S302 specifically, all word vectors in the text having the same category index are subjected to addition operation to obtain a specific numeric value, and an average value of the numeric value is calculated, and the average value of the numeric value is used as a vector corresponding to each category index.
Taking the example in step S301 as an example, word vectors of words (i, and other) having the same category index in the text are obtained, the three word vectors are added to obtain a specific numeric value, an average value of the numeric value is calculated, and the average value of the numeric value is used as a vector corresponding to each category index. Specifically, the vector corresponding to the category index 1 is shown in formula (2):
b 1 = (w (you) + w (I) + w (he))/3 (2)
Wherein, w (you) is the word vector corresponding to "you", w (I) is the word vector corresponding to "I", and w (he) is the word vector corresponding to "he".
Step S203: and combining the vector corresponding to each category index to obtain a Brown clustering vector.
In the process of implementing step S203 specifically, the vector corresponding to each obtained category index is subjected to a brownian clustering operation, so as to obtain a brownian clustering vector.
And performing Brownian clustering on the vector corresponding to each category index based on the hierarchical clustering algorithm to obtain a Brownian clustering vector.
Step S104: and respectively marking the two sections of texts as a first argument and a second argument to obtain a first argument vector and a second argument vector corresponding to the two sections of texts.
In step S104, if a is a word, the argument vector of a can be represented by r (a).
In the process of specifically implementing step S104, a first text segment in the two acquired text segments is marked and marked as a first argument to obtain a first argument vector corresponding to the first text segment in the two text segments, and a second text segment in the two acquired text segments is marked and marked as a second argument to obtain a second argument vector corresponding to the second text segment in the two text segments.
For example, the first piece of text is: today, weather is good, and the second text is: yesterday also lightning thunder. Marking a first text segment ' good weather today ' as a first argument, representing words in the first argument by ' a ', distinguishing each word in the first argument in a subscript form to obtain a first argument vector r (a) corresponding to the ' good weather today 1 ,a 2 ,a 3 ,a 4 ,a 5 ,a 6 ) Marking 'yesterday lightning thunder and lightning' as a second argument, representing words in the second argument by 'b', distinguishing each word in the first argument in a subscript mode, and obtaining a second argument vector r (b) corresponding to the 'yesterday lightning thunder and lightning' (b) 1 ,b 2 ,b 3 ,b 4 ,b 5 ,b 6 ,b 7 )。
Step S105: and combining the first argument vector and the second argument vector to obtain the argument vectors of the two sections of texts.
In the process of implementing step S105, the first argument vector corresponding to the first text of the two texts and the second argument vector corresponding to the second text of the two texts are merged to obtain a final argument vector, i.e., the argument vector of the two texts.
Taking the example of step S104 as an example, the obtained first argument vector is r (a) 1 ,a 2 ,a 3 ,a 4 ,a 5 ,a 6 ) The second argument vector is r (b) 1 ,b 2 ,b 3 ,b 4 ,b 5 ,b 6 ,b 7 ) The first argument vector r (a) 1 ,a 2 ,a 2 ,a 3 ,a 4 ,a 5 ) And a second argument vector r (b) 1 ,b 2 ,b 3 ,b 4 ,b 5 ,b 6 ,b 7 ) Merging to obtain argument vectors of the two sections of texts, namely' today, weather is good, yesterday and lightning thunder
Figure BDA0003346932100000081
It should be noted that the argument vector of the two final sections of text includes the argument vector of each word in the first section of text and the argument vector of each word in the second section of text, and "0" in the argument vector obtained above represents a placeholder for more intuitively describing the final argument vector.
Step S106: and obtaining a CLS vector based on the text semantic representation, the Brownian clustering vector and the argument vector.
In the process of implementing step S106 specifically, based on the text semantic representation, the bronsted clustering vector, and the argument vector, the text semantic representation, the bronsted clustering vector, and the argument vector are encoded by using the pre-trained language model BERT to obtain the CLS vector.
Step S107: and predicting the discourse relation of the two sections of texts by utilizing the CLS vector to obtain the discourse relation of the two sections of texts.
Optionally, in step S107, the CLS vector is used to predict the chapter relationship between the two text segments, so as to obtain the chapter relationship between the two text segments, including:
and (3) predicting the chapter relationship of the two sections of texts by taking the CLS vector as the input of the logistic regression model Softmax, and outputting the chapter relationship of the two sections of texts.
Specifically, the obtained CLS vector is used as the input of the logistic regression model Softmax, and is input into the logistic regression model Softmax to predict the chapter relationship of the two sections of texts, and the chapter relationship of the two sections of texts is output.
Based on the discourse relation identification method provided by the embodiment of the invention, two sections of texts describing an object are obtained; inputting the two sections of texts into a pre-training language model BERT for coding, and outputting text semantic representation; performing Brown clustering mapping on the two sections of texts to obtain a Brown clustering vector; marking the two sections of texts as a first argument and a second argument respectively to obtain a first argument vector and a second argument vector corresponding to the two sections of texts; combining the first argument vector and the second argument vector to obtain argument vectors of the two sections of texts; obtaining a CLS vector based on the text semantic representation, the Brownian clustering vector and the argument vector; and predicting the discourse relation of the two sections of texts by utilizing the CLS vector to obtain the discourse relation of the two sections of texts. In the scheme, a pre-training language model BERT is used for coding two sections of texts, and CLS vectors obtained based on text semantic representation, brownian clustering vectors and argument vectors are used for predicting the chapter relations of the two sections of texts, so that the accuracy of chapter relation recognition is improved.
Based on the discourse relation identification method shown in fig. 1 in the embodiment of the present invention, as shown in fig. 4, the method is a flow chart of another discourse relation identification method provided in the embodiment of the present invention, and the method mainly includes the following steps:
step S401: two pieces of text describing the object are obtained.
Step S402: and inputting the two sections of texts into a pre-training language model BERT for coding, and outputting text semantic representation.
Step S403: and performing Brown clustering mapping on the two sections of texts to obtain a Brown clustering vector.
Step S404: and respectively marking the two sections of texts as a first argument and a second argument to obtain a first argument vector and a second argument vector corresponding to the two sections of texts.
Step S405: and combining the first argument vector and the second argument vector to obtain the argument vectors of the two sections of texts.
Step S406: and obtaining a CLS vector based on the text semantic representation, the Brownian clustering vector and the argument vector.
Step S407: and predicting the discourse relation of the two sections of texts by utilizing the CLS vector to obtain the discourse relation of the two sections of texts.
The execution principle and process of the above steps S401 to S407 are the same as the execution principle and process of the steps S101 to S107 disclosed in fig. 1, and reference may be made to these steps, which are not described herein again.
Step S408: and taking the discourse relation of the two sections of texts and the standard class label as input of a loss function to calculate a loss value so as to obtain the discourse relation of the two sections of texts.
In the process of implementing step S408 specifically, the chapter relationship and the standard category label of the two sections of text are used as input of the loss function, and input into the loss function to perform loss value calculation, so as to obtain a loss value of the chapter relationship of the two sections of text.
Based on the discourse relation identification method provided by the embodiment of the invention, two sections of texts describing an object are obtained; inputting two sections of texts into a pre-training language model BERT for coding, and outputting text semantic representation; performing Brownian clustering mapping on the two sections of texts to obtain a Brownian clustering vector; marking the two sections of texts as a first argument and a second argument respectively to obtain a first argument vector and a second argument vector corresponding to the two sections of texts; combining the first argument vector and the second argument vector to obtain argument vectors of the two sections of texts; obtaining a CLS vector based on the text semantic representation, the Brownian clustering vector and the argument vector; and predicting the discourse relation of the two sections of texts by utilizing the CLS vector to obtain the discourse relation of the two sections of texts. In the scheme, a pre-training language model BERT is used for coding two sections of texts, a CLS vector obtained based on text semantic representation, a Brownian clustering vector and a argument vector is used for predicting the chapter relationship of the two sections of texts, loss value calculation is further carried out on the obtained chapter relationship of the two sections of texts, and then the error degree or loss degree of the chapter relationship of the two sections of texts is measured, so that the accuracy of chapter relationship identification is improved.
In order to better understand the discourse relation identification method provided by the above embodiment of the present invention, the following description is given by way of example.
For example, chapter 1 is "he is late today" and chapter 2 is "so late".
First, chapter 1 and chapter 2 are obtained, and chapter 1 and chapter 2 are input into pre-training language model BERT encoding to obtain semantic representation h = (h) of discourse 1 and discourse 2 0 ,h 1 ,h 2 ,...,h 11 ) Output text semantic representation h = (h) 0 ,h 1 ,h 2 ,...,h 11 )。
Next, category indexes of brownian clusters pointing to chapter 1 and chapter 2 are respectively established.
Assume that the category indexes of the brownian clusters corresponding to each word in chapter 1 are 1, 2, 3, 4, 5, and 6, respectively, and the category indexes of the brownian clusters corresponding to each word in chapter 2 are 1, 2, 3, 4, and 6, respectively.
According to the category index of each Brown cluster, searching for the word vectors with the same category index in discourse 1 and discourse 2, wherein the words with the same category index are found as follows: the sum of days, sum of his and late, sum of arrival, and, in chapters 1 and 2, the word vectors with the same category indices are found, w (present) and w (in), w (days) and w (in), w (he) and w (late), w (start) and w (arrival), w (arrival) and w (arrival), respectively.
Calculating the sum of the word vectors with the same category index in discourse 1 and discourse 2, and taking the average value of the sum as the vector corresponding to each category index, specifically:
the vector corresponding to the category index 1 is b 1 = (w (d) + w (d))/2;
the vector corresponding to the category index 2 is b 2 = (w (days) + w (in))/2;
the vector corresponding to the category index 3 is b 3 = (w (he) + e (late))/2;
the vector corresponding to the category index 4 is b 4 = (w (up) + w (up))/2;
the vector corresponding to the category index 5 is b 5 = w (late)/1;
the vector corresponding to the category index 6 is b 6 = (w (has) + w (has))/2.
Combining the vector corresponding to each category index to obtain a Brown clustering vector, which specifically comprises the following steps: performing Brownian clustering operation on the vector corresponding to each category index to obtain a Brownian clustering vector b = (b) 1 ,b 2 ,b 3 ,b 4 ,b 5 ,b 6 )。
Then, the discourse 1 is marked as a first argument, the word in the first argument is represented by ' a ', and each word in the first argument is distinguished by using the form of subscript to obtain a corresponding first argument vector r (a) of ' late from the beginning of the day 1 ,a 2 ,a 3 ,a 4 ,a 5 ,a 6 ) The chapter 2 is marked as the second argument, the word in the second argument is represented by "b", and each word in the first argument is also distinguished by subscript to obtain the "so late" corresponding second argument vector r (b) 1 ,b 2 ,b 3 ,b 4 ,b 5 )。
Merging the first argument vector and the second argument vector to obtain argument vectors of chapters 1 and 2
Figure BDA0003346932100000111
It should be noted that the argument vector of the two final sections of text includes the argument vector of each word in the first section of text and the argument vector of each word in the second section of text, and "0" in the argument vector obtained above represents a placeholder for more intuitively describing the final argument vector.
And then, based on the text semantic representation, the Brown cluster vector and the argument vector, the text semantic representation, the Brown cluster vector and the argument vector are coded by using a pre-training language model BERT to obtain a CLS vector.
And finally, predicting the chapter relationship of the two sections of texts by taking the CLS vector as the input of the logistic regression model Softmax to obtain that the chapter relationship of the chapters 1 and 2 is causal, and outputting the chapter relationship of the chapters 1 and 2.
The discourse relation recognition method provided by the embodiment of the invention utilizes the pre-training language model BERT to encode discourse 1 and discourse 2, and utilizes the CLS vector obtained based on the text semantic representation, the Brownian clustering vector and the argument vector to predict the discourse relation of two sections of texts, thereby improving the accuracy of discourse relation recognition.
Corresponding to the discourse relation identification method shown in the embodiment of the present invention, the embodiment of the present invention also provides a discourse relation identification system, as shown in fig. 5, where the discourse relation identification system includes: an acquisition module 51, an encoding module 52, a mapping module 53, a labeling module 54, a combining module 55, an obtaining module 56 and a prediction module 57.
The obtaining module 51 is configured to obtain two pieces of text describing an object.
And the coding module 52 is configured to input the two segments of text into the pre-training language model BERT for coding, and output a semantic representation of the text.
And the mapping module 53 is configured to perform brownian clustering mapping on the two sections of texts to obtain a brownian clustering vector.
The marking module 54 is configured to mark the two sections of text as a first argument and a second argument respectively, so as to obtain a first argument vector and a second argument vector corresponding to the two sections of text.
And the combining module 55 is configured to combine the first argument vector and the second argument vector to obtain argument vectors of two sections of texts.
And the obtaining module 56 is used for obtaining the CLS vector based on the text semantic representation, the Brownian clustering vector and the argument vector.
And the prediction module 57 is configured to predict the chapter relationship between the two pieces of text by using the CLS vector to obtain the chapter relationship between the two pieces of text.
It should be noted that the specific principle and the execution process of each module in the chapter relationship identification system disclosed in the embodiment of the present invention are the same as the method for identifying chapter relationship disclosed in the embodiment of the present invention, and reference may be made to the corresponding parts in the chapter relationship identification method disclosed in the embodiment of the present invention, which are not described herein again.
Based on the discourse relation identification system provided by the embodiment of the invention, two sections of texts describing an object are obtained; inputting the two sections of texts into a pre-training language model BERT for coding, and outputting text semantic representation; performing Brown clustering mapping on the two sections of texts to obtain a Brown clustering vector; marking the two sections of texts as a first argument and a second argument respectively to obtain a first argument vector and a second argument vector corresponding to the two sections of texts; combining the first argument vector and the second argument vector to obtain argument vectors of the two sections of texts; obtaining a CLS vector based on the text semantic representation, the Brownian clustering vector and the argument vector; and predicting the chapter relationship of the two sections of texts by using the CLS vector to obtain the chapter relationship of the two sections of texts. In the scheme, a pre-training language model BERT is used for coding two sections of texts, and CLS vectors obtained based on text semantic representation, brownian clustering vectors and argument vectors are used for predicting the chapter relations of the two sections of texts, so that the accuracy of chapter relation recognition is improved.
Optionally, based on the mapping module 53 shown in fig. 5, the mapping module 53 is specifically configured to:
aiming at each section of text, establishing a class index pointing to the Brownian cluster of the text; obtaining a vector corresponding to each category index based on the category index of each Brown cluster; and combining the vector corresponding to each category index to obtain a Brownian clustering vector.
Based on the discourse relation identification system provided by the embodiment of the invention, the Brownian clustering vector is obtained by carrying out Brownian clustering mapping on the two sections of texts, so that the accuracy of discourse relation identification is improved.
Optionally, based on the mapping module 53 shown in fig. 5, the mapping module 53, configured to obtain the vector corresponding to each class index based on the class index of each bronsted cluster, is specifically configured to:
based on the category index of each Brown cluster, obtaining word vectors with the same category index in the text; and calculating the sum value of word vectors with the same category index in the text, and taking the average value of the sum value as the vector corresponding to each category index.
Based on the discourse relation identification system provided by the embodiment of the invention, the vector corresponding to each category index is obtained by utilizing the category index of each Brownian cluster, so that the accuracy of discourse relation identification is improved.
Optionally, based on the prediction module 57 shown in fig. 5, the prediction module 57 is specifically configured to:
and (3) predicting the chapter relationship of the two sections of texts by taking the CLS vector as the input of the logistic regression model Softmax, and outputting the chapter relationship of the two sections of texts.
Based on the chapter relationship identification system provided by the embodiment of the invention, the obtained CLS vector is used for predicting the chapter relationship of two sections of texts, so that the accuracy of chapter relationship identification is improved.
Based on the discourse relation identification system shown in fig. 5, in conjunction with fig. 5, as shown in fig. 6, the discourse relation identification system is further provided with a loss value calculation module 58.
And the loss value calculation module 58 is configured to perform loss value calculation by using the chapter relationship between the two sections of text and the standard category label as input of the loss function, so as to obtain a loss value of the chapter relationship between the two sections of text.
Based on the discourse relation identification system provided by the embodiment of the invention, the obtained discourse relation of the two sections of texts is subjected to loss value calculation, so that the error degree or the loss degree of the discourse relation of the two sections of texts is measured, and the accuracy of discourse relation identification is improved.
Based on the chapter relationship identification system disclosed in the embodiment of the present invention, the modules may be implemented by a hardware device composed of a processor and a memory. Specifically, the modules are stored in the memory as program units, and the processor executes the program units stored in the memory to realize chapter relationship identification.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. One or more than one kernel can be set, and the chapter relation identification is realized by adjusting kernel parameters.
An embodiment of the present invention provides a computer storage medium, where the storage medium includes a stored discourse relation identification program, and the program is executed by a processor to implement the discourse relation identification method according to any one of claims 1 to 5.
The embodiment of the invention provides a processor, which is used for running a program, wherein the discourse relation identification method disclosed in figure 1 is executed when the program runs.
An electronic device 70 according to an embodiment of the present invention is provided, as shown in fig. 7, and is a schematic structural diagram of the electronic device 70 according to an embodiment of the present invention.
The electronic device in the embodiment of the invention can be a server, a PC, a PAD, a mobile phone and the like.
The electronic device comprises at least one processor 701, and at least one memory 702 connected to the processor, and a bus 703.
The processor 701 and the memory 702 communicate with each other via a bus 703. A processor 701 for executing the program stored in the memory 702.
A memory 702 for storing a program for at least: acquiring two sections of texts for describing an object; inputting the two sections of texts into a pre-training language model BERT for coding, and outputting text semantic representation; performing Brown clustering mapping on the two sections of texts to obtain a Brown clustering vector; marking the two sections of texts as a first argument and a second argument respectively to obtain a first argument vector and a second argument vector corresponding to the two sections of texts; combining the first argument vector and the second argument vector to obtain argument vectors of the two sections of texts; obtaining a CLS vector based on the text semantic representation, the Brownian clustering vector and the argument vector; and predicting the discourse relation of the two sections of texts by utilizing the CLS vector to obtain the discourse relation of the two sections of texts.
The present application further provides a computer program product adapted to perform a program for initializing the following method steps when executed on an electronic device:
acquiring two sections of texts for describing an object; inputting the two sections of texts into a pre-training language model BERT for coding, and outputting text semantic representation; performing Brown clustering mapping on the two sections of texts to obtain a Brown clustering vector; marking the two sections of texts as a first argument and a second argument respectively to obtain a first argument vector and a second argument vector corresponding to the two sections of texts; combining the first argument vector and the second argument vector to obtain argument vectors of the two sections of texts; obtaining a CLS vector based on text semantic representation, a Brownian clustering vector and an argument vector; and predicting the discourse relation of the two sections of texts by utilizing the CLS vector to obtain the discourse relation of the two sections of texts.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a device includes one or more processors (CPUs), memory, and a bus. The device may also include input/output interfaces, network interfaces, and the like.
The memory may include volatile memory in a computer readable medium, random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip. The memory is an example of a computer-readable medium.
Computer-readable media, including both permanent and non-permanent, removable and non-removable media, may implement the information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are only illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement without inventive effort.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A discourse relation identification method is characterized by comprising the following steps:
acquiring two sections of texts for describing an object;
inputting the two sections of texts into a pre-training language model BERT for coding, and outputting text semantic representation;
performing Brown clustering mapping on the two sections of texts to obtain a Brown clustering vector;
marking the two sections of texts as a first argument and a second argument respectively to obtain a first argument vector and a second argument vector corresponding to the two sections of texts;
combining the first argument vector and the second argument vector to obtain argument vectors of the two sections of texts;
obtaining a classified CLS vector based on the text semantic representation, the Brownian clustering vector and the argument vector;
and predicting the chapter relationship of the two sections of texts by using the CLS vector to obtain the chapter relationship of the two sections of texts.
2. The method according to claim 1, wherein performing a brownian cluster mapping on the two pieces of text to obtain a brownian cluster vector comprises:
for each piece of text, establishing a category index pointing to a Brown cluster of the text;
obtaining a vector corresponding to each category index based on the category index of each Brown cluster;
and combining the vector corresponding to each category index to obtain a Brown clustering vector.
3. The method according to claim 2, wherein the obtaining a vector corresponding to each class index based on the class index of each braun cluster comprises:
based on the category index of each Brownian cluster, obtaining word vectors in the text with the same category index;
and calculating sum values of word vectors with the same category index in the text, and taking the average value of the sum values as a vector corresponding to each category index.
4. The method of claim 1, wherein the predicting the discourse relation of the two pieces of text by using the CLS vector to obtain the discourse relation of the two pieces of text comprises:
and taking the CLS vector as the input of a logistic regression model Softmax to predict the chapter relationship of the two sections of texts, and outputting the chapter relationship of the two sections of texts.
5. The method of claim 1, further comprising:
and calculating the loss value by taking the chapter relationship of the two sections of texts and the standard class label as the input of a loss function to obtain the loss value of the chapter relationship of the two sections of texts.
6. A system for identifying discourse relation, the system comprising:
the acquisition module is used for acquiring two sections of texts for describing the object;
the coding module is used for inputting the two sections of texts into a pre-training language model BERT for coding and outputting text semantic representation;
the mapping module is used for carrying out Brown clustering mapping on the two sections of texts to obtain a Brown clustering vector;
the marking module is used for respectively marking the two sections of texts as a first argument and a second argument to obtain a first argument vector and a second argument vector corresponding to the two sections of texts;
the combining module is used for combining the first argument vector and the second argument vector to obtain argument vectors of the two sections of texts;
the obtaining module is used for obtaining a CLS vector based on the text semantic representation, the Brownian clustering vector and the argument vector;
and the prediction module is used for predicting the chapter relationship of the two sections of texts by using the CLS vector to obtain the chapter relationship of the two sections of texts.
7. The system of claim 6, wherein the mapping module is specifically configured to:
for each piece of text, establishing a category index pointing to a Brownian cluster of the text; obtaining a vector corresponding to each category index based on the category index of each Brown cluster; and combining the vector corresponding to each category index to obtain a Brown clustering vector.
8. The system of claim 6, wherein the prediction module is specifically configured to:
and predicting the chapter relationship of the two sections of texts by taking the CLS vector as the input of a logistic regression model Softmax, and outputting the chapter relationship of the two sections of texts.
9. An electronic device, wherein the electronic device is configured to run a program, and wherein the program executes the discourse relation identification method according to any one of claims 1 to 5.
10. A computer storage medium, characterized in that the storage medium comprises a storage program, wherein when the program runs, the device on which the storage medium is located is controlled to execute the discourse relation identification method according to any one of claims 1 to 5.
CN202111325913.7A 2021-11-10 2021-11-10 Discourse relation identification method, system, equipment and computer storage medium Pending CN115936000A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111325913.7A CN115936000A (en) 2021-11-10 2021-11-10 Discourse relation identification method, system, equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111325913.7A CN115936000A (en) 2021-11-10 2021-11-10 Discourse relation identification method, system, equipment and computer storage medium

Publications (1)

Publication Number Publication Date
CN115936000A true CN115936000A (en) 2023-04-07

Family

ID=86647798

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111325913.7A Pending CN115936000A (en) 2021-11-10 2021-11-10 Discourse relation identification method, system, equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN115936000A (en)

Similar Documents

Publication Publication Date Title
CN111241851A (en) Semantic similarity determination method and device and processing equipment
CN113064964A (en) Text classification method, model training method, device, equipment and storage medium
CN112084752B (en) Sentence marking method, device, equipment and storage medium based on natural language
CN110909540B (en) Method and device for identifying new words of short message spam and electronic equipment
CN109597982B (en) Abstract text recognition method and device
CN112070550A (en) Keyword determination method, device and equipment based on search platform and storage medium
CN113110843B (en) Contract generation model training method, contract generation method and electronic equipment
CN112084448B (en) Similar information processing method and device
CN116308738B (en) Model training method, business wind control method and device
CN111507726B (en) Message generation method, device and equipment
CN111538925A (en) Method and device for extracting Uniform Resource Locator (URL) fingerprint features
CN113836308B (en) Network big data long text multi-label classification method, system, device and medium
CN114547257B (en) Class matching method and device, computer equipment and storage medium
CN116028626A (en) Text matching method and device, storage medium and electronic equipment
CN115936000A (en) Discourse relation identification method, system, equipment and computer storage medium
CN115186085A (en) Reply content processing method and interaction method of media content interaction content
CN114912513A (en) Model training method, information identification method and device
CN113961663A (en) Discourse relation identification method, system, equipment and computer storage medium
CN113704466A (en) Text multi-label classification method and device based on iterative network and electronic equipment
US20210117853A1 (en) Methods and systems for automated feature generation utilizing formula semantification
CN113112007B (en) Method, device and equipment for selecting sequence length in neural network and storage medium
CN114548083B (en) Title generation method, device, equipment and medium
CN116166800A (en) Text data enhancement method and device and electronic equipment
CN110969011B (en) Text emotion analysis method and device, storage medium and processor
CN113901206A (en) Word embedding-based equipment loss prediction method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination