CN111401035A - Zero-reference resolution method, device, equipment and medium based on big data - Google Patents

Zero-reference resolution method, device, equipment and medium based on big data Download PDF

Info

Publication number
CN111401035A
CN111401035A CN202010099118.XA CN202010099118A CN111401035A CN 111401035 A CN111401035 A CN 111401035A CN 202010099118 A CN202010099118 A CN 202010099118A CN 111401035 A CN111401035 A CN 111401035A
Authority
CN
China
Prior art keywords
word
sentence
probability
resolved
item
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010099118.XA
Other languages
Chinese (zh)
Inventor
楼星雨
许开河
王少军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202010099118.XA priority Critical patent/CN111401035A/en
Publication of CN111401035A publication Critical patent/CN111401035A/en
Priority to PCT/CN2020/123173 priority patent/WO2021164293A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a zero-index resolution method based on big data, which comprises the following steps: executing vectorization processing on a sentence to be resolved and the upper information thereof to obtain context vector representation of each word in the sentence to be resolved and the upper information; inputting the context vector representation of each word in the sentence to be resolved and the above information into a bidirectional long-short term memory network to obtain the enhanced context vector representation of each word; traversing the context vector representation of each enhanced word, and predicting the probability of the first word of the fingerback item and the probability of the last word of the fingerback item according to the parameter vector in the bert model; traversing each word, constructing a continuous text segment, and calculating the probability of the fingerback item of the continuous text segment; and selecting the continuous text segment with the maximum index probability as the index of the sentence to be digested. The method solves the problems that the existing zero-index resolution technology excessively depends on the candidate set of the return index, and the resolution result is low in accuracy and unstable.

Description

Zero-reference resolution method, device, equipment and medium based on big data
Technical Field
The invention relates to the technical field of information, in particular to a zero-index resolution method, a zero-index resolution device, zero-index resolution equipment and a zero-index resolution medium based on big data.
Background
The reference resolution is one of the technologies which have the longest research time and have wide application scenes in the field of natural language. In a customer service robot, a conversation robot and an intelligent outbound platform, the resolution is one of the most core technologies. The reference resolution comprises zero reference resolution and coreference resolution.
In languages with missing pronouns, such as chinese, the parts that can be inferred from the context are often omitted, and the omitted parts bear the corresponding syntactic components in the sentence and refer back to a certain language unit in the preceding text. The omitted parts are referred to as zero-index terms. The zero-reference resolution is to find the corresponding language unit in the foregoing for the zero-reference item. The zero-index resolution task is generally divided into two subtasks, namely zero-index position detection and resolution.
The purpose of the resolution task is to identify its specific return entry for the zero-referenced item with the previous return entry on the basis of the zero-referenced position detection result. The traditional digestion model is generally to construct a candidate set of the referents, and then to select the most probable candidate from the candidate set of the referents as the final recognition result by using a classification or ranking method. The construction of the candidate set of referents is often composed of the largest noun phrase and the modified noun phrase in the two preceding sentences before the zero-referent. The accuracy of this method depends on the accuracy of the candidate set of the referent, and if the set does not contain the correct referent, subsequent identification failure will inevitably result. Since the candidate set of the referent is only composed of a few simple nominal phrases, the traditional resolution method has higher instability and lower accuracy.
Therefore, finding a method for solving the problems that the existing zero-index resolution technology depends too much on the candidate set of the return index, and the resolution result is low in accuracy and unstable becomes a technical problem to be solved urgently by the technical personnel in the field.
Disclosure of Invention
The embodiment of the invention provides a zero-index resolution method, a zero-index resolution device, zero-index resolution equipment and a zero-index resolution medium based on big data, and aims to solve the problems that the existing zero-index resolution technology excessively depends on a candidate set of an echo item, and the resolution result is low in accuracy and unstable.
A zero-reference resolution method based on big data comprises the following steps:
acquiring a sentence to be resolved and upper text information thereof, and executing vectorization processing on the sentence to be resolved and the upper text information thereof to obtain context vector representation of each word in the sentence to be resolved and context vector representation of each word in the upper text information;
inputting the context vector representation of each word in the sentence to be resolved and the above information into a bidirectional long-short term memory network to enhance the context expression and the position information of each word and obtain the enhanced context vector representation of each word;
traversing the context vector representation of each enhanced word, and predicting the probability of the first word of the fingerback item and the probability of the last word of the fingerback item according to the parameter vector in the bert model;
traversing each word, constructing a continuous text segment, and calculating the probability of the fingerback item of the continuous text segment according to the probability of the fingerback item head word and the probability of the fingerback item tail word of each word;
and selecting the continuous text segment with the maximum finger probability from the continuous text segments as the finger of the sentence to be digested.
Further, the performing vectorization processing on the sentence to be resolved and the upper text information thereof to obtain context vector representation of each word in the sentence to be resolved, and the context vector representation of each word in the upper text information includes:
representing each word in the sentence to be resolved and the text information thereof in a one-hot form to obtain a high-dimensional discrete word representation matrix corresponding to the sentence to be resolved and a high-dimensional discrete word representation matrix corresponding to the text information thereof;
respectively embedding a high-dimensional discrete word representation matrix corresponding to the sentence to be digested and a high-dimensional discrete word representation matrix corresponding to the above information into a low-dimensional dense representation matrix by adopting a word embedding method;
and inputting the sentence to be resolved and the low-dimensional dense representation matrix corresponding to the upper text information of the sentence to be resolved into a preset bert model for bidirectional coding to obtain the context vector representation of each word in the sentence to be resolved and the upper text information of the sentence to be resolved.
Further, traversing the enhanced context vector representation of each word, and predicting the probability of the head word of the fingerback item and the probability of the tail word of the fingerback item of each word according to the parameter vector in the bert model comprises:
acquiring a head word parameter vector and a tail word parameter vector in a bert model;
performing dot product operation on the context vector representation after each character enhancement and the first character parameter vector, and performing softmax processing on the dot product operation result to obtain the first character probability of the fingerback item of each character;
and performing dot product operation on the context vector representation after each word is enhanced and the tail word parameter vector, and performing softmax processing on the dot product operation result to obtain the probability of the final word of the corresponding finger item of each word.
Further, traversing each word, constructing a continuous text segment, and calculating the probability of the fingerback item of the continuous text segment according to the probability of the fingerback item head word and the probability of the fingerback item tail word of each word includes:
traversing each word, taking the word as a first word of a return finger item, taking the word and the following words as tail words of the return finger item, and constructing a continuous text segment;
and calculating the product of the initial word probability of the initial word and the final word probability of the final word in the continuous text segment to obtain the final word probability of the continuous text segment.
Further, the selecting the continuous text segment with the highest indexing probability from the continuous text segments as the indexing item of the sentence to be digested comprises:
filtering out continuous text segments which have intersection with sentences to be digested from the continuous text segments;
and selecting the continuous text segment with the maximum index probability from the filtered continuous text segments as the index of the sentence to be digested.
A big data-based zero-reference resolution device, comprising:
the vectorization module is used for acquiring a sentence to be resolved and the information of the sentence and executing vectorization processing on the sentence to be resolved and the information of the sentence to be resolved to obtain context vector representation of each word in the sentence to be resolved and context vector representation of each word in the information of the sentence;
the enhancement module is used for inputting the context vector representation of each word in the sentence to be resolved and the above information into a bidirectional long-short term memory network so as to enhance the context expression and the position information of each word and obtain the enhanced context vector representation of each word;
the prediction module is used for traversing the context vector representation of each word after being enhanced, and predicting the probability of the first word of the fingerback item and the probability of the last word of the fingerback item according to the parameter vector in the bert model;
the construction module is used for traversing each word, constructing a continuous text segment and calculating the probability of the index item of the continuous text segment according to the probability of the head word of the index item of each word and the probability of the tail word of the index item;
and the selection module is used for selecting the continuous text segment with the maximum index probability from the continuous text segments as the index of the sentence to be digested.
Further, the vectorization module includes:
the representation unit is used for representing each word in the sentence to be resolved and the text information thereof in a one-hot form to obtain a high-dimensional discrete word representation matrix corresponding to the sentence to be resolved and a high-dimensional discrete word representation matrix corresponding to the text information thereof;
the embedding unit is used for respectively embedding the high-dimensional discrete word expression matrix corresponding to the sentence to be digested and the high-dimensional discrete word expression matrix corresponding to the above information into the low-dimensional dense expression matrix by adopting a word embedding method;
and the coding unit is used for inputting the sentence to be resolved and the low-dimensional dense characterization matrix corresponding to the upper text information thereof into a preset bert model for bidirectional coding to obtain the context vector representation of each word in the sentence to be resolved and the upper text information thereof.
Further, the prediction module comprises:
the acquiring unit is used for acquiring a head word parameter vector and a tail word parameter vector in the bert model;
the head word probability calculation unit is used for carrying out dot product operation on the context vector representation after each word is enhanced and the parameter vector of the head word, and carrying out softmax processing on the dot product operation result to obtain the head word probability of the return finger item of each word;
and the tail word probability calculation unit is used for performing dot product operation on the enhanced context vector representation of each word and the tail word parameter vector, and performing softmax processing on the dot product operation result to obtain the probability of the finger-back tail word of each word.
A computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the big-data based zero-refer resolution method when executing the computer program.
A computer-readable storage medium, which stores a computer program that, when executed by a processor, implements the big-data based zero-reference resolution method described above.
The method comprises the steps of performing vectorization processing on a sentence to be resolved and upper text information of the sentence to be resolved to obtain context vector representation of each word in the sentence to be resolved and context vector representation of each word in the upper text information; enhancing the context expression and the position information of each word through a bidirectional long-short term memory network to obtain the enhanced context vector representation of each word; then traversing the context vector representation of each word after enhancement, and predicting the probability of the first word of the fingerback item and the probability of the last word of the fingerback item according to the parameter vector in the bert model; constructing a continuous text segment according to the words, and calculating the probability of the fingerback item of the continuous text segment according to the probability of the fingerback item head word and the probability of the fingerback item tail word of each word; and finally, selecting the continuous text segment with the maximum finger probability from the continuous text segments as the finger of the sentence to be digested. According to the invention, based on the extraction type reading understanding model, all continuous segments in the text can be used as candidate return finger items, a return finger item candidate set is not required to be constructed in advance by using rules, the number and the coverage degree of the candidate return finger items are larger, and the accuracy and the reliability of a zero-finger resolution result are effectively improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.
FIG. 1 is a flow chart of a big data based zero-finger resolution method according to an embodiment of the present invention;
FIG. 2 is a flowchart of step S101 of the big data based zero-designation resolution method according to another embodiment of the present invention;
FIG. 3 is a flowchart of step S103 of the big data based zero-designation resolution method according to another embodiment of the present invention;
FIG. 4 is a flowchart of step S104 of the big data based zero-designation resolution method according to another embodiment of the present invention;
FIG. 5 is a flowchart of step S105 of the big data based zero-finger resolution method according to another embodiment of the present invention;
FIG. 6 is a schematic block diagram of a big-data-based zero-reference resolution apparatus according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of a computer device according to an embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The zero-reference resolution method based on big data provided by the embodiment of the invention is applied to a server. The server may be implemented as a stand-alone server or as a server cluster consisting of a plurality of servers. In one embodiment, as shown in fig. 1, a zero-reference resolution method based on big data is provided, which includes the following steps:
in step S101, a sentence to be resolved and its upper text information are obtained, and vectorization processing is performed on the sentence to be resolved and its upper text information to obtain a context vector representation of each word in the sentence to be resolved and a context vector representation of each word in the upper text information.
Here, the above information is text information before a paragraph where the sentence to be resolved is located, and may be one or more sentences before the sentence to be resolved. In an application scenario of the customer service robot, the sentence to be resolved may be a text input by the customer when performing query and chat or a result text after the phonetic transcription. The contextual information may be all text that the customer asks and chats.
Each word corresponds to a context vector representation, which refers to a feature vector for each word. As a preferred example of the present invention, the embodiment of the present invention obtains the context vector representation of each word by vectorizing the sentence to be resolved and the context information thereof, and introduces the correlation between words through the wordemeading method and the bert model, so that the context vector representation of each word is more accurate, and the feature dimension in the context vector representation of each word is reduced. Optionally, as shown in fig. 2, the performing vectorization processing on the sentence to be resolved and the upper text information thereof in step S101 to obtain a context vector representation of each word in the sentence to be resolved, and the context vector representation of each word in the upper text information includes:
in step S201, a one-hot form is adopted to represent each word in the sentence to be resolved and the text information thereof, and a high-dimensional discrete word representation matrix corresponding to the sentence to be resolved and a high-dimensional discrete word representation matrix corresponding to the text information thereof are obtained.
In the embodiment of the invention, the sentence to be resolved and the upper information thereof are converted into mathematical representation in a one-hot form, and a high-dimensional discrete word representation matrix corresponding to the sentence to be resolved and a high-dimensional discrete word representation matrix corresponding to the upper information thereof are obtained. Specifically, a dictionary is constructed in advance, the dictionary at least comprises all words in a sentence to be resolved, each word is assigned with a number, and when the sentence to be resolved is coded, each word contained in the sentence to be resolved is converted into a one-hot form in which the corresponding position of the number of the word in the dictionary is 1. The processing logic for the above information is the same, and is not described in detail here. The One-hot representation mode is very intuitive, the length of the One-hot form of each word is the length of the dictionary, if the dictionary contains 10000 words, the One-hot form corresponding to each word is a vector of 1 x 10000, only One position of the vector is 1, and the rest positions are 0, so that the space is wasted, and the calculation is not facilitated; in addition, the relationship between each word cannot be represented by a one-hot form.
In view of this, the embodiment of the present invention further performs dimension reduction on the high-dimensional discrete word representation matrix corresponding to the sentence to be resolved and the high-dimensional discrete word representation matrix corresponding to the above information thereof, and introduces correlation between words.
In step S202, a word embedding method is used to embed the high-dimensional discrete word representation matrix corresponding to the sentence to be resolved and the high-dimensional discrete word representation matrix corresponding to the above information into the low-dimensional dense representation matrix, respectively.
Optionally, the embodiment of the present invention adopts a word2vec method in the word embedding method. Specifically, a preset shallow neural network is trained, and dense feature vectors of each word are learned through the shallow neural network, so that word vector representation capable of reflecting the relation between any two words is obtained. And traversing the one-hot form of each word in the sentence to be resolved, converting the words in the sentence to be resolved into corresponding dense feature vectors through a trained shallow neural network, and combining the dense feature vectors of all the words to obtain a low-dimensional dense characterization matrix corresponding to the sentence to be resolved. The processing logic for the context information is the same, and is not described herein again.
Here, the dense feature vector assigns a vector representation of a fixed length to each word, which can be set by itself, such as 300, much smaller than the dictionary length in the one-hot form; and the relation between two characters can be represented by an included angle value between the two characters, and particularly can be represented by a simple cosine function. It can be seen that the embodiment introduces the correlation between words through the dense feature vectors, and reduces the feature dimension in the sentence to be resolved and the upper text information thereof.
In step S203, the low-dimensional dense representation matrix corresponding to the sentence to be resolved and the upper information thereof is input to a preset bert model for bidirectional encoding, so as to obtain a context vector representation of each word in the sentence to be resolved and the upper information thereof.
Here, the context vector representation refers to a feature vector containing the above and below information. The bert model can deeply bi-directionally encode each word vector representation, changing the word vector that originally contained no context information into a context vector that introduced the word context and context information. Specifically, the bert model maps the input low-dimensional dense characterization matrix to a hidden space through a 24-layer transformation module, namely, a transform-block. Each transducer-block is composed of a multi-head attention module, a residual error network, a layer standardization module, a feedforward neural network module and the like in sequence. In the multi-head attention module, the low-dimensional dense characterization matrix can learn the interaction information between the contexts and add the position coding, so that each word vector in the hidden space obtained by the bert model is a vector representation based on the contexts. Each word corresponds to a context vector representation, thereby making the vector representation of each word more accurate.
In step S102, the context vector representation of each word in the sentence to be resolved and the above information is input into the two-way long-short term memory network, so as to enhance the context expression and the position information of each word, and obtain the enhanced context vector representation of each word.
In order to solve the problems, the context vector representation of each word in the sentence to be resolved and the above information is input into a bidirectional L STM network, and the bidirectional L STM network directly learns the dependency relationship between the words.
In step S103, traversing the enhanced context vector representation of each word, and predicting the first word probability and the last word probability of each word according to the parameter vector in the bert model.
In step S203, the bert model obtains a context vector representation of each word in the sentence to be resolved and the upper text information thereof by establishing two parameter vectors to be learned. The two parameter vectors to be learned are respectively a head word parameter vector and a tail word parameter vector, and are parameter vectors in a bert model. The embodiment of the invention further utilizes the parameter vector in the bert model to predict the first word probability and the last word probability of the fingerback item of each word. Alternatively, as shown in fig. 3, traversing the enhanced context vector representation of each word in step S103, and predicting the probability of the beginning word of the finger item and the probability of the end word of the finger item according to the parameter vector in the bert model includes:
in step S301, a head word parameter vector and a tail word parameter vector in the bert model are obtained.
Here, the head word parameter vector and the tail word parameter vector are both two vectors initialized randomly in the bert model, and can be continuously learned by optimizing an objective function.
In step S302, the enhanced context vector representation of each word is dot-product-operated with the parameter vector of the first word, and the result of the dot-product-operation is softmax-processed to obtain the probability of the first word of the finger-back item of each word.
After the header word parameter vector is obtained, each word is traversed, and the dot product of the context vector representation after the word enhancement and the header word parameter vector is calculated. In the bert model, a plurality of head word parameter vectors are provided, and each head word parameter vector corresponds to one dot product, so that a plurality of dot products corresponding to each word are obtained. The present embodiment further performs numerical processing on the multiple dot products through a Softmax function, converts the multiple dot products into relative probabilities, and selects a maximum value among the relative probabilities as the probability of the return finger item head word of the word.
In step S303, the enhanced context vector representation of each word is dot-product-operated with the parameter vector of the end word, and the result of dot-product-operation is softmax-processed to obtain the probability of the end word of the finger-back item of each word.
The calculation process of the probability of the end word is the same as that of the probability of the head word. After the tail word vector parameters are obtained, each word is traversed, and the dot product of the context vector representation after the word enhancement and the tail word parameter vector is calculated. In the bert model, a plurality of tail word parameter vectors are provided, and each tail word parameter vector corresponds to one dot product, so that a plurality of dot products corresponding to each word are obtained. The embodiment further performs numerical processing on the multiple dot products through a Softmax function, converts the multiple dot products into relative probabilities, and selects a maximum value of the relative probabilities as the probability of the last word of the ring finger of the word.
In step S104, traversing each word, constructing a continuous text segment, and calculating a probability of a finger back item of the continuous text segment according to a probability of a head word of the finger back item and a probability of a tail word of the finger back item of each word.
Here, for each word, a continuous text segment is a continuous segment between the word as the beginning word of the fingerback term and the other words as the end word of the fingerback term. The probability of the initial word of the word as the initial word of the initial word is multiplied by the probability of the final word of the initial. Optionally, as shown in fig. 4, traversing each word in step S104 to construct a continuous text segment, and calculating a probability of a fingerback item of the continuous text segment according to a probability of a fingerback item head word and a probability of a fingerback item tail word of each word includes:
in step S401, each word is traversed, the word is used as a beginning word of the fingerback item, and the word and the following words are used as end words of the fingerback item, so as to construct a continuous text segment.
The continuous text segment can be a single word, or a word, a sentence and a text segment, so that a new way for creating the candidate set of the return finger item is provided, any segment in the above information of the sentence to be digested can be used as the candidate item of the return finger item, the candidate set of the return finger item does not need to be manually created, and the range of the candidate item of the return finger item is effectively expanded.
In step S402, a product of the probability of the top word and the probability of the end word in the continuous text segment is calculated to obtain the probability of the top word of the continuous text segment.
For each continuous segment, the embodiment calculates the probability of the fingerback item of the continuous text segment according to the head word and the tail word in the continuous text segment. The value of the back finger item probability is the product of the back finger item head word probability of the head word and the back finger item tail word probability of the tail word in the continuous segment. The product of the initial word probability of the initial word and the final word probability of the final word in the continuous segment is larger, the final word probability of the continuous segment is larger, and the product of the initial word probability of the initial word and the final word probability of the final word in the continuous segment is smaller.
In step S105, the continuous text segment with the highest indexing probability is selected from the continuous text segments as the indexing item of the sentence to be digested.
Optionally, as shown in fig. 5, the selecting, as the back pointer of the sentence to be digested, the continuous text segment with the highest probability of retrieving the back pointer from the continuous text segments in step S105 includes:
in step S501, continuous text segments intersecting with the sentence to be digested are filtered out from the continuous text segments.
If any part of the continuous text segment is overlapped with the sentence to be digested, deleting the continuous text segment to complete the filtering of the continuous text segment. Specifically, the continuous text segment with the head characters and/or the tail characters falling in the sentence to be resolved can be deleted by judging whether the head characters and/or the tail characters of the continuous text segment fall in the sentence to be resolved, so that the head characters and the tail characters of the continuous text segment which is reserved are not in the current sentence to be resolved. Because the number of the continuous text segments is very large, and the referent usually cannot overlap with the sentence to be resolved, the continuous text segments which are definitely not referent are filtered out by deleting the continuous text segments which have intersection with the sentence to be resolved, so that the continuous text segments which are taken as candidate items are reduced, the efficiency of selecting the referent of the sentence to be resolved is improved, and the zero-reference resolution efficiency is further improved.
In step S502, the continuous text segment with the highest finger probability is selected from the filtered continuous text segments as the finger of the sentence to be digested.
And then according to the back finger item probability obtained by the calculation in the step S104, selecting the continuous text segment with the maximum back finger item probability from the reserved continuous text segments as the back finger item of the sentence to be digested.
The embodiment of the invention provides a new method for determining a return finger item, which is characterized in that all continuous segments in the text can be used as candidate return finger items based on an extraction type reading understanding model, a return finger item candidate set does not need to be constructed in advance by using rules, the number and coverage degree of the candidate return finger items are larger, the problem of identification failure caused when the return finger item candidate set does not contain correct return finger items in the prior art is effectively avoided, the problem that the existing zero-index resolution method excessively depends on the return finger item candidate set is solved, and the accuracy and reliability of a zero-index resolution result are improved.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
In an embodiment, a zero-reference resolution device based on big data is provided, and the zero-reference resolution device based on big data corresponds to the zero-reference resolution method based on big data in the above embodiment one to one. As shown in fig. 6, the big data based zero-reference resolution device includes a vectorization module 61, an enhancement module 62, a prediction module 63, a construction module 64, and a selection module 65. The functional modules are explained in detail as follows:
the vectorization module 61 is configured to obtain a sentence to be resolved and upper information thereof, and perform vectorization processing on the sentence to be resolved and the upper information thereof to obtain a context vector representation of each word in the sentence to be resolved and a context vector representation of each word in the upper information;
an enhancing module 62, configured to input the context vector representation of each word in the sentence to be resolved and the above information into a bidirectional long-short term memory network, so as to enhance the context expression and the position information of each word, and obtain an enhanced context vector representation of each word;
the prediction module 63 is configured to traverse the enhanced context vector representation of each word, and predict a probability of a top word of a finger item and a probability of a tail word of the finger item of each word according to a parameter vector in the bert model;
the constructing module 64 is configured to traverse each word, construct a continuous text segment, and calculate a probability of a hinting item of the continuous text segment according to a probability of a top word of the hinting item and a probability of a tail word of the hinting item of each word;
and the selecting module 65 is configured to select the continuous text segment with the highest index probability from the continuous text segments as the index of the sentence to be digested.
Optionally, the vectorization module 61 includes:
the representation unit is used for representing each word in the sentence to be resolved and the text information thereof in a one-hot form to obtain a high-dimensional discrete word representation matrix corresponding to the sentence to be resolved and a high-dimensional discrete word representation matrix corresponding to the text information thereof;
the embedding unit is used for respectively embedding the high-dimensional discrete word expression matrix corresponding to the sentence to be digested and the high-dimensional discrete word expression matrix corresponding to the above information into the low-dimensional dense expression matrix by adopting a word embedding method;
and the coding unit is used for inputting the sentence to be resolved and the low-dimensional dense characterization matrix corresponding to the upper text information thereof into a preset bert model for bidirectional coding to obtain the context vector representation of each word in the sentence to be resolved and the upper text information thereof.
Optionally, the prediction module 63 includes:
the acquiring unit is used for acquiring a head word parameter vector and a tail word parameter vector in the bert model;
the head word probability calculation unit is used for carrying out dot product operation on the context vector representation after each word is enhanced and the parameter vector of the head word, and carrying out softmax processing on the dot product operation result to obtain the head word probability of the return finger item of each word;
and the tail word probability calculation unit is used for performing dot product operation on the enhanced context vector representation of each word and the tail word parameter vector, and performing softmax processing on the dot product operation result to obtain the probability of the finger-back tail word of each word.
Optionally, the building module 64 includes:
the construction unit is used for traversing each word, taking the word as a first word of a fingerback item, taking the word and the following words as tail words of the fingerback item, and constructing a continuous text segment;
and the calculating unit is used for calculating the product of the initial character probability of the initial character of the continuous text segment and the final character probability of the final character to obtain the final character probability of the continuous text segment.
Optionally, the selecting module 65 includes:
the filtering unit is used for filtering out continuous text segments which have intersection with sentences to be digested from the continuous text segments;
and the selecting unit is used for selecting the continuous text segment with the maximum index probability from the filtered continuous text segments as the index of the sentence to be digested.
For specific definition of the zero-index digestion device based on big data, reference may be made to the above definition of the zero-index digestion method based on big data, and details are not repeated here. The modules in the big data-based zero-designation digestion device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a big data based zero-reference resolution method.
In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
acquiring a sentence to be resolved and upper text information thereof, and executing vectorization processing on the sentence to be resolved and the upper text information thereof to obtain context vector representation of each word in the sentence to be resolved and context vector representation of each word in the upper text information;
inputting the context vector representation of each word in the sentence to be resolved and the above information into a bidirectional long-short term memory network to enhance the context expression and the position information of each word and obtain the enhanced context vector representation of each word;
traversing the context vector representation of each enhanced word, and predicting the probability of the first word of the fingerback item and the probability of the last word of the fingerback item according to the parameter vector in the bert model;
traversing each word, constructing a continuous text segment, and calculating the probability of the fingerback item of the continuous text segment according to the probability of the fingerback item head word and the probability of the fingerback item tail word of each word;
and selecting the continuous text segment with the maximum finger probability from the continuous text segments as the finger of the sentence to be digested.
It will be understood by those of ordinary skill in the art that all or a portion of the processes of the methods of the embodiments described above may be implemented by a computer program that may be stored on a non-volatile computer-readable storage medium, which when executed, may include the processes of the embodiments of the methods described above, wherein any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims (10)

1. A zero-reference resolution method based on big data is characterized by comprising the following steps:
acquiring a sentence to be resolved and upper text information thereof, and executing vectorization processing on the sentence to be resolved and the upper text information thereof to obtain context vector representation of each word in the sentence to be resolved and context vector representation of each word in the upper text information;
inputting the context vector representation of each word in the sentence to be resolved and the above information into a bidirectional long-short term memory network to enhance the context expression and the position information of each word and obtain the enhanced context vector representation of each word;
traversing the context vector representation of each enhanced word, and predicting the probability of the first word of the fingerback item and the probability of the last word of the fingerback item according to the parameter vector in the bert model;
traversing each word, constructing a continuous text segment, and calculating the probability of the fingerback item of the continuous text segment according to the probability of the fingerback item head word and the probability of the fingerback item tail word of each word;
and selecting the continuous text segment with the maximum finger probability from the continuous text segments as the finger of the sentence to be digested.
2. The big-data-based zero-reference resolution method according to claim 1, wherein the performing vectorization processing on the sentence to be resolved and the upper information thereof to obtain the context vector representation of each word in the sentence to be resolved, and the context vector representation of each word in the upper information comprises:
representing each word in the sentence to be resolved and the text information thereof in a one-hot form to obtain a high-dimensional discrete word representation matrix corresponding to the sentence to be resolved and a high-dimensional discrete word representation matrix corresponding to the text information thereof;
respectively embedding a high-dimensional discrete word representation matrix corresponding to the sentence to be digested and a high-dimensional discrete word representation matrix corresponding to the above information into a low-dimensional dense representation matrix by adopting a word embedding method;
and inputting the sentence to be resolved and the low-dimensional dense representation matrix corresponding to the upper text information of the sentence to be resolved into a preset bert model for bidirectional coding to obtain the context vector representation of each word in the sentence to be resolved and the upper text information of the sentence to be resolved.
3. The big-data-based zero-finger resolution method of claim 1 or 2, wherein traversing the enhanced context vector representation of each word, and predicting the probability of the top word of the fingerpost and the probability of the tail word of the fingerpost of each word according to the parameter vector in the bert model comprises:
acquiring a head word parameter vector and a tail word parameter vector in a bert model;
performing dot product operation on the context vector representation after each character enhancement and the first character parameter vector, and performing softmax processing on the dot product operation result to obtain the first character probability of the fingerback item of each character;
and performing dot product operation on the context vector representation after each word is enhanced and the tail word parameter vector, and performing softmax processing on the dot product operation result to obtain the probability of the final word of the corresponding finger item of each word.
4. The big-data-based zero-finger resolution method of claim 3, wherein traversing each word, constructing a continuous text segment, and calculating a fingerback probability of the continuous text segment according to the fingerback head probability and the fingerback end probability of each word comprises:
traversing each word, taking the word as a first word of a return finger item, taking the word and the following words as tail words of the return finger item, and constructing a continuous text segment;
and calculating the product of the initial word probability of the initial word and the final word probability of the final word in the continuous text segment to obtain the final word probability of the continuous text segment.
5. The big-data-based zero-reference resolution method according to claim 4, wherein the selecting the continuous text segment with the highest probability of indexing back from the continuous text segments as the indexing back of the sentence to be resolved comprises:
filtering out continuous text segments which have intersection with sentences to be digested from the continuous text segments;
and selecting the continuous text segment with the maximum index probability from the filtered continuous text segments as the index of the sentence to be digested.
6. A big data-based zero-designation resolution device is characterized by comprising:
the vectorization module is used for acquiring a sentence to be resolved and the information of the sentence and executing vectorization processing on the sentence to be resolved and the information of the sentence to be resolved to obtain context vector representation of each word in the sentence to be resolved and context vector representation of each word in the information of the sentence;
the enhancement module is used for inputting the context vector representation of each word in the sentence to be resolved and the above information into a bidirectional long-short term memory network so as to enhance the context expression and the position information of each word and obtain the enhanced context vector representation of each word;
the prediction module is used for traversing the context vector representation of each word after being enhanced, and predicting the probability of the first word of the fingerback item and the probability of the last word of the fingerback item according to the parameter vector in the bert model;
the construction module is used for traversing each word, constructing a continuous text segment and calculating the probability of the index item of the continuous text segment according to the probability of the head word of the index item of each word and the probability of the tail word of the index item;
and the selection module is used for selecting the continuous text segment with the maximum index probability from the continuous text segments as the index of the sentence to be digested.
7. The big-data-based zero-reference digestion apparatus according to claim 6, wherein said vectorization module comprises:
the representation unit is used for representing each word in the sentence to be resolved and the text information thereof in a one-hot form to obtain a high-dimensional discrete word representation matrix corresponding to the sentence to be resolved and a high-dimensional discrete word representation matrix corresponding to the text information thereof;
the embedding unit is used for respectively embedding the high-dimensional discrete word expression matrix corresponding to the sentence to be digested and the high-dimensional discrete word expression matrix corresponding to the above information into the low-dimensional dense expression matrix by adopting a word embedding method;
and the coding unit is used for inputting the sentence to be resolved and the low-dimensional dense characterization matrix corresponding to the upper text information thereof into a preset bert model for bidirectional coding to obtain the context vector representation of each word in the sentence to be resolved and the upper text information thereof.
8. The big-data-based zero-reference digestion apparatus according to claim 6 or 7, wherein said prediction module comprises:
the acquiring unit is used for acquiring a head word parameter vector and a tail word parameter vector in the bert model;
the head word probability calculation unit is used for carrying out dot product operation on the context vector representation after each word is enhanced and the parameter vector of the head word, and carrying out softmax processing on the dot product operation result to obtain the head word probability of the return finger item of each word;
and the tail word probability calculation unit is used for performing dot product operation on the enhanced context vector representation of each word and the tail word parameter vector, and performing softmax processing on the dot product operation result to obtain the probability of the finger-back tail word of each word.
9. A computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the big-data based zero-refer resolution method according to any of claims 1 to 5 when executing the computer program.
10. A computer-readable storage medium, storing a computer program, wherein the computer program, when executed by a processor, implements the big-data based zero-reference resolution method according to any of claims 1 to 5.
CN202010099118.XA 2020-02-18 2020-02-18 Zero-reference resolution method, device, equipment and medium based on big data Pending CN111401035A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010099118.XA CN111401035A (en) 2020-02-18 2020-02-18 Zero-reference resolution method, device, equipment and medium based on big data
PCT/CN2020/123173 WO2021164293A1 (en) 2020-02-18 2020-10-23 Big-data-based zero anaphora resolution method and apparatus, and device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010099118.XA CN111401035A (en) 2020-02-18 2020-02-18 Zero-reference resolution method, device, equipment and medium based on big data

Publications (1)

Publication Number Publication Date
CN111401035A true CN111401035A (en) 2020-07-10

Family

ID=71430335

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010099118.XA Pending CN111401035A (en) 2020-02-18 2020-02-18 Zero-reference resolution method, device, equipment and medium based on big data

Country Status (2)

Country Link
CN (1) CN111401035A (en)
WO (1) WO2021164293A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112256868A (en) * 2020-09-30 2021-01-22 华为技术有限公司 Zero-reference resolution method, method for training zero-reference resolution model and electronic equipment
CN112463942A (en) * 2020-12-11 2021-03-09 深圳市欢太科技有限公司 Text processing method and device, electronic equipment and computer readable storage medium
CN112633014A (en) * 2020-12-11 2021-04-09 厦门渊亭信息科技有限公司 Long text reference resolution method and device based on neural network
WO2021164293A1 (en) * 2020-02-18 2021-08-26 平安科技(深圳)有限公司 Big-data-based zero anaphora resolution method and apparatus, and device and medium
WO2022123400A1 (en) * 2020-12-10 2022-06-16 International Business Machines Corporation Anaphora resolution for enhanced context switching

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114511798B (en) * 2021-12-10 2024-04-26 安徽大学 Driver distraction detection method and device based on transformer

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294322A (en) * 2016-08-04 2017-01-04 哈尔滨工业大学 A kind of Chinese based on LSTM zero reference resolution method
CN109165386A (en) * 2017-08-30 2019-01-08 哈尔滨工业大学 A kind of Chinese empty anaphora resolution method and system
US10606931B2 (en) * 2018-05-17 2020-03-31 Oracle International Corporation Systems and methods for scalable hierarchical coreference
CN110162785A (en) * 2019-04-19 2019-08-23 腾讯科技(深圳)有限公司 Data processing method and pronoun clear up neural network training method
CN111401035A (en) * 2020-02-18 2020-07-10 平安科技(深圳)有限公司 Zero-reference resolution method, device, equipment and medium based on big data

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021164293A1 (en) * 2020-02-18 2021-08-26 平安科技(深圳)有限公司 Big-data-based zero anaphora resolution method and apparatus, and device and medium
CN112256868A (en) * 2020-09-30 2021-01-22 华为技术有限公司 Zero-reference resolution method, method for training zero-reference resolution model and electronic equipment
WO2022123400A1 (en) * 2020-12-10 2022-06-16 International Business Machines Corporation Anaphora resolution for enhanced context switching
US11645465B2 (en) 2020-12-10 2023-05-09 International Business Machines Corporation Anaphora resolution for enhanced context switching
GB2616805A (en) * 2020-12-10 2023-09-20 Ibm Anaphora resolution for enhanced context switching
CN112463942A (en) * 2020-12-11 2021-03-09 深圳市欢太科技有限公司 Text processing method and device, electronic equipment and computer readable storage medium
CN112633014A (en) * 2020-12-11 2021-04-09 厦门渊亭信息科技有限公司 Long text reference resolution method and device based on neural network
CN112633014B (en) * 2020-12-11 2024-04-05 厦门渊亭信息科技有限公司 Neural network-based long text reference digestion method and device

Also Published As

Publication number Publication date
WO2021164293A1 (en) 2021-08-26

Similar Documents

Publication Publication Date Title
CN111401035A (en) Zero-reference resolution method, device, equipment and medium based on big data
CN108920622B (en) Training method, training device and recognition device for intention recognition
CN110413788B (en) Method, system, device and storage medium for predicting scene category of conversation text
CN111581229B (en) SQL statement generation method and device, computer equipment and storage medium
CN111444311A (en) Semantic understanding model training method and device, computer equipment and storage medium
WO2020108063A1 (en) Feature word determining method, apparatus, and server
CN114399769B (en) Training method of text recognition model, and text recognition method and device
CN109344242B (en) Dialogue question-answering method, device, equipment and storage medium
CN110309504B (en) Text processing method, device, equipment and storage medium based on word segmentation
CN113326380B (en) Equipment measurement data processing method, system and terminal based on deep neural network
EP4191544A1 (en) Method and apparatus for recognizing token, electronic device and storage medium
CN112084769A (en) Dependency syntax model optimization method, device, equipment and readable storage medium
CN115730597A (en) Multi-level semantic intention recognition method and related equipment thereof
CN113435182A (en) Method, device and equipment for detecting conflict of classification labels in natural language processing
CN112507124A (en) Chapter-level event causal relationship extraction method based on graph model
KR102608867B1 (en) Method for industry text increment, apparatus thereof, and computer program stored in medium
CN114399772A (en) Sample generation, model training and trajectory recognition methods, devices, equipment and medium
CN111400340B (en) Natural language processing method, device, computer equipment and storage medium
CN114647739B (en) Entity chain finger method, device, electronic equipment and storage medium
CN116186219A (en) Man-machine dialogue interaction method, system and storage medium
CN114612911B (en) Stroke-level handwritten character sequence recognition method, device, terminal and storage medium
CN114580399A (en) Text error correction method and device, electronic equipment and storage medium
CN113011162A (en) Reference resolution method, device, electronic equipment and medium
CN113656566A (en) Intelligent dialogue processing method and device, computer equipment and storage medium
CN111782781A (en) Semantic analysis method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination