CN114330355A - Text processing method and device, electronic equipment and storage medium - Google Patents

Text processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114330355A
CN114330355A CN202011074414.0A CN202011074414A CN114330355A CN 114330355 A CN114330355 A CN 114330355A CN 202011074414 A CN202011074414 A CN 202011074414A CN 114330355 A CN114330355 A CN 114330355A
Authority
CN
China
Prior art keywords
text content
vector
word vector
text
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011074414.0A
Other languages
Chinese (zh)
Inventor
冯帅
李涛
方高林
邹红建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202011074414.0A priority Critical patent/CN114330355A/en
Publication of CN114330355A publication Critical patent/CN114330355A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a text processing method and device, electronic equipment and a storage medium. Belongs to the technical field of natural language processing. The method comprises the following steps: acquiring first text content and a first word vector; acquiring second text content and a second word vector; acquiring a target probability based on the attention mechanism, the first word vector and the second word vector; and if the target probability value is greater than the probability threshold value, determining that the semantics of the first text content are matched with the semantics of the second text content. Therefore, in the process of semantic matching of the first text content and the second text content, the word vector of the second text content for semantic matching with the first text content is pre-stored, so that in the actual semantic matching process, the second text content is not required to be input into the model for obtaining the word vector, and the word vector pre-stored before can be directly utilized, thereby reducing the time consumption in the text processing process and improving the efficiency of semantic matching.

Description

Text processing method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of natural language processing technologies, and in particular, to a text processing method and apparatus, an electronic device, and a storage medium.
Background
With the development of natural language processing technology, a computer can process the acquired text content to acquire text content with similar or identical semantics. In the process, semantic similarity between texts needs to be acquired, but in a related manner of acquiring semantic similarity between texts, the efficiency of acquiring semantic similarity still needs to be improved.
Disclosure of Invention
In view of the foregoing, the present application provides a text processing method, apparatus, electronic device and storage medium to improve the foregoing problems.
In a first aspect, the present application provides a text processing method, including: acquiring first text content and a first word vector, wherein the first word vector is a word vector corresponding to the first text content; acquiring second text content and a second word vector, wherein the second text content is the text content which is currently used for semantic matching with the first text content, and the second word vector is a pre-stored word vector corresponding to the second text content; obtaining a target probability based on an attention mechanism, the first word vector and the second word vector, the target probability representing a semantic matching degree between the first text content and the second text content; and if the target probability value is larger than a probability threshold value, determining that the first text content is semantically matched with the second text content.
In a second aspect, the present application provides a text processing method, including: acquiring a text input by an information query interface as a first text content, and acquiring a first word vector, wherein the first word vector is a word vector corresponding to the first text content; acquiring second text content and a second word vector, wherein the second text content is used for being matched with the first text content, and the second word vector is a pre-stored word vector corresponding to the second text content; obtaining a target probability based on an attention mechanism, the first word vector and the second word vector, the target probability characterizing a semantic matching degree between the first text content and the second text content; if the target probability value is larger than a probability threshold value, determining that the first text content is semantically matched with the second text content; outputting information associated with the second text content to the information query interface.
In a third aspect, the present application provides a text processing apparatus, comprising: the device comprises a first data acquisition unit, a second data acquisition unit, a probability acquisition unit and a text matching unit. The device comprises a first data acquisition unit, a second data acquisition unit and a word processing unit, wherein the first data acquisition unit is used for acquiring first text content and a first word vector, and the first word vector is a word vector corresponding to the first text content; a second data obtaining unit, configured to obtain a second text content and a second word vector, where the second text content is a text content currently used for performing semantic matching with the first text content, and the second word vector is a word vector corresponding to the second text content and stored in advance; a probability obtaining unit, configured to obtain a target probability based on an attention mechanism, the first word vector, and the second word vector, where the target probability represents a semantic matching degree between the first text content and the second text content; and the text matching unit is used for determining semantic matching between the first text content and the second text content if the target probability value is greater than a probability threshold.
In a fourth aspect, the present application provides a text processing apparatus, comprising: the device comprises a query data acquisition unit, a second data acquisition unit, a probability acquisition unit, a text matching unit and an information output unit. The query data acquisition unit is used for acquiring a text input by an information query interface as a first text content and acquiring a first word vector, wherein the first word vector is a word vector corresponding to the first text content; a second data obtaining unit, configured to obtain a second text content and a second word vector, where the second text content is a text content used for matching with the first text content, and the second word vector is a word vector corresponding to the second text content and stored in advance; a probability obtaining unit, configured to obtain a target probability based on an attention mechanism, the first word vector, and the second word vector, where the target probability represents a semantic matching degree between the first text content and the second text content; the text matching unit is used for determining semantic matching between the first text content and the second text content if the target probability value is greater than a probability threshold; and the information output unit is used for outputting the information associated with the second text content to the information query interface.
In a fifth aspect, the present application provides an electronic device comprising a processor and a memory; one or more programs are stored in the memory and configured to be executed by the processor to implement the methods described above.
In a sixth aspect, the present application provides a computer readable storage medium having program code stored therein, wherein the program code performs the above-mentioned method when executed by a processor.
According to the text processing method, the text processing device, the electronic equipment and the storage medium, first text content and a word vector corresponding to the first text content are obtained and serve as a first word vector, second text content used for semantic matching with the first text content and a second word vector corresponding to the second text content are obtained, then a target probability is obtained based on an attention mechanism, the first word vector and the second word vector, and the target probability represents the semantic matching degree between the first text content and the second text content; and if the target probability value is larger than a probability threshold value, determining that the first text content is semantically matched with the second text content. Therefore, in the process of semantic matching of the first text content and the second text content, the word vector of the second text content for semantic matching with the first text content is pre-stored, so that in the actual semantic matching process, the second text content is not required to be input into the model for obtaining the word vector, and the word vector pre-stored before can be directly utilized, thereby reducing the time consumption in the text processing process and improving the efficiency of semantic matching.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic diagram of an application environment to which embodiments of the present application relate;
FIG. 2 is a flow chart illustrating a method of text processing according to an embodiment of the present application;
FIG. 3 is a flow chart illustrating a method of text processing according to another embodiment of the present application;
FIG. 4 shows a flowchart of the steps involved in S250 of FIG. 3;
FIG. 5 is a flow chart illustrating a method of text processing according to yet another embodiment of the present application;
FIG. 6 is a flow chart illustrating a method of text processing according to yet another embodiment of the present application;
FIG. 7 is a schematic diagram illustrating three stages involved in a text processing method as proposed in the present application;
FIG. 8 is a flow chart illustrating a method of text processing according to yet another embodiment of the present application;
FIG. 9 is a schematic diagram showing an information query interface in an embodiment of the present application;
FIG. 10 is a schematic diagram illustrating information associated with a second text output in an information query interface in an embodiment of the present application;
fig. 11 is a block diagram showing a structure of a text processing apparatus according to an embodiment of the present application;
fig. 12 is a block diagram showing a structure of a text processing apparatus according to another embodiment of the present application;
fig. 13 is a block diagram showing a structure of an electronic device for executing a text processing method according to an embodiment of the present application;
fig. 14 illustrates a storage unit for storing or carrying program codes for implementing a text processing method according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Among them, with the development of text processing technology in artificial intelligence technology, many scenes related to text recognition based on text processing technology as well as natural language processing technology appear. Such as a smart question and answer scenario. In the intelligent question-answering scene, a user can input a question which the user desires to know in a text mode, and the intelligent question-answering system can query a corresponding answer according to the question input by the user and feed the answer back to the user. For another example, in a news information search scenario, a user may input a topic desired to be known in a text form, and the search system may search for corresponding information according to the topic and feed the corresponding information back to the user.
However, after the inventor researches the related text recognition method in the text recognition scene, the inventor finds that the related text recognition method still has the problem that the recognition efficiency is to be improved. In the related text recognition method, the specific meaning of the text input by the user is recognized in a semantic matching mode, in the semantic matching process, the contents of two texts (the text input by the user and the text used for matching with the text input by the user) which need to be subjected to semantic matching are input into a model, and then whether the two texts subjected to semantic matching are similar is calculated according to an output vector of the model, so that in the related recognition method, the two texts subjected to matching are required to be input into the model each time to obtain the output vector of the model for semantic matching, and the model consumes a certain amount of time in the operation process, so that the text recognition delay is caused, and the recognition efficiency is influenced. Moreover, a larger delay is caused in the case that there are more texts that need to be compared, for example, a text that semantically matches the text input by the user needs to be found out from 100 texts, and the text input by the user needs to be semantically matched with the 100 texts respectively once.
Therefore, in order to improve the above problems, the inventors propose a text processing method, an apparatus, an electronic device, and a storage medium provided by the present application, in the method, first obtain a first text content and a word vector corresponding to the first text content as a first vector, and obtain a second text content currently used for performing semantic matching with the first text content and a second vector corresponding to the second text content, and then obtain a target probability based on an attention mechanism, the first vector, and the second vector, the target probability representing a semantic matching degree between the first text content and the second text content; and if the target probability value is greater than the probability threshold value, determining that the semantics of the first text content are matched with the semantics of the second text content.
Therefore, in the process of semantic matching of the first text content and the second text content, the word vector of the second text content for semantic matching with the first text content is pre-stored, so that in the actual semantic matching process, the second text content is not required to be input into the model for obtaining the word vector, and the word vector pre-stored before can be directly utilized, thereby reducing the time consumption in the text processing process and improving the efficiency of semantic matching.
Before further detailed description of the embodiments of the present application, an application environment related to the embodiments of the present application will be described.
As shown in fig. 1, fig. 1 is a schematic diagram of an application environment according to an embodiment of the present application. The system includes a client 110 and a server 120. The client 110 is configured to collect text input by a user and then send the collected text to the server 120. After receiving the text input by the user, the server 120 further obtains the first vector, the second text content, and the second vector to execute the text processing method provided by the embodiment of the present application. The server 120 returns information associated with the second textual content to the client 110 in the event that a semantic match is identified between the first textual content and the second textual content by the text processing method. The client 110 displays the returned information after receiving the returned information.
It should be noted that fig. 1 is an exemplary application environment, and the method provided in the embodiment of the present application may also be executed in other application environments. For example, the text processing methods provided by the embodiments of the present application may all be executed by the client 110.
It should be noted that the server 120 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and an artificial intelligence platform. The electronic device in which the client 110 is located may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, and the like.
The following description will be made of terms related to the embodiments of the present application.
Word vector: the Word vector (Word embedding) is a vector corresponding to a Word or a phrase in the text content. Where a word vector characterizes the meaning of a single word or phrase itself.
Sentence vector: the office vector (sequence Embedding) is a vector corresponding to the entire text content, and represents the meaning expressed by the entire text content.
For example, the word vector corresponding to the text content of "we go to move" may include the word vector corresponding to "i" and "s", the word vector corresponding to "go" and "move", and the word vector corresponding to "move". And the sentence vector corresponding to the "we go to move" is a vector obtained by performing linear transformation on the word vector corresponding to each word, wherein the linear transformation vector can represent the overall meaning of the text "we go to move". The overall meaning can be understood as meaning expressed by the text obtained by combining each word in the text content in the whole.
Embodiments of the present application will be described in detail below with reference to the accompanying drawings.
Referring to fig. 2, fig. 2 is a flowchart illustrating a text processing method according to an embodiment of the present application, where the method includes:
s110: the method comprises the steps of obtaining first text content and a first word vector, wherein the first word vector is a word vector corresponding to the first text content.
It should be noted that, in the embodiment of the present application, the first text content is a text content that needs to be semantically matched. In the embodiment of the present application, there may be multiple ways of obtaining the first text content.
As a mode, the text processing method provided in this embodiment may be triggered and executed by a semantic matching request triggered by a client, and if the semantic matching request carries text content that needs to be semantically matched, the semantic matching request may be analyzed to obtain the text content carried in the semantic matching request as a first text content.
Alternatively, the text content to be semantically matched may not be carried in the semantic matching request. In this way, after receiving the semantic matching request, response information may be returned to the client that sent the semantic matching request, and the client sends the text content that needs to be semantically matched after receiving the response information, so as to use the received text content that needs to be semantically matched as the first text content.
After the first text content is obtained, a word vector corresponding to the first text content may be further obtained as the first word vector. In this embodiment, the word vector corresponding to the first text content includes a word vector corresponding to each word in the first text content. For example, if the first text content is "application turnover", the first word vector corresponding to "application turnover" includes the word vector corresponding to "apply", "please", "week", and "turn".
S120: and acquiring second text content and a second word vector, wherein the second text content is the text content which is currently used for semantic matching with the first text content, and the second word vector is a pre-stored word vector corresponding to the second text content.
In this embodiment, the second text content is a text content for performing semantic matching with the first text content. In this embodiment, in the process of semantic matching the second text content with the first text content, matching is performed based on the word vector of the first text content and the word vector of the second text content, and then the word vector corresponding to the second text content is correspondingly obtained after the second text content is obtained. In this way, when the word vector corresponding to the second text content needs to be acquired, the word vector can be directly read from the specified storage area and acquired without acquiring the word vector corresponding to the second text content in real time in a calculation manner.
S130: and acquiring a target probability based on the attention mechanism, the first word vector and the second word vector, wherein the target probability represents the semantic matching degree between the first text content and the second text content.
It should be noted that the attention mechanism can enable the machine to pay more attention to the key content affecting the text semantics in the text content, and further update the word vector corresponding to the text content, so that the updated word vector can accurately express the meaning of the corresponding text content. In this embodiment, obtaining the target probability based on the attention mechanism, the first word vector, and the second word vector may be understood as inputting the first word vector and the second word vector into a model (attention model) based on the attention mechanism, and then updating the first word vector and the second word vector based on a vector output by the model based on the attention mechanism, so that the updated first word vector can more accurately express a meaning of the first text content, and correspondingly, the updated second word vector can more accurately express a meaning of the second text content.
S140: and if the target probability value is greater than the probability threshold value, determining that the semantics of the first text content are matched with the semantics of the second text content.
As a mode, a probability threshold value representing semantic matching of two texts may be preset, and after a target probability representing a semantic matching degree of a first text content and a second text content is obtained, the target probability is compared with the probability threshold value, and if the target probability value is greater than the probability threshold value, semantic matching of the first text content and the second text content is determined.
In the text processing method provided by this embodiment, first text content and a word vector corresponding to the first text content are obtained as a first word vector, second text content currently used for performing semantic matching with the first text content and a second word vector corresponding to the second text content are obtained, and then a target probability is obtained based on an attention mechanism, the first word vector and the second word vector, where the target probability represents a semantic matching degree between the first text content and the second text content; and if the target probability value is greater than the probability threshold value, determining that the semantics of the first text content are matched with the semantics of the second text content. Therefore, in the process of semantic matching of the first text content and the second text content, the word vector of the second text content for semantic matching with the first text content is pre-stored, so that in the actual semantic matching process, the second text content is not required to be input into the model for obtaining the word vector, and the word vector pre-stored before can be directly utilized, thereby reducing the time consumption in the text processing process and improving the efficiency of semantic matching.
Referring to fig. 3, fig. 3 is a flowchart illustrating a text processing method according to an embodiment of the present application, where the method includes:
s210: the method comprises the steps of obtaining first text content and a first word vector, wherein the first word vector is a word vector corresponding to the first text content.
Optionally, in this embodiment, a word vector of the first text content may be obtained through a specified model. For example, the specified model may be a Bert model, an Albert model, or a Robert model. As a manner, in the process of obtaining the first word vector through the specified model, each word in the first text content may be converted into a one-dimensional vector, and an initial one-dimensional vector corresponding to each word in the first text content is obtained. It should be noted that the initial one-dimensional vector here can be understood as an initial word vector of each word in the first text content. Then, the initial one-dimensional vectors corresponding to the characters are input into the specified model, so that word vectors corresponding to the characters in the first text content output by the specified model are obtained, and the word vectors corresponding to the characters output by the specified model are combined to obtain the first word vectors corresponding to the first text content. Optionally, the combination may obtain the first word vector corresponding to the first text content, and the first word vector may be in a sequence form.
It should be noted that, compared with the initial one-dimensional vector corresponding to each word in the first text content, the word vector corresponding to each word in the first text content output by the specified model can fuse more meanings of the context of the first text content, so that the word vector corresponding to each word in the first text content output by the specified model can express the corresponding meaning more accurately. Illustratively, the first text content is "i like to use apple", then the apple therein can be understood as a kind of fruit and also can be understood as a mobile phone brand, and the user of the apple can use the action of "using" instead of "eating" in combination with the context, so that the apple can be determined as the mobile phone brand in combination with the actually expressed intention corresponding to the meaning of voice. Then, in obtaining the initial one-dimensional vector, the one-dimensional vector corresponding to the apple may represent a fruit, and after the designated model is processed and output, the word vector corresponding to the apple may represent a brand.
S220: and acquiring second text content and a second word vector, wherein the second text content is the text content which is currently used for semantic matching with the first text content, and the second word vector is a pre-stored word vector corresponding to the second text content.
And the second text content is used for carrying out semantic matching with the first text content. It should be noted that, in this embodiment, the second text content may be obtained based on the first text content, and then the semantics of the second text content and the first text content may be matched.
In this embodiment, there may be various ways of obtaining the second text content based on the first text content.
As one way, in the process of acquiring the second text content based on the first text content, a text having the same or similar content as the first text content may be taken as the second text content. It should be noted that the semantics may not be actually the same between two texts with the same or similar contents, and then it is determined whether the first text content and the second text content are actually semantically matched through the subsequent steps. For example, the two texts "apply for turnover" and "turnover apply for nothing" have the same contents of "apply for" and "turnover", but the semantics of the two texts "apply for turnover" and "turnover apply for nothing" are different, and the subsequent steps of the embodiment can identify whether the two texts are actually semantically the same.
S230: each element in the attention matrix is derived based on the first word vector and the second word vector.
It should be noted that the attention matrix may be included in the model based on the attention mechanism, and in this embodiment, the attention matrix may be constructed based on the first word vector and the second word vector. Illustratively, the first word vector and the second word vector are in the form of a sequence. For example, the first word vector is
Figure BDA0002716170880000101
The second word vector is
Figure BDA0002716170880000102
Wherein, alaA word vector characterizing a corresponding word in the first text content. For example, a1Characterizing a word vector corresponding to a first word in the first text content, a2Characterizing a word vector corresponding to a second word in the first text content, and
Figure BDA0002716170880000103
and representing a word vector corresponding to the la-th word in the first text content. Similarly, b1Characterizing a word vector corresponding to a first word in the first text content, b2Characterizing a word vector corresponding to a second word in the first text content, and
Figure BDA0002716170880000104
characterization ofA word vector corresponding to the lb-th word in the first text content.
Alternatively, each element in the attention matrix may be obtained based on the following formula:
eij:=F′(ai,bj):=(ai)Tbj,
wherein e isijElements in the attention matrix in the ith row and in the jth column are characterized. Correspondingly, the element in the ith row and the jth column in the attention matrix is composed of a first word vector in the form of a sequence
Figure BDA0002716170880000105
The ith word vector is inverted and then the second word vector in the form of a sequence is added
Figure BDA0002716170880000106
The product of the jth word vector in (a).
S240: a reference vector is derived based on each element in the attention matrix and the second word vector.
The reference vector can be understood as a vector output by a model where the attention moment array is located, the reference vector represents the attention relation between each word in the first text content and each word in the second text content, and then the reference vector can be used for updating the first word vector and the second word vector, so that the updated first word vector can more accurately express the actual meaning of the first text content, and the updated second word vector can more accurately express the actual meaning of the second text content.
Alternatively, the reference vector generated based on the attention mechanism may be calculated based on the following formula:
Figure BDA0002716170880000107
wherein, lhs _ alignediCharacterized by the reference vector obtained, and the exp function by the base of the natural constant eNumber function, e.g. exp (e) thereinij) Characterized by the base e and the index (e)ij) As a function of (c). In this formula, exp (e) is calculated firstij) And
Figure BDA0002716170880000108
and then comparing the ratio with the second word vector
Figure BDA0002716170880000109
Calculates a product based on the jth word vector in (1)
Figure BDA0002716170880000111
By accumulating, a reference vector can be obtained.
S250: and calculating to obtain the target probability based on the reference vector, the first word vector and the second word vector.
As one way, as shown in fig. 4, calculating the target probability based on the reference vector, the first word vector and the second word vector includes:
s251: and splicing the reference vector and the first word vector to obtain a first vector to be processed.
The reference vector and the first word vector are spliced, and it can be understood that the first word vector is updated based on the reference vector, and then the corresponding first to-be-processed vector can be understood as a vector after the first word vector is updated. Because of the processing of the aforementioned attention mechanism, the first vector to be processed can express the semantic content actually expressed by the first text content more accurately than the first word vector.
It should be noted that, in the present embodiment, when performing vector splicing, vector elements corresponding to positions in two vectors may be spliced. Wherein, the vector comprises a plurality of vector elements. For example, a first word vector
Figure BDA0002716170880000112
Wherein a is1、a2Up to the point where
Figure BDA0002716170880000113
Are all vector elements in the first word vector, the second word vector
Figure BDA0002716170880000114
In (b)1、b2Up to the point where
Figure BDA0002716170880000115
Are both vector elements in the second word vector, then a positionally corresponding vector element may be understood as two vector elements that are positioned identically in the respective vector. For example, in the first word vector, a1Is the vector element of the first position, and if the vector element of the first position in the reference vector is c1Then the vector element a1And vector element c1Is the corresponding vector element, and then the vector element c is processed in S2511And vector element a1And (6) splicing. Then the first vector to be processed can be obtained by concatenating the reference vector with the vector elements corresponding to the positions in the first word vector.
It should be noted that, splicing vector elements may be understood as taking the content included in each of the two vector elements as the content in the spliced vector element. Illustratively, one vector element is [ a ]11,a12,a13,a14]Another vector element is [ b ]11,b12,b13,b14]Then the vector element resulting from the concatenation is a11,a12,a13,a14,b11,b12,b13,b14]。
S252: and splicing the vector after the reference vector is rotated and the second word vector to obtain a second vector to be processed.
Correspondingly, the reference vector and the second word vector are spliced, which can be understood as that the second word vector is updated based on the reference vector, and then the corresponding second to-be-processed vector can be understood as the updated vector of the second word vector. Because of the processing of the aforementioned attention mechanism, the second vector to be processed can express the semantic content actually expressed by the second text content more accurately than the second word vector. In the process of generating the second to-be-processed vector, the generated reference vector is transposed, and then the transposed vector and the second word vector are spliced to obtain the second to-be-processed vector.
S253: pooling the first vector to be processed and the second vector to be processed, and carrying out nonlinear processing on the result of the pooling processing to obtain a target probability.
It should be noted that, the first to-be-processed vector and the second to-be-processed vector still include word vectors corresponding to each word in the text content. For example, the word vector corresponding to each word in the first text content is still included in the first to-be-processed vector, but the word vector corresponding to each word included in the first to-be-processed vector can express the actual meaning of each word in the context of the first text content more accurately than the word vector corresponding to each word included in the first word vector. Correspondingly, the word vector corresponding to each word in the second text content is still included in the second to-be-processed vector, but the actual meaning of each word in the context of the second text content can be more accurately expressed than the word vector corresponding to each word included in the second word vector.
In this embodiment, the pooling processing is performed on the first to-be-processed vector to integrate the features of each word in the first text content into the features of the first text content as a whole, so as to obtain a vector capable of representing the whole meaning of the first text content. Correspondingly, the pooling processing is performed on the second to-be-processed vector so as to integrate the features of each word in the second text content into the overall features of the second text content, and further obtain a vector capable of representing the overall meaning of the second text content.
Optionally, pooling the first to-be-processed vector and the second to-be-processed vector, and performing nonlinear processing on a result of the pooling to obtain a target probability, includes: performing first pooling on the first vector to be processed to obtain a first vector to be calculated; performing second pooling on the second vector to be processed to obtain a second vector to be calculated; carrying out nonlinear processing on the first vector to be calculated to obtain a first probability value, and carrying out nonlinear processing on the second vector to be calculated to obtain a second probability value; an average of the first probability value and the second probability value is calculated as a target probability.
Wherein the first pooling process and the second pooling process may be the same. Optionally, performing a first pooling process on the first to-be-processed vector to obtain a first to-be-calculated vector, including: carrying out global maximum pooling on the first vector to be processed to obtain a first maximum pooling vector; carrying out global average pooling on the first vector to be processed to obtain a first average pooling vector; and splicing the first maximum pooling vector and the first average pooling vector to obtain a first vector to be calculated.
Optionally, performing second pooling on the second to-be-processed vector to obtain a second to-be-calculated vector, including: carrying out global maximum pooling on the second vector to be processed to obtain a second maximum pooling vector; carrying out global average pooling on the second vector to be processed to obtain a second average pooling vector; and splicing the second maximum pooling vector and the second average pooling vector to obtain a second vector to be calculated.
S260: and comparing the target probability with a probability threshold.
S261: and if the target probability value is greater than the probability threshold value, determining that the semantics of the first text content are matched with the semantics of the second text content.
S262: and if the target probability value is not greater than the probability threshold, acquiring the next second text content, and performing semantic matching on the next second text content and the first text content.
It should be noted that, in the process of semantic matching the second text content with the first text content each time, the semantic matching may be performed based on the foregoing S230 to S260.
In the text processing method provided by this embodiment, in the process of needing to perform semantic matching between the first text content and the second text content, the word vector of the second text content used for performing semantic matching with the first text content is stored in advance, so that in the actual semantic matching process, the second text content does not need to be input into the model to acquire the word vector, but the word vector prestored before can be directly utilized, thereby reducing the time consumption in the text processing process and improving the efficiency of performing semantic matching. In addition, in this embodiment, each element in the attention matrix is first constructed based on the first word vector and the second word vector, and then the target probability representing the similarity between the first text content and the second text content is obtained based on the constructed attention matrix, so that the calculated target probability can represent the probability of similarity between the key content in the first text content and the key content in the second text content more in the process of calculating the target probability based on the characteristic that the attention mechanism can focus on the key content more, and the accuracy of semantic matching between the first text content and the second text content is further improved.
Referring to fig. 5, fig. 5 is a flowchart illustrating a text processing method according to an embodiment of the present application, where the method includes:
s310: the method comprises the steps of obtaining first text content and a first word vector, wherein the first word vector is a word vector corresponding to the first text content.
S320: and acquiring text content to be matched and word vectors which are stored in advance and correspond to the text content to be matched from the specified storage area based on the sentence vectors of the first text content.
It should be noted that, in the embodiment of the present application, the second text content may be obtained from the text content to be matched. The text content to be matched may be acquired first, so as to realize acquisition of the second text content. Optionally, the designated storage area is a storage area corresponding to a pre-established knowledge base, and the text content to be matched may be obtained from the knowledge base in a recall manner. Optionally, the method may include, based on the sentence vector of the first text content, acquiring the text content to be matched and a word vector, which is stored in advance and corresponds to the text content to be matched, from the specified storage area by using the sentence vector of the first text content as a reference for recalling the text content to be matched, and includes: matching the sentence vector of the first text content with the sentence vector of the text content in the appointed storage area in a similar way; and taking the successfully matched text content as the text content to be matched, and acquiring a word vector corresponding to the text content to be matched from the specified storage area.
In the process of performing similarity matching on the sentence vector of the first text content and the sentence vector of the text content in the designated storage area, the similarity degree between the texts can be calculated based on cosine similarity or Manhattan distance, and then the text with higher similarity degree is recalled as the text to be matched. For example, if 100 texts with the highest similarity degrees are configured to be used as the texts with the higher similarity degrees, when the text to be matched is obtained, 100 texts are obtained, and then the 100 texts are respectively used as the second text contents, so as to respectively obtain the target probabilities between the 100 texts and the first text contents.
In this embodiment, the word vector corresponding to the text content to be matched is also pre-stored in the designated storage area, for example, the text content to be matched and the corresponding word vector are both stored in the knowledge base, so that the word vector corresponding to the text content to be matched can be obtained while the text content to be matched is recalled, and the word vector corresponding to the text content to be matched is obtained through the model in real time without obtaining the text content to be matched, or the word vector corresponding to the text content to be matched is obtained in another storage area outside the knowledge base, and the word vector recalled together with the text content to be matched can be directly buffered in the local memory together with the text content to be matched, so that the second word vector can be directly read from the local memory when the second text content is determined and the second word vector is obtained correspondingly, and further, the rate of obtaining the second word vector is improved, and the speed of complete semantic matching is further improved.
It should be noted that, although the sentence vector of the first text content is used in the recall process of the text to be matched, the sentence vector used in the recall process is a relatively rough whole meaning for expressing the first text content, and the vector (for example, the first vector to be calculated) obtained through the pooling process in S253 in the present embodiment is a relatively accurate whole meaning for expressing the first text content, so that the first vector to be calculated can express the whole meaning of the first text content more accurately than the sentence vector used in the recall process.
As a mode, before the text content to be matched is acquired from the specified storage area based on the sentence vector of the first text content, the method further includes: acquiring the identification of each word in the first text content based on the dictionary; and obtaining a word vector and a sentence vector of the first text content based on the Bert model and the identification of each word.
S330: and taking the text content which is used for matching with the first text content currently in the text contents to be matched as second text content, and acquiring a word vector of the second text content as a second word vector.
S340: and acquiring a target probability based on the attention mechanism, the first word vector and the second word vector, wherein the target probability represents the semantic matching degree between the first text content and the second text content.
S350: and if the target probability value is greater than the probability threshold value, determining that the semantics of the first text content are matched with the semantics of the second text content.
In the text processing method provided by this embodiment, in the process of needing to perform semantic matching between the first text content and the second text content, the word vector of the second text content used for performing semantic matching with the first text content is stored in advance, so that in the actual semantic matching process, the second text content does not need to be input into the model to acquire the word vector, but the word vector prestored before can be directly utilized, thereby reducing the time consumption in the text processing process and improving the efficiency of performing semantic matching. In this embodiment, the second text content may be selected from the text content to be matched recalled from the designated storage area based on the first text content, and the word vector and the sentence vector of each text content are obtained by pre-storing in the designated storage area, so that the word vector corresponding to the text content to be matched may also be obtained by recalling the text content to be matched at the same time, and when the pre-stored word vector corresponding to the second text content is obtained, the word vector is not only obtained without being input into the model, but also may be directly read from the local, thereby further improving the efficiency of semantic matching.
Referring to fig. 6, fig. 6 shows a text processing method according to an embodiment of the present application, the method includes:
s410: predefined textual content is obtained.
It should be noted that the predefined text content is a text content for matching with the first text content input by the user. In the process of generating the predefined text contents, information associated with each predefined text content can be further generated, so that when the predefined text contents are identified to be matched with the semantics of the first text content, the information associated with the predefined text contents is fed back to the user.
For example, in a smart question-and-answer scenario, a predefined question may be obtained as predefined textual content and an answer corresponding to the predefined question may be generated. Wherein the answer may be understood as information associated with a predefined question. In the intelligent question-answering scenario, the question input by the user serves as first text content, and then an answer associated with a predefined question semantically matched with the question input by the user can be fed back to the user. For another example, in a news information search scenario, predefined news topics may be obtained as predefined content, and consultation content corresponding to each predefined news topic may be generated. Wherein the advisory content can be understood as information associated with a predefined news topic. And after the news theme input by the user is taken as the first text content, the consultation content corresponding to the predefined news theme semantically matched with the news theme input by the user can be fed back to the user.
S420: and acquiring a word vector and a sentence vector corresponding to the predefined text content.
Optionally, the word vector and the sentence vector corresponding to the predefined text content may be obtained through the Bert model indicated in the foregoing embodiment.
S430: the predefined text content, corresponding word vectors and sentence vectors are stored in a knowledge base.
S440: first text content is obtained.
S450: and acquiring a first word vector of the first text content and a sentence vector of the first text content.
S460: and recalling the text content to be matched and the word vector corresponding to the text content to be matched from the knowledge base based on the sentence vector of the first text content.
The text content recall from the knowledge base can be understood as the acquisition of the text content from the storage area corresponding to the knowledge base. In this embodiment, the text content in the knowledge base is the predefined text content. Recalling the text content to be matched from the knowledge base can be understood as recalling the text content to be matched from the predefined text content.
S470: and acquiring second text content from the text content to be matched and acquiring a second word vector.
S480: acquiring a target probability based on an attention mechanism, the first word vector and the second word vector, wherein the target probability represents the semantic matching degree between the first text content and the second text content;
s490: and if the target probability value is greater than the probability threshold value, determining that the semantics of the first text content are matched with the semantics of the second text content.
The flow of processing according to the present embodiment will be described again with reference to fig. 7.
Three phases may be included for the process flow shown in fig. 7, phase 80, phase 81, and phase 82, respectively. The first stage 80 is executed, and the stage 80 corresponds to the aforementioned S410 to S430. Further, in stage 80, predefined text content is input into a model (e.g., Bert model), and word vectors 12 and sentence vectors of the predefined text content output by the model are obtained and stored in a knowledge base.
Stage 81 is then performed, corresponding to stages S450 through S470 as described above in stage 81. Correspondingly, in the stage 81, the first text content may be obtained, and the first text content is input into a model, for example, a Bert model), so as to obtain a first word vector 10 and a sentence vector corresponding to the first text content. Optionally, the second text content may be recalled from the knowledge base based on the sentence vector of the first text content, and the word vector 12 may be further obtained.
During the execution of stage 82, the first word vector 10 is input into the attention matrix based on the direction indicated by arrow 13, and the second word vector 12 is input into the attention matrix based on the direction indicated by arrow 14, so as to obtain the reference vector 15 output by the model where the attention matrix is located.
Then, the first word vector 10 is input to the splicing layer for splicing based on the direction indicated by the arrow 20 and the reference vector 15 is input to the splicing layer based on the direction indicated by the arrow 21 to obtain the first vector to be processed in the foregoing embodiment. And inputting the second word vector 12 to the splicing layer for splicing based on the direction shown by the arrow 22 and the vector 16 with the reference vector 15 being rotated based on the direction shown by the arrow 23 to obtain the second vector to be processed in the foregoing embodiment.
Optionally, if the size of the first word vector 10 is SaX M, the size of the second word vector 12 is SbX M, then the reference vector 15 size may be Sa×SbThe size of the vector 16 after the reference vector 15 is transformed may be Sb×Sa. After the first word vector 10 and the reference vector 15 are spliced, the size of the obtained first vector to be processed may be Sa×(M+Sb) Correspondingly, the size of the second vector to be processed may be Sb×(M+Sa)。
For the first vector to be processed, the first vector to be processed is input into the global maximum pooling layer 40 for pooling based on the direction of the arrow 30 to obtain a first maximum pooling vector, and is input into the global average pooling layer 41 for pooling based on the direction of the arrow 31 to obtain a first average pooling vector. For the second vector to be processed, the global maximum pooling layer 40 is inputted based on the direction shown by the arrow 32 for pooling to obtain a second maximum pooling vector, and the global average pooling layer 41 is inputted based on the direction shown by the arrow 33 for pooling to obtain a second average pooling vector.
After the first maximum pooling vector is input to the splicing layer for splicing based on the direction indicated by the arrow 42 and the first average pooling vector is input to the splicing layer based on the direction indicated by the arrow 48, the first vector to be calculated obtained by splicing is input to the aggregation layer (Aggregate)51 in the directions indicated by the arrow 43 and the arrow 49, respectively. After the second maximum pooling vector is input to the splicing layer for splicing based on the direction indicated by the arrow 46 and the second average pooling vector is input to the splicing layer based on the direction indicated by the arrow 44, the second vector to be calculated obtained by splicing is input to the polymerization layer 52 along the directions indicated by the arrow 45 and the arrow 47, respectively. In the aggregation layer 50, the aggregation is performed based on the way that the first vector to be calculated is in front and the second vector to be calculated is in back. In the aggregation layer (Aggregate)51, the aggregation is performed based on the second vector to be calculated being in the front and the first vector to be calculated being in the back.
The vector obtained by the aggregation layer 50 sequentially passes through the fully connected layer (dense)60 and the nonlinear layer (sigmoid)70 to obtain the first probability value, the vector obtained by the aggregation layer 51 sequentially passes through the fully connected layer 60 and the nonlinear layer 70 to obtain the second probability value, the first probability value and the second probability value are averaged to obtain the target probability in the foregoing embodiment, and if the target probability is greater than the probability threshold, it is determined that the first text content and the second text content are semantically matched. The fully-connected layer 60 is configured to fully connect and reduce the dimensions of the received vectors, and output the reduced vectors to the nonlinear layer 70.
Referring to fig. 8, fig. 8 shows a text processing method according to an embodiment of the present application, the method includes:
s510: the method comprises the steps of obtaining a text input by an information query interface as a first text content, and obtaining a first word vector, wherein the first word vector is a word vector corresponding to the first text content.
The information query interface can be different in the information of corresponding query in different scenes. For example, in the scenario of intelligent question answering, the information query interface may be an interface for question input, and the information of the corresponding query may be an answer corresponding to a question. For another example, in a news information search scenario, the information query interface may be an interface for inputting a news topic, and the corresponding queried information may be the consulting content corresponding to the news topic.
Illustratively, as shown in fig. 9, in the information query interface shown in fig. 9, the user inputs "how the product that is redeemed by me has not been paid" and then "how the product that is redeemed by me has not been paid" as the first text content.
S520: and acquiring second text content and a second word vector, wherein the second text content is used for matching with the first text content, and the second word vector is a pre-stored word vector corresponding to the second text content.
S530: and acquiring a target probability based on the attention mechanism, the first word vector and the second word vector, wherein the target probability represents the semantic matching degree between the first text content and the second text content.
S540: and if the target probability value is greater than the probability threshold value, determining that the semantics of the first text content are matched with the semantics of the second text content.
S550: and outputting the information associated with the second text content to an information query interface.
For example, as shown in fig. 10, if the current second text content is "the check-out time before it is determined that the" check-out time before it "is semantically matched with the first text content," the information associated with the "check-out time before it" is output to the information query interface shown in fig. 9, and further the display state shown in fig. 10 is changed. In the interface shown in fig. 10, the answer corresponding to "check out time" is displayed.
In the text processing method provided by this embodiment, after the input first text content is acquired from the information query interface, in the process of needing to perform semantic matching between the first text content and the second text content, the word vector of the second text content used for performing semantic matching with the first text content is pre-stored, so that in the actual semantic matching process, the second text content is not required to be input into the model for acquiring the word vector, but the word vector pre-stored before can be directly utilized, thereby reducing the time consumption in the text processing process, improving the efficiency of performing semantic matching, and further enabling the corresponding feedback information to be displayed more quickly after the user inputs the content needing to be queried in the information query interface.
Referring to fig. 11, fig. 11 is a diagram illustrating a text processing apparatus 600 according to an embodiment of the present application, where the apparatus 600 includes:
the first data obtaining unit 610 is configured to obtain a first text content and a first word vector, where the first word vector is a word vector corresponding to the first text content.
The second data obtaining unit 620 is configured to obtain a second text content and a second word vector, where the second text content is a text content currently used for performing semantic matching with the first text content, and the second word vector is a word vector corresponding to the second text content and stored in advance.
As a mode, the second data obtaining unit 620 is specifically configured to obtain, based on a sentence vector of the first text content, a text content to be matched and a word vector, which is stored in advance and corresponds to the text content to be matched, from a specified storage area; and taking the text content which is currently used for matching with the first text content in the text contents to be matched as second text content.
Optionally, the second data obtaining unit 620 is specifically configured to perform similar matching on the sentence vector of the first text content and the sentence vector of the text content in the specified storage area; and taking the successfully matched text content as the text content to be matched, and acquiring a word vector corresponding to the text content to be matched from the specified storage area.
In addition, the first data obtaining unit 610 is further configured to obtain an identifier of each word in the first text content based on the dictionary; and obtaining a word vector and a sentence vector of the first text content based on the Bert model and the identification of each word.
A probability obtaining unit 630, configured to obtain a target probability based on the attention mechanism, the first word vector, and the second word vector, where the target probability represents a semantic matching degree between the first text content and the second text content.
And the text matching unit 640 is configured to determine that the first text content is semantically matched with the second text content if the target probability value is greater than the probability threshold.
As one mode, the probability obtaining unit 630 is specifically configured to obtain each element in the attention matrix based on the first word vector and the second word vector; obtaining a reference vector based on each element in the attention matrix and the second word vector; and calculating to obtain the target probability based on the reference vector, the first word vector and the second word vector.
Optionally, the probability obtaining unit 630 is specifically configured to splice the reference vector and the first word vector to obtain a first vector to be processed; splicing the vector after the reference vector is rotated with a second word vector to obtain a second vector to be processed; pooling the first vector to be processed and the second vector to be processed, and carrying out nonlinear processing on the result of the pooling processing to obtain a target probability.
Optionally, the probability obtaining unit 630 is specifically configured to perform a first pooling process on the first to-be-processed vector to obtain a first to-be-calculated vector; performing second pooling on the second vector to be processed to obtain a second vector to be calculated; carrying out nonlinear processing on the first vector to be calculated to obtain a first probability value, and carrying out nonlinear processing on the second vector to be calculated to obtain a second probability value; an average of the first probability value and the second probability value is calculated as a target probability.
As one mode, the probability obtaining unit 630 is specifically configured to perform global maximum pooling on the first vector to be processed to obtain a first maximum pooled vector; carrying out global average pooling on the first vector to be processed to obtain a first average pooling vector; and splicing the first maximum pooling vector and the first average pooling vector to obtain a first vector to be calculated.
As a mode, the probability obtaining unit 630 is specifically configured to perform global maximum pooling on the second vector to be processed to obtain a second maximum pooled vector; carrying out global average pooling on the second vector to be processed to obtain a second average pooling vector; and splicing the second maximum pooling vector and the second average pooling vector to obtain a second vector to be calculated.
Referring to fig. 12, fig. 12 is a diagram illustrating a text processing apparatus 700 according to an embodiment of the present application, where the apparatus 700 includes:
the query data obtaining unit 710 is configured to obtain a text input by the information query interface as a first text content, and obtain a first word vector, where the first word vector is a word vector corresponding to the first text content.
The second data obtaining unit 620 is configured to obtain a second text content and a second word vector, where the second text content is a text content used for matching with the first text content, and the second word vector is a word vector that is pre-stored and corresponds to the second text content.
A probability obtaining unit 630, configured to obtain a target probability based on the attention mechanism, the first word vector, and the second word vector, where the target probability represents a semantic matching degree between the first text content and the second text content.
And the text matching unit 640 is configured to determine that the first text content is semantically matched with the second text content if the target probability value is greater than the probability threshold.
And an information output unit 720, configured to output information associated with the second text content to the information query interface.
According to the text processing device, in the process of needing to carry out semantic matching on the first text content and the second text content, the word vector of the second text content used for carrying out semantic matching on the first text content is stored in advance, so that in the actual semantic matching process, the second text content does not need to be input into the model to obtain the word vector, the word vector pre-stored before can be directly utilized, time consumption in the text processing process is reduced, and efficiency of carrying out semantic matching is improved.
It should be noted that the device embodiment and the method embodiment in the present application correspond to each other, and specific principles in the device embodiment may refer to the contents in the method embodiment, which is not described herein again.
An electronic device provided by the present application will be described below with reference to fig. 13.
Referring to fig. 13, based on the text processing method, another electronic device 200 including a processor 104 capable of executing the text processing method is provided in the embodiment of the present application, where the electronic device 200 may be a smart phone, a tablet computer, a portable computer, or the like. The electronic device 200 also includes a memory 104 and a network module 106. The memory 104 stores programs that can execute the content of the foregoing embodiments, and the processor 102 can execute the programs stored in the memory 104.
Processor 102 may include, among other things, one or more cores for processing data and a message matrix unit. The processor 102 interfaces with various components throughout the electronic device 200 using various interfaces and circuitry to perform various functions of the electronic device 200 and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 104 and invoking data stored in the memory 104. Alternatively, the processor 102 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 102 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 102, but may be implemented by a communication chip.
The Memory 104 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 104 may be used to store instructions, programs, code sets, or instruction sets. The memory 104 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The storage data area may also store data created by the terminal 100 in use, such as a phonebook, audio-video data, chat log data, and the like.
The network module 106 is configured to receive and transmit electromagnetic waves, and implement interconversion between the electromagnetic waves and the electrical signals, so as to communicate with a communication network or other devices, for example, the network module 106 may transmit broadcast data, and may also analyze broadcast data transmitted by other devices. The network module 106 may include various existing circuit elements for performing these functions, such as an antenna, a radio frequency transceiver, a digital signal processor, an encryption/decryption chip, a Subscriber Identity Module (SIM) card, memory, and so forth. The network module 106 may communicate with various networks, such as the internet, an intranet, a wireless network, or with other devices via a wireless network. The wireless network may comprise a cellular telephone network, a wireless local area network, or a metropolitan area network. For example, the network module 106 may interact with a base station.
Referring to fig. 14, a block diagram of a computer-readable storage medium according to an embodiment of the present application is shown. The computer-readable medium 1100 has stored therein program code that can be called by a processor to perform the method described in the above-described method embodiments.
The computer-readable storage medium 1100 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Alternatively, the computer-readable storage medium 1100 includes a non-volatile computer-readable storage medium. The computer readable storage medium 1100 has storage space for program code 810 to perform any of the method steps of the method described above. The program code can be read from or written to one or more computer program products. The program code 1110 may be compressed, for example, in a suitable form.
Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the text processing method described above.
The two tasks of question pair matching and semantic similarity calculation (respectively aiming at Chinese and English) are tested, and the results are as follows:
LCQMC (Chinese) test (ACC)
google bert-base protogenesis model 86.90%
google bert-base Siamese Network 84.08%
Text processing method in embodiment of application 86.30%
STS-B (English) Test (Pearson/Spearman)
google bert-medium proto model 81.9%/80.6%
google bert-medium Siamese Network 78.4%/77.6%
Text processing method in embodiment of application 81.3%/80.1%
Wherein LCQMC is the Chinese data set used in the test, and STS-B is the English data set used in the test. In the test based on the LCQMC data set, the accuracy of the text processing method in the embodiment of the present application may reach 86.30%. In the test based on the STS-B data set, the accuracy may reach 81.3% in the case where the Pearson (Pearson) correlation coefficient is the criterion, and the accuracy may reach 80.1% in the case where the Spearman (Spearman) correlation coefficient is the criterion.
In a time-consuming test, a semantic similarity calculation operation is performed on a question of one user and 100 candidate questions (which can be understood as predefined questions) in an intelligent customer service environment, and the time consumption is as follows:
model or algorithm Time consuming (ms)
ALBERT-zh-tiny primary model 1048
Text processing method in embodiment of application 234
ALBERT-zh-tiny Siamese Network 26
To sum up, according to the text processing method, the text processing device, the electronic device, and the storage medium provided by the present application, a first text content and a word vector corresponding to the first text content are first obtained as a first word vector, a second text content currently used for semantic matching with the first text content and a second word vector corresponding to the second text content are obtained, and then a target probability is obtained based on an attention mechanism, the first word vector, and the second word vector, where the target probability represents a semantic matching degree between the first text content and the second text content; and if the target probability value is larger than a probability threshold value, determining that the first text content is semantically matched with the second text content. Therefore, in the process of semantic matching of the first text content and the second text content, the word vector of the second text content for semantic matching with the first text content is pre-stored, so that in the actual semantic matching process, the second text content is not required to be input into the model for obtaining the word vector, and the word vector pre-stored before can be directly utilized, thereby reducing the time consumption in the text processing process and improving the efficiency of semantic matching.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (14)

1. A method of text processing, the method comprising:
acquiring first text content and a first word vector, wherein the first word vector is a word vector corresponding to the first text content;
acquiring second text content and a second word vector, wherein the second text content is the text content which is currently used for semantic matching with the first text content, and the second word vector is a pre-stored word vector corresponding to the second text content;
obtaining a target probability based on an attention mechanism, the first word vector and the second word vector, the target probability representing a semantic matching degree between the first text content and the second text content;
and if the target probability value is larger than a probability threshold value, determining that the first text content is semantically matched with the second text content.
2. The method of claim 1, wherein obtaining a target probability based on the attention mechanism, the first word vector, and the second word vector comprises:
obtaining each element in an attention matrix based on the first word vector and the second word vector;
obtaining a reference vector based on each element in the attention matrix and the second word vector;
and calculating to obtain a target probability based on the reference vector, the first word vector and the second word vector.
3. The method of claim 2, wherein calculating a target probability based on the reference vector, the first word vector, and the second word vector comprises:
splicing the reference vector and the first word vector to obtain a first vector to be processed;
splicing the vector after the reference vector is rotated with the second word vector to obtain a second vector to be processed;
pooling the first vector to be processed and the second vector to be processed, and carrying out nonlinear processing on the result of the pooling processing to obtain a target probability.
4. The method according to claim 3, wherein pooling the first vector to be processed and the second vector to be processed and performing nonlinear processing on the result of the pooling to obtain a target probability comprises:
performing first pooling on the first vector to be processed to obtain a first vector to be calculated;
performing second pooling on the second vector to be processed to obtain a second vector to be calculated;
carrying out nonlinear processing on the first vector to be calculated to obtain a first probability value, and carrying out nonlinear processing on the second vector to be calculated to obtain a second probability value;
calculating an average of the first probability value and the second probability value as a target probability.
5. The method of claim 4, wherein the performing the first pooling on the first vector to be processed to obtain a first vector to be calculated comprises:
carrying out global maximum pooling on the first vector to be processed to obtain a first maximum pooling vector;
carrying out global average pooling on the first vector to be processed to obtain a first average pooling vector;
and splicing the first maximum pooling vector and the first average pooling vector to obtain a first vector to be calculated.
6. The method according to claim 4, wherein the performing a second pooling process on the second vector to be processed to obtain a second vector to be calculated comprises:
performing global maximum pooling on the second vector to be processed to obtain a second maximum pooling vector;
carrying out global average pooling on the second vector to be processed to obtain a second average pooling vector;
and splicing the second maximum pooling vector and the second average pooling vector to obtain a second vector to be calculated.
7. The method according to any one of claims 1-6, wherein said obtaining the second text content and the second word vector comprises:
based on the sentence vector of the first text content, acquiring text content to be matched and a word vector which is stored in advance and corresponds to the text content to be matched from a specified storage area;
and taking the text content which is currently used for matching with the first text content in the text contents to be matched as second text content.
8. The method according to claim 7, wherein the obtaining of the text content to be matched and the word vector pre-stored in correspondence with the text content to be matched from a specified storage area based on the sentence vector of the first text content comprises:
performing similarity matching on the sentence vector of the first text content and the sentence vector of the text content in the appointed storage area;
and taking the successfully matched text content as the text content to be matched, and acquiring a word vector corresponding to the text content to be matched from the specified storage area.
9. The method of claim 7, wherein before the obtaining the text content to be matched from the designated storage area based on the sentence vector of the first text content, the method further comprises:
acquiring the identification of each word in the first text content based on a dictionary;
and obtaining a word vector and a sentence vector of the first text content based on the Bert model and the identification of each word.
10. A method of text processing, the method comprising:
acquiring a text input by an information query interface as a first text content, and acquiring a first word vector, wherein the first word vector is a word vector corresponding to the first text content;
acquiring second text content and a second word vector, wherein the second text content is used for being matched with the first text content, and the second word vector is a pre-stored word vector corresponding to the second text content;
obtaining a target probability based on an attention mechanism, the first word vector and the second word vector, the target probability characterizing a semantic matching degree between the first text content and the second text content;
if the target probability value is larger than a probability threshold value, determining that the first text content is semantically matched with the second text content;
outputting information associated with the second text content to the information query interface.
11. A text processing apparatus, characterized in that the apparatus comprises:
the device comprises a first data acquisition unit, a second data acquisition unit and a word processing unit, wherein the first data acquisition unit is used for acquiring first text content and a first word vector, and the first word vector is a word vector corresponding to the first text content;
a second data obtaining unit, configured to obtain a second text content and a second word vector, where the second text content is a text content currently used for performing semantic matching with the first text content, and the second word vector is a word vector corresponding to the second text content and stored in advance;
a probability obtaining unit, configured to obtain a target probability based on an attention mechanism, the first word vector, and the second word vector, where the target probability represents a semantic matching degree between the first text content and the second text content;
and the text matching unit is used for determining semantic matching between the first text content and the second text content if the target probability value is greater than a probability threshold.
12. A text processing apparatus, characterized in that the apparatus comprises:
the query data acquisition unit is used for acquiring a text input by an information query interface as first text content and acquiring a first word vector, wherein the first word vector is a word vector corresponding to the first text content;
a second data obtaining unit, configured to obtain a second text content and a second word vector, where the second text content is a text content used for matching with the first text content, and the second word vector is a word vector corresponding to the second text content and stored in advance;
a probability obtaining unit, configured to obtain a target probability based on an attention mechanism, the first word vector, and the second word vector, where the target probability represents a semantic matching degree between the first text content and the second text content;
the text matching unit is used for determining semantic matching between the first text content and the second text content if the target probability value is greater than a probability threshold;
and the information output unit is used for outputting the information associated with the second text content to the information query interface.
13. An electronic device comprising a processor and a memory; one or more programs are stored in the memory and configured to be executed by the processor to implement the method of any of claims 1-9 or to implement the method of claim 10.
14. A computer-readable storage medium, in which a program code is stored, wherein the program code performs the method of any one of claims 1-9 or performs the method of claim 10 when executed by a processor.
CN202011074414.0A 2020-10-09 2020-10-09 Text processing method and device, electronic equipment and storage medium Pending CN114330355A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011074414.0A CN114330355A (en) 2020-10-09 2020-10-09 Text processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011074414.0A CN114330355A (en) 2020-10-09 2020-10-09 Text processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114330355A true CN114330355A (en) 2022-04-12

Family

ID=81031995

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011074414.0A Pending CN114330355A (en) 2020-10-09 2020-10-09 Text processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114330355A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6847966B1 (en) * 2002-04-24 2005-01-25 Engenium Corporation Method and system for optimally searching a document database using a representative semantic space
CN107562792A (en) * 2017-07-31 2018-01-09 同济大学 A kind of question and answer matching process based on deep learning
CN110298037A (en) * 2019-06-13 2019-10-01 同济大学 The matched text recognition method of convolutional neural networks based on enhancing attention mechanism
CN110555093A (en) * 2018-03-30 2019-12-10 华为技术有限公司 text matching method, device and equipment
WO2020124959A1 (en) * 2018-12-21 2020-06-25 平安科技(深圳)有限公司 Semantic similarity matching method based on cross attention mechanism, and apparatus therefor
CN111488438A (en) * 2020-02-21 2020-08-04 天津大学 Question-answer matching attention processing method, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6847966B1 (en) * 2002-04-24 2005-01-25 Engenium Corporation Method and system for optimally searching a document database using a representative semantic space
CN107562792A (en) * 2017-07-31 2018-01-09 同济大学 A kind of question and answer matching process based on deep learning
CN110555093A (en) * 2018-03-30 2019-12-10 华为技术有限公司 text matching method, device and equipment
WO2020124959A1 (en) * 2018-12-21 2020-06-25 平安科技(深圳)有限公司 Semantic similarity matching method based on cross attention mechanism, and apparatus therefor
CN110298037A (en) * 2019-06-13 2019-10-01 同济大学 The matched text recognition method of convolutional neural networks based on enhancing attention mechanism
CN111488438A (en) * 2020-02-21 2020-08-04 天津大学 Question-answer matching attention processing method, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US12039447B2 (en) Information processing method and terminal, and computer storage medium
CN112868004B (en) Resource recommendation method and device, electronic equipment and storage medium
CN110162770A (en) A kind of word extended method, device, equipment and medium
CN109033156B (en) Information processing method and device and terminal
CN108768824B (en) Information processing method and device
CN114036398B (en) Content recommendation and ranking model training method, device, equipment and storage medium
CN110069769B (en) Application label generation method and device and storage device
CN114706945A (en) Intention recognition method and device, electronic equipment and storage medium
CN111506717B (en) Question answering method, device, equipment and storage medium
CN111581347A (en) Sentence similarity matching method and device
CN116775980B (en) Cross-modal searching method and related equipment
CN113643706B (en) Speech recognition method, device, electronic equipment and storage medium
CN112101023B (en) Text processing method and device and electronic equipment
CN114330355A (en) Text processing method and device, electronic equipment and storage medium
CN114996578A (en) Model training method, target object selection method, device and electronic equipment
CN115129845A (en) Text information processing method and device and electronic equipment
CN111898033A (en) Content pushing method and device and electronic equipment
CN113505293A (en) Information pushing method and device, electronic equipment and storage medium
CN113254579A (en) Voice retrieval method and device and electronic equipment
CN114139031B (en) Data classification method, device, electronic equipment and storage medium
CN116911304B (en) Text recommendation method and device
CN111859096B (en) Information pushing device, method, electronic equipment and computer readable storage medium
CN114281969A (en) Reply sentence recommendation method and device, electronic equipment and storage medium
CN118113915A (en) Problem processing method, device, electronic equipment and storage medium
CN117892728A (en) Text recognition method and device, recognition model, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination