CN111581335A

CN111581335A - Text representation method and device

Info

Publication number: CN111581335A
Application number: CN202010406112.2A
Authority: CN
Inventors: 李伟康
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-05-14
Filing date: 2020-05-14
Publication date: 2020-08-25
Anticipated expiration: 2040-05-14
Also published as: CN111581335B

Abstract

The application relates to the technical field of computers, in particular to a text representation method and a text representation device, which are used for obtaining word vector representation of each word in a text to be processed; obtaining the original word vector representation of each participle in the text to be processed; fusing the character vector representation of each character with the original word vector representation of each corresponding participle to obtain fused vector representation of each participle; and obtaining the text vector representation of the text to be processed according to the fusion vector representation of each participle, so that words are fused, text representation information can be enriched, and text representation accuracy is improved.

Description

Text representation method and device

Technical Field

The present application relates to the field of computer technologies, and in particular, to a text representation method and apparatus.

Background

The text representation method refers to a vectorization method of a text, and representing the text as a vector containing semantic information is helpful for applications such as classification, retrieval and recommendation, and how to accurately represent the text is very necessary.

In the related art, the text representation method mainly directly takes a word or a word as a minimum unit, namely a meta unit, and then converts the minimum unit into vector representation, and further obtains the vector representation of the whole sentence text by using a related network.

Disclosure of Invention

The embodiment of the application provides a text representation method and a text representation device, so that the accuracy of text representation is improved.

The embodiment of the application provides the following specific technical scheme:

one embodiment of the present application provides a text representation method, including:

obtaining word vector representation of each word in the text to be processed;

obtaining the original word vector representation of each participle in the text to be processed;

fusing the character vector representation of each character with the original word vector representation of each corresponding participle to obtain fused vector representation of each participle;

and obtaining the text vector representation of the text to be processed according to the fusion vector representation of each participle.

Another embodiment of the present application provides a text presentation apparatus, including:

the first obtaining module is used for obtaining the word vector representation of each word in the text to be processed;

a second obtaining module, configured to obtain an original word vector representation of each participle in the text to be processed;

the fusion module is used for fusing the character vector representation of each character with the corresponding original word vector representation of each participle to obtain the fusion vector representation of each participle;

and the third obtaining module is used for obtaining the text vector representation of the text to be processed according to the fusion vector representation of each participle.

Another embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of any one of the text representation methods described above when executing the program.

Another embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, performs the steps of any of the above-mentioned text representation methods.

In the embodiment of the application, the word vector representation of each word and the original word vector representation of each participle in the text to be processed are obtained, the word vector representation of each word and the original word vector representation of each corresponding participle are fused to obtain the fused vector representation of each participle, and then the text vector representation of the text to be processed is obtained according to the fused vector representation of each participle.

Drawings

FIG. 1 is a schematic diagram of an application architecture of a text representation method in an embodiment of the present application;

FIG. 2 is a flow chart of a text representation method in an embodiment of the present application;

FIG. 3 is a schematic diagram of a vector subtraction operation in an embodiment of the present application;

FIG. 4 is a diagram illustrating a vector multiplication operation in an embodiment of the present application;

FIG. 5 is a diagram illustrating a vector addition operation in an embodiment of the present application;

FIG. 6 is a schematic diagram of vector parallel operation according to an embodiment of the present application;

FIG. 7 is a schematic diagram of the fusion operation by RNN model in the embodiment of the present application;

fig. 8 is a schematic diagram of a fusion operation performed by a CNN model in the embodiment of the present application;

FIG. 9 is a schematic diagram of the fusion operation by the feedforward neural network model in the embodiment of the present application;

FIG. 10 is a diagram illustrating the operation of tensor inner product computation in the embodiment of the present application;

FIG. 11 is a schematic diagram of a text-based presentation apparatus according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of an electronic device in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

For the purpose of facilitating an understanding of the embodiments of the present application, a brief introduction of several concepts is provided below:

short text: the title or comment with the too short text length in the video field can also be a text with a short length in other application fields, such as question and answer sentences in intelligent equipment.

And (3) word fusion: indicating that information of words and phrases is adequately mined and utilized.

Element: the smallest units that constitute a sentence, e.g., words, phrases.

Text representation: the representation is vectorization of the text, representing the text as a vector containing semantic information.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like. For example, the embodiment of the present application mainly relates to a natural language processing technology, and may perform word segmentation on a text to be processed, perform coding operation on a word or a participle, and so on, further fuse the word, and may also perform processing using a neural network model during the fusion processing, thereby obtaining a fused unit representation of the text to be processed, and perform modeling on the entire text to be processed, thereby obtaining a text vector representation. In addition, in the embodiment of the application, after the text vector representation of the text to be processed is obtained, the text to be processed may be identified, classified, translated and generated by using the sentence classification, machine translation and other technologies in the natural language processing technology.

Along with the research and progress of artificial intelligence technology, the artificial intelligence technology develops research and application in a plurality of fields, for example, common intelligent home, intelligent wearable equipment, virtual assistant, intelligent sound box, intelligent marketing, unmanned driving, automatic driving, unmanned aerial vehicle, robot, intelligent medical treatment, intelligent customer service and the like.

The scheme provided by the embodiment of the application mainly relates to an artificial intelligence natural language processing technology, and is specifically explained by the following embodiment:

the vector representation of the text is very important in classification, retrieval, recommendation and other business applications, and in the related technology, the text representation method mainly takes characters or words as element units, then converting the code into a one-hot (one-hot) code form through a dictionary or converting the code into a dense vector by means of word2vec and the like, further utilizing a neural network, the sentence is modeled integrally to obtain the final text vector representation, but in the related technology, a word or a word is directly used as a unit to represent the text, the information between the words is ignored, and a single word is also ambiguous and cannot accurately represent text information, for example, a word may be a composition of a plurality of words, if a single word of a user is represented as a unit, current context information cannot be accurately represented, especially for short text, because the content is limited, it is very important to sufficiently mine fusion information between words.

Therefore, in order to solve the above problems, an embodiment of the present application provides a new text representation method, which obtains a word vector representation of each word and an original word vector representation of each participle in a to-be-processed text, and fuses the word vector representation of each word and the original word vector representation of each corresponding participle to obtain a fused vector representation of each participle, and further obtains a text vector representation of the to-be-processed text according to the fused vector representation of each participle.

Fig. 1 is a schematic diagram of an application architecture of the text representation method in the embodiment of the present application, including a terminal 100 and a server 200.

The terminal 100 may be any smart device such as a smart phone, a tablet computer, a portable personal computer, a desktop computer, a smart television, a smart robot, etc., and various Applications (APP) may be installed on the terminal 100, such as a user who wants to search for a video, the terminal 100 can send the search text to the server 200 by inputting the search text through the video APP in the terminal 100, the server 200 fuses word information based on the text representation method in the embodiment of the present application, obtains a fusion vector representation of each participle of the search text, obtains a text vector representation of the search text, further, according to the text vector representation of the search text, searching the matched and associated video, and returning the searched video to the terminal 100, the terminal 100 displays the video returned by the server 200, therefore, the retrieval text is more accurately represented, and the retrieval accuracy and the retrieval effect can be further improved.

The server 200 can provide various network services for the terminal 100, and for different applications, the server 200 may be regarded as a corresponding background server, where the server 200 may be a server, a server cluster composed of several servers, or a cloud computing center.

The terminal 100 and the server 200 may be connected via the internet to communicate with each other. Optionally, the internet described above uses standard communication techniques and/or protocols. The internet is typically the internet, but can be any Network including, but not limited to, a Local Area Network (LAN), a Metropolitan Area Network (MAN), a Wide Area Network (WAN), a mobile, wireline or wireless Network, a private Network, or any combination of virtual private networks. In some embodiments, data exchanged over a network is represented using techniques and/or formats including Hypertext Markup Language (HTML), Extensible Markup Language (XML), and the like. All or some of the links may also be encrypted using conventional encryption techniques such as Secure Socket Layer (SSL), Transport Layer Security (TLS), Virtual Private Network (VPN), Internet Protocol Security (IPsec), and so on. In other embodiments, custom and/or dedicated data communication techniques may also be used in place of, or in addition to, the data communication techniques described above.

It should be noted that, in each embodiment of the present application, the text representation method may be executed by the server 200, and certainly may also be executed by the terminal 100, or may also be executed by both the terminal 100 and the server 200, which is not limited in this embodiment, for example, the server 200 obtains a to-be-processed text from the terminal 100, determines a word vector representation of each word in the to-be-processed text and an original word vector representation of each participle, performs word fusion to obtain a fusion vector representation of each participle, obtains a text vector representation of the to-be-processed text according to the fusion vector representation of each participle, and further may perform related service processing based on the text vector representation of the to-be-processed text.

It should be noted that the application architecture diagram in the embodiment of the present application is to illustrate the technical solution in the embodiment of the present application more clearly, and does not limit the technical solution provided in the embodiment of the present application, and is not limited to the representation of short text, and is also not limited to be applied to the business fields of video, intelligent customer service, translation, and the like, but for other application architectures and business applications, the technical solution provided in the embodiment of the present application is also applicable to similar problems.

In the embodiments of the present application, the text representation method is schematically illustrated as being applied to the application architecture shown in fig. 1.

Based on the foregoing embodiment, referring to fig. 2, a flowchart of a text representation method in the embodiment of the present application is shown, where the method is described by being executed by a server as an example, and specifically the method includes:

step 200: and obtaining the word vector representation of each word in the text to be processed.

Specifically, the text to be processed is subjected to word segmentation processing, each word is encoded based on a trained machine learning model, and each word is mapped into a word vector representation containing context information.

The machine learning model can be a word2vec model, a golve model and the like, the embodiment of the application is not limited, the machine learning model is pre-trained, and each word is mapped into a word vector to be represented by a word vector table obtained by a pre-training method. The word2vec model can map words or participles to a K-dimensional vector space by using context information to obtain vector representations of the words or the participles, the glove model is more inclined to analyze the co-occurrence relation between contexts before and after analysis, word vectors are abstracted through the co-occurrence relation, co-occurrence representations are co-occurrence, namely whether one word is present or not in the vicinity of the other word is seen, namely the vicinity is a concept of a moving window, and after the radius of the window (the distance from a central word to an edge) is defined, the number of words appearing in the range of a square circle is determined, namely the co-occurrence.

Step 210: and obtaining the original word vector representation of each participle in the text to be processed.

When step 210 is executed, the method specifically includes: and performing word segmentation on the text to be processed by adopting a word segmentation tool to obtain each word segmentation of the text to be processed, and coding each word segmentation based on a trained machine learning model to obtain an original word vector representation of each word segmentation.

The word segmentation tool can be a (jieba) word segmentation tool, the jieba word segmentation tool is mainly used for constructing a prefix dictionary based on a statistical dictionary, segmenting a text by utilizing the prefix dictionary to obtain all segmentation possibilities, constructing a directed acyclic graph according to segmentation positions, and then calculating to obtain a maximum probability path through a dynamic programming algorithm, namely obtaining a final segmentation form.

And the coded machine learning model can also adopt a word2vec model, a golve model and the like, the embodiment of the application is not limited, the machine learning model is pre-trained, and each participle is mapped into a word vector containing context information by using a word vector table obtained by a pre-training method.

Of course, for obtaining the word vector representation of each word and the original word vector representation of each participle in the text to be processed, the embodiment of the present application is not limited in specific implementation manner, and may adopt the related technology for processing.

Step 220: and fusing the character vector representation of each character with the original word vector representation of each corresponding participle to obtain the fused vector representation of each participle.

When step 220 is executed, the method specifically includes:

and S1, fusing the character vector representations of the characters corresponding to the participles respectively aiming at each participle to obtain the intermediate vector representation corresponding to each participle.

And S2, respectively aiming at each participle, carrying out fusion processing on the original word vector representation and the intermediate vector representation of the participle to obtain the fusion vector representation of each participle.

That is to say, in the embodiment of the present application, the main purpose is to fuse word information, first fuse word vector representations corresponding to word segmentation to obtain a new intermediate vector representation of the word segmentation based on the word vector representation, and then fuse an original word vector representation and the intermediate vector representation to obtain a final fused vector representation of the word segmentation, so that word and word segmentation information in a text can be fully mined, meta-unit information representation is enriched, and meta-unit representation accuracy is improved.

Step 230: and obtaining text vector representation of the text to be processed according to the fusion vector representation of each participle.

In the embodiment of the present application, after obtaining the fusion vector representation of each participle of the text to be processed, the fusion vector representation of each participle is used as a unit representation, and then the text vector representation of the text to be processed is obtained.

The Neural Network model may be a Recurrent Neural Network (RNN), a Convolutional Neural Network (CNN), a Recurrent Neural Network, an attention Network, a graph Neural Network, or the like, which is not limited in the embodiment of the present application, and the text to be processed is modeled based on the Neural Network model to obtain an overall text vector representation of the text to be processed.

Based on the foregoing embodiments, the following specifically describes an implementation manner of the intermediate vector representation and the fused vector representation in the foregoing step 220 in this embodiment, and the implementation manner is specifically divided into the following two parts:

a first part: and respectively fusing the character vector representations of the characters corresponding to the participles aiming at each participle to obtain the intermediate vector representation corresponding to each participle.

Specifically, the method comprises the following steps: s1, fusing the word vector representations of the words corresponding to the participles, specifically including: and performing one or more operations on the word vector representation of the word corresponding to the participle: vector subtraction operation, vector multiplication operation, vector addition operation, vector parallel connection operation and vector fusion operation after being input into the neural network model.

For example, referring to fig. 3, which is a schematic diagram of vector subtraction operation in the embodiment of the present application, a word vector of a word corresponding to a participle is represented to be subjected to vector subtraction operation, that is, corresponding elements represented by word vectors corresponding to the participle are subtracted one by one, so as to obtain a subtracted vector, for example, the word vector corresponding to the participle is "you" and "good", the word vector corresponding to you "is represented as a1, the word vector corresponding to good is represented as a2, and corresponding elements corresponding to a1 and a2 are subtracted to obtain a new vector.

For example, referring to fig. 4, which is a schematic diagram of vector multiplication operation in the embodiment of the present application, a word vector representation of a word corresponding to a participle is subjected to vector multiplication operation, that is, elements corresponding to a vector are multiplied one by one, so as to obtain a new vector.

For example, referring to fig. 5, which is a schematic diagram of vector addition operation in the embodiment of the present application, vector addition operation is performed on word vector representations of words corresponding to word segments, that is, elements corresponding to vectors are added one by one to obtain a new vector.

For example, referring to fig. 6, which is a schematic diagram of vector parallel operation in the embodiment of the present application, a word vector of a word corresponding to a participle is represented by performing vector parallel operation, that is, the vectors are connected end to obtain a new vector, for example, the participle "hello", the corresponding words are "you" and "good", the word vector corresponding to "you" is represented by a1, the word vector corresponding to "good" is represented by a2, and the words between a1 and a2 are connected in parallel to obtain a new vector of a1a 2.

The trained neural network model is input, specifically, the word vector representation of the character corresponding to each participle is input to the trained neural network, and a new vector corresponding to the participle is obtained.

In the embodiment of the present application, the neural network model may be an RNN, a CNN, a feedforward neural network, or the like, without limitation, and modeling is performed by using the neural network model, so that word vector representations may be fused to obtain a new vector of a word segmentation based on the word vector representations.

For example, referring to fig. 7, a schematic diagram of a word vector of a word corresponding to a participle through the RNN model fusion operation in the embodiment of the present application is shown, where the input RNN model is represented by the word vector, and a fused vector, that is, a new vector of the participle, is output.

For example, referring to fig. 8, a schematic diagram of a fusion operation of a CNN model according to an embodiment of the present application is shown, in which a word vector of a word corresponding to a participle is input to the CNN model, and a fused vector, that is, a new vector of the participle, is output.

For example, referring to fig. 9, a schematic diagram of fusion operation of feedforward neural network models in the embodiment of the present application is shown, where word vectors of words corresponding to participles are input into the feedforward neural network models, and fused vectors, that is, new vectors of the participles, are output.

It should be noted that, there may be a plurality of words corresponding to one word segmentation, and the number of words corresponding to one word segmentation is not limited to two in the above examples, and usually, a word corresponding to one word segmentation is greater than two, and the neural network model has a better effect when the input is modeled by a plurality of vectors, so that the word vector representation of the word corresponding to the word segmentation is subjected to the fusion processing, the neural network model may be adopted, and vector subtraction, vector addition, vector multiplication, and vector parallel operation on a plurality of vectors are all applicable, and a plurality of vectors may be fused into one vector, thereby implementing the fusion of information of the word corresponding to the word segmentation.

Of course, the embodiment of the present application is not limited to the above-mentioned several fusion operation processes, and other methods may be used for fusion, which is not limited in the embodiment of the present application.

And S2, obtaining intermediate vector representations corresponding to the participles.

Specifically, two situations can be distinguished:

in the first case: when one operation is performed, the operated word vector corresponding to each participle is represented as an intermediate vector of the corresponding participle.

That is to say, in the embodiment of the present application, only one operation may be used to implement word information fusion, for example, only a vector addition operation is used, a vector addition operation is performed on the word vector representation of the word corresponding to the participle, and a vector is obtained after the addition operation, that is, the vector is used as an intermediate vector representation of the corresponding participle.

In the second case: and if at least two kinds of operation are carried out, respectively obtaining the character vector representation after each operation corresponding to each participle, respectively representing the character vector representation after each operation corresponding to each participle, and carrying out parallel connection operation to obtain the intermediate vector representation of the corresponding participle.

That is, in the embodiment of the present application, when the word vector representation fusion processing is performed, multiple operations are simultaneously performed, and finally, vectors obtained after the multiple operations are fused in parallel, for example, vector addition and vector subtraction are performed, a new vector is obtained through the vector addition operation, a new vector is obtained through the vector subtraction, and then, the two new vectors are connected in parallel, so that the intermediate vector representation of the corresponding word segmentation is obtained.

In addition, if only one word corresponding to one participle is available, for example, the text to be processed is "i love sky", after the participle is performed, each obtained participle is "i", "i" and "sky", and only one word corresponding to the participle of "i" and "i" is available, the word vector corresponding to the word of "i" and "i" can be directly represented as the intermediate vector of the corresponding participle of "i" and "i".

A second part: and respectively carrying out fusion processing on the original word vector representation and the intermediate vector representation of each participle to obtain the fusion vector representation of each participle.

Specifically, the method comprises the following steps:

s1, fusing the original word vector representation and the intermediate vector representation of the participle, specifically comprising: and performing one or more of the following operations on the original word vector representation and the intermediate vector representation of the participle: vector subtraction operation, vector multiplication operation, vector addition operation, vector parallel connection operation and tensor inner product calculation operation.

When the original word vector representation and the intermediate vector representation of the word segmentation are subjected to the fusion processing, the vector subtraction operation, the vector multiplication operation, the vector addition operation and the vector parallel connection operation are the same as the principles of the above-mentioned fig. 3 to 6, except that the original word vector representation and the intermediate vector representation are subjected to the fusion processing for two vectors.

Moreover, for the tensor inner product calculation operation, for example, as shown in fig. 10, which is a schematic diagram of the tensor inner product calculation operation in the embodiment of the present application, the tensor inner product calculation is performed by representing the original word vector and the intermediate vector of the participle, and the calculation result is the fusion vector representation of the participle.

It should be noted that, since the original word vector representation and the intermediate vector representation of the word segmentation are fusion processing for two vectors, and the tensor inner product is usually simpler and more efficient to calculate for two vectors, the calculation of the fusion vector representation suitable for word segmentation is relatively simple, and the vector subtraction operation, the vector multiplication operation, the vector addition operation, and the vector parallel connection operation are also suitable for the calculation of two vectors.

In addition, the embodiment of the present application is not limited to the above fusion operation processing, and other manners may also be used to perform fusion operation on the original word vector representation and the intermediate vector representation, which is not limited in the embodiment of the present application.

And S2, obtaining the fusion vector representation of each participle.

Specifically, two cases can be classified:

in the first case: and if one operation is carried out, the operated original word vector representation and the operated intermediate vector representation corresponding to each participle are taken as the fused vector representation of the corresponding participle.

In the second case: and if at least two kinds of operation are carried out, respectively obtaining the operated original word vector representation and the operated intermediate vector representation corresponding to each participle, respectively representing the operated original word vector representation and the operated intermediate vector representation corresponding to each participle, and carrying out parallel connection operation to obtain the fused vector representation corresponding to the participle.

That is, in this embodiment of the present application, one or more operation modes may be adopted to perform fusion processing on the original word vector representation and the intermediate vector representation, and this embodiment of the present application is not limited.

Therefore, in the embodiment of the application, a plurality of mechanisms are adopted for information fusion, the words of the participle are fused, and then the intermediate vector representation and the original word vector representation after the words are fused to obtain the final fused vector representation of the participle, so that word information in the text can be fully mined, richer element representation can be obtained, and the accuracy of element representation is improved.

Further, for different application scenarios, other information may also be fused to obtain a text vector representation of the text to be processed, so as to enrich the information representation of the text, and specifically, several possible implementation manners are provided in the embodiment of the present application:

the first embodiment: obtaining text vector representation of the text to be processed according to the fusion vector representation of each participle, which specifically comprises the following steps:

acquiring user portrait characteristic information of a user corresponding to a text to be processed; and obtaining text vector representation of the text to be processed according to the fusion vector representation of each participle and the user portrait characteristic information.

For example, short texts in the video field are often related to video content corresponding to the short texts, and different users have different attitudes, such as likes or dislikes, for the same video, so that when vector representation is performed on the short texts, user portrait information can be fused into the representation of the short texts, where the user portrait characteristic information is, for example, age, occupation, hobby, gender, and the like, and this embodiment of the application is not limited.

Specifically, for example, user portrait feature information is obtained, after modeling is performed through a neural network model, various user portrait feature information can be fused to obtain fused vectors, and then the fused vectors and fused vector representations of all participles are simultaneously input into the neural network model, so that text vector representation of a final text to be processed is obtained.

Therefore, the text vector representation of the text to be processed is fused with the user portrait characteristic information, so that a personalized text representation method can be constructed, more accurate content can be provided when a user searches videos or recommends platform users, and user experience is optimized.

The second embodiment: obtaining text vector representation of the text to be processed according to the fusion vector representation of each participle, which specifically comprises the following steps:

obtaining context multi-modal information of a text to be processed; and obtaining text vector representation of the text to be processed according to the fusion vector representation of each participle and the context multi-modal information.

The contextual multimodal information may be pictures, videos, audios, and the like, and is not limited in this embodiment of the application.

In the embodiment of the application, multi-modal information existing in the context environment of the text to be processed can be considered, for example, the context multi-modal information and the fusion vector representation of each participle are all input into the neural network model for modeling, so that the text vector representation of the text to be processed is output finally, and thus, the context multi-modal information is also fused in the text vector representation, and the vectorization representation of the text can be further enriched.

Further, based on the above embodiment, a specific application scenario is used for description below, and the text representation method in the embodiment of the present application may be applied to different service fields, such as video recommendation, classification, generation, identification, retrieval, and the like, without limitation, so that after obtaining the text vector representation of the text to be processed in the embodiment of the present application, a possible implementation manner is provided, and corresponding service processing is performed on the text to be processed according to the text vector representation of the text to be processed and a specified service target.

Therefore, the text representation method in the embodiment of the application can enrich text representation information, improve the accuracy and the representation quality of text representation, further optimize the recognition and generation capacity of texts of related services when related service processing is performed based on final text vector representation, and improve the accuracy and the effect of text classification, search and recommendation, and the like.

Based on the same inventive concept, the embodiment of the present application further provides a text presentation apparatus, which may be a hardware structure, a software module, or a hardware structure plus a software module. Based on the above embodiments, referring to fig. 11, a text presentation apparatus in an embodiment of the present application specifically includes:

a first obtaining module 1100, configured to obtain a word vector representation of each word in a text to be processed;

a second obtaining module 1110, configured to obtain an original word vector representation of each participle in the text to be processed;

a fusion module 1120, configured to fuse the word vector representation of each word with the original word vector representation of each corresponding participle to obtain a fusion vector representation of each participle;

a third obtaining module 1130, configured to obtain a text vector representation of the text to be processed according to the fused vector representation of each participle.

Optionally, when the word vector representation of each word is fused with the original word vector representation of each corresponding participle to obtain a fused vector representation of each participle, the fusion module 1120 is specifically configured to:

respectively fusing the character vector representations of the characters corresponding to the participles aiming at each participle to obtain intermediate vector representations corresponding to the participles;

and respectively carrying out fusion processing on the original word vector representation and the intermediate vector representation of each participle to obtain the fusion vector representation of each participle.

Optionally, when performing fusion processing on the word vector representations of the words corresponding to the participles, the fusion module 1120 is specifically configured to:

and performing one or more operations on the word vector representation of the word corresponding to the participle: vector subtraction operation, vector multiplication operation, vector addition operation, vector parallel connection operation and vector fusion operation after being input into the neural network model.

Optionally, when obtaining the intermediate vector representation corresponding to each participle, the fusion module 1120 is specifically configured to:

if one operation is carried out, representing the operated character vector corresponding to each participle as the intermediate vector representation of the corresponding participle;

and if at least two kinds of operation are carried out, respectively obtaining the operated character vector representation corresponding to each participle, respectively representing the operated character vector corresponding to each participle, and carrying out parallel connection operation to obtain the intermediate vector representation corresponding to the participle.

Optionally, when performing fusion processing on the original word vector representation and the intermediate vector representation of the participle, the fusion module 1120 is specifically configured to:

and performing one or more of the following operations on the original word vector representation and the intermediate vector representation of the participle: vector subtraction operation, vector multiplication operation, vector addition operation, vector parallel connection operation and tensor inner product calculation operation.

Optionally, when obtaining the fused vector representation of each participle, the fusion module 1120 is specifically configured to:

if one operation is carried out, the operated original word vector representation and the operated intermediate vector representation corresponding to each participle are taken as the fusion vector representation of the corresponding participle;

and if at least two kinds of operation are carried out, respectively obtaining the operated original word vector representation and the operated intermediate vector representation corresponding to each participle, respectively representing the operated original word vector representation and the operated intermediate vector representation corresponding to each participle, and carrying out parallel connection operation to obtain the fused vector representation corresponding to the participle.

Optionally, when obtaining the text vector representation of the text to be processed according to the fused vector representation of each participle, the third obtaining module 1130 is specifically configured to:

acquiring user portrait characteristic information of a user corresponding to the text to be processed;

and obtaining text vector representation of the text to be processed according to the fusion vector representation of each participle and the user portrait feature information.

obtaining context multi-modal information of the text to be processed;

and obtaining text vector representation of the text to be processed according to the fusion vector representation of each participle and the context multi-modal information.

In the embodiment of the application, richer meta representation of the text to be processed can be obtained by fusing the information of each character and each word in the text to be processed, so that more accurate text vector representation can be obtained, text representation is enriched, text representation quality is improved, and the method is beneficial to business applications such as identification and generation of the text to be processed.

Based on the above embodiments, fig. 12 is a schematic structural diagram of an electronic device in an embodiment of the present application.

The present embodiment provides an electronic device, which may be a terminal or a server, and the electronic device is taken as an example in the present embodiment to be described, and the electronic device may include a processor 1210 (CPU), a memory 1220, an input device 1230, an output device 1240, and the like.

Memory 1220 may include Read Only Memory (ROM) and Random Access Memory (RAM), and provides processor 1210 with program instructions and data stored in memory 1220. In the embodiment of the present application, the memory 1220 may be used for storing a program of any one of the text representation methods in the embodiment of the present application.

Processor 1210 is configured to execute any of the text presentation methods of the embodiments of the present application according to the obtained program instructions by calling the program instructions stored in memory 1220.

Based on the above embodiments, in the embodiments of the present application, a computer-readable storage medium is provided, on which a computer program is stored, and the computer program, when executed by a processor, implements the text representation method in any of the above method embodiments.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.

It will be apparent to those skilled in the art that various changes and modifications may be made in the embodiments of the present application without departing from the spirit and scope of the embodiments of the present application. Thus, if such modifications and variations of the embodiments of the present application fall within the scope of the claims of the present application and their equivalents, the present application is also intended to encompass such modifications and variations.

Claims

1. A method of text representation, comprising:

obtaining word vector representation of each word in the text to be processed;

2. The method of claim 1, wherein fusing the word vector representation of each word with the corresponding original word vector representation of each participle to obtain a fused vector representation of each participle, specifically comprises:

3. The method according to claim 2, wherein the fusing the word vector representations of the words corresponding to the participles specifically comprises:

4. The method of claim 3, wherein obtaining the intermediate vector representation corresponding to each participle specifically comprises:

5. The method of claim 2, wherein fusing the original word vector representation and the intermediate vector representation of the participle comprises:

6. The method according to claim 5, wherein the obtaining a fused vector representation of each participle specifically comprises:

7. The method according to any one of claims 1 to 6, wherein obtaining the text vector representation of the text to be processed according to the fused vector representation of each participle specifically comprises:

8. The method according to any one of claims 1 to 6, wherein obtaining the text vector representation of the text to be processed according to the fused vector representation of each participle specifically comprises:

obtaining context multi-modal information of the text to be processed;

9. A text presentation device, comprising:

10. The apparatus of claim 9, wherein the fusion module is specifically configured to:

11. The apparatus of claim 10, wherein the fusion module is specifically configured to:

12. The apparatus of claim 10, wherein the fusion module is specifically configured to:

13. The apparatus of any one of claims 9-12, wherein the third obtaining module is specifically configured to:

14. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the method of any of claims 1-8 are implemented when the program is executed by the processor.

15. A computer-readable storage medium having stored thereon a computer program, characterized in that: the computer program when executed by a processor implements the steps of the method of any one of claims 1 to 8.