CN112035662B

CN112035662B - Text processing method and device, computer equipment and storage medium

Info

Publication number: CN112035662B
Application number: CN202010872702.4A
Authority: CN
Inventors: 叶志豪; 文瑞; 陈曦; 张子恒; 李智勇
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-08-26
Filing date: 2020-08-26
Publication date: 2021-06-08
Anticipated expiration: 2040-08-26
Also published as: CN112035662A

Abstract

The embodiment of the application discloses a text processing method, a text processing device, computer equipment and a storage medium, wherein the text processing method can be applied to the field of artificial intelligence, and comprises the following steps: acquiring a target text, wherein the target text comprises N target word groups; determining the theme context characteristics of each target phrase and K text topics according to a theme phrase weight characteristic set between K text topics and V vocabulary phrases; identifying matching weight characteristics between a target text and K text topics, and determining expansion topic characteristics of the target text according to a topic phrase weight characteristic set, the matching weight characteristics and topic context characteristics of each target phrase; and combining the expansion subject characteristics and the subject context characteristics of the N target word groups into target text characteristics, and identifying the target text characteristics to obtain the service text type to which the target text belongs. By the method and the device, the text classification efficiency can be improved.

Description

Text processing method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a text processing method and apparatus, a computer device, and a storage medium.

Background

The text classification refers to classifying and marking the text according to a certain classification system or classification standard. The text classification result can provide a data basis for a text downstream task, for example, the text semantic understanding is performed according to the text classification result, the accurate recommendation is performed according to the text classification result, and the like.

At present, text classification is mainly completed manually, that is, after the whole text is understood manually, a corresponding type label is set for the text according to a preset classification standard. Because the manual classification of the text needs to be understood by the manual text, and the manual setting of the labels and other processes can consume a large amount of time, the efficiency of text classification is low.

Disclosure of Invention

The embodiment of the application provides a text processing method and device, a computing device and a storage medium, which can improve text classification efficiency.

An embodiment of the present application provides a text processing method, including:

acquiring a target text, wherein the target text comprises N target word groups, and N is a positive integer;

determining the theme context characteristics of each target phrase and K text topics according to a theme phrase weight characteristic set between K text topics and V vocabulary phrases, wherein K and V are positive integers;

identifying matching weight characteristics between the target text and the K text topics, and determining expansion topic characteristics of the target text according to the topic phrase weight characteristic set, the matching weight characteristics and topic context characteristics of each target phrase;

and combining the extended subject feature and the subject context features of the N target word groups into a target text feature, and identifying the target text feature to obtain the service text type to which the target text belongs.

An embodiment of the present application provides a text processing apparatus in one aspect, including:

the acquisition module is used for acquiring a target text, wherein the target text comprises N target word groups, and N is a positive integer;

the first determining module is used for determining the theme context characteristics of each target phrase and K text topics according to a theme phrase weight characteristic set between K text topics and V vocabulary phrases, wherein K and V are positive integers;

the first identification module is used for identifying matching weight characteristics between the target text and the K text topics;

a second determining module, configured to determine an extended topic feature of the target text according to the topic phrase weight feature set, the matching weight feature, and the topic context feature of each target phrase;

the combination module is used for combining the expansion subject characteristics and the subject context characteristics of the N target phrases into target text characteristics;

and the second identification module is used for identifying the characteristics of the target text to obtain the service text type of the target text.

An aspect of the embodiments of the present application provides a computer device, including a memory and a processor, where the memory stores a computer program, and when the computer program is executed by the processor, the processor is caused to execute the method in the foregoing embodiments.

An aspect of the embodiments of the present application provides a computer storage medium, in which a computer program is stored, where the computer program includes program instructions, and when the program instructions are executed by a processor, the method in the foregoing embodiments is performed.

An aspect of the embodiments of the present application provides a computer program product or a computer program, where the computer program product or the computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium, and when the computer instructions are executed by a processor of a computer device, the computer instructions perform the methods in the embodiments described above.

According to the method and the device, manual participation is not needed, the terminal equipment automatically extracts the theme context characteristics of each phrase in the text and extracts the extension theme characteristics of the text, so that the text type of the text is determined, the condition of low efficiency caused by manual classification is avoided, the text classification efficiency can be improved, and the text classification mode is enriched; moreover, based on the subject context characteristics of each phrase, compared with the expansion subject characteristics determined based on the context-free word vectors, the determined expansion subject characteristics can effectively avoid errors and noises generated when the polysemous words are matched with the expansion subject characteristics, and further improve the accuracy of text classification.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a system architecture diagram of a text process provided by an embodiment of the present application;

fig. 2 is a schematic view of a text processing scenario provided in an embodiment of the present application;

FIG. 3 is a flow chart of text processing provided by an embodiment of the present application;

FIG. 4 is a schematic diagram of a classification model provided by an embodiment of the present application;

FIG. 5 is a flow chart illustrating a method for determining contextual characteristics of a subject according to an embodiment of the present application;

FIG. 6 is a schematic structural diagram of a BERT model provided in an embodiment of the present application;

fig. 7 is an overall architecture diagram of a text processing method according to an embodiment of the present application;

FIG. 8 is a schematic structural diagram of a text processing apparatus according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

The scheme provided by the embodiment of the application belongs to natural language processing technology belonging to the field of artificial intelligence.

Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics.

The text type recognition method mainly relates to recognition of the text type of the text based on a natural language processing technology, and the text with the determined text type can be used for subsequent accurate recommendation, text abstract generation tasks and the like.

The application can be applied to the following scenes: in a medical spoken language intention query scene (such as intelligent diagnosis guide), a user intention (for example, a disease judgment intention, a doctor finding intention, a department finding intention, a non-self-diagnosis finding intention, and the like) is to be recognized, a target text input by a user can be acquired, a theme context feature of each phrase in the target text is determined by adopting the scheme of the application, an expansion theme feature of the target text is determined based on the theme context feature of each phrase, and an intention type of the target text is determined according to the expansion theme feature. Subsequently, accurate recommendations of medical business data may be made based on the determined type of intent.

For another example, in a disease prediction scenario, to identify a type of a medical complaint (e.g., symptom description, physical examination, past medical history), a medical diagnosis text (referred to as a target text) may be obtained, a subject context feature of each phrase in the target text is determined by using the scheme of the present application, an extended subject feature of the target text is determined based on the subject context feature of each phrase, and a type of the complaint of the target text is determined according to the extended subject feature. Subsequently, the disease prediction can be carried out based on the determined chief complaint type, and the performance of the disease prediction is improved.

For another example, in the field of comment emotion analysis, to identify an emotion type (e.g., negative evaluation, positive evaluation, neutral) of a comment text, a comment text (called a target text) may be acquired, a theme context feature of each phrase in the target text is determined by using the scheme of the present application, an extended theme feature of the target text is determined based on the theme context feature of each phrase, and an emotion type of the target text is determined according to the extended theme feature.

Fig. 1 is a system architecture diagram of text processing according to an embodiment of the present application. The application relates to a server 10d and a terminal device cluster, and the terminal device cluster may include: terminal device 10a, terminal device 10 b.

Taking the terminal device 10a as an example, the terminal device 10a obtains a target text to be classified, and sends the target text to the server 10 d. The server 10d determines the topic context characteristics of each target phrase and K text topics in the target text according to the topic phrase weight characteristic set between K text topics and V vocabulary phrases in the neural topic model; the server 10d calls the neural topic model to determine matching weight features between the target text and the K text topics, the server 10d determines expansion topic features of the target text according to the topic phrase weight feature set, the matching weight features and the topic context features of each target phrase, combines the expansion topic features and the topic context features of all the target phrases into target text features, and identifies the target text features to obtain the text type to which the target text belongs.

Subsequently, the server 10d may issue the recognized text type to the terminal device 10a, and the terminal device 10a may output the text type issued by the server 10 d; or the terminal device 10a may further post-process the target text according to the recognized text type to output a post-processing result.

Of course, extracting the subject context feature of each target phrase and determining the text type to which the target text belongs may also be performed by the terminal device.

The server 10d shown in fig. 1 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform, and the like.

The terminal device 10a, the terminal device 10b, the terminal device 10c, and the like shown in fig. 1 may be an intelligent device having a text processing function, such as a mobile phone, a tablet computer, a notebook computer, a palm computer, a Mobile Internet Device (MID), a wearable device, and the like. The terminal device cluster and the server 10d may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.

The following is a detailed description of how the server 10d determines the text type of a text:

please refer to fig. 2, which is a schematic view of a text processing scenario provided in an embodiment of the present application. As shown in fig. 2, the server 10d obtains a text 20a to be recognized, where the text 20a includes N phrases, which are phrase 1, phrase 2, and phrase N. Server 10d first converts text 20a into a bag-of-words vector, where each component in the bag-of-words vector represents the frequency of occurrence of each phrase in the vocabulary in text 20 a. Inputting the bag-of-words vector of the text 20a into a neural topic model 20b, wherein the neural topic model predicts the matching probability between the text and a plurality of topics based on a neural network, the neural topic model 20b includes a weight matrix 20c between the topics and the phrases, and each row of the weight matrix 20c represents the matching weight between a certain topic and a plurality of phrases in the vocabulary. Based on the neural topic model 20b, the matching probabilities between the text 20a and the plurality of topics can be predicted, and the plurality of matching probabilities are combined into a matching probability feature 20 e.

Server 10d extracts the local context features of each phrase in text 20a, where the local context features of each phrase may be determined using a self-attention mechanism or based on a BERT model.

The process of determining the local context characteristics of each phrase based on the self-attention mechanism is as follows:

the server 10d converts each phrase into a word vector, and for any phrase, the server 10d calculates a similarity weight between the word vector of any phrase and the word vector of each phrase in the text 20a, where the similarity weight between two word vectors may be calculated in a dot product, concatenation, or perceptron manner. And normalizing the N similarity weights, weighting the normalized similarity weights and word vectors of all phrases, and superposing the weighted N word vectors as the local context characteristics of any phrase. Server 10d may determine the local context vector for each phrase in text 20a in the same manner.

The process of determining the local context characteristics of each phrase based on the BERT model is as follows:

the server 10d obtains a word vector of each phrase, obtains a sentence position vector of a sentence in which each phrase is located in the text 20a, and obtains a phrase position vector of each phrase in the text 20 a. Combining the word vector, the sentence position vector and the phrase position vector of each phrase into an input vector of each phrase, inputting N input vectors into a trained BERT model, carrying out multi-attention coding on the input vector of each phrase by the BERT model, and taking the feature of each phrase output by the last hidden layer of the BERT model as the local context feature of each phrase.

After determining the local context feature of each phrase in the text 20a, for any phrase, calculating the similarity between the local context feature of any phrase and the weight vector between each topic-phrase in the neural topic model 20b (i.e., each row of the weight matrix 20c between the topic and the phrase), obtaining the similarity between this phrase and each topic, weighting the similarity and the weight vector between the corresponding topic-phrases, and superimposing the weighted weight vector between the topic-phrases as the global topic context feature of any phrase. Server 10f may determine a global context vector for each phrase in text 20a in the same manner.

To this end, the server 10d determines the local context feature and the global subject context feature of each phrase in the text 20a, and superimposes the local context feature and the global subject context feature of each phrase on the subject context feature of each phrase.

The server 10d may combine the subject contextual characteristics of all phrases in the text 20a into a subject contextual characteristic set 20d (shown in fig. 2).

The server 10d inputs the topic context feature set 20d, the matching probability feature 20e and the weight matrix 20c into the extended knowledge model 20f, determines a source topic knowledge matrix and a target topic knowledge matrix according to the weight matrix 20c, the server 10d determines a topic similarity weight between the source topic knowledge matrix and the topic context feature set 20d according to the matching probability feature 20e, and performs weighted summation on the determined topic similarity weight and the target topic knowledge matrix to obtain an extended topic feature 20g of the text 20 a.

The server 10d splices the expanded topic features 20g with each topic context feature in the topic context feature set 20d, inputs the spliced topic context features into the classification model 20i, the classification model 20i outputs the matching probability between the text 20a and the multiple text types, and the sum of the matching probabilities between the text 20a and the multiple text types is equal to 1.

As shown in fig. 2, the matching probability between text 20a and text type 1 is 0.1, the matching probability between text 20a and text type 2 is 0.2, the matching probability between text 20a and text type 3 is 0.6, and the matching probability between text 20a and text type 4 is 0.1. The server 10d may select the maximum matching probability from the 4 matching probabilities as the text probability to which the text 20a belongs, that is, the text type 3 to which the maximum matching probability 0.6 corresponds is the text type to which the text 20a belongs.

To this end, the server 10d recognizes the text type to which the text 20a belongs, and the server 10d may set a type tag for the text 20 a.

The specific process of obtaining a target text (e.g., the text 20a in the foregoing embodiment), determining topic context features of each target phrase (e.g., N topic context features in the topic context feature set 20d in the foregoing embodiment), obtaining matching weight features (e.g., the matching probability features 20e in the foregoing embodiment), and determining a business text type (e.g., the text type 3 in the foregoing embodiment) to which the text belongs may refer to the following embodiments corresponding to fig. 3 to fig. 7.

Please refer to fig. 3, which is a schematic flowchart of a text processing method provided in an embodiment of the present application, in which a server is used as an execution subject to describe how to identify a service text type to which a text belongs, the text processing method may include the following steps:

step S101, a target text is obtained, wherein the target text comprises N target phrases, and N is a positive integer.

Specifically, a server (e.g., the server 10d in the embodiment corresponding to fig. 2) obtains a text to be recognized (referred to as a target text, e.g., the text 20a in the embodiment corresponding to fig. 2), where the target text includes N phrases, each phrase is referred to as a target phrase (e.g., N phrases in the embodiment corresponding to fig. 2), and N is a positive integer.

It is noted that the target text in the present application may specifically belong to short text, that is, the characters contained in the target text are smaller than a preset character threshold.

Since the short text contains fewer characters (information), the text features are generally sparse, and the accuracy of text recognition is reduced. In order to improve the identification accuracy, additional subject knowledge is introduced to generate the expansion subject characteristics of the text, so that the problem of characteristic sparsity can be avoided, and the text classification accuracy can be effectively improved.

Step S102, according to a theme phrase weight characteristic set between K text themes and V vocabulary phrases, determining theme context characteristics of each target phrase and K text themes, wherein K and V are positive integers.

Specifically, the server invokes a word vector model (wor2vec) to convert each phrase into word vector features, i.e., each phrase is represented as a numerical vector. And determining local context characteristics of each target phrase based on a self-attention mechanism or a BERT model and the word vector characteristics of the N target phrases. The local context feature of each target phrase is a vector whose dimension is V.

The local context characteristics of the phrases mean that the characteristic representation of each phrase is determined not only by self characters, but also by front and rear phrases of the phrase in the text, and the local context characteristics can express the local semantics of the phrases.

The server obtains a subject phrase weight feature set (e.g., a weight matrix 20c between a subject and a phrase in the embodiment corresponding to fig. 2) between K texts and V vocabulary phrases in a neural topic model (e.g., the neural topic model 20b in the embodiment corresponding to fig. 2), where the subject phrase weight feature set includes K subject phrase weight features, each subject phrase weight feature represents a matching weight between one text topic and V vocabulary phrases, and the subject phrase weight feature set is a model parameter determined when the neural topic model is trained. Therefore, the topic phrase weight feature set can be regarded as a feature matrix with rows of K and columns of V, and each row in the feature matrix represents a topic phrase weight feature.

For example, there are 5 vocabulary phrases and 2 text topics, and the 5 vocabulary word groups are: automobiles, trains, airplanes, cell phones, and tablet computers; the 2 text topics are: traffic and science and technology, then the matching weight of the text subject "traffic" and the vocabulary phrase "car", the vocabulary phrase "train" and the vocabulary phrase "plane" should be greater than the matching weight of the text subject "traffic" and the vocabulary phrase "mobile phone", the vocabulary phrase "tablet computer"; the matching weight of the text subject "science and technology" with the vocabulary phrase "mobile phone" and the vocabulary phrase "tablet computer" should be greater than the matching weight with the vocabulary phrase "car", the vocabulary phrase "train" and the vocabulary phrase "plane".

And the server determines global subject context characteristics of each target phrase and K text topics according to the subject phrase weight characteristic set and the local context characteristics of each target phrase, wherein each global subject context characteristic is a vector, and the dimension of the vector is V.

The global theme context characteristics of the phrases mean that the characteristic representation of each phrase is determined not only by own characters, but also by other phrases except the phrase in the text, and the global theme context characteristics can express the global semantics of the phrases.

Aiming at the problem of ambiguous words in short texts, the global subject context feature is beneficial to accurately representing specific meanings of the ambiguous words in different sentences because the determination of the global subject context feature needs the participation of K text subjects. Specifically, different meanings of the ambiguous word have corresponding subjects, and the subject information can also represent the meaning of the ambiguous word in different sentences, i.e. the global subject context feature can effectively represent the true meaning of the ambiguous word represented in different sentences.

Thus, the server obtains the local context feature and the global topic context feature of each target phrase, and superimposes the local context feature and the global topic context feature of each target phrase on the topic context features of each target phrase and K text topics (e.g., the N topic context features in the topic context feature set 20d in the embodiment corresponding to fig. 2), where each topic context feature is a vector and the dimension of the vector is V.

Step S103, identifying the matching weight characteristics between the target text and the K text topics, and determining the expansion topic characteristics of the target text according to the topic phrase weight characteristic set, the matching weight characteristics and the topic context characteristics of each target phrase.

Specifically, the server converts the target text into bag-of-words features according to the arrangement sequence of the V vocabulary phrases, the bag-of-words features are a vector, the dimension of the vector is equal to V, and each component in the bag-of-words features represents the frequency of occurrence of a vocabulary phrase in the target text.

For example, there are 5 vocabulary phrases, which are: phrase 1, phrase 2, phrase 3, phrase 4, and phrase 5, the current target text is: phrase 1 and phrase 3 and phrase 1, that is, in the target text, phrase 1 appears 2 times and phrase 3 appears 1 time, then the bag-of-words feature of the target text may be expressed as: [2,0,1,0,0], it can be known that the bag-of-words feature can only express the frequency information of the appearance of the phrase in the text, and the position information of the phrase is abandoned.

And the server calls an encoder in the trained neural topic model to encode the bag-of-word features of the target text to obtain the text encoding features of the target text.

The specific process can be expressed by the following formula (1) and formula (2):

the encoding process of the encoder comprises two processes of a priori parameter estimation and latent variable estimation, and the process of a priori parameter estimation can be represented by the following formula (1):

wherein x is_bowRepresenting bag of words features, μ (x) and σ (x) represent two prior parameters, l1, l2, and f_MLPShown is a fully connected network with Relu of its activation function.

The process of latent variable estimation can be represented by the following equation (2):

it can be known from the formula (2) that after two prior parameters are obtained, the neural topic model performs an addition operation on the two prior parameters, where epsilon is a parameter weight, and then a layer of neural network with ReLu as an activation function is used to obtain a latent variable z.

The latent variable z is the text encoding characteristic z after the encoder encodes the bag-of-words characteristic.

And calling a decoder in the neural topic model, and reconstructing the text coding features z to obtain matching weight features (such as the matching probability features 20e in the embodiment corresponding to fig. 2) between the target text and the K text topics.

The calculation process of the decoder can be described by the following formula (3):

θ＝softmax(g(z)) (3)

where g (-) denotes the decoder, softmax (-) denotes the normalization function, θ is the matching weight feature between the target text and the K text topics.

Thus, the server obtains the matching weight characteristics between the target text and the K text topics, wherein the matching weight characteristics are a vector, and the dimensionality of the vector is K.

In order to further alleviate the problem of feature sparseness of short texts, the subject context features of each target phrase are used for matching subject knowledge to obtain extended subject features, so that the accuracy of the extended subject features can be improved, and the matched noise can be reduced. The specific process for determining the expansion theme characteristics comprises the following steps:

the server inputs the subject phrase weight characteristic set into a first neural sensor in a trained extended knowledge model, the first neural sensor compresses the subject phrase weight characteristic set to obtain a source subject knowledge characteristic matrix, the row number of the source subject knowledge characteristic matrix is K, the column number of the source subject knowledge characteristic matrix is E, wherein E is a positive integer, and E is smaller than V.

And inputting the subject phrase weight characteristic set into a second neural sensor in the trained extended knowledge model, and compressing the subject phrase weight characteristic set by the second neural sensor to obtain a target subject knowledge characteristic matrix, wherein the row number and the column number of the target subject knowledge characteristic matrix are K and E, and E is a positive integer.

That is, the source topic knowledge feature matrix and the target topic knowledge feature matrix are the same size.

And the server matches the source topic knowledge characteristic matrix with the topic context characteristics of each target phrase to obtain memory weight characteristics, wherein the memory weight characteristics are a vector, and the dimensionality of the vector is K. The memory weight feature and the matching weight feature are added to form an integrated weight feature, and as can be seen from the foregoing, the matching weight feature is also a vector whose dimension is K. Therefore, the added integrated weight features are also a K-dimensional vector.

The specific process of the server for determining the memory weight characteristics is as follows:

the source topic knowledge characteristic matrix comprises K source topic knowledge characteristics, namely, each row in the source topic knowledge characteristic matrix represents one source topic knowledge characteristic. Taking a source subject knowledge characteristic as an example, how to determine an attention weight coefficient, K attention weight coefficients are determined in the same manner, and the K attention weight coefficients are combined into a memory weight characteristic.

The server firstly splices the source subject knowledge characteristics with the subject context characteristics of N target word groups respectively to obtain N spliced subject context characteristics of the source subject knowledge characteristics, and determines the attention weight coefficient of the source subject knowledge characteristics based on model parameters in the extended knowledge model and the N spliced subject context characteristics of the source subject knowledge characteristics. The process of determining the attention weight coefficient can be described by the following equation (4):

wherein con (-) represents the splicing function, S_kRepresents the kth source topic knowledge feature, U, in the source topic knowledge feature matrix_iThe subject context characteristics, P, of the ith target phrase_kRepresenting the kth attention weight coefficient, i.e. P_kIs a numerical value.

The server determines an attention weight coefficient, and may determine K attention weight coefficients in the same manner, and then combine the K attention weight coefficients into a memory weight feature.

The superposition of the matching weight features and the memory weight features into the integrated weight features can be expressed as the following formula (5):

wherein, theta_kRepresenting the k-th matching weight coefficient, P, in the matching weight features_kRepresents the kth attention weight coefficient in the memory weight feature,

represents the kth integrated feature weight coefficient in the integrated weight features, and gamma is a parameter weight. It can be known that K integrated feature weight coefficients can be combined into an integrated weight feature.

The server performs weighted summation on the K-dimensional integrated weight feature and the target topic knowledge feature matrix to obtain an extended topic feature of the target text (such as the extended topic feature 20g in the embodiment corresponding to fig. 2). The specific process of performing weighted summation on the K-dimensional integrated weight characteristics and the target subject knowledge characteristic matrix to obtain the extended subject characteristics is as follows:

the target topic knowledge characteristic matrix comprises K target topic knowledge characteristics, namely each row in the target topic knowledge characteristic matrix represents one target topic knowledge characteristic. The integrated weight features of the K dimensions comprise K integrated feature weight coefficients, and each target subject knowledge feature and each integrated weight coefficient have a one-to-one correspondence relationship. And the server weights the K integrated feature weight coefficients and the K target subject knowledge features to obtain K knowledge features to be superimposed. And superposing the K knowledge features to be superposed into the expansion theme features of the target text.

For example, there are 3 target subject knowledge features, which are [1,2,3], [2,4,1], [0,3,2], respectively, and the existing 3-dimensional integration weight feature is: [2.0,1.0,1.5], weighting the 3 target subject knowledge features and the 3-dimensional integrated weight features into 3 knowledge features to be superimposed, which are respectively: [2,4,6], [2,4,1], [0,4.5,3], and the 3 knowledge features to be superimposed are superimposed into an extended topic knowledge feature: [4,12.5,10].

The matched expansion subject characteristics of the application can be more accurate compared with the prior scheme, because for the ambiguous words, if a single word vector representation is used, noise knowledge irrelevant to the ambiguous words in the sentence can be obtained, and the expansion subject characteristics consistent with the real meaning of the ambiguous words in the sentence can be matched by using the subject context characteristics.

And step S104, combining the expansion subject characteristics and the subject context characteristics of the N target word groups into target text characteristics, and identifying the target text characteristics to obtain the service text type to which the target text belongs.

Specifically, the server obtains the expansion topic features of the target text and the topic context features of each target phrase. The server may combine the expanded subject feature and the subject context feature of each target phrase into a target text feature.

There are two ways to combine the expansion subject feature and the subject context feature of each target phrase into the target text feature, one of which is:

the server can splice the expansion subject features and the subject context features of each target phrase to obtain the target text features of each target phrase, namely the number of the obtained target text features is N, and when the subject context features and the expansion subject features are spliced, the subject context features can be in front of the expansion subject features; the topic feature can also be expanded to be before and the topic context feature to be after.

The second is as follows:

the server compresses the theme context characteristics of the N target phrases into text context characteristics, wherein the text context characteristics are a vector and can be compressed in a pooling mode. In other words, a feature matrix is compressed into a feature vector.

The server splices the expansion subject feature and the text context feature into a target text feature, and when the text context feature and the expansion subject feature are spliced, the text context feature can be in front of the expansion subject feature; it is also possible to extend the subject feature ahead and the text context feature behind.

The server determines the target text features, and then can identify the text type based on the target text features, and the specific process is as follows:

the method and the device adopt the trained classification model to identify the text type, the classification model comprises a convolution pooling layer and a full connection layer, the target text features are input into the classification model, and the convolution pooling layer in the classification model performs convolution pooling processing on the target text features to obtain the convolution features. The convolution pooling layer comprises a convolution layer and a pooling layer, the convolution layer comprises 3 convolution kernels, the sizes of the 3 convolution kernels are 1 xd, 2 xd and 3 xd respectively, and d is the characteristic dimension of the target text characteristic.

And the full connection layer in the classification model performs full connection processing on the convolution characteristics to obtain matching probabilities between the target text and a plurality of service text types in the classification model, and the service text type corresponding to the maximum matching probability in the matching probabilities is used as the service text type of the target text (such as the text type 3 in the embodiment corresponding to the above fig. 2).

The service text type can be an intention type in a medical scene, a chief complaint type in the medical scene, and an emotion type in a comment emotion analysis scene.

Referring to fig. 4, fig. 4 is a schematic diagram of a classification model according to an embodiment of the present application, and as shown in fig. 4, the classification model includes a convolutional layer, a pooling layer, and a full link layer. Inputting the target text features into a classification model, performing convolution processing on convolution layers in the classification model corresponding to the target text features, inputting the result after the convolution processing into a pooling layer, and performing pooling processing on the result after the convolution processing by the pooling layer to obtain the convolution features, wherein the pooling processing can be average pooling processing or maximum pooling processing. And inputting the convolution characteristics into a full connection layer, performing full connection processing on the convolution characteristics by the full connection layer to obtain matching probabilities between the target text and a plurality of service text types in the classification model, and taking the service text type corresponding to the maximum matching probability as the service text type of the target text in the plurality of matching probabilities.

According to the method, the terminal equipment automatically extracts the theme context characteristics of each phrase in the text and extracts the extension theme characteristics of the text without manual participation, so that the text type of the text is determined, the problem of low efficiency caused by manual classification is avoided, the text classification efficiency can be improved, and the text classification mode is enriched; moreover, the matched expansion topic features of the method are more accurate compared with the previous scheme, because for the ambiguous words, if a single word vector is used for representing, noise knowledge irrelevant to the ambiguous words in the sentence is obtained through matching, and the expansion topic features consistent with the real meanings of the ambiguous words in the sentence can be matched by using the topic context features, so that the accuracy of text classification is improved.

Please refer to fig. 5, which is a flowchart illustrating a method for determining a contextual feature of a subject according to an embodiment of the present application, where the method for determining the contextual feature of the subject includes the following steps:

step S201, obtaining the word vector feature of each target phrase.

Specifically, the server calls a word vector model (wor2vec) to convert each phrase into word vector features, that is, each phrase is represented as a numerical vector.

Step S202, according to the word vector characteristics of the N target phrases, determining the local context characteristics of each target phrase.

Specifically, the server may determine the local context feature of each target phrase using a self-attention mechanism or based on a BERT model. The following first describes the manner in which the local context feature is determined using the self-attention mechanism:

for any target phrase of the N target phrases, the process of determining the local context characteristics of the target phrase according to the word vector characteristics of the N target phrases is as follows: the server determines feature similarity (referred to as first feature similarity) between the word vector features of any target phrase and the word vector features of the N target phrases, and it can be known that the number of the first feature similarity is N. The similarity between the two features can be measured by dot product, splicing or perceptron, etc. The server normalizes the N first feature similarities to obtain N standard first feature similarities, the N standard first feature similarities after normalization are all in the range of 0-1, and the sum of the N standard first feature similarities is equal to 1. And the server performs weighted summation on the N standard first feature similarities and the word vector features of the N target phrases to obtain the local context features of any target phrase.

The self-attention mechanism can be described by the following equation (6):

H^T＝Attention(Q,K,A)＝ATT(Q,K)A (6)

wherein Q represents the word vector feature of any target phrase, K is equal to a, and K and a represent N word vector features of N target phrases, as can be seen from formula (6), after the self-attention mechanism, i.e., similarity calculation, the similarity weight and the corresponding word vector feature are subjected to weighted summation to obtain the local context feature of one target phrase.

For example, the word vector features of the existing 3 target phrases are [1,2,3], [2,4,1], [0,3,2], respectively, and the similarity of the existing 3 standard first features is: 0.2,0.2 and 0.6, carrying out weighted summation on the 3 standard first feature similarities and the 3 word vectors of the target word group to obtain local context features: [0.2 × 1+0.2 × 2+0.6 × 0 ═ 0.6, 0.2 × 2+0.2 × 4+0.6 × 3 ═ 3, and 0.2 × 3+0.2 × 1+0.6 × 2 ═ 2], so the local context characteristics are: [0.6,3.0,2.0].

The server may determine the local context characteristics of each of the N target phrases in the same manner.

The following is a description of the manner in which the BERT model is used to determine local context characteristics:

the server obtains the phrase position characteristics of each target phrase in the target text, and it can be known that the number of the obtained phrase position characteristics is also N. For example, the target text includes 4 target phrases, and the phrase position feature of the second target phrase may be: [0,1,0,0].

The server obtains the sentence position characteristics of the sentence where each target phrase is located in the target text, and it can be known that the number of the obtained sentence position characteristics is also N. For example, if the target phrase a is in the third sentence of the target text, and the target text includes a total of 4 sentences, the sentence position feature of the target phrase a may be: [0,0,1,0].

The server splices the word vector features, the phrase position features and the sentence position features of each target phrase into the input features of each target phrase, and can know that the number of the input features is also N. Inputting the N input features into a trained BERT model, and carrying out Multi-head self-entry (Multi-head self-entry) on the N input features by the BERT model to obtain the local context feature of each target phrase.

The BERT model is an advanced pre-training model, and after training on a large-scale data set is completed, the model can be applied to fine adjustment of different tasks, or sentence vectors and word vectors in the model can be obtained to serve as supplementary features.

In the present application, the feature output by the last hidden layer of the BERT model may be used as the local context feature of each target phrase.

Please refer to fig. 6, which is a schematic structural diagram of a BERT model provided in an embodiment of the present application, and as shown in fig. 6, a server obtains input features (E1, E2.) of each phrase, where the input features are obtained by combining word vector features, phrase position features, and sentence position features. And inputting the input characteristics of all phrases into a BERT model, wherein the last hidden layer of the BERT model outputs the local context characteristics of each phrase.

Step S203, determining the global theme context characteristics of each target phrase and K text themes according to the theme phrase weight characteristic set and the local context characteristics of each target phrase.

Specifically, for any target phrase of the N target phrases, the process of determining the global subject context characteristics of the any target phrase and the K text subjects according to the subject phrase weight characteristic set and the local context characteristics of the any target phrase includes:

the theme phrase weight characteristic set comprises K theme phrase weight characteristics, if the theme phrase weight characteristic set is regarded as a K-V characteristic matrix, each row of the characteristic matrix is the theme phrase weight characteristic, and each theme phrase weight characteristic represents the matching weight between any text theme and V vocabulary phrases, so that the dimension of each theme phrase weight characteristic is V.

The server determines the feature similarity (called second feature similarity) between the local context feature of any target phrase and the weight feature of each subject phrase, and it can be known that the number of the second feature similarity is K. And carrying out normalization processing on the K second feature similarities to obtain K standard second feature similarities, wherein the K standard second feature similarities after the normalization processing are all in the range of 0-1, and the sum of the K standard second feature similarities is equal to 1.

Carrying out weighted summation on the similarity of the K standard second characteristics and the weight characteristics of the K subject word groups to obtain the global subject context characteristics of any target word group and the K text subjects

The process of determining the global topic context feature can be represented by the following equation (7):

wherein alpha is_ikRepresenting the standard second feature similarity between the ith target phrase and the kth subject phrase weight feature, g_iAnd representing the global theme context characteristics of the ith target phrase.

Step S204, the local context feature and the global subject context feature of each target phrase are superposed into the subject context feature of each target phrase and K text subjects.

Specifically, the server obtains the local context feature of each target phrase and the global subject context feature of each target phrase, and superimposes the local context feature of each target phrase and the global subject context feature of each target phrase into the subject context features of each target phrase and K text subjects.

Wherein determining the subject contextual characteristics can be described by the following equation (8):

c_i＝l_i+g_i (8)

wherein, c_iThe subject context characteristics, l, of the ith target phrase_iLocal context feature, g, representing the ith target phrase_iAnd representing the global theme context characteristics of the ith target phrase.

Optionally, the aforementioned models related to the present application include a neural topic model, an extended knowledge model, and a classification model, and the following describes in detail the training process of the above 3 models:

text for a model (referred to as sample text) is obtained, the sample text including a plurality of sample word groups. The server obtains a sample neural topic model, the sample neural topic model comprises a sample topic phrase weight feature set between K text topics and V vocabulary phrases, the sample topic phrase weight feature set can be regarded as a feature matrix with K columns and V rows, and the value of the feature matrix can be a random number under the initial condition.

The server determines the sample subject context feature of each sample phrase according to the sample subject phrase weight feature set, wherein the process of determining the sample subject context feature is the same as that of determining the subject context feature of the target phrase in the foregoing, and details are not repeated here.

And the server calls the sample neural topic model to determine the sample matching weight characteristics between the sample text and the K text topics, wherein the process of determining the sample matching weight characteristics is the same as that of determining the matching weight characteristics of the target text, and the details are not repeated here.

And the server inputs the sample subject phrase weight feature set, the sample matching weight features and the sample subject context features of the plurality of sample phrases into a sample extended knowledge model, and the sample extended knowledge model outputs sample extended subject features, wherein the process for determining the sample extended subject features is the same as the process for determining the extended subject features.

The server combines the sample expansion theme characteristics and the sample theme context characteristics of the sample word groups into sample text characteristics, wherein the combination mode can refer to the two modes of combining the target text characteristics.

The server inputs the sample text features into a sample classification model, the sample classification model determines a text type (called as a sample text type) corresponding to the sample text features, obtains a text type label corresponding to the sample text, and determines a classification error according to the sample text type and the sample type label.

The server reconstructs the sample matching weight characteristics according to the sample theme phrase weight characteristic set in the sample neural theme model to obtain sample bag characteristics of the sample text, obtains bag characteristic labels of the sample text, and determines reconstruction errors according to the bag characteristic labels and the sample bag characteristics. And superposing the classification error and the reconstruction error into a model error.

Wherein, the calculation formula of the model error can be represented by the following formula (9):

wherein L is_lossRepresenting the model error, L_NTM-RDenotes the reconstruction error, L_CLSRepresenting the classification error (also the cross-entropy loss function), λ represents the weight hyperparameter, C represents the regularization term, q (z) is the standard normal prior, and p (z | x) and p (x | z) are the probabilities of the encoding process and the decoding process, respectively.

And training a sample neural topic model, a sample extended knowledge model and a sample classification model based on the model error. Because the sample theme phrase weight feature set is a model parameter in the sample neural theme model, the sample theme phrase weight feature set is trained in the process of training the sample neural theme model. When the training frequency reaches a frequency threshold value, or the difference between the model parameters before training and the model parameters after training is small, the trained sample neural topic model can be used as a neural topic model, the trained sample extended knowledge model is used as an extended knowledge model, the trained sample classification model is used as a classification model, and the trained sample topic phrase weight feature set is used as a topic phrase weight feature set.

The following table 1 shows the classification effect of the present application and other comparative models on four different data sets:

TABLE 1

Where Acc represents accuracy and F1 represents the harmonic mean of precision and recall. cs-TMN-self and cs-TMN-BERT are the proposed solutions, where cs-TMN-self represents the local context features obtained using the self-attention mechanism and cs-TMN-BERT represents the local context features obtained using the BERT model. Training and testing 4 commonly used data sets (SearchSnippets, stackoverflow, Biomedical, and Weibo) found that the method proposed herein has a significant improvement in the effect of these 4 data sets over the traditional short-text classification model. Compared with the advanced method TMN, the scheme has the advantages that the improvement of the data set StackOverflow is more than 5%, and the improvement of the data set Biomedical and Weibo is 2% to 3%.

Referring to fig. 7, fig. 7 is an overall architecture diagram of a text processing method provided in an embodiment of the present application, where the text processing method involves 5 modules, which are a neural topic module, a local context representation module, a global topic context representation module, a topic matching module, and a classification module.

The neural topic module can correspond to the neural topic model in the application, after the target text is converted into the bag-of-word feature x, an encoder in the neural topic module determines two prior parameters according to a formula (1), and determines a potential variable z according to a formula (2) after determining the prior parameters. And (3) reconstructing the latent variable z by a decoder in the neural topic module according to a formula (3) to obtain a matching weight characteristic theta between the target text and the K text topics. The neural topic module comprises a topic phrase weight characteristic set W. The specific process of determining the matching weight characteristic θ can be referred to step S103 in the corresponding embodiment of fig. 3.

The local context representation module is configured to determine a local context feature of each target phrase in the target text, where the local context feature may be determined by using a self-attention mechanism, and a specific process of determining the local context feature may refer to step S202 in the embodiment corresponding to fig. 5.

The global topic context representation module is used for determining global topic context characteristics of each target phrase in the target text, and specifically comprises: and determining the global subject context characteristics of each target phrase according to the local context characteristics of each target phrase and the subject phrase weight characteristic set W. The specific process of determining the global topic context feature can be seen in step S203 in the embodiment corresponding to fig. 5. And superposing the local context characteristics and the global theme context characteristics of each target phrase into the theme context characteristics of each target phrase.

And inputting the theme phrase weight characteristic set W, the theme context characteristic of each target phrase and the matching weight characteristic theta into a theme matching module, wherein the theme matching module can correspond to the extended knowledge model in the application. The topic matching module outputs the expanded topic features of the target text, wherein the specific process of determining the expanded topic features can refer to step S103 in the corresponding embodiment of fig. 3. And combining the expansion subject feature and the subject context feature of each target phrase into a target text feature of the target text.

The classification module can input the target text characteristics into the classification module corresponding to the classification model in the application, the classification module outputs the matching probability between the target text and a plurality of service text types, and the service text type corresponding to the maximum matching probability is taken as the service text type of the target text in the plurality of matching probabilities.

According to the method, the terminal equipment automatically extracts the theme context characteristics of each phrase in the text and extracts the extension theme characteristics of the text without manual participation, so that the text type of the text is determined, the problem of low efficiency caused by manual classification is avoided, the text classification efficiency can be improved, and the text classification mode is enriched; moreover, the matched expansion topic features are more accurate compared with the previous scheme, because for the polysemous words, if a single word vector is used for representing, noise knowledge irrelevant to the polysemous words in a sentence can be obtained through matching, and the expansion topic features consistent with the real meanings of the polysemous words in the sentence can be matched by using the topic context features, so that the accuracy of text classification is improved; furthermore, the application provides a plurality of ways for determining the theme context characteristics of each phrase and enriching the mode for determining the theme context characteristics.

Further, please refer to fig. 8, where fig. 8 is a schematic structural diagram of a text processing apparatus according to an embodiment of the present application. As shown in fig. 8, the text processing apparatus 1 can be applied to the terminal device in the embodiments corresponding to fig. 3 to 7 described above. The text processing means may be a computer program (comprising program code) running on a computer device, for example an application software; the apparatus may be used to perform the corresponding steps in the methods provided by the embodiments of the present application.

The text processing apparatus 1 may include: the device comprises an acquisition module 11, a first determination module 12, a first identification module 13, a second determination module 14, a combination module 15 and a second identification module 16.

An obtaining module 11, configured to obtain a target text, where the target text includes N target phrases, and N is a positive integer;

a first determining module 12, configured to determine, according to a theme phrase weight feature set between K text themes and V vocabulary phrases, theme context features of each target phrase and K text themes, where K and V are positive integers;

a first identification module 13, configured to identify matching weight features between the target text and the K text topics;

a second determining module 14, configured to determine an extended topic feature of the target text according to the topic phrase weight feature set, the matching weight feature, and the topic context feature of each target phrase;

a combination module 15, configured to combine the extended topic features and the topic context features of the N target phrases into target text features;

and the second identification module 16 is configured to identify the target text feature to obtain a service text type to which the target text belongs.

The first identification module 13 is specifically configured to:

converting the target text into bag-of-words characteristics according to the arrangement sequence of the V vocabulary phrases;

calling an encoder in the neural topic model to encode the word bag characteristics to obtain text encoding characteristics;

and calling a decoder in the neural topic model to reconstruct the text coding features to obtain matching weight features between the target text and the K text topics.

The combination module 15 is specifically configured to:

splicing the extended subject characteristics with the subject context characteristics of each target phrase respectively to obtain the target text characteristics of each target phrase; or,

compressing the theme context characteristics of the N target phrases into text context characteristics, and splicing the extended theme characteristics and the text context characteristics into the target text characteristics.

In one embodiment, the text processing apparatus 1 further includes: a training module 17.

A training module 17, configured to obtain a sample text, where the sample text includes a plurality of sample phrases, determine a sample topic context feature of each sample phrase according to a sample topic phrase weight feature set between K text topics and V vocabulary phrases in a sample neural topic model, obtain a sample matching weight feature between the sample text and K text topics, determine a sample expanding topic feature based on a sample expanding knowledge model, the sample topic phrase weight feature set, the sample matching weight feature, and a plurality of sample topic context features, combine the sample expanding topic feature and the plurality of sample topic context features into a sample text feature, invoke a sample classification model to determine a sample text type corresponding to the sample text feature, and determine a classification error according to the sample text type, calling the sample neural topic model, reconstructing the sample matching weight features to obtain sample bag-of-words features, determining a reconstruction error according to the sample bag-of-words features, training the sample neural topic model, the sample extended knowledge model and the sample classification model according to the reconstruction error and the classification error to obtain a neural topic model, an extended knowledge model and a classification model.

For specific functional implementation manners of the obtaining module 11, the first determining module 12, the first identifying module 13, the second determining module 14, the combining module 15, and the second identifying module 16, reference may be made to steps S101 to S104 in the embodiment corresponding to fig. 3, and for specific functional implementation manners of the training module 17, reference may be made to step S204 in the embodiment corresponding to fig. 5, which is not described herein again.

Referring to fig. 8, the first determining module 12 may include: an acquisition unit 121, a first determination unit 122, and a second determination unit 123.

An obtaining unit 121, configured to obtain a word vector feature of each target phrase;

a first determining unit 122, configured to determine, according to the word vector features of the N target phrases, local context features of each target phrase;

a second determining unit 123, configured to determine global subject context features of each target phrase and K text subjects according to the subject phrase weight feature set and the local context feature of each target phrase;

the obtaining unit 121 is further configured to superimpose the local context feature and the global topic context feature of each target phrase into the topic context feature of each target phrase and K text topics.

In an embodiment, for any target phrase in the N target phrases, when the first determining unit 122 is configured to determine the local context feature of the any target phrase according to the word vector features of the N target phrases, it is specifically configured to:

respectively determining first feature similarity between the word vector features of any one target phrase and the word vector features of the N target phrases;

carrying out normalization processing on the N first feature similarities to obtain N standard first feature similarities;

and carrying out weighted summation on the N standard first feature similarities and the word vector features of the N target phrases to obtain the local context features of any target phrase.

In an embodiment, when the first determining unit 122 is configured to determine the local context feature of each target phrase according to the word vector features of the N target phrases, it is specifically configured to:

acquiring phrase position characteristics of each target phrase in the target text, and acquiring sentence position characteristics of each target phrase in the target text;

splicing the word vector characteristics, the phrase position characteristics and the sentence position characteristics of each target phrase into the input characteristics of each target phrase;

and carrying out multi-attention coding on the N input characteristics to obtain the local context characteristics of each target phrase.

In one embodiment, the theme phrase weight feature set includes K theme phrase weight features, and any theme phrase weight feature represents a matching weight between any text theme and V vocabulary phrases;

for any target phrase in the N target phrases, when the second determining unit 123 is configured to determine the global subject context features of the any target phrase and the K text subjects according to the subject phrase weight feature set and the local context features of the any target phrase, specifically configured to:

determining a second feature similarity between the local context feature of any target phrase and the weight feature of each subject phrase;

carrying out normalization processing on the K second feature similarities to obtain K standard second feature similarities;

and carrying out weighted summation on the similarity of the K standard second characteristics and the weight characteristics of the K subject word groups to obtain the global subject context characteristics of any target word group and the K text subjects.

For specific functional implementation manners of the obtaining unit 121, the first determining unit 122, and the second determining unit 123, reference may be made to step S201 to step S204 in the embodiment corresponding to fig. 5, which is not described again here.

Referring again to fig. 9, the second determination module 14 may include: a calling unit 141, a matching unit 142 and a weighting unit 143.

A calling unit 141, configured to call a first neural sensor in an extended knowledge model, compress the subject phrase weight feature set into a source subject knowledge feature matrix, call a second neural sensor in the extended knowledge model, and compress the subject phrase weight feature set into a target subject knowledge feature matrix;

a matching unit 142, configured to match the source topic knowledge feature matrix with the topic context feature of each target phrase, so as to obtain a memory weight feature;

the invoking unit 141 is further configured to superimpose the matching weight features and the memory weight features into integrated weight features;

and the weighting unit 143 is configured to perform weighted summation on the integrated weight feature and the target topic knowledge feature matrix to obtain an extended topic feature of the target text.

In one embodiment, the source topic knowledge feature matrix comprises K source topic knowledge features;

the matching unit 142 is specifically configured to:

splicing the knowledge characteristics of each source theme with the theme context characteristics of the N target word groups respectively to obtain N spliced theme context characteristics of each text theme;

and determining an attention weight coefficient of each text topic according to the context characteristics of the N splicing topics of each text topic, and combining the K attention weight coefficients into the memory weight characteristics.

In one embodiment, the target topic knowledge feature matrix comprises K target topic knowledge features, and the integrated weight features comprise K integrated feature weight coefficients;

the weighting unit 143 is specifically configured to:

weighting the K integrated feature weight coefficients and the K target subject knowledge features to obtain K knowledge features to be superimposed;

and overlapping the K knowledge features to be overlapped into the expansion theme features of the target text.

The specific functional implementation manners of the calling unit 141, the matching unit 142, and the weighting unit 143 may refer to step S203 in the embodiment corresponding to fig. 5, which is not described herein again.

Further, please refer to fig. 9, which is a schematic structural diagram of a computer device according to an embodiment of the present application. The server in the above embodiments corresponding to fig. 3-7 may be a computer device 1000, and as shown in fig. 9, the computer device 1000 may include: a user interface 1002, a processor 1004, an encoder 1006, and a memory 1008. Signal receiver 1016 is used to receive or transmit data via cellular interface 1010, WIFI interface 1012. The encoder 1006 encodes the received data into a computer-processed data format. The memory 1008 has stored therein a computer program by which the processor 1004 is arranged to perform the steps of any of the method embodiments described above. The memory 1008 may include volatile memory (e.g., dynamic random access memory DRAM) and may also include non-volatile memory (e.g., one time programmable read only memory OTPROM). In some instances, the memory 1008 can further include memory located remotely from the processor 1004, which can be connected to the computer device 1000 via a network. The user interface 1002 may include: a keyboard 1018, and a display 1020.

In the computer device 1000 shown in fig. 9, the processor 1004 may be configured to call the memory 1008 to store a computer program to implement:

In one embodiment, when the processor 1004 determines the topic context characteristics of each target phrase and K text topics according to the topic phrase weight characteristic set between K text topics and V vocabulary phrases, the following steps are specifically performed:

acquiring the word vector characteristics of each target phrase;

determining local context characteristics of each target phrase according to the word vector characteristics of the N target phrases;

determining global subject context characteristics of each target phrase and K text subjects according to the subject phrase weight characteristic set and the local context characteristics of each target phrase;

and superposing the local context characteristics and the global subject context characteristics of each target phrase into the subject context characteristics of each target phrase and K text subjects.

In one embodiment, for any target phrase in the N target phrases, when the processor 1004 determines the local context feature of the target phrase according to the word vector feature of the N target phrases, the following steps are specifically performed:

In one embodiment, when the processor 1004 determines the local context feature of each target phrase according to the word vector features of the N target phrases, the following steps are specifically performed:

for any target phrase in the N target phrases, when the processor 1004 determines the global subject context features of the any target phrase and the K text topics according to the subject phrase weight feature set and the local context features of the any target phrase, the following steps are specifically performed:

In one embodiment, the processor 1004, when performing the step of identifying the matching weight features between the target text and the K text topics, specifically performs the following steps:

In one embodiment, when the processor 1004 determines the expanded subject feature of the target text according to the subject phrase weight feature set, the matching weight feature and the subject context feature of each target phrase, the following steps are specifically performed:

calling a first nerve perceptron in an extended knowledge model, and compressing the subject phrase weight characteristic set into a source subject knowledge characteristic matrix;

calling a second neural perceptron in the extended knowledge model, and compressing the theme phrase weight characteristic set into a target theme knowledge characteristic matrix;

matching the source topic knowledge characteristic matrix with the topic context characteristics of each target phrase to obtain memory weight characteristics;

superposing the matching weight features and the memory weight features into integrated weight features;

and carrying out weighted summation on the integrated weight characteristics and the target subject knowledge characteristic matrix to obtain the extended subject characteristics of the target text.

when the processor 1004 performs matching on the source topic knowledge characteristic matrix and the topic context characteristics of each target phrase to obtain the memory weight characteristics, the following steps are specifically performed:

when the processor 1004 performs weighted summation on the integrated weight features and the target topic knowledge feature matrix to obtain the extended topic features of the target text, the following steps are specifically performed:

In one embodiment, when the processor 1004 performs the combination of the expansion topic feature and the topic context features of the N target phrases into the target text feature, the following steps are specifically performed:

In an embodiment, when the processor 1004 performs the step of identifying the target text feature to obtain the service text type to which the target text belongs, the following steps are specifically performed:

calling a classification model to carry out convolution pooling on the target text features to obtain convolution features;

calling the classification model to carry out full-connection processing on the convolution characteristics to obtain the matching probability between the target text and a plurality of service text types;

and taking the service text type corresponding to the maximum matching probability in the multiple matching probabilities as the service text type of the target text.

In one embodiment, the processor 1004 further performs the following steps:

obtaining a sample text, wherein the sample text comprises a plurality of sample word groups;

determining sample theme context characteristics of each sample phrase according to a sample theme phrase weight characteristic set between K text themes and V vocabulary phrases in a sample neural theme model;

acquiring sample matching weight characteristics between the sample text and K text topics;

determining sample expansion topic features based on a sample expansion knowledge model, the sample topic phrase weight feature set, the sample matching weight features and a plurality of sample topic context features;

combining the sample expanded topic feature and the plurality of sample topic context features into a sample text feature;

calling a sample classification model to determine a sample text type corresponding to the sample text characteristics, and determining a classification error according to the sample text type;

calling the sample neural topic model, reconstructing the sample matching weight characteristics to obtain sample bag-of-words characteristics, and determining a reconstruction error according to the sample bag-of-words characteristics;

and training the sample neural topic model, the sample extended knowledge model and the sample classification model according to the reconstruction error and the classification error to obtain a neural topic model, an extended knowledge model and a classification model.

It should be understood that the computer device 1000 described in this embodiment of the present application may perform the description of the text processing method in the embodiment corresponding to fig. 3 to fig. 7, and may also perform the description of the text processing apparatus 1 in the embodiment corresponding to fig. 8, which is not described herein again. In addition, the beneficial effects of the same method are not described in detail.

Further, here, it is to be noted that: an embodiment of the present application further provides a computer storage medium, and the computer storage medium stores the aforementioned computer program executed by the text processing apparatus 1, and the computer program includes program instructions, and when the processor executes the program instructions, the method in the embodiment corresponding to fig. 3 to 7 can be executed, and therefore, details will not be repeated here. In addition, the beneficial effects of the same method are not described in detail. For technical details not disclosed in the embodiments of the computer storage medium referred to in the present application, reference is made to the description of the embodiments of the method of the present application. By way of example, program instructions may be deployed to be executed on one computer device or on multiple computer devices at one site or distributed across multiple sites and interconnected by a communication network, which may comprise a block chain system.

According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instruction from the computer-readable storage medium, and executes the computer instruction, so that the computer device can perform the method in the embodiment corresponding to fig. 3 to fig. 7, and therefore, the detailed description thereof will not be repeated here.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims

1. A method of text processing, comprising:

identifying matching weight characteristics between the target text and the K text topics, calling a first neural perceptron in an extended knowledge model, and compressing the topic phrase weight characteristic set into a source topic knowledge characteristic matrix;

carrying out weighted summation on the integrated weight characteristics and the target subject knowledge characteristic matrix to obtain the extended subject characteristics of the target text;

2. The method of claim 1, wherein determining the subject context characteristics of each target phrase and K text topics from the set of subject phrase weight characteristics between K text topics and V vocabulary phrases comprises:

acquiring the word vector characteristics of each target phrase;

3. The method according to claim 2, wherein the process of determining the local context feature of any one of the N target phrases according to the word vector feature of the N target phrases comprises, for any one of the N target phrases:

4. The method according to claim 2, wherein the determining the local context feature of each target phrase according to the word vector features of the N target phrases comprises:

5. The method of claim 2, wherein the set of subject phrase weight features includes K subject phrase weight features, any subject phrase weight feature representing a matching weight between any text subject and V vocabulary phrases;

for any target phrase in the N target phrases, the process of determining the global subject context characteristics of the any target phrase and the K text subjects according to the subject phrase weight characteristic set and the local context characteristics of the any target phrase comprises the following steps:

6. The method of claim 1, wherein the identifying matching weight features between the target text and the K text topics comprises:

7. The method of claim 1, wherein the source topic knowledge feature matrix comprises K source topic knowledge features;

the matching the source topic knowledge characteristic matrix and the topic context characteristics of each target phrase to obtain the memory weight characteristics comprises:

8. The method of claim 7, wherein the target topic knowledge feature matrix comprises K target topic knowledge features, and the integrated weight features comprise K integrated feature weight coefficients;

the weighted summation of the integrated weight features and the target topic knowledge feature matrix to obtain the extended topic features of the target text comprises the following steps:

9. The method of claim 1, wherein said combining the expanded subject feature and the subject context features of the N target phrases into a target text feature comprises:

10. The method of claim 1, wherein the identifying the target text feature to obtain a service text type to which the target text belongs comprises:

11. The method of any one of claims 1-10, further comprising:

12. A text processing apparatus, comprising:

the second identification module is used for identifying the target text characteristics to obtain the service text type of the target text;

wherein the second determining module comprises:

the calling unit is used for calling a first nerve perceptron in an extended knowledge model, compressing the subject phrase weight characteristic set into a source subject knowledge characteristic matrix, calling a second nerve perceptron in the extended knowledge model, and compressing the subject phrase weight characteristic set into a target subject knowledge characteristic matrix;

the matching unit is used for matching the source topic knowledge characteristic matrix with the topic context characteristics of each target phrase to obtain memory weight characteristics;

the calling unit is further used for superposing the matching weight features and the memory weight features into integrated weight features;

and the weighting unit is used for weighting and summing the integrated weight characteristics and the target subject knowledge characteristic matrix to obtain the extended subject characteristics of the target text.

13. A computer arrangement comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to carry out the steps of the method according to any one of claims 1-11.

14. A computer storage medium, characterized in that the computer storage medium stores a computer program comprising program instructions which, when executed by a processor, perform the method of any one of claims 1-11.