CN110913354A

CN110913354A - Short message classification method and device and electronic equipment

Info

Publication number: CN110913354A
Application number: CN201811084292.6A
Authority: CN
Inventors: 高喆; 康杨杨; 周笑添; 孙常龙; 刘晓钟; 司罗
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2018-09-17
Filing date: 2018-09-17
Publication date: 2020-03-24

Abstract

The application discloses a short message classification method and device, a short message category prediction model construction method and device and electronic equipment. The short message classification method comprises the following steps: acquiring a short message text to be processed; performing word embedding on the short message text to obtain a word vector included in the short message text; taking word vectors included in the short message text as input data of a short message category prediction model, wherein the short message category prediction model comprises a short message feature extraction sub-network and a multi-category prediction sub-network, and extracting the features of the short message text according to the word vectors included in the short message text through the short message feature extraction sub-network; and acquiring the multi-class predicted value of the short message text according to the characteristics through the multi-class prediction subnetwork. By adopting the processing mode, the deep multi-label learning model is combined to improve the expression capability of the features; therefore, the accuracy of short message classification can be effectively improved.

Description

Short message classification method and device and electronic equipment

Technical Field

The application relates to the technical field of text classification, in particular to a short message classification method and device, a short message classification prediction model construction method and device and electronic equipment.

Background

A typical scenario for sending a short message is that a merchant sends a short message to a consumer through a network platform, so as to send information such as sales promotion of goods to the consumer in time, thereby ensuring effective implementation of a sales plan of the merchant and improving user experience. However, along with these benefits, a large amount of spam has also emerged. The flooding of spam messages has seriously influenced the normal life of consumers, the image of a network platform and even the social stability.

With the continuous development of internet technology, more and more network platforms utilize short message content security systems to perform content analysis on short messages of Business-to-Customer (B2C), and perform intelligent short message interception and channel optimization. The short message classification is an important function of a short message content safety system, and by classifying the short messages, each attribute dimension of the short messages can be effectively analyzed, so that a short message sending channel is reasonably scheduled, the service is safer, and the whole sending cost is reduced.

The short message classification method is mainly divided into three categories: a two-classification model-based method, a multi-classification model-based method, and a multi-label learning-based method. In addition, because the models need to be trained separately for all the classes, the number of the models needing to be trained is very large; the method based on the multi-classification model does not consider samples overlapped among attribute classes, a large amount of noise is introduced during training, and the problem that one sample appears in multiple classes to cause indistinguishability occurs; the method based on multi-label learning, such as FastXML or SLEEC, solves the problem that the traditional single-label classification model cannot adapt to multi-label classification to a certain extent, and therefore becomes the most common short message classification method at present.

However, in the process of implementing the present invention, the inventor finds that the existing short message classification scheme based on multi-tag learning has at least the following problems: when a large number of categories are faced, the existing short message feature expression is too simple, so that the accuracy of the whole short message classification is low.

Disclosure of Invention

The application provides a short message classification method to solve the problem of low short message classification accuracy in the prior art. The application further provides a short message classification device, a short message classification prediction model construction method and device and electronic equipment.

The application provides a short message classification method, which comprises the following steps:

acquiring a short message text to be processed;

performing word embedding on the short message text to obtain a word vector included in the short message text;

taking word vectors included in the short message text as input data of a short message category prediction model, wherein the short message category prediction model comprises a short message feature extraction sub-network and a multi-category prediction sub-network, and extracting the features of the short message text according to the word vectors included in the short message text through the short message feature extraction sub-network;

and acquiring the multi-class predicted value of the short message text according to the characteristics through the multi-class prediction subnetwork.

Optionally, the network structure of the short message feature extraction sub-network comprises a Bi-directional long-short term memory network structure Bi-LSTM;

the using the word vector included in the short message text as the input data of the short message category prediction model comprises the following steps:

taking a forward sequence of word vectors included in the short message text as input data of a first LSTM; and taking the reverse sequence of the word vector included in the short message text as input data of a second LSTM.

Optionally, the method further includes:

acquiring signature information corresponding to the short message text;

performing word embedding on the signature information to obtain a word vector of the signature information;

the obtaining the multi-class prediction value of the short message text according to the characteristics through the multi-class prediction sub-network comprises:

and acquiring the multi-class predicted value according to the feature and the word vector of the signature information through the multi-class prediction subnetwork.

Optionally, the performing word embedding on the signature information to obtain a word vector of the signature information includes:

acquiring a word vector of a word in the signature information;

and determining a word vector of the signature information according to the word vector.

Optionally, the performing word embedding on the short message text to obtain a word vector included in the short message text includes:

acquiring a first word vector included in the short message text; and acquiring a word vector of the word in the short message text;

and determining word vectors included in the short message text according to the first word vector and the word vectors.

Optionally, the method further includes:

acquiring a training short message text set comprising category marking information;

and learning the short message type prediction model from the short message text set for training.

Optionally, the method further includes:

acquiring signature information corresponding to the short message text for training;

the learning of the short message type prediction model from the short message text set for training comprises the following steps:

and learning to obtain the short message type prediction model according to the short message text set for training and the signature information corresponding to the short message text for training.

Optionally, the loss function of the multi-class prediction sub-network includes a binary cross entropy function.

The present application further provides a short message classification device, including:

the short message text acquisition unit is used for acquiring a short message text to be processed;

the first word embedding unit is used for executing word embedding on the short message text to obtain a word vector included by the short message text;

the feature extraction unit is used for taking word vectors contained in the short message text as input data of a short message category prediction model, wherein the short message category prediction model comprises a short message feature extraction sub-network and a multi-category prediction sub-network, and features of the short message text are extracted according to the word vectors contained in the short message text through the short message feature extraction sub-network;

and the multi-class prediction unit is used for acquiring a multi-class prediction value of the short message text according to the characteristics through the multi-class prediction sub-network.

the feature extraction unit is specifically configured to use a forward sequence of word vectors included in the short message text as input data of a first LSTM; and taking the reverse sequence of the word vector included in the short message text as input data of a second LSTM.

Optionally, the method further includes:

the signature information acquisition unit is used for acquiring signature information corresponding to the short message text;

the second word embedding unit is used for executing word embedding on the signature information to obtain a word vector of the signature information;

the multi-class prediction unit is specifically configured to obtain, through the multi-class prediction subnetwork, the multi-class prediction value according to the feature and the word vector of the signature information.

Optionally, the second word embedding unit includes:

a word vector obtaining subunit, configured to obtain a word vector of a word in the signature information;

and the word vector determining subunit is used for determining the word vector of the signature information according to the word vector.

Optionally, the first word embedding unit includes:

the first word vector acquiring subunit is used for acquiring a first word vector included in the short message text;

the word vector acquiring subunit is used for acquiring the word vector of the word in the short message text;

and the word vector determining subunit is used for determining the word vectors included in the short message text according to the first word vector and the word vectors.

Optionally, the method further includes:

the training sample acquisition unit is used for acquiring a training short message text set comprising category marking information;

and the model training unit is used for obtaining the short message type prediction model from the short message text set for training.

Optionally, the method further includes:

The present application further provides an electronic device, comprising:

a processor; and

the memory is used for storing a program for realizing the short message classification method, and after the device is powered on and the program for realizing the short message classification method is run by the processor, the following steps are executed: acquiring a short message text to be processed; performing word embedding on the short message text to obtain a word vector included in the short message text; taking word vectors included in the short message text as input data of a short message category prediction model, wherein the short message category prediction model comprises a short message feature extraction sub-network and a multi-category prediction sub-network, and extracting the features of the short message text according to the word vectors included in the short message text through the short message feature extraction sub-network; and acquiring the multi-class predicted value of the short message text according to the characteristics through the multi-class prediction subnetwork.

The application also provides a short message category prediction model construction method, which comprises the following steps:

constructing a deep neural network according to a plurality of categories to be predicted; the deep neural network comprises a short message feature extraction sub-network and a multi-class prediction sub-network based on a deep model; the short message feature extraction sub-network is used for extracting the features of the short message text according to the word vectors included in the short message text; the multi-class prediction sub-network is used for acquiring multi-class prediction values of the short message text according to the characteristics;

and taking word vectors included in the short message text as input data of the deep neural network, taking the category marking information as output data of the deep neural network, and training the deep neural network according to the short message text set for training to obtain a short message category prediction model.

Optionally, the method further includes:

acquiring signature information corresponding to the short message text;

the multi-class prediction subnetwork is specifically configured to obtain the multi-class prediction value according to the feature and the word vector of the signature information.

The present application further provides a short message category prediction model construction device, including:

the deep neural network construction unit is used for constructing a deep neural network according to a plurality of categories to be predicted; the deep neural network comprises a short message feature extraction sub-network and a multi-class prediction sub-network based on a deep model; the short message feature extraction sub-network is used for extracting the features of the short message text according to the word vectors included in the short message text; the multi-class prediction sub-network is used for acquiring multi-class prediction values of the short message text according to the characteristics;

and the model training unit is used for taking word vectors included in the short message text as input data of the deep neural network, taking the category marking information as output data of the deep neural network, and training the deep neural network according to the short message text set for training to obtain a short message category prediction model.

Optionally, the method further includes:

The present application further provides an electronic device, comprising:

a processor; and

the memory is used for storing a program for realizing the short message type prediction model building method, and after the device is electrified and runs the program for realizing the short message type prediction model building method through the processor, the following steps are executed: acquiring a training short message text set comprising category marking information;

constructing a deep neural network according to a plurality of categories to be predicted; the deep neural network comprises a short message feature extraction sub-network and a multi-class prediction sub-network based on a deep model; the short message feature extraction sub-network is used for extracting the features of the short message text according to the word vectors included in the short message text; the multi-class prediction sub-network is used for acquiring multi-class prediction values of the short message text according to the characteristics; performing word embedding on the short message text to obtain a word vector included in the short message text; and taking word vectors included in the short message text as input data of the deep neural network, taking the category marking information as output data of the deep neural network, and training the deep neural network according to the short message text set for training to obtain a short message category prediction model.

The present application also provides a computer-readable storage medium having stored therein instructions, which when run on a computer, cause the computer to perform the various methods described above.

The present application also provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the various methods described above.

Compared with the prior art, the method has the following advantages:

the short message classification method provided by the embodiment of the application obtains the short message text to be processed; performing word embedding on the short message text to obtain a word vector included in the short message text; taking word vectors included in the short message text as input data of a short message category prediction model, wherein the short message category prediction model comprises a short message feature extraction sub-network and a multi-category prediction sub-network based on a depth model, extracting the features of the short message text according to the word vectors included in the short message text through the short message feature extraction sub-network; acquiring a multi-class predicted value of the short message text according to the characteristics through the multi-class prediction sub-network; the processing mode is combined with a deep multi-label learning model to improve the expressive power of the features; therefore, the accuracy of short message classification can be effectively improved. In addition, the processing mode only needs to train one deep multi-label learning model to learn the relationship between the short message text and thousands of short message categories, and the model does not need to be trained independently aiming at all the categories; therefore, the number of models can be effectively reduced.

The method for constructing the short message category prediction model comprises the steps of obtaining a short message text set for training, wherein the short message text set comprises category marking information; constructing a deep neural network according to a plurality of categories to be predicted; the deep neural network comprises a short message feature extraction sub-network and a multi-class prediction sub-network based on a deep model; the short message feature extraction sub-network is used for extracting the features of the short message text according to the word vectors included in the short message text; the multi-class prediction sub-network is used for acquiring multi-class prediction values of the short message text according to the characteristics; performing word embedding on the short message text to obtain a word vector included in the short message text; taking word vectors included in the short message text as input data of the deep neural network, taking the category marking information as output data of the deep neural network, and training the deep neural network according to the short message text set for training to obtain a short message category prediction model; the processing mode is combined with a deep multi-label learning model to improve the expressive power of the features; therefore, the prediction accuracy of the short message type prediction model can be effectively improved. Meanwhile, the relation between the short message text and thousands of short message categories can be learned only by training one deep multi-label learning model, and the model does not need to be trained independently aiming at all categories; therefore, the number of models can be effectively reduced.

Drawings

Fig. 1 is a flowchart of an embodiment of a short message classification method provided in the present application;

fig. 2 is a schematic diagram of a short message classification prediction model according to an embodiment of a short message classification method provided in the present application;

fig. 3 is a detailed flowchart of an embodiment of a short message classification method provided in the present application;

FIG. 4 is a diagram of another short message classification prediction model according to an embodiment of the short message classification method provided in the present application;

fig. 5 is a detailed flowchart of an embodiment of a short message classification method provided in the present application;

fig. 6 is a schematic diagram of an embodiment of a short message classification device provided in the present application;

fig. 7 is a detailed schematic diagram of an embodiment of a short message classification device provided in the present application;

fig. 8 is a detailed schematic diagram of an embodiment of a short message classification device provided in the present application;

FIG. 9 is a schematic diagram of an embodiment of an electronic device provided herein;

FIG. 10 is a flowchart of an embodiment of a method for constructing a short message category prediction model according to the present application;

fig. 11 is a specific flowchart of an embodiment of a short message category prediction model construction method provided in the present application;

fig. 12 is a schematic diagram of an embodiment of a short message category prediction model construction device provided in the present application;

fig. 13 is a schematic diagram of an embodiment of a short message category prediction model construction device provided in the present application;

FIG. 14 is a schematic diagram of an embodiment of an electronic device provided herein.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.

The application provides a short message classification method and device, a short message category prediction model construction method and device and electronic equipment. Each of the schemes is described in detail in the following examples.

First embodiment

Please refer to fig. 1, which is a flowchart illustrating an embodiment of a method for classifying short messages according to the present application, wherein an executing entity of the method includes a short message classification apparatus. The short message classification method provided by the application comprises the following steps:

step S101: and acquiring a short message text to be processed.

The short message text is also called short message text or short message text, including but not limited to a mobile phone short message, and may also be an instant message or other forms of short message text.

In specific implementation, the short message classification device can intercept short message texts sent by a short message sender, and perform multi-class identification on the short message texts, so that intelligent short message interception, channel optimization and other processing are facilitated.

Step S103: and performing word embedding on the short message text to obtain a word vector included by the short message text.

After the short message text to be processed is obtained, the short message text can be organized according to word vectors in a word embedding mode, and the word vectors capable of expressing the semantics of the short message text are obtained, so that the text features can be conveniently mined according to the word vectors.

In one example, step S103 may include the following sub-steps: 1) obtaining words included in the short message text through a word segmentation algorithm to serve as short message words; 2) and performing word embedding on the short message words to obtain word vectors of the short message words.

1) And acquiring words included in the short message text as short message words through a word segmentation algorithm.

In specific implementation, the existing word segmentation algorithm can be adopted to perform word segmentation processing on the short message text. The existing word segmentation algorithm can be divided into three categories: a word segmentation method based on character string matching, a word segmentation method based on understanding and a word segmentation method based on statistics. Whether the method is combined with the part-of-speech tagging process or not can be divided into a simple word segmentation method and an integrated method combining word segmentation and tagging. The word segmentation algorithm belongs to the mature prior art, and is not repeated herein, and any one of the existing word segmentation algorithms can be selected according to actual requirements.

For example, the spam message is 'respectful customers, and 2080-yuan crown draw-bar box … can be freely paid in the process of handling XX credit marking in 5 months', and the word segmentation result comprises the following words: honor, client, 5 months, transact, XX credit, marking, free, robbery, exchange, 2080 yuan, crown, draw-bar box, etc.

2) And performing word embedding on the short message words to obtain word vectors of the short message words.

In one example, the step of performing word embedding on the short message word to obtain a word vector of the short message word may include the following sub-steps: 2.1) obtaining a first word vector of the short message word; and acquiring a character vector of the character in the short message word; 2.2) determining the word vector of the short message word according to the first word vector and the word vector.

1) Acquiring a first word vector of the short message word; and acquiring a character vector of the character in the short message word.

The first word vector includes, but is not limited to, the word vector derived by Skip-Gram.

In specific implementation, the method may first adopt an offline or online manner to calculate embedding (word embedding, word vector) of a word-based language model of all short messages in a preset short message set, such as an N-Gram or Skip-Gram language model, or adopt cbow, glove and other manners, so as to determine the first word vector of the short message word. By adopting the processing mode, the accuracy of the word vector can be effectively improved, for example, the short message A ' buying and selling invoice, and I's WeChat ' is added, wherein the ' invoice ' is a common word; the short message B is 'buy and sell issue glance sideways at, add me WeChat' and 'issue glance sideways at' is low in word frequency, but embedding describes the frequent occurrence context of the word, so that 'issue glance sideways at' and 'invoice' are similar in embedding.

2) And determining the word vector of the short message word according to the first word vector and the word vector.

In this embodiment, the text of the short message includes the short message word "abc", where the first word vector of abc is [1,2,3,4], the word vector of a is [1,1,1,1], the word vector of b is [2,2,2,2], the word vector of c is [3,3,3,3], and the final word vector of abc is [ (1+ (1+2+ 3)/2, (2+ (1+2+3)/3)/2, (3+ (1+2+3)/3)/2, (4+ (1+2+3)/3)/2 ]. By adopting the processing mode, the accuracy of the word vector can be further improved.

After the word vectors included in the short message text are obtained, the next step can be entered, a sub-network is extracted through the short message characteristics included in the short message type prediction model, and the characteristics of the short message text are extracted according to the word vectors included in the short message text.

Step S105: and taking the word vectors included in the short message text as input data of a short message type prediction model, extracting the characteristics of the short message text according to the word vectors included in the short message text through a short message characteristic extraction sub-network included in the short message type prediction model.

The short message category prediction model is a deep multi-label learning model. The short message type prediction model comprises a short message characteristic extraction sub-network and a multi-type prediction sub-network. The short message feature extraction sub-network is used for extracting the features of the short message text according to the word vectors included in the short message text; and the multi-class prediction sub-network is used for acquiring the multi-class prediction value of the short message text according to the characteristics.

The short message feature extraction sub-network can adopt various deep neural network structures including but not limited to: convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and so on.

Please refer to fig. 2, which is a schematic diagram of a short message classification prediction model according to an embodiment of a short message classification method provided in the present application. In one example, the short message feature extraction sub-network adopts a Bi-directional long-short term memory network structure Bi-LSTM; correspondingly, the word vector included in the short message text is used as input data of a short message category prediction model, and the following processing modes can be adopted: taking a forward sequence of word vectors included in the short message text as input data of a first LSTM; and taking the reverse sequence of the word vector included in the short message text as input data of a second LSTM; accordingly, the outputs of the hidden layers of the two LSTMs are connected as inputs to the multi-class prediction subnetwork.

The forward sequence of the word vectors included in the short message text refers to a short message word sequence of short message words arranged from left to right in the short message text. The reverse sequence of the word vector included in the short message text refers to a short message word sequence of short message words arranged from right to left in the short message text.

According to the method provided by the embodiment of the application, the long-distance dependency relationship between words can be modeled by adopting the short message characteristic extraction sub-network based on the Bi-LSTM, and the modeling can be performed from two directions, and the long-distance dependency relationship between the words can determine the category condition of the short message text; therefore, the accuracy of the category prediction can be effectively improved.

After the characteristics of the short message text are extracted, the next step can be entered, and the types of the short message text are predicted according to the characteristics through the multi-type prediction sub-network.

Step S107: and acquiring the multi-class predicted value of the short message text according to the characteristics through the multi-class prediction subnetwork.

The multi-class prediction sub-network comprises a full connection layer and an output layer of the characteristics, the probability of each class is calculated by integrating various characteristics of the short message text through the full connection layer, and the class with the probability larger than a probability threshold (such as 0.5) is used as the class of the short message text.

The short message category may include an industry category, a content category, a service category, and the like. For example: industry categories, which may be financial, real estate, educational, medical, entertainment, and the like; content categories, which can be study-keeping immigration, logistics pickup, marriage dating, and the like; the traffic category may be a passcode, a notification, marketing, etc.

Please refer to fig. 3, which is a flowchart illustrating an embodiment of a short message classification method according to the present application. In one example, the method further comprises the steps of:

step S301: and acquiring signature information corresponding to the short message text.

The signature information can be used for distinguishing a short message sending party, namely a short message attribution party. Multiple short message senders can send short messages to consumer users by means of the same network platform. The network platform can identify different merchants according to the short message sending party identification, and the short message sending party identification is used as the signature information of the short message. When the sender sets the sending content, the signature and the short message content need to be set.

It should be noted that the signature information includes, but is not limited to, information such as a sender identifier of the short message, and may also be other information that can identify the short message.

For example, the text of the short message is ' honored client, 2080 yuan imperial crown draw bar box … ' can be charged free by transacting XX credit marking in 5 months ', the sender of the short message is ' XX credit ', and therefore the signature of the short message can be set to ' XX credit '.

Step S303: and performing word embedding on the signature information to obtain a word vector of the signature information.

In one example, step S303 may include the following sub-steps: 1) acquiring a word vector of a word in the signature information; 2) and determining a word vector of the signature information according to the word vector.

In the present embodiment, the word vector of the signature information is an average value of the word vectors of each word. For example, the signature information is "abc", the word vector of a is [1,1,1,1], the word vector of b is [2,2,2,2], the word vector of c is [3,3,3,3], the word vector of abc is [ (1+2+3)/3, (1+2+3)/3, (1+2+3)/3, (1+2+3)/3 ].

Please refer to fig. 4, which is a detailed diagram of another short message classification prediction model according to an embodiment of the short message classification method provided in the present application. In the case shown in fig. 3, the obtaining of the multi-class prediction value of the short message text according to the feature through the multi-class prediction subnetwork may be implemented as follows: and acquiring the multi-class predicted value according to the feature and the word vector of the signature information through the multi-class prediction subnetwork.

According to the method provided by the embodiment of the application, the word vector of the signature information is obtained by obtaining the signature information corresponding to the short message text and performing word embedding on the signature information, and the multi-class predicted value is obtained according to the feature and the word vector of the signature information through the multi-class prediction subnetwork; the processing mode also introduces signature text information on the basis of predicting the short message type according to the short message text characteristics, and performs auxiliary judgment on the short message type according to the signature characteristics; therefore, the accuracy of the category prediction can be effectively improved.

It should be noted that, to implement the method provided by the embodiment of the present application, a short message category prediction model is first constructed, and the short message category prediction model may be learned from training data.

Please refer to fig. 5, which is a flowchart illustrating a short message classification prediction model according to an embodiment of a short message classification method provided in the present application. In this embodiment, the method further includes the steps of:

step S501: and acquiring a short message text set for training, which comprises category marking information.

The short message text set for training comprises a plurality of short message texts and corresponding relations between category marking information.

Step S503: and learning the short message type prediction model from the short message text set for training.

After the short message text set for training is obtained, the short message category prediction model can be obtained by learning from the short message text set for training through a deep learning algorithm. Since the deep learning algorithm belongs to the mature prior art, it is not described herein again.

In one example, the method for constructing the short message category prediction model further comprises the following steps: acquiring signature information corresponding to the short message text for training; accordingly, step S503 may adopt the following manner: and learning to obtain the short message type prediction model according to the short message text set for training and the signature information corresponding to the short message text for training.

When the short message type prediction model is trained, the loss function of the multi-type prediction sub-network can adopt a binary cross entropy function or a common cross entropy function.

According to the scheme provided by the embodiment of the application, the loss function of the binary cross entropy is adopted when the deep multi-label learning model is trained, so that the problem that short message samples of certain classes are sparse can be solved to a certain extent when a large number of classes face, and on the other hand, the samples with overlapped attributes among classes are considered, so that a large amount of noise cannot be introduced during model training, and therefore the problem that one sample appears in multiple classes to cause indistinguishability when multiple classes appear can be avoided; therefore, the accuracy of the class prediction can be effectively improved.

As can be seen from the foregoing embodiments, the short message classification method provided in the embodiments of the present application obtains a short message text to be processed; performing word embedding on the short message text to obtain a word vector included in the short message text; taking word vectors included in the short message text as input data of a short message category prediction model, wherein the short message category prediction model comprises a short message feature extraction sub-network and a multi-category prediction sub-network based on a depth model, extracting the features of the short message text according to the word vectors included in the short message text through the short message feature extraction sub-network; acquiring a multi-class predicted value of the short message text according to the characteristics through the multi-class prediction sub-network; the processing mode is combined with a deep multi-label learning model to improve the expressive power of the features; therefore, the accuracy of short message classification can be effectively improved. In addition, the processing mode only needs to train one deep multi-label learning model to learn the relationship between the short message text and thousands of short message categories, and the model does not need to be trained independently aiming at all the categories; therefore, the number of models can be effectively reduced.

In the above embodiment, a short message classification method is provided, and correspondingly, a short message classification device is also provided. The apparatus corresponds to an embodiment of the method described above.

Second embodiment

Please refer to fig. 6, which is a schematic diagram of an embodiment of a short message classification apparatus according to the present application. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.

The present application additionally provides a short message classification device, comprising:

a short message text acquisition unit 601, configured to acquire a short message text to be processed;

a first word embedding unit 602, configured to perform word embedding on the short message text to obtain a word vector included in the short message text;

a feature extraction unit 603, configured to use word vectors included in the short message text as input data of a short message category prediction model, where the short message category prediction model includes a short message feature extraction sub-network and a multi-category prediction sub-network, and extract features of the short message text according to the word vectors included in the short message text through the short message feature extraction sub-network;

a multi-class prediction unit 604, configured to obtain, through the multi-class prediction subnetwork, a multi-class prediction value of the short message text according to the feature.

the feature extraction unit 603 is specifically configured to use a forward sequence of word vectors included in the short message text as input data of a first LSTM; and taking the reverse sequence of the word vector included in the short message text as input data of a second LSTM.

Please refer to fig. 7, which is a detailed schematic diagram of an embodiment of a short message classification apparatus according to the present application. Optionally, the method further includes:

a signature information obtaining unit 701, configured to obtain signature information corresponding to the short message text;

a second word embedding unit 702, configured to perform word embedding on the signature information to obtain a word vector of the signature information;

the multi-class prediction unit 604 is specifically configured to obtain the multi-class prediction value according to the feature and the word vector of the signature information through the multi-class prediction subnetwork.

Optionally, the second word embedding unit 702 includes:

Optionally, the first word embedding unit 602 includes:

Please refer to fig. 8, which is a detailed schematic diagram of an embodiment of a short message classification apparatus according to the present application. Optionally, the method further includes:

a training sample acquisition unit 801, configured to acquire a training short message text set including category label information;

a model training unit 802, configured to learn the short message category prediction model from the short message text set for training.

Optionally, the method further includes:

Third embodiment

Please refer to fig. 9, which is a schematic diagram of an embodiment of an electronic device according to the present application. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.

An electronic device of the present embodiment includes: a processor 901 and a memory 902; the memory is used for storing a program for realizing the short message classification method, and after the device is powered on and the program for realizing the short message classification method is run by the processor, the following steps are executed: acquiring a short message text to be processed; performing word embedding on the short message text to obtain a word vector included in the short message text; taking word vectors included in the short message text as input data of a short message category prediction model, wherein the short message category prediction model comprises a short message feature extraction sub-network and a multi-category prediction sub-network, and extracting the features of the short message text according to the word vectors included in the short message text through the short message feature extraction sub-network; and acquiring the multi-class predicted value of the short message text according to the characteristics through the multi-class prediction subnetwork.

In the above embodiment, a short message classification method is provided, and correspondingly, the application also provides a short message classification prediction model construction method. The method corresponds to the embodiment of the method described above.

Fourth embodiment

Please refer to fig. 10, which is a flowchart illustrating an embodiment of a method for constructing a short message classification prediction model according to the present application, wherein an execution subject of the method includes a short message classification prediction model constructing apparatus. Since the method embodiment is a part of the method embodiment of the first embodiment, the description is relatively simple, and the relevant points can be referred to the partial description of the method embodiment. The method embodiments described below are merely illustrative.

The short message category prediction model construction method provided by the application comprises the following steps:

step S1001: and acquiring a short message text set for training, which comprises category marking information.

Step S1003: and constructing a deep neural network according to a plurality of categories to be predicted.

The category to be predicted refers to the short message category which can be predicted by the short message category prediction model. The category to be predicted can include an industry category, a content category, a business category and the like. For example: industry categories, which may be financial, real estate, educational, medical, entertainment, and the like; content categories, which can be study-keeping immigration, logistics pickup, marriage dating, and the like; the traffic category may be a passcode, a notification, marketing, etc.

The deep neural network comprises a short message feature extraction sub-network and a multi-class prediction sub-network based on a deep model; the short message feature extraction sub-network is used for extracting the features of the short message text according to the word vectors included in the short message text; and the multi-class prediction sub-network is used for acquiring the multi-class prediction value of the short message text according to the characteristics.

Step S1005: and performing word embedding on the short message text to obtain a word vector included by the short message text.

Step S1007: and taking word vectors included in the short message text as input data of the deep neural network, taking the category marking information as output data of the deep neural network, and training the deep neural network according to the short message text set for training to obtain a short message category prediction model.

The loss function of the multi-class prediction sub-network comprises a binary cross-entropy function.

Please refer to fig. 11, which is a flowchart illustrating an embodiment of a method for constructing a short message classification prediction model according to the present application. In this embodiment, the method further includes the steps of:

step S1101: and acquiring signature information corresponding to the short message text.

Step S1103: and performing word embedding on the signature information to obtain a word vector of the signature information.

In this case, the multi-class prediction subnetwork is specifically configured to obtain the multi-class prediction value according to the feature and the word vector of the signature information.

As can be seen from the above embodiments, the short message category prediction model construction method provided in the embodiments of the present application obtains a short message text set for training including category labeling information; constructing a deep neural network according to a plurality of categories to be predicted; the deep neural network comprises a short message feature extraction sub-network and a multi-class prediction sub-network based on a deep model; the short message feature extraction sub-network is used for extracting the features of the short message text according to the word vectors included in the short message text; the multi-class prediction sub-network is used for acquiring multi-class prediction values of the short message text according to the characteristics; performing word embedding on the short message text to obtain a word vector included in the short message text; taking word vectors included in the short message text as input data of the deep neural network, taking the category marking information as output data of the deep neural network, and training the deep neural network according to the short message text set for training to obtain a short message category prediction model; the processing mode is combined with a deep multi-label learning model to improve the expressive power of the features; therefore, the prediction accuracy of the short message type prediction model can be effectively improved. Meanwhile, the relation between the short message text and thousands of short message categories can be learned only by training one deep multi-label learning model, and the model does not need to be trained independently aiming at all categories; therefore, the number of models can be effectively reduced.

In the above embodiment, a short message category prediction model construction method is provided, and correspondingly, the present application also provides a short message category prediction model construction device. The apparatus corresponds to an embodiment of the method described above.

Fifth embodiment

Please refer to fig. 12, which is a schematic diagram of an embodiment of a short message classification prediction model construction apparatus according to the present application. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.

a training sample obtaining unit 1201, configured to obtain a training short message text set including category labeling information;

a deep neural network construction unit 1202, configured to construct a deep neural network according to a plurality of categories to be predicted; the deep neural network comprises a short message feature extraction sub-network and a multi-class prediction sub-network based on a deep model; the short message feature extraction sub-network is used for extracting the features of the short message text according to the word vectors included in the short message text; the multi-class prediction sub-network is used for acquiring multi-class prediction values of the short message text according to the characteristics;

a first word embedding unit 1203, configured to perform word embedding on the short message text to obtain a word vector included in the short message text;

a model training unit 1204, configured to use word vectors included in the short message text as input data of the deep neural network, use the category labeling information as output data of the deep neural network, and train the deep neural network according to the short message text set for training, so as to obtain a short message category prediction model.

Please refer to fig. 13, which is a detailed schematic diagram of an embodiment of a short message classification prediction model construction apparatus according to the present application. In this embodiment, the apparatus further includes:

a signature information obtaining unit 1301, configured to obtain signature information corresponding to the short message text;

a second word embedding unit 1302, configured to perform word embedding on the signature information to obtain a word vector of the signature information;

Sixth embodiment

Please refer to fig. 14, which is a diagram illustrating an embodiment of an electronic device according to the present application. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.

An electronic device of the present embodiment includes: a processor 1401 and a memory 1402; the memory is used for storing a program for realizing the short message type prediction model building method, and after the device is electrified and runs the program for realizing the short message type prediction model building method through the processor, the following steps are executed: acquiring a training short message text set comprising category marking information; constructing a deep neural network according to a plurality of categories to be predicted; the deep neural network comprises a short message feature extraction sub-network and a multi-class prediction sub-network based on a deep model; the short message feature extraction sub-network is used for extracting the features of the short message text according to the word vectors included in the short message text; the multi-class prediction sub-network is used for acquiring multi-class prediction values of the short message text according to the characteristics; performing word embedding on the short message text to obtain a word vector included in the short message text; and taking word vectors included in the short message text as input data of the deep neural network, taking the category marking information as output data of the deep neural network, and training the deep neural network according to the short message text set for training to obtain a short message category prediction model.

Although the present application has been described with reference to the preferred embodiments, it is not intended to limit the present application, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present application, therefore, the scope of the present application should be determined by the claims that follow.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

1. Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.

2. As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Claims

1. A short message classification method is characterized by comprising the following steps:

acquiring a short message text to be processed;

2. The method of claim 1,

the network structure of the short message feature extraction sub-network comprises a bidirectional long-short term memory network structure B i-LSTM;

3. The method of claim 1, further comprising:

acquiring signature information corresponding to the short message text;

4. The method of claim 3, wherein performing word embedding on the signature information to obtain a word vector of the signature information comprises:

acquiring a word vector of a word in the signature information;

5. The method of claim 1, wherein the performing word embedding on the short message text to obtain a word vector included in the short message text comprises:

6. The method of claim 1, further comprising:

7. The method of claim 6, further comprising:

8. The method of claim 6, wherein the loss function of the multi-class prediction sub-network comprises a binary cross-entropy function.

9. A short message classification device is characterized by comprising:

10. The apparatus of claim 9,

the network structure of the short message feature extraction sub-network comprises a bidirectional long-short term memory network structure Bi-LSTM;

11. The apparatus of claim 9, further comprising:

12. The apparatus of claim 11, wherein the second word embedding unit comprises:

13. The apparatus of claim 9, wherein the first word embedding unit comprises:

14. The apparatus of claim 9, further comprising:

15. The apparatus of claim 14, further comprising:

16. An electronic device, comprising:

a processor; and

17. A short message category prediction model construction method is characterized by comprising the following steps:

18. The method of claim 17, further comprising:

acquiring signature information corresponding to the short message text;

19. The method of claim 17, wherein the loss function of the multi-class prediction sub-network comprises a binary cross-entropy function.

20. A short message type prediction model construction device is characterized by comprising the following steps:

21. The apparatus of claim 20, further comprising:

22. An electronic device, comprising:

a processor; and

the memory is used for storing a program for realizing the short message type prediction model building method, and after the device is electrified and runs the program for realizing the short message type prediction model building method through the processor, the following steps are executed: acquiring a training short message text set comprising category marking information; constructing a deep neural network according to a plurality of categories to be predicted; the deep neural network comprises a short message feature extraction sub-network and a multi-class prediction sub-network based on a deep model; the short message feature extraction sub-network is used for extracting the features of the short message text according to the word vectors included in the short message text; the multi-class prediction sub-network is used for acquiring multi-class prediction values of the short message text according to the characteristics; performing word embedding on the short message text to obtain a word vector included in the short message text; and taking word vectors included in the short message text as input data of the deep neural network, taking the category marking information as output data of the deep neural network, and training the deep neural network according to the short message text set for training to obtain a short message category prediction model.