CN112100390A

CN112100390A - Scene-based text classification model, text classification method and device

Info

Publication number: CN112100390A
Application number: CN202011291516.8A
Authority: CN
Inventors: 李博; 徐英杰
Original assignee: Zhizhe Sihai Beijing Technology Co Ltd
Current assignee: Zhizhe Sihai Beijing Technology Co Ltd
Priority date: 2020-11-18
Filing date: 2020-11-18
Publication date: 2020-12-18
Anticipated expiration: 2040-11-18
Also published as: CN112100390B

Abstract

The invention relates to a scene-based text classification model, a text classification method and a device, belongs to the technical field of natural language processing, and can realize text classification under multiple scenes and improve the accuracy of a text classification result. The model comprises: the input layer is used for acquiring the short text and the scene label, and the scene label is used for representing the scene category of the short text; the embedded layer is used for adding scene codes and position codes to the short text to obtain coded representation of the short text in the scene; the coding layer comprises a multi-head self-attention layer and is used for extracting text features in the short text according to the coding representation obtained by the embedding layer to obtain feature vectors of the short text; the decoding layer comprises a multilayer perceptron and is used for outputting scores of all service types according to the feature vectors of the short texts obtained by the coding layer; and the output layer is used for outputting the classification result according to the scores of all the service types.

Description

Scene-based text classification model, text classification method and device

Technical Field

The invention relates to the technical field of natural language processing, in particular to a text classification model based on scenes, a text classification method and a text classification device.

Background

At present, in the business of various social platforms (such as microblog, know and the like), a large number of situations with similar scenes and the same task exist. Tasks are similar but not completely consistent under different scenes, and may be different in data source, object category and discrimination definition. The multi-scene similar task is mainly characterized in that 1, the scene is multiple. For example, a scene may contain questions, answers, comments, articles, barracks, ideas, etc. 2. There is similarity, but not complete agreement, between the various scene tasks. For example, for unfriendly services, although the unfriendly related content is identified in each scene, the definition about unfriendly is slightly different in different scenes, the target label is different, and the data distribution is greatly different in each scene.

In the prior art, models, online and optimization are respectively trained on each scene, but the development and maintenance are relatively complicated, the commonality of tasks under each scene is neglected, and when the number of samples in a single scene is extremely insufficient, an effective model is more difficult to train, so that the result of text classification based on the model is inaccurate.

Disclosure of Invention

In view of this, embodiments of the present invention provide a scene-based text classification model, a text classification method, and a device, which can implement text classification in multiple scenes and improve accuracy of text classification results.

In a first aspect of the embodiments of the present invention, a text classification model based on a scene is provided, where the text classification model is implemented by a computer, and the model includes: the system comprises an input layer, a display layer and a scene tag, wherein the input layer is used for acquiring a short text and the scene tag, and the scene tag is used for representing the scene category of the short text; the embedded layer is used for adding scene codes and position codes to the short text to obtain coded representation of the short text under the scene; the coding layer comprises a multi-head self-attention layer and is used for extracting text features in the short text according to the coding representation obtained by the embedding layer to obtain feature vectors of the short text; the decoding layer comprises a multilayer perceptron and is used for outputting scores of all service types according to the feature vectors of the short texts obtained by the coding layer; and the output layer is used for outputting a classification result according to the scores of all the service types.

In one possible embodiment, the embedding layer is specifically configured to: and mapping the scene label and the short text into a vector, and adding a position code to obtain a code representation of the short text in the scene.

In one possible embodiment, the scene tag is located at the start position or the end position of the coded representation, and the coded representation of the short text in the scene is as follows:

or is or

；

Wherein the content of the first and second substances,

in order to encode a scene, the scene is encoded,

it is shown that the position code is,

representing the vector to which each character corresponds.

In a second aspect of the embodiments of the present invention, a method for training a text classification model based on a scene is provided, where the method includes: acquiring a training set, wherein the training set comprises scene labels and short texts, and the scene labels are used for representing scene categories where the short texts are located; inputting the scene labels and the short texts into a text classification model for iterative training to determine parameters of the text classification model; and constructing the text classification model according to the text classification model parameters.

In one possible embodiment, the text classification model comprises at least: an embedding layer, an encoding layer, and a decoding layer, wherein: the embedded layer is used for adding scene codes and position codes to the short text to obtain coded representation of the short text under the scene; the coding layer comprises a multi-head self-attention layer and is used for extracting text features in the short text according to the coding representation obtained by the embedding layer to obtain feature vectors of the short text; and the decoding layer comprises a multilayer perceptron and is used for outputting scores of all service types according to the feature vectors of the short texts obtained by the coding layer.

In one possible embodiment, the inputting the scene label and the short text into a text classification model iterative training to determine the parameters of the text classification model comprises inputting the scene label and the short text into a text classification model training model parameters; and when the loss values and the harmonic mean values in the verification set meet the set conditions, stopping training and determining the parameters of the text classification model.

In a third aspect of the embodiments of the present invention, a text classification method is provided, where the method includes: acquiring a short text to be recognized and a scene label to which the short text belongs; inputting the short text and the scene label to a trained text classification model; and outputting the classification result of the short text to be recognized under the scene label.

In one possible embodiment, the classification results include at least two of the following: unfriendly, unfriendly and normal.

In a fourth aspect of the embodiments of the present invention, a scene-based text classification model training apparatus is provided, where the apparatus includes: the acquisition module is configured to acquire a training set, wherein the training set comprises scene labels and short texts, and the scene labels are used for representing scene categories where the short texts are located; a training module configured to input the scene labels and the short texts into a text classification model to perform iterative training to determine parameters of the text classification model; a construction module configured to construct the text classification model according to the text classification model parameters.

In one possible embodiment, the training module is specifically configured to input the scene labels and the short texts into a text classification model training model parameter; and when the loss values and the harmonic mean values in the verification set meet the set conditions, stopping training and determining the parameters of the text classification model.

In a fifth aspect of the embodiments of the present invention, there is provided a text classification apparatus, including: the system comprises an acquisition module, a recognition module and a recognition module, wherein the acquisition module is configured to acquire a short text to be recognized and a scene label to which the short text belongs; an input module configured to input the short text and the scene label to a trained text classification model; and the output module is configured to output the classification result of the short text to be recognized under the scene label.

In a sixth aspect of the embodiments of the present invention, there is provided an electronic device, including: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to the second or third aspect when executing the program.

A seventh aspect of embodiments of the present invention provides a computer-readable storage medium having stored thereon executable instructions, which when executed by a processor, cause the processor to perform the method according to the second or third aspect.

The text classification model based on the scene, the text classification method and the text classification device provided by the embodiment of the invention comprise an input layer, a short text and a scene label, wherein the input layer is used for acquiring the short text and the scene label, and the scene label is used for representing the scene type of the short text; the embedded layer is used for adding scene codes and position codes to the short text to obtain coded representation of the short text in the scene; the coding layer comprises a multi-head self-attention layer and is used for extracting text features in the short text according to the coding representation obtained by the embedding layer to obtain feature vectors of the short text; the decoding layer comprises a multilayer perceptron and is used for outputting scores of all service types according to the feature vectors of the short texts obtained by the coding layer; and the output layer is used for outputting the classification result according to the scores of all the service types. The scene labels capable of representing the short text scenes are acquired at the input layer, and then the scene codes and the position codes are added at the embedded layer to distinguish different scenes, so that when the same service is faced, the classification results of the corresponding scenes are different. In addition, because different scenes are distinguished in the coding layer, the multi-layer perceptron is used for shared decoding in the decoding layer, so that the model training is simple, the development and maintenance are convenient, and the effect is better. Furthermore, as the scene labels are added to the model, the text classification model can be used for text classification under multiple scenes, and the text classification result is accurate.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts. The foregoing and other objects, features and advantages of the application will be apparent from the accompanying drawings. Like reference numerals refer to like parts throughout the drawings. The drawings are not intended to be to scale as practical, emphasis instead being placed upon illustrating the subject matter of the present application.

Fig. 1 is a schematic structural diagram illustrating a text classification model based on a scene according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a method for training a scene-based text classification model according to an embodiment of the present invention;

FIG. 3 is a flow chart of a text classification method according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram illustrating a training apparatus for a scene-based text classification model according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram illustrating a text classification apparatus according to an embodiment of the present invention;

fig. 6 shows a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. It is to be understood that such description is merely illustrative and not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The words "a", "an" and "the" and the like as used herein are also intended to include the meanings of "a plurality" and "the" unless the context clearly dictates otherwise. Furthermore, the terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.

Currently, a text classification model is usually trained based on the following four ways.

1. The method comprises the following steps of scene training, wherein the scheme ignores the commonalities of data and tasks under different scenes, firstly, in the real service, the labeled samples of different scenes are limited, and the scheme causes that model training under each scene is more difficult to be fully fitted; secondly, the task commonality information in each scene is lost, and the data of other scenes is difficult to utilize; c. the first two points directly result in poor model performance.

2. The scheme ignores the difference of data and tasks under different scenes, mainly because all scenes are not distinguished on the data and the models, the models need to guess which scene task the data specifically corresponds to first and then determine the output of the scene task, and the prediction effect is poor directly.

3. Fine-tuning migration multi-scenario: the pre-training model is a current state-of-art text classification method, and can realize transfer learning through fine adjustment. The scheme can form a model for each scene, and has long training process and difficult development and maintenance.

4. And in the multi-task learning, different scenes of the scheme share the same underlying network, and each scene has a unique upper sub-network. Under the scheme, although the shared underlying network can learn a better representation, the upper-layer sub-network of a scene with extremely small data amount is difficult to be fitted sufficiently, so that the task under the scene cannot utilize the data under the other scenes for learning.

Based on the problems of the above training and learning methods, embodiments of the present invention provide a scene-based text classification model, a text classification method, and a device, which can implement text classification in multiple scenes, improve accuracy of text classification results, and have the advantages of simple model training, convenient development dimensionality, and better effect.

The text classification model based on the scene, the text classification method and the text classification device provided by the embodiment of the invention comprise an input layer, a short text and a scene label, wherein the input layer is used for acquiring the short text and the scene label, and the scene label is used for representing the scene type of the short text; the embedded layer is used for adding scene codes and position codes to the short text to obtain coded representation of the short text in the scene; the coding layer comprises a multi-head self-attention layer and is used for extracting text features in the short text according to the coding representation obtained by the embedding layer to obtain feature vectors of the short text; the decoding layer comprises a multilayer perceptron and is used for outputting scores of all service types according to the feature vectors of the short texts obtained by the coding layer; and the output layer is used for outputting the classification result according to the scores of all the service types. The scene labels capable of representing the short text scenes are acquired at the input layer, and then the scene codes and the position codes are added at the embedded layer to distinguish different scenes, so that when the same service is faced, the classification results of the corresponding scenes are different. In addition, because different scenes are distinguished in the coding layer, the multi-layer perceptron is used for shared decoding in the decoding layer, so that the model training is simple, the development and maintenance are convenient, and the effect is better. Furthermore, as the scene labels are added to the model, the text classification model can be used for text classification under multiple scenes, and the text classification result is accurate. The technical content of the implementation of the invention will be described in detail with reference to the accompanying fig. 1-6, and specific reference can be made to the following.

Fig. 1 is a schematic structural diagram of a scene-based text classification model according to an embodiment of the present invention. The text classification model is implemented by a computer, the model comprising:

and the input layer is used for acquiring the short text and the scene label.

As an alternative embodiment, the scene tag is used to indicate a scene category in which the short text is located, for example, the scene tag includes but is not limited to: questions, answers, comments, articles, barracks, ideas, and so on. The short text may be one or more words, a sentence, a word or a short article.

And the embedding layer is used for adding scene codes and position codes to the short text to obtain coded representation of the short text in the scene.

As an optional implementation, the embedding layer is specifically configured to: and mapping the scene label and the short text into a vector, and adding position codes to obtain a coded representation of the short text in the scene. The scene label here is located at the start or end position of the coded representation.

For example, if the short text is a sentence, the embedding layer can realize that a sentence containing a plurality of words is given

And mapping words are vectors, adding position codes, and additionally adding scene codes in order to identify different scenes. The specific implementation content mainly comprises the following three parts:

a. scene coding: scene coding is used to identify the scene corresponding to the sentence, and for the mth scene, the method uses

To represent its scene encoding. For simplicity, special characters may be used directly

As its scene code and added to the start or end position of the sentence vector X.

b. Position coding: to acquire sequence information of a sentence, a position code is added to each word. Here, a random vector is used as a position code, obtained by model learning, and represented by PE.

c. Word vector: the word vector is a vector corresponding to each character, and random vectors are used as word vectors, and are obtained by model learning

And (4) showing.

Finally, the encoding of sentence X under scene m is represented as:

，

or

；

Wherein the content of the first and second substances,

in order to encode a scene, the scene is encoded,

it is shown that the position code is,

representing the vector to which each character corresponds.

And the coding layer comprises a multi-head self-attention layer and is used for extracting text features in the short text according to the coding representation obtained by the embedding layer to obtain feature vectors of the short text.

As an alternative implementation, a transformer is used as the encoder, including a predominantly multi-headed self-attention layer that effectively extracts periods and gets a per-word dynamic vector representation. In each multi-headed self-attention layer, scaled point-by-point attention is used to model interactions between individual words in a sentence.

First, a sequence vector is given

Wherein

And

representing the length and dimensions of the input vector sequence, H is projected from the attention layer to three different matrices: a query matrix, a key matrix, and a value matrix.

；

；

The scaled point-by-attention is then used to obtain an output representation, the result of which is as follows:

；

。

wherein, the above

，

，

Are learnable parameters. In order to enhance the effect of the self-attention layer, a multi-head attention mode is adopted to jointly model a plurality of interactions from different representation spaces, and the obtained model results are as follows:

here, the number of the first and second electrodes,

are trainable parameters. The encoder also includes a plurality of stacked multi-headed self-attention layers and a fully-connected layer, the output of which is assumed to be H from the input of the self-attention layer

Comprises the following steps:

the above

The layer normalization is indicated.

This enables the use of the same encoder for data of multiple scenes. However, in different scenes, the encoder can extract effective representation in different scenes, namely coded representation of short text in scenes, as long as different scene codes are included.

And the decoding layer comprises a multilayer perceptron and is used for outputting scores of all service types according to the feature vectors of the short texts obtained by the coding layer.

In a common multi-task learning framework, each task has a separate sub-network to predict the labels of each task. The present model uses a single multi-layered perceptron (MLP) as a shared decoder. Since the encoder can distinguish different scenes (i.e. different tasks) and the obtained representation contains scene information, a shared decoder is used here, which is more compact on one hand and can obtain common features of similar tasks on the other hand.

And the output layer is used for outputting the classification result according to the scores of all the service types.

Referring to fig. 1, the same text segment X will generate different prediction results under different scenarios (represented by different shaded boxes), and the model easily learns similar features of the task under the three scenarios due to the use of a shared encoder and decoder. Wherein: the input scene label is represented by a shaded box with a left slant, the corresponding output is correspondingly represented by a shaded box with a left slant, and the output means the classification result under the scene label. Correspondingly, the input scene label is represented by a shaded box with a downward slant to the right, the corresponding output is correspondingly represented by a shaded box with a downward slant to the right, and the output means the classification result under the scene label.

As an alternative implementation, the classification result includes at least two of the following: unfriendly, unfriendly and normal.

For example, if the scene label we input at the input layer is a question, then the classification result obtained at the output layer is the question scene: unfriendly or normal; if the scene label input by the user in the input layer is a comment, the classification result obtained in the output layer under the comment scene is as follows: unfriendly, unfriendly or normal.

The embodiment of the invention uses a single model (a single encoder and a single decoder) to solve the learning process of multi-scene similar tasks, and fully utilizes the characteristic that a plurality of similar tasks are not completely consistent or not completely related. The single model is simple to train, convenient to develop and maintain and better in effect.

Compared with a scene-division training method and a single-model multi-scene training method, the method fully utilizes the characteristics of similar tasks, namely, the connection of different tasks is not cut off (scene-division training) and the difference of different tasks is not ignored (single-model multi-scene training). Compared with scene-based training, the training is simpler, the development and maintenance are more convenient, and the model fitting is more sufficient; compared with single-model multi-scene training, the method has the advantages that scene information is additionally acquired, and the effect is better.

Compared with a fine-tuning migration multi-scene method, the scheme has the advantages that a single model is used for directly solving a plurality of tasks under multiple scenes, training is simpler, development and maintenance are more convenient, in addition, the fine-tuning model can obtain the information of data under two scenes at most, and the model can directly obtain the information of data under all scenes.

Compared with a multi-task learning method, hard isolation does not exist in training of each scene in the scheme, the scenes with insufficient data can also be learned by using the data of other scenes, and the model mainly depends on the fitting capability of the transformer to isolate each scene in a soft mode. In addition, the model has single input and single output, the model is smaller, and the training and prediction are simple.

As shown in fig. 2, a flowchart of a text classification model training method based on a scene according to an embodiment of the present invention is provided. The method comprises the following steps: 201. and acquiring a training set, wherein the training set comprises scene labels and short texts, and the scene labels are used for representing scene categories where the short texts are located. 202. And inputting the scene labels and the short texts into a text classification model for iterative training to determine parameters of the text classification model. 203. And constructing a text classification model according to the text classification model parameters.

As an alternative implementation, the step 202 includes the following steps: inputting the scene labels and the short texts into training model parameters of a text classification model; and when the loss values and the harmonic mean values in the verification set meet the set conditions, stopping training and determining the parameters of the text classification model.

In the training process, training data are divided into a training set, a verification set and a test set, and parameters such as reasonable data iteration times (epochs), sample number (batch _ size) of one training process, early stopping (early stopping) criteria and the like are set firstly. Then, inputting a training set for training various parameters in the model, and constructing the training set by using a plurality of scene data to train the model. The verification set is used for verifying the state and convergence condition of the model in the training process of the model, determining which set of parameters of the model has the best effect by verifying the loss value (loss) and the harmonic mean value (f 1) of the accuracy and the recall rate, and judging when to stop training by the early stopping strategy to prevent overfitting; and finally, judging the classification performance of the model according to the effect of the model on a test set, wherein the test set is used for evaluating the generalization capability of the model, namely, the model uses a verification set to determine the hyper-parameters before, and finally, the test set is used for judging whether the model works or not.

As an optional implementation manner, the text classification model constructed in step 203 at least includes: an embedding layer, an encoding layer, and a decoding layer, wherein: the embedded layer is used for adding scene codes and position codes to the short text to obtain coded representation of the short text in the scene; the coding layer comprises a multi-head self-attention layer and is used for extracting text features in the short text according to the coding representation obtained by the embedding layer to obtain feature vectors of the short text; and the decoding layer comprises a multilayer perceptron and is used for outputting scores of all service types according to the feature vectors of the short texts obtained by the coding layer. The specific content of the text classification model can refer to the corresponding content described above in fig. 1.

Fig. 3 is a flowchart of a text classification method according to an embodiment of the present invention. The method comprises the following steps: 301. and acquiring the short text to be recognized and the scene label to which the short text belongs. 302. And inputting the short text and the scene label into the trained text classification model. 303. And outputting the classification result of the short text to be recognized under the scene label.

As a possible implementation, the classification result includes at least two of the following: unfriendly, and normal.

It should be noted that the present invention is not limited to the combination of dividing the service types into three types, i.e., unfind, unfriendly, and normal, and may also be expanded according to the actual requirements, and the service types are classified into multiple types according to the predetermined standard.

For example, if the input scene tag is a problem, the corresponding output is a classification result of the short text to be recognized under the problem tag; and if the input scene label is comment, the corresponding output is the classification result of the short text to be recognized under the comment label.

In order to create a more friendly community environment, different measures are taken to deal with some unfriendly contents in different scenes. Comments, unkind, unfriendly, normal; private letters, unkind, normal; bullet screen, and unspkind and normal. Although all describe the degree of text unfriendliness in different scenarios, the same unfolded category is not exactly the same scale in three scenarios. For example, the same content: "do not have culture and read book o more". Under the comment scene, the comment scene can be classified as unfriendly; in the private letter scene, only two categories are classified into unknown; unfriendly definitions in the bullet screen scene are more relaxed and so are classified into normal.

As a preferred implementation, the text classification model is directly used for predicting the result of the short text, and for the long text, the short text is split into the short texts and then the short texts are input into the models respectively, and all the short text prediction results are combined to be used as the prediction result of the long text.

Further, the classification model needs to be optimized. In business development, there is a case of single scene optimization. Through testing, the scene model is used for optimizing under single scene data, the performance of other scenes cannot be obviously degraded, but the output score can be obviously changed. When a certain scene is optimized and the data volume of the scene and the original model training set are in the same order, the data of the other scenes can be synchronously added to inhibit the performance degradation degree of the other scenes. For example, if the original multi-scene model is used for comments and answers, it is now better to add some data of the comments if the answer is to be optimized, so that the comments may be less degraded while the answer is optimized.

A text classification model training device based on a scene according to an embodiment of the present invention will be described below based on the related description in the embodiment of the text classification model training method based on a scene corresponding to fig. 2. Technical terms, concepts, and the like related to the above-described embodiments in the following embodiments may be described with reference to the above-described embodiments.

Fig. 4 is a schematic structural diagram of a text classification model training apparatus based on a scene according to an embodiment of the present invention. The device 4 comprises: an obtaining module 401, a training module 402, and a constructing module 403, wherein:

an obtaining module 401 configured to obtain a training set, where the training set includes a scene tag and a short text, and the scene tag is used to represent a scene category where the short text is located; a training module 402 configured to input the scene labels and the short texts into a text classification model to iteratively train and determine parameters of the text classification model; a building module 403 configured to build the text classification model according to the text classification model parameters.

As a possible implementation manner, the training module 402 is specifically configured to input the scene labels and the short texts into training model parameters of a text classification model; and when the loss values and the harmonic mean values in the verification set meet the set conditions, stopping training and determining the parameters of the text classification model.

A text classification device provided in an embodiment of the present invention will be described below based on the related description in the embodiment of the text classification method corresponding to fig. 3. Technical terms, concepts, and the like related to the above-described embodiments in the following embodiments may be described with reference to the above-described embodiments.

As shown in fig. 5, which is a schematic structural diagram of a text classification apparatus according to an embodiment of the present invention, the apparatus 5 includes: an obtaining module 501, an input module 502, and an output module 503, wherein:

an obtaining module 501, configured to obtain a short text to be recognized and a scene tag to which the short text belongs; an input module 502 configured to input the short text and the scene label to the trained text classification model; and an output module 503 configured to output the classification result of the short text to be recognized under the scene label.

As shown in fig. 6, in order to provide a schematic structural diagram of an electronic device according to an embodiment of the present invention, the electronic device 600 includes a Central Processing Unit (CPU) 601, which can execute various suitable actions and processes shown in fig. 2 or 3 according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.

Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, only the division of the functional modules is illustrated, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions. For the specific working processes of the system, the apparatus and the unit described above, reference may be made to the corresponding processes in the foregoing method embodiments, and details are not described here again.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A scene-based text classification model implemented by a computer, the model comprising:

the system comprises an input layer, a display layer and a scene tag, wherein the input layer is used for acquiring a short text and the scene tag, and the scene tag is used for representing the scene category of the short text;

the embedded layer is used for adding scene codes and position codes to the short text to obtain coded representation of the short text under the scene;

the coding layer comprises a multi-head self-attention layer and is used for extracting text features in the short text according to the coding representation obtained by the embedding layer to obtain feature vectors of the short text;

the decoding layer comprises a multilayer perceptron and is used for outputting scores of all service types according to the feature vectors of the short texts obtained by the coding layer;

and the output layer is used for outputting a classification result according to the scores of all the service types.

2. The model of claim 1, wherein the embedding layer is specifically configured to: and mapping the scene label and the short text into a vector, and adding a position code to obtain a code representation of the short text in the scene.

3. The model of claim 2, wherein the scene tag is located at the beginning or end of the coded representation, and the coded representation of the short text under the scene is:

，

or

；

Wherein the content of the first and second substances,

in order to encode a scene, the scene is encoded,

it is shown that the position code is,

representing the vector to which each character corresponds.

4. A method for training a text classification model based on a scene is characterized by comprising the following steps:

acquiring a training set, wherein the training set comprises scene labels and short texts, and the scene labels are used for representing scene categories where the short texts are located;

inputting the scene labels and the short texts into a text classification model for iterative training to determine parameters of the text classification model;

and constructing the text classification model according to the text classification model parameters.

5. A method of text classification, the method comprising:

acquiring a short text to be recognized and a scene label to which the short text belongs;

inputting the short text and the scene label to a trained text classification model;

and outputting the classification result of the short text to be recognized under the scene label.

6. The method of claim 5, wherein the classification results include at least two of: unfriendly, unfriendly and normal.

7. A scene-based text classification model training apparatus, the apparatus comprising:

the acquisition module is configured to acquire a training set, wherein the training set comprises scene labels and short texts, and the scene labels are used for representing scene categories where the short texts are located;

a training module configured to input the scene labels and the short texts into a text classification model to perform iterative training to determine parameters of the text classification model;

a construction module configured to construct the text classification model according to the text classification model parameters.

8. An apparatus for classifying text, the apparatus comprising:

the system comprises an acquisition module, a recognition module and a recognition module, wherein the acquisition module is configured to acquire a short text to be recognized and a scene label to which the short text belongs;

an input module configured to input the short text and the scene label to a trained text classification model;

and the output module is configured to output the classification result of the short text to be recognized under the scene label.

9. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 4-6 when executing the program.

10. A computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the method of any one of claims 4-6.