CN113254645B

CN113254645B - Text classification method and device, computer equipment and readable storage medium

Info

Publication number: CN113254645B
Application number: CN202110635464.XA
Authority: CN
Inventors: 顾凌云; 陈波; 江峰; 陈国豪
Original assignee: Nanjing Bingjian Information Technology Co ltd
Current assignee: Nanjing Bingjian Information Technology Co ltd
Priority date: 2021-06-08
Filing date: 2021-06-08
Publication date: 2021-09-28
Anticipated expiration: 2041-06-08
Also published as: CN113254645A

Abstract

The application discloses a text classification method, a text classification device, computer equipment and a readable storage medium, wherein the text classification method comprises the following steps: obtaining a plurality of training text data including corresponding text category labels, inputting one training text data into an initial text classification model to train to obtain a current text classification model, configuring a first operation type and a second operation type for each layer of current model elements included in a current text classification model to construct a current search network space, inputting another training text data into the current search network space, training to obtain an intermediate text classification model, taking the intermediate text classification model as the current text classification model, continuing training all the training text data to obtain a target text classification model capable of classifying the text data, through the scheme, compared with the situation that a plurality of models need to be trained for different text classification tasks in the related art, the scheme provided by the application can complete the text classification tasks by only one model.

Description

Text classification method and device, computer equipment and readable storage medium

Technical Field

The present application relates to the field of text classification technologies, and in particular, to a text classification method, apparatus, computer device, and readable storage medium.

Background

With more and more data in real life, people hope that an artificial intelligence neural network model can continuously learn like the human brain, however, the existing neural network can not overcome the problem of catastrophic forgetting, namely, after new knowledge is learned, the learned knowledge is almost completely forgotten. In the field of text classification and recognition, because the traditional model obtained by neural network training has the catastrophic forgetting problem, a model needs to be trained correspondingly for a text classification task. With the increase of the text classification task, the number of models to be trained is increased, which not only occupies a large amount of memory of computer equipment, but also wastes a large amount of time for training the models.

Disclosure of Invention

The application provides a text classification method, a text classification device, computer equipment and a readable storage medium, which can realize a plurality of text classification tasks by only one model, save storage space and save time for training a plurality of models.

In a first aspect, an embodiment of the present application provides a text classification method applied to a computer device, where the method includes:

acquiring a plurality of training text data, wherein the plurality of training text data comprise corresponding text type labels;

inputting training text data into an initial text classification model, and training to obtain a current text classification model, wherein the current text classification model comprises a plurality of layers of current model elements;

configuring a first operation type and a second operation type for each layer of current model elements, and constructing a current search network space based on a current text classification model, the first operation type and the second operation type, wherein the first operation type is used for representing the retention operation of the current model elements and the second operation type is used for representing the modification operation of the current model elements;

selecting any training text data from the rest training text data, inputting the training text data into the current search network space, and training to obtain an intermediate text classification model;

and taking the intermediate text classification model as a current text classification model, returning to the step of configuring a first operation type and a second operation type for each layer of current model primitive, and constructing a current search network space based on the current text classification model, the first operation type and the second operation type until all training text data are input, and training to obtain a target text classification model for classifying the text data.

In a possible implementation, the first operation type is a hold operation, and the second operation type is a new operation;

configuring a first operation type and a second operation type for each layer of current model primitives, and constructing a current search network space based on a current text classification model, the first operation type and the second operation type, wherein the method comprises the following steps:

configuring a maintaining operation and a new building operation for each layer of current model primitive;

and constructing and obtaining a current search network space comprising a plurality of layers of current search network nodes based on the current text classification model, wherein each layer of current search network node comprises corresponding current model element configuration holding operation and new construction operation.

In a possible implementation manner, selecting any training text data from the remaining training text data to input into the current search network space, and training to obtain an intermediate text classification model, including:

selecting any training text data from the rest training text data and inputting the training text data into the current search network space;

determining a characteristic value corresponding to each layer of the current search network node according to the training text data;

determining the operation type corresponding to each layer of the current search network node as a holding operation or a new operation according to the characteristic value;

according to the operation type corresponding to each layer of current searching network node, corresponding operation is executed on the current model element corresponding to each layer of current searching network node;

obtaining an initial intermediate text classification model according to each layer of current model elements executing corresponding operations;

and training to obtain an intermediate text classification model according to the training text data.

In a possible implementation manner, according to the determined operation type corresponding to each layer of current search network node, performing a corresponding operation on a current model primitive corresponding to each layer of current search network node includes:

and when the operation type corresponding to the target current searching network node is a holding operation, holding the corresponding target current model primitive, wherein the target current searching network node is any one of the multilayer current searching network nodes.

when the operation type corresponding to the target current searching network node is a holding operation, a target new model element with the same size as the current model element is newly added, and the target current searching network node is any one of the multilayer current searching network nodes;

and training a target new model element according to the training text data.

In one possible implementation, configuring a first operation type and a second operation type for each layer of the current model primitive, further includes:

configuring a third operation type for each layer of current model primitive, wherein the third operation type is fine tuning operation;

according to the operation type corresponding to each layer of the current searching network node, corresponding operation is executed to the current model element corresponding to each layer of the current searching network node, and the method further comprises the following steps:

when the operation type corresponding to the target current search network node is fine tuning operation, the corresponding target current model element is maintained, and a fine tuning parameter model element is established according to the fine tuning operation;

and merging the fine tuning parameter model primitive and the target current model primitive to be used as a target current model primitive.

In a possible implementation manner, determining a feature value corresponding to each layer of the current search network node according to the training text data includes:

acquiring an intermediate test classification text with the same type as the training text data;

and training the classification text based on the intermediate test by using a logistic regression model to obtain a characteristic value corresponding to each layer of the current search network node.

In a second aspect, an embodiment of the present application provides a text classification apparatus, which is applied to a computer device, and includes:

the acquisition module is used for acquiring a plurality of training text data, and the plurality of training text data comprise corresponding text type labels.

The training module is used for inputting training text data into the initial text classification model and training to obtain a current text classification model, and the current text classification model comprises a plurality of layers of current model elements; configuring a first operation type and a second operation type for each layer of current model elements, and constructing a current search network space based on a current text classification model, the first operation type and the second operation type, wherein the first operation type is used for representing the retention operation of the current model elements and the second operation type is used for representing the modification operation of the current model elements; selecting any training text data from the rest training text data, inputting the training text data into the current search network space, and training to obtain an intermediate text classification model; and taking the intermediate text classification model as a current text classification model, returning to the step of configuring a first operation type and a second operation type for each layer of current model primitive, and constructing a current search network space based on the current text classification model, the first operation type and the second operation type until all training text data are input, and training to obtain a target text classification model for classifying the text data.

In a third aspect, an embodiment of the present application provides a computer device, where the computer device includes a processor and a non-volatile memory storing computer instructions, and when the computer instructions are executed by the processor, the computer device performs the text classification method in at least one possible implementation manner of the first aspect.

In a fourth aspect, the present application provides a readable storage medium, where the readable storage medium includes a computer program, and the computer program controls, when running, a computer device in which the readable storage medium is located to perform the text classification method in at least one possible implementation manner of the first aspect.

Compared with the prior art, the beneficial effects provided by the application comprise: the application discloses a text classification method, a text classification device, computer equipment and a readable storage medium, wherein the text classification method comprises the following steps: the method comprises the steps of obtaining a plurality of training text data including corresponding text category labels, inputting one training text data into an initial text classification model for training to obtain a current text classification model, configuring a first operation type and a second operation type for each layer of current model elements included by the current text classification model to construct a current search network space, inputting another training text data into the current search network space, training to obtain an intermediate text classification model, taking the intermediate text classification model as the current text classification model, performing training on all the training text data to obtain a target text classification model capable of classifying the text data, and according to the scheme, compared with the scheme that a plurality of models need to be trained aiming at different text classification tasks in the related technology, the scheme provided by the application can complete a plurality of text classification tasks only through one model.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments will be briefly described below. It is appreciated that the following drawings depict only certain embodiments of the application and are therefore not to be considered limiting of its scope. For a person skilled in the art, it is possible to derive other relevant figures from these figures without inventive effort.

Fig. 1 is a schematic flowchart illustrating steps of a text classification method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a loss function formula provided in an embodiment of the present application;

fig. 3 is a schematic diagram of a eigenvalue solution formula provided in the embodiment of the present application;

fig. 4 is a schematic diagram of a morphological change of a current text classification model in a training process according to an embodiment of the present application;

fig. 5 is a block diagram schematically illustrating a structure of a text classification apparatus according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

Furthermore, the terms "first," "second," and the like are used merely to distinguish one description from another, and are not to be construed as indicating or implying relative importance.

In the description of the present application, it is also to be noted that, unless otherwise explicitly stated or limited, the terms "disposed" and "connected" are to be interpreted broadly, for example, "connected" may be a fixed connection, a detachable connection, or an integral connection; can be mechanically or electrically connected; the connection may be direct or indirect via an intermediate medium, and may be a communication between the two elements. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate.

The following detailed description of embodiments of the present application will be made with reference to the accompanying drawings.

In the training process of the traditional neural network, a model is generally required to be trained aiming at a classification task. For example, in the field of text classification, news texts need to be classified, that is, sports news texts, entertainment news texts, financial news texts, and the like are determined from a plurality of obtained news texts, which may be referred to as a classification task. Yet another classification task may be to classify the education-based text, i.e., to determine higher mathematical texts, signal and system texts, linear algebraic texts, etc. from the acquired plurality of education-based texts. In the related art, in order to solve the problem of collapse forgetting which may occur in the neural network learning process, different tasks respectively train different models to process.

In order to solve the technical problem in the foregoing background art, fig. 1 is a schematic flowchart of a step of a text classification method provided in an embodiment of the present application, and the text classification method is described in detail below.

Step S201, a plurality of training text data are acquired.

Wherein the plurality of training text data includes corresponding text category labels.

In order to invoke only one model in processing the task of classifying the individual texts, the training data may be composed of a plurality of training text data, each corresponding to a text class label. For example, the text category label corresponding to one of the training text data may be a news category, and the training text data corresponding to the news category may include a plurality of sports news texts, entertainment news texts, financial news texts, and the like. It should be understood that the training text data corresponding to each text category label corresponds to one classification task, not to multiple categories of data under the same classification task.

Step S202, inputting training text data into an initial text classification model, and training to obtain a current text classification model.

Wherein the current text classification model includes a plurality of layers of current model primitives.

Any training text data can be selected from a plurality of training text data to input into the initial text classification model for training, and a current text classification model capable of classifying the currently input training text data is obtained. The number of layers of the initial text classification model can be adjusted according to the required model precision, and is not limited herein, and the trained current text classification model comprises multiple layers of current model elements with the same number of layers. In the embodiment of the present application, the model primitive may be LSTM (Long Short-Term Memory), or may be transform or Bert (natural language processing), which is not limited herein.

In the embodiment of the present application, reference may be made to a formula shown in fig. 2, which may be utilized to train a current text classification model, where L_tRefers to the overall softmax (logistic regression) cross entropy loss function, l_tThe method is characterized in that the method refers to a softmax cross entropy loss function of a current training text, theta refers to a parameter of a current text classification model, t refers to training texts t, n_tIs the number of data sets corresponding to the current training text, f isReferring to the predicted value of the current text classification model, y is the true value,

is the ith data set in the current training text t,

refers to a training set corresponding to the training text t.

Step S203, configuring a first operation type and a second operation type for each layer of current model primitive, and constructing a current search network space based on the current text classification model, the first operation type and the second operation type.

Wherein the first operation type is used for characterizing the retention operation performed on the current model primitive and the second operation type is used for characterizing the modification operation performed on the current model primitive.

The first operation type and the second operation type may be configured for each layer of current model primitives included in the trained current text classification model, and it should be understood that the configuration of the first operation type and the second operation type is used as a basis for constructing a subsequent intermediate text classification model, and the first operation type and the second operation type may be adjusted based on a degree of adaptation of the current text classification model to subsequently input training data.

And step S204, selecting any training text data from the rest training text data, inputting the training text data into the current search network space, and training to obtain an intermediate text classification model.

It should be understood that the current text classification model obtained by the training can implement text classification on the training text data input for the first time, and in order to implement different text classification tasks by using the same model, any training text data can be selected from the remaining training text data to be trained in the current search network space constructed based on the current text classification model, so as to obtain an intermediate text classification model capable of classifying two types of training texts, that is, the intermediate text classification model can be executed for two different text classification tasks.

And S205, taking the intermediate text classification model as a current text classification model, returning to the step of configuring the first operation type and the second operation type for each layer of current model primitive, and constructing the current search network space based on the current text classification model, the first operation type and the second operation type until all the input of the training text data is completed, and training to obtain a target text classification model for classifying the text data.

The intermediate text classification model obtained by training can be used as a current text classification model, a corresponding current search network is constructed, and then the rest training text data are sequentially used as input for training, so that a target text classification model capable of classifying the text data corresponding to the related text class labels can be obtained.

Through the scheme, the relation between different training text data is established by utilizing the constructed search network and the configured first operation type and second operation type, the target text classification model which can be used for text classification tasks corresponding to different text type labels can be finally obtained, the problem of collapse forgetting caused by utilizing the traditional neural network model in the related technology is solved, a plurality of models do not need to be trained, the memory of a computer is saved, and the training time is saved

In one possible embodiment, the first operation type is a hold operation and the second operation type is a new operation. The foregoing step S203 can be implemented by the following specific implementation.

And a substep S203-1 of configuring a hold operation and a new operation for each layer of the current model primitive.

And a substep S203-2, constructing and obtaining a current searching network space comprising a plurality of layers of current searching network nodes based on the current text classification model.

Each layer of current search network node comprises corresponding current model primitive configuration holding operation and new building operation.

In the embodiment of the present application, the first operation type for characterizing the retention operation performed on the current model primitive may be a retention operation, and the second operation type for characterizing the modification operation performed on the current model primitive may be a new addition operation.

In a possible implementation, the aforementioned step S204 may be implemented in the following manner.

And a substep S204-1, selecting any training text data from the rest training text data and inputting the selected training text data into the current search network space.

And a substep S204-2, determining a characteristic value corresponding to each layer of the current search network node according to the training text data.

And a substep S204-3, determining the operation type corresponding to each layer of the current search network node as a holding operation or a new operation according to the characteristic value.

And a substep S204-4, executing corresponding operation on the current model element corresponding to each layer of current searching network node according to the determined operation type corresponding to each layer of current searching network node.

And a substep S204-5, obtaining an initial intermediate text classification model according to each layer of current model elements executing corresponding operations.

And a substep S204-6, training to obtain an intermediate text classification model according to the training text data.

In this embodiment of the present application, the current search network space may be formed by multiple layers of current search network nodes, and the basis for selecting the holding operation or the new creation operation in each layer of current search network nodes may be the feature value α corresponding to the current search network node. According to the characterization condition of the characteristic value alpha, the fact that each layer of current search network node executes the holding operation or the new building operation can be determined. Model primitives obtained after corresponding operations are executed on each layer of current search network nodes can form an intermediate text classification model. And training by using the training text data to finally obtain an intermediate text classification model.

In a possible embodiment, the foregoing sub-step S204-4 can be implemented by the following specific embodiments.

(1) And when the operation type corresponding to the target current searching network node is a holding operation, holding the corresponding target current model primitive, wherein the target current searching network node is any one of the multilayer current searching network nodes.

It should be understood that, when the operation type corresponding to the target current search network node is the hold operation, the target current model primitive corresponding to the target current search network node obtained by the previous training may be considered to be suitable for the training text data provided this time, and therefore, the target current model primitive may be maintained without performing modification related operations.

(2) And when the operation type corresponding to the target current searching network node is the holding operation, newly adding a target newly-built model element with the same size as the current model element, wherein the target current searching network node is any one of the multilayer current searching network nodes.

(3) And training a target new model element according to the training text data.

It should be understood that, when the operation type corresponding to the target current search network node is a new operation, it may be considered that the target current model primitive corresponding to the target current search network node obtained by the previous training is not suitable for the training text data provided this time, and the training of the model primitive needs to be performed again, a target new model primitive having the same size as the current model primitive may be provided, and training is performed by using the training text data input this time, and after the training is completed, the current model primitive and the new model primitive exist in the same layer as the model primitive of the training completed model.

In a possible embodiment, the foregoing step S203 may also have the following embodiments.

And a substep S203-3, configuring a third operation type for each layer of the current model primitive, wherein the third operation type is a fine tuning operation.

Accordingly, the aforementioned sub-step S204-4 can be implemented by the following specific embodiments.

(4) When the operation type corresponding to the target current search network node is fine tuning operation, the corresponding target current model element is maintained, and a fine tuning parameter model element is established according to the fine tuning operation;

In addition to the hold operation and the new operation mentioned above, a third operation type may be configured for each layer of the current model primitive, and the third operation type may be a fine-tuning operation. In the case that the operation to be executed is a fine-tuning operation, it may be considered that the target current model primitive corresponding to the target current search network node obtained by the previous training is suitable for the training text data provided this time to some extent, and only the model primitive needs to be fine-tuned. The fine-tuning parameter model elements can be constructed according to the smaller difference points, and are added into the target current model element for combined action.

It should be noted that, in the embodiment of the present application, if a new creation operation is involved in a previous training process, a layer corresponding to the new creation operation may include two model primitives, and in a next training process of new training text data, a hold operation or a fine-tuning operation is performed on both the two model primitives obtained in the previous training process, and the new creation operation is unrelated to the two model primitives obtained in the previous training process.

In one possible embodiment, the aforementioned sub-step S204-2 may be implemented in the following manner.

(1) And acquiring an intermediate test classification text with the same type as the training text data.

(2) And training the classification text based on the intermediate test by using a logistic regression model to obtain a characteristic value corresponding to each layer of the current search network node.

Referring to fig. 3, in the process of determining the feature value α corresponding to each layer of the current search network node, a logistic regression model may be used for training, for example, the size of the search space of each layer may be K, and L layers are shared, and taking the first layer of the current search network node as an example, the form may be shown in fig. 3, where gl K (x) refers to the kth primitive model of the L layer, and the feature value α may be trained by combining the formula shown in fig. 2. It should be understood that the intermediate test employed herein classifies text D_tvalMay be pre-separated from the corresponding training text data.

In order to provide a more clear description of the embodiments of the present application, the following takes an example in which the initial text classification model includes five layers of model primitives. Referring to fig. 4, training text data (inputdata) is input for the first time, training is performed by using the loss function, a prediction classification result Task1 is output through five-layer Model primitives Model1, Model2, Model3, Model4 and Model5, and after the training is completed, a current text classification Model can be obtained. And then, constructing a current search network space, inputting new training text data into the current search network space, and if the feature values alpha corresponding to the Model primitives of Model1, Model2, Model3, Model4 and Model5 respectively represent "hold operation", "new operation", "hold operation", "fine adjustment operation" and "new operation", thus, training the obtained intermediate text classification Model, wherein the "Model 2 '" and the "Model 5'" are new Model primitives. The predicted classification result Task2 output by the intermediate text classification model may be the classification result of the class label corresponding to the training text data of the previous time or the classification result of the class label corresponding to the training text data input this time.

An embodiment of the present application provides a text classification apparatus 110, which is applied to a computer device, please refer to fig. 5, where the text classification apparatus 110 includes:

the obtaining module 1101 is configured to obtain a plurality of training text data, where the plurality of training text data includes corresponding text category labels.

The training module 1102 is configured to input training text data into an initial text classification model, and train to obtain a current text classification model, where the current text classification model includes multiple layers of current model elements; configuring a first operation type and a second operation type for each layer of current model elements, and constructing a current search network space based on a current text classification model, the first operation type and the second operation type, wherein the first operation type is used for representing the retention operation of the current model elements and the second operation type is used for representing the modification operation of the current model elements; selecting any training text data from the rest training text data, inputting the training text data into the current search network space, and training to obtain an intermediate text classification model; and taking the intermediate text classification model as a current text classification model, returning to the step of configuring a first operation type and a second operation type for each layer of current model primitive, and constructing a current search network space based on the current text classification model, the first operation type and the second operation type until all training text data are input, and training to obtain a target text classification model for classifying the text data.

In a possible implementation, the first operation type is a hold operation, and the second operation type is a new operation; the training module 1102 is specifically configured to:

configuring a maintaining operation and a new building operation for each layer of current model primitive; and constructing and obtaining a current search network space comprising a plurality of layers of current search network nodes based on the current text classification model, wherein each layer of current search network node comprises corresponding current model element configuration holding operation and new construction operation.

In one possible implementation, the training module 1102 is specifically configured to:

selecting any training text data from the rest training text data and inputting the training text data into the current search network space; determining a characteristic value corresponding to each layer of the current search network node according to the training text data; determining the operation type corresponding to each layer of the current search network node as a holding operation or a new operation according to the characteristic value; according to the operation type corresponding to each layer of current searching network node, corresponding operation is executed on the current model element corresponding to each layer of current searching network node; obtaining an initial intermediate text classification model according to each layer of current model elements executing corresponding operations; and training to obtain an intermediate text classification model according to the training text data.

In a possible implementation, the training module 1102 is further specifically configured to:

when the operation type corresponding to the target current searching network node is a holding operation, a target new model element with the same size as the current model element is newly added, and the target current searching network node is any one of the multilayer current searching network nodes; and training a target new model element according to the training text data.

In one possible implementation, the training module 1102 is further configured to:

and configuring a third operation type for each layer of the current model primitive, wherein the third operation type is a fine tuning operation.

acquiring an intermediate test classification text with the same type as the training text data; and training the classification text based on the intermediate test by using a logistic regression model to obtain a characteristic value corresponding to each layer of the current search network node.

It should be noted that, for the implementation principle of the text classification apparatus 110, reference may be made to the implementation principle of the text classification method, which is not described herein again. It should be understood that the division of the modules of the above apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity or may be physically separated. And these modules can be realized in the form of software called by processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the text classification apparatus 110 may be a processing element separately set up, or may be implemented by being integrated into a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and a processing element of the apparatus calls and executes the functions of the text classification apparatus 110. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.

For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. For another example, when some of the above modules are implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor that can call program code. As another example, these modules may be integrated together, implemented in the form of a system-on-a-chip (SOC).

The embodiment of the application provides a computer device, the computer device comprises a processor and a nonvolatile memory storing computer instructions, and when the computer instructions are executed by the processor, the computer device executes the text classification device. The computer device comprises a text classification device, a memory, a processor and a communication unit.

To enable the transfer or interaction of data, the memory, processor and communication unit components are electrically connected to each other, directly or indirectly. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The text classification device comprises at least one software functional module which can be stored in a memory in the form of software or firmware (firmware) or solidified in an Operating System (OS) of the computer device. The processor is used for executing the text classification device stored in the memory, such as software functional modules and computer programs included in the text classification device.

An embodiment of the present application provides a readable storage medium, where the readable storage medium includes a computer program, and the computer program controls a computer device where the readable storage medium is located to execute the foregoing text classification method when the computer program runs.

In summary, the present application discloses a text classification method, apparatus, computer device and readable storage medium, including: the method comprises the steps of obtaining a plurality of training text data including corresponding text category labels, inputting one training text data into an initial text classification model for training to obtain a current text classification model, configuring a first operation type and a second operation type for each layer of current model elements included by the current text classification model to construct a current search network space, inputting another training text data into the current search network space, training to obtain an intermediate text classification model, taking the intermediate text classification model as the current text classification model, performing training on all the training text data to obtain a target text classification model capable of classifying the text data, and according to the scheme, compared with the scheme that a plurality of models need to be trained aiming at different text classification tasks in the related technology, the scheme provided by the application can complete a plurality of text classification tasks only through one model.

The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the application to the precise forms disclosed. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the application and its practical application, to thereby enable others skilled in the art to best utilize the application and various embodiments with various modifications as are suited to the particular use contemplated. The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the application to the precise forms disclosed. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the application and its practical application, to thereby enable others skilled in the art to best utilize the application and various embodiments with various modifications as are suited to the particular use contemplated.

Claims

1. A text classification method applied to a computer device, the method comprising:

acquiring a plurality of training text data, wherein the training text data comprise corresponding text type labels;

inputting the training text data into an initial text classification model, and training to obtain a current text classification model, wherein the current text classification model comprises a plurality of layers of current model elements;

configuring a first operation type and a second operation type for each layer of the current model primitive, and constructing a current search network space based on the current text classification model, the first operation type and the second operation type, wherein the first operation type is a maintaining operation, and the second operation type is a new building operation;

taking the intermediate text classification model as a current text classification model of a next stage, returning to the step of configuring a first operation type and a second operation type for each layer of current model elements, and constructing a current search network space based on the current text classification model of the next stage, the first operation type and the second operation type until all training text data are input, and training to obtain a target text classification model for classifying the text data;

configuring a first operation type and a second operation type for each layer of the current model primitive, and constructing a current search network space based on the current text classification model, the first operation type and the second operation type, including:

configuring the maintaining operation and the newly-built operation for each layer of the current model primitive;

constructing and obtaining a current search network space comprising a plurality of layers of current search network nodes based on the current text classification model, wherein each layer of the current search network nodes comprises corresponding current model elements for configuring the maintaining operation and the newly-built operation;

selecting any training text data from the rest training text data, inputting the selected training text data into the current search network space, and training to obtain an intermediate text classification model, wherein the training comprises the following steps:

selecting any training text data from the rest training text data to input the current search network space;

determining the operation type corresponding to each layer of the current search network node as the holding operation or the new building operation according to the characteristic value;

according to the determined operation type corresponding to each layer of the current searching network node, corresponding operation is executed on the current model element corresponding to each layer of the current searching network node;

obtaining an initial intermediate text classification model according to each layer of the current model primitive executing corresponding operation;

and training to obtain the intermediate text classification model according to the training text data and the initial intermediate text classification model.

2. The method according to claim 1, wherein said performing corresponding operations on the current model primitive corresponding to the current searching network node of each layer according to the determined operation type corresponding to the current searching network node of each layer comprises:

3. The method according to claim 1, wherein said performing corresponding operations on the current model primitive corresponding to the current searching network node of each layer according to the determined operation type corresponding to the current searching network node of each layer comprises:

when the operation type corresponding to the target current searching network node is a holding operation, newly adding a target newly-built model element with the same size as the current model element, wherein the target current searching network node is any one of the multiple layers of current searching network nodes;

and training the target new model primitive according to the training text data.

4. The method of claim 1, wherein configuring a first operation type and a second operation type for each layer of the current model primitive further comprises:

configuring a third operation type for each layer of the current model primitive, wherein the third operation type is a fine tuning operation;

the executing corresponding operation to the current model primitive corresponding to each layer of the current searching network node according to the determined operation type corresponding to each layer of the current searching network node further comprises:

when the operation type corresponding to the target current search network node is fine tuning operation, maintaining the corresponding target current model primitive, and establishing a fine tuning parameter model primitive according to the fine tuning operation;

merging the fine-tuning parametric model primitive with the target current model primitive as the target current model primitive.

5. The method of claim 1, wherein the determining feature values corresponding to the current search network node for each layer according to the training text data comprises:

and training based on the intermediate test classification text by using a logistic regression model to obtain a characteristic value corresponding to each layer of the current search network node.

6. A text classification device applied to a computer device, the device comprising:

the acquisition module is used for acquiring a plurality of training text data, and the training text data comprise corresponding text category labels;

the training module is used for inputting the training text data into an initial text classification model and training to obtain a current text classification model, and the current text classification model comprises a plurality of layers of current model elements; configuring a first operation type and a second operation type for each layer of the current model primitive, and constructing a current search network space based on the current text classification model, the first operation type and the second operation type, wherein the first operation type is a maintaining operation, and the second operation type is a new building operation; selecting any training text data from the rest training text data, inputting the training text data into the current search network space, and training to obtain an intermediate text classification model; taking the intermediate text classification model as the current text classification model, returning to the step of configuring a first operation type and a second operation type for each layer of the current model primitive, and constructing a current search network space based on the current text classification model, the first operation type and the second operation type until all training text data are input, and training to obtain a target text classification model for classifying the text data;

the training module is specifically configured to:

configuring the maintaining operation and the newly-built operation for each layer of the current model primitive; constructing and obtaining a current search network space comprising a plurality of layers of current search network nodes based on the current text classification model, wherein each layer of the current search network nodes comprises corresponding current model elements for configuring the maintaining operation and the newly-built operation;

the training module is specifically further configured to:

selecting any training text data from the rest training text data to input the current search network space; determining a characteristic value corresponding to each layer of the current search network node according to the training text data; determining the operation type corresponding to each layer of the current search network node as the holding operation or the new building operation according to the characteristic value; according to the determined operation type corresponding to each layer of the current searching network node, corresponding operation is executed on the current model element corresponding to each layer of the current searching network node; obtaining an initial intermediate text classification model according to each layer of the current model primitive executing corresponding operation; and training to obtain the intermediate text classification model according to the training text data and the initial intermediate text classification model.

7. A computer device comprising a processor and a non-volatile memory having computer instructions stored thereon that, when executed by the processor, cause the computer device to perform the method of text classification of any of claims 1-5.

8. A readable storage medium, characterized in that the readable storage medium comprises a computer program which, when running, controls a computer device on which the readable storage medium is located to perform the text classification method according to any one of claims 1-5.