CN116166798A

CN116166798A - Text classification model training method, device and equipment based on hierarchical Softmax

Info

Publication number: CN116166798A
Application number: CN202211556578.6A
Authority: CN
Inventors: 肖威; 莫凡
Original assignee: DBAPPSecurity Co Ltd
Current assignee: DBAPPSecurity Co Ltd
Priority date: 2022-12-06
Filing date: 2022-12-06
Publication date: 2023-05-26

Abstract

The application relates to a text classification model training method, device and equipment based on a hierarchical Softmax, wherein the text classification model training method based on the hierarchical Softmax comprises the following steps: constructing a binary category tree based on the hierarchical classification directory of the text, wherein each category node of the binary category tree is the nearest public ancestor node of all the sub-category nodes of the binary category tree; if a node of the binary class tree comprises a child node, merging the child node with the node; coding nodes of the binary class tree according to the node paths to obtain class codes; calculating probability distribution of the text on each category according to the category codes; based on the probability distribution of the text on each category and the known text category, training the model, and continuously optimizing the learning parameters of each node to obtain a text classification model. According to the method and the device, the text category hierarchical structure is considered in training of the text classification model, and the model calculation efficiency under the hierarchical classification scene is improved.

Description

Text classification model training method, device and equipment based on hierarchical Softmax

Technical Field

The application relates to the technical field of data classification, in particular to a text classification model training method, device and equipment based on hierarchical Softmax.

Background

Data is the core asset of an enterprise, and in general, a catalog of data classifications is represented in multiple levels. Performing hierarchical classification protection on data is a matter of urgent concern for enterprises. Currently, machine learning methods based on natural language understanding are the primary method of classifying data stored in a structured database. The current common classification method is to divide a certain data into categories according to a certain level of catalogue, for example, only into a certain category in a third level of catalogue or into a certain category in a fourth level of catalogue, so that a model is only applicable to a certain level of catalogue and cannot infer any category under any level of catalogue. When used for deducing the probability of any category under any level of the directory, multiple models are often needed, and maintenance is inconvenient.

In the prior art, a text classification model can be trained and classified by adopting a hierarchical Softmax, but the traditional hierarchical Softmax does not consider the category hierarchical structure of the text, the output categories do not have father-son relationship, the probability of all subclasses needs to be calculated first, and then the probability of the father class is calculated by summing up layer by layer, so that the probability of a certain category cannot be calculated directly, and the model calculation efficiency is low.

Disclosure of Invention

The embodiment of the application provides a text classification model training method, device and equipment based on hierarchical Softmax, which at least solve the problem that the model calculation efficiency is low because the class hierarchical structure of the text is not considered in the text classification model training in the related technology.

In a first aspect, an embodiment of the present application provides a text classification model training method based on hierarchical Softmax, where the method includes:

constructing a binary category tree based on the hierarchical classification directory of the text, wherein each category node of the binary category tree is the nearest public ancestor node of all the sub-category nodes of the binary category tree;

if a node of the binary class tree comprises a child node, merging the child node with the node;

coding nodes of the binary class tree according to the node paths to obtain class codes;

calculating probability distribution of the text on each category according to the category codes;

based on the probability distribution of the text on each category and the known text category, training the model, and continuously optimizing the learning parameters of each node to obtain a text classification model.

In one embodiment, the text-based hierarchical classification directory builds a binary class tree, each class node of the binary class tree being the nearest common ancestor node of all its child nodes, comprising:

constructing an original category tree based on a hierarchical classification catalog of a text, wherein the number of child nodes of the original category tree is equal to the number of child categories of the text;

constructing a binary class tree based on the original class tree, each class node of the binary class tree being a nearest common ancestor node of all its child nodes.

In one embodiment, the encoding the nodes of the binary class tree according to the node path, and obtaining the class code includes:

the left child node code of the binary class tree is recorded as 1, and the right child node code is recorded as 0;

and coding the nodes of the binary class tree according to the node paths to obtain class codes.

In one embodiment, said calculating a probability distribution of said text over the respective categories according to said category codes comprises:

preprocessing a text to obtain a text sentence vector;

and calculating the probability of selecting the left child node and the probability of selecting the right child node of the text sentence vector, and calculating the probability distribution of the text on each category.

Wherein the formula for calculating the probability distribution of the text on each category according to the category codes is as follows

Wherein l represents the category of the text, v represents the text sentence vector, i represents the node of the binary category tree, left _l Representing a set of nodes from the root node to all selected left child nodes in the path of the category node, right _l Node set representing all selected right child nodes in path from root node to category node, p _i (v) Representing the probability that the text sentence vector selects the left child node.

In a second aspect, the embodiment of the application also provides a text classification model training device based on the hierarchical Softmax. The device comprises:

the binary category tree construction module is used for constructing a binary category tree based on the hierarchical classification catalog of the text, and each category node of the binary category tree is the nearest public ancestor node of all the sub-category nodes of the binary category tree; if a node of the binary class tree comprises a child node, merging the child node with the node;

the class code acquisition module is used for coding the nodes of the binary class tree according to the node paths to acquire class codes;

the probability calculation module is used for calculating probability distribution of the text on each category according to the category codes;

the model obtaining module is used for training the model based on the probability distribution of the text on each category and the known text category, and continuously optimizing the learning parameters of each node to obtain the text classification model.

In a third aspect, embodiments of the present application further provide a text classification method based on hierarchical Softmax. The method comprises the following steps:

training to obtain a text classification model based on the text classification model training method based on the hierarchical Softmax according to the first aspect;

inputting the text to be classified into the trained text classification model to obtain a classification result.

In a fourth aspect, embodiments of the present application further provide a text classification device based on hierarchical Softmax. The device comprises:

the model training module is used for training to obtain a text classification model;

and the text classification module is used for inputting the text to be classified into the trained text classification model to obtain a classification result.

In a fifth aspect, embodiments of the present application further provide a computer device, including a memory and a processor, where the memory stores a computer program, and the processor is configured to execute the computer program to perform the hierarchical Softmax-based text classification model training method according to the first aspect or the hierarchical Softmax-based text classification method according to the third aspect.

In a sixth aspect, an embodiment of the present application further provides a storage medium, where a computer program is stored, where the computer program when executed by a processor implements the text classification model training method based on hierarchical Softmax according to the first aspect or the text classification method based on hierarchical Softmax according to the third aspect.

Compared with the related art, the text classification model training method, device, equipment and storage medium based on the hierarchical Softmax provided by the embodiment of the application construct a binary class tree based on the hierarchical classification catalog of the text, wherein each class node of the binary class tree is the nearest common ancestor node of all the child nodes of the binary class tree, and if a certain node of the binary class tree comprises a child node, the child node is merged with the node. According to the node path, the nodes of the binary class tree are encoded to obtain class codes, probability distribution of the text on each class is calculated according to the class codes, training is carried out on the model based on the probability distribution of the text on each class and the known text class, learning parameters of each node are continuously optimized, a text classification model is obtained, the problem that the model calculation efficiency is low due to the fact that the class hierarchical structure of the text is not considered during training of the text classification model in the related technology is solved, and the calculation efficiency of the model under the hierarchical classification scene is improved.

The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the other features, objects, and advantages of the application.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:

FIG. 1 is an application environment diagram of a hierarchical Softmax-based text classification model training method according to an embodiment of the application;

FIG. 2 is a flow chart of a hierarchical Softmax based text classification model training method in accordance with an embodiment of the application;

FIG. 3 is a schematic diagram of an original category tree according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a binary class tree according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a merged binary class tree according to an embodiment of the present application;

FIG. 6 is a block diagram of a hierarchical Softmax based text classification model training device in accordance with an embodiment of the present application;

FIG. 7 is a flow chart of a hierarchical Softmax based text classification method according to an embodiment of the application;

FIG. 8 is a block diagram of a hierarchical Softmax based text classification device according to an embodiment of the application;

fig. 9 is a schematic diagram of a hardware structure of a computer device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described and illustrated below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden on the person of ordinary skill in the art based on the embodiments provided herein, are intended to be within the scope of the present application.

It is apparent that the drawings in the following description are only some examples or embodiments of the present application, and it is possible for those of ordinary skill in the art to apply the present application to other similar situations according to these drawings without inventive effort. Moreover, it should be appreciated that while such a development effort might be complex and lengthy, it would nevertheless be a routine undertaking of design, fabrication, or manufacture for those of ordinary skill having the benefit of this disclosure, and thus should not be construed as having the benefit of this disclosure.

Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is to be expressly and implicitly understood by those of ordinary skill in the art that the embodiments described herein can be combined with other embodiments without conflict.

Unless defined otherwise, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar terms herein do not denote a limitation of quantity, but rather denote the singular or plural. The terms "comprising," "including," "having," and any variations thereof, are intended to cover a non-exclusive inclusion; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to only those steps or elements but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. The terms "connected," "coupled," and the like in this application are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as used herein refers to two or more. "and/or" describes an association relationship of an association object, meaning that there may be three relationships, e.g., "a and/or B" may mean: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship. The terms "first," "second," "third," and the like, as used herein, are merely distinguishing between similar objects and not representing a particular ordering of objects.

The text classification model training method based on the hierarchical Softmax, provided by the embodiment of the application, can be applied to an application environment shown in FIG. 1. Wherein the terminal 101 communicates with the server 102 via a network. The data storage system may store data that the server 102 needs to process. The data storage system may be integrated on the server 102 or may be located on a cloud or other network server. The terminal 101 may be, but not limited to, various personal computers, notebook computers, tablet computers. The server 102 may be implemented as a stand-alone server or as a server cluster of multiple servers.

The embodiment provides a text classification model training method based on a hierarchical Softmax, and fig. 2 is a flowchart of the text classification model training method based on the hierarchical Softmax according to the embodiment of the application, as shown in fig. 2, and the method includes:

step S201, constructing a binary category tree based on the hierarchical classification directory of the text, wherein each category node of the binary category tree is the nearest common ancestor node of all the sub-category nodes of the binary category tree.

The text in this embodiment has a category hierarchy, for example, a certain text belongs to a certain sub-category under a secondary directory under a primary directory. Constructing a binary class tree according to the hierarchical classification directory of the text, wherein the number of child nodes of each node of the binary class tree cannot exceed two, and each class node of the binary class tree is the nearest common ancestor node of all the child nodes.

Step S202, if a node of the binary class tree has only one child node, merging the child node with the node. Each node of the combined binary class tree is provided with two sub-nodes, and the two sub-nodes are divided into a left sub-node and a right sub-node respectively.

Step 203, coding the nodes of the binary class tree according to the node paths to obtain class codes.

Wherein the node path is a path from a root node of the binary class tree to a designated node.

And step S204, calculating probability distribution of the text on each category according to the category codes.

Step S205, training the model based on the probability distribution of the text on each category and the known text category, and continuously optimizing the learning parameters of each node to obtain a text classification model. The model may be a classification model constructed based on RNN, LSTM or other neural network algorithms.

Constructing a binary category tree based on the hierarchical classification directory of the text through the steps S201 to S205, wherein each category node of the binary category tree is the nearest common ancestor node of all the sub-category nodes; if a node of the binary class tree comprises a child node, merging the child node with the node; coding nodes of the binary class tree according to the node paths to obtain class codes; calculating probability distribution of the text on each category according to the category codes; based on the probability distribution of the text on each category and the known text category, training the model, and continuously optimizing the learning parameters of each node to obtain a text classification model. According to the method and the device, the hierarchical classification catalogue of the text is merged into the hierarchical Softmax structure, the nodes of each category in the binary category tree are guaranteed to be the nearest public ancestor nodes of all the sub-categories of the binary category tree, category coding is carried out based on the merged binary category tree, the problem that the model calculation efficiency is low due to the fact that the category hierarchical structure of the text is not considered in the text classification model training in the related technology is solved, and the model calculation efficiency under the hierarchical classification scene is improved.

In one embodiment, the text-based hierarchical classification directory builds a binary class tree, each class node of the binary class tree being the nearest common ancestor node of all its child nodes, comprising: constructing an original category tree based on a hierarchical classification catalog of a text, wherein the number of child nodes of the original category tree is equal to the number of child categories of the text; constructing a binary class tree based on the original class tree, wherein the number of child nodes of each node of the binary class tree cannot exceed two, and each class node of the binary class tree is the nearest common ancestor node of all the child nodes.

In this embodiment, the hierarchical classification directory of text may be one having a primary category and a secondary category. Fig. 3-5 illustrate how a merged binary class tree is constructed based on a hierarchical classification directory of text. Fig. 3 is an original class tree diagram, fig. 3 is a binary class tree diagram, and fig. 5 is a merged binary class tree diagram.

According to the method, the device and the system, the hierarchical classification catalog of the text is merged into the hierarchical Softmax structure, the nodes of each category in the binary category tree are guaranteed to be the nearest public ancestor nodes of all the sub-categories of the binary category tree, category coding is carried out based on the merged binary category tree, the problem that the model calculation efficiency is low due to the fact that the hierarchical structure of the text is not considered in the text classification model training process in the related technology is solved, and the model calculation efficiency in the hierarchical classification scene is improved.

In one embodiment, calculating the probability distribution of the text over the respective categories based on the category codes comprises: and preprocessing the text to obtain text sentence vectors. The text sentence vector representation converts the text into vectors in a high-dimensional continuous space, and the function of the text sentence vector representation is that the similarity of text semantics can be measured by the distance between the vectors, so that the model is better in processing the text.

Specifically, RNN, LSTM, biLSTM or other neural network algorithms may be used to extract text sentence vectors of the input text.

Specifically, a text sentence vector is set as v, the current node number is i, and the probability of selecting the left child node by the current node is recorded as p _i (v) The probability is between 0 and 1, so the probability of the current node selecting the right child node is 1-p _i (v) A. The invention relates to a method for producing a fibre-reinforced plastic composite Wherein p is _i (v) The calculation formula is shown as formula (1):

wherein sigma (·) is a Sigmoid function, W _i Is the learning parameter of node i.

Assuming that probability calculation of each node is independent from each other, a probability calculation formula in which the text sentence vector v is classified into the category l is as formula (2):

wherein l represents the category of the text, v represents the text sentence vector, i represents the node of the binary category tree, left _l Representing a set of nodes for all selected left child nodes in a path from a root node to a designated node, right _l Representing a set of nodes, p, of all selected right child nodes in a path from a root node to a designated node _i (v) Representing the probability that the text sentence vector selects the left child node.

In one embodiment, the encoding the nodes of the binary class tree according to the node path, and obtaining the class code includes: the left child node code of the binary class tree is recorded as 1, and the right child node code is recorded as 0; and coding the nodes of the binary class tree according to the node paths to obtain class codes.

Wherein the node path is a path from a root node of the binary class tree to a designated node. In the embodiment, the binary class tree is encoded to obtain class codes, and the class codes can be used for rapidly positioning the class to which the binary class tree belongs, so that the calculation efficiency of the model in the hierarchical classification scene is improved.

In one of the embodiments of the present invention,from the known text category, the category code for that category may be determined to be 101, with the left child node code being noted 1 and the right child node code being noted 0. The known text is preprocessed to obtain text sentence vectors, and three nodes in the path from the root node to the appointed node of the input text sentence vectors can be judged according to the category codes 101, and then the right sub-node, the left sub-node and the left sub-node are sequentially selected to obtain correct classification results. According to the probability distribution of the text sentence vector on each category and the known text category, training a large number of samples, and continuously optimizing each node parameter W during the training _i And (3) obtaining a trained text classification model until the overall classification accuracy is not improved.

In this embodiment, the learning parameter W of the i-th node may be updated by the gradient descent method _i Where η is the learning rate, in this embodiment, the learning rate may be 0.01, 0.02, 0.05 or other values.

In summary, the hierarchical classification directory of the text is merged into the hierarchical Softmax structure, so that the node of each category in the binary category tree is ensured to be the nearest common ancestor node of all the sub-categories of the node, and category coding is carried out based on the merged binary category tree, thereby solving the problem of low model calculation efficiency caused by the fact that the category hierarchical structure of the text is not considered in the text classification model training in the related technology, and improving the calculation efficiency of the model in the hierarchical classification scene.

The embodiment provides a text classification model training device based on a hierarchical Softmax, and fig. 6 is a structural block diagram of the text classification model training device based on the hierarchical Softmax according to the embodiment of the application, as shown in fig. 6, and the device includes:

a binary category tree construction module 610 for constructing a binary category tree based on the hierarchical classification directory of text, each category node of the binary category tree being the nearest common ancestor node of all its child nodes; if a node of the binary class tree comprises a child node, merging the child node with the node.

The category code obtaining module 620 is configured to encode the nodes of the binary category tree according to the node paths, and obtain category codes.

And the probability calculation module 630 is used for calculating probability distribution of the text on each category according to the category codes.

Model obtaining module 640 is configured to train the model based on calculating the probability distribution of the text on each category and the known text category, and continuously optimize the learning parameters of each node to obtain a text classification model.

The binary category tree construction module 610 is further configured to construct an original category tree based on the hierarchical classification directory of the text, where the number of child nodes of the original category tree is equal to the number of child categories of the text.

The binary category tree construction module 610 is further configured to construct a binary category tree based on the original category tree, each category node of the binary category tree being a nearest common ancestor node of all its child nodes.

The category code acquisition module 620 is further configured to record a left child node code of the binary category tree as 1 and a right child node code as 0;

The probability calculation module 630 is further configured to pre-process the text to obtain a text sentence vector;

The probability calculation module 630 is further configured to calculate the probability distribution of the text over the respective categories as

Wherein l represents the category of the text, v represents the text sentence vector, i represents the node of the binary category tree, left _l Representing a set of nodes from the root node to all selected left child nodes in the path of the category node, right _l Node set representing all selected right child nodes in path from root node to category node, p _i (v) Representation ofThe text sentence vector selects the probability of the left child node.

It should be noted that, specific examples in this embodiment may refer to examples described in the foregoing embodiments and alternative implementations, and this embodiment is not repeated herein.

The present embodiment provides a text classification method based on a hierarchical Softmax, and fig. 7 is a flowchart of the text classification method based on the hierarchical Softmax according to an embodiment of the present application, as shown in fig. 7, where the method includes:

step S701, training to obtain a text classification model based on a text classification model training method of the hierarchical Softmax;

step S702, inputting the text to be classified into the trained text classification model to obtain a classification result.

Specifically, preprocessing the text to be classified to obtain text sentence vectors. Inputting the text sentence vector into the trained text classification model, calculating the probability of selecting the left child node and the probability of selecting the right child node of the input text sentence vector, and calculating the probability distribution of the input text sentence vector belonging to each category according to category codes. The category with the highest probability is a classification result of the text to be classified.

The embodiment also provides a text classification device based on the hierarchical Softmax, and fig. 8 is a structural block diagram of a text classification model training device based on the hierarchical Softmax according to an embodiment of the application, as shown in fig. 8, where the device includes:

model training module 810 is used for training to obtain text classification model.

The text classification module 820 is configured to input a text to be classified into the trained text classification model, and obtain a classification result.

In addition, in combination with the text classification model training method based on the hierarchical Softmax or the text classification method based on the hierarchical Softmax in the above embodiments, the embodiments of the present application may provide a computer device to be implemented.

Fig. 9 is a schematic diagram of a hardware structure of a computer device according to an embodiment of the present application. The computer device may include a processor 910 and a memory 920 storing computer program instructions.

In particular, the processor 910 described above may include a Central Processing Unit (CPU), or an application specific integrated circuit (Application Specific Integrated Circuit, abbreviated as ASIC), or may be configured to implement one or more integrated circuits of embodiments of the present application.

Memory 920 may include, among other things, mass storage for data or instructions. By way of example, and not limitation, memory 920 may include a Hard Disk Drive (HDD), floppy Disk Drive, solid state Drive (Solid State Drive, SSD), flash memory, optical Disk, magneto-optical Disk, tape, or universal serial bus (Universal Serial Bus, USB) Drive, or a combination of two or more of these. Memory 920 may include removable or non-removable (or fixed) media where appropriate. Memory 920 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 920 is a Non-Volatile (Non-Volatile) memory. In a particular embodiment, the Memory 920 includes Read-Only Memory (ROM) and random access Memory (RandomAccess Memory, RAM). Where appropriate, the ROM may be a mask-programmed ROM, a programmable ROM (PROM for short), an erasable PROM (Erasable Programmable Read-Only Memory for short, EPROM for short), an electrically erasable PROM (Electrically Erasable Programmable Read-Only Memory for short, EEPROM), an electrically rewritable ROM (Electrically Alterable Read-Only Memory for short, EAROM) or FLASH Memory (FLASH) or a combination of two or more of these. The RAM may be Static Random-Access Memory (SRAM) or dynamic Random-Access Memory (Dynamic Random Access Memory DRAM), where the DRAM may be a fast page mode dynamic Random-Access Memory (FastPage Mode Dynamic RandomAccess Memory FPMDRAM), extended data output dynamic Random-Access Memory (Extended Date Out Dynamic Random Access Memory EDODRAM), synchronous dynamic Random-Access Memory (Synchronous Dynamic RandomAccess Memory SDRAM), or the like, as appropriate.

Memory 920 may be used to store or cache various data files that are required for processing and/or communication, as well as possible computer program instructions for execution by processor 910.

The processor 910 implements any of the above embodiments of the hierarchical Softmax-based text classification model training method or the hierarchical Softmax-based text classification method by reading and executing computer program instructions stored in the memory 920.

In addition, in combination with the text classification model training method based on the hierarchical Softmax or the text classification method based on the hierarchical Softmax in the above embodiments, the embodiments of the present application may provide a storage medium to be implemented. The storage medium has a computer program stored thereon; the computer program, when executed by a processor, implements the text classification model training method of any of the hierarchical Softmax or the text classification method based on hierarchical Softmax in the above embodiments.

The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims

1. The text classification model training method based on the hierarchical Softmax is characterized by comprising the following steps of:

2. The hierarchical Softmax-based text classification model training method of claim 1, wherein the text-based hierarchical classification directory builds a binary class tree, each class node of the binary class tree being a nearest common ancestor node of all its child nodes comprising:

3. The hierarchical Softmax-based text classification model training method of claim 1, wherein the encoding the nodes of the binary class tree according to the node path, obtaining class codes comprises:

4. The hierarchical Softmax based text classification model training method of claim 1, wherein the calculating the probability distribution of the text over the respective categories based on the category codes comprises:

preprocessing a text to obtain a text sentence vector;

and calculating the probability of selecting the left child node and the probability of selecting the right child node of the text sentence vector, and calculating to obtain the probability distribution of the text on each category.

5. The method according to claim 4, wherein the formula for calculating probability distribution of the text on each category according to the category codes is

6. A text classification model training device based on hierarchical Softmax, the device comprising:

7. A text classification method based on hierarchical Softmax, comprising:

training to obtain a text classification model based on the method of any one of claims 1 to 5;

8. A hierarchical Softmax-based text classification device, the device comprising:

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the hierarchical Softmax based text classification model training method of any of claims 1 to 5 or the hierarchical Softmax based text classification method of claim 7 when executing the computer program.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the hierarchical Softmax based text classification model training method according to any one of claims 1 to 5 or the hierarchical Softmax based text classification method according to claim 7.