CN112966110A

CN112966110A - Text type identification method and related equipment

Info

Publication number: CN112966110A
Application number: CN202110286227.7A
Authority: CN
Inventors: 李明凡; 周凯捷
Original assignee: Ping An Life Insurance Company of China Ltd
Current assignee: Ping An Life Insurance Company of China Ltd
Priority date: 2021-03-17
Filing date: 2021-03-17
Publication date: 2021-06-15

Abstract

The application discloses a text type identification method and related equipment, which are applied to electronic equipment, wherein the method comprises the following steps: obtaining a training sample; training a preset text classification model to be trained by adopting the training sample to obtain a text classification model; acquiring a text to be classified, and inputting the text to be classified into the text classification model to obtain a class prediction probability set, wherein the class prediction probability set comprises prediction probabilities that the sample to be classified belongs to preset text classes; and determining a target text category to which the text to be classified belongs in the preset text categories based on the category prediction probability set. By the adoption of the text classification method and device, text classification efficiency is improved.

Description

Text type identification method and related equipment

Technical Field

The present application relates to the field of electronic technologies, and in particular, to a text type recognition method and a related device.

Background

With the development of the internet, a large amount of text data is generated continuously, so that text classification occupies an important position in information processing. Because a large amount of information exists in text data, if the information cannot be managed and extracted quickly and effectively, significant losses of enterprise and social information technologies are caused, and therefore, how to identify texts by using an effective and quick method to realize classification is a key problem to be solved urgently.

Disclosure of Invention

The embodiment of the application provides a text type identification method and related equipment, which are beneficial to quickly and effectively classifying texts.

In a first aspect, an embodiment of the present application provides a text category identification method, where the method includes:

obtaining a training sample;

training a preset text classification model to be trained by adopting the training sample to obtain a text classification model;

acquiring a text to be classified, and inputting the text to be classified into the text classification model to obtain a class prediction probability set, wherein the class prediction probability set comprises prediction probabilities that the sample to be classified belongs to preset text classes;

and determining a target text category to which the text to be classified belongs in the preset text categories based on the category prediction probability set.

In a second aspect, an embodiment of the present application provides a text type identification apparatus, including:

a first obtaining unit for obtaining a training sample;

the training unit is used for training a preset text classification model to be trained by adopting the training sample to obtain a text classification model;

the second acquisition unit is used for acquiring texts to be classified;

the input unit is used for inputting the text to be classified into the text classification model to obtain a class prediction probability set, and the class prediction probability set comprises prediction probabilities that the samples to be classified belong to preset text classes;

and the determining unit is used for determining a target text category to which the text to be classified belongs in the preset text categories based on the category prediction probability set.

In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the program includes instructions for executing steps in the method according to the first aspect of the embodiment of the present application.

In a fourth aspect, the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, where the computer program causes a computer to perform some or all of the steps described in the method according to the first aspect of the present application.

In a fifth aspect, the present application provides a computer program product, where the computer program product includes a non-transitory computer-readable storage medium storing a computer program, where the computer program is operable to cause a computer to perform some or all of the steps described in the method according to the first aspect of the present application. The computer program product may be a software installation package.

It can be seen that, in the embodiment of the application, the electronic device first obtains a training sample, then trains a preset text classification model to be trained by using the training sample to obtain a text classification model, then obtains a text to be classified, inputs the text to be classified into the text classification model to obtain a class prediction probability set, wherein the class prediction probability set includes prediction probabilities that the sample to be classified belongs to preset text classes, and finally determines target text classes to which the text to be classified belongs in the preset text classes based on the class prediction probability set. The text classification model to be trained is trained first, and then the text classification model obtained through training is adopted to classify the text, so that the text classification method is beneficial to quickly and effectively classifying the text.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure;

fig. 2 is a schematic flowchart of a text category identification method according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of another electronic device provided in the embodiment of the present application;

fig. 4 is a schematic structural diagram of a text type identification device according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The following are detailed below.

The terms "first," "second," "third," and "fourth," etc. in the description and claims of this application and in the accompanying drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

Hereinafter, some terms in the present application are explained to facilitate understanding by those skilled in the art.

The electronic device may include a computing device or other processing device connected to a wireless modem, or the like.

As shown in fig. 1, fig. 1 is a schematic structural diagram of an electronic device provided in an embodiment of the present application. The electronic device includes at least one of: processors, Memory, signal processors, transceivers, Random Access Memory (RAM), sensors, and so forth. The memory, the signal processor, the RAM and the sensor are connected with the processor, and the transceiver is connected with the signal processor.

Wherein the sensor comprises at least one of: light-sensitive sensors, gyroscopes, infrared proximity sensors, fingerprint sensors, pressure sensors, etc. Among them, the light sensor, also called an ambient light sensor, is used to detect the ambient light brightness. The light sensor may include a light sensitive element and an analog to digital converter. The photosensitive element is used for converting collected optical signals into electric signals, and the analog-to-digital converter is used for converting the electric signals into digital signals. Optionally, the light sensor may further include a signal amplifier, and the signal amplifier may amplify the electrical signal converted by the photosensitive element and output the amplified electrical signal to the analog-to-digital converter. The photosensitive element may include at least one of a photodiode, a phototransistor, a photoresistor, and a silicon photocell.

The processor is a control center of the electronic equipment, various interfaces and lines are used for connecting all parts of the whole electronic equipment, and various functions and processing data of the electronic equipment are executed by operating or executing software programs and/or modules stored in the memory and calling data stored in the memory, so that the electronic equipment is monitored integrally.

The processor may integrate an application processor and a modem processor, wherein the application processor mainly handles operating systems, user interfaces, application programs, and the like, and the modem processor mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor.

The memory is used for storing software programs and/or modules, and the processor executes various functional applications and data processing of the electronic equipment by operating the software programs and/or modules stored in the memory. The memory mainly comprises a program storage area and a data storage area, wherein the program storage area can store an operating system, a software program required by at least one function and the like; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The following describes embodiments of the present application in detail.

As shown in fig. 2, a flowchart of a text type identification method provided in an embodiment of the present application is applied to the electronic device, and specifically includes the following steps:

step 201: training samples are obtained.

The training samples may include an inclusionary training sample and an exclusionary training sample, where the inclusionary training sample means that the sample belongs to a certain sample class, and the exclusionary training sample means that the sample does not belong to a certain sample class.

Step 202: and training the text classification model to be trained by adopting the training sample to obtain a text classification model.

The text classification model to be trained may include an exclusivity text classification model to be trained and an inclusivity text classification model to be trained, the exclusivity text classification model to be trained corresponds to the exclusivity training sample, and the inclusivity training sample corresponds to the inclusivity text classification model to be trained.

The text classification model comprises an exclusivity text classification model and an inclusivity text classification model, the exclusivity text classification model corresponds to the exclusivity text classification model to be trained, and the inclusivity text classification model corresponds to the inclusivity text classification model to be trained.

Step 203: and acquiring a text to be classified, and inputting the text to be classified into the text classification model to obtain a class prediction probability set, wherein the class prediction probability set comprises the prediction probability that the sample to be classified belongs to a preset text class.

The text to be classified may be an inclusionary text or an exclusionary text.

The preset text category comprises at least one text category, and the category prediction probability set comprises the probability that the sample to be classified belongs to each preset text category.

The class prediction probabilities of the texts to be classified on different text classes may be the same or different.

Where the text category may be news topics, spam, user comments, and the like.

Step 204: and determining a target text category to which the text to be classified belongs in the preset text categories based on the category prediction probability set.

And determining the text category corresponding to the maximum prediction probability in the category prediction probability set as the target text category.

In a possible implementation manner, the training a preset text classification model to be trained by using the training sample to obtain a text classification model includes:

dividing the training samples into inclusionary training samples and exclusionary training samples based on sample identifications;

training an exclusionary text classification model to be trained based on the exclusionary training sample to obtain an exclusionary text classification model;

and training the inclusionary text classification model to be trained based on the inclusionary training sample to obtain the inclusionary text classification model.

The sample identifications of all inclusive training samples are the same, the sample identifications of all exclusive training samples are the same, and the sample identifications of the inclusive training samples and the exclusive training samples are different.

Wherein, there is at least one inclusive training sample and at least one inclusive training sample.

The number of inclusionary training samples may be greater than or equal to the number of exclusionary training samples, or the number of inclusionary training samples may be less than the number of exclusionary training samples.

It can be seen that, in the embodiment of the application, the inclusion training sample is adopted to train the inclusion text classification model to be trained, and the exclusion training sample is adopted to train the exclusion text classification model to be trained, so that the text to be classified can be classified into the exclusion text when the text to be classified is the text with problems, and the accuracy of text classification is ensured.

In a possible implementation manner, the training the exclusionary text classification model to be trained based on the exclusionary training sample to obtain the exclusionary text classification model includes:

inputting the exclusionary training samples into a first preset classification model to obtain an inclusionary prediction probability set and an exclusionary prediction probability set, wherein the inclusionary prediction probability set comprises the probability that the exclusionary training samples belong to the preset text category, and the exclusionary prediction probability set comprises the probability that the exclusionary training samples are excluded from the preset text category;

and training the exclusionary text classification model to be trained based on the inclusionary prediction probability set and the exclusionary prediction probability set to obtain the exclusionary text classification model.

The preset text category comprises at least one text category, the number of the exclusionary training samples is greater than or equal to one, and the first preset classification model is used for carrying out normalization processing on the exclusionary training samples, so that an inclusionary prediction probability set corresponding to each exclusionary training sample and an exclusionary prediction probability set corresponding to each exclusionary training sample can be obtained, the inclusionary prediction probability set comprises the probability that the corresponding exclusionary training sample belongs to each text category, and the exclusionary prediction probability set comprises the probability that the corresponding exclusionary training sample is excluded from each text category.

In a possible implementation manner, the training the exclusionary text classification model to be trained based on the inclusionary prediction probability set and the exclusionary prediction probability set to obtain the exclusionary text classification model includes:

determining an inclusionary prediction probability set corresponding to the exclusionary training sample in the inclusionary prediction probability set, and determining an exclusionary prediction probability set corresponding to the exclusionary training sample in the inclusionary prediction probability set;

obtaining a first loss value based on a first loss function, an inclusiveness prediction probability set corresponding to the exclusiveness training sample and an exclusiveness prediction probability set corresponding to the exclusiveness training sample;

and under the condition that the first loss value is greater than or equal to a first preset loss value, adjusting a first preset model parameter based on the first loss value and a back propagation algorithm until the first loss value is smaller than the first preset loss value, and obtaining the exclusionary text classification model.

The method comprises the steps of obtaining a preset text type, determining an exclusionary training sample A, determining a first loss value according to an inclusionary prediction probability set corresponding to the exclusionary training sample A, an exclusionary prediction probability set corresponding to the exclusionary training sample A and a first loss function, adjusting a first preset model parameter if the first loss value is smaller than the first preset loss value, and randomly selecting one exclusionary training sample except the exclusionary training sample A from the exclusionary training sample A again to serve as the exclusionary training sample A until the first loss value is smaller than the first preset loss value.

Wherein the first loss function is

p_iRepresenting the probability of a training sample belonging to the ith text class, q_iRepresenting the probability of the training sample being excluded from the ith text category.

For example, if there are 3 text categories (text category a, text category B, and text category C), the probability that the training sample a belongs to the text category a is 0.8, the probability that the training sample a belongs to the text category B is 0.7, the probability that the training sample a belongs to the text category C is 0.5, the probability that the training sample a is excluded from the text category a is 0.8, the probability that the training sample a is excluded from the text category B is 0.6, and the probability that the training sample a is excluded from the text category C is 0.3, the loss value corresponding to the training sample a is- (0.8 log (0.8) +0.7 log (0.6) +0.5 log (0.3)).

The adjustment of the current first preset model parameter is performed on the basis of the first preset model parameter obtained by the last adjustment.

The first preset model parameter may be a positive number or a negative number.

It can be seen that, in the embodiment of the present application, under the condition that the first loss value is smaller than the first preset loss value, the exclusionary text classification model is obtained, which is favorable for improving the correctness of determining the text classification.

In a possible implementation manner, the training the inclusionary text classification model to be trained based on the inclusionary training sample to obtain an inclusionary text classification model includes:

inputting the inclusiveness training sample into a second preset classification model to obtain an inclusiveness prediction probability set, wherein the inclusiveness prediction probability set comprises the probability that the inclusiveness training sample belongs to the preset text category;

determining an annotation probability set based on the annotation of the inclusionary training sample, the annotation probability set comprising a probability that the inclusionary training sample is annotated as the preset text classification;

and training the inclusionary text classification model to be trained based on the inclusionary prediction probability set and the labeling probability set to obtain the inclusionary text classification model.

The preset text classes comprise at least one text class, the number of inclusionary training samples is greater than or equal to one, the second preset classification model is used for carrying out normalization processing on the inclusionary training samples, so that an inclusionary prediction probability set corresponding to each inclusionary training sample and a labeling probability set corresponding to each inclusionary training sample can be obtained, the inclusionary prediction probability set comprises the probability that the corresponding inclusionary training sample belongs to each text class, and the labeling probability set comprises the probability that the corresponding inclusionary training sample is labeled as each text class.

Wherein different text categories correspond to different labels.

The method comprises the steps of determining which text type an inclusiveness training sample is labeled through labeling of a training text, wherein if the inclusiveness training sample is labeled as a certain text type, the labeling probability corresponding to the certain text type is 1, and the labeling probability corresponding to the text types except the certain text type is 0.

For example, if there are 3 text types (text type a, text type B, and text type C), and the training sample is labeled as belonging to the text type a, the labeling probability set corresponding to the training sample is (1,0, 0).

In a possible implementation manner, the training the inclusionary text classification model to be trained based on the inclusionary prediction probability set and the labeling probability set to obtain the inclusionary text classification model includes:

determining an inclusionary prediction probability set corresponding to the inclusionary training sample in the inclusionary prediction probability set, and determining an annotation probability set corresponding to the inclusionary training sample in the annotation probability set;

obtaining a second loss value based on a second loss function, an inclusiveness prediction probability set corresponding to the inclusiveness training sample and an annotation probability set corresponding to the inclusiveness training sample;

and under the condition that the second loss value is greater than or equal to a second preset loss value, adjusting second preset model parameters based on the second loss value and a back propagation algorithm until the second loss value is smaller than the second preset loss value, and obtaining the inclusive text classification model.

The method comprises the steps of determining a first loss value, determining a second loss value based on an inclusiveness prediction probability set corresponding to the inclusiveness training sample B, a labeling probability set corresponding to the inclusiveness training sample B and a first loss function, adjusting parameters of a first preset model if the first loss value is smaller than the first loss value, and selecting one inclusiveness training sample except the inclusiveness training sample B from the inclusiveness training samples as the inclusiveness training sample B until the first loss value is smaller than the first preset loss value.

And adjusting the current second preset model parameter on the basis of the second preset model parameter obtained by the last adjustment.

The second preset model parameter may be a positive number or a negative number.

Wherein the second loss function is

p_iRepresenting the probability that the training sample belongs to the ith text class,

is a probability that the training sample is labeled as the ith text category.

For example, if there are 3 text classes (text class a, text class B, and text class C), the probability that the training sample a belongs to the text class a is 0.8, the probability that the training sample a belongs to the text class B is 0.7, the probability that the training sample a belongs to the text class C is 0.5, and the training sample a is labeled as the text class a, the corresponding loss value of the training sample a is-1 × log (0.8).

It can be seen that, in the embodiment of the present application, under the condition that the second loss value is smaller than the second preset loss value, the second trained text classification model is obtained, which is beneficial to improving the correctness of determining the text classification.

In a possible implementation manner, after determining the target text category to which the text to be classified belongs based on the N first prediction probabilities, the method further includes:

and outputting the target text category and the prediction probability corresponding to the target text category.

Wherein, the target text category and the corresponding prediction probability can be associated and output.

It can be seen that, in the embodiment of the present application, the target text category and the prediction probability are output, which is beneficial to understanding the accuracy of text classification.

Referring to fig. 3, in accordance with the embodiment shown in fig. 2, fig. 3 is a schematic structural diagram of another electronic device provided in an embodiment of the present application, as shown, the electronic device includes a processor, a memory, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the programs include instructions for performing the following steps:

obtaining a training sample;

In an implementation manner of the present application, in terms of training a preset text classification model to be trained by using the training samples to obtain a text classification model, the above-mentioned program is specifically used for executing instructions of the following steps:

In an implementation manner of the present application, in terms of training an exclusionary text classification model to be trained based on the exclusionary training samples to obtain the exclusionary text classification model, the above-mentioned program is specifically configured to execute instructions of the following steps:

In an implementation manner of the present application, in terms of training the exclusionary text classification model to be trained based on the inclusionary prediction probability set and the exclusionary prediction probability set to obtain the exclusionary text classification model, the above-mentioned program is specifically configured to execute instructions of the following steps:

In an implementation manner of the present application, in terms of training an inclusionary text classification model to be trained based on the inclusionary training sample to obtain an inclusionary text classification model, the above-mentioned program is specifically used for executing instructions of the following steps:

In an implementation manner of the present application, in terms of training the inclusionary text classification model to be trained based on the inclusionary prediction probability set and the labeling probability set to obtain the inclusionary text classification model, the above-mentioned program is specifically configured to execute instructions of the following steps:

In an implementation manner of the present application, after determining a target text category to which the text to be classified belongs in the preset text categories, the program is further specifically configured to execute instructions of the following steps:

It should be noted that, for the specific implementation process of the present embodiment, reference may be made to the specific implementation process described in the above method embodiment, and a description thereof is omitted here.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a text type recognition apparatus according to an embodiment of the present application, where the apparatus includes:

a first obtaining unit 401, configured to obtain a training sample;

a training unit 402, configured to train a preset text classification model to be trained by using the training sample, so as to obtain a text classification model;

a second obtaining unit 403, configured to obtain a text to be classified;

an input unit 404, configured to input the text to be classified into the text classification model, so as to obtain a class prediction probability set, where the class prediction probability set includes prediction probabilities that the samples to be classified belong to preset text classes;

a determining unit 405, configured to determine, based on the category prediction probability set, a target text category to which the text to be classified belongs in the preset text categories.

In an implementation manner of the present application, in terms of training a preset text classification model to be trained by using the training samples to obtain a text classification model, the training unit 402 is configured to execute instructions of the following steps:

In an implementation manner of the present application, in terms of training an exclusionary text classification model to be trained based on the exclusionary training samples to obtain the exclusionary text classification model, the training unit 402 is configured to execute the following instructions:

In an implementation manner of the present application, in terms of training the exclusionary text classification model to be trained based on the inclusionary prediction probability set and the exclusionary prediction probability set to obtain the exclusionary text classification model, the training unit 402 is configured to execute the following steps:

In an implementation manner of the present application, in terms of training a to-be-trained inclusionary text classification model based on the inclusionary training sample to obtain an inclusionary text classification model, the training unit 402 is configured to execute instructions of the following steps:

In an implementation manner of the present application, in terms of training the inclusionary text classification model to be trained based on the inclusionary prediction probability set and the labeling probability set to obtain the inclusionary text classification model, the training unit 402 is configured to execute the following steps:

In a possible implementation, the text category identifying means further comprises an output unit 406.

In an implementation manner of the present application, after determining a target text category to which the text to be classified belongs in the preset text categories, the output unit 406 is configured to execute the following steps:

It should be noted that the first obtaining unit 401, the training unit 402, the second obtaining unit 403, the input unit 404, the determining unit 405, and the output unit 406 may be implemented by a processor.

Embodiments of the present application further provide a computer storage medium, where the computer storage medium stores a computer program, and the computer program is executed by a processor to implement part or all of the steps of any one of the text category identification methods as described in the above method embodiments.

Embodiments of the present application also provide a computer program product, where the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause a computer to perform some or all of the steps described in the electronic device in the above method. The computer program product may be a software installation package.

The steps of a method or algorithm described in the embodiments of the present application may be implemented in hardware, or may be implemented by a processor executing software instructions. The software instructions may be comprised of corresponding software modules that may be stored in Random Access Memory (RAM), flash Memory, Read Only Memory (ROM), Erasable Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), registers, a hard disk, a removable disk, a compact disc Read Only Memory (CD-ROM), or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an ASIC. Additionally, the ASIC may reside in an access network device, a target network device, or a core network device. Of course, the processor and the storage medium may reside as discrete components in an access network device, a target network device, or a core network device.

Those skilled in the art will appreciate that in one or more of the examples described above, the functionality described in the embodiments of the present application may be implemented, in whole or in part, by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., Digital Video Disk (DVD)), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the embodiments of the present application in further detail, and it should be understood that the above-mentioned embodiments are only specific embodiments of the present application, and are not intended to limit the scope of the embodiments of the present application, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the embodiments of the present application should be included in the scope of the embodiments of the present application.

Claims

1. A text category identification method is applied to an electronic device, and comprises the following steps:

obtaining a training sample;

2. The method according to claim 1, wherein the training a preset text classification model to be trained by using the training samples to obtain a text classification model comprises:

3. The method of claim 2, wherein training the exclusionary text classification model to be trained based on the exclusionary training samples to obtain an exclusionary text classification model comprises:

4. The method according to claim 3, wherein the training the exclusionary text classification model to be trained based on the inclusionary prediction probability set and the exclusionary prediction probability set to obtain the exclusionary text classification model comprises:

5. The method of claim 2, wherein training the inclusionary text classification model to be trained based on the inclusionary training samples to obtain an inclusionary text classification model comprises:

6. The method of claim 5, wherein the training the inclusionary text classification model to be trained based on the inclusionary prediction probability set and the labeling probability set to obtain the inclusionary text classification model comprises:

7. The method according to claim 4 or 6, wherein after determining a target text category to which the text to be classified belongs from the preset text categories, the method further comprises:

8. A text category identification apparatus, characterized in that the apparatus comprises:

a first obtaining unit for obtaining a training sample;

the second acquisition unit is used for acquiring texts to be classified;

9. An electronic device, comprising a processor, memory, and one or more programs stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing the steps in the method of any of claims 1-7.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program, wherein the computer program is processed to execute instructions of the steps in the method according to any one of claims 1-7.