CN114281991A - Text classification method and device, electronic equipment and storage medium - Google Patents

Text classification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114281991A
CN114281991A CN202111565838.1A CN202111565838A CN114281991A CN 114281991 A CN114281991 A CN 114281991A CN 202111565838 A CN202111565838 A CN 202111565838A CN 114281991 A CN114281991 A CN 114281991A
Authority
CN
China
Prior art keywords
text
sample
vector
character
category
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111565838.1A
Other languages
Chinese (zh)
Inventor
刘欢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Puhui Enterprise Management Co Ltd
Original Assignee
Ping An Puhui Enterprise Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Puhui Enterprise Management Co Ltd filed Critical Ping An Puhui Enterprise Management Co Ltd
Priority to CN202111565838.1A priority Critical patent/CN114281991A/en
Publication of CN114281991A publication Critical patent/CN114281991A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of artificial intelligence, and discloses a text classification method, which comprises the following steps: performing coding processing on the text category set and the sample set to obtain an initial label matrix and a first vector corresponding to each character of each sample in the sample set; performing correlation analysis on the first vector and the initial label matrix to obtain a second vector corresponding to each character, and determining a third vector corresponding to each sample in the sample set; performing classification processing on the third vector to obtain a prediction probability value of each sample in each text category; inputting the text type labels and the corresponding prediction probability values into a loss function to obtain loss values, and determining a target label matrix and a trained text classification model by minimizing the loss values; and inputting the text to be classified and the target label matrix into the trained text classification model to obtain the target text category. The invention also provides a text classification device, electronic equipment and a storage medium. The invention improves the text classification accuracy.

Description

Text classification method and device, electronic equipment and storage medium
Technical Field
The invention relates to the field of artificial intelligence, in particular to a text classification method and device, electronic equipment and a storage medium.
Background
Text classification is widely applied in life, for example, news classification, mail classification, intention classification, and the like, and how to accurately classify texts is a key point of attention of people.
Currently, a text classification model obtained by supervised training is generally adopted for text classification, which generally improves the text classification accuracy by increasing the number of samples, however, the method does not consider the correlation between characters in the samples and labels (for example, "missile" in the text has stronger correlation with its label "military"), so that the text classification accuracy is not high; and a large amount of manpower marking cost is required to be invested in increasing samples. Therefore, a text classification method is needed to improve the text classification accuracy and save the labor cost.
Disclosure of Invention
In view of the above, there is a need to provide a text classification method, aiming at improving the text classification accuracy.
The text classification method provided by the invention comprises the following steps:
acquiring a sample set carrying text category labels, and determining a text category set corresponding to the sample set based on the text category labels;
respectively performing coding processing on the text classification set and the sample set based on a coding network of a text classification model to obtain an initial label matrix and a first vector corresponding to each character of each sample in the sample set;
performing relevance analysis on the first vector and the initial label matrix based on a relevance analysis network of the text classification model to obtain a second vector corresponding to each character;
splicing the second vector corresponding to each character in each sample to obtain a third vector corresponding to each sample in the sample set;
based on the classification network of the text classification model, performing classification processing on the third vector to obtain a prediction probability value of each sample in the sample set in each text category;
inputting the text type labels and the corresponding prediction probability values into a predetermined loss function to obtain loss values, determining a target label matrix and structural parameters of the text classification model by minimizing the loss values, and obtaining a trained text classification model based on the structural parameters;
and inputting the text to be classified and the target label matrix into the trained text classification model to obtain the target text category.
Optionally, the performing correlation analysis on the first vector and the initial label matrix to obtain a second vector corresponding to each character includes:
performing correlation analysis between characters based on the first vector to obtain a fourth vector corresponding to each character;
performing correlation analysis between characters and text categories based on the first vector and the initial label matrix to obtain a fifth vector corresponding to each character;
and adding the fourth vector and the fifth vector to obtain a second vector corresponding to each character.
Optionally, the initial tag matrix includes an initial tag vector corresponding to each text category in the text category set, and the performing correlation analysis between characters and text categories based on the first vector and the initial tag matrix to obtain a fifth vector corresponding to each character includes:
calculating a correlation value for each of the characters to each of the set of text categories based on the first vector and the initial tag vector;
and calculating a fifth vector corresponding to each character based on the correlation value and the initial label vector.
Optionally, the calculation formula of the correlation value is:
αni,j=cosin(hni,tj)
wherein alpha isni,jIs the correlation value h of the ith character of the nth sample in the sample set and the jth text category in the text category setniIs a first vector corresponding to the i character of the n sample in the sample set, tjAn initial label vector corresponding to the jth text category in the text category set;
the calculation formula of the fifth vector is as follows:
Figure BDA0003419894950000021
wherein, h'niA fifth vector corresponding to the ith character of the nth sample in the sample set, k is the total number of text categories in the text category set, and alphani,jIs the correlation value of the ith character of the nth sample in the sample set and the jth text category in the text category set, tjAnd the initial label vector corresponding to the jth text category in the text category set.
Optionally, the dependency analysis network includes a plurality of attention layers connected in series, the classification network includes a full connection layer and an active layer, and the coding network includes a plurality of coding layers connected in series.
Optionally, the loss function is:
Figure BDA0003419894950000031
wherein the content of the first and second substances,
Figure BDA0003419894950000032
in order to obtain the value of the loss,
Figure BDA0003419894950000033
the label value of the jth text category in the text category set for the ith sample in the sample set,
Figure BDA0003419894950000034
and the prediction probability value of the ith sample in the sample set in the jth text category in the text category set is shown, N is the total number of samples in the sample set, and k is the total number of text categories in the text category set.
Optionally, after obtaining the trained text classification model based on the structure parameter, the method further includes:
and extracting a new sample set at preset intervals, and updating the structural parameters of the target label matrix and the trained text classification model based on the new sample set to obtain a new target label matrix and a newly trained text classification model.
In order to solve the above problem, the present invention also provides a text classification apparatus, including:
the acquisition module is used for acquiring a sample set carrying text category labels and determining a text category set corresponding to the sample set based on the text category labels;
the encoding module is used for respectively performing encoding processing on the text classification set and the sample set based on an encoding network of a text classification model to obtain an initial label matrix and a first vector corresponding to each character of each sample in the sample set;
the analysis module is used for performing correlation analysis on the first vector and the initial label matrix based on a correlation analysis network of the text classification model to obtain a second vector corresponding to each character;
the splicing module is used for splicing the second vector corresponding to each character in each sample to obtain a third vector corresponding to each sample in the sample set;
the prediction module is used for performing classification processing on the third vector based on the classification network of the text classification model to obtain a prediction probability value of each sample in the sample set in each text category;
the training module is used for inputting the text type labels and the corresponding prediction probability values into a predetermined loss function to obtain loss values, determining a target label matrix and structural parameters of the text classification model through minimizing the loss values, and obtaining a trained text classification model based on the structural parameters;
and the classification module is used for inputting the texts to be classified and the target label matrix into the trained text classification model to obtain the target text category.
In order to solve the above problem, the present invention also provides an electronic device, including:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a text classification program executable by the at least one processor to enable the at least one processor to perform the text classification method described above.
In order to solve the above problem, the present invention also provides a computer-readable storage medium having a text classification program stored thereon, the text classification program being executable by one or more processors to implement the above text classification method.
Compared with the prior art, the method comprises the steps of firstly, respectively performing coding processing on a text category set and a sample set to obtain an initial label matrix and a first vector corresponding to each character of each sample in the sample set; then, performing correlation analysis on the first vector and the initial label matrix to obtain a second vector corresponding to each character, and determining a third vector corresponding to each sample; performing classification processing on the third vector to obtain a prediction probability value of each sample in each text category; secondly, inputting the text type labels and the corresponding prediction probability values into a loss function to obtain loss values, and determining a target label matrix and a trained text classification model by minimizing the loss values; and finally, inputting the text to be classified and the target label matrix into the trained text classification model to obtain the target text category. The method performs correlation analysis on the first vector and the initial label matrix, learns the correlation between each character and each text category, and accordingly the text classification accuracy of the trained text classification model is higher. Therefore, the text classification accuracy is improved.
Drawings
Fig. 1 is a schematic flowchart of a text classification method according to an embodiment of the present invention;
fig. 2 is a schematic block diagram of a text classification apparatus according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device implementing a text classification method according to an embodiment of the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
The invention provides a text classification method. Fig. 1 is a schematic flow chart of a text classification method according to an embodiment of the present invention. The method may be performed by an electronic device, which may be implemented by software and/or hardware.
In this embodiment, the text classification method includes:
s1, obtaining a sample set carrying text category labels, and determining a text category set corresponding to the sample set based on the text category labels.
In this embodiment, news text classification is taken as an example for explanation, samples with a preset number of labeled text category labels are extracted from a news text library to serve as a sample set, and if the categories of the news text include finance, sports, education and entertainment, the text category set corresponding to the sample set is { finance, sports, education and entertainment }.
S2, respectively executing coding processing on the text classification set and the sample set based on a coding network of a text classification model to obtain an initial label matrix and a first vector corresponding to each character of each sample in the sample set.
In this embodiment, the text classification model is used for text classification, and includes a coding network, a correlation analysis network, and a classification network that are connected in series, where the coding network is used to code each character of an input text, the correlation analysis network is used to analyze the correlation between input coding vectors and assign corresponding weights to the input coding vectors to obtain vectors after each character is weighted, and the classification network is used to classify the input text according to the vectors after each character is weighted.
The coding network comprises a plurality of coding layers (transform Encoder layers) connected in series, the correlation analysis network comprises a plurality of Attention layers (Attention layers) connected in series, and the classification network comprises a full connection layer and an activation layer.
Inputting the sample set into an encoding network to perform encoding processing, obtaining a first vector corresponding to each character in each sample, and assuming that the encoded dimension is 100, the first vector is an array of 1 × 100.
4 text categories (namely: finance, sports, education and entertainment) in the text category set are respectively input into a coding network to be coded, and the obtained initial label matrix is a matrix of 4 x 100.
S3, based on the relevance analysis network of the text classification model, carrying out relevance analysis on the first vector and the initial label matrix to obtain a second vector corresponding to each character.
Different from the prior art in which only the correlation between the characters of the input text is analyzed, in this embodiment, the correlation between each character and each text category is also analyzed, so that the correlation between the second vector corresponding to each character obtained through the correlation analysis and the text category is stronger, and a more accurate text classification result can be obtained.
Performing correlation analysis on the first vector and the initial label matrix to obtain a second vector corresponding to each character, including:
a11, performing correlation analysis between characters based on the first vector to obtain a fourth vector corresponding to each character;
the attention layer may learn an association relationship between each character of the input text, determine an importance degree of a first vector corresponding to each character, and assign a corresponding weight value, and may determine a fourth vector corresponding to each character according to the weight value and the first vector corresponding thereto, where this part of contents is related to the prior art and is not described herein again.
A12, performing correlation analysis between characters and text categories based on the first vector and the initial label matrix to obtain a fifth vector corresponding to each character;
in this embodiment, the correlation analysis result between the character and the text category is determined by comparing the similarity between the first vector of each character and the initial label vector of each text category, and the following steps B11-B12 describe the correlation analysis process between the character and the text category in a detailed manner.
And A13, summing the fourth vector and the fifth vector to obtain a second vector corresponding to each character.
The second vector corresponding to each character fuses correlation analysis results among characters and between the characters and text categories, so that the second vector is more abundant in characteristic.
Performing correlation analysis between characters and text categories based on the first vector and the initial label matrix to obtain a fifth vector corresponding to each character, including:
b11, calculating a correlation value of each character and each text category in the text category set based on the first vector and the initial label vector;
the calculation formula of the correlation value is as follows:
αni,j=cosin(hni,tj)
wherein alpha isni,jIs the correlation value h of the ith character of the nth sample in the sample set and the jth text category in the text category setniIs a first vector corresponding to the i character of the n sample in the sample set, tjAn initial label vector corresponding to the jth text category in the text category set;
and B12, calculating a fifth vector corresponding to each character based on the correlation value and the initial label vector.
The calculation formula of the fifth vector is as follows:
Figure BDA0003419894950000071
wherein, h'niA fifth vector corresponding to the ith character of the nth sample in the sample set, k is the total number of text categories in the text category set, and alphani,jIs the correlation value of the ith character of the nth sample in the sample set and the jth text category in the text category set, tjAnd the initial label vector corresponding to the jth text category in the text category set.
And S4, splicing the second vector corresponding to each character in each sample to obtain a third vector corresponding to each sample in the sample set.
In this embodiment, the second vectors of each character in each sample are spliced according to the position sequence of the character, so as to obtain a third vector corresponding to each sample in the sample set.
And S5, performing classification processing on the third vector based on the classification network of the text classification model to obtain a prediction probability value of each sample in the sample set in each text category.
The third vector of each sample in the sample set is input to a classification network of a text classification model, and the model can output the prediction probability value of each sample in each text category.
S6, inputting the text type labels and the corresponding prediction probability values into a predetermined loss function to obtain loss values, determining a target label matrix and structural parameters of the text classification model by minimizing the loss values, and obtaining a trained text classification model based on the structural parameters.
Different from the prior art in which only the structural parameters of the model are used as the parameters to be optimized, in this embodiment, the tag matrix is also used as the parameters to be optimized, and the target tag matrix and the structural parameters of the model can be determined by minimizing the loss function.
The loss function is:
Figure BDA0003419894950000072
wherein the content of the first and second substances,
Figure BDA0003419894950000073
in order to obtain the value of the loss,
Figure BDA0003419894950000074
the label value of the jth text category in the text category set for the ith sample in the sample set,
Figure BDA0003419894950000075
and the prediction probability value of the ith sample in the sample set in the jth text category in the text category set is shown, N is the total number of samples in the sample set, and k is the total number of text categories in the text category set.
And S7, inputting the text to be classified and the target label matrix into the trained text classification model to obtain the target text category.
After the model training is finished, when text classification is required to be performed on the text to be classified, each character in the text to be classified can be associated with each text type based on the target label matrix, so that more accurate characteristic vectors can be obtained, and the text classification accuracy is higher.
Inputting the text to be classified and the target label matrix into the trained text classification model to obtain a target text category, wherein the method comprises the following steps:
inputting the text to be classified into the coding network of the trained text classification model to perform coding processing, and obtaining a sixth vector corresponding to each character in the text to be classified; inputting the sixth vector and the target label matrix into the trained text classification model to perform correlation analysis between characters and text categories to obtain a seventh vector corresponding to each character in the text to be classified; and inputting the seventh vector into the classification network of the trained text classification model to perform classification processing, so as to obtain a target text category.
After the obtaining of the trained text classification model based on the structure parameters, the method further includes:
and extracting a new sample set at preset intervals, and updating the structural parameters of the target label matrix and the trained text classification model based on the new sample set to obtain a new target label matrix and a newly trained text classification model.
This operation ensures the accuracy of the subsequent text classification.
As can be seen from the foregoing embodiments, in the text classification method provided by the present invention, firstly, a text classification set and a sample set are respectively encoded to obtain an initial label matrix and a first vector corresponding to each character of each sample in the sample set; then, performing correlation analysis on the first vector and the initial label matrix to obtain a second vector corresponding to each character, and determining a third vector corresponding to each sample; performing classification processing on the third vector to obtain a prediction probability value of each sample in each text category; secondly, inputting the text type labels and the corresponding prediction probability values into a loss function to obtain loss values, and determining a target label matrix and a trained text classification model by minimizing the loss values; and finally, inputting the text to be classified and the target label matrix into the trained text classification model to obtain the target text category. The method performs correlation analysis on the first vector and the initial label matrix, learns the correlation between each character and each text category, and accordingly the text classification accuracy of the trained text classification model is higher. Therefore, the text classification accuracy is improved.
Fig. 2 is a schematic block diagram of a text classification apparatus according to an embodiment of the present invention.
The text classification apparatus 100 according to the present invention may be installed in an electronic device. According to the implemented functions, the text classification apparatus 100 may include an obtaining module 110, an encoding module 120, an analyzing module 130, a splicing module 140, a predicting module 150, a training module 160, and a classifying module 170. The module of the present invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and that can perform a fixed function, and that are stored in a memory of the electronic device.
In the present embodiment, the functions regarding the respective modules/units are as follows:
an obtaining module 110, configured to obtain a sample set carrying text category labels, and determine, based on the text category labels, a text category set corresponding to the sample set.
The encoding module 120 is configured to perform encoding processing on the text classification set and the sample set respectively based on a text classification model encoding network, so as to obtain an initial label matrix and a first vector corresponding to each character of each sample in the sample set.
The coding network comprises a plurality of coding layers (transform Encoder layers) connected in series, the correlation analysis network comprises a plurality of Attention layers (Attention layers) connected in series, and the classification network comprises a full connection layer and an activation layer.
An analysis module 130, configured to perform correlation analysis on the first vector and the initial tag matrix based on a correlation analysis network of the text classification model, so as to obtain a second vector corresponding to each character.
Performing correlation analysis on the first vector and the initial label matrix to obtain a second vector corresponding to each character, including:
a21, performing correlation analysis between characters based on the first vector to obtain a fourth vector corresponding to each character;
a22, performing correlation analysis between characters and text categories based on the first vector and the initial label matrix to obtain a fifth vector corresponding to each character;
and A23, summing the fourth vector and the fifth vector to obtain a second vector corresponding to each character.
The initial label matrix includes an initial label vector corresponding to each text category in the text category set, and the performing correlation analysis between characters and text categories based on the first vector and the initial label matrix to obtain a fifth vector corresponding to each character includes:
b21, calculating a correlation value of each character and each text category in the text category set based on the first vector and the initial label vector;
the calculation formula of the correlation value is as follows:
αni,j=cosin(hni,tj)
wherein alpha isni,jIs the correlation value h of the ith character of the nth sample in the sample set and the jth text category in the text category setniIs a first vector corresponding to the i character of the n sample in the sample set, tjAn initial label vector corresponding to the jth text category in the text category set;
and B22, calculating a fifth vector corresponding to each character based on the correlation value and the initial label vector.
The calculation formula of the fifth vector is as follows:
Figure BDA0003419894950000101
wherein, h'niA fifth vector corresponding to the ith character of the nth sample in the sample set, k is the total number of text categories in the text category set, and alphani,jIs the correlation value of the ith character of the nth sample in the sample set and the jth text category in the text category set, tjAnd the initial label vector corresponding to the jth text category in the text category set.
And a splicing module 140, configured to splice the second vector corresponding to each character in each sample to obtain a third vector corresponding to each sample in the sample set.
A prediction module 150, configured to perform classification processing on the third vector based on the classification network of the text classification model to obtain a prediction probability value of each sample in the sample set in each text category.
The training module 160 is configured to input the text type labels and the corresponding prediction probability values into a predetermined loss function to obtain a loss value, determine a target label matrix and structural parameters of the text classification model by minimizing the loss value, and obtain a trained text classification model based on the structural parameters.
The loss function is:
Figure BDA0003419894950000102
wherein the content of the first and second substances,
Figure BDA0003419894950000103
in order to obtain the value of the loss,
Figure BDA0003419894950000104
the label value of the jth text category in the text category set for the ith sample in the sample set,
Figure BDA0003419894950000105
and the prediction probability value of the ith sample in the sample set in the jth text category in the text category set is shown, N is the total number of samples in the sample set, and k is the total number of text categories in the text category set.
And the classification module 170 is configured to input the text to be classified and the target label matrix into the trained text classification model to obtain the target text category.
After the trained text classification model is obtained based on the structure parameters, the training module 170 is further configured to:
and extracting a new sample set at preset intervals, and updating the structural parameters of the target label matrix and the trained text classification model based on the new sample set to obtain a new target label matrix and a newly trained text classification model.
Fig. 3 is a schematic structural diagram of an electronic device for implementing a text classification method according to an embodiment of the present invention.
The electronic device 1 is a device capable of automatically performing numerical calculation and/or information processing in accordance with a command set or stored in advance. The electronic device 1 may be a computer, or may be a single network server, a server group composed of a plurality of network servers, or a cloud composed of a large number of hosts or network servers based on cloud computing, where cloud computing is one of distributed computing and is a super virtual computer composed of a group of loosely coupled computers.
In the present embodiment, the electronic device 1 includes, but is not limited to, a memory 11, a processor 12, and a network interface 13, which are communicatively connected to each other through a system bus, wherein the memory 11 stores a text classification program 10, and the text classification program 10 is executable by the processor 12. While fig. 3 shows only the electronic device 1 with components 11-13 and the text classification program 10, those skilled in the art will appreciate that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may include fewer or more components than shown, or some components in combination, or a different arrangement of components.
The storage 11 includes a memory and at least one type of readable storage medium. The memory provides cache for the operation of the electronic equipment 1; the readable storage medium may be a non-volatile storage medium such as flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the readable storage medium may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1; in other embodiments, the non-volatile storage medium may also be an external storage device of the electronic device 1, such as a plug-in hard disk provided on the electronic device 1, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. In this embodiment, the readable storage medium of the memory 11 is generally used for storing an operating system and various application software installed in the electronic device 1, for example, codes of the text classification program 10 in an embodiment of the present invention. Further, the memory 11 may also be used to temporarily store various types of data that have been output or are to be output.
Processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 12 is generally configured to control the overall operation of the electronic device 1, such as performing control and processing related to data interaction or communication with other devices. In this embodiment, the processor 12 is configured to run the program code stored in the memory 11 or process data, for example, run the text classification program 10.
The network interface 13 may comprise a wireless network interface or a wired network interface, and the network interface 13 is used for establishing a communication connection between the electronic device 1 and a client (not shown).
Optionally, the electronic device 1 may further include a user interface, the user interface may include a Display (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface may further include a standard wired interface and a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visualized user interface, among other things.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
The memory 11 in the electronic device 1 stores a text classification program 10 that is a combination of instructions that, when executed in the processor 12, perform the steps of:
acquiring a sample set carrying text category labels, and determining a text category set corresponding to the sample set based on the text category labels;
respectively performing coding processing on the text classification set and the sample set based on a coding network of a text classification model to obtain an initial label matrix and a first vector corresponding to each character of each sample in the sample set;
performing relevance analysis on the first vector and the initial label matrix based on a relevance analysis network of the text classification model to obtain a second vector corresponding to each character;
splicing the second vector corresponding to each character in each sample to obtain a third vector corresponding to each sample in the sample set;
based on the classification network of the text classification model, performing classification processing on the third vector to obtain a prediction probability value of each sample in the sample set in each text category;
inputting the text type labels and the corresponding prediction probability values into a predetermined loss function to obtain loss values, determining a target label matrix and structural parameters of the text classification model by minimizing the loss values, and obtaining a trained text classification model based on the structural parameters;
and inputting the text to be classified and the target label matrix into the trained text classification model to obtain the target text category.
Specifically, the processor 12 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the text classification program 10, which is not described herein again.
Further, the integrated modules/units of the electronic device 1, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. The computer readable medium may be non-volatile or non-volatile. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).
The computer readable storage medium having stored thereon a text classification program 10, the text classification program 10 executable by one or more processors to implement the steps of:
acquiring a sample set carrying text category labels, and determining a text category set corresponding to the sample set based on the text category labels;
respectively performing coding processing on the text classification set and the sample set based on a coding network of a text classification model to obtain an initial label matrix and a first vector corresponding to each character of each sample in the sample set;
performing relevance analysis on the first vector and the initial label matrix based on a relevance analysis network of the text classification model to obtain a second vector corresponding to each character;
splicing the second vector corresponding to each character in each sample to obtain a third vector corresponding to each sample in the sample set;
based on the classification network of the text classification model, performing classification processing on the third vector to obtain a prediction probability value of each sample in the sample set in each text category;
inputting the text type labels and the corresponding prediction probability values into a predetermined loss function to obtain loss values, determining a target label matrix and structural parameters of the text classification model by minimizing the loss values, and obtaining a trained text classification model based on the structural parameters;
and inputting the text to be classified and the target label matrix into the trained text classification model to obtain the target text category.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A method of text classification, the method comprising:
acquiring a sample set carrying text category labels, and determining a text category set corresponding to the sample set based on the text category labels;
respectively performing coding processing on the text classification set and the sample set based on a coding network of a text classification model to obtain an initial label matrix and a first vector corresponding to each character of each sample in the sample set;
performing relevance analysis on the first vector and the initial label matrix based on a relevance analysis network of the text classification model to obtain a second vector corresponding to each character;
splicing the second vector corresponding to each character in each sample to obtain a third vector corresponding to each sample in the sample set;
based on the classification network of the text classification model, performing classification processing on the third vector to obtain a prediction probability value of each sample in the sample set in each text category;
inputting the text type labels and the corresponding prediction probability values into a predetermined loss function to obtain loss values, determining a target label matrix and structural parameters of the text classification model by minimizing the loss values, and obtaining a trained text classification model based on the structural parameters;
and inputting the text to be classified and the target label matrix into the trained text classification model to obtain the target text category.
2. The method of claim 1, wherein the performing a correlation analysis on the first vector and the initial label matrix to obtain a second vector corresponding to each of the characters comprises:
performing correlation analysis between characters based on the first vector to obtain a fourth vector corresponding to each character;
performing correlation analysis between characters and text categories based on the first vector and the initial label matrix to obtain a fifth vector corresponding to each character;
and adding the fourth vector and the fifth vector to obtain a second vector corresponding to each character.
3. The method of classifying text according to claim 2, wherein the initial label matrix includes an initial label vector corresponding to each text category in the set of text categories, and the performing a correlation analysis between characters and text categories based on the first vector and the initial label matrix to obtain a fifth vector corresponding to each character comprises:
calculating a correlation value for each of the characters to each of the set of text categories based on the first vector and the initial tag vector;
and calculating a fifth vector corresponding to each character based on the correlation value and the initial label vector.
4. The text classification method according to claim 3, characterized in that the correlation value is calculated by the formula:
αni,j=cosin(hni,tj)
wherein alpha isni,jIs the correlation value h of the ith character of the nth sample in the sample set and the jth text category in the text category setniIs a first vector corresponding to the i character of the n sample in the sample set, tjFor the jth text category in the text category setAn initial label vector;
the calculation formula of the fifth vector is as follows:
Figure FDA0003419894940000021
wherein, h'niA fifth vector corresponding to the ith character of the nth sample in the sample set, k is the total number of text categories in the text category set, and alphani,jIs the correlation value of the ith character of the nth sample in the sample set and the jth text category in the text category set, tjAnd the initial label vector corresponding to the jth text category in the text category set.
5. The text classification method of claim 1, wherein the relevance analysis network includes a plurality of attention layers connected in series, the classification network includes a fully connected layer and an active layer, and the encoding network includes a plurality of encoding layers connected in series.
6. The text classification method of claim 1, wherein the loss function is:
Figure FDA0003419894940000022
wherein the content of the first and second substances,
Figure FDA0003419894940000023
in order to obtain the value of the loss,
Figure FDA0003419894940000024
the label value of the jth text category in the text category set for the ith sample in the sample set,
Figure FDA0003419894940000025
prediction probability value of jth text category in text category set for ith sample in sample setN is the total number of samples in the sample set, and k is the total number of text categories in the text category set.
7. The method of text classification according to claim 1, characterized in that after the deriving of the trained text classification model based on the structure parameters, the method further comprises:
and extracting a new sample set at preset intervals, and updating the structural parameters of the target label matrix and the trained text classification model based on the new sample set to obtain a new target label matrix and a newly trained text classification model.
8. An apparatus for classifying text, the apparatus comprising:
the acquisition module is used for acquiring a sample set carrying text category labels and determining a text category set corresponding to the sample set based on the text category labels;
the encoding module is used for respectively performing encoding processing on the text classification set and the sample set based on an encoding network of a text classification model to obtain an initial label matrix and a first vector corresponding to each character of each sample in the sample set;
the analysis module is used for performing correlation analysis on the first vector and the initial label matrix based on a correlation analysis network of the text classification model to obtain a second vector corresponding to each character;
the splicing module is used for splicing the second vector corresponding to each character in each sample to obtain a third vector corresponding to each sample in the sample set;
the prediction module is used for performing classification processing on the third vector based on the classification network of the text classification model to obtain a prediction probability value of each sample in the sample set in each text category;
the training module is used for inputting the text type labels and the corresponding prediction probability values into a predetermined loss function to obtain loss values, determining a target label matrix and structural parameters of the text classification model through minimizing the loss values, and obtaining a trained text classification model based on the structural parameters;
and the classification module is used for inputting the texts to be classified and the target label matrix into the trained text classification model to obtain the target text category.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a text classification program executable by the at least one processor to enable the at least one processor to perform the text classification method of any one of claims 1 to 7.
10. A computer-readable storage medium having stored thereon a text classification program executable by one or more processors to implement the text classification method of any one of claims 1 to 7.
CN202111565838.1A 2021-12-20 2021-12-20 Text classification method and device, electronic equipment and storage medium Pending CN114281991A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111565838.1A CN114281991A (en) 2021-12-20 2021-12-20 Text classification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111565838.1A CN114281991A (en) 2021-12-20 2021-12-20 Text classification method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114281991A true CN114281991A (en) 2022-04-05

Family

ID=80873257

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111565838.1A Pending CN114281991A (en) 2021-12-20 2021-12-20 Text classification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114281991A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116150625A (en) * 2023-03-08 2023-05-23 华院计算技术(上海)股份有限公司 Training method and device for text search model and computing equipment
CN117786104A (en) * 2023-11-17 2024-03-29 中信建投证券股份有限公司 Model training method and device, electronic equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116150625A (en) * 2023-03-08 2023-05-23 华院计算技术(上海)股份有限公司 Training method and device for text search model and computing equipment
CN116150625B (en) * 2023-03-08 2024-03-29 华院计算技术(上海)股份有限公司 Training method and device for text search model and computing equipment
CN117786104A (en) * 2023-11-17 2024-03-29 中信建投证券股份有限公司 Model training method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112417096B (en) Question-answer pair matching method, device, electronic equipment and storage medium
CN111241304B (en) Answer generation method based on deep learning, electronic device and readable storage medium
CN114462412B (en) Entity identification method, entity identification device, electronic equipment and storage medium
CN114281991A (en) Text classification method and device, electronic equipment and storage medium
CN113157927A (en) Text classification method and device, electronic equipment and readable storage medium
CN113485889B (en) Buried data verification method and device, electronic equipment and storage medium
CN112800178A (en) Answer generation method and device, electronic equipment and readable storage medium
CN112597135A (en) User classification method and device, electronic equipment and readable storage medium
CN112686301A (en) Data annotation method based on cross validation and related equipment
CN114781832A (en) Course recommendation method and device, electronic equipment and storage medium
CN113688239B (en) Text classification method and device under small sample, electronic equipment and storage medium
CN114706985A (en) Text classification method and device, electronic equipment and storage medium
CN112395401B (en) Self-adaptive negative sample pair sampling method and device, electronic equipment and storage medium
CN113658002A (en) Decision tree-based transaction result generation method and device, electronic equipment and medium
CN113610580B (en) Product recommendation method and device, electronic equipment and readable storage medium
CN113706252B (en) Product recommendation method and device, electronic equipment and storage medium
CN114818685B (en) Keyword extraction method and device, electronic equipment and storage medium
CN113656586B (en) Emotion classification method, emotion classification device, electronic equipment and readable storage medium
CN113064984B (en) Intention recognition method, device, electronic equipment and readable storage medium
CN113342977B (en) Invoice image classification method, device, equipment and storage medium
CN114139530A (en) Synonym extraction method and device, electronic equipment and storage medium
CN113850260A (en) Key information extraction method and device, electronic equipment and readable storage medium
CN114398877A (en) Theme extraction method and device based on artificial intelligence, electronic equipment and medium
CN113705692A (en) Emotion classification method and device based on artificial intelligence, electronic equipment and medium
CN113591881A (en) Intention recognition method and device based on model fusion, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination