CN115048524A - Text classification display method and device, electronic equipment and computer readable medium - Google Patents

Text classification display method and device, electronic equipment and computer readable medium Download PDF

Info

Publication number
CN115048524A
CN115048524A CN202210971410.5A CN202210971410A CN115048524A CN 115048524 A CN115048524 A CN 115048524A CN 202210971410 A CN202210971410 A CN 202210971410A CN 115048524 A CN115048524 A CN 115048524A
Authority
CN
China
Prior art keywords
text
sample
vector
classified
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210971410.5A
Other languages
Chinese (zh)
Other versions
CN115048524B (en
Inventor
赵祥
李建华
王静宇
张昆鹏
马亚中
王辉
郭宝松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongguancun Smart City Co Ltd
Original Assignee
Zhongguancun Smart City Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongguancun Smart City Co Ltd filed Critical Zhongguancun Smart City Co Ltd
Priority to CN202210971410.5A priority Critical patent/CN115048524B/en
Publication of CN115048524A publication Critical patent/CN115048524A/en
Application granted granted Critical
Publication of CN115048524B publication Critical patent/CN115048524B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the disclosure discloses a text classification display method and device, electronic equipment and a computer readable medium. One embodiment of the method comprises: acquiring a text to be classified; inputting the text to be classified into dynamic planning and slicing layers included in a pre-trained text classification model to obtain a sub-text set to be classified; inputting the sub-text set to be classified into the vector generation layer to obtain a sub-text vector set to be classified; inputting the vector set of the sub-documents to be classified into the vector fusion layer to obtain text vectors to be classified; inputting the text vector to be classified into the classification output layer to obtain a text class label; and displaying the text to be classified in a text display area corresponding to the text type label and included in the text display window. This embodiment improves the accuracy of classifying text.

Description

Text classification display method and device, electronic equipment and computer readable medium
Technical Field
The embodiment of the disclosure relates to the technical field of text classification, in particular to a text classification display method and device, electronic equipment and a computer readable medium.
Background
With the development of the internet, a large amount of text is often generated in various scenes. Before text is used or processed, it is often necessary to classify the text. For example, for news text, the news text needs to be further classified into sports news, social news, and the like, and then presented to users viewing the news in categories. Currently, in text classification, a BERT model may be employed.
However, when text classification is performed in the above manner, there are often technical problems as follows:
firstly, for a text with a longer length, the BERT model adopts truncation processing to discard a part of the text exceeding a preset text length threshold, and a text category label cannot be generated according to the whole text, so that the classification accuracy of the text is low.
Secondly, the model to be trained and the loss function of the model are specified in advance, and the model with better expression effect and the loss function cannot be used as the model to be trained and the loss function through comparison, so that the classification accuracy of the model to the text is further reduced.
Thirdly, a soft negative example is not introduced, and a text classification model cannot be trained on the basis of noise construction, so that the generalization capability of the model is poor, and the classification accuracy of the text is low.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Some embodiments of the present disclosure propose a text classification presentation method, apparatus, electronic device and computer readable medium to solve one or more of the technical problems set forth in the background section above.
In a first aspect, some embodiments of the present disclosure provide a text classification display method, including: acquiring a text to be classified, wherein the text length of the text to be classified is greater than or equal to a preset text length threshold; inputting the text to be classified into a dynamic planning cutting layer included in a pre-trained text classification model to obtain a sub-text set to be classified, wherein the text classification model comprises the dynamic planning cutting layer, a vector generation layer, a vector fusion layer and a classification output layer; inputting the sub-text set to be classified into the vector generation layer to obtain a sub-text vector set to be classified; inputting the vector set of the sub-documents to be classified into the vector fusion layer to obtain text vectors to be classified; inputting the text vector to be classified into the classification output layer to obtain a text class label; and displaying the text to be classified in a text display area corresponding to the text type label and included in the text display window.
In a second aspect, some embodiments of the present disclosure provide a text classification display apparatus, including: the device comprises an acquisition unit, a classification unit and a classification unit, wherein the acquisition unit is configured to acquire a text to be classified, and the text length of the text to be classified is greater than or equal to a preset text length threshold; the text classification method comprises a first input unit, a second input unit and a third input unit, wherein the first input unit is configured to input the text to be classified into a dynamic planning cutting layer included in a pre-trained text classification model to obtain a sub-text set to be classified, and the text classification model comprises the dynamic planning cutting layer, a vector generation layer, a vector fusion layer and a classification output layer; a second input unit configured to input the sub-text set to be classified into the vector generation layer, so as to obtain a sub-text vector set to be classified; a third input unit, configured to input the sub-text vector set to be classified into the vector fusion layer, so as to obtain a text vector to be classified; a fourth input unit, configured to input the text vector to be classified into the classification output layer, so as to obtain a text category label; and the display unit is configured to display the text to be classified in a text display area corresponding to the text type label and included in the text display window.
In a third aspect, some embodiments of the present disclosure provide an electronic device, comprising: one or more processors; a storage device having one or more programs stored thereon, which when executed by one or more processors, cause the one or more processors to implement the method described in any of the implementations of the first aspect.
In a fourth aspect, some embodiments of the present disclosure provide a computer readable medium on which a computer program is stored, wherein the program, when executed by a processor, implements the method described in any of the implementations of the first aspect.
In a fifth aspect, some embodiments of the present disclosure provide a computer program product comprising a computer program that, when executed by a processor, implements the method described in any of the implementations of the first aspect above.
The above embodiments of the present disclosure have the following advantages: by the text classification display method of some embodiments of the disclosure, the classification accuracy of the text is improved. Specifically, the reason why the classification accuracy of the text is low is that: for a text with a long length, the BERT model adopts truncation processing, discards a part of the text exceeding a preset text length threshold, and cannot generate a text category label according to the whole text, so that the classification accuracy of the text is low. Based on this, in the text classification display method according to some embodiments of the present disclosure, first, a text to be classified is obtained. The text to be classified may be a text whose text length is greater than or equal to a preset text length threshold and is to be classified. Therefore, the text to be classified with the text length being greater than or equal to the preset text length threshold can be obtained. And then, inputting the text to be classified into a dynamic planning cutting layer included in a pre-trained text classification model to obtain a sub-text set to be classified. The text classification model comprises the dynamic planning cutting layer, a vector generation layer, a vector fusion layer and a classification output layer. Therefore, the discarding of the texts exceeding the preset text length threshold in the texts to be classified can be avoided. Secondly, inputting the sub-text set to be classified into a vector generation layer to obtain the sub-text vector set to be classified. Thus, a vector representation of each sub-text to be classified can be obtained. And then, inputting the vector set of the sub-texts to be classified into a vector fusion layer to obtain the text vector to be classified. Therefore, the sub-text vectors to be classified can be merged into the text vector to be classified. And then, inputting the text vector to be classified into the classification output layer to obtain a text class label. Therefore, the classification of the texts to be classified can be realized, and the text category labels representing the text categories to be classified are obtained. And finally, displaying the text to be classified in a text display area corresponding to the text category label and included in the text display window. Therefore, the texts to be classified can be displayed in the corresponding text display areas. The text which exceeds the preset text length threshold value in the text to be classified is prevented from being discarded, so that the text category label can be generated according to the whole text to be classified, and the accuracy of classifying the text to be classified is improved.
Drawings
The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and components are not necessarily drawn to scale.
FIG. 1 is a flow diagram of some embodiments of a text classification presentation method according to the present disclosure;
FIG. 2 is an architectural diagram of a text classification model of a text classification presentation method according to the present disclosure;
FIG. 3 is a schematic structural diagram of some embodiments of a text classification presentation apparatus according to the present disclosure;
FIG. 4 is a schematic block diagram of an electronic device suitable for use in implementing some embodiments of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.
It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 illustrates a flow 100 of some embodiments of a text classification presentation method according to the present disclosure. The text classification display method comprises the following steps:
step 101, obtaining a text to be classified.
In some embodiments, an executing subject (e.g., a computing device) of the text classification presentation method may obtain the text to be classified from a terminal storing the text to be classified through a wired connection manner or a wireless connection manner. And the text length of the text to be classified is greater than or equal to a preset text length threshold value. The text to be classified may be a text whose text length is greater than or equal to a preset text length threshold and is to be classified. For example, the text to be classified may be news text. The text length may be the number of characters included in the text to be classified. The preset text length threshold may be a preset text length threshold. For example, the preset text length threshold may be 512 characters. It should be noted that the wireless connection means may include, but is not limited to, a 3G/4G connection, a WiFi connection, a bluetooth connection, a WiMAX connection, a Zigbee connection, a uwb (ultra wideband) connection, and other wireless connection means now known or developed in the future. Therefore, the text to be classified with the text length being greater than or equal to the preset text length threshold can be obtained.
The computing device may be hardware or software. When the computing device is hardware, it may be implemented as a distributed cluster composed of multiple servers or terminal devices, or may be implemented as a single server or a single terminal device. When the computing device is embodied as software, it may be installed in the hardware devices enumerated above. It may be implemented, for example, as multiple software or software modules to provide distributed services, or as a single software or software module. And is not particularly limited herein. It should be understood that there may be any number of computing devices, as desired for an implementation.
Step 102, inputting the text to be classified into a dynamic planning cutting layer included in a pre-trained text classification model to obtain a sub-text set to be classified.
In some embodiments, the executing entity may input the text to be classified into a dynamic planning segmentation layer included in a pre-trained text classification model, so as to obtain a sub-text set to be classified. As shown in fig. 2, the text classification model 200 includes the dynamic cutting layer 201, a vector generation layer 202, a vector fusion layer 203, and a classification output layer 204. The pre-trained text classification model may be a pre-trained neural network model. For example, the neural network model may be a BERT model or a SimBERT model. The dynamic programming segmentation layer 201 may perform segmentation processing on the input text 2011 to be classified according to a dynamic programming algorithm, so as to obtain a sub-text set 2012 to be classified. The vector generation layer 202 may convert each input sub-document to be classified into a sub-document vector to be classified, so as to obtain a vector set of the sub-documents to be classified. The vector generation layer 202 may include a pre-trained vector generation model 2021. The vector fusion layer 203 may fuse each sub-text vector to be classified in the input sub-text vector set to be classified into a text vector to be classified. The vector fusion layer 203 may include a graph neural network model. As shown in fig. 2, the Graph neural Network model may be a Graph Attention Network (GAT) model 2031. The classification output layer 204 may input the input text vector to be classified into the classifier 2041 included in the classification output layer, so as to obtain a text class label and output the text class label. Therefore, the discarding of the texts exceeding the preset text length threshold in the texts to be classified can be avoided.
Optionally, the text classification model may be obtained by training through the following steps:
first, an initial sample set is obtained. The initial sample in the initial sample set comprises a sample text and a sample text category label corresponding to the sample text. The sample text length of the sample text is greater than or equal to the preset text length threshold. The sample text may be a text for model training. For example, the sample text may be news text. The sample text category label may be a label that characterizes a category of the sample text. Here, the specific setting of the sample text type label is not limited. For example, the sample text category label may be "sports news".
And secondly, extracting at least one initial sample from the initial sample set to serve as a sample set.
Third, for each sample comprised by the set of samples, the following vector generation step may be performed:
and a first substep, inputting the sample text included in the sample into the dynamic planning and cutting layer, so as to perform dynamic planning and cutting processing on the sample text, and obtain a sample subfile set. In practice, the dynamic programming segmentation layer may perform segmentation processing on the input sample text according to a dynamic programming algorithm to obtain a sample sub-text set.
A second sub-step, for each sample sub-text in the sample sub-text set, inputting the sample sub-text into a pre-trained vector generation model included in the vector generation layer to obtain a sample sub-text vector, so as to generate a sample sub-text vector set corresponding to the sample sub-text set. The vector generation model trained in advance may be a BERT model or a SimBERT model.
In the third sub-step, the sample sub-text vector set may be input to a neural network model included in the vector fusion layer, so as to obtain a sample text vector. The Graph neural Network model may be a GAT (Graph Attention Network) model.
Optionally, the graph neural network model may include a linear transformation layer, an attention layer, a combined splice layer, and a dimensionality reduction layer.
In some optional implementations of some embodiments, first, the execution body may input the sample sub-sample vector set to the linear transformation layer to obtain a sample sub-sample variable vector set. The linear transformation layer is used for performing linear transformation on each sample sub-text vector in the input sample sub-text vector set according to the target linear transformation matrix to obtain a sample sub-text variable direction vector set. The target linear transformation matrix may be a matrix including elements belonging to network parameters that are adjustable when the model is generated by training a text. Then, the sample sub-text variable direction quantity set can be input into the attention layer to obtain an attention coefficient set. Wherein the attention layer is used for determining the attention coefficient of every two sample sub-text transformation vectors. In practice, canThe attention coefficient for each two sample sub-text transformation vectors is determined by:
Figure 207861DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 111095DEST_PATH_IMAGE002
is shown as
Figure 385081DEST_PATH_IMAGE003
Sample sub-sample transform vector sum
Figure 31963DEST_PATH_IMAGE004
The individual samples sub-primitive transform the attention coefficients of the vector.
Figure 137322DEST_PATH_IMAGE005
Is shown as
Figure 86824DEST_PATH_IMAGE003
Sample sub-sample transform vector sum
Figure 707161DEST_PATH_IMAGE004
The individual sample subfolders transform the relevance of the vector.
Figure 33100DEST_PATH_IMAGE006
Is shown as
Figure 992966DEST_PATH_IMAGE003
Sample sub-sample transform vector sum
Figure 972423DEST_PATH_IMAGE007
The individual sample subfolders transform the relevance of the vector.
Figure 486581DEST_PATH_IMAGE008
Representing the number of each sample sub-primitive transformation vector comprised by the set of sample sub-primitive transformation vectors. Wherein the determination can be made by the following formula
Figure 147370DEST_PATH_IMAGE005
Figure 292568DEST_PATH_IMAGE009
Wherein the content of the first and second substances,
Figure 583872DEST_PATH_IMAGE010
is shown as
Figure 444380DEST_PATH_IMAGE003
Individual sample sub-sample vectors.
Figure 908860DEST_PATH_IMAGE011
Is shown as
Figure 781001DEST_PATH_IMAGE004
Individual sample sub-sample vectors.
Figure 367840DEST_PATH_IMAGE012
Representing the target linear transformation matrix.
Figure 591011DEST_PATH_IMAGE013
Representing a weight vector. Wherein, each element included in the weight vector is each network parameter adjustable in the text classification model.
Secondly, the attention coefficient set and the sample sub-text variable vector set can be input to the combined splicing layer to obtain a to-be-reduced sample text vector. The combined splicing layer is used for activating each input sample sub-script transformation vector according to each input attention coefficient to obtain each sample sub-script activation vector, and splicing each obtained sample sub-script activation vector to obtain a to-be-reduced sample text vector. In practice, the obtained sub-sample text activation vectors of each sample can be spliced into a vector to obtain a text vector of the sample to be reduced in dimension. Wherein, the activation processing can be performed on each input sample sub-text transformation vector by the following formula:
Figure 921498DEST_PATH_IMAGE014
wherein the content of the first and second substances,
Figure 710463DEST_PATH_IMAGE015
is shown as
Figure 343569DEST_PATH_IMAGE003
Individual samples subfile activation vectors.
Figure 913091DEST_PATH_IMAGE016
Representing a variance calculation.
And finally, inputting the sample text vector to be dimension-reduced to the dimension-reduction layer to obtain a sample text vector. The dimension reduction layer is used for carrying out dimension reduction processing on the input sample text vector to be subjected to dimension reduction to obtain a sample text vector. The dimensionality reduction layer may be a linear layer. In practice, the dimension reduction layer can reduce the dimension of the input sample text vector to be reduced to 768 dimensions to obtain a sample text vector. Thus, a vector representation can be obtained which represents the entire text to be classified.
And fourthly, performing model training on the text classification model to be trained based on the sample set and the obtained text vectors of the samples to obtain the trained text classification model as the text classification model. In practice, based on the sample set and the obtained text vectors of the samples, model training can be performed on a text classification model to be trained based on various modes, and the trained text classification model is obtained and used as the text classification model. Thus, training of the text classification model can be completed.
Optionally, the performing model training on the text classification model to be trained includes:
firstly, inputting each obtained sample text vector to a target neural network model included in the classification output layer to obtain a prediction sample text category label corresponding to each sample in a sample set. The target neural network model may be a preset neural network model. For example, the target neural network model may be a BERT model or a SimBERT model.
And secondly, comparing the predicted sample text type label corresponding to each sample in the sample set with the sample text type label included in the sample.
And thirdly, determining whether the text classification model reaches a preset optimization target according to the comparison result. The optimization target may refer to that the accuracy of the predicted sample text category label generated by the target neural network is greater than a preset accuracy threshold.
And fourthly, in response to the fact that the text classification model reaches the optimization target, taking the text classification model reaching the optimization target as a trained text classification model.
And fifthly, responding to the situation that the text classification model does not reach the optimization target, adjusting network parameters of the text classification model, forming a sample set by using the initial samples which are not extracted from the initial sample set, using the adjusted text classification model as the text classification model to be trained, and executing the vector generation step and the model training of the text classification model to be trained again. As an example, a Back propagation Algorithm (BP Algorithm) and a gradient descent method (e.g., a random small batch gradient descent Algorithm) may be used to adjust the network parameters of the initial neural network. Thus, training of the text classification model can be completed.
In some optional implementation manners of some embodiments, first, according to punctuations in a sample text included in the sample, the dynamic programming segmentation layer may perform segmentation processing on the sample text, and obtain each segmented sample sub-segment as a sample sub-segment set. And arranging the sample sub-segments in the sample sub-segment set according to the sequence in the sample text. In practice, the execution subject may divide the sample text into a plurality of sample sub-segments according to a comma, a period, an exclamation mark and a question mark included in the sample text. Next, a candidate sample subfragment can be generated from the sample subfragment set. In practice, according to the sample sub-segment set, a candidate sample sub-text can be generated through the fast and slow pointers, and a candidate sample sub-text set is obtained. Each candidate sample sub-text in the candidate sample sub-text set comprises at least one sample sub-segment. The text length of the sub-text of each candidate sample is smaller than a preset sub-text length threshold value. The preset sub-text length threshold may be a preset upper limit of the text length of the sub-text. For example, the preset sub-text length threshold may be half of the preset text length. Two adjacent candidate sample subfiles may include the same sample subfragment. The candidate sample sub-texts in the candidate sample sub-text set are arranged according to the order of the sample sub-fragments.
As an example, each sample sub-segment in the sample sub-segment set may be: sent1, Sent2, Sent3, Sent4, Sent5, and Sent 6. Each candidate sample sub-version in the candidate sample sub-version set may be: para1, Para2, Para3, Para 4. The Para1 may include a nt1, a nt2, and a nt 3. The Para2 may include a nt2, a nt3, and a nt 4. The Para3 may include a nt3, a nt4, and a nt 5. The Para4 may include a nt4, a nt5, and a nt 6.
Then, according to the candidate sample sub-text set, a candidate node directed acyclic graph can be constructed. The candidate node directed acyclic graph comprises at least one candidate node and one virtual node. Each candidate node corresponds to a candidate sample sub-text. In practice, in the first step, a directed edge may be established between two candidate nodes including the same sample sub-segment between the corresponding candidate sample sub-segments. In the second step, the square of the length of the same sample sub-segment included in the two candidate sample sub-segments may be used as the weight of the directed edge. And thirdly, establishing a virtual node after the last candidate node, and establishing directed edges between the last candidate node and the virtual node, so as to obtain a candidate node directed acyclic graph including each directed edge. The last candidate node may be a candidate node corresponding to the last candidate sample sub-text in the candidate sample sub-text set.
As an example, the above-described directed acyclic graph may include a candidate node Para1, a candidate node Para2, a candidate node Para3, a candidate node Para4, and a virtual node Para r. Wherein, a directed edge exists between the candidate node Para1 and the candidate node Para 2. There is a directed edge between the candidate node Para1 and the candidate node Para 3. There is a directed edge between the candidate node Para2 and the candidate node Para 3. There is a directed edge between the candidate node Para2 and the candidate node Para 4. There is a directed edge between the candidate node Para3 and the candidate node Para 4. There is a directed edge between the candidate node Para4 and the virtual node Para r.
And then, at least one candidate path can be generated according to the candidate node directed acyclic graph. In practice, the path from the first candidate node to the virtual node may be solved recursively towards the first candidate node, starting from the last candidate node. The first candidate node may be a candidate node corresponding to the first candidate sample sub-text in the candidate sample sub-text set. The path includes at least one directed edge. Then, a candidate route satisfying a preset route condition may be selected from the at least one candidate route as a target route. The preset path condition may be a weight of each included directional edge and a minimum candidate path. Next, each candidate node corresponding to the target path may be determined as a target node set. And finally, determining the candidate sample sub-texts corresponding to each target node in the target node set as sample sub-texts to obtain a sample sub-text set. Therefore, the repetition degree among the sub texts of the various samples can be balanced as much as possible.
Optionally, the loss function of the target neural network model is a target loss function.
Alternatively, the target neural network model and the target loss function may be determined by:
firstly, a format text set is obtained. And the text format of the format text in the format text set is a preset text format. The format text in the format text set may be a text with a text format that is a preset text format. The preset text format may be a preset text format. For example, the preset text format may be: "Sennce 1 | | Sennce 2 | | | score". The above formatted text in the set of formatted text may be "three people playing chess. I two people play chess. And |2 |. Where the sequence 1 represents the first piece of text that the formatted text includes. Sequence 2 represents the second piece of text that the formatted text includes. Score represents the similarity Score of the first piece of text and the second piece of text.
And secondly, according to the format text set, the following label construction steps can be executed:
and the first substep, dividing the format text set according to the size of a preset batch to obtain a format text group set. The preset batch size may be a preset number size.
And a second substep of performing expansion processing on each formatted text group included in the formatted text group set so as to update the formatted text group set. In practice, for each formatted text group, the copy processing may be performed on each formatted text in the formatted text group, so that each piece of text included in the formatted text is repeated twice.
And a third substep, inputting the format text to the vector generation model for the format text included in each format text group in the updated format text group set to obtain a format text vector so as to generate a format text vector group set corresponding to the format text group set.
And a fourth substep, constructing a format text positive label for each format text vector group in the format text vector group set to obtain a format text positive label set corresponding to the format text vector group set. In practice, for each format text vector included in each format text vector group, the sequence number of the positive sample of the format text vector is used as the label of the format text vector, so that the format text positive label of each format text vector group is obtained. The positive sample of the formatted text vector may be a formatted text vector obtained by copying the formatted text vector. As an example, the preset batch size may be 32. The set of formatted text vectors includes 64 formatted text vectors. The sequence numbers of the 64 format text vectors are as follows: 0. 1, 2, ·, 63. The above formatted text positive label may be [1, 0, 3, 2, ·, 63, 62 ].
Thirdly, based on the format text set and the format text positive label set, the following target neural network model determining steps can be executed:
the first substep, obtain the model set of the initial neural network and preserve the set of the loss function. Wherein, the initial neural network model in the initial neural network model set may be a neural network model for determining the target neural network model. As an example, the set of initial neural network models may include an initial neural network model that is a BERT model and an initial neural network model that is a SimBERT model. The preset loss function in the preset loss function set may be a preset loss function. As an example, the set of preset loss functions may include a first preset loss function, a second preset loss function, and a third preset loss function. Wherein, the first predetermined loss function is a cross entropy loss function. The second predetermined loss function is a combination function of a Cross Entropy loss function (CE loss) and a contrast loss function (Info loss). The third predetermined loss function is a combination function of a cross entropy loss function, a contrast loss function, and a Soft Negative loss function (Negative loss). Wherein the combination function is a function of summing the respective functions included.
And a second substep, for each initial neural network model in the initial neural network model set, determining each preset loss function in the preset loss function set as the loss function of the initial neural network model to construct each alternative neural network model corresponding to the initial neural network model, so as to obtain an alternative neural network model set corresponding to the initial neural network model set. As an example, the set of alternative neural network models may include 6 alternative neural network models: the loss function is a BERT model of the first preset loss function, a BERT model of the second preset loss function, a BERT model of the third preset loss function, a SimBERT model of the first preset loss function, a SimBERT model of the second preset loss function, and a SimBERT model of the third preset loss function.
And a third substep, performing gradient return training on each alternative neural network model in the alternative neural network model set according to the format text set and the format text positive label set to obtain a trained neural network model.
And a fourth substep of determining the index accuracy of each obtained trained neural network model for each model test data set according to at least one model test data set and a preset index. The model test data set of the at least one model test data set may be a data set used for testing each of the obtained trained neural network models. The preset index may be a preset index. By way of example, the at least one model test data set may include, but is not limited to: STS-B (The Semantic text Similarity Benchmark) dataset, LCQMC (A Large-scale Chinese query Matching Corpus) dataset, BQ (Bank query) dataset, ATEC (advanced Technology amplification Community) Semantic Similarity dataset, and PAWS-X (para-whose updates from Word scanning) dataset. The preset index may be a spearman correlation coefficient. The index accuracy may be an accuracy of a spearman correlation coefficient. As an example, the accuracy of each metric for each model test data set after each training neural network model can be as shown in the following table:
Figure 719373DEST_PATH_IMAGE017
and a fifth substep of comparing the accuracy of each index of each trained neural network model to obtain an accuracy comparison result. In practice, the accuracy of each index of each trained neural network model can be compared in various ways to obtain an accuracy comparison result. The specific form of the accuracy comparison result is not limited. For example, for each model test data set, the index accuracy results of each trained neural network model for the model test data set may be ranked, and the ranked sum of each trained neural network model for each model test data set may be used as the accuracy comparison result. For another example, the accuracy comparison result may be obtained by comparing whether the accuracy of the index reaches a preset index threshold.
And a sixth substep of determining a target neural network model and a target loss function according to the initial neural network model set, the preset loss function set and the accuracy comparison result. In practice, the target neural network model and the target loss function may be determined in various ways according to the initial neural network model set, the preset loss function set, and the accuracy comparison result. For example, the initial neural network model and the preset loss function corresponding to the trained neural network model ranked as the first two in the above accuracy comparison results may be determined as the target neural network model and the target loss function, respectively. For another example, the initial neural network model and the preset loss function corresponding to the trained neural network model with the index accuracy reaching the preset index threshold may be determined as the target neural network model and the target loss function, respectively.
The above-mentioned contents are taken as an invention point of the embodiment of the present disclosure, and a second technical problem mentioned in the background art is solved, "a model to be trained and a loss function of the model are specified in advance, and the model with better performance and the loss function cannot be used as the model to be trained and the loss function by comparison, which further reduces the classification accuracy of the model for the text. Further contributing factors to the reduction in the classification accuracy of the model for text are as follows: the model to be trained and the loss function of the model are specified in advance, and the model with better expression effect and the loss function cannot be used as the model to be trained and the loss function through comparison, so that the classification accuracy of the model to the text is further reduced. If the factors are solved, the effect of improving the classification accuracy of the model to the text can be achieved. In order to achieve the effect, according to the formatted text set and the formatted text positive label set, the target neural network model and the target loss function are determined from the initial neural network model set and the preset loss function set, so that the target neural network model with higher text classification accuracy can be obtained and used as a classification output layer of the text classification model, and the text classification accuracy of the model is improved.
Optionally, the target loss function is a combining function. The target loss function may include a cross-entropy loss function, a contrast loss function, and a soft negative loss function.
Optionally, the performing model training on the text classification model to be trained includes:
and step one, dividing each obtained sample text vector according to the preset batch size to obtain a sample text vector group set.
Secondly, for each sample text vector group in the sample text vector group set, the following specific training steps of the text classification model can be executed:
the first substep, expand the above-mentioned sample text vector group, in order to upgrade the above-mentioned sample text vector group. In practice, each sample text vector in the sample text vector group may be copied to update the sample text vector group.
And a second substep of generating a sample text vector group label according to the updated sample text vector group. In practice, the specific step of generating the sample text vector group tag according to the updated sample text vector group may refer to the specific step of constructing the positive tag of the formatted text for each formatted text vector group, which is not described again here.
And a third substep, determining the similarity of every two sample text vectors in the sample text vector group to obtain a sample text vector similarity set. Here, the specific method for determining the similarity is not limited. For example, the similarity of every two sample text vectors can be determined by the pearson correlation coefficient.
And a fourth substep of generating a sample text similar diagonal matrix according to the sample text vector similarity set. In practice, a sample text similar diagonal matrix may be constructed according to the sequence of sample text vectors in a sample text vector group, so that each element included in the sample text similar diagonal matrix is the similarity between two sample text vectors corresponding to the line number and the column number corresponding to the element. As an example, the above sample text similarity diagonal matrix includes a first row and a first column of elements, which are the similarity between the first sample text vector and itself. The sample text similarity diagonal matrix comprises a first line and a second column of elements which are the similarity of a first sample text vector and a second sample text vector.
And a fifth substep of adjusting the values of the diagonal elements included in the sample text similar diagonal matrix to preset diagonal element values. The preset diagonal element value may be a preset diagonal element value. For example, the predetermined diagonal element value may be 10 to the power of-12.
And a sixth substep, dividing the adjusted sample text similar diagonal matrix by a preset factor coefficient to obtain a target sample text similar diagonal matrix. The preset factor coefficient may be a preset coefficient. For example, the preset factor coefficient may be 0.05.
And a seventh substep of generating a contrast loss value according to the target sample text similar diagonal matrix, the sample text vector group label and the contrast loss function. In practice, a cross entropy loss function may be adopted to generate a contrast loss value according to the target sample text similar diagonal matrix and the sample text vector group label.
An eighth substep of generating at least one gaussian noise vector. In practice, at least one 256-dimensional gaussian noise vector may be randomly initialized to obtain at least one gaussian noise vector.
A ninth substep, for each sample text vector in the set of sample text vectors, determining a sum of the sample text vector and any gaussian noise vector in the at least one gaussian noise vector as a negative example sample text vector.
A tenth sub-step of generating a negative interval loss value based on the set of sample text vectors, the determined negative sample text vectors, and the soft negative loss function. Wherein the soft negative example loss function is shown as follows:
Figure 831685DEST_PATH_IMAGE018
wherein the content of the first and second substances,
Figure 494748DEST_PATH_IMAGE019
representing a negative interval loss value.
Figure 692511DEST_PATH_IMAGE020
Positive and negative example differences are shown.
Figure 364801DEST_PATH_IMAGE021
And
Figure 128357DEST_PATH_IMAGE022
are all preset parameter items. For example,
Figure 103267DEST_PATH_IMAGE021
the number of the channels can be 1,
Figure 647380DEST_PATH_IMAGE022
may be 0.3. Wherein, the difference value of positive and negative examples
Figure 998727DEST_PATH_IMAGE020
Can be generated by the following formula:
Figure 616790DEST_PATH_IMAGE023
wherein the content of the first and second substances,
Figure 155744DEST_PATH_IMAGE024
is shown as
Figure 796941DEST_PATH_IMAGE003
A sample text vector.
Figure 76612DEST_PATH_IMAGE025
Is shown as
Figure 752444DEST_PATH_IMAGE003
Negative example text vectors corresponding to the sample text vectors.
Figure 193790DEST_PATH_IMAGE026
Is shown as
Figure 384600DEST_PATH_IMAGE003
A positive sample vector corresponding to the sample text vector. The positive sample vector is obtained by copying the sample text vector.
Figure 77749DEST_PATH_IMAGE027
Representing the cosine similarity.
And an eleventh substep, generating an initial loss value according to the sample text vector group and the cross entropy loss function. Wherein the cross entropy loss function is shown as:
Figure 998301DEST_PATH_IMAGE028
wherein, the above
Figure 751493DEST_PATH_IMAGE029
Representing the resulting contrast loss value.
Figure 491916DEST_PATH_IMAGE030
Representing a predetermined batch size.
Figure 723177DEST_PATH_IMAGE031
The number of sample text category labels representing different categories.
Figure 498235DEST_PATH_IMAGE032
Characterization of
Figure 422329DEST_PATH_IMAGE003
Whether the category of the sample text vector is the same as that of the sample text category labels of different categories
Figure 915627DEST_PATH_IMAGE033
The sample text category labels are the same. Wherein the content of the first and second substances,
Figure 950579DEST_PATH_IMAGE032
when 1, characterize the
Figure 252248DEST_PATH_IMAGE003
Class and number of sample text vectors
Figure 471876DEST_PATH_IMAGE033
The sample text category labels are the same.
Figure 62258DEST_PATH_IMAGE032
When is-1, characterize the
Figure 762885DEST_PATH_IMAGE003
Class and number of sample text vectors
Figure 387901DEST_PATH_IMAGE033
The sample text category labels are not identical.
Figure 778431DEST_PATH_IMAGE034
Characterization of
Figure 652846DEST_PATH_IMAGE003
The class of the sample text vector is
Figure 29601DEST_PATH_IMAGE033
Probability of the category represented by the individual sample text category label.
A twelfth substep of determining the sum of the contrast loss value, the negative interval loss value and the initial loss value as a target loss value.
And a thirteenth substep of taking the text classification model with the target loss value being greater than or equal to the preset loss value as a trained text classification model in response to the target loss value being greater than or equal to the preset loss value. The preset loss value may be a preset upper loss value limit.
And a fourteenth substep, in response to the target loss value being smaller than the preset loss value, adjusting network parameters of a text classification model, forming a sample set by using unused samples in the at least one sample, using the adjusted text classification model as a text classification model to be trained, and executing the vector generation step and model training on the text classification model to be trained again.
The above contents serve as an invention point of the embodiments of the present disclosure, and solve the technical problems mentioned in the background art, i.e. the fact that a text classification model cannot be trained on the basis of noise construction without introducing a soft negative case, which results in poor generalization capability of the model and low accuracy of text classification. The generalization capability of the model is poor, and the classification accuracy of the text is low due to the following factors: a soft negative case is not introduced, and a text classification model cannot be trained on the basis of noise construction, so that the generalization capability of the model is poor, and the classification accuracy of the text is low. If the factors are solved, the generalization capability of the text classification model and the accuracy rate of the text classification can be improved. In order to achieve the effect, soft negative examples are introduced through negative example sample text vectors, so that a text classification model can be trained on the basis of noise construction, and further the generalization capability of the text classification model and the accuracy rate of text classification are improved.
Step 103, inputting the sub-text set to be classified into the vector generation layer to obtain the sub-text vector set to be classified.
In some embodiments, the executing entity may input the sub-text set to be classified into the vector generation layer to obtain the sub-text vector set to be classified. Thus, a vector representation of each sub-text to be classified can be obtained.
And 104, inputting the vector set of the sub-texts to be classified into a vector fusion layer to obtain the text vectors to be classified.
In some embodiments, the executing entity may input the set of sub-text vectors to be classified into the vector fusion layer to obtain a text vector to be classified. Therefore, the sub-text vectors to be classified can be merged into the text vector to be classified.
And 105, inputting the text vector to be classified into a classification output layer to obtain a text class label.
In some embodiments, the execution subject may input the text vector to be classified to the classification output layer to obtain a text category tag. In practice, the executing body may input the text vector to be classified to the classifier included in the classification output layer to obtain the text category label. Therefore, the classification of the texts to be classified can be realized, and the text category labels representing the text categories to be classified are obtained.
And 106, displaying the text to be classified in a text display area corresponding to the text category label and included in the text display window.
In some embodiments, the execution subject may display the text to be classified in a text display area corresponding to the text category tag included in a text display window. The text display window may be a window for displaying text. The text presentation window may include at least one text presentation area. The text type label corresponding to each text display area in the at least one text display area is different. Therefore, the texts to be classified can be displayed in the corresponding text display areas.
Optionally, the text to be classified is a city work order text to be classified. The city work order text to be classified can be a text generated in a city service scene. For example, the city work order text to be classified may be a text for recording the complaint content of the user to city staff (for example, city management staff). The city work order text to be classified can also be a text for recording repair contents for city buildings provided by the user.
Optionally, first, in response to the text category tag representing a repair category, the executing body may send the city work order text to be classified to an associated serviceman terminal. Wherein the associated service personnel terminal. May be the terminal of a serviceman responsible for the maintenance. Here, the terminal may be one of the following: a smart phone or a tablet computer. In practice, the city work order text to be classified can be sent to the associated maintenance personnel terminal in a wired or wireless connection mode. And then, in response to the text category label representation report category, performing name extraction processing on the city work order text to be classified to obtain name information. In practice, name extraction processing may be performed on the city worksheet text to be classified according to an mmseg (Maximum Matching segmentation) algorithm to obtain name information. Secondly, according to the name information, the employee information meeting the employee name condition corresponding to the name information can be selected from the employee information set as the target employee information, and a target employee information set is obtained. The employee name condition may be employee information including a name identical to the name included in the name information. The employee information in the employee information set may be information including, but not limited to, an employee name and an employee identifier. And finally, the city work order text to be classified and the target employee information set can be sent to a related report processing terminal. The report processing terminal may be a terminal of a person responsible for processing a report transaction. In practice, the city work order text to be classified and the target employee information set can be sent to a related report processing terminal in a wired connection or wireless connection mode. Therefore, the speed of processing the reported affairs by the personnel in charge of processing the reported affairs can be increased.
The above embodiments of the present disclosure have the following advantages: by the text classification display method of some embodiments of the disclosure, the classification accuracy of the text is improved. Specifically, the reason why the classification accuracy of the text is low is that: for a text with a long length, the BERT model adopts truncation processing, discards a part of the text exceeding a preset text length threshold, and cannot generate a text category label according to the whole text, so that the classification accuracy of the text is low. Based on this, in the text classification display method according to some embodiments of the present disclosure, first, a text to be classified is obtained. The text to be classified may be a text whose text length is greater than or equal to a preset text length threshold and is to be classified. Therefore, the text to be classified with the text length being greater than or equal to the preset text length threshold can be obtained. And then, inputting the text to be classified into a dynamic planning cutting layer included in a pre-trained text classification model to obtain a sub-text set to be classified. The text classification model comprises the dynamic planning cutting layer, a vector generation layer, a vector fusion layer and a classification output layer. Therefore, the discarding of the texts exceeding the preset text length threshold in the texts to be classified can be avoided. Secondly, inputting the sub-text set to be classified into a vector generation layer to obtain the sub-text vector set to be classified. Thus, a vector representation of each sub-text to be classified can be obtained. And then, inputting the vector set of the sub-texts to be classified into the vector fusion layer to obtain the text vectors to be classified. Therefore, the sub-text vectors to be classified can be merged into the text vector to be classified. And then, inputting the text vector to be classified into the classification output layer to obtain a text class label. Therefore, the classification of the texts to be classified can be realized, and the text category labels representing the text categories to be classified are obtained. And finally, displaying the text to be classified in a text display area corresponding to the text category label and included in the text display window. Therefore, the texts to be classified can be displayed in the corresponding text display areas. Due to the fact that the text exceeding the preset text length threshold value in the text to be classified is prevented from being discarded, the text category label can be generated according to the whole text to be classified, and the accuracy rate of classifying the text to be classified is improved.
With continuing reference to fig. 3, as an implementation of the methods illustrated in the above figures, the present disclosure provides some embodiments of a text classification presentation apparatus, which correspond to those of the method embodiments illustrated in fig. 1, and which may be applied in various electronic devices.
As shown in fig. 3, the text classification presentation apparatus 300 of some embodiments includes: an acquisition unit 301, a first input unit 302, a second input unit 303, a third input unit 304, a fourth input unit 305, and a presentation unit 306. The obtaining unit 301 is configured to obtain a text to be classified, where a text length of the text to be classified is greater than or equal to a preset text length threshold; the first input unit 302 is configured to input the text to be classified into a dynamic cutting hierarchy included in a pre-trained text classification model, so as to obtain a sub-text set to be classified, wherein the text classification model includes the dynamic cutting hierarchy, a vector generation layer, a vector fusion layer and a classification output layer; the second input unit 303 is configured to input the sub-text set to be classified into the vector generation layer, so as to obtain a sub-text vector set to be classified; the third input unit 304 is configured to input the set of sub-text vectors to be classified into the vector fusion layer, resulting in text vectors to be classified; the fourth input unit 305 is configured to input the text vector to be classified into the classification output layer, resulting in a text class label; the presentation unit 306 is configured to present the text to be classified in a text presentation area included in the text presentation window and corresponding to the text category label.
It will be understood that the units described in the apparatus 300 correspond to the various steps in the method described with reference to fig. 1. Thus, the operations, features and resulting advantages described above with respect to the method are also applicable to the apparatus 300 and the units included therein, and are not described herein again.
Referring now to FIG. 4, shown is a schematic block diagram of an electronic device (e.g., computing device) 400 suitable for use in implementing some embodiments of the present disclosure. The electronic device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 4, electronic device 400 may include a processing device (e.g., central processing unit, graphics processor, etc.) 401 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM) 402 or a program loaded from a storage device 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data necessary for the operation of the electronic apparatus 400 are also stored. The processing device 401, the ROM 402, and the RAM 403 are connected to each other through a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
Generally, the following devices may be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 407 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 408 including, for example, tape, hard disk, etc.; and a communication device 409. The communication device 409 may allow the electronic device 400 to communicate with other devices, either wirelessly or by wire, to exchange data. While fig. 4 illustrates an electronic device 400 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may be alternatively implemented or provided. Each block shown in fig. 4 may represent one device or may represent multiple devices as desired.
In particular, according to some embodiments of the present disclosure, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, some embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In some such embodiments, the computer program may be downloaded and installed from a network through the communication device 409, or from the storage device 408, or from the ROM 402. The computer program, when executed by the processing device 401, performs the above-described functions defined in the methods of some embodiments of the present disclosure.
It should be noted that the computer readable medium described in some embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In some embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In some embodiments of the present disclosure, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a text to be classified, wherein the text length of the text to be classified is greater than or equal to a preset text length threshold; inputting the text to be classified into a dynamic planning cutting layer included in a pre-trained text classification model to obtain a sub-text set to be classified, wherein the text classification model comprises the dynamic planning cutting layer, a vector generation layer, a vector fusion layer and a classification output layer; inputting the sub-text set to be classified into the vector generation layer to obtain a sub-text vector set to be classified; inputting the vector set of the sub-documents to be classified into the vector fusion layer to obtain text vectors to be classified; inputting the text vector to be classified into the classification output layer to obtain a text class label; and displaying the text to be classified in a text display area corresponding to the text type label and included in the text display window.
Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in some embodiments of the present disclosure may be implemented by software, and may also be implemented by hardware. The described units may also be provided in a processor, and may be described as: a processor includes an acquisition unit, a first input unit, a second input unit, a third input unit, a fourth input unit, and a presentation unit. The names of the units do not in some cases constitute a limitation to the units themselves, for example, the display unit may also be described as a "unit displaying the text to be classified in the text display area of the corresponding text category label included in the text display window".
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.
Some embodiments of the present disclosure also provide a computer program product comprising a computer program which, when executed by a processor, implements any of the text classification presentation methods described above.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the inventive concept as defined above. For example, the above features and (but not limited to) technical features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.

Claims (10)

1. A text classification display method comprises the following steps:
acquiring a text to be classified, wherein the text length of the text to be classified is greater than or equal to a preset text length threshold;
inputting the text to be classified into a dynamic planning cutting layer included in a pre-trained text classification model to obtain a sub-text set to be classified, wherein the text classification model comprises the dynamic planning cutting layer, a vector generation layer, a vector fusion layer and a classification output layer;
inputting the sub-document set to be classified into the vector generation layer to obtain a sub-document vector set to be classified;
inputting the vector set of the sub-documents to be classified into the vector fusion layer to obtain text vectors to be classified;
inputting the text vector to be classified into the classification output layer to obtain a text class label;
and displaying the text to be classified in a text display area corresponding to the text category label and included in a text display window.
2. The method of claim 1, wherein the text classification model is trained by:
acquiring an initial sample set, wherein an initial sample in the initial sample set comprises a sample text and a sample text category label corresponding to the sample text, and the sample text length of the sample text is greater than or equal to the preset text length threshold;
extracting at least one initial sample from the initial sample set as a sample set;
for each sample comprised by the set of samples, the following vector generation steps are performed:
inputting the sample text included by the sample into the dynamic planning segmentation layer to perform dynamic planning segmentation processing on the sample text to obtain a sample sub-text set;
for each sample sub-text in the sample sub-text set, inputting the sample sub-text into a pre-trained vector generation model included in the vector generation layer to obtain a sample sub-text vector so as to generate a sample sub-text vector set corresponding to the sample sub-text set;
inputting the sample sub-text vector set to a graph neural network model included in the vector fusion layer to obtain a sample text vector;
and performing model training on the text classification model to be trained based on the sample set and the obtained sample text vectors to obtain a trained text classification model as the text classification model.
3. The method of claim 2, wherein the model training of the text classification model to be trained comprises:
inputting each obtained sample text vector to a target neural network model included in the classification output layer to obtain a prediction sample text category label corresponding to each sample in a sample set;
comparing a predicted sample text category label corresponding to each sample in the set of samples with sample text category labels included in the samples;
determining whether the text classification model reaches a preset optimization target or not according to the comparison result;
in response to determining that the text classification model reaches the optimization goal, taking the text classification model reaching the optimization goal as a trained text classification model;
and in response to the fact that the text classification model does not reach the optimization target, adjusting network parameters of the text classification model, forming a sample set by using initial samples which are not extracted in the initial sample set, using the adjusted text classification model as a text classification model to be trained, and executing the vector generation step and model training of the text classification model to be trained again.
4. The method of claim 2, wherein the inputting sample text that the sample comprises into the dynamic slicing hierarchy comprises:
according to punctuation marks in a sample text included by the sample, carrying out segmentation processing on the sample text to obtain each segmented sample sub-segment as a sample sub-segment set;
generating a candidate sample sub-fragment set according to the sample sub-fragment set;
constructing a candidate node directed acyclic graph according to the candidate sample sub-text set, wherein the candidate node directed acyclic graph comprises at least one candidate node and one virtual node, and each candidate node corresponds to one candidate sample sub-text;
generating at least one candidate path according to the candidate node directed acyclic graph;
selecting a candidate path meeting a preset path condition from the at least one candidate path as a target path;
determining each candidate node corresponding to the target path as a target node set;
and determining the candidate sample sub-texts corresponding to each target node in the target node set as sample sub-texts to obtain a sample sub-text set.
5. The method of claim 2, wherein the graph neural network model comprises a linear transformation layer, an attention layer, a combined stitching layer, and a dimensionality reduction layer; and
inputting the sample sub-text vector set into the graph neural network model included in the vector fusion layer to obtain a sample text vector includes:
inputting the sample sub-vector sets to the linear transformation layer to obtain sample sub-vector variable transformation vector sets, wherein the linear transformation layer is used for performing linear transformation on each sample sub-vector in the input sample sub-vector sets according to a target linear transformation matrix to obtain the sample sub-vector variable transformation vector sets;
inputting the sample sub-text transformation vector set into the attention layer to obtain an attention coefficient set, wherein the attention layer is used for determining the attention coefficients of every two sample sub-text transformation vectors;
inputting the attention coefficient set and the sample sub-text variable direction vector set into the combined splicing layer to obtain a sample text vector to be reduced, wherein the combined splicing layer is used for activating each input sample sub-text transformation vector according to each input attention coefficient to obtain each sample sub-text activation vector, and splicing each obtained sample sub-text activation vector to obtain a sample text vector to be reduced;
and inputting the to-be-reduced sample text vector to the dimension reduction layer to obtain a sample text vector, wherein the dimension reduction layer is used for performing dimension reduction processing on the input to-be-reduced sample text vector to obtain a sample text vector.
6. The method of claim 1, wherein the text to be classified is a city work order text to be classified; and
the method further comprises the following steps:
responding to the text category label representation repair category, and sending the city work order text to be classified to a related maintenance personnel terminal;
responding to the text category label representation report category, and performing name extraction processing on the city work order text to be classified to obtain name information;
according to the name information, selecting employee information meeting employee name conditions corresponding to the name information from an employee information set as target employee information to obtain a target employee information set;
and sending the city work order text to be classified and the target employee information set to a related report processing terminal.
7. A text classification presentation apparatus comprising:
the device comprises an acquisition unit, a classification unit and a classification unit, wherein the acquisition unit is configured to acquire a text to be classified, and the text length of the text to be classified is greater than or equal to a preset text length threshold;
a first input unit, configured to input the text to be classified into a dynamic cutting hierarchy included in a pre-trained text classification model, resulting in a sub-text set to be classified, wherein the text classification model includes the dynamic cutting hierarchy, a vector generation layer, a vector fusion layer, and a classification output layer;
the second input unit is configured to input the sub-text set to be classified into the vector generation layer to obtain a sub-text vector set to be classified;
a third input unit, configured to input the sub-text vector set to be classified into the vector fusion layer, resulting in a text vector to be classified;
the fourth input unit is configured to input the text vector to be classified into the classification output layer, so that a text class label is obtained;
and the display unit is configured to display the text to be classified in a text display area corresponding to the text category label and included in the text display window.
8. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon;
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-6.
9. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-6.
10. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-6.
CN202210971410.5A 2022-08-15 2022-08-15 Text classification display method and device, electronic equipment and computer readable medium Active CN115048524B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210971410.5A CN115048524B (en) 2022-08-15 2022-08-15 Text classification display method and device, electronic equipment and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210971410.5A CN115048524B (en) 2022-08-15 2022-08-15 Text classification display method and device, electronic equipment and computer readable medium

Publications (2)

Publication Number Publication Date
CN115048524A true CN115048524A (en) 2022-09-13
CN115048524B CN115048524B (en) 2022-10-28

Family

ID=83167510

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210971410.5A Active CN115048524B (en) 2022-08-15 2022-08-15 Text classification display method and device, electronic equipment and computer readable medium

Country Status (1)

Country Link
CN (1) CN115048524B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116630840A (en) * 2023-04-07 2023-08-22 中关村科学城城市大脑股份有限公司 Classification information generation method, device, electronic equipment and computer readable medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107832456A (en) * 2017-11-24 2018-03-23 云南大学 A kind of parallel KNN file classification methods based on the division of critical Value Data
US20200110777A1 (en) * 2017-06-28 2020-04-09 Zhejiang University System and Method of Graph Feature Extraction Based on Adjacency Matrix
CN112380862A (en) * 2021-01-18 2021-02-19 武汉千屏影像技术有限责任公司 Method, apparatus and storage medium for automatically acquiring pathological information
CN113868419A (en) * 2021-09-29 2021-12-31 中国平安财产保险股份有限公司 Text classification method, device, equipment and medium based on artificial intelligence
US20220138423A1 (en) * 2020-11-02 2022-05-05 Chengdu Wang'an Technology Development Co., Ltd. Deep learning based text classification
CN114579743A (en) * 2022-03-04 2022-06-03 合众新能源汽车有限公司 Attention-based text classification method and device and computer readable medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200110777A1 (en) * 2017-06-28 2020-04-09 Zhejiang University System and Method of Graph Feature Extraction Based on Adjacency Matrix
CN107832456A (en) * 2017-11-24 2018-03-23 云南大学 A kind of parallel KNN file classification methods based on the division of critical Value Data
US20220138423A1 (en) * 2020-11-02 2022-05-05 Chengdu Wang'an Technology Development Co., Ltd. Deep learning based text classification
CN112380862A (en) * 2021-01-18 2021-02-19 武汉千屏影像技术有限责任公司 Method, apparatus and storage medium for automatically acquiring pathological information
CN113868419A (en) * 2021-09-29 2021-12-31 中国平安财产保险股份有限公司 Text classification method, device, equipment and medium based on artificial intelligence
CN114579743A (en) * 2022-03-04 2022-06-03 合众新能源汽车有限公司 Attention-based text classification method and device and computer readable medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴习沫 等: "安全类文章的多文本分类系统的设计与实现", 《信息技术与网络安全》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116630840A (en) * 2023-04-07 2023-08-22 中关村科学城城市大脑股份有限公司 Classification information generation method, device, electronic equipment and computer readable medium
CN116630840B (en) * 2023-04-07 2024-02-02 中关村科学城城市大脑股份有限公司 Classification information generation method, device, electronic equipment and computer readable medium

Also Published As

Publication number Publication date
CN115048524B (en) 2022-10-28

Similar Documents

Publication Publication Date Title
WO2020155423A1 (en) Cross-modal information retrieval method and apparatus, and storage medium
KR20210070891A (en) Method and apparatus for evaluating translation quality
CN107145485B (en) Method and apparatus for compressing topic models
JP2008052732A (en) Method and program for calculating similarity, and method and program for deriving context model
WO2023138188A1 (en) Feature fusion model training method and apparatus, sample retrieval method and apparatus, and computer device
CN109933217B (en) Method and device for pushing sentences
JP2023017910A (en) Semantic representation model pre-training method, device, and electronic apparatus
CN111090753B (en) Training method of classification model, classification method, device and computer storage medium
CN110275962B (en) Method and apparatus for outputting information
CN111738010A (en) Method and apparatus for generating semantic matching model
WO2024099171A1 (en) Video generation method and apparatus
CN115048524B (en) Text classification display method and device, electronic equipment and computer readable medium
WO2023280106A1 (en) Information acquisition method and apparatus, device, and medium
WO2023029354A1 (en) Text information extraction method and apparatus, and storage medium and computer device
CN112582073B (en) Medical information acquisition method, device, electronic equipment and medium
CN108038109A (en) Method and system, the computer program of Feature Words are extracted from non-structured text
CN112836035A (en) Method, device, equipment and computer readable medium for matching data
CN112200183A (en) Image processing method, device, equipment and computer readable medium
JP7106647B2 (en) Quantum Superposition and Entanglement in Social Emotion and Natural Language Generation
JP2022541832A (en) Method and apparatus for retrieving images
CN115062119A (en) Government affair event handling recommendation method and device
CN112199954B (en) Disease entity matching method and device based on voice semantics and computer equipment
CN111754984B (en) Text selection method, apparatus, device and computer readable medium
CN113987118A (en) Corpus acquisition method, apparatus, device and storage medium
WO2022141855A1 (en) Text regularization method and apparatus, and electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant