CN110929033A - Long text classification method and device, computer equipment and storage medium - Google Patents
Long text classification method and device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN110929033A CN110929033A CN201911172175.XA CN201911172175A CN110929033A CN 110929033 A CN110929033 A CN 110929033A CN 201911172175 A CN201911172175 A CN 201911172175A CN 110929033 A CN110929033 A CN 110929033A
- Authority
- CN
- China
- Prior art keywords
- long text
- classified
- sentence
- vector
- determining
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 239000013598 vector Substances 0.000 claims abstract description 117
- 238000003062 neural network model Methods 0.000 claims description 63
- 238000004590 computer program Methods 0.000 claims description 21
- 238000006243 chemical reaction Methods 0.000 claims description 7
- 230000008569 process Effects 0.000 abstract description 5
- 239000011159 matrix material Substances 0.000 description 6
- 230000004044 response Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000017105 transposition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a long text classification method, a long text classification device, computer equipment and a storage medium. The method comprises the following steps: if a long text to be classified sent by a terminal is received, converting words of each sentence of the long text to be classified into word vectors; determining a sentence vector of each sentence of the long text to be classified according to the word vector of each sentence of the long text to be classified; determining a long text vector of the long text to be classified according to a sentence vector of each sentence of the long text to be classified, and determining a classification to which the long text to be classified belongs according to the long text vector of the long text to be classified; and sending the classification to which the long text to be classified belongs to a terminal. By applying the scheme provided by the embodiment of the invention, the process of manually extracting the features can be avoided, and the long text can be rapidly and accurately classified.
Description
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a long text classification method and device, computer equipment and a storage medium.
Background
Long text classification is an important application in natural language processing. The traditional long text classification method needs manual feature extraction, and the long text is classified by utilizing a classification algorithm through the manual feature extraction. The traditional method has low efficiency and low accuracy.
Disclosure of Invention
The embodiment of the invention provides a long text classification method, a long text classification device, computer equipment and a storage medium, and aims to solve the problems of low efficiency and low accuracy of the conventional long text classification.
In a first aspect, an embodiment of the present invention provides a long text classification method, which includes:
if a long text to be classified sent by a terminal is received, converting words of each sentence of the long text to be classified into word vectors;
determining a sentence vector of each sentence of the long text to be classified according to the word vector of each sentence of the long text to be classified;
determining a long text vector of the long text to be classified according to a sentence vector of each sentence of the long text to be classified, and determining a classification to which the long text to be classified belongs according to the long text vector of the long text to be classified;
and sending the classification to which the long text to be classified belongs to a terminal.
The further technical scheme is that the converting words of each sentence of the long text to be classified into word vectors comprises:
and sequentially inputting each sentence of the long text to be classified into a preset embedding layer of the neural network model based on the level attention.
The further technical scheme is that words of each sentence of the long text to be classified are converted into word vectors, and the method further comprises the following steps:
inputting an output result of the embedding layer of the hierarchical attention based neural network model into a word self-attention layer of the hierarchical attention based neural network model.
A further technical solution is that the determining a sentence vector of each sentence of the long text to be classified according to the word vector of each sentence of the long text to be classified includes:
inputting an output result of the word self-attention layer of the neural network model based on the hierarchical attention into a sensor encoder layer of the neural network model based on the hierarchical attention.
A further technical solution is that the determining a sentence vector of each sentence of the long text to be classified according to the word vector of each sentence of the long text to be classified further includes:
inputting an output result of a sensor encoder layer of the neural network model based on the hierarchical attention into a sensor self-attack layer of the neural network model based on the hierarchical attention.
A further technical solution is that, the determining the long text vector of the long text to be classified according to the sentence vector of each sentence of the long text to be classified, and determining the classification to which the long text to be classified belongs according to the long text vector of the long text to be classified, includes:
inputting an output result of a sensor self-attention layer of the hierarchical attention-based neural network model into an output layer of the hierarchical attention-based neural network model.
In a second aspect, an embodiment of the present invention further provides a long text classification apparatus, which includes:
the conversion unit is used for converting words of each sentence of the long text to be classified into word vectors if the long text to be classified sent by the terminal is received;
the first determining unit is used for determining sentence vectors of the sentences of the long text to be classified according to the word vectors of the sentences of the long text to be classified;
the second determining unit is used for determining the long text vector of the long text to be classified according to the sentence vector of each sentence of the long text to be classified, and determining the classification of the long text to be classified according to the long text vector of the long text to be classified;
and the sending unit is used for sending the classification to which the long text to be classified belongs to the terminal.
The further technical scheme is that the conversion unit comprises:
and the first input unit is used for sequentially inputting each sentence of the long text to be classified into a preset embedding layer of the neural network model based on the level attention.
In a third aspect, an embodiment of the present invention further provides a computer device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the above method when executing the computer program.
In a fourth aspect, the present invention also provides a computer-readable storage medium, which stores a computer program, and the computer program can implement the above method when being executed by a processor.
By applying the technical scheme of the embodiment of the invention, if the long text to be classified sent by the terminal is received, the words of each sentence of the long text to be classified are converted into word vectors; determining a sentence vector of each sentence of the long text to be classified according to the word vector of each sentence of the long text to be classified; determining a long text vector of the long text to be classified according to a sentence vector of each sentence of the long text to be classified, and determining a classification to which the long text to be classified belongs according to the long text vector of the long text to be classified; and sending the classification to which the long text to be classified belongs to a terminal. By applying the scheme provided by the embodiment of the invention, the process of manually extracting the features can be avoided, and the long text can be rapidly and accurately classified.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a long text classification method according to an embodiment of the present invention;
fig. 2 is a schematic block diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
Referring to fig. 1, fig. 1 is a flowchart illustrating a long text classification method according to an embodiment of the present invention. As shown, the method includes the following steps S1-S4.
And S1, if the long text to be classified sent by the terminal is received, converting words of each sentence of the long text to be classified into word vectors.
In specific implementation, if a long text to be classified sent by a terminal is received (the long text to be classified may be a long text in the national economic industry, such as a news text in the national economic industry, etc.), words of each sentence of the long text to be classified are converted into word vectors.
In one embodiment, the above step S1 specifically includes the following steps S11-S12.
And S11, sequentially inputting each sentence of the long text to be classified into a preset embedding layer of the neural network model based on the level attention.
In this embodiment, a neural network model based on hierarchical attention is constructed in advance, and the neural network model based on hierarchical attention includes an embedded layer, which is also called a word encoder layer. The embedding layer is used for embedding each word of each sentence of the long text to be classified into a word vector. The obtained word vector calculation result can be expressed by the following formula (1) and formula (2).
Formula (1):Werepresenting a word vector matrix having V words, each word vector having a dimension demb。
Formula (2): x is the number ofit=Wewit(ii) a Wherein x isitRepresenting the word vector corresponding to the t-th word in the i-th sentence, witRepresenting the index of the word vector in the word vector matrix.
S12, inputting the output result of the embedding layer of the neural network model based on the hierarchical attention into the word self-attention layer of the neural network model based on the hierarchical attention.
The word self-attention layer is mainly added to a self-attention model. The words in the sentence are dependent, and the recurrent neural network expresses the dependency between the words by the information that the current word depends on the previous word. Since each word in a sentence has a different importance to the sentence, the attention model assigns different weights to words in the text based on a recurrent neural network. A recursion mechanism is added into the self-attention model, so that the model can discover the interdependence relation between words in a sentence and the interdependence relation between words in a text. The interdependence between texts is mainly solved by recursively memorizing contents.
This layer is mainly composed of multiple heads, which are composed of multiple self-attentions, as shown in the following formula (3):
formula (3): MultiHead ═ concat (head)1,head2,...,headh)Wo(ii) a Wherein the headhOne of which is shown, concat shows the splicing operation performed on a plurality of self-attentions;
The self-attention operation adopts a point multiplication operation consisting of three parts of Query, Key and Value, and is specifically shown in the following formula (3) and formula (4):
formula (3): headi+1=Attention(Qi+1,Ki+1,Vi+1) i 1,2,. h; wherein the headi+1Indicating self-attention, Qi+1,Ki+1,Vi+1Three input matrices representing self attention.
Formula (4):wherein the Attention represents the output matrix after self-Attention calculation, dkRepresents Ki+1T denotes a matrix transposition.
The recursion mechanism is mainly passed to the next step through the memory content of the previous step. As shown in the following equation (5).
Formula (5):wherein the content of the first and second substances,the word vector representing the (i + 1) th sentence in the text to be processed is the word vector passing through the last sentenceAnd word vector of the (i + 1) th sentenceFormed after the splicing operation, SG indicates no back propagation.
Q in the formula (3) is calculated by the following formula (6)i+1,Ki+1,Vi+1。
Formula (6): wherein, Wq T、Wk T、Wv TThree parameter matrices are represented. dkRepresents Qi+1And Ki+1Dimension of (d)vRepresents Vi+1Of (c) is calculated.
S2, determining sentence vectors of the sentences of the long text to be classified according to the word vectors of the sentences of the long text to be classified.
In specific implementation, the sentence vector of each sentence of the long text to be classified is determined according to the word vector of each sentence of the long text to be classified.
In one embodiment, the above step S1 specifically includes the following steps S11-S12.
S21, inputting the output result of the word self-attention layer of the neural network model based on the hierarchical attention into the sensor encoder layer of the neural network model based on the hierarchical attention.
In specific implementation, the output result of the word self-attribute layer of the neural network model based on the hierarchical attention is input into the sensor encoder layer of the neural network model based on the hierarchical attention.
The sensor encoder layer, also called a sentence encoder layer, mainly embeds word vectors in sentences into sentence vectors, as shown in the following equation (7).
Formula (7): si+1Density (first (multihead)), where Si+1Represents a vector of sentences, First represents taking the First value in the MultiHead dimension, and Dense represents a full join.
S22, inputting the output result of the sensor encoder layer of the neural network model based on the level attention into the sensor self-attention layer of the neural network model based on the level attention.
In a specific implementation, the output result of the sensor encoder layer of the neural network model based on the hierarchical attention is input into the sensor self-attention layer of the neural network model based on the hierarchical attention.
The sensor self-attention layer, also called the attention layer, is similar to the processing rules in the word self-attention layer. The difference is that the input of the sensor self-entry layer is the sentence vector obtained in step S21, which is not described in detail herein in this embodiment of the present invention.
S3, determining the long text vector of the long text to be classified according to the sentence vector of each sentence of the long text to be classified, and determining the classification of the long text to be classified according to the long text vector of the long text to be classified.
In specific implementation, the long text vector of the long text to be classified is determined according to the sentence vector of each sentence of the long text to be classified, and the classification to which the long text to be classified belongs is determined according to the long text vector of the long text to be classified.
In an embodiment, the step S3 includes the following steps: inputting an output result of a sensor self-attention layer of the hierarchical attention-based neural network model into an output layer of the hierarchical attention-based neural network model.
The output layer mainly processes sentence vectors in the whole text by adopting a formula (7), and the output result is expressed as v. And finally, calculating the classification of the long text to be detected by the output layer by adopting softmax, as shown in a formula (8).
Formula (8): p ═ softmax (W)cv+bc) Where p denotes the probability of different classification labels of the text to be classified, WcRepresenting a parameter matrix, bcIndicating the bias.
The loss function of the neural network model training based on the level attention is calculated by the following formula (9).
And S4, sending the classification to which the long text to be classified belongs to the terminal.
In specific implementation, the classification to which the long text to be classified belongs is sent to the terminal, so that the long text can be classified quickly.
By applying the technical scheme of the embodiment of the invention, if the long text to be classified sent by the terminal is received, the words of each sentence of the long text to be classified are converted into word vectors; determining a sentence vector of each sentence of the long text to be classified according to the word vector of each sentence of the long text to be classified; determining a long text vector of the long text to be classified according to a sentence vector of each sentence of the long text to be classified, and determining a classification to which the long text to be classified belongs according to the long text vector of the long text to be classified; and sending the classification to which the long text to be classified belongs to a terminal. By applying the scheme provided by the embodiment of the invention, the process of manually extracting the features can be avoided, and the long text can be rapidly and accurately classified.
The invention also provides a long text classification device corresponding to the long text classification method. The long text classification apparatus comprises means for performing the long text classification method described above. Specifically, the long text classification device comprises a conversion unit, a first determination unit, a second determination unit and a sending unit.
The conversion unit is used for converting words of each sentence of the long text to be classified into word vectors if the long text to be classified sent by the terminal is received;
the first determining unit is used for determining sentence vectors of the sentences of the long text to be classified according to the word vectors of the sentences of the long text to be classified;
the second determining unit is used for determining the long text vector of the long text to be classified according to the sentence vector of each sentence of the long text to be classified, and determining the classification of the long text to be classified according to the long text vector of the long text to be classified;
and the sending unit is used for sending the classification to which the long text to be classified belongs to the terminal.
In one embodiment, the conversion unit includes a first input unit and a second input unit.
And the first input unit is used for sequentially inputting each sentence of the long text to be classified into a preset embedding layer of the neural network model based on the level attention.
And the second input unit is used for inputting the output result of the embedded layer of the neural network model based on the hierarchical attention into the word self-attack layer of the neural network model based on the hierarchical attention.
In an embodiment, the first determination unit includes a third input unit and a fourth input unit.
A third input unit, configured to input an output result of the word self-annotation layer of the neural network model based on the hierarchical attention into a sensor encoder layer of the neural network model based on the hierarchical attention.
A fourth input unit, configured to input an output result of the sensor encoder layer of the neural network model based on the hierarchical attention into the sensor self-attack layer of the neural network model based on the hierarchical attention.
In an embodiment, the second determination unit comprises a fifth input unit.
A fifth input unit, configured to input an output result of the sensor self-attention layer of the neural network model based on the hierarchical attention into an output layer of the neural network model based on the hierarchical attention.
It should be noted that, as can be clearly understood by those skilled in the art, the specific implementation processes of the long text classification device and each unit may refer to the corresponding descriptions in the foregoing method embodiments, and for convenience and brevity of description, no further description is provided herein.
The above-described long text classification apparatus may be implemented in the form of a computer program that is executable on a computer device as shown in fig. 2.
Referring to fig. 2, fig. 2 is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device 500 may be a terminal or a server, where the terminal may be an electronic device with a communication function, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device. The server may be an independent server or a server cluster composed of a plurality of servers.
Referring to fig. 2, the computer device 500 includes a processor 502, memory, and a network interface 505 connected by a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
The non-volatile storage medium 503 may store an operating system 5031 and a computer program 5032. The computer program 5032, when executed, causes the processor 502 to perform a long text classification method.
The processor 502 is used to provide computing and control capabilities to support the operation of the overall computer device 500.
The internal memory 504 provides an environment for the execution of the computer program 5032 in the non-volatile storage medium 503, and when the computer program 5032 is executed by the processor 502, the processor 502 can be enabled to execute a long text classification method.
The network interface 505 is used for network communication with other devices. Those skilled in the art will appreciate that the configuration shown in fig. 2 is a block diagram of only a portion of the configuration associated with the present application and does not constitute a limitation of the computer device 500 to which the present application may be applied, and that a particular computer device 500 may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
Wherein the processor 502 is configured to run the computer program 5032 stored in the memory to implement the following steps:
if a long text to be classified sent by a terminal is received, converting words of each sentence of the long text to be classified into word vectors;
determining a sentence vector of each sentence of the long text to be classified according to the word vector of each sentence of the long text to be classified;
determining a long text vector of the long text to be classified according to a sentence vector of each sentence of the long text to be classified, and determining a classification to which the long text to be classified belongs according to the long text vector of the long text to be classified;
and sending the classification to which the long text to be classified belongs to a terminal.
In an embodiment, when the step of converting the words of each sentence of the long text to be classified into word vectors is implemented by the processor 502, the following steps are specifically implemented:
inputting each sentence of the long text to be classified into a preset embedding layer of a neural network model based on hierarchical attention in sequence;
inputting an output result of the embedding layer of the hierarchical attention based neural network model into a word self-attention layer of the hierarchical attention based neural network model.
In an embodiment, when implementing the step of determining a sentence vector of each sentence of the long text to be classified according to a word vector of each sentence of the long text to be classified, the processor 502 specifically implements the following steps:
inputting an output result of a word self-attention layer of the neural network model based on the hierarchical attention into a sensor encoder layer of the neural network model based on the hierarchical attention;
inputting an output result of a sensor encoder layer of the neural network model based on the hierarchical attention into a sensor self-attack layer of the neural network model based on the hierarchical attention.
In an embodiment, when the processor 502 determines the long text vector of the long text to be classified according to the sentence vector of each sentence of the long text to be classified, and determines the classification step to which the long text to be classified belongs according to the long text vector of the long text to be classified, the following steps are specifically implemented:
inputting an output result of a sensor self-attention layer of the hierarchical attention-based neural network model into an output layer of the hierarchical attention-based neural network model.
It should be understood that, in the embodiment of the present Application, the Processor 502 may be a Central Processing Unit (CPU), and the Processor 502 may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field-Programmable Gate arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It will be understood by those skilled in the art that all or part of the flow of the method implementing the above embodiments may be implemented by a computer program instructing associated hardware. The computer program may be stored in a storage medium, which is a computer-readable storage medium. The computer program is executed by at least one processor in the computer system to implement the flow steps of the embodiments of the method described above.
Accordingly, the present invention also provides a storage medium. The storage medium may be a computer-readable storage medium. The storage medium stores a computer program. The computer program, when executed by a processor, causes the processor to perform the steps of:
if a long text to be classified sent by a terminal is received, converting words of each sentence of the long text to be classified into word vectors;
determining a sentence vector of each sentence of the long text to be classified according to the word vector of each sentence of the long text to be classified;
determining a long text vector of the long text to be classified according to a sentence vector of each sentence of the long text to be classified, and determining a classification to which the long text to be classified belongs according to the long text vector of the long text to be classified;
and sending the classification to which the long text to be classified belongs to a terminal.
In an embodiment, when the step of converting words of each sentence of the long text to be classified into word vectors is implemented by the processor executing the computer program, the following steps are specifically implemented:
inputting each sentence of the long text to be classified into a preset embedding layer of a neural network model based on hierarchical attention in sequence;
inputting an output result of the embedding layer of the hierarchical attention based neural network model into a word self-attention layer of the hierarchical attention based neural network model.
In an embodiment, when the processor executes the computer program to implement the step of determining a sentence vector of each sentence of the long text to be classified according to a word vector of each sentence of the long text to be classified, the processor specifically implements the following steps:
inputting an output result of a word self-attention layer of the neural network model based on the hierarchical attention into a sensor encoder layer of the neural network model based on the hierarchical attention;
inputting an output result of a sensor encoder layer of the neural network model based on the hierarchical attention into a sensor self-attack layer of the neural network model based on the hierarchical attention.
In an embodiment, when the processor executes the computer program to implement the determining a long text vector of the long text to be classified according to a sentence vector of each sentence of the long text to be classified, and determining a classification step to which the long text to be classified belongs according to the long text vector of the long text to be classified, the following steps are specifically implemented:
inputting an output result of a sensor self-attention layer of the hierarchical attention-based neural network model into an output layer of the hierarchical attention-based neural network model.
The storage medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk, which can store various computer readable storage media.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative. For example, the division of each unit is only one logic function division, and there may be another division manner in actual implementation. For example, various elements or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented.
The steps in the method of the embodiment of the invention can be sequentially adjusted, combined and deleted according to actual needs. The units in the device of the embodiment of the invention can be merged, divided and deleted according to actual needs. In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a terminal, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, while the invention has been described with respect to the above-described embodiments, it will be understood that the invention is not limited thereto but may be embodied with various modifications and changes.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (10)
1. A method for classifying long texts, comprising:
if a long text to be classified sent by a terminal is received, converting words of each sentence of the long text to be classified into word vectors;
determining a sentence vector of each sentence of the long text to be classified according to the word vector of each sentence of the long text to be classified;
determining a long text vector of the long text to be classified according to a sentence vector of each sentence of the long text to be classified, and determining a classification to which the long text to be classified belongs according to the long text vector of the long text to be classified;
and sending the classification to which the long text to be classified belongs to a terminal.
2. The method of claim 1, wherein converting words of each sentence of the long text to be classified into a word vector comprises:
and sequentially inputting each sentence of the long text to be classified into a preset embedding layer of the neural network model based on the level attention.
3. The method of claim 2, wherein converting words of each sentence of the long text to be classified into a word vector further comprises:
inputting an output result of the embedding layer of the hierarchical attention based neural network model into a word self-attention layer of the hierarchical attention based neural network model.
4. The method according to claim 3, wherein the determining a sentence vector for each sentence of the long text to be classified according to the word vector for each sentence of the long text to be classified comprises:
inputting an output result of the word self-attention layer of the neural network model based on the hierarchical attention into a sensor encoder layer of the neural network model based on the hierarchical attention.
5. The method of claim 4, wherein the determining a sentence vector for each sentence of the long text to be classified according to a word vector for each sentence of the long text to be classified further comprises:
inputting an output result of a sensor encoder layer of the neural network model based on the hierarchical attention into a sensor self-attack layer of the neural network model based on the hierarchical attention.
6. The method according to claim 5, wherein the determining a long text vector of the long text to be classified according to a sentence vector of each sentence of the long text to be classified and determining a classification to which the long text to be classified belongs according to the long text vector of the long text to be classified comprises:
inputting an output result of a sensor self-attention layer of the hierarchical attention-based neural network model into an output layer of the hierarchical attention-based neural network model.
7. A long text classification apparatus, comprising:
the conversion unit is used for converting words of each sentence of the long text to be classified into word vectors if the long text to be classified sent by the terminal is received;
the first determining unit is used for determining sentence vectors of the sentences of the long text to be classified according to the word vectors of the sentences of the long text to be classified;
the second determining unit is used for determining the long text vector of the long text to be classified according to the sentence vector of each sentence of the long text to be classified, and determining the classification of the long text to be classified according to the long text vector of the long text to be classified;
and the sending unit is used for sending the classification to which the long text to be classified belongs to the terminal.
8. The apparatus according to claim 7, wherein the conversion unit includes:
and the first input unit is used for sequentially inputting each sentence of the long text to be classified into a preset embedding layer of the neural network model based on the level attention.
9. A computer arrangement, characterized in that the computer arrangement comprises a memory having stored thereon a computer program and a processor implementing the method according to any of claims 1-6 when executing the computer program.
10. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when being executed by a processor, is adapted to carry out the method according to any one of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911172175.XA CN110929033A (en) | 2019-11-26 | 2019-11-26 | Long text classification method and device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911172175.XA CN110929033A (en) | 2019-11-26 | 2019-11-26 | Long text classification method and device, computer equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110929033A true CN110929033A (en) | 2020-03-27 |
Family
ID=69851948
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911172175.XA Pending CN110929033A (en) | 2019-11-26 | 2019-11-26 | Long text classification method and device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110929033A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113240605A (en) * | 2021-05-21 | 2021-08-10 | 南开大学 | Image enhancement method for forward and backward bidirectional learning based on symmetric neural network |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2018097468A (en) * | 2016-12-09 | 2018-06-21 | 日本電信電話株式会社 | Sentence classification learning device, sentence classification device, sentence classification learning method and sentence classification learning program |
CN108595632A (en) * | 2018-04-24 | 2018-09-28 | 福州大学 | A kind of hybrid neural networks file classification method of fusion abstract and body feature |
CN109299262A (en) * | 2018-10-09 | 2019-02-01 | 中山大学 | A kind of text implication relation recognition methods for merging more granular informations |
CN109635109A (en) * | 2018-11-28 | 2019-04-16 | 华南理工大学 | Sentence classification method based on LSTM and combination part of speech and more attention mechanism |
CN109902175A (en) * | 2019-02-20 | 2019-06-18 | 上海方立数码科技有限公司 | A kind of file classification method and categorizing system based on neural network structure model |
-
2019
- 2019-11-26 CN CN201911172175.XA patent/CN110929033A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2018097468A (en) * | 2016-12-09 | 2018-06-21 | 日本電信電話株式会社 | Sentence classification learning device, sentence classification device, sentence classification learning method and sentence classification learning program |
CN108595632A (en) * | 2018-04-24 | 2018-09-28 | 福州大学 | A kind of hybrid neural networks file classification method of fusion abstract and body feature |
CN109299262A (en) * | 2018-10-09 | 2019-02-01 | 中山大学 | A kind of text implication relation recognition methods for merging more granular informations |
CN109635109A (en) * | 2018-11-28 | 2019-04-16 | 华南理工大学 | Sentence classification method based on LSTM and combination part of speech and more attention mechanism |
CN109902175A (en) * | 2019-02-20 | 2019-06-18 | 上海方立数码科技有限公司 | A kind of file classification method and categorizing system based on neural network structure model |
Non-Patent Citations (2)
Title |
---|
SHELLEYHLX: "Hierarchical Attention Networks for document classification", 《HTTPS://BLOG.CSDN.NET/QQ_27009517/ARTICLE/DETAILS/82893885?》 * |
ZENRRAN: "干货|自然语言处理中注意力机制综述", 《HTTPS://BLOG.CSDN.NET/QQ_27590277/ARTICLE/DETAILS/106263148?》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113240605A (en) * | 2021-05-21 | 2021-08-10 | 南开大学 | Image enhancement method for forward and backward bidirectional learning based on symmetric neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021179570A1 (en) | Sequence labeling method and apparatus, and computer device and storage medium | |
WO2020258502A1 (en) | Text analysis method and apparatus, computer apparatus and computer storage medium | |
CN110377740B (en) | Emotion polarity analysis method and device, electronic equipment and storage medium | |
JP5901001B1 (en) | Method and device for acoustic language model training | |
WO2016180268A1 (en) | Text aggregate method and device | |
US9697819B2 (en) | Method for building a speech feature library, and method, apparatus, device, and computer readable storage media for speech synthesis | |
CN111680159A (en) | Data processing method and device and electronic equipment | |
TW202020691A (en) | Feature word determination method and device and server | |
CN111985229A (en) | Sequence labeling method and device and computer equipment | |
CN111008272A (en) | Knowledge graph-based question and answer method and device, computer equipment and storage medium | |
WO2023045184A1 (en) | Text category recognition method and apparatus, computer device, and medium | |
WO2020232898A1 (en) | Text classification method and apparatus, electronic device and computer non-volatile readable storage medium | |
CN111831826B (en) | Training method, classification method and device of cross-domain text classification model | |
US20210117802A1 (en) | Training a Neural Network Using Small Training Datasets | |
CN109885811B (en) | Article style conversion method, apparatus, computer device and storage medium | |
WO2022042297A1 (en) | Text clustering method, apparatus, electronic device, and storage medium | |
CN112364637B (en) | Sensitive word detection method and device, electronic equipment and storage medium | |
CN111125529A (en) | Product matching method and device, computer equipment and storage medium | |
JP7155625B2 (en) | Inspection device, inspection method, program and learning device | |
JP2023002690A (en) | Semantics recognition method, apparatus, electronic device, and storage medium | |
WO2022116443A1 (en) | Sentence discrimination method and apparatus, and device and storage medium | |
CN114022192A (en) | Data modeling method and system based on intelligent marketing scene | |
CN110929033A (en) | Long text classification method and device, computer equipment and storage medium | |
CN112395880A (en) | Error correction method and device for structured triples, computer equipment and storage medium | |
WO2021072864A1 (en) | Text similarity acquisition method and apparatus, and electronic device and computer-readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200327 |
|
RJ01 | Rejection of invention patent application after publication |