WO2021008037A1 - 基于A-BiLSTM神经网络的文本分类方法、存储介质及计算机设备 - Google Patents

基于A-BiLSTM神经网络的文本分类方法、存储介质及计算机设备 Download PDF

Info

Publication number
WO2021008037A1
WO2021008037A1 PCT/CN2019/118083 CN2019118083W WO2021008037A1 WO 2021008037 A1 WO2021008037 A1 WO 2021008037A1 CN 2019118083 W CN2019118083 W CN 2019118083W WO 2021008037 A1 WO2021008037 A1 WO 2021008037A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
node
feature representation
backward
depth feature
Prior art date
Application number
PCT/CN2019/118083
Other languages
English (en)
French (fr)
Inventor
占小杰
方豪
王少军
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021008037A1 publication Critical patent/WO2021008037A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • This application relates to the field of artificial intelligence, and in particular to a text classification method, storage medium and computer equipment based on an A-BiLSTM neural network.
  • LSTM Long Short-Term Memory
  • LSTM Long Short-Term Memory
  • LSTM has already had many applications in the field of science and technology. LSTM-based systems can learn to translate languages, control robots, image analysis, document summarization, speech recognition, handwriting recognition, control chat robots, predict diseases, click-through rates and stocks, synthesize music, and more.
  • the embodiments of the present application provide a text classification method, device, storage medium, and computer equipment based on an A-BiLSTM neural network to solve the problem of poor accuracy of text classification by a long and short-term memory network.
  • an embodiment of the present application provides a text classification method based on an A-BiLSTM neural network.
  • the method includes: obtaining a target text; performing word segmentation processing on the target text to obtain N words, where N is greater than 2. Natural number; calculate the word vector corresponding to each of the N words; convert the word vector corresponding to the t+Tth word, the forward t-th node of the A-BiLSTM neural network to the forward t+T-1 The output of each node is used as the input of the forward t+Tth node of the A-BiLSTM neural network, where T is a preset natural number and T ⁇ 2, t is a natural number and t+T ⁇ N, calculate the target text
  • T is a preset natural number and T ⁇ 2
  • t is a natural number and t+T ⁇ N
  • an embodiment of the present application provides a text classification device based on an A-BiLSTM neural network.
  • the device includes: an acquiring unit for acquiring a target text; a word segmentation processing unit for performing word segmentation processing on the target text , Obtain N words, N is a natural number greater than 2; the first calculation unit is used to calculate the word vector corresponding to each word in the N words; the second calculation unit is used to correspond to the t+Tth word
  • the output from the forward t-th node to the forward t+T-1th node of the A-BiLSTM neural network is used as the input of the forward t+T-th node of the A-BiLSTM neural network, T Is a preset natural number and T ⁇ 2, t is a natural number and t+T ⁇ N, calculates the forward depth feature representation vector of the target text; the third calculation unit is used to calculate the word vector corresponding to the t+Tth word , The output from the backward t-th node to the back
  • an embodiment of the present application provides a computer device, including a memory, a processor, and a computer program stored in the memory and running on the processor.
  • the processor executes the computer program when the computer program is executed.
  • the steps of the above text classification method based on A-BiLSTM neural network.
  • an embodiment of the present application provides a storage medium that stores a computer program that, when executed by a processor, implements the steps of the above-mentioned A-BiLSTM neural network-based text classification method.
  • the target text is segmented to obtain N words, and the word vector corresponding to each word in the N words is calculated; the word vector corresponding to the t+T word, the A-BiLSTM neural network
  • the output from the forward tth node to the forward t+T-1th node is used as the input of the forward t+Tth node of the A-BiLSTM neural network to calculate the forward depth feature representation vector of the target text;
  • the word vector corresponding to t+T words the output from the backward t-th node to the backward t+T-1 node of the A-BiLSTM neural network as the backward t+T-th node of the A-BiLSTM neural network Calculate the backward depth feature representation vector of the target text; classify the target text according to the forward depth feature representation vector and the backward depth feature representation vector, because when a word in the target text is processed, the word The word vectors corresponding to the first T words of the word are input to the A-Bi
  • FIG. 1 is a flowchart of an optional text classification method based on A-BiLSTM neural network provided by an embodiment of the present application;
  • FIG. 2 is a flowchart of an optional training A-BiLSTM neural network provided by an embodiment of the present application
  • FIG. 3 is a schematic diagram of an optional vector corresponding to a word input into an A-BiLSTM neural network according to an embodiment of this application;
  • FIG. 4 is a schematic diagram of an optional text classification device based on A-BiLSTM neural network provided by an embodiment of the present application;
  • Fig. 5 is a schematic diagram of an optional computer device provided by an embodiment of the present application.
  • first, second, third, etc. may be used in the embodiments of the present application to describe the preset range, etc., these preset ranges should not be limited to these terms. These terms are only used to distinguish the preset ranges from each other.
  • the first preset range may also be referred to as the second preset range, and similarly, the second preset range may also be referred to as the first preset range.
  • the word “if” as used herein can be interpreted as “when” or “when” or “in response to determination” or “in response to detection”.
  • the phrase “if determined” or “if detected (statement or event)” can be interpreted as “when determined” or “in response to determination” or “when detected (statement or event) )” or “in response to detection (statement or event)”.
  • A-BiLSTM Advanced Bilateral Long Short-Term Memory, an advanced two-way long and short-term memory network.
  • FIG. 1 shows a flowchart of a text classification method based on an A-BiLSTM neural network according to an embodiment of the present application, including:
  • Step S101 Obtain the target text.
  • Step S102 Perform word segmentation processing on the target text to obtain N words, where N is a natural number greater than 2.
  • Step S103 Calculate the word vector corresponding to each word in the N words.
  • Step S104 the word vector corresponding to the t+Tth word, the output from the forward t-th node to the forward t+T-1th node of the A-BiLSTM neural network as the forward-th
  • T is a preset natural number and T ⁇ 2
  • t is a natural number and t+T ⁇ N
  • the forward depth feature representation vector of the target text is calculated.
  • Step S105 using the word vector corresponding to the t+Tth word and the output from the backward t-th node to the backward t+T-1th node of the A-BiLSTM neural network as the backward-th node of the A-BiLSTM neural network With the input of t+T nodes, the backward depth feature representation vector of the target text is calculated.
  • the method before calculating the backward depth feature representation vector of the target text in step S105, the method further includes: training an A-BiLSTM neural network.
  • FIG. 2 shows a flowchart of training an A-BiLSTM neural network according to an embodiment of the present application, including:
  • Step S201 Obtain multiple training samples.
  • Step S202 For the first training sample among the multiple training samples, convert the word vector corresponding to the s+Tth word of the first training sample, the forward sth node of the A-BiLSTM neural network to the forward s+th The output of T-1 nodes is used as the input of the forward s+Tth node of the A-BiLSTM neural network, and the forward depth feature representation vector of the first training sample is calculated, s is a natural number, and the first training sample is multiple training Any training sample in the sample.
  • Step S203 Use the word vector corresponding to the s+Tth word of the first training sample, the output from the backward sth node to the backward s+T-1th node of the A-BiLSTM neural network as the A-BiLSTM nerve Calculate the backward depth feature representation vector of the first training sample with the input of the backward s+Tth node of the network.
  • Step S204 Determine the classification prediction result of the first training sample according to the forward depth feature representation vector of the first training sample and the backward depth feature representation vector of the first training sample.
  • Step S205 Calculate the accuracy of text classification according to the classification prediction result and the category label of each training sample in the multiple training samples.
  • Step S206 Determine whether to stop training the A-BiLSTM neural network according to the accuracy of the text classification and the value of the loss function.
  • the training samples are input to the A-BiLSTM neural network in batches, and step S205 and step S206 calculate the accuracy of text classification according to the classification prediction result and category label of each training sample in the multiple training samples;
  • the accuracy of text classification and the change trend of the value of the loss function determine whether to stop training the A-BiLSTM neural network, which can specifically include: calculating text classification based on the classification prediction results and category labels of each training sample in the same batch If the accuracy of the text classification corresponding to the same batch does not reach the preset accuracy, continue to use the training samples of the next batch to train the A-BiLSTM neural network.
  • the accuracy of the text classification corresponding to the same batch reaches the preset accuracy, it is determined whether the value of the loss function is less than or equal to the preset threshold; if the value of the loss function is less than or equal to the preset threshold, the A-BiLSTM The neural network is trained.
  • Step S106 Classify the target text according to the forward depth feature representation vector and the backward depth feature representation vector.
  • step S106 classifying the target text according to the forward depth feature representation vector and the backward depth feature representation vector includes: connecting the forward depth feature representation vector and the backward depth feature representation vector Processing, the obtained vector is used as the depth feature representation vector of the target text; the depth feature representation vector of the target text is input into the classifier function, and the classifier function classifies the target text to obtain the classification result; the classification result is used as the target text category.
  • the target text is segmented to obtain N words, and the word vector corresponding to each word in the N words is calculated; the word vector corresponding to the t+T word, the A-BiLSTM neural network
  • the output from the forward tth node to the forward t+T-1th node is used as the input of the forward t+Tth node of the A-BiLSTM neural network to calculate the forward depth feature representation vector of the target text;
  • the word vector corresponding to t+T words the output from the backward t-th node to the backward t+T-1 node of the A-BiLSTM neural network as the backward t+T-th node of the A-BiLSTM neural network Calculate the backward depth feature representation vector of the target text; classify the target text according to the forward depth feature representation vector and the backward depth feature representation vector, because when a word in the target text is processed, the word The word vectors corresponding to the first T words of the word are input to the A-Bi
  • the development set is equally divided into k parts, one of which is used as the verification set, and the remaining k-1 parts are used as the training set.
  • 1 epoch is equivalent to training once using all samples in the training set.
  • the first text includes: text 1, text 2, ..., text 400;
  • the included texts are: Text 401, Text 402, ..., Text 800;
  • the third included texts Text 801, Text 802, ..., Text 1200;
  • the fourth included texts Text 1201, Text 1202,..., text 1600;
  • the fifth included text text 1601, text 1602,..., text 2000.
  • the ways to divide the training set and the validation set can include:
  • Case 1 The 400 texts included in the first copy are used as the validation set, and the remaining 1600 texts are used as the training set.
  • Case 2 The 400 texts included in the second copy are used as the validation set, and the remaining 1600 texts are used as the training set.
  • model M2 is used as the basic model for the next epoch.
  • FIG. 3 shows the input mode of the vector provided by the embodiment of the present application: where C(t) represents the cell state unit of an LSTM node, and O (t+1) and O′ (T+1) represent Depth feature output representation in forward and backward directions.
  • W C is the weight matrix of the update gate unit
  • x t is the input word vector corresponding to the t-th moment
  • W i is the update gate weight matrix
  • b i is the update gate bias vector
  • h t-1 the output of the hidden unit corresponding to the word vector x t-1
  • C′ T and h′ T are calculated according to the following formulas (4)-(7).
  • the input data of LSTM is a sequence.
  • the text data is first vectorized.
  • each word is converted into a 200-dimensional (vector dimension can be customized, and 200-dimensional is used as an example in this application) through the word2vec algorithm.
  • Vectors, the nine 200-dimensional vectors in Table 2 are obtained: x(0), x(1),..., x(8).
  • the text "The distance between review nodes is different, the front-end review nodes are dense, and the back-end review nodes are loose" are segmented, and each word is converted into a 200-dimensional vector through the word2vec algorithm, then 10 in Table 3 is obtained.
  • word2vec is a correlation model used to generate word vectors. These models are shallow and two-layer neural networks that are used to train to reconstruct linguistic word text. The network is represented by words, and the input words in adjacent positions need to be guessed. Under the assumption of the word bag model in word2vec, the order of words is not important. After the training is completed, the word2vec model can be used to map each word to a vector, which can be used to represent the relationship between words and words. This vector is the hidden layer of the neural network.
  • A-BiLSTM is input word by word in sentence order (from front to back or from back to front).
  • the text classification method based on the A-BiLSTM neural network considers the context of words, reduces the possibility of missing important information, helps to grasp the meaning of the text as a whole, and improves the accuracy of text classification .
  • FIG. 4 is a schematic diagram of a text classification device based on A-BiLSTM neural network provided by an embodiment of the application.
  • the device includes: an acquisition unit 41, a word segmentation processing unit 42, a first calculation unit 43, and a second calculation unit 44.
  • the obtaining unit 41 is used to obtain the target text.
  • the word segmentation processing unit 42 is configured to perform word segmentation processing on the target text to obtain N words, where N is a natural number greater than 2.
  • the first calculation unit 43 is configured to calculate a word vector corresponding to each word in the N words.
  • the second calculation unit 44 is configured to use the word vector corresponding to the t+Tth word and the output from the forward t-th node to the forward t+T-1th node of the A-BiLSTM neural network as the A-BiLSTM nerve
  • T is a preset natural number and T ⁇ 2
  • t is a natural number and t+T ⁇ N
  • the forward depth feature representation vector of the target text is calculated.
  • the third calculation unit 45 is configured to use the word vector corresponding to the t+Tth word and the output from the backward t-th node to the backward t+T-1th node of the A-BiLSTM neural network as the A-BiLSTM nerve
  • the input of the t+Tth node in the backward direction of the network calculates the backward depth feature representation vector of the target text.
  • the classification unit 46 is configured to classify the target text according to the forward depth feature representation vector and the backward depth feature representation vector.
  • the classification unit 46 includes: a connection processing subunit, an input subunit, and a first determining subunit.
  • the connection processing subunit is used to perform connection processing on the forward depth feature representation vector and the backward depth feature representation vector, and the obtained vector is used as the depth feature representation vector of the target text.
  • the input subunit is used to input the depth feature representation vector of the target text into the classifier function, and the classifier function classifies the target text to obtain the classification result.
  • the first determining subunit is used to use the classification result as the category of the target text.
  • the device further includes: a training unit.
  • the training unit is used to train the A-BiLSTM neural network before the third calculation unit 45 calculates the backward depth feature representation vector of the target text.
  • the training unit includes: an acquisition subunit, a first calculation subunit, a second calculation subunit, a second determination subunit, a third calculation subunit, and a third determination subunit.
  • the acquisition subunit is used to acquire multiple training samples.
  • the first calculation subunit is used to convert the word vector corresponding to the s+Tth word of the first training sample and the forward sth node of the A-BiLSTM neural network to the first training sample of the multiple training samples
  • the output of the forward s+T-1th node is used as the input of the forward s+Tth node of the A-BiLSTM neural network to calculate the forward depth feature representation vector of the first training sample
  • s is a natural number
  • the first training The sample is any training sample among multiple training samples.
  • the second calculation subunit is used to output the word vector corresponding to the s+Tth word of the first training sample and the backward sth node of the A-BiLSTM neural network to the backward s+T-1th node As the input of the backward s+Tth node of the A-BiLSTM neural network, the backward depth feature representation vector of the first training sample is calculated.
  • the second determining subunit is used to determine the classification prediction result of the first training sample according to the forward depth feature representation vector of the first training sample and the backward depth feature representation vector of the first training sample.
  • the third calculation subunit is used to calculate the accuracy of text classification according to the classification prediction result and the category label of each training sample in the multiple training samples.
  • the third determining subunit is used to determine whether to stop training the A-BiLSTM neural network according to the accuracy of text classification and the value of the loss function.
  • the training samples are input to the A-BiLSTM neural network in batches, and the third calculation subunit is used to calculate the accuracy of text classification according to the classification prediction results and category labels of each training sample in the same batch.
  • the third determination subunit is used to continue training the A-BiLSTM neural network using the training samples of the next batch if the accuracy of the text classification corresponding to the same batch does not reach the preset accuracy.
  • the device further includes: a judging unit and a stopping unit.
  • the judging unit is configured to judge whether the value of the loss function is less than or equal to the preset threshold if the accuracy of the text classification corresponding to the same batch reaches the preset accuracy.
  • the stop unit is used to stop training the A-BiLSTM neural network if the value of the loss function is less than or equal to the preset threshold.
  • an embodiment of the present application provides a storage medium, the storage medium includes a stored program, wherein when the program is running, the device where the storage medium is located is controlled to perform the following steps: obtain the target text; perform word segmentation processing on the target text to obtain N Words, N is a natural number greater than 2; Calculate the word vector corresponding to each word in the N words; convert the word vector corresponding to the t+Tth word, the forward t-th node of the A-BiLSTM neural network to the forward-th The output of t+T-1 nodes is used as the input of the forward t+Tth node of the A-BiLSTM neural network.
  • T is a preset natural number and T ⁇ 2, t is a natural number and t+T ⁇ N, calculate the target text
  • the input of the t+Tth node of the network is used to calculate the backward depth feature representation vector of the target text; the target text is classified according to the forward depth feature representation vector and the backward depth feature representation vector.
  • the device where the storage medium is controlled further executes the following steps: connect the forward depth feature representation vector and the backward depth feature representation vector, and use the obtained vector as the depth feature representation vector of the target text;
  • the depth feature of the text represents the vector input to the classifier function, and the classifier function classifies the target text to obtain the classification result; the classification result is used as the target text category.
  • the device where the storage medium is controlled further executes the following steps: obtain multiple training samples; for the first training sample among the multiple training samples, the s+Tth word of the first training sample corresponds to The word vector, the output from the forward sth node to the forward s+T-1 node of the A-BiLSTM neural network as the input of the forward s+T node of the A-BiLSTM neural network, calculate the first training
  • the forward depth feature representation vector of the sample s is a natural number, the first training sample is any training sample among multiple training samples; the word vector corresponding to the s+T word of the first training sample, A-BiLSTM neural
  • the output from the backward sth node to the backward s+T-1th node of the network is used as the input of the backward s+Tth node of the A-BiLSTM neural network, and the backward depth feature representation of the first training sample is calculated Vector; Determine the classification prediction result of the first training sample according to the forward
  • the device where the storage medium is controlled also performs the following steps: calculate the accuracy of text classification according to the classification prediction results and category labels of each training sample in the same batch; if the same batch corresponds to If the accuracy of text classification does not reach the preset accuracy, the next batch of training samples is used to train the A-BiLSTM neural network.
  • controlling the device where the storage medium is located also performs the following steps: if the accuracy of text classification corresponding to the same batch reaches a preset accuracy, judging whether the value of the loss function is less than or equal to the preset threshold; If the value of the loss function is less than or equal to the preset threshold, stop training the A-BiLSTM neural network.
  • an embodiment of the present application provides a computer device, including a memory, a processor, and a computer program stored in the memory and running on the processor, characterized in that the processor executes the following steps when the computer program is executed: Target text; segment the target text to obtain N words, where N is a natural number greater than 2; calculate the word vector corresponding to each word in the N words; calculate the word vector corresponding to the t+T word, A-BiLSTM
  • Target text Target text
  • N words where N is a natural number greater than 2
  • the output from the forward t-th node to the forward t+T-1 node of the neural network is used as the input of the forward t+T-th node of the A-BiLSTM neural network.
  • T is a preset natural number and T ⁇ 2, t is a natural number and t+T ⁇ N, calculate the forward depth feature representation vector of the target text; convert the word vector corresponding to the t+Tth word, the backward tth node of the A-BiLSTM neural network to the backward tth
  • the output of +T-1 nodes is used as the input of the t+Tth node of the A-BiLSTM neural network to calculate the backward depth feature representation vector of the target text; according to the forward depth feature representation vector and the backward depth feature representation
  • the vector classifies the target text.
  • the processor executes the computer program, the following steps are implemented: the forward depth feature representation vector and the backward depth feature representation vector are connected, and the obtained vector is used as the depth feature representation vector of the target text; and the depth of the target text
  • the feature representation vector is input to the classifier function, and the classifier function classifies the target text to obtain the classification result; the classification result is used as the target text category.
  • the processor further implements the following steps when executing the computer program: acquiring multiple training samples; for the first training sample among the multiple training samples, the word vector corresponding to the s+Tth word of the first training sample, The output from the forward sth node to the forward s+T-1 node of the A-BiLSTM neural network is used as the input of the forward s+T node of the A-BiLSTM neural network, and the first training sample is calculated.
  • Depth feature representation vector, s is a natural number
  • the first training sample is any training sample among multiple training samples
  • the word vector corresponding to the s+T word of the first training sample is the back of the A-BiLSTM neural network
  • the output from the sth node to the backward s+T-1 node is used as the input of the backward s+T node of the A-BiLSTM neural network, and the backward depth feature representation vector of the first training sample is calculated;
  • the forward depth feature representation vector of the first training sample and the backward depth feature representation vector of the first training sample determine the classification prediction result of the first training sample; according to the classification prediction result and category label of each training sample in the multiple training samples Calculate the accuracy of text classification; determine whether to stop training the A-BiLSTM neural network according to the accuracy of text classification and the value of the loss function.
  • the processor also implements the following steps when executing the computer program: calculate the accuracy of text classification according to the classification prediction results and class labels of each training sample in the same batch; if the text classification corresponding to the same batch is If the accuracy does not reach the preset accuracy, then continue to use the next batch of training samples to train the A-BiLSTM neural network.
  • the processor further implements the following steps when executing the computer program: if the accuracy of the text classification corresponding to the same batch reaches the preset accuracy, it is determined whether the value of the loss function is less than or equal to the preset threshold; if the loss function If the value of is less than or equal to the preset threshold, the training of the A-BiLSTM neural network is stopped.
  • Fig. 5 is a schematic diagram of a computer device provided by an embodiment of the present application.
  • the computer device 50 of this embodiment includes: a processor 51, a memory 52, and a computer program 53 stored in the memory 52 and running on the processor 51.
  • the computer program 53 is executed by the processor 51, In order to avoid repetition, the text classification method based on the A-BiLSTM neural network in the implementation embodiment will not be repeated here.
  • the computer program is executed by the processor 51, the function of each model/unit in the text classification device based on the A-BiLSTM neural network in the embodiment is realized. To avoid repetition, it will not be repeated here.
  • the computer device 50 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the computer device may include, but is not limited to, a processor 51 and a memory 52.
  • FIG. 5 is only an example of the computer device 50, and does not constitute a limitation on the computer device 50. It may include more or less components than shown in the figure, or a combination of certain components, or different components.
  • computer equipment may also include input and output devices, network access devices, buses, and so on.
  • the so-called processor 51 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 52 may be an internal storage unit of the computer device 50, such as a hard disk or memory of the computer device 50.
  • the memory 52 may also be an external storage device of the computer device 50, such as a plug-in hard disk equipped on the computer device 50, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, and a flash memory card (Flash). Card) and so on.
  • the memory 52 may also include both an internal storage unit of the computer device 50 and an external storage device.
  • the memory 52 is used to store computer programs and other programs and data required by the computer equipment.
  • the memory 52 can also be used to temporarily store data that has been output or will be output.
  • the above-mentioned integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium.
  • the above-mentioned software functional unit is stored in a storage medium and includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (Processor) execute the method described in each embodiment of the present application Part of the steps.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

一种基于A-BiLSTM神经网络的文本分类方法、存储介质及计算机设备,设计人工智能领域。方法包括:将目标文本进行分词处理,得到N个词语(S102);计算N个词语中每个词语对应的词向量(S103);将第t+T个词语对应的词向量、A-BiLSTM神经网络的前向第t个节点至前向第t+T-1个节点的输出作为A-BiLSTM神经网络的前向第t+T个节点的输入,计算目标文本的前向深度特征表示向量(S104);将第t+T个词语对应的词向量、A-BiLSTM神经网络的后向第t个节点至后向第t+T-1个节点的输出作为A-BiLSTM神经网络的后向第t+T个节点的输入,计算目标文本的后向深度特征表示向量(S105);根据前向深度特征表示向量和后向深度特征表示向量对目标文本进行分类(S106)。能够解决长短期记忆网络对文本进行分类的准确性差的问题。

Description

基于A-BiLSTM神经网络的文本分类方法、存储介质及计算机设备
本申请要求于2019年07月15日提交中国专利局、申请号为201910633814.1、申请名称为“基于A-BiLSTM神经网络的文本分类方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
【技术领域】
本申请涉及人工智能领域,尤其涉及一种基于A-BiLSTM神经网络的文本分类方法、存储介质及计算机设备。
【背景技术】
LSTM(Long Short-Term Memory)是长短期记忆网络,是一种时间递归神经网络,适合于处理和预测时间序列中间隔和延迟相对较长的重要事件。
LSTM已经在科技领域有了多种应用。基于LSTM的系统可以学习翻译语言、控制机器人、图像分析、文档摘要、语音识别、手写识别、控制聊天机器人、预测疾病、点击率和股票、合成音乐等等。
在将LSTM应用于文本分类的过程中时,文本的上下文之间往往具有一定联系,例如“我生长在中国,……,我会说中文”,显然,“中文”与“中国”之间具有一定联系,而LSTM处理文本的某一个词语时,不考虑该词语的上下文环境,从而可能遗漏重要信息,导致文本分类的准确性差。
【申请内容】
有鉴于此,本申请实施例提供了一种基于A-BiLSTM神经网络的文本分类方法、装置、存储介质及计算机设备,用以解决长短期记忆网络对文本进行分类的准确性差的问题。
一方面,本申请实施例提供了一种基于A-BiLSTM神经网络的文本分类方法,所述方法包括:获取目标文本;将所述目标文本进行分词处理,得到N个词语,N为大于2的自然数;计算所述N个词语中每个词语对应的词向量;将第t+T个词语对应的词向量、A-BiLSTM神经网络的前向 第t个节点至前向第t+T-1个节点的输出作为所述A-BiLSTM神经网络的前向第t+T个节点的输入,T为预设自然数并且T≥2,t为自然数并且t+T≤N,计算所述目标文本的前向深度特征表示向量;将第t+T个词语对应的词向量、所述A-BiLSTM神经网络的后向第t个节点至后向第t+T-1个节点的输出作为所述A-BiLSTM神经网络的后向第t+T个节点的输入,计算所述目标文本的后向深度特征表示向量;根据所述前向深度特征表示向量和所述后向深度特征表示向量对所述目标文本进行分类。
一方面,本申请实施例提供了一种基于A-BiLSTM神经网络的文本分类装置,所述装置包括:获取单元,用于获取目标文本;分词处理单元,用于将所述目标文本进行分词处理,得到N个词语,N为大于2的自然数;第一计算单元,用于计算所述N个词语中每个词语对应的词向量;第二计算单元,用于将第t+T个词语对应的词向量、A-BiLSTM神经网络的前向第t个节点至前向第t+T-1个节点的输出作为所述A-BiLSTM神经网络的前向第t+T个节点的输入,T为预设自然数并且T≥2,t为自然数并且t+T≤N,计算所述目标文本的前向深度特征表示向量;第三计算单元,用于将第t+T个词语对应的词向量、所述A-BiLSTM神经网络的后向第t个节点至后向第t+T-1个节点的输出作为所述A-BiLSTM神经网络的后向第t+T个节点的输入,计算所述目标文本的后向深度特征表示向量;分类单元,用于根据所述前向深度特征表示向量和所述后向深度特征表示向量对所述目标文本进行分类。
一方面,本申请实施例提供了一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述基于A-BiLSTM神经网络的文本分类方法的步骤。
一方面,本申请实施例提供了一种存储介质,所述存储介质存储有计算机程序,所述计算机程序被处理器执行时实现上述基于A-BiLSTM神经网络的文本分类方法的步骤。
在本申请实施例中,将目标文本进行分词处理,得到N个词语,计算N个词语中每个词语对应的词向量;将第t+T个词语对应的词向量、A-BiLSTM神经网络的前向第t个节点至前向第t+T-1个节点的输出作为A-BiLSTM神经网络的前向第t+T个节点的输入,计算目标文本的前向深度特征表示向量;将第t+T个词语对应的词向量、A-BiLSTM神经网络的后向第t个节点至后向第t+T-1个节点的输出作为A-BiLSTM神经网络的后向第t+T个节点的输入,计算目标文本的后向深度特征表示向量;根据前向深度特征表示向量和后向深度特征表示向量对目标文本进行分类,由于在对目标文本的某一词语进行处理时,将该词语的前面T个词语对应的词向量输入到A-BiLSTM神经网络,即考虑了该词语的前面T个词语的含义,降低了遗漏重要信息的可能性,有助于从整体上把握文本的含义,提升文本分类的准确性。
【附图说明】
为了更清楚地说明本申请实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其它的附图。
图1是本申请实施例提供的一种可选的基于A-BiLSTM神经网络的文本分类方法的流程图;
图2是本申请实施例提供的一种可选的训练A-BiLSTM神经网络的流程图;
图3是本申请是实施例提供的一种可选的将词语对应的向量输入A-BiLSTM神经网络的示意图;
图4是本申请实施例提供的一种可选的基于A-BiLSTM神经网络的文本分类装置的示意图;
图5是本申请实施例提供的一种可选的计算机设备的示意图。
【具体实施方式】
为了更好的理解本申请的技术方案,下面结合附图对本申请实施例进行详细描述。
应当明确,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。
在本申请实施例中使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本申请。在本申请实施例和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。
应当理解,本文中使用的术语“和/或”仅仅是一种描述关联对象的相同的字段,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
应当理解,尽管在本申请实施例中可能采用术语第一、第二、第三等来描述预设范围等,但这些预设范围不应限于这些术语。这些术语仅用来将预设范围彼此区分开。例如,在不脱离本申请实施例范围的情况下,第一预设范围也可以被称为第二预设范围,类似地,第二预设范围也可以被称为第一预设范围。
取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”或“响应于检测”。类似地,取决于语境,短语“如果确定”或“如果检测(陈述的条件或事件)”可以被解释成 为“当确定时”或“响应于确定”或“当检测(陈述的条件或事件)时”或“响应于检测(陈述的条件或事件)”。
技术术语解释:
A-BiLSTM:Advanced Bilateral Long Short-Term Memory,高级双向长短期记忆网络。
请参见图1,所示为根据本申请实施例的一种基于A-BiLSTM神经网络的文本分类方法的流程图,包括:
步骤S101,获取目标文本。
步骤S102,将目标文本进行分词处理,得到N个词语,N为大于2的自然数。
步骤S103,计算N个词语中每个词语对应的词向量。
步骤S104,将第t+T个词语对应的词向量、A-BiLSTM神经网络的前向第t个节点至前向第t+T-1个节点的输出作为A-BiLSTM神经网络的前向第t+T个节点的输入,T为预设自然数并且T≥2,t为自然数并且t+T≤N,计算目标文本的前向深度特征表示向量。
步骤S105,将第t+T个词语对应的词向量、A-BiLSTM神经网络的后向第t个节点至后向第t+T-1个节点的输出作为A-BiLSTM神经网络的后向第t+T个节点的输入,计算目标文本的后向深度特征表示向量。
作为一种可选的实施方式,在步骤S105计算目标文本的后向深度特征表示向量之前,方法还包括:训练A-BiLSTM神经网络。
请参见图2,所示为根据本申请实施例训练A-BiLSTM神经网络的流程图,包括:
步骤S201,获取多个训练样本。
步骤S202,对于多个训练样本中的第一训练样本,将第一训练样本的第s+T个词语对应的词向量、A-BiLSTM神经网络的前向第s个节点至前向第s+T-1个节点的输出作为A-BiLSTM神经网络的前向第s+T个节点的输入,计算第一训练样本的前向深度特征表示向量,s为自然数,第一训练样本为多个训练样本中的任意一个训练样本。
步骤S203,将第一训练样本的第s+T个词语对应的词向量、A-BiLSTM神经网络的后向第s个节点至后向第s+T-1个节点的输出作为A-BiLSTM神经网络的后向第s+T个节点的输入,计算第一训练样本的后向深度特征表示向量。
步骤S204,根据第一训练样本的前向深度特征表示向量和第一训练样本的后向深度特征表示向量确定第一训练样本的分类预测结果。
步骤S205,根据多个训练样本中每个训练样本的分类预测结果和类别标签计算文本分类的准确度。
步骤S206,根据文本分类的准确度和损失函数的值确定是否停止对A-BiLSTM神经网络进行训练。
作为一种可选的实施方式,训练样本分批输入A-BiLSTM神经网络,步骤S205和步骤S206根据多个训练样本中每个训练样本的分类预测结果和类别标签计算文本分类的准确度;根据文本分类的准确度和损失函数的值的变化趋势确定是否停止对A-BiLSTM神经网络进行训练,具体可以包括:根据同一个批次中的每个训练样本的分类预测结果和类别标签计算文本分类的准确度;如果同一个批次对应的文本分类的准确度未达到预设准确度,则继续使用下一个批次的训练样本对A-BiLSTM神经网络进行训练。如果同一个批次对应的文本分类的准确度达到预设准确度,则判断损失函数的值是否小于或等于预设阈值;如果损失函数的值小于或等于预设阈值,则停止对A-BiLSTM神经网络进行训练。
步骤S106,根据前向深度特征表示向量和后向深度特征表示向量对目标文本进行分类。
作为一种可选的实施方式,步骤S106中根据前向深度特征表示向量和后向深度特征表示向量对目标文本进行分类,包括:将前向深度特征表示向量和后向深度特征表示向量进行连接处理,得到的向量作为目标文本的深度特征表示向量;将目标文本的深度特征表示向量输入分类器函数,分类器函数对目标文本进行分类得到分类结果;将分类结果作为目标文本的类别。
在本申请实施例中,将目标文本进行分词处理,得到N个词语,计算N个词语中每个词语对应的词向量;将第t+T个词语对应的词向量、A-BiLSTM神经网络的前向第t个节点至前向第t+T-1个节点的输出作为A-BiLSTM神经网络的前向第t+T个节点的输入,计算目标文本的前向深度特征表示向量;将第t+T个词语对应的词向量、A-BiLSTM神经网络的后向第t个节点至后向第t+T-1个节点的输出作为A-BiLSTM神经网络的后向第t+T个节点的输入,计算目标文本的后向深度特征表示向量;根据前向深度特征表示向量和后向深度特征表示向量对目标文本进行分类,由于在对目标文本的某一词语进行处理时,将该词语的前面T个词语对应的词向量输入到A-BiLSTM神经网络,即考虑了该词语的前面T个词语的含义,降低了遗漏重要信息的可能性,有助于从整体上把握文本的含义,提升文本分类的准确性。
下面对本申请实施例提供的训练过程进行详细说明。
将所有训练样本按照预设比例划分为开发集和测试集,对于训练过程中的第一个epoch,将开发集平均划分为k份,将其中一份作为验证集,其余k-1份作为训练集,训练中设定批大小。初始化网络模型设定为M0,在M0的基础上分别取5份训练得到5个模型M01,M02,M03,M04,M05,取其中在对应验证集上最好的模型作为下一个epoch的基础模型。
对于训练过程中的第一个epoch,将开发集平均划分为k份,其中一份作为验证集,其余k-1份作为训练集。
epoch:1个epoch等于使用训练集中的全部样本训练一次。
例如,假设开发集有2000条文本,假设开发集平均划分为5份,每份有400条文本,其中,第一份包括的文本有:文本1、文本2、……、文本400;第二份包括的文本有:文本401、文本402、……、文本800;第三份包括的文本有:文本801、文本802、……、文本1200;第四份包括的文本有:文本1201、文本1202、……、文本1600;第五份包括的文本有:文本1601、文本1602、……、文本2000。
如表1所示,划分训练集和验证集的方式可以包括:
情况一:将第一份包括的400条文本作为验证集,将其余1600条文本作为训练集。
情况二:将第二份包括的400条文本作为验证集,将其余1600条文本作为训练集。
情况三:将第三份包括的400条文本作为验证集,将其余1600条文本作为训练集。
情况四:将第四份包括的400条文本作为验证集,将其余1600条文本作为训练集。
情况五:将第五份包括的400条文本作为验证集,将其余1600条文本作为训练集。
表1
Figure PCTCN2019118083-appb-000001
Figure PCTCN2019118083-appb-000002
使用分组情况一中的训练集对M0进行训练,得到模型M1,使用分组情况一中的验证集对模型M1的分类效果进行验证。
使用分组情况二中的训练集对M0进行训练,得到模型M2;使用分组情况二中的验证集对模型M2的分类效果进行验证。
使用分组情况三中的训练集对M0进行训练,得到模型M3;使用分组情况三中的验证集对模型M3的分类效果进行验证。
使用分组情况四中的训练集对M0进行训练,得到模型M4;使用分组情况四中的验证集对模型M4的分类效果进行验证。
使用分组情况五中的训练集对M0进行训练,得到模型M5;使用分组情况五中的验证集对模型M5的分类效果进行验证。
取对应的验证集分类效果最佳的分组情况训练得到的模型作为下一次epoch的基础模型,例如,假设分组情况二的分类效果最佳,则将模型M2作为下一次epoch的基础模型。
每次将对对应的对对应的验证集分类效果最佳的分组情况训练得到的模型作为下一次epoch的基础模型,直至epoch的次数达到预设次数,或者分类效果达到预设效果,则停止对模型的训练。
请参见图3,示出了本申请实施例提供的向量的输入方式:其中C(t)表示一个LSTM节点的细胞状态单元,其中O (t+1)和O′ (T+1)分别表示前向和后向的深度特征输出表示。
A-BiLSTM神经网络的参数的计算公式如(1)-(3)所示:
Figure PCTCN2019118083-appb-000003
Figure PCTCN2019118083-appb-000004
h T=o t☉tanh(C′ T)   公式(3)
对公式中涉及的参数进行解释:
Figure PCTCN2019118083-appb-000005
W C:是更新门单元权重矩阵
x t:是第t个时刻对应的输入词向量
b C:是更新门单元偏置向量
C t:词向量x t对应的单元状态输出
f t=σ(W f·[h t-1,x t]+b f),W f为遗忘门权重矩阵,b f为遗忘门偏置向量,σ为sigmoid函数,sigmoid函数的计算公式为:
Figure PCTCN2019118083-appb-000006
☉:向量按元素相乘(哈德曼乘积)
i t=σ(W i·[h t-1,x t]+b i),W i为更新门权重矩阵,b i为更新门偏置向量。
h t-1:词向量x t-1对应的隐藏单元输出
h t:词向量x t对应的隐藏单元输出
o t:第t个输出门单元的输出
其中,C′ T、h′ T根据以下公式(4)-(7)计算。
Figure PCTCN2019118083-appb-000007
其中,
Figure PCTCN2019118083-appb-000008
根据公式(5)计算。
Figure PCTCN2019118083-appb-000009
Figure PCTCN2019118083-appb-000010
其中,
Figure PCTCN2019118083-appb-000011
根据公式(7)计算。
Figure PCTCN2019118083-appb-000012
LSTM的输入数据为序列,比如在短文本(200个字以内)分类任务中,首先将文本数据向量化。
例如,将文本“记忆模块对最近的内容记忆比较清晰”进行分词处理,并通过word2vec算法将每一个词转换成一个200维(向量维度可以自定义,本申请实施例中以200维举例)的向量,则得到表2中的9个200维向量:x(0)、x(1)、……、x(8)。
表2
记忆 模块 最近 内容 记忆 比较 清晰
x(0) x(1) x(2) x(3) x(4) x(5) x(6) x(7) x(8)
再例如,将文本“复习节点的间距不同,前端复习节点密集,后端 复习节点疏松”进行分词处理,并通过word2vec算法将每一个词转换成一个200维的向量,则得到表3中的10个200维向量:x(0)、x(1)、……、x(9)。
表3
Figure PCTCN2019118083-appb-000013
word2vec,是用来产生词向量的相关模型。这些模型为浅而双层的神经网络,用来训练以重新建构语言学之词文本。网络以词表现,并且需猜测相邻位置的输入词,在word2vec中词袋模型假设下,词的顺序是不重要的。训练完成之后,word2vec模型可用来映射每个词到一个向量,可用来表示词对词之间的关系,该向量为神经网络之隐藏层。
对于第一句话,有9个词,即可以转换成:9*200的矩阵
对于第二句话,有11个词,即可以转换成:11*200的矩阵
A-BiLSTM的输入是逐个词按句子顺序输入(从前往后或者从后往前)。
上述算法解析:假设T=5,即最多考虑前5个词和最多考虑后5个词:
输入“记忆”,前面的词数量为0,则用5个200维的预定义向量代替;输入“模块”,前面有一个词,其余四个使用4个200维预定义向量代替;……;输入“清晰”,考虑前面5个词。
本申请实施例提供的基于A-BiLSTM神经网络的文本分类方法考虑了词语的上下文环境,降低了遗漏重要信息的可能性,有助于从整体上把握文本的含义,提升了文本分类的准确性。
请参阅图4,所示为本申请实施例提供的基于A-BiLSTM神经网络的文本分类装置的示意图,该装置包括:获取单元41、分词处理单元42、第一计算单元43、第二计算单元44、第三计算单元45、分类单元46。
获取单元41,用于获取目标文本。
分词处理单元42,用于将目标文本进行分词处理,得到N个词语,N为大于2的自然数。
第一计算单元43,用于计算N个词语中每个词语对应的词向量。
第二计算单元44,用于将第t+T个词语对应的词向量、A-BiLSTM神经网络的前向第t个节点至前向第t+T-1个节点的输出作为A-BiLSTM神经网络的前向第t+T个节点的输入,T为预设自然数并且T≥2,t为自然数并且t+T≤N,计算目标文本的前向深度特征表示向量。
第三计算单元45,用于将第t+T个词语对应的词向量、A-BiLSTM神经网络的后向第t个节点至后向第t+T-1个节点的输出作为A-BiLSTM神经网络的后向第t+T个节点的输入,计算目标文本的后向深 度特征表示向量。
分类单元46,用于根据前向深度特征表示向量和后向深度特征表示向量对目标文本进行分类。
可选地,分类单元46包括:连接处理子单元、输入子单元、第一确定子单元。连接处理子单元,用于将前向深度特征表示向量和后向深度特征表示向量进行连接处理,得到的向量作为目标文本的深度特征表示向量。输入子单元,用于将目标文本的深度特征表示向量输入分类器函数,分类器函数对目标文本进行分类得到分类结果。第一确定子单元,用于将分类结果作为目标文本的类别。
可选地,装置还包括:训练单元。训练单元,用于在第三计算单元45计算目标文本的后向深度特征表示向量之前,训练A-BiLSTM神经网络。训练单元包括:获取子单元、第一计算子单元、第二计算子单元、第二确定子单元、第三计算子单元、第三确定子单元。获取子单元,用于获取多个训练样本。第一计算子单元,用于对于多个训练样本中的第一训练样本,将第一训练样本的第s+T个词语对应的词向量、A-BiLSTM神经网络的前向第s个节点至前向第s+T-1个节点的输出作为A-BiLSTM神经网络的前向第s+T个节点的输入,计算第一训练样本的前向深度特征表示向量,s为自然数,第一训练样本为多个训练样本中的任意一个训练样本。第二计算子单元,用于将第一训练样本的第s+T个词语对应的词向量、A-BiLSTM神经网络的后向第s个节点至后向第s+T-1个节点的输出作为A-BiLSTM神经网络的后向第s+T个节点的输入,计算第一训练样本的后向深度特征表示向量。第二确定子单元,用于根据第一训练样本的前向深度特征表示向量和第一训练样本的后向深度特征表示向量确定第一训练样本的分类预测结果。第三计算子单元,用于根据多个训练样本中每个训练样本的分类预测结果和类别标签计算文本分类的准确度。第三确定子单元,用于根据文本分类的准确度和损失函数的值确定是否停止对A-BiLSTM神经网络进行训练。
可选地,训练样本分批输入A-BiLSTM神经网络,第三计算子单元用于根据同一个批次中的每个训练样本的分类预测结果和类别标签计算文本分类的准确度。第三确定子单元用于如果同一个批次对应的文本分类的准确度未达到预设准确度,则继续使用下一个批次的训练样本对A-BiLSTM神经网络进行训练。
可选地,装置还包括:判断单元、停止单元。判断单元,用于如果同一个批次对应的文本分类的准确度达到预设准确度,则判断损失函数的值是否小于或等于预设阈值。停止单元,用于如果损失函数的值小于或等于预设阈值,则停止对A-BiLSTM神经网络进行训练。
一方面,本申请实施例提供了一种存储介质,存储介质包括存储的程序,其中,在程序运行时控制存储介质所在设备执行以下步骤:获取目标文本;将目标文本进行分词处理,得到N个词语,N为大于2的自然数; 计算N个词语中每个词语对应的词向量;将第t+T个词语对应的词向量、A-BiLSTM神经网络的前向第t个节点至前向第t+T-1个节点的输出作为A-BiLSTM神经网络的前向第t+T个节点的输入,T为预设自然数并且T≥2,t为自然数并且t+T≤N,计算目标文本的前向深度特征表示向量;将第t+T个词语对应的词向量、A-BiLSTM神经网络的后向第t个节点至后向第t+T-1个节点的输出作为A-BiLSTM神经网络的后向第t+T个节点的输入,计算目标文本的后向深度特征表示向量;根据前向深度特征表示向量和后向深度特征表示向量对目标文本进行分类。
可选地,在程序运行时控制存储介质所在设备还执行以下步骤:将前向深度特征表示向量和后向深度特征表示向量进行连接处理,得到的向量作为目标文本的深度特征表示向量;将目标文本的深度特征表示向量输入分类器函数,分类器函数对目标文本进行分类得到分类结果;将分类结果作为目标文本的类别。
可选地,在程序运行时控制存储介质所在设备还执行以下步骤:获取多个训练样本;对于多个训练样本中的第一训练样本,将第一训练样本的第s+T个词语对应的词向量、A-BiLSTM神经网络的前向第s个节点至前向第s+T-1个节点的输出作为A-BiLSTM神经网络的前向第s+T个节点的输入,计算第一训练样本的前向深度特征表示向量,s为自然数,第一训练样本为多个训练样本中的任意一个训练样本;将第一训练样本的第s+T个词语对应的词向量、A-BiLSTM神经网络的后向第s个节点至后向第s+T-1个节点的输出作为A-BiLSTM神经网络的后向第s+T个节点的输入,计算第一训练样本的后向深度特征表示向量;根据第一训练样本的前向深度特征表示向量和第一训练样本的后向深度特征表示向量确定第一训练样本的分类预测结果;根据多个训练样本中每个训练样本的分类预测结果和类别标签计算文本分类的准确度;根据文本分类的准确度和损失函数的值确定是否停止对A-BiLSTM神经网络进行训练。
可选地,在程序运行时控制存储介质所在设备还执行以下步骤:根据同一个批次中的每个训练样本的分类预测结果和类别标签计算文本分类的准确度;如果同一个批次对应的文本分类的准确度未达到预设准确度,则继续使用下一个批次的训练样本对A-BiLSTM神经网络进行训练。
可选地,在程序运行时控制存储介质所在设备还执行以下步骤:如果同一个批次对应的文本分类的准确度达到预设准确度,则判断损失函数的值是否小于或等于预设阈值;如果损失函数的值小于或等于预设阈值,则停止对A-BiLSTM神经网络进行训练。
一方面,本申请实施例提供了一种计算机设备,包括存储器、处理器以及存储在存储器中并可在处理器上运行的计算机程序,其特征在于,处理器执行计算机程序时实现以下步骤:获取目标文本;将目标文本进行分词处理,得到N个词语,N为大于2的自然数;计算N个词语中每个词语对应的词向量;将第t+T个词语对应的词向量、A-BiLSTM神经网络的 前向第t个节点至前向第t+T-1个节点的输出作为A-BiLSTM神经网络的前向第t+T个节点的输入,T为预设自然数并且T≥2,t为自然数并且t+T≤N,计算目标文本的前向深度特征表示向量;将第t+T个词语对应的词向量、A-BiLSTM神经网络的后向第t个节点至后向第t+T-1个节点的输出作为A-BiLSTM神经网络的后向第t+T个节点的输入,计算目标文本的后向深度特征表示向量;根据前向深度特征表示向量和后向深度特征表示向量对目标文本进行分类。
可选地,处理器执行计算机程序时还实现以下步骤:将前向深度特征表示向量和后向深度特征表示向量进行连接处理,得到的向量作为目标文本的深度特征表示向量;将目标文本的深度特征表示向量输入分类器函数,分类器函数对目标文本进行分类得到分类结果;将分类结果作为目标文本的类别。
可选地,处理器执行计算机程序时还实现以下步骤:获取多个训练样本;对于多个训练样本中的第一训练样本,将第一训练样本的第s+T个词语对应的词向量、A-BiLSTM神经网络的前向第s个节点至前向第s+T-1个节点的输出作为A-BiLSTM神经网络的前向第s+T个节点的输入,计算第一训练样本的前向深度特征表示向量,s为自然数,第一训练样本为多个训练样本中的任意一个训练样本;将第一训练样本的第s+T个词语对应的词向量、A-BiLSTM神经网络的后向第s个节点至后向第s+T-1个节点的输出作为A-BiLSTM神经网络的后向第s+T个节点的输入,计算第一训练样本的后向深度特征表示向量;根据第一训练样本的前向深度特征表示向量和第一训练样本的后向深度特征表示向量确定第一训练样本的分类预测结果;根据多个训练样本中每个训练样本的分类预测结果和类别标签计算文本分类的准确度;根据文本分类的准确度和损失函数的值确定是否停止对A-BiLSTM神经网络进行训练。
可选地,处理器执行计算机程序时还实现以下步骤:根据同一个批次中的每个训练样本的分类预测结果和类别标签计算文本分类的准确度;如果同一个批次对应的文本分类的准确度未达到预设准确度,则继续使用下一个批次的训练样本对A-BiLSTM神经网络进行训练。
可选地,处理器执行计算机程序时还实现以下步骤:如果同一个批次对应的文本分类的准确度达到预设准确度,则判断损失函数的值是否小于或等于预设阈值;如果损失函数的值小于或等于预设阈值,则停止对A-BiLSTM神经网络进行训练。
图5是本申请实施例提供的一种计算机设备的示意图。如图5所示,该实施例的计算机设备50包括:处理器51、存储器52以及存储在存储器52中并可在处理器51上运行的计算机程序53,该计算机程序53被处理器51执行时实现实施例中的基于A-BiLSTM神经网络的文本分类方法,为避免重复,此处不一一赘述。或者,该计算机程序被处理器51执行时实现实施例中基于A-BiLSTM神经网络的文本分类装 置中各模型/单元的功能,为避免重复,此处不一一赘述。
计算机设备50可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。计算机设备可包括,但不仅限于,处理器51、存储器52。本领域技术人员可以理解,图5仅仅是计算机设备50的示例,并不构成对计算机设备50的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如计算机设备还可以包括输入输出设备、网络接入设备、总线等。
所称处理器51可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
存储器52可以是计算机设备50的内部存储单元,例如计算机设备50的硬盘或内存。存储器52也可以是计算机设备50的外部存储设备,例如计算机设备50上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器52还可以既包括计算机设备50的内部存储单元也包括外部存储设备。存储器52用于存储计算机程序以及计算机设备所需的其他程序和数据。存储器52还可以用于暂时地存储已经输出或者将要输出的数据。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
上述以软件功能单元的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用以使得一台计算机装置(可以是个人计算机,服务器,或者网络装置等)或处理器(Processor)执行本申请各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅为本申请的较佳实施例而已,并不用以限制本申请,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请保护的范围之内。

Claims (20)

  1. 一种基于A-BiLSTM神经网络的文本分类方法,其特征在于,所述方法包括:
    获取目标文本;
    将所述目标文本进行分词处理,得到N个词语,N为大于2的自然数;
    计算所述N个词语中每个词语对应的词向量;
    将第t+T个词语对应的词向量、A-BiLSTM神经网络的前向第t个节点至前向第t+T-1个节点的输出作为所述A-BiLSTM神经网络的前向第t+T个节点的输入,T为预设自然数并且T≥2,t为自然数并且t+T≤N,计算所述目标文本的前向深度特征表示向量;
    将第t+T个词语对应的词向量、所述A-BiLSTM神经网络的后向第t个节点至后向第t+T-1个节点的输出作为所述A-BiLSTM神经网络的后向第t+T个节点的输入,计算所述目标文本的后向深度特征表示向量;
    根据所述前向深度特征表示向量和所述后向深度特征表示向量对所述目标文本进行分类。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述前向深度特征表示向量和所述后向深度特征表示向量对所述目标文本进行分类,包括:
    将所述前向深度特征表示向量和所述后向深度特征表示向量进行连接处理,得到的向量作为所述目标文本的深度特征表示向量;
    将所述目标文本的深度特征表示向量输入分类器函数,所述分类器函数对所述目标文本进行分类得到分类结果;
    将所述分类结果作为所述目标文本的类别。
  3. 根据权利要求1所述的方法,其特征在于,在所述计算所述目标文本的后向深度特征表示向量之前,所述方法还包括:训练所述A-BiLSTM神经网络,
    所述训练所述A-BiLSTM神经网络,包括:
    获取多个训练样本;
    对于所述多个训练样本中的第一训练样本,将所述第一训练样本的第s+T个词语对应的词向量、所述A-BiLSTM神经网络的前向第s个节点至前向第s+T-1个节点的输出作为所述A-BiLSTM神经网络的前向第s+T个节点的输入,计算所述第一训练样本的前向深度特征表示向量,s为自然数,所述第一训练样本为所述多个训练样本中的任意一个训练样本;
    将所述第一训练样本的第s+T个词语对应的词向量、所述A- BiLSTM神经网络的后向第s个节点至后向第s+T-1个节点的输出作为所述A-BiLSTM神经网络的后向第s+T个节点的输入,计算所述第一训练样本的后向深度特征表示向量;
    根据所述第一训练样本的前向深度特征表示向量和所述第一训练样本的后向深度特征表示向量确定所述第一训练样本的分类预测结果;
    根据所述多个训练样本中每个训练样本的分类预测结果和类别标签计算文本分类的准确度;
    根据所述文本分类的准确度和损失函数的值确定是否停止对所述A-BiLSTM神经网络进行训练。
  4. 根据权利要求3所述的方法,其特征在于,所述训练样本分批输入所述A-BiLSTM神经网络,所述根据所述多个训练样本中每个训练样本的分类预测结果和类别标签计算文本分类的准确度;根据所述文本分类的准确度和损失函数的值的变化趋势确定是否停止对所述A-BiLSTM神经网络进行训练,包括:
    根据同一个批次中的每个训练样本的分类预测结果和类别标签计算文本分类的准确度;
    如果所述同一个批次对应的文本分类的准确度未达到预设准确度,则继续使用下一个批次的训练样本对所述A-BiLSTM神经网络进行训练。
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:
    如果所述同一个批次对应的文本分类的准确度达到所述预设准确度,则判断所述损失函数的值是否小于或等于预设阈值;
    如果所述损失函数的值小于或等于所述预设阈值,则停止对所述A-BiLSTM神经网络进行训练。
  6. 一种基于A-BiLSTM神经网络的文本分类装置,其特征在于,所述装置包括:
    获取单元,用于获取目标文本;
    分词处理单元,用于将所述目标文本进行分词处理,得到N个词语,N为大于2的自然数;
    第一计算单元,用于计算所述N个词语中每个词语对应的词向量;
    第二计算单元,用于将第t+T个词语对应的词向量、A-BiLSTM神经网络的前向第t个节点至前向第t+T-1个节点的输出作为所述A-BiLSTM神经网络的前向第t+T个节点的输入,T为预设自然数并且T≥2,t为自然数并且t+T≤N,计算所述目标文本的前向深度特征表示向量;
    第三计算单元,用于将第t+T个词语对应的词向量、所述A-BiLSTM神经网络的后向第t个节点至后向第t+T-1个节点的输出作为所述A-BiLSTM神经网络的后向第t+T个节点的输入,计算所述目标 文本的后向深度特征表示向量;
    分类单元,用于根据所述前向深度特征表示向量和所述后向深度特征表示向量对所述目标文本进行分类。
  7. 根据权利要求6所述的装置,其特征在于,所述分类单元包括:
    连接处理子单元,用于将所述前向深度特征表示向量和所述后向深度特征表示向量进行连接处理,得到的向量作为所述目标文本的深度特征表示向量;
    输入子单元,用于将所述目标文本的深度特征表示向量输入分类器函数,所述分类器函数对所述目标文本进行分类得到分类结果;
    第一确定子单元,用于将所述分类结果作为所述目标文本的类别。
  8. 根据权利要求6所述的装置,其特征在于,所述装置还包括:
    训练单元,用于在所述第三计算单元计算所述目标文本的后向深度特征表示向量之前,训练所述A-BiLSTM神经网络,
    所述训练单元包括:
    获取子单元,用于获取多个训练样本;
    第一计算子单元,用于对于所述多个训练样本中的第一训练样本,将所述第一训练样本的第s+T个词语对应的词向量、所述A-BiLSTM神经网络的前向第s个节点至前向第s+T-1个节点的输出作为所述A-BiLSTM神经网络的前向第s+T个节点的输入,计算所述第一训练样本的前向深度特征表示向量,s为自然数,所述第一训练样本为所述多个训练样本中的任意一个训练样本;
    第二计算子单元,用于将所述第一训练样本的第s+T个词语对应的词向量、所述A-BiLSTM神经网络的后向第s个节点至后向第s+T-1个节点的输出作为所述A-BiLSTM神经网络的后向第s+T个节点的输入,计算所述第一训练样本的后向深度特征表示向量;
    第二确定子单元,用于根据所述第一训练样本的前向深度特征表示向量和所述第一训练样本的后向深度特征表示向量确定所述第一训练样本的分类预测结果;
    第三计算子单元,用于根据所述多个训练样本中每个训练样本的分类预测结果和类别标签计算文本分类的准确度;
    第三确定子单元,用于根据所述文本分类的准确度和损失函数的值确定是否停止对所述A-BiLSTM神经网络进行训练。
  9. 根据权利要求8所述的装置,其特征在于,所述第三计算子单元,还用于根据同一个批次中的每个训练样本的分类预测结果和类别标签计算文本分类的准确度;
    所述第三确定子单元,还用于如果所述同一个批次对应的文本分类的准确度未达到预设准确度,则继续使用下一个批次的训练样本对所述A-BiLSTM神经网络进行训练。
  10. 根据权利要求9所述的装置,其特征在于,所述装置还包括:
    判断单元,用于如果所述同一个批次对应的文本分类的准确度达到所述预设准确度,则判断所述损失函数的值是否小于或等于预设阈值;
    停止单元,用于如果所述损失函数的值小于或等于所述预设阈值,则停止对所述A-BiLSTM神经网络进行训练。
  11. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现以下步骤:
    获取目标文本;
    将所述目标文本进行分词处理,得到N个词语,N为大于2的自然数;
    计算所述N个词语中每个词语对应的词向量;
    将第t+T个词语对应的词向量、A-BiLSTM神经网络的前向第t个节点至前向第t+T-1个节点的输出作为所述A-BiLSTM神经网络的前向第t+T个节点的输入,T为预设自然数并且T≥2,t为自然数并且t+T≤N,计算所述目标文本的前向深度特征表示向量;
    将第t+T个词语对应的词向量、所述A-BiLSTM神经网络的后向第t个节点至后向第t+T-1个节点的输出作为所述A-BiLSTM神经网络的后向第t+T个节点的输入,计算所述目标文本的后向深度特征表示向量;
    根据所述前向深度特征表示向量和所述后向深度特征表示向量对所述目标文本进行分类。
  12. 根据权利要求11所述的计算机设备,其特征在于,所述处理器执行所述计算机程序时实现所述根据所述前向深度特征表示向量和所述后向深度特征表示向量对所述目标文本进行分类的步骤,包括:
    将所述前向深度特征表示向量和所述后向深度特征表示向量进行连接处理,得到的向量作为所述目标文本的深度特征表示向量;
    将所述目标文本的深度特征表示向量输入分类器函数,所述分类器函数对所述目标文本进行分类得到分类结果;
    将所述分类结果作为所述目标文本的类别。
  13. 根据权利要求11所述的计算机设备,其特征在于,所述处理器执行所述计算机程序时在实现所述计算所述目标文本的后向深度特征表示向量之前,还实现以下步骤:
    训练所述A-BiLSTM神经网络,
    所述训练所述A-BiLSTM神经网络,包括:
    获取多个训练样本;
    对于所述多个训练样本中的第一训练样本,将所述第一训练样本的第s+T个词语对应的词向量、所述A-BiLSTM神经网络的前向第s个节点至前向第s+T-1个节点的输出作为所述A-BiLSTM神经网络的 前向第s+T个节点的输入,计算所述第一训练样本的前向深度特征表示向量,s为自然数,所述第一训练样本为所述多个训练样本中的任意一个训练样本;
    将所述第一训练样本的第s+T个词语对应的词向量、所述A-BiLSTM神经网络的后向第s个节点至后向第s+T-1个节点的输出作为所述A-BiLSTM神经网络的后向第s+T个节点的输入,计算所述第一训练样本的后向深度特征表示向量;
    根据所述第一训练样本的前向深度特征表示向量和所述第一训练样本的后向深度特征表示向量确定所述第一训练样本的分类预测结果;
    根据所述多个训练样本中每个训练样本的分类预测结果和类别标签计算文本分类的准确度;
    根据所述文本分类的准确度和损失函数的值确定是否停止对所述A-BiLSTM神经网络进行训练。
  14. 根据权利要求13所述的计算机设备,其特征在于,所述训练样本分批输入所述A-BiLSTM神经网络,所述处理器执行所述计算机程序时在实现所述根据所述多个训练样本中每个训练样本的分类预测结果和类别标签计算文本分类的准确度;根据所述文本分类的准确度和损失函数的值的变化趋势确定是否停止对所述A-BiLSTM神经网络进行训练,包括:
    根据同一个批次中的每个训练样本的分类预测结果和类别标签计算文本分类的准确度;
    如果所述同一个批次对应的文本分类的准确度未达到预设准确度,则继续使用下一个批次的训练样本对所述A-BiLSTM神经网络进行训练。
  15. 根据权利要求14所述的计算机设备,其特征在于,所述处理器执行所述计算机程序时还实现以下步骤:
    如果所述同一个批次对应的文本分类的准确度达到所述预设准确度,则判断所述损失函数的值是否小于或等于预设阈值;
    如果所述损失函数的值小于或等于所述预设阈值,则停止对所述A-BiLSTM神经网络进行训练。
  16. 一种存储介质,所述存储介质包括存储的程序,其特征在于,在所述程序运行时控制所述存储介质所在设备执行以下步骤:
    获取目标文本;
    将所述目标文本进行分词处理,得到N个词语,N为大于2的自然数;
    计算所述N个词语中每个词语对应的词向量;
    将第t+T个词语对应的词向量、A-BiLSTM神经网络的前向第t个节点至前向第t+T-1个节点的输出作为所述A-BiLSTM神经网络的前向第t+T个节点的输入,T为预设自然数并且T≥2,t为自然数并且 t+T≤N,计算所述目标文本的前向深度特征表示向量;
    将第t+T个词语对应的词向量、所述A-BiLSTM神经网络的后向第t个节点至后向第t+T-1个节点的输出作为所述A-BiLSTM神经网络的后向第t+T个节点的输入,计算所述目标文本的后向深度特征表示向量;
    根据所述前向深度特征表示向量和所述后向深度特征表示向量对所述目标文本进行分类。
  17. 根据权利要求16所述的存储介质,其特征在于,在所述程序运行时控制所述存储介质所在设备执行所述根据所述前向深度特征表示向量和所述后向深度特征表示向量对所述目标文本进行分类的步骤,包括:
    将所述前向深度特征表示向量和所述后向深度特征表示向量进行连接处理,得到的向量作为所述目标文本的深度特征表示向量;
    将所述目标文本的深度特征表示向量输入分类器函数,所述分类器函数对所述目标文本进行分类得到分类结果;
    将所述分类结果作为所述目标文本的类别。
  18. 根据权利要求16所述的存储介质,其特征在于,在所述程序运行时控制所述存储介质所在设备在执行所述计算所述目标文本的后向深度特征表示向量之前,还执行以下步骤:
    训练所述A-BiLSTM神经网络,
    所述训练所述A-BiLSTM神经网络,包括:
    获取多个训练样本;
    对于所述多个训练样本中的第一训练样本,将所述第一训练样本的第s+T个词语对应的词向量、所述A-BiLSTM神经网络的前向第s个节点至前向第s+T-1个节点的输出作为所述A-BiLSTM神经网络的前向第s+T个节点的输入,计算所述第一训练样本的前向深度特征表示向量,s为自然数,所述第一训练样本为所述多个训练样本中的任意一个训练样本;
    将所述第一训练样本的第s+T个词语对应的词向量、所述A-BiLSTM神经网络的后向第s个节点至后向第s+T-1个节点的输出作为所述A-BiLSTM神经网络的后向第s+T个节点的输入,计算所述第一训练样本的后向深度特征表示向量;
    根据所述第一训练样本的前向深度特征表示向量和所述第一训练样本的后向深度特征表示向量确定所述第一训练样本的分类预测结果;
    根据所述多个训练样本中每个训练样本的分类预测结果和类别标签计算文本分类的准确度;
    根据所述文本分类的准确度和损失函数的值确定是否停止对所述A-BiLSTM神经网络进行训练。
  19. 根据权利要求18所述的存储介质,其特征在于,所述训练样本 分批输入所述A-BiLSTM神经网络,在所述程序运行时控制所述存储介质所在设备执行所述根据所述多个训练样本中每个训练样本的分类预测结果和类别标签计算文本分类的准确度;根据所述文本分类的准确度和损失函数的值的变化趋势确定是否停止对所述A-BiLSTM神经网络进行训练,包括:
    根据同一个批次中的每个训练样本的分类预测结果和类别标签计算文本分类的准确度;
    如果所述同一个批次对应的文本分类的准确度未达到预设准确度,则继续使用下一个批次的训练样本对所述A-BiLSTM神经网络进行训练。
  20. 根据权利要求19所述的存储介质,其特征在于,在所述程序运行时控制所述存储介质所在设备还执行以下步骤:
    如果所述同一个批次对应的文本分类的准确度达到所述预设准确度,则判断所述损失函数的值是否小于或等于预设阈值;
    如果所述损失函数的值小于或等于所述预设阈值,则停止对所述A-BiLSTM神经网络进行训练。
PCT/CN2019/118083 2019-07-15 2019-11-13 基于A-BiLSTM神经网络的文本分类方法、存储介质及计算机设备 WO2021008037A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910633814.1A CN110457471A (zh) 2019-07-15 2019-07-15 基于A-BiLSTM神经网络的文本分类方法和装置
CN201910633814.1 2019-07-15

Publications (1)

Publication Number Publication Date
WO2021008037A1 true WO2021008037A1 (zh) 2021-01-21

Family

ID=68482803

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118083 WO2021008037A1 (zh) 2019-07-15 2019-11-13 基于A-BiLSTM神经网络的文本分类方法、存储介质及计算机设备

Country Status (2)

Country Link
CN (1) CN110457471A (zh)
WO (1) WO2021008037A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113011533A (zh) * 2021-04-30 2021-06-22 平安科技(深圳)有限公司 文本分类方法、装置、计算机设备和存储介质
CN113177119A (zh) * 2021-05-07 2021-07-27 北京沃东天骏信息技术有限公司 文本分类模型训练、分类方法和系统及数据处理系统
CN113516198A (zh) * 2021-07-29 2021-10-19 西北大学 一种基于记忆网络和图神经网络的文化资源文本分类方法
CN116595978A (zh) * 2023-07-14 2023-08-15 腾讯科技(深圳)有限公司 对象类别识别方法、装置、存储介质及计算机设备

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112580820A (zh) * 2020-12-01 2021-03-30 遵义师范学院 一种间歇式机器学习训练方法

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108829719A (zh) * 2018-05-07 2018-11-16 中国科学院合肥物质科学研究院 一种非事实类问答答案选择方法及系统

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3454260A1 (en) * 2017-09-11 2019-03-13 Tata Consultancy Services Limited Bilstm-siamese network based classifier for identifying target class of queries and providing responses thereof
CN108874896B (zh) * 2018-05-22 2020-11-06 大连理工大学 一种基于神经网络和幽默特征的幽默识别方法

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108829719A (zh) * 2018-05-07 2018-11-16 中国科学院合肥物质科学研究院 一种非事实类问答答案选择方法及系统

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
STYMNE SARA, LOÁICIGA SHARID, CAP FABIENNE: "A BiLSTM-based System for Cross-lingual Pronoun Prediction", PROCEEDINGS OF THE THIRD WORKSHOP ON DISCOURSE IN MACHINE TRANSLATION, ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, STROUDSBURG, PA, USA, 8 September 2017 (2017-09-08), Stroudsburg, PA, USA, pages 47 - 53, XP055775684, DOI: 10.18653/v1/W17-4805 *
小文的数据之旅 (NON-OFFICIAL TRANSLATION: LEARNING DATA ANALYSIS WITH XIAOWEN): "轻松入门机器学习-线性回归实战 (Non-official translation: Machine Learning for Beginners: Practicing Linear Regression)", HTTPS://BLOG.CSDN.NET/D345389812/ARTICLE/DETAILS/93206773, 21 June 2019 (2019-06-21) *
王恰 (WANG, QIA): "基于Attention Bi-LSTM的文本分类方法研究 (Research on Text Classification Based on Attention Bi-LSTM)", 中国优秀硕士学位论文全文数据库 (电子期刊) (CHINESE MASTER’S THESES FULL-TEXT DATABASE (ELECTRONIC JOURNAL)), no. 01, 15 January 2019 (2019-01-15) *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113011533A (zh) * 2021-04-30 2021-06-22 平安科技(深圳)有限公司 文本分类方法、装置、计算机设备和存储介质
CN113011533B (zh) * 2021-04-30 2023-10-24 平安科技(深圳)有限公司 文本分类方法、装置、计算机设备和存储介质
CN113177119A (zh) * 2021-05-07 2021-07-27 北京沃东天骏信息技术有限公司 文本分类模型训练、分类方法和系统及数据处理系统
CN113177119B (zh) * 2021-05-07 2024-02-02 北京沃东天骏信息技术有限公司 文本分类模型训练、分类方法和系统及数据处理系统
CN113516198A (zh) * 2021-07-29 2021-10-19 西北大学 一种基于记忆网络和图神经网络的文化资源文本分类方法
CN113516198B (zh) * 2021-07-29 2023-09-22 西北大学 一种基于记忆网络和图神经网络的文化资源文本分类方法
CN116595978A (zh) * 2023-07-14 2023-08-15 腾讯科技(深圳)有限公司 对象类别识别方法、装置、存储介质及计算机设备
CN116595978B (zh) * 2023-07-14 2023-11-14 腾讯科技(深圳)有限公司 对象类别识别方法、装置、存储介质及计算机设备

Also Published As

Publication number Publication date
CN110457471A (zh) 2019-11-15

Similar Documents

Publication Publication Date Title
WO2021008037A1 (zh) 基于A-BiLSTM神经网络的文本分类方法、存储介质及计算机设备
WO2022022163A1 (zh) 文本分类模型的训练方法、装置、设备及存储介质
WO2022007823A1 (zh) 一种文本数据处理方法及装置
CN109992780B (zh) 一种基于深度神经网络特定目标情感分类方法
Luo et al. Online learning of interpretable word embeddings
JP7266674B2 (ja) 画像分類モデルの訓練方法、画像処理方法及び装置
CN109766557B (zh) 一种情感分析方法、装置、存储介质及终端设备
US10747961B2 (en) Method and device for identifying a sentence
WO2021135457A1 (zh) 基于循环神经网络的情绪识别方法、装置及存储介质
Nihal et al. Bangla sign alphabet recognition with zero-shot and transfer learning
US20220269939A1 (en) Graph-based labeling rule augmentation for weakly supervised training of machine-learning-based named entity recognition
Amidi et al. Vip cheatsheet: Recurrent neural networks
CN114419378B (zh) 图像分类的方法、装置、电子设备及介质
CN111178082A (zh) 一种句向量生成方法、装置及电子设备
WO2021147405A1 (zh) 客服语句质检方法及相关设备
CN113963205A (zh) 基于特征融合的分类模型训练方法、装置、设备及介质
WO2021212681A1 (zh) 语义角色标注方法、装置、计算机设备及存储介质
WO2024037526A1 (zh) 药物与靶标的相互作用预测方法、装置及存储介质
CN113065350A (zh) 一种基于注意力神经网络的生物医学文本词义消岐方法
CN113011532A (zh) 分类模型训练方法、装置、计算设备及存储介质
WO2023116572A1 (zh) 一种词句生成方法及相关设备
CN112445914A (zh) 文本分类方法、装置、计算机设备和介质
CN112364912A (zh) 信息分类方法、装置、设备及存储介质
Verma et al. VAGA: a novel viscosity-based accelerated gradient algorithm: Convergence analysis and applications
CN113011689A (zh) 软件开发工作量的评估方法、装置及计算设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19937667

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19937667

Country of ref document: EP

Kind code of ref document: A1