CN110263152B - Text classification method, system and computer equipment based on neural network - Google Patents

Text classification method, system and computer equipment based on neural network Download PDF

Info

Publication number
CN110263152B
CN110263152B CN201910374240.0A CN201910374240A CN110263152B CN 110263152 B CN110263152 B CN 110263152B CN 201910374240 A CN201910374240 A CN 201910374240A CN 110263152 B CN110263152 B CN 110263152B
Authority
CN
China
Prior art keywords
word
word segmentation
text
target
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910374240.0A
Other languages
Chinese (zh)
Other versions
CN110263152A (en
Inventor
于凤英
王健宗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910374240.0A priority Critical patent/CN110263152B/en
Priority to PCT/CN2019/102785 priority patent/WO2020224106A1/en
Publication of CN110263152A publication Critical patent/CN110263152A/en
Application granted granted Critical
Publication of CN110263152B publication Critical patent/CN110263152B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The embodiment of the invention provides a text classification method based on a neural network, which comprises the following steps: word segmentation operation is carried out on the text to be classified to obtain L word segments; word vector mapping is respectively carried out on the L segmented words so as to obtain an L x d-dimensional word vector matrix, wherein each segmented word is mapped into a d-dimensional word vector; performing convolution operation on the L x d dimension word vector matrix through a convolution layer to obtain M convolution feature graphs, wherein the convolution layer comprises M f x d convolution kernels; configuring the jth element in each convolution feature diagram into the jth input vector to obtain (L-f+1) input vectors, wherein j is more than or equal to 1 and less than or equal to (L-f+1); and sequentially inputting the (L-f+1) input vectors into a long-short-term memory network model, and calculating the classification vector of the text to be classified. The text classification method provided by the embodiment of the invention can effectively avoid the problem of text classification errors, thereby improving the classification accuracy.

Description

Text classification method, system and computer equipment based on neural network
Technical Field
Embodiments of the present invention relate to the field of computer data processing, and in particular, to a text classification method, system, computer device and computer readable storage medium based on a neural network.
Background
Text classification is one of the important tasks of natural language processing, and similar to industry classification of articles, emotion analysis and other natural language processing tasks, the essence of which is classification of text. The text classifier commonly used at present can be mainly divided into two main categories: a priori rule-based text classifier and a model-based text classifier. The classification rules of the text classifier based on the prior rules need to be manually mined or accumulated with prior knowledge. Text is classified based on a model-based text classifier, such as a topic model based on LDA (Latent Dirichlet Allocation, document topic generation model) or the like.
However, the above classification method often has a problem of wrong classification, resulting in low classification accuracy.
Disclosure of Invention
Accordingly, an object of the embodiments of the present invention is to provide a text classification method, system, computer device and computer readable storage medium based on neural network, which solve the problems of text classification errors and low classification accuracy.
In order to achieve the above object, an embodiment of the present invention provides a text classification method based on a neural network, including the following steps:
word segmentation operation is carried out on the text to be classified to obtain L word segments;
word vector mapping is respectively carried out on the L segmented words so as to obtain an L x d-dimensional word vector matrix, wherein each segmented word is mapped into a d-dimensional word vector;
performing convolution operation on the L x d dimension word vector matrix through a convolution layer to obtain M convolution feature graphs, wherein the convolution layer comprises M f x d convolution kernels;
configuring the jth element in each convolution feature map into a jth input vector to obtain (L-f+1) input vectors, wherein the element arrangement sequence in the jth input vector is determined by the i value of the feature convolution map where each element is located, i is a convolution kernel identifier, and i is more than or equal to 1 and less than or equal to M; and
And sequentially inputting the (L-f+1) input vectors into a long-short-term memory network model, and calculating the classification vector of the text to be classified.
Preferably, the step of performing word segmentation on the text to be classified to obtain L segmented words includes:
acquiring a plurality of user attribute information of a plurality of users browsing the text to be classified;
analyzing and obtaining a target group for browsing the text to be classified according to the attribute information of the plurality of users;
obtaining the prediction probability of the text to be classified corresponding to each theme according to the historical user portraits of the target group;
screening a plurality of target topics with the prediction probability larger than a preset threshold according to the prediction probability of each topic; and
And performing word segmentation operation on the text to be classified based on the target topics.
Preferably, the step of performing word segmentation operation on the text to be classified based on the target topics includes:
and performing word segmentation operation on the text to be classified according to a plurality of topic word libraries of the target topics.
Preferably, the step of performing word segmentation operation on the text to be classified based on the target topics includes:
performing word segmentation operation on the text to be classified according to the topic word library associated with each target topic to obtain a plurality of word segmentation sets;
comparing whether the word segmentation of each word segmentation set in the corresponding character position area is the same or not;
if the character position areas are the same, putting the word segmentation of the corresponding character position areas into a target word segmentation set; and
If the character position areas are different, selecting to put the word of one word segmentation set in the corresponding character position areas into the target word segmentation set.
Preferably, the step of selecting to put the word segment of one of the word segment sets in the corresponding character position area into the target word segment set includes:
analyzing the divided probabilities of the word segmentation of each word segmentation set in the corresponding character position area through a hidden Markov model; and
And selecting the word segmentation with the highest probability of being divided to be put into the target word segmentation set.
Preferably, the step of selecting to put the word segment of one of the word segment sets in the corresponding character position area into the target word segment set includes:
analyzing the divided probabilities of the word segmentation of each word segmentation set in the corresponding character position area through a hidden Markov model;
according to the division probability of the word of each word set in the corresponding character position area and the prediction probability of the target subject associated with each word set, calculating the comprehensive weight coefficient of the word of each word set in the corresponding character position area; and
And selecting the word segmentation with the highest comprehensive weight coefficient to be added into the target word segmentation set.
Preferably, the step of sequentially inputting the (L-f+1) input vectors into a long-short-term memory network model and calculating the classification vector of the text to be classified includes:
obtaining (L-f+1) output vectors through the long-term and short-term memory network model; and
The (L-f+1) output vectors are input to a classification layer, through which classification vectors are output.
To achieve the above object, an embodiment of the present invention further provides a text classification system based on a neural network, including:
the word segmentation module is used for carrying out word segmentation operation on the text to be classified to obtain L word segments;
the word vector mapping module is used for respectively carrying out word vector mapping on the L segmented words to obtain an L-d-dimensional word vector matrix, wherein each segmented word is mapped into a d-dimensional word vector;
the convolution module is used for performing convolution operation on the L-d-dimensional word vector matrix through a convolution layer to obtain M convolution feature graphs, and the convolution layer comprises M f-d convolution kernels;
the feature mapping module is used for configuring the j-th element in each convolution feature diagram into the j-th input vector to obtain (L-f+1) input vectors, wherein the arrangement sequence of the elements in the j-th input vector is determined by the i value of the feature convolution diagram where each element is positioned, i is a convolution kernel mark, and i is more than or equal to 1 and less than or equal to M; and
And the prediction module is used for inputting the (L-f+1) input vectors into the long-short-term memory network model in sequence and calculating the classification vector of the text to be classified.
To achieve the above object, embodiments of the present invention also provide a computer device memory, a processor, and a computer program stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the neural network based text classification method as described above.
To achieve the above object, an embodiment of the present invention also provides a computer-readable storage medium having stored therein a computer program executable by at least one processor to cause the at least one processor to perform the steps of the neural network-based text classification method as described above.
The text classification method, the system, the computer equipment and the computer readable storage medium based on the neural network, provided by the embodiment of the invention, combine convolution and a long-term network model to form a CNN+LSTM text classification model, and effectively consider local context characteristics of texts and dependency relations among long-span words. Therefore, the problems of wrong text classification and low classification accuracy can be solved, and the method is particularly suitable for text classification tasks of long texts.
Drawings
Fig. 1 is a flowchart of a text classification method based on a neural network according to an embodiment of the present invention.
Fig. 2 is a schematic flowchart of step S100 in fig. 1.
Fig. 3 is a schematic flowchart of step S1008 in fig. 2.
Fig. 4 is a schematic diagram of a program module of a text classification system according to a second embodiment of the invention.
Fig. 5 is a schematic diagram of a hardware structure of a third embodiment of the computer device of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that the description of "first", "second", etc. in this disclosure is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implying an indication of the number of technical features being indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.
The following embodiments will exemplarily describe taking the computer device 2 as an execution subject.
Example 1
Referring to fig. 1, a flowchart illustrating steps of a text classification method based on a neural network according to an embodiment of the invention is shown. It will be appreciated that the flow charts in the method embodiments are not intended to limit the order in which the steps are performed. Specifically, the following is described.
Step S100, word segmentation operation is carried out on the text to be classified to obtain L word segments.
The word segmentation operation may be based on a dictionary word segmentation algorithm: the forward maximum matching method, the reverse maximum matching method and the bi-directional matching word segmentation method can also be based on hidden Markov model HMM, CRF, SVM, deep learning and other algorithms.
In an exemplary embodiment, referring to fig. 2, the step S100 may further include steps S1000 to S1008:
step S1000, obtaining a plurality of user attribute information of a plurality of users browsing the text to be classified. Exemplary user attribute information includes, but is not limited to, age, gender, occupation, territory, hobbies, and the like.
Step S1002, analyzing and obtaining a target group browsing the text to be classified according to the plurality of user attribute information of the plurality of users.
And step S1004, obtaining the prediction probability of the text to be classified corresponding to each theme according to the historical user portraits of the target group.
And the historical user portrayal is used for obtaining the interesting coefficient of each theme corresponding to the target group according to the historical behavior information of the target group. There is a correspondence between the coefficient of interest and the prediction probability.
In step S1006, a plurality of target topics with a prediction probability greater than a preset threshold are screened according to the prediction probability of each topic.
And step S1008, performing word segmentation operation on the text to be classified based on the target topics.
In an exemplary embodiment, the step S1008 may include: and performing word segmentation operation on the text to be classified according to a plurality of topic word libraries of the target topics. The method comprises the following steps:
referring to fig. 3, the step S1008 may further include steps S1008A to S1008D:
step S1008A, performing word segmentation operation on the text to be classified according to the topic word stock associated with each target topic to obtain a plurality of word segmentation sets;
step S1008B, comparing whether the word segmentation of each word segmentation set in the corresponding character position area is the same;
step S1008C, if the character position areas are the same, putting the word segmentation of the corresponding character position areas into a target word segmentation set; and
And step S1008D, if the character position areas are not the same, selecting to put the word of one word segmentation set in the corresponding character position areas into the target word segmentation set.
In an exemplary embodiment, the step S1008D may further include:
step 1, analyzing the divided probability of the word segmentation of each word segmentation set in the corresponding character position area through a hidden Markov model;
and step 2, selecting the word segmentation with the highest probability of being divided and putting the word segmentation into the target word segmentation set.
In another exemplary embodiment, the step S1008D may further include:
step 1, analyzing the divided probability of the word segmentation of each word segmentation set in the corresponding character position area through a hidden Markov model;
step 2, calculating the comprehensive weight coefficient of the word segmentation of each word segmentation set in the corresponding character position area according to the divided probability of the word segmentation of each word segmentation set in the corresponding character position area and the prediction probability of the target subject associated with each word segmentation set; and
And step 3, selecting the word segmentation with the highest comprehensive weight coefficient to be added into the target word segmentation set.
Step S102, word vector mapping is performed on the L segmented words respectively to obtain an L-d-dimensional word vector matrix, wherein each segmented word is mapped into a d-dimensional word vector.
In an exemplary embodiment, 128-dimensional word vectors for each word segment may be obtained through a word2vec model or the like.
Step S104, performing convolution operation on the L-d-dimensional word vector matrix through a convolution layer to obtain M convolution feature graphs, wherein the convolution layer comprises M f-d convolution kernels.
In an exemplary embodiment, the convolution layer includes a number of convolution kernels with a step size of 1 f×d, and a convolution operation is performed on the word vector matrix with dimension l×d by the convolution layer to obtain a number of convolution feature maps with dimensions (L-f+1) ×1. That is, each convolution feature map has a width of 1 and a length of L-f+1. The length of the convolution kernel is f, and the number of the segmented words is L. L is a positive integer greater than 1.
The convolution characteristic diagram comprises (L-f+1) 1 elements, and the calculation formula is as follows:
c ij =f(w ij ⊙m i +b i )
wherein c ij Is the eigenvalue, w, of the j-th element in (L-f+1) in the ith eigenvector graph ij For the word vector matrix covered by the convolution kernel corresponding to the ith convolution feature map, as indicated by the matrix multiplication, m i B is a convolution kernel for calculating the ith convolution feature map i For the bias term used to calculate the ith convolution feature map, f is a nonlinear activation function, such as a ReLU function.
Specifically, the number of the convolution kernels may be 4, so that 4 (L-f+1) x 1 convolution feature maps are obtained.
And S106, configuring the jth element in each convolution characteristic diagram into the jth input vector to obtain (L-f+1) input vectors, wherein j is more than or equal to 1 and less than or equal to (L-f+1).
The element arrangement sequence in the jth input vector is determined by the i value of the characteristic convolution graph where each element is located, i is a convolution kernel mark, and i is more than or equal to 1 and less than or equal to M.
And S108, sequentially inputting the (L-f+1) input vectors into a Long Short-Term Memory network model (LSTM), and calculating the classification vector of the text to be classified.
The long-short-term memory network model is used for processing the sequence dependency relationship among long spans and is suitable for processing the task of dependency among long texts.
In an exemplary embodiment, the step S108 may further include steps S1080 to S1082:
step S1080, obtaining (L-f+1) output vectors through the long-short-term memory network model; and
Step S1082, inputting the (L-f+1) output vectors to a classification layer, and outputting classification vectors through the classification layer.
Illustratively, the step of calculating the classification vector of the text to be classified is as follows:
(1) According to the output h of the last moment t-1 And current input x t To obtain f t Value to determine whether to let the last learned information C t-1 Through or partially through:
f t =σ(W f [x t ,h t-1 ]+b f ) Wherein f t ∈[0,1]Representing the selection weight of the node at time t to the memory of the cell at time t-1, W f Weight matrix for forgetting gate, b f Bias item h for forgetting door t-1 Hidden state information representing t-1 node, nonlinear function σ (x) =1/(1+e) -x );
(2) Determining which values to update by sigmoid and generating new candidate values q by tanh layer t It is used asCandidate values generated by the current layer may be added to the memory cell state, and the two generated values are combined to update:
i t =σ(W i [x t ,h t-1 ]+b i ) Wherein i is t ∈[0,1]The selection weight of the node at the time t to the current node information is represented, b i Is the bias term of the input gate, W i For the weight matrix of the input gate, the nonlinear function σ (x) =1/(1+e) -x );
Current node inputs information q t =tanh(W q [h t-1 ,x t ]+b q ) Wherein b q As bias term, W q A weight matrix representing information to be updated, tanh being a hyperbolic tangent activation function, x t An input vector of an LSTM neural network node at the moment t is represented, h t-1 Hidden layer state information representing a t-1 node;
updating the state of the old memory cell, and adding new information:
currently outputting memory information C t =f t *C t-1 +i t *q t ) Wherein q is t Memory information representing t-1 node, f t The selection weight of the node at time t to the memory of the cell at time t-1, i t The selection weight of the node at the time t to the current node information is represented;
(3) Outputting an LSTM model;
o t =σ(W o [x t ,h t-1 ]+b o ) Wherein o t ∈[0,1]A selection weight indicating node cell memory information at time t, b o To output the bias of the gate, W o To output the weight matrix of the gate,representing the concatenated vector of vectors xt and ht-1, i.e., |x t |+|h t-1 Vector of dimension.
h t =o t ·tanh(C t )
x t Representing the input data of the LSTM neural network node at the time t, namely the time of the embodimentL-f+1) one of the input vectors; h is a t Is the output vector of the LSTM neural network node at the time t.
Through the above formula, the LSTM model can output (L-f+1) output vectors in total, and is input to a softmax layer according to the (L-f+1) output vectors, through which classification vectors are output. Each vector parameter in the classification vector represents a confidence level for the corresponding text category.
Example two
With continued reference to fig. 4, a program module diagram of a fourth embodiment of the text classification system of the invention is shown. In this embodiment, the text classification system 20 may include or be divided into one or more program modules stored in a storage medium and executed by one or more processors to accomplish the present invention and may implement the neural network-based text classification method described above. Program modules depicted in the embodiments of the present invention are directed to a series of computer program instruction segments capable of performing the specified functions, and are more suited to describing the execution of the text classification system 20 on a storage medium than the program itself. The following description will specifically describe functions of each program module of the present embodiment:
the word segmentation module 200 is configured to perform word segmentation operation on the text to be classified to obtain L segmented words.
In an exemplary embodiment, the word segmentation module 200 may include an acquisition module, an analysis module, a topic prediction module, a screening module, and a word segmentation module, which are specifically as follows:
the acquisition module is used for acquiring a plurality of user attribute information of a plurality of users browsing the text to be classified.
And the analysis module is used for analyzing and obtaining a target group for browsing the text to be classified according to the attribute information of the plurality of users.
And the theme predicting module is used for obtaining the predicting probability of each theme corresponding to the text to be classified according to the historical user portraits of the target group.
The analysis module is used for acquiring target attribute information from the plurality of user attribute information.
The theme predicting module is used for inputting the target attribute information into a pre-configured neural network model to obtain the predicting probability of each theme.
And the screening module is used for screening a plurality of target topics with the prediction probability larger than a preset threshold according to the prediction probability of each topic.
The word segmentation module is used for carrying out word segmentation operation on the text to be classified based on the target topics. The word segmentation module is further used for: according to a plurality of topic word libraries of the target topics, word segmentation operation is carried out on the text to be classified, wherein the word segmentation operation is specifically as follows: performing word segmentation operation on the text to be classified according to the topic word library associated with each target topic to obtain a plurality of word segmentation sets; comparing whether the word segmentation of each word segmentation set in the corresponding character position area is the same or not; if the character position areas are the same, putting the word segmentation of the corresponding character position areas into a target word segmentation set; if the character position areas are different, selecting to put the word of one word segmentation set in the corresponding character position areas into the target word segmentation set.
In an exemplary embodiment, selecting to put a word segment of one of the word segment sets in the corresponding character position region into the target word segment set further includes: analyzing the divided probabilities of the word segmentation of each word segmentation set in the corresponding character position area through a hidden Markov model; and selecting the word segmentation with the highest probability of being divided to be put into the target word segmentation set.
In another exemplary embodiment, selecting to put a word segment of one of the word segment sets in the corresponding character position area into the target word segment set further includes: analyzing the divided probabilities of the word segmentation of each word segmentation set in the corresponding character position area through a hidden Markov model; according to the division probability of the word of each word set in the corresponding character position area and the prediction probability of the target subject associated with each word set, calculating the comprehensive weight coefficient of the word of each word set in the corresponding character position area; and selecting the word segmentation with the highest comprehensive weight coefficient to be added into the target word segmentation set.
The word vector mapping module 202 is configured to perform word vector mapping on the L segmented words respectively to obtain an L x d-dimensional word vector matrix, where each segmented word is mapped into a d-dimensional word vector.
In an exemplary embodiment, 128-dimensional word vectors for each word segment may be obtained through a word2vec model or the like.
The convolution module 204 is configured to perform a convolution operation on the l×d-dimensional word vector matrix by using a convolution layer to obtain M convolution feature graphs, where the convolution layer includes M convolution kernels f×d.
In an exemplary embodiment, the convolution layer includes a number of convolution kernels with a step size of 1 f×d, and a convolution operation is performed on the word vector matrix with dimension l×d by the convolution layer to obtain a number of convolution feature maps with dimensions (L-f+1) ×1. That is, each convolution feature map has a width of 1 and a length of L-f+1. The length of the convolution kernel is f, and the number of the segmented words is L.
The convolution characteristic diagram comprises (L-f+1) 1 elements, and the calculation formula is as follows:
c ij =f(w ij ⊙m i +b i )
wherein c ij Is the eigenvalue, w, of the j-th element in (L-f+1) in the ith eigenvector graph ij For the word vector matrix covered by the convolution kernel corresponding to the ith convolution feature map, as indicated by the matrix multiplication, m i B is a convolution kernel for calculating the ith convolution feature map i For the bias term used to calculate the ith convolution feature map, f is a nonlinear activation function, such as a ReLU function.
Specifically, the number of the convolution kernels may be 4, so that 4 (L-f+1) x 1 convolution feature maps are obtained.
Feature mapping module 206 is configured to configure the jth element in each convolution feature map into the jth input vector to obtain (L-f+1) input vectors, where 1 is equal to or less than j is equal to or less than (L-f+1).
The element arrangement sequence in the jth input vector is determined by the i value of the characteristic convolution graph where each element is located, i is a convolution kernel mark, and i is more than or equal to 1 and less than or equal to M.
The prediction module 208 is configured to sequentially input the (L-f+1) input vectors into the long-short-term memory network model, and calculate a classification vector of the text to be classified.
In an exemplary embodiment, the prediction module 208 is further configured to: obtaining (L-f+1) output vectors through the long-term and short-term memory network model; and inputting the (L-f+1) output vectors to a classification layer, and outputting classification vectors through the classification layer.
Example III
Fig. 5 is a schematic hardware architecture of a computer device according to a third embodiment of the invention. In this embodiment, the computer device 2 is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction. The computer device 2 may be a PC, rack server, blade server, tower server, or rack server (including a stand-alone server, or a server cluster made up of multiple servers), or the like. As shown, the computer device 2 includes, but is not limited to, at least a memory 21, a processor 22, a network interface 23, and a text classification system 20 communicatively coupled to each other via a system bus. Wherein:
in this embodiment, the memory 21 includes at least one type of computer-readable storage medium including flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the memory 21 may be an internal storage unit of the computer device 2, such as a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the computer device 20. Of course, the memory 21 may also include both internal storage units of the computer device 2 and external storage devices. In this embodiment, the memory 21 is typically used to store an operating system and various types of application software installed on the computer device 2, such as program codes of the text classification system 20 of the second embodiment. Further, the memory 21 may be used to temporarily store various types of data that have been output or are to be output.
The processor 22 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 22 is typically used to control the overall operation of the computer device 2. In this embodiment, the processor 22 is configured to execute the program code stored in the memory 21 or process data, for example, execute the text classification system 20, to implement the text classification method based on the neural network of the first embodiment.
The network interface 23 may comprise a wireless network interface or a wired network interface, which network interface 23 is typically used for establishing a communication connection between the computer apparatus 2 and other electronic devices. For example, the network interface 23 is used to connect the computer device 2 to an external terminal through a network, establish a data transmission channel and a communication connection between the computer device 2 and the external terminal, and the like. The network may be an Intranet (Intranet), the Internet (Internet), a global system for mobile communications (Global System of Mobile communication, GSM), wideband code division multiple access (Wideband Code Division Multiple Access, WCDMA), a 4G network, a 5G network, bluetooth (Bluetooth), wi-Fi, or other wireless or wired network.
It is noted that fig. 5 only shows a computer device 2 having components 20-23, but it is understood that not all of the illustrated components are required to be implemented, and that more or fewer components may alternatively be implemented.
In this embodiment, the text classification system 20 stored in the memory 21 may be further divided into one or more program modules, which are stored in the memory 21 and executed by one or more processors (the processor 22 in this embodiment) to complete the present invention.
For example, FIG. 4 shows a schematic diagram of program modules implementing a second embodiment of the text classification system 20, in which the text-based classification system 20 may be divided into a word segmentation module 200, a word vector mapping module 202, a convolution module 204, a feature mapping module 206, and a prediction module 208. Program modules in the present invention are understood to mean a series of computer program instruction segments capable of performing a specific function, more appropriately than a program, describing the execution of the text classification system 20 in the computer device 2. The specific functions of the program modules 200-208 are described in detail in the second embodiment, and are not described herein.
Example IV
The present embodiment also provides a computer-readable storage medium such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application store, etc., on which a computer program is stored, which when executed by a processor, performs the corresponding functions. The computer readable storage medium of the present embodiment is used for storing the text classification system 20, and when executed by a processor, implements the neural network-based text classification method of the first embodiment.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (7)

1. A neural network-based text classification method, the method comprising:
word segmentation operation is carried out on the text to be classified to obtain L word segments;
word vector mapping is respectively carried out on the L segmented words so as to obtain an L x d-dimensional word vector matrix, wherein each segmented word is mapped into a d-dimensional word vector;
performing convolution operation on the L x d dimension word vector matrix through a convolution layer to obtain M convolution feature graphs, wherein the convolution layer comprises M f x d convolution kernels;
configuring the jth element in each convolution feature map into a jth input vector to obtain (L-f+1) input vectors, wherein the element arrangement sequence in the jth input vector is determined by the i value of the feature convolution map where each element is located, i is a convolution kernel identifier, and i is more than or equal to 1 and less than or equal to M; and
Sequentially inputting the (L-f+1) input vectors into a long-short-term memory network model, and calculating classification vectors of the text to be classified;
the step of performing word segmentation operation on the text to be classified to obtain L word segments comprises the following steps:
acquiring a plurality of user attribute information of a plurality of users browsing the text to be classified;
analyzing and obtaining a target group for browsing the text to be classified according to the attribute information of the plurality of users;
obtaining the prediction probability of the text to be classified corresponding to each theme according to the historical user portraits of the target group;
screening a plurality of target topics with the prediction probability larger than a preset threshold according to the prediction probability of each topic; and
Word segmentation operation is carried out on the text to be classified based on the target topics;
the step of word segmentation operation on the text to be classified based on the target topics comprises the following steps:
performing word segmentation operation on the text to be classified according to the topic word library associated with each target topic to obtain a plurality of word segmentation sets;
comparing whether the word segmentation of each word segmentation set in the corresponding character position area is the same or not;
if the character position areas are the same, putting the word segmentation of the corresponding character position areas into a target word segmentation set; and
If the character position areas are different, selecting word segmentation of one word segmentation set in the corresponding character position areas to be put into the target word segmentation set;
the step of selecting to put the word segment of one word segment set in the corresponding character position area into the target word segment set comprises the following steps:
analyzing the divided probabilities of the word segmentation of each word segmentation set in the corresponding character position area through a hidden Markov model;
according to the division probability of the word of each word set in the corresponding character position area and the prediction probability of the target subject associated with each word set, calculating the comprehensive weight coefficient of the word of each word set in the corresponding character position area; and
And selecting the word segmentation with the highest comprehensive weight coefficient to be added into the target word segmentation set.
2. The text classification method based on the neural network according to claim 1, wherein the step of performing a word segmentation operation on the text to be classified based on the plurality of target subjects comprises:
and performing word segmentation operation on the text to be classified according to a plurality of topic word libraries of the target topics.
3. The neural network-based text classification method of claim 1, wherein said selecting a word segment of one of the word segment sets in a corresponding character position region into said target word segment set comprises:
analyzing the divided probabilities of the word segmentation of each word segmentation set in the corresponding character position area through a hidden Markov model; and
And selecting the word segmentation with the highest probability of being divided to be put into the target word segmentation set.
4. The neural network-based text classification method of claim 1, wherein the step of sequentially inputting the (L-f+1) input vectors into a long-short-term memory network model, and calculating the classification vector of the text to be classified, comprises:
obtaining (L-f+1) output vectors through the long-term and short-term memory network model; and
The (L-f+1) output vectors are input to a classification layer, through which classification vectors are output.
5. A neural network-based text classification system, comprising:
the word segmentation module is used for carrying out word segmentation operation on the text to be classified to obtain L word segments;
the word vector mapping module is used for respectively carrying out word vector mapping on the L segmented words to obtain an L-d-dimensional word vector matrix, wherein each segmented word is mapped into a d-dimensional word vector;
the convolution module is used for performing convolution operation on the L-d-dimensional word vector matrix through a convolution layer to obtain M convolution feature graphs, and the convolution layer comprises M f-d convolution kernels;
the feature mapping module is used for configuring the j-th element in each convolution feature diagram into the j-th input vector to obtain (L-f+1) input vectors, wherein the arrangement sequence of the elements in the j-th input vector is determined by the i value of the feature convolution diagram where each element is positioned, i is a convolution kernel mark, and i is more than or equal to 1 and less than or equal to M; and
The prediction module is used for inputting the (L-f+1) input vectors into a long-short-term memory network model in sequence and calculating the classification vector of the text to be classified;
wherein, the word segmentation module is further used for:
acquiring a plurality of user attribute information of a plurality of users browsing the text to be classified;
analyzing and obtaining a target group for browsing the text to be classified according to the attribute information of the plurality of users;
obtaining the prediction probability of the text to be classified corresponding to each theme according to the historical user portraits of the target group;
screening a plurality of target topics with the prediction probability larger than a preset threshold according to the prediction probability of each topic; and
Word segmentation operation is carried out on the text to be classified based on the target topics;
the word segmentation operation on the text to be classified based on the target topics comprises the following steps:
performing word segmentation operation on the text to be classified according to the topic word library associated with each target topic to obtain a plurality of word segmentation sets;
comparing whether the word segmentation of each word segmentation set in the corresponding character position area is the same or not;
if the character position areas are the same, putting the word segmentation of the corresponding character position areas into a target word segmentation set; and
If the character position areas are different, selecting word segmentation of one word segmentation set in the corresponding character position areas to be put into the target word segmentation set;
the selecting step of putting the word segment of one word segment set in the corresponding character position area into the target word segment set comprises the following steps:
analyzing the divided probabilities of the word segmentation of each word segmentation set in the corresponding character position area through a hidden Markov model;
according to the division probability of the word of each word set in the corresponding character position area and the prediction probability of the target subject associated with each word set, calculating the comprehensive weight coefficient of the word of each word set in the corresponding character position area; and
And selecting the word segmentation with the highest comprehensive weight coefficient to be added into the target word segmentation set.
6. A computer device memory, a processor and a computer program stored on the memory and executable on the processor, wherein the computer program when executed by the processor implements the steps of the neural network based text classification method of any of claims 1 to 4.
7. A computer-readable storage medium, in which a computer program is stored, the computer program being executable by at least one processor to cause the at least one processor to perform the steps of the neural network-based text classification method of any of claims 1 to 4.
CN201910374240.0A 2019-05-07 2019-05-07 Text classification method, system and computer equipment based on neural network Active CN110263152B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910374240.0A CN110263152B (en) 2019-05-07 2019-05-07 Text classification method, system and computer equipment based on neural network
PCT/CN2019/102785 WO2020224106A1 (en) 2019-05-07 2019-08-27 Text classification method and system based on neural network, and computer device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910374240.0A CN110263152B (en) 2019-05-07 2019-05-07 Text classification method, system and computer equipment based on neural network

Publications (2)

Publication Number Publication Date
CN110263152A CN110263152A (en) 2019-09-20
CN110263152B true CN110263152B (en) 2024-04-09

Family

ID=67914250

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910374240.0A Active CN110263152B (en) 2019-05-07 2019-05-07 Text classification method, system and computer equipment based on neural network

Country Status (2)

Country Link
CN (1) CN110263152B (en)
WO (1) WO2020224106A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110717330A (en) * 2019-09-23 2020-01-21 哈尔滨工程大学 Word-sentence level short text classification method based on deep learning
CN111178070B (en) * 2019-12-25 2022-11-25 深圳平安医疗健康科技服务有限公司 Word sequence obtaining method and device based on word segmentation and computer equipment
CN112597764B (en) * 2020-12-23 2023-07-25 青岛海尔科技有限公司 Text classification method and device, storage medium and electronic device
CN113204698B (en) * 2021-05-31 2023-12-26 平安科技(深圳)有限公司 News subject term generation method, device, equipment and medium
CN114579752B (en) * 2022-05-09 2023-05-26 中国人民解放军国防科技大学 Feature importance-based long text classification method and device and computer equipment
CN117473095B (en) * 2023-12-27 2024-03-29 合肥工业大学 Short text classification method and system based on theme enhancement word representation

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107169035A (en) * 2017-04-19 2017-09-15 华南理工大学 A kind of file classification method for mixing shot and long term memory network and convolutional neural networks
CN107301246A (en) * 2017-07-14 2017-10-27 河北工业大学 Chinese Text Categorization based on ultra-deep convolutional neural networks structural model
CN107729311A (en) * 2017-08-28 2018-02-23 云南大学 A kind of Chinese text feature extracting method of the fusing text tone
CN108763216A (en) * 2018-06-01 2018-11-06 河南理工大学 A kind of text emotion analysis method based on Chinese data collection
CN109299268A (en) * 2018-10-24 2019-02-01 河南理工大学 A kind of text emotion analysis method based on dual channel model
CN109543029A (en) * 2018-09-27 2019-03-29 平安科技(深圳)有限公司 File classification method, device, medium and equipment based on convolutional neural networks
CN109684476A (en) * 2018-12-07 2019-04-26 中科恒运股份有限公司 A kind of file classification method, document sorting apparatus and terminal device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9484015B2 (en) * 2013-05-28 2016-11-01 International Business Machines Corporation Hybrid predictive model for enhancing prosodic expressiveness
CN109213868A (en) * 2018-11-21 2019-01-15 中国科学院自动化研究所 Entity level sensibility classification method based on convolution attention mechanism network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107169035A (en) * 2017-04-19 2017-09-15 华南理工大学 A kind of file classification method for mixing shot and long term memory network and convolutional neural networks
CN107301246A (en) * 2017-07-14 2017-10-27 河北工业大学 Chinese Text Categorization based on ultra-deep convolutional neural networks structural model
CN107729311A (en) * 2017-08-28 2018-02-23 云南大学 A kind of Chinese text feature extracting method of the fusing text tone
CN108763216A (en) * 2018-06-01 2018-11-06 河南理工大学 A kind of text emotion analysis method based on Chinese data collection
CN109543029A (en) * 2018-09-27 2019-03-29 平安科技(深圳)有限公司 File classification method, device, medium and equipment based on convolutional neural networks
CN109299268A (en) * 2018-10-24 2019-02-01 河南理工大学 A kind of text emotion analysis method based on dual channel model
CN109684476A (en) * 2018-12-07 2019-04-26 中科恒运股份有限公司 A kind of file classification method, document sorting apparatus and terminal device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于VDCNN与LSTM混合模型的中文文本分类研究;彭玉青;宋初柏;闫倩;赵晓松;魏铭;;计算机工程;20171113(第11期);第196-202页 *

Also Published As

Publication number Publication date
WO2020224106A1 (en) 2020-11-12
CN110263152A (en) 2019-09-20

Similar Documents

Publication Publication Date Title
CN110263152B (en) Text classification method, system and computer equipment based on neural network
CN110347835B (en) Text clustering method, electronic device and storage medium
CN110750965B (en) English text sequence labeling method, english text sequence labeling system and computer equipment
US11568315B2 (en) Systems and methods for learning user representations for open vocabulary data sets
CN108536800B (en) Text classification method, system, computer device and storage medium
CN108563722B (en) Industry classification method, system, computer device and storage medium for text information
US10762283B2 (en) Multimedia document summarization
JP5031206B2 (en) Fit exponential model
CN108520041B (en) Industry classification method and system of text, computer equipment and storage medium
CN110598206A (en) Text semantic recognition method and device, computer equipment and storage medium
CN111461637A (en) Resume screening method and device, computer equipment and storage medium
US20160012351A1 (en) Information processing device, information processing method, and program
CN110659667A (en) Picture classification model training method and system and computer equipment
CN111611374A (en) Corpus expansion method and device, electronic equipment and storage medium
CN112328909B (en) Information recommendation method and device, computer equipment and medium
CN115730597A (en) Multi-level semantic intention recognition method and related equipment thereof
CN114780746A (en) Knowledge graph-based document retrieval method and related equipment thereof
CN113254649B (en) Training method of sensitive content recognition model, text recognition method and related device
CN112861692B (en) Method and device for constructing room classification model, and method and device for classifying rooms
CN112508177A (en) Network structure searching method and device, electronic equipment and storage medium
CN112685656A (en) Label recommendation method and electronic equipment
CN110781404A (en) Friend relationship chain matching method, system, computer equipment and readable storage medium
CN115062619B (en) Chinese entity linking method, device, equipment and storage medium
CN113378866B (en) Image classification method, system, storage medium and electronic device
CN112989022B (en) Intelligent virtual text selection method and device and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant