CN113704475A - Text classification method and device based on deep learning, electronic equipment and medium - Google Patents
Text classification method and device based on deep learning, electronic equipment and medium Download PDFInfo
- Publication number
- CN113704475A CN113704475A CN202111011763.2A CN202111011763A CN113704475A CN 113704475 A CN113704475 A CN 113704475A CN 202111011763 A CN202111011763 A CN 202111011763A CN 113704475 A CN113704475 A CN 113704475A
- Authority
- CN
- China
- Prior art keywords
- text
- classification
- classification result
- hyperplane
- feature vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013135 deep learning Methods 0.000 title claims abstract description 52
- 238000000034 method Methods 0.000 title claims abstract description 45
- 239000013598 vector Substances 0.000 claims abstract description 219
- 230000006870 function Effects 0.000 claims abstract description 103
- 238000000605 extraction Methods 0.000 claims abstract description 18
- 238000012549 training Methods 0.000 claims description 74
- 238000012795 verification Methods 0.000 claims description 52
- 238000004590 computer program Methods 0.000 claims description 14
- 238000006243 chemical reaction Methods 0.000 claims description 8
- 238000013507 mapping Methods 0.000 claims description 4
- 238000013473 artificial intelligence Methods 0.000 abstract description 13
- 238000004891 communication Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 5
- 238000007726 management method Methods 0.000 description 5
- 238000006073 displacement reaction Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000008451 emotion Effects 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the field of artificial intelligence, and discloses a text classification method based on deep learning, which comprises the following steps: performing feature extraction on the text set to be classified to obtain a feature vector set; carrying out first classification on the feature vector set by using a pre-constructed naive Bayes classifier to obtain a first classification result; constructing a plurality of hyperplane functions, determining an optimal hyperplane by utilizing Lagrange number multiplication, and performing second classification on the feature vector set by utilizing the optimal hyperplane to obtain a second classification result; if the first classification result is inconsistent with the second classification result, adjusting parameters of a naive Bayes classifier and selecting an optimal hyperplane, and executing the first classification and the second classification again; and if the first classification result is consistent with the second classification result, taking the first classification result or the second classification result as a classification result. The invention also provides a text classification device based on deep learning, electronic equipment and a storage medium. The invention can improve the accuracy of text classification based on deep learning.
Description
Technical Field
The invention relates to the field of artificial intelligence, in particular to a text classification method and device based on deep learning, electronic equipment and a readable storage medium.
Background
With the rapid development of the internet technology, the application scenarios of the text classification based on the deep learning are very wide, and a large amount of data and information can be organized and managed through the text classification based on the deep learning. For example, the extensive use of various kinds of APPs needs to be applied to text classification based on deep learning, and users often have questions about more operation flows in the APP use process, so that the users select to feed back their own questions to the artificial intelligence customer service in the customer service system of the APP, and when the artificial intelligence customer service obtains the questions of the users, the questions can be automatically subjected to text classification, and then corresponding answer texts are selected and sent to the users. If the text classification is not accurate, the accuracy and the efficiency of obtaining the answer text are greatly reduced, and therefore a method for improving the accuracy of the text classification is urgently needed.
Disclosure of Invention
The invention provides a text classification method and device based on deep learning, electronic equipment and a computer readable storage medium, and mainly aims to improve the accuracy of text classification based on deep learning.
In order to achieve the above object, the text classification method based on deep learning provided by the present invention includes:
acquiring a text set to be classified, and performing feature extraction on the text set to be classified to obtain a feature vector set;
carrying out first classification on the feature vector set by using a pre-constructed naive Bayes classifier to obtain a first classification result;
constructing a plurality of hyperplane functions of the feature vector set, determining an optimal hyperplane in the hyperplane functions by utilizing Lagrange number multiplication, and carrying out second classification on the feature vector set by utilizing the optimal hyperplane to obtain a second classification result;
judging whether the first classification result is consistent with the second classification result;
if the first classification result is inconsistent with the second classification result, adjusting parameters of the naive Bayes classifier and selection of the optimal hyperplane, and executing the first classification and the second classification again;
and if the first classification result is consistent with the second classification result, taking the first classification result or the second classification result as the classification result of the text to be classified.
Optionally, the constructing a plurality of hyperplane functions of the feature vector set includes:
mapping the vectors in the feature vector set to a plane coordinate system to obtain a plurality of plane normal vectors;
and obtaining a plurality of hyperplane functions according to the plurality of plane normal vectors and a preset regression function.
Optionally, the determining an optimal hyperplane in the plurality of hyperplane functions by using lagrangian number multiplication includes:
determining two parallel hyperplane functions in the hyperplane functions by using a preset geometric interval, and performing formula conversion on the two parallel hyperplane functions to obtain a constraint condition;
and converting the constraint condition into an unconstrained condition by utilizing the Lagrange number multiplication, and calculating the unconstrained condition to obtain the optimal hyperplane in the two parallel hyperplane functions.
Optionally, the performing, by using a pre-constructed naive bayes classifier, a first classification on the feature vector set to obtain a first classification result includes:
dividing the feature vector set into a first training set and a first verification set;
classifying the first training set by using a pre-constructed naive Bayes classifier to obtain a prior probability of the first training set;
predicting the text categories of a plurality of feature vectors in the first training set according to the prior probability to obtain the predicted text categories of the first training set;
inputting the first verification set into the naive Bayes classifier to obtain a predicted text category of the first verification set;
if the predicted text category of the first verification set is inconsistent with the predicted text category of the first training set, adjusting the parameters of the naive Bayes classifier, and executing the operation of classifying the first training set by the naive Bayes classifier again;
and if the predicted text category of the first verification set is consistent with the predicted text category of the first training set, taking the predicted text category of the first verification set and/or the predicted text category of the first training set as a first classification result.
Optionally, the performing, by using the optimal hyperplane, a second classification on the feature vector set to obtain a second classification result, including:
dividing the feature vector set into a second training set and a second verification set;
acquiring a function of the optimal hyperplane, and classifying the second training set by using the function of the optimal hyperplane to obtain a text category to which the second training set belongs;
classifying the second verification set by using the function of the optimal hyperplane to obtain a text category to which the second verification set belongs;
judging whether the text category to which the second verification set belongs is consistent with the text category to which the second training set belongs;
if the text type to which the second verification set belongs is not consistent with the text type to which the second training set belongs, adjusting parameters of the hyperplane function, and executing the operation of classifying the training set by using the function of the optimal hyperplane again;
and if the text category to which the second verification set belongs is consistent with the text category to which the second training set belongs, taking the text category to which the second verification set belongs and/or the text category to which the second training set belongs as a second classification result of the feature vector set.
Optionally, the performing feature extraction on the text set to be classified to obtain a feature vector set includes:
performing vectorization operation on the text set to be classified by using a pre-constructed word vector conversion model to generate a plurality of word vectors;
pre-training the word vectors to obtain a plurality of pre-trained word vectors;
and matching the text set to be classified with a plurality of word vectors to obtain matched word vectors, and forming the matched word vectors into a feature vector set.
Optionally, the matching the text set to be classified with the word vectors to obtain matched word vectors includes:
matching the text set to be classified with a plurality of word vectors respectively;
in order to solve the above problems, the present invention further provides a text classification device based on deep learning, wherein the device includes:
the text feature extraction module is used for acquiring a text set to be classified, and performing feature extraction on the text set to be classified to obtain a feature vector set;
the naive Bayes classification module is used for carrying out first classification on the feature vector set by utilizing a pre-constructed naive Bayes classifier to obtain a first classification result;
the optimal hyperplane classification module is used for constructing a plurality of hyperplane functions of the feature vector set, determining an optimal hyperplane in the hyperplane functions by utilizing Lagrange number multiplication, and performing second classification on the feature vector set by utilizing the optimal hyperplane to obtain a second classification result;
and the text final classification module is used for judging whether the first classification result is consistent with the second classification result, adjusting the parameters of the naive Bayes classifier and the selection of the optimal hyperplane if the first classification result is inconsistent with the second classification result, executing the first classification and the second classification again, and taking the first classification result or the second classification result as the classification result of the text to be classified if the first classification result is consistent with the second classification result.
In order to solve the above problem, the present invention also provides an electronic device, including:
a memory storing at least one computer program; and
and a processor executing the computer program stored in the memory to implement the text classification method based on deep learning.
In order to solve the above problem, the present invention further provides a computer-readable storage medium, in which at least one computer program is stored, and the at least one computer program is executed by a processor in an electronic device to implement the text classification method based on deep learning described above.
The method comprises the steps of firstly carrying out first classification on a feature vector set through a naive Bayes classifier to obtain a first classification result, further determining an optimal hyperplane of the feature vector set, carrying out second classification on the feature vector set through the optimal hyperplane to obtain a second classification result, finally judging the consistency of the first classification result and the second classification result, and continuously adjusting parameters of the naive Bayes classifier and selection of the optimal hyperplane when the first classification result is inconsistent with the second classification result to improve the accuracy of model classification, thereby improving the accuracy of text classification based on deep learning. Therefore, the text classification method and device based on deep learning, the electronic device and the readable storage medium provided by the embodiment of the invention can improve the accuracy of text classification based on deep learning.
Drawings
Fig. 1 is a schematic flowchart of a text classification method based on deep learning according to an embodiment of the present invention;
FIG. 2 is a block diagram of an apparatus for text classification based on deep learning according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an internal structure of an electronic device implementing a text classification method based on deep learning according to an embodiment of the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The embodiment of the invention provides a text classification method based on deep learning. The execution subject of the text classification method based on deep learning includes, but is not limited to, at least one of electronic devices such as a server and a terminal, which can be configured to execute the method provided by the embodiments of the present application. The server may be an independent server, or may be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like. In other words, the text classification method based on deep learning may be performed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
Referring to fig. 1, which is a schematic flow diagram of a text classification method based on deep learning according to an embodiment of the present invention, in an embodiment of the present invention, the text classification method based on deep learning includes:
s1, obtaining a text set to be classified, and performing feature extraction on the text set to be classified to obtain a feature vector set.
In the embodiment of the invention, the text set to be classified comprises a plurality of texts of different types, the text set to be classified can be obtained from customer service problem feedback of various bank APPs and financing APPs in the market, and the obtained text can comprise problem texts, text lengths, emotion analysis and other information fed back by a user.
The feature vector set represents word features in the text set to be classified through vectors.
In detail, the performing feature extraction on the text set to be classified to obtain a feature vector set includes:
performing vectorization operation on the text set to be classified by using a pre-constructed word vector conversion model to generate a plurality of word vectors;
pre-training the word vectors to obtain a plurality of pre-trained word vectors;
and matching the text set to be classified with a plurality of word vectors to obtain matched word vectors, and forming the matched word vectors into a feature vector set.
Since the pre-trained word vectors in word embedding can reduce the output parameters in the word embedding process, the word vectors are trained in advance in the embodiment, so that the pre-trained word vectors are obtained. In addition, words with similar semantics can be gathered in a word vector space by training the word vector in advance, and convenience is provided for subsequent text classification operation based on deep learning.
Further, the matching the text set to be classified with the word vectors to obtain matched word vectors includes:
matching the text set to be classified with a plurality of word vectors respectively;
and converting the texts matched with the word vectors in the text set to be classified into vectors to obtain matched word vectors.
And S2, performing first classification on the feature vector set by using a pre-constructed naive Bayes classifier to obtain a first classification result.
In the embodiment of the invention, the naive Bayes classifier is a classifier for classifying the feature vector set based on a Bayes algorithm, and mainly used for classifying texts.
In detail, the performing a first classification on the feature vector set by using a pre-constructed naive bayes classifier to obtain a first classification result includes:
dividing the feature vector set into a first training set and a first verification set;
classifying the first training set by using a pre-constructed naive Bayes classifier to obtain a prior probability of the first training set;
predicting the text categories of a plurality of feature vectors in the first training set according to the prior probability to obtain the predicted text categories of the first training set;
inputting the first verification set into the naive Bayes classifier to obtain a predicted text category of the first verification set;
if the predicted text category of the first verification set is inconsistent with the predicted text category of the first training set, adjusting the parameters of the naive Bayes classifier, and executing the operation of classifying the first training set by the naive Bayes classifier again;
and if the predicted text category of the first verification set is consistent with the predicted text category of the first training set, taking the predicted text category of the first verification set and/or the predicted text category of the first training set as a first classification result.
In the embodiment of the invention, the prior probability represents the probability that a feature vector set obtained according to experience belongs to a certain text category; the posterior probability represents the probability that a feature vector set belongs to a certain text category when the feature vector set is classified.
For example, the prior probability may be P (A | B), the posterior probability may be P (B | A); wherein, "a" represents a feature vector set, and "B" represents a text category corresponding to the feature vector set.
In the embodiment of the present invention, the classification of the feature vector set may be implemented by a conditional probability formula of the naive bayes classifier:
specifically, for example, the set of text categories is C ═ { C ═ C1,c2,…,cnD ═ D in the feature vector set1,d2,…,dmFinding the probability that the feature vector set D belongs to the text category C according to the conditional probability formula can be determined by the following stepsCalculating by the formula:
wherein P (D) is a preset fixed value, and P (c)i| D) represents that the text category corresponding to the item with the highest probability in the probability of each text category under the condition of D is the text category of the D feature vector set, and P (D | c)i) Is shown at ciProbability of D occurrence under the conditions of occurrence, P (c)i) Representing the i-th text class probability, P (d), corresponding to the feature vectorj|ci) Is represented in the text category as ciAnd in the j-th feature vector djThe number of occurrences in the total text category is ciM represents the number of feature vectors.
Further, if P (d) is presentj|ci) 0, results in P (c)iIf | D) | 0, then the formula is meaningless, in order to avoid P (D)j|ci) The laplace calibration can be used, and a specific calibration method can be obtained from the prior art, which is not described herein again.
Similarly, if P (c) is presenti) Laplace calibration may also be used, 0.
In the embodiment of the present invention, the first classification of the feature vector set may be implemented based on an artificial intelligence algorithm.
S3, constructing a plurality of hyperplane functions of the feature vector set, determining an optimal hyperplane in the hyperplane functions by utilizing Lagrange number multiplication, and performing second classification on the feature vector set by utilizing the optimal hyperplane to obtain a second classification result.
In this embodiment of the present invention, the hyperplane function may be regarded as an interface of a predicted text category of the feature vector set.
In detail, the constructing the plurality of hyperplane functions of the feature vector set includes:
mapping the vectors in the feature vector set to a plane coordinate system to obtain a plurality of plane normal vectors;
and obtaining a plurality of hyperplane functions according to the plurality of plane normal vectors and a preset regression function.
In the embodiment of the present invention, the expression of the regression function may be hθ(z)=g(θtz)。
For example, the feature vector set D may be divided into D0And D1If there is an n-dimensional vector and a real number b, when θ istz is 0, for hθ(z) rewriting to obtain hw,b(x)(x)=g(wtxi+ b); so that all belong to D0Point x ofiAll have wtxi+b>0; and all belong to D1Point x ofiAll have wtxi+b<0, then D is0And D1The completely-scribed hyperplane function is then wtxi+b=0。
In the embodiment of the invention, the optimal hyperplane means that the distance from the sample points at two sides of the feature vector set closest to the hyperplane is the largest.
For example, the feature vector set D may be divided into D0And D1,D0And D1The sample points in (A) are distributed on both sides of the hyperplane, so that D0And D1The distance from the sample point closest to the hyperplane is the largest, and the hyperplane is the optimal hyperplane.
In detail, the determining an optimal hyperplane in a plurality of hyperplane functions using lagrange number multiplication includes:
determining two parallel hyperplane functions in the hyperplane functions by using a preset geometric interval, and performing formula conversion on the two parallel hyperplane functions to obtain a constraint condition;
and converting the constraint condition into an unconstrained condition by utilizing the Lagrange number multiplication, and calculating the unconstrained condition to obtain the optimal hyperplane in the two parallel hyperplane functions.
In the embodiment of the invention, the maximum distance between the two parallel hyperplane functions is the maximum interval, and the constraint condition can be obtained according to the maximum interval; the constraint condition is that an optimal value of the objective function is found in a limited space.
For example, the size of the geometric interval is:
wherein w is a normal vector of the hyperplane, and the coordinate of the sample point A is x(i)The distance from the sample point A to the boundary line is
The geometric interval is substituted into a hyperplane function wtxiIn + b ═ 0, two parallel hyperplane functions are obtained:
wtx+b≥1,y=1
wtx+b≤-1,y=-1
wherein, the wtIs the hyperplane normal vector, x is the feature vector set, and b is the real-digit displacement term.
Combining the two parallel hyperplane functions to obtain:
y(wtx+b)≥1
the distance of a sample point to the hyperplane can be expressed as:
and converting the distance from the sample point to the hyperplane to obtain a constraint condition:
the lagrange function is:
wherein, the xiIs a set of feature vectors, yiIs xiB is a real-time digital-shift term, γiFor geometric spacing, | | w | | is
And solving the partial derivative of the Lagrange function to obtain:
where m denotes a sample point on two parallel hyperplanes, xiIs a set of feature vectors, yiIs xiClass label of gammaiAre geometrically spaced.
Further obtaining an optimal hyperplane:
f(x)=(wtx+b)
wherein f (x) represents an optimal hyperplane function, wtIs the hyperplane normal vector, x is the feature vector set, and b is the real-digit displacement term.
In detail, the performing, by using the optimal hyperplane, a second classification on the feature vector set to obtain a second classification result includes:
dividing the feature vector set into a second training set and a second verification set;
acquiring a function of the optimal hyperplane, and classifying the second training set by using the function of the optimal hyperplane to obtain a text category to which the second training set belongs;
classifying the second verification set by using the function of the optimal hyperplane to obtain a text category to which the second verification set belongs;
judging whether the text category to which the second verification set belongs is consistent with the text category to which the second training set belongs;
if the text type to which the second verification set belongs is not consistent with the text type to which the second training set belongs, adjusting parameters of the hyperplane function, and executing the operation of classifying the training set by using the function of the optimal hyperplane again;
and if the text category to which the second verification set belongs is consistent with the text category to which the second training set belongs, taking the text category to which the second verification set belongs and/or the text category to which the second training set belongs as a second classification result of the feature vector set.
In the embodiment of the present invention, the optimal hyperplane function is: (x) sign (w)tx+b)
Wherein sign is a step function:
in the embodiment of the invention, the related data can be acquired and processed based on the artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
S4, judging whether the first classification result is consistent with the second classification result.
In this embodiment of the present invention, the determining whether the first classification result is consistent with the second classification result includes: and calculating the association degree of the first classification result and the second classification result, wherein if the association degree is greater than a preset association degree, the first classification result is consistent with the second classification result, and if the association degree is greater than the preset association degree, the first classification result is inconsistent with the second classification result.
In an optional embodiment, the association between the first classification result and the second classification result may be implemented by a similarity algorithm, such as a cosine similarity algorithm, a Jaccard similarity coefficient algorithm, and the like, and the preset association may be set to 0.95, or may be set according to an actual service scenario.
S5, if the first classification result is inconsistent with the second classification result, adjusting the parameters of the naive Bayes classifier and the selection of the optimal hyperplane, and executing the first classification and the second classification again.
In this embodiment of the present invention, if the first classification result is inconsistent with the second classification result, a target text category cannot be output, the parameter of the naive bayes classifier and the selection of the optimal hyperplane of the feature vector set are adjusted, and the first classification and the second classification are performed again, that is, the naive bayes classifier is used again to perform the first classification on the feature vector set and to reselect the optimal hyperplane of the feature vector set, and the selected optimal hyperplane is used to perform the second classification on the feature vector set until the first classification result is consistent with the second classification result.
S6, if the first classification result is consistent with the second classification result, taking the first classification result or the second classification result as the classification result of the text to be classified.
For example, in the embodiment of the present invention, if the text to be classified is "i want to apply for credit card transaction", the obtained first classification result is "transaction acceptance for credit card", and the second classification result is "transaction acceptance for credit card", the "transaction acceptance for credit card" is used as the classification result of the text to be classified.
In the embodiment of the invention, the classification result of the text to be classified can be obtained by using artificial intelligence and a preset judgment rule.
The method comprises the steps of firstly carrying out first classification on a feature vector set through a naive Bayes classifier to obtain a first classification result, further determining an optimal hyperplane of the feature vector set, carrying out second classification on the feature vector set through the optimal hyperplane to obtain a second classification result, finally judging the consistency of the first classification result and the second classification result, and continuously adjusting parameters of the naive Bayes classifier and selection of the optimal hyperplane when the first classification result is inconsistent with the second classification result to improve the accuracy of model classification, thereby improving the accuracy of text classification based on deep learning. Therefore, the text classification method based on deep learning provided by the embodiment of the invention can improve the accuracy of text classification based on deep learning.
Fig. 2 is a functional block diagram of the text classification apparatus based on deep learning according to the present invention.
The text classification device 100 based on deep learning according to the present invention can be installed in an electronic device. According to the implemented functions, the text classification apparatus based on deep learning may include a text feature extraction module 101, a naive bayes classification module 102, an optimal hyperplane classification module 103, and a text final classification module 104, which may also be referred to as a unit, and refers to a series of computer program segments that can be executed by a processor of an electronic device and can perform fixed functions, and are stored in a memory of the electronic device.
In the present embodiment, the functions regarding the respective modules/units are as follows:
the text feature extraction module 101 is configured to obtain a text set to be classified, and perform feature extraction on the text set to be classified to obtain a feature vector set.
In the embodiment of the invention, the text set to be classified comprises a plurality of texts of different types, the text set to be classified can be obtained from customer service problem feedback of various bank APPs and financing APPs in the market, and the obtained text can comprise problem texts, text lengths, emotion analysis and other information fed back by a user.
The feature vector set represents word features in the text set to be classified through vectors.
In detail, the text feature extraction module 101 performs feature extraction on the text set to be classified by performing the following operations to obtain a feature vector set, including:
performing vectorization operation on the text set to be classified by using a pre-constructed word vector conversion model to generate a plurality of word vectors;
pre-training the word vectors to obtain a plurality of pre-trained word vectors;
and matching the text set to be classified with a plurality of word vectors to obtain matched word vectors, and forming the matched word vectors into a feature vector set.
Since the pre-trained word vectors in word embedding can reduce the output parameters in the word embedding process, the word vectors are trained in advance in the embodiment, so that the pre-trained word vectors are obtained. In addition, words with similar semantics can be gathered in a word vector space by training the word vector in advance, and convenience is provided for subsequent text classification operation based on deep learning.
Further, the matching the text set to be classified with the word vectors to obtain matched word vectors includes:
matching the text set to be classified with a plurality of word vectors respectively;
and converting the texts matched with the word vectors in the text set to be classified into vectors to obtain matched word vectors.
The naive bayes classification module 102 is configured to perform a first classification on the feature vector set by using a pre-constructed naive bayes classifier to obtain a first classification result.
In the embodiment of the invention, the naive Bayes classifier is a classifier for classifying the feature vector set based on a Bayes algorithm, and mainly used for classifying texts.
In detail, the naive bayes classifier classification 102 performs a first classification on the feature vector set by using a pre-constructed naive bayes classifier through the following operations to obtain a first classification result, including:
dividing the feature vector set into a first training set and a first verification set;
classifying the first training set by using a pre-constructed naive Bayes classifier to obtain a prior probability of the first training set;
predicting the text categories of a plurality of feature vectors in the first training set according to the prior probability to obtain the predicted text categories of the first training set;
inputting the first verification set into the naive Bayes classifier to obtain a predicted text category of the first verification set;
if the predicted text category of the first verification set is inconsistent with the predicted text category of the first training set, adjusting the parameters of the naive Bayes classifier, and executing the operation of classifying the first training set by the naive Bayes classifier again;
and if the predicted text category of the first verification set is consistent with the predicted text category of the first training set, taking the predicted text category of the first verification set and/or the predicted text category of the first training set as a first classification result.
In the embodiment of the invention, the prior probability represents the probability that a feature vector set obtained according to experience belongs to a certain text category; the posterior probability represents the probability that a feature vector set belongs to a certain text category when the feature vector set is classified.
For example, the prior probability may be P (A | B), the posterior probability may be P (B | A); wherein, "a" represents a feature vector set, and "B" represents a text category corresponding to the feature vector set.
In the embodiment of the present invention, the classification of the feature vector set may be implemented by a conditional probability formula of the naive bayes classifier:
specifically, for example, the set of text categories is C ═ { C ═ C1,c2,…,cnD ═ D in the feature vector set1,d2,…,dmAnd calculating the probability that the feature vector set D belongs to the text category C according to the conditional probability formula by using the following formula:
wherein P (D) is a preset fixed value, and P (c)i| D) represents that the text category corresponding to the item with the highest probability in the probability of each text category under the condition of D is the text category of the D feature vector set, and P (D | c)i) Is shown at ciProbability of D occurrence under the conditions of occurrence, P (c)i) Representing the i-th text class probability, P (d), corresponding to the feature vectorj|ci) Is represented in the text category as ciAnd in the j-th feature vector djThe number of occurrences in the total text category is ciM represents the number of feature vectors.
Further, if P (d) is presentj|ci) 0, results in P (c)iIf | D) | 0, then the formula is meaningless, in order to avoid P (D)j|ci) The laplace calibration can be used, and a specific calibration method can be obtained from the prior art, which is not described herein again.
Similarly, if P (c) is presenti) Laplace calibration may also be used, 0.
In the embodiment of the present invention, the first classification of the feature vector set may be implemented based on an artificial intelligence algorithm.
The optimal hyperplane classification module 103 is configured to construct a plurality of hyperplane functions of the feature vector set, determine an optimal hyperplane in the plurality of hyperplane functions by using lagrange number multiplication, and perform second classification on the feature vector set by using the optimal hyperplane to obtain a second classification result.
In this embodiment of the present invention, the hyperplane function may be regarded as an interface of a predicted text category of the feature vector set.
In detail, the optimal hyperplane classification module 103 may construct a plurality of hyperplane functions of the feature vector set by performing the following operations, including:
mapping the vectors in the feature vector set to a plane coordinate system to obtain a plurality of plane normal vectors;
and obtaining a plurality of hyperplane functions according to the plurality of plane normal vectors and a preset regression function.
In the embodiment of the present invention, the expression of the regression function may be hθ(z)=g(θtz)。
For example, the feature vector set D may be divided into D0And D1If there is an n-dimensional vector and a real number b, when θ istz is 0, for hθ(z) rewriting to obtain hw,b(x)(x)=g(wtxi+ b); so that all belong to D0Point x ofiAll have wtxi+b>0; and all belong to D1Point x ofiAll have wtxi+b<0, then D is0And D1The completely-scribed hyperplane function is then wtxi+b=0。
In the embodiment of the invention, the optimal hyperplane means that the distance from the sample points at two sides of the feature vector set closest to the hyperplane is the largest.
For example, the feature vector set D may be divided into D0And D1,D0And D1The sample points in (A) are distributed on both sides of the hyperplane, so that D0And D1The distance from the sample point closest to the hyperplane is the largest, and the hyperplane is the optimal hyperplane.
In detail, the optimal hyperplane classification module 103 may determine an optimal hyperplane in a plurality of the hyperplane functions by multiplying lagrange numbers by performing operations comprising:
determining two parallel hyperplane functions in the hyperplane functions by using a preset geometric interval, and performing formula conversion on the two parallel hyperplane functions to obtain a constraint condition;
and converting the constraint condition into an unconstrained condition by utilizing the Lagrange number multiplication, and calculating the unconstrained condition to obtain the optimal hyperplane in the two parallel hyperplane functions.
In the embodiment of the invention, the maximum distance between the two parallel hyperplane functions is the maximum interval, and the constraint condition can be obtained according to the maximum interval; the constraint condition is that an optimal value of the objective function is found in a limited space.
For example, the size of the geometric interval is:
wherein w is a normal vector of the hyperplane, and the coordinate of the sample point A is x(i)The distance from the sample point A to the boundary line is
The geometric interval is substituted into a hyperplane function wtxiIn + b ═ 0, two parallel hyperplane functions are obtained:
wtx+b≥1,y=1
wtx+b≤-1,y=-1
wherein, the wtIs the hyperplane normal vector, x is the feature vector set, and b is the real-digit displacement term.
Combining the two parallel hyperplane functions to obtain:
y(wtx+b)≥1
the distance of a sample point to the hyperplane can be expressed as:
and converting the distance from the sample point to the hyperplane to obtain a constraint condition:
the lagrange function is:
wherein, the xiIs a set of feature vectors, yiIs xiB is a real-time digital-shift term, γiFor geometric spacing, | | w | | is
And solving the partial derivative of the Lagrange function to obtain:
where m denotes a sample point on two parallel hyperplanes, xiIs a set of feature vectors, yiIs xiClass label of gammaiAre geometrically spaced.
Further obtaining an optimal hyperplane:
f(x)=(wtx+b)
wherein f (x) represents an optimal hyperplane function, wtIs the hyperplane normal vector, x is the feature vector set, and b is the real-digit displacement term.
In detail, the optimal hyperplane classification module 103 may perform a second classification on the feature vector set by using the optimal hyperplane to obtain a second classification result, where the second classification result includes:
dividing the feature vector set into a second training set and a second verification set;
acquiring a function of the optimal hyperplane, and classifying the second training set by using the function of the optimal hyperplane to obtain a text category to which the second training set belongs;
classifying the second verification set by using the function of the optimal hyperplane to obtain a text category to which the second verification set belongs;
judging whether the text category to which the second verification set belongs is consistent with the text category to which the second training set belongs;
if the text type to which the second verification set belongs is not consistent with the text type to which the second training set belongs, adjusting parameters of the hyperplane function, and executing the operation of classifying the training set by using the function of the optimal hyperplane again;
and if the text category to which the second verification set belongs is consistent with the text category to which the second training set belongs, taking the text category to which the second verification set belongs and/or the text category to which the second training set belongs as a second classification result of the feature vector set.
In the embodiment of the present invention, the optimal hyperplane function is: (x) sign (w)tx+b)
Wherein sign is a step function:
in the embodiment of the invention, the related data can be acquired and processed based on the artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The text final classification module 104 is configured to determine whether the first classification result is consistent with the second classification result, adjust parameters of the naive bayes classifier and the selection of the optimal hyperplane if the first classification result is inconsistent with the second classification result, execute the first classification and the second classification again, and take the first classification result or the second classification result as a classification result of the text to be classified if the first classification result is consistent with the second classification result.
In this embodiment of the present invention, the step of determining whether the first classification result is consistent with the second classification result by the text final classification module 104 includes: and calculating the association degree of the first classification result and the second classification result, wherein if the association degree is greater than a preset association degree, the first classification result is consistent with the second classification result, and if the association degree is greater than the preset association degree, the first classification result is inconsistent with the second classification result.
In an optional embodiment, the association between the first classification result and the second classification result may be implemented by a similarity algorithm, such as a cosine similarity algorithm, a Jaccard similarity coefficient algorithm, and the like, and the preset association may be set to 0.95, or may be set according to an actual service scenario.
In this embodiment of the present invention, if the first classification result is inconsistent with the second classification result, a target text category cannot be output, the parameter of the naive bayes classifier and the selection of the optimal hyperplane of the feature vector set are adjusted, and the first classification and the second classification are performed again, that is, the naive bayes classifier is used again to perform the first classification on the feature vector set and to reselect the optimal hyperplane of the feature vector set, and the selected optimal hyperplane is used to perform the second classification on the feature vector set until the first classification result is consistent with the second classification result.
For example, in the embodiment of the present invention, if the text to be classified is "i want to apply for credit card transaction", the obtained first classification result is "transaction acceptance for credit card", and the second classification result is "transaction acceptance for credit card", the "transaction acceptance for credit card" is used as the classification result of the text to be classified.
In the embodiment of the invention, the classification result of the text to be classified can be obtained by using artificial intelligence and a preset judgment rule.
The method comprises the steps of firstly carrying out first classification on a feature vector set through a naive Bayes classifier to obtain a first classification result, further determining an optimal hyperplane of the feature vector set, carrying out second classification on the feature vector set through the optimal hyperplane to obtain a second classification result, finally judging the consistency of the first classification result and the second classification result, and continuously adjusting parameters of the naive Bayes classifier and selection of the optimal hyperplane when the first classification result is inconsistent with the second classification result to improve the accuracy of model classification, thereby improving the accuracy of text classification based on deep learning. Therefore, the text classification device based on deep learning provided by the embodiment of the invention can improve the accuracy of text classification based on deep learning.
Fig. 3 is a schematic structural diagram of an electronic device implementing the text classification method based on deep learning according to the present invention.
The electronic device may comprise a processor 10, a memory 11, a communication bus 12 and a communication interface 13, and may further comprise a computer program, such as a deep learning based text classification program, stored in the memory 11 and executable on the processor 10.
The memory 11 includes at least one type of readable storage medium, which includes flash memory, removable hard disk, multimedia card, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, for example a removable hard disk of the electronic device. The memory 11 may also be an external storage device of the electronic device in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used not only to store application software installed in the electronic device and various types of data, such as codes of a text classification program based on deep learning, etc., but also to temporarily store data that has been output or is to be output.
The processor 10 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device by running or executing programs or modules (e.g., text classification programs based on deep learning, etc.) stored in the memory 11 and calling data stored in the memory 11.
The communication bus 12 may be a PerIPheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus. The bus may be divided into an address bus, a data bus, a control bus, etc. The communication bus 12 is arranged to enable connection communication between the memory 11 and at least one processor 10 or the like. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
Fig. 3 shows only an electronic device having components, and those skilled in the art will appreciate that the structure shown in fig. 3 does not constitute a limitation of the electronic device, and may include fewer or more components than those shown, or some components may be combined, or a different arrangement of components.
For example, although not shown, the electronic device may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management and the like are realized through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
Optionally, the communication interface 13 may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), which is generally used to establish a communication connection between the electronic device and other electronic devices.
Optionally, the communication interface 13 may further include a user interface, which may be a Display (Display), an input unit (such as a Keyboard (Keyboard)), and optionally, a standard wired interface, or a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable, among other things, for displaying information processed in the electronic device and for displaying a visualized user interface.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
The deep learning based text classification program stored in the memory 11 of the electronic device is a combination of a plurality of computer programs, which when executed in the processor 10, can implement:
acquiring a text set to be classified, and performing feature extraction on the text set to be classified to obtain a feature vector set;
carrying out first classification on the feature vector set by using a pre-constructed naive Bayes classifier to obtain a first classification result;
constructing a plurality of hyperplane functions of the feature vector set, determining an optimal hyperplane in the hyperplane functions by utilizing Lagrange number multiplication, and carrying out second classification on the feature vector set by utilizing the optimal hyperplane to obtain a second classification result;
judging whether the first classification result is consistent with the second classification result;
if the first classification result is inconsistent with the second classification result, adjusting parameters of the naive Bayes classifier and selection of the optimal hyperplane, and executing the first classification and the second classification again;
and if the first classification result is consistent with the second classification result, taking the first classification result or the second classification result as the classification result of the text to be classified.
Specifically, the processor 10 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the computer program, which is not described herein again.
Further, the electronic device integrated module/unit, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in a computer readable storage medium. The computer readable medium may be non-volatile or volatile. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).
Embodiments of the present invention may also provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor of an electronic device, the computer program may implement:
acquiring a text set to be classified, and performing feature extraction on the text set to be classified to obtain a feature vector set;
carrying out first classification on the feature vector set by using a pre-constructed naive Bayes classifier to obtain a first classification result;
constructing a plurality of hyperplane functions of the feature vector set, determining an optimal hyperplane in the hyperplane functions by utilizing Lagrange number multiplication, and carrying out second classification on the feature vector set by utilizing the optimal hyperplane to obtain a second classification result;
judging whether the first classification result is consistent with the second classification result;
if the first classification result is inconsistent with the second classification result, adjusting parameters of the naive Bayes classifier and selection of the optimal hyperplane, and executing the first classification and the second classification again;
and if the first classification result is consistent with the second classification result, taking the first classification result or the second classification result as the classification result of the text to be classified.
Further, the computer usable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.
Claims (10)
1. A text classification method based on deep learning, which is characterized by comprising the following steps:
acquiring a text set to be classified, and performing feature extraction on the text set to be classified to obtain a feature vector set;
carrying out first classification on the feature vector set by using a pre-constructed naive Bayes classifier to obtain a first classification result;
constructing a plurality of hyperplane functions of the feature vector set, determining an optimal hyperplane in the hyperplane functions by utilizing Lagrange number multiplication, and carrying out second classification on the feature vector set by utilizing the optimal hyperplane to obtain a second classification result;
judging whether the first classification result is consistent with the second classification result;
if the first classification result is inconsistent with the second classification result, adjusting parameters of the naive Bayes classifier and selection of the optimal hyperplane, and executing the first classification and the second classification again;
and if the first classification result is consistent with the second classification result, taking the first classification result or the second classification result as the classification result of the text to be classified.
2. The method for deep learning-based text classification according to claim 1, wherein the constructing the plurality of hyperplane functions of the feature vector set comprises:
mapping the vectors in the feature vector set to a plane coordinate system to obtain a plurality of plane normal vectors;
and obtaining a plurality of hyperplane functions according to the plurality of plane normal vectors and a preset regression function.
3. The method for deep learning based text classification according to claim 1, wherein the determining an optimal hyperplane of the plurality of hyperplane functions using lagrangian number multiplication comprises:
determining two parallel hyperplane functions in the hyperplane functions by using a preset geometric interval, and performing formula conversion on the two parallel hyperplane functions to obtain a constraint condition;
and converting the constraint condition into an unconstrained condition by utilizing the Lagrange number multiplication, and calculating the unconstrained condition to obtain the optimal hyperplane in the two parallel hyperplane functions.
4. The method for text classification based on deep learning of claim 1, wherein the first classification of the feature vector set by using a pre-constructed naive bayes classifier to obtain a first classification result comprises:
dividing the feature vector set into a first training set and a first verification set;
classifying the first training set by using a pre-constructed naive Bayes classifier to obtain a prior probability of the first training set;
predicting the text categories of a plurality of feature vectors in the first training set according to the prior probability to obtain the predicted text categories of the first training set;
inputting the first verification set into the naive Bayes classifier to obtain a predicted text category of the first verification set;
if the predicted text category of the first verification set is inconsistent with the predicted text category of the first training set, adjusting the parameters of the naive Bayes classifier, and executing the operation of classifying the first training set by the naive Bayes classifier again;
and if the predicted text category of the first verification set is consistent with the predicted text category of the first training set, taking the predicted text category of the first verification set and/or the predicted text category of the first training set as a first classification result.
5. The method for deep learning based text classification according to claim 1, wherein the second classification of the feature vector set by using the optimal hyperplane to obtain a second classification result comprises:
dividing the feature vector set into a second training set and a second verification set;
acquiring a function of the optimal hyperplane, and classifying the second training set by using the function of the optimal hyperplane to obtain a text category to which the second training set belongs;
classifying the second verification set by using the function of the optimal hyperplane to obtain a text category to which the second verification set belongs;
judging whether the text category to which the second verification set belongs is consistent with the text category to which the second training set belongs;
if the text type to which the second verification set belongs is not consistent with the text type to which the second training set belongs, adjusting parameters of the hyperplane function, and executing the operation of classifying the training set by using the function of the optimal hyperplane again;
and if the text category to which the second verification set belongs is consistent with the text category to which the second training set belongs, taking the text category to which the second verification set belongs and/or the text category to which the second training set belongs as a second classification result of the feature vector set.
6. The text classification method based on deep learning of any one of claims 1 to 5, wherein the feature extraction of the text set to be classified to obtain a feature vector set comprises:
performing vectorization operation on the text set to be classified by using a pre-constructed word vector conversion model to generate a plurality of word vectors;
pre-training the word vectors to obtain a plurality of pre-trained word vectors;
and matching the text set to be classified with a plurality of word vectors to obtain matched word vectors, and forming the matched word vectors into a feature vector set.
7. The method for text classification based on deep learning of claim 6, wherein the matching the text set to be classified with a plurality of word vectors to obtain matched word vectors comprises:
matching the text set to be classified with a plurality of word vectors respectively;
and converting the texts matched with the word vectors in the text set to be classified into vectors to obtain matched word vectors.
8. A text classification apparatus based on deep learning, comprising:
the text feature extraction module is used for acquiring a text set to be classified, and performing feature extraction on the text set to be classified to obtain a feature vector set;
the naive Bayes classification module is used for carrying out first classification on the feature vector set by utilizing a pre-constructed naive Bayes classifier to obtain a first classification result;
the optimal hyperplane classification module is used for constructing a plurality of hyperplane functions of the feature vector set, determining an optimal hyperplane in the hyperplane functions by utilizing Lagrange number multiplication, and performing second classification on the feature vector set by utilizing the optimal hyperplane to obtain a second classification result;
and the text final classification module is used for judging whether the first classification result is consistent with the second classification result, adjusting the parameters of the naive Bayes classifier and the selection of the optimal hyperplane if the first classification result is inconsistent with the second classification result, executing the first classification and the second classification again, and taking the first classification result or the second classification result as the classification result of the text to be classified if the first classification result is consistent with the second classification result.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores computer program instructions executable by the at least one processor to enable the at least one processor to perform the method of deep learning based text classification of any one of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, implements a method for deep learning based text classification according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111011763.2A CN113704475A (en) | 2021-08-31 | 2021-08-31 | Text classification method and device based on deep learning, electronic equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111011763.2A CN113704475A (en) | 2021-08-31 | 2021-08-31 | Text classification method and device based on deep learning, electronic equipment and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113704475A true CN113704475A (en) | 2021-11-26 |
Family
ID=78657956
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111011763.2A Pending CN113704475A (en) | 2021-08-31 | 2021-08-31 | Text classification method and device based on deep learning, electronic equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113704475A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060047616A1 (en) * | 2004-08-25 | 2006-03-02 | Jie Cheng | System and method for biological data analysis using a bayesian network combined with a support vector machine |
CN110427959A (en) * | 2019-06-14 | 2019-11-08 | 合肥工业大学 | Complain classification method, system and the storage medium of text |
CN111000553A (en) * | 2019-12-30 | 2020-04-14 | 山东省计算中心(国家超级计算济南中心) | Intelligent classification method for electrocardiogram data based on voting ensemble learning |
CN112116006A (en) * | 2020-09-18 | 2020-12-22 | 青海师范大学 | Underwater sound target classification method based on dual space optimization |
-
2021
- 2021-08-31 CN CN202111011763.2A patent/CN113704475A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060047616A1 (en) * | 2004-08-25 | 2006-03-02 | Jie Cheng | System and method for biological data analysis using a bayesian network combined with a support vector machine |
CN110427959A (en) * | 2019-06-14 | 2019-11-08 | 合肥工业大学 | Complain classification method, system and the storage medium of text |
CN111000553A (en) * | 2019-12-30 | 2020-04-14 | 山东省计算中心(国家超级计算济南中心) | Intelligent classification method for electrocardiogram data based on voting ensemble learning |
CN112116006A (en) * | 2020-09-18 | 2020-12-22 | 青海师范大学 | Underwater sound target classification method based on dual space optimization |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114663198A (en) | Product recommendation method, device and equipment based on user portrait and storage medium | |
CN113656690B (en) | Product recommendation method and device, electronic equipment and readable storage medium | |
CN115002200A (en) | User portrait based message pushing method, device, equipment and storage medium | |
CN113626606B (en) | Information classification method, device, electronic equipment and readable storage medium | |
CN112269875B (en) | Text classification method, device, electronic equipment and storage medium | |
CN113570286B (en) | Resource allocation method and device based on artificial intelligence, electronic equipment and medium | |
CN116523268B (en) | Person post matching analysis method and device based on big data portrait | |
CN114491047A (en) | Multi-label text classification method and device, electronic equipment and storage medium | |
CN114880449B (en) | Method and device for generating answers of intelligent questions and answers, electronic equipment and storage medium | |
CN114781832A (en) | Course recommendation method and device, electronic equipment and storage medium | |
CN114550870A (en) | Prescription auditing method, device, equipment and medium based on artificial intelligence | |
CN112885423A (en) | Disease label detection method and device, electronic equipment and storage medium | |
CN113868529A (en) | Knowledge recommendation method and device, electronic equipment and readable storage medium | |
US20240311931A1 (en) | Method, apparatus, device, and storage medium for clustering extraction of entity relationships | |
CN113313211B (en) | Text classification method, device, electronic equipment and storage medium | |
CN111652282A (en) | Big data based user preference analysis method and device and electronic equipment | |
CN114840684A (en) | Map construction method, device and equipment based on medical entity and storage medium | |
CN116741358A (en) | Inquiry registration recommendation method, inquiry registration recommendation device, inquiry registration recommendation equipment and storage medium | |
CN115982454A (en) | User portrait based questionnaire pushing method, device, equipment and storage medium | |
CN115082736A (en) | Garbage identification and classification method and device, electronic equipment and storage medium | |
CN115098644A (en) | Image and text matching method and device, electronic equipment and storage medium | |
CN114676307A (en) | Ranking model training method, device, equipment and medium based on user retrieval | |
CN113704475A (en) | Text classification method and device based on deep learning, electronic equipment and medium | |
CN114219367A (en) | User scoring method, device, equipment and storage medium | |
CN113822215A (en) | Equipment operation guide file generation method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20211126 |