CN111428028A - Information classification method based on deep learning and related equipment - Google Patents

Information classification method based on deep learning and related equipment Download PDF

Info

Publication number
CN111428028A
CN111428028A CN202010142300.9A CN202010142300A CN111428028A CN 111428028 A CN111428028 A CN 111428028A CN 202010142300 A CN202010142300 A CN 202010142300A CN 111428028 A CN111428028 A CN 111428028A
Authority
CN
China
Prior art keywords
classification
information
data
identified
deep learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010142300.9A
Other languages
Chinese (zh)
Inventor
金美芝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN202010142300.9A priority Critical patent/CN111428028A/en
Publication of CN111428028A publication Critical patent/CN111428028A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to the technical field of data analysis, in particular to an information classification method based on deep learning and related equipment, which comprises the following steps: acquiring the data quantity of information to be identified, determining the clustering mode of the information to be identified according to the data quantity, and preprocessing the information to be identified by applying the clustering mode to obtain pre-classified data; performing word vector conversion on the pre-classified data to obtain word vectors of the pre-classified data; inputting word vectors of the pre-classified data into a deep learning model for text feature extraction to obtain a plurality of text features; classifying each text feature to obtain a classification result of the text feature; and (4) scoring the classification result by applying a voting mechanism, and determining the classification label of the information to be identified according to the scoring result. The method and the device effectively solve the problem that the content of the original information cannot be accurately reflected when the text feature extraction is carried out by applying the deep learning model.

Description

Information classification method based on deep learning and related equipment
Technical Field
The application relates to the technical field of data analysis, in particular to an information classification method based on deep learning and related equipment.
Background
Usually, people can clearly understand a plurality of intentions expressed in the text, but the robot is difficult to know all intentions expressed in the text, so that the answers given by the robot are not complete, a client cannot obtain a complete and satisfactory answer from the robot, and even an incorrect answer can be returned because the multi-intentions expressed by the client are not understood, so that the extremely poor experience is brought to the client, the satisfaction degree of the client is reduced, and therefore, the important task of customer service of the robot is to be solved by the robot.
At present, the text classification mode is mainly adopted when multi-purpose recognition is carried out. However, the problem of data imbalance during text classification results in failure to accurately reflect the content of original information when text feature extraction is performed by applying a deep learning model.
Disclosure of Invention
Based on the above, an information classification method based on deep learning and related equipment are provided for solving the problem that the content of original information cannot be accurately reflected when text feature extraction is performed by applying a deep learning model due to the problem of data imbalance during text classification at present.
An information classification method based on deep learning comprises the following steps:
acquiring the data quantity of information to be identified, determining the clustering mode of the information to be identified according to the data quantity, and preprocessing the information to be identified by applying the clustering mode to obtain pre-classified data;
performing word vector conversion on the pre-classified data to obtain word vectors of the pre-classified data;
inputting the word vectors of the pre-classified data into a preset deep learning model for text feature extraction to obtain a plurality of text features;
classifying the text features to obtain a classification result of the text features;
and scoring the classification result by using a preset voting mechanism to obtain a scoring result, and determining the classification label of the information to be identified according to the scoring result.
In one possible embodiment, the obtaining the data quantity of the information to be identified, determining a clustering mode of the information to be identified according to the data quantity, and preprocessing the information to be identified by applying the clustering mode to obtain pre-classified data includes:
comparing the data quantity with a preset data quantity threshold, if the data quantity is greater than the data quantity threshold, determining that the information to be identified is large sample data, otherwise, determining that the information to be identified is small sample data;
if the information to be identified is large sample data, clustering the large sample data by applying a clustering algorithm after removing noise points and isolated points in the large sample data to obtain pre-classified data;
if the information to be identified is small sample data, clustering similar samples in the small sample data by using a clustering algorithm to generate a plurality of clusters, and processing the data in each cluster by respectively adopting a genetic crossover algorithm to obtain the pre-classification data.
In one possible embodiment, the performing word vector conversion on the pre-classified data to obtain a word vector of the pre-classified data includes:
acquiring a preset word vector embedding model, and dividing the pre-classified data into a plurality of sentences according to the attribute of the word vector embedding model;
inputting the sentence into the word vector embedding model for mapping to obtain an initial text word vector;
and calculating the characteristic value of the initial text word vector, deleting the initial text word vector with the characteristic value of zero, summarizing the rest initial text word vectors, and obtaining the word vector of the pre-classified data.
In one possible embodiment, the entering of the word vector of the pre-classified data into a preset deep learning model for text feature extraction to obtain a plurality of text features includes:
inputting a preset standard word vector into an input layer in a preset cyclic neural network model, performing probability prediction on the word vector processed by the input layer through a hidden layer in the cyclic neural network model to obtain a probability prediction result, and converting the probability prediction result by using an output layer in the cyclic neural network model to obtain a prediction keyword;
and comparing the predicted key words with the keywords corresponding to the standard word vectors, if the predicted key words are consistent with the keywords corresponding to the standard word vectors, adding the word vectors of the pre-classified data into the cyclic neural network model for feature extraction, and otherwise, changing the parameters in the hidden layer for re-prediction until the predicted key words are consistent with the keywords corresponding to the standard word vectors.
In one possible embodiment, the classifying the text features to obtain a classification result of the text features includes:
obtaining classifiers of different categories, and establishing a classifier sub-tree according to the hierarchical relation among the classifiers;
inputting the text features into a root node of the classifier subtree, performing primary classification to obtain a primary classification result, and inputting the primary classification result into a next-level leaf node of the root node;
taking the next-level leaf node as a new root node to continue classifying until the next-level leaf node is the minimum leaf node;
and summarizing the classification result of the minimum leaf node to obtain the classification result of the text features.
In one possible embodiment, the scoring the classification result by using a preset voting mechanism to obtain a scoring result, and determining the classification label of the information to be identified according to the scoring result includes:
obtaining the classification accuracy of the end classifier corresponding to each minimum leaf node, and taking the classification accuracy as the weight of the end classifier;
voting and scoring the classification labels output by the end classifier by applying the voting mechanism by taking the weights as auxiliary parameters;
and extracting the classification label with the voting score larger than a score threshold value as the classification label of the information to be identified.
In one possible embodiment, the continuing the classification with the next-level leaf node as a new root node until the next-level leaf node is a minimum leaf node includes:
acquiring the similarity among the output results of all leaf nodes at any level, and extracting target leaf nodes corresponding to a plurality of output results with the similarity larger than a similarity threshold value as root nodes of next-level classification;
and acquiring a node label corresponding to the target leaf node, and inputting the node label into a next-level classifier for classification until a root node of the next-level classification is the minimum leaf node.
An information classification device based on deep learning comprises the following modules:
the pre-classification module is used for acquiring the data quantity of the information to be identified, determining the clustering mode of the information to be identified according to the data quantity, and preprocessing the information to be identified by applying the clustering mode to obtain pre-classification data;
the word vector module is used for carrying out word vector conversion on the pre-classified data to obtain word vectors of the pre-classified data;
the feature extraction module is used for inputting the word vectors of the pre-classified data into a preset deep learning model to extract text features so as to obtain a plurality of text features;
the result generation module is used for classifying the text features to obtain the classification result of the text features;
and the label generation module is set to score the classification result by applying a preset voting mechanism to obtain a scoring result, and determine the classification label of the information to be identified according to the scoring result.
A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which, when executed by the processor, cause the processor to perform the steps of the above deep learning based information classification method.
A storage medium having stored thereon computer-readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the above-described deep learning-based information classification method.
Compared with the existing mechanism, the method and the device have the advantages that the data quantity of the information to be identified is obtained, the clustering mode of the information to be identified is determined according to the data quantity, and the information to be identified is preprocessed by applying the clustering mode to obtain pre-classified data; performing word vector conversion on the pre-classified data to obtain word vectors of the pre-classified data; inputting the word vectors of the pre-classified data into a preset deep learning model for text feature extraction to obtain a plurality of text features; classifying the text features to obtain a classification result of the text features; and scoring the classification result by using a preset voting mechanism to obtain a scoring result, and determining the classification label of the information to be identified according to the scoring result. The problem that the content of original information cannot be accurately reflected when text feature extraction is carried out by applying a deep learning model due to the problem of data imbalance can be effectively solved.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application.
FIG. 1 is a flowchart illustrating an overall method for deep learning based information classification according to an embodiment of the present application;
FIG. 2 is a diagram illustrating a pre-classification process in an information classification method based on deep learning according to an embodiment of the present application;
FIG. 3 is a diagram illustrating a result generation process in an information classification method based on deep learning according to an embodiment of the present application;
fig. 4 is a block diagram of an information classification apparatus based on deep learning according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Fig. 1 is an overall flowchart of an information classification method based on deep learning according to an embodiment of the present application, where the information classification method based on deep learning includes the following steps:
s1, acquiring the data quantity of the information to be identified, determining the clustering mode of the information to be identified according to the data quantity, and preprocessing the information to be identified by applying the clustering mode to obtain pre-classified data;
specifically, when information identification is performed, whether information to be identified is a simple graph or a multiple graph is to be determined, where a simple graph means that a text sentence only includes one graph, such as: "I want to listen to Zhou Jieren's song", this language intent can be attributed to a musical intent, while multiple intentions means that multiple intentions can be contained in the text utterance, such as: the intention of buying apple can be concluded as fruit intention, and the intention of buying fruit can also be concluded as electronic intention, i think of buying apple mobile phone, then the intention of the sentence in the conversation is judged according to the past search record or context information of the user, and the best answer is returned preferentially. And then, when the data amount of the information to be identified is counted, a natural language identification algorithm is needed to classify the single sentence intentions and the multiple sentence intention diagrams in the information to be identified, the single sentence corresponding to each single sentence intention is used as 1 data, and the multiple sentences corresponding to each multiple sentence intention diagram are used as 1 data together. When clustering is carried out, the used clustering algorithm is mainly a K-mean algorithm and a coacervation hierarchical clustering algorithm.
S2, performing word vector conversion on the pre-classified data to obtain word vectors of the pre-classified data;
specifically, word vector conversion commonly used in Wordvec2, which uses BERT model as word vector embedding model in this step, can convert word vectors of text by using BERT model, 1, installing BERT model, (1) installing BERT on server side, (2) installing BERT on client side, 2, starting service, executing the following codes of BERT-providing-start-model _ dir/tmp/engli sh _ L-12 _ H-768_ A-12-num _ worker ═ 4, where/tmp/englishh _ L-12 _ H-768_ A-12/path of downloaded model 3, text vectorization using python script, executing code of from BERT _ providing
S3, inputting the word vectors of the pre-classified data into a preset deep learning model for text feature extraction to obtain a plurality of text features;
specifically, the deep learning model selected during text extraction is usually a convolutional neural network model or a cyclic neural network model, and when text features of word vectors are extracted, the deep learning model is trained firstly, and when the text feature accuracy of the trained deep learning model is greater than a preset threshold, the deep learning model can be used for extracting text features of pre-classified word vectors. In the step, a word frequency-inverse file frequency algorithm (TF-IDF) can be adopted for text feature extraction.
S4, classifying the text features to obtain a classification result of the text features;
specifically, the text feature classification uses a computer to automatically classify and mark a text set (or other entities or objects) according to a certain classification system or standard. Commonly used text feature classification methods are a naive bayes classification method, a decision tree method, an SVM support vector machine and the like. When the text features are classified, a classifier needs to be trained, and only the result classified by the verified classifier can be used as a reliable result for application.
The decision tree method is to classify text features by applying a decision tree model, firstly input the text features into a root node of the decision tree model for first classification, then perform second classification by using first leaf nodes, and so on until the minimum leaf node of the decision tree model. The decision tree model can classify the text features from coarse to fine step by step, so that a more accurate classification result is obtained. If the text characteristics are: and (3) the root node of the orange in the decision tree model is a 'creature', the first leaf node is a 'plant', and the minimum leaf node is a 'fruit'.
And S5, scoring the classification result by using a preset voting mechanism to obtain a scoring result, and determining the classification label of the information to be identified according to the scoring result.
The voting mechanism (voting) is a combination strategy for the classification problem in ensemble learning. The basic idea is to select the class that outputs the most among all machine learning algorithms. The output of the machine learning classification algorithm is of two types: one is to directly output class labels, and the other is to output class probabilities, and the former is used for voting and is called Hard voting (Majority/Hard voting), and the latter is used for classifying and is called Soft voting (Soft voting). In the step, a soft voting mechanism is adopted, and the weight is added in the voting as an auxiliary parameter, so that the classification label can be better obtained.
The following steps may be taken for the soft voting mechanism: firstly, obtaining class probabilities output by the used machine learning algorithm, such as 50% of class A, 30% of class B and 20% of class C, and then calculating weighted average values of the classes after obtaining weighted values of the classes, such as 0.3 of class A, 0.5 of class B and 0.2 of class C, wherein the class with a large value is selected.
In the embodiment, information needing to be subjected to intention identification is subjected to pre-classification processing, and different processing modes are adopted according to different types, so that the problem that the content of original information cannot be accurately reflected when text feature extraction is carried out by applying a deep learning model due to the unbalanced data can be effectively solved.
Fig. 2 is a schematic diagram of a pre-classification process in an information classification method based on deep learning in an embodiment of the present application, as shown in the drawing, in step S1, acquiring a data quantity of information to be identified, determining a clustering mode of the information to be identified according to the data quantity, and applying the clustering mode to pre-process the information to be identified to obtain pre-classification data, where the pre-classification process includes:
s11, comparing the data quantity with a preset data quantity threshold, if the data quantity is larger than the data quantity threshold, determining that the information to be identified is large sample data, otherwise, determining that the information to be identified is small sample data;
in general, indexes such as median, mode, average value, and the like are used as the data volume threshold, and if the sample data volume is larger than the threshold, the sample is determined as a large sample, otherwise, the sample is determined as a small sample.
S12, if the information to be identified is large sample data, clustering the large sample data by applying a clustering algorithm after removing noise points and isolated points in the large sample data to obtain pre-classified data;
specifically, the large sample data is data conforming to normal distribution, and when the large sample data is clustered, special symbols such as a special symbol in the large sample data are required; ",". And removing non-character noise points such as 'and the like', and if points which do not conform to normal distribution exist in the large sample data, taking the points as isolated points, removing the points in advance and then clustering the points.
And S13, if the information to be identified is small sample data, clustering similar samples in the small sample data by using a clustering algorithm to generate a plurality of clusters, and processing the data in each cluster by respectively using a genetic crossover algorithm to obtain the pre-classified data.
Specifically, the small sample data contains less feature information than the large sample data, the machine learns less feature information from the small sample, and the calculated joint probability is relatively smaller. Therefore, the clustering algorithm such as K-mean can not be directly adopted for carrying out classification statistics on the target object, and the classification accuracy is seriously influenced. When the genetic crossover algorithm is applied for processing, the adopted crossover operator is a uniform crossover operator. Two crossover points were randomly generated in two ligands a and B, and then gene-exchanged with three randomly generated integers of 0, 1, and 2, thereby forming two new individuals to complete crossover.
In the embodiment, the sample classification is performed on the data to be analyzed, and different clustering algorithms are adopted, so that the inaccuracy of information classification caused by data imbalance is avoided.
In one embodiment, the performing word vector conversion on the pre-classified data to obtain a word vector of the pre-classified data includes:
acquiring a preset word vector embedding model, and dividing the pre-classified data into a plurality of sentences according to the attribute of the word vector embedding model;
in particular, different word vector embedding models have different limitations on the length of the characters to be word vector converted.
Inputting the sentence into the word vector embedding model for mapping to obtain an initial text word vector;
and calculating the characteristic value of the initial text word vector, deleting the initial text word vector with the characteristic value of zero, summarizing the rest initial text word vectors, and obtaining the word vector of the pre-classified data.
In the embodiment, the word vector is embedded by using BERT, and the model adopts a transform sequence model, so that the method has a bidirectional function, can obtain semantic representation at a sentence level higher than a word, and has the advantages of strong universality, good effect and the like.
In one embodiment, the entering of the word vector of the pre-classification data into a preset deep learning model for text feature extraction to obtain a plurality of text features includes:
inputting a preset standard word vector into an input layer in a preset cyclic neural network model, performing probability prediction on the word vector processed by the input layer through a hidden layer in the cyclic neural network model to obtain a probability prediction result, and converting the probability prediction result by using an output layer in the cyclic neural network model to obtain a prediction keyword;
the recurrent neural network model comprises an input layer, a hidden layer and an output layer, wherein the input layer is used for receiving data, the hidden layer is used for processing the data, and the output layer is used for outputting the result. Wherein, a series of processing will be carried out to the data in the hidden layer, mainly including: gradient truncation, regularization, gating, etc. The data is effectively processed through the hidden layer.
And comparing the predicted key words with the keywords corresponding to the standard word vectors, if the predicted key words are consistent with the keywords corresponding to the standard word vectors, adding the word vectors of the pre-classified data into the cyclic neural network model for feature extraction, and otherwise, changing the parameters in the hidden layer for re-prediction until the predicted key words are consistent with the keywords corresponding to the standard word vectors.
In the embodiment, text feature extraction is performed on the word vectors through the deep learning model, so that the accuracy of the text features is ensured.
Fig. 3 is a schematic diagram of a result generation process in an information classification method based on deep learning according to an embodiment of the present application, where as shown in the drawing, the S4 classifies the text features to obtain a classification result of the text features, where the classification result includes:
s41, obtaining classifiers of different categories, and establishing a classifier sub-tree according to the hierarchical relationship among the classifiers;
specifically, a probability calculation method may be adopted in hierarchical classification, as follows:
P(nk|di)=∏p(nj|di)xak
Figure BDA0002399522140000101
wherein P (n)k|di) Representing a document diSorting Final arriving node nkProbability of (1), P (n)j|di) Representing a document diAt arriving node nkAncestor node n through which the node has previously passedjProbability of (a)kRepresenting a high penalty factor.
S42, inputting the text features to a root node of the classifier subtree, carrying out primary classification to obtain a primary classification result, and inputting the primary classification result to a next-level leaf node of the root node;
s43, continuing to classify by taking the next-level leaf node as a new root node until the next-level leaf node is the minimum leaf node;
specifically, the similarity between the output results of all leaf nodes at any level is obtained, and the target leaf nodes corresponding to a plurality of output results with the similarity greater than a similarity threshold are extracted as the root nodes of the next-level classification;
and acquiring a node label corresponding to the target leaf node, and inputting the node label into a next-level classifier for classification until a root node of the next-level classification is the minimum leaf node.
Wherein, the similarity threshold is set according to the type of the classifier, such as: and judging the probabilities of the classifiers belonging to a task type, a chatting type and an FAQ, if the calculated probabilities are 0.8, 0.3 and 0.9 respectively, and the set threshold value is 0.5, judging that the text data belongs to the task type and the chatting, if the chatting belongs to a leaf node, not continuing the judgment, and then calculating the probabilities of the text belonging to a customer service task type and a maintenance task type through a next-stage classifier, and if the calculated probabilities are 0.88 and 0.21 respectively, finally judging that the results are the customer service task type and the chatting.
And S44, summarizing the classification result of the minimum leaf node to obtain the classification result of the text feature.
In the embodiment, by adopting the improved hierarchical classification method, the problem that all subsequent classifications are wrong due to the fact that classification errors occur in the classification of the upper hierarchy is solved.
In an embodiment, the scoring the classification result by using a preset voting mechanism to obtain a scoring result, and determining the classification label of the information to be identified according to the scoring result includes:
obtaining the classification accuracy of the end classifier corresponding to each minimum leaf node, and taking the classification accuracy as the weight of the end classifier;
the calculation formula of the classification accuracy rate is as follows:
Figure BDA0002399522140000111
in the formula, TP is a true case, TN is a true negative case, FP is a false positive case, and FN is a false negative case.
Voting and scoring the classification labels output by the end classifier by applying the voting mechanism by taking the weights as auxiliary parameters;
and extracting the classification label with the voting score larger than a score threshold value as the classification label of the information to be identified.
According to the embodiment, the classification result is effectively scored by using a voting mechanism, so that the accuracy of the classification label is greatly improved.
The technical features mentioned in any of the above corresponding embodiments or implementations are also applicable to the embodiment corresponding to fig. 4 in the present application, and the details of the subsequent similarities are not repeated.
In the above description, a method for classifying information based on deep learning in the present application is described, and an information classification apparatus for performing the above deep learning is described below.
A structure diagram of an information classification apparatus based on deep learning, which is applicable to information classification based on deep learning, is shown in fig. 4. The deep learning-based information classification apparatus in the embodiment of the present application can implement the steps corresponding to the deep learning-based information classification method performed in the embodiment corresponding to fig. 1 described above. The functions realized by the deep learning-based information classification device can be realized by hardware, and can also be realized by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above functions, which may be software and/or hardware.
In one embodiment, an information classification apparatus based on deep learning is provided, as shown in fig. 4, including the following modules:
the pre-classification module 10 is configured to acquire the data quantity of the information to be identified, determine the clustering mode of the information to be identified according to the data quantity, and pre-process the information to be identified by applying the clustering mode to obtain pre-classification data;
a word vector module 20 configured to perform word vector conversion on the pre-classified data to obtain a word vector of the pre-classified data;
the feature extraction module 30 is configured to add the word vectors of the pre-classified data into a preset deep learning model to perform text feature extraction, so as to obtain a plurality of text features;
a result generation module 40 configured to classify the text features to obtain a classification result of the text features;
and the label generating module 50 is configured to score the classification result by applying a preset voting mechanism to obtain a scoring result, and determine the classification label of the information to be identified according to the scoring result.
In one embodiment, a computer device is provided, the computer device includes a memory and a processor, the memory stores computer readable instructions, and when executed by the processor, the processor executes the steps of the deep learning based information classification method in the above embodiments.
In one embodiment, a storage medium storing computer-readable instructions is provided, which when executed by one or more processors, cause the one or more processors to perform the steps of the deep learning based information classification method in the above embodiments. The storage medium may be a nonvolatile storage medium or a volatile storage medium, and the present application is not limited in particular.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-described embodiments are merely illustrative of some embodiments of the present application, which are described in more detail and detail, but are not to be construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. An information classification method based on deep learning is characterized by comprising the following steps:
acquiring the data quantity of information to be identified, determining the clustering mode of the information to be identified according to the data quantity, and preprocessing the information to be identified by applying the clustering mode to obtain pre-classified data;
performing word vector conversion on the pre-classified data to obtain word vectors of the pre-classified data;
inputting the word vectors of the pre-classified data into a preset deep learning model for text feature extraction to obtain a plurality of text features;
classifying the text features to obtain a classification result of the text features;
and scoring the classification result by using a preset voting mechanism to obtain a scoring result, and determining the classification label of the information to be identified according to the scoring result.
2. The information classification method based on deep learning of claim 1, wherein the obtaining of the data quantity of the information to be identified, the determining of the clustering mode of the information to be identified according to the data quantity, and the preprocessing of the information to be identified by applying the clustering mode to obtain pre-classification data comprises:
comparing the data quantity with a preset data quantity threshold, if the data quantity is greater than the data quantity threshold, determining that the information to be identified is large sample data, otherwise, determining that the information to be identified is small sample data;
if the information to be identified is large sample data, clustering the large sample data by applying a clustering algorithm after removing noise points and isolated points in the large sample data to obtain pre-classified data;
if the information to be identified is small sample data, clustering similar samples in the small sample data by using a clustering algorithm to generate a plurality of clusters, and processing the data in each cluster by respectively adopting a genetic crossover algorithm to obtain the pre-classification data.
3. The method for information classification based on deep learning of claim 1, wherein the performing word vector transformation on the pre-classified data to obtain a word vector of the pre-classified data comprises:
acquiring a preset word vector embedding model, and dividing the pre-classified data into a plurality of sentences according to the attribute of the word vector embedding model;
inputting the sentence into the word vector embedding model for mapping to obtain an initial text word vector;
and calculating the characteristic value of the initial text word vector, deleting the initial text word vector with the characteristic value of zero, summarizing the rest initial text word vectors, and obtaining the word vector of the pre-classified data.
4. The information classification method based on deep learning of claim 1, wherein the step of inputting the word vector of the pre-classification data into a preset deep learning model for text feature extraction to obtain a plurality of text features comprises:
inputting a preset standard word vector into an input layer in a preset cyclic neural network model, performing probability prediction on the word vector processed by the input layer through a hidden layer in the cyclic neural network model to obtain a probability prediction result, and converting the probability prediction result by using an output layer in the cyclic neural network model to obtain a prediction keyword;
and comparing the predicted key words with the keywords corresponding to the standard word vectors, if the predicted key words are consistent with the keywords corresponding to the standard word vectors, adding the word vectors of the pre-classified data into the cyclic neural network model for feature extraction, and otherwise, changing the parameters in the hidden layer for re-prediction until the predicted key words are consistent with the keywords corresponding to the standard word vectors.
5. The information classification method based on deep learning according to any one of claims 1 to 4, wherein the classifying the text features to obtain a classification result of the text features includes:
obtaining classifiers of different categories, and establishing a classifier sub-tree according to the hierarchical relation among the classifiers;
inputting the text features into a root node of the classifier subtree, performing primary classification to obtain a primary classification result, and inputting the primary classification result into a next-level leaf node of the root node;
taking the next-level leaf node as a new root node to continue classifying until the next-level leaf node is the minimum leaf node;
and summarizing the classification result of the minimum leaf node to obtain the classification result of the text features.
6. The information classification method based on deep learning of claim 5, wherein the applying a preset voting mechanism to score the classification result to obtain a scoring result, and determining the classification label of the information to be identified according to the scoring result comprises:
obtaining the classification accuracy of the end classifier corresponding to each minimum leaf node, and taking the classification accuracy as the weight of the end classifier;
voting and scoring the classification labels output by the end classifier by applying the voting mechanism by taking the weights as auxiliary parameters;
and extracting the classification label with the voting score larger than a score threshold value as the classification label of the information to be identified.
7. The method for classifying information based on deep learning according to claim 5, wherein the classifying continues with the next-level leaf node as a new root node until the next-level leaf node is a minimum leaf node, including:
acquiring the similarity among the output results of all leaf nodes at any level, and extracting target leaf nodes corresponding to a plurality of output results with the similarity larger than a similarity threshold value as root nodes of next-level classification;
and acquiring a node label corresponding to the target leaf node, and inputting the node label into a next-level classifier for classification until a root node of the next-level classification is the minimum leaf node.
8. An information classification device based on deep learning is characterized by comprising the following modules:
the pre-classification module is used for acquiring the data quantity of the information to be identified, determining the clustering mode of the information to be identified according to the data quantity, and preprocessing the information to be identified by applying the clustering mode to obtain pre-classification data;
the word vector module is used for carrying out word vector conversion on the pre-classified data to obtain word vectors of the pre-classified data;
the feature extraction module is used for inputting the word vectors of the pre-classified data into a preset deep learning model to extract text features so as to obtain a plurality of text features;
the result generation module is used for classifying the text features to obtain the classification result of the text features;
and the label generation module is set to score the classification result by applying a preset voting mechanism to obtain a scoring result, and determine the classification label of the information to be identified according to the scoring result.
9. A computer device comprising a memory and a processor, the memory having stored therein computer-readable instructions, which, when executed by the processor, cause the processor to carry out the steps of the deep learning based information classification method according to any one of claims 1 to 7.
10. A storage medium storing computer readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the deep learning based information classification method according to any one of claims 1 to 7.
CN202010142300.9A 2020-03-04 2020-03-04 Information classification method based on deep learning and related equipment Pending CN111428028A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010142300.9A CN111428028A (en) 2020-03-04 2020-03-04 Information classification method based on deep learning and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010142300.9A CN111428028A (en) 2020-03-04 2020-03-04 Information classification method based on deep learning and related equipment

Publications (1)

Publication Number Publication Date
CN111428028A true CN111428028A (en) 2020-07-17

Family

ID=71547408

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010142300.9A Pending CN111428028A (en) 2020-03-04 2020-03-04 Information classification method based on deep learning and related equipment

Country Status (1)

Country Link
CN (1) CN111428028A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112133306A (en) * 2020-08-03 2020-12-25 浙江百世技术有限公司 Response method and device based on express delivery user and computer equipment
CN112287084A (en) * 2020-10-30 2021-01-29 国网江苏省电力有限公司营销服务中心 Question-answering method and system based on ensemble learning
CN112329877A (en) * 2020-11-16 2021-02-05 山西三友和智慧信息技术股份有限公司 Voting mechanism-based web service classification method and system
CN112632222A (en) * 2020-12-25 2021-04-09 海信视像科技股份有限公司 Terminal equipment and method for determining data belonging field
CN112632274A (en) * 2020-10-29 2021-04-09 中科曙光南京研究院有限公司 Abnormal event classification method and system based on text processing
CN112988954A (en) * 2021-05-17 2021-06-18 腾讯科技(深圳)有限公司 Text classification method and device, electronic equipment and computer-readable storage medium
CN116204645A (en) * 2023-03-02 2023-06-02 北京数美时代科技有限公司 Intelligent text classification method, system, storage medium and electronic equipment
CN116738343A (en) * 2023-08-08 2023-09-12 云筑信息科技(成都)有限公司 Material data identification method and device for construction industry and electronic equipment

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112133306B (en) * 2020-08-03 2023-10-03 浙江百世技术有限公司 Response method and device based on express delivery user and computer equipment
CN112133306A (en) * 2020-08-03 2020-12-25 浙江百世技术有限公司 Response method and device based on express delivery user and computer equipment
CN112632274A (en) * 2020-10-29 2021-04-09 中科曙光南京研究院有限公司 Abnormal event classification method and system based on text processing
CN112632274B (en) * 2020-10-29 2024-04-26 中科曙光南京研究院有限公司 Abnormal event classification method and system based on text processing
CN112287084A (en) * 2020-10-30 2021-01-29 国网江苏省电力有限公司营销服务中心 Question-answering method and system based on ensemble learning
CN112329877A (en) * 2020-11-16 2021-02-05 山西三友和智慧信息技术股份有限公司 Voting mechanism-based web service classification method and system
CN112632222A (en) * 2020-12-25 2021-04-09 海信视像科技股份有限公司 Terminal equipment and method for determining data belonging field
CN112632222B (en) * 2020-12-25 2023-02-03 海信视像科技股份有限公司 Terminal equipment and method for determining data belonging field
CN112988954A (en) * 2021-05-17 2021-06-18 腾讯科技(深圳)有限公司 Text classification method and device, electronic equipment and computer-readable storage medium
CN116204645A (en) * 2023-03-02 2023-06-02 北京数美时代科技有限公司 Intelligent text classification method, system, storage medium and electronic equipment
CN116204645B (en) * 2023-03-02 2024-02-20 北京数美时代科技有限公司 Intelligent text classification method, system, storage medium and electronic equipment
CN116738343A (en) * 2023-08-08 2023-09-12 云筑信息科技(成都)有限公司 Material data identification method and device for construction industry and electronic equipment
CN116738343B (en) * 2023-08-08 2023-10-20 云筑信息科技(成都)有限公司 Material data identification method and device for construction industry and electronic equipment

Similar Documents

Publication Publication Date Title
CN111428028A (en) Information classification method based on deep learning and related equipment
CN108304372B (en) Entity extraction method and device, computer equipment and storage medium
CN106951422B (en) Webpage training method and device, and search intention identification method and device
CN108197282B (en) File data classification method and device, terminal, server and storage medium
CN112417863B (en) Chinese text classification method based on pre-training word vector model and random forest algorithm
CN112800170A (en) Question matching method and device and question reply method and device
CN110619051B (en) Question sentence classification method, device, electronic equipment and storage medium
CN107229627B (en) Text processing method and device and computing equipment
CN112732871B (en) Multi-label classification method for acquiring client intention labels through robot induction
CN108027814B (en) Stop word recognition method and device
CN112035599B (en) Query method and device based on vertical search, computer equipment and storage medium
CN112735383A (en) Voice signal processing method, device, equipment and storage medium
CN113254643B (en) Text classification method and device, electronic equipment and text classification program
CN112347223B (en) Document retrieval method, apparatus, and computer-readable storage medium
CN110287314B (en) Long text reliability assessment method and system based on unsupervised clustering
CN113821605B (en) Event extraction method
Bouguila A model-based approach for discrete data clustering and feature weighting using MAP and stochastic complexity
CN112131876A (en) Method and system for determining standard problem based on similarity
CN111985228A (en) Text keyword extraction method and device, computer equipment and storage medium
CN113505200A (en) Sentence-level Chinese event detection method combining document key information
CN115048464A (en) User operation behavior data detection method and device and electronic equipment
CN111325033B (en) Entity identification method, entity identification device, electronic equipment and computer readable storage medium
CN114491079A (en) Knowledge graph construction and query method, device, equipment and medium
CN112579781B (en) Text classification method, device, electronic equipment and medium
CN110413770B (en) Method and device for classifying group messages into group topics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination