CN114547305A - Text classification system based on natural language processing - Google Patents

Text classification system based on natural language processing Download PDF

Info

Publication number
CN114547305A
CN114547305A CN202210172720.0A CN202210172720A CN114547305A CN 114547305 A CN114547305 A CN 114547305A CN 202210172720 A CN202210172720 A CN 202210172720A CN 114547305 A CN114547305 A CN 114547305A
Authority
CN
China
Prior art keywords
text
data
classification
module
text data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210172720.0A
Other languages
Chinese (zh)
Inventor
韩天
张竹
江晓林
任明远
董长春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinhua Institute Of Higher Learning Office Of Leading Group For Preparation Of Jinhua Institute Of Technology
Original Assignee
Jinhua Institute Of Higher Learning Office Of Leading Group For Preparation Of Jinhua Institute Of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinhua Institute Of Higher Learning Office Of Leading Group For Preparation Of Jinhua Institute Of Technology filed Critical Jinhua Institute Of Higher Learning Office Of Leading Group For Preparation Of Jinhua Institute Of Technology
Priority to CN202210172720.0A priority Critical patent/CN114547305A/en
Publication of CN114547305A publication Critical patent/CN114547305A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a text classification system based on natural language processing, which comprises a data acquisition module, a data preprocessing module, a data post-processing module, a text classification module, a classification result verification module and a visualization module, wherein the data acquisition module is used for acquiring a text; the method combines the characteristics of natural language processing to preprocess the original text data, thereby facilitating the uniform processing of the original text data, reducing the influence of word frequency factors on classification results by weighting processing key information, improving the accuracy of text classification, solving the problem that the traditional algorithm can not reflect word position information, combining a convolutional neural network with a support vector machine classifier, and adding an attention mechanism in a model, thereby performing feature extraction on the text data through the convolutional neural network, replacing a normalization exponential function with insufficient generalization capability in the convolutional neural network by using classification counting based on the support vector machine, simplifying model parameters and improving the efficiency and the accuracy of text classification.

Description

Text classification system based on natural language processing
Technical Field
The invention relates to the technical field of text classification, in particular to a text classification system based on natural language processing.
Background
The text classification refers to automatic classification and marking of a text set according to a certain classification system or standard by a computer, a relation model between document features and document categories is found according to a labeled training document set, then a new document is subjected to category judgment by using the relation model obtained by learning, the text classification gradually changes from a knowledge-based method to a statistical and machine learning-based method along with the development of science and technology, the text classification generally comprises the processes of text expression, classifier selection and training, classification result evaluation and feedback and the like, wherein the text expression can be subdivided into the steps of text preprocessing, indexing and statistics, feature extraction and the like.
Natural language processing refers to a technology of interactive communication between a natural language used for human communication and a machine, through artificial natural language processing, a computer can read and understand the natural language, related research of natural language processing starts with human exploration of machine translation, although natural language processing relates to multidimensional operations such as voice, grammar, semantics and pragmatics, the basic task of natural language processing is simply to divide words of a to-be-processed corpus based on an ontology dictionary, word frequency statistics, context semantic analysis and the like to form a lexical item unit which takes minimum part of speech as a unit and is rich in semantics, and the natural language processing is mainly applied to aspects such as machine translation, public opinion monitoring, automatic summarization and viewpoint extraction.
The traditional text classification work mostly depends on manual computer operation, which not only wastes time and labor, but also ensures the classification effect, and with the explosive growth of text document data, the manual operation can not meet the requirements of text classification work, however, the research of applying the natural language processing technology to text classification is not mature enough at present, the importance of a word in a text is mostly measured only according to the occurrence frequency of the word, and the importance evaluation cannot be carried out according to the occurrence position of the word in an article, thereby reducing the accuracy of text classification, having low text classification efficiency, failing to meet some precise text classification work, in addition, the readability of the classification result of the existing text classification system is also poor, the user can not be provided with more visual experience, therefore, the present invention provides a text classification system based on natural language processing to solve the problems in the prior art.
Disclosure of Invention
In view of the above problems, an object of the present invention is to provide a text classification system based on natural language processing, which preprocesses original text data to facilitate uniform processing of the original text data, reduces the influence of word frequency factors on classification results by performing weighting processing on key information, improves the accuracy of text classification, and solves the problem that a conventional algorithm cannot reflect word position information.
In order to achieve the purpose of the invention, the invention is realized by the following technical scheme: a text classification system based on natural language processing comprises a data acquisition module, a data preprocessing module, a data post-processing module, a text classification module, a classification result verification module and a visualization module, wherein the data acquisition module acquires original text data to be classified and sends the acquired original text data to the data preprocessing module, the preprocessing module comprises a data screening unit, a formatting unit and a normalization unit, the data post-processing module comprises a text word segmentation unit for decomposing the preprocessed data into word segmentation text data and an information weight unit for processing the weight of the word segmentation text data into a text data set, the text classification module comprises a text classification model for classifying the text data and a model training unit for training the text classification model, the text classification model is constructed based on a convolutional neural network and a support vector machine classifier and introduces attention control, the classification result verification module tests and verifies the text classification result of the text classification module, and the visualization module visually displays the text classification result of the text classification module and the verification result of the classification result verification module.
The further improvement lies in that: the data screening unit screens original text data, screens invalid text data in the original text data, and meanwhile retains valid text data, wherein the invalid text data comprise missing value data, abnormal value data, inconsistent value data and repeated text data.
The further improvement lies in that: the formatting unit formats effective text data obtained after the data screening unit screens the effective text data into a uniform format to obtain text data with the uniform format, the normalization unit splits the text data with the uniform format by taking a sentence as a unit and creates a normalization tag for the split sentence to obtain normalized text data and finish preprocessing of the original text data.
The further improvement lies in that: the text word segmentation unit performs word segmentation on the preprocessed text data, removes inflectives and stop words in the text data to obtain word segmentation text data, the information weight unit gives different weights to words appearing at different positions of the word segmentation text data to enable the word segmentation text data to be processed by key information weights, and then the words in the word segmentation text data are mapped into word vector forms corresponding to the words by using a one-hot coding or word embedding technology to obtain a text data set.
The further improvement lies in that: the model training unit trains the text classification model by using a model algorithm based on machine learning or a model algorithm based on deep learning, and the text classification model inputs a text data set for text classification after training.
The further improvement lies in that: the convolutional neural network in the text classification model comprises an input layer, a hidden layer and an output layer, wherein the hidden layer comprises a convolutional layer, a pooling layer, an attention layer and a full connection layer, the input layer introduces a text data set, the convolutional layer and the pooling layer complete feature extraction work and introduce an attention mechanism by the attention layer in the extraction process, and the full connection layer realizes text classification work.
The further improvement lies in that: the classification result verification module comprises a data set dividing unit and a classification result analysis unit, the data set dividing unit randomly divides the text data set into a training set and a testing set, then the training set and the testing set are input into the text classification model for training and testing, and the classification result analysis unit compares and analyzes the testing result and the text classification result and verifies the accuracy of text classification.
The further improvement lies in that: the data set dividing unit randomly arranges the text data sets by setting random seeds and randomly divides the text data sets into a training set and a testing set according to a ratio of 9:1 or 8: 2.
The further improvement lies in that: the data conversion unit converts a text classification result of the text classification module and a verification result of the classification result verification module into visual data, and the data visualization unit puts the visual data on an external display and displays the visual data to a user.
The invention has the beneficial effects that: the invention combines the characteristics of natural language processing to carry out screening, format unification and normalization pretreatment on original text data, thereby facilitating the unified treatment on the original text data, decomposing the text data into basic processing units by carrying out word segmentation operation, key information weight treatment and characterization treatment on the text data, enabling the feature extraction work in the text classification process to be more convenient, simultaneously reducing the cost of subsequent treatment, reducing the influence of word frequency factors on classification results by carrying out weight treatment on key information, improving the accuracy of text classification, solving the problem that the traditional algorithm can not reflect word position information, combining a convolutional neural network with a support vector machine classifier, increasing an attention mechanism in a model, thereby carrying out feature extraction on the text data through the convolutional neural network, and replacing normalization with insufficient generalization capability in the convolutional neural network by using classification counting based on the support vector machine The exponential function simplifies the model parameters, improves the efficiency and the accuracy of text classification, and finally improves the readability of the text classification result through the visualization module, so that the text classification result can be seen more intuitively.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic diagram of the system architecture of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
Referring to fig. 1, the embodiment provides a text classification system based on natural language processing, including a data acquisition module, a data preprocessing module, a data post-processing module, a text classification module, a classification result verification module and a visualization module, where the data acquisition module acquires original text data to be classified and sends the acquired original text data to the data preprocessing module, the preprocessing module includes a data screening unit, a formatting unit and a normalization unit, which perform preprocessing of screening, formatting unification and normalization on the original text data in combination with the characteristics of natural language processing, so as to facilitate uniform processing on the original text data, the data post-processing module includes a text word segmentation unit for decomposing the preprocessed data into word segmentation text data and an information weight unit for processing the weight of the word segmentation text data into a text data set, the text classification module comprises a text classification model for classifying text data and a model training unit for training the text classification model, the text classification model is constructed based on a convolutional neural network and a support vector machine classifier, an attention mechanism is introduced, the convolutional neural network is combined with the support vector machine classifier, the attention mechanism is added in the model, so that the text data is subjected to feature extraction through the convolutional neural network, a normalization exponential function with insufficient generalization capability in the convolutional neural network is replaced by classification counting based on the support vector machine, the efficiency and the accuracy of text classification are improved while model parameters are simplified, the text classification result of the text classification module is tested and verified by the classification result verification module, and the text classification result of the text classification module and the verification result of the classification result verification module are visually displayed by the visualization module, the readability of the text classification result is improved through the visualization module, and the text classification result can be seen more visually through the visualization module.
The data screening unit screens the original text data, screens invalid text data in the original text data, and meanwhile retains the valid text data, wherein the invalid text data comprises missing value data, abnormal value data, inconsistent value data and repeated text data.
The formatting unit formats the effective text data screened by the data screening unit into a uniform format to obtain text data with the uniform format, the normalizing unit splits the text data with the uniform format by taking a sentence as a unit and creates a normalization label for the split sentence to obtain normalized text data and finish preprocessing of the original text data.
The text word segmentation unit performs word segmentation on the preprocessed text data, removes inflectives and stop words in the text data to obtain segmented text data, the information weight unit gives different weights to words appearing at different positions of the segmented text data to enable the segmented text data to obtain key information weight processing, then uses a one-hot coding technology to map the words in the segmented text data into word vector forms corresponding to the words to obtain a text data set, and performs word segmentation, key information weight processing and characterization processing on the text data to decompose the text data into basic processing units, so that feature extraction in the text classification process is more convenient, cost of subsequent processing is reduced, influence of word frequency factors on classification results is reduced by performing weight processing on the key information, and accuracy of text classification is improved, the problem that the traditional algorithm cannot reflect word position information is solved.
The model training unit trains the text classification model by using a model algorithm based on machine learning, and the text classification model is input into a text data set for text classification after training.
The convolutional neural network in the text classification model comprises an input layer, a hidden layer and an output layer, wherein the hidden layer comprises a convolutional layer, a pooling layer, an attention layer and a full connection layer, the input layer imports a text data set, the convolutional layer and the pooling layer complete feature extraction work and an attention mechanism is introduced by the attention layer in the extraction process, and the full connection layer realizes text classification work.
The classification result verification module comprises a data set dividing unit and a classification result analysis unit, wherein the data set dividing unit randomly divides the text data set into a training set and a testing set, then inputs the training set and the testing set into a text classification model for training and testing, and the classification result analysis unit compares and analyzes the testing result and the text classification result and verifies the accuracy of text classification.
The data set dividing unit randomly arranges the text data set by setting random seeds and randomly divides the text data set into a training set and a testing set according to a ratio of 9: 1.
The visualization module comprises a data conversion unit and a data visualization unit, the data conversion unit converts the text classification result of the text classification module and the verification result of the classification result verification module into visualization data, and the data visualization unit puts the visualization data on the external display and displays the visualization data to a user.
Example two
Referring to fig. 1, the embodiment provides a text classification system based on natural language processing, including a data acquisition module, a data preprocessing module, a data post-processing module, a text classification module, a classification result verification module and a visualization module, where the data acquisition module acquires original text data to be classified and sends the acquired original text data to the data preprocessing module, the preprocessing module includes a data screening unit, a formatting unit and a normalization unit, which perform preprocessing of screening, formatting unification and normalization on the original text data in combination with the characteristics of natural language processing, so as to facilitate uniform processing on the original text data, the data post-processing module includes a text word segmentation unit for decomposing the preprocessed data into word segmentation text data and an information weight unit for processing the weight of the word segmentation text data into a text data set, the text classification module comprises a text classification model for text data classification and a model training unit for training the text classification model, the text classification model is constructed based on a convolutional neural network and a support vector machine classifier, an attention mechanism is introduced, the convolutional neural network is combined with the support vector machine classifier, the attention mechanism is added in the model, so that the feature extraction is carried out on the text data through the convolutional neural network, the classification counting based on the support vector machine is used for replacing a normalization exponential function with insufficient generalization capability in the convolutional neural network, the efficiency and the accuracy of text classification are improved while model parameters are simplified, the text classification result verification module tests and verifies the text classification result of the text classification module, and the visualization module visually displays the text classification result of the text classification module and the verification result of the classification result verification module, the readability of the text classification result is improved through the visualization module, and the text classification result can be seen more visually through the visualization module.
The data screening unit screens the original text data, screens invalid text data in the original text data, and meanwhile retains the valid text data, wherein the invalid text data comprises missing value data, abnormal value data, inconsistent value data and repeated text data.
The formatting unit formats the effective text data screened by the data screening unit into a uniform format to obtain text data with the uniform format, the normalizing unit splits the text data with the uniform format by taking a sentence as a unit and creates a normalization label for the split sentence to obtain normalized text data and finish preprocessing of the original text data.
The text word segmentation unit performs word segmentation on the preprocessed text data, removes inflectives and stop words in the text data to obtain segmented text data, the information weight unit gives different weights to words appearing at different positions of the segmented text data to enable the segmented text data to obtain key information weight processing, then uses a one-hot coding technology to map the words in the segmented text data into word vector forms corresponding to the words to obtain a text data set, and performs word segmentation, key information weight processing and characterization processing on the text data to decompose the text data into basic processing units, so that feature extraction in the text classification process is more convenient, cost of subsequent processing is reduced, influence of word frequency factors on classification results is reduced by performing weight processing on the key information, and accuracy of text classification is improved, the problem that the traditional algorithm cannot reflect word position information is solved.
The model training unit trains the text classification model by using a deep learning-based model algorithm, and after the training of the text classification model is finished, a text data set is input for text classification.
The convolutional neural network in the text classification model comprises an input layer, a hidden layer and an output layer, wherein the hidden layer comprises a convolutional layer, a pooling layer, an attention layer and a full connection layer, the input layer imports a text data set, the convolutional layer and the pooling layer complete feature extraction work and an attention mechanism is introduced by the attention layer in the extraction process, and the full connection layer realizes text classification work.
The classification result verification module comprises a data set dividing unit and a classification result analysis unit, wherein the data set dividing unit randomly divides the text data set into a training set and a testing set, then inputs the training set and the testing set into a text classification model for training and testing, and the classification result analysis unit compares and analyzes the testing result and the text classification result and verifies the accuracy of text classification.
The data set dividing unit randomly arranges the text data set by setting random seeds and randomly divides the text data set into a training set and a testing set according to the proportion of 8: 2.
The visualization module comprises a data conversion unit and a data visualization unit, the data conversion unit converts the text classification result of the text classification module and the verification result of the classification result verification module into visualization data, and the data visualization unit puts the visualization data on the external display and displays the visualization data to a user.
When original text data is subjected to text classification, the original text data to be classified is collected by a data collection module, then the collected original text data is screened, unified in format and normalized by a data preprocessing module, word segmentation operation, key information weight processing and characterization processing are performed on the preprocessed text data by a data post-processing module to obtain a text data set, then a text classification model trained by a model training unit is used for performing text classification on the text data set, a classification result verification module is used for verifying a text classification result, and finally a visualization module is used for visually displaying the text classification result of the text classification module and the verification result of the classification result verification module to a user.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (9)

1. A text classification system based on natural language processing, characterized by: the system comprises a data acquisition module, a data preprocessing module, a data post-processing module, a text classification module, a classification result verification module and a visualization module, wherein the data acquisition module acquires original text data to be classified and sends the original text data to the data preprocessing module, the preprocessing module comprises a data screening unit, a formatting unit and a normalization unit, the data post-processing module comprises a text word segmentation unit and an information weight unit, the text word segmentation unit is used for decomposing the preprocessed data into word segmentation text data, the information weight unit is used for processing the weight of the word segmentation text data into a text data set, the text classification module comprises a text classification model used for classifying the text data and a model training unit used for training the text classification model, the text classification model is constructed based on a convolutional neural network and a support vector machine classifier and is introduced with an attention machine system, the classification result verification module tests and verifies the text classification result of the text classification module, and the visualization module visually displays the text classification result of the text classification module and the verification result of the classification result verification module.
2. A system for natural language processing based text classification as claimed in claim 1, wherein: the data screening unit screens original text data, screens invalid text data in the original text data, and meanwhile retains valid text data, wherein the invalid text data comprise missing value data, abnormal value data, inconsistent value data and repeated text data.
3. A natural language processing based text classification system according to claim 2, characterized in that: the formatting unit formats effective text data obtained after the data screening unit screens the effective text data into a uniform format to obtain text data with the uniform format, the normalization unit splits the text data with the uniform format by taking a sentence as a unit and creates a normalization tag for the split sentence to obtain normalized text data and finish preprocessing of the original text data.
4. A natural language processing based text classification system according to claim 1, characterized in that: the text word segmentation unit performs word segmentation on the preprocessed text data, removes inflectives and stop words in the text data to obtain word segmentation text data, the information weight unit gives different weights to words appearing at different positions of the word segmentation text data to enable the word segmentation text data to be processed by key information weights, and then the words in the word segmentation text data are mapped into word vector forms corresponding to the words by using a one-hot coding or word embedding technology to obtain a text data set.
5. A natural language processing based text classification system according to claim 1, characterized in that: the model training unit trains the text classification model by using a model algorithm based on machine learning or a model algorithm based on deep learning, and the text classification model inputs a text data set for text classification after training.
6. A natural language processing based text classification system according to claim 1, characterized in that: the convolutional neural network in the text classification model comprises an input layer, a hidden layer and an output layer, wherein the hidden layer comprises a convolutional layer, a pooling layer, an attention layer and a full connection layer, the input layer introduces a text data set, the convolutional layer and the pooling layer complete feature extraction work and introduce an attention mechanism by the attention layer in the extraction process, and the full connection layer realizes text classification work.
7. A natural language processing based text classification system according to claim 1, characterized in that: the classification result verification module comprises a data set dividing unit and a classification result analysis unit, the data set dividing unit randomly divides the text data set into a training set and a testing set, then the training set and the testing set are input into the text classification model for training and testing, and the classification result analysis unit compares and analyzes the testing result and the text classification result and verifies the accuracy of text classification.
8. A natural language processing based text classification system according to claim 7, characterized in that: the data set dividing unit randomly arranges the text data sets by setting random seeds and randomly divides the text data sets into a training set and a testing set according to a ratio of 9:1 or 8: 2.
9. A natural language processing based text classification system according to claim 1, characterized in that: the data conversion unit converts a text classification result of the text classification module and a verification result of the classification result verification module into visual data, and the data visualization unit puts the visual data on an external display and displays the visual data to a user.
CN202210172720.0A 2022-02-24 2022-02-24 Text classification system based on natural language processing Pending CN114547305A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210172720.0A CN114547305A (en) 2022-02-24 2022-02-24 Text classification system based on natural language processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210172720.0A CN114547305A (en) 2022-02-24 2022-02-24 Text classification system based on natural language processing

Publications (1)

Publication Number Publication Date
CN114547305A true CN114547305A (en) 2022-05-27

Family

ID=81678504

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210172720.0A Pending CN114547305A (en) 2022-02-24 2022-02-24 Text classification system based on natural language processing

Country Status (1)

Country Link
CN (1) CN114547305A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102033964A (en) * 2011-01-13 2011-04-27 北京邮电大学 Text classification method based on block partition and position weight
CN109472024A (en) * 2018-10-25 2019-03-15 安徽工业大学 A kind of file classification method based on bidirectional circulating attention neural network
CN109857864A (en) * 2019-01-07 2019-06-07 平安科技(深圳)有限公司 Text sentiment classification method, device, computer equipment and storage medium
CN111125366A (en) * 2019-12-25 2020-05-08 三角兽(北京)科技有限公司 Text classification method and device
CN111309901A (en) * 2020-01-19 2020-06-19 北京海鑫科金高科技股份有限公司 Short text classification method and device
CN111709235A (en) * 2020-05-28 2020-09-25 上海发电设备成套设计研究院有限责任公司 Text data statistical analysis system and method based on natural language processing
CN113946677A (en) * 2021-09-14 2022-01-18 中北大学 Event identification and classification method based on bidirectional cyclic neural network and attention mechanism

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102033964A (en) * 2011-01-13 2011-04-27 北京邮电大学 Text classification method based on block partition and position weight
CN109472024A (en) * 2018-10-25 2019-03-15 安徽工业大学 A kind of file classification method based on bidirectional circulating attention neural network
CN109857864A (en) * 2019-01-07 2019-06-07 平安科技(深圳)有限公司 Text sentiment classification method, device, computer equipment and storage medium
CN111125366A (en) * 2019-12-25 2020-05-08 三角兽(北京)科技有限公司 Text classification method and device
CN111309901A (en) * 2020-01-19 2020-06-19 北京海鑫科金高科技股份有限公司 Short text classification method and device
CN111709235A (en) * 2020-05-28 2020-09-25 上海发电设备成套设计研究院有限责任公司 Text data statistical analysis system and method based on natural language processing
CN113946677A (en) * 2021-09-14 2022-01-18 中北大学 Event identification and classification method based on bidirectional cyclic neural network and attention mechanism

Similar Documents

Publication Publication Date Title
CN109829159B (en) Integrated automatic lexical analysis method and system for ancient Chinese text
CN109960800A (en) Weakly supervised file classification method and device based on Active Learning
CN110232439B (en) Intention identification method based on deep learning network
CN111709235A (en) Text data statistical analysis system and method based on natural language processing
CN110489750A (en) Burmese participle and part-of-speech tagging method and device based on two-way LSTM-CRF
CN111462752B (en) Attention mechanism, feature embedding and BI-LSTM (business-to-business) based customer intention recognition method
CN108363691A (en) A kind of field term identifying system and method for 95598 work order of electric power
TWI828928B (en) Highly scalable, multi-label text classification methods and devices
CN110781681A (en) Translation model-based elementary mathematic application problem automatic solving method and system
CN111145903A (en) Method and device for acquiring vertigo inquiry text, electronic equipment and inquiry system
CN108536673B (en) News event extraction method and device
CN114239579A (en) Electric power searchable document extraction method and device based on regular expression and CRF model
CN108229565A (en) A kind of image understanding method based on cognition
CN111859032A (en) Method and device for detecting character-breaking sensitive words of short message and computer storage medium
CN110765107A (en) Question type identification method and system based on digital coding
CN114547305A (en) Text classification system based on natural language processing
CN116976321A (en) Text processing method, apparatus, computer device, storage medium, and program product
US20220156611A1 (en) Method and apparatus for entering information, electronic device, computer readable storage medium
CN112488593B (en) Auxiliary bid evaluation system and method for bidding
CN112541075B (en) Standard case sending time extraction method and system for alert text
CN113672734A (en) Long text classification method based on deep learning composite model
CN114298041A (en) Network security named entity identification method and identification device
CN113537802A (en) Open source information-based geopolitical risk deduction method
CN111930947A (en) System and method for identifying authors of modern Chinese written works
CN115563311B (en) Document labeling and knowledge base management method and knowledge base management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination