CN110516064A - A kind of Aeronautical R&D paper classification method based on deep learning - Google Patents
A kind of Aeronautical R&D paper classification method based on deep learning Download PDFInfo
- Publication number
- CN110516064A CN110516064A CN201910625454.0A CN201910625454A CN110516064A CN 110516064 A CN110516064 A CN 110516064A CN 201910625454 A CN201910625454 A CN 201910625454A CN 110516064 A CN110516064 A CN 110516064A
- Authority
- CN
- China
- Prior art keywords
- aeronautical
- paper
- data set
- training
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/313—Selection or weighting of terms for indexing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The Aeronautical R&D paper classification method based on deep learning that the present invention relates to a kind of includes the following steps: S1: acquisition Aeronautical R&D paper data obtain paper data set;S2: cleaning pretreatment is carried out to the paper data set, obtains the first data set;S3: Text Pretreatment is carried out to first data set, obtains the second data set;S4: Aeronautical R&D paper classification model is constructed based on Text-CNN text classification algorithm;S5: second data set training Aeronautical R&D paper classification model is utilized;S6: Aeronautical R&D paper classification is carried out using the Aeronautical R&D paper classification model after training.Compared with the sorting techniques such as existing random forest, support vector machines, classification method of the present invention has many advantages, such as that speed is fast, accuracy is high, this will be helpful to the working efficiency for improving researcher.
Description
Technical field
The invention belongs to data mining technology fields, are related to a kind of Aeronautical R&D paper classification method, more particularly, to one
Aeronautical R&D paper classification method of the kind based on deep learning.
Background technique
Due to gradually increasing for academic research personnel in recent years, the scientific achievements such as paper deliver speed quickening, science opinion
Explosive increase is presented in literary quantity, is related to every subjects, has expedited the emergence of various demands of user when using them.Such as scholar needs
Newest pertinent literature is found in its research field, therefore carry out classification to document and seem to be highly desirable.To Scientific Articles into
Classification belonging to paper is carried out automation and labelled, can significantly improve these retrieval efficiency, accelerates scientific research work by row classification
Make.Meanwhile constructing the expansion that paper classification model also contributes to the research works such as paper matching, these retrieval, expert's recommendation.Face
To the paper data of magnanimity, for example to have classification effectiveness low, quasi- for naive Bayesian scheduling algorithm etc. for traditional text classification algorithm
The problems such as really rate is low.
Natural language processing technique develops rapidly in recent years, the natural language processing technique after having merged deep learning algorithm
Also the classification field of document is progressed into, wherein the technical term profession degree of Aeronautical R&D paper is high, compared to other field texts
It offers, text size needed for the classification of Aeronautical R&D paper is less, still not special at present for the screening of Aeronautical R&D paper
Disaggregated model, the conventional machines learning algorithms such as logistic regression, random forest, support vector machines, k nearest neighbor algorithm can not be in aviation sections
It grinds paper classification field and particularly shows better classifying quality.
Summary of the invention
It is an object of the present invention to overcome the above-mentioned drawbacks of the prior art and provide one kind to be based on deep learning
Aeronautical R&D paper classification method.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of Aeronautical R&D paper classification method based on deep learning, includes the following steps:
S1: acquisition Aeronautical R&D paper data obtain paper data set;
S2: cleaning pretreatment is carried out to the paper data set, obtains the first data set;
S3: Text Pretreatment is carried out to first data set, obtains the second data set;
S4: Aeronautical R&D paper classification model is constructed based on Text-CNN text classification algorithm;
S5: second data set training Aeronautical R&D paper classification model is utilized;
S6: Aeronautical R&D paper classification is carried out using the Aeronautical R&D paper classification model after training.
Further, paper data are acquired in step S1 specifically: crawl paper in paper library using crawlers, institute
Crawlers are stated using Python programming language, PyCharm editing environment and Scrapy crawler frame.
Further, described in step S2 clean pretreatment specifically: by paper data set abnormal data and repetition
Data are rejected, and the abnormal data includes messy code character.
Further, Text Pretreatment described in step S3 specifically: to first data set carry out jieba participle and
It goes stop words to handle and is organized into the abstract of a thesis-classification form.
Further, training process in step S5 specifically:
Training process in step S5 specifically:
S501: second data set is divided into training set, verifying collection and test set, feature is carried out to each data set and is mentioned
It takes;
S502: based on training set training Aeronautical R&D paper classification model, fitted model parameters;Based on the verifying
Collect the hyper parameter during adjusting training;Aeronautical R&D paper classification model is general after based on test set inspection training
Change ability.
Further, the process of the feature extraction are as follows: encoded according to data set generation vocabulary, and by One-Hot
Generate numerical matrix.
Further, the Aeronautical R&D paper classification model based on the Text-CNN text classification algorithm includes successively connecting
Input layer, convolutional layer, pond layer and 4 layers of full articulamentum composition, the input layer connect is inputted for term vector, convolutional layer and pond layer
Advanced features are extracted, full articulamentum completes classification, and class categories number is 2, and convolution nucleus number is 128, and convolution kernel is having a size of 5, Chi Hua
Layer is Max-pool, and full articulamentum neuron is 128.
Convolution algorithm is as follows:
cj=f (W × Xj:j+h-1+b) (1)
Wherein, f is ReLU activation primitive, cjTo be after convolution as a result, W is weight matrix, Xj:j+h-1For window j-j+h-1
Term vector matrix, b is amount of bias.
Pond operation is as follows:
cmax=max (cj) (2)
Wherein, cmaxFor after maximum pond operation as a result, cj(j=1,2 ..., n-h+1) is the result after convolution algorithm.
Compared with prior art, the present invention have with following the utility model has the advantages that
1) Aeronautical R&D paper classification model of the invention is based on Text-CNN text classification algorithm, learns with conventional machines
Algorithm is compared, and TextCNN is very strong to the Extracting Ability of text shallow-layer feature, and direction of scientific rersearch this for Aeronautical R&D paper is distinct
Paper, the professional identification degree of keyword is high, only needs a very short text that can recognize, therefore utilizes to short text classifying quality more
Classifying quality is fine when the TextCNN text classification algorithm got well, and speed is fast;
2) present invention carries out jieba participle to paper data set text and goes stop words to handle and be organized into the abstract of a thesis-
The text information of redundancy is screened and eliminated to the form of classification, improves the efficiency and accuracy of classification;
3) present invention constructs Aeronautical R&D paper classification model, can carry out data mining to the key message got
Analysis, obtains keyword data group and retrieves corresponding airline Scientific Articles, and can discuss the keyword as Aeronautical R&D
Text storage indicates, to realize that pertinent literature is accurately retrieved and stored in Aeronautical R&D field;
4) present invention building Aeronautical R&D paper classification model, can be used as the basis of other algorithms, it will help aviation section
The expansion of other research works such as paper matching, these retrieval, expert's recommendation is ground, base can be established for other research works
Plinth;
5) during classification method of the present invention can also be used for the text classifications such as patent, expansibility is strong, has one
Fixed promotional value.
Detailed description of the invention
Fig. 1 is the Aeronautical R&D paper classification implementation flow chart based on deep learning;
Fig. 2 is Text-CNN convolutional neural networks structure chart;
Fig. 3 is the Aeronautical R&D paper classification flow diagram based on conventional machines learning algorithm;
Fig. 4 is the combination comparison diagram of Fig. 1 and Fig. 3;
Fig. 5 is simulated environment information schematic diagram;
Fig. 6 is the classifying quality comparison diagram of algorithms of different.
Specific embodiment
The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.The present embodiment is with technical solution of the present invention
Premised on implemented, the detailed implementation method and specific operation process are given, but protection scope of the present invention is not limited to
Following embodiments.
The present invention provides a kind of Aeronautical R&D paper classification method based on deep learning, as shown in Figure 1, including following step
It is rapid:
S1: acquisition Aeronautical R&D paper data obtain paper data set, specifically: known using crawlers from China
Paper is crawled in net, data are stored in MySQL database, and the data that the present embodiment is used are the abstract of a thesis and the affiliated class of paper
Not, the crawlers use Python programming language, PyCharm editing environment and Scrapy crawler frame.
S2: cleaning pretreatment is carried out to the paper data set, obtains the first data set;
Described in step S2 clean pretreatment specifically: by paper data set abnormal data and repeated data pick
It removes, the abnormal data includes messy code character.
S3: Text Pretreatment is carried out to first data set, obtains the second data set;
Text Pretreatment described in step S3 specifically: jieba participle is carried out to first data set and removes stop words
The abstract of a thesis-classification form is handled and is organized into, jieba segments kit and the text in the abstract of a thesis is divided into word, and
The character filterings without practical significance such as spcial character in abstract are fallen by deactivating vocabulary.
S4: Aeronautical R&D paper classification model is constructed based on Text-CNN text classification algorithm;
The structure of Aeronautical R&D paper classification model includes 1 input layer, a convolutional layer, a pond layer and full connection
Layer, as shown in Figure 2.
Input layer is also referred to as word embeding layer, inputs for term vector, and text is input to after feature extraction is converted into term vector
Input layer, in the present embodiment, term vector dimension is 64, sequence length 600, and class categories number is 2 classes.
Convolutional layer and pond layer extract advanced features, and convolution kernel is one-dimensional sliding, base of the present invention in text classification
In the Aeronautical R&D paper classification method of deep learning, convolution kernel size kernel_size is 5, convolution kernel number num_
Filters is 128;
Convolution algorithm is as follows:
cj=f (W × Xj:j+h-1+b) (1)
Wherein, f is ReLU activation primitive, cjTo be after convolution as a result, W is weight matrix, Xj:j+h-1For window j-j+h-1
Term vector matrix, b is amount of bias.
In embodiment, the pond layer of Text-CNN model is maximum pond layer, the sentence of different length using Max-pool
Son becomes fixed length through pond layer, and model parameter is reduced, and helps to improve classification effectiveness.
Pond operation is as follows:
cmax=max (cj) (2)
Wherein, cmaxFor after maximum pond operation as a result, cj(j=1,2 ..., n-h+1) is the result after convolution algorithm.
Finally classification work is completed by full articulamentum, after full articulamentum, exports the probability of each classification.The present embodiment
In, full articulamentum neuron is 128, and activation primitive ReLU, dropout retaining ratio is set as 0.5 to prevent over-fitting.
S5: training the Aeronautical R&D paper classification model using second data set, specifically:
S501: second data set is divided into training set, verifying collection and test set, institute according to 6:2:2 division proportion
Training set is stated for fitted model parameters, the verifying collection is used for for the hyper parameter adjustment in training process, the test set
The generalization ability of testing model after training.
S502: generating vocabulary according to training set, verifying collection and test set, and encoded with One-Hot and generate numerical matrix,
The numerical matrix is input to convolutional neural networks to be trained, verify and test.
S6: Aeronautical R&D paper classification is carried out using the Aeronautical R&D paper classification model after training.
The present invention also provides a kind of automatic classification systems for realizing above-mentioned classification method, comprising: number enters module, provides data
Input interface, acquire Aeronautical R&D paper data, obtain paper data set;Preprocessing module, for paper data set into
Row cleaning pretreatment and Text Pretreatment, obtain the second data set;Training authentication module, based on the second data set to the base of building
It is trained and verifies in the Aeronautical R&D paper classification model of Text-CNN text classification algorithm;Application module, based on training
Good Aeronautical R&D paper classification model carries out paper classification to Aeronautical R&D paper to be sorted.
The present embodiment is selected to examine the paper classification effect based on Text-CNN using the thought of method of comparative analysis
Four kinds of logistic regression, random forest, support vector machines and k nearest neighbor algorithm conventional machines learning algorithms carry out Aeronautical R&D paper
Classification, with more each algorithm classification performance, detailed process is as shown in figure 3, step S1-S3 and the present embodiment the method are one
It causes, step S4-S5 is the Aeronautical R&D paper classification model established and feature extraction mode difference.
Conventional machines learning algorithm pass through following steps 1) realize this method step S4:
1) disaggregated model of building conventional machines study;
Be respectively adopted LogisticRegression, RandomForestClassifier in the tool box sklearn,
KNeighborsClassifier, SVC train logistic regression disaggregated model, random forest disaggregated model, K arest neighbors disaggregated model
And support vector cassification model, after repetition training, verifying, test, the available aviation based on conventional machines study
Scientific Articles disaggregated model.
Conventional machines learning algorithm pass through following steps 2) realize this method step S5:
2) based on the Text character extraction of TF-IDF;
The reverse document-frequency TF-IDF of word frequency-is mainly made of word frequency TF and reverse document-frequency IDF, specific calculating process
Such as following steps:
201): calculating TF
Wherein mijIt is the number that certain word occurs in entire document, ∑tmtjIt is the frequency of occurrence summation of all vocabulary.
202): calculating IDF
Wherein | D | it is the sum of all documents, | { j:wi∈dj+ 1 | show comprising word wiNumber of documents.
203):
TF-IDF=TF × IDF (5)
Wherein, the product of TF-IDF, that is, word frequency TF and reverse document-frequency IDF.
Aeronautical R&D paper classification method based on deep learning and the Aeronautical R&D paper point based on conventional machines study
Class method comparison diagram is as shown in Figure 4.
The experimental situation of the present embodiment progress classification experiments and the kit used using PyCharm as shown in figure 5, edited
Environment, Python programming language and deep learning frame TensorFlow.
The present embodiment use classifying quality evaluation index have accuracy rate Precision i.e. Pr, recall rate Recall i.e. Re,
Harmonic-mean F1, the formula of each index are as follows:
Accuracy rate Pr is used to characterize classification results correctness, the completeness of recall rate Re characterization classification, harmonic-mean
F1 value combines accuracy rate and recall rate.
Comprehensive evaluation index harmonic-mean F1 is mainly used to comment the nicety of grading of above 5 kinds of algorithms in embodiment
Valence and comparative analysis, classification results are as shown in Figure 6, it can be seen that in the classification of aircraft, the aviation section based on deep learning
The recall rate, accuracy rate and F1 value for grinding paper classification method have respectively reached 97%, 98%, 97%;For aero-engine with
The two classifications of aircraft, the nicety of grading of Text-CNN have respectively reached 0.95 and 0.97.With conventional machines learning classification side
Method is compared, and Text-CNN algorithm can automatically extract and learn to more characteristic of division, and training speed is faster, therefore, this
Text-CNN algorithm used by embodiment is preferable to the classifying quality of Aeronautical R&D paper.
The preferred embodiment of the present invention has been described in detail above.It should be appreciated that those skilled in the art without
It needs creative work according to the present invention can conceive and makes many modifications and variations.Therefore, all technologies in the art
Personnel are available by logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea
Technical solution, all should be within the scope of protection determined by the claims.
Claims (8)
1. a kind of Aeronautical R&D paper classification method based on deep learning, which comprises the steps of:
S1: acquisition Aeronautical R&D paper data obtain paper data set;
S2: cleaning pretreatment is carried out to the paper data set, obtains the first data set;
S3: Text Pretreatment is carried out to first data set, obtains the second data set;
S4: Aeronautical R&D paper classification model is constructed based on Text-CNN text classification algorithm;
S5: second data set training Aeronautical R&D paper classification model is utilized;
S6: Aeronautical R&D paper classification is carried out using the Aeronautical R&D paper classification model after training.
2. the Aeronautical R&D paper classification method according to claim 1 based on deep learning, which is characterized in that step S1
Middle acquisition paper data specifically: paper is crawled in paper library using crawlers, the crawlers are compiled using Python
Cheng Yuyan, PyCharm editing environment and Scrapy crawler frame.
3. the Aeronautical R&D paper classification method according to claim 1 based on deep learning, which is characterized in that step S1
Described in Aeronautical R&D paper data include the Aeronautical R&D abstract of a thesis and Aeronautical R&D paper generic.
4. the Aeronautical R&D paper classification method according to claim 1 based on deep learning, which is characterized in that step S2
Described in clean pretreatment specifically: in paper data set abnormal data and repeated data reject.
5. the Aeronautical R&D paper classification method according to claim 1 based on deep learning, which is characterized in that step S3
Described in Text Pretreatment specifically: first data set jieba participle and go stop words to handle and be organized into opinion
Digest wants-form of classification.
6. the Aeronautical R&D paper classification method according to claim 1 based on deep learning, which is characterized in that step S5
The Aeronautical R&D paper classification model includes sequentially connected input layer, convolutional layer, pond layer and full articulamentum.
7. the Aeronautical R&D paper classification method according to claim 1 based on deep learning, which is characterized in that step S5
Middle training process specifically:
S501: second data set is divided into training set, verifying collection and test set, feature extraction is carried out to each data set;
S502: based on training set training Aeronautical R&D paper classification model, fitted model parameters;Collected based on the verifying and is adjusted
Hyper parameter during training white silk;The extensive energy of Aeronautical R&D paper classification model after training is examined based on the test set
Power.
8. the Aeronautical R&D paper classification method according to claim 7 based on deep learning, which is characterized in that the spy
Levy the process extracted are as follows: according to data set generation vocabulary, and encode by One-Hot and generate numerical matrix.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910625454.0A CN110516064A (en) | 2019-07-11 | 2019-07-11 | A kind of Aeronautical R&D paper classification method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910625454.0A CN110516064A (en) | 2019-07-11 | 2019-07-11 | A kind of Aeronautical R&D paper classification method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110516064A true CN110516064A (en) | 2019-11-29 |
Family
ID=68623059
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910625454.0A Pending CN110516064A (en) | 2019-07-11 | 2019-07-11 | A kind of Aeronautical R&D paper classification method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110516064A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111241283A (en) * | 2020-01-15 | 2020-06-05 | 电子科技大学 | Rapid characterization method for portrait of scientific research student |
CN111651605A (en) * | 2020-06-04 | 2020-09-11 | 电子科技大学 | Lung cancer leading edge trend prediction method based on multi-label classification |
CN113342975A (en) * | 2021-06-11 | 2021-09-03 | 江苏卓易信息科技股份有限公司 | Information catalog topic library classification method for data resources |
CN113837240A (en) * | 2021-09-03 | 2021-12-24 | 南京昆虫软件有限公司 | Classification system and classification method for education department |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10095992B1 (en) * | 2016-07-01 | 2018-10-09 | Intraspexion, Inc. | Using classified text, deep learning algorithms and blockchain to identify risk in low-frequency, high value situations, and provide early warning |
CN108681610A (en) * | 2018-05-28 | 2018-10-19 | 山东大学 | Production takes turns more and chats dialogue method, system and computer readable storage medium |
CN109033402A (en) * | 2018-08-02 | 2018-12-18 | 上海应用技术大学 | The classification method of security fields patent text |
CN109062958A (en) * | 2018-06-26 | 2018-12-21 | 华中师范大学 | It is a kind of based on the primary school of TextRank and convolutional neural networks write a composition automatic classification method |
CN109189926A (en) * | 2018-08-28 | 2019-01-11 | 中山大学 | A kind of construction method of technical paper corpus |
-
2019
- 2019-07-11 CN CN201910625454.0A patent/CN110516064A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10095992B1 (en) * | 2016-07-01 | 2018-10-09 | Intraspexion, Inc. | Using classified text, deep learning algorithms and blockchain to identify risk in low-frequency, high value situations, and provide early warning |
CN108681610A (en) * | 2018-05-28 | 2018-10-19 | 山东大学 | Production takes turns more and chats dialogue method, system and computer readable storage medium |
CN109062958A (en) * | 2018-06-26 | 2018-12-21 | 华中师范大学 | It is a kind of based on the primary school of TextRank and convolutional neural networks write a composition automatic classification method |
CN109033402A (en) * | 2018-08-02 | 2018-12-18 | 上海应用技术大学 | The classification method of security fields patent text |
CN109189926A (en) * | 2018-08-28 | 2019-01-11 | 中山大学 | A kind of construction method of technical paper corpus |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111241283A (en) * | 2020-01-15 | 2020-06-05 | 电子科技大学 | Rapid characterization method for portrait of scientific research student |
CN111241283B (en) * | 2020-01-15 | 2023-04-07 | 电子科技大学 | Rapid characterization method for portrait of scientific research student |
CN111651605A (en) * | 2020-06-04 | 2020-09-11 | 电子科技大学 | Lung cancer leading edge trend prediction method based on multi-label classification |
CN111651605B (en) * | 2020-06-04 | 2022-07-05 | 电子科技大学 | Lung cancer leading edge trend prediction method based on multi-label classification |
CN113342975A (en) * | 2021-06-11 | 2021-09-03 | 江苏卓易信息科技股份有限公司 | Information catalog topic library classification method for data resources |
CN113837240A (en) * | 2021-09-03 | 2021-12-24 | 南京昆虫软件有限公司 | Classification system and classification method for education department |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110516064A (en) | A kind of Aeronautical R&D paper classification method based on deep learning | |
CN109189926B (en) | Construction method of scientific and technological paper corpus | |
CN101819601B (en) | Method for automatically classifying academic documents | |
CN110175224B (en) | Semantic link heterogeneous information network embedding-based patent recommendation method and device | |
Sundus et al. | A deep learning approach for arabic text classification | |
CN102194013A (en) | Domain-knowledge-based short text classification method and text classification system | |
CN114048305B (en) | Class case recommendation method of administrative punishment document based on graph convolution neural network | |
CN105260437A (en) | Text classification feature selection method and application thereof to biomedical text classification | |
CN107194617A (en) | A kind of app software engineers soft skill categorizing system and method | |
CN109255029A (en) | A method of automatic Bug report distribution is enhanced using weighted optimization training set | |
Basnet et al. | Improving Nepali news recommendation using classification based on LSTM recurrent neural networks | |
Nguyen et al. | An ensemble of shallow and deep learning algorithms for Vietnamese sentiment analysis | |
CN114265935A (en) | Science and technology project establishment management auxiliary decision-making method and system based on text mining | |
Ali et al. | A probabilistic framework for short text classification | |
Kundana | Data Driven Analysis of Borobudur Ticket Sentiment Using Naïve Bayes. | |
CN112784919A (en) | Intelligent manufacturing multi-mode data oriented classification method | |
Swami et al. | Resume classifier and summarizer | |
Ai | Predicting Titanic Survivors by Using Machine Learning | |
CN117235253A (en) | Truck user implicit demand mining method based on natural language processing technology | |
Mantika et al. | Sentiment Analysis on Twitter Using Naïve Bayes and Logistic Regression for the 2024 Presidential Election | |
Almutairi et al. | A Comparative Analysis for Arabic Sentiment Analysis Models In E-Marketing Using Deep Learning Techniques | |
Sameh et al. | Behaviour analysis voting model using social media data | |
Rajasekar et al. | Comparison of machine learning algorithms in domain specific information extraction | |
Shanthi et al. | Machine learning based twitter sentiment analysis on COVID-19 | |
Shifullah et al. | Classification of Hotel Reviews Using Sentiment Analysis and Machine Learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191129 |
|
RJ01 | Rejection of invention patent application after publication |