CN104035996A - Domain concept extraction method based on Deep Learning - Google Patents

Domain concept extraction method based on Deep Learning Download PDF

Info

Publication number
CN104035996A
CN104035996A CN201410259300.1A CN201410259300A CN104035996A CN 104035996 A CN104035996 A CN 104035996A CN 201410259300 A CN201410259300 A CN 201410259300A CN 104035996 A CN104035996 A CN 104035996A
Authority
CN
China
Prior art keywords
training
classification
degree
concept
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410259300.1A
Other languages
Chinese (zh)
Other versions
CN104035996B (en
Inventor
吕钊
张青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN201410259300.1A priority Critical patent/CN104035996B/en
Publication of CN104035996A publication Critical patent/CN104035996A/en
Application granted granted Critical
Publication of CN104035996B publication Critical patent/CN104035996B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Abstract

The invention discloses a domain concept extraction method based on Deep Learning. The method includes extracting samples in a training corpus, adopting word frequency, document frequency, inverse document frequency, word length, word frequency variance and domain consensus as feature vectors, training and acquiring a deep network model, which is capable of representing the complex mapping correspondence between the word-type filed concept multi-dimensional feature vectors and class labels, on the basis of the Deep Learning technology, and finally comparing the deep network model established on the basis of the Deep Learning technology, an optimized BP neural network model and mainstream KNN and SVM models in the testing step. According to the tests, the optimal test effect is acquired through the deep network model established on the basis of the Deep Learning technology.

Description

Field concept abstracting method based on Deep Learning
Technical field
The present invention relates to field concept, field concept Automatic Extraction, artificial neural network, Deep Learning and degree of depth conviction network technology field, specifically a kind of Feature Extraction Method that has proposed applicable word type field concept feature based on Deep Learning.
Background technology
Field concept is a kind of form of expression of domain knowledge, and people come certain object in description field, communication sphere information with field concept.For example: " note ", " CRBT " belong to the concept of moving communicating field, " data structure ", " computer network " belong to the concept of computer realm.Say in a sense, field concept is the mankind abstract for things in cognitive process, is a kind of form of expression of domain knowledge in text, and reflects to a certain extent the development and change in this field.Field concept uses comparatively frequent conventionally in specific field, uses less at other field.
According to whether being formed by more than two word, field concept can be divided into word type and compound two classes.Existing research is mostly for compound field concept, and seldom has research separately for word type field concept.But, existing word V-neck V territory concept extraction method ubiquity the problem that accuracy rate is not high, feature selecting is single, researchers have often only taked once having completed the screening for field concept and non-field concept to two kinds of a small amount of features, for the distinguishing ability of noise a little less than.Meanwhile, in the science not that arranges of feature weight and threshold value, generally need to select comparatively suitable value according to the result of test of many times, artificial intervention is larger, and the in the situation that of change language material scale, weight and threshold value also need to make corresponding amendment, portable poor.So the extraction effect of word type field concept is in urgent need to be improved.
Neural network is the machine learning method of a class maturation, and it provides a kind of practicality and effective method goes out the function of real number value or vector value from input data learning, and has good robustness for the noise in data.Therefore, neural network is applicable to for the mapping relations between learning word type field concept multidimensional characteristic vectors and corresponding classification very much.The neural network that possesses multiple hidden layers has stronger ability to express, and Deep Learning is exactly mainly the problem concerning study of using the neural network that solves many hidden layers.
Summary of the invention
The object of the invention is for a little less than the unsupervised method learning ability of tradition, field concept extracts the problem of poor effect and a kind of field concept abstracting method based on Deep Learning of providing, field concept extraction problem is converted into two classification problems, adopt the more statistical nature of horn of plenty, utilize the field concept extraction algorithm of Deep Learning, Deep Learning and field concept extraction task are combined, carry out unsupervised pre-training by building degree of depth conviction net, then coordinate traditional neural network model to have the adjustment of supervision, the degree of depth network model and the KNN that finally train, SVM model is compared, in test data set, obtain the highest F value.
The concrete technical scheme that realizes the object of the invention is:
A field concept abstracting method based on Deep Learning, the method comprises following concrete steps:
A) training stage
First extract the positive negative sample in training corpus, the row labels of going forward side by side; Then combined training corpus and background corpus, aligns negative sample and carries out feature extraction, structural attitude vector set; Finally utilize training under the environment of set of eigenvectors and the corresponding degree of deep learning tool case that is marked at matlab to obtain degree of depth network DN model;
B) test phase
Target is to utilize the degree of depth network DN model that the training stage obtains to check the classifying quality to testing material storehouse; First successively candidate item extraction, feature extraction are carried out in testing material storehouse, structural attitude vector set; Then set of eigenvectors is inputted to degree of depth network DN model, utilized degree of depth network DN model that proper vector is automatically judged and identified, realize the classification of the candidate item to testing material storehouse; Finally obtain correct field concept collection according to result and the manual examination and verification of classification.
Described structural attitude vector set is to form with following characteristics:
1) word frequency (TF);
2) document frequency (DF);
3) inverse document frequency (IDF);
4) word length (LEN);
5) word frequency variance (TV);
6) (DC) unanimously spent in field.
In described step a), training obtains degree of depth network model DN, specifically comprises:
I) only utilize the proper vector of training data to carry out nothing supervision to learn construction depth conviction net (Deep Belief Nets, DBN);
Import a proper vector into input layer, the restriction Boltzmann machine (Restricted Boltzmann Machine, RBM) of training ground floor; Then fixing ground floor RBM parameter, the input using the output of ground floor RBM as second layer RBM, training second layer RBM; The parameter of fixing front two-layer RBM, utilizes the output of second layer RBM to complete the training of the 3rd layer of RBM similarly; When having learnt after whole proper vectors, the training process of whole DBN also finishes;
II) utilize the parameter initialization degree of depth network DN of degree of depth conviction net DBN, then adopt back-propagation algorithm, there is supervision according to the classification mark of training sample and finely tune degree of depth network DN parameter, when iteration or error through some number of times are decreased in 0.001 ~ 0.005 scope, the parameter adjustment of Part II finishes; So far, the training stage of degree of depth network DN model also just completes.
The classification of the candidate item to testing material storehouse in described step b) is using the extraction of field concept as binary classification, i.e. " field concept " and " non-field concept "; According to the output valve of DN model, obtain the co-occurrence probabilities p (x, y) of candidate feature x and classification y, with it weigh the degree of confidence that candidate's concept belongs to classification y in the situation that being characterized as x; X represents the proper vector of candidate's concept, and classification y represents one of " field concept ", " non-field concept " two classes; The sorter obtaining by training corpus utilizes the classification of sorter automatic discrimination candidate concept in test data set.
The invention provides a kind of field concept abstracting method based on Deep Learning, comprise the classification problem of field concept in extracting and the field concept extraction algorithm of the Deep Learning of proposition, for the extraction of word type field concept, the method has better recognition effect to field word than traditional neural network model, classical KNN model and SVM model on identical experiment data set.
The present invention combines Deep Learning and field concept extraction task, carry out unsupervised pre-training by building degree of depth conviction net, then coordinate traditional neural network model to have the adjustment of supervision, finally train degree of depth network model and in test data set, obtain higher accuracy rate, also ensured certain recall rate, the recognition performance of entirety is best simultaneously.
Utilize the present invention, can effectively obtain based on Deep Learning technology the extraction result of word type field concept, there is positive effect for researchs such as information retrieval, mechanical translation, body learnings.
Brief description of the drawings
Fig. 1 is process flow diagram of the present invention;
Fig. 2 is training process flow diagram of the present invention;
Fig. 3 is test flow chart of the present invention;
Fig. 4 is degree of depth network model structural drawing of the present invention;
Fig. 5 is different disaggregated model experimental index comparison diagrams.
Embodiment
The present invention is a kind of field concept abstracting method based on Deep Learning, the method comprises that classification and the field concept of Deep Learning of field concept in extracting extracts, wherein: the classification during described field concept extracts, using field concept extraction as binary classification, i.e. " field concept " and " non-field concept " two classes.Adopt the thought of machine learning, by training sample acquisition characteristics, structural classification device utilizes the classification of sorter automatic discrimination candidate concept in test data set.Particularly, classification is the co-occurrence probabilities p (x, y) that estimates candidate concept characteristic x and classification y, with it weigh the degree of confidence that candidate's concept belongs to classification y in the situation that being characterized as x.The x here represents the proper vector of candidate's concept, and classification y represents one of " field concept ", " non-field concept " two classes.
The field concept of described Deep Learning extracts (Deep Learning based Domain Concept Extraction Algorithm, DLDoC) be divided into generally two stages of training and testing, as shown in Figure 1, first utilize training data study to obtain degree of depth network (Deep Nets by training module, DN) model then utilizes previous step to train the DN model obtaining to carry out automatic classification identification to test data in test module.For classification results, by the mode of manual examination and verification, finally obtain correct field concept collection, concrete steps are as follows:
I) training stage: the training stage completes the structure of degree of depth network model.As shown in Figure 2, first extract the positive negative sample in training corpus, the row labels of going forward side by side; Then combined training corpus and background corpus, carries out feature extraction, structural attitude vector set to the positive negative sample obtaining; Finally utilize set of eigenvectors and corresponding flag data training pattern.Whole training process can be understood as the mapping from training corpus to model, wherein passes through successively the conversion of sample space, feature space.
II) test phase: test phase is to utilize the DN model that previous step training process obtains to check the recognition effect to test data set.As shown in Figure 3, similar with training process, first successively candidate item extraction, feature extraction are carried out in testing material storehouse, structural attitude vector set; Then set of eigenvectors is inputted to DN model, it can automatically be judged and identify proper vector, thereby realize the classification to candidate item; Finally compare according to the result of classification and artificial mark, thereby calculate overall recognition effect.
Described structural attitude vector set:
The method of the TF-IDF adopting for most researchers, the present invention chooses following several feature:
1) word frequency (TF);
2) document frequency (DF);
3) inverse document frequency (IDF);
4) word length (LEN);
5) word frequency variance (TV);
6) (DC) unanimously spent in field.
The structure of described degree of depth network DN model, as shown in Figure 4:
I) only utilize the proper vector of training data to carry out nothing supervision to learn construction depth conviction net (Deep Belief Nets, DBN).Import a proper vector into input layer, the RBM of training ground floor; Then fixing ground floor RBM parameter, the input using the output of ground floor RBM as second layer RBM, training second layer RBM; The parameter of fixing front two-layer RBM, utilizes the output of second layer RBM to complete the training of the 3rd layer of RBM similarly.When having learnt after whole proper vectors, the training process of whole DBN is also through with.
II) utilize the parameter initialization DN of DBN, then adopt back-propagation algorithm, have supervision according to the classification mark of training sample and finely tune, when iteration or error through some number of times are decreased in 0.001 ~ 0.005 scope, the parameter adjustment of Part II is just through with.So far, the training of DN model has also just completed, and can be used for the classification of unknown sample to predict.
Embodiment
Taking military field material as example, the present invention is further described by reference to the accompanying drawings below.
Consult Fig. 1, first from training corpus, carry out sample extraction, carry out feature extraction from sample, select proper vector, obtain training pattern-DN model, the DN model obtaining carries out automatic classification identification to test data.For classification results, can be by the mode of manual examination and verification, finally obtain correct field concept collection.
In the present embodiment, as shown in Figure 2, realize the conversion of training corpus to sample space, the present invention chooses above several latent structure proper vector, and table 1 has been listed the eigenwert of the part training sample that the present invention extracts in military field material.
Table 1 military field part training sample feature
Positive and negative sample set and characteristic of correspondence vector that model training utilizes first two steps to extract and obtains are gathered, relation between learning characteristic vector sum sample labeling data, train degree of depth network model (DN), this model has completed the mapping from proper vector to mark for each sample, namely obtains the parameter of DN model.
In the present embodiment, as shown in Figure 3, select test sample book, " commandant " proper vector: 29 6 4.8078 2 208.9667 1.4144, after test, this proper vector is identified as positive example, illustrates that DN model has good recognition capability to sample set.
The present invention the DN model building is combined with neural network simultaneously and with traditional KNN model and SVM model, contrast, as shown in Figure 5, adopt the DBN DBN+NN model of training in advance can obtain relatively good and stable accuracy rate, exceeded respectively 13.05 percentage points, KNN model and SVM model and 23.09 percentage points.In the F value index of reflection overall performance, the DBN+NN model that the present invention builds has obtained mxm., exceedes 2.53 percentage points, SVM model, and basic NN2 model and the F value of KNN model are more or less the same.

Claims (4)

1. the field concept abstracting method based on Deep Learning, is characterized in that the method comprises following concrete steps:
A) training stage
First extract the positive negative sample in training corpus, the row labels of going forward side by side; Then combined training corpus and background corpus, aligns negative sample and carries out feature extraction, structural attitude vector set; Finally utilize training under the environment of set of eigenvectors and the corresponding degree of deep learning tool case that is marked at matlab to obtain degree of depth network DN model;
B) test phase
First successively candidate item extraction, feature extraction are carried out in testing material storehouse, structural attitude vector set; Then set of eigenvectors is inputted to degree of depth network DN model, utilized degree of depth network DN model that proper vector is automatically judged and identified, realize the classification of the candidate item to testing material storehouse; Finally obtain correct field concept collection according to result and the manual examination and verification of classification.
2. method according to claim 1, is characterized in that described structural attitude vector set, is to form with following characteristics:
Word frequency (TF);
Document frequency (DF);
Inverse document frequency (IDF);
Word length (LEN);
Word frequency variance (TV);
(DC) unanimously spent in field.
3. method according to claim 1, is characterized in that in described step a), training obtains degree of depth network model DN, specifically comprises:
I) only utilize the proper vector of training data to carry out nothing supervision to learn construction depth conviction net DBN;
Import a proper vector into input layer, the restriction Boltzmann machine RBM of training ground floor; Then fixing ground floor RBM parameter, the input using the output of ground floor RBM as second layer RBM, training second layer RBM; The parameter of fixing front two-layer RBM, utilizes the output of second layer RBM to complete the training of the 3rd layer of RBM similarly; When having learnt after whole proper vectors, the training process of whole DBN also finishes;
II) utilize the parameter initialization degree of depth network DN of degree of depth conviction net DBN, then adopt back-propagation algorithm, there is supervision according to the classification mark of training sample and finely tune degree of depth network DN parameter, when iteration or error through some number of times are decreased in 0.001 ~ 0.005 scope, the parameter adjustment of Part II finishes; So far, the training stage of degree of depth network DN model also just completes.
4. method according to claim 1, the classification that it is characterized in that the candidate item to testing material storehouse in described step b) is using the extraction of field concept as binary classification, i.e. " field concept " and " non-field concept "; According to the output valve of DN model, obtain the co-occurrence probabilities p (x, y) of candidate feature x and classification y, with it weigh the degree of confidence that candidate's concept belongs to classification y in the situation that being characterized as x; X represents the proper vector of candidate's concept, and classification y represents one of " field concept ", " non-field concept " two classes; The sorter obtaining by training corpus utilizes the classification of sorter automatic discrimination candidate concept in test data set.
CN201410259300.1A 2014-06-11 2014-06-11 Field concept abstracting method based on Deep Learning Expired - Fee Related CN104035996B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410259300.1A CN104035996B (en) 2014-06-11 2014-06-11 Field concept abstracting method based on Deep Learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410259300.1A CN104035996B (en) 2014-06-11 2014-06-11 Field concept abstracting method based on Deep Learning

Publications (2)

Publication Number Publication Date
CN104035996A true CN104035996A (en) 2014-09-10
CN104035996B CN104035996B (en) 2017-06-16

Family

ID=51466766

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410259300.1A Expired - Fee Related CN104035996B (en) 2014-06-11 2014-06-11 Field concept abstracting method based on Deep Learning

Country Status (1)

Country Link
CN (1) CN104035996B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106055560A (en) * 2016-05-18 2016-10-26 上海申腾信息技术有限公司 Method for collecting data of word segmentation dictionary based on statistical machine learning method
CN106228980A (en) * 2016-07-21 2016-12-14 百度在线网络技术(北京)有限公司 Data processing method and device
CN106599577A (en) * 2016-12-13 2017-04-26 重庆邮电大学 ListNet learning-to-rank method combining RBM with feature selection
CN106650806A (en) * 2016-12-16 2017-05-10 北京大学深圳研究生院 Cooperative type deep network model method for pedestrian detection
CN106686403A (en) * 2016-12-07 2017-05-17 腾讯科技(深圳)有限公司 Video preview generation method, device, server and system
CN106980873A (en) * 2017-03-09 2017-07-25 南京理工大学 Fancy carp screening technique and device based on deep learning
CN108959375A (en) * 2018-05-24 2018-12-07 南京网感至察信息科技有限公司 A kind of rule-based Knowledge Extraction Method with deep learning
WO2019015461A1 (en) * 2017-07-18 2019-01-24 中国银联股份有限公司 Risk identification method and system based on transfer deep learning
CN109543046A (en) * 2018-11-16 2019-03-29 重庆邮电大学 A kind of robot data interoperability Methodologies for Building Domain Ontology based on deep learning
CN109597946A (en) * 2018-12-05 2019-04-09 国网江西省电力有限公司信息通信分公司 A kind of bad webpage intelligent detecting method based on deepness belief network algorithm
CN109871896A (en) * 2019-02-26 2019-06-11 北京达佳互联信息技术有限公司 Data classification method, device, electronic equipment and storage medium
CN114626520A (en) * 2022-03-01 2022-06-14 腾讯科技(深圳)有限公司 Method, device, equipment and storage medium for training model
CN115357691A (en) * 2022-10-21 2022-11-18 成都数之联科技股份有限公司 Semantic retrieval method, system, equipment and computer readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101290626A (en) * 2008-06-12 2008-10-22 昆明理工大学 Text categorization feature selection and weight computation method based on field knowledge
CN101739430A (en) * 2008-11-21 2010-06-16 中国科学院计算技术研究所 Method for training and classifying text emotion classifiers based on keyword
CN103365997A (en) * 2013-07-12 2013-10-23 华东师范大学 Opinion mining method based on ensemble learning
CN103793510A (en) * 2014-01-29 2014-05-14 苏州融希信息科技有限公司 Classifier construction method based on active learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101290626A (en) * 2008-06-12 2008-10-22 昆明理工大学 Text categorization feature selection and weight computation method based on field knowledge
CN101739430A (en) * 2008-11-21 2010-06-16 中国科学院计算技术研究所 Method for training and classifying text emotion classifiers based on keyword
CN103365997A (en) * 2013-07-12 2013-10-23 华东师范大学 Opinion mining method based on ensemble learning
CN103793510A (en) * 2014-01-29 2014-05-14 苏州融希信息科技有限公司 Classifier construction method based on active learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郭云龙 等: "基于证据理论的多分类器中文微博观点句识别", 《计 算 机 工 程》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106055560A (en) * 2016-05-18 2016-10-26 上海申腾信息技术有限公司 Method for collecting data of word segmentation dictionary based on statistical machine learning method
CN106228980A (en) * 2016-07-21 2016-12-14 百度在线网络技术(北京)有限公司 Data processing method and device
CN106228980B (en) * 2016-07-21 2019-07-05 百度在线网络技术(北京)有限公司 Data processing method and device
CN106686403A (en) * 2016-12-07 2017-05-17 腾讯科技(深圳)有限公司 Video preview generation method, device, server and system
CN106686403B (en) * 2016-12-07 2019-03-08 腾讯科技(深圳)有限公司 A kind of video preview drawing generating method, device, server and system
CN106599577A (en) * 2016-12-13 2017-04-26 重庆邮电大学 ListNet learning-to-rank method combining RBM with feature selection
CN106650806B (en) * 2016-12-16 2019-07-26 北京大学深圳研究生院 A kind of cooperating type depth net model methodology for pedestrian detection
CN106650806A (en) * 2016-12-16 2017-05-10 北京大学深圳研究生院 Cooperative type deep network model method for pedestrian detection
CN106980873A (en) * 2017-03-09 2017-07-25 南京理工大学 Fancy carp screening technique and device based on deep learning
WO2019015461A1 (en) * 2017-07-18 2019-01-24 中国银联股份有限公司 Risk identification method and system based on transfer deep learning
CN108959375A (en) * 2018-05-24 2018-12-07 南京网感至察信息科技有限公司 A kind of rule-based Knowledge Extraction Method with deep learning
CN109543046A (en) * 2018-11-16 2019-03-29 重庆邮电大学 A kind of robot data interoperability Methodologies for Building Domain Ontology based on deep learning
CN109597946A (en) * 2018-12-05 2019-04-09 国网江西省电力有限公司信息通信分公司 A kind of bad webpage intelligent detecting method based on deepness belief network algorithm
CN109597946B (en) * 2018-12-05 2022-04-12 国网江西省电力有限公司信息通信分公司 Bad webpage intelligent detection method based on deep belief network algorithm
CN109871896A (en) * 2019-02-26 2019-06-11 北京达佳互联信息技术有限公司 Data classification method, device, electronic equipment and storage medium
CN114626520A (en) * 2022-03-01 2022-06-14 腾讯科技(深圳)有限公司 Method, device, equipment and storage medium for training model
CN115357691A (en) * 2022-10-21 2022-11-18 成都数之联科技股份有限公司 Semantic retrieval method, system, equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN104035996B (en) 2017-06-16

Similar Documents

Publication Publication Date Title
CN104035996A (en) Domain concept extraction method based on Deep Learning
CN108804677B (en) Deep learning problem classification method and system combining multi-level attention mechanism
CN105975573B (en) A kind of file classification method based on KNN
CN109766277B (en) Software fault diagnosis method based on transfer learning and DNN
Jiao et al. SAR images retrieval based on semantic classification and region-based similarity measure for earth observation
CN105404632B (en) System and method for carrying out serialized annotation on biomedical text based on deep neural network
CN103309953B (en) Method for labeling and searching for diversified pictures based on integration of multiple RBFNN classifiers
CN112819023B (en) Sample set acquisition method, device, computer equipment and storage medium
CN110942091B (en) Semi-supervised few-sample image classification method for searching reliable abnormal data center
CN106294344A (en) Video retrieval method and device
CN104834940A (en) Medical image inspection disease classification method based on support vector machine (SVM)
CN104834941A (en) Offline handwriting recognition method of sparse autoencoder based on computer input
CN109684476A (en) A kind of file classification method, document sorting apparatus and terminal device
CN114154570A (en) Sample screening method and system and neural network model training method
CN104008187A (en) Semi-structured text matching method based on the minimum edit distance
CN105701225A (en) Cross-media search method based on unification association supergraph protocol
Kim et al. EnvBERT: multi-label text classification for imbalanced, noisy environmental news data
Parvathi et al. Identifying relevant text from text document using deep learning
CN111026887A (en) Cross-media retrieval method and system
CN113486670A (en) Text classification method, device and equipment based on target semantics and storage medium
CN116561314B (en) Text classification method for selecting self-attention based on self-adaptive threshold
CN110825852B (en) Long text-oriented semantic matching method and system
Alsammak et al. An enhanced performance of K-nearest neighbor (K-NN) classifier to meet new big data necessities
CN110502669A (en) The unsupervised chart dendrography learning method of lightweight and device based on the side N DFS subgraph
CN115186069A (en) CNN-BiGRU-based academic text abstract automatic classification method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170616

Termination date: 20210611