CN110263166A - Public sentiment file classification method based on deep learning - Google Patents

Public sentiment file classification method based on deep learning Download PDF

Info

Publication number
CN110263166A
CN110263166A CN201910525459.6A CN201910525459A CN110263166A CN 110263166 A CN110263166 A CN 110263166A CN 201910525459 A CN201910525459 A CN 201910525459A CN 110263166 A CN110263166 A CN 110263166A
Authority
CN
China
Prior art keywords
sample
data
positive
negative
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910525459.6A
Other languages
Chinese (zh)
Inventor
肖翔
黄泓
周家木
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sea - Induced Star Map Technology Co Ltd
Original Assignee
Beijing Sea - Induced Star Map Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sea - Induced Star Map Technology Co Ltd filed Critical Beijing Sea - Induced Star Map Technology Co Ltd
Priority to CN201910525459.6A priority Critical patent/CN110263166A/en
Publication of CN110263166A publication Critical patent/CN110263166A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides the public sentiment file classification methods based on deep learning, include the following steps: 1, crawl enterprise's public sentiment text from internet Baidu, available a small amount of positive sample and largely without mark sample;2, initial training data set is constructed by PU-Learning technology;3, three kinds of depth models are trained using fasttext, CNN, RNN to the data set in 2, using multi-model coorinated training;4, the trained CNN of data set after expanding in use 3 classifies to test data set.This patent constructs positive sample data by the purposive data that crawl, and the quality of positive sample can be made higher;More farther, the relatively reliable negative sample apart from positive sample can be obtained from without mark sample;The problem of business personnel's concern is identified from public sentiment data with higher accuracy rate and event, push and early warning, substantially increase business personnel's working efficiency in time.

Description

Public sentiment file classification method based on deep learning
Technical field
The present invention relates to a kind of public sentiment file classification method, especially a kind of quality of positive sample is higher, can obtain distance Farther, the relatively reliable negative sample of positive sample, accuracy rate is high, the public sentiment text classification side based on deep learning that work efficiency is high Method.
Background technique
Currently, the classification for Company News public sentiment text data combines simple rule to be divided also in artificial treatment The stage of class, inefficiency, while classifying quality not can guarantee.
Summary of the invention
To solve the above problems, the present invention provides a kind of quality of positive sample is higher, can obtain it is farther apart from positive sample, Relatively reliable negative sample, accuracy rate is high, the public sentiment file classification method based on deep learning that work efficiency is high.
Public sentiment file classification method based on deep learning includes the following steps: 1, crawls enterprise carriage from internet Baidu Feelings text, available a small amount of positive sample and largely without mark sample;2, pass through the initial instruction of PU-Learning technology building Practice data set;3, three kinds of depth models are trained using fasttext, CNN, RNN to the data set in 2, is cooperateed with and is instructed using multi-model Practice, classification judgement is carried out to without mark sample data with these three models respectively, if three kinds of classifiers are determined as positive sample And emotion be it is negative, then be determined as positive sample, positive sample collection be added;If three kinds of classifiers are determined as negative sample and emotion is Front is then determined as negative sample, and negative sample collection is added;Other situations wouldn't process;4, the data training after expanding in use 3 The CNN perfected classifies to test data set, if classification accuracy is less than threshold value, iteration executes the operation in 3, on the contrary Process terminates.
The specific method is as follows for it:
(1) data are crawled and are pre-processed
In news public sentiment event category scene, data unlike it is contemplated that it is so ideal, due to data mark at This too high etc. reason, we are difficult to the positive negative sample of accumulating and enriching, therefore how to take a large amount of and accurately have the positive and negative of mark Sample has very big influence for classifying quality.
In this patent, using keyword combination, (, had been there is fund in this way in such as LeEco+capital chain for we The multiple combinations of the enterprise name and bankroll problem descriptor of problem) it crawls and business capital problem news data occurs, mark fund Problem positive sample data;Simultaneously with not occurring such as Tencent, " good " enterprise of bankroll problem, Alibaba crawls phase as keyword News is closed, (may cannot be known as negative sample as without mark sample also with the presence of the news comprising part bankroll problem, answer This is unknown sample also referred to as without mark sample).In this way we just have a small amount of positive sample (network crawls+the artificial mark in part Note confirmation) and largely without mark sample;
(2) training set constructs
Learn (PU-learning, Positive and unlabeled learning) iteration without label using positive sample From largely without sample with positive sample COS distance as far as possible is found out in mark sample set, being regarded as more reliable in (1) Negative sample, together with positive sample, construct training set.
The application scenarios of PULearning are that we can clearly determine positive sample, but not can determine that negative sample, because It is likely to be positive sample for it, only we prove not yet.At this moment the uncertain sample in this part can be called nothing by we Exemplar U, in addition positive sample P establishes model.
The calculation process of PU-learning is broadly divided into two stages:
First stage: reliable negative example collection RN is selected from unmarked example, way is:
A, it randomly selects a part of positive example S in P to be added in U, at this moment two datasets are respectively P-S and U+S, are determined respectively Justice is ps and us, and the data for being us with one two disaggregated model model, label 0 of ps and us training, label 1 is the number of ps According to;
B, then with this classifier model for no label data U, unlabeled exemplars set U is done and is classified, calculated every A sample belongs to the probability of negative class, sets a threshold value a, if sample classification probability is greater than a, it is considered that being a phase To reliable negative sample.
Second stage: using positive example P and reliable negative example RN, one traditional machine learning classification model of training is used to pre- Survey new samples.
(3) multi-model coorinated training
It is mainly divided into three steps:
A, identification and classification is carried out to no label data respectively with three kinds of sorter models fasttext, cnn, rnn, if three kinds Model, which all differentiates, to be positive class (there are bankroll problems), then is directly added into training set as positive sample;If all differentiating the class that is negative (bankroll problem is not present), then be negative sample;If there are two classifiers to differentiate the class that is positive, a classifier differentiates the class that is negative, then Retain this data, carries out manual intervention mark;If there are two classifiers to differentiate the class that is negative, a classifier differentiates the class that is positive, It disregards, continues to regard as no label data.
B, after by the operation in a, training set data is updated, then proceedes to three kinds of model model of training, calculating is being tested The classification accuracy of concentration;
C, iteration carries out a, and the operation in b terminates iteration until the accuracy rate in test set reaches threshold value, protects Deposit model
(4) category of model
According to updated training data is obtained in 3, trained depth convolutional neural networks CNN is to test in use 3 Data set is classified, if classification accuracy is less than threshold value (0.8), continues to execute the operation in 3, otherwise process terminates.
This patent constructs positive sample data by the purposive data that crawl, and the quality of positive sample can be made higher;Simultaneously In conjunction with PU-learning more farther, the relatively reliable negative sample apart from positive sample can be obtained from without mark sample;Simultaneously It can be in the generally existing a small amount of positive sample of industry and largely without mark in conjunction with PU-learning and multi-model coorinated training technology Ideal effect is obtained in the case where signed-off sample notebook data, and business personnel is identified from public sentiment data with higher accuracy rate The problem of concern and event, in time push and early warning substantially increase business personnel's working efficiency, and according to recognition result point Analysis, facilitates business personnel to take risk management measure.
Detailed description of the invention
Fig. 1 is the workflow schematic diagram of this patent
Fig. 2 is the model support composition of the character level convolutional neural networks (char-CNN) of this patent
Specific embodiment
As depicted in figs. 1 and 2, the public sentiment file classification method based on deep learning includes the following steps: 1, from internet Baidu crawls enterprise's public sentiment text, available a small amount of positive sample and largely without mark sample;2, pass through PU-Learning Technology constructs initial training data set;3, three kinds of depth models are trained using fasttext, CNN, RNN to the data set in 2, adopted With multi-model coorinated training, classification judgement is carried out to without mark sample data with these three models respectively, if three kinds of classifiers Be determined as positive sample and emotion be it is negative, then be determined as positive sample, positive sample collection be added;If three kinds of classifiers are determined as Negative sample and emotion are front, then are determined as negative sample, and negative sample collection is added;Other situations wouldn't process;4, expand in use 3 The trained CNN of data set after filling classifies to test data set, if classification accuracy is less than threshold value, iteration is executed Operation in 3, on the contrary process terminates.
The specific method is as follows for it:
(1) data are crawled and are pre-processed
In news public sentiment event category scene, data unlike it is contemplated that it is so ideal, due to data mark at This too high etc. reason, we are difficult to the positive negative sample of accumulating and enriching, therefore how to take a large amount of and accurately have the positive and negative of mark Sample has very big influence for classifying quality.
In this patent, using keyword combination, (, had been there is fund in this way in such as LeEco+capital chain for we The multiple combinations of the enterprise name and bankroll problem descriptor of problem) it crawls and business capital problem news data occurs, mark fund Problem positive sample data;Simultaneously with not occurring such as Tencent, " good " enterprise of bankroll problem, Alibaba crawls phase as keyword News is closed, (may cannot be known as negative sample as without mark sample also with the presence of the news comprising part bankroll problem, answer This is unknown sample also referred to as without mark sample).In this way we just have a small amount of positive sample (network crawls+the artificial mark in part Note confirmation) and largely without mark sample;
(2) training set constructs
Learn (PU-learning, Positive and unlabeled learning) iteration without label using positive sample From largely without sample with positive sample COS distance as far as possible is found out in mark sample set, being regarded as more reliable in (1) Negative sample, together with positive sample, construct training set.
The application scenarios of PULearning are that we can clearly determine positive sample, but not can determine that negative sample, because It is likely to be positive sample for it, only we prove not yet.At this moment the uncertain sample in this part can be called nothing by we Exemplar U, in addition positive sample P establishes model.
The calculation process of PU-learning is broadly divided into two stages:
First stage: reliable negative example collection RN is selected from unmarked example, way is:
A, it randomly selects a part of positive example S in P to be added in U, at this moment two datasets are respectively P-S and U+S, are determined respectively Justice is ps and us, and the data for being us with one two disaggregated model model, label 0 of ps and us training, label 1 is the number of ps According to;
B, then with this classifier model for no label data U, unlabeled exemplars set U is done and is classified, calculated every A sample belongs to the probability of negative class, sets a threshold value a, if sample classification probability is greater than a, it is considered that being a phase To reliable negative sample.
Second stage: using positive example P and reliable negative example RN, one traditional machine learning classification model of training is used to pre- Survey new samples.
(3) multi-model coorinated training
It is mainly divided into three steps:
A, identification and classification is carried out to no label data respectively with three kinds of sorter models fasttext, cnn, rnn, if three kinds Model, which all differentiates, to be positive class (there are bankroll problems), then is directly added into training set as positive sample;If all differentiating the class that is negative (bankroll problem is not present), then be negative sample;If there are two classifiers to differentiate the class that is positive, a classifier differentiates the class that is negative, then Retain this data, carries out manual intervention mark;If there are two classifiers to differentiate the class that is negative, a classifier differentiates the class that is positive, It disregards, continues to regard as no label data.
B, after by the operation in a, training set data is updated, then proceedes to three kinds of model model of training, calculating is being tested The classification accuracy of concentration;
C, iteration carries out a, and the operation in b terminates iteration until the accuracy rate in test set reaches threshold value, protects Deposit model
(4) category of model
According to updated training data is obtained in 3, trained depth convolutional neural networks CNN is to test in use 3 Data set is classified, if classification accuracy is less than threshold value (0.8), continues to execute the operation in 3, otherwise process terminates.
This patent constructs positive sample data by the purposive data that crawl, and the quality of positive sample can be made higher;Simultaneously In conjunction with PU-learning more farther, the relatively reliable negative sample apart from positive sample can be obtained from without mark sample;Simultaneously It can be in the generally existing a small amount of positive sample of industry and largely without mark in conjunction with PU-learning and multi-model coorinated training technology Ideal effect is obtained in the case where signed-off sample notebook data, and business personnel is identified from public sentiment data with higher accuracy rate The problem of concern and event, in time push and early warning substantially increase business personnel's working efficiency, and according to recognition result point Analysis, facilitates business personnel to take risk management measure.
The above-described embodiments are merely illustrative of preferred embodiments of the present invention, not to model of the invention It encloses and is defined, without departing from the spirit of the design of the present invention, this field ordinary engineering and technical personnel is to the technology of the present invention side The various changes and improvements that case is made, should fall within the scope of protection determined by the claims of the present invention.

Claims (1)

1. the public sentiment file classification method based on deep learning, includes the following steps:
1), from internet, Baidu crawls enterprise's public sentiment text, available a small amount of positive sample and largely without mark sample;
2) initial training data set, is constructed by PU-Learning technology;
3), to the data set in 2 using fasttext, CNN, RNN three kinds of depth models of training, using multi-model coorinated training, Classification judgement is carried out to without mark sample data with these three models respectively, if three kinds of classifiers are determined as positive sample and feelings It is negative for feeling, then is determined as positive sample, and positive sample collection is added;If three kinds of classifiers are determined as negative sample and emotion is positive Face is then determined as negative sample, and negative sample collection is added;Other situations wouldn't process;
4) the trained CNN of data set after, expanding in use 3 classifies to test data set, if classification accuracy is small In threshold value, then iteration executes the operation in 3, otherwise process terminates;
The specific method is as follows for it:
(1) data are crawled and are pre-processed
In news public sentiment event category scene, data unlike it is contemplated that it is so ideal, too due to data mark cost The reasons such as height, we are difficult to the positive negative sample of accumulating and enriching, therefore how to take positive negative sample that is a large amount of and accurately having mark, There is very big influence for classifying quality;
In this patent, we are crawled using keyword combination there is business capital problem news data, marks the positive sample of bankroll problem Notebook data;" good " enterprise for not occurring bankroll problem is used to crawl related news as keyword simultaneously, as without mark sample;This Sample we just have a small amount of positive sample and largely without mark sample;
(2) training set constructs
Using positive sample without label study iteration from (1) largely without mark sample set in find out and positive sample COS distance Sample as far as possible is regarded as more structurally sound negative sample, together with positive sample, constructs training set;
The application scenarios of PULearning are that we can clearly determine positive sample, but not can determine that negative sample, because it It is likely to be positive sample, only we prove not yet, and at this moment we can be known as the uncertain sample in this part without label Sample U, in addition positive sample P establishes model;
The calculation process of PU-learning is broadly divided into two stages:
First stage: reliable negative example collection RN is selected from unmarked example, way is:
A, it randomly selects a part of positive example S in P to be added in U, at this moment two datasets are respectively P-S and U+S, are respectively defined as Ps and us, the data for being us with one two disaggregated model model, label 0 of ps and us training, label 1 is the data of ps;
B, then with this classifier model for no label data U, unlabeled exemplars set U is done and is classified, each sample is calculated Originally the probability for belonging to negative class sets a threshold value a, if sample classification probability is greater than a, it is considered that be one it is opposite can The negative sample leaned on;
Second stage: using positive example P and reliable negative example RN, one traditional machine learning classification model of training is new for predicting Sample;
(3) multi-model coorinated training
It is mainly divided into three steps:
A, identification and classification is carried out to no label data respectively with three kinds of sorter models fasttext, cnn, rnn, if three kinds of models All differentiate the class that is positive, is then directly added into training set as positive sample;If differentiating the class that is negative, all for negative sample;If having two A classifier differentiates the class that is positive, and a classifier differentiates the class that is negative, then retains this data, carries out manual intervention mark;If having Two classifiers differentiate the class that is negative, and a classifier differentiates the class that is positive, disregards, continue to regard as no label data;
B, after by the operation in a, training set data is updated, three kinds of model model of training is then proceeded to, calculates in test set Classification accuracy;
C, iteration carries out a, and the operation in b terminates iteration until the accuracy rate in test set reaches threshold value, saves mould Type;
(4) category of model
According to updated training data is obtained in 3, trained depth convolutional neural networks CNN is to test data in use 3 Collection is classified, if classification accuracy is less than threshold value (0.8), continues to execute the operation in 3, otherwise process terminates.
CN201910525459.6A 2019-06-18 2019-06-18 Public sentiment file classification method based on deep learning Pending CN110263166A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910525459.6A CN110263166A (en) 2019-06-18 2019-06-18 Public sentiment file classification method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910525459.6A CN110263166A (en) 2019-06-18 2019-06-18 Public sentiment file classification method based on deep learning

Publications (1)

Publication Number Publication Date
CN110263166A true CN110263166A (en) 2019-09-20

Family

ID=67919008

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910525459.6A Pending CN110263166A (en) 2019-06-18 2019-06-18 Public sentiment file classification method based on deep learning

Country Status (1)

Country Link
CN (1) CN110263166A (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110704661A (en) * 2019-10-12 2020-01-17 腾讯科技(深圳)有限公司 Image classification method and device
CN110826320A (en) * 2019-11-28 2020-02-21 上海观安信息技术股份有限公司 Sensitive data discovery method and system based on text recognition
CN111078879A (en) * 2019-12-09 2020-04-28 北京邮电大学 Method and device for detecting text sensitive information of satellite internet based on deep learning
CN111177507A (en) * 2019-12-31 2020-05-19 支付宝(杭州)信息技术有限公司 Method and device for processing multi-label service
CN111310014A (en) * 2020-02-21 2020-06-19 深圳中兴网信科技有限公司 Scenic spot public opinion monitoring system, method, device and storage medium based on deep learning
CN111666414A (en) * 2020-06-12 2020-09-15 上海观安信息技术股份有限公司 Method for detecting cloud service by sensitive data and cloud service platform
CN111931912A (en) * 2020-08-07 2020-11-13 北京推想科技有限公司 Network model training method and device, electronic equipment and storage medium
CN111966944A (en) * 2020-08-17 2020-11-20 中电科大数据研究院有限公司 Model construction method for multi-level user comment security audit
CN112115264A (en) * 2020-09-14 2020-12-22 中国科学院计算技术研究所苏州智能计算产业技术研究院 Text classification model adjusting method facing data distribution change
CN112597141A (en) * 2020-12-24 2021-04-02 国网山东省电力公司 Network flow detection method based on public opinion analysis
CN112819023A (en) * 2020-06-11 2021-05-18 腾讯科技(深圳)有限公司 Sample set acquisition method and device, computer equipment and storage medium
CN113139381A (en) * 2021-04-29 2021-07-20 平安国际智慧城市科技股份有限公司 Unbalanced sample classification method and device, electronic equipment and storage medium
CN113269229A (en) * 2021-04-22 2021-08-17 中国科学院信息工程研究所 Training method for enhancing generalization ability of deep learning classification model
CN113361585A (en) * 2021-06-02 2021-09-07 浪潮软件科技有限公司 Method for optimizing and screening clues based on supervised learning algorithm
CN113609298A (en) * 2021-08-23 2021-11-05 南京擎盾信息科技有限公司 Data processing method and device for court public opinion corpus extraction
CN113641888A (en) * 2021-03-31 2021-11-12 昆明理工大学 Event-related news filtering learning method based on fusion topic information enhanced PU learning
CN113849645A (en) * 2021-09-28 2021-12-28 平安科技(深圳)有限公司 Mail classification model training method, device, equipment and storage medium
CN114223012A (en) * 2019-10-31 2022-03-22 深圳市欢太科技有限公司 Push object determination method and device, terminal equipment and storage medium
CN114254588A (en) * 2021-12-16 2022-03-29 马上消费金融股份有限公司 Data tag processing method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101980202A (en) * 2010-11-04 2011-02-23 西安电子科技大学 Semi-supervised classification method of unbalance data
CN105468713A (en) * 2015-11-19 2016-04-06 西安交通大学 Multi-model fused short text classification method
CN107239529A (en) * 2017-05-27 2017-10-10 中国矿业大学 A kind of public sentiment hot category classification method based on deep learning
CN109299162A (en) * 2018-11-08 2019-02-01 南京航空航天大学 A kind of Active Learning Method classified for positive class and data untagged

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101980202A (en) * 2010-11-04 2011-02-23 西安电子科技大学 Semi-supervised classification method of unbalance data
CN105468713A (en) * 2015-11-19 2016-04-06 西安交通大学 Multi-model fused short text classification method
CN107239529A (en) * 2017-05-27 2017-10-10 中国矿业大学 A kind of public sentiment hot category classification method based on deep learning
CN109299162A (en) * 2018-11-08 2019-02-01 南京航空航天大学 A kind of Active Learning Method classified for positive class and data untagged

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
何远生: "基于深度学习多模型融合的中文短文本情感分类算法研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
张璞,刘畅,李逍: "基于PU学习的建议语句分类方法", 《计算机应用》 *

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110704661A (en) * 2019-10-12 2020-01-17 腾讯科技(深圳)有限公司 Image classification method and device
CN110704661B (en) * 2019-10-12 2021-04-13 腾讯科技(深圳)有限公司 Image classification method and device
CN114223012A (en) * 2019-10-31 2022-03-22 深圳市欢太科技有限公司 Push object determination method and device, terminal equipment and storage medium
CN110826320A (en) * 2019-11-28 2020-02-21 上海观安信息技术股份有限公司 Sensitive data discovery method and system based on text recognition
CN110826320B (en) * 2019-11-28 2023-10-13 上海观安信息技术股份有限公司 Sensitive data discovery method and system based on text recognition
CN111078879A (en) * 2019-12-09 2020-04-28 北京邮电大学 Method and device for detecting text sensitive information of satellite internet based on deep learning
CN111177507A (en) * 2019-12-31 2020-05-19 支付宝(杭州)信息技术有限公司 Method and device for processing multi-label service
CN111177507B (en) * 2019-12-31 2023-06-23 支付宝(杭州)信息技术有限公司 Method and device for processing multi-mark service
CN111310014A (en) * 2020-02-21 2020-06-19 深圳中兴网信科技有限公司 Scenic spot public opinion monitoring system, method, device and storage medium based on deep learning
CN112819023A (en) * 2020-06-11 2021-05-18 腾讯科技(深圳)有限公司 Sample set acquisition method and device, computer equipment and storage medium
CN112819023B (en) * 2020-06-11 2024-02-02 腾讯科技(深圳)有限公司 Sample set acquisition method, device, computer equipment and storage medium
CN111666414B (en) * 2020-06-12 2023-10-17 上海观安信息技术股份有限公司 Method for detecting cloud service by sensitive data and cloud service platform
CN111666414A (en) * 2020-06-12 2020-09-15 上海观安信息技术股份有限公司 Method for detecting cloud service by sensitive data and cloud service platform
CN111931912A (en) * 2020-08-07 2020-11-13 北京推想科技有限公司 Network model training method and device, electronic equipment and storage medium
CN111966944A (en) * 2020-08-17 2020-11-20 中电科大数据研究院有限公司 Model construction method for multi-level user comment security audit
CN111966944B (en) * 2020-08-17 2024-04-09 中电科大数据研究院有限公司 Model construction method for multi-level user comment security audit
CN112115264B (en) * 2020-09-14 2024-03-22 中科苏州智能计算技术研究院 Text classification model adjustment method for data distribution change
CN112115264A (en) * 2020-09-14 2020-12-22 中国科学院计算技术研究所苏州智能计算产业技术研究院 Text classification model adjusting method facing data distribution change
CN112597141A (en) * 2020-12-24 2021-04-02 国网山东省电力公司 Network flow detection method based on public opinion analysis
CN112597141B (en) * 2020-12-24 2022-07-15 国网山东省电力公司 Network flow detection method based on public opinion analysis
CN113641888A (en) * 2021-03-31 2021-11-12 昆明理工大学 Event-related news filtering learning method based on fusion topic information enhanced PU learning
CN113641888B (en) * 2021-03-31 2023-08-29 昆明理工大学 Event-related news filtering learning method based on fusion topic information enhanced PU learning
CN113269229A (en) * 2021-04-22 2021-08-17 中国科学院信息工程研究所 Training method for enhancing generalization ability of deep learning classification model
CN113139381A (en) * 2021-04-29 2021-07-20 平安国际智慧城市科技股份有限公司 Unbalanced sample classification method and device, electronic equipment and storage medium
CN113139381B (en) * 2021-04-29 2023-11-28 平安国际智慧城市科技股份有限公司 Unbalanced sample classification method, unbalanced sample classification device, electronic equipment and storage medium
CN113361585A (en) * 2021-06-02 2021-09-07 浪潮软件科技有限公司 Method for optimizing and screening clues based on supervised learning algorithm
CN113609298A (en) * 2021-08-23 2021-11-05 南京擎盾信息科技有限公司 Data processing method and device for court public opinion corpus extraction
CN113849645A (en) * 2021-09-28 2021-12-28 平安科技(深圳)有限公司 Mail classification model training method, device, equipment and storage medium
CN113849645B (en) * 2021-09-28 2024-06-04 平安科技(深圳)有限公司 Mail classification model training method, device, equipment and storage medium
CN114254588B (en) * 2021-12-16 2023-10-13 马上消费金融股份有限公司 Data tag processing method and device
CN114254588A (en) * 2021-12-16 2022-03-29 马上消费金融股份有限公司 Data tag processing method and device

Similar Documents

Publication Publication Date Title
CN110263166A (en) Public sentiment file classification method based on deep learning
WO2016033907A1 (en) Statistical machine learning-based internet hidden link detection method
CN107092596A (en) Text emotion analysis method based on attention CNNs and CCR
CN110134849A (en) A kind of network public-opinion monitoring method and system
CN109635108B (en) Man-machine interaction based remote supervision entity relationship extraction method
CN106776538A (en) The information extracting method of enterprise's noncanonical format document
CN106709754A (en) Power user grouping method based on text mining
CN108563638B (en) Microblog emotion analysis method based on topic identification and integrated learning
CN112214610A (en) Entity relation joint extraction method based on span and knowledge enhancement
CN106528528A (en) A text emotion analysis method and device
CN107239439A (en) Public sentiment sentiment classification method based on word2vec
CN109918505B (en) Network security event visualization method based on text processing
CN103064971A (en) Scoring and Chinese sentiment analysis based review spam detection method
CN103984943A (en) Scene text identification method based on Bayesian probability frame
CN106021361A (en) Sequence alignment-based self-adaptive application layer network protocol message clustering method
CN106294324A (en) A kind of machine learning sentiment analysis device based on natural language parsing tree
CN110909542B (en) Intelligent semantic serial-parallel analysis method and system
CN110851593B (en) Complex value word vector construction method based on position and semantics
CN113742733B (en) Method and device for extracting trigger words of reading and understanding vulnerability event and identifying vulnerability type
CN107885849A (en) A kind of moos index analysis system based on text classification
CN111984790B (en) Entity relation extraction method
CN107368526A (en) A kind of data processing method and device
CN112434163A (en) Risk identification method, model construction method, risk identification device, electronic equipment and medium
CN115545437A (en) Financial enterprise operation risk early warning method based on multi-source heterogeneous data fusion
CN114265931A (en) Big data text mining-based consumer policy perception analysis method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190920