CN110909545A - Black guide detection method based on gradient lifting algorithm - Google Patents

Black guide detection method based on gradient lifting algorithm Download PDF

Info

Publication number
CN110909545A
CN110909545A CN201911173486.8A CN201911173486A CN110909545A CN 110909545 A CN110909545 A CN 110909545A CN 201911173486 A CN201911173486 A CN 201911173486A CN 110909545 A CN110909545 A CN 110909545A
Authority
CN
China
Prior art keywords
word
black
training
news
detection method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911173486.8A
Other languages
Chinese (zh)
Inventor
詹瑾瑜
余佳雨
江维
李响
杨瑞
刘昌澍
李博智
蔡玉舒
周巧瑜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Division Big Data Research Institute Co Ltd
University of Electronic Science and Technology of China
Original Assignee
Division Big Data Research Institute Co Ltd
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Division Big Data Research Institute Co Ltd, University of Electronic Science and Technology of China filed Critical Division Big Data Research Institute Co Ltd
Priority to CN201911173486.8A priority Critical patent/CN110909545A/en
Publication of CN110909545A publication Critical patent/CN110909545A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9566URL specific, e.g. using aliases, detecting broken or misspelled links
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/14Travel agencies

Abstract

The invention discloses a black tour guide detection method based on a gradient lifting algorithm, which is applied to the field of data detection and aims at the problem of supervision lag of the existing tour industry; training by adopting a gradient lifting algorithm based on the obtained word vector model to obtain a black tour guide category prediction model; and finally, inputting a complaint text into the obtained black tour guide category prediction model to obtain a prediction type, and compared with the existing manual data detection, the detection efficiency is obviously improved.

Description

Black guide detection method based on gradient lifting algorithm
Technical Field
The invention belongs to the field of big data processing, and particularly relates to a data detection technology based on a gradient lifting algorithm.
Background
Recently, news reports about the appearance of butchered guests, black shops and black tours in the domestic tourist market are frequent, the problems of malicious fraud and the like in the domestic tourist market are exposed, and the essence of lagging supervision of the existing tourist market is reflected. In the age of the machine learning becoming mature, how to solve the supervision lag problem of the tourism market by using the machine learning, through the collection, cleaning and analysis of mass data, the machine learning is applied to solve the related problems of the tourism market supervision, and the research on the hot problems in the intelligent supervision of the tourism market becomes a necessary trend.
In order to solve the above problems, a black tour guide detection technology, which specifically refers to a technology based on the content of a complaint text, has been developed, but the prior art lacks effective detection.
Disclosure of Invention
In order to solve the technical problems, the invention provides a black guide detection method based on a gradient boost algorithm, which judges and classifies a large amount of texts into a plurality of predefined categories by using the gradient boost algorithm, thereby effectively promoting the travel market.
The technical scheme adopted by the invention is as follows: a black tour guide detection method based on a gradient lifting algorithm comprises the following steps:
A. acquiring website news URL data, and obtaining a word vector model based on word embedding training;
B. based on the word vector model in the step A, training by adopting a gradient lifting algorithm to obtain a black tour guide category prediction model;
C. and D, inputting a complaint text into the black tour guide category prediction model obtained in the step B to obtain a prediction type.
Further, the step A comprises the following substeps:
a1, a request is initiated by a travel news network to obtain news URL data;
a2, crawling news content from news URL data;
a3, performing word segmentation on the news content obtained in the step A2 to obtain word segmentation corpora;
and A4, training according to the participle corpus to obtain a word vector model.
Further, step a1 specifically includes: and simulating an HTTP request by Postman, setting request parameters to obtain all results, setting the document type to be application/x-www-form-URL, analyzing the returned results, and storing the daily news URL data according to the rows.
Further, step a2 specifically includes: reading news URL to initiate HTTP request, analyzing the returned HTML content, respectively obtaining the content in the title label and the content in the text label, directly storing the title content as a line, cutting the text content into segments according to periods, and then writing the files according to the lines.
Further, the step B comprises the following substeps:
b1, obtaining a complaint text, and dividing the complaint text into two parts, wherein one part is used as a training set, and the other part is used as a test set;
b2, reading the local word vector model file, analyzing according to rows, using a word as a key, using the corresponding word vector as a value, and storing the value in a dictionary variable to obtain a word embedded dictionary;
b3, converting each sentence in the training set and the test set into a training sentence vector and a test sentence vector respectively by using a word embedding dictionary;
b4, training by using a gradient lifting algorithm according to the training sentence vectors to generate a black guide category prediction model;
and B5, verifying the effect of the training model by using test data, wherein the evaluation indexes comprise accuracy, recall rate and F1 value.
Further, step C includes the following substeps:
c1, reading the local word vector model file, analyzing according to the line, using a word as a key, using the corresponding word vector as a value, and storing the value in a dictionary variable, thereby obtaining a word embedded dictionary;
c2, converting the input text into sentence vectors by using a word embedding dictionary;
and C3, inputting the sentence vector obtained in the step C2 into the black tour guide type prediction model obtained by training in the step B, and outputting to obtain a prediction type result.
The invention has the beneficial effects that: the method disclosed by the invention is based on the black tour guide prediction model obtained by training of the gradient lifting algorithm, can effectively identify the black tour guide category, and remarkably improves the classification detection efficiency of the tour data compared with the existing method for manually processing the complaint text.
Drawings
Fig. 1 is a schematic flow chart of a black tour guide detection method based on a gradient boosting algorithm according to the present invention.
FIG. 2 is a flow diagram of the crawler module of the present invention.
FIG. 3 is a flow diagram of the word vector model module of the present invention.
FIG. 4 is a flow diagram of the word segmentation module of the present invention.
Fig. 5 is a flow diagram of the sentence vector conversion module of the present invention.
FIG. 6 is a flow diagram of a predictive model training module of the present invention.
FIG. 7 is a flow chart of the gradient boost algorithm module of the present invention.
FIG. 8 is a flow diagram of the prediction module of the present invention.
FIG. 9 is a flow diagram of the space vector model construction module of the present invention.
Detailed Description
In order to facilitate the understanding of the technical contents of the present invention by those skilled in the art, the present invention will be further explained with reference to the accompanying drawings.
The technical scheme of the invention is as follows: a black tour guide detection method based on a gradient lifting algorithm mainly comprises three large flow modules, as shown in FIG. 1: the device comprises a space vector model construction module, a prediction model training module, a black tour guide category prediction module and two tool modules: word segmentation module, sentence vector conversion module. The flow of the space vector model building module consists of small modules, and comprises the following steps: the device comprises a crawler module, a word segmentation module and a word vector model module. The prediction model training module comprises a core algorithm module, namely a gradient lifting algorithm module.
The invention classifies the tour guides with the following five characteristics or behaviors as black tour guides: 1) forced shopping/consumption; 2) modifying/terminating the trip; 3) catering/accommodation violations; 4) the tour guide has no qualification/tour guide certificate; 5) assault, assault. When the input complaint text is subjected to black tour guide prediction, category matching is carried out according to the classification, and the probability ranking of the predicted category is output.
As shown in fig. 1, a black tour detection method based on a gradient lifting algorithm mainly includes three modules, a spatial vector model construction module, a prediction model training module, and a black tour category prediction module, and in addition, the invention also includes two reusable tool modules, a word segmentation module (as shown in fig. 4), and a sentence vector conversion module (as shown in fig. 5).
As shown in fig. 9, step a of the present invention: training a space vector model by word embedding;
word embedding may also be called neural network-based distributed representation, where a neural network word vector representation technique models the context and the relationship between the context and the target word through a neural network technique. Since neural networks are flexible, complex contexts can be represented. If n-grams containing word order information are used as context, the total number of n-grams grows exponentially as n increases, and the dimensionality disaster is encountered. And when the neural network represents the n-gram, the n words can be combined in some combination modes, and the number of the parameters only increases at a linear speed. With this advantage, the neural network model can model more complex contexts, including richer semantic information in the word vector.
Step A, crawling a news text, and then constructing a space vector model through a word vector model, wherein the method comprises the following steps:
step A1: as shown in FIG. 2, the travel News Web initiates a request to obtain a series of news URL data: simulating an HTTP request by Postman, setting request parameters to obtain all results, analyzing the returned results, and storing the URL of the news every day according to the rows, wherein the document type is application/x-www-form-URL;
step A2: crawling news contents through a news URL, and storing the news contents in a one-sentence one-line format: reading a news URL to initiate an HTTP request, analyzing the returned content in the HTML format, respectively obtaining the content in a title label and the content in a text label, directly storing the title content as a line, cutting the text content into parts according to periods, and then writing the parts into files according to the lines;
step A3: the use environment of the invention is a Chinese environment, Chinese is different from English in that words are separated by spaces, and English is a natural word-dividing word, so that the Chinese news needs to be divided into words. Travel news is Chinese and requires word segmentation of news content. As shown in fig. 4, the specific steps are to implement efficient word graph scanning based on a prefix dictionary, generate a Directed Acyclic Graph (DAG) composed of all possible word forming conditions of the chinese characters in the sentence, generate a prefix tree using twenty thousand words trained in the jieba open source library, and then generate several possible distinctions of the sentence to be segmented against the existing prefix tree. And then, searching a maximum probability path by adopting dynamic programming, and finding out a maximum segmentation combination based on the word frequency. For words which cannot be found in a prefix tree, an HMM model based on Chinese character word forming capability is adopted, Chinese words are arranged into a sequence according to B (begin-start position) E (end-end position) M (middle-middle position) S (single-single word forming position, no front or back), a BEMS sequence with the maximum probability can be obtained by matching with a Viterbi algorithm, sentences to be segmented are recombined according to the mode of B heading and E ending, and finally a segmentation result is obtained;
step A4: and training a word vector model by using the word segmentation linguistic data obtained in the previous step. As shown in fig. 3, specifically, a fully-connected neural network with only one hidden layer is constructed: firstly, inputting a sentence into an input layer, and converting a word into a One-Hot vector at the moment; then, inputting a linear model for simple mapping in a first hidden layer, wherein the linear model is not a nonlinear activation function; and finally, the third layer is a classifier which uses Softmax regression and outputs the probability corresponding to each word. And finally, saving the trained word vector model as a local file according to a format that a line of words and a line of word vectors are added.
As shown in fig. 6, step B of the present invention: training a black tour guide category prediction model by using a gradient lifting algorithm; specifically, the method comprises the following steps: the method for obtaining the training model by using the gradient promotion algorithm for the complaint text of the black tour guide (the complaint text is obtained by using a crawler module and crawling from the comments of the tourism website), comprises the following steps:
step B1: dividing the complaint text by the pseudo-ginseng, wherein 70% of the complaint text is used as a training set, and 30% of the complaint text is used as a test set, so that the test set can correctly evaluate the performance of the model by the proportion division;
step B2: loading a word vector model: reading a local word vector model file, analyzing according to rows, taking a word as a key, taking a word vector as a value, and storing the word vector in a dictionary variable;
step B3: each sentence of the training set and the test set is converted into a training sentence vector and a test sentence vector, respectively, using a word embedding dictionary. As shown in fig. 5, a specific method for converting a sentence into a sentence vector is to perform word segmentation on the sentence (the detailed steps are the same as those in step a 3), obtain a series of words after word segmentation, and use the words as keys to search the vector values in a dictionary. Adding the obtained word vector values, and finally dividing by the number of the words to eliminate deviation;
step B4: and training by using a gradient lifting algorithm to generate a black tour guide category prediction model.
As shown in fig. 7, the specific gradient boost algorithm is as follows. Firstly, a regression tree class is created, the information of the tree comprises a root node, the height of the tree and a rule, and the nodes comprise a storage predicted value, a left node, a right node, a feature and a partition point. The method for calculating the division point and the optimal division point comprises the following steps: calculating the MSE after segmentation according to the independent variable col, the dependent variable label and the segmentation point split; and traversing all non-repeating points in a certain column of the features, finding out the point with the minimum MSE as the optimal segmentation point, and returning to None if no non-repeating elements exist in the features. Selecting the best characteristics: and traversing all the features, calculating the MSE corresponding to the optimal segmentation point, and finding out the features with the minimum MSE, the corresponding segmentation points and the average values corresponding to the left and right subnodes. And if all the characteristics have no non-repeated elements, returning None. Rule: all rules of the regression tree are expressed in words by using queue + breadth first search, so that the full appearance of the tree is solved.
The regression tree is then initialized and the regression tree, learning rate, initial predicted values and transformation functions are stored. And then calculating which leaf node of the regression tree the training sample belongs to, then acquiring all leaf nodes of one regression tree, storing all the training samples corresponding to the leaf nodes of the regression tree and the training samples into the dictionary, calculating the predicted value again, updating the predicted value of each leaf node of the regression tree, and calculating the effect of residual prediction in the current round. And obtaining a function by using the predicted value and the residual error of the (m-1) th round, and further optimizing the function.
Step B5: the test data was used to validate the effectiveness of the training model, and the evaluation metrics included accuracy, recall, F1 values (F1-score). The formula for precision (precision) is P ═ TP/(TP + FP), which calculates the proportion of all correctly retrieved results (TP) to all actually retrieved (TP + FP). The recall ratio (recall) is given by the formula R TP/(TP + FN) which calculates the proportion of all correctly retrieved results (TP) to all results (TP + FN) that should be retrieved. The F1 value is a harmonic mean of the exact value and the recall ratio, and is expressed as F1 ═ 2 × P × R/(P + R), and F1 combines the results of P and R, indicating that the test method is more effective when F1 is higher. The test of this step is suitable for verifying the validity of the model, and if the effect is poor, it indicates that the data volume for training is insufficient, and training data should be supplemented.
The evaluation effect is shown in table 1, and it can be seen that the prediction effect of the method is better in the categories of "forced shopping/consumption" and "change/stop trip", the recognition errors are both less than 0.1, and the two categories are also the most common complaint categories, so that the scheme of the invention can be used for rapidly and accurately recognizing the black tour guide.
TABLE 1 model evaluation data sheet
precision recall F1-score
Forced shopping/self-fee 0.91 0.93 0.92
Change/terminate stroke 0.90 0.90 0.90
As shown in fig. 8, step C of the present invention: and inputting a complaint text to predict the black tour guide category. Specifically, the method for predicting the complaint category through the training model obtained in the step B comprises the following steps:
step C1: loading a word vector model: reading a local word vector model file, analyzing according to rows, taking a word as a key, taking a word vector as a value, and storing the word vector in a dictionary variable;
step C2: the input text is converted into a sentence vector using a word embedding dictionary. As shown in fig. 5, a specific method for converting a sentence into a sentence vector is to perform word segmentation on the sentence (the detailed steps are the same as those in step a 3), obtain a series of words after word segmentation, and use the words as keys to search vector values in dictionary variables. Adding the obtained word vector values, and finally dividing by the number of the words to eliminate deviation;
step C3: and (4) carrying out violation type prediction by using a black tour guide type prediction model, and outputting the predicted violation type. And predicting by the mean value of the optimal division area, adding the initial value and the predicted value of m-1 regression trees, and solving the Sigmoid value to predict y. The violation categories in this step are: 1) forced shopping/consumption; 2) modifying/terminating the trip; 3) catering/accommodation violations; 4) the tour guide has no qualification/tour guide certificate; 5) assault, assault.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims (7)

1. A black tour guide detection method based on a gradient lifting algorithm is characterized by comprising the following steps:
A. acquiring website news URL data, and obtaining a word vector model based on word embedding training;
B. based on the word vector model in the step A, training by adopting a gradient lifting algorithm to obtain a black tour guide category prediction model;
C. and D, inputting a complaint text into the black tour guide category prediction model obtained in the step B to obtain a prediction type.
2. The black guide detection method based on the gradient boost algorithm according to claim 1, wherein the step a comprises the following sub-steps:
a1, a request is initiated by a travel news network to obtain news URL data;
a2, crawling news content for news URL data;
a3, performing word segmentation on the news content obtained in the step A2 to obtain word segmentation corpora;
and A4, training according to the participle corpus to obtain a word vector model.
3. The black guide detection method based on the gradient boost algorithm according to claim 2, wherein the step a1 specifically comprises: and simulating the HTTP request by Postman, setting request parameters to obtain all results, setting the document type to be application/x-www-form-URL, analyzing the returned results, and storing the daily news URL data according to the rows.
4. The black guide detection method based on the gradient boost algorithm according to claim 2, wherein the step a2 specifically comprises: reading news URL to initiate HTTP request, analyzing the returned HTML content, respectively obtaining the content in the title label and the content in the text label, directly storing the title content as a line, cutting the text content into segments according to periods, and then writing the files according to the lines.
5. The black guide detection method based on the gradient boost algorithm according to claim 1, wherein the step B comprises the following sub-steps:
b1, obtaining a complaint text, and taking a part of the complaint text as a training set;
b2, reading the local word vector model file, analyzing according to rows, using a word as a key, using the corresponding word vector as a value, and storing the value in a dictionary variable to obtain a word embedded dictionary;
b3, converting each sentence of the training set into a training sentence vector by using a word embedding dictionary;
and B4, training by using a gradient lifting algorithm according to the training sentence vectors to generate a black guide type prediction model.
6. The method according to claim 5, wherein the data bits in the training set obtain 70% of the complaint text.
7. The black guide detection method based on the gradient boost algorithm according to claim 1, wherein the step C comprises the following sub-steps:
c1, reading the local word vector model file, analyzing according to the line, using a word as a key, using the corresponding word vector as a value, and storing the value in a dictionary variable, thereby obtaining a word embedded dictionary;
c2, converting the input text into sentence vectors by using a word embedding dictionary;
and C3, inputting the sentence vector obtained in the step C2 into the black tour guide type prediction model obtained by training in the step B, and outputting to obtain a prediction type result.
CN201911173486.8A 2019-11-26 2019-11-26 Black guide detection method based on gradient lifting algorithm Pending CN110909545A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911173486.8A CN110909545A (en) 2019-11-26 2019-11-26 Black guide detection method based on gradient lifting algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911173486.8A CN110909545A (en) 2019-11-26 2019-11-26 Black guide detection method based on gradient lifting algorithm

Publications (1)

Publication Number Publication Date
CN110909545A true CN110909545A (en) 2020-03-24

Family

ID=69819468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911173486.8A Pending CN110909545A (en) 2019-11-26 2019-11-26 Black guide detection method based on gradient lifting algorithm

Country Status (1)

Country Link
CN (1) CN110909545A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597306A (en) * 2020-12-24 2021-04-02 电子科技大学 Travel comment suggestion mining method based on BERT

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107220732A (en) * 2017-05-31 2017-09-29 福州大学 A kind of power failure complaint risk Forecasting Methodology based on gradient boosted tree
CN107944014A (en) * 2017-12-11 2018-04-20 河海大学 A kind of Chinese text sentiment analysis method based on deep learning
CN108399545A (en) * 2017-02-06 2018-08-14 北京京东尚科信息技术有限公司 E-commerce platform quality determining method and device
CN108416616A (en) * 2018-02-05 2018-08-17 阿里巴巴集团控股有限公司 The sort method and device of complaints and denunciation classification

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108399545A (en) * 2017-02-06 2018-08-14 北京京东尚科信息技术有限公司 E-commerce platform quality determining method and device
CN107220732A (en) * 2017-05-31 2017-09-29 福州大学 A kind of power failure complaint risk Forecasting Methodology based on gradient boosted tree
CN107944014A (en) * 2017-12-11 2018-04-20 河海大学 A kind of Chinese text sentiment analysis method based on deep learning
CN108416616A (en) * 2018-02-05 2018-08-17 阿里巴巴集团控股有限公司 The sort method and device of complaints and denunciation classification

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MODIFYBLOG: "机器学习算法GBDT的面试要点总结-上篇", 《HTTPS://WWW.CNBLOGS.COM/MODIFYRONG/P/7744987.HTML》 *
XINDONG WU ET.AL: "Top 10 algorithms in data mining", 《DOI 10.1007/S10115-007-0114-2》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597306A (en) * 2020-12-24 2021-04-02 电子科技大学 Travel comment suggestion mining method based on BERT

Similar Documents

Publication Publication Date Title
Swathi et al. An optimal deep learning-based LSTM for stock price prediction using twitter sentiment analysis
CN108573411B (en) Mixed recommendation method based on deep emotion analysis and multi-source recommendation view fusion of user comments
CN104699763B (en) The text similarity gauging system of multiple features fusion
CN109948143B (en) Answer extraction method of community question-answering system
CN110298032A (en) Text classification corpus labeling training system
US20090157572A1 (en) Stacked generalization learning for document annotation
CN112487143A (en) Public opinion big data analysis-based multi-label text classification method
CN101127042A (en) Sensibility classification method based on language model
KR102155768B1 (en) Method for providing question and answer data set recommendation service using adpative learning from evoloving data stream for shopping mall
CN109145260A (en) A kind of text information extraction method
CN111695358B (en) Method and device for generating word vector, computer storage medium and electronic equipment
US20210406474A1 (en) Methods and systems for generating a reference data structure for anonymization of text data
CN113312480A (en) Scientific and technological thesis level multi-label classification method and device based on graph convolution network
CN110674301A (en) Emotional tendency prediction method, device and system and storage medium
Yao et al. Knowledge enhanced person-job fit for talent recruitment
CN114997288A (en) Design resource association method
CN111858933A (en) Character-based hierarchical text emotion analysis method and system
CN115438709A (en) Code similarity detection method based on code attribute graph
Le et al. Stroke order normalization for improving recognition of online handwritten mathematical expressions
CN116628173B (en) Intelligent customer service information generation system and method based on keyword extraction
CN116629258B (en) Structured analysis method and system for judicial document based on complex information item data
CN117349423A (en) Template matching type knowledge question-answering model in water conservancy field
CN110909545A (en) Black guide detection method based on gradient lifting algorithm
CN111859955A (en) Public opinion data analysis model based on deep learning
CN111428034A (en) Training method of classification model, and classification method and device of comment information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200324