CN106202481A - The evaluation methodology of a kind of perception data and system - Google Patents

The evaluation methodology of a kind of perception data and system Download PDF

Info

Publication number
CN106202481A
CN106202481A CN201610565797.9A CN201610565797A CN106202481A CN 106202481 A CN106202481 A CN 106202481A CN 201610565797 A CN201610565797 A CN 201610565797A CN 106202481 A CN106202481 A CN 106202481A
Authority
CN
China
Prior art keywords
data
perception data
training
word
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610565797.9A
Other languages
Chinese (zh)
Inventor
李甫
汪洋泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Quantum Cloud Future (beijing) Mdt Infotech Ltd
Wuxi Liangziyun Digital New Media Technology Co Ltd
Original Assignee
Quantum Cloud Future (beijing) Mdt Infotech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Quantum Cloud Future (beijing) Mdt Infotech Ltd filed Critical Quantum Cloud Future (beijing) Mdt Infotech Ltd
Priority to CN201610565797.9A priority Critical patent/CN106202481A/en
Publication of CN106202481A publication Critical patent/CN106202481A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • Human Resources & Organizations (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)
  • Development Economics (AREA)
  • Computational Linguistics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)

Abstract

The present invention relates to evaluation methodology and the system of a kind of perception data, obtain perception data as corpus;Corpus is carried out data prediction and artificial mark, obtains training dictionary;Vocabulary in training dictionary is carried out feature extraction, obtains feature lexicon, then feature based dictionary generates characteristic vector, and build training sample;Create grader, utilize training sample to train grader;Obtain perception data to be evaluated, perception data to be evaluated is carried out data prediction, and builds perception data vector, then this perception data vector is inputted trained grader, it is judged that the classification of perception data;Finally calculate the evaluation score of perception data to be evaluated.Unified authentication system standard can be set up for a certain service occupation by above-mentioned evaluation methodology.

Description

Perception data evaluation method and system
Technical Field
The invention relates to the technical field of data analysis, in particular to a perception data evaluation method and system.
Background
With the development of electronic commerce, the express logistics industry has been developed greatly. There are a plurality of express companies, and an objective evaluation standard is needed for selecting a proper express company from the plurality of express companies. However, at present, no system capable of quantitatively analyzing the perception comments of the customers well appears in the market, and the difficulty is selection of scoring indexes, natural semantic analysis and the like. Therefore, it is necessary to provide an evaluation method, which can implement automatic, quantitative and standardized evaluation of customer perception data to implement comparison between different express service providers.
The express logistics service industry is a typical production type service industry, and an evaluation method provided by taking the express logistics service industry as an example can play a demonstration role in the development of other service industry authentication.
Disclosure of Invention
In view of the foregoing analysis, the present invention aims to provide a method and a system for evaluating sensory data, so as to solve the problem that the service industry lacks a uniform authentication system standard for evaluation.
The purpose of the invention is mainly realized by the following technical scheme:
the method for evaluating the perception data comprises the following steps:
s1, obtaining perception data as training corpora;
s2, carrying out data preprocessing and manual labeling on the training corpus to obtain a training word bank;
s3, extracting characteristics of words in the training word bank to obtain a characteristic dictionary, generating characteristic vectors based on the characteristic dictionary, and constructing a training sample;
s4, creating a classifier, and training the classifier by using a training sample;
s5, acquiring perception data to be evaluated, performing data preprocessing on the perception data to be evaluated, constructing a perception data vector, inputting the perception data vector into a trained classifier, and judging the category of the perception data;
and S6, calculating the evaluation score of the perception data to be evaluated.
Wherein, the preprocessing in the steps S2 and S5 further includes formatting and word segmentation, and the specific steps are as follows:
s21, formatting each piece of perception data in the training corpus, and converting the perception data into the same structured format, wherein the structured format at least comprises 4 fields of perception data content, a theme domain, keywords and a company name; wherein, there is at least one topic area, and each topic area defines at least one category;
s22, segmenting the sensing data content; adopting a Chinese word segmentation device for Chinese sensing data; and for English perception data, performing space word segmentation, and after English word segmentation is completed, normalizing the tense and the single-complex number by using a word stem extraction mode.
The preprocessing in step S2 and step S5 further includes stop word and synonym processing, and the specific steps are as follows:
a. processing the word segmentation result by using a pre-established stop word list, and removing stop words;
b. synonyms are replaced with a pre-established synonym table.
The manual labeling in step S2 is performed to label the subject field and the category under the subject field.
In step S3, the feature extraction method includes: and counting the word frequency of each vocabulary in the training word bank, sequencing the vocabularies according to the word frequency, and selecting the first N words to form a feature dictionary.
The method for generating the feature vector specifically comprises the following steps: taking the number of words in the feature dictionary as the total dimensionality of the feature vector, wherein each word in the feature dictionary corresponds to one feature dimensionality, and establishing the feature vector for perception data on the basis of the feature dimensionality; if words in the feature dictionary appear in the preprocessed perception data, taking TF-IDF values corresponding to the appearing words as values of corresponding dimensions; if the words in the feature dictionary do not appear in the preprocessed sensing data, the corresponding feature dimension value is 0; the TF-IDF value refers to TF multiplied by IDF, and TF refers to word frequency; IDF means inverse document frequency, where n denotes the number of perceptual data in which a word appears, and D is the total perceptual data number.
For the topic domains, a training sample may be constructed for each topic domain in step S3, and a classifier may be created for each topic domain in step S4, and the training sample of each topic domain is used to train the respective classifier. The classifier may be a classifier that employs a naive bayes model.
The evaluation score is calculated by the formulaWhereinWherein n represents the number of categories in the subject field, Max represents the highest score of the evaluation coefficient, △ represents the highest value minus the lowest value of the evaluation coefficient, h represents each category, h is 1-n, αhIs an evaluation coefficient of each category; x is the number ofChThe number of items belonging to each category under each subject domain is satisfiedXCIRepresenting the number of perceptual data divided into a certain theme zone.
The invention also provides a system for evaluating the perception data, which comprises the following components:
the training corpus module is used for acquiring sensing data as training corpuses;
the preprocessing module is used for preprocessing the materials;
the training word library module is used for calling the preprocessing module to carry out data preprocessing on the training corpus and then carrying out manual labeling to obtain a training word library;
the training sample module is used for extracting the characteristics of the vocabularies in the training word stock to obtain a characteristic dictionary, generating a characteristic vector based on the characteristic dictionary and constructing a training sample;
the training module is used for creating a classifier and training the classifier by using the training samples;
the judging module is used for acquiring the perception data to be evaluated, calling the preprocessing module to perform data preprocessing on the perception data to be evaluated, constructing a perception data vector, inputting the perception data vector into the trained classifier, and judging the category of the perception data;
and the evaluation module is used for calculating the evaluation score of the perception data to be evaluated.
The invention has the following beneficial effects:
the invention evaluates based on the customer perception data, and the evaluation and authentication system is different from the traditional product and system authentication from index selection, evaluation technology to authentication mode. The service quality score is obtained through statistics and calculation of user perception data, and therefore a uniform authentication system standard is established for a certain service industry.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The drawings are only for purposes of illustrating particular embodiments and are not to be construed as limiting the invention, wherein like reference numerals are used to designate like parts throughout.
Fig. 1 is a flowchart of an evaluation method of perception data.
Detailed Description
The preferred embodiments of the present invention will now be described in detail with reference to the accompanying drawings, which form a part hereof, and which together with the embodiments of the invention serve to explain the principles of the invention.
According to a specific embodiment of the invention, the perception data in the field of express delivery service is evaluated by a method and a system, wherein the sources of the perception data include but are not limited to evaluation contents of microblogs, posts, public comments, post bureau evaluation websites and various e-commerce websites, and can also be derived from user behavior logs, user behavior analysis and the like. The perception data in the embodiment refers to comments of the user on the express delivery service.
The evaluation method specifically comprises the following steps:
s1, crawling comment contents on a network through URL links to serve as training corpora.
Specifically, an open-source nwebrowler program can be adopted to crawl HTML files and then extract evaluation information from the HTML files.
The wider the source distribution of the training comment data is, the more comprehensive the data type is collected, the more accurate the trained classifier is, so that the more accurate the result of subsequent category prediction is, and the more the express company condition can be reflected by the final scoring.
S2, carrying out data preprocessing and manual labeling on the training corpus to obtain a training word bank; the data pre-processing further comprises: formatting, word segmentation, stop words, synonym processing and the like, wherein the specific contents are as follows:
and S21, formatting each comment in the training corpus, and converting the comments into the same structured format, wherein the format can be json or xml and the like. The fields of the structured format include: comment content, topic domain, keywords, company name, etc. There may be multiple subject domains, with multiple level classes defined under each subject domain. The content of the keyword field is extracted based on the original review content.
Take a certain express company in the field of express service as an example. The subject domain was identified as 6 of functionality, economy, safety, timeliness, comfort, civilization, as shown in table 1. The function represents the individual service condition; economically expressing the price status; the safety expresses privacy protection, insurance and cargo integrity; the timeliness shows the delivery speed; whether the comfort performance is convenient to consult, convenient to take and deliver, timely to remind and the like; the service attitude of the company is expressed civilized. Under each topic domain, 4 classes are defined, respectively good, bad, and bad. Of course, the expression corresponding to 4 levels of classes corresponding to the timeliness may also be: fast, slow, very slow; the 4 level class correspondence expressions for economy may be: is cheap, expensive and very expensive.
And S22, segmenting the structured comment content. Wherein, if the comment is Chinese, a Chinese word segmentation device is adopted; if the English words are English words, the blank space is used for word segmentation, and the tense and the single-complex number are normalized by using a word stem extraction mode after the English word segmentation is finished. Specifically, word segmentation tools such as ICTCLAS (Institute of Computing Technology, chinese lexical Analysis System) and IK Analyzer (IK segmenter) may be employed as the chinese segmenter.
And S23, processing the word segmentation result by using a pre-established stop word list, and removing stop words. The stop words include words or words without practical meaning, such as "and" have "," not only "but also" and the like, and some uncommon words and special symbols.
S24, replacing synonyms in the training word library by using a pre-established synonym table, so that all synonyms are represented by one word.
TABLE 1 topic Domain and its level classes
And S25, manually marking the topic domain related to each comment in the training corpus and the level class under the topic domain. It should be noted that a comment may relate to multiple subject domains, but a comment may only correspond to one level class under each subject domain. For example, a topic domain involved in a comment is security and timeliness, and manually labeled "good" in security and "slow" in timeliness according to semantics. If a comment is not related to each topic field, the comment is deleted.
And S26, storing the vocabulary topic division domains subjected to word division, table deactivation and synonym processing into a training word library corresponding to each topic domain in a vector mode.
And S3, extracting the characteristics of the words in the training word library to obtain a characteristic dictionary, generating the characteristic vector of each comment by using the characteristic dictionary, and forming the training sample of each topic domain by using the characteristic vector related in each topic domain and the manually labeled class.
The feature extraction method comprises the following steps: and counting the word frequency of each vocabulary in the training word bank, and selecting the first N (N is more than or equal to 1) high-frequency words as a feature dictionary.
The generation method of the feature vector comprises the following steps: counting the number of words (total dimension) of the feature dictionary, wherein each word corresponds to one feature dimension. Based on this, a feature vector is established for each comment. If a word in the feature dictionary appears in the preprocessed comment, taking the TF-IDF value corresponding to the word as the value of the corresponding dimension; and if the word in the feature dictionary does not appear in the comment, the corresponding feature dimension value is 0.
The form of the feature vector is as follows:
represents: the words in 3 feature dictionaries appear in a preprocessed comment and respectively correspond to the words in the 1 st dimension, the 32 th dimension and the 80 th dimension of the feature dictionary, so that the values of the feature vector of the comment in the 1 st dimension, the 32 th dimension and the 80 th dimension are TF-IDF values of the 3 words, namely 0.1, 0.4 and 0.32, and the values of the feature vectors of other dimensions are 0. 0 indicates that the word corresponding to the dimension in the feature dictionary does not appear in the comment.
The TF-IDF value refers to TF multiplied by IDF, and TF refers to word frequency; IDF means inverse document frequency, where IDF is log (D/n), where n represents the number of comments where the word appears and D is the total number of comments.
And S4, creating a classifier for each topic domain, and training the corresponding classifier by using the training samples of each topic domain. The classifier will be used to predict the class of levels in the subject domain where the reviews are located.
The embodiment adopts a naive Bayes model as a classifier, the classification principle is to judge the probability of the characteristic belonging to each class, and then the class with the highest probability is taken as the classification result. The invention is not limited to the naive Bayes model, and other classifiers such as SVM (support vector machine) classifiers can be adopted.
S5, the comments of a certain company are crawled through URL links, data preprocessing is carried out on the comments, comment vectors are constructed, the comment vectors are input into a trained classifier, the class of the comments in the related subject domains is judged, and then the class distribution condition of the company in the subject domains can be obtained.
The data preprocessing comprises formatting, word segmentation, stop word processing and the like.
And S51, converting the crawled comments into the same structured format, wherein the format can be json or xml and the like. The fields of the structured format include: comment content, topic domain, keywords, company name, etc.
And S52, performing word segmentation on the structured comment content, wherein the word segmentation method is the same as the step S23.
And S53, processing the word segmentation result by using a pre-established stop word list, and removing stop words. The deactivation word list is the same as that used in step S24.
S54, the construction method of the comment feature vector comprises the following steps: comparing the word after the data preprocessing with a characteristic dictionary, and if a word in the characteristic dictionary appears in the word after the data preprocessing, acquiring a TF-IDF value of the word in a training sample as a characteristic value of a corresponding position in a characteristic vector; and if the word in the feature dictionary does not appear in the word after the data preprocessing, the feature value of the corresponding position of the word is 0.
Taking an express company in a time-sensitive subject domain as an example, the distribution conditions of all levels are judged by a classifier as shown in table 2.
Table 2 distribution of express companies in each level on the timeliness topic domain
And S6, calculating the evaluation scores of the companies on each topic domain.
Calculating the score K of the company on a subject domain based on the level class distribution of the company on the subject domainCIThe calculation formula isWherein
In the formula, n represents the number of level classes under the subject field;
max represents the highest score of the evaluation coefficient;
delta is the highest value minus the lowest value of the evaluation coefficient;
h represents each class, h is 1 to n;
αhthe evaluation coefficients of all the classes are obtained, and the values of the evaluation coefficients can be changed according to requirements;
xChthe number of items belonging to each class in a certain subject domain for the comment of the company
XCIThe number of comments divided into the topic field I for company C comments is indicated.
Taking the distribution of the level classes of a certain express company on the time-sensitive subject domain in table 2 as an example, the calculation of the evaluation score is described, wherein 4 level classes are set in each subject domain distribution, namely n is 4, h is 1, 2, 3 and 4, and the evaluation coefficient of each level class is set to α1=1.2、α2=1、α3=-1、α4-1.2. Thus the formula is
K C I = 5 - 4 * ( 1.2 - R C I ) 2.4
Wherein,
r is to beCISubstituting the value of (C) into formula KCI
K C I = 5 - 4 * ( 1.2 - R C I ) 2.4 = 3.1167.
Meaning that the express company has a score of 3.1167 on the subject field "timeliness".
The invention discloses another specific embodiment, which provides an evaluation system of perception data for implementing the perception data evaluation method, comprising:
a corpus module, configured to implement step S1 to obtain perceptual data as corpus;
the preprocessing module is used for preprocessing the materials; the preprocessing may include formatting, word segmentation, and further may include stop word processing, synonym processing, etc., as described in the above steps S21 to S24;
the training word library module is used for calling the preprocessing module to carry out data preprocessing on the training corpus and then carrying out manual labeling to obtain a training word library; the manual marking can mark a subject field related to each comment in the training corpus and a level class under the subject field;
the training sample module is used for extracting the characteristics of the vocabularies in the training word stock to obtain a characteristic dictionary, generating a characteristic vector based on the characteristic dictionary and constructing a training sample; specifically, the method of the above-described step S3 may be employed;
a training module, configured to create a classifier, train the classifier using the training sample, and specifically adopt the method in step S4;
the judging module is configured to obtain the sensing data to be evaluated, call the preprocessing module to perform data preprocessing on the sensing data to be evaluated, construct a sensing data vector, input the sensing data vector into the trained classifier, and judge the category of the sensing data, where the method in step S5 may be specifically adopted;
and the evaluation module is used for calculating the evaluation score of the perception data to be evaluated, wherein the calculation method is as described in step S6.
In summary, the embodiments of the present invention provide an evaluation method and system for perception data in the field of express service, which classify and quantify user evaluations, and provide a scoring method for express company services, so as to establish a unified evaluation standard in the express service industry.
Those skilled in the art will appreciate that all or part of the flow of the method implementing the above embodiments may be implemented by a computer program, which is stored in a computer readable storage medium, to instruct related hardware. The computer readable storage medium is a magnetic disk, an optical disk, a read-only memory or a random access memory.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention.

Claims (10)

1. A method for evaluating perception data is characterized by comprising the following steps:
s1, obtaining perception data as training corpora;
s2, carrying out data preprocessing and manual labeling on the training corpus to obtain a training word bank;
s3, extracting characteristics of words in the training word bank to obtain a characteristic dictionary, generating characteristic vectors based on the characteristic dictionary, and constructing a training sample;
s4, creating a classifier, and training the classifier by using a training sample;
s5, acquiring perception data to be evaluated, performing data preprocessing on the perception data to be evaluated, constructing a perception data vector, inputting the perception data vector into a trained classifier, and judging the category of the perception data;
and S6, calculating the evaluation score of the perception data to be evaluated.
2. The method for evaluating perceptual data according to claim 1, wherein the preprocessing in steps S2 and S5 further comprises formatting and word segmentation, and the specific steps are as follows:
s21, formatting the perception data, and converting the perception data into the same structured format, wherein the structured format at least comprises 4 fields of perception data content, a subject domain, keywords and a company name; wherein, there is at least one topic area, and each topic area defines at least one category;
and S22, segmenting the sensing data content.
3. The evaluation method of perception data according to claim 2, wherein a chinese word segmenter is employed for the chinese perception data; and for English perception data, performing space word segmentation, and after English word segmentation is completed, normalizing the tense and the single-complex number by using a word stem extraction mode.
4. The method for evaluating perceptual data according to claim 2, wherein the preprocessing in steps S2 and S5 further comprises stop word and synonym processing, and the specific steps are as follows:
a. processing the word segmentation result by using a pre-established stop word list, and removing stop words;
b. synonyms are replaced with a pre-established synonym table.
5. The method for evaluating perceptual data according to claim 2, wherein the manual labeling in step S2 is performed on the topic area and the category under the topic area.
6. The method for evaluating perceptual data according to claim 1, wherein the method for extracting features in step S3 is: and counting the word frequency of each vocabulary in the training word bank, sequencing the vocabularies according to the word frequency, and selecting the first N words to form a feature dictionary.
7. The method for evaluating perceptual data according to claim 1, wherein the method for generating the feature vector in step S3 is: taking the number of words in the feature dictionary as the total dimensionality of the feature vector, wherein each word in the feature dictionary corresponds to one feature dimensionality, and establishing the feature vector for perception data on the basis of the feature dimensionality; if words in the feature dictionary appear in the preprocessed perception data, taking TF-IDF values corresponding to the appearing words as values of corresponding dimensions of the feature vectors; if the words in the feature dictionary do not appear in the preprocessed sensing data, the corresponding feature dimension value is 0; the TF-IDF value refers to TF multiplied by IDF, and TF refers to word frequency; IDF means inverse document frequency, and IDF is log (D/n), where n denotes the number of perceptual data in which a word appears and D is the total number of perceptual data.
8. The method for evaluating perceptual data according to claim 2, wherein a training sample is constructed for each topic domain in step S3, a classifier is created for each topic domain in step S4, and the training sample for each topic domain is used to train the respective classifier.
9. The method for evaluating perceptual data according to claim 2, wherein the evaluation score is calculated by a formula ofWhereinIn the formula, n represents the number of categories under the theme zone; max representative evaluation systemThe highest score, △ is the highest value of the evaluation coefficient minus the lowest value, h represents each category, h is 1-n, αhIs an evaluation coefficient of each category; x is the number ofChThe number of items belonging to each category under each subject domain is satisfiedXCIRepresenting the number of perceptual data divided into a certain theme zone.
10. An evaluation system for implementing the perceptual data evaluation method of any one of claims 1 to 9, comprising:
the training corpus module is used for acquiring sensing data as training corpuses;
the preprocessing module is used for preprocessing the materials;
the training word library module is used for calling the preprocessing module to carry out data preprocessing on the training corpus and then carrying out manual labeling to obtain a training word library;
the training sample module is used for extracting the characteristics of the vocabularies in the training word stock to obtain a characteristic dictionary, generating a characteristic vector based on the characteristic dictionary and constructing a training sample;
the training module is used for creating a classifier and training the classifier by using the training samples;
the judging module is used for acquiring the perception data to be evaluated, calling the preprocessing module to perform data preprocessing on the perception data to be evaluated, constructing a perception data vector, inputting the perception data vector into the trained classifier, and judging the category of the perception data;
and the evaluation module is used for calculating the evaluation score of the perception data to be evaluated.
CN201610565797.9A 2016-07-18 2016-07-18 The evaluation methodology of a kind of perception data and system Pending CN106202481A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610565797.9A CN106202481A (en) 2016-07-18 2016-07-18 The evaluation methodology of a kind of perception data and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610565797.9A CN106202481A (en) 2016-07-18 2016-07-18 The evaluation methodology of a kind of perception data and system

Publications (1)

Publication Number Publication Date
CN106202481A true CN106202481A (en) 2016-12-07

Family

ID=57493783

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610565797.9A Pending CN106202481A (en) 2016-07-18 2016-07-18 The evaluation methodology of a kind of perception data and system

Country Status (1)

Country Link
CN (1) CN106202481A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066442A (en) * 2017-02-15 2017-08-18 阿里巴巴集团控股有限公司 Detection method, device and the electronic equipment of mood value
CN107067143A (en) * 2016-12-30 2017-08-18 山东鲁能软件技术有限公司 A kind of equipment safety grade separation method
CN107168987A (en) * 2017-03-24 2017-09-15 联想(北京)有限公司 A kind of data processing method and its device
CN107608964A (en) * 2017-09-13 2018-01-19 上海六界信息技术有限公司 Screening technique, device, equipment and the storage medium of live content based on barrage
CN107657284A (en) * 2017-10-11 2018-02-02 宁波爱信诺航天信息有限公司 A kind of trade name sorting technique and system based on Semantic Similarity extension
CN108280198A (en) * 2018-01-29 2018-07-13 口碑(上海)信息技术有限公司 List generation method and device
CN108520012A (en) * 2018-03-21 2018-09-11 北京航空航天大学 Mobile Internet user comment method for digging based on machine learning
CN108537428A (en) * 2018-03-28 2018-09-14 校宝在线(杭州)科技股份有限公司 A kind of cloud service provider service quality evaluation method based on official website situation of change
CN111415176A (en) * 2018-12-19 2020-07-14 杭州海康威视数字技术股份有限公司 Satisfaction evaluation method and device and electronic equipment
WO2023240858A1 (en) * 2022-06-16 2023-12-21 四川大学 Pca-e-based product kansei semantic word extraction method
US11868432B1 (en) 2022-06-16 2024-01-09 Sichuan University Method for extracting kansei adjective of product based on principal component analysis and explanation (PCA-E)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103123633A (en) * 2011-11-21 2013-05-29 阿里巴巴集团控股有限公司 Generation method of evaluation parameters and information searching method based on evaluation parameters
WO2013107031A1 (en) * 2012-01-20 2013-07-25 华为技术有限公司 Method, device and system for determining video quality parameter based on comment
CN103793503A (en) * 2014-01-24 2014-05-14 北京理工大学 Opinion mining and classification method based on web texts
CN104750844A (en) * 2015-04-09 2015-07-01 中南大学 Method and device for generating text characteristic vectors based on TF-IGM, method and device for classifying texts
CN104951558A (en) * 2015-06-30 2015-09-30 北京奇艺世纪科技有限公司 Video to-be-improved item determining method and device
CN104965867A (en) * 2015-06-08 2015-10-07 南京师范大学 Text event classification method based on CHI feature selection

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103123633A (en) * 2011-11-21 2013-05-29 阿里巴巴集团控股有限公司 Generation method of evaluation parameters and information searching method based on evaluation parameters
WO2013107031A1 (en) * 2012-01-20 2013-07-25 华为技术有限公司 Method, device and system for determining video quality parameter based on comment
CN103793503A (en) * 2014-01-24 2014-05-14 北京理工大学 Opinion mining and classification method based on web texts
CN104750844A (en) * 2015-04-09 2015-07-01 中南大学 Method and device for generating text characteristic vectors based on TF-IGM, method and device for classifying texts
CN104965867A (en) * 2015-06-08 2015-10-07 南京师范大学 Text event classification method based on CHI feature selection
CN104951558A (en) * 2015-06-30 2015-09-30 北京奇艺世纪科技有限公司 Video to-be-improved item determining method and device

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107067143A (en) * 2016-12-30 2017-08-18 山东鲁能软件技术有限公司 A kind of equipment safety grade separation method
CN107066442A (en) * 2017-02-15 2017-08-18 阿里巴巴集团控股有限公司 Detection method, device and the electronic equipment of mood value
CN107168987A (en) * 2017-03-24 2017-09-15 联想(北京)有限公司 A kind of data processing method and its device
CN107608964B (en) * 2017-09-13 2021-01-12 上海六界信息技术有限公司 Live broadcast content screening method, device, equipment and storage medium based on barrage
CN107608964A (en) * 2017-09-13 2018-01-19 上海六界信息技术有限公司 Screening technique, device, equipment and the storage medium of live content based on barrage
CN107657284A (en) * 2017-10-11 2018-02-02 宁波爱信诺航天信息有限公司 A kind of trade name sorting technique and system based on Semantic Similarity extension
CN108280198A (en) * 2018-01-29 2018-07-13 口碑(上海)信息技术有限公司 List generation method and device
CN108280198B (en) * 2018-01-29 2021-03-02 口碑(上海)信息技术有限公司 List generation method and apparatus
CN108520012A (en) * 2018-03-21 2018-09-11 北京航空航天大学 Mobile Internet user comment method for digging based on machine learning
CN108520012B (en) * 2018-03-21 2022-02-18 北京航空航天大学 Mobile internet user comment mining method based on machine learning
CN108537428A (en) * 2018-03-28 2018-09-14 校宝在线(杭州)科技股份有限公司 A kind of cloud service provider service quality evaluation method based on official website situation of change
CN111415176A (en) * 2018-12-19 2020-07-14 杭州海康威视数字技术股份有限公司 Satisfaction evaluation method and device and electronic equipment
WO2023240858A1 (en) * 2022-06-16 2023-12-21 四川大学 Pca-e-based product kansei semantic word extraction method
US11868432B1 (en) 2022-06-16 2024-01-09 Sichuan University Method for extracting kansei adjective of product based on principal component analysis and explanation (PCA-E)

Similar Documents

Publication Publication Date Title
CN106202481A (en) The evaluation methodology of a kind of perception data and system
Liu et al. Assessing product competitive advantages from the perspective of customers by mining user-generated content on social media
Boumans et al. Taking stock of the toolkit: An overview of relevant automated content analysis approaches and techniques for digital journalism scholars
Mukherjee et al. Effect of negation in sentences on sentiment analysis and polarity detection
Bharathi et al. Sentiment analysis for effective stock market prediction
Mandal et al. Unsupervised approaches for measuring textual similarity between legal court case reports
Zhao et al. Adding redundant features for CRFs-based sentence sentiment classification
Esuli et al. Machines that learn how to code open-ended survey data
AU2019219746A1 (en) Artificial intelligence based corpus enrichment for knowledge population and query response
CN107491531A (en) Chinese network comment sensibility classification method based on integrated study framework
Kumari et al. Sentiment analysis of smart phone product review using SVM classification technique
US20230069935A1 (en) Dialog system answering method based on sentence paraphrase recognition
Nagar et al. Using text and data mining techniques to extract stock market sentiment from live news streams
JP2022552421A (en) Techniques for dynamically creating representations for regulations
Rahate et al. Feature selection for sentiment analysis by using svm
Haque et al. Opinion mining from bangla and phonetic bangla reviews using vectorization methods
Somprasertsri et al. Automatic product feature extraction from online product reviews using maximum entropy with lexical and syntactic features
Das et al. Sentiment analysis of movie reviews using POS tags and term frequencies
Brottrager et al. Modeling and predicting literary reception
GB2572320A (en) Hate speech detection system for online media content
Addepalli et al. A proposed framework for measuring customer satisfaction and product recommendation for ecommerce
Sjaif Sentiment Analysis using Term based Method for Customers’ Reviews in Amazon Product
Yang et al. Feature-based Product Review Summarization Utilizing User Score.
Velmurugan et al. Mining implicit and explicit rules for customer data using natural language processing and apriori algorithm
KR20240110453A (en) Personal information detection device, system, method and recording medium in unstructured data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20170215

Address after: 100079 Beijing City, Haidian District cloud layer 6235 Li Jin Ya Yuan Shanghai 6

Applicant after: Quantum cloud future (Beijing) Mdt InfoTech Ltd

Applicant after: WUXI LIANGZIYUN DIGITAL NEW MEDIA TECHNOLOGY CO., LTD.

Address before: 100000 Beijing City, Haidian District cloud layer 6235 Li Jin Ya Yuan Shanghai 6

Applicant before: Quantum cloud future (Beijing) Mdt InfoTech Ltd

RJ01 Rejection of invention patent application after publication

Application publication date: 20161207

RJ01 Rejection of invention patent application after publication