CN107766316B - Evaluation data analysis method, device and system - Google Patents

Evaluation data analysis method, device and system Download PDF

Info

Publication number
CN107766316B
CN107766316B CN201610670873.2A CN201610670873A CN107766316B CN 107766316 B CN107766316 B CN 107766316B CN 201610670873 A CN201610670873 A CN 201610670873A CN 107766316 B CN107766316 B CN 107766316B
Authority
CN
China
Prior art keywords
evaluation
score
user
elements
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610670873.2A
Other languages
Chinese (zh)
Other versions
CN107766316A (en
Inventor
姜珊珊
郑继川
董滨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to CN201610670873.2A priority Critical patent/CN107766316B/en
Publication of CN107766316A publication Critical patent/CN107766316A/en
Application granted granted Critical
Publication of CN107766316B publication Critical patent/CN107766316B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods

Abstract

The invention provides an evaluation data analysis method, device and system, and belongs to the field of natural language processing. The evaluation data analysis method comprises the following steps: acquiring evaluation data of a user, wherein the evaluation data comprises evaluation elements of a product and a first evaluation score of the user on the evaluation elements; clustering the evaluation elements; and performing parameter estimation on the evaluation elements of each cluster by using the first evaluation score to obtain a second evaluation score of each evaluation element cluster, wherein the second evaluation score is a weighted average value obtained according to the first evaluation score of the evaluation element of each cluster and the scoring weight of the user. By the technical scheme of the invention, the evaluation which is closer to the real idea of the user to the product or a certain characteristic of the product can be obtained.

Description

Evaluation data analysis method, device and system
Technical Field
The present invention relates to the field of natural language processing, and in particular, to a method, an apparatus, and a system for analyzing evaluation data.
Background
Currently, the evaluation of a product by a user is usually expressed by a text, and in order to better understand and analyze the viewpoint in the evaluation text, the viewpoint mining of evaluation elements is a main subject in the evaluation analysis field. The idea mining of the evaluation elements mainly includes two steps, and the extraction of the evaluation elements and the judgment of the corresponding emotional tendency are performed.
The evaluation element may be a certain feature of the product being evaluated, for example, in the field of mobile phone products, "battery" and "screen" may be the evaluation elements. The emotional tendency of the user to the rating elements may generally be represented by a rating score, such as + N generally representing a positive rating, 0 representing a neutral rating, and-N representing a negative rating, where N is a positive integer. When analyzing a user's view of a product or a certain characteristic of a product, it is obviously more meaningful to analyze the evaluation scores of a plurality of users in combination than to analyze the evaluation score of a single user. Therefore, in the prior art, generally, the evaluation scores of a plurality of users for the same evaluation element are obtained, and the average value of the evaluation scores is used as the true evaluation score of the evaluation element.
However, the evaluation score of the user for the evaluation element tends to have a certain bias. For example, for the same mobile phone product, the user A is more interested in the mobile phone screen, and the cruising ability of the mobile phone battery is less interested, so that when the user A evaluates the mobile phone product, the score of the mobile phone screen is harsher to-4 points, and the score of the mobile phone battery is wider to +5 points; on the contrary, the user B is more concerned about the cruising ability of the mobile phone battery, but is less concerned about the mobile phone screen, when the user B evaluates the mobile phone product, the score of the mobile phone battery is harsher to be-3 scores, and the score of the mobile phone screen is wider to be +4 scores. Obviously, an arithmetic mean obtained by simply calculating the evaluation scores of a plurality of users for the same evaluation element cannot represent the actual evaluation of a product or a certain characteristic of the product by the users.
Disclosure of Invention
The invention aims to provide an evaluation data analysis method, device and system, which can obtain the evaluation closer to the real idea of a user on a product or a certain characteristic of the product.
To solve the above technical problem, embodiments of the present invention provide the following technical solutions:
in one aspect, an evaluation data analysis method is provided, including:
acquiring evaluation data of a user, wherein the evaluation data comprises evaluation elements of a product and a first evaluation score of the user on the evaluation elements;
clustering the evaluation elements;
and performing parameter estimation on the evaluation elements of each cluster by using the first evaluation score to obtain a second evaluation score of each evaluation element cluster, wherein the second evaluation score is a weighted average value obtained according to the first evaluation score of the evaluation element of each cluster and the scoring weight of the user.
Further, the acquiring of the evaluation data of the user specifically includes:
capturing an evaluation text of a user;
and identifying the evaluation text, extracting the evaluation elements of the product and the first evaluation scores of the user on the evaluation elements contained in the evaluation text, and generating the evaluation data according to the evaluation elements of the product and the first evaluation scores of the user on the evaluation elements.
Further, the clustering the evaluation elements, and performing parameter estimation on the evaluation elements of each cluster by using the first evaluation score to obtain a second evaluation score of each evaluation element cluster includes:
dividing the obtained evaluation data into a plurality of sets, wherein the evaluation elements in each set belong to the same cluster of the same product;
in each set, establishing a maximum likelihood estimation equation with the second evaluation score of the corresponding evaluation element cluster and the scoring weight of the user as parameters;
and iterating the second evaluation score estimation value and the scoring weight estimation value of the user to maximize the target value of the maximum likelihood estimation equation to be converged, and taking the obtained second evaluation score estimation value as a second evaluation score of the evaluation element cluster corresponding to the set.
Further, obtaining the second rating score of each rating element cluster further includes:
screening a plurality of products comprising the same clustering evaluation element;
and sequencing the plurality of products according to the second evaluation score of the evaluation element cluster.
Further, obtaining the second rating score of each rating element cluster further includes:
calculating an average value according to second evaluation scores of different evaluation element clusters of the same product to obtain a third evaluation score of the product;
and sequencing a plurality of products by using the third evaluation score.
Further, the evaluation data further includes a fourth evaluation score of the product, and after obtaining the third evaluation score of the product, the method further includes:
calculating an average value according to a fourth evaluation score and a third evaluation score of the same product to obtain a fifth evaluation score of the product;
ranking a plurality of products using the fifth valuation score.
The embodiment of the invention also provides an analysis device for evaluation data, which comprises:
the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring evaluation data of a user, and the evaluation data comprises evaluation elements of a product and a first evaluation score of the user on the evaluation elements;
the clustering module is used for clustering the evaluation elements;
and the calculating module is used for carrying out parameter estimation on the evaluation elements of each cluster by using the first evaluation score to obtain a second evaluation score of each evaluation element cluster, and the second evaluation score is a weighted average value obtained according to the first evaluation score of the evaluation elements of each cluster and the scoring weight of the user.
Further, the apparatus further comprises:
the grabbing module is used for grabbing an evaluation text of a user;
and the identification module is used for identifying the evaluation text, extracting the evaluation elements of the product and the first evaluation scores of the user on the evaluation elements contained in the evaluation text, and generating the evaluation data according to the evaluation elements of the product and the first evaluation scores of the user on the evaluation elements.
Further, the calculation module includes:
the dividing unit is used for dividing the acquired evaluation data into a plurality of sets, and the evaluation elements in each set belong to the same cluster of the same product;
the equation establishing unit is used for establishing a maximum likelihood estimation equation with the second evaluation score of the corresponding evaluation element cluster and the scoring weight of the user as parameters in each set;
and the solving unit is used for iterating the second evaluation score estimation value and the scoring weight estimation value of the user to maximize the target value of the maximum likelihood estimation equation to be converged, and taking the obtained second evaluation score estimation value as a second evaluation score of the evaluation element cluster corresponding to the set.
The embodiment of the invention also provides an analysis system for evaluation data, which comprises:
a network interface, a processor, an input device, a memory, and a display device interconnected by a bus architecture; wherein the content of the first and second substances,
the network interface is used for connecting to a network;
the input device is used for receiving an input instruction and sending the input instruction to the processor for execution;
the memory is used for storing an operating system, an application program and intermediate data in the calculation process of the processor;
the display device is used for displaying the result obtained by the processor;
the processor is used for acquiring evaluation data of a user, wherein the evaluation data comprises evaluation elements of products and first evaluation scores of the evaluation elements of the user, clustering the evaluation elements, performing parameter estimation on the evaluation elements of each cluster by using the first evaluation scores to obtain a second evaluation score of each evaluation element cluster, and the second evaluation score is a weighted average value obtained according to the first evaluation scores of the evaluation elements of each cluster and the scoring weight of the user.
The embodiment of the invention has the following beneficial effects:
in the scheme, the evaluation element of each cluster is subjected to parameter estimation by using the first evaluation score to obtain the second evaluation score of each evaluation element cluster, and the second evaluation score is a weighted average value obtained according to the first evaluation score of the evaluation element of each cluster and the grading weight of the user.
Drawings
FIG. 1 is a schematic flow chart of an analysis method for evaluation data according to an embodiment of the present invention;
FIG. 2 is a block diagram of an apparatus for analyzing evaluation data according to an embodiment of the present invention;
FIG. 3 is a block diagram of an apparatus for analyzing evaluation data according to another embodiment of the present invention;
FIG. 4 is a block diagram of a computing module according to an embodiment of the invention;
FIG. 5 is a block diagram of an analysis system for evaluating data according to an embodiment of the present invention;
fig. 6 is a flowchart illustrating an analysis method of evaluation data according to an embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages to be solved by the embodiments of the present invention clearer, the following detailed description will be given with reference to the accompanying drawings and specific embodiments.
The embodiment of the invention aims at the problem that the evaluation scores of a plurality of users on the same evaluation element are simply subjected to arithmetic mean calculation in the prior art, and the obtained arithmetic mean cannot represent the real evaluation of the users on a product or a certain characteristic of the product, and provides an evaluation data analysis method, device and system, which can obtain the evaluation closer to the real idea of the users on the product or the certain characteristic of the product.
Example one
The present embodiment provides an evaluation data analysis method, as shown in fig. 1, the present embodiment includes:
step 101: acquiring evaluation data of a user, wherein the evaluation data comprises evaluation elements of a product and a first evaluation score of the user on the evaluation elements;
preferably, the evaluation data may include evaluation elements of all products and a first evaluation score of the user for the evaluation elements.
Step 102: clustering the evaluation elements;
step 103: and performing parameter estimation on the evaluation elements of each cluster by using the first evaluation score to obtain a second evaluation score of each evaluation element cluster, wherein the second evaluation score is a weighted average value obtained according to the first evaluation score of the evaluation element of each cluster and the scoring weight of the user.
In the embodiment, the biasing of user scoring is embodied by using the scoring weight of the user, the evaluation element of each cluster is subjected to parameter estimation by using the first evaluation score to obtain the second evaluation score of each evaluation element cluster, and the second evaluation score is a weighted average value obtained according to the first evaluation score of the evaluation element of each cluster and the scoring weight of the user.
As an example, the acquiring of the evaluation data of the user includes:
capturing an evaluation text of a user;
and identifying the evaluation text, extracting the evaluation elements of the product and the first evaluation scores of the user on the evaluation elements contained in the evaluation text, and generating the evaluation data according to the evaluation elements of the product and the first evaluation scores of the user on the evaluation elements.
As an example, the clustering the evaluation elements, and performing parameter estimation on the evaluation element of each cluster by using the first evaluation score to obtain the second evaluation score of each evaluation element cluster includes:
dividing the obtained evaluation data into a plurality of sets, wherein the evaluation elements in each set belong to the same cluster of the same product;
in each set, establishing a maximum likelihood estimation equation with the second evaluation score of the corresponding evaluation element cluster and the scoring weight of the user as parameters;
and iterating the second evaluation score estimation value and the scoring weight estimation value of the user to maximize the target value of the maximum likelihood estimation equation to be converged, and taking the obtained second evaluation score estimation value as a second evaluation score of the evaluation element cluster corresponding to the set.
As an example, the maximum likelihood estimation equation is:
Figure BDA0001078790540000061
the iteration parameters are:
Figure BDA0001078790540000062
Figure BDA0001078790540000063
wherein r isijkA first rating score for user j for the k-cluster rating elements for product i,
Figure BDA0001078790540000064
the scoring weights for users who evaluate elements for k clusters,
Figure BDA0001078790540000065
estimation of scoring weights for users who evaluate elements for k clusters, qikA second valuation score for the k valuation factor cluster for product i,
Figure BDA0001078790540000066
a second valuation score estimate for the k valuation factor cluster for product i.
Further, obtaining the second rating score of each rating element cluster further includes:
screening a plurality of products comprising the same clustering evaluation element;
and sequencing the plurality of products according to the second evaluation score of the evaluation element cluster.
Thus, when the user faces a plurality of products comprising the same cluster evaluation element, one of the products can be selected according to the sorting result, and the better the second evaluation score of the product is, the better the cluster evaluation element evaluation of the product is. For example, the user pays attention to the evaluation element battery, the mobile phone product A, the mobile phone product B and the mobile phone product C all comprise the same clustered evaluation element battery, the mobile phone product A, the mobile phone product B and the mobile phone product C are ranked according to the obtained second evaluation score for the user to refer to, and the user can select a product with the highest second evaluation score, because the highest second evaluation score represents that the performance of the clustered evaluation element of the product is superior.
Further, obtaining the second rating score of each rating element cluster further includes:
calculating an average value according to second evaluation scores of different evaluation element clusters of the same product to obtain a third evaluation score of the product;
and sequencing a plurality of products by using the third evaluation score.
The third evaluation score of the product is not only related to a certain clustering evaluation element, but also the result obtained by clustering a plurality of evaluation elements is integrated, and the products are sorted according to the third evaluation score of the products, so that when a user faces a plurality of products, the products can be selected according to the height of the third evaluation score, and the higher the third evaluation score of the product is, the better the comprehensive performance of the product is.
Further, the evaluation data further includes a fourth evaluation score of the product, just as the first evaluation score is an initial evaluation score of the user on the evaluation element in the evaluation data, and the fourth evaluation score is an initial evaluation score of the user on the whole product in the evaluation data, and after obtaining the third evaluation score of the product, the method further includes:
calculating an average value according to a fourth evaluation score and a third evaluation score of the same product to obtain a fifth evaluation score of the product;
ranking a plurality of products using the fifth valuation score.
When the evaluation scores of the products are obtained, the evaluation scores of the products can be calculated not only according to the second evaluation scores of different evaluation element clusters of the products, but also by combining the initial evaluation scores of the users on the whole products, so that the accuracy of the evaluation scores of the products can be further improved, the finally obtained fifth evaluation scores are not only related to a certain cluster evaluation element, but also are integrated with the results obtained by the multiple evaluation element clusters and the user evaluation, the products are sorted according to the fifth evaluation scores of the products, so that when the users face multiple products, the products can be selected according to the height of the fifth evaluation scores, and the higher the fifth evaluation scores of the products are, the better the comprehensive performance of the products is.
Example two
The present embodiment further provides an evaluation data analysis device, as shown in fig. 2, the present embodiment includes:
the system comprises an acquisition module 21, a storage module and a display module, wherein the acquisition module 21 is used for acquiring evaluation data of a user, and the evaluation data comprises evaluation elements of a product and a first evaluation score of the user on the evaluation elements;
a clustering module 22, configured to cluster the evaluation elements;
the calculating module 23 is configured to perform parameter estimation on the evaluation element of each cluster by using the first evaluation score to obtain a second evaluation score of each evaluation element cluster, where the second evaluation score is a weighted average value obtained according to the first evaluation score of the evaluation element of each cluster and the scoring weight of the user.
In the embodiment, the biasing of user scoring is embodied by using the scoring weight of the user, the evaluation element of each cluster is subjected to parameter estimation by using the first evaluation score to obtain the second evaluation score of each evaluation element cluster, and the second evaluation score is a weighted average value obtained according to the first evaluation score of the evaluation element of each cluster and the scoring weight of the user.
Further, as shown in fig. 2, the evaluation data analysis device further includes:
an input module 20, configured to provide evaluation data of a user to the obtaining module 21;
and the output module 24 is used for outputting the calculation result of the calculation module 23.
Further, as shown in fig. 3, the apparatus further includes:
a grasping module 25, configured to grasp an evaluation text of a user;
the identification module 26 is configured to identify the evaluation text, extract an evaluation element of a product and a first evaluation score of the user for the evaluation element included in the evaluation text, and generate the evaluation data according to the evaluation element of the product and the first evaluation score of the user for the evaluation element.
Further, as shown in fig. 4, the calculating module 23 includes:
a dividing unit 231, configured to divide the acquired evaluation data into multiple sets, where evaluation elements in each set belong to the same cluster of the same product;
an equation establishing unit 232, configured to establish, in each set, a maximum likelihood estimation equation using the second evaluation score of the corresponding evaluation element cluster and the scoring weight of the user as parameters;
and a solving unit 233, configured to iterate the second evaluation score estimation value and the scoring weight estimation value of the user to maximize a target value of the maximum likelihood estimation equation to converge, and use the obtained second evaluation score estimation value as a second evaluation score of the evaluation element cluster corresponding to the set in which the second evaluation score estimation value is located.
EXAMPLE III
The present embodiment further provides an evaluation data analysis system 50, as shown in fig. 5, the present embodiment includes:
a network interface 51, a processor 52, an input device 53, a memory 54, a hard disk 55, and a display device 56. The various interfaces and devices described above may be interconnected by a bus architecture. A bus architecture may be any architecture that may include any number of interconnected buses and bridges. Various circuits of one or more Central Processing Units (CPUs), represented in particular by processor 52, and one or more memories, represented by memory 54, are coupled together. The bus architecture may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like. It will be appreciated that a bus architecture is used to enable communications among the components. The bus architecture includes a power bus, a control bus, and a status signal bus, in addition to a data bus, all of which are well known in the art and therefore will not be described in detail herein.
The network interface 51 may be connected to a network (e.g., the internet, a local area network, etc.), and may acquire relevant data from the network, such as a user's rating text, and may store the relevant data in the hard disk 55.
The input device 53 may receive various commands input by an operator and send the commands to the processor 52 for execution. The input device 53 may include a keyboard or a pointing device (e.g., a mouse, a trackball, a touch pad, a touch screen, or the like.
The display device 56 may display the result of the instructions executed by the processor 52. The display device 56 may include a display, a projection device, and the like.
The memory 54 is used for storing programs and data necessary for operating the operating system, and data such as intermediate results in the calculation process of the processor 52.
It will be appreciated that memory 54 in embodiments of the invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read Only Memory (EPROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. The memory 54 of the apparatus and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
In some embodiments, memory 54 stores elements, executable modules or data structures, or a subset thereof, or an expanded set thereof: an operating system 541 and application programs 542.
The operating system 541 includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application programs 542 include various application programs such as a Browser (Browser) and the like for implementing various application services. A program implementing the method of an embodiment of the present invention may be included in the application program 542.
The processor 52 may be configured to obtain evaluation data of the user when invoking and executing the application program and the data stored in the memory 54, specifically, the application program or the instruction stored in the application program 542, cluster the evaluation elements, perform parameter estimation on the evaluation elements of each cluster by using the first evaluation score to obtain a second evaluation score of each cluster of the evaluation elements, where the second evaluation score is a weighted average value obtained according to the first evaluation score of the evaluation element of each cluster and the scoring weight of the user.
The method disclosed by the above embodiment of the present invention can be applied to the processor 52, or implemented by the processor 52. Processor 52 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 52. The processor 52 may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof, and may implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present invention. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory 54, and the processor 52 reads the information in the memory 54 and performs the steps of the above method in combination with the hardware thereof.
It is to be understood that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or any combination thereof. For a hardware implementation, the processing units may be implemented within one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), general purpose processors, controllers, micro-controllers, microprocessors, other electronic units designed to perform the functions described herein, or a combination thereof.
For a software implementation, the techniques described herein may be implemented with modules (e.g., procedures, functions, and so on) that perform the functions described herein. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.
Further, the processor 52 is also configured to capture the evaluation text of the user; and identifying the evaluation text, and extracting the evaluation elements of the products contained in the evaluation text and the first evaluation scores of the user on the evaluation elements to generate the evaluation data.
Specifically, the processor 52 divides the acquired evaluation data into a plurality of sets, and the evaluation elements in each set belong to the same cluster of the same product; in each set, establishing a maximum likelihood estimation equation with the second evaluation score of the corresponding evaluation element cluster and the scoring weight of the user as parameters; and iterating the second evaluation score estimation value and the scoring weight estimation value of the user to maximize the target value of the maximum likelihood estimation equation to be converged, and taking the obtained second evaluation score estimation value as a second evaluation score of the evaluation element cluster corresponding to the set.
Optionally, the processor 52 screens out a plurality of products comprising the same cluster evaluation element; and sequencing the plurality of products according to the second evaluation score of the evaluation element cluster.
Optionally, the processor 52 calculates an average value according to the second evaluation scores of different evaluation element clusters of the same product, to obtain a third evaluation score of the product; and sequencing a plurality of products by using the third evaluation score.
Optionally, the processor 52 calculates an average value according to the fourth evaluation score and the third evaluation score of the same product, to obtain a fifth evaluation score of the product; ranking a plurality of products using the fifth valuation score.
In the embodiment, the biasing of user scoring is embodied by using the scoring weight of the user, the evaluation element of each cluster is subjected to parameter estimation by using the first evaluation score to obtain the second evaluation score of each evaluation element cluster, and the second evaluation score is a weighted average value obtained according to the first evaluation score of the evaluation element of each cluster and the scoring weight of the user.
Example four
Specifically, as shown in fig. 6, the method for analyzing evaluation data of the present embodiment specifically includes the following steps:
step 601: capturing an evaluation text of a user;
specifically, the user's comment text may be captured from a plurality of diversified data sources, including but not limited to a product information page, a community, a blog article, product news, a forum, and the like, and the user's comment text includes a product or some feature of the product, i.e., a comment element, and also includes the contents of the comment on the comment element and an emotional tendency, which may be expressed as an initial comment score (i.e., the first comment score) of the user on the comment element. In addition to this, the rating text may include other information such as user information, time information, rating usefulness information, and the like.
Illustratively, the piece of user rating text that is crawled is: the button of the mobile phone is better and sensitive in response. However, the battery heating problem is serious, and the memory card cannot be used universally and is not recommended to be purchased. The corresponding star rating is memory card-5, system fluency +4, operability +1 ". Where, + N represents a positive evaluation, 0 represents a neutral evaluation, and-N represents a negative evaluation, where N is a positive integer. When analyzing the viewpoint of the user on the evaluation element, it is obviously more meaningful to comprehensively analyze the initial evaluation scores of a plurality of users than to analyze the initial evaluation score of a single user. Therefore, as much user evaluation text as possible should be captured for analysis.
Step 602: identifying an evaluation text of a user, and extracting evaluation elements of a product and initial evaluation values of the user on the evaluation elements;
the evaluation element analysis is carried out on the evaluation text of the user, and the extracted evaluation data can be { product, evaluation element, emotion tendency } or { evaluation element, emotion tendency }, wherein the product and the evaluation element are nouns, the emotion tendency is a score, the value range of the score is { -1,0, +1} or { -5, -4 … +4, +5} and the like, the meaning of the emotion tendency is positive emotion, negative emotion and neutral emotion (also called mixed emotion), for example, { -1,0, +1} wherein, +1 represents positive emotion, -1 represents negative emotion, and 0 represents neutral emotion; of { -5, -4 … +4, +5} -5 represents an extremely negative emotion and-1 represents a slightly negative emotion, etc. This embodiment collectively refers to the above-described scores as initial evaluation scores of evaluation elements by the user.
The method for extracting the evaluation elements from the evaluation text of the user includes, but is not limited to: sequence annotation based methods, topic model based methods, dictionary based methods, syntax based methods, and the like. The method for extracting the initial evaluation score of the user on the evaluation element from the evaluation text of the user includes but is not limited to: supervised learning methods, dictionary-based methods, document set-based methods, etc.
Step 603: clustering the evaluation elements;
when analyzing the simulated actual evaluation score (i.e., the second evaluation score) of the evaluation elements, the extracted evaluation elements are clustered, each clustered evaluation element belongs to the same product, and then the simulated actual evaluation score of the evaluation element of each cluster is analyzed cluster by cluster. For example, the evaluation elements of the extracted mobile phone product include "battery endurance" and "battery heating problem", two small clusters of "battery endurance" and "battery heating problem" may be established according to the application purpose, and further, the evaluation elements of "battery endurance" and "battery heating problem" may be classified into the large cluster of "battery". Illustratively, the evaluation element clustering system of the digital camera is "appearance", "battery", "attachment", "image quality", "lens", "functionality", "operability", "cost performance", and "memory card".
Since the bias of the user's evaluation of the evaluation elements (i.e., the user's scoring weights) is typically reflected in a certain characteristic of the product, not in all products. For example, both mobile phones and digital cameras belong to electronic products, and evaluation element clustering occurs: for example, if the user a pays more importance to the battery performance of the electronic product, the user a pays more importance to the "battery" of the mobile phone and the digital camera, and the user a pays more importance to the "battery" of the mobile phone and the digital camera, but does not pay more importance to the "battery", "appearance", "operability", and the like. Therefore, the evaluation elements are clustered, and the simulated real evaluation scores of the evaluation elements are analyzed according to the clustering, so that the real evaluation of the user on the product or a certain characteristic of the product can be obtained.
The method for clustering the evaluation elements includes, but is not limited to: a domain prior knowledge based method, a topic model based method, a general text clustering method such as a K-Means algorithm, etc. The clustered result can be regarded as a plurality of sets, each set comprises a plurality of evaluation data, and evaluation elements of the evaluation data in each set belong to the same cluster of the same product.
Step 604: and calculating the simulated real evaluation score of each evaluation element cluster.
User ujTo product oiEvaluation element (a) ofkThe initial rating score of is denoted as rijk,rijkIs modeled as a "simulated true rating score" qikAnd scoring weight of user
Figure BDA0001078790540000131
Is a stochastic equation of parameters.
The present embodiment assumes rijkObey a normal distribution:
Figure BDA0001078790540000141
the parameter q (q)ik) And σ (σ)jk) The maximum likelihood estimation equation of (a) is:
Figure BDA0001078790540000142
solution of equations, iterative parameters
Figure BDA0001078790540000143
And
Figure BDA0001078790540000144
maximizing the target value until convergence, and obtaining a simulated true evaluation score qikEstimated value
Figure BDA0001078790540000145
The simulated true evaluation score of the evaluation element cluster is obtained.
Wherein the content of the first and second substances,
Figure BDA0001078790540000146
clustering a in evaluation elements for all userskWeighted average of upper scores:
Figure BDA0001078790540000147
meanwhile, the scoring weight of the user is the biased normalization thereof, which is reflected as the variance of the initial scoring score relative to the 'true scoring score':
Figure BDA0001078790540000148
of the above parameters, the set ri*kFor products o on behalf of all usersiEvaluation element clustering of (a)kScoring; set r*jkRepresenting all users ujClustering u in evaluation elementsjThe score of the above-mentioned points is given,
Figure BDA0001078790540000149
and evaluating the evaluation weight estimated value of the user of the k cluster evaluation element.
Through the steps 601-604, the simulated real evaluation score of each evaluation element cluster can be obtained.
Further, after deriving the simulated true valuation score for each valuation element cluster, the simulated true valuation score can be applied in the following scenarios:
and in the first scene, screening out a plurality of products comprising the same clustering evaluation element, and sequencing the plurality of products according to the simulated real evaluation score of the evaluation element clustering.
For example, the mobile phone product a, the mobile phone product B and the mobile phone product C all include the evaluation element "battery" in the same cluster, the simulated real evaluation score of the user on the mobile phone product "battery" can be respectively calculated through the steps 601 and 604, and the mobile phone product a, the mobile phone product B and the mobile phone product C can be sorted according to the obtained simulated real evaluation score for the user to refer to.
Further, for example, if the mobile phone product a, the mobile phone product B, and the digital camera product D all include the evaluation element "battery" in the same cluster, the simulated real evaluation score of the user on the "battery" of the electronic product can be respectively calculated through the steps 601-604, and the mobile phone product a, the mobile phone product B, and the digital camera product D can be sorted according to the obtained simulated real evaluation score for the user to refer to.
And secondly, calculating an average value according to the simulated real evaluation scores of different evaluation element clusters of the same product to obtain the simulated real evaluation scores of the product, and sequencing the products according to the simulated real evaluation scores of the products.
For example, the plurality of clustering evaluation elements of the mobile phone product a include "screen", "battery" and "operation fluency", the simulated real evaluation scores of the three evaluation elements clustered by the user can be respectively calculated through the above-mentioned step 601 and 604, and the simulated real evaluation scores of the three clustering evaluation elements are subjected to arithmetic average or weighted average to obtain the simulated real evaluation score of the mobile phone product a. Similarly, the simulated real evaluation scores of the mobile phone product B and the mobile phone product C can be obtained, and the mobile phone product A, the mobile phone product B and the mobile phone product C are sequenced according to the obtained simulated real evaluation scores of the products so as to be referred by the user.
Specifically, the arithmetic mean may be calculated by the following formula:
Figure BDA0001078790540000151
the weighted average may be calculated by the following formula:
Figure BDA0001078790540000152
wherein, the set qi*Represents the product oiSimulated true rating scores on individual cluster rating elements.
And thirdly, calculating an average value according to the initial evaluation score and the simulated real evaluation score of the same product to obtain a mixed evaluation score of the product, and sequencing the products according to the mixed evaluation score of the product.
For example, after the simulated real evaluation scores of the mobile phone product a, the mobile phone product B and the mobile phone product C are obtained, the mixed evaluation scores of the mobile phone product a, the mobile phone product B and the mobile phone product C can be obtained by calculating the average value by combining the initial evaluation scores of the mobile phone product a, the mobile phone product B and the mobile phone product C by the user, and then the mobile phone product a, the mobile phone product B and the mobile phone product C are sorted according to the mixed evaluation scores for the user to refer to.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (8)

1. An analysis method of evaluation data, comprising:
acquiring evaluation data of a user, wherein the evaluation data comprises evaluation elements of a product and a first evaluation score of the user on the evaluation elements;
clustering the evaluation elements;
performing parameter estimation on the evaluation elements of each cluster by using the first evaluation score to obtain a second evaluation score of each evaluation element cluster, wherein the second evaluation score is a weighted average value obtained according to the first evaluation score of the evaluation element of each cluster and the scoring weight of the user;
the clustering the evaluation elements, and performing parameter estimation on the evaluation elements of each cluster by using the first evaluation score to obtain a second evaluation score of each evaluation element cluster includes:
dividing the obtained evaluation data into a plurality of sets, wherein the evaluation elements in each set belong to the same cluster of the same product;
in each set, establishing a maximum likelihood estimation equation with the second evaluation score of the corresponding evaluation element cluster and the scoring weight of the user as parameters;
and iterating the second evaluation score estimation value and the scoring weight estimation value of the user to maximize the target value of the maximum likelihood estimation equation to be converged, and taking the obtained second evaluation score estimation value as a second evaluation score of the evaluation element cluster corresponding to the set.
2. The method for analyzing evaluation data according to claim 1, wherein the acquiring evaluation data of the user specifically includes:
capturing an evaluation text of a user;
and identifying the evaluation text, and extracting the evaluation elements of the products contained in the evaluation text and the first evaluation scores of the user on the evaluation elements to generate the evaluation data.
3. The method for analyzing evaluation data according to claim 1 or 2, wherein the obtaining of the second evaluation score of each evaluation element cluster further comprises:
screening a plurality of products comprising the same clustering evaluation element;
and sequencing the products comprising the same cluster evaluation element according to the second evaluation score of the evaluation element cluster.
4. The method for analyzing evaluation data according to claim 1 or 2, wherein the obtaining of the second evaluation score of each evaluation element cluster further comprises:
calculating an average value according to second evaluation scores of different evaluation element clusters of the same product to obtain a third evaluation score of the product;
and sequencing a plurality of products by using the third evaluation score.
5. The method for analyzing evaluation data according to claim 4, wherein the evaluation data further includes a fourth evaluation score of the product, and further includes, after obtaining the third evaluation score of the product:
calculating an average value according to a fourth evaluation score and a third evaluation score of the same product to obtain a fifth evaluation score of the product;
ranking a plurality of products using the fifth valuation score.
6. An analysis apparatus for evaluating data, comprising:
the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring evaluation data of a user, and the evaluation data comprises evaluation elements of a product and a first evaluation score of the user on the evaluation elements;
the clustering module is used for clustering the evaluation elements;
the calculation module is used for carrying out parameter estimation on the evaluation elements of each cluster by using the first evaluation score to obtain a second evaluation score of each evaluation element cluster, and the second evaluation score is a weighted average value obtained according to the first evaluation score of the evaluation elements of each cluster and the scoring weight of the user;
wherein the calculation module comprises:
the dividing unit is used for dividing the acquired evaluation data into a plurality of sets, and the evaluation elements in each set belong to the same cluster of the same product;
the equation establishing unit is used for establishing a maximum likelihood estimation equation with the second evaluation score of the corresponding evaluation element cluster and the scoring weight of the user as parameters in each set;
and the solving unit is used for iterating the second evaluation score estimation value and the scoring weight estimation value of the user to maximize the target value of the maximum likelihood estimation equation to be converged, and taking the obtained second evaluation score estimation value as a second evaluation score of the evaluation element cluster corresponding to the set.
7. The apparatus for analyzing evaluation data according to claim 6, further comprising:
the grabbing module is used for grabbing an evaluation text of a user;
and the identification module is used for identifying the evaluation text, extracting the evaluation elements of the product and the first evaluation scores of the user on the evaluation elements contained in the evaluation text, and generating the evaluation data according to the evaluation elements of the product and the first evaluation scores of the user on the evaluation elements.
8. An analytical system for evaluating data, comprising: a network interface, a processor, an input device, a memory, and a display device interconnected by a bus architecture; wherein the content of the first and second substances,
the network interface is used for connecting to a network;
the input device is used for receiving an input instruction and sending the input instruction to the processor for execution;
the memory is used for storing an operating system, an application program and intermediate data in the calculation process of the processor;
the display device is used for displaying the result obtained by the processor;
the processor is used for acquiring evaluation data of a user, wherein the evaluation data comprises evaluation elements of products and first evaluation scores of the evaluation elements of the user, clustering the evaluation elements, and performing parameter estimation on the evaluation elements of each cluster by using the first evaluation scores to obtain a second evaluation score of each evaluation element cluster, and the second evaluation score is a weighted average value obtained according to the first evaluation score of the evaluation elements of each cluster and the scoring weight of the user;
the clustering the evaluation elements, and performing parameter estimation on the evaluation elements of each cluster by using the first evaluation score to obtain a second evaluation score of each evaluation element cluster includes:
dividing the obtained evaluation data into a plurality of sets, wherein the evaluation elements in each set belong to the same cluster of the same product;
in each set, establishing a maximum likelihood estimation equation with the second evaluation score of the corresponding evaluation element cluster and the scoring weight of the user as parameters;
and iterating the second evaluation score estimation value and the scoring weight estimation value of the user to maximize the target value of the maximum likelihood estimation equation to be converged, and taking the obtained second evaluation score estimation value as a second evaluation score of the evaluation element cluster corresponding to the set.
CN201610670873.2A 2016-08-15 2016-08-15 Evaluation data analysis method, device and system Active CN107766316B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610670873.2A CN107766316B (en) 2016-08-15 2016-08-15 Evaluation data analysis method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610670873.2A CN107766316B (en) 2016-08-15 2016-08-15 Evaluation data analysis method, device and system

Publications (2)

Publication Number Publication Date
CN107766316A CN107766316A (en) 2018-03-06
CN107766316B true CN107766316B (en) 2021-03-30

Family

ID=61259906

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610670873.2A Active CN107766316B (en) 2016-08-15 2016-08-15 Evaluation data analysis method, device and system

Country Status (1)

Country Link
CN (1) CN107766316B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109559020A (en) * 2018-11-08 2019-04-02 南京邮电大学 A kind of quality testing method mutually commented based on colleague
CN109447767A (en) * 2018-11-26 2019-03-08 重庆电子工程职业学院 A kind of commodity evaluation method and system applied to e-commerce
CN111325475A (en) * 2020-03-04 2020-06-23 国网江苏省电力有限公司扬州供电分公司 Emergency repair work order evaluation factor analysis method based on total log-likelihood algorithm
CN111581975B (en) * 2020-05-09 2023-06-20 北京明朝万达科技股份有限公司 Method and device for processing written text of case, storage medium and processor
CN112036157A (en) * 2020-08-04 2020-12-04 林树 Foundation manager tone text analysis method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101572820A (en) * 2009-05-11 2009-11-04 宁波海视智能系统有限公司 Preprocessing method of video signal in detection process of moving target
CN103530318A (en) * 2007-01-05 2014-01-22 雅虎公司 Clustered search processing
CN105740434A (en) * 2016-02-01 2016-07-06 腾讯科技(深圳)有限公司 Network information scoring method and device
CN105809379A (en) * 2014-12-30 2016-07-27 阿里巴巴集团控股有限公司 Logistics branch evaluation method, device and electronic device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9619742B2 (en) * 2001-05-18 2017-04-11 Nxp B.V. Self-descriptive data tag
JP3682529B2 (en) * 2002-01-31 2005-08-10 独立行政法人情報通信研究機構 Summary automatic evaluation processing apparatus, summary automatic evaluation processing program, and summary automatic evaluation processing method
US9703892B2 (en) * 2005-09-14 2017-07-11 Millennial Media Llc Predictive text completion for a mobile communication facility
US8949252B2 (en) * 2010-03-29 2015-02-03 Ebay Inc. Product category optimization for image similarity searching of image-based listings in a network-based publication system
CN105354208A (en) * 2015-09-21 2016-02-24 江苏讯狐信息科技有限公司 Big data information mining method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530318A (en) * 2007-01-05 2014-01-22 雅虎公司 Clustered search processing
CN101572820A (en) * 2009-05-11 2009-11-04 宁波海视智能系统有限公司 Preprocessing method of video signal in detection process of moving target
CN105809379A (en) * 2014-12-30 2016-07-27 阿里巴巴集团控股有限公司 Logistics branch evaluation method, device and electronic device
CN105740434A (en) * 2016-02-01 2016-07-06 腾讯科技(深圳)有限公司 Network information scoring method and device

Also Published As

Publication number Publication date
CN107766316A (en) 2018-03-06

Similar Documents

Publication Publication Date Title
CN107766316B (en) Evaluation data analysis method, device and system
US20210182611A1 (en) Training data acquisition method and device, server and storage medium
JP6994588B2 (en) Face feature extraction model training method, face feature extraction method, equipment, equipment and storage medium
WO2017045443A1 (en) Image retrieval method and system
CN109345553B (en) Palm and key point detection method and device thereof, and terminal equipment
US11182447B2 (en) Customized display of emotionally filtered social media content
CN107526846B (en) Method, device, server and medium for generating and sorting channel sorting model
CN106874253A (en) Recognize the method and device of sensitive information
CN112035549B (en) Data mining method, device, computer equipment and storage medium
CN111898577B (en) Image detection method, device, equipment and computer readable storage medium
WO2022037299A1 (en) Abnormal behavior detection method and apparatus, and electronic device and computer-readable storage medium
CN111898675B (en) Credit wind control model generation method and device, scoring card generation method, machine readable medium and equipment
CN110909005B (en) Model feature analysis method, device, equipment and medium
CN110968664A (en) Document retrieval method, device, equipment and medium
Shi et al. Segmentation quality evaluation based on multi-scale convolutional neural networks
CN111797258B (en) Image pushing method, system, equipment and storage medium based on aesthetic evaluation
CN113918738A (en) Multimedia resource recommendation method and device, electronic equipment and storage medium
CN113919361A (en) Text classification method and device
CN111275683B (en) Image quality grading processing method, system, device and medium
CN114360053A (en) Action recognition method, terminal and storage medium
CN114143571B (en) User processing method, device, equipment and storage medium
CN112905896A (en) Training method of recommended number model, and mixed content recommendation method and device
CN109885504B (en) Recommendation system test method, device, medium and electronic equipment
CN112801053A (en) Video data processing method and device
CN111507141B (en) Picture identification method, service interface display method, system and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant