CN106708868A - Method and system for analyzing internet data - Google Patents
Method and system for analyzing internet data Download PDFInfo
- Publication number
- CN106708868A CN106708868A CN201510784361.4A CN201510784361A CN106708868A CN 106708868 A CN106708868 A CN 106708868A CN 201510784361 A CN201510784361 A CN 201510784361A CN 106708868 A CN106708868 A CN 106708868A
- Authority
- CN
- China
- Prior art keywords
- product
- comment
- attribute
- weighted value
- eigenvalue
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a method and a system for analyzing internet data. The method includes acquiring attributes of products on the internet and comments corresponding to the products; determining first weight values corresponding to each comment for each product according to concern degree information of the comment corresponding to the product; determining second weight values of each attribute of each product according to results obtained by sediment classification on the comments corresponding to the attribute of the product; combining the first weight values corresponding to each comment and the second weight values of each attribute of each product with one another and determining data analysis results related to the comments on the products. The method and the system have the advantage that the problem of inaccurate analysis results due to the fact that comment data are analyzed by the aid of existing star rating methods only according to average values can be solved by the aid of the method and the system.
Description
Technical field
The present invention relates to communication technical field, more particularly to a kind of internet data analysis method and system.
Background technology
In today that electronic information is developed rapidly, all kinds of e-commerce websites are provided one after another delivers the flat of online comment
Platform, e-commerce website generally uses with Star rating the consumer couple to be generally shown in embodied in line comment out
The evaluation of overall product or different attribute, then calculates the average value of the Star rating of all comments of the product again, based on flat
Average represents the on-line evaluation result of the product.
It can be seen that, represent that consumer have ignored to the on-line evaluation result of the product above by the average value of Star rating
Information and the useful sex differernce of different comments that the content of text of comment is included, and lead in the online comment text of product
Multiple product attribute can be often referred to, only sees that overall Star rating does not know that evaluation of the consumer to different product attribute, and disappeared
There is heterogeneity in the person of expense, that is, the product attribute paid attention to is different, is commented only in accordance with current average star to the preference of product attribute
Point, consumer is not easy to according to its preference quickly to select product, also it is not easy to manufacturer changing according to this result guide product
Enter direction.
To sum up, only in accordance with mean value feedback comment data there is analysis result inaccurate in existing Star rating method
Problem.
The content of the invention
The embodiment of the present invention provides a kind of internet data analysis method and device, is used to solve existing Star rating side
Method has that analysis result is inaccurate only in accordance with mean value feedback comment data.
The inventive method includes a kind of internet data analysis method, and the method includes:Obtain the product on internet
Attribute and the corresponding comment of the product;For a product, according to the degree of concern of the corresponding every comment of the product
Information, determines corresponding first weighted value of every comment;And according to the corresponding comment of each attribute to the product
The result that emotional semantic classification is obtained is carried out, the second weighted value of each attribute of the product is determined;It is right with reference to described every comment
The first weighted value and the second weighted value of each attribute of the product answered, it is determined that the data of the comment on the product point
Analysis result.
Based on same inventive concept, the embodiment of the present invention further provides for a kind of internet data analysis system, and this is
System includes:Acquiring unit, attribute and the corresponding comment of the product for obtaining the product on internet;First determines list
Unit, for for a product, according to the degree of concern information of the corresponding every comment of the product, determines every comment
Corresponding first weighted value;And the knot that emotional semantic classification is obtained is carried out according to the corresponding comment of each attribute to the product
Really, the second weighted value of each attribute of the product is determined;Second determining unit, it is corresponding for combining described every comment
Second weighted value of each attribute of the first weighted value and the product, it is determined that the data analysis knot of the comment on the product
Really.
The embodiment of the present invention by obtaining attribute and the corresponding comment of the product of the product on internet, for one
Individual product, on the one hand according to the degree of concern information of the corresponding every comment of the product, determines that every comment is corresponding
First weighted value, on the other hand carries out the result that emotional semantic classification is obtained according to the corresponding comment of each attribute to the product,
Determine the second weighted value of each attribute of the product.Finally, corresponding first weighted value and institute are commented on reference to described every
The second weighted value of each attribute of product is stated, it is determined that the data results of the comment on the product.It can be seen that, the present invention
Different comments are assigned the weight of different stage by embodiment, and comment text content is parsed, and draw the every of different product
The weight of individual attribute, further combines above-mentioned two weight on the basis of existing comment data so that comment number
According to analysis result it is more accurate, the selection or manufacturer that are conducive to instructing consumer products carry out the improvement of product.
Brief description of the drawings
Technical scheme in order to illustrate more clearly the embodiments of the present invention, below will be to that will make needed for embodiment description
Accompanying drawing is briefly introduced, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for this
For the those of ordinary skill in field, without having to pay creative labor, it can also be obtained according to these accompanying drawings
His accompanying drawing.
A kind of Fig. 1 internet data analysis method schematic flow sheets for the embodiment of the present invention is provided;
A kind of Fig. 2 vector machine model schematic diagrames for the embodiment of the present invention is provided;
A kind of Fig. 3 internet data analysis system configuration diagrams for the embodiment of the present invention is provided.
Specific embodiment
In order that the object, technical solutions and advantages of the present invention are clearer, below in conjunction with accompanying drawing the present invention is made into
One step ground is described in detail, it is clear that described embodiment is only some embodiments of the invention, rather than whole implementation
Example.Based on the embodiment in the present invention, what those of ordinary skill in the art were obtained under the premise of creative work is not made
All other embodiment, belongs to the scope of protection of the invention.
Shown in Figure 1, the embodiment of the present invention provides a kind of internet data analysis method schematic flow sheet, specifically real
Existing method includes:
Step S101, the attribute of the product on acquisition internet and the corresponding comment of the product.
Step S102, for a product, according to the degree of concern information of the corresponding every comment of the product, determines institute
State every and comment on corresponding first weighted value;And emotional semantic classification is carried out according to the corresponding comment of each attribute to the product
The result for obtaining, determines the second weighted value of each attribute of the product.
Step S103, with reference to described every the second power of each attribute for commenting on corresponding first weighted value and the product
Weight values, it is determined that the data results of the comment on the product.
For example, internet electric business sells the commodity on line by the network platform, and consumer can be directed to the order of oneself
Make comments, comment content would generally be related to many aspects such as quality, size, the logistics of product, and final electric business is according to consumer
Star rating result draw the order be favorable comment or difference comment two kinds of results.In step S101, embodiment of the present invention master
The attribute of the dynamic all product types obtained in data to be analyzed and each product, such as mobile phone relates to price, type
Number, battery standby ability etc., and all orders of the product comment data.
Comment content in view of some orders is more detailed, and the reference value to other users is very big, then Jiu Yaokao
Consider and assign certain weight to this comment, for example, consumer Lee has delivered one for certain Mobile phone order on the net
Bar is commented on, and the content of comment is related to the multi-aspect informations such as the experience on probation of the mobile phone, mobile phone cost performance, mobile phone radiation power, and
And the picture also mixed.Therefore this comment of consumer Lee just has reference value very much, and other consumers see this
Bar comment will feel fine, can comment on thumb up to this, in order to the concern journey such as the data of these thumb ups and the sum of comment
Degree information also serves as the Consideration of the evaluation result of the product, therefore the embodiment of the present invention proposes the first weighted value.Specifically
Ground, for a product, the degree of concern information comments on total and every support fraction of comment including the product;Institute
State the first weighted value and meet following equation:
... ... ... .. formula 1
Wherein,Represent i-th the first weighted value of comment, HVs (vi) represent i-th support fraction of comment, p tables
Show the comment sum of the product, λ represents i-th weighted factor of comment, and the usual values of λ are 1.
Therefore, after being commented on using above-mentioned formula imparting weight for every, a row matrix of 1 × p can be obtained
It is of course also possible to directly according to the support fraction and the corresponding relation of the first weight of the product, determine described every
Bar comments on corresponding first weight.
In addition, the comment serviceability delivered of different user gradations difference, therefore can also be according to the grade of user
Certain weighted value is assigned, the comment serviceability that different user is delivered can be thus distinguished.
Based on would generally be related to multiple attributes of the product in the online comment for product, and to the product in comment
One attribute of product is made while certainly, may be with thumb down to another attribute of the product, therefore simply sees the product
The Star rating result of product is that to comment be inappropriate for favorable comment or difference.Therefore the embodiment of the present invention is according further to the product
The comment Evaluations matrix of the generation on the product of each attribute of corresponding attribute and the product.
Specifically, using the product attribute referred in the method extracting comment of text mining, if A=(A1,A2,…,An) be
Product attribute collection, then using semantic analysis, according to comment star, three-star at that time representative above is favorable comment, and a star is represented
Difference is commented, and remaining is qualitative for neutrality is commented on, therefore the Sentiment orientation of comment is divided into three ranks:It is front, neutrality, negative, successively
Represented with 1,0, -1, can so obtain the attribute evaluation matrix of certain a product, as shown in Table 1.
Table one:
From table one, the part attribute on product is not referred in some comment texts, therefore the part belongs to
There is null value in above-mentioned Evaluations matrix table in property.For the ease of subsequently carrying out matrix operation based on other factorses, the present invention is implemented
Example further fills the null value using default value.Specifically, if the part attribute of the product is not commented on, according to institute
The default evaluation of estimate of comment of product is stated, and using institute's evaluation values as the default value without the part attribute commented on,
So that the comment evaluation square of the generation on the product of each attribute according to the corresponding attribute of the product and the product
Battle array.
For example, the filling mode of null value is:The Star rating (1 star~5 star) that this comment is given is mapped to (- 1~1)
In the range of after, then mapping value is inserted all missing values of same a line using mapping function.Mapping function is:
... ... ... formula 2
Wherein, Score represents the mapping value of Star rating, and Rating represents former Star rating.It is so treated, you can
To complete product attribute Evaluations matrix.
For the otherness of the further different attribute of highlight products, the embodiment of the present invention continues to be based on above-mentioned comment square
Battle array assigns different weighted values to the different attribute of each product.Specifically, for an attribute, according to the product
The result of the emotional semantic classification of the corresponding comment of each attribute, determines that the first of corresponding each attribute of every comment of the product is special
Value indicative;
The second feature of corresponding each attribute of every comment for determining to be evaluated for front according to the First Eigenvalue
Value;And the second feature of corresponding each attribute of every comment for unfavorable ratings is determined according to the First Eigenvalue
Value;
The second weighted value of each attribute of the product is determined according to the Second Eigenvalue.
Above-mentioned second weighted value be related to Fig. 2 during actual determination in supporting vector machine model and weight point
Analysis model, specifically the determination process of the second weighted value is as follows:
Step one:By text mining and the work of semantic analysis, we have respectively obtained user and the entirety of product have been beaten
Divide yiThe marking x of (star rating of comment) and user to each product attributeij, the relation for the existing between the two such as institute of formula 3
Show:
... ... .. formula 3
Wherein, yiRepresent entirety marking of the user to product;wjRepresent the weight of each attribute of product;xijRepresent user
Marking to each product attribute.
Step 2:It is two classification (i.e.+1, -1) based on the vector machine model in Fig. 2, and it is many that existing data can be
Type, for example, be divided into five classes, if five classification are converted into two classification by us, transformation rule is shown in formula 4.
if yi=yj (xi-xj, -1) and y=-1 ... ... formula 4
The preceding Evaluations matrix of conversion is as shown in Table 2.
Table two:
Conversion post-evaluation matrix is as shown in Table 3.
Table three:
Step 3:Based on above-mentioned relation formula, w is solvedj, when weights omega is solved, we establish weight analysis mould
Type, model is improved last foundation according to above-mentioned formula 3 with reference to the vector machine model and algorithm of support vector machine in Fig. 2
Form.
Wherein it is possible to every comment is regarded as a sample, y can be used as the label of sample class, and x is as sample at each
Value in dimension.Weight analysis model algorithm is as follows:
ω≥0
ξi>=0 ... ... ... formula 5
Wherein, 1/C represents penalty coefficient (equivalent to C is multiplied by before slack variable), it is to avoid outlier is excessive;ξiRepresent pine
Relaxation variable, makes equation have feasible solution;ω represents the weighted value column vector of product feature, ω >=0;xiRepresent sample in each dimension
On value,Represent;yiRepresent the label of sample class.
Algorithm of support vector machine algorithm of support vector machine can be used to solve the classification problem of different samples, ask for different classes of
The largest interval of sample, it is ensured that the classification results of sample are the most accurate, wx+b=0 is decision function.W in formula is used as this hair
Bright embodiment the second weighted value to be asked.Algorithm of support vector machine formula is as follows:
s.t.:y(ωTxi+ b) >=1, i=1 ..., n ... ... ... ... .. formula 6
Wherein, ω represents the column vector of weighted value, ω >=0;ξiRepresent slack variable;B represents constant term;Y represents sample
Class label;
To sum up, after determining the first weighted value, the second weighted value, Evaluations matrix by the above method, according to the institute
State the result of product of the second weight of each attribute of Evaluations matrix, every corresponding first weight of comment and the product
The corresponding overall evaluation result of the product can be determined.Each belongs to will be seen that product based on this overall evaluation result manufacturer
Property can find out the attribute that most of consumer compares concern, then at this to the influence degree of overall product overall merit
The research and development of a little attributes and improve and put into more resources, to better meet consumer the need for.Further, it is also possible to directly survey
The overall merit of each attribute is spent, this is easy to find out the short slab of influence overall product overall merit, then targetedly carries out
Improve and manage, more significantly effect can be obtained.
Based on identical technology design, the embodiment of the present invention also provides a kind of internet data analysis system, the internet
The executable above method embodiment of data analysis system.Internet data analysis system such as Fig. 3 institutes provided in an embodiment of the present invention
Show, including:Acquiring unit 301, the first determining unit 302, the second determining unit 303.Wherein:
Acquiring unit 301, attribute and the corresponding comment of the product for obtaining the product on internet;
First determining unit 302, for for a product, according to the degree of concern of the corresponding every comment of the product
Information, determines corresponding first weighted value of every comment;And according to the corresponding comment of each attribute to the product
The result that emotional semantic classification is obtained is carried out, the second weighted value of each attribute of the product is determined;
Second determining unit 303, for each with reference to described every corresponding first weighted value of comment and the product
Second weighted value of attribute, it is determined that the data results of the comment on the product.
For example, internet electric business sells the commodity on line by the network platform, and consumer can be directed to the order of oneself
Make comments, comment content would generally be related to many aspects such as quality, size, the logistics of product, and final electric business is according to consumer
Star rating result draw the order be favorable comment or difference comment two kinds of results.In step S101, embodiment of the present invention master
The attribute of the dynamic all product types obtained in data to be analyzed and each product, such as mobile phone relates to price, type
Number, battery standby ability etc., and all orders of the product comment data.
Comment content in view of some orders is more detailed, and the reference value to other users is very big, then Jiu Yaokao
Consider and assign certain weight to this comment, for example, consumer Lee has delivered one for certain Mobile phone order on the net
Bar is commented on, and the content of comment is related to the multi-aspect informations such as the experience on probation of the mobile phone, mobile phone cost performance, mobile phone radiation power, and
And the picture also mixed.Therefore this comment of consumer Lee just has reference value very much, and other consumers see this
Bar comment will feel fine, can comment on thumb up to this, in order to the concern journey such as the data of these thumb ups and the sum of comment
Degree information also serves as the Consideration of the evaluation result of the product, therefore the embodiment of the present invention proposes the first weighted value.Specifically
Ground, for a product, the degree of concern information comments on total and every support fraction of comment including the product;Institute
State the first weighted value and meet formula 1, the particular content of formula 1 is repeated no more as described in above-mentioned method.
It is of course also possible to directly according to the support fraction and the corresponding relation of the first weight of the product, determine described every
Bar comments on corresponding first weight.
In addition, the comment serviceability delivered of different user gradations difference, therefore can also be according to the grade of user
Certain weighted value is assigned, the comment serviceability that different user is delivered can be thus distinguished.
Based on would generally be related to multiple attributes of the product in the online comment for product, and to the product in comment
One attribute of product is made while certainly, may be with thumb down to another attribute of the product, therefore simply sees the product
The Star rating result of product is that to comment be inappropriate for favorable comment or difference.Therefore the embodiment of the present invention further Utilization assessment matrix
Generation unit generates Evaluations matrix.The Evaluations matrix generation unit 304, for according to the corresponding attribute of the product and described
The comment Evaluations matrix of the generation on the product of each attribute of product.
Specifically, using the product attribute referred in the method extracting comment of text mining, if A=(A1,A2,…,An) be
Product attribute collection, then using semantic analysis, according to comment star, three-star at that time representative above is favorable comment, and a star is represented
Difference is commented, and remaining is qualitative for neutrality is commented on, therefore the Sentiment orientation of comment is divided into three ranks:It is front, neutrality, negative, successively
Represented with 1,0, -1, can so obtain the attribute evaluation matrix of certain a product, as shown in Table 1.
From table one, the part attribute on product is not referred in some comment texts, therefore the part belongs to
There is null value in above-mentioned Evaluations matrix table in property.For the ease of subsequently carrying out matrix operation based on other factorses, the present invention is implemented
Example further fills the null value using default value.Specifically, the Evaluations matrix generation unit 304 specifically for:If institute
The part attribute for stating product is not commented on, then the default evaluation of estimate of the comment according to the product, and institute's evaluation values are made
It is the default value of the part attribute without comment, so that according to each category of the corresponding attribute of the product and the product
Property comment generation the Evaluations matrix on the product.
For example, the filling mode of null value is:The Star rating (1 star~5 star) that this comment is given is mapped to (- 1~1)
In the range of after, then mapping value is inserted all missing values of same a line using mapping function.The mapping function such as institute of formula 2
State, repeat no more.
For the otherness of the further different attribute of highlight products, the embodiment of the present invention continues to be based on above-mentioned comment square
Battle array assigns different weighted values to the different attribute of each product.First determining unit specifically for:For an attribute, root
According to the result of the emotional semantic classification of the corresponding comment of each attribute to the product, determine that every comment of the product is corresponding every
The First Eigenvalue of individual attribute;
The second feature of corresponding each attribute of every comment for determining to be evaluated for front according to the First Eigenvalue
Value;And the second feature of corresponding each attribute of every comment for unfavorable ratings is determined according to the First Eigenvalue
Value;
The second weighted value of each attribute of the product is determined according to the Second Eigenvalue.
Above-mentioned second weighted value be related to Fig. 2 during actual determination in supporting vector machine model and weight point
Analysis model, specifically the determination process of the second weighted value is as follows:
Step one:By text mining and the work of semantic analysis, we have respectively obtained user and the entirety of product have been beaten
Divide yiThe marking x of (star rating of comment) and user to each product attributeij, the relation for the existing between the two such as institute of formula 3
Show.
Step 2:It is two classification (i.e.+1, -1) based on the vector machine model in Fig. 2, and it is many that existing data can be
Type, for example, be divided into five classes, if five classification are converted into two classification by us, transformation rule is shown in formula 4.
Step 3:Based on above-mentioned relation formula, w is solvedj, when weights omega is solved, we establish weight analysis mould
Type, model is improved last foundation according to above-mentioned formula 3 with reference to the vector machine model and algorithm of support vector machine in Fig. 2
Form.
Wherein it is possible to every comment is regarded as a sample, y can be used as the label of sample class, and x is as sample at each
Value in dimension.Weight analysis model algorithm is as shown in Equation 5.Algorithm of support vector machine algorithm of support vector machine can be used to solve
Never with the classification problem of sample, the largest interval of different classes of sample is asked for, it is ensured that the classification results of sample are the most accurate, wx
+ b=0 is decision function.W in formula is used as the embodiment of the present invention the second weighted value to be asked.Algorithm of support vector machine formula
As shown in Equation 6.
To sum up, after determining the first weighted value, the second weighted value, Evaluations matrix by the above method, according to the institute
State the result of product of the second weight of each attribute of Evaluations matrix, every corresponding first weight of comment and the product
The corresponding overall evaluation result of the product can be determined.It can be seen that, different comments are assigned different stage by the embodiment of the present invention
Weight, and comment text content is parsed, the weight of each attribute of different product is drawn, in existing comment data
On the basis of further combine above-mentioned two weight so that the analysis result of comment data is more accurate, be conducive to improve
The precision of marketing, manufacturer using the analysis result can quick positioning product prominent attributive character, then in formulation marketing
When tactful, can targetedly strengthen publicity, can so strengthen impression of these attributive character in the minds of consumer, make
The core competitiveness of product, so, the consumer that these attributes are paid attention to just will focus more on the product, so as to improve product
Sales volume.It can be seen that, the present embodiments relate to method come from the thought that people-oriented, take into full account the demand of user, with meet
User's request is important goal.Related electric business website can it is existing by popularity, sales volume, price ranking on the basis of, profit
Increase the ranking of each attribute with the embodiment of the present invention, so, the different consumer of preference can select what is valued according to oneself
The overall merit of attribute scans for ranking, without speculating that product is each after the content of text for going to browse numerous online comments again
The substantially evaluation of individual attribute, significantly reduces the time cost of search and the risk of transaction.
The present invention is the flow with reference to method according to embodiments of the present invention, equipment (system) and computer program product
Figure and/or block diagram are described.It should be understood that every first-class during flow chart and/or block diagram can be realized by computer program instructions
The combination of flow and/or square frame in journey and/or square frame and flow chart and/or block diagram.These computer programs can be provided
The processor of all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce
A raw machine so that produced for reality by the instruction of computer or the computing device of other programmable data processing devices
The device of the function of being specified in present one flow of flow chart or multiple one square frame of flow and/or block diagram or multiple square frames.
These computer program instructions may be alternatively stored in can guide computer or other programmable data processing devices with spy
In determining the computer-readable memory that mode works so that instruction of the storage in the computer-readable memory is produced and include finger
Make the manufacture of device, the command device realize in one flow of flow chart or multiple one square frame of flow and/or block diagram or
The function of being specified in multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that in meter
Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented treatment, so as in computer or
The instruction performed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one
The step of function of being specified in individual square frame or multiple square frames.
, but those skilled in the art once know basic creation although preferred embodiments of the present invention have been described
Property concept, then can make other change and modification to these embodiments.So, appended claims are intended to be construed to include excellent
Select embodiment and fall into having altered and changing for the scope of the invention.
Obviously, those skilled in the art can carry out various changes and modification without deviating from essence of the invention to the present invention
God and scope.So, if these modifications of the invention and modification belong to the scope of the claims in the present invention and its equivalent technologies
Within, then the present invention is also intended to comprising these changes and modification.
Claims (10)
1. a kind of internet data analysis method, it is characterised in that the method includes:
The attribute of the product on acquisition internet and the corresponding comment of the product;
For a product, according to the degree of concern information of the corresponding every comment of the product, determine that every comment is right
The first weighted value answered;And the result that emotional semantic classification is obtained is carried out according to the corresponding comment of each attribute to the product,
Determine the second weighted value of each attribute of the product;
With reference to described every the second weighted value of each attribute for commenting on corresponding first weighted value and the product, it is determined that on
The data results of the comment of the product.
2. the method for claim 1, it is characterised in that for a product, the degree of concern information includes described
Product comment it is total and every comment support fraction;
First weighted value meets following equation:
Formula one:
Wherein,Represent i-th the first weighted value of comment, HVs (vi) i-th support fraction of comment is represented, p represents described
The comment sum of product, λ represents i-th weighted factor of comment.
3. the method for claim 1, it is characterised in that the attribute and the product of the product on the acquisition internet
After the corresponding comment of product, before corresponding first weighted value is commented in described every of the determination, including:
The comment evaluation square of the generation on the product of each attribute according to the corresponding attribute of the product and the product
Battle array.
4. method as claimed in claim 3, it is characterised in that also include:
If the part attribute of the product is not commented on, evaluation of estimate is preset in the comment according to the product, and will be described
Evaluation of estimate as it is described without comment part attribute default value so that according to the corresponding attribute of the product and the product
Each attribute comment generation the Evaluations matrix on the product.
5. the method for claim 1, it is characterised in that the second weight of each attribute of the determination product
Value, including:
For an attribute, the result of the emotional semantic classification according to the corresponding comment of each attribute to the product, it is determined that described
The First Eigenvalue of corresponding each attribute of every comment of product;
The Second Eigenvalue of corresponding each attribute of every comment for determining to be evaluated for front according to the First Eigenvalue;With
And the Second Eigenvalue of corresponding each attribute of every comment for unfavorable ratings is determined according to the First Eigenvalue;
The second weighted value of each attribute of the product is determined according to the Second Eigenvalue.
6. a kind of internet data analysis system, it is characterised in that the system includes:
Acquiring unit, attribute and the corresponding comment of the product for obtaining the product on internet;
First determining unit, for for a product, according to the degree of concern information of the corresponding every comment of the product, really
Fixed described every is commented on corresponding first weighted value;And emotion is carried out according to the corresponding comment of each attribute to the product
The result that classification is obtained, determines the second weighted value of each attribute of the product;
Second determining unit, for combining described every each attribute for commenting on corresponding first weighted value and the product the
Two weighted values, it is determined that the data results of the comment on the product.
7. system as claimed in claim 6, it is characterised in that for a product, the degree of concern information includes described
Product comment it is total and every comment support fraction;
First weighted value meets following equation:
Formula one:
Wherein,Represent i-th the first weighted value of comment, HVs (vi) i-th support fraction of comment is represented, p represents described
The comment sum of product, λ represents i-th weighted factor of comment.
8. system as claimed in claim 6, it is characterised in that also include:
Evaluations matrix generation unit, for being generated according to the comment of the corresponding attribute of the product and each attribute of the product
Evaluations matrix on the product.
9. system as claimed in claim 8, it is characterised in that the Evaluations matrix generation unit specifically for:
If the part attribute of the product is not commented on, evaluation of estimate is preset in the comment according to the product, and will be described
Evaluation of estimate as it is described without comment part attribute default value so that according to the corresponding attribute of the product and the product
Each attribute comment generation the Evaluations matrix on the product.
10. system as claimed in claim 6, it is characterised in that the first determining unit specifically for:
For an attribute, the result of the emotional semantic classification according to the corresponding comment of each attribute to the product, it is determined that described
The First Eigenvalue of corresponding each attribute of every comment of product;
The Second Eigenvalue of corresponding each attribute of every comment for determining to be evaluated for front according to the First Eigenvalue;With
And the Second Eigenvalue of corresponding each attribute of every comment for unfavorable ratings is determined according to the First Eigenvalue;
The second weighted value of each attribute of the product is determined according to the Second Eigenvalue.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510784361.4A CN106708868B (en) | 2015-11-16 | 2015-11-16 | Internet data analysis method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510784361.4A CN106708868B (en) | 2015-11-16 | 2015-11-16 | Internet data analysis method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106708868A true CN106708868A (en) | 2017-05-24 |
CN106708868B CN106708868B (en) | 2020-02-21 |
Family
ID=58931580
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510784361.4A Active CN106708868B (en) | 2015-11-16 | 2015-11-16 | Internet data analysis method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106708868B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107909401A (en) * | 2017-11-14 | 2018-04-13 | 阮敬 | A kind of satisfaction measuring method based on big data technology |
CN108595562A (en) * | 2018-04-12 | 2018-09-28 | 西安邮电大学 | User's evaluation data analysing method based on accurate sex determination |
CN109284373A (en) * | 2018-09-06 | 2019-01-29 | 合肥工业大学 | The acquisition methods and device of product up-gradation strategy based on text mining driving |
CN109376888A (en) * | 2018-10-09 | 2019-02-22 | 长安大学 | A kind of Forum on College Eating-room management system and management method based on cell phone application |
CN110837739A (en) * | 2019-10-24 | 2020-02-25 | 支付宝(杭州)信息技术有限公司 | Service processing method and device and electronic equipment |
CN111767725A (en) * | 2020-06-24 | 2020-10-13 | 中国平安财产保险股份有限公司 | Data processing method and device based on emotion polarity analysis model |
CN112559685A (en) * | 2020-12-11 | 2021-03-26 | 芜湖汽车前瞻技术研究院有限公司 | Automobile forum spam comment identification method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102945268A (en) * | 2012-10-25 | 2013-02-27 | 北京腾逸科技发展有限公司 | Method and system for excavating comments on characteristics of product |
CN103399916A (en) * | 2013-07-31 | 2013-11-20 | 清华大学 | Internet comment and opinion mining method and system on basis of product features |
CN103914783A (en) * | 2014-04-13 | 2014-07-09 | 北京工业大学 | E-commerce website recommending method based on similarity of users |
CN104156390A (en) * | 2014-07-07 | 2014-11-19 | 乐视网信息技术(北京)股份有限公司 | Comment recommendation method and system |
-
2015
- 2015-11-16 CN CN201510784361.4A patent/CN106708868B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102945268A (en) * | 2012-10-25 | 2013-02-27 | 北京腾逸科技发展有限公司 | Method and system for excavating comments on characteristics of product |
CN103399916A (en) * | 2013-07-31 | 2013-11-20 | 清华大学 | Internet comment and opinion mining method and system on basis of product features |
CN103914783A (en) * | 2014-04-13 | 2014-07-09 | 北京工业大学 | E-commerce website recommending method based on similarity of users |
CN104156390A (en) * | 2014-07-07 | 2014-11-19 | 乐视网信息技术(北京)股份有限公司 | Comment recommendation method and system |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107909401A (en) * | 2017-11-14 | 2018-04-13 | 阮敬 | A kind of satisfaction measuring method based on big data technology |
CN108595562A (en) * | 2018-04-12 | 2018-09-28 | 西安邮电大学 | User's evaluation data analysing method based on accurate sex determination |
CN109284373A (en) * | 2018-09-06 | 2019-01-29 | 合肥工业大学 | The acquisition methods and device of product up-gradation strategy based on text mining driving |
CN109376888A (en) * | 2018-10-09 | 2019-02-22 | 长安大学 | A kind of Forum on College Eating-room management system and management method based on cell phone application |
CN110837739A (en) * | 2019-10-24 | 2020-02-25 | 支付宝(杭州)信息技术有限公司 | Service processing method and device and electronic equipment |
CN111767725A (en) * | 2020-06-24 | 2020-10-13 | 中国平安财产保险股份有限公司 | Data processing method and device based on emotion polarity analysis model |
CN111767725B (en) * | 2020-06-24 | 2023-06-20 | 中国平安财产保险股份有限公司 | Data processing method and device based on emotion polarity analysis model |
CN112559685A (en) * | 2020-12-11 | 2021-03-26 | 芜湖汽车前瞻技术研究院有限公司 | Automobile forum spam comment identification method |
Also Published As
Publication number | Publication date |
---|---|
CN106708868B (en) | 2020-02-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106708868A (en) | Method and system for analyzing internet data | |
EP3893154A1 (en) | Recommendation model training method and related apparatus | |
Jiao et al. | Profit maximization mechanism and data management for data analytics services | |
Sangeetha et al. | Service quality models in banking: a review | |
CN109034973B (en) | Commodity recommendation method, commodity recommendation device, commodity recommendation system and computer-readable storage medium | |
US20140172642A1 (en) | Analyzing commodity evaluations | |
CN105229721A (en) | When client device is in the lock state to the dynamic arrangements of the content presented | |
JP2010079657A (en) | Information processor, information processing method, and program | |
JP2018077615A (en) | Advertising image generation device, advertising image generation method and program for advertising image generation device | |
CN106651544A (en) | Conversational recommendation system for minimum user interaction | |
CN109816134A (en) | Shipping address prediction technique, device and storage medium | |
CN107885784A (en) | The method and apparatus for extracting user characteristic data | |
CN110706028A (en) | Commodity evaluation emotion analysis system based on attribute characteristics | |
CN110033324A (en) | Data processing method, device, electronic equipment and computer readable storage medium | |
CN111654714B (en) | Information processing method, apparatus, electronic device and storage medium | |
Shayaa et al. | Social media sentiment analysis of consumer purchasing behavior vs consumer confidence index | |
KR20220117425A (en) | Marketability analysis and commercialization methodology analysis system using big data | |
CN109636530B (en) | Product determination method, product determination device, electronic equipment and computer-readable storage medium | |
Dargahi et al. | Co-production or DIY: an analytical model of consumer choice and social preferences | |
US20140372207A1 (en) | Profit index value generation system and profit index value generation method | |
CN107679887A (en) | A kind for the treatment of method and apparatus of trade company's scoring | |
Carter et al. | When do I profit? Uncovering boundary conditions on reputation effects in online auctions | |
Al-Zadjali et al. | Assessing customer satisfaction of m-banking in Oman using SERVQUAL model | |
Javidnia et al. | Identifying factors affecting acceptance of new technology in the industry using hybrid model of UTAUT and FUZZY DEMATEL | |
CN103902380B (en) | A kind of method, apparatus and equipment determining resource allocation using sandbox |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |