CN112070543B - Method for detecting comment quality in E-commerce website - Google Patents

Method for detecting comment quality in E-commerce website Download PDF

Info

Publication number
CN112070543B
CN112070543B CN202010944581.XA CN202010944581A CN112070543B CN 112070543 B CN112070543 B CN 112070543B CN 202010944581 A CN202010944581 A CN 202010944581A CN 112070543 B CN112070543 B CN 112070543B
Authority
CN
China
Prior art keywords
merchant
feature
comment
group
avg
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010944581.XA
Other languages
Chinese (zh)
Other versions
CN112070543A (en
Inventor
刘嘉辉
李喆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin University of Science and Technology
Original Assignee
Harbin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin University of Science and Technology filed Critical Harbin University of Science and Technology
Priority to CN202010944581.XA priority Critical patent/CN112070543B/en
Publication of CN112070543A publication Critical patent/CN112070543A/en
Application granted granted Critical
Publication of CN112070543B publication Critical patent/CN112070543B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0282Rating or review of business operators or products
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Abstract

The invention provides a comment evaluation model and a fuzzy clustering method based on individual and group of commentators, which are used for detecting the quality of comments in an e-commerce website and comprise the following steps: extracting relevant characteristics of the commentators and the merchants, normalizing unknown target values in a characteristic set in a limited range to be 0-1 by adopting a series convergence model, constructing characteristic vectors and training set data of the target value set, establishing a truth degree logistic regression model of the commentators and the merchants according to the characteristics and the training set, classifying the commentators into individuals and groups according to a criterion, respectively constructing individual and group comment evaluation models, iterating to obtain true classes and false class membership degrees of the commentaries by utilizing a fuzzy C-means clustering algorithm, comparing the membership degrees, and detecting the comment quality. The invention establishes a more perfect appraisal model based on the commentator from the individual and group angles, improves the authenticity and the intuition of the comment quality detection result by the fuzzy clustering analysis method, and meets the requirement of the E-commerce comment quality detection.

Description

Method for detecting comment quality in E-commerce website
Technical Field
The invention relates to a comment evaluation model and a fuzzy clustering method based on individual and group comments of commentators, in particular to a method for detecting comment quality of an electronic commerce website, and belongs to the field of computer technology application.
Background
With the development and popularization of internet technology, the era mainly of information and data gradually goes into life of common people, and a mode of shopping through electronic transaction on the internet is widely favored. The electronic commerce website is used as a main channel of online shopping, a large amount of commodity comment information of a purchasing user is accumulated, the comment information is used as one of expression forms of the network public praise, and the quality of the comment information has a self-evident effect on consumption decision of the user. However, as the e-commerce website reviews accumulate and commercial interest drives, a large number of meaningless and even false reviews appear in the website. How to adopt the technical means to detect the comment quality of the electronic commerce website, the reputations of the online shopping platform are improved by removing the counterfeits and the truths, and the method becomes one of the tasks to be solved urgently at present.
The purpose of detecting e-commerce website reviews is to identify false reviews therein. The false comment is a poor motivation for some traders to seek personal interest, and may be used to compile unrealistic consumption experiences in the comment, to blow or defame the quality of an evaluation object, or the like, and to mislead the consumption behavior of the trader. Examples of cases of false comments in an e-commerce website are as follows:
case 1: at a certain electronic commerce platform, a user A has purchased some goods and published a plurality of real comments, and at the moment, the user A temporarily accepts some entrusts of merchants to carry out false purchase and partially presents false comments. User A is referred to herein as an individual false reviewer, with most of the reviews of user A being true and only a small percentage of false reviews being present. The false comments made by the users have the characteristics of temporality and concealment.
Case 2: at an e-commerce platform, a merchant B spends money to hire a group of 'water force' users C to perform fake evaluation on a target commodity, and the 'water force' users generate a certain amount of fake comments within a specified time period. The "water force" user C is referred to herein as the cohort false comment, and most or even all of the comments of the "water force" user C are false comments. The false comments published by the users have the characteristics of promptness and exposure.
Currently, for the detection of false comments, research can be started from two types of detection objects, namely comment text detection and comment person detection. The detection method based on the comment text mostly relates to vocabulary sentence patterns, grammar emotions and the like, the method has good pertinence to comment contents, but complex and changeable Chinese vocabulary collocation, rich grammar structure and semantic association relation limit the popularization of the method, the subjectivity of the method is reflected, the actual detection effect is not good, and therefore the detection method based on the comment text is particularly important. According to the above cases, the commentators are divided into individuals and groups according to different generation modes of the false comments, so that the characteristics of the comment evaluation can be effectively extracted, and the evaluation model can be reasonably established.
Extracting relevant features of the comments, introducing a mutual rating model between the commentators and the merchants by analyzing an evaluation mechanism of the E-commerce comments, and taking the calculated commentator rating and the merchant rating as the first type of features.
In the feature normalization processing, the discussion and screening of the distribution values in the feature set need to satisfy both the requirements of statistical analysis and computer processing, so that the values in the feature set need to be normalized. Because the limited feature set has an upper limit and a lower limit and meets the basic features of series planning, the convergence property of the function term series is adopted to normalize the numerical value in the feature value range interval to 0-1 interval, and the data meeting the requirements of the given target value can be conveniently mined from the given data set.
The method for calculating the truth of the user and the merchant according to the characteristics is characterized in that logistic regression is used as a generalized linear model, nonlinear factors are introduced through a sigmoid function on the basis of the linear regression, the method is commonly used for solving the problem of binary classification, when specific constraint conditions are met, the probability of occurrence of an event can be expressed, and the method is widely used for distinguishing and predicting. Therefore, a logistic regression equation can be established through the relevant characteristics, and the user and merchant trueness can be calculated.
A common clustering analysis is used as a multivariate statistical method to objectively classify types and quantitatively determine the affinity and sparseness relationship among data. The truth of the user and the business is used as a classification standard, the comment quality is divided, and attribute boundaries which are not strictly clear are not defined, so that the unreasonable assumption is made that the category of the comment belongs to the mandatory degree. Therefore, a fuzzy dynamic clustering analysis method is adopted, and the forced constraint relation is relaxed in the form of class membership, namely, the class is closer to which class, and the membership degree is higher. The trueness, the interpretability and the intuition of the quality detection effect are better.
And the fuzzy C-means clustering algorithm obtains the membership degree of each data to the class center by optimizing the objective function, so as to judge the membership degree of each data corresponding to the class. The fuzzy C-means clustering algorithm is widely applied at present as a class of fuzzy dynamic clustering algorithms with perfect theory.
The basic idea of the method for detecting the comment quality in the E-commerce website is as follows: firstly, calculating a first class of characteristics according to a mutual scoring model of a commentator and a merchant, extracting other class related characteristics of the commentator and the merchant, normalizing a characteristic set of a limited range to a 0-1 interval by adopting a series normalization convergence model, establishing a logistic regression model according to the characteristics, and expressing the truth of the commentator (the merchant); secondly, dividing the commentators into individuals and groups according to the criterion that the commentators of more than or equal to three merchants are in one group together with the common commentator; then, an individual critic evaluation model is constructed: sampling merchants from the perspective of individual commentators according to the price intervals of purchased commodities, calculating the truth of the individual commentators and the corresponding merchants by adopting a logistic regression model, constructing a sample characteristic evaluation matrix, and iterating to obtain an optimal class membership function of the commentary by utilizing a fuzzy C-means clustering algorithm; constructing a group critic evaluation model: sampling group critics from the perspective of a merchant according to a sold product review time interval, calculating the truth of the merchant and the truth of the corresponding group critics by adopting a logistic regression model, constructing a sample characteristic evaluation matrix, and iterating to obtain an optimal class membership function of reviews by utilizing a fuzzy C-means clustering algorithm; and finally, detecting whether the comment quality is good or not through the degree of membership of the individual comment category and the group comment category.
Disclosure of Invention
Technical problem to be solved
In order to detect the comment quality of the e-commerce website and identify the false comments in the e-commerce website, the comment-based evaluation model is established by combining a mutual scoring method, a series convergence method, a logistic regression method and a fuzzy clustering method for comment detection of comment features. Firstly, a mutual scoring model is established, scores of a critic and a merchant are respectively calculated to serve as first-class characteristics, other relevant characteristics of the critic and the merchant are extracted, further quantization and standardization are carried out, and a dimensionless numerical value which is convenient to analyze and calculate is obtained. Considering that the quantized feature set needs to be normalized to an interval of 0-1, and different feature sets have different value ranges, a series normalization statistical analysis convergence model is adopted, and the feature sets in different ranges are normalized by series models with different convergence rates. Secondly, a maximum likelihood estimation method is adopted, a group of characteristic weights are obtained according to training set data, and the truth of the to-be-detected commentator and the merchant is calculated by substituting a logistic regression equation. Then, considering different generation modes of the false comments of the website, a classification criterion is provided to divide the commentators into individual commentators and group commentators, and two comment quality evaluation models are established. Further, constructing an individual critic evaluation model: calculating the truth of the individual commentator, dividing the purchased commodities into a commodity number sequence according to price intervals, randomly sampling the commodities in the sequence by Poisson distribution with a parameter of lambda, and calculating the truth mean value of the corresponding merchant to obtain the truth of the individual commentator and the corresponding merchant; constructing a group critic evaluation model: calculating the truth of the merchant, dividing the received comments by time intervals to obtain a group comment number sequence, randomly sampling the comments in the sequence by Poisson distribution with the parameter as mu, and calculating the average value of the truth of the corresponding group comment persons to obtain the truth of the merchant and the corresponding group comment persons. Finally, selecting the trueness of the individual commentator and the trueness of the corresponding merchant, and combining the two parameters into an individual comment quality characteristic vector; and selecting the authenticity of the merchant and the authenticity of the corresponding group comment person, and combining the two parameters into a group comment quality characteristic vector. And establishing individual and group characteristic evaluation matrixes, iterating to obtain an optimal class center matrix and a class membership matrix by adopting a fuzzy C-means clustering algorithm respectively, distinguishing the quality degree of the comments according to the class membership degree, and identifying false comments, so that the comment quality of the E-commerce website can be detected.
(II) technical scheme
In order to realize the detection of the comment quality of the e-commerce website and identify false comments in the e-commerce website, the invention provides a method for detecting the comment quality in the e-commerce website, which comprises the following steps:
(1) Calculating a first class of characteristics according to a mutual grading model of the commentator and the merchant, extracting other relevant characteristics of the commentator and the merchant, normalizing the characteristic set to a (0, 1) interval by adopting a series normalization statistical analysis convergence model, establishing a target characteristic vector, and marking out a training data set.
(2) And according to the marked training set data, establishing a truth logistic regression model of the commentator and the merchant according to the characteristics.
(3) And dividing the commentators into individual commentators and group commentators according to a classification criterion, and establishing a corresponding comment quality evaluation model.
(4) Individual evaluation model: sampling merchants from the perspective of individual commentators according to the price interval of purchased commodities, and calculating the truths of the individual commentators and the corresponding merchants by using a logistic regression equation. Group evaluation model: and sampling the group commentators from the perspective of the merchant according to the received comment time interval, and calculating the truth of the merchant and the group commentators corresponding to the category by using a logistic regression equation.
(5) Establishing truth characteristic evaluation matrixes of individuals and groups, and iterating to obtain final truth class membership and false class membership of the comments by adopting a fuzzy C-means clustering algorithm respectively, so as to achieve the purpose of detecting the quality of the comments and identifying the false comments in the comments.
A method for detecting comment quality in an E-commerce website comprises the following steps:
the scoring mechanism of the commenter by the merchant is as follows: and defining calculation rules reflecting three behaviors of registration of a commentator (remote registration or frequent registration), browsing (browsing similar commodities before purchase) and commenting (evaluating commodities after purchase).
In each transaction, the merchant will comment on the person's three types of behavior described above, with the tag 1 present and the tag 0 absent, with 8 possibilities for combination. 001, 101, 111 are defined as suspicious transactions and the rest are normal transactions.
Defining X marks, calculating a merchant mark as a first class of characteristics according to the probability of normal transaction in n transactions, extracting other relevant characteristics of the critics, and constructing the characteristic vector of the critics.
Specifically, the commentator feature is represented as u = (1, u feature_1, u feature_2, \8230; u feature _ k), parameters in the vector represent characteristic values of the quantized commentators, including merchant scores, registration time, purchase rates, comment quantity and the like;
the scoring mechanism of the commenter for the merchant: and defining calculation rules reflecting three attributes of merchant goods (description matching), service (service attitude) and logistics (logistics situation).
In each transaction, 8 possibilities are available for the commentator to combine the above three types of attributes of the merchant, the excellent label 1, the poor label 0 and the like. 000, 001, 010, 100 are defined as bad transactions, and the rest are good transactions.
Defining X marks, calculating the marks of the critics according to the probability of high-quality transactions in n transactions as a first class of characteristics, extracting other relevant characteristics of merchants and constructing merchant characteristic vectors.
Specifically, the merchant features are denoted as m = (1, m feature 2, \8230; m feature k), the parameters in the vector represent the quantized merchant feature values, including "critic rating", "registration time", "rating ratio", "sales quantity", etc.;
discussion and screening of distribution values in feature sets of critics and merchants are reference values for statistical analysis, so that the numerical values in the feature sets need to be normalized, and processed data are input into a logistic regression model;
because the feature set has upper limit and lower limit, the basic feature of series planning is satisfied, and a convergence model S is established according to the consistent convergence property of the function term series n (x)=x n /(x n + Range), range is defined as the maximum value of the Range of features, the argument x is the set of features defining the Range, S n (x) The sequence S (x) and the portion of (c) converge consistently to 1 over the argument interval (0, range);
then selecting an initialized n value according to the Range value, enabling all characteristic values in the value Range to approach a normalization (0, 1) interval, and obtaining a result after characteristic data processing;
combining the normalized target characteristic values into target characteristic vectors, namely characteristic vectors of each commentator and each merchant;
and (4) excavating feature vectors meeting the conditions in a set feature vector set by adopting a statistical analysis labeling method, and labeling categories as a training set of logistic regression.
In the logistic regression training set data, the independent variable is the characteristic of the commenter after quantization normalization, the dependent variable is the type of the annotated commenter and obeys Bernoulli distribution, the true commenter is marked as 1, and the false commenter is marked as 0;
the feature matrix of the training set of the commentator is represented as U = { U _1, U _2, \8230;, U _ n }, wherein U _ i is a (k + 1) -dimensional feature vector of the ith sample, and the labeling result of the training set is represented by an n-dimensional 0,1 vector;
a group of (k + 1) -dimensional regression coefficient vectors alpha are obtained by adopting a maximum likelihood estimation principle and a batch gradient descent method, and the truth of a critic is expressed as URE =1/[1+ exp (-alpha) ] T *u)]。
In the logistic regression training set data, the independent variable is the merchant characteristics after quantization and normalization, the dependent variable is the marked merchant category and obeys Bernoulli distribution, the marked real merchant is 1, and the false merchant is 0;
the feature matrix of the training set of the merchant is expressed as M = { M _1, M _2, \8230;, M _ n }, wherein M _ i is a (k + 1) -dimensional feature vector of the ith sample, and the marking result of the training set is expressed by an n-dimensional 0,1 vector;
a group of (k + 1) -dimensional regression coefficient vectors beta are obtained by adopting a maximum likelihood estimation principle and a batch gradient descent method, and the truth of a merchant is represented as MRE =1/[1+ exp (-beta) ] T *m)]。
According to the criterion that the commentators of more than or equal to three merchants are in a group after commenting together, dividing the commentators to be detected into individual commentators and group commentators.
Individual critic evaluation model: individual commentators are defined as individuals, as follows:
(1) Calculating individual truth URE by adopting the logistic regression model;
(2) Dividing all commodity numbers purchased by individuals according to a specified price interval (p _1, p _2, \8230;, p _ n) to obtain an initial commodity number sequence (c _1, c _2, \8230;, c _ n);
(3) The initial commodity number sequence obeys Poisson distribution with parameter c _ avg, random ('poisson', c _ avg,1, n) is adopted to generate a Poisson distribution random number sequence, poisson is defined as Poisson distribution sampling, c _ avg is defined as the average value of distribution values, and the commodity number sequence after sampling is represented as (sc _1, sc _2, \8230;, sc _ n);
(4) And calculating the truth average value (MRE _ avg _1, MRE _avg _2, \8230; MRE _ avg _ n) of the merchant corresponding to the commodity number sequence after sampling by adopting the logistic regression model.
The group commentator evaluation model comprises: the group commentator is defined as a group, and the following details are as follows:
(1) Calculating the merchant truth MRE by adopting the logistic regression model;
(2) Dividing the number of all comments received by a merchant according to a specified time interval (t _1, t _2, \8230;, t _ n), ensuring that each interval has only one group of comments, and obtaining an initial group comment number sequence (r _1, r _2, \8230;, r _ n);
(3) The initial group comment number sequence obeys Poisson distribution with a parameter of r _ avg, random ('poisson', r _ avg,1, n) is adopted to generate a Poisson distribution random number sequence, poisson is defined as Poisson distribution sampling, r _ avg is defined as the average value of distribution values, and the sampled group comment number sequence is represented as (sr _1, sr_2, 8230;, sr _ n);
(4) And calculating the truth average value (URE _ avg _1, URE _avg _2, \8230; URE _ avg _ n) of the group corresponding to the group comment number sequence after sampling by adopting the logistic regression model.
The individual review sample feature evaluation matrix is represented as X = { X _1, X _2, \8230;, X _ n }, wherein X _ j = (X _ j _ URE, X _ j _ MRE _ avg) is the degree of truth of the jth individual and its corresponding class of merchants;
the group comment sample characteristic evaluation matrix is represented as Y = { Y _1, Y _2, \8230;, Y _ n }, wherein Y _ j = (Y _ j _ MRE, Y _ j _ URE _ avg) is the trueness of the jth merchant and the corresponding class group;
the algorithm for detecting the individual comment quality is applied to the detection of the group comment quality in the same way, and the algorithm is as follows:
step _1, randomly selecting feature evaluation vectors c _1 and c _2as a category center of a real comment and a category center of a false comment respectively, dividing a sample into two categories, wherein the Euclidean distance between the sample x _ j and the category center c _ i is d _ ij = | | x _ j-c _ i |, U _ ij represents a membership function of the jth sample to the ith category, U represents a fuzzy classification matrix, and V represents a category center matrix;
step _2, the objective function and constraints of fuzzy C-means clustering are as follows:
J(U,V)=(u_11) m *(d_11) 2 +(u_12) m *(d_12) 2 +…+(u_1n) m *(d_1n) 2 +(u_21) m *(d_21) 2 +(u_22) m *(d_22) 2 +…+(u_2n) m *(d_2n) 2
u_1j+u_2j=1,j=1,2,…,n
step _3, deriving a membership function and a class center:
u_ij=[(d_ij/d_1j) 2/(m-1) +(d_ij/d_2j) 2/(m-1) ] -1 ,i=1,2;j=1,2,…,n
c_i=[(u_i1) m *x_1+(u_i2) m *x_2+…+(u_in) m *x_n]/[(u_i1) m +(u_i2) m +…+(u_in) m ],i=1,2
step _4, taking a threshold value epsilon =0.001, m =2, and when | | | Δ c _ i | < epsilon is met, stopping iteration and outputting an optimal fuzzy classification matrix U and a category center matrix V by the algorithm;
according to the fuzzy classification matrix U, the membership degree of each comment belonging to the real comment or the false comment can be known, namely which type of comment of the merchants evaluated by the individual is judged as the false comment according to the value, and similarly which type of group of comments of the merchants is judged as the false comment.
And the quality of the E-commerce website comments is detected by combining the two methods.
(III) advantageous effects
The method has the advantages that a mutual scoring model of the commentator and the merchant is established to calculate the first class of characteristics, then other relevant characteristics of the commentator and the merchant are extracted, the characteristic values are preprocessed by utilizing a series normalization convergence model, two classes of comment quality evaluation models based on the commentator individuals and the commenter groups are established through regression and sampling methods, the quality of comments is distinguished by adopting a fuzzy clustering analysis method according to the truth characteristics of the commentator and the merchant, the relevant characteristics can be uniformly and reasonably processed to establish a more practical model, the model is solved by adopting a fuzzy division method, and the authenticity, the interpretability and the intuitiveness of the detection effect are reflected.
Drawings
FIG. 1 is a series normalized statistical analysis convergence model image of a feature set.
Fig. 2 is a flow chart of a method for detecting review quality in an e-commerce website.
Detailed Description
Embodiments of the present invention are described in further detail below with reference to the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
Example 1: establishing a mutual scoring model to calculate the first kind of characteristics
The merchant scores the commentators: and defining calculation rules reflecting three behaviors of registration of a commentator (remote registration or frequent registration), browsing (browsing similar commodities before purchase) and commenting (evaluating commodities after purchase).
In each transaction, the merchant will comment on the above three types of behavior of the person, with the mark 1 appearing and the mark 0 not appearing, and there are 8 possibilities of combination. 001, 101, 111 are defined as suspicious transactions and the rest are normal transactions.
Taking X scores and n transactions as an example, if there are k suspicious transactions, the merchant score X (n-k) is calculated.
The commenter scores the merchants: and defining calculation rules reflecting three types of attributes of merchant goods (description matching), service (service attitude) and logistics (logistics situation).
In each transaction, the commenter can combine the three attributes of the merchant, namely the excellent attribute 1 and the poor attribute 0, so that the possibility is 8. 000, 001, 010, 100 are defined as bad transactions, the rest are good transactions.
Taking X points and n transactions as an example, if there are k bad transactions, the reviewer score is calculated as X (n-k).
Example 2: and (5) normalizing the feature set, and constructing a feature vector and a training set.
Take "registration time" and "number of comments" of a commentator as an example: the registration time is used as a limited feature set and is represented by x1, and the value range is (0-10), S n (x 1) is a normalization interval within the value range of x 1; the number of comments is used as a limited feature set and is represented by x2, and the value range is (0-100), S n And (x 2) is a normalization interval within the value range of x 2.
Establishing a convergence model S n (x)=x n /(x n + Range), when x = x1, take n =2.0, range = 10; when x = x2, take n =1.5, range =100; as can be seen from fig. 1, S is within the value range of the x1 feature set n (x 1) converges to the interval 0 to 1, S within the value range of the x2 feature set n (x 2) also converges to the interval 0 to 1.
Similarly, for other characteristics of the commentators and the merchants, such as unit time purchase quantity, unit time sales quantity and the like, the initialized n value is selected to be brought into the convergence model according to different characteristic set value intervals, and the normalized characteristic value is obtained.
From the above, it can be concluded that the result of normalization of one critic who registers for 5 years and has the number of reviews of 50 is (0.83, 0.78), and the critic annotation class is put into the training set data, thereby establishing the critic training set.
The commenting person feature vector u = (1, u2, \8230;, uk) is thus obtained; merchant feature vector m = (1, m2, \8230;, mk); commenting people and merchants.
Example 3: and constructing a truth logistic regression model of the commentators and the merchants.
Selecting n critic training data sets with the characteristic vector u, marking a real critic as 1 and marking a false critic as 0; similarly, n merchant training data sets with the feature vector m are selected, the real merchant is marked as 1, and the false merchant is marked as 0.
Taking the human review as an example, the probability that a sample is considered as class 1 and class 0 can be expressed according to the logistic regression model as follows:
p(y=1|u,α)=h α (u)=1/[1+exp(-α T *u)]
p(y=0|u,α)=1- h α (u)
it can further be represented in its general form as:
p(y|u,α)= (h α (u)) y (1- h α (u)) 1-y
calculating a loss function of logistic regression according to a known label training set by adopting a maximum likelihood estimation principle, wherein the loss function is as follows:
J(h α (u),y)=(-1/n)*[y 1 *ln(h α (u 1 ))+(1-y 1 )ln(1-h α (u 1 ))
+y 2 *ln(h α (u 2 ))+(1-y 2 )ln(1- h α (u 2 ))+…+ y n *ln(h α (u n ))+(1-y n )ln(1-h α (u n ))]
and minimizing a loss function by adopting a batch gradient descent method to obtain a final weight coefficient vector alpha of a group of characteristics of the critics, wherein the truth of the critics can be expressed as URE =1/[1+ exp (-alpha) T *u)]。
The method described above obtains the final weight coefficient vector β of a set of merchant features, and the truth of the merchant can be expressed as MRE =1/[1+ exp (- β) T *u)]。
Example 4: and classifying the individual commentators and the group commentators, and constructing corresponding evaluation models.
According to the criterion that the commentators of more than or equal to three merchants are in a group, the commentators to be detected are divided into individual commentators and group commentators.
Individual critic evaluation model:
for convenience of explanation, an individual commentator is defined as an individual.
Step 1: obtaining individual characteristic values, constructing individual characteristic vectors (1, u2, \8230;, uk), and bringing the vectors into a critic logistic regression equation to obtain individual trueness URE =1/[1+ a 1+ u1+ \8230 ++ α k + uk)) ];
and 2, step: suppose that the total commodities purchased by an individual are divided according to commodity price intervals { (1, 50], (50, 100], (100, 150], (150, 200], (200, 250], (250, 300], (300, 350], (350, 400], (400, 450], (450, 500) ] }, to obtain an initial commodity number sequence (5, 6,8,10,4,5,6,8, 4) marked as c, wherein 5 indicates that the individual purchases 5 commodities within the price range of 1-50 yuan, and so on.
And step 3: knowing that the sequence c obeys the poisson distribution with the parameter λ =6, 10 poisson distribution random numbers are generated by random ('poisson', 6,1, 10) and are used for sampling the initial commodity number sequence, and if the selected random numbers do not exceed the data per se, the sampled commodity number sequence (4, 5,4,6,4, 6,5, 4) is recorded as sc.
And 4, step 4: calculating the mean value of the truth of each merchant corresponding to the number of each commodity in the sequence sc, specifically, assuming that the first commodity number 4 corresponds to two merchants, namely merchant 1 and merchant 2, obtaining the characteristic value of merchant 1, constructing the characteristic vector (1, m2, \ 8230;, mk), and substituting into a merchant logistic regression equation to obtain the truth of merchant 1 MRE _1=1/[1 ++ exp (- (β 0+ β 1 m1+ \ 8230; + β k:) ], and obtaining the truth of merchant 2 MRE _2 through the same process, wherein the result is MRE _ avg _1= (MRE _1 +2)/2, and so on, the average value of the truth of the merchant corresponding to the sequence sc is represented as (MRE _ avg _1, MRE _ avg \\ 2, MRE _ avg _2, \\\\\ \ 8230, and E _ avg _ 10).
The group commentator evaluation model comprises:
for convenience of explanation, the group commentator is defined as a group.
Step 1: obtaining a merchant characteristic value, constructing a merchant characteristic vector (1, m1, m2, \8230;, mk), and substituting the merchant characteristic vector into a merchant logistic regression equation to obtain a merchant truth MRE =1/[1+ exp (- (beta 0+ beta 1+ m1+ \ 8230; + beta k + mk)) ];
and 2, step: suppose that all group reviews received by a merchant are divided according to review time intervals { (1, 9], (5, 12], (10, 15], (15, 25], (16, 30], (25, 40], (40, 50], (45, 55], (55, 70], (60, 75) ] }, and an initial group review number sequence (4, 5,6,5,4, 5) is recorded as r, wherein 4 represents that the merchant receives 4 reviews of a group within the time range of 1-9 days, and so on.
And 3, step 3: knowing that the sequence r obeys the poisson distribution with the parameter β =5, 10 poisson distribution random numbers are generated using random ('poisson', 5,1,10) for sampling the initial cohort critic number sequence, and if the selected random numbers do not exceed the data itself, the sampled cohort critic number sequence (4,5,4,5,4,4,5,5,3,4) is denoted as sr.
And 4, step 4: calculating the truth mean value of each group with the number of comments in each group in the sequence sr, specifically, assuming that the number 4 of the comments in the first group corresponds to two commentators, namely the commentator 1 and the commentator 2, respectively, obtaining the characteristic value of the commentator 1, constructing the characteristic vector (1, u2, \ 8230;, uk) of the commentator 1, bringing the characteristic vector into a logistic regression equation of the commentator, obtaining the true degree URE _1=1/[1+ exp (- (α 0+ α 1+ u1+ \8230; + α k + uk)) ], obtaining the true degree URE _2 of the critic 2 through the same process, obtaining the true degree URE _2, and obtaining the true degree URE _ avg _1= (URE _1 URE \/2)/2, and so on, wherein the group true degree mean value corresponding to the sequence sr is represented as (URE _ avg _1, URE \/avg \/2, \/8230;, URE _ avg _ 10).
Example 5: and establishing a comment evaluation matrix, detecting the comment quality, and identifying false comments.
An individual review evaluation matrix X = { X _1, X _2, \8230;, X _ n }, where X _ j = (X _ j _ URE, X _ j _ MRE _ avg _ k), k ∈ [1,10],
representing the truth of the jth individual and the corresponding class of merchants; the cohort review evaluation matrix Y = { Y _1, Y _2, \8230;, Y _ n }, where,
y _ j = (y _ j _ MRE, y _ j _ URE _ avg _ k), k ∈ [1,10], which represents the trueness of the jth merchant and its corresponding class group.
The method is also applicable to group comment quality detection by taking the example of detecting the individual comment quality;
randomly selecting feature evaluation vectors c _1 and c _2as a category center of a real comment and a category center of a false comment respectively, and dividing the samples into two categories, wherein the Euclidean distance between a sample x _ j and the category center c _ i is d _ ij = | | | x _ j-c _ i |, U _ ij represents a membership function of a jth sample to the ith category, U represents a fuzzy classification matrix, and V represents a category center matrix;
the objective function and constraint conditions of the fuzzy C-means clustering are as follows:
J(U,V)=(u_11) m *(d_11) 2 +(u_12) m *(d_12) 2 +…+(u_1n) m *(d_1n) 2 +(u_21) m *(d_21) 2 +(u_22) m *(d_22) 2 +…+(u_2n) m *(d_2n) 2
u_1j+u_2j=1,j=1,2,…,n
deriving a membership function and a class center:
u_ij=[(d_ij/d_1j) 2/(m-1) +(d_ij/d_2j) 2/(m-1) ] -1 ,i=1,2;j=1,2,…,n
c_i=[(u_i1) m *x_1+(u_i2) m *x_2+…+(u_in) m *x_n]/[(u_i1) m +(u_i2) m +…+(u_in) m ],i=1,2
taking a threshold value epsilon =0.001, m =2, and when | | | Δ c _ i | < epsilon is satisfied, stopping iteration by the algorithm and outputting an optimal fuzzy classification matrix U and a class center matrix V;
assuming that a certain vector in U is (0.6, 0.4), it indicates that the possibility of 0.6 of the comments belongs to the real category, and the possibility of 0.4 belongs to the false category, and the comments are classified as real comments; on the contrary, if a certain vector is (0.4, 0.6), it indicates that the possibility of 0.4 of the comments belongs to the true category, and the possibility of 0.6 belongs to the false category, and the comments are classified as false comments; when a certain vector is (0.5 ), the likelihood of representing that the comment is true or false is the same, and the comment is not divided.
Finally, it should be noted that: the above examples are intended only to illustrate the technical process of the invention, and not to limit it; although the invention has been described in detail with reference to the foregoing examples, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing examples can be modified, or some technical features can be equivalently replaced; such modifications or substitutions do not depart from the spirit and scope of the corresponding embodiments of the present invention.

Claims (1)

1. A method for detecting comment quality in an E-commerce website is characterized by comprising the following steps:
the scoring mechanism of the commenter by the merchant is as follows: defining characteristics reflecting the critics as: logging in: logging in at different places and frequently logging in; browsing: browsing similar commodities before purchase; review: after purchasing, providing calculation rules of three types of behaviors for commodity evaluation;
in each transaction, 8 combinations of the three types of behaviors of the commenter, namely the mark 1 and the mark 0 are shown by the merchant, 001, 101 and 111 are defined as suspicious transactions, and the rest is normal transactions;
defining an X score, calculating a merchant score X p _1 as a first class of characteristics according to the probability p _1 of normal transaction in n _ E61 transactions, and constructing a feature vector of a commentator according to the characteristics;
the human commentator represents with a vector: u = (1, u feature _1, u feature _2, \/8230;, u feature _ k), parameters in the vector represent the quantized human commentator feature values, and the initial values include: u _ feature _1= "merchant score", u _ feature _2= "registration time", u _ feature _ k = null value;
the scoring mechanism of the commenter for the merchant: the definition reflects the characteristics of the merchant as: commercial product: the description is consistent; service: a service attitude; logistics: calculating rules of three types of attributes of the logistics situation;
in each transaction, 8 combinations of the above three types of attributes of the merchant, namely a good mark 1 and a poor mark 0, are defined by the commentator, wherein 000, 001, 010 and 100 are inferior transactions, and the rest are high-quality transactions;
defining an X score, calculating a commenter score X p _2 as a first class of features according to the probability p _2 of high-quality transaction in n _ E62 transactions, and constructing a merchant feature vector according to the features;
the merchant uses a vector representation: m _ E7= (1, m feature _1, m feature _2, \8230;, m feature _ k), the parameters in the vector represent the quantized merchant feature values, and the initial values include: m _ feature _1= "reviewer score", m _ feature _2= "registration time", m _ feature _ k = null value;
the discussion and screening of the distribution values in the feature sets of the commentators and the merchants are reference values for statistical analysis, so that the values in the feature sets need to be normalized, and the processed data are input into a logistic regression model;
because the feature set has upper limit and lower limit, the basic feature of series planning is satisfied, and a convergence model S is established according to the consistent convergence property of the function term series n_E1 (x)=x n _ E1 /(x n _ E1 + Range), range is defined as the maximum value of the Range of features, the argument x is the set of features defining the Range, S n_E1 (x) And the sequence S _ PartSum (x) converge consistently to 1 over the argument interval (0, range);
then, according to the Range value, selecting an initialized n _ E1 value, enabling all characteristic values in the value Range to approach a normalization (0, 1) interval, and obtaining a result after characteristic data processing;
combining the normalized target characteristic values into a target characteristic vector, namely the characteristic vectors of each commentator and each merchant;
adopting a statistical analysis labeling method, excavating feature vectors meeting conditions in a set feature vector set, and labeling categories as a training set of logistic regression;
in the logistic regression training set data, the independent variable is the characteristic of the commenter after quantization normalization, the dependent variable is the type of the annotated commenter and obeys Bernoulli distribution, the true commenter is marked as 1, and the false commenter is marked as 0;
the feature matrix of the training set of the critic is represented as U _ E4= { U _1, U \2, \8230;, U _ nE4}, wherein U _ i is a (k + 1) -dimensional feature vector of the ith sample, and the marking result of the training set is represented by an nE 4-dimensional 0,1 vector;
a group of (k + 1) -dimensional regression coefficient vectors alpha are obtained by adopting a maximum likelihood estimation principle and a batch gradient descent method, and the truth of a critic is expressed as URE =1/[1+ exp (-alpha) ] T *u_E2)];
The independent variable in the logistic regression training set data is the merchant characteristic after quantization normalization, the dependent variable is the marked merchant category and obeys Bernoulli distribution, the marked real merchant is 1, and the false merchant is 0;
the feature matrix of the merchant training set is expressed by M = { M _1, M _2, \8230;, M _ nProSet }, wherein M _ i is a (k + 1) -dimensional feature vector of the ith sample, and the marking result of the training set is expressed by an nProSet-dimensional 0,1 vector;
a group of (k + 1) -dimensional regression coefficient vectors beta are obtained by adopting a maximum likelihood estimation principle and a batch gradient descent method, and the truth of a merchant is expressed as MRE =1/[1+ exp (-beta) T *u_E2mre)];
Dividing the commentators needing to be detected into individual commentators and group commentators according to the criterion that the commentators of three merchants with more than or equal to the common commentary are in a group;
individual critic evaluation model: individual commentators are defined as individuals, and are described in detail as follows:
(1) Calculating individual truth URE by adopting the logistic regression model;
(2) Dividing all commodity numbers purchased by an individual into specified price intervals (p _1, p _2, \8230;, p _ n) to obtain an initial commodity number sequence (GSet _ c _1, GSet _c _2, \8230;, GSet _ c _ n);
(3) The initial commodity number sequence obeys Poisson distribution with parameter c _ avg, random ('poisson', c _ avg,1, n \\ E31) is adopted to generate a Poisson distribution random number sequence, poisson is defined as a Poisson distribution sampling, c _ avg is defined as an average value of distribution values, and the commodity number sequence after sampling is expressed as (sc _1, sc_2, 8230, sc _ n);
(4) Calculating the truth average value (MRE _ avg _1, MRE _avg _2, \8230; MRE _ avg _ n) of the merchant corresponding to the commodity number sequence after sampling by adopting the logistic regression model;
the group commentator evaluation model comprises: the group commentator is defined as a group and is specifically as follows:
(1) Calculating the merchant truth MRE by adopting the logistic regression model;
(2) Dividing all the comment numbers received by a merchant according to a specified time interval (t _1, t_2, \8230;, t _ n), ensuring that each interval has only one group of comments, and obtaining an initial group comment number sequence (r _1, r_2, \8230;, r _ n);
(3) The initial group comment number sequence obeys Poisson distribution with a parameter of r _ avg, random ('poisson', r _ avg,1, n \ E32) is adopted to generate a Poisson distribution random number sequence, poisson is defined as a Poisson distribution sampling, r _ avg is defined as an average value of distribution values, and the sampled group comment number sequence is represented as (sr _1, sr u2, \8230;, sr _ n);
(4) Calculating the truth mean value (URE _ avg _1, URE _avg _2, \ 8230; URE _ avg _ n) of the group corresponding to the group comment number sequence after sampling by adopting the logistic regression model;
the individual review sample feature evaluation matrix is denoted as X = { X _1, X _2, \8230;, X _ n }, where,
x _ j = (x _ j _ URE, x _ j _ MRE _ avg) is the degree of truth of the individual of the jth sample and its corresponding class of merchants;
the cohort review sample feature evaluation matrix is denoted as Y = { Y _1, Y _2, \8230, Y _ n }, where,
y _ j = (y _ j _ MRE, y _ j _ URE _ avg) is the trueness of the merchant of the jth sample and its corresponding class group;
the algorithm for detecting the individual comment quality is applied to the detection of the group comment quality in the same way, and the algorithm is as follows:
step _1, randomly selecting feature evaluation vectors c _1 and c _2as a category center of a real comment and a category center of a false comment respectively, dividing a sample into two categories, wherein the Euclidean distance between the sample x _ j and the category center c _ i is d _ ij = | | x _ j-c _ i |, U _ ij represents a membership function of the jth sample to the ith category, U represents a fuzzy classification matrix, and V represents a category center matrix;
step _2, the objective function and constraints of fuzzy C-means clustering are as follows:
J(U,V)=(u_11) m *(d_11) 2 +(u_12) m *(d_12) 2 +…+(u_1n) m *(d_1n) 2 +(u_21) m *(d_21) 2 +(u_22) m *(
d_22) 2 +…+(u_2n) m *(d_2n) 2
u_1j+u_2j=1,j=1,2,…,n
step _3, deriving a membership function and a class center:
u_ij=[(d_ij/d_1j) 2/(m-1) +(d_ij/d_2j) 2/(m-1) ] -1 ,i=1,2;j=1,2,…,n
c_i=[(u_i1) m *x_1+(u_i2) m *x_2+…+(u_in) m *x_n]/[(u_i1) m +(u_i2) m +…+(u_in) m ],i=1,2
step _4, taking a threshold value epsilon =0.001, and when the < epsilon > is satisfied, stopping iteration and outputting an optimal fuzzy classification matrix U and a category center matrix V by the algorithm;
according to a fuzzy classification matrix U, knowing that the individual comments belong to the membership degree U1 of the real comments and the membership degree U2 of the false comments, taking the membership degrees U1 and U2 as indexes for dividing the comment quality, when U1 is greater than U2, the comments belong to the real comment class, when U1 is less than U2, the comments belong to the false comment class, and when U1 is = U2, the comments are not divided;
the quality detection of the group comments is the same as the quality detection method of the individual comments.
CN202010944581.XA 2020-09-10 2020-09-10 Method for detecting comment quality in E-commerce website Active CN112070543B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010944581.XA CN112070543B (en) 2020-09-10 2020-09-10 Method for detecting comment quality in E-commerce website

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010944581.XA CN112070543B (en) 2020-09-10 2020-09-10 Method for detecting comment quality in E-commerce website

Publications (2)

Publication Number Publication Date
CN112070543A CN112070543A (en) 2020-12-11
CN112070543B true CN112070543B (en) 2023-04-07

Family

ID=73663633

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010944581.XA Active CN112070543B (en) 2020-09-10 2020-09-10 Method for detecting comment quality in E-commerce website

Country Status (1)

Country Link
CN (1) CN112070543B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113256372A (en) * 2021-05-14 2021-08-13 深圳迅销科技股份有限公司 Commodity sale system and method based on electronic commerce
CN113724035B (en) * 2021-07-29 2023-10-17 河海大学 Malicious user detection method based on feature learning and graph reasoning
CN113641798B (en) * 2021-10-12 2022-02-08 成都晓多科技有限公司 Identification method and system for disruptive comments of merchants
CN117172796A (en) * 2023-08-07 2023-12-05 北京智慧大王科技有限公司 Big data electronic commerce management system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101393555A (en) * 2008-09-09 2009-03-25 浙江大学 Rubbish blog detecting method
CN102681973A (en) * 2011-03-17 2012-09-19 张严 Grade sequencing method for credits of buyers and sellers in transaction system
CN103136330A (en) * 2013-01-04 2013-06-05 武汉大学 User reliability assessment method based on microblog platforms
CN108322473A (en) * 2018-02-12 2018-07-24 北京京东金融科技控股有限公司 User behavior analysis method and apparatus
CN110727844A (en) * 2019-10-21 2020-01-24 东北林业大学 Online commented commodity feature viewpoint extraction method based on generation countermeasure network
CN111047148A (en) * 2019-11-21 2020-04-21 山东科技大学 False score detection method based on reinforcement learning
CN111598588A (en) * 2020-05-20 2020-08-28 广州鹄志信息咨询有限公司 Bill-swiping identification method
CN111640033A (en) * 2020-04-11 2020-09-08 中国人民解放军战略支援部队信息工程大学 Detection method and device for network water army

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101393555A (en) * 2008-09-09 2009-03-25 浙江大学 Rubbish blog detecting method
CN102681973A (en) * 2011-03-17 2012-09-19 张严 Grade sequencing method for credits of buyers and sellers in transaction system
CN103136330A (en) * 2013-01-04 2013-06-05 武汉大学 User reliability assessment method based on microblog platforms
CN108322473A (en) * 2018-02-12 2018-07-24 北京京东金融科技控股有限公司 User behavior analysis method and apparatus
CN110727844A (en) * 2019-10-21 2020-01-24 东北林业大学 Online commented commodity feature viewpoint extraction method based on generation countermeasure network
CN111047148A (en) * 2019-11-21 2020-04-21 山东科技大学 False score detection method based on reinforcement learning
CN111640033A (en) * 2020-04-11 2020-09-08 中国人民解放军战略支援部队信息工程大学 Detection method and device for network water army
CN111598588A (en) * 2020-05-20 2020-08-28 广州鹄志信息咨询有限公司 Bill-swiping identification method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Ashish Kumar Tripathi等."Military Dog Based on Optimizer and its Application to Fake Review".《arxiv.org/abs/1909.11890》.2019,1-21. *
Chuanming Yu等."An individual-group-merchant relation model for identifying fake online reviews: an empirical study on a Chinese e-commerce platform".《Information Technology and Management》.2018,第20卷123–138. *
Tie Qiu等."SIGMM: A Novel Machine Learning Algorithm for Spammer Identification in Industrial Mobile Cloud Computing".《IEEE Transactions on Industrial Informatics》.2019,第15卷(第4期),2349-2358. *
吕海等."在线产品虚假评论检测技术研究".《沈阳理工大学学报》.2018,第37卷(第6期),81-85. *
张琪等."基于带权评论图的水军群组检测及特征分析".《计算机应用》.2019,第39卷1595-1600. *

Also Published As

Publication number Publication date
CN112070543A (en) 2020-12-11

Similar Documents

Publication Publication Date Title
CN112070543B (en) Method for detecting comment quality in E-commerce website
Tsai Combining cluster analysis with classifier ensembles to predict financial distress
CN109829733B (en) False comment detection system and method based on shopping behavior sequence data
Kirkos et al. Identifying qualified auditors' opinions: a data mining approach
CN111259140A (en) False comment detection method based on LSTM multi-entity feature fusion
Tamilselvi et al. An overview of data mining techniques and applications
CN114942974A (en) E-commerce platform commodity user evaluation emotional tendency classification method
CN114997916A (en) Prediction method, system, electronic device and storage medium of potential user
Baghla et al. Performance evaluation of various classification techniques for customer churn prediction in e-commerce
Dong et al. Integrated Machine Learning Approaches for E-commerce Customer Behavior Prediction
Sebt et al. Implementing a data mining solution approach to identify the valuable customers for facilitating electronic banking
Hamad et al. Sentiment analysis of restaurant reviews in social media using naïve bayes
CN114612239A (en) Stock public opinion monitoring and wind control system based on algorithm, big data and artificial intelligence
Wu et al. Customer churn prediction for commercial banks using customer-value-weighted machine learning models
CN114757495A (en) Membership value quantitative evaluation method based on logistic regression
US10210528B2 (en) Method and system for assessing and improving individual customer profitability for a profit-making organization
Nadali et al. Class Labeling of Bank Credit's Customers Using AHP and SAW for Credit Scoring with Data Mining Algorithms
Urkude et al. Comparative analysis on machine learning techniques: a case study on Amazon product
Liço et al. Performance Analysis of and Neural KNN Networks for Predicting Customer Purchases in a Real Retail Department Store
Farid et al. Classification of Bank Customers by Data Mining: a Case Study of Mellat Bank branches in Shiraz.
CN111626331B (en) Automatic industry classification device and working method thereof
CN117151870B (en) Portrait behavior analysis method and system based on guest group
CN116957740B (en) Agricultural product recommendation system based on word characteristics
CN117150245B (en) Enterprise intelligent diagnosis information generation method, device, equipment and storage medium
He et al. The application of machine learning algorithms in predicting the borrower’s default risk in online peer-to-peer lending

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant