CN108053268A - A kind of commercial articles clustering confirmation method and device - Google Patents

A kind of commercial articles clustering confirmation method and device Download PDF

Info

Publication number
CN108053268A
CN108053268A CN201711484831.0A CN201711484831A CN108053268A CN 108053268 A CN108053268 A CN 108053268A CN 201711484831 A CN201711484831 A CN 201711484831A CN 108053268 A CN108053268 A CN 108053268A
Authority
CN
China
Prior art keywords
commodity
user
matrix
data
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711484831.0A
Other languages
Chinese (zh)
Inventor
范芳铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Pinwei Software Co Ltd
Original Assignee
Guangzhou Pinwei Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Pinwei Software Co Ltd filed Critical Guangzhou Pinwei Software Co Ltd
Priority to CN201711484831.0A priority Critical patent/CN108053268A/en
Publication of CN108053268A publication Critical patent/CN108053268A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Game Theory and Decision Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention discloses a kind of commercial articles clustering confirmation method and device, wherein, this method includes:Couple corresponding with the sales volume of targeted species commodity purchase user is sampled, and obtains customer consumption data to be clustered, wherein, customer consumption data buy the quantity of extensive stock for user;Using User ID as row, type of merchandize is row, and the first matrix is generated according to customer consumption data to be clustered, and the similarity of targeted species commodity between any two is calculated according to the data of the first matrix, and hierarchical clustering is carried out to targeted species commodity according to similarity, obtain commercial articles clustering result.The present invention is according to the cluster result, it can be realized that the higher type of merchandize of correlation in the order of user accurately to carry out bundle sale and recommendation to a variety of commodity, further improves effective sales volume of electric business.

Description

A kind of commercial articles clustering confirmation method and device
Technical field
The present invention relates to commodity consumption data analysis field more particularly to a kind of commercial articles clustering confirmation methods and device.
Background technology
With the fast development of network technology, electric business industry emerges rapidly therewith, and shopping at network infiltration is in people's life Each corner.
The long-term electric business for laying particular emphasis on the industries such as women's dress, shoes and hats has accumulated many valuable data, can be sent out by data Existing, the age may have entirely different style of wearing the clothes with similar people is taken in, it can be found that some special " clothes groups ", and According to data combo promotion and recommendation can also be done to the commodity that some are often bought.
And during analyzing data, the way of the less electric business of commodity amount is end elimination system, be exactly sale not Good commodity are directly eliminated, and change some new commodity, and this method is simple and directly perceived.Also some electrospray chambers take Pearson came Correlation calculations do the relation between commodity, but under large-scale data, can not usually carry out commodity covariance exactly It calculates and then realizes the accurate promotion and recommendation to commodity.
The content of the invention
An embodiment of the present invention provides a kind of commercial articles clustering confirmation method and devices, and solving current electric business can not be exactly The technical issues of carrying out the reckoning of commodity covariance and then realizing the accurate promotion and recommendation to commodity.
An embodiment of the present invention provides a kind of commercial articles clustering confirmation method, including:
Couple corresponding with the sales volume of targeted species commodity purchase user is sampled, and obtains customer consumption to be clustered Data, wherein, customer consumption data buy the quantity of extensive stock for user;
Using User ID as row, type of merchandize is row, and the first matrix is generated according to customer consumption data to be clustered, according to the The data of one matrix calculate the similarity of targeted species commodity between any two, and targeted species commodity are divided according to similarity Grade cluster, obtains commercial articles clustering result.
Preferably, a couple purchase user corresponding with the sales volume of targeted species commodity is sampled, and obtains to be clustered Customer consumption data, wherein, customer consumption data further include before the quantity of extensive stock is bought for user:
The standard deviation of commodity sales number is calculated according to the sales volume of all kinds commodity got, rejects all kinds It is not more than the data of standard deviation in the sales volume of class commodity, obtains the sales volume of targeted species commodity.
Preferably, using User ID as row, type of merchandize is row, and the first square is generated according to customer consumption data to be clustered Battle array calculates the similarity of targeted species commodity between any two according to the data of the first matrix, and according to similarity to targeted species Commodity carry out hierarchical clustering, obtain further including after commercial articles clustering result:
Transposition is carried out to the first matrix, obtains the second matrix, according between each user of the data of the second matrix calculating Similarity, and hierarchical clustering is carried out to all users according to the similarity between each user, obtain user clustering result.
Preferably, calculating the similarity of targeted species commodity between any two according to the data of the first matrix is specially:
Pass through Euclidean distance or manhatton distance or Pearson came relatedness computation target according to the data of the first matrix The distance of species commodity between any two, the distance of targeted species commodity between any two and targeted species commodity between any two similar It spends corresponding.
Preferably, the embodiment of the present invention additionally provides a kind of commercial articles clustering confirmation device, including:
Sampling unit is sampled for a couple purchase user corresponding with the sales volume of targeted species commodity, is treated The customer consumption data of cluster, wherein, customer consumption data buy the quantity of extensive stock for user;
Cluster cell, for using User ID as row, type of merchandize to be row, according to customer consumption data generation to be clustered the One matrix calculates the similarity of targeted species commodity between any two according to the data of the first matrix, and according to similarity to target Species commodity carry out hierarchical clustering, obtain commercial articles clustering result.
Preferably, a kind of commercial articles clustering provided in an embodiment of the present invention confirms that device further includes:
Culling unit, for calculating the standard of commodity sales number according to the sales volume of all kinds commodity got Difference rejects the data for being not more than standard deviation in the sales volume of all kinds commodity, obtains the sales volume of targeted species commodity.
Preferably, a kind of commercial articles clustering provided in an embodiment of the present invention confirms that device further includes:
Transposition unit for carrying out transposition to the first matrix, obtains the second matrix, is calculated according to the data of the second matrix each Similarity between a user, and hierarchical clustering is carried out to all users according to the similarity between each user, obtain user Cluster result.
Preferably, cluster cell specifically includes:
Subelement is generated, for using User ID as row, type of merchandize to be row, is generated according to customer consumption data to be clustered First matrix;
Computation subunit passes through Euclidean distance or manhatton distance or Pearson came for the data according to the first matrix The distance of relatedness computation targeted species commodity between any two, the distance of targeted species commodity between any two and targeted species commodity Similarity between any two is corresponding;
Subelement is clustered, for carrying out hierarchical clustering to targeted species commodity according to similarity, obtains commercial articles clustering result.
As can be seen from the above technical solutions, the embodiment of the present invention has the following advantages:
An embodiment of the present invention provides a kind of commercial articles clustering confirmation method and device, wherein, this method includes:Pair and target The corresponding purchase user of sales volume of species commodity is sampled, and obtains customer consumption data to be clustered, wherein, user disappears Take the quantity that data buy extensive stock for user;Using User ID as row, type of merchandize is row, according to customer consumption to be clustered Data generate the first matrix, and the similarity of targeted species commodity between any two is calculated according to the data of the first matrix, and according to phase Hierarchical clustering is carried out to targeted species commodity like degree, obtains commercial articles clustering result.The present invention is by confirming targeted species commodity Sales volume, and customer consumption data to be clustered are obtained after being sampled to its corresponding user, to the user's consumption data It is presented in a manner of matrix computations and calculates the similarity between every two kinds of targeted species commodity, finally according to similarity to commodity It carries out hierarchical clustering and obtains commercial articles clustering as a result, according to the cluster result, it can be realized that correlation is higher in the order of user Type of merchandize, accurately to carry out bundle sale and recommendation to a variety of commodity, further improve effective sale of electric business Volume.
Description of the drawings
It in order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention, for those of ordinary skill in the art, without having to pay creative labor, may be used also To obtain other attached drawings according to these attached drawings.
Fig. 1 is a kind of flow diagram of one embodiment of commercial articles clustering confirmation method provided in an embodiment of the present invention;
Fig. 2 is the structure diagram of one embodiment that a kind of commercial articles clustering provided in an embodiment of the present invention confirms device;
Fig. 3 is a kind of hierarchical clustering process of one embodiment of commercial articles clustering confirmation method provided in an embodiment of the present invention Schematic diagram.
Specific embodiment
An embodiment of the present invention provides a kind of commercial articles clustering confirmation method and devices, and solving current electric business can not be exactly The technical issues of carrying out the reckoning of commodity covariance and then realizing the accurate promotion and recommendation to commodity.
Goal of the invention, feature, advantage to enable the present invention is more apparent and understandable, below in conjunction with the present invention Attached drawing in embodiment is clearly and completely described the technical solution in the embodiment of the present invention, it is clear that disclosed below Embodiment be only part of the embodiment of the present invention, and not all embodiment.Based on the embodiments of the present invention, this field All other embodiment that those of ordinary skill is obtained without making creative work, belongs to protection of the present invention Scope.
Referring to Fig. 1, a kind of one embodiment of commercial articles clustering confirmation method provided by the invention, including:
101st, the standard deviation of commodity sales number is calculated according to the sales volume of all kinds commodity got, rejects institute There are the data for being not more than standard deviation in the sales volume of species commodity, obtain the sales volume of targeted species commodity;
It collects comprising various commodity sales numbers in data, and commodity sales number substantially conforms to power law point Cloth, to retain effective data, the present embodiment calculates commodity sales number by calculating the sales volume of all kinds commodity Standard deviation, rejects the data for being not more than standard deviation in the sales volume of all kinds commodity, it is such as existing there are four types of commodity be respectively a, Sales volume by calculating the standard deviation of these four commodity sales numbers, is more than the business of standard deviation by the b, sales volume of c and d Product are left the targeted species commodity for participating in subsequently calculating, and what remaining was not more than reject.Wherein, the formula of standard deviation is calculated For:
In formula, xiFor the sales volume of extensive stock, N is the number of species of commodity, and μ is the sales volume of extensive stock Arithmetic mean of instantaneous value.
By the rejecting of data, retain effectively and be worth the type of merchandize and its sales volume of research, be subsequent matrix Calculating is supported.
102nd, a couple purchase user corresponding with the sales volume of targeted species commodity is sampled, and obtains user to be clustered Consumption data, wherein, customer consumption data buy the quantity of extensive stock for user;
In electronic commercial company, userbase is likely to be breached several hundred million, huge due to amount of user data, also needs The filtering and screening of data are further carried out, after the sales volume of targeted species commodity is obtained, to the pin of targeted species commodity The corresponding user of quantity is sold to be sampled, the ratio of sampling can be one of very, 1/20th etc., to user data into After line sampling, the data of certain scale, i.e., customer consumption data to be clustered are obtained.
103rd, using User ID as row, type of merchandize is row, and the first matrix, root are generated according to customer consumption data to be clustered Calculate the similarity of targeted species commodity between any two according to the data of the first matrix, and according to similarity to targeted species commodity into Row hierarchical clustering obtains commercial articles clustering result;
In the present embodiment, the similarity according to the data of the first matrix calculating targeted species commodity between any two is specific For:Pass through Euclidean distance or manhatton distance or Pearson came relatedness computation targeted species business according to the data of the first matrix The distance of product between any two, similarity of the distance of targeted species commodity between any two with targeted species commodity between any two are opposite It should.
It should be noted that the form of customer consumption data is generally as follows:User ID, commodity ID (species), quantity.Such as User 1 has purchased commodity A, quantity 9, and user 1 has purchased commodity B, quantity 2, and user 2 has purchased commodity A, and quantity 4 is used Family 2 has purchased commodity B, and more than consumption data is generated following first matrix by quantity 3 ...:
First matrix
User 1 User 2 User 3 User 4 User 5
Commodity A 9 4 0 0 0
Commodity B 2 3 5 0 0
Commodity C 0 0 2 0 0
Commodity D 0 0 0 7 11
Commodity E 0 0 0 9 8
Hierarchical clustering by continuously merging the most similar group two-by-two, to construct the level of a group Structure.Each group therein is since a simple elements.During each iteration, hierarchical clustering algorithm meeting The distance between each two group is calculated, and nearest Liang Ge groups are merged into a new group.This process repeats always Go down, it is known that until only remaining next group.Then one piece of people occurred or commodity are found from group.These people or business Product just have common feature.
It is the phase calculated between commodity and commodity purchasing to the core of the first matrix computations in the present embodiment, in clustering algorithm Like degree the distance between (i.e. two kinds commodity), Euclidean distance or manhatton distance or the progress of the Pearson came degree of correlation can be passed through It calculates, the process of calculating can be:As shown in figure 3, using commodity A, B, C, D, E as each independent group, extensive stock is calculated Distance between any two, by data above, on the basis of commodity A, the distance between commodity B and commodity A are minimum, therefore will A and B clusters continue to calculate and understand for new group A` subsequently through the data of user 3, the distance between commodity C and commodity B compared with It is small, A` and C are clustered to obtain A``, and calculated by the data of user 4 and user 5, commodity D, commodity E are respectively with A``'s Distance is larger, and the distance between commodity D and commodity E are smaller, therefore also can be D` by D and E clusters, then D` and A`` is clustered For final result.By final result, grouping of commodities (A, B), (D, E) can preferentially carry out binding distribution, combine (A, B, C) to take second place, combination (A, B, C, D, E) finally considers.
, can be by Pearson came similarity when similarity is calculated, it can be to avoid a other abnormal data to whole Deviation caused by body, example code are as follows:
Pearson came similarity algorithm after # weightings.More similar, the value of return is fewer
#v1 and v2 is to need two groups of numerical value for comparing similarity
def pearson_similary(v1,v2):
Sum1=sum (v1)
Sum2=sum (v2)
Sum1Sq=sum ([pow (v, 2) for v in v1])
Sum2Sq=sum ([pow (v, 2) for v in v2])
PSum=sum ([v1 [i] * v2 [i] for i in range (len (v1))])
# calculates Pearson came similarity
Num=pSum- (sum1*sum2/len (v1))
Den=sqrt ((sum1Sq-pow (sum1,2)/len (v1)) * (sum2Sq-pow (sum2,2)/len (v1)))
If den==0:return 0
# processes result, more similar, and the value of return is fewer
return 1.0-num/den
104th, transposition is carried out to the first matrix, obtains the second matrix, according to the data of the second matrix calculate each user it Between similarity, and according to the similarity between each user to all users carry out hierarchical clustering, obtain user clustering result.
In the present embodiment, the first matrix procession can be converted, code is as follows:
#data is a two-dimensional matrix
By carrying out transposition to the first matrix, can obtain the second matrix, as in step 103 to the data of the first matrix into The process of row hierarchical clustering, user clustering can be obtained as a result, can be obtained by the result by carrying out hierarchical clustering to the second matrix Know the similar crowd of purchase commodity, these crowds can be directed to and do personalized marketing strategy.
An embodiment of the present invention provides a kind of commercial articles clustering method, by confirming the sales volume of targeted species commodity, and Customer consumption data to be clustered are obtained after being sampled to its corresponding user, to the user's consumption data with matrix computations Mode presents and calculates the similarity between every two kinds of targeted species commodity, finally carries out hierarchical clustering to commodity according to similarity Commercial articles clustering is obtained as a result, the grouping of commodities marketed of can obtaining being suitble to put together according to the cluster result, more into one Step ground, the present invention can also find the crowd for having common consumption habit, can analyze consumer behavior and the custom of crowd.
The above are the detailed description of the embodiment progress to a kind of commercial articles clustering method provided by the invention, below to this hair A kind of commercial articles clustering of bright offer confirms that one embodiment of device illustrates, referring to Fig. 2, the device includes:
Sampling unit 202 is sampled for a couple purchase user corresponding with the sales volume of targeted species commodity, obtains Customer consumption data to be clustered, wherein, customer consumption data buy the quantity of extensive stock for user;
Cluster cell 203, for using User ID as row, type of merchandize to be row, is given birth to according to customer consumption data to be clustered Into the first matrix, the similarity of targeted species commodity between any two is calculated according to the data of the first matrix, and according to similarity pair Targeted species commodity carry out hierarchical clustering, obtain commercial articles clustering result.
In the present embodiment, a kind of commercial articles clustering provided in an embodiment of the present invention confirms that device further includes:
Culling unit 201, for calculating commodity sales number according to the sales volume of all kinds commodity got Standard deviation rejects the data for being not more than standard deviation in the sales volume of all kinds commodity, obtains the sale of targeted species commodity Quantity.
In the present embodiment, a kind of commercial articles clustering provided in an embodiment of the present invention confirms that device further includes:
Transposition unit 204 for carrying out transposition to the first matrix, obtains the second matrix, according to the data meter of the second matrix The similarity between each user is calculated, and hierarchical clustering is carried out to all users according to the similarity between each user, is obtained User clustering result.
In the present embodiment, cluster cell 203 specifically includes:
Subelement 2031 is generated, for using User ID as row, type of merchandize to be row, according to customer consumption data to be clustered Generate the first matrix;
Computation subunit 2032 passes through Euclidean distance or manhatton distance or skin for the data according to the first matrix The distance of the inferior relatedness computation targeted species commodity of that between any two, the distance and targeted species of targeted species commodity between any two The similarity of commodity between any two is corresponding;
Subelement 2033 is clustered, for carrying out hierarchical clustering to targeted species commodity according to similarity, obtains commercial articles clustering As a result.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit may be referred to the corresponding process in preceding method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed system, apparatus and method can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit Division is only a kind of division of logic function, can there is other dividing mode, such as multiple units or component in actual implementation It may be combined or can be integrated into another system or some features can be ignored or does not perform.It is another, it is shown or The mutual coupling, direct-coupling or communication connection discussed can be the indirect coupling by some interfaces, device or unit It closes or communicates to connect, can be electrical, machinery or other forms.
The unit illustrated as separating component may or may not be physically separate, be shown as unit The component shown may or may not be physical location, you can be located at a place or can also be distributed to multiple In network element.Some or all of unit therein can be selected to realize the mesh of this embodiment scheme according to the actual needs 's.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, it can also That unit is individually physically present, can also two or more units integrate in a unit.Above-mentioned integrated list The form that hardware had both may be employed in member is realized, can also be realized in the form of SFU software functional unit.
If the integrated unit is realized in the form of SFU software functional unit and is independent production marketing or use When, it can be stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially The part to contribute in other words to the prior art or all or part of the technical solution can be in the form of software products It embodies, which is stored in a storage medium, is used including some instructions so that a computer Equipment (can be personal computer, server or the network equipment etc.) performs the complete of each embodiment the method for the present invention Portion or part steps.And foregoing storage medium includes:USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can store journey The medium of sequence code.
The above, the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although with reference to before Embodiment is stated the present invention is described in detail, it will be understood by those of ordinary skill in the art that:It still can be to preceding The technical solution recorded in each embodiment is stated to modify or carry out equivalent substitution to which part technical characteristic;And these Modification is replaced, and the essence of appropriate technical solution is not made to depart from the spirit and scope of various embodiments of the present invention technical solution.

Claims (8)

1. a kind of commercial articles clustering confirmation method, which is characterized in that including:
Couple corresponding with the sales volume of targeted species commodity purchase user is sampled, and obtains customer consumption number to be clustered According to, wherein, customer consumption data buy the quantity of extensive stock for user;
Using User ID as row, type of merchandize is row, the first matrix is generated according to customer consumption data to be clustered, according to the first square The data of battle array calculate the similarity of targeted species commodity between any two, and targeted species commodity are carried out with classification according to similarity and is gathered Class obtains commercial articles clustering result.
2. commercial articles clustering confirmation method according to claim 1, which is characterized in that the sale number pair with targeted species commodity It measures corresponding purchase user to be sampled, obtains customer consumption data to be clustered, wherein, customer consumption data are bought for user It is further included before the quantity of extensive stock:
The standard deviation of commodity sales number is calculated according to the sales volume of all kinds commodity got, rejects all kinds business It is not more than the data of standard deviation in the sales volume of product, obtains the sales volume of targeted species commodity.
3. commercial articles clustering confirmation method according to claim 2, which is characterized in that using User ID as row, type of merchandize is Row, the first matrix is generated according to customer consumption data to be clustered, and targeted species commodity two are calculated according to the data of the first matrix Similarity between two, and hierarchical clustering is carried out to targeted species commodity according to similarity, it obtains going back after commercial articles clustering result Including:
Transposition is carried out to the first matrix, obtains the second matrix, according to similar between each user of the data of the second matrix calculating Degree, and hierarchical clustering is carried out to all users according to the similarity between each user, obtain user clustering result.
4. commercial articles clustering confirmation method according to claim 1, which is characterized in that calculate mesh according to the data of the first matrix Marking the similarity of species commodity between any two is specially:
Pass through Euclidean distance or manhatton distance or Pearson came relatedness computation targeted species according to the data of the first matrix The distance of commodity between any two, similarity phase of the distance of targeted species commodity between any two with targeted species commodity between any two It is corresponding.
5. a kind of commercial articles clustering confirms device, which is characterized in that including:
Sampling unit is sampled for a couple purchase user corresponding with the sales volume of targeted species commodity, obtains to be clustered Customer consumption data, wherein, customer consumption data for user buy extensive stock quantity;
For using User ID as row, type of merchandize to be row, the first square is generated according to customer consumption data to be clustered for cluster cell Battle array calculates the similarity of targeted species commodity between any two according to the data of the first matrix, and according to similarity to targeted species Commodity carry out hierarchical clustering, obtain commercial articles clustering result.
6. commercial articles clustering according to claim 5 confirms device, which is characterized in that further includes:
Culling unit, for calculating the standard deviation of commodity sales number according to the sales volume of all kinds commodity got, The data for being not more than standard deviation in the sales volume of all kinds commodity are rejected, obtain the sales volume of targeted species commodity.
7. commercial articles clustering according to claim 6 confirms device, which is characterized in that further includes:
Transposition unit, for carrying out transposition to the first matrix, obtains the second matrix, and each use is calculated according to the data of the second matrix Similarity between family, and hierarchical clustering is carried out to all users according to the similarity between each user, obtain user clustering As a result.
8. commercial articles clustering according to claim 5 confirms device, which is characterized in that cluster cell specifically includes:
Subelement is generated, for using User ID as row, type of merchandize to be row, according to customer consumption data generation first to be clustered Matrix;
Computation subunit is related by Euclidean distance or manhatton distance or Pearson came for the data according to the first matrix Degree calculates the distance of targeted species commodity between any two, and the distance of targeted species commodity between any two and targeted species commodity are two-by-two Between similarity it is corresponding;
Subelement is clustered, for carrying out hierarchical clustering to targeted species commodity according to similarity, obtains commercial articles clustering result.
CN201711484831.0A 2017-12-29 2017-12-29 A kind of commercial articles clustering confirmation method and device Pending CN108053268A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711484831.0A CN108053268A (en) 2017-12-29 2017-12-29 A kind of commercial articles clustering confirmation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711484831.0A CN108053268A (en) 2017-12-29 2017-12-29 A kind of commercial articles clustering confirmation method and device

Publications (1)

Publication Number Publication Date
CN108053268A true CN108053268A (en) 2018-05-18

Family

ID=62129313

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711484831.0A Pending CN108053268A (en) 2017-12-29 2017-12-29 A kind of commercial articles clustering confirmation method and device

Country Status (1)

Country Link
CN (1) CN108053268A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255645A (en) * 2018-07-20 2019-01-22 阿里巴巴集团控股有限公司 A kind of consumption predictions method, apparatus and electronic equipment
CN109740063A (en) * 2019-01-17 2019-05-10 北京奇艺世纪科技有限公司 Information recalls, information cluster method, device and equipment
CN109902706A (en) * 2018-11-09 2019-06-18 华为技术有限公司 Recommended method and device
CN111523918A (en) * 2019-02-02 2020-08-11 北京极智嘉科技有限公司 Commodity clustering method, commodity clustering device, commodity clustering equipment and storage medium
CN111814944A (en) * 2019-04-12 2020-10-23 北京百度网讯科技有限公司 Vertex-to-community distribution method and device and terminal
CN112329838A (en) * 2020-11-02 2021-02-05 上海明略人工智能(集团)有限公司 Method and device for determining category label of target set
CN112434154A (en) * 2019-08-26 2021-03-02 北京星选科技有限公司 Object processing method and device, electronic equipment and storage medium
CN113240453A (en) * 2021-04-21 2021-08-10 福建神笔马良智能科技股份有限公司 Commodity sales dynamic pushing management system based on block chain
CN114648364A (en) * 2022-03-30 2022-06-21 李艳华 Method and system for analyzing sales data of electronic commerce website
CN116796910A (en) * 2023-08-21 2023-09-22 青岛中德智能技术研究院 Order batch optimization method based on goods allocation strategy

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104111969A (en) * 2014-06-04 2014-10-22 百度移信网络技术(北京)有限公司 Method and system for measuring similarity
CN105095306A (en) * 2014-05-20 2015-11-25 阿里巴巴集团控股有限公司 Operating method and device based on associated objects
CN105335368A (en) * 2014-06-06 2016-02-17 阿里巴巴集团控股有限公司 Product clustering method and apparatus
CN105809474A (en) * 2016-02-29 2016-07-27 深圳市未来媒体技术研究院 Hierarchical commodity information filtering and recommending method
US20160379288A1 (en) * 2015-06-29 2016-12-29 Wal-Mart Stores, Inc. Integrated Meal Plan Generation and Supply Chain Management
CN107292713A (en) * 2017-06-19 2017-10-24 武汉科技大学 A kind of rule-based individual character merged with level recommends method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095306A (en) * 2014-05-20 2015-11-25 阿里巴巴集团控股有限公司 Operating method and device based on associated objects
CN104111969A (en) * 2014-06-04 2014-10-22 百度移信网络技术(北京)有限公司 Method and system for measuring similarity
CN105335368A (en) * 2014-06-06 2016-02-17 阿里巴巴集团控股有限公司 Product clustering method and apparatus
US20160379288A1 (en) * 2015-06-29 2016-12-29 Wal-Mart Stores, Inc. Integrated Meal Plan Generation and Supply Chain Management
CN105809474A (en) * 2016-02-29 2016-07-27 深圳市未来媒体技术研究院 Hierarchical commodity information filtering and recommending method
CN107292713A (en) * 2017-06-19 2017-10-24 武汉科技大学 A kind of rule-based individual character merged with level recommends method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张彩霞等: "《大数据时代,企业借助互联网成功转型升级》", 31 July 2015, 中国财富出版社 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255645B (en) * 2018-07-20 2021-09-14 创新先进技术有限公司 Consumption prediction method and device and electronic equipment
CN109255645A (en) * 2018-07-20 2019-01-22 阿里巴巴集团控股有限公司 A kind of consumption predictions method, apparatus and electronic equipment
CN109902706A (en) * 2018-11-09 2019-06-18 华为技术有限公司 Recommended method and device
CN109902706B (en) * 2018-11-09 2023-08-22 华为技术有限公司 Recommendation method and device
CN109740063A (en) * 2019-01-17 2019-05-10 北京奇艺世纪科技有限公司 Information recalls, information cluster method, device and equipment
CN111523918A (en) * 2019-02-02 2020-08-11 北京极智嘉科技有限公司 Commodity clustering method, commodity clustering device, commodity clustering equipment and storage medium
CN111523918B (en) * 2019-02-02 2023-09-19 北京极智嘉科技股份有限公司 Commodity clustering method, device, equipment and storage medium
CN111814944A (en) * 2019-04-12 2020-10-23 北京百度网讯科技有限公司 Vertex-to-community distribution method and device and terminal
CN112434154A (en) * 2019-08-26 2021-03-02 北京星选科技有限公司 Object processing method and device, electronic equipment and storage medium
CN112329838A (en) * 2020-11-02 2021-02-05 上海明略人工智能(集团)有限公司 Method and device for determining category label of target set
CN112329838B (en) * 2020-11-02 2024-02-02 上海明略人工智能(集团)有限公司 Method and device for determining target set category label
CN113240453A (en) * 2021-04-21 2021-08-10 福建神笔马良智能科技股份有限公司 Commodity sales dynamic pushing management system based on block chain
CN113240453B (en) * 2021-04-21 2024-05-28 福建神笔马良智能科技股份有限公司 Dynamic commodity sales promotion management system based on block chain
CN114648364A (en) * 2022-03-30 2022-06-21 李艳华 Method and system for analyzing sales data of electronic commerce website
CN116796910A (en) * 2023-08-21 2023-09-22 青岛中德智能技术研究院 Order batch optimization method based on goods allocation strategy
CN116796910B (en) * 2023-08-21 2023-11-21 青岛中德智能技术研究院 Order batch optimization method based on goods allocation strategy

Similar Documents

Publication Publication Date Title
CN108053268A (en) A kind of commercial articles clustering confirmation method and device
Syakur et al. Integration k-means clustering method and elbow method for identification of the best customer profile cluster
Aryuni et al. Customer segmentation in XYZ bank using K-means and K-medoids clustering
CN106651546B (en) Electronic commerce information recommendation method oriented to smart community
US20150294336A1 (en) Multidimensional personal behavioral tomography
CN108960992A (en) A kind of information recommendation method and relevant device
CN109559208A (en) A kind of information recommendation method, server and computer-readable medium
CN107516246B (en) User type determination method, user type determination device, medium and electronic equipment
Chen et al. Predicting default risk on peer-to-peer lending imbalanced datasets
CN106529968A (en) Customer classification method and system thereof based on transaction data
CN110489642A (en) Method of Commodity Recommendation, system, equipment and the medium of Behavior-based control signature analysis
CN104715409A (en) Method and system for electronic commerce user purchasing power classification
CN102521605A (en) Wave band selection method for hyperspectral remote-sensing image
CN109948724A (en) A kind of electric business brush single act detection method based on improvement LOF algorithm
Tan et al. Time series clustering: A superior alternative for market basket analysis
CN106127493A (en) A kind of method and device analyzing customer transaction behavior
CN105956122A (en) Object attribute determining method and device
Farooqi et al. Effectiveness of Data mining in Banking Industry: An empirical study
CN109461083B (en) Method and device for determining preferential exchange rate
CN114626925A (en) Recommendation method and device for financial products, electronic equipment and storage medium
Xue et al. Intelligent mining on purchase information and recommendation system for e-commerce
Regmi et al. Customer Market Segmentation using Machine Learning Algorithm
CN116362236A (en) Target word mining method and device and storage medium
Rezaeian et al. Measuring Customers Satisfaction of ECommerce Sites Using Clustering Techniques: Case Study of Nyazco Website.
Zhao Marketing Segmentation in Consumer Product Industry

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180518