CN115983921A - Offline store commodity association combination method, device, equipment and storage medium - Google Patents
Offline store commodity association combination method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN115983921A CN115983921A CN202211709578.5A CN202211709578A CN115983921A CN 115983921 A CN115983921 A CN 115983921A CN 202211709578 A CN202211709578 A CN 202211709578A CN 115983921 A CN115983921 A CN 115983921A
- Authority
- CN
- China
- Prior art keywords
- commodity
- association
- rule
- commodities
- confidence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 53
- 239000013598 vector Substances 0.000 claims description 49
- 238000012549 training Methods 0.000 claims description 26
- 238000005457 optimization Methods 0.000 claims description 23
- 230000015654 memory Effects 0.000 claims description 20
- 230000006870 function Effects 0.000 claims description 13
- 238000005516 engineering process Methods 0.000 claims description 12
- 238000005070 sampling Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 5
- 238000013500 data storage Methods 0.000 claims description 4
- 230000001174 ascending effect Effects 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 230000007547 defect Effects 0.000 abstract description 3
- 238000006467 substitution reaction Methods 0.000 abstract description 3
- 244000269722 Thea sinensis Species 0.000 description 42
- 235000009569 green tea Nutrition 0.000 description 22
- 235000006468 Thea sinensis Nutrition 0.000 description 18
- 235000020279 black tea Nutrition 0.000 description 18
- 235000013618 yogurt Nutrition 0.000 description 15
- 235000015895 biscuits Nutrition 0.000 description 14
- CDBYLPFSWZWCQE-UHFFFAOYSA-L Sodium Carbonate Chemical compound [Na+].[Na+].[O-]C([O-])=O CDBYLPFSWZWCQE-UHFFFAOYSA-L 0.000 description 11
- 238000004422 calculation algorithm Methods 0.000 description 6
- 229940034610 toothpaste Drugs 0.000 description 6
- 239000000606 toothpaste Substances 0.000 description 6
- 238000010586 diagram Methods 0.000 description 3
- 235000005979 Citrus limon Nutrition 0.000 description 2
- 244000131522 Citrus pyriformis Species 0.000 description 2
- 238000010009 beating Methods 0.000 description 2
- 238000009960 carding Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 235000013616 tea Nutrition 0.000 description 2
- MIDXCONKKJTLDX-UHFFFAOYSA-N 3,5-dimethylcyclopentane-1,2-dione Chemical compound CC1CC(C)C(=O)C1=O MIDXCONKKJTLDX-UHFFFAOYSA-N 0.000 description 1
- 235000017060 Arachis glabrata Nutrition 0.000 description 1
- 244000105624 Arachis hypogaea Species 0.000 description 1
- 235000010777 Arachis hypogaea Nutrition 0.000 description 1
- 235000018262 Arachis monticola Nutrition 0.000 description 1
- OYPRJOBELJOOCE-UHFFFAOYSA-N Calcium Chemical compound [Ca] OYPRJOBELJOOCE-UHFFFAOYSA-N 0.000 description 1
- 241000287420 Pyrus x nivalis Species 0.000 description 1
- 235000016954 Ribes hudsonianum Nutrition 0.000 description 1
- 240000001890 Ribes hudsonianum Species 0.000 description 1
- 235000001466 Ribes nigrum Nutrition 0.000 description 1
- 244000000231 Sesamum indicum Species 0.000 description 1
- 235000003434 Sesamum indicum Nutrition 0.000 description 1
- 235000013405 beer Nutrition 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 235000020682 bottled natural mineral water Nutrition 0.000 description 1
- 235000021152 breakfast Nutrition 0.000 description 1
- 239000011575 calcium Substances 0.000 description 1
- 229910052791 calcium Inorganic materials 0.000 description 1
- 235000013736 caramel Nutrition 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 229940112822 chewing gum Drugs 0.000 description 1
- 235000015218 chewing gum Nutrition 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 235000021551 crystal sugar Nutrition 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 235000020278 hot chocolate Nutrition 0.000 description 1
- 235000015243 ice cream Nutrition 0.000 description 1
- 235000013336 milk Nutrition 0.000 description 1
- 239000008267 milk Substances 0.000 description 1
- 210000004080 milk Anatomy 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 235000020232 peanut Nutrition 0.000 description 1
- 235000013580 sausages Nutrition 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 235000012773 waffles Nutrition 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention relates to an offline store commodity association combination method, device, equipment and storage medium. According to the off-line store commodity association combination method, off-line shopping data are obtained, element matching is conducted on each data set according to commodity information in each order based on association rules and continuous bag-of-words model rules, matching degrees among commodities and confidence degrees among the matching degrees are calculated, and finally association rules, commodity strong relation rules and chain rules among the commodities are determined. And finally finding potential relevant rules among the commodities, including but not limited to substitution and binding. The off-line store commodity association combination method has the advantage of overcoming the defect that the association relation cannot be calculated due to no co-occurrence relation of commodities.
Description
Technical Field
The invention relates to the technical field of data intelligent algorithms, in particular to the field of offline store commodity association combination optimization methods.
Background
With the progress of society, various new commodities are continuously emerged in our lives, and people need to spend more time selecting products. The trend of enterprises is both challenging and opportunistic, and needs to capture the demands of customers timely and adopt proper sales strategies appropriately to win more customers and better improve the commodity sales volume. One important sales strategy is commodity combination sales, according to the purchase data of commodities, the similarity among the commodities in shopping behaviors is mined, interested commodity combinations are sold to customers, and the purchase desire of the customers is stimulated. For customers, the method helps the customers to quickly find interesting and high-quality commodities, and improves shopping experience; and secondly, the energy of the customers for finding the needed commodities is reduced. At present, the combined sale method of the off-line stores mainly adopts human experience and an association rule algorithm, wherein the human experience is to guess the association of commodities according to a simple rule, such as binding sale or adjacent display of toothpaste and toothbrushes, and can really help the sale quantity to increase to a certain extent, but the defects are that the rules found manually are few or the accuracy is not high. The association rule algorithm finds the relation between items from the data set, and the main flow is to collect the data of the shopping baskets of the customers of the off-line stores, set the threshold value of the support degree, find out the frequent item set of the commodity from the data, set the threshold value of the confidence degree, and calculate and output the association rule from the frequent item set. The method has the advantages that frequent co-occurrence relations mainly exist, clear and useful results such as famous beer and diaper commodity association combinations can be mined, and data with long processing time can be supported. However, only the commonly occurring commodity relations can be mined, and the commodity relations which do not exist but have potential association cannot be mined.
Disclosure of Invention
Accordingly, an object of the present invention is to provide an offline store product association combining method, apparatus, device, and storage medium. The method collects the data of the customer shopping baskets of the off-line stores, outputs commodity association combination through the rule linkage and fusion of the association rule and the continuous word bag model algorithm, and scientifically and reasonably guides the optimization of strategies such as commodity marketing, bound sales, display combination and the like. The method overcomes the defect that the incidence relation cannot be calculated due to no co-occurrence relation of commodities, and has high universality.
The invention relates to a commodity association combination optimization method for an offline store, which comprises the following steps:
acquiring shopping data, and establishing a sample data set according to the shopping data;
according to the sample data set, a C1-Ck item set is established, a function is searched to scan the data set to find a frequent item set L1-Lk which meets the requirement of being larger than or equal to the minimum support degree, and a commodity association rule is generated; wherein Ck represents an item set only containing k elements, lk represents a frequent item set containing k data sets, and k is a non-negative integer; the commodity association rule comprises a first association relation among a group of commodities;
taking the sample data set as a training sample, training by a word embedding vector technology of a continuous word bag model, and outputting a commodity vector represented in a distributed mode;
calculating the most similar commodity of each commodity through cosine similarity to form a commodity similarity table; each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise second association relations among the group of commodities;
obtaining the confidence coefficient of the chain rule according to the confidence coefficient of the commodity association rule comprising the same commodity and the similarity of the commodity strong relation rule; and if the confidence of the chain rule is greater than a preset confidence threshold, forming a chain rule, wherein the chain rule comprises a third association relation between the commodities in the commodity association rule and the commodity strong relation rule.
Further, the frequent item sets L1-Lk meeting the minimum support degree are found by searching the function scanning data set, and the association rule is generated, which specifically comprises the following steps:
combining every two C1 item sets in the L1, calculating the support degree, considering the item as a frequent item when the support degree is greater than a set threshold value, taking the item as an element in the L2, and repeatedly calculating until all combined calculation is completed;
at the found most frequent set, searching an item set with a confidence degree greater than or equal to a given threshold value, and generating an association rule, wherein:
wherein, confidence (x → y) is confidence, N (x) is the total number of item sets containing element x in the frequent item set, and N (x & y) is the number of item sets containing element x and element y in N (x).
Further, the specific steps of training by the word embedding vector technology of the continuous bag-of-words model and outputting the commodity vector represented in a distributed manner are as follows:
initializing model parameters θ and commodity vectors x k Obtaining neg central words p different from p0 by negative sampling method i (ii) a The dimension of the commodity vector is N, for each record sample, the central commodity is p0, the window is c, the surrounding contexts have 2c commodities and are marked as context (p 0), and the real positive example is marked as (context (p 0), p 0);
for each sample in the training set (context (p) 0 ),p 0 ,p 1 ,…,p neg ) Performing gradient ascending iteration updating parameter and vector process;
if the gradient is converged, the gradient iteration is ended, and the parameters are updatedAnd commodity vector x k Otherwise, the iteration continues.
Further, the negative sampling method is calculated by the following formula,
wherein P (P) i ) Represents a weight, f (p) i ) Representing the frequency of occurrence of the word.
Further, for each sample (context (p)) in the training set 0 ),p 0 ,p 1 ,…,p neg ) The step of performing gradient ascent iteration to update parameters and vectors specifically comprises the following steps:
the 2c commodity vectors around the center commodity p0 are summed and averaged for the center commodity, i.e.:
for i =0to neg, calculate:
for each word vector x in context (w) k Update (2 c in total):
wherein eta is the learning rate; i =0,y i Is 1,i =1,2, \8230, neg, y i Is 0; sigma is a sigmoid function.
Further, the support degree is calculated by the following formula:
wherein support (x) is the support of the commodity x, N is the total number of the sample data set, and N (x) is the total number of the commodity x contained in N.
In another aspect, the present application provides an offline store commodity association combination optimization apparatus, including:
a data acquisition module: the system comprises a data acquisition module, a data storage module and a data processing module, wherein the data acquisition module is used for acquiring shopping data and establishing a sample data set according to the shopping data;
a first association rule generation module: the system is used for establishing a C1-Ck item set according to the sample data set, finding a frequent item set L1-Lk which meets the requirement of being more than or equal to the minimum support degree through searching a function scanning data set, and generating a commodity association rule; wherein Ck represents an item set only containing k elements, lk represents a frequent item set containing k data sets, and k is a non-negative integer; the commodity association rule comprises a first association relation among a group of commodities;
an iterative training module: the system is used for training by using the sample data set as a training sample through a word embedding vector technology of a continuous word bag model and outputting a commodity vector represented in a distributed mode;
a second incidence relation calculation module: the method is used for calculating the most similar commodity of each commodity through cosine similarity to form a commodity similarity table; each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise second association relations among the group of commodities;
a third association calculation module: the confidence degree of the chain type rule is obtained according to the confidence degree of the commodity association rule comprising the same commodity and the similarity of the commodity strong relation rule; and if the confidence of the chain rule is greater than a preset confidence threshold, forming a chain rule, wherein the chain rule comprises a third association relation between the commodities in the commodity association rule and the commodity strong relation rule.
In another aspect, the present application further provides an electronic device, including:
at least one memory and at least one processor;
the memory for storing one or more programs;
when executed by the at least one processor, the one or more programs cause the at least one processor to implement the steps of a method for offline store merchandise association portfolio optimization as described in any one of the above.
In another aspect, the present application further provides a computer-readable storage medium, which stores a computer program, where the computer program is executed by a processor to implement the steps of any one of the offline store commodity association combination optimization methods described above.
The method for optimizing the commodity association combination of the off-line store can perform commodity complementation or substitution operation according to a commodity association combination rule base, and scientifically and reasonably realize strategy optimization of commodity marketing, binding sales, display combination and the like binding sales of brands and cross sales of different brands of commodities. The method has the advantages that the potential strong correlation among the off-line store commodities which never have the co-occurrence relationship can be found, and the commodity sales volume is improved.
For a better understanding and practice, the invention is described in detail below with reference to the accompanying drawings.
Drawings
Fig. 1 is a flowchart of a method for optimizing commodity association combinations of offline stores according to an embodiment of the present application;
fig. 2 is a block diagram of a structure of an offline store commodity association combination optimization apparatus according to an embodiment of the present application;
fig. 3 is a block diagram illustrating a structure of an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
It should be understood that the embodiments described are only some embodiments of the present application, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the embodiments in the present application.
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the present application. As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims. In the description of the present application, it is to be understood that the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not necessarily used to describe a particular order or sequence, nor are they to be construed as indicating or implying relative importance. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate.
Further, in the description of the present application, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
With the continuous appearance of new commodities in life, people need to spend time on selecting products for different commodities, but in the current fast-paced social environment, people often do not spend much time on selecting commodities, the on-line and sale of new products have problems, and the existing stores select some products to be bundled and sold, so that customers can quickly find interesting and high-quality products, the time spent on selecting the commodities by the customers is reduced, or some complementary commodities are adjacently displayed to realize sales promotion among the commodities. However, the existing association between commodities is usually discovered through daily habits and common sense of life of most people, and the association is relatively subjective and has strong personal awareness, so that the association rules are few and the accuracy is not high, and some potential association between commodities cannot be discovered.
Based on the above background, with reference to fig. 1, the present embodiment provides a commodity association combination optimization method applied to an offline store. The method specifically comprises the following steps of calculating and outputting commodity association combination by collecting offline shop shopping basket data, fusing association rules and continuous word bag models, and guiding strategy optimization such as commodity combination sale, binding sale and the like:
s10: and acquiring shopping data, and establishing a sample data set according to the shopping data.
The shopping data is a shopping list of purchases made by the customer, each item representing an element. The sample data set is used for storing a data set in a client shopping data set of a stage, and a group of shopping data of each client is taken as a sample.
S20: establishing a C1-Ck item set according to the sample data set, scanning the data set through a search function to find a frequent item set L1-Lk which meets the requirement of being greater than or equal to the minimum support degree, and generating a commodity association rule; wherein Ck represents an item set only containing k elements, lk represents a frequent item set containing k data sets, and k is a non-negative integer; the commodity association rule includes a first association relationship between a set of commodities.
The C1-Ck item set represents from an item set only containing 1 element to an item set only containing k elements, and the C1 item set is a 1-item set; the L1-Lk frequent item set represents a frequent item set containing 1 data set to a frequent item set containing k data sets, for example, a C1 item set { toothpaste; iced black tea; toothpaste, C2 set { lemon tea/cola; green tea/toothpaste }, there are five elements in total. The L1 of the toothpaste is 0.6.
S30: and (3) taking the sample data set as a training sample, training by a word embedding vector technology of a continuous word bag model, and outputting a commodity vector represented in a distributed mode.
Constructing a continuous bag-of-words model based on a negative sampling method, initializing model parameters theta and commodity vectors X k Assuming the dimension of the commodity vector is N, for each record sample, the center commodity is p 0 The window is c, the total number of the surrounding context is 2c, and the neg sum p can be obtained by the negative sampling method which is marked as context (p 0) 0 Different core words p i Wherein i =1, 2.. Neg, the unrealistic negative case is denoted as (context (p) 0 ),p i ) True positive example is (context (p) 0 ),p 0 );
S40: calculating the most similar commodity of each commodity through cosine similarity to form a commodity similarity table; each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise second association relations among the group of commodities.
Cosine similarity measures the similarity degree between vectors according to the included angle between space vectors, and in the application, the commodity vectors represented in a distributed mode are output through training of a word embedding vector technology of a continuous word bag model.
S50: obtaining the confidence coefficient of the chain rule according to the confidence coefficient of the commodity association rule comprising the same commodity and the similarity of the commodity strong relation rule; and if the confidence of the chain rule is greater than a preset confidence threshold, forming the chain rule, wherein the chain rule comprises a third association relation between the commodities in the commodity association rule and the commodity strong relation rule.
In a specific embodiment, the technical scheme is realized by the following scheme.
Shopping basket data of off-line shops are collected, wherein the shopping basket data comprise shopping basket order numbers and shopping basket order commodity lists, and the data are stored in Clickhouse. Sample data are as follows:
order numbering | Commodity list |
LZ_B_M_220715_123554 | Green tea/crystal sugar snow pear/ice black tea/high calcium milk/yogurt/waffle biscuit |
LZ_B_M_220716_091521 | Blackcurrant/natural mineral water/green tea/soda biscuit/caramel toast |
LZ_B_M_220716_111617 | Chocolate/green tea/ice black tea/chewing gum/sausage/yoghurt |
1Z_B-M_220718_201522 | Green tea/toothpaste/toothbrush |
LZ_B_M_220720_153018 | Ice cream cone/lemon tea/cola |
LZ_B_M_220725_102522 | Fruit orange/soda cracker/garlic-flavored peanut/sesame cake/breakfast biscuit |
And constructing an association rule algorithm. In this embodiment, the minimum support degree is set to be 0.05, the minimum confidence degree is set to be 0.6, and a frequent item set is calculated.
1) And creating a 1-item set C1, scanning the data set through a search function, calculating the support degree, and finding a frequent item set L1 which meets the requirement of being more than or equal to the minimum support degree of 0.05.
Wherein support (x) is the support of the commodity x, N is the total number of the sample data set, and N (x) is the total number of the commodity x contained in N.
Taking 6 sample data as an example, the support degree of green tea is 0.67, which is greater than or equal to the minimum support degree, and the green tea is judged as a frequent set.
2) Combining every 1-item set in the L1 in pairs, repeating the step 1), and finding a frequent item set L2.
Taking 6 sample data as an example, the support degree of (green tea, ice black tea) is 0.33, which is greater than or equal to the minimum support degree, and the data is judged as the frequent set.
The above operations are repeated until the k-term set is cycled through. Taking 6 sample data as an example, the k-item set is a 3-item set (green tea, ice black tea and yoghourt), and the support degree is 0.33.
3) And (3) generating an association rule: and in the found most frequent set, the search confidence is greater than or equal to 0.6, and the association rule is generated. The confidence coefficient is calculated according to the following formula:
wherein, confidence (x → y) is confidence, N (x) is the total number of item sets containing element x in the frequent item set, and N (x & y) is the number of item sets containing element x and element y simultaneously in N (x).
Taking 6 sample data as an example, it is calculated that one rule is confidence (green tea, ice black tea → yoghurt) =2/2=1, which is greater than or equal to 0.6 of the minimum confidence, and the rule is determined to be the association rule.
And constructing a continuous bag-of-words model based on a negative sampling method.
Taking an order commodity list of an offline shopping basket as a training sample, training through a word embedding vector technology of a continuous word bag model, and outputting a commodity vector represented in a distributed manner, wherein the dimension N of the commodity vector is 10. Sample commodity vector data is as follows (for convenience of presentation, the vector takes a three-digit decimal number).
Goods of commerce | vec1 | vec2 | vec3 | vec4 | vec5 | vec6 | vec7 | vec8 | vec9 | vec10 |
Green tea | 0.139 | -0.259 | -0.047 | -0.244 | -0.724 | -0.581 | 0.213 | -0.395 | -0.514 | 0.029 |
Iced black tea | 0.018 | 0.042 | 0.079 | 0.521 | -0.729 | -0.544 | -0.245 | -0.281 | -0.102 | -0.168 |
Soda biscuitDry matter | 0.469 | 0.833 | 0.689 | 0.0769 | -0.441 | -0.614 | -0.012 | -0.426 | -0.676 | 0.184 |
Yoghurt | 0.542 | -0.724 | 0.624 | 0.589 | 0.094 | -0.576 | -0.113 | -0.343 | -0.183 | 0.142 |
... |
Through cosine similarity, the similarity threshold is set to be 0.7 in the embodiment, the most similar commodity of each commodity is calculated, and a commodity similarity table is formed:
goods of commerce | Similar series commodity (threshold greater than or equal to 0.7) |
Green tea | 0.993 of soda biscuits; ice black tea: 0.811 |
Iced black tea | And (3) carding and beating biscuits: 0.954; green tea: 0.811 |
Soda biscuit | Green tea: 0.993; ice black tea: 0.954; and (3) yogurt: 0.736 |
Yoghurt | And (3) carding and beating biscuits: 0.736 |
Each record of the commodity similarity table is judged to be a strong relationship rule, and the numerical part is the strong relationship similarity.
The yogurt and the soda biscuit never have a co-occurrence relationship in shopping basket data, and are a potential relationship mined by a distributed representation method, so that the algorithm generalization capability is enhanced.
The association rule and the continuous bag-of-words model are used for regular linking, the confidence coefficient of the chain rule is set to be 0.7, and if the confidence coefficient is greater than or equal to 0.7, the chain rule is the chain rule. The chain rule confidence coefficient is equal to the association rule confidence coefficient and the strong relationship similarity, and the specific method is as follows:
the association rule calculated through the above steps is (green tea, ice black tea → yogurt), and the association rule is linked by combining the calculated strong relationship rule (yogurt → soda biscuit), so as to generate a rule (green tea, ice black tea → yogurt → soda biscuit).
The above results suggest a rule (green tea, ice black tea → crackers), that is, there is a correlation between green tea and ice black tea, and that people who buy crackers have a strong desire to buy green tea and ice black tea.
And determining the chain rule by calculating the similarity of the confidence coefficient = association rule confidence coefficient =1 × 0.736=0.736 and is greater than or equal to 0.7.
The chain rule shows that the combed biscuits can be combined and sold as alternative commodities under the condition that intermediate commodities such as yoghourt in the store is out of stock on line.
(5) And (4) rule duplication elimination is carried out according to the association rule, the strong relation rule and the chain rule, then a union set is taken, and a commodity association combination rule base is output. Examples are as follows:
serial number | Commodity association combination | Rule type |
1 | Green tea, ice black tea, soda biscuit | Chain rule |
2 | Green tea, ice black tea, and yogurt | Association rule/strong relationship rule |
3 | Yoghourt, soda biscuit | Strong relationship rules |
4 | Soda biscuit, green tea, ice black tea, and yogurt | Strong relationship rules |
5 | ... | ... |
The method for optimizing the commodity association combination of the off-line stores is provided, the off-line stores can perform commodity complementation or substitution operation according to a commodity association combination rule base, and the strategy optimization of commodity marketing, bundled sales, display combination and the like is scientifically and reasonably realized like bundled sales of brands and cross sales of different brands of commodities. The method has the advantages that the potential strong correlation relationship among the off-line store commodities which never have the co-occurrence relationship can be found, and the commodity sales volume is improved.
With reference to fig. 3, the present invention further provides an offline store commodity association combination optimization apparatus, including:
a data acquisition module: the system comprises a data acquisition module, a data storage module and a data processing module, wherein the data acquisition module is used for acquiring shopping data and establishing a sample data set according to the shopping data;
a first association rule generation module: the system comprises a data acquisition module, a data acquisition module and a commodity association rule, wherein the data acquisition module is used for acquiring a C1-Ck item set according to a sample data set, searching a function scanning data set to find a frequent item set L1-Lk which meets the requirement of being greater than or equal to the minimum support degree, and generating a commodity association rule; wherein Ck represents an item set only containing k elements, lk represents a frequent item set containing k data sets, and k is a non-negative integer; the commodity association rule comprises a first association relation among a group of commodities;
an iterative training module: the system is used for training by using a sample data set as a training sample through a word embedding vector technology of a continuous bag-of-words model and outputting a commodity vector represented in a distributed mode;
a second incidence relation calculation module: the method comprises the steps of calculating the most similar commodity of each commodity through cosine similarity to form a commodity similarity table; each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise second association relations among the group of commodities;
a third association calculation module: the confidence degree of the chain rule is obtained according to the confidence degree of the commodity association rule comprising the same commodity and the similarity of the commodity strong relation rule; and if the confidence coefficient of the chain rule is greater than a preset confidence coefficient threshold value, forming the chain rule, wherein the chain rule comprises a third association relation between the commodities in the commodity association rule and the commodity strong relation rule.
In this application, the first association rule generating module includes a confidence calculating unit, and the confidence calculating unit is configured to find, in the found most frequent set, a set of items whose confidence is greater than or equal to a given threshold, and generate an association rule, where:
wherein, confidence (x → y) is confidence, N (x) is the total number of item sets containing element x in the frequent item set, and N (x & y) is the number of item sets containing element x and element y simultaneously in N (x).
In another embodiment, the iterative training module comprises a bag-of-words model unit, the bag-of-words model unit is used for initializing model parameters theta and commodity vectors xk, and neg central words p different from p0 are obtained by a negative sampling method i (ii) a The dimension of the commodity vector is N, for each record sample, the central commodity is p0, the window is c, the surrounding contexts have 2c commodities in total, and the commodity vector is marked as context (p) 0 )。
For each sample in the training set (context (p)) 0 ),p 0 ,p 1 ,…,p neg ) And performing gradient ascending iteration to update parameters and vectors.
The negative sampling method specifically comprises the following steps:
wherein P (P) i ) Represents a weight, f (p) i ) Representing the frequency of occurrence of the word.
The iterative parameter and vector updating process specifically includes: the 2c product vectors around the center product D0 are summed and averaged for the center product, i.e.:
fori =0toneg, calculate:
for each word vector x in context (w) k Update (2 c in total):
wherein η is the learning rate; i =0, y i 1,i =1,2,. Ang, then y i Is 0; sigma is a sigmoid function.
As shown in fig. 2, fig. 2 is a block diagram of an electronic device according to an exemplary embodiment of the present disclosure. The electronic device includes a processor 910 and a memory 920. The number of the processors 910 in the main control chip may be one or more, and one processor 910 is taken as an example in fig. 2. The number of the memories 920 in the main control chip may be one or more, and one memory 920 is taken as an example in fig. 2.
The memory 920 is used as a computer-readable storage medium, and can be used for storing a software program, a computer-executable program, and modules, such as a program of the offline store product association optimization method according to any embodiment of the present application, and program instructions/modules corresponding to the offline store product association optimization method according to any embodiment of the present application. The memory 920 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the device, and the like. Further, the memory 920 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 920 may further include memory located remotely from the processor 910, which may be connected to devices over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The processor 910 executes software programs, instructions and modules stored in the memory 920 to execute various functional applications and data processing of the device, that is, to implement the offline store and commodity association optimization method described in any of the above embodiments.
The embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method for optimizing the offline store commodity association combination according to any one of the above embodiments is implemented.
The present invention may take the form of a computer program product embodied on one or more storage media including, but not limited to, disk storage, CD-ROM, optical storage, and the like, having program code embodied therein. Computer readable storage media, which include both non-transitory and non-transitory, removable and non-removable media, may implement any method or technology for storage of information. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of the storage medium of the computer include, but are not limited to: phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technologies, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape storage or other magnetic storage devices, or any other non-transmission medium, may be used to store information that may be accessed by a computing device.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, to those skilled in the art, changes and modifications may be made without departing from the spirit of the present invention, and it is intended that the present invention encompass such changes and modifications.
Claims (9)
1. An offline store commodity association combination optimization method is characterized by comprising the following steps:
acquiring shopping data, and establishing a sample data set according to the shopping data;
establishing a C1-Ck item set according to the sample data set, searching a function scanning data set to find a frequent item set L1-Lk which meets the requirement of being more than or equal to the minimum support degree, and generating a commodity association rule; wherein Ck represents an item set only containing k elements, lk represents a frequent item set containing k data sets, and k is a non-negative integer; the commodity association rule comprises a first association relation among a group of commodities;
taking the sample data set as a training sample, training by a word embedding vector technology of a continuous bag-of-word model, and outputting a commodity vector represented in a distributed mode;
calculating the most similar commodity of each commodity through cosine similarity to form a commodity similarity table; each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise second association relations among the group of commodities;
obtaining the confidence of the chain rule according to the confidence of the commodity association rule comprising the same commodity and the similarity of the commodity strong relationship rule; and if the confidence of the chain rule is greater than a preset confidence threshold, forming a chain rule, wherein the chain rule comprises a third association relation between the commodities in the commodity association rule and the commodity strong relation rule.
2. The offline store commodity association combination optimization method according to claim 1, wherein frequent item sets L1-Lk satisfying not less than a minimum support degree are found by searching function scan data sets, and association rules are generated, specifically comprising the following steps:
combining every two C1 item sets in the L1, calculating the support degree, considering the item as a frequent item when the support degree is greater than a set threshold value, taking the item as an element in the L2, and repeatedly calculating until all combination calculation is completed;
at the found most frequent set, searching an item set with a confidence degree greater than or equal to a given threshold value, and generating an association rule, wherein:
wherein confidence (→ y) is confidence, N (x) is the total number of item sets containing the element x in the frequent item set, and N (x & y) is the number of item sets containing the element x and the element y in the N (x) at the same time.
3. The offline store commodity association combination optimization method according to claim 2, wherein the specific steps of training through a word embedding vector technology of a continuous bag-of-words model and outputting a commodity vector represented in a distributed manner are as follows:
initializing model parameters θ and commodity vectors x k Obtaining neg central words p different from p0 by negative sampling method i (ii) a The dimension of the commodity vector is N, for each record sample, the central commodity is p0, the window is c, the surrounding contexts have 2c commodities in total, and the commodity vector is marked as context (p) 0 ) And the true positive example is (context (p) 0 ),p 0 );
For each sample in the training set (context (p) 0 ),p 0 ,p 1 ,…,p neg ) Performing gradient ascending iteration updating parameter and vector process;
5. The offline store commodity association combination optimization method according to claim 4, wherein for each sample (context (p) in the training set 0 ),p 0 ,p 1 ,…,p neg ) The step of performing gradient ascent iteration to update parameters and vectors specifically comprises the following steps:
the 2c product vectors around the center product p0 are summed and averaged for the center product, i.e.:
for i =0to neg, calculate:
for each word vector x in context (w) k Update (2 c in total):
wherein η is the learning rate; i =0, y i Is 1,i =1,2, \8230, neg, y i Is 0; sigma is sigmoid function.
6. The offline store commodity association combination optimization method according to claim 1, wherein the support degree is calculated by the following formula:
wherein support () is the support of commodity x, N is the total number of sample data sets, and N (x) is the total number of commodities x contained in N.
7. The utility model provides an offline store commodity association combination optimizing device which characterized in that includes:
a data acquisition module: the system comprises a data acquisition module, a data storage module and a data processing module, wherein the data acquisition module is used for acquiring shopping data and establishing a sample data set according to the shopping data;
a first association rule generation module: the system is used for establishing a C1-Ck item set according to the sample data set, finding a frequent item set L1-Lk which meets the requirement of being more than or equal to the minimum support degree through searching a function scanning data set, and generating a commodity association rule; wherein Ck represents an item set only containing k elements, lk represents a frequent item set containing k data sets, and k is a non-negative integer; the commodity association rule comprises a first association relation among a group of commodities;
an iterative training module: the system is used for training by using the sample data set as a training sample through a word embedding vector technology of a continuous word bag model and outputting a commodity vector represented in a distributed mode;
the second incidence relation calculation module: the method comprises the steps of calculating the most similar commodity of each commodity through cosine similarity to form a commodity similarity table; each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise second association relations among the group of commodities;
a third association calculation module: the confidence degree of the chain type rule is obtained according to the confidence degree of the commodity association rule comprising the same commodity and the similarity of the commodity strong relation rule; and if the confidence of the chain rule is greater than a preset confidence threshold, forming a chain rule, wherein the chain rule comprises a third association relation between the commodities in the commodity association rule and the commodity strong relation rule.
8. An electronic device, comprising:
at least one memory and at least one processor;
the memory for storing one or more programs;
when executed by the at least one processor, the one or more programs cause the at least one processor to perform the steps of the offline store commodity association combination optimization method of any one of claims 1 to 6.
9. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of a method for product association and combination optimization for offline store according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211709578.5A CN115983921B (en) | 2022-12-29 | 2022-12-29 | Off-line store commodity association combination method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211709578.5A CN115983921B (en) | 2022-12-29 | 2022-12-29 | Off-line store commodity association combination method, device, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115983921A true CN115983921A (en) | 2023-04-18 |
CN115983921B CN115983921B (en) | 2023-11-14 |
Family
ID=85971873
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211709578.5A Active CN115983921B (en) | 2022-12-29 | 2022-12-29 | Off-line store commodity association combination method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115983921B (en) |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120254242A1 (en) * | 2011-03-31 | 2012-10-04 | Infosys Technologies Limited | Methods and systems for mining association rules |
CN103473262A (en) * | 2013-07-17 | 2013-12-25 | 北京航空航天大学 | Automatic classification system and automatic classification method for Web comment viewpoint on the basis of association rule |
CN103700005A (en) * | 2013-12-17 | 2014-04-02 | 南京信息工程大学 | Association-rule recommending method based on self-adaptive multiple minimum supports |
US20150278350A1 (en) * | 2014-03-27 | 2015-10-01 | Microsoft Corporation | Recommendation System With Dual Collaborative Filter Usage Matrix |
CN107491988A (en) * | 2017-08-09 | 2017-12-19 | 浙江工商大学 | A kind of wisdom retail data method for digging based on genetic algorithm and improvement interest-degree |
CN108122126A (en) * | 2016-11-29 | 2018-06-05 | 财团法人工业技术研究院 | Method for extending association rule, device using same and computer readable medium |
US20180276688A1 (en) * | 2017-03-24 | 2018-09-27 | International Business Machines Corporation | System and method for a scalable recommender system using massively parallel processors |
CN110196904A (en) * | 2018-02-26 | 2019-09-03 | 佛山市顺德区美的电热电器制造有限公司 | A kind of method, apparatus and computer readable storage medium obtaining recommendation information |
CN110362670A (en) * | 2019-07-19 | 2019-10-22 | 中国联合网络通信集团有限公司 | Item property abstracting method and system |
CN110851571A (en) * | 2019-11-14 | 2020-02-28 | 拉扎斯网络科技(上海)有限公司 | Data processing method and device, electronic equipment and computer readable storage medium |
CN111914163A (en) * | 2020-06-20 | 2020-11-10 | 武汉海云健康科技股份有限公司 | Medicine combination recommendation method and device, electronic equipment and storage medium |
CN111915400A (en) * | 2020-07-30 | 2020-11-10 | 广州大学 | Personalized clothing recommendation method and device based on deep learning |
CN113988638A (en) * | 2021-10-29 | 2022-01-28 | 深圳壹账通智能科技有限公司 | Method and device for measuring and calculating strength of general association relationship, electronic equipment and medium |
CN114418663A (en) * | 2021-12-10 | 2022-04-29 | 珠海格力电器股份有限公司 | Commodity information processing method and device, computer equipment and storage medium |
CN115099857A (en) * | 2022-06-24 | 2022-09-23 | 广州华多网络科技有限公司 | Advertisement commodity combined publishing method and device, equipment, medium and product thereof |
-
2022
- 2022-12-29 CN CN202211709578.5A patent/CN115983921B/en active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120254242A1 (en) * | 2011-03-31 | 2012-10-04 | Infosys Technologies Limited | Methods and systems for mining association rules |
CN103473262A (en) * | 2013-07-17 | 2013-12-25 | 北京航空航天大学 | Automatic classification system and automatic classification method for Web comment viewpoint on the basis of association rule |
CN103700005A (en) * | 2013-12-17 | 2014-04-02 | 南京信息工程大学 | Association-rule recommending method based on self-adaptive multiple minimum supports |
US20150278350A1 (en) * | 2014-03-27 | 2015-10-01 | Microsoft Corporation | Recommendation System With Dual Collaborative Filter Usage Matrix |
CN108122126A (en) * | 2016-11-29 | 2018-06-05 | 财团法人工业技术研究院 | Method for extending association rule, device using same and computer readable medium |
US20180276688A1 (en) * | 2017-03-24 | 2018-09-27 | International Business Machines Corporation | System and method for a scalable recommender system using massively parallel processors |
CN107491988A (en) * | 2017-08-09 | 2017-12-19 | 浙江工商大学 | A kind of wisdom retail data method for digging based on genetic algorithm and improvement interest-degree |
CN110196904A (en) * | 2018-02-26 | 2019-09-03 | 佛山市顺德区美的电热电器制造有限公司 | A kind of method, apparatus and computer readable storage medium obtaining recommendation information |
CN110362670A (en) * | 2019-07-19 | 2019-10-22 | 中国联合网络通信集团有限公司 | Item property abstracting method and system |
CN110851571A (en) * | 2019-11-14 | 2020-02-28 | 拉扎斯网络科技(上海)有限公司 | Data processing method and device, electronic equipment and computer readable storage medium |
CN111914163A (en) * | 2020-06-20 | 2020-11-10 | 武汉海云健康科技股份有限公司 | Medicine combination recommendation method and device, electronic equipment and storage medium |
CN111915400A (en) * | 2020-07-30 | 2020-11-10 | 广州大学 | Personalized clothing recommendation method and device based on deep learning |
CN113988638A (en) * | 2021-10-29 | 2022-01-28 | 深圳壹账通智能科技有限公司 | Method and device for measuring and calculating strength of general association relationship, electronic equipment and medium |
CN114418663A (en) * | 2021-12-10 | 2022-04-29 | 珠海格力电器股份有限公司 | Commodity information processing method and device, computer equipment and storage medium |
CN115099857A (en) * | 2022-06-24 | 2022-09-23 | 广州华多网络科技有限公司 | Advertisement commodity combined publishing method and device, equipment, medium and product thereof |
Non-Patent Citations (1)
Title |
---|
裘立波;姜元春;林文龙;: "电子商务环境下捆绑商品研究", 商业研究, no. 09, pages 186 - 188 * |
Also Published As
Publication number | Publication date |
---|---|
CN115983921B (en) | 2023-11-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Feldman et al. | Customer choice models vs. machine learning: Finding optimal product displays on Alibaba | |
Ma et al. | Demand forecasting with high dimensional data: The case of SKU retail sales forecasting with intra-and inter-category promotional information | |
Bajari et al. | Demand estimation with machine learning and model combination | |
US10860634B2 (en) | Artificial intelligence system and method for generating a hierarchical data structure | |
CN111402013B (en) | Commodity collocation recommendation method, system, device and storage medium | |
US20110213661A1 (en) | Computer-Implemented Method For Enhancing Product Sales | |
Chiang | Applying data mining for online CRM marketing strategy: An empirical case of coffee shop industry in Taiwan | |
Argente et al. | How do firms grow? The life cycle of products matters | |
Behera et al. | Grid search optimization (GSO) based future sales prediction for big mart | |
Plattner | Economic decision making in a public marketplace | |
US20040181445A1 (en) | Method and apparatus for managing product planning and marketing | |
Gangurde et al. | Building prediction model using market basket analysis | |
CN111932339A (en) | Commodity recommendation method and system based on consumer groups and computer storage medium | |
CN111177581A (en) | Multi-platform-based social e-commerce website commodity recommendation method and device | |
Aguilar-Palacios et al. | Forecasting promotional sales within the neighbourhood | |
CN109615460A (en) | Gather the selection method and selection system of single commodity | |
Tuinesia et al. | The influence of brand awareness and perceived quality on repurchase intention: brand loyalty as intervening variable (case study at kopi soe branch of Panakkukang Makassar) | |
Patwary et al. | Market Basket Analysis Approach to Machine Learning | |
CN115983921A (en) | Offline store commodity association combination method, device, equipment and storage medium | |
Schultz et al. | Killing brands… softly | |
Ajay et al. | Analyzing and Predicting the Sales Forecasting using Modified Random Forest and Decision Tree Algorithm | |
Al-Basha | Forecasting Retail Sales Using Google Trends and Machine Learning | |
Tanaka et al. | Automatic generation method to derive for the design variable spaces for interactive genetic algorithms | |
Peker et al. | A methodology for product segmentation using sale transactions | |
Konieczny et al. | The behavior of price dispersion in a natural experiment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |