CN115983921A - Offline store commodity association combination method, device, equipment and storage medium - Google Patents

Offline store commodity association combination method, device, equipment and storage medium Download PDF

Info

Publication number
CN115983921A
CN115983921A CN202211709578.5A CN202211709578A CN115983921A CN 115983921 A CN115983921 A CN 115983921A CN 202211709578 A CN202211709578 A CN 202211709578A CN 115983921 A CN115983921 A CN 115983921A
Authority
CN
China
Prior art keywords
commodity
association
rule
commodities
confidence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211709578.5A
Other languages
Chinese (zh)
Other versions
CN115983921B (en
Inventor
关梓文
林沛欣
许洁斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Xuanwu Wireless Technology Co Ltd
Original Assignee
Guangzhou Xuanwu Wireless Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Xuanwu Wireless Technology Co Ltd filed Critical Guangzhou Xuanwu Wireless Technology Co Ltd
Priority to CN202211709578.5A priority Critical patent/CN115983921B/en
Publication of CN115983921A publication Critical patent/CN115983921A/en
Application granted granted Critical
Publication of CN115983921B publication Critical patent/CN115983921B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to an offline store commodity association combination method, device, equipment and storage medium. According to the off-line store commodity association combination method, off-line shopping data are obtained, element matching is conducted on each data set according to commodity information in each order based on association rules and continuous bag-of-words model rules, matching degrees among commodities and confidence degrees among the matching degrees are calculated, and finally association rules, commodity strong relation rules and chain rules among the commodities are determined. And finally finding potential relevant rules among the commodities, including but not limited to substitution and binding. The off-line store commodity association combination method has the advantage of overcoming the defect that the association relation cannot be calculated due to no co-occurrence relation of commodities.

Description

Offline store commodity association combination method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of data intelligent algorithms, in particular to the field of offline store commodity association combination optimization methods.
Background
With the progress of society, various new commodities are continuously emerged in our lives, and people need to spend more time selecting products. The trend of enterprises is both challenging and opportunistic, and needs to capture the demands of customers timely and adopt proper sales strategies appropriately to win more customers and better improve the commodity sales volume. One important sales strategy is commodity combination sales, according to the purchase data of commodities, the similarity among the commodities in shopping behaviors is mined, interested commodity combinations are sold to customers, and the purchase desire of the customers is stimulated. For customers, the method helps the customers to quickly find interesting and high-quality commodities, and improves shopping experience; and secondly, the energy of the customers for finding the needed commodities is reduced. At present, the combined sale method of the off-line stores mainly adopts human experience and an association rule algorithm, wherein the human experience is to guess the association of commodities according to a simple rule, such as binding sale or adjacent display of toothpaste and toothbrushes, and can really help the sale quantity to increase to a certain extent, but the defects are that the rules found manually are few or the accuracy is not high. The association rule algorithm finds the relation between items from the data set, and the main flow is to collect the data of the shopping baskets of the customers of the off-line stores, set the threshold value of the support degree, find out the frequent item set of the commodity from the data, set the threshold value of the confidence degree, and calculate and output the association rule from the frequent item set. The method has the advantages that frequent co-occurrence relations mainly exist, clear and useful results such as famous beer and diaper commodity association combinations can be mined, and data with long processing time can be supported. However, only the commonly occurring commodity relations can be mined, and the commodity relations which do not exist but have potential association cannot be mined.
Disclosure of Invention
Accordingly, an object of the present invention is to provide an offline store product association combining method, apparatus, device, and storage medium. The method collects the data of the customer shopping baskets of the off-line stores, outputs commodity association combination through the rule linkage and fusion of the association rule and the continuous word bag model algorithm, and scientifically and reasonably guides the optimization of strategies such as commodity marketing, bound sales, display combination and the like. The method overcomes the defect that the incidence relation cannot be calculated due to no co-occurrence relation of commodities, and has high universality.
The invention relates to a commodity association combination optimization method for an offline store, which comprises the following steps:
acquiring shopping data, and establishing a sample data set according to the shopping data;
according to the sample data set, a C1-Ck item set is established, a function is searched to scan the data set to find a frequent item set L1-Lk which meets the requirement of being larger than or equal to the minimum support degree, and a commodity association rule is generated; wherein Ck represents an item set only containing k elements, lk represents a frequent item set containing k data sets, and k is a non-negative integer; the commodity association rule comprises a first association relation among a group of commodities;
taking the sample data set as a training sample, training by a word embedding vector technology of a continuous word bag model, and outputting a commodity vector represented in a distributed mode;
calculating the most similar commodity of each commodity through cosine similarity to form a commodity similarity table; each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise second association relations among the group of commodities;
obtaining the confidence coefficient of the chain rule according to the confidence coefficient of the commodity association rule comprising the same commodity and the similarity of the commodity strong relation rule; and if the confidence of the chain rule is greater than a preset confidence threshold, forming a chain rule, wherein the chain rule comprises a third association relation between the commodities in the commodity association rule and the commodity strong relation rule.
Further, the frequent item sets L1-Lk meeting the minimum support degree are found by searching the function scanning data set, and the association rule is generated, which specifically comprises the following steps:
combining every two C1 item sets in the L1, calculating the support degree, considering the item as a frequent item when the support degree is greater than a set threshold value, taking the item as an element in the L2, and repeatedly calculating until all combined calculation is completed;
at the found most frequent set, searching an item set with a confidence degree greater than or equal to a given threshold value, and generating an association rule, wherein:
Figure BDA0004026794730000021
wherein, confidence (x → y) is confidence, N (x) is the total number of item sets containing element x in the frequent item set, and N (x & y) is the number of item sets containing element x and element y in N (x).
Further, the specific steps of training by the word embedding vector technology of the continuous bag-of-words model and outputting the commodity vector represented in a distributed manner are as follows:
initializing model parameters θ and commodity vectors x k Obtaining neg central words p different from p0 by negative sampling method i (ii) a The dimension of the commodity vector is N, for each record sample, the central commodity is p0, the window is c, the surrounding contexts have 2c commodities and are marked as context (p 0), and the real positive example is marked as (context (p 0), p 0);
for each sample in the training set (context (p) 0 ),p 0 ,p 1 ,…,p neg ) Performing gradient ascending iteration updating parameter and vector process;
if the gradient is converged, the gradient iteration is ended, and the parameters are updated
Figure BDA0004026794730000022
And commodity vector x k Otherwise, the iteration continues.
Further, the negative sampling method is calculated by the following formula,
Figure BDA0004026794730000031
wherein P (P) i ) Represents a weight, f (p) i ) Representing the frequency of occurrence of the word.
Further, for each sample (context (p)) in the training set 0 ),p 0 ,p 1 ,…,p neg ) The step of performing gradient ascent iteration to update parameters and vectors specifically comprises the following steps:
the 2c commodity vectors around the center commodity p0 are summed and averaged for the center commodity, i.e.:
Figure BDA0004026794730000032
for i =0to neg, calculate:
Figure BDA0004026794730000033
for each word vector x in context (w) k Update (2 c in total):
Figure BDA0004026794730000034
wherein eta is the learning rate; i =0,y i Is 1,i =1,2, \8230, neg, y i Is 0; sigma is a sigmoid function.
Further, the support degree is calculated by the following formula:
Figure BDA0004026794730000035
wherein support (x) is the support of the commodity x, N is the total number of the sample data set, and N (x) is the total number of the commodity x contained in N.
In another aspect, the present application provides an offline store commodity association combination optimization apparatus, including:
a data acquisition module: the system comprises a data acquisition module, a data storage module and a data processing module, wherein the data acquisition module is used for acquiring shopping data and establishing a sample data set according to the shopping data;
a first association rule generation module: the system is used for establishing a C1-Ck item set according to the sample data set, finding a frequent item set L1-Lk which meets the requirement of being more than or equal to the minimum support degree through searching a function scanning data set, and generating a commodity association rule; wherein Ck represents an item set only containing k elements, lk represents a frequent item set containing k data sets, and k is a non-negative integer; the commodity association rule comprises a first association relation among a group of commodities;
an iterative training module: the system is used for training by using the sample data set as a training sample through a word embedding vector technology of a continuous word bag model and outputting a commodity vector represented in a distributed mode;
a second incidence relation calculation module: the method is used for calculating the most similar commodity of each commodity through cosine similarity to form a commodity similarity table; each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise second association relations among the group of commodities;
a third association calculation module: the confidence degree of the chain type rule is obtained according to the confidence degree of the commodity association rule comprising the same commodity and the similarity of the commodity strong relation rule; and if the confidence of the chain rule is greater than a preset confidence threshold, forming a chain rule, wherein the chain rule comprises a third association relation between the commodities in the commodity association rule and the commodity strong relation rule.
In another aspect, the present application further provides an electronic device, including:
at least one memory and at least one processor;
the memory for storing one or more programs;
when executed by the at least one processor, the one or more programs cause the at least one processor to implement the steps of a method for offline store merchandise association portfolio optimization as described in any one of the above.
In another aspect, the present application further provides a computer-readable storage medium, which stores a computer program, where the computer program is executed by a processor to implement the steps of any one of the offline store commodity association combination optimization methods described above.
The method for optimizing the commodity association combination of the off-line store can perform commodity complementation or substitution operation according to a commodity association combination rule base, and scientifically and reasonably realize strategy optimization of commodity marketing, binding sales, display combination and the like binding sales of brands and cross sales of different brands of commodities. The method has the advantages that the potential strong correlation among the off-line store commodities which never have the co-occurrence relationship can be found, and the commodity sales volume is improved.
For a better understanding and practice, the invention is described in detail below with reference to the accompanying drawings.
Drawings
Fig. 1 is a flowchart of a method for optimizing commodity association combinations of offline stores according to an embodiment of the present application;
fig. 2 is a block diagram of a structure of an offline store commodity association combination optimization apparatus according to an embodiment of the present application;
fig. 3 is a block diagram illustrating a structure of an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
It should be understood that the embodiments described are only some embodiments of the present application, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the embodiments in the present application.
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the present application. As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims. In the description of the present application, it is to be understood that the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not necessarily used to describe a particular order or sequence, nor are they to be construed as indicating or implying relative importance. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate.
Further, in the description of the present application, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
With the continuous appearance of new commodities in life, people need to spend time on selecting products for different commodities, but in the current fast-paced social environment, people often do not spend much time on selecting commodities, the on-line and sale of new products have problems, and the existing stores select some products to be bundled and sold, so that customers can quickly find interesting and high-quality products, the time spent on selecting the commodities by the customers is reduced, or some complementary commodities are adjacently displayed to realize sales promotion among the commodities. However, the existing association between commodities is usually discovered through daily habits and common sense of life of most people, and the association is relatively subjective and has strong personal awareness, so that the association rules are few and the accuracy is not high, and some potential association between commodities cannot be discovered.
Based on the above background, with reference to fig. 1, the present embodiment provides a commodity association combination optimization method applied to an offline store. The method specifically comprises the following steps of calculating and outputting commodity association combination by collecting offline shop shopping basket data, fusing association rules and continuous word bag models, and guiding strategy optimization such as commodity combination sale, binding sale and the like:
s10: and acquiring shopping data, and establishing a sample data set according to the shopping data.
The shopping data is a shopping list of purchases made by the customer, each item representing an element. The sample data set is used for storing a data set in a client shopping data set of a stage, and a group of shopping data of each client is taken as a sample.
S20: establishing a C1-Ck item set according to the sample data set, scanning the data set through a search function to find a frequent item set L1-Lk which meets the requirement of being greater than or equal to the minimum support degree, and generating a commodity association rule; wherein Ck represents an item set only containing k elements, lk represents a frequent item set containing k data sets, and k is a non-negative integer; the commodity association rule includes a first association relationship between a set of commodities.
The C1-Ck item set represents from an item set only containing 1 element to an item set only containing k elements, and the C1 item set is a 1-item set; the L1-Lk frequent item set represents a frequent item set containing 1 data set to a frequent item set containing k data sets, for example, a C1 item set { toothpaste; iced black tea; toothpaste, C2 set { lemon tea/cola; green tea/toothpaste }, there are five elements in total. The L1 of the toothpaste is 0.6.
S30: and (3) taking the sample data set as a training sample, training by a word embedding vector technology of a continuous word bag model, and outputting a commodity vector represented in a distributed mode.
Constructing a continuous bag-of-words model based on a negative sampling method, initializing model parameters theta and commodity vectors X k Assuming the dimension of the commodity vector is N, for each record sample, the center commodity is p 0 The window is c, the total number of the surrounding context is 2c, and the neg sum p can be obtained by the negative sampling method which is marked as context (p 0) 0 Different core words p i Wherein i =1, 2.. Neg, the unrealistic negative case is denoted as (context (p) 0 ),p i ) True positive example is (context (p) 0 ),p 0 );
S40: calculating the most similar commodity of each commodity through cosine similarity to form a commodity similarity table; each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise second association relations among the group of commodities.
Cosine similarity measures the similarity degree between vectors according to the included angle between space vectors, and in the application, the commodity vectors represented in a distributed mode are output through training of a word embedding vector technology of a continuous word bag model.
S50: obtaining the confidence coefficient of the chain rule according to the confidence coefficient of the commodity association rule comprising the same commodity and the similarity of the commodity strong relation rule; and if the confidence of the chain rule is greater than a preset confidence threshold, forming the chain rule, wherein the chain rule comprises a third association relation between the commodities in the commodity association rule and the commodity strong relation rule.
In a specific embodiment, the technical scheme is realized by the following scheme.
Shopping basket data of off-line shops are collected, wherein the shopping basket data comprise shopping basket order numbers and shopping basket order commodity lists, and the data are stored in Clickhouse. Sample data are as follows:
order numbering Commodity list
LZ_B_M_220715_123554 Green tea/crystal sugar snow pear/ice black tea/high calcium milk/yogurt/waffle biscuit
LZ_B_M_220716_091521 Blackcurrant/natural mineral water/green tea/soda biscuit/caramel toast
LZ_B_M_220716_111617 Chocolate/green tea/ice black tea/chewing gum/sausage/yoghurt
1Z_B-M_220718_201522 Green tea/toothpaste/toothbrush
LZ_B_M_220720_153018 Ice cream cone/lemon tea/cola
LZ_B_M_220725_102522 Fruit orange/soda cracker/garlic-flavored peanut/sesame cake/breakfast biscuit
And constructing an association rule algorithm. In this embodiment, the minimum support degree is set to be 0.05, the minimum confidence degree is set to be 0.6, and a frequent item set is calculated.
1) And creating a 1-item set C1, scanning the data set through a search function, calculating the support degree, and finding a frequent item set L1 which meets the requirement of being more than or equal to the minimum support degree of 0.05.
Figure BDA0004026794730000071
Wherein support (x) is the support of the commodity x, N is the total number of the sample data set, and N (x) is the total number of the commodity x contained in N.
Taking 6 sample data as an example, the support degree of green tea is 0.67, which is greater than or equal to the minimum support degree, and the green tea is judged as a frequent set.
2) Combining every 1-item set in the L1 in pairs, repeating the step 1), and finding a frequent item set L2.
Taking 6 sample data as an example, the support degree of (green tea, ice black tea) is 0.33, which is greater than or equal to the minimum support degree, and the data is judged as the frequent set.
The above operations are repeated until the k-term set is cycled through. Taking 6 sample data as an example, the k-item set is a 3-item set (green tea, ice black tea and yoghourt), and the support degree is 0.33.
3) And (3) generating an association rule: and in the found most frequent set, the search confidence is greater than or equal to 0.6, and the association rule is generated. The confidence coefficient is calculated according to the following formula:
Figure BDA0004026794730000072
wherein, confidence (x → y) is confidence, N (x) is the total number of item sets containing element x in the frequent item set, and N (x & y) is the number of item sets containing element x and element y simultaneously in N (x).
Taking 6 sample data as an example, it is calculated that one rule is confidence (green tea, ice black tea → yoghurt) =2/2=1, which is greater than or equal to 0.6 of the minimum confidence, and the rule is determined to be the association rule.
And constructing a continuous bag-of-words model based on a negative sampling method.
Taking an order commodity list of an offline shopping basket as a training sample, training through a word embedding vector technology of a continuous word bag model, and outputting a commodity vector represented in a distributed manner, wherein the dimension N of the commodity vector is 10. Sample commodity vector data is as follows (for convenience of presentation, the vector takes a three-digit decimal number).
Goods of commerce vec1 vec2 vec3 vec4 vec5 vec6 vec7 vec8 vec9 vec10
Green tea 0.139 -0.259 -0.047 -0.244 -0.724 -0.581 0.213 -0.395 -0.514 0.029
Iced black tea 0.018 0.042 0.079 0.521 -0.729 -0.544 -0.245 -0.281 -0.102 -0.168
Soda biscuitDry matter 0.469 0.833 0.689 0.0769 -0.441 -0.614 -0.012 -0.426 -0.676 0.184
Yoghurt 0.542 -0.724 0.624 0.589 0.094 -0.576 -0.113 -0.343 -0.183 0.142
...
Through cosine similarity, the similarity threshold is set to be 0.7 in the embodiment, the most similar commodity of each commodity is calculated, and a commodity similarity table is formed:
goods of commerce Similar series commodity (threshold greater than or equal to 0.7)
Green tea 0.993 of soda biscuits; ice black tea: 0.811
Iced black tea And (3) carding and beating biscuits: 0.954; green tea: 0.811
Soda biscuit Green tea: 0.993; ice black tea: 0.954; and (3) yogurt: 0.736
Yoghurt And (3) carding and beating biscuits: 0.736
Each record of the commodity similarity table is judged to be a strong relationship rule, and the numerical part is the strong relationship similarity.
The yogurt and the soda biscuit never have a co-occurrence relationship in shopping basket data, and are a potential relationship mined by a distributed representation method, so that the algorithm generalization capability is enhanced.
The association rule and the continuous bag-of-words model are used for regular linking, the confidence coefficient of the chain rule is set to be 0.7, and if the confidence coefficient is greater than or equal to 0.7, the chain rule is the chain rule. The chain rule confidence coefficient is equal to the association rule confidence coefficient and the strong relationship similarity, and the specific method is as follows:
the association rule calculated through the above steps is (green tea, ice black tea → yogurt), and the association rule is linked by combining the calculated strong relationship rule (yogurt → soda biscuit), so as to generate a rule (green tea, ice black tea → yogurt → soda biscuit).
The above results suggest a rule (green tea, ice black tea → crackers), that is, there is a correlation between green tea and ice black tea, and that people who buy crackers have a strong desire to buy green tea and ice black tea.
And determining the chain rule by calculating the similarity of the confidence coefficient = association rule confidence coefficient =1 × 0.736=0.736 and is greater than or equal to 0.7.
The chain rule shows that the combed biscuits can be combined and sold as alternative commodities under the condition that intermediate commodities such as yoghourt in the store is out of stock on line.
(5) And (4) rule duplication elimination is carried out according to the association rule, the strong relation rule and the chain rule, then a union set is taken, and a commodity association combination rule base is output. Examples are as follows:
serial number Commodity association combination Rule type
1 Green tea, ice black tea, soda biscuit Chain rule
2 Green tea, ice black tea, and yogurt Association rule/strong relationship rule
3 Yoghourt, soda biscuit Strong relationship rules
4 Soda biscuit, green tea, ice black tea, and yogurt Strong relationship rules
5 ... ...
The method for optimizing the commodity association combination of the off-line stores is provided, the off-line stores can perform commodity complementation or substitution operation according to a commodity association combination rule base, and the strategy optimization of commodity marketing, bundled sales, display combination and the like is scientifically and reasonably realized like bundled sales of brands and cross sales of different brands of commodities. The method has the advantages that the potential strong correlation relationship among the off-line store commodities which never have the co-occurrence relationship can be found, and the commodity sales volume is improved.
With reference to fig. 3, the present invention further provides an offline store commodity association combination optimization apparatus, including:
a data acquisition module: the system comprises a data acquisition module, a data storage module and a data processing module, wherein the data acquisition module is used for acquiring shopping data and establishing a sample data set according to the shopping data;
a first association rule generation module: the system comprises a data acquisition module, a data acquisition module and a commodity association rule, wherein the data acquisition module is used for acquiring a C1-Ck item set according to a sample data set, searching a function scanning data set to find a frequent item set L1-Lk which meets the requirement of being greater than or equal to the minimum support degree, and generating a commodity association rule; wherein Ck represents an item set only containing k elements, lk represents a frequent item set containing k data sets, and k is a non-negative integer; the commodity association rule comprises a first association relation among a group of commodities;
an iterative training module: the system is used for training by using a sample data set as a training sample through a word embedding vector technology of a continuous bag-of-words model and outputting a commodity vector represented in a distributed mode;
a second incidence relation calculation module: the method comprises the steps of calculating the most similar commodity of each commodity through cosine similarity to form a commodity similarity table; each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise second association relations among the group of commodities;
a third association calculation module: the confidence degree of the chain rule is obtained according to the confidence degree of the commodity association rule comprising the same commodity and the similarity of the commodity strong relation rule; and if the confidence coefficient of the chain rule is greater than a preset confidence coefficient threshold value, forming the chain rule, wherein the chain rule comprises a third association relation between the commodities in the commodity association rule and the commodity strong relation rule.
In this application, the first association rule generating module includes a confidence calculating unit, and the confidence calculating unit is configured to find, in the found most frequent set, a set of items whose confidence is greater than or equal to a given threshold, and generate an association rule, where:
Figure BDA0004026794730000091
wherein, confidence (x → y) is confidence, N (x) is the total number of item sets containing element x in the frequent item set, and N (x & y) is the number of item sets containing element x and element y simultaneously in N (x).
In another embodiment, the iterative training module comprises a bag-of-words model unit, the bag-of-words model unit is used for initializing model parameters theta and commodity vectors xk, and neg central words p different from p0 are obtained by a negative sampling method i (ii) a The dimension of the commodity vector is N, for each record sample, the central commodity is p0, the window is c, the surrounding contexts have 2c commodities in total, and the commodity vector is marked as context (p) 0 )。
For each sample in the training set (context (p)) 0 ),p 0 ,p 1 ,…,p neg ) And performing gradient ascending iteration to update parameters and vectors.
The negative sampling method specifically comprises the following steps:
Figure BDA0004026794730000101
wherein P (P) i ) Represents a weight, f (p) i ) Representing the frequency of occurrence of the word.
The iterative parameter and vector updating process specifically includes: the 2c product vectors around the center product D0 are summed and averaged for the center product, i.e.:
Figure BDA0004026794730000102
fori =0toneg, calculate:
Figure BDA0004026794730000103
for each word vector x in context (w) k Update (2 c in total):
Figure BDA0004026794730000104
wherein η is the learning rate; i =0, y i 1,i =1,2,. Ang, then y i Is 0; sigma is a sigmoid function.
As shown in fig. 2, fig. 2 is a block diagram of an electronic device according to an exemplary embodiment of the present disclosure. The electronic device includes a processor 910 and a memory 920. The number of the processors 910 in the main control chip may be one or more, and one processor 910 is taken as an example in fig. 2. The number of the memories 920 in the main control chip may be one or more, and one memory 920 is taken as an example in fig. 2.
The memory 920 is used as a computer-readable storage medium, and can be used for storing a software program, a computer-executable program, and modules, such as a program of the offline store product association optimization method according to any embodiment of the present application, and program instructions/modules corresponding to the offline store product association optimization method according to any embodiment of the present application. The memory 920 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the device, and the like. Further, the memory 920 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 920 may further include memory located remotely from the processor 910, which may be connected to devices over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The processor 910 executes software programs, instructions and modules stored in the memory 920 to execute various functional applications and data processing of the device, that is, to implement the offline store and commodity association optimization method described in any of the above embodiments.
The embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method for optimizing the offline store commodity association combination according to any one of the above embodiments is implemented.
The present invention may take the form of a computer program product embodied on one or more storage media including, but not limited to, disk storage, CD-ROM, optical storage, and the like, having program code embodied therein. Computer readable storage media, which include both non-transitory and non-transitory, removable and non-removable media, may implement any method or technology for storage of information. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of the storage medium of the computer include, but are not limited to: phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technologies, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape storage or other magnetic storage devices, or any other non-transmission medium, may be used to store information that may be accessed by a computing device.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, to those skilled in the art, changes and modifications may be made without departing from the spirit of the present invention, and it is intended that the present invention encompass such changes and modifications.

Claims (9)

1. An offline store commodity association combination optimization method is characterized by comprising the following steps:
acquiring shopping data, and establishing a sample data set according to the shopping data;
establishing a C1-Ck item set according to the sample data set, searching a function scanning data set to find a frequent item set L1-Lk which meets the requirement of being more than or equal to the minimum support degree, and generating a commodity association rule; wherein Ck represents an item set only containing k elements, lk represents a frequent item set containing k data sets, and k is a non-negative integer; the commodity association rule comprises a first association relation among a group of commodities;
taking the sample data set as a training sample, training by a word embedding vector technology of a continuous bag-of-word model, and outputting a commodity vector represented in a distributed mode;
calculating the most similar commodity of each commodity through cosine similarity to form a commodity similarity table; each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise second association relations among the group of commodities;
obtaining the confidence of the chain rule according to the confidence of the commodity association rule comprising the same commodity and the similarity of the commodity strong relationship rule; and if the confidence of the chain rule is greater than a preset confidence threshold, forming a chain rule, wherein the chain rule comprises a third association relation between the commodities in the commodity association rule and the commodity strong relation rule.
2. The offline store commodity association combination optimization method according to claim 1, wherein frequent item sets L1-Lk satisfying not less than a minimum support degree are found by searching function scan data sets, and association rules are generated, specifically comprising the following steps:
combining every two C1 item sets in the L1, calculating the support degree, considering the item as a frequent item when the support degree is greater than a set threshold value, taking the item as an element in the L2, and repeatedly calculating until all combination calculation is completed;
at the found most frequent set, searching an item set with a confidence degree greater than or equal to a given threshold value, and generating an association rule, wherein:
Figure FDA0004026794720000011
wherein confidence (→ y) is confidence, N (x) is the total number of item sets containing the element x in the frequent item set, and N (x & y) is the number of item sets containing the element x and the element y in the N (x) at the same time.
3. The offline store commodity association combination optimization method according to claim 2, wherein the specific steps of training through a word embedding vector technology of a continuous bag-of-words model and outputting a commodity vector represented in a distributed manner are as follows:
initializing model parameters θ and commodity vectors x k Obtaining neg central words p different from p0 by negative sampling method i (ii) a The dimension of the commodity vector is N, for each record sample, the central commodity is p0, the window is c, the surrounding contexts have 2c commodities in total, and the commodity vector is marked as context (p) 0 ) And the true positive example is (context (p) 0 ),p 0 );
For each sample in the training set (context (p) 0 ),p 0 ,p 1 ,…,p neg ) Performing gradient ascending iteration updating parameter and vector process;
if the gradient is converged, finishing the gradient iteration and updating the parameters
Figure FDA0004026794720000021
And commodity vector x k Otherwise, the iteration is continued.
4. The offline store commodity association combination optimization method according to claim 3, wherein:
the negative sampling method is calculated by the following formula,
Figure FDA0004026794720000022
wherein P (P) i ) Representing weight, fp i ) Representing the frequency of occurrence of the word.
5. The offline store commodity association combination optimization method according to claim 4, wherein for each sample (context (p) in the training set 0 ),p 0 ,p 1 ,…,p neg ) The step of performing gradient ascent iteration to update parameters and vectors specifically comprises the following steps:
the 2c product vectors around the center product p0 are summed and averaged for the center product, i.e.:
Figure FDA0004026794720000023
for i =0to neg, calculate:
Figure FDA0004026794720000024
for each word vector x in context (w) k Update (2 c in total):
Figure FDA0004026794720000025
wherein η is the learning rate; i =0, y i Is 1,i =1,2, \8230, neg, y i Is 0; sigma is sigmoid function.
6. The offline store commodity association combination optimization method according to claim 1, wherein the support degree is calculated by the following formula:
Figure FDA0004026794720000026
wherein support () is the support of commodity x, N is the total number of sample data sets, and N (x) is the total number of commodities x contained in N.
7. The utility model provides an offline store commodity association combination optimizing device which characterized in that includes:
a data acquisition module: the system comprises a data acquisition module, a data storage module and a data processing module, wherein the data acquisition module is used for acquiring shopping data and establishing a sample data set according to the shopping data;
a first association rule generation module: the system is used for establishing a C1-Ck item set according to the sample data set, finding a frequent item set L1-Lk which meets the requirement of being more than or equal to the minimum support degree through searching a function scanning data set, and generating a commodity association rule; wherein Ck represents an item set only containing k elements, lk represents a frequent item set containing k data sets, and k is a non-negative integer; the commodity association rule comprises a first association relation among a group of commodities;
an iterative training module: the system is used for training by using the sample data set as a training sample through a word embedding vector technology of a continuous word bag model and outputting a commodity vector represented in a distributed mode;
the second incidence relation calculation module: the method comprises the steps of calculating the most similar commodity of each commodity through cosine similarity to form a commodity similarity table; each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise second association relations among the group of commodities;
a third association calculation module: the confidence degree of the chain type rule is obtained according to the confidence degree of the commodity association rule comprising the same commodity and the similarity of the commodity strong relation rule; and if the confidence of the chain rule is greater than a preset confidence threshold, forming a chain rule, wherein the chain rule comprises a third association relation between the commodities in the commodity association rule and the commodity strong relation rule.
8. An electronic device, comprising:
at least one memory and at least one processor;
the memory for storing one or more programs;
when executed by the at least one processor, the one or more programs cause the at least one processor to perform the steps of the offline store commodity association combination optimization method of any one of claims 1 to 6.
9. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of a method for product association and combination optimization for offline store according to any one of claims 1 to 6.
CN202211709578.5A 2022-12-29 2022-12-29 Off-line store commodity association combination method, device, equipment and storage medium Active CN115983921B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211709578.5A CN115983921B (en) 2022-12-29 2022-12-29 Off-line store commodity association combination method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211709578.5A CN115983921B (en) 2022-12-29 2022-12-29 Off-line store commodity association combination method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115983921A true CN115983921A (en) 2023-04-18
CN115983921B CN115983921B (en) 2023-11-14

Family

ID=85971873

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211709578.5A Active CN115983921B (en) 2022-12-29 2022-12-29 Off-line store commodity association combination method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115983921B (en)

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120254242A1 (en) * 2011-03-31 2012-10-04 Infosys Technologies Limited Methods and systems for mining association rules
CN103473262A (en) * 2013-07-17 2013-12-25 北京航空航天大学 Automatic classification system and automatic classification method for Web comment viewpoint on the basis of association rule
CN103700005A (en) * 2013-12-17 2014-04-02 南京信息工程大学 Association-rule recommending method based on self-adaptive multiple minimum supports
US20150278350A1 (en) * 2014-03-27 2015-10-01 Microsoft Corporation Recommendation System With Dual Collaborative Filter Usage Matrix
CN107491988A (en) * 2017-08-09 2017-12-19 浙江工商大学 A kind of wisdom retail data method for digging based on genetic algorithm and improvement interest-degree
CN108122126A (en) * 2016-11-29 2018-06-05 财团法人工业技术研究院 Method for extending association rule, device using same and computer readable medium
US20180276688A1 (en) * 2017-03-24 2018-09-27 International Business Machines Corporation System and method for a scalable recommender system using massively parallel processors
CN110196904A (en) * 2018-02-26 2019-09-03 佛山市顺德区美的电热电器制造有限公司 A kind of method, apparatus and computer readable storage medium obtaining recommendation information
CN110362670A (en) * 2019-07-19 2019-10-22 中国联合网络通信集团有限公司 Item property abstracting method and system
CN110851571A (en) * 2019-11-14 2020-02-28 拉扎斯网络科技(上海)有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN111914163A (en) * 2020-06-20 2020-11-10 武汉海云健康科技股份有限公司 Medicine combination recommendation method and device, electronic equipment and storage medium
CN111915400A (en) * 2020-07-30 2020-11-10 广州大学 Personalized clothing recommendation method and device based on deep learning
CN113988638A (en) * 2021-10-29 2022-01-28 深圳壹账通智能科技有限公司 Method and device for measuring and calculating strength of general association relationship, electronic equipment and medium
CN114418663A (en) * 2021-12-10 2022-04-29 珠海格力电器股份有限公司 Commodity information processing method and device, computer equipment and storage medium
CN115099857A (en) * 2022-06-24 2022-09-23 广州华多网络科技有限公司 Advertisement commodity combined publishing method and device, equipment, medium and product thereof

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120254242A1 (en) * 2011-03-31 2012-10-04 Infosys Technologies Limited Methods and systems for mining association rules
CN103473262A (en) * 2013-07-17 2013-12-25 北京航空航天大学 Automatic classification system and automatic classification method for Web comment viewpoint on the basis of association rule
CN103700005A (en) * 2013-12-17 2014-04-02 南京信息工程大学 Association-rule recommending method based on self-adaptive multiple minimum supports
US20150278350A1 (en) * 2014-03-27 2015-10-01 Microsoft Corporation Recommendation System With Dual Collaborative Filter Usage Matrix
CN108122126A (en) * 2016-11-29 2018-06-05 财团法人工业技术研究院 Method for extending association rule, device using same and computer readable medium
US20180276688A1 (en) * 2017-03-24 2018-09-27 International Business Machines Corporation System and method for a scalable recommender system using massively parallel processors
CN107491988A (en) * 2017-08-09 2017-12-19 浙江工商大学 A kind of wisdom retail data method for digging based on genetic algorithm and improvement interest-degree
CN110196904A (en) * 2018-02-26 2019-09-03 佛山市顺德区美的电热电器制造有限公司 A kind of method, apparatus and computer readable storage medium obtaining recommendation information
CN110362670A (en) * 2019-07-19 2019-10-22 中国联合网络通信集团有限公司 Item property abstracting method and system
CN110851571A (en) * 2019-11-14 2020-02-28 拉扎斯网络科技(上海)有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN111914163A (en) * 2020-06-20 2020-11-10 武汉海云健康科技股份有限公司 Medicine combination recommendation method and device, electronic equipment and storage medium
CN111915400A (en) * 2020-07-30 2020-11-10 广州大学 Personalized clothing recommendation method and device based on deep learning
CN113988638A (en) * 2021-10-29 2022-01-28 深圳壹账通智能科技有限公司 Method and device for measuring and calculating strength of general association relationship, electronic equipment and medium
CN114418663A (en) * 2021-12-10 2022-04-29 珠海格力电器股份有限公司 Commodity information processing method and device, computer equipment and storage medium
CN115099857A (en) * 2022-06-24 2022-09-23 广州华多网络科技有限公司 Advertisement commodity combined publishing method and device, equipment, medium and product thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
裘立波;姜元春;林文龙;: "电子商务环境下捆绑商品研究", 商业研究, no. 09, pages 186 - 188 *

Also Published As

Publication number Publication date
CN115983921B (en) 2023-11-14

Similar Documents

Publication Publication Date Title
Feldman et al. Customer choice models vs. machine learning: Finding optimal product displays on Alibaba
Ma et al. Demand forecasting with high dimensional data: The case of SKU retail sales forecasting with intra-and inter-category promotional information
Bajari et al. Demand estimation with machine learning and model combination
US10860634B2 (en) Artificial intelligence system and method for generating a hierarchical data structure
CN111402013B (en) Commodity collocation recommendation method, system, device and storage medium
US20110213661A1 (en) Computer-Implemented Method For Enhancing Product Sales
Chiang Applying data mining for online CRM marketing strategy: An empirical case of coffee shop industry in Taiwan
Argente et al. How do firms grow? The life cycle of products matters
Behera et al. Grid search optimization (GSO) based future sales prediction for big mart
Plattner Economic decision making in a public marketplace
US20040181445A1 (en) Method and apparatus for managing product planning and marketing
Gangurde et al. Building prediction model using market basket analysis
CN111932339A (en) Commodity recommendation method and system based on consumer groups and computer storage medium
CN111177581A (en) Multi-platform-based social e-commerce website commodity recommendation method and device
Aguilar-Palacios et al. Forecasting promotional sales within the neighbourhood
CN109615460A (en) Gather the selection method and selection system of single commodity
Tuinesia et al. The influence of brand awareness and perceived quality on repurchase intention: brand loyalty as intervening variable (case study at kopi soe branch of Panakkukang Makassar)
Patwary et al. Market Basket Analysis Approach to Machine Learning
CN115983921A (en) Offline store commodity association combination method, device, equipment and storage medium
Schultz et al. Killing brands… softly
Ajay et al. Analyzing and Predicting the Sales Forecasting using Modified Random Forest and Decision Tree Algorithm
Al-Basha Forecasting Retail Sales Using Google Trends and Machine Learning
Tanaka et al. Automatic generation method to derive for the design variable spaces for interactive genetic algorithms
Peker et al. A methodology for product segmentation using sale transactions
Konieczny et al. The behavior of price dispersion in a natural experiment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant