CN115983921B - Off-line store commodity association combination method, device, equipment and storage medium - Google Patents

Off-line store commodity association combination method, device, equipment and storage medium Download PDF

Info

Publication number
CN115983921B
CN115983921B CN202211709578.5A CN202211709578A CN115983921B CN 115983921 B CN115983921 B CN 115983921B CN 202211709578 A CN202211709578 A CN 202211709578A CN 115983921 B CN115983921 B CN 115983921B
Authority
CN
China
Prior art keywords
commodity
association
rule
relation
commodities
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211709578.5A
Other languages
Chinese (zh)
Other versions
CN115983921A (en
Inventor
关梓文
林沛欣
许洁斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Xuanwu Wireless Technology Co Ltd
Original Assignee
Guangzhou Xuanwu Wireless Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Xuanwu Wireless Technology Co Ltd filed Critical Guangzhou Xuanwu Wireless Technology Co Ltd
Priority to CN202211709578.5A priority Critical patent/CN115983921B/en
Publication of CN115983921A publication Critical patent/CN115983921A/en
Application granted granted Critical
Publication of CN115983921B publication Critical patent/CN115983921B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application relates to a method, a device, equipment and a storage medium for associating and combining off-line store commodities. According to the off-line store commodity association combination method, the off-line shopping data are acquired, element matching is carried out on commodity information in each order based on association rules and continuous word bag model rules, element matching is carried out on each data set, matching degree between commodities and confidence degree between the matching degrees are calculated, and finally association rules, commodity strong relation rules and chain rules between commodities are determined. Potential correlation rules between goods are ultimately found, including but not limited to substitution, bundling. The off-line store commodity association combination method has the advantage of overcoming the problem that the association relation cannot be calculated because commodities have no co-occurrence relation.

Description

Off-line store commodity association combination method, device, equipment and storage medium
Technical Field
The application relates to the technical field of data intelligent algorithms, in particular to the field of an off-line store commodity association combination optimization method.
Background
With the progress of society, various new commodities are continuously emerging in our lives, and people need more time to select products. The trend of enterprises is both a challenge and a chance, needs to capture the demands of clients in time and properly adopts proper sales strategies to gain more clients, thereby better improving commodity sales. One important sales strategy is commodity combination sales, namely, the similarity between commodities in shopping behaviors is mined according to the purchasing data of the commodities, and interested commodity combinations are sold to customers to stimulate the purchasing desire of the customers. For clients, firstly, the clients are helped to quickly find interesting and high-quality commodities, and shopping experience is improved; secondly, the effort spent by the customer on finding the goods required by the customer is reduced. The existing combination sales method of the off-line store is mainly based on human experience and association rule algorithm, the human experience is to speculate the association of commodities according to simple ordinary theory, such as binding sales or adjacent display of toothpaste and toothbrushes, the assistance sales amount can be increased to a certain extent, but the defect is that the rule of human discovery is less or the accuracy is not high. The association rule algorithm finds the relation between items from the data set, and the main process is to collect customer shopping basket data of off-line stores, set a support threshold value, find a commodity frequent item set from the data, set a confidence threshold value and calculate and output association rules from the frequent item set. The advantage is that frequent co-occurrence relationships exist, and clear and useful results, such as well-known beer and diaper commodity association combinations, can be mined and processing of variable length data is supported. However, only the commodity relationships which appear together can be mined, but the commodity relationships which do not have the co-occurrence relationship but have potential association cannot be mined.
Disclosure of Invention
Accordingly, an object of the present application is to provide a method, apparatus, device, and storage medium for offline store commodity association combination. The method collects customer shopping basket data of off-line shops, outputs commodity association combination through rule linking and fusion of association rules and continuous word bag model algorithms, and scientifically and reasonably guides strategy optimization such as commodity marketing, binding sales, display combination and the like. The method overcomes the defect that the association relation cannot be calculated because the commodity has no co-occurrence relation, and has more universality.
The application relates to a commodity association combination optimization method for an off-line store, which comprises the following steps:
acquiring shopping data, and establishing a sample data set according to the shopping data;
according to the sample data set, a C1-Ck item set is established, a function scanning data set is searched to find a frequent item set L1-Lk meeting the minimum support degree or more, and commodity association rules are generated; wherein Ck represents a set of terms that contains only k elements, lk represents a frequent set of terms that contains k sets of data, k being a non-negative integer; the commodity association rule comprises a first association relation among a group of commodities;
training the sample data set serving as a training sample through a word embedding vector technology of a continuous word bag model, and outputting a commodity vector in a distributed representation;
calculating the most similar commodities of each commodity through cosine similarity to form a commodity similarity table; wherein each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise a second association relation between the group of commodities;
obtaining the confidence coefficient of a chain rule according to the confidence coefficient of the commodity association rule comprising the same commodity and the similarity of the commodity strong relation rule; and if the confidence coefficient of the chain rule is larger than a preset confidence coefficient threshold value, forming a chain rule, wherein the chain rule comprises a third association relation between the commodity association rule and the commodity in the commodity strong relation rule.
Further, the data set is scanned through a searching function to find frequent item sets L1-Lk meeting the minimum support degree, and an association rule is generated, and the method specifically comprises the following steps:
combining each C1 item set in L1 in pairs, calculating the support degree, when the support degree is larger than a set threshold value, considering the item as a frequent item, and taking the item as an element in L2, and repeating calculation until all combination calculation is completed;
searching a term set with the confidence coefficient larger than or equal to a given threshold value in the found maximum frequent set, and generating an association rule, wherein:
where confidence (x→y) is the confidence, N (x) is the total number of item sets containing element x in the frequent item set, and N (x & y) is the number of item sets containing both element x and element y in N (x).
Further, the training of the word embedding vector technology through the continuous word bag model, and the specific steps of outputting the commodity vector of the distributed representation are as follows:
initializing model parameters theta and commodity vector x k Obtaining neg central words p different from p0 through a negative sampling method i The method comprises the steps of carrying out a first treatment on the surface of the Wherein the dimension of the commodity vector is N, for each recorded sample, the center commodity is p0, the window is c, the surrounding contexts have 2c commodities in total, which is denoted as context (p 0), and the true positive example is denoted as context (p 0), p 0;
for each sample in the training set (context (p 0 ),p 0 ,p 1 ,…,p neg ) Carrying out gradient ascending iteration to update parameters and vector process;
if the gradient converges, ending the gradient iteration and updating the parametersAnd a commodity vector for each commodity, otherwise continuing the iteration, wherein +.>Is a logistic regression parameter.
Further, the negative sampling method is calculated by the following formula,
wherein P (P) i ) Represents the weight, f (p i ) Representing the frequency of word occurrences, n of the summed symbol represents the total number of all training sample items.
Further, for each sample in the training set (context (p 0 ),p 0 ,p 1 ,…,p neg ) The process for iteratively updating parameters and vectors in gradient ascent specifically comprises:
the 2c commodity vectors around the center commodity p0 are summed and averaged, i.e.:
the data for the first time according to i=0, 1,2, once again, i is chosen, sequentially performing calculation:
for each word vector x in the training set k (2 c total) updates:
wherein η is the learning rate; i=0, y i 1, i=1, 2, …, neg, y i Is 0; sigma is a sigmoid function.
Further, the support is calculated by the following formula:
where support (x) is the support of commodity x, N is the total number of sample data sets, and N (x) is the total number of commodity x contained in N.
In another aspect, the present application provides an off-line store commodity association combination optimizing apparatus, comprising:
and a data acquisition module: the method comprises the steps of acquiring shopping data and establishing a sample data set according to the shopping data;
a first association rule generation module: the method comprises the steps of establishing a C1-Ck item set according to a sample data set, searching a function scanning data set to find a frequent item set L1-Lk meeting the minimum support degree or more, and generating commodity association rules; wherein Ck represents a set of terms that contains only k elements, lk represents a frequent set of terms that contains k sets of data, k being a non-negative integer; the commodity association rule comprises a first association relation among a group of commodities;
and (3) an iteration training module: the method comprises the steps of training a sample data set serving as a training sample through a word embedding vector technology of a continuous word bag model, and outputting commodity vectors in a distributed representation;
the second association relation calculating module: the method comprises the steps of calculating the most similar commodities of each commodity through cosine similarity to form a commodity similarity table; wherein each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise a second association relation between the group of commodities;
and a third association relation calculating module: the method comprises the steps of obtaining the confidence coefficient of a chain rule according to the confidence coefficient of the commodity association rule comprising the same commodity and the similarity of the commodity strong relation rule; and if the confidence coefficient of the chain rule is larger than a preset confidence coefficient threshold value, forming a chain rule, wherein the chain rule comprises a third association relation between the commodity association rule and the commodity in the commodity strong relation rule.
In another aspect, the present application also provides an electronic device, including:
at least one memory and at least one processor;
the memory is used for storing one or more programs;
the one or more programs, when executed by the at least one processor, cause the at least one processor to implement the steps of an off-line store commodity association combination optimizing method as described in any of the above-mentioned.
In another aspect, the present application further provides a computer readable storage medium, where a computer program is stored, where the computer program when executed by a processor implements the steps of an offline store commodity association combination optimization method according to any one of the above.
The application provides a commodity association combination optimization method for off-line shops, which can carry out commodity complementation or substitution operation according to a commodity association combination rule base, and can realize the strategy optimization of commodity marketing, binding sales, display combination and the like binding sales of branded commodities and cross sales of different branded commodities scientifically and reasonably. The method has the advantages that the potential strong correlation between the off-line store commodities with the co-occurrence relationship can be found, and the commodity sales volume can be improved.
For a better understanding and implementation, the present application is described in detail below with reference to the drawings.
Drawings
FIG. 1 is a flowchart of a method for optimizing commodity association combinations of an offline store according to an embodiment of the present application;
FIG. 2 is a block diagram of a device for optimizing the association and combination of off-line store commodities according to an embodiment of the present application;
fig. 3 is a block diagram illustrating an electronic device according to an exemplary embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the following detailed description of the embodiments of the present application will be given with reference to the accompanying drawings.
It should be understood that the described embodiments are merely some, but not all embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the application, are intended to be within the scope of the embodiments of the present application.
The terminology used in the embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of embodiments of the application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application as detailed in the accompanying claims. In the description of the present application, it should be understood that the terms "first," "second," "third," and the like are used merely to distinguish between similar objects and are not necessarily used to describe a particular order or sequence, nor should they be construed to indicate or imply relative importance. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art according to the specific circumstances.
Furthermore, in the description of the present application, unless otherwise indicated, "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.
With the continuous appearance of new products in life, people need to spend time selecting products for different products, but in the current fast-paced social environment, people often do not spend much time selecting products, and the online and sales of new products are problematic, and some stores select to bind and sell some products, so that customers can be helped to quickly find interesting and high-quality products, the time spent by customers for selecting products is reduced, or some complementary products are adjacent to display to realize promotion among the products. However, the existing correlation between commodities is often found through the daily habits and the common sense of life of most people, and the correlation is relatively subjective and has strong personal consciousness, so that the correlation rules are less and have low accuracy, and some potential relations between commodities cannot be found.
Based on the above background, in conjunction with fig. 1, this embodiment proposes a commodity association combination optimization method applied to an offline store. Through gathering online store shopping basket data, fusing association rules and continuous word bag models, calculating and outputting commodity association combinations, the method can be used for guiding the strategy optimization such as commodity combination sales, binding sales and the like, and specifically comprises the following steps:
s10: shopping data is acquired, and a sample data set is established according to the shopping data.
Shopping data is a shopping list of customers performing purchasing activities, each commodity representing an element. The sample data set is used for storing a data set in a customer shopping data set of one stage, and a group of shopping data of each customer is taken as one sample.
S20: according to the sample data set, a C1-Ck item set is established, a function scanning data set is searched to find a frequent item set L1-Lk which meets the minimum support degree or more, and commodity association rules are generated; wherein Ck represents a set of terms that contains only k elements, lk represents a frequent set of terms that contains k sets of data, k being a non-negative integer; the article association rule includes a first association relationship between a set of articles.
The C1-Ck item set represents from the item set only containing 1 element to the item set only containing k elements, and the C1 item set is the 1-item set; the L1-Lk frequent item sets represent the frequent item sets from the frequent item set containing 1 data set to the frequent item set containing k data sets, such as C1 item set { toothpaste; iced black tea; toothpaste }, C2 set of terms { lemon tea/cola; green tea/toothpaste }, five elements in total. Then the toothpaste L1 is 0.6.
S30: and training by using the sample data set as a training sample through a word embedding vector technology of a continuous word bag model, and outputting a commodity vector of distributed representation.
Continuous word bag model based on negative sampling method is constructed, and model parameters theta and commodity vector x are initialized k Assuming that the dimension of the commodity vector is N, for each recorded sample, the center commodity is p 0 The window is c, the surrounding context has 2c commodities, namely context (p 0), and neg and p can be obtained through a negative sampling method 0 Different center words p i Where i=1, 2,..neg, negative examples of the unauthentic presence are noted as (context (p 0 ),p i ) The true positive example is denoted (context (p 0 ),p 0 );
S40: calculating the most similar commodities of each commodity through cosine similarity to form a commodity similarity table; each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise second association relations among the group of commodities.
Cosine similarity measures the similarity degree between vectors by the included angle between space vectors, and in the application, commodity vectors in distributed representation are trained and output through the word embedding vector technology of a continuous word bag model.
S50: obtaining the confidence coefficient of the chain rule according to the confidence coefficient of the commodity association rule of the same commodity and the similarity of the commodity strong relation rule; and if the confidence coefficient of the chain rule is larger than a preset confidence coefficient threshold value, forming the chain rule, wherein the chain rule comprises a third association relation between the commodity association rule and the commodity in the commodity strong relation rule.
In a specific embodiment, the present technical solution is implemented by the following solution.
Acquiring shopping basket data of off-line store, including shopping basket order numbers, shopping basket order commodity list, and storing the data into a shopping basket order commodity list
Order number Commodity inventory
LZ_B_M_220715_123554 Green tea/rock sugar snow pear/ice black tea/high-calcium milk/yoghurt/waffle biscuit
LZ_B_M_220716_091521 Blackcurrant/natural mineral water/green tea/soda cracker/caramel toast
LZ_B_M_220716_111617 Chocolate/green tea/iced black tea/chewing gum/sausage/yoghurt
LZ_B_M_220718_201522 Green tea/toothpaste/toothbrush
LZ_B_M_220720_153018 Ice cream cone/lemon tea/cola
LZ_B_M_220725_102522 Fruit orange/soda cake/garlic peanut/sesame cake/breakfast biscuit
Clickhouse. Sample data are as follows:
and constructing an association rule algorithm. In this embodiment, the minimum support is set to 0.05, the minimum confidence is set to 0.6, and the frequent item set is calculated.
1) Creating a 1-item set C1, scanning the data set through a search function, calculating the support degree, and finding a frequent item set L1 which is greater than or equal to the minimum support degree of 0.05.
Where support (x) is the support of commodity x, N is the total number of sample data sets, and N (x) is the total number of commodity x contained in N.
Taking 6 pieces of sample data as an example, the green tea support degree is 0.67, which is greater than or equal to the minimum support degree, and the green tea is judged to be frequent.
2) And (3) combining each 1-item set in the L1 in pairs, repeating the step (1), and finding a frequent item set L2.
Taking 6 pieces of sample data as an example, the support degree of (green tea, iced black tea) was 0.33, which was greater than or equal to the minimum support degree, and the frequent collection was determined.
The above operation is repeated until the k-term set is cycled through. Taking 6 pieces of sample data as an example, the k-term set is a 3-term set (green tea, iced black tea, yogurt) with a support of 0.33.
3) Generating an association rule: and searching the confidence coefficient greater than or equal to 0.6 in the found maximum frequent set, and generating an association rule. The confidence coefficient is calculated as follows:
where confidence (x→y) is confidence, N (x) is the total number of item sets containing element x in the frequent item set, and N (x & y) is the number of item sets containing element x and element y in N (x) at the same time.
Taking 6 pieces of sample data as an example, one rule is calculated as confidence (green tea, iced black tea → yoghurt) =2/2=1, and the confidence is greater than or equal to the minimum confidence of 0.6, and the rule is judged as an association rule.
And constructing a continuous bag-of-words model based on a negative sampling method.
The commodity list of the shopping basket order of the off-line store is used as a training sample, the commodity vector in the distributed representation is output through the word embedding vector technology training of the continuous word bag model, and the dimension N of the commodity vector in the embodiment is taken as 10. Sample commodity vector data is as follows (for convenience of presentation, the vector takes three decimal places).
Goods commodity vec1 vec2 vec3 vec4 vec5 vec6 vec7 vec8 vec9 vec10
Green tea 0.139 -0.259 -0.047 -0.244 -0.724 -0.581 0.213 -0.395 -0.514 0.029
Iced black tea 0.018 0.042 0.079 0.521 -0.729 -0.544 -0.245 -0.281 -0.102 -0.168
Soda biscuits 0.469 0.833 0.689 0.0769 -0.441 -0.614 -0.012 -0.426 -0.676 0.184
Yoghurt 0.542 -0.724 0.624 0.589 0.094 -0.576 -0.113 -0.343 -0.183 0.142
...
Through cosine similarity, the embodiment sets the similarity threshold to 0.7, calculates the most similar commodity of each commodity, and forms a commodity similarity table:
goods commodity Similar series commodity (threshold greater than or equal to 0.7)
Green tea Soda biscuits: 0.993; iced black tea: 0.811
Iced black tea Soda biscuits: 0.954; green tea: 0.811
Soda biscuits Green tea: 0.993; iced black tea: 0.954; yogurt: 0.736
Yoghurt Soda biscuits: 0.736
And judging each record of the commodity similarity table as a strong relationship rule, wherein the numerical value part is a strong relationship similarity.
The co-occurrence relation of the yoghurt and the soda biscuits in shopping basket data never occurs, is a potential relation mined through a distributed representation method, and enhances the generalization capability of an algorithm.
The link of the association rule and the continuous word bag model is performed, and the chain rule confidence is set to be 0.7, and the chain rule is obtained when the confidence is greater than or equal to 0.7. The chain rule confidence is equal to the association rule confidence and the relationship similarity is strong, and the specific method is as follows:
the association rule calculated through the steps is (green tea, iced black tea, yoghurt) and is combined with the calculated strong relationship rule (yoghurt, soda biscuits) to link, so that the rule (green tea, iced black tea, yoghurt, soda biscuits) is generated.
According to the result, a rule (green tea, ice black tea and soda biscuits) is deduced, namely, the green tea and the ice black tea have a correlation, and a person who purchases the soda biscuits has a strong willingness to purchase the green tea and the ice black tea.
The chain rule is determined by calculating confidence=association rule confidence strong relationship rule similarity=1×0.736=0.736, which is greater than or equal to 0.7.
The chain rule indicates that the soda cracker can be used as an alternative commodity for combined sales under the condition that the intermediate commodity such as yoghurt is out of stock in an online store.
(5) And carrying out rule deduplication on the association rule, the strong relationship rule and the chain rule, then taking the union set, and outputting the commodity association combination rule base. Examples are as follows:
sequence number Commodity association combination Rule type
1 Green tea, iced black tea and soda cracker Chain rule
2 Green tea, iced black tea and yoghurt Association rule/strong relationship rule
3 Yoghurt and soda cracker Strong relation rule
4 Soda biscuits, green tea, ice black tea and yoghurt Strong relation rule
5 ... ...
The application provides a commodity association combination optimization method for off-line shops, which can carry out commodity complementation or substitution operation according to a commodity association combination rule base, and can realize the strategy optimization of commodity marketing, binding sales, display combination and the like binding sales of branded commodities and cross sales of different branded commodities scientifically and reasonably. The method has the advantages that the potential strong correlation between the off-line store commodities with the co-occurrence relationship can be found, and the commodity sales volume can be improved.
With reference to fig. 3, the present application further provides an off-line store commodity association combination optimization device, including:
and a data acquisition module: the method comprises the steps of acquiring shopping data and establishing a sample data set according to the shopping data;
a first association rule generation module: the method comprises the steps of establishing a C1-Ck item set according to a sample data set, searching the frequent item set L1-Lk meeting the minimum support degree or more by searching a function scanning data set, and generating commodity association rules; wherein Ck represents a set of terms that contains only k elements, lk represents a frequent set of terms that contains k sets of data, k being a non-negative integer; the commodity association rule comprises a first association relation among a group of commodities;
and (3) an iteration training module: the method comprises the steps of training a sample data set serving as a training sample through a word embedding vector technology of a continuous word bag model, and outputting commodity vectors represented in a distributed mode;
the second association relation calculating module: the method comprises the steps of calculating the most similar commodities of each commodity through cosine similarity to form a commodity similarity table; wherein each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise a second association relation between the group of commodities;
and a third association relation calculating module: the method comprises the steps of obtaining the confidence coefficient of a chain rule according to the confidence coefficient of a commodity association rule comprising the same commodity and the similarity of a commodity strong relation rule; and if the confidence coefficient of the chain rule is larger than a preset confidence coefficient threshold value, forming the chain rule, wherein the chain rule comprises a third association relation between the commodity association rule and the commodity in the commodity strong relation rule.
In the application, the first association rule generation module comprises a confidence coefficient calculation unit, wherein the confidence coefficient calculation unit is used for searching a term set with the confidence coefficient being greater than or equal to a given threshold value in the found maximum frequent set to generate an association rule, and the association rule generation unit comprises the following steps:
where confidence (x→y) is confidence, N (x) is the total number of item sets containing element x in the frequent item set, and N (x & y) is the number of item sets containing element x and element y in N (x) at the same time.
In another embodiment, the iterative training module includes a bag of words model unit for initializing model parameters θ and commodity vector x k Obtaining neg central words p different from p0 through a negative sampling method i The method comprises the steps of carrying out a first treatment on the surface of the Wherein the dimension of the commodity vector is N, for each recorded sample, the center commodity is p0, the window is c, and the surrounding context has 2c commodities in total, denoted as context (p 0 )。
For each sample in the training set (context (p 0 ),p 0 ,p 1 ,…,p neg ) And carrying out a gradient ascending iteration parameter and vector updating process.
The negative sampling method specifically comprises the following steps:
wherein P (P) i ) Represents the weight, f (p i ) Representing the frequency of word occurrences.
The iterative updating parameter and vector process specifically comprises the following steps: the 2c commodity vectors around the center commodity p0 are summed and averaged for the center commodity, namely:
the data for the first time according to i=0, 1,2, once again, i is chosen, sequentially performing calculation:
for each word vector x in the training set k (2 c total) updates:
wherein η is the learning rate; when i=0, y i 1, i=1, 2, …, neg, y i Is 0; sigma is a sigmoid function.
As shown in fig. 2, fig. 2 is a block diagram illustrating a structure of an electronic device according to an exemplary embodiment of the present application. The electronic device includes a processor 910 and a memory 920. The number of processors 910 in the main control chip may be one or more, and one processor 910 is illustrated in fig. 2. The number of memories 920 in the main control chip may be one or more, and one memory 920 is illustrated in fig. 2.
The memory 920 is used as a computer readable storage medium, and can be used for storing a software program, a computer executable program and a module, which are a program of an offline store commodity association combination optimization method according to any embodiment of the present application, and a program instruction/module corresponding to the offline store commodity association combination optimization method according to any embodiment of the present application. Memory 920 may include primarily a program storage area and a data storage area, wherein the program storage area may store an operating system, at least one application program required for functionality; the storage data area may store data created according to the use of the device, etc. In addition, memory 920 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, memory 920 may further include memory located remotely from processor 910, which may be connected to the device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The processor 910 executes various functional applications and data processing of the device by executing software programs, instructions and modules stored in the memory 920, that is, implements an off-line store commodity association combination optimization method described in any of the above embodiments.
The embodiment of the application also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the method for optimizing the offline store commodity association combination according to any one of the above embodiments.
The present application may take the form of a computer program product embodied on one or more storage media (including, but not limited to, magnetic disk storage, CD-ROM, optical storage, etc.) having program code embodied therein. Computer-readable storage media include both non-transitory and non-transitory, removable and non-removable media, and information storage may be implemented by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to: phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, may be used to store information that may be accessed by the computing device.
The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that modifications and improvements can be made by those skilled in the art without departing from the spirit of the application, and the application is intended to encompass such modifications and improvements.

Claims (9)

1. The method for optimizing the offline store commodity association combination is characterized by comprising the following steps of:
acquiring shopping data, and establishing a sample data set according to the shopping data;
according to the sample data set, a C1-Ck item set is established, a function scanning data set is searched to find a frequent item set L1-Lk meeting the minimum support degree or more, and commodity association rules are generated; wherein Ck represents a set of terms that contains only k elements, lk represents a frequent set of terms that contains k sets of data, k being a non-negative integer; the commodity association rule comprises a first association relation among a group of commodities;
training the sample data set serving as a training sample through a word embedding vector technology of a continuous word bag model, and outputting a commodity vector in a distributed representation;
calculating the most similar commodities of each commodity through cosine similarity to form a commodity similarity table; wherein each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise a second association relation between the group of commodities;
obtaining the confidence coefficient of a chain rule according to the confidence coefficient of the commodity association rule comprising the same commodity and the similarity of the commodity strong relation rule; and if the confidence coefficient of the chain rule is larger than a preset confidence coefficient threshold value, forming a chain rule, wherein the chain rule comprises a third association relation between the commodity association rule and the commodity in the commodity strong relation rule.
2. The method for optimizing the association combination of off-line store commodity according to claim 1, wherein the frequent item sets L1-Lk satisfying the minimum support degree are found by searching the function scan data set, and the association rule is generated, specifically comprising the steps of:
combining each C1 item set in L1 in pairs, calculating the support degree, when the support degree is larger than a set threshold value, considering the item as a frequent item, and taking the item as an element in L2, and repeating calculation until all combination calculation is completed;
searching a term set with the confidence coefficient larger than or equal to a given threshold value in the found maximum frequent set, and generating an association rule, wherein:
where confidence (x→y) is confidence, N (x) is the total number of item sets containing element x in the frequent item set, and N (x & y) is the number of item sets containing element x and element y in N (x) at the same time.
3. The method for optimizing offline store commodity association according to claim 2, wherein the training by the word embedding vector technology of the continuous word bag model, the specific step of outputting the commodity vector of the distributed representation is as follows:
initializing model parameters theta and commodity vector x k Obtaining neg central words p different from p0 through a negative sampling method i The method comprises the steps of carrying out a first treatment on the surface of the Wherein the dimension of the commodity vector is N, for each recorded sample, the center commodity is p0, the window is c, and the surrounding context has 2c commodities in total, denoted as context (p 0 ) And the true positive example is denoted (context (p 0 ),p 0 );
For each sample in the training set (context (p 0 ),p 0 ,p 1 ,…,p neg ) Carrying out gradient ascending iteration to update parameters and vector process;
if the gradient converges, ending the gradient iteration and updating the parametersAnd a commodity vector for each commodity, otherwise continuing the iteration, wherein +.>Is a logistic regression parameter.
4. The method for optimizing offline store commodity association according to claim 3, wherein:
the negative sampling method is calculated by the following formula,
wherein P (P) i ) Represents the weight, f (p i ) Representing the frequency of word occurrences, n of the summed symbol represents the total number of items in all training samples.
5. The method of claim 4, wherein for each sample (context (p 0 ),p 0 ,p 1 ,…,p neg ) The process for iteratively updating parameters and vectors in gradient ascent specifically comprises:
summing and averaging 2c commodity vectors around the center commodity p0 to obtainNamely:
according to i=0, 1,2,..neg, calculation is performed in sequence:
for each commodity word vector x in the training set k Updating:
wherein η is the learning rate; when i=0, y i 1, i=1, 2, …, neg, y i Is 0; sigma is a sigmoid function.
6. The method for optimizing offline store commodity association according to claim 1, wherein the support is calculated by the following formula:
where support (x) is the support of commodity x, N is the total number of sample data sets, and N (x) is the total number of commodity x contained in N.
7. An off-line store commodity association combination optimizing device, which is characterized by comprising:
and a data acquisition module: the method comprises the steps of acquiring shopping data and establishing a sample data set according to the shopping data;
a first association rule generation module: the method comprises the steps of establishing a C1-Ck item set according to a sample data set, searching a function scanning data set to find a frequent item set L1-Lk meeting the minimum support degree or more, and generating commodity association rules; wherein Ck represents a set of terms that contains only k elements, lk represents a frequent set of terms that contains k sets of data, k being a non-negative integer; the commodity association rule comprises a first association relation among a group of commodities;
and (3) an iteration training module: the method comprises the steps of training a sample data set serving as a training sample through a word embedding vector technology of a continuous word bag model, and outputting commodity vectors in a distributed representation;
the second association relation calculating module: the method comprises the steps of calculating the most similar commodities of each commodity through cosine similarity to form a commodity similarity table; wherein each record of the commodity similarity table corresponds to a group of commodity strong relation rules, and the commodity strong relation rules comprise a second association relation between the group of commodities;
and a third association relation calculating module: the method comprises the steps of obtaining the confidence coefficient of a chain rule according to the confidence coefficient of the commodity association rule comprising the same commodity and the similarity of the commodity strong relation rule; and if the confidence coefficient of the chain rule is larger than a preset confidence coefficient threshold value, forming a chain rule, wherein the chain rule comprises a third association relation between the commodity association rule and the commodity in the commodity strong relation rule.
8. An electronic device, comprising:
at least one memory and at least one processor;
the memory is used for storing one or more programs;
when the one or more programs are executed by the at least one processor, the at least one processor implements the steps of an off-line store commodity association combination optimizing method as claimed in any one of claims 1 to 6.
9. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of an off-line store commodity association combination optimizing method according to any one of claims 1 to 6.
CN202211709578.5A 2022-12-29 2022-12-29 Off-line store commodity association combination method, device, equipment and storage medium Active CN115983921B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211709578.5A CN115983921B (en) 2022-12-29 2022-12-29 Off-line store commodity association combination method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211709578.5A CN115983921B (en) 2022-12-29 2022-12-29 Off-line store commodity association combination method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115983921A CN115983921A (en) 2023-04-18
CN115983921B true CN115983921B (en) 2023-11-14

Family

ID=85971873

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211709578.5A Active CN115983921B (en) 2022-12-29 2022-12-29 Off-line store commodity association combination method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115983921B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473262A (en) * 2013-07-17 2013-12-25 北京航空航天大学 Automatic classification system and automatic classification method for Web comment viewpoint on the basis of association rule
CN103700005A (en) * 2013-12-17 2014-04-02 南京信息工程大学 Association-rule recommending method based on self-adaptive multiple minimum supports
CN107491988A (en) * 2017-08-09 2017-12-19 浙江工商大学 A kind of wisdom retail data method for digging based on genetic algorithm and improvement interest-degree
CN108122126A (en) * 2016-11-29 2018-06-05 财团法人工业技术研究院 Method for extending association rule, device using same and computer readable medium
CN110196904A (en) * 2018-02-26 2019-09-03 佛山市顺德区美的电热电器制造有限公司 A kind of method, apparatus and computer readable storage medium obtaining recommendation information
CN110362670A (en) * 2019-07-19 2019-10-22 中国联合网络通信集团有限公司 Item property abstracting method and system
CN110851571A (en) * 2019-11-14 2020-02-28 拉扎斯网络科技(上海)有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN111915400A (en) * 2020-07-30 2020-11-10 广州大学 Personalized clothing recommendation method and device based on deep learning
CN111914163A (en) * 2020-06-20 2020-11-10 武汉海云健康科技股份有限公司 Medicine combination recommendation method and device, electronic equipment and storage medium
CN113988638A (en) * 2021-10-29 2022-01-28 深圳壹账通智能科技有限公司 Method and device for measuring and calculating strength of general association relationship, electronic equipment and medium
CN114418663A (en) * 2021-12-10 2022-04-29 珠海格力电器股份有限公司 Commodity information processing method and device, computer equipment and storage medium
CN115099857A (en) * 2022-06-24 2022-09-23 广州华多网络科技有限公司 Advertisement commodity combined publishing method and device, equipment, medium and product thereof

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8812543B2 (en) * 2011-03-31 2014-08-19 Infosys Limited Methods and systems for mining association rules
US9348898B2 (en) * 2014-03-27 2016-05-24 Microsoft Technology Licensing, Llc Recommendation system with dual collaborative filter usage matrix
US10147103B2 (en) * 2017-03-24 2018-12-04 International Business Machines Corproation System and method for a scalable recommender system using massively parallel processors

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473262A (en) * 2013-07-17 2013-12-25 北京航空航天大学 Automatic classification system and automatic classification method for Web comment viewpoint on the basis of association rule
CN103700005A (en) * 2013-12-17 2014-04-02 南京信息工程大学 Association-rule recommending method based on self-adaptive multiple minimum supports
CN108122126A (en) * 2016-11-29 2018-06-05 财团法人工业技术研究院 Method for extending association rule, device using same and computer readable medium
CN107491988A (en) * 2017-08-09 2017-12-19 浙江工商大学 A kind of wisdom retail data method for digging based on genetic algorithm and improvement interest-degree
CN110196904A (en) * 2018-02-26 2019-09-03 佛山市顺德区美的电热电器制造有限公司 A kind of method, apparatus and computer readable storage medium obtaining recommendation information
CN110362670A (en) * 2019-07-19 2019-10-22 中国联合网络通信集团有限公司 Item property abstracting method and system
CN110851571A (en) * 2019-11-14 2020-02-28 拉扎斯网络科技(上海)有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN111914163A (en) * 2020-06-20 2020-11-10 武汉海云健康科技股份有限公司 Medicine combination recommendation method and device, electronic equipment and storage medium
CN111915400A (en) * 2020-07-30 2020-11-10 广州大学 Personalized clothing recommendation method and device based on deep learning
CN113988638A (en) * 2021-10-29 2022-01-28 深圳壹账通智能科技有限公司 Method and device for measuring and calculating strength of general association relationship, electronic equipment and medium
CN114418663A (en) * 2021-12-10 2022-04-29 珠海格力电器股份有限公司 Commodity information processing method and device, computer equipment and storage medium
CN115099857A (en) * 2022-06-24 2022-09-23 广州华多网络科技有限公司 Advertisement commodity combined publishing method and device, equipment, medium and product thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
电子商务环境下捆绑商品研究;裘立波;姜元春;林文龙;;商业研究(第09期);第186-188页 *
裘立波 ; 姜元春 ; 林文龙 ; .电子商务环境下捆绑商品研究.商业研究.2009,(第09期),第186-188页. *

Also Published As

Publication number Publication date
CN115983921A (en) 2023-04-18

Similar Documents

Publication Publication Date Title
Irfan et al. Prediction of quality food sale in mart using the AI‐based TOR method
Ruiz et al. SHOPPER
CN107563841B (en) Recommendation system based on user score decomposition
US11574353B2 (en) Compatibility based furniture recommendations
US20230394432A1 (en) Machine-learned model for optmizing selection sequence for items in a warehouse
Bajari et al. Demand estimation with machine learning and model combination
CN111144986B (en) Social electronic commerce website commodity recommendation method and device based on sharing behavior
JP2019020980A (en) Estimation device, estimation method, estimation program, and model
US11250338B2 (en) Method for enhancing association rules, apparatus using the same and computer readable medium therefor
Hu Exploring the relationship between perceived risk and customer involvement, brand equity and customer loyalty as mediators
CN111177581A (en) Multi-platform-based social e-commerce website commodity recommendation method and device
Zekić-Sušac et al. Data mining as support to Knowledge Management in Marketing
CN108292409A (en) Consumer's decision tree generation system
Dippold et al. Variable selection for market basket analysis
CN117649256A (en) Ecological product sales information analysis method suitable for karst region
Wang et al. Dichotomic pattern mining with applications to intent prediction from semi-structured clickstream datasets
Abdo et al. Organizing objects by predicting user preferences through collaborative filtering
CN116843371B (en) Marketing promotion method, marketing promotion device, marketing promotion equipment and computer-readable storage medium
CN115983921B (en) Off-line store commodity association combination method, device, equipment and storage medium
CN116957727A (en) Commodity recommendation method and device
Tian Role extraction, dynamics, and optimisation on networks
Al-Basha Forecasting Retail Sales Using Google Trends and Machine Learning
Sekban Applying machine learning algorithms in sales prediction
Xia et al. Multicategory choice modeling with sparse and high dimensional data: A Bayesian deep learning approach
Kabir et al. Visualization of interval regression for facilitating data and model insight

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant