CN107507028B - User preference determination method, device, equipment and storage medium - Google Patents

User preference determination method, device, equipment and storage medium Download PDF

Info

Publication number
CN107507028B
CN107507028B CN201710699894.1A CN201710699894A CN107507028B CN 107507028 B CN107507028 B CN 107507028B CN 201710699894 A CN201710699894 A CN 201710699894A CN 107507028 B CN107507028 B CN 107507028B
Authority
CN
China
Prior art keywords
brand image
user
brand
image word
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710699894.1A
Other languages
Chinese (zh)
Other versions
CN107507028A (en
Inventor
刘朋飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710699894.1A priority Critical patent/CN107507028B/en
Publication of CN107507028A publication Critical patent/CN107507028A/en
Priority to PCT/CN2018/100688 priority patent/WO2019034087A1/en
Application granted granted Critical
Publication of CN107507028B publication Critical patent/CN107507028B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0203Market surveys; Market polls
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Abstract

The disclosure relates to a user preference determination method, a user preference determination device, an electronic device and a storage medium. The method comprises the following steps: counting shopping behavior data of a user corresponding to each brand image word in a brand image word bank, wherein the brand image words corresponding to each brand name are stored in the brand image word bank; calculating the membership degree of the user to each brand image word through fuzzy clustering based on the shopping behavior data of the user; and determining the brand image words with the membership degrees larger than a first threshold value as the brand images preferred by the users. The method and the device can efficiently dig out the brand image preferred by the user from the shopping behavior data of the user, further can deeply dig out the consumption psychology and preference of the user, are convenient for accurate marketing, and reduce the labor cost.

Description

User preference determination method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of big data technologies, and in particular, to a user preference determination method, a user preference determination apparatus, an electronic device, and a computer-readable storage medium.
Background
With the wide application of big data technology, accurate marketing has become an important way for brand merchants to perform marketing activities in the practice of electronic commerce services, and how to perform accurate marketing according to the preference degree of users on brand images becomes an important research direction.
Currently, the determination of user preferences is mainly achieved through the scheme of questionnaire. In the scheme, on one hand, the efficiency is low due to the fact that manual operation is needed; on the other hand, the user is mainly judged based on subjective factors, so that the brand image really preferred by the user is difficult to accurately obtain, and accurate marketing to the user cannot be realized.
Therefore, it is desirable to provide a user preference determination method and a user preference determination apparatus capable of solving one or more of the above-described problems.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
An object of the present disclosure is to provide a user preference determination method, a user preference determination apparatus, an electronic device, and a computer-readable storage medium, thereby overcoming, at least to some extent, one or more problems caused by the limitations and disadvantages of the related art.
According to an aspect of the present disclosure, there is provided a user preference determination method including:
counting shopping behavior data of a user corresponding to each brand image word in a brand image word bank, wherein the brand image words corresponding to each brand name are stored in the brand image word bank;
calculating the membership degree of the user to each brand image word through fuzzy clustering based on the shopping behavior data of the user; and
and determining the brand image words with the membership degrees larger than a first threshold value as the brand images preferred by the users.
In an exemplary embodiment of the present disclosure, calculating the degree of membership of the user to each brand image word by fuzzy clustering based on the shopping behavior data of the user includes:
calculating the distance between the shopping behavior data of the user and each brand image word;
and calculating the membership degree of the user to each brand image word based on the distance.
In an exemplary embodiment of the present disclosure, the user preference determining method further includes:
and matching each item of commodity information in the commodity information database with the brand image words corresponding to each brand name.
In an exemplary embodiment of the present disclosure, the user preference determining method further includes:
generating a frequent itemset about the commodity information and the brand image words based on the matched commodity information and the brand image words;
and adding commodity information in the frequent item set with the support degree larger than a second threshold value into the brand image word stock.
In an exemplary embodiment of the present disclosure, generating a frequent itemset about each item of merchandise information and the brand image words includes:
and generating a frequent item set related to each item of commodity information and brand image words through FP-growth operation.
In an exemplary embodiment of the present disclosure, the counting shopping behavior data of the user corresponding to each brand image word in the brand image word bank includes:
carrying out normalization processing on shopping behavior data of a user;
and counting the normalized shopping behavior data of the user corresponding to each brand image word in the brand image word bank.
According to an aspect of the present disclosure, there is provided a user preference determination apparatus including:
the system comprises a statistic unit, a display unit and a display unit, wherein the statistic unit is used for counting shopping behavior data of users corresponding to brand image words in a brand image word bank, and the brand image words corresponding to brand names are stored in the brand image word bank;
the membership calculation unit is used for calculating the membership of the user to each brand image word through fuzzy clustering based on the shopping behavior data of the user; and
and the user preference determining unit is used for determining the brand image words with the membership degrees larger than a first threshold as the preferred brand images of the users.
In an exemplary embodiment of the present disclosure, calculating the degree of membership of the user to each brand image word by fuzzy clustering based on the shopping behavior data of the user includes:
calculating the distance between the shopping behavior data of the user and each brand image word;
and calculating the membership degree of the user to each brand image word based on the distance.
According to an aspect of the present disclosure, there is provided an electronic device including:
a processor; and
a memory having computer readable instructions stored thereon which, when executed by the processor, implement a user preference determination method according to any of the above.
According to an aspect of the present disclosure, there is also provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a user preference determination method according to any one of the above.
The user preference determining method, the user preference determining apparatus, the electronic device, and the computer-readable storage medium in an exemplary embodiment of the present disclosure count shopping behavior data of a user corresponding to each brand image word, calculate a membership degree of the user to each brand image word through fuzzy clustering based on the shopping behavior data of the user, and determine the brand image word having the membership degree greater than a first threshold as a brand image preferred by the user. On one hand, the shopping behavior data of the user corresponding to each brand image word is counted, and the shopping behavior data of the user can be associated with the brand image words, so that the brand image preferred by the user can be analyzed through the shopping behavior data of the user; on the other hand, the membership degree of the user to each brand image word is calculated through fuzzy clustering based on the shopping behavior data of the user, the brand image word with the membership degree larger than the first threshold value is determined as the preferred brand image of the user, a plurality of preferred brand images of the user can be automatically and efficiently mined from the shopping behavior data of the user, the consumption psychology and preference of the user can be further deeply mined, accurate marketing is facilitated, and the labor cost is reduced.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The above and other features and advantages of the present disclosure will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.
FIG. 1 schematically illustrates a flow chart of a user preference determination method according to an exemplary embodiment of the present disclosure;
FIG. 2 schematically illustrates an architecture diagram of a user preference determination system according to an exemplary embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow diagram of a user preference determination system according to an exemplary embodiment of the present disclosure;
FIG. 4 schematically illustrates a diagram of a constructed FP-tree according to an exemplary embodiment of the present disclosure;
FIG. 5 schematically illustrates a block diagram of a user preference determination apparatus according to an exemplary embodiment of the present disclosure;
FIG. 6 schematically illustrates a block diagram of an electronic device according to an exemplary embodiment of the present disclosure;
fig. 7 shows an illustrative diagram of a computer-readable storage medium according to an exemplary embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The same reference numerals denote the same or similar parts in the drawings, and thus their repetitive description will be omitted.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the embodiments of the disclosure can be practiced without one or more of the specific details, or with other methods, components, materials, devices, steps, and so forth. In other instances, well-known structures, methods, devices, implementations, materials, or operations are not shown or described in detail to avoid obscuring aspects of the disclosure.
The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. That is, these functional entities may be implemented in the form of software, or in one or more software-hardened modules, or in different networks and/or processor devices and/or microcontroller devices.
In the present exemplary embodiment, a user preference determination method is first provided. Referring to fig. 1, the user preference determination method may include the steps of:
s110, counting the purchasing behavior data of the user corresponding to each brand image word in a brand image word bank, wherein the brand image word bank stores the brand image words corresponding to each brand name;
s120, calculating the membership degree of the user to each brand image word through fuzzy clustering based on the shopping behavior data of the user; and
and S130, determining the brand image words with the membership degrees larger than a first threshold value as the preferred brand images of the users.
According to the user preference determining method in the present exemplary embodiment, on one hand, the shopping behavior data of the user corresponding to each brand image word is counted, and the shopping behavior data of the user can be associated with the brand image word, so that the analysis of the brand image preferred by the user through the shopping behavior data of the user is facilitated; on the other hand, the membership degree of the user to each brand image word is calculated through fuzzy clustering based on the shopping behavior data of the user, the brand image word with the membership degree larger than the first threshold value is determined as the preferred brand image of the user, a plurality of preferred brand images of the user can be automatically and efficiently mined from the shopping behavior data of the user, the consumption psychology and preference of the user can be further deeply mined, accurate marketing is facilitated, and the labor cost is reduced.
Next, the user preference determination method in the present exemplary embodiment will be further explained.
In step S110, shopping behavior data of the user corresponding to each brand image word in a brand image word bank is counted, wherein the brand image word bank stores the brand image words corresponding to each brand name.
In the present exemplary embodiment, brand image words corresponding to each brand name may be stored in advance in the brand image word bank. Referring to fig. 2, in the brand image word cold start module 210, brand image positioning may be described by a brand specialist or a business domain specialist, and the brand specialist summarizes abstract image concepts of a brand by using up few and precise words and then inputs the summarized brand image words into a brand image word bank. Taking a computer brand as an example, the basic format of the brand image word library table can be as shown in the following table 1:
TABLE 1 brand image word-base table
Figure BDA0001380082560000051
Figure BDA0001380082560000061
Further, in this exemplary embodiment, each item of commodity information in the commodity information database stored in the background of the shopping website may be matched with a brand image word corresponding to each brand name, so as to associate the user and the brand through the shopping behavior data of the user in the commodity information database, which may be implemented by the commodity image word matching module 220 in fig. 2, and the specific implementation flow is shown in fig. 3. In the present exemplary embodiment, the term and the brand image word in the brand image word library are calculated by associating and matching mainly using the term, the function word, the modifier, and other commercial or brand-specific words in the commodity information. As shown in table 2 below, the brand image words in the brand image thesaurus are matched with the data in the commodity information table by using the "brand" field common in the brand image thesaurus table and the commodity information table, so as to prepare for the subsequent processing in the frequent item set feedback module 230 and the fuzzy clustering module 250.
TABLE 2 Commodity information Table
Figure BDA0001380082560000062
In the present exemplary implementation, the brand image word related to "association" collected in the brand image word cold start module 210 can be correspondingly matched to the "association" commodity information in the commodity information table through the "association" field in the brand image word table of table 1 and the "association" field in the commodity information table of table 2, so that the user can be associated with the "association" brand through the shopping behavior data of the user in the commodity information table.
Further, in the present exemplary embodiment, referring to fig. 2 and fig. 3, in the user-brand image word feature processing module 240, the matched item information and the brand image word may be processed, for example, shopping behavior features on the brand image word, such as purchase times, purchase amount, order amount, etc., of each user granularity are counted, and after normalization, the normalized features are input to the fuzzy clustering module for fuzzy clustering calculation. For example, if the user has shopping behaviors on n brand image words, for example, has an order amount on n brand image words, shopping behavior characteristics on n brand image words corresponding to the user may be constructed, as shown in table 3 below:
TABLE 3 shopping behavior feature List over n brand image words corresponding to the user
Figure BDA0001380082560000071
In the present exemplary embodiment, if there are m users in the product information database, the m users and the n brand image words may form an m × n matrix, and as shown in fig. 2 and fig. 3, the m × n matrix may be input to the fuzzy clustering module 250 for clustering. In addition, if the shopping behavior characteristics on the n brand image words corresponding to the user have the condition that the data indexes are inconsistent, normalization processing can be carried out on the shopping behavior characteristic indexes so as to ensure that dimensions are uniform. In this exemplary embodiment, the data normalization processing is mean normalization processing, the processing method is to perform data normalization based on a mean and a standard deviation of the raw data, the mean is a concentration degree of the measurement data, and the calculation formula is:
Figure BDA0001380082560000075
wherein, x1 to xn are the original data to be normalized, and n is the number of brand image words.
The standard deviation std is the discrete degree of the measured data, and the calculation formula is as follows:
Figure BDA0001380082560000073
the formula for the normalization process is:
Figure BDA0001380082560000074
wherein, XoldThe Xnew is the data after normalization process for the data that needs normalization process.
It should be noted that, in the present exemplary embodiment, the shopping behavior characteristics of the user are not limited to the number of purchases, the purchase amount, and the order amount, for example, the shopping behavior characteristics may also be the number of shopping carts added and the collection number, and the like, which is also within the protection scope of the present disclosure.
Next, in step S120, the membership degree of the user to each brand image word is calculated through fuzzy clustering based on the shopping behavior data of the user.
In the example embodiment, the shopping behavior characteristic indexes of each user of the user-brand image word feature processing module on the brand image words can be used as input to calculate the behavior expression of the user on the brand image words, and the fuzzy membership degree of the user on each brand image word is calculated according to fuzzy clustering.
Fuzzy clustering is different from traditional hard clustering and is a soft segmentation algorithm. In fuzzy clustering, each sample needing to be clustered can simultaneously belong to a plurality of classes, and the sum of the total membership degrees of the classes of each sample is 1, so that the membership degree or the approximation degree of the sample in each class can be known by comparing the membership degree of the sample in each class. In the example embodiment, which brand image the user prefers or likes can be derived according to the membership degree of the user to each brand.
Specifically, in this exemplary embodiment, the implementation method of fuzzy clustering may be: n shopping behavior feature vectors xi (i ═ 1, 2., n) are divided into c fuzzy groups, and c can be the number of brand image words. The brand image words may serve as a clustering center for each group, and the clustering center may be the brand image word that minimizes a cost function of the non-similarity index. Fuzzy clustering allows each given data point to represent how well it belongs to each group, i.e., each brand pictogram, by a degree of membership between 0 and 1. In adaptation to introducing fuzzy partitions, the membership matrix U is allowed to have elements with values between 0 and 1. Furthermore, with the addition of the normalization constraint, the sum of the membership of one dataset is always equal to 1, as shown in equation 4 below:
Figure RE-GDA0001461445000000081
then, the value function (or objective function) for fuzzy clustering of n shopping behavior features is a generalized form of the following equation:
Figure BDA0001380082560000082
wherein, the value range of uij is between 0 and 1; ci is the cluster center of the fuzzy group i, dij | | | ci-xj | | is the euclidean distance between the ith cluster center and the jth data point, and m (from 1 to infinity) is a weighting index.
Further, in this example implementation, referring to fig. 3, a specific process of fuzzy clustering on n shopping behavior features may be divided into the following 3 sub-modules:
submodule 1: determining initial parameters: in this exemplary embodiment, there may be two initial parameters, one is the fuzzy clustering number, i.e. the number of brand image words c, and the other is the parameter m for controlling the softness of the algorithm. In this exemplary embodiment, c may be a positive integer not greater than 20, since too many clusters may be unfavorable for interpretation and specific business application, and in addition, the optimal cluster number c may be found through the grid traversal search, which is also within the protection scope of the present disclosure. In the present exemplary embodiment, the softness parameter m cannot be too large, which may affect the clustering effect, and the softness parameter m may be a number between 2 and 5, or a positive integer not greater than 10, such as 2, or other suitable numbers, and the disclosure is not limited herein.
Submodule 2: constructing a fuzzy matrix according to the given shopping behavior data sample and the corresponding sample feature vector, wherein the initialization of i clustering centers can be random selection, and then the optimal solution is iterated step by step:
Figure BDA0001380082560000091
wherein: xj is a sample data point, and uij is the membership degree of the sample data point j to the clustering center i.
Submodule 3: judging whether the target function is converged (stopping iteration and outputting a result):
the scheme has an objective function of
Figure BDA0001380082560000092
Wherein: dij is the euclidean distance of the sample data point j to the cluster center i. The convergence condition may be that the threshold value calculated for a certain time is smaller than a certain threshold value, or that the amount of change of the threshold value calculated for a certain time from the last objective function value is smaller than a certain threshold value. And if the target function reaches the convergence condition, stopping the algorithm operation, and obtaining the membership degree of the sample data point j to the clustering center i.
Next, in step S130, the brand image word whose membership degree is greater than a first threshold is determined as the brand image preferred by the user.
In this exemplary embodiment, the first threshold may be a value determined according to the number of brand image words and the shopping behavior data amount of the user, or may be a value determined according to an actual processing result after the user preference determination method in this exemplary embodiment is adopted, and the disclosure is not particularly limited herein. Referring to fig. 2 and 3, a brand image word having a membership degree greater than a first threshold may be determined as a user-preferred brand image in the user-brand image matching module 260, and the determined user-preferred brand image may be output.
Further, in the present exemplary embodiment, in order to enrich the content of brand images in the brand image thesaurus, brand image words may be added to the brand image thesaurus according to information of each brand good on the shopping platform. Therefore, the user preference determination method may further include: generating a frequent itemset about the commodity information and the brand image words based on the matched commodity information and the brand image words; and adding commodity information in the frequent item set with the support degree larger than a second threshold value into the brand image word stock. In the present example embodiment, the second threshold value is a value set according to the number of article information items in the article information table, the number of brand image words in the brand image word stock, the calculation performance of the computer, and the like.
Specifically, referring to fig. 2 and 3, in the frequent itemset feedback module: the method can calculate the frequent item condition of the selected brand image words in the brand image word bank table and the matched commodity information items in the commodity information table in the commodity sales, and adds the co-occurring commodity information items meeting the minimum support threshold value of the frequent items, namely the second threshold value, into the brand image word bank table, thereby realizing the automatic expansion of the brand image word bank and the subsequent coverage of commodities and users.
In the present exemplary embodiment, the generated frequent item set includes two parts: one part is from brand image words in the brand image word stock, and the other part is from article words, function words, modifiers and the like in the commodity information table, and frequent item set calculation is respectively carried out on the brand image words, so that words with higher co-occurrence frequency with the predetermined brand image words are output. And the commodity information items with higher co-occurrence frequency, which meet the minimum support degree threshold value of the frequent items, namely the second threshold value, are used as the supplement of the initial brand image words, and are gradually enriched into the brand image word bank after gradual iteration, so that the independent expansion of the brand image word bank is realized.
Further, in this exemplary embodiment, a frequent item set of a brand image word and a category word, a function word, a modifier, etc. in the product information table may be generated by an FP-growth method, and the specific implementation flow is as follows: and continuously iterating the construction and projection process of the FP-tree formed by the brand image words and various words in the commodity information table. And constructing a conditional projection database and a projection FP-tree of each constructed frequent item. This process is repeated for each newly constructed FP-tree until the constructed new FP-tree is empty or contains only one path. When the constructed FP-tree is empty, the prefix is a frequent mode; when only one path is involved, the frequent pattern is obtained by enumerating all possible combinations and concatenating the prefixes of the tree.
Referring to fig. 4, the FP-tree is a special prefix tree, and is composed of a frequent entry header tree and an entry prefix tree. A prefix tree is a data structure that stores candidate sets, branches of the tree being identified by item names, nodes of the tree storing suffix items, and paths representing the set of items. The FP-tree is generated as follows:
first, a transaction item set is generated, the format of which is shown in table 4 below:
TABLE 4 transaction item set and frequent items
Item set id Item set Frequent item
001 {f,a,c,d,g,i,m,p} {f,c,a,m,p}
002 {a,b,c,f,l,m,o} {f,c,a,b,m}
003 {b,f,h,j,o,w} {f,b}
004 {b,c,k,s,p} {c,b,p}
005 {a,f,c,e,l,p,m,n} {f,c,a,m,p}
In the present exemplary embodiment, for the sake of simplicity and convenience, the brand image words in the brand image thesaurus or the respective goods information words in the goods information table are represented by letters, and the item set may represent an item set composed of the brand image words and the category words, function words, modifiers, and the like in the goods information table. When the minimum support degree is 3, the brand image words and the commodity information words in the database are scanned, the occurrence frequency of each single item is calculated, and the record of which the occurrence frequency is greater than the minimum support degree is reserved. Thus, in the rightmost column of table 4, only entries with an occurrence frequency greater than 3 are retained.
The second step is that: and calculating the occurrence frequency of the items meeting the minimum support degree, and arranging the frequent items in a descending order of the frequency to generate rearranged frequent items. The frequency of occurrence of each item in the set of items is shown in table 5 below:
TABLE 5 frequency of occurrence of items in item set
Item(s) Frequency of
f 4
c 4
a 3
b 3
m 3
p 3
As shown in table 5, among the frequent items calculated in table 4 above, the letter f appears 4 times, the letter c appears 4 times, and the letter a appears 3 times, and the frequencies of the occurrences of the letters are arranged in descending order to obtain rearranged frequent items, as shown in the rightmost column in table 4.
Thirdly, the brand image words and the commodity information words in the database are scanned again to construct an FP-tree, and the final result is shown in figure 4. In fig. 4, each path of the solid line may represent an item set, the FP-tree is a highly compressed structure, all information used for mining the frequent item set is stored, and after the FP-tree is generated, the frequent item set of each brand image word can be obtained through the FP-tree, so that the brand image thesaurus can be expanded. In addition, the FP-tree algorithm only needs to perform secondary scanning on the transaction database and does not need to generate a large number of candidate sets, so that the data processing efficiency can be improved.
It should be noted that although the various steps of the methods of the present disclosure are depicted in the drawings in a particular order, this does not require or imply that these steps must be performed in this particular order, or that all of the depicted steps must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions, etc.
Further, in the present exemplary embodiment, a user preference determination apparatus is also provided. Referring to fig. 5, the user preference determination apparatus may include: a statistic unit 510, a membership calculation unit 520, and a user preference determination unit 530. Wherein:
the statistic unit 510 is configured to count shopping behavior data of a user corresponding to each brand image word in a brand image word bank, where the brand image word bank stores brand image words corresponding to each brand name;
the membership degree calculating unit 520 is used for calculating the membership degree of the user to each brand image word through fuzzy clustering based on the shopping behavior data of the user; and
the user preference determining unit 530 is configured to determine the brand image word with the membership degree greater than the first threshold as the brand image preferred by the user.
Further, in this example embodiment, calculating the degree of membership of the user to each brand image word by fuzzy clustering based on the shopping behavior data of the user may include:
calculating the distance between the shopping behavior data of the user and each brand image word;
and calculating the membership degree of the user to each brand image word based on the distance.
Further, in the present exemplary embodiment, the user preference determination apparatus may further include: and the matching unit is used for matching each item of commodity information in the commodity information database with the brand image words corresponding to each brand name.
Further, in the present exemplary embodiment, the user preference determination apparatus may further include: the frequent item set generating unit is used for generating a frequent item set which relates to each item of commodity information and the brand image words; and the adding unit is used for adding the commodity information in the frequent item set with the support degree greater than a second threshold value into the brand image word stock.
Further, in the present exemplary embodiment, generating a frequent itemset regarding each item of merchandise information and the brand image word may include:
and generating a frequent item set related to each item of commodity information and brand image words through FP-growth operation.
Further, in the present exemplary embodiment, counting shopping behavior data of the user corresponding to each brand image word in the brand image word bank may include:
carrying out normalization processing on shopping behavior data of a user;
and counting the normalized shopping behavior data of the user corresponding to each brand image word in the brand image word bank.
Since each functional module of the user preference determining apparatus 400 of the exemplary embodiment of the present disclosure corresponds to the step of the exemplary embodiment of the user preference determining method, it is not described herein again.
It should be noted that although several modules or units of the user preference determination apparatus are mentioned in the above detailed description, such division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
In an exemplary embodiment of the present disclosure, there is also provided an electronic device capable of implementing the above method.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
An electronic device 600 according to such an embodiment of the invention is described below with reference to fig. 6. The electronic device 600 shown in fig. 6 is only an example and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 6, the electronic device 600 is embodied in the form of a general purpose computing device. The components of the electronic device 600 may include, but are not limited to: the at least one processing unit 610, the at least one memory unit 620, a bus 630 connecting different system components (including the memory unit 620 and the processing unit 610), and a display unit 640.
Wherein the storage unit stores program code that is executable by the processing unit 610 such that the processing unit 610 performs the steps according to various exemplary embodiments of the present invention as described in the "exemplary methods" section above in this specification. For example, the processing unit 610 may perform step s110 as shown in fig. 1. count shopping behavior data of a user corresponding to each brand image word in a brand image thesaurus in which brand image words corresponding to each brand name are stored; s120, calculating the membership degree of the user to each brand image word through fuzzy clustering based on the purchasing behavior data of the user; and S130, determining the brand image words with the membership degrees larger than a first threshold value as the brand images preferred by the users.
The storage unit 620 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)6201 and/or a cache memory unit 6202, and may further include a read only memory unit (ROM) 6203.
The memory unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, and in some combination, may comprise an implementation of a network environment.
Bus 630 may be one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 600 may also communicate with one or more external devices 670 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 600, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 600 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 650. Also, the electronic device 600 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 660. As shown, the network adapter 660 communicates with the other modules of the electronic device 600 over the bus 630. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by a combination of software and necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, aspects of the invention may also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps according to various exemplary embodiments of the invention described in the above-mentioned "exemplary methods" section of the description, when said program product is run on the terminal device.
Referring to fig. 7, a program product 700 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this respect, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
Furthermore, the above-described figures are merely schematic illustrations of processes involved in methods according to exemplary embodiments of the invention, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein can be implemented by software, and also by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (8)

1. A method for determining user preferences, comprising:
counting shopping behavior data of a user corresponding to each brand image word in a brand image word bank, wherein the brand image words corresponding to each brand name are stored in the brand image word bank, and the brand image words in the brand image word bank are dynamically changed;
calculating the membership degree of the user to each brand image word through fuzzy clustering based on the shopping behavior data of the user; and
determining the brand image words with the membership degrees larger than a first threshold value as the brand images preferred by the users;
wherein the brand image words are added to the brand image thesaurus by:
matching each item of commodity information in the commodity information database with brand image words corresponding to each brand name;
generating a frequent itemset about the commodity information and the brand image words based on the matched commodity information and the brand image words;
and adding commodity information in the frequent item set with the support degree larger than a second threshold value into the brand image word stock.
2. The user preference determination method of claim 1, wherein calculating the degree of membership of the user to each brand image word by fuzzy clustering based on the shopping behavior data of the user comprises:
calculating the distance between the shopping behavior data of the user and each brand image word;
and calculating the membership degree of the user to each brand image word based on the distance.
3. The user preference determination method of claim 1, wherein generating a frequent itemset regarding item information and the brand image word comprises:
and generating a frequent item set related to each item of commodity information and brand image words through FP-growth operation.
4. The user preference determination method of claim 1, wherein the counting shopping behavior data of the user corresponding to each brand image word in the brand image word bank comprises:
carrying out normalization processing on shopping behavior data of a user;
and counting the normalized shopping behavior data of the user corresponding to each brand image word in the brand image word bank.
5. A user preference determination apparatus, comprising:
the device comprises a statistic unit, a display unit and a display unit, wherein the statistic unit is used for counting shopping behavior data of users corresponding to brand image words in a brand image word bank, the brand image words corresponding to brand names are stored in the brand image word bank, and the brand image words in the brand image word bank are dynamically changed;
the membership calculation unit is used for calculating the membership of the user to each brand image word through fuzzy clustering based on the shopping behavior data of the user; and
the user preference determining unit is used for determining the brand image words with the membership degrees larger than a first threshold as the preferred brand images of the users;
the statistical unit comprises an image word adding subunit, a brand image word adding subunit and a brand image word adding subunit, wherein the image word adding subunit is used for matching each item of commodity information in the commodity information database with the brand image words corresponding to each brand name; generating a frequent itemset about the commodity information and the brand image words based on the matched commodity information and the brand image words; and adding commodity information in the frequent item set with the support degree larger than a second threshold value into the brand image word stock.
6. The apparatus of claim 5, wherein calculating the degree of membership of the user to each brand image word by fuzzy clustering based on the shopping behavior data of the user comprises:
calculating the distance between the shopping behavior data of the user and each brand image word;
and calculating the membership degree of the user to each brand image word based on the distance.
7. An electronic device, comprising:
a processor; and
a memory having stored thereon computer readable instructions which, when executed by the processor, implement the user preference determination method of any of claims 1 to 4.
8. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the user preference determination method according to any one of claims 1 to 4.
CN201710699894.1A 2017-08-16 2017-08-16 User preference determination method, device, equipment and storage medium Active CN107507028B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201710699894.1A CN107507028B (en) 2017-08-16 2017-08-16 User preference determination method, device, equipment and storage medium
PCT/CN2018/100688 WO2019034087A1 (en) 2017-08-16 2018-08-15 User preference determination method, apparatus, device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710699894.1A CN107507028B (en) 2017-08-16 2017-08-16 User preference determination method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN107507028A CN107507028A (en) 2017-12-22
CN107507028B true CN107507028B (en) 2021-11-30

Family

ID=60690818

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710699894.1A Active CN107507028B (en) 2017-08-16 2017-08-16 User preference determination method, device, equipment and storage medium

Country Status (2)

Country Link
CN (1) CN107507028B (en)
WO (1) WO2019034087A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107507028B (en) * 2017-08-16 2021-11-30 北京京东尚科信息技术有限公司 User preference determination method, device, equipment and storage medium
CN108009897A (en) * 2017-12-25 2018-05-08 北京中关村科金技术有限公司 A kind of real-time recommendation method of commodity, system and readable storage medium storing program for executing
CN110110033A (en) * 2018-01-29 2019-08-09 清华大学 Information extracting method, device, computer equipment and storage medium
CN109359246A (en) * 2018-12-07 2019-02-19 上海宏原信息科技有限公司 A kind of brand cohesion calculation method based on forum user speech
CN109658195B (en) * 2018-12-24 2020-12-25 北京亿百分科技有限公司 Commodity display decision method
CN110413852A (en) * 2019-07-19 2019-11-05 深圳市元征科技股份有限公司 A kind of information-pushing method, device, equipment and medium
CN111401409B (en) * 2020-02-28 2023-04-18 创新奇智(青岛)科技有限公司 Commodity brand feature acquisition method, sales volume prediction method, device and electronic equipment
CN113553493A (en) * 2020-04-24 2021-10-26 哈尔滨工业大学 Service selection method based on demand service probability matrix

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102750647A (en) * 2012-06-29 2012-10-24 南京大学 Merchant recommendation method based on transaction network
CN105488597A (en) * 2015-12-28 2016-04-13 中国民航信息网络股份有限公司 Passenger destination prediction method and system
CN106294462A (en) * 2015-06-01 2017-01-04 Tcl集团股份有限公司 A kind of method and system obtaining recommendation video

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8160918B1 (en) * 2005-01-14 2012-04-17 Comscore, Inc. Method and apparatus for determining brand preference
US20060242011A1 (en) * 2005-04-21 2006-10-26 International Business Machines Corporation Method and system for automatic, customer-specific purchasing preferences and patterns of complementary products
NZ572036A (en) * 2008-10-15 2010-03-26 Nikola Kirilov Kasabov Data analysis and predictive systems and related methodologies
CN103745379A (en) * 2013-12-23 2014-04-23 苏州亚安智能科技有限公司 Method for realizing customer-demand-based orientationi electronic-commerce platform
CN103810251B (en) * 2014-01-21 2017-05-10 南京财经大学 Method and device for extracting text
CN104298778B (en) * 2014-11-04 2017-07-04 北京科技大学 A kind of Forecasting Methodology and system of the steel rolling product quality based on correlation rule tree
JP6334455B2 (en) * 2015-04-23 2018-05-30 日本電信電話株式会社 Clustering apparatus, method, and program
CN105975608A (en) * 2016-05-17 2016-09-28 北京京东尚科信息技术有限公司 Data mining method and device
CN106682968B (en) * 2017-01-10 2021-07-02 北京三快在线科技有限公司 Navigation menu generation method and device and server
CN106844787B (en) * 2017-03-16 2020-06-16 四川大学 Recommendation method for searching target users and matching target products for automobile industry
CN107507028B (en) * 2017-08-16 2021-11-30 北京京东尚科信息技术有限公司 User preference determination method, device, equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102750647A (en) * 2012-06-29 2012-10-24 南京大学 Merchant recommendation method based on transaction network
CN106294462A (en) * 2015-06-01 2017-01-04 Tcl集团股份有限公司 A kind of method and system obtaining recommendation video
CN105488597A (en) * 2015-12-28 2016-04-13 中国民航信息网络股份有限公司 Passenger destination prediction method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于模糊聚类的旅游推荐算法;张应辉 等;《计算机技术与发展》;20161122;第26卷(第12期);第100页 *
张应辉 等.基于模糊聚类的旅游推荐算法.《计算机技术与发展》.2016,第26卷(第12期),第99-102页. *

Also Published As

Publication number Publication date
CN107507028A (en) 2017-12-22
WO2019034087A1 (en) 2019-02-21

Similar Documents

Publication Publication Date Title
CN107507028B (en) User preference determination method, device, equipment and storage medium
US20210027146A1 (en) Method and apparatus for determining interest of user for information item
Wang et al. A sentiment-enhanced hybrid recommender system for movie recommendation: a big data analytics framework
US20220114199A1 (en) System and method for information recommendation
Vriens et al. Metric conjoint segmentation methods: A Monte Carlo comparison
Miao et al. Context‐based dynamic pricing with online clustering
US20080288481A1 (en) Ranking online advertisement using product and seller reputation
Yoseph et al. The impact of big data market segmentation using data mining and clustering techniques
CN111666304B (en) Data processing device, data processing method, storage medium, and electronic apparatus
CN103309886A (en) Trading-platform-based structural information searching method and device
Wang et al. Attribute embedding: Learning hierarchical representations of product attributes from consumer reviews
CN109034853B (en) Method, device, medium and electronic equipment for searching similar users based on seed users
De Cnudde et al. A benchmarking study of classification techniques for behavioral data
Zhang et al. Decomposition methods for tourism demand forecasting: A comparative study
Bi et al. International tourism demand forecasting with machine learning models: The power of the number of lagged inputs
CN111966886A (en) Object recommendation method, object recommendation device, electronic equipment and storage medium
CN113032668A (en) Product recommendation method, device and equipment based on user portrait and storage medium
Gensch Empirical evidence supporting the use of multiple choice models in analyzing a population
Putler et al. A Bayesian approach for estimating target market potential with limited geodemographic information
CN110599281A (en) Method and device for determining target shop
AlRossais et al. Improving cold-start recommendations using item-based stereotypes
Abd Rahman et al. Classification of customer feedbacks using sentiment analysis towards mobile banking applications
Ovsjanikov et al. Topic modeling for personalized recommendation of volatile items
CN112632275B (en) Crowd clustering data processing method, device and equipment based on personal text information
CN114065063A (en) Information processing method, information processing apparatus, storage medium, and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant