CN111695941A - Commodity transaction website data analysis method and device and electronic equipment - Google Patents

Commodity transaction website data analysis method and device and electronic equipment Download PDF

Info

Publication number
CN111695941A
CN111695941A CN202010543883.6A CN202010543883A CN111695941A CN 111695941 A CN111695941 A CN 111695941A CN 202010543883 A CN202010543883 A CN 202010543883A CN 111695941 A CN111695941 A CN 111695941A
Authority
CN
China
Prior art keywords
data
user
life cycle
module
website
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010543883.6A
Other languages
Chinese (zh)
Inventor
伍国飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Tiantu Network Technology Co ltd
Original Assignee
Guangzhou Tiantu Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Tiantu Network Technology Co ltd filed Critical Guangzhou Tiantu Network Technology Co ltd
Priority to CN202010543883.6A priority Critical patent/CN111695941A/en
Publication of CN111695941A publication Critical patent/CN111695941A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0207Discounts or incentives, e.g. coupons or rebates
    • G06Q30/0224Discounts or incentives, e.g. coupons or rebates based on user history
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0207Discounts or incentives, e.g. coupons or rebates
    • G06Q30/0222During e-commerce, i.e. online transactions

Abstract

The application relates to a commodity transaction website data analysis method and device and electronic equipment. The method comprises the following steps: extracting index features in user behavior data and order related data in a data warehouse; normalizing and carrying out numerical processing on the index characteristic data to obtain converted user data; calculating the converted user data by adopting a clustering algorithm based on distance to obtain a data model based on the full life cycle of the user; allocating a life cycle label to each user according to a data model based on the full life cycle of the user; and adjusting the promotion activity content of the website according to the data rule of all the user life cycle labels. The method is based on the analysis method of the full life cycle of the user, analyzes the behavior characteristics of the user in the stable period, finds the optimal behavior path developed into the user in the stable period, and directs the optimal form to adjust the functional structure of the website, optimize the commodity selection strategy, adjust the content of the promotion activities of the website and promote the transaction behavior of the user on the website.

Description

Commodity transaction website data analysis method and device and electronic equipment
Technical Field
The present application relates to the field of data analysis technologies, and in particular, to a method and an apparatus for analyzing data of a commodity transaction website, and an electronic device.
Background
In the related art, a user value management system is commonly used in the user relationship management in the e-commerce platform. The most widely used method for classifying different user values in the user value management system is the RFM model analysis method. The RFM analysis is to subdivide the value of a client by 3 indexes of the latest consumption (Recency), the consumption Frequency (Frequency) and the consumption amount (Monetary) of the client, and further to perform a refined operation strategy on the subdivided user group, wherein the three indexes are as follows:
r (Recency) last consumption time: indicating the time the user last consumed is now. The more recent the consumption time, the greater the value of the customer.
F (Frequency) consumption Frequency: the consumption frequency refers to the number of times that the user transacts in a statistical period, and the value of the frequently purchased user, namely the frequent customer, is higher than the value of the user who is once every two years old.
M (money) consumption amount: the consumption amount is the total amount consumed by the user in the counting period, which represents the amount of profit created by the consumer for the enterprise, and the more the consumption amount, the higher the user value.
The calculation of each index in the RFM model is from order data, and the trading of the order is only one step in the process of considering, purchasing, using and enjoying a product by a client, so that the analysis of the RFM model only focuses on the order user, and the focus on most non-trading users is omitted, thereby missing the opportunity of possibly generating the trading. And as the classification of the user value is calculated according to the result in a certain statistical period, as for the reason why the user becomes a lost user (whether the commodity in a certain order is in a problem) or a high-value user, the RFM model cannot give explanation, so that the transaction behavior of the user cannot be improved in a targeted manner. In addition, due to objective reasons such as rapid increase of business and short online time, a large number of users (about 50%) with the consumption number of 1 exist, so that the model data is inclined, and the users with the consumption number of 1 do not have good marketing strategies.
Disclosure of Invention
In order to solve the problems in the related art, the application provides a method, a device and an electronic device for analyzing commodity transaction website data, which can find the optimal behavior path of a user developing into a stable period by analyzing the behavior characteristics of the user which becomes the stable period based on the analysis method of the full life cycle of the user, and guide the optimal form to adjust the website functional structure, optimize the commodity selection strategy, adjust the content of the website promotion activity, and finally realize the promotion of the transaction behavior of the user on the website.
The first aspect of the application provides a data analysis method for a commodity transaction website, which is used for extracting index features in user behavior data and order related data in a data warehouse;
normalizing and carrying out numerical processing on the index characteristic data to obtain converted user data;
calculating the converted user data by adopting a clustering algorithm based on distance to obtain a data model based on the full life cycle of the user;
allocating a life cycle label to each user according to a data model based on the full life cycle of the user;
and adjusting the promotion activity content of the website according to the data rule of all the user life cycle labels.
A second aspect of the present application provides a commodity transaction website data analysis apparatus, including: the system comprises a data extraction module, a data processing module, a data model generation module, a user life cycle label distribution module and a data application module;
the data extraction module is used for extracting index features in user behavior data and order related data in the data warehouse;
the data processing module is used for carrying out normalization and digitization processing on the index characteristic data to obtain converted user data;
the data model generation module is used for calculating the converted user data by adopting a clustering algorithm based on distance to obtain a data model based on the full life cycle of the user;
the user life cycle label distribution module is used for distributing life cycle labels to each user according to a data model based on the full life cycle of the users;
and the data application module is used for adjusting the website promotion activity content according to the data rules of all the user life cycle labels.
A third aspect of the present application provides an electronic device comprising:
a processor; and
a memory having executable code stored thereon, which when executed by the processor, causes the processor to perform the method as described above.
A fourth aspect of the present application provides a non-transitory machine-readable storage medium having stored thereon executable code, which when executed by a processor of an electronic device, causes the processor to perform a method as described above.
The technical scheme provided by the application can comprise the following beneficial effects: the method can find the optimal behavior path developed into the user in the stable period by analyzing the behavior characteristics of the user in the stable period based on the analysis method of the full life cycle of the user, and guide the optimal form to adjust the functional structure of the website, optimize the commodity selection strategy, adjust the content of the promotion activity of the website, and finally realize the promotion of the transaction behavior of the user on the website.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The foregoing and other objects, features and advantages of the application will be apparent from the following more particular descriptions of exemplary embodiments of the application, as illustrated in the accompanying drawings wherein like reference numbers generally represent like parts throughout the exemplary embodiments of the application.
Fig. 1 is a schematic flow chart of a data analysis method for a commodity transaction website according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a data analysis device for a commodity transaction website according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device shown in an embodiment of the present application.
Detailed Description
Preferred embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms "first," "second," "third," etc. may be used herein to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.
In the related technology at present, a user value management system is generally used for user relationship management in e-commerce, and the most widely used method for classifying different user values in the user value management system is an RFM model analysis method, which only focuses on order users and omits the attention of most of users who do not trade.
In view of the above problems, embodiments of the present application provide a method for analyzing data of a commodity transaction website, which can find an optimal behavior path of a user developing into a stable period by analyzing behavior characteristics of the user having become the stable period based on an analysis method of a full life cycle of the user, and direct the optimal behavior path to adjust a website functional structure, optimize a commodity selection policy, adjust contents of a website promotion activity, and finally achieve promotion of a transaction behavior of the user on the website.
The technical solutions of the embodiments of the present application are described in detail below with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a data analysis method for a commodity transaction website according to an embodiment of the present application.
Referring to fig. 1, a method for analyzing data of a commodity transaction website includes:
and S101, extracting index features in the user behavior data and the order related data in the data warehouse.
In a preferred embodiment, the method further comprises the following steps: and collecting user behavior data and order related data, and storing the user behavior data and the order related data in a data warehouse. The data of the user behavior used in this embodiment mainly has three types, which are: user page browsing data, user click event data and user dwell time data. The data related to the order mainly comprises: order transaction data and item review data. The data used by the implementation is periodically synchronized to a data warehouse of a big data center by a service end for storage. The data acquisition mode of the embodiment is to acquire data according to the embedded point of the SDK of the app, that is, the SDK with a data acquisition function is added to the application client. Here, app is an application program, and SDK is a software development kit.
In a more specific example, (1) the behavior class data may be some data indicator features as follows:
1. cumulative PV total: the basic factors for determining which life cycle a user is in are equivalent to experience values in the gaming industry. The total PV is: page View number.
2. And the commodity search times are user behavior indexes with different life cycles, are used for reflecting the strength of the order placing desire of the user, and are used together with the accumulated order amount index (the search conversion rate is accumulated order amount/commodity search times).
3. And the number of the PV of the commodity list page is an index of a user behavior path.
4. And the number of the commodity detail pages PV is an index of a user behavior path.
5. And the PV number of the order payment pages is an index of a user behavior path.
6. PV number of successful payment pages, namely an index of a user behavior path.
Days logged in 7.7: and judging whether the user enters the important prediction factor of the silent period.
8.30 days login days, an important factor in judging whether the user enters the attrition or silence period.
9. The average stay time is the stay time of the user on the app, and generally reflects the stickiness of the app to the user.
(2) The order transaction data may be some data index features as follows:
the first order type is that as the business of the company belongs to a preferential aggregation platform, various order types (such as self-service shopping malls, coupons, recharging, adding oil and the like) exist, and the first order type reflects the willingness and purpose of the user to use the app.
The first order amount can reflect the consumption capacity of the user, and the user with high consumption capacity is relatively easy to develop to the mature period in the life cycle process
Accumulating the order amount, wherein the accumulated order amount can reflect the loyalty and the trust
The user repurchase period comprises two sub repurchase periods, namely a first repurchase period to a second repurchase period and a first repurchase period to a sixth repurchase period according to the existing repurchase data
The order frequency of commodity types reflects the preference of the user types, and if the commodities on the platform can be matched with the preference of the user, the index can be used as the reason why the user becomes a user in a stable period or a user in a loss period
(3) The commodity comment data can be characterized by the following data indexes:
and the user bad comment number is that the user bad comment can reflect whether the user enters the important factors of the stream losing period.
In a preferred embodiment, the collected user behavior data and order related data are first subjected to data cleaning and then stored in a data warehouse. Data cleansing is to remove data that is significantly erroneous.
And S102, normalizing and digitizing the index characteristic data to obtain converted user data.
Because each collected or calculated data characteristic index has a value of the primary consumption amount larger than that of the repurchase period, and the change of the primary consumption amount is more likely to influence the classification of the whole user, the data needs to be normalized, and the values of different metrics are converted into the same interval (0, 1).
Figure BDA0002539991940000061
X in the formulaThe numerical result representing the normalized feature index, x represents the absolute value, i.e. the raw value of the acquisition, for example: the number of days in the re-purchasing period or the first consumption amount, min represents the minimum value in the data of all the users in the original data of the item, and max represents the maximum value in the data of all the users in the original data of the item.
The meaning of this formula is illustrated below by way of an example, as shown in table 1:
Figure BDA0002539991940000062
TABLE 1
Take the repurchase cycle as an example:
the x of the a user is 5, the x of the b user is 10, min is 5 and max is 12 in the column.
The value of this column of the repurchase period (normalization) is calculated using this formula. For example, A is 0 ═ 5-5)/(12-5).
Since the data processing model of the present application is to process data of numerical type, for data of text type, such as: the first order type characteristic value, such as self-operated shopping mall, recharge, add oil, and the like, needs to be converted into numerical type, and in this embodiment, the text type is converted into numerical type mainly by using One-hot encoding, that is, a method of unique hot encoding.
The following illustrates the digitized conversion result of this step by an example, as shown in tables 2 and 3:
user' s First order type
A Self-service shopping mall
B Refueling
C Recharge
D Self-service shopping mall
TABLE 2
User' s Shopping mall Refueling Recharge
A 1 0 0
B 0 1 0
C 0 0 1
D 1 0 0
TABLE 3
And S103, calculating the converted user data by adopting a clustering algorithm based on distance to obtain a data model based on the full life cycle of the user.
In the embodiment, the distance-based clustering algorithm adopts a K-means clustering algorithm, also called a K-means clustering algorithm, and adopts the distance as an evaluation index of similarity, that is, the closer the distance between two objects is, the greater the similarity of the two objects is. The algorithm considers that the class cluster is composed of objects close in distance, so that a compact and independent cluster is taken as a final target.
The K-means clustering algorithm is a clustering analysis algorithm for iterative solution, and comprises the steps of randomly selecting K objects as initial clustering centers, then calculating the distance between each object and each seed clustering center, and allocating each object to the nearest clustering center. The cluster centers and the objects assigned to them represent a cluster. The cluster center of a cluster is recalculated for each sample assigned based on the objects existing in the cluster. This process will be repeated until some termination condition is met. The termination condition may be that no (or minimum number) objects are reassigned to different clusters, no (or minimum number) cluster centers are changed again, and the sum of squared errors is locally minimal.
In the embodiment, the converted user data is used, and a K-means algorithm is called to finally calculate and obtain the data model based on the full life cycle of the user. In a preferred embodiment, K is 5.
And S104, defining the life cycle type of the user according to the login data and the purchase data of the user.
In a preferred embodiment, the following method is used to define the user lifecycle:
new hand-period- -users who have not placed an order and logged in within 7 days, users who have placed an order and have not logged in within 30 days.
Growth period-order number is below 2 and there are logged-in users within 30 days.
Maturity-orders are more than 5 times and there are logged-on users within 30 days.
Dormant period-the order number is more than 2 times, and there is no logged-on user within 30 days.
Loss period-no user was ordered and no login was made within 7 days, no user was ordered and no login was made for 90 consecutive days.
It should be noted that S104 may be completed at any time before S105, and may be completed before or after any step of S101, S102, and S103.
And S105, allocating a life cycle label to each user according to the data model based on the full life cycle of the user.
The embodiment analyzes data of various types of users based on a data model of the full life cycle of the users, and assigns life cycle labels to each user.
In a preferred embodiment, S105 assigns a lifecycle label to each user according to the user lifecycle type according to a data model based on the user full lifecycle, that is, this step tags each user with a lifecycle label according to one of the 5 lifecycle types defined in S104.
And S106, adjusting the content of the website promotion activities according to the data rules of all the user life cycle labels.
In a preferred embodiment, after the life cycle label is printed for each user, the users are classified and counted according to the 5 life cycle types, and the data rules of all the user life cycle labels, that is, the user data rules of each life cycle label, are obtained according to the statistical result.
In the preferred embodiment, by analyzing the user data for the development phase and the stabilization phase, the following partial data regularity exists:
1. this part of the user is used to weekend consumption.
2. The number of searches of the part of users is relatively low, and most users can finally place orders through browsing rather than searching; meanwhile, the diet commodity can more easily promote ordering behavior of the user.
3. It generally takes 18 days to develop from novice users to growth-stage users, and it generally takes 26 days from growth stage to stationary stage.
And then, according to the data rule analyzed by the data, adjusting the content of the website promotion activities as follows:
1. and guiding the novice user to generate more browsing by taking the food commodities as the explosive commodities.
2. The user of a different lifecycle is reached at each time node.
The method comprises the steps of finding a target user group such as users in a development period and a stabilization period by clustering all existing users, analyzing behavior characteristics and behavior paths of the target user group, namely common behavior paths, and finally forming a list by searching commodities for the users; most of the first users are users who refuel, etc. Aiming at the behavior characteristics, corresponding website product function adjustment or commodity category adjustment strategies are made, if certain categories are easy to enable users to enter a silent period or a churn period, a large number of novice users are developed into development period or stable period users, the silent period users are activated at the same time, churn period users are saved, the content of website promotion activities is adjusted, and the transaction behaviors of the users on the website are improved finally.
According to the embodiment, the commodity transaction website data analysis method extracts the index features in the user behavior data and the order related data in the data warehouse; normalizing and carrying out numerical processing on the index characteristic data to obtain converted user data; calculating the converted user data by adopting a clustering algorithm based on distance to obtain a data model based on the full life cycle of the user; allocating a life cycle label to each user according to a data model based on the full life cycle of the user; and adjusting the promotion activity content of the website according to the data rule of all the user life cycle labels. In this embodiment, based on the analysis method of the full life cycle of the user, an optimal behavior path developed to the user in the stable period is found by analyzing the behavior characteristics of the user in the stable period, and the optimal behavior path is used to guide the user to adjust the functional structure of the website, optimize the commodity selection strategy, adjust the content of the promotion activity of the website, and finally improve the transaction behavior of the user on the website.
Corresponding to the embodiment of the application function implementation method, the application also provides a commodity transaction website data analysis device, electronic equipment and a corresponding embodiment.
Fig. 2 is a schematic structural diagram of a product transaction website data analysis device according to an embodiment of the present application.
Referring to fig. 2, a commodity transaction website data analysis apparatus includes: a data extraction module 201, a data processing module 202, a data model generation module 203, a user lifecycle tags assignment module 204, and a data application module 205.
And the data extraction module 201 is used for extracting index features in the user behavior data and the order related data in the data warehouse.
And the data processing module 202 is configured to perform normalization and digitization processing on the index feature data to obtain converted user data.
And the data model generating module 203 is configured to calculate the converted user data by using a distance-based clustering algorithm to obtain a data model based on the full life cycle of the user.
A user lifecycle label assignment module 204, configured to assign a lifecycle label to each user according to a data model based on the user's full lifecycle.
And the data application module 205 is used for adjusting the content of the website promotion activities according to the data rules of all the user life cycle tags.
In a preferred embodiment, the apparatus of this embodiment further includes: a data acquisition module 206, a data storage module 207, and a user lifecycle type definition module 208.
A data acquisition module 206, configured to acquire user behavior data and order related data;
and the data storage module 207 is used for storing the user behavior data and the order related data in a data warehouse.
A user lifecycle type definition module 208 for defining a user lifecycle type based on the user login data and the purchase data.
The principle of the mutual operation of the modules is explained below in connection with the previous method.
The data extraction module 201 extracts index features in the user behavior data and the order related data in the data warehouse.
In a preferred embodiment, the data collection module 206 of this embodiment collects user behavior data and order related data. The data storage module 207 stores the user behavior data and the order related data in a data repository. The data of the user behavior used in this embodiment mainly has three types, which are: user page browsing data, user click event data and user dwell time data. The data related to the order in this embodiment mainly includes: order transaction data and item review data. All data are stored in a data warehouse of a big data center in a timing and synchronous mode through a service end. The data acquisition mode is the SDK point-buried acquisition according to the app.
In a more specific example, (1) the behavior class data may be some data indicator features as follows:
1. the accumulated PV total (Page View Page View number) is a basic factor for judging which life cycle the user is in, and is equivalent to an experience value in the game industry.
2. And the commodity search times are user behavior indexes with different life cycles, are used for reflecting the strength of the order placing desire of the user, and are used together with the accumulated order amount index (the search conversion rate is accumulated order amount/commodity search times).
3. And the number of the PV of the commodity list page is an index of a user behavior path.
4. And the number of the commodity detail pages PV is an index of a user behavior path.
5. And the PV number of the order payment pages is an index of a user behavior path.
6. PV number of successful payment pages, namely an index of a user behavior path.
Days logged in 7.7: and judging whether the user enters the important prediction factor of the silent period.
8.30 days login days, an important factor in judging whether the user enters the attrition or silence period.
9. The average stay time is the stay time of the user on the app, and generally reflects the stickiness of the app to the user.
(2) The order transaction data may be some data index features as follows:
the first order type is that as the business of the company belongs to a preferential aggregation platform, various order types (such as self-service shopping malls, coupons, recharging, adding oil and the like) exist, and the first order type reflects the willingness and purpose of the user to use the app.
The first order amount can reflect the consumption capacity of the user, and the user with high consumption capacity is relatively easy to develop to the mature period in the life cycle process.
And accumulating the order amount, wherein the accumulated order amount can reflect the loyalty and the trust.
And (4) a user repurchase period, wherein the repurchase period comprises two sub repurchase periods, namely a first repurchase period to a second repurchase period and a first repurchase period to a sixth repurchase period according to the existing repurchase data.
And the commodity type order frequency reflects the type preference of the user, and if the commodities on the platform can be matched with the preference of the user, the index can be used as the reason why the user becomes a user in a stable period or a user in a loss period.
(3) The commodity comment data can be characterized by the following data indexes:
and the user bad comment number is that the user bad comment can reflect whether the user enters the important factors of the stream losing period.
In a preferred embodiment, the collected user behavior data and order related data are first subjected to data cleaning and then stored in a data warehouse. Data cleansing is to remove data that is significantly erroneous.
After the data extraction module 201 extracts the index features in the user behavior data and the order related data in the data warehouse, the data processing module 202 performs normalization and digitization processing on the index feature data to obtain converted user data.
The index features in the user behavior data and the order related data extracted by the data extraction module 201 are acquired or calculated, the value of the first consumption amount of the user is larger than the value of the re-purchase period, and the change of the values easily affects the classification of the whole user, so the data processing module 202 is required to normalize the data, and convert the values of different metrics into the same interval (0, 1), the application uses min-max normalization, and the formula is as follows:
Figure BDA0002539991940000111
x in the formulaThe numerical result representing the normalized feature index, x represents the absolute value, i.e. the raw value of the acquisition, for example: number of days in the repeat cycle, or the amount of the first purchase. min represents the minimum value of the data of all users in the original data of the item, and max represents the maximum value of the data of all users in the original data of the item.
Because the data processing model of the application is used for processing numerical data, for text data, such as a first order type characteristic value, a self-owned mall, recharge, refund and the like, the data processing module 202 is required to convert the data into the numerical data, and the data processing module 202 mainly uses One-Hot Encoding, namely a single Hot Encoding method, to convert the text into the numerical data.
The data processing module 202 is configured to perform normalization and digitization processing on the index feature data to obtain converted user data, and the data model generation module 203 calculates the converted user data by using a distance-based clustering algorithm to obtain a data model based on the full life cycle of the user.
The data model generation module 203 employs a K-means clustering algorithm, also known as a K-means clustering algorithm. The algorithm of the present embodiment uses the distance as an evaluation index of similarity, that is, it is considered that the closer the distance between two objects is, the greater the similarity thereof is. The algorithm considers that the class cluster is composed of objects close in distance, so that a compact and independent cluster is taken as a final target.
The K-means clustering algorithm is a clustering analysis algorithm for iterative solution, and comprises the steps of randomly selecting K objects as initial clustering centers, then calculating the distance between each object and each seed clustering center, and allocating each object to the nearest clustering center. The cluster centers and the objects assigned to them represent a cluster. The cluster center of a cluster is recalculated for each sample assigned based on the objects existing in the cluster. This process will be repeated until some termination condition is met. The termination condition may be that no (or minimum number) objects are reassigned to different clusters, no (or minimum number) cluster centers are changed again, and the sum of squared errors is locally minimal.
And the data model generation module 203 calls a K-means algorithm to finally calculate a data model based on the full life cycle of the user by using the converted user data. In a preferred embodiment, K is 5.
Thereafter, the user lifecycle label assignment module 204 assigns a lifecycle label to each user according to a data model based on the user's full lifecycle.
In this embodiment, the user lifecycle tag assignment module 204 analyzes data of various types of users based on the data model of the user full lifecycle, and assigns a lifecycle tag to each user.
In this embodiment, each user is tagged with a lifecycle label according to the lifecycle defined by the user lifecycle type definition module 208. Each user is branded according to one of the 5 life cycles defined in the user life cycle type definition module 208.
In a preferred embodiment, the user lifecycle type definition module 208 defines the user lifecycle using the following method:
new hand-period- -users who have not placed an order and logged in within 7 days, users who have placed an order and have not logged in within 30 days.
Growth period-order number is below 2 and there are logged-in users within 30 days.
Maturity-orders are more than 5 times and there are logged-on users within 30 days.
Dormant period-the order number is more than 2 times, and there is no logged-on user within 30 days.
Loss period-no user was ordered and no login was made within 7 days, no user was ordered and no login was made for 90 consecutive days.
Finally, the data application module 205 adjusts the content of the website promotional activity according to the data rules of all user lifecycle tags.
After the user lifecycle label allocation module 204 finishes tagging each user with a lifecycle label, the data application module 205 performs classification statistics on the users according to the 5 middle-of-life types defined by the user lifecycle type definition module 208, and obtains data rules of all user lifecycle labels, that is, the user data rules of each lifecycle label, according to the statistical result.
In the preferred embodiment, the data application module 205 analyzes the user data in the development phase and the stabilization phase, and has the following partial data rule:
1. this part of the user is used to weekend consumption.
2. The number of searches of the part of users is relatively low, and most users can finally place orders through browsing rather than searching; meanwhile, the diet commodity can more easily promote ordering behavior of the user.
3. It generally takes 18 days to develop from novice users to growth-stage users, and it generally takes 26 days from growth stage to stationary stage.
Then, the data application module 205 adjusts the content of the website promotion activity according to the analyzed data rule, and the adjustment is as follows:
1. and guiding the novice user to generate more browsing by taking the food commodities as the explosive commodities.
2. The user of a different lifecycle is reached at each time node.
In the embodiment, the target user groups, such as users in the development period and the stabilization period, are found by clustering all the existing users, and then the behavior characteristics and the behavior paths, i.e., common behavior paths, of the target user groups are analyzed, for example, the users are finally listed by searching for commodities, and most of the users are users who refuel. Aiming at the behavior characteristics, corresponding website product function adjustment or commodity category adjustment strategies are made, for example, certain categories are easy to enable users to enter a silent period or a loss period. Finally, a large number of new-hand-stage users are developed into development-stage or stable-stage users, silent-stage users are activated at the same time, loss-stage users are saved, content of website promotion activities is adjusted, and transaction behaviors of the users on the website are improved finally.
According to the embodiment, the data extraction module of the commodity transaction website data analysis device extracts the index characteristics in the user behavior data and the order related data in the data warehouse; the data processing module is used for carrying out normalization and numerical processing on the index characteristic data to obtain converted user data; the data model generation module calculates the converted user data by adopting a distance-based clustering algorithm to obtain a data model based on the full life cycle of the user; the user life cycle label distribution module distributes life cycle labels to each user according to the data model based on the full life cycle of the users; and the data application module adjusts the website promotion activity content according to the data rule of all the user life cycle labels. The device of the embodiment finds the optimal behavior path developed to the user in the stable period by analyzing the behavior characteristics of the user in the stable period based on the analysis method of the full life cycle of the user, and guides the optimal form to adjust the function structure of the website and optimize the commodity selection strategy.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 3 is a schematic structural diagram of an electronic device shown in an embodiment of the present application.
Referring to fig. 3, the electronic device 300 includes a memory 310 and a processor 320.
The Processor 320 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 310 may include various types of storage units, such as system memory, Read Only Memory (ROM), and permanent storage. Wherein the ROM may store static data or instructions for the processor 320 or other modules of the computer. The persistent storage device may be a read-write storage device. The persistent storage may be a non-volatile storage device that does not lose stored instructions and data even after the computer is powered off. In some embodiments, the persistent storage device employs a mass storage device (e.g., magnetic or optical disk, flash memory) as the persistent storage device. In other embodiments, the permanent storage may be a removable storage device (e.g., floppy disk, optical drive). The system memory may be a read-write memory device or a volatile read-write memory device, such as a dynamic random access memory. The system memory may store instructions and data that some or all of the processors require at runtime. Further, the memory 310 may comprise any combination of computer-readable storage media, including various types of semiconductor memory chips (DRAM, SRAM, SDRAM, flash memory, programmable read-only memory), magnetic and/or optical disks, may also be employed. In some embodiments, memory 310 may include a removable storage device that is readable and/or writable, such as a Compact Disc (CD), a read-only digital versatile disc (e.g., DVD-ROM, dual layer DVD-ROM), a read-only Blu-ray disc, an ultra-density optical disc, a flash memory card (e.g., SD card, min SD card, Micro-SD card, etc.), a magnetic floppy disc, or the like. Computer-readable storage media do not contain carrier waves or transitory electronic signals transmitted by wireless or wired means.
The memory 310 has stored thereon executable code that, when processed by the processor 320, may cause the processor 320 to perform some or all of the methods described above.
The aspects of the present application have been described in detail hereinabove with reference to the accompanying drawings. In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments. Those skilled in the art should also appreciate that the acts and modules referred to in the specification are not necessarily required in the present application. In addition, it can be understood that the steps in the method of the embodiment of the present application may be sequentially adjusted, combined, and deleted according to actual needs, and the modules in the device of the embodiment of the present application may be combined, divided, and deleted according to actual needs.
Furthermore, the method according to the present application may also be implemented as a computer program or computer program product comprising computer program code instructions for performing some or all of the steps of the above-described method of the present application.
Alternatively, the present application may also be embodied as a non-transitory machine-readable storage medium (or computer-readable storage medium, or machine-readable storage medium) having stored thereon executable code (or a computer program, or computer instruction code) which, when executed by a processor of an electronic device (or electronic device, server, etc.), causes the processor to perform part or all of the various steps of the above-described method according to the present application.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the applications disclosed herein may be implemented as electronic hardware, computer software, or combinations of both.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems and methods according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Having described embodiments of the present application, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (10)

1. A data analysis method for a commodity transaction website comprises the following steps:
extracting index features in user behavior data and order related data in a data warehouse;
normalizing and carrying out numerical processing on the index characteristic data to obtain converted user data;
calculating the converted user data by adopting a clustering algorithm based on distance to obtain a data model based on the full life cycle of the user;
allocating a life cycle label to each user according to a data model based on the full life cycle of the user;
and adjusting the promotion activity content of the website according to the data rule of all the user life cycle labels.
2. The method of claim 1, further comprising:
collecting user behavior data and order related data;
and storing the user behavior data and the order related data in a data warehouse.
3. The method of claim 1, further comprising: a user lifecycle type is defined based on the user login data and the purchase data.
4. The method of claim 3, wherein the user lifecycle types comprise:
new hand, growing, maturing, resting and runoff.
5. The method of claim 4, wherein assigning a lifecycle label to each user according to a data model based on the user's full lifecycle comprises:
and allocating a life cycle label to each user according to the user life cycle type according to a data model based on the user full life cycle.
6. A commodity transaction website data analysis device comprises: the system comprises a data extraction module, a data processing module, a data model generation module, a user life cycle label distribution module and a data application module;
the data extraction module is used for extracting index features in user behavior data and order related data in the data warehouse;
the data processing module is used for carrying out normalization and digitization processing on the index characteristic data to obtain converted user data;
the data model generation module is used for calculating the converted user data by adopting a clustering algorithm based on distance to obtain a data model based on the full life cycle of the user;
the user life cycle label distribution module is used for distributing life cycle labels to each user according to a data model based on the full life cycle of the users;
and the data application module is used for adjusting the website promotion activity content according to the data rules of all the user life cycle labels.
7. The apparatus of claim 6, further comprising: the data acquisition module and the data storage module;
the data acquisition module is used for acquiring user behavior data and order related data;
and the data storage module is used for storing the user behavior data and the order related data in a data warehouse.
8. The apparatus of claim 6 or 7, further comprising: a user life cycle type definition module;
and the user life cycle type definition module is used for defining the user life cycle type according to the user login data and the purchase data.
9. An electronic device, comprising:
a processor; and
a memory having executable code stored thereon, which when executed by the processor, causes the processor to perform the method of any one of claims 1-5.
10. A non-transitory machine-readable storage medium having stored thereon executable code, which when executed by a processor of an electronic device, causes the processor to perform the method of any one of claims 1-5.
CN202010543883.6A 2020-06-15 2020-06-15 Commodity transaction website data analysis method and device and electronic equipment Pending CN111695941A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010543883.6A CN111695941A (en) 2020-06-15 2020-06-15 Commodity transaction website data analysis method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010543883.6A CN111695941A (en) 2020-06-15 2020-06-15 Commodity transaction website data analysis method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN111695941A true CN111695941A (en) 2020-09-22

Family

ID=72481171

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010543883.6A Pending CN111695941A (en) 2020-06-15 2020-06-15 Commodity transaction website data analysis method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN111695941A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112465546A (en) * 2020-11-26 2021-03-09 中诚信征信有限公司 User identification method, device and equipment
CN112598442A (en) * 2020-12-25 2021-04-02 中国建设银行股份有限公司 Multidimensional operation analysis method and multidimensional operation analysis device for network traffic
CN113297478A (en) * 2021-04-25 2021-08-24 上海淇玥信息技术有限公司 Information pushing method and device based on user life cycle and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447126A (en) * 2015-11-17 2016-03-30 苏州蜗牛数字科技股份有限公司 Game prop personalized recommendation method
CN107784390A (en) * 2017-10-19 2018-03-09 北京京东尚科信息技术有限公司 Recognition methods, device, electronic equipment and the storage medium of subscriber lifecycle
CN108021929A (en) * 2017-11-16 2018-05-11 华南理工大学 Mobile terminal electric business user based on big data, which draws a portrait, to establish and analysis method and system
CN110503446A (en) * 2018-05-16 2019-11-26 江苏天智互联科技股份有限公司 The client segmentation method and decision-making technique of electric business platform based on clustering algorithm
CN110689355A (en) * 2019-09-03 2020-01-14 浙江数链科技有限公司 Client classification method, device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447126A (en) * 2015-11-17 2016-03-30 苏州蜗牛数字科技股份有限公司 Game prop personalized recommendation method
CN107784390A (en) * 2017-10-19 2018-03-09 北京京东尚科信息技术有限公司 Recognition methods, device, electronic equipment and the storage medium of subscriber lifecycle
CN108021929A (en) * 2017-11-16 2018-05-11 华南理工大学 Mobile terminal electric business user based on big data, which draws a portrait, to establish and analysis method and system
CN110503446A (en) * 2018-05-16 2019-11-26 江苏天智互联科技股份有限公司 The client segmentation method and decision-making technique of electric business platform based on clustering algorithm
CN110689355A (en) * 2019-09-03 2020-01-14 浙江数链科技有限公司 Client classification method, device, computer equipment and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112465546A (en) * 2020-11-26 2021-03-09 中诚信征信有限公司 User identification method, device and equipment
CN112465546B (en) * 2020-11-26 2024-04-19 中诚信征信有限公司 User identification method, device and equipment
CN112598442A (en) * 2020-12-25 2021-04-02 中国建设银行股份有限公司 Multidimensional operation analysis method and multidimensional operation analysis device for network traffic
CN113297478A (en) * 2021-04-25 2021-08-24 上海淇玥信息技术有限公司 Information pushing method and device based on user life cycle and electronic equipment
CN113297478B (en) * 2021-04-25 2022-06-21 上海淇玥信息技术有限公司 Information pushing method and device based on user life cycle and electronic equipment

Similar Documents

Publication Publication Date Title
JP6109290B2 (en) Identify classified misplacements
JP5965911B2 (en) Data processing based on online trading platform
CN111695941A (en) Commodity transaction website data analysis method and device and electronic equipment
TWI529642B (en) Promotion method and equipment of product information
Tsai et al. Customer segmentation issues and strategies for an automobile dealership with two clustering techniques
US7437323B1 (en) Method and system for spot pricing via clustering based demand estimation
KR102174206B1 (en) Method and apparatus for online product recommendation considering reliability of product
JP5253519B2 (en) Method, apparatus and storage medium for generating smart text
CN108416616A (en) The sort method and device of complaints and denunciation classification
US11200593B2 (en) Predictive recommendation system using tiered feature data
CN105225135B (en) Potential customer identification method and device
CN109583966A (en) A kind of high value customer recognition methods, system, equipment and storage medium
WO2015124024A1 (en) Method and device for promoting exposure rate of information, method and device for determining value of search word
JP2019212126A (en) Sales support system, sales support method, and sales support program
CN114782076B (en) Online mall consumption platform lottery integral exchange intelligent management method, system and computer storage medium
US8275682B2 (en) Systems and methods for consumer price index determination using panel-based and point-of-sale market research data
Reutterer et al. A data mining framework for targeted category promotions
Weingarten et al. Shortening delivery times by predicting customers’ online purchases: A case study in the fashion industry
US20180232750A1 (en) Systems and methods for predicting customer behavior from social media activity
CN115099878B (en) Marketing method based on big data analysis
US11687977B2 (en) System and method for predicting customer lifetime value using two-stage machine learning
US20100063869A1 (en) Methods and apparatus to create modeling groups for use in trade promotion response models
CN115965468A (en) Transaction data-based abnormal behavior detection method, device, equipment and medium
CN115131108A (en) E-commerce commodity screening system
JP4962950B2 (en) Recommendation method, recommendation server and program for users on network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination