CN115409577A

CN115409577A - Intelligent container repurchase prediction method and system based on user behavior and environmental information

Info

Publication number: CN115409577A
Application number: CN202211058391.3A
Authority: CN
Inventors: 龚科; 陈子良; 陈添水
Original assignee: Guangzhou Wisdom Technology Guangzhou Co ltd
Current assignee: Guangzhou Wisdom Technology Guangzhou Co ltd
Priority date: 2022-08-31
Filing date: 2022-08-31
Publication date: 2022-11-29

Abstract

The invention provides an intelligent container repurchase prediction method and system based on user behavior and environmental information, wherein the method comprises the following steps: s1, data collection and pretreatment: collecting basic information and historical order information of all users from an intelligent container, acquiring environment information of each historical order, and obtaining user portrait data fused with the environment information, namely environment user portrait data according to the information; the environment information comprises weather information, date information and peripheral competitive product information when an order is committed; s2, characteristic construction and screening; s3, model training and verification; and S4, forecasting the repurchase behavior. The method and the system fully excavate the user portrait in the unmanned retail scene, introduce the environmental information as context supplement of the historical purchasing behavior of the user, can realize accurate recognition of the repurchase user, and have positive guiding significance on marketing strategies.

Description

Intelligent container repurchase prediction method and system based on user behavior and environmental information

Technical Field

The invention relates to the field of machine learning, in particular to an intelligent container repurchase prediction method and system based on user behaviors and environmental information.

Background

In recent years, intelligent containers based on AI unmanned retail technology are gradually popularized domestically, wherein the repurchase prediction plays a key guiding role in marketing strategies of the intelligent containers. The re-purchasing behavior is defined as the behavior of the same user to repeatedly purchase goods within a period of time. The frequency of the repurchase behavior can be used as an evaluation index of the user viscosity, and the higher the repurchase frequency is, the higher the viscosity of the user is.

In reality, taking an office building scene as an example, as office population is relatively fixed, the probability of occurrence of the repurchase behavior is relatively high, and at the moment, the accurate repurchase prediction system can guide a seller to accurately model users in the population and push related preferential information, so that the sales profit is improved.

The Chinese invention patent with application publication number CN111681055A discloses a commodity repurchase pushing method, which comprises the following steps: receiving the information of the WeChat service number of the merchant concerned by the customer, and inputting the personal information of the customer and the commodity purchase record bound in the WeChat service number; calculating the re-purchasing time of the customer based on the commodity purchasing record, and pushing a re-purchasing reminding notice to the customer through a WeChat service number at the re-purchasing time; and processing the repeated-purchase commodity operation performed by the customer entering the online shop of the merchant according to the repeated-purchase reminding notice. The online shopping system aims at the online shopping scene with relatively dense purchase frequency, and cannot be applied to the offline shopping scene with relatively sparse purchase frequency.

The Chinese patent with an authorization publication number of CN107220845B discloses a user repurchase probability prediction method, which comprises the following steps: learning from the training sample set to obtain a prediction model of the user repurchase probability; acquiring a characteristic data set of a user to be predicted; and taking the characteristic data set of the user to be predicted as the input of the prediction model, and obtaining the repurchase probability prediction value of the user to be predicted through the prediction model. However, the adopted repurchase prediction method is based on some simple statistical information and manual rules, such as the purchase times of the users, the order amount and the like, the environment information specific to the offline shopping scene is not fully considered, the environment characteristics of the users during shopping are not considered enough, and the repurchase rate prediction accuracy is low.

In summary, most of the existing repurchase prediction systems aim at intensive repurchase scenes such as e-commerce and the like, the purchase frequency of the off-line scenes is relatively sparse, and the off-line scenes have complex context information, so that the existing repurchase prediction algorithm cannot be directly migrated and reused; meanwhile, the existing repurchase prediction system does not fully consider the specific environmental information of the offline shopping scene, and the modeling granularity cannot meet the requirements of the scene of the unmanned retail container.

Disclosure of Invention

The invention aims to provide an intelligent container repurchase prediction method and system based on user behaviors and environmental information aiming at the problems in the prior art, wherein the user historical purchasing behaviors and the environmental information are fused by combining advanced machine learning and data mining algorithms, so that a user is more finely modeled, and the repurchase intention of the user is accurately identified; through discerning the user that has higher repurchase probability, realize accurate propelling movement and sale, improve the commodity sales volume and the profit of intelligent packing cupboard.

In order to achieve the purpose, the invention adopts the following technical scheme:

an intelligent container repurchase prediction method based on user behavior and environmental information comprises the following steps:

s1, data collection and pretreatment: collecting basic information and historical order information of all users from an intelligent container, acquiring environment information of each historical order, and obtaining user portrait data fused with the environment information, namely environment user portrait data according to the information; the environment information comprises weather information, date information and peripheral competitive product information when an order is committed;

s2, characteristic construction and screening: converting the environmental user portrait data generated in the S1 into continuous or discrete numerical features, and splicing to obtain user feature vectors; expanding the user characteristic vectors through a manually designed characteristic crossing rule, and outputting an expanded user characteristic vector set; screening the features in the expanded user feature vector set, calculating the feature importance and sequencing in a descending order, and deleting the features with the importance lower than a certain threshold value to obtain a final user feature vector set;

s3, model training and verification: establishing a model, fitting the user repurchase behavior based on the characteristics screened in the S2, and predicting the user repurchase probability after a certain time; performing model verification and parameter selection through a cross verification method or a leave-one-out method to obtain a final model M;

s4, forecasting the repurchase behavior: and calling the final model M to predict the repurchase rate of the user and updating the repurchase rate to a background database of the intelligent container.

Further, the S1 specifically includes the following steps:

s101, extracting basic information and historical order information of all users from online logs of intelligent containers;

s102, screening active users according to the registration time and the historical purchase time of the users, and outputting an active user set U;

s103, acquiring population attributes of the users in the active user set U, and retrieving historical order information through user IDs; constructing user portrait data based on the information, and outputting by taking the user ID as an index;

s104, crawling weather information, date information and surrounding competitive product information when each historical order is in a deal based on a crawler technology; constructing order environment data based on the information, and outputting by taking the order ID as an index;

and S105, combining the user portrait data and the order environment data according to the corresponding relation of (user ID and order ID) to obtain structured environment user portrait data, and uploading the structured environment user portrait data to a local or cloud storage medium.

Further, the population attributes comprise gender, age, nickname, whether to open a privacy-exempt payment;

the historical order information comprises the purchase time of each order, a purchased commodity list in the order, and the unit price and the quantity of each purchased commodity;

the date information comprises whether the order transaction date is weekend or legal holiday;

the peripheral competitive product information comprises the number of peripheral convenience stores and the price of the competitive product.

Further, the S102 specifically includes the following steps:

s1021, distinguishing a new user from a non-new user according to the registration duration of the user;

and S1022, distinguishing active users from static users according to whether purchasing behaviors exist in the latest period.

Further, in S1021, the method for distinguishing the new user from the non-new user is as follows:

selecting a certain month as an initial month, calculating retention rates of registered users of the initial month in a plurality of subsequent months, and calculating a difference value between the month and the initial month as a new user observation period when the retention rates tend to be stable from the certain month;

and after the registration time exceeds the new user observation period, the users with purchasing behaviors exist, the users are defined as non-new users, and the rest are new users.

Further, in S1022, the method for distinguishing the active user from the stationary user from the non-new user is:

calculating the N +1 month purchase rate of the user, wherein the N +1 month purchase rate of the user is defined as: the probability that the user does not have purchasing behavior for N consecutive months and has purchasing behavior in the (N + 1) th month;

when N is larger than a certain integer value K, the purchase rate of the user in the N +1 month is always in a lower level, and whether the purchase behavior occurs in the continuous K months is used as a division standard of active users and static users.

Further, the method further comprises S5, information pushing: and pushing new products or preferential information by the intelligent container according to the repurchase rate prediction data of the user.

Further, the S2 specifically includes:

converting the environmental user portrait data generated in the S1 into continuous or discrete numerical features, and splicing to obtain user feature vectors;

generating high-order cross features of the user feature vectors, namely performing combined operation on feature values of different dimensions by using a factorization machine or a deep neural network model according to a feature cross rule considered to be designed so as to expand the user feature vectors and outputting an expanded user feature vector set;

the expanded user feature vector set is used as the input of a single-layer two-classification logistic regression model, the user repurchase behavior prediction is carried out, the model is verified based on offline test data and an organization online A/B test, and the weight parameters of all features in the model are further adjusted; and finally, taking the weight parameters of the model as feature importance, performing descending order on the features according to the importance, and deleting the features with the importance lower than a certain threshold value to obtain a final user feature vector set.

Further, the S3 specifically includes:

selecting and establishing a plurality of basic models based on different model structures;

dividing the final user characteristic vector set in the S2 into a training set and a verification set;

setting hyper-parameters of the basic model, training the basic model by using a training set, fitting the user repurchase behavior based on the characteristics screened in the S2, and predicting the user repurchase probability after a certain time;

then, performing model verification and parameter selection on the basic model on the test set by a cross verification method or a leave-one-out method;

and selecting an optimal basic model and a parameter combination thereof based on the test indexes to obtain a final model M.

An intelligent container repurchase prediction system based on user behavior and environmental information comprises:

the data collection and preprocessing module: the intelligent container is used for collecting basic information and historical order information of all users from the intelligent container, acquiring environment information of each historical order, and obtaining user portrait data fused with the environment information according to the information, namely environment user portrait data; the environment information comprises weather information, date information and peripheral competitive product information when an order is committed;

the characteristic construction and screening module comprises: the system comprises a data collection and preprocessing module, a user feature vector generation module, a user image generation module and a user image generation module, wherein the data collection and preprocessing module is used for generating environment user portrait data; expanding the user characteristic vectors through a manually designed characteristic crossing rule, and outputting an expanded user characteristic vector set; screening the features in the expanded user feature vector set, calculating the feature importance and sequencing in a descending order, and deleting the features with the importance lower than a certain threshold value to obtain a final user feature vector set;

a model training and verification module: the system is used for establishing a model, fitting the user repurchase behavior based on the characteristics screened by the characteristic construction and screening module and predicting the user repurchase probability after a certain time; performing model verification and parameter selection through a cross verification method or a leave-one-out method to obtain a final model M;

a repurchase behavior prediction module: and the method is used for calling the final model M to predict the repurchase rate of the user and updating the repurchase rate to the background database of the intelligent container.

The method fully excavates the user portrait in the unmanned retail scene, introduces the environmental information as context supplement of the historical purchasing behavior of the user, can realize accurate recognition of the repurchase user, and has positive guiding significance on marketing strategies.

According to the intelligent container repurchase prediction method and system based on the user behaviors and the environmental information, the user portrait modeling is carried out through complete feature extraction, feature screening and fusion of environmental context information, the user repurchase behaviors of the unmanned retail intelligent container are predicted accurately by means of an advanced machine learning technology, marketing strategies of a container operator can be guided, users can be served accurately, and real intelligent, reliable and convenient unmanned retail service is achieved.

Compared with the prior art, the invention has the following advantages:

1. the modeling precision is high: according to the invention, the intelligent container user is accurately depicted in fine granularity through the information such as environmental information, population attributes and historical behaviors, so that high individuation can be achieved;

2. the universality is good: the system and the method are also suitable for traditional e-commerce and other new retail scenes, and can bring actual sales gain to merchants;

3. the use experience is good: the interaction times of the user and the intelligent container are more, the modeling of the user is more accurate, the screening of the repurchase user is more accurate, and the operation mode and the user experience can be effectively improved.

Drawings

Fig. 1 is a flowchart of an intelligent container repurchase prediction method based on user behavior and environmental information according to an embodiment of the present invention.

Fig. 2 is a schematic structural diagram of an intelligent container repurchase prediction system based on user behavior and environmental information according to a second embodiment of the present invention.

Detailed Description

The technical solution of the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.

Example one

As shown in fig. 1, an intelligent container repurchase prediction method based on user behavior and environmental information provided by an embodiment of the present invention includes the following steps:

s4, forecasting the repurchase behavior: calling the final model M to predict the repurchase rate of the user and updating the repurchase rate to a background database of the intelligent container;

s5, information pushing: and pushing new products or preferential information by the intelligent container according to the repurchase rate prediction data of the user.

The main objective of the S1 is to define and filter a user set suitable for repurchase prediction, and abstract user historical behaviors into structured data, and specifically includes the following steps:

s103, aiming at the users in the active user set U, acquiring the population attributes (including sex, age, nickname and whether to open password-free payment) of the users, and retrieving historical order information (including the purchase time of each order, a purchased commodity list in the order, and the unit price and the quantity of each purchased commodity) through the user ID; constructing user portrait data based on the information, organizing the user portrait data according to a json format, and outputting the user portrait data with a user ID as an index;

s104, based on a crawler technology, crawling weather information, date information (including whether the order transaction date is weekend or legal holiday) and peripheral competitive product information (including the number of peripheral convenience stores and the price of competitive products) of each historical order at the time of transaction; constructing order environment data based on the information, organizing the order environment data according to a json format, and outputting the order ID as an index;

and S105, combining the user portrait data and the order environment data according to the corresponding relation of (user ID and order ID) to obtain structured environment user portrait data, packaging the structured environment user portrait data in a json or csv file form, and uploading the structured environment user portrait data to a local or cloud storage medium.

Wherein the S102 includes the steps of:

s1021, distinguishing a new user from a non-new user according to the user registration duration;

Specifically, in S1021, the method for distinguishing the new user from the non-new user is:

selecting a certain month as an initial month, calculating the retention rate of registered users of the initial month in a plurality of subsequent months, and calculating the difference value of the month and the initial month as a new user observation period when the retention rate tends to be stable from the certain month;

Wherein, the calculation formula of the retention rate is as follows:

for example: with 1 month of 2021 year as the starting month, the registered number is 100. In the period of 2021 month from 2 month to 6 month, the user belongs to 1 month registration, and the number of users who have purchased behavior is 80, 70, 69, 72 and 70 respectively, then the retention rate in 2-6 months is 80%, 70%, 69%, 72% and 70% respectively, and the fluctuation value of the retention rate from 2021 year and 3 months is less than 3%, it can be considered that the retention rate from 2021 year and 3 months tends to be stable, so 3-1=2 month is selected as the observation period, that is, the user who has purchased behavior after registering for 2 months is defined as a non-new user.

and calculating the N +1 month purchase rate of the user, wherein the N +1 month purchase rate of the user is defined as: the probability that the user has no purchase behavior for N consecutive months from a certain starting month and has purchase behavior in the (N + 1) th month is calculated by the following formula:

when N is larger than a certain integer value K, the purchase rate of the user in the N +1 month is always in a lower level, and whether the purchase behavior occurs in the continuous K months is used as a division standard of active users and static users. Wherein, the evaluation criteria that the purchase rate of the user in the N +1 month is always at a lower level are as follows: the user N +1 monthly purchase rate continues to be below 20% for several consecutive months, or the variance of the user N +1 monthly purchase rate for several consecutive months is less than 0.04. For example: when a certain user does not take purchasing actions for more than 3 consecutive months (N is more than or equal to 3, K = 3), the probability mean value of the purchasing actions taken in the next month is 5%, and the variance is less than 0.04, the user who does not take purchasing actions for more than 3 consecutive months is divided into static users, and the rest are active users.

The purpose of the above process is to evaluate whether a user can form a habit of continuous purchase for a certain period of time (K months), and if the user does not generate new purchase behavior for the period of time, the user is regarded as a stationary user, otherwise, the user is an active user. It should be noted that when the user is a registered user within K months, the user should be excluded from the statistical criteria of the authentication set.

Further, the S2 specifically includes:

converting the environmental user portrait data generated in the S1 into continuous or discrete numerical characteristics, and splicing to obtain a user characteristic vector;

generating high-order cross characteristics of the user characteristic vectors, namely performing combined operation on characteristic values of different dimensions by using a factorization machine or a deep neural network model according to a characteristic cross rule considered to be designed so as to expand the user characteristic vectors and outputting an expanded user characteristic vector set;

Further, the S3 specifically includes:

selecting and establishing a plurality of basic models based on different model structures (such as a gradient lifting tree, a deep neural network, a support vector machine, a naive Bayes and other statistical machine learning models);

dividing the final user characteristic vector set in the S2 into a training set and a verification set according to a certain rule; for example, time-sequential partitioning, random sampling partitioning, according to the K-fold cross validation principle;

selecting an optimal basic model and a parameter combination thereof based on the test indexes to obtain a final model M; the test indexes comprise AUC, MAP, F1, recall, precison and the like, and besides, the parameter combination can also be determined by a grid search method.

Further, the S4 specifically includes: and calling the final model M to predict the repurchase rates of all users in the set U, finally giving a classification result (0 or 1) of whether repurchase is possible or not and a corresponding probability, and updating the classification result to a background database of the intelligent container by taking the user ID as an index.

Further, in S5, the intelligent container can carry out different marketing strategies according to different repurchase rates of users, so that accurate marketing is realized. For example: carrying out coupon pushing on the low-repeated-purchase-probability customer group to train the purchasing habits of the users; after-purchase lottery activity and coupon pushing are carried out on the secondary-purchase probability customer group, and the secondary purchase of the user is promoted; the method comprises the following steps of carrying out post-purchase lottery activities and giving certain amount of reward activities when certain purchase times are met for medium-high and repeated purchase probability customers, and improving user stickiness; for the high-repeated-purchase-probability customer group, the activities of meeting certain purchase times and giving certain money rewards are developed, and the sales volume, the user stickiness and the user satisfaction degree are improved.

Example two

As shown in fig. 2, an intelligent container repurchase prediction system based on user behavior and environmental information provided in an embodiment of the present invention includes:

the data collection and pretreatment module comprises: the intelligent container is used for collecting basic information and historical order information of all users from the intelligent container, acquiring environment information of each historical order, and obtaining user portrait data fused with the environment information according to the information, namely environment user portrait data; the environment information comprises weather information, date information and peripheral competitive product information when an order is committed;

The work flows of the above modules correspond to S1 to S4 in the first embodiment one by one, and are not described herein again.

The method and the system fully excavate the user portrait in the unmanned retail scene, introduce the environmental information as context supplement of the historical purchasing behavior of the user, can realize accurate recognition of the repurchase user, and have positive guiding significance on marketing strategies.

According to the intelligent container repurchase prediction method and system based on the user behaviors and the environmental information, the user portrait modeling is carried out through complete feature extraction and feature screening and fusion of the environmental context information, the user repurchase behaviors of the intelligent container for the unmanned retail are predicted accurately by means of an advanced machine learning technology, marketing strategies of container operators can be guided, users can be served accurately, and real intelligent, reliable and convenient unmanned retail service is achieved.

Compared with the prior art, the invention has the following advantages:

3. the use experience is good: the more the interaction times of the user and the intelligent container are, the more accurate the modeling of the user is, the more accurate the screening of the repurchase user is, and the operation mode and the user experience can be effectively improved.

The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is specific and detailed, but not to be understood as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. An intelligent container repurchase prediction method based on user behavior and environmental information is characterized by comprising the following steps:

s2, feature construction and screening: converting the environmental user portrait data generated in the S1 into continuous or discrete numerical features, and splicing to obtain user feature vectors; expanding the user characteristic vectors through a manually designed characteristic crossing rule, and outputting an expanded user characteristic vector set; screening the features in the expanded user feature vector set, calculating the feature importance and sequencing in a descending order, and deleting the features with the importance lower than a certain threshold value to obtain a final user feature vector set;

2. The intelligent container repurchase prediction method based on user behavior and environmental information according to claim 1, wherein the S1 specifically comprises the following steps:

s101, extracting basic information and historical order information of all users from online logs of an intelligent container;

s103, acquiring population attributes of the users in the active user set U, and retrieving historical order information through user IDs; constructing user portrait data based on the information, and outputting by taking a user ID as an index;

s104, crawling weather information, date information and surrounding competitive product information of each historical order in the transaction process based on a crawler technology; constructing order environment data based on the information, and outputting by taking the order ID as an index;

3. The intelligent container repurchase prediction method based on user behavior and environmental information according to claim 2, wherein the population attributes comprise gender, age, nickname, whether to open a privacy-exempt payment;

the peripheral auction information includes the number of peripheral convenience stores and the price of the auction.

4. The intelligent container repurchase prediction method based on user behavior and environmental information according to claim 3, wherein the S102 specifically comprises the following steps:

5. The intelligent container repurchase prediction method based on user behavior and environmental information according to claim 3, wherein in S1021, the method for distinguishing new users from non-new users is as follows:

6. The intelligent container repurchase prediction method based on user behavior and environmental information according to claim 3, wherein in S1022, the method for distinguishing active users and static users from non-new users is:

7. The intelligent container repurchase prediction method based on user behavior and environmental information according to claim 3, further comprising S5, information pushing: and pushing new products or preferential information by the intelligent container according to the repurchase rate prediction data of the user.

8. The intelligent container repurchase prediction method based on user behavior and environmental information according to claim 3, wherein the S2 specifically comprises:

9. The intelligent container repurchase prediction method based on user behavior and environmental information according to claim 8, wherein the S3 specifically comprises:

10. The utility model provides an intelligence packing cupboard buys prediction system again based on user's action and environmental information which characterized in that includes:

a feature construction and screening module: the system comprises a data collection and preprocessing module, a user feature vector generation module, a user image generation module and a user image generation module, wherein the data collection and preprocessing module is used for generating environment user portrait data; expanding the user characteristic vectors through a manually designed characteristic crossing rule, and outputting an expanded user characteristic vector set; screening the features in the expanded user feature vector set, calculating the feature importance and sequencing in a descending order, and deleting the features with the importance lower than a certain threshold value to obtain a final user feature vector set;

a model training and verification module: the system is used for establishing a model, fitting the user repurchase behavior based on the characteristics screened by the characteristic construction and screening module, and predicting the user repurchase probability after a certain time; performing model verification and parameter selection through a cross verification method or a leave-one-out method to obtain a final model M;