CN109685537B - User behavior analysis method, device, medium and electronic equipment - Google Patents

User behavior analysis method, device, medium and electronic equipment Download PDF

Info

Publication number
CN109685537B
CN109685537B CN201710971008.6A CN201710971008A CN109685537B CN 109685537 B CN109685537 B CN 109685537B CN 201710971008 A CN201710971008 A CN 201710971008A CN 109685537 B CN109685537 B CN 109685537B
Authority
CN
China
Prior art keywords
user
behavior
brand
category
users
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710971008.6A
Other languages
Chinese (zh)
Other versions
CN109685537A (en
Inventor
周默
吴劲平
李凯东
张燕锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710971008.6A priority Critical patent/CN109685537B/en
Publication of CN109685537A publication Critical patent/CN109685537A/en
Application granted granted Critical
Publication of CN109685537B publication Critical patent/CN109685537B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Abstract

The invention provides a user behavior analysis method, a user behavior analysis device, a user behavior analysis medium and electronic equipment. The user behavior analysis method comprises the following steps: acquiring behavior data of a user in a shopping process; determining user behavior characteristics, brand behavior characteristics related to the user, and associated behavior characteristics between the user and the brand based on the behavior data; determining sample data for analyzing user behaviors according to the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics; determining a positive sample and a negative sample in the sample data based on the actual shopping behavior of the user in a predetermined time period; and analyzing and predicting the user behavior according to the positive sample, the negative sample and a preset prediction model. According to the invention, different types of users associated with a certain brand can be quickly and accurately identified based on the analysis of behavior data of the users in the shopping process, so that accurate message pushing can be realized, and the purchase conversion rate of the users can be improved.

Description

User behavior analysis method, device, medium and electronic equipment
Technical Field
The invention relates to the technical field of data analysis, in particular to a user behavior analysis method, a user behavior analysis device, a user behavior analysis medium and electronic equipment.
Background
At present, internet e-commerce websites are rapidly developed, and the number of commodities and the number of active users are increased to a mass level. For each large brand, when the commodity is popularized, it is difficult to accurately locate the target group, and a large number of sales promotion activities are usually replaced by a small purchase conversion rate.
In order to improve the purchase conversion rate, the purchasing behavior of a user on a certain brand needs to be predicted, and the current common practice is to collect behaviors of browsing commodities, adding shopping carts, adding attention, descending and the like of the user for a long time, analyze out the brand in which the user is interested, and then mark the user. For example, a user who frequently visits brand A, model B5, joins a shopping cart, and finds that the user has previously purchased brand A, model B4, may be flagged as a high quality user of brand A.
Although some key indexes of users during shopping are comprehensively considered, such as browsing, buying, paying attention, placing orders and the like, in the environment of mass data, the characteristics still cannot accurately reflect the real intention of the users and do not have enough discrimination.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present invention and therefore may include information that does not constitute prior art known to a person of ordinary skill in the art.
Disclosure of Invention
An object of the present invention is to provide a method, apparatus, medium, and electronic device for analyzing user behavior, thereby overcoming, at least to some extent, one or more of the problems due to the limitations and disadvantages of the related art.
Additional features and advantages of the invention will be set forth in the detailed description which follows, or may be learned by practice of the invention.
According to a first aspect of the present invention, there is provided a method for analyzing user behavior, including: acquiring behavior data of a user in a shopping process; determining user behavior characteristics, brand behavior characteristics related to the user, and associated behavior characteristics between the user and the brand based on the behavior data; determining sample data for analyzing user behaviors according to the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics; determining a positive sample and a negative sample in the sample data based on the actual shopping behavior of the user in a predetermined time period; and analyzing and predicting the user behavior according to the positive sample, the negative sample and a preset prediction model.
In some embodiments of the present invention, based on the foregoing scheme, determining the user behavior feature based on the behavior data includes: determining a behavior vector of each user in the shopping process and a time vector for representing the occurrence time of the behavior vector according to the behavior data; and carrying out Cartesian orthogonal operation on the behavior vector and the time vector, and carrying out statistical analysis processing on an operation result to obtain the user behavior characteristics.
In some embodiments of the present invention, based on the foregoing scheme, determining a brand behavior feature related to the user based on the behavior data includes: determining behavior vectors related to various brands and occurring in the shopping process of the user according to the behavior data, and time vectors used for representing the occurrence time of the behavior vectors; and carrying out Cartesian orthogonal operation on the behavior vector and the time vector, and carrying out statistical analysis processing on an operation result to obtain brand behavior characteristics related to the user.
In some embodiments of the present invention, based on the foregoing scheme, determining the associated behavior feature between the user and the brand based on the behavior data includes: determining behavior vectors which occur in the shopping process of each user and are related to each brand and time vectors used for representing the occurrence time of the behavior vectors according to the behavior data; and carrying out Cartesian orthogonal operation on the behavior vector and the time vector, and carrying out statistical analysis processing on an operation result to obtain associated behavior characteristics between the user and the brand.
In some embodiments of the present invention, based on the foregoing scheme, determining sample data for analyzing user behavior according to the user behavior feature, the brand behavior feature and the associated behavior feature includes: dividing the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics according to the subdivisions of the commodities to obtain the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics under each subdivisions; and determining sample data for analyzing the user behaviors based on the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics under the various sub-classification classes.
In some embodiments of the present invention, based on the foregoing scheme, determining sample data for analyzing user behavior based on the user behavior features, the brand behavior features, and the associated behavior features under the respective sub-category includes: determining users under each user type of the target brand according to the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics under each fine category; and obtaining the sample data according to the determined users of the target brand under each user type.
In some embodiments of the present invention, based on the foregoing solution, determining users in each user type for a target brand according to the user behavior features, the brand behavior features, and the associated behavior features in each segment class includes: selecting users related to the target category of the target brand as users under the first user type of the target brand; selecting users which are related to the target category and exclude the target brand as users under a second user type of the target brand; selecting users which are related to the associated item of the target item and do not belong to the first user type and the second user type as users of a third user type of the target brand; selecting users which are related to other brands under the target category and do not belong to the first user type, the second user type and the third user type as users under a fourth user type of the target brand.
In some embodiments of the present invention, based on the foregoing solution, the method further includes: if a user browsing a first category also browses a second category, and the second category is the category with the browsing frequency arranged at the top N bits in the categories browsed by the user browsing the first category, and the first category is the category with the browsing frequency arranged at the top N bits in the categories browsed by the user browsing the second category, the first category and the second category are determined to be related categories.
In some embodiments of the present invention, based on the foregoing scheme, determining the positive sample and the negative sample in the sample data based on the actual shopping behavior of the user within a predetermined time period includes: and taking user data ordered within the preset time period in the sample data as the positive sample, and taking user data except the positive sample in the sample data as the negative sample.
In some embodiments of the present invention, based on the foregoing solution, the performing an analysis prediction on the behavior of the user according to the positive sample, the negative sample and a predetermined prediction model includes: inputting the positive samples and the negative samples into the predetermined prediction model, and constructing a PR curve based on an output result of the predetermined prediction model; and analyzing and predicting the behavior of the user based on the PR curve.
According to a second aspect of the present invention, there is provided an analysis apparatus for user behavior, comprising: the acquisition unit is used for acquiring behavior data of a user in a shopping process; a first determination unit, configured to determine, based on the behavior data, a user behavior feature, a brand behavior feature related to the user, and an associated behavior feature between the user and the brand; the second determining unit is used for determining sample data for analyzing the user behavior according to the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics; a third determining unit, configured to determine a positive sample and a negative sample in the sample data based on an actual shopping behavior of a user within a predetermined time period; and the processing unit is used for analyzing and predicting the user behavior according to the positive sample, the negative sample and a preset prediction model.
According to a third aspect of the present invention, there is provided a computer readable medium, on which a computer program is stored, which when executed by a processor, implements the method of analyzing user behavior as described in the first aspect of the embodiments above.
According to a fourth aspect of the present invention, there is provided an electronic apparatus comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method of analyzing user behavior as described in the first aspect of the embodiments above.
In the technical scheme provided by some embodiments of the present invention, by determining the user behavior characteristics, the brand behavior characteristics related to the user, and the associated behavior characteristics between the user and the brand, analyzing the sample data of the user behavior according to the user behavior characteristics, and then determining the positive sample and the negative sample in the sample data based on the actual shopping behavior of the user in the predetermined time period, to analyze and predict the user behavior according to the positive sample, the negative sample, and the predetermined prediction model, the different types of users associated with a certain brand can be quickly and accurately identified based on the analysis of the behavior data of the user in the shopping process, and thus, accurate message push can be realized to improve the purchase conversion rate of the user.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort. In the drawings:
fig. 1 schematically shows a flow chart of a method of analyzing user behavior according to a first embodiment of the invention;
FIG. 2 schematically shows a flow chart of a method of analyzing user behavior according to a second embodiment of the invention;
FIG. 3 schematically illustrates an overall architecture diagram of an analysis system of user behavior according to an embodiment of the invention;
FIG. 4 schematically shows a block diagram of an apparatus for analyzing user behavior according to an embodiment of the present invention;
FIG. 5 illustrates a schematic structural diagram of a computer system suitable for use with the electronic device to implement an embodiment of the invention.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention may be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations or operations have not been shown or described in detail to avoid obscuring aspects of the invention.
The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.
The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
Fig. 1 schematically shows a flow chart of a method of analyzing user behavior according to a first embodiment of the present invention.
Referring to fig. 1, a method for analyzing user behavior according to a first embodiment of the present invention includes:
and step S10, acquiring behavior data of the user in the shopping process.
It should be noted that, in the embodiment of the present invention, the behavior data of the user during the shopping process includes order detail data, product browsing data, data of adding and deleting products to/from the shopping cart, data of paying attention to/collecting products, user data, and related product data, etc.
And step S12, determining user behavior characteristics, brand behavior characteristics related to the user and association behavior characteristics between the user and the brand based on the behavior data.
According to an exemplary embodiment of the present invention, the step of determining the user behavior characteristics in step S12 includes: determining a behavior vector of each user in the shopping process and a time vector for representing the occurrence time of the behavior vector according to the behavior data; and carrying out Cartesian orthogonal operation on the behavior vector and the time vector, and carrying out statistical analysis processing on an operation result to obtain the user behavior characteristics.
According to an exemplary embodiment of the present invention, the step of determining the brand behavior feature related to the user in step S12 includes: determining behavior vectors related to various brands and occurring in the shopping process of the user according to the behavior data, and time vectors used for representing the occurrence time of the behavior vectors; and carrying out Cartesian orthogonal operation on the behavior vector and the time vector, and carrying out statistical analysis processing on an operation result to obtain brand behavior characteristics related to the user.
According to an exemplary embodiment of the present invention, the step of determining the associated behavioral characteristics between the user and the brand in step S12 includes: determining behavior vectors which occur in the shopping process of each user and are related to each brand and time vectors used for representing the occurrence time of the behavior vectors according to the behavior data; and carrying out Cartesian orthogonal operation on the behavior vector and the time vector, and carrying out statistical analysis processing on an operation result to obtain associated behavior characteristics between the user and the brand.
It should be noted that the behavior vector may include purchasing, browsing, paying attention, canceling attention, adding and deleting from the shopping cart, etc. The time vector may include approximately 1 day, approximately 2 days, approximately 3 days, approximately 5 days, approximately 7 days, approximately 15 days, approximately 30 days, approximately half a year, and the like. The statistical analysis processing of the operation result may be averaging, totaling, or the like of the operation result.
And step S14, determining sample data for analyzing the user behavior according to the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics.
According to an exemplary embodiment of the present invention, step S14 includes: dividing the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics according to the subdivisions of the commodities to obtain the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics under each subdivisions; and determining sample data for analyzing the user behaviors based on the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics under the various sub-classification classes.
It should be noted that the sub-categories of the commodities can be divided into secondary categories and tertiary categories, for example, the primary categories of the commodities are household appliances, the secondary categories can be large household appliances, small household appliances and the like, and the tertiary categories of the commodities under the large household appliances can be refrigerators and the like. According to the scheme of the embodiment, the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics are divided according to the subdivided categories of the commodities, so that the behavior characteristics corresponding to the categories can be further analyzed.
In some embodiments of the present invention, based on the foregoing scheme, determining sample data for analyzing user behavior based on the user behavior features, the brand behavior features, and the associated behavior features under the respective sub-category includes: determining users under each user type of the target brand according to the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics under each fine category; and obtaining the sample data according to the determined users of the target brand under each user type.
In some embodiments of the present invention, based on the foregoing solution, determining users in each user type for a target brand according to the user behavior features, the brand behavior features, and the associated behavior features in each segment class includes: selecting users related to the target category of the target brand as users under the first user type of the target brand; selecting users which are related to the target category and exclude the target brand as users under a second user type of the target brand; selecting users which are related to the associated item of the target item and do not belong to the first user type and the second user type as users of a third user type of the target brand; selecting users which are related to other brands under the target category and do not belong to the first user type, the second user type and the third user type as users under a fourth user type of the target brand.
It should be noted that: if a user browsing a first category also browses a second category, and the second category is the category with the browsing frequency arranged at the top N bits in the categories browsed by the user browsing the first category, and the first category is the category with the browsing frequency arranged at the top N bits in the categories browsed by the user browsing the second category, the first category and the second category are determined to be related categories. Specifically, for example, if there is a product B in top10 of the browsing frequency in the categories browsed by the user browsing the product a and there is a product a in top10 of the browsing frequency in the categories browsed by the user browsing the product B, the product a and the product B are the associated categories.
Step S16, based on the actual shopping behavior of the user in the preset time period, determining the positive sample and the negative sample in the sample data.
According to an exemplary embodiment of the present invention, step S16 includes: and taking user data ordered within the preset time period in the sample data as the positive sample, and taking user data except the positive sample in the sample data as the negative sample.
And step S18, analyzing and predicting the user behavior according to the positive sample, the negative sample and a preset prediction model.
According to an exemplary embodiment of the present invention, the analyzing and predicting the user's behavior according to the positive sample, the negative sample and a predetermined prediction model comprises: inputting the positive sample and the negative sample into the predetermined prediction model, and constructing a PR (Precision Recall) curve based on an output result of the predetermined prediction model; and analyzing and predicting the behavior of the user based on the PR curve.
It should be noted that the predetermined prediction model may be a GBDT (Gradient Boosting Decision Tree) model and a random forest training model.
The technical scheme of the embodiment of the invention mainly analyzes various types of users aiming at a certain brand from massive user behavior data by means of a characteristic engineering and machine learning method, thereby facilitating accurate message pushing and improving the purchase conversion rate of the users.
The technical solution of the embodiment of the present invention is explained in detail by specific examples as follows:
as shown in fig. 2, the method for analyzing user behavior according to the second embodiment of the present invention includes the following steps: data cleaning, feature selection, sample marking, algorithm training and crowd screening. The following is a detailed description of the various flows shown in fig. 2:
data cleansing
Firstly, selecting required basic data, including 6 types of underlying data tables such as orders, browsing, shopping carts, concerns, commodities, users and the like, respectively performing data cleaning on the underlying data tables, screening effective fields and establishing an independent intermediate table, wherein the fields contained in the intermediate table can be specifically shown as table 1:
Figure BDA0001437507550000091
TABLE 1
Feature selection
Based on the basic data shown in table 1, feature extraction is performed for 3 dimensions of brands, users, and user-associated brands, respectively.
A. Brand features
Selecting 6 behaviors of purchasing, browsing, paying attention, canceling attention, purchasing and purchasing under consideration as user behavior vectors, selecting 8 time ranges of nearly 1 day, nearly 2 days, nearly 3 days, nearly 5 days, nearly 7 days, nearly 15 days, nearly 1 month and nearly half a year as time vectors, carrying out Cartesian orthogonal combination on the two vectors, and respectively calculating two indexes of total number and average number to obtain 6 x 8 x 2-96 characteristics facing to the brand. In particular, for example, the resulting brand characteristics may be: total number of purchases for brand a in the last 3 days, and average number of times brand B was viewed daily in the last half year. Expressed in vector form as follows:
Figure BDA0001437507550000101
B. user features
Similar to the selection of brand features, 96 features expressed by the above vector can be calculated for each user. For example, the obtained user characteristics may be: the total number of times that the user A pays attention to the commodity in 7 days; user B averaged the number of buys per day over the last 1 month.
C. User-associated brand features
Similarly, the 96 features expressed by the above vector are calculated for each user and each brand, respectively, taking the features of the user-associated brand into consideration collectively. For example, the obtained user associated brand features may be: user A pays attention to the total number of brands B in approximately 3 days, and user C browses brand D in approximately 7 days.
In the embodiment of the invention, the purpose of separating the user, the brand and the user-associated brand to construct different feature sets is to describe not only the relationship between the user and the brand but also the respective characteristics of the user and the brand, thereby greatly increasing the distinguishing degree.
After the above-described features are obtained, some meaningless or difficult-to-acquire feature combinations may be removed, eventually leaving valid features. And then, splitting treatment is carried out respectively for the secondary class and the tertiary class (in the embodiment of the invention, the primary class is not treated mainly because the subclasses of the primary class are very different, the interference among the classes is very obvious and is not beneficial to being used as the characteristic), and the user characteristic of the secondary class, the brand characteristic of the secondary class, the associated brand characteristic of the user of the secondary class, the user characteristic of the tertiary class, the brand characteristic of the tertiary class and the associated brand characteristic of the user of the tertiary class are obtained. Expressed in vector form as follows:
Figure BDA0001437507550000111
sample marking
In the embodiment of the invention, core users, intended users, potential users and competitive users of the target brand can be distinguished according to the characteristics. Specifically, the obtained series of features are combined in a correlation manner, and the following division is defined:
A. a core user: screening users related to the target product and the target brand;
described in SQL as:
select pin from t where name ═ target class 'and brand'
B. The intended user: screening users related to the target category but excluding the target brand;
described in SQL as:
select pin from t where the name of "target class" and pin not in Set (A)
C. Potential users: screening users related to the related categories, and excluding core users and intention users;
described in SQL as: select pin from t where the gene in ('related class 1', 'related class 2') and pin not in Set (A) and pin not in Set (B)
D. Competing users: and screening users related to the target product and other brands, and excluding all screened users.
Described in SQL as:
select pin from t where name ═ target class ' and brand < > ' target brand ' and pin not in Set (A) and pin not in Set (B) and pin not in Set (C)
The calculation method of the related categories comprises the following steps: if the item B exists in the top10 of the browsing frequency in the categories browsed by the user browsing the item A and the item A exists in the top10 of the browsing frequency in the categories browsed by the user browsing the item B, the item A and the item B are the associated categories. For example, if there is a diaper in top10 of the browsing frequency in the categories browsed by the user browsing beer and there is beer in top10 of the browsing frequency in the categories browsed by the user browsing diaper, the beer and diaper are the related categories.
Algorithm training
And (3) taking 7 days as a sliding window, screening 4 types of users at the time T by using the user division method, tracking the user purchasing behavior at the time T +7, marking the users who really make an order as positive samples, and marking the users who make an order in error or not as negative samples. Considering that the magnitude of the negative examples is much larger than the positive examples, embodiments of the present invention randomly pick 6 times as many negative examples as positive examples.
And inputting the positive and negative samples into a GBDT model and a random forest training model, carrying out weighted average according to the proportion of 3:7, fusing to obtain two classification models of four target crowds respectively, and obtaining a probability value P with correct classification.
Population screening
And constructing a corresponding PR curve based on the probability value P output by the model, and dividing reasonable adjustment parameters according to the size of the screened crowd. For example, when P is 0.005, the number of people is 10 ten thousand, and when P is 0.01, the number of people becomes 3 ten thousand, and by selecting different values of P, the size of the current people can be effectively controlled, and it is ensured that users with high purchase probability can always be identified.
Based on the technical solution of the above embodiment, as shown in fig. 3, an overall architecture of an analysis system for user behavior according to an embodiment of the present invention is shown.
Referring to fig. 3, data sources are used to provide the underlying data, which may be tied to a large data mart for e-commerce platforms. The offline data engine is responsible for data extraction, task scheduling (such as timing management, task dependency management, variable management and the like), alarm monitoring and the like. The task scheduling module organizes the processing flow in the above embodiment of the present invention in a workflow form, supports timing triggering, sequentially executes order dependence, and can increase the flexibility of the system by assigning variables, and when any one step fails, there is a corresponding retry and alarm mechanism.
The algorithm calculation engine can be based on Spark clusters, encapsulates algorithm packages such as GBDT and random forest, reads data provided by the offline data engine through the data reading and writing module, generates a PR curve after being processed through GBDT and random forest algorithm, and analyzes based on the PR curve. And the algorithm calculation engine can also comprise a parameter configuration module to configure each parameter of the algorithm.
Fig. 4 schematically shows a block diagram of an apparatus for analyzing user behavior according to an embodiment of the present invention.
Referring to fig. 4, an apparatus 400 for analyzing user behavior according to an embodiment of the present invention includes: an acquisition unit 402, a first determination unit 404, a second determination unit 406, a third determination unit 408 and a processing unit 410.
Specifically, the obtaining unit 402 is configured to obtain behavior data of the user during shopping; the first determining unit 404 is configured to determine a user behavior characteristic, a brand behavior characteristic related to the user, and an associated behavior characteristic between the user and the brand based on the behavior data; the second determining unit 406 is configured to determine sample data for analyzing user behavior according to the user behavior feature, the brand behavior feature, and the associated behavior feature; the third determining unit 408 is configured to determine a positive sample and a negative sample in the sample data based on the actual shopping behavior of the user within a predetermined time period; the processing unit 410 is configured to analyze and predict the behavior of the user according to the positive sample, the negative sample and a predetermined prediction model.
It should be noted that the specific details of each module/unit included in the user behavior analysis apparatus 400 are already described in detail in the corresponding user behavior analysis method, and therefore are not described herein again.
Referring now to FIG. 5, shown is a block diagram of a computer system 500 suitable for use in implementing an electronic device of an embodiment of the present invention. The computer system 500 of the electronic device shown in fig. 5 is only an example, and should not bring any limitation to the function and the scope of the use of the embodiments of the present invention.
As shown in fig. 5, the computer system 500 includes a Central Processing Unit (CPU)501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for system operation are also stored. The CPU 501, ROM 502, and RAM 503 are connected to each other via a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
The following components are connected to the I/O interface 505: an input portion 506 including a keyboard, a mouse, and the like; an output portion 507 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the I/O interface 505 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted into the storage section 508 as necessary.
In particular, according to an embodiment of the present invention, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 509, and/or installed from the removable medium 511. The above-described functions defined in the system of the present application are executed when the computer program is executed by the Central Processing Unit (CPU) 501.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present invention may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.
As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to implement the method for analyzing user behavior as described in the above embodiments.
For example, the electronic device may implement the following as shown in fig. 1: step S10, behavior data of the user in the shopping process is obtained; step S12, determining user behavior characteristics, brand behavior characteristics related to the user and association behavior characteristics between the user and the brand based on the behavior data; step S14, determining sample data for analyzing user behavior according to the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics; step S16, determining a positive sample and a negative sample in the sample data based on the actual shopping behavior of the user in a preset time period; and step S18, analyzing and predicting the user behavior according to the positive sample, the negative sample and a preset prediction model.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the invention. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiment of the present invention can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which can be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiment of the present invention.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (11)

1. A method for analyzing user behavior, comprising:
acquiring behavior data of a user in a shopping process;
determining user behavior characteristics, brand behavior characteristics related to the user, and associated behavior characteristics between the user and the brand based on the behavior data;
determining sample data for analyzing user behavior according to the user behavior feature, the brand behavior feature and the associated behavior feature, including:
selecting users related to a target category of a target brand as users under a first user type of the target brand; selecting users which are related to the target category and exclude the target brand as users under a second user type of the target brand; selecting users which are related to the associated item of the target item and do not belong to the first user type and the second user type as users of a third user type of the target brand; selecting users which are related to other brands under the target category and do not belong to the first user type, the second user type and the third user type as users under a fourth user type of the target brand;
if a user browsing a first category also browses a second category, and the second category is the category with the browsing frequency arranged at the top N bits in the categories browsed by the user browsing the first category, and the first category is the category with the browsing frequency arranged at the top N bits in the categories browsed by the user browsing the second category, determining that the first category and the second category are related categories;
the user, the brand and the user related item are disassembled to construct different feature sets;
determining a positive sample and a negative sample in the sample data based on the actual shopping behavior of the user in a predetermined time period;
according to the positive sample, the negative sample and a preset prediction model, analyzing and predicting the behavior of the user, wherein the method comprises the following steps:
marking the users who really place orders as the positive samples, and marking the users who are wrongly or not divided as the negative samples;
inputting the positive sample and the negative sample into a training model to obtain a probability value of correct classification;
and constructing a corresponding PR curve based on the probability value output by the training model, and dividing out an adjusting parameter according to the screened crowd scale.
2. The method of claim 1, wherein determining user behavior characteristics based on the behavior data comprises:
determining a behavior vector of each user in the shopping process and a time vector for representing the occurrence time of the behavior vector according to the behavior data;
and carrying out Cartesian orthogonal operation on the behavior vector and the time vector, and carrying out statistical analysis processing on an operation result to obtain the user behavior characteristics.
3. The method of analyzing user behavior according to claim 1, wherein determining brand behavior characteristics related to the user based on the behavior data comprises:
determining behavior vectors related to various brands and occurring in the shopping process of the user according to the behavior data, and time vectors used for representing the occurrence time of the behavior vectors;
and carrying out Cartesian orthogonal operation on the behavior vector and the time vector, and carrying out statistical analysis processing on an operation result to obtain brand behavior characteristics related to the user.
4. The method for analyzing user behavior according to claim 1, wherein determining the associated behavior characteristics between the user and the brand based on the behavior data comprises:
determining behavior vectors which occur in the shopping process of each user and are related to each brand and time vectors used for representing the occurrence time of the behavior vectors according to the behavior data;
and carrying out Cartesian orthogonal operation on the behavior vector and the time vector, and carrying out statistical analysis processing on an operation result to obtain associated behavior characteristics between the user and the brand.
5. The method according to claim 1, wherein determining sample data for analyzing user behavior according to the user behavior feature, the brand behavior feature and the associated behavior feature comprises:
dividing the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics according to the subdivisions of the commodities to obtain the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics under each subdivisions;
and determining sample data for analyzing the user behaviors based on the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics under the various sub-classification classes.
6. The method for analyzing user behavior according to claim 5, wherein determining sample data for analyzing user behavior based on the user behavior features, brand behavior features and associated behavior features under each sub-category comprises:
determining users under each user type of the target brand according to the user behavior characteristics, the brand behavior characteristics and the associated behavior characteristics under each fine category;
and obtaining the sample data according to the determined users of the target brand under each user type.
7. The method according to any one of claims 1 to 6, wherein determining the positive and negative examples in the sample data based on the actual shopping behavior of the user within a predetermined time period comprises:
and taking user data ordered within the preset time period in the sample data as the positive sample, and taking user data except the positive sample in the sample data as the negative sample.
8. The method for analyzing user's behavior according to any one of claims 1 to 6, wherein the analyzing and predicting the user's behavior according to the positive sample, the negative sample and a predetermined prediction model comprises:
inputting the positive samples and the negative samples into the predetermined prediction model, and constructing a PR curve based on an output result of the predetermined prediction model;
and analyzing and predicting the behavior of the user based on the PR curve.
9. An apparatus for analyzing a user behavior, comprising:
the acquisition unit is used for acquiring behavior data of a user in a shopping process;
a first determination unit, configured to determine, based on the behavior data, a user behavior feature, a brand behavior feature related to the user, and an associated behavior feature between the user and the brand;
a second determining unit, configured to determine, according to the user behavior feature, the brand behavior feature, and the associated behavior feature, sample data for analyzing a user behavior, including:
selecting users related to a target category of a target brand as users under a first user type of the target brand; selecting users which are related to the target category and exclude the target brand as users under a second user type of the target brand; selecting users which are related to the associated item of the target item and do not belong to the first user type and the second user type as users of a third user type of the target brand; selecting users which are related to other brands under the target category and do not belong to the first user type, the second user type and the third user type as users under a fourth user type of the target brand;
if a user browsing a first category also browses a second category, and the second category is the category with the browsing frequency arranged at the top N bits in the categories browsed by the user browsing the first category, and the first category is the category with the browsing frequency arranged at the top N bits in the categories browsed by the user browsing the second category, determining that the first category and the second category are related categories;
the user, the brand and the user related item are disassembled to construct different feature sets;
a third determining unit, configured to determine a positive sample and a negative sample in the sample data based on an actual shopping behavior of a user within a predetermined time period;
the processing unit is used for analyzing and predicting the user behavior according to the positive sample, the negative sample and a preset prediction model, and comprises the following steps:
marking the users who really place orders as the positive samples, and marking the users who are wrongly or not divided as the negative samples;
inputting the positive sample and the negative sample into a training model to obtain a probability value of correct classification;
and constructing a corresponding PR curve based on the probability value output by the training model, and dividing out an adjusting parameter according to the screened crowd scale.
10. A computer-readable medium, on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out a method of analyzing a user behavior according to any one of claims 1 to 8.
11. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement a method of analyzing user behavior according to any one of claims 1 to 8.
CN201710971008.6A 2017-10-18 2017-10-18 User behavior analysis method, device, medium and electronic equipment Active CN109685537B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710971008.6A CN109685537B (en) 2017-10-18 2017-10-18 User behavior analysis method, device, medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710971008.6A CN109685537B (en) 2017-10-18 2017-10-18 User behavior analysis method, device, medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN109685537A CN109685537A (en) 2019-04-26
CN109685537B true CN109685537B (en) 2021-02-26

Family

ID=66183948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710971008.6A Active CN109685537B (en) 2017-10-18 2017-10-18 User behavior analysis method, device, medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN109685537B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070519B (en) * 2019-06-11 2024-03-05 中国科学院沈阳自动化研究所 Prediction method based on data global search and feature classification
CN112150185A (en) * 2019-06-28 2020-12-29 上海掌学教育科技有限公司 Model and method for predicting student renewal
CN110443637A (en) * 2019-07-16 2019-11-12 浙江大华技术股份有限公司 User's Shopping Behaviors analysis method, device and storage medium
CN112396449A (en) * 2020-06-30 2021-02-23 安徽听见科技有限公司 Method, device, equipment and storage medium for predicting group activities
CN112381291A (en) * 2020-11-13 2021-02-19 北京乐学帮网络技术有限公司 Behavior prediction method and device, information push method and device, electronic equipment and storage medium
CN115408589B (en) * 2022-08-31 2023-06-20 智城动力(深圳)科技有限公司 Customer type matching method and system

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411754A (en) * 2011-11-29 2012-04-11 南京大学 Personalized recommendation method based on commodity property entropy
CN103412882A (en) * 2013-07-18 2013-11-27 百度在线网络技术(北京)有限公司 Method and device for distinguishing consumption intention
CN104794207A (en) * 2015-04-23 2015-07-22 山东大学 Recommendation system based on cooperation and working method of recommendation system
CN104866474A (en) * 2014-02-20 2015-08-26 阿里巴巴集团控股有限公司 Personalized data searching method and device
CN105528374A (en) * 2014-10-21 2016-04-27 苏宁云商集团股份有限公司 A commodity recommendation method in electronic commerce and a system using the same
CN105574216A (en) * 2016-03-07 2016-05-11 达而观信息科技(上海)有限公司 Personalized recommendation method and system based on probability model and user behavior analysis
CN105786965A (en) * 2016-01-27 2016-07-20 久远谦长(北京)技术服务有限公司 URL-based user behavior analysis method and device
CN105868847A (en) * 2016-03-24 2016-08-17 车智互联(北京)科技有限公司 Shopping behavior prediction method and device
CN106251174A (en) * 2016-07-26 2016-12-21 北京小米移动软件有限公司 Information recommendation method and device
CN107103514A (en) * 2017-04-25 2017-08-29 北京京东尚科信息技术有限公司 Commodity distinguishing label determines method and apparatus
KR20170105844A (en) * 2016-03-10 2017-09-20 주식회사 윈스 Attack sensing system using user behavior analysis and method thereof

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411754A (en) * 2011-11-29 2012-04-11 南京大学 Personalized recommendation method based on commodity property entropy
CN103412882A (en) * 2013-07-18 2013-11-27 百度在线网络技术(北京)有限公司 Method and device for distinguishing consumption intention
CN104866474A (en) * 2014-02-20 2015-08-26 阿里巴巴集团控股有限公司 Personalized data searching method and device
CN105528374A (en) * 2014-10-21 2016-04-27 苏宁云商集团股份有限公司 A commodity recommendation method in electronic commerce and a system using the same
CN104794207A (en) * 2015-04-23 2015-07-22 山东大学 Recommendation system based on cooperation and working method of recommendation system
CN105786965A (en) * 2016-01-27 2016-07-20 久远谦长(北京)技术服务有限公司 URL-based user behavior analysis method and device
CN105574216A (en) * 2016-03-07 2016-05-11 达而观信息科技(上海)有限公司 Personalized recommendation method and system based on probability model and user behavior analysis
KR20170105844A (en) * 2016-03-10 2017-09-20 주식회사 윈스 Attack sensing system using user behavior analysis and method thereof
CN105868847A (en) * 2016-03-24 2016-08-17 车智互联(北京)科技有限公司 Shopping behavior prediction method and device
CN106251174A (en) * 2016-07-26 2016-12-21 北京小米移动软件有限公司 Information recommendation method and device
CN107103514A (en) * 2017-04-25 2017-08-29 北京京东尚科信息技术有限公司 Commodity distinguishing label determines method and apparatus

Also Published As

Publication number Publication date
CN109685537A (en) 2019-04-26

Similar Documents

Publication Publication Date Title
CN109685537B (en) User behavior analysis method, device, medium and electronic equipment
CN106651542B (en) Article recommendation method and device
CN107330445B (en) User attribute prediction method and device
CN108664513B (en) Method, device and equipment for pushing keywords
CN106251174A (en) Information recommendation method and device
CN107689008A (en) A kind of user insures the method and device of behavior prediction
US20160285672A1 (en) Method and system for processing network media information
CN108984554B (en) Method and device for determining keywords
CN111061979B (en) User tag pushing method and device, electronic equipment and medium
CN107886241B (en) Resource analysis method, device, medium, and electronic apparatus
CN116541610B (en) Training method and device for recommendation model
CN111861521A (en) Data processing method and device, computer readable medium and electronic equipment
CN112750029A (en) Credit risk prediction method, device, electronic equipment and storage medium
CN113407854A (en) Application recommendation method, device and equipment and computer readable storage medium
CN110866625A (en) Promotion index information generation method and device
CN111179051A (en) Financial target customer determination method and device and electronic equipment
CN114239697A (en) Target object classification method and device, electronic equipment and storage medium
Joung et al. Importance-performance analysis of product attributes using explainable deep neural network from online reviews
CN116127188A (en) Target feedback value determining method and device, electronic equipment and storage medium
CN107357847B (en) Data processing method and device
CN115952468A (en) Feature processing method, device, equipment and computer storage medium
CN110659919A (en) Data matching method, device, medium and electronic equipment
CN114897607A (en) Data processing method and device for product resources, electronic equipment and storage medium
CN113792952A (en) Method and apparatus for generating a model
Fitrianto et al. Development of direct marketing strategy for banking industry: The use of a Chi-squared Automatic Interaction Detector (CHAID) in deposit subscription classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant