CN109685537A - Analysis method, device, medium and the electronic equipment of user behavior - Google Patents

Analysis method, device, medium and the electronic equipment of user behavior Download PDF

Info

Publication number
CN109685537A
CN109685537A CN201710971008.6A CN201710971008A CN109685537A CN 109685537 A CN109685537 A CN 109685537A CN 201710971008 A CN201710971008 A CN 201710971008A CN 109685537 A CN109685537 A CN 109685537A
Authority
CN
China
Prior art keywords
user
behavior
brand
category
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710971008.6A
Other languages
Chinese (zh)
Other versions
CN109685537B (en
Inventor
周默
吴劲平
李凯东
张燕锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710971008.6A priority Critical patent/CN109685537B/en
Publication of CN109685537A publication Critical patent/CN109685537A/en
Application granted granted Critical
Publication of CN109685537B publication Critical patent/CN109685537B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides a kind of analysis method of user behavior, device, medium and electronic equipments.The analysis method of the user behavior includes: the behavioral data for obtaining user in shopping process;Based on the behavioral data, the correlation behavior feature between user behavior characteristics, brand behavioural characteristic related to user and user and brand is determined;According to the user behavior characteristics, the brand behavioural characteristic and the correlation behavior feature, the sample data for analyzing user behavior is determined;Based on the actual Shopping Behaviors of user in predetermined amount of time, the positive sample and negative sample in the sample data are determined;According to the positive sample, the negative sample and scheduled prediction model, analysis prediction is carried out to the behavior of user.The present invention can rapidly and accurately recognize different types of user associated with some brand based on the analysis of the behavioral data to user in shopping process, and then can be realized accurately message push, to improve the purchase conversion ratio of user.

Description

Analysis method, device, medium and the electronic equipment of user behavior
Technical field
The present invention relates to data analysis technique field, in particular to a kind of analysis method of user behavior, device, Medium and electronic equipment.
Background technique
Currently, electric business website development in internet is swift and violent, either commodity amount or any active ues amount all rise to sea Magnitude is other.For each famous brand quotient, when promoting the commodity of oneself, it is difficult to accomplish precise positioning target group, often greatly The purchase conversion ratio for the only very little that the advertising campaign of amount is brought.
In order to improve purchase conversion ratio, need to predict buying behavior of the user to some brand, common practice is at present It collects user to browse commodity plus shopping cart in a long time, pay attention in, inferior behavior, analyzes product interested to user Then board is marked user.For example, one is frequently visited by the user the product of A brand and model B5, and it joined shopping Vehicle, while finding that the user bought the product of A brand and model B4 before, then can be A brand by this user's mark High quality user.
Although having comprehensively considered some emphasis indexs when user's shopping, for example browses plus purchases, pays close attention to, placing an order, In the environment of mass data, these features are still unable to the true intention that accurate response goes out user, do not have enough differentiations Degree, gives a forecast in this way, often filters out a large amount of target group, and index is bad defines for some rules, than It such as browses how many times and calculates potential user, lower how many orders calculation core customers, these tend to rely on the subjective experience of business personnel.
It should be noted that information is only used for reinforcing the reason to background of the invention disclosed in above-mentioned background technology part Solution, therefore may include the information not constituted to the prior art known to persons of ordinary skill in the art.
Summary of the invention
The purpose of the present invention is to provide a kind of analysis method of user behavior, device, medium and electronic equipment, Jin Erzhi It is few to overcome the problems, such as caused by the limitation and defect due to the relevant technologies one or more to a certain extent.
Other characteristics and advantages of the invention will be apparent from by the following detailed description, or partially by the present invention Practice and acquistion.
According to the first aspect of the invention, a kind of analysis method of user behavior is provided, comprising: obtain user and doing shopping Behavioral data in the process;Based on the behavioral data, user behavior characteristics, brand behavioural characteristic related to user are determined, And the correlation behavior feature between user and brand;According to the user behavior characteristics, the brand behavioural characteristic and described Correlation behavior feature determines the sample data for analyzing user behavior;Based on the actual shopping row of user in predetermined amount of time To determine the positive sample and negative sample in the sample data;According to the positive sample, the negative sample and scheduled prediction mould Type carries out analysis prediction to the behavior of user.
In some embodiments of the invention, aforementioned schemes are based on, the behavioral data is based on, determine user behavior spy Sign, comprising: behavior vector of each user in shopping process is determined according to the behavioral data, and for indicating the row For the time arrow of vector time of origin;Time arrow described in the behavior vector sum is subjected to Descartes's orthogonal operations, and right Operation result processing for statistical analysis, to obtain the user behavior characteristics.
In some embodiments of the invention, aforementioned schemes are based on, the behavioral data is based on, determines related to user Brand behavioural characteristic, comprising: occurred in user's shopping process according to behavioral data determination relevant to each brand Behavior vector, and the time arrow for indicating the behavior vector time of origin;By the time described in the behavior vector sum Vector carries out Descartes's orthogonal operations, and to operation result processing for statistical analysis, to obtain brand row related to user It is characterized.
In some embodiments of the invention, aforementioned schemes are based on, the behavioral data is based on, determine user and brand it Between correlation behavior feature, comprising: determine that each user occurring with each product in shopping process according to the behavioral data The relevant behavior vector of board, and the time arrow for indicating the behavior vector time of origin;By the behavior vector sum The time arrow carries out Descartes's orthogonal operations, and to operation result processing for statistical analysis, to obtain user and brand Between correlation behavior feature.
In some embodiments of the invention, aforementioned schemes are based on, according to the user behavior characteristics, the brand behavior Feature and the correlation behavior feature, determine the sample data for analyzing user behavior, comprising: the user behavior is special Sign, the brand behavioural characteristic and the correlation behavior feature are divided according to the subdivision category of commodity, each thin to obtain Divide user behavior characteristics, brand behavioural characteristic and the correlation behavior feature under category;Based on user's row under each subdivision category It is characterized, brand behavioural characteristic and correlation behavior feature, determines the sample data for analyzing user behavior.
In some embodiments of the invention, be based on aforementioned schemes, based on it is each subdivision category under user behavior characteristics, Brand behavioural characteristic and correlation behavior feature determine the sample data for analyzing user behavior, comprising: according to described each thin Divide user behavior characteristics, brand behavioural characteristic and the correlation behavior feature under category, determines each user type of target brand Under user;According to the user under each user type of the determining target brand, the sample data is obtained.
In some embodiments of the invention, aforementioned schemes are based on, according to the user behavior under each subdivision category Feature, brand behavioural characteristic and correlation behavior feature determine the user under each user type for target brand, comprising: User relevant to the target category of the target brand is chosen as the user under the first user type of the target brand; Choose second user type of the user that is related to the target category and excluding the target brand as the target brand Under user;It chooses related to the association category of the target category and is not belonging to first user type and second use The user of family type is as the user under the third user type of the target brand;Other under choosing with the target category Brand is related and is not belonging to user's conduct of first user type, the second user type and the third user type User under the fourth user type of the target brand.
In some embodiments of the invention, aforementioned schemes are based on, further includes: if the user of the first category of browsing also browses Second category, and second category is to browse before browsing frequency comes in the category that the user of first category is browsed N categories, first category are to browse browsing frequency in the category that the user of second category is browsed to come top N Category, it is determined that first category is to be associated with category with second category.
In some embodiments of the invention, aforementioned schemes are based on, based on the actual shopping row of user in predetermined amount of time To determine the positive sample and negative sample in the sample data, comprising: by the sample data in the predetermined amount of time The user data to place an order is as the positive sample, using the user data in the sample data in addition to the positive sample as institute State negative sample.
In some embodiments of the invention, aforementioned schemes are based on, according to the positive sample, the negative sample and scheduled Prediction model carries out analysis prediction to the behavior of user, comprising: is input to the positive sample and the negative sample described predetermined Prediction model in, and based on the output result of the scheduled prediction model construct PR curve;Based on the PR curve to The behavior at family carries out analysis prediction.
According to the second aspect of the invention, a kind of analytical equipment of user behavior is provided, comprising: acquiring unit is used for Obtain behavioral data of the user in shopping process;First determination unit determines user behavior for being based on the behavioral data Correlation behavior feature between feature, brand behavioural characteristic related to user and user and brand;Second determination unit, For determining for analyzing user according to the user behavior characteristics, the brand behavioural characteristic and the correlation behavior feature The sample data of behavior;Third determination unit, for determining the sample based on the actual Shopping Behaviors of user in predetermined amount of time Positive sample and negative sample in notebook data;Processing unit, for according to the positive sample, the negative sample and scheduled prediction mould Type carries out analysis prediction to the behavior of user.
According to the third aspect of the invention we, a kind of computer-readable medium is provided, computer program is stored thereon with, institute State the analysis method that the user behavior as described in first aspect in above-described embodiment is realized when program is executed by processor.
According to the fourth aspect of the invention, a kind of electronic equipment is provided, comprising: one or more processors;Storage dress It sets, for storing one or more programs, when one or more of programs are executed by one or more of processors, makes Obtain the analysis method for the user behavior that one or more of processors are realized as described in first aspect in above-described embodiment.
In the technical solution provided by some embodiments of the present invention, by determine user behavior characteristics, with user's phase Correlation behavior feature between the brand behavioural characteristic and user and brand of pass, and the sample number of user behavior is analyzed accordingly According to being then based on the actual Shopping Behaviors of user in predetermined amount of time and determine positive sample and negative sample in sample data, with root Analysis prediction is carried out to user behavior according to the positive sample, negative sample and scheduled prediction model, is made it possible to based on to user The analysis of behavioral data in shopping process rapidly and accurately recognizes different types of use associated with some brand Family, and then can be realized accurately message push, to improve the purchase conversion ratio of user.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not It can the limitation present invention.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows and meets implementation of the invention Example, and be used to explain the principle of the present invention together with specification.It should be evident that the accompanying drawings in the following description is only the present invention Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.In the accompanying drawings:
Fig. 1 diagrammatically illustrates the flow chart of the analysis method of user behavior according to first embodiment of the invention;
Fig. 2 diagrammatically illustrates the flow chart of the analysis method of the user behavior of second embodiment according to the present invention;
Fig. 3 diagrammatically illustrates the integrated stand composition of the analysis system of the user behavior of embodiment according to the present invention;
Fig. 4 diagrammatically illustrates the block diagram of the analytical equipment of the user behavior of embodiment according to the present invention;
Fig. 5 shows the structural schematic diagram for being suitable for the computer system for the electronic equipment for being used to realize the embodiment of the present invention.
Specific embodiment
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be with a variety of shapes Formula is implemented, and is not understood as limited to example set forth herein;On the contrary, thesing embodiments are provided so that the present invention will more Fully and completely, and by the design of example embodiment comprehensively it is communicated to those skilled in the art.
In addition, described feature, structure or characteristic can be incorporated in one or more implementations in any suitable manner In example.In the following description, many details are provided to provide and fully understand to the embodiment of the present invention.However, It will be appreciated by persons skilled in the art that technical solution of the present invention can be practiced without one or more in specific detail, Or it can be using other methods, constituent element, device, step etc..In other cases, it is not shown in detail or describes known side Method, device, realization or operation are to avoid fuzzy each aspect of the present invention.
Block diagram shown in the drawings is only functional entity, not necessarily must be corresponding with physically separate entity. I.e., it is possible to realize these functional entitys using software form, or realized in one or more hardware modules or integrated circuit These functional entitys, or these functional entitys are realized in heterogeneous networks and/or processor device and/or microcontroller device.
Flow chart shown in the drawings is merely illustrative, it is not necessary to including all content and operation/step, It is not required to execute by described sequence.For example, some operation/steps can also decompose, and some operation/steps can close And or part merge, therefore the sequence actually executed is possible to change according to the actual situation.
Fig. 1 diagrammatically illustrates the flow chart of the analysis method of user behavior according to first embodiment of the invention.
Referring to Fig.1, the analysis method of user behavior according to first embodiment of the invention, comprising:
Step S10 obtains behavioral data of the user in shopping process.
It should be noted that in an embodiment of the present invention, behavioral data of the user in shopping process includes that order is bright It counts evidence, goods browse data accurately, shopping cart is added and deletes the data of commodity from shopping cart, concern/collecting commodities data, use User data and dependent merchandise data etc..
Step S12 is based on the behavioral data, determines user behavior characteristics, brand behavioural characteristic related to user, with And the correlation behavior feature between user and brand.
An exemplary embodiment of the present invention, the step of user behavior characteristics are determined in step S12 include: according to Behavioral data determines behavior vector of each user in shopping process, and for indicating the behavior vector time of origin Time arrow;Time arrow described in the behavior vector sum is subjected to Descartes's orthogonal operations, and operation result is counted Analysis processing, to obtain the user behavior characteristics.
An exemplary embodiment of the present invention, the step of brand behavioural characteristic related to user is determined in step S12 It include: the behavior vector relevant to each brand occurred in user's shopping process according to behavioral data determination, and For indicating the time arrow of the behavior vector time of origin;Time arrow described in the behavior vector sum is subjected to Descartes Orthogonal operations, and to operation result processing for statistical analysis, to obtain brand behavioural characteristic related to user.
An exemplary embodiment of the present invention determines the correlation behavior feature between user and brand in step S12 Step include: according to the behavioral data determine behavior relevant to each brand that each user occurs in shopping process to Amount, and the time arrow for indicating the behavior vector time of origin;By time arrow described in the behavior vector sum into Row Descartes's orthogonal operations, and to operation result processing for statistical analysis, to obtain the correlation behavior between user and brand Feature.
It should be noted that above-mentioned behavior vector may include purchase, browsing, concern, cancel concern, shopping cart is added It is deleted with from shopping cart.Time arrow may include nearly 1 day, it is 2 days nearly, 3 days nearly, 5 days nearly, 7 days nearly, 15 days nearly, 30 days nearly, Nearly half a year etc..Processing for statistical analysis to operation result, which can be, is averaging to operation result and is taken total processing etc..
Step S14 is determined according to the user behavior characteristics, the brand behavioural characteristic and the correlation behavior feature For analyzing the sample data of user behavior.
An exemplary embodiment of the present invention, step S14 include: by the user behavior characteristics, the brand behavior Feature and the correlation behavior feature are divided according to the subdivision category of commodity, to obtain user's row under each subdivision category It is characterized, brand behavioural characteristic and correlation behavior feature;It is special based on the user behavior characteristics under each subdivision category, brand behavior It seeks peace correlation behavior feature, determines the sample data for analyzing user behavior.
It should be noted that the subdivision category of commodity can be divided into second level category and three-level category, such as the one of commodity Grade category is household electrical appliance, then second level category can be big household electrical appliances, small household appliances etc., and the three-level category under second level category electricity can To be refrigerator etc..The scheme of the embodiment by by user behavior characteristics, brand behavioural characteristic and correlation behavior feature according to quotient The subdivision category of product is divided, and the corresponding behavioural characteristic of each category can be further analyzed.
In some embodiments of the invention, be based on aforementioned schemes, based on it is each subdivision category under user behavior characteristics, Brand behavioural characteristic and correlation behavior feature determine the sample data for analyzing user behavior, comprising: according to described each thin Divide user behavior characteristics, brand behavioural characteristic and the correlation behavior feature under category, determines each user type of target brand Under user;According to the user under each user type of the determining target brand, the sample data is obtained.
In some embodiments of the invention, aforementioned schemes are based on, according to the user behavior under each subdivision category Feature, brand behavioural characteristic and correlation behavior feature determine the user under each user type for target brand, comprising: User relevant to the target category of the target brand is chosen as the user under the first user type of the target brand; Choose second user type of the user that is related to the target category and excluding the target brand as the target brand Under user;It chooses related to the association category of the target category and is not belonging to first user type and second use The user of family type is as the user under the third user type of the target brand;Other under choosing with the target category Brand is related and is not belonging to user's conduct of first user type, the second user type and the third user type User under the fourth user type of the target brand.
It should be understood that if the user of the first category of browsing has also browsed the second category, and second category is clear Browsing frequency in the category that is browsed of user of first category of looking at comes the category of top N, and first category is browsing Browse frequency in the category that the user of second category is browsed and come the category of top N, it is determined that first category with Second category is association category.Browse frequency Being browsed in the top10 of frequency in the category that the user for having commodity B in top10, and browsing commodity B is browsed has commodity A, then Commodity A is to be associated with category with commodity B.
Step S16 determines the positive sample in the sample data based on the actual Shopping Behaviors of user in predetermined amount of time And negative sample.
An exemplary embodiment of the present invention, step S16 include: by the sample data in the predetermined amount of time The user data inside to place an order as the positive sample, using the user data in the sample data in addition to the positive sample as The negative sample.
Step S18 divides the behavior of user according to the positive sample, the negative sample and scheduled prediction model Analysis prediction.
An exemplary embodiment of the present invention is right according to the positive sample, the negative sample and scheduled prediction model The behavior of user carries out analysis prediction, comprising: the positive sample and the negative sample are input to the scheduled prediction model In, and it is bent based on the output result of scheduled prediction model building PR (Precision Recall, precision ratio-recall ratio) Line;Analysis prediction is carried out to the behavior of user based on the PR curve.
It should be noted that scheduled prediction model can be GBDT (Gradient Boosting Decision Tree, gradient boosted tree) model and random forest training pattern.
The technical solution of the above embodiment of the present invention is mainly the method by Feature Engineering and machine learning, from magnanimity Various types of users for some brand are analyzed in user behavior data, and then are convenient for accurately message push, Improve the purchase conversion ratio of user.
It is described in detail below with technical solution of the specific example to the embodiment of the present invention:
As shown in Fig. 2, the analysis method of the user behavior of second embodiment according to the present invention, including following below scheme: Data cleansing, feature selecting, sample mark, algorithm training and crowd's screening.It is carried out below for each process shown in Fig. 2 It is described in detail:
Data cleansing:
The 6 class bottoms such as basic data, including order, browsing, shopping cart, concern, commodity, user required for choosing first Tables of data carries out data cleansing to it respectively, filters out effective field and establishes independent middle table, wherein includes in middle table Field specifically can be as shown in table 1:
Table 1
Feature selecting:
It is special respectively for brand, user, this 3 dimensions progress of user-association brand based on basic data shown in table 1 Sign is extracted.
A. brand identity
Purchase is chosen, browsing, concern, cancels concern plus purchase, subtract and purchase this 6 kinds of behaviors as user behavior vector, is chosen close 1 day, this 8 time ranges of 2 days nearly, 3 days nearly, 5 days nearly, 7 days nearly, 15 days nearly, 1 month nearly, nearly half a year as time arrow, will The two vectors do Descartes's Orthogonal Composite, and calculate separately sum and this two indexs of average, are obtained towards brand 6*8*2=96 feature.Specifically, for example obtained brand identity may is that the sum of purchase in brand A nearly 3 days, brand B are close The average time that half a year is browsed daily.It is expressed as follows in the form of vectors:
B. user characteristics
Similar to the selection of brand identity, it can be directed to each user, calculate separately 96 spies that vector above is stated Sign.For example obtained user characteristics may is that the total degree of user A nearly 7 days concern commodity;Nearest 1 monthly average of user B is daily Plus purchase number.
C. user-association brand identity
Similarly, the feature for comprehensively considering user-association brand calculates separately above for each user and each brand 96 features that vector is stated.For example obtained user-association brand identity may is that user A nearly 3 days and pay close attention to the total of brand B Number, the total degree of user C nearly 7 days browsing brand D.
Why user, brand, user-association brand are dismantled in the embodiment of the present invention and construct different feature sets, is In order to selected feature the relationship of user and brand can be described and describe user and brand respectively itself the characteristics of, greatly Discrimination is increased greatly.
After obtaining above-mentioned feature, it can remove some meaningless or be difficult to the feature obtained combination, finally leave Validity feature.Then deconsolidation process is carried out (not to level-one in the embodiment of the present invention respectively for second level category and three-level category Category process be primarily due to level-one category each subclass difference it is very big, the interference between category clearly, is unfavorable for making It is characterized), obtain second level category user characteristics, second level category brand identity, second level category user-association brand identity, three-level product Class user characteristics, three-level category brand identity, three-level category user-association brand identity.It is expressed as follows in the form of vectors:
Sample mark:
In an embodiment of the present invention, can be distinguished according to features described above the core customer of target brand, intention user, Potential user, competition user.Specifically, above-mentioned obtained series of features is associated merging, and is defined as follows division:
A. core customer: screening user relevant to target category, target brand;
It is described with SQL are as follows:
Select pin from t where cate=' target category ' and brand=' target brand '
B. intention user: screening user related to target category but excluding target brand;
It is described with SQL are as follows:
Select pin from t where cate=' target category ' and pin not in Set (A)
C. potential user: screening user relevant with category is associated with excludes core customer and intention user;
It is described with SQL are as follows: select pin from t where cate in (' association category 1 ', ' association category 2 ') and pin not in Set(A)and pin not in Set(B)
D. compete user: screening user relevant to target category, other brands excludes above-mentioned all screen User.
It is described with SQL are as follows:
Select pin from t where cate=' target category ' and brand<>' target brand ' and pin not in Set(A)and pin not in Set(B)and pin not in Set(C)
Wherein, the calculation method for being associated with category is: if browsing frequency in the category that the user of browsing commodity A is browsed Being browsed in the top10 of frequency in the category that the user for having commodity B in top10, and browsing commodity B is browsed has commodity A, then Commodity A is to be associated with category with commodity B.For example, having in the top10 of browsing frequency in the category that the user of browsing beer is browsed Diaper, and browsed in the category that is browsed of the user for browsing diaper in the top10 of frequency and have beer, then beer is with diaper To be associated with category.
Algorithm training:
With 7 days for a sliding window, the 4 class users of time T are filtered out with above-mentioned user's division methods, are then tracked User's buying behavior of time T+7 marks out the user really to place an order as positive sample, misclassification the or unallocated negative sample of conduct This.In view of the magnitude of negative sample is more much larger than positive sample, the embodiment of the present invention picks 6 times of positive sample of negative sample at random This.
Positive negative sample is input in GBDT model and random forest training pattern, it is average according to the proportion weighted of 3:7, melt Two disaggregated models of four kinds of target groups are respectively obtained after conjunction, while the also available correct probability value P of classification.
Crowd's screening:
Based on the probability value P of above-mentioned model output, corresponding PR curve is constructed, according to the crowd size filtered out, is divided Reasonable adjusting parameter out.For example, when crowd's quantity is 100,000, P=0.01, crowd becomes 30,000 as P=0.005, lead to It crosses and chooses different P values, can effectively control the size of current crowd, and guarantee to buy the higher user's total energy of possibility Enough it is identified.
Technical solution based on the above embodiment is illustrated in figure 3 the analysis system of the user behavior of the embodiment of the present invention Overall architecture.
Referring to Fig. 3, data source can rely on the big data fairground of electric business platform for providing basic data.Offline number It is responsible for data pick-up, task schedule (such as time control, task rely on management, variable management), alarm monitoring etc. according to engine.Its In, task scheduling modules organize the process flow in the above embodiment of the present invention in the form of workflow, support timing Triggering successively executes order dependent, and can increase the flexibility of system by assigned variable, when any one step fails Afterwards, have and retry and alarm mechanism accordingly.
Algorithm computing engines can be based on Spark cluster, and encapsulate GBDT, random forest scheduling algorithm packet, pass through data Module for reading and writing reads the data that off-line data engine provides, and generates PR curve after handling by GBDT, random forests algorithm etc., with It is analyzed based on PR curve.And algorithm computing engines can also include parameter configuration module, with the parameters to algorithm It is configured.
Fig. 4 diagrammatically illustrates the block diagram of the analytical equipment of the user behavior of embodiment according to the present invention.
Referring to Fig. 4, the analytical equipment 400 of the user behavior of embodiment according to the present invention, comprising: acquiring unit 402, the One determination unit 404, the second determination unit 406, third determination unit 408 and processing unit 410.
Specifically, acquiring unit 402 is for obtaining behavioral data of the user in shopping process;First determination unit 404 For being based on the behavioral data, user behavior characteristics, brand behavioural characteristic related to user and user and brand are determined Between correlation behavior feature;Second determination unit 406 be used for according to the user behavior characteristics, the brand behavioural characteristic and The correlation behavior feature determines the sample data for analyzing user behavior;Third determination unit 408 is used for based on pre- timing Between the actual Shopping Behaviors of user in section, determine the positive sample and negative sample in the sample data;Processing unit 410 is used for root According to the positive sample, the negative sample and scheduled prediction model, analysis prediction is carried out to the behavior of user.
It should be noted that the detail for each module/unit for including in the analytical equipment 400 of above-mentioned user behavior is Through being described in detail in the analysis method of corresponding user behavior, therefore details are not described herein again.
Below with reference to Fig. 5, it illustrates the computer systems 500 for the electronic equipment for being suitable for being used to realize the embodiment of the present invention Structural schematic diagram.The computer system 500 of electronic equipment shown in Fig. 5 is only an example, should not be to the embodiment of the present invention Function and use scope bring any restrictions.
As shown in figure 5, computer system 500 includes central processing unit (CPU) 501, it can be read-only according to being stored in Program in memory (ROM) 502 or be loaded into the program in random access storage device (RAM) 503 from storage section 508 and Execute various movements appropriate and processing.In RAM 503, it is also stored with various programs and data needed for system operatio.CPU 501, ROM 502 and RAM 503 is connected with each other by bus 504.Input/output (I/O) interface 505 is also connected to bus 504。
I/O interface 505 is connected to lower component: the importation 506 including keyboard, mouse etc.;It is penetrated including such as cathode The output par, c 507 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 508 including hard disk etc.; And the communications portion 509 of the network interface card including LAN card, modem etc..Communications portion 509 via such as because The network of spy's net executes communication process.Driver 510 is also connected to I/O interface 505 as needed.Detachable media 511, such as Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 510, in order to read from thereon Computer program be mounted into storage section 508 as needed.
Particularly, according to an embodiment of the invention, may be implemented as computer above with reference to the process of flow chart description Software program.For example, the embodiment of the present invention includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communications portion 509, and/or from detachable media 511 are mounted.When the computer program is executed by central processing unit (CPU) 501, executes and limited in the system of the application Above-mentioned function.
It should be noted that computer-readable medium shown in the present invention can be computer-readable signal media or meter Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device, Or above-mentioned any appropriate combination.In the present invention, computer readable storage medium can be it is any include or storage journey The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this In invention, computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc. are above-mentioned Any appropriate combination.
Flow chart and block diagram in attached drawing are illustrated according to the system of various embodiments of the invention, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of above-mentioned module, program segment or code include one or more Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants It is noted that the combination of each box in block diagram or flow chart and the box in block diagram or flow chart, can use and execute rule The dedicated hardware based systems of fixed functions or operations is realized, or can use the group of specialized hardware and computer instruction It closes to realize.
Being described in unit involved in the embodiment of the present invention can be realized by way of software, can also be by hard The mode of part realizes that described unit also can be set in the processor.Wherein, the title of these units is in certain situation Under do not constitute restriction to the unit itself.
As on the other hand, present invention also provides a kind of computer-readable medium, which be can be Included in electronic equipment described in above-described embodiment;It is also possible to individualism, and without in the supplying electronic equipment. Above-mentioned computer-readable medium carries one or more program, when the electronics is set by one for said one or multiple programs When standby execution, so that the electronic equipment realizes the analysis method such as above-mentioned user behavior as described in the examples.
For example, the electronic equipment may be implemented as shown in Figure 1: step S10 obtains user in shopping process Behavioral data;Step S12 is based on the behavioral data, determines user behavior characteristics, brand behavioural characteristic related to user, And the correlation behavior feature between user and brand;Step S14, it is special according to the user behavior characteristics, the brand behavior It seeks peace the correlation behavior feature, determines the sample data for analyzing user behavior;Step S16, based in predetermined amount of time The actual Shopping Behaviors of user determine positive sample and negative sample in the sample data;Step S18, according to the positive sample, The negative sample and scheduled prediction model carry out analysis prediction to the behavior of user.
It should be noted that although being referred to several modules or list for acting the equipment executed in the above detailed description Member, but this division is not enforceable.In fact, embodiment according to the present invention, it is above-described two or more Module or the feature and function of unit can embody in a module or unit.Conversely, an above-described mould The feature and function of block or unit can be to be embodied by multiple modules or unit with further division.
Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented Mode can also be realized by software realization in such a way that software is in conjunction with necessary hardware.Therefore, according to the present invention The technical solution of embodiment can be embodied in the form of software products, which can store non-volatile at one Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are so that a calculating Equipment (can be personal computer, server, touch control terminal or network equipment etc.) executes embodiment according to the present invention Method.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its Its embodiment.This application is intended to cover any variations, uses, or adaptations of the invention, these modifications, purposes or Person's adaptive change follows general principle of the invention and including the undocumented common knowledge in the art of the present invention Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following Claim is pointed out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is limited only by the attached claims.

Claims (13)

1. a kind of analysis method of user behavior characterized by comprising
Obtain behavioral data of the user in shopping process;
Based on the behavioral data, user behavior characteristics, brand behavioural characteristic related to user and user and brand are determined Between correlation behavior feature;
According to the user behavior characteristics, the brand behavioural characteristic and the correlation behavior feature, determine for analyzing user The sample data of behavior;
Based on the actual Shopping Behaviors of user in predetermined amount of time, the positive sample and negative sample in the sample data are determined;
According to the positive sample, the negative sample and scheduled prediction model, analysis prediction is carried out to the behavior of user.
2. the analysis method of user behavior according to claim 1, which is characterized in that be based on the behavioral data, determine User behavior characteristics, comprising:
Determine behavior vector of each user in shopping process according to the behavioral data, and for indicate the behavior to Measure the time arrow of time of origin;
Time arrow described in the behavior vector sum is subjected to Descartes's orthogonal operations, and to operation result place for statistical analysis Reason, to obtain the user behavior characteristics.
3. the analysis method of user behavior according to claim 1, which is characterized in that be based on the behavioral data, determine Brand behavioural characteristic related to user, comprising:
The behavior vector relevant to each brand occurred in user's shopping process, Yi Jiyong are determined according to the behavioral data In the time arrow for indicating the behavior vector time of origin;
Time arrow described in the behavior vector sum is subjected to Descartes's orthogonal operations, and to operation result place for statistical analysis Reason, to obtain brand behavioural characteristic related to user.
4. the analysis method of user behavior according to claim 1, which is characterized in that be based on the behavioral data, determine Correlation behavior feature between user and brand, comprising:
The behavior vector relevant to each brand that each user occurs in shopping process is determined according to the behavioral data, with And the time arrow for indicating the behavior vector time of origin;
Time arrow described in the behavior vector sum is subjected to Descartes's orthogonal operations, and to operation result place for statistical analysis Reason, to obtain the correlation behavior feature between user and brand.
5. the analysis method of user behavior according to claim 1, which is characterized in that according to the user behavior characteristics, The brand behavioural characteristic and the correlation behavior feature, determine the sample data for analyzing user behavior, comprising:
By the user behavior characteristics, the brand behavioural characteristic and the correlation behavior feature according to commodity subdivision category into Row divides, to obtain user behavior characteristics, brand behavioural characteristic and the correlation behavior feature under each subdivision category;
Based on user behavior characteristics, brand behavioural characteristic and the correlation behavior feature under each subdivision category, determine for analyzing The sample data of user behavior.
6. the analysis method of user behavior according to claim 5, which is characterized in that based on the use under each subdivision category Family behavioural characteristic, brand behavioural characteristic and correlation behavior feature determine the sample data for analyzing user behavior, comprising:
According to user behavior characteristics, brand behavioural characteristic and the correlation behavior feature under each subdivision category, target is determined User under each user type of brand;
According to the user under each user type of the determining target brand, the sample data is obtained.
7. the analysis method of user behavior according to claim 6, which is characterized in that according under each subdivision category User behavior characteristics, brand behavioural characteristic and correlation behavior feature, determine for target brand each user type under User, comprising:
It chooses under the first user type of the user relevant to the target category of the target brand as the target brand User;
Choose second user of the user that is related to the target category and excluding the target brand as the target brand User under type;
It chooses related to the association category of the target category and is not belonging to first user type and the second user class The user of type is as the user under the third user type of the target brand;
It chooses related to other brands under the target category and is not belonging to first user type, the second user class The user of type and the third user type is as the user under the fourth user type of the target brand.
8. the analysis method of user behavior according to claim 7, which is characterized in that further include:
If the user of the first category of browsing has also browsed the second category, and second category is to browse the use of first category The category that frequency comes top N is browsed in the category that family is browsed, first category is to browse the user of second category The category that frequency comes top N is browsed in the category browsed, it is determined that first category is to be associated with second category Category.
9. the analysis method of user behavior according to any one of claim 1 to 8, which is characterized in that based on pre- timing Between the actual Shopping Behaviors of user in section, determine the positive sample and negative sample in the sample data, comprising:
Using the user data to place an order in the predetermined amount of time in the sample data as the positive sample, by the sample User data in data in addition to the positive sample is as the negative sample.
10. the analysis method of user behavior according to any one of claim 1 to 8, which is characterized in that according to it is described just Sample, the negative sample and scheduled prediction model carry out analysis prediction to the behavior of user, comprising:
The positive sample and the negative sample are input in the scheduled prediction model, and are based on the scheduled prediction mould The output result of type constructs PR curve;
Analysis prediction is carried out to the behavior of user based on the PR curve.
11. a kind of analytical equipment of user behavior characterized by comprising
Acquiring unit, for obtaining behavioral data of the user in shopping process;
First determination unit determines that user behavior characteristics, brand behavior related to user are special for being based on the behavioral data Sign and the correlation behavior feature between user and brand;
Second determination unit is used for according to the user behavior characteristics, the brand behavioural characteristic and the correlation behavior feature, Determine the sample data for analyzing user behavior;
Third determination unit, for determining in the sample data based on the actual Shopping Behaviors of user in predetermined amount of time Positive sample and negative sample;
Processing unit, for dividing the behavior of user according to the positive sample, the negative sample and scheduled prediction model Analysis prediction.
12. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that described program is held by processor The analysis method of the user behavior as described in any one of claims 1 to 10 is realized when row.
13. a kind of electronic equipment characterized by comprising
One or more processors;
Storage device, for storing one or more programs, when one or more of programs are by one or more of processing When device executes, so that one or more of processors realize the user behavior as described in any one of claims 1 to 10 Analysis method.
CN201710971008.6A 2017-10-18 2017-10-18 User behavior analysis method, device, medium and electronic equipment Active CN109685537B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710971008.6A CN109685537B (en) 2017-10-18 2017-10-18 User behavior analysis method, device, medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710971008.6A CN109685537B (en) 2017-10-18 2017-10-18 User behavior analysis method, device, medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN109685537A true CN109685537A (en) 2019-04-26
CN109685537B CN109685537B (en) 2021-02-26

Family

ID=66183948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710971008.6A Active CN109685537B (en) 2017-10-18 2017-10-18 User behavior analysis method, device, medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN109685537B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443637A (en) * 2019-07-16 2019-11-12 浙江大华技术股份有限公司 User's Shopping Behaviors analysis method, device and storage medium
CN112070519A (en) * 2019-06-11 2020-12-11 中国科学院沈阳自动化研究所 Prediction method based on data global search and feature classification
CN112150185A (en) * 2019-06-28 2020-12-29 上海掌学教育科技有限公司 Model and method for predicting student renewal
CN112381291A (en) * 2020-11-13 2021-02-19 北京乐学帮网络技术有限公司 Behavior prediction method and device, information push method and device, electronic equipment and storage medium
CN112396449A (en) * 2020-06-30 2021-02-23 安徽听见科技有限公司 Method, device, equipment and storage medium for predicting group activities
CN115408589A (en) * 2022-08-31 2022-11-29 智城动力(深圳)科技有限公司 Client type matching method and system

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411754A (en) * 2011-11-29 2012-04-11 南京大学 Personalized recommendation method based on commodity property entropy
CN103412882A (en) * 2013-07-18 2013-11-27 百度在线网络技术(北京)有限公司 Method and device for distinguishing consumption intention
CN104794207A (en) * 2015-04-23 2015-07-22 山东大学 Recommendation system based on cooperation and working method of recommendation system
CN104866474A (en) * 2014-02-20 2015-08-26 阿里巴巴集团控股有限公司 Personalized data searching method and device
CN105528374A (en) * 2014-10-21 2016-04-27 苏宁云商集团股份有限公司 A commodity recommendation method in electronic commerce and a system using the same
CN105574216A (en) * 2016-03-07 2016-05-11 达而观信息科技(上海)有限公司 Personalized recommendation method and system based on probability model and user behavior analysis
CN105786965A (en) * 2016-01-27 2016-07-20 久远谦长(北京)技术服务有限公司 URL-based user behavior analysis method and device
CN105868847A (en) * 2016-03-24 2016-08-17 车智互联(北京)科技有限公司 Shopping behavior prediction method and device
CN106251174A (en) * 2016-07-26 2016-12-21 北京小米移动软件有限公司 Information recommendation method and device
CN107103514A (en) * 2017-04-25 2017-08-29 北京京东尚科信息技术有限公司 Commodity distinguishing label determines method and apparatus
KR20170105844A (en) * 2016-03-10 2017-09-20 주식회사 윈스 Attack sensing system using user behavior analysis and method thereof

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411754A (en) * 2011-11-29 2012-04-11 南京大学 Personalized recommendation method based on commodity property entropy
CN103412882A (en) * 2013-07-18 2013-11-27 百度在线网络技术(北京)有限公司 Method and device for distinguishing consumption intention
CN104866474A (en) * 2014-02-20 2015-08-26 阿里巴巴集团控股有限公司 Personalized data searching method and device
CN105528374A (en) * 2014-10-21 2016-04-27 苏宁云商集团股份有限公司 A commodity recommendation method in electronic commerce and a system using the same
CN104794207A (en) * 2015-04-23 2015-07-22 山东大学 Recommendation system based on cooperation and working method of recommendation system
CN105786965A (en) * 2016-01-27 2016-07-20 久远谦长(北京)技术服务有限公司 URL-based user behavior analysis method and device
CN105574216A (en) * 2016-03-07 2016-05-11 达而观信息科技(上海)有限公司 Personalized recommendation method and system based on probability model and user behavior analysis
KR20170105844A (en) * 2016-03-10 2017-09-20 주식회사 윈스 Attack sensing system using user behavior analysis and method thereof
CN105868847A (en) * 2016-03-24 2016-08-17 车智互联(北京)科技有限公司 Shopping behavior prediction method and device
CN106251174A (en) * 2016-07-26 2016-12-21 北京小米移动软件有限公司 Information recommendation method and device
CN107103514A (en) * 2017-04-25 2017-08-29 北京京东尚科信息技术有限公司 Commodity distinguishing label determines method and apparatus

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070519A (en) * 2019-06-11 2020-12-11 中国科学院沈阳自动化研究所 Prediction method based on data global search and feature classification
CN112070519B (en) * 2019-06-11 2024-03-05 中国科学院沈阳自动化研究所 Prediction method based on data global search and feature classification
CN112150185A (en) * 2019-06-28 2020-12-29 上海掌学教育科技有限公司 Model and method for predicting student renewal
CN110443637A (en) * 2019-07-16 2019-11-12 浙江大华技术股份有限公司 User's Shopping Behaviors analysis method, device and storage medium
CN112396449A (en) * 2020-06-30 2021-02-23 安徽听见科技有限公司 Method, device, equipment and storage medium for predicting group activities
CN112381291A (en) * 2020-11-13 2021-02-19 北京乐学帮网络技术有限公司 Behavior prediction method and device, information push method and device, electronic equipment and storage medium
CN115408589A (en) * 2022-08-31 2022-11-29 智城动力(深圳)科技有限公司 Client type matching method and system

Also Published As

Publication number Publication date
CN109685537B (en) 2021-02-26

Similar Documents

Publication Publication Date Title
CN109685537A (en) Analysis method, device, medium and the electronic equipment of user behavior
CN110135901A (en) A kind of enterprise customer draws a portrait construction method, system, medium and electronic equipment
JP7120649B2 (en) Information processing system, information processing device, prediction model extraction method, and prediction model extraction program
CN111401777B (en) Enterprise risk assessment method, enterprise risk assessment device, terminal equipment and storage medium
CN107247786A (en) Method, device and server for determining similar users
CN110111156A (en) A kind of customer defection early warning method, system, medium and electronic equipment
CN107784390A (en) Recognition methods, device, electronic equipment and the storage medium of subscriber lifecycle
CN108540826A (en) Barrage method for pushing, device, electronic equipment and storage medium
US20200234218A1 (en) Systems and methods for entity performance and risk scoring
CN108280685A (en) Information acquisition method and device
CN110163705A (en) Method and apparatus for pushed information
CN108932625A (en) Analysis method, device, medium and the electronic equipment of user behavior data
US20140229233A1 (en) Consumer spending forecast system and method
JP2015043167A (en) Sales prediction system and method
CN108389060A (en) customer loyalty information processing method and device
CN108388563A (en) Information output method and device
CN108038217B (en) Information recommendation method and device
CN113781149A (en) Information recommendation method and device, computer-readable storage medium and electronic equipment
CN110570271A (en) information recommendation method and device, electronic equipment and readable storage medium
CN110263255A (en) Acquisition methods, system, server and the storage medium of customer attribute information
US20210090105A1 (en) Technology opportunity mapping
CN109685560A (en) Big data processing method, device, medium and electronic equipment
US20190205341A1 (en) Systems and methods for measuring collected content significance
WO2020150597A1 (en) Systems and methods for entity performance and risk scoring
Sobreiro et al. A slr on customer dropout prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant