CN109615128A - Real estate client's conclusion of the business probability forecasting method, device and server - Google Patents
Real estate client's conclusion of the business probability forecasting method, device and server Download PDFInfo
- Publication number
- CN109615128A CN109615128A CN201811478616.4A CN201811478616A CN109615128A CN 109615128 A CN109615128 A CN 109615128A CN 201811478616 A CN201811478616 A CN 201811478616A CN 109615128 A CN109615128 A CN 109615128A
- Authority
- CN
- China
- Prior art keywords
- client
- conclusion
- business
- historical behavior
- behavior data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/16—Real estate
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Marketing (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention provides a kind of real estate client conclusion of the business probability forecasting method, device and server, by obtaining client to be measured to the historical behavior data of target building, the historical behavior data of client to be measured include that the frequency occurs for one or more behavioural characteristics and corresponding first;Obtain the decision tree structure being trained using random forests algorithm model to training set;It include the historical behavior data and customer type of the selected client from database in training set, the historical behavior data of selected client include that the frequency occurs for one or more behavioural characteristics and corresponding second;The historical behavior data of client to be measured are input in the decision tree structure, obtain client to be measured to the conclusion of the business probability of target building;Client's conclusion of the business probability is effectively predicted to realize, sales force is reduced and difficulty is analyzed to the intention of real estate client's conclusion of the business probability, is conducive to sales force and more targeted follow-up service is carried out to the client of different conclusion of the business probability, promote sales achievement.
Description
Technical field
The present invention relates to real estate domain more particularly to a kind of real estate client conclusion of the business probability forecasting methods, device and clothes
Business device.
Background technique
Client mainly carries out reception process by the visiting under line and with sales force for the access of building project at present
The information of middle expression is determined.Client once leaves sale scene, and the tracking right and wrong of client are carried out for sales force
Often difficult thing.Therefore the intentional behavior that client generates project at present all can not effectively by a kind of effective means into
Line trace record, therefore collecting the information of client by multidimensional is very difficult behavior.
How to judge that the true purchase intention of client perplexs always the problem of sales force, people is sold during daily reception
The limited time of member and client's contact exchanges the often related introduction development simply by project with client, can not be from each
What the angle of client set out depth carries out the behavior purchased house exchange, thus be difficult to by client's contact in a relatively short period of time into
The accurate intention of row determines.
Summary of the invention
Real estate client conclusion of the business probability forecasting method, device and server provided by the invention, the technology mainly solved are asked
Topic is: the intention degree (conclusion of the business probability) of project can not be effectively predicted in real estate client.
In order to solve the above technical problems, the present invention provides a kind of real estate client conclusion of the business probability forecasting method, comprising:
Client to be measured is obtained to the historical behavior data of target building, the historical behavior data of the client to be measured include one
A or multiple behavioural characteristics and the corresponding first generation frequency;
Obtain the decision tree structure being trained using random forests algorithm model to training set;In the training set
Historical behavior data and customer type including client selected from database, the historical behavior number of the selected client
According to including that the frequency occurs for one or more behavioural characteristics and corresponding second, the customer type include conclusion of the business client and it is non-at
Hand over client;
By in the historical behavior data and the decision tree structure of the client to be measured, the client to be measured is obtained to institute
State the conclusion of the business probability of target building.
Further, the behavioural characteristic includes following at least one:
Access duration, odd-numbered day maximum accesses duration, number of clicks, odd-numbered day maximum number of clicks, access day, accession page
Average access duration, average number of clicks, average access number of days in number, access cycle.
Further, the historical behavior data by the client to be measured are input in the decision tree structure, are obtained
The client to be measured includes: to the conclusion of the business probability of the target building
The historical behavior data of the client to be measured are input in each decision tree structure, the client to be measured is obtained
Affiliated customer type, according to it is each it is described belonging to customer type the conclusion of the business probability of the client to be measured is calculated.
Further, the second generation frequency includes that the frequency effectively occurs, and effective generation frequency is based on as follows
Mode obtains:
The historical behavior data of several clients, the historical behavior data packet of several clients are obtained from the database
Include one or more behavioural characteristics and it is corresponding it is in different time periods actually occur the frequency, according to the preset period and the
The mapping table of two weighted values determines corresponding target weight value of each period, according to the practical hair in different time periods
Effective generation frequency is calculated in the raw frequency and the target weight value.
Further, before being trained using random forests algorithm model to training set, further includes:
The first sample quantity that label in the training set is conclusion of the business client is obtained, the first sample quantity is calculated and accounts for institute
First ratio is compared, described by the first ratio for stating whole sample sizes in training set with the first setting ratio
When first ratio is less than the first setting ratio, conclusion of the business client is expanded using synthesis minority class oversampling technique.
The present invention also provides a kind of real estate client conclusion of the business probabilistic forecasting devices, comprising:
Data acquisition module, for obtaining client to be measured to the historical behavior data of target building, the client's to be measured
Historical behavior data include that the frequency occurs for one or more behavioural characteristics and corresponding first;And for obtaining using random
The decision tree structure that forest algorithm model is trained training set;It include being selected from database in the training set
Client historical behavior data and customer type, the historical behavior data of the selected client include one or more behaviors
Feature and the corresponding second generation frequency, the customer type includes conclusion of the business client and non-conclusion of the business client;
Processing module, for the historical behavior data of the client to be measured according to accessed by the data acquisition module are defeated
Enter into the decision tree structure, calculates the client to be measured to the conclusion of the business probability of the target building.
Further, the behavioural characteristic includes following at least one:
Access duration, odd-numbered day maximum accesses duration, number of clicks, odd-numbered day maximum number of clicks, access day, accession page
Average access duration, average number of clicks, average access number of days in number, access cycle.
Further, described second occurs the frequency including the frequency, the real estate client conclusion of the business probabilistic forecasting effectively occurs
Device further include:
Training set processing module, it is described several for obtaining the historical behavior data of several clients from the database
The historical behavior data of client include one or more behavioural characteristics and it is corresponding it is in different time periods actually occur the frequency, according to
The mapping table of preset period and the second weighted value determine corresponding target weight value of each period, according to institute
It states and in different time periods actually occurs the frequency and effective generation frequency is calculated in the target weight value.
Further, the real estate client conclusion of the business probabilistic forecasting device further include:
Training set enlargement module is calculated for obtaining the first sample quantity that label in the training set is conclusion of the business client
The first sample quantity accounts for the first ratio of whole sample sizes in the training set, and first ratio and first are set
Ratio is compared, and when first ratio is less than the first setting ratio, utilizes synthesis minority class oversampling technique pair
Conclusion of the business client expands.
The present invention also provides a kind of servers, including processor, memory and communication bus;
The communication bus is for realizing the connection communication between processor and memory;
The processor is for executing one or more program stored in memory, to realize described in any one as above
Real estate client's conclusion of the business probability forecasting method the step of.
The beneficial effects of the present invention are:
Real estate client's conclusion of the business probability forecasting method, device and the server provided according to the present invention, it is to be measured by obtaining
Client to the historical behavior data of target building, the historical behavior data of client to be measured include one or more behavioural characteristics and
Corresponding first occurs the frequency;Obtain the decision tree structure being trained using random forests algorithm model to training set;
It include the historical behavior data and customer type of the selected client from database, the history row of selected client in training set
Include that the frequency occurs for one or more behavioural characteristics and corresponding second for data, customer type include conclusion of the business client and it is non-at
Hand over client;The historical behavior data of client to be measured are input in the decision tree structure, obtain client to be measured to target building
Conclusion of the business probability;Client's conclusion of the business probability is effectively predicted to realize, sales force is reduced and strikes a bargain generally to real estate client
The intention of rate analyzes difficulty, is conducive to sales force and carries out more targeted follow-up service to the client of different conclusion of the business probability,
Promote sales achievement.
Detailed description of the invention
Fig. 1 is real estate client's conclusion of the business probability forecasting method flow diagram of the embodiment of the present invention one;
Fig. 2 is that the K of the embodiment of the present invention one rolls over cross validation schematic diagram;
Fig. 3 is the random forest training process schematic diagram of the embodiment of the present invention one;
Fig. 4 is that the sample size of the embodiment of the present invention one expands schematic diagram;
Fig. 5 is real estate client's conclusion of the business probabilistic forecasting apparatus structure schematic diagram one of the embodiment of the present invention two;
Fig. 6 is real estate client's conclusion of the business probabilistic forecasting apparatus structure schematic diagram two of the embodiment of the present invention two;
Fig. 7 is real estate client's conclusion of the business probabilistic forecasting apparatus structure schematic diagram three of the embodiment of the present invention two;
Fig. 8 is the server architecture schematic diagram of the embodiment of the present invention two.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, below by specific embodiment knot
Closing attached drawing, invention is further described in detail.It should be appreciated that specific embodiment described herein is only used to explain this
Invention, is not intended to limit the present invention.
Embodiment one:
Referring to Figure 1, Fig. 1 is a kind of real estate client conclusion of the business probability forecasting method process that the embodiment of the present invention one provides
Schematic diagram, this method mainly include the following steps:
S101, client to be measured is obtained to the historical behavior data of target building, the historical behavior data of client to be measured include
One or more behavioural characteristics and the corresponding first generation frequency.
For model with " client's id+ building id " for basic unit, namely the conclusion of the business probability of prediction client to be measured is to be measured with this
Client carries out the historical behavior data of the target building, for the user to be measured to the historical behavior data of other buildings,
Then it is not used in the prediction to target building conclusion of the business probability.Wherein, historical behavior data include behavioural characteristic and each behavioural characteristic
Corresponding first occurs the frequency.Such as target building " 1567 ", specifically may refer to as shown in table 1 below:
Table 1
Client id | Access duration | Number of clicks | Access day | Accession page number |
“jm122” | 1552 | 255 | 12 | 56 |
“cs233” | 899 | 123 | 23 | 96 |
Wherein, behavioural characteristic includes but is not limited to: access duration, odd-numbered day maximum access duration, number of clicks, odd-numbered day are maximum
Number of clicks, access day, accession page number, average access duration, average number of clicks, average access number of days in access cycle
Deng.
It should be noted that the historical behavior data of client can store in server or in database, in client
When carrying out business access using client, during historical behavior data can be recorded and upload onto the server, therefore server
The historical behavior data of relative clients in system can be got in memory or database by itself.
The decision tree structure that S102, acquisition are trained training set using random forests algorithm model;The training
Concentration includes the historical behavior data and customer type of the selected client from database, the historical behavior number of selected client
According to including that the frequency occurs for one or more behavioural characteristics and corresponding second, customer type includes conclusion of the business client and non-conclusion of the business visitor
Family.
When user has demand to certain commodity, higher attention rate, this attention rate are also correspondingly had to the commodity
It can relatively accurately be embodied in the history access behavior and data of filing of user.Therefore, conclusion of the business user and the user that do not strike a bargain exist
Significantly difference is also had in terms of history access behavior.
Using random forests algorithm can the historical behavior data to user be trained modeling, will be defeated after training
Decision tree structure out.Modeling process in order to better understand is illustrated below with reference to specific example, it should be understood that the example
It is mainly used for explaining the process, is not construed as limitation of the present invention:
For example, training set includes sample A, B, C, D, wherein customer type is as data label, and 0 indicates non-conclusion of the business client, 1
Indicate conclusion of the business client;Shown in table 2 specific as follows:
Table 2
The training set is trained using random forests algorithm model, obtains decision tree structure as shown in Figure 3.It is practical
In, more decision trees can be constructed according to different condition, then sample to be tested are input in each decision tree, according to each
Decision tree judges decision down from level to level, which child node determination finally falls in, then which kind of the client to be measured just belongs to.
Wherein, judgement sequence in decision tree, first judges behavioural characteristic a, still first judges behavioural characteristic b, can be according to different demarcation
Comentropy determines.Specifically can in known manner, details are not described herein.
After building decision tree, so that it may the decision tree structure is utilized, the prediction of conclusion of the business probability is carried out to client to be measured,
The client to be measured is input in each decision tree, sees which child node the client to be measured falls in, then the client to be measured just with
The customer type of the child node is identical.Assuming that it is 25 that the historical behavior data of client to be measured, which include: behavioural characteristic a, behavioural characteristic b
It is 10, is entered into first decision tree structure, the affiliated customer type that the client to be measured can be obtained is 1, is also
Conclusion of the business client;It being entered into the second class decision tree structure, the affiliated customer type of the available client to be measured is 1,
As conclusion of the business client;To sum up, judging result is conclusion of the business client twice, and the conclusion of the business probability of the available client to be measured is larger;
When there are more decision trees, to same client to be measured, it is understood that there may be the case where belonging to conclusion of the business client, it is also possible to which there are non-conclusions of the business
It the case where client, at this time can be by choosing its affiliated customer type in a vote.Such as wherein 10 times be conclusion of the business client, wherein 5 times
It, then can be by choosing it in a vote as conclusion of the business client for non-conclusion of the business client.
Optionally, when there are more decision trees, the proportion that output result is conclusion of the business client can be counted, to determine
The conclusion of the business probability of client to be measured.For example, client to be measured is input in 20 decision trees there are 20 decision trees, respectively into
Row judgement, it is 15 times that obtain output result, which be the number of conclusion of the business client, and the output result of non-conclusion of the business client is 5 times, then can be with
Obtaining conclusion of the business probability is 15/20=75%.
Random forest modeling process mainly includes three basic steps: Construction of A Model, model training, model optimization.Wherein
Construction of A Model builds algorithm frame in server using the thought of random forests algorithm;It can be rolled in conjunction with K during model training
Training parameter is adjusted in cross validation or grid search (GridSearchCV) method.It is that will train number that K, which rolls over cross validation,
It according to k equal portions are divided into, is trained using k-1 parts of data therein, in addition a data are as test.If Fig. 2 is that K folding intersects
Verify schematic diagram.GridSearchCV carries out cross validation according to given model automatically, by adjust each parameter come with
Track appraisal result, for cyclic process when process is instead of parameter search.Model optimization process is according to the test knot of test set
Fruit carries out model hyper parameter using control variate method targetedly to adjust optimization, such as: current hyper parameter has a, b, c tri-
It is a, it is remained unchanged by controlling b, c, to a mono- sufficiently large upper lower threshold value, finds its optimum point, the optimization of other hyper parameters
Similarly.Wherein hyper parameter includes but is not limited to decision tree quantity, depth etc..
Random forests algorithm can automatically process the data of the diversified forms such as binary feature, characteristic of division, numerical characteristics.?
During model training, it can be automatically performed the selection of hidden feature, the importance measures of model are provided after training.It is random gloomy
Woods model is capable of handling continuous type numerical variable, therefore does not need to carry out input feature vector additional processing, as numerical value discretization,
Normalization etc..Meanwhile the weight parameter of each behavioural characteristic will be provided after Random Forest model training, that is, it is based on learning model
Feature ordering.Therefore it does not need to carry out excessive Feature Engineering link, need to only be searched for after the completion of training according to backward selection
Method determine feature.
Model completes the training of random forests algorithm based on sklearn machine learning library, due to Random Forest model parameter
Numerous, therefore, we will be in conjunction with k folding cross validation and grid search (GridSearchCV) method to training in training process
Parameter is adjusted.Random forest training process is as shown in Figure 3.
It optionally, can also include: that data preparation, feature selecting and data prediction waited before modeling training
Journey, wherein Data Preparation Process: being handled and analyzed to data for convenience, and the data of storage in the database are imported into
In Series and Dataframe data structure.Series is an one-dimensional similar array object, the number comprising an array
According to (data type of any NumPy) and one and the associated data label of array, that is, index.Seriers interactive display
String representation is index on the left side, and value is on the right.Datarame indicates a table, is a kind of similar electrical form
Data structure, comprising one by sequence list collection, each of which can have different types values (number, character string,
Boolean etc.).Datarame has the index of row and column.It can be counted as the dictionary of a Series, and (each Series is total
Enjoy an index).
Feature selection process: model is based primarily upon both sides data: (1) user's history behavioral data;(2) user builds
Shelves information (including subscriber identity information, such as ID, name, telephone number etc.).Model is basic system with " User ID+building ID "
Unit is counted, access duration, odd-numbered day maximum access duration, number of clicks, odd-numbered day maximum number of clicks, access day, visit are had chosen
Ask page number, the features such as average access duration, average number of clicks, average access number of days in access cycle.
The first step, the essential attribute and behavior property of foundation client simultaneously combine business scenario, and thinking summarizes potential n
The set F of feature;
Second step, carry out the fractionation of characteristic attribute with merge.Split: by former characteristic set F attribute Fi (0≤i <
N) multiple features are split as according to specific rules, such as " access duration " is split as " odd-numbered day maximum accesses duration " and " access cycle
Interior average access duration ";Merge: i.e. dimensionality reduction is merged the stronger index of multiple correlations using technologies such as PCA, thus
Index dimension is reduced, such as indicates " access frequency " use " access times " or " access cycle ";
Third step, using the feature after splitting, merging as the m dimensional feature collection F ' finally used in a model, mainly
Include: access duration, odd-numbered day maximum access duration, number of clicks, odd-numbered day maximum number of clicks, access day, accession page number,
The features such as average access duration, average number of clicks, average access number of days in access cycle.
Data preparation: the feature extraction rule come out according to Feature Engineering combing (second step) extracts sample from database
Data are formed sample set S (namely training set);S contains value (including structural data and the unstructured number of each feature
According to) and sample label (0: it can indicate not strike a bargain, 1: can indicate to strike a bargain).
Data prediction: after obtaining data, pre-processing data set, including removes repeated data, detects and go
Except exceptional value, missing values, the conversion of data type and the dimensionality reduction operation that may be needed etc. are handled.In rejecting outliers link,
Outlier is removed by the methods of single argument abnormality detection, cluster abnormality detection.After obtaining data, in order to guarantee data
Accuracy, dirty data is cleaned, including removal repeated data, detection and remove exceptional value, processing missing values, data class
Conversion, dimensionality reduction operation of type etc..
Behavioural characteristic and corresponding value (i.e. second occur the frequency) in training set including sample, in the present embodiment, the
The two generation frequencys refer to that the frequency effectively occurs, and it is to be based on actually occurring after the frequency is weighted processing obtaining that the frequency, which effectively occurs,
It arrives.It specifically includes: obtaining the historical behavior data of several clients from database, the historical behavior data of several clients include
One or more behavioural characteristics and it is corresponding it is in different time periods actually occur the frequency, according to the preset period and second
The mapping table of weighted value determines corresponding target weight value of each period, actually occurs the frequency according in different time periods,
And effective generation frequency is calculated in target weight value.
For example, certain client to the number of clicks of target building be divided into nearest one week number of clicks be 1000 times, nearest one
The number of clicks of a month (not including this nearest one week) is 500 times, the click time of nearest half a year (not including this nearest one month)
Number is 10000 times, and the number of clicks before half a year is 6000 times;It so can be according to preset period and the second weighted value
Mapping table, it is assumed that mapping table is as follows:
Table 3
Period | Nearest one week | Nearest one month | Nearest half a year | Before half a year |
Second weighted value | 1 | 0.8 | 0.5 | 0.1 |
Effective generation frequency of client's number of clicks: 1000*1+500*0.8+10000*0.5+6000* can be obtained
0.1=7000 times.
Different timing nodes, feature have different weights, and client's access behavior in nearly one week is certainly than one month
Preceding access behavior has more reference value, therefore, in modeling process, assigns not to client in the access behavior of different phase
Same weight is conducive to the accuracy for promoting prediction.
Optionally, it before being trained using random forests algorithm model to training set, can also obtain in training set
Label is the first sample quantity of conclusion of the business client, calculates the first ratio that first sample quantity accounts for whole sample sizes in training set
First ratio is compared by value with the first setting ratio, a small number of using synthesis when the first ratio is less than the first setting ratio
Class oversampling technique (Synthetic Minority Oversampling Technique, abbreviation SMOTE) to conclusion of the business client into
Row expands.SMOTE algorithm is to be generated to carry out in the new original data set of minority class sample point addition according to certain regular random
A kind of trained method.It is as shown in Figure 4 that SMOTE method generates new sample point schematic diagram.Conclusion of the business sample size and the sample that do not strike a bargain
When quantity differs greatly, it is reconstructed by SMOTE, reducing training set sample class serious unbalance bring influences.
This model based on random forests algorithm, behavioural characteristic based on user's history access and file information architecture and
At.Model is to differentiate the conclusion of the business probability of user as target, it is intended to be carried out by the multidimensional characteristic of user to the conclusion of the business probability of user
Prediction.
S103, the historical behavior data of client to be measured are input in decision tree structure, obtain client to be measured to target building
The conclusion of the business probability of disk.
The historical behavior data of client to be measured are input in each decision tree structure, the affiliated customer class of client to be measured is obtained
The conclusion of the business probability of client to be measured is calculated according to each affiliated customer type for type.
Real estate client conclusion of the business probability forecasting method provided by the invention goes through target building by obtaining client to be measured
History behavioral data, the historical behavior data of client to be measured include that frequency occurs for one or more behavioural characteristics and corresponding first
It is secondary;Obtain corresponding first weight of each behavioural characteristic being trained using random forests algorithm model to training set
Value;It include the historical behavior data of the selected client from database, the historical behavior data of selected client in training set
The frequency occurs including one or more behavioural characteristics and corresponding second;According to the historical behavior data of client to be measured and respectively
Corresponding first weighted value of behavioural characteristic calculates client to be measured to the conclusion of the business probability value of target building;To realize to client
Conclusion of the business probability is effectively predicted, and reduces sales force and analyzes difficulty to the intention of real estate client's conclusion of the business probability, is conducive to pin
It sells personnel and more targeted follow-up service is carried out to the client of different conclusion of the business probability, promote sales achievement.
Embodiment two:
The present embodiment on the basis of example 1, provides a kind of real estate client conclusion of the business probabilistic forecasting device, refers to
Fig. 5, real estate client's conclusion of the business probabilistic forecasting device 50 is for realizing real estate client conclusion of the business probability described in above-described embodiment one
The step of prediction technique, the prediction meanss 50 include data acquisition module 51, processing module 52.
Wherein data acquisition module 51 is for obtaining client to be measured to the historical behavior data of target building, client's to be measured
Historical behavior data include that the frequency occurs for one or more behavioural characteristics and corresponding first;And for obtaining using random
The decision tree structure that forest algorithm model is trained training set;It include the selected visitor from database in training set
The historical behavior data and customer type at family, the historical behavior data of selected client include one or more behavioural characteristics and
Corresponding second occurs the frequency, and customer type includes conclusion of the business client and non-conclusion of the business client;
Processing module 52 is used for the historical behavior data of client to be measured accessed by data acquisition module 51 and is somebody's turn to do
Decision tree structure obtains client to be measured to the conclusion of the business probability of target building.
Behavioural characteristic includes following at least one: access duration, odd-numbered day maximum access duration, number of clicks, odd-numbered day are maximum
Number of clicks, access day, accession page number, average access duration, average number of clicks, average access day in access cycle
Number.
Second occurs the frequency including the frequency effectively occurs, and real estate client's conclusion of the business probabilistic forecasting device 50 further includes training set
Processing module 53, refers to Fig. 6:
Training set processing module 53 is used to obtain the historical behavior data of several clients from database, and several clients' goes through
History behavioral data include one or more behavioural characteristics and it is corresponding it is in different time periods actually occur the frequency, according to presetting
Period and the second weighted value mapping table, corresponding target weight value of each period is determined, according to different time sections
Actually occur the frequency and target weight value is calculated and the frequency effectively occurs.
Optionally, Fig. 7 is referred to, real estate client's conclusion of the business probabilistic forecasting device 50 further includes training set enlargement module 54,
For obtaining the first sample quantity that label in training set is conclusion of the business client, calculates first sample quantity and account for whole samples in training set
First ratio is compared, in the first ratio less than the first setting ratio by first ratio of this quantity with the first setting ratio
When, conclusion of the business client is expanded using synthesis minority class oversampling technique.
The present embodiment also provides a kind of server, refers to Fig. 8, which includes processor 81, memory 82 and lead to
Believe bus 83;Communication bus 83 is for realizing the connection communication between processor 81 and memory 82;Processor 81 is for executing
One or more program stored in memory 82, to realize real estate client's conclusion of the business probabilistic forecasting as described in embodiment one
The step of method.
Obviously, those skilled in the art should be understood that each module of aforementioned present invention or each step can be with general
Computing device realizes that they can be concentrated on a single computing device, or be distributed in constituted by multiple computing devices
On network, optionally, they can be realized with the program code that computing device can perform, it is thus possible to be stored in
It is performed by computing device in computer storage medium (ROM/RAM, magnetic disk, CD), and in some cases, it can be with not
The sequence being same as herein executes shown or described step, or they are fabricated to each integrated circuit modules, or
Person makes multiple modules or steps in them to single integrated circuit module to realize.So the present invention is not limited to appoint
What specific hardware and software combines.
The above content is specific embodiment is combined, further detailed description of the invention, and it cannot be said that this hair
Bright specific implementation is only limited to these instructions.For those of ordinary skill in the art to which the present invention belongs, it is not taking off
Under the premise of from present inventive concept, a number of simple deductions or replacements can also be made, all shall be regarded as belonging to protection of the invention
Range.
Claims (10)
1. a kind of real estate client conclusion of the business probability forecasting method characterized by comprising
Client to be measured is obtained to the historical behavior data of target building, the historical behavior data of the client to be measured include one or
Multiple behavioural characteristics and the corresponding first generation frequency;
Obtain the decision tree structure being trained using random forests algorithm model to training set;Include in the training set
The historical behavior data and customer type of selected client, the historical behavior data packet of the selected client from database
One or more behavioural characteristics and the corresponding second generation frequency are included, the customer type includes conclusion of the business client and non-conclusion of the business visitor
Family;
The historical behavior data of the client to be measured are input in the decision tree structure, obtain the client to be measured to described
The conclusion of the business probability of target building.
2. real estate client conclusion of the business probability forecasting method as described in claim 1, which is characterized in that the behavioural characteristic includes
It is following at least one:
Access duration, odd-numbered day maximum access duration, number of clicks, odd-numbered day maximum number of clicks, access day, accession page number,
Average access duration, average number of clicks, average access number of days in access cycle.
3. real estate client conclusion of the business probability forecasting method as described in claim 1, which is characterized in that described by the visitor to be measured
The historical behavior data at family are input in the decision tree structure, and it is general to the conclusion of the business of the target building to obtain the client to be measured
Rate includes:
The historical behavior data of the client to be measured are input in each decision tree structure, the institute of the client to be measured is obtained
Belong to customer type, the conclusion of the business probability of the client to be measured is calculated according to each affiliated customer type.
4. real estate client conclusion of the business probability forecasting method as described in any one of claims 1-3, which is characterized in that described second
It includes that the frequency effectively occurs that the frequency, which occurs, and effective generation frequency is based on as under type obtains:
The historical behavior data of several clients are obtained from the database, and the historical behavior data of several clients include one
A or multiple behavioural characteristics and it is corresponding it is in different time periods actually occur the frequency, according to preset period and the second power
The mapping table of weight values determines corresponding target weight value of each period, according to it is described it is in different time periods actually occur frequency
Effective generation frequency is calculated in the secondary and described target weight value.
5. real estate client conclusion of the business probability forecasting method as claimed in claim 4, which is characterized in that calculated using random forest
Before method model is trained training set, further includes:
The first sample quantity that label in the training set is conclusion of the business client is obtained, the first sample quantity is calculated and accounts for the instruction
Practice the first ratio for concentrating whole sample sizes, first ratio is compared with the first setting ratio, described first
When ratio is less than the first setting ratio, conclusion of the business client is expanded using synthesis minority class oversampling technique.
6. a kind of real estate client conclusion of the business probabilistic forecasting device characterized by comprising
Data acquisition module, for obtaining client to be measured to the historical behavior data of target building, the history of the client to be measured
Behavioral data includes that the frequency occurs for one or more behavioural characteristics and corresponding first;And random forest is utilized for obtaining
The decision tree structure that algorithm model is trained training set;It include the selected visitor from database in the training set
The historical behavior data and customer type at family, the historical behavior data of the selected client include one or more behavioural characteristics
And the corresponding second generation frequency, the customer type includes conclusion of the business client and non-conclusion of the business client;
Processing module, it is described for being input to the historical behavior data of client to be measured accessed by the data acquisition module
In decision tree structure, the client to be measured is obtained to the conclusion of the business probability of the target building.
7. real estate client conclusion of the business probabilistic forecasting device as claimed in claim 6, which is characterized in that the behavioural characteristic includes
It is following at least one:
Access duration, odd-numbered day maximum access duration, number of clicks, odd-numbered day maximum number of clicks, access day, accession page number,
Average access duration, average number of clicks, average access number of days in access cycle.
8. real estate client conclusion of the business probabilistic forecasting device as claimed in claim 6, which is characterized in that described second occurs the frequency
Including the frequency, the real estate client conclusion of the business probabilistic forecasting device effectively occurs further include:
Training set processing module, for obtaining the historical behavior data of several clients, several clients from the database
Historical behavior data include one or more behavioural characteristics and it is corresponding it is in different time periods actually occur the frequency, according to preparatory
The mapping table of the period of setting and the second weighted value determine corresponding target weight value of each period, according to it is described not
With the period actually occur the frequency and effective generation frequency is calculated in the target weight value.
9. such as the described in any item real estate client conclusion of the business probabilistic forecasting devices of claim 6-8, which is characterized in that the premises
Produce client's conclusion of the business probabilistic forecasting device further include:
Training set enlargement module, for obtaining the first sample quantity that label in the training set is conclusion of the business client, described in calculating
First sample quantity accounts for the first ratio of whole sample sizes in the training set, by first ratio and the first setting ratio
It is compared, when first ratio is less than the first setting ratio, using synthesis minority class oversampling technique to conclusion of the business
Client expands.
10. a kind of server, which is characterized in that the server includes processor, memory and communication bus;
The communication bus is for realizing the connection communication between processor and memory;
The processor is for executing one or more program stored in memory, to realize as appointed in claim 1 to 5
Described in one the step of real estate client conclusion of the business probability forecasting method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811478616.4A CN109615128A (en) | 2018-12-05 | 2018-12-05 | Real estate client's conclusion of the business probability forecasting method, device and server |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811478616.4A CN109615128A (en) | 2018-12-05 | 2018-12-05 | Real estate client's conclusion of the business probability forecasting method, device and server |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109615128A true CN109615128A (en) | 2019-04-12 |
Family
ID=66005462
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811478616.4A Pending CN109615128A (en) | 2018-12-05 | 2018-12-05 | Real estate client's conclusion of the business probability forecasting method, device and server |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109615128A (en) |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110675069A (en) * | 2019-09-26 | 2020-01-10 | 重庆锐云科技有限公司 | Real estate industry client signing risk early warning method, server and storage medium |
CN110716979A (en) * | 2019-10-18 | 2020-01-21 | 重庆锐云科技有限公司 | House buying intention client mining method, device and server |
CN110852797A (en) * | 2019-10-29 | 2020-02-28 | 深圳市看见网络科技有限公司 | Method, mobile terminal and computer storage medium for helping broker to judge guests efficiently |
CN110910282A (en) * | 2019-11-27 | 2020-03-24 | 重庆锐云科技有限公司 | System, method and storage medium for earning commission of multi-role customer service for real estate electronic commerce |
CN111158732A (en) * | 2019-12-23 | 2020-05-15 | 中国平安人寿保险股份有限公司 | Access data processing method and device, computer equipment and storage medium |
CN111292140A (en) * | 2020-03-19 | 2020-06-16 | 重庆锐云科技有限公司 | Online customer intelligent distribution method |
CN111415199A (en) * | 2020-03-20 | 2020-07-14 | 重庆锐云科技有限公司 | Customer prediction updating method and device based on big data and storage medium |
CN111414542A (en) * | 2020-03-20 | 2020-07-14 | 重庆锐云科技有限公司 | Real estate customer group identification and marketing method |
CN111563628A (en) * | 2020-05-09 | 2020-08-21 | 重庆锐云科技有限公司 | Real estate customer transaction time prediction method, device and storage medium |
CN111695015A (en) * | 2020-06-04 | 2020-09-22 | 重庆锐云科技有限公司 | Customer behavior analysis method and device, computer equipment and storage medium |
CN111815066A (en) * | 2020-07-21 | 2020-10-23 | 上海数鸣人工智能科技有限公司 | User click prediction method based on gradient lifting decision tree |
CN112001757A (en) * | 2020-08-26 | 2020-11-27 | 中山世达模型制造有限公司 | Sales order prediction method |
CN112288117A (en) * | 2019-07-23 | 2021-01-29 | 贝壳技术有限公司 | Target customer deal probability prediction method and device and electronic equipment |
CN112417267A (en) * | 2020-10-10 | 2021-02-26 | 腾讯科技(深圳)有限公司 | User behavior analysis method and device, computer equipment and storage medium |
JP2021068437A (en) * | 2019-10-17 | 2021-04-30 | 株式会社ウチダレック | Output program, method for output, and output device |
CN115239400A (en) * | 2022-09-21 | 2022-10-25 | 广州越创智数信息科技有限公司 | House purchasing user intention degree calculation method and system |
CN115309737A (en) * | 2022-10-11 | 2022-11-08 | 深圳市明源云客电子商务有限公司 | Visitor intention analysis method and system, terminal device and readable storage medium |
CN117632905A (en) * | 2023-11-28 | 2024-03-01 | 广州视声智能科技有限公司 | Database management method and system based on cloud use records |
CN117632905B (en) * | 2023-11-28 | 2024-05-17 | 广州视声智能科技有限公司 | Database management method and system based on cloud use records |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140089908A1 (en) * | 2012-09-25 | 2014-03-27 | Jeffrey S. Dunn | Decision Tree Ensemble Compilation |
DE102013022171A1 (en) * | 2013-03-15 | 2014-09-18 | Nvidia Corporation | Execution of object recognition operations by means of a graphic processing unit |
CN107590688A (en) * | 2017-08-24 | 2018-01-16 | 平安科技(深圳)有限公司 | The recognition methods of target customer and terminal device |
-
2018
- 2018-12-05 CN CN201811478616.4A patent/CN109615128A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140089908A1 (en) * | 2012-09-25 | 2014-03-27 | Jeffrey S. Dunn | Decision Tree Ensemble Compilation |
DE102013022171A1 (en) * | 2013-03-15 | 2014-09-18 | Nvidia Corporation | Execution of object recognition operations by means of a graphic processing unit |
CN107590688A (en) * | 2017-08-24 | 2018-01-16 | 平安科技(深圳)有限公司 | The recognition methods of target customer and terminal device |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112288117A (en) * | 2019-07-23 | 2021-01-29 | 贝壳技术有限公司 | Target customer deal probability prediction method and device and electronic equipment |
CN110675069A (en) * | 2019-09-26 | 2020-01-10 | 重庆锐云科技有限公司 | Real estate industry client signing risk early warning method, server and storage medium |
CN110675069B (en) * | 2019-09-26 | 2022-05-31 | 重庆锐云科技有限公司 | Real estate industry client signing risk early warning method, server and storage medium |
JP7109804B2 (en) | 2019-10-17 | 2022-08-01 | 株式会社ウチダレック | Output program, output method and output device |
JP2021068437A (en) * | 2019-10-17 | 2021-04-30 | 株式会社ウチダレック | Output program, method for output, and output device |
CN110716979A (en) * | 2019-10-18 | 2020-01-21 | 重庆锐云科技有限公司 | House buying intention client mining method, device and server |
CN110852797A (en) * | 2019-10-29 | 2020-02-28 | 深圳市看见网络科技有限公司 | Method, mobile terminal and computer storage medium for helping broker to judge guests efficiently |
CN110910282A (en) * | 2019-11-27 | 2020-03-24 | 重庆锐云科技有限公司 | System, method and storage medium for earning commission of multi-role customer service for real estate electronic commerce |
CN111158732A (en) * | 2019-12-23 | 2020-05-15 | 中国平安人寿保险股份有限公司 | Access data processing method and device, computer equipment and storage medium |
CN111158732B (en) * | 2019-12-23 | 2024-04-02 | 中国平安人寿保险股份有限公司 | Access data processing method, device, computer equipment and storage medium |
CN111292140A (en) * | 2020-03-19 | 2020-06-16 | 重庆锐云科技有限公司 | Online customer intelligent distribution method |
CN111414542A (en) * | 2020-03-20 | 2020-07-14 | 重庆锐云科技有限公司 | Real estate customer group identification and marketing method |
CN111415199A (en) * | 2020-03-20 | 2020-07-14 | 重庆锐云科技有限公司 | Customer prediction updating method and device based on big data and storage medium |
CN111563628A (en) * | 2020-05-09 | 2020-08-21 | 重庆锐云科技有限公司 | Real estate customer transaction time prediction method, device and storage medium |
CN111695015A (en) * | 2020-06-04 | 2020-09-22 | 重庆锐云科技有限公司 | Customer behavior analysis method and device, computer equipment and storage medium |
CN111815066A (en) * | 2020-07-21 | 2020-10-23 | 上海数鸣人工智能科技有限公司 | User click prediction method based on gradient lifting decision tree |
CN111815066B (en) * | 2020-07-21 | 2021-03-26 | 上海数鸣人工智能科技有限公司 | User click prediction method based on gradient lifting decision tree |
CN112001757A (en) * | 2020-08-26 | 2020-11-27 | 中山世达模型制造有限公司 | Sales order prediction method |
CN112417267A (en) * | 2020-10-10 | 2021-02-26 | 腾讯科技(深圳)有限公司 | User behavior analysis method and device, computer equipment and storage medium |
CN115239400A (en) * | 2022-09-21 | 2022-10-25 | 广州越创智数信息科技有限公司 | House purchasing user intention degree calculation method and system |
CN115309737A (en) * | 2022-10-11 | 2022-11-08 | 深圳市明源云客电子商务有限公司 | Visitor intention analysis method and system, terminal device and readable storage medium |
CN117632905A (en) * | 2023-11-28 | 2024-03-01 | 广州视声智能科技有限公司 | Database management method and system based on cloud use records |
CN117632905B (en) * | 2023-11-28 | 2024-05-17 | 广州视声智能科技有限公司 | Database management method and system based on cloud use records |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109615128A (en) | Real estate client's conclusion of the business probability forecasting method, device and server | |
Cavalli et al. | CNN-based multivariate data analysis for bitcoin trend prediction | |
Zheng et al. | Real-time intelligent big data processing: technology, platform, and applications | |
Fayazi et al. | Uncovering crowdsourced manipulation of online reviews | |
CN102970289B (en) | The identity identifying method of sing on web user behavior pattern | |
CN102890803B (en) | The defining method of the abnormal process of exchange of electronic goods and device thereof | |
CN107689008A (en) | A kind of user insures the method and device of behavior prediction | |
CN103678659A (en) | E-commerce website cheat user identification method and system based on random forest algorithm | |
CN107798027B (en) | Information popularity prediction method, information recommendation method and device | |
CN103136303A (en) | Method and equipment of dividing user group in social network service website | |
CN108230016B (en) | Agricultural product market price transmission analysis method and analysis device | |
CN107679626A (en) | Machine learning method, device, system, storage medium and equipment | |
Yu et al. | Data cleaning for personal credit scoring by utilizing social media data: An empirical study | |
CN107274042A (en) | A kind of business participates in the Risk Identification Method and device of object | |
CN110297990A (en) | The associated detecting method and system of crowdsourcing marketing microblogging and waterborne troops | |
CN115221396A (en) | Information recommendation method and device based on artificial intelligence and electronic equipment | |
CN103440199A (en) | Method and device for guiding test | |
CN116362823A (en) | Recommendation model training method, recommendation method and recommendation device for behavior sparse scene | |
CN110795613A (en) | Commodity searching method, device and system and electronic equipment | |
US20210174367A1 (en) | System and method including accurate scoring and response | |
Abrishami et al. | Using real-world store data for foot traffic forecasting | |
US10339134B2 (en) | System, method, and non-transitory computer-readable storage media for generating normalization candidates for a search query | |
Mulahuwaish et al. | Topic modeling based on two-step flow theory: Application to Tweets about bitcoin | |
CN114943563A (en) | Rights and interests pushing method and device, computer equipment and storage medium | |
Lian et al. | A fuel sales forecast method based on variational Bayesian structural time series |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190412 |
|
RJ01 | Rejection of invention patent application after publication |