CN109615128A - Real estate client's conclusion of the business probability forecasting method, device and server - Google Patents

Real estate client's conclusion of the business probability forecasting method, device and server Download PDF

Info

Publication number
CN109615128A
CN109615128A CN201811478616.4A CN201811478616A CN109615128A CN 109615128 A CN109615128 A CN 109615128A CN 201811478616 A CN201811478616 A CN 201811478616A CN 109615128 A CN109615128 A CN 109615128A
Authority
CN
China
Prior art keywords
client
conclusion
business
historical behavior
behavior data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811478616.4A
Other languages
Chinese (zh)
Inventor
李琦
宋卫东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Rui Yun Technology Co Ltd
Original Assignee
Chongqing Rui Yun Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Rui Yun Technology Co Ltd filed Critical Chongqing Rui Yun Technology Co Ltd
Priority to CN201811478616.4A priority Critical patent/CN109615128A/en
Publication of CN109615128A publication Critical patent/CN109615128A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/16Real estate

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides a kind of real estate client conclusion of the business probability forecasting method, device and server, by obtaining client to be measured to the historical behavior data of target building, the historical behavior data of client to be measured include that the frequency occurs for one or more behavioural characteristics and corresponding first;Obtain the decision tree structure being trained using random forests algorithm model to training set;It include the historical behavior data and customer type of the selected client from database in training set, the historical behavior data of selected client include that the frequency occurs for one or more behavioural characteristics and corresponding second;The historical behavior data of client to be measured are input in the decision tree structure, obtain client to be measured to the conclusion of the business probability of target building;Client's conclusion of the business probability is effectively predicted to realize, sales force is reduced and difficulty is analyzed to the intention of real estate client's conclusion of the business probability, is conducive to sales force and more targeted follow-up service is carried out to the client of different conclusion of the business probability, promote sales achievement.

Description

Real estate client's conclusion of the business probability forecasting method, device and server
Technical field
The present invention relates to real estate domain more particularly to a kind of real estate client conclusion of the business probability forecasting methods, device and clothes Business device.
Background technique
Client mainly carries out reception process by the visiting under line and with sales force for the access of building project at present The information of middle expression is determined.Client once leaves sale scene, and the tracking right and wrong of client are carried out for sales force Often difficult thing.Therefore the intentional behavior that client generates project at present all can not effectively by a kind of effective means into Line trace record, therefore collecting the information of client by multidimensional is very difficult behavior.
How to judge that the true purchase intention of client perplexs always the problem of sales force, people is sold during daily reception The limited time of member and client's contact exchanges the often related introduction development simply by project with client, can not be from each What the angle of client set out depth carries out the behavior purchased house exchange, thus be difficult to by client's contact in a relatively short period of time into The accurate intention of row determines.
Summary of the invention
Real estate client conclusion of the business probability forecasting method, device and server provided by the invention, the technology mainly solved are asked Topic is: the intention degree (conclusion of the business probability) of project can not be effectively predicted in real estate client.
In order to solve the above technical problems, the present invention provides a kind of real estate client conclusion of the business probability forecasting method, comprising:
Client to be measured is obtained to the historical behavior data of target building, the historical behavior data of the client to be measured include one A or multiple behavioural characteristics and the corresponding first generation frequency;
Obtain the decision tree structure being trained using random forests algorithm model to training set;In the training set Historical behavior data and customer type including client selected from database, the historical behavior number of the selected client According to including that the frequency occurs for one or more behavioural characteristics and corresponding second, the customer type include conclusion of the business client and it is non-at Hand over client;
By in the historical behavior data and the decision tree structure of the client to be measured, the client to be measured is obtained to institute State the conclusion of the business probability of target building.
Further, the behavioural characteristic includes following at least one:
Access duration, odd-numbered day maximum accesses duration, number of clicks, odd-numbered day maximum number of clicks, access day, accession page Average access duration, average number of clicks, average access number of days in number, access cycle.
Further, the historical behavior data by the client to be measured are input in the decision tree structure, are obtained The client to be measured includes: to the conclusion of the business probability of the target building
The historical behavior data of the client to be measured are input in each decision tree structure, the client to be measured is obtained Affiliated customer type, according to it is each it is described belonging to customer type the conclusion of the business probability of the client to be measured is calculated.
Further, the second generation frequency includes that the frequency effectively occurs, and effective generation frequency is based on as follows Mode obtains:
The historical behavior data of several clients, the historical behavior data packet of several clients are obtained from the database Include one or more behavioural characteristics and it is corresponding it is in different time periods actually occur the frequency, according to the preset period and the The mapping table of two weighted values determines corresponding target weight value of each period, according to the practical hair in different time periods Effective generation frequency is calculated in the raw frequency and the target weight value.
Further, before being trained using random forests algorithm model to training set, further includes:
The first sample quantity that label in the training set is conclusion of the business client is obtained, the first sample quantity is calculated and accounts for institute First ratio is compared, described by the first ratio for stating whole sample sizes in training set with the first setting ratio When first ratio is less than the first setting ratio, conclusion of the business client is expanded using synthesis minority class oversampling technique.
The present invention also provides a kind of real estate client conclusion of the business probabilistic forecasting devices, comprising:
Data acquisition module, for obtaining client to be measured to the historical behavior data of target building, the client's to be measured Historical behavior data include that the frequency occurs for one or more behavioural characteristics and corresponding first;And for obtaining using random The decision tree structure that forest algorithm model is trained training set;It include being selected from database in the training set Client historical behavior data and customer type, the historical behavior data of the selected client include one or more behaviors Feature and the corresponding second generation frequency, the customer type includes conclusion of the business client and non-conclusion of the business client;
Processing module, for the historical behavior data of the client to be measured according to accessed by the data acquisition module are defeated Enter into the decision tree structure, calculates the client to be measured to the conclusion of the business probability of the target building.
Further, the behavioural characteristic includes following at least one:
Access duration, odd-numbered day maximum accesses duration, number of clicks, odd-numbered day maximum number of clicks, access day, accession page Average access duration, average number of clicks, average access number of days in number, access cycle.
Further, described second occurs the frequency including the frequency, the real estate client conclusion of the business probabilistic forecasting effectively occurs Device further include:
Training set processing module, it is described several for obtaining the historical behavior data of several clients from the database The historical behavior data of client include one or more behavioural characteristics and it is corresponding it is in different time periods actually occur the frequency, according to The mapping table of preset period and the second weighted value determine corresponding target weight value of each period, according to institute It states and in different time periods actually occurs the frequency and effective generation frequency is calculated in the target weight value.
Further, the real estate client conclusion of the business probabilistic forecasting device further include:
Training set enlargement module is calculated for obtaining the first sample quantity that label in the training set is conclusion of the business client The first sample quantity accounts for the first ratio of whole sample sizes in the training set, and first ratio and first are set Ratio is compared, and when first ratio is less than the first setting ratio, utilizes synthesis minority class oversampling technique pair Conclusion of the business client expands.
The present invention also provides a kind of servers, including processor, memory and communication bus;
The communication bus is for realizing the connection communication between processor and memory;
The processor is for executing one or more program stored in memory, to realize described in any one as above Real estate client's conclusion of the business probability forecasting method the step of.
The beneficial effects of the present invention are:
Real estate client's conclusion of the business probability forecasting method, device and the server provided according to the present invention, it is to be measured by obtaining Client to the historical behavior data of target building, the historical behavior data of client to be measured include one or more behavioural characteristics and Corresponding first occurs the frequency;Obtain the decision tree structure being trained using random forests algorithm model to training set; It include the historical behavior data and customer type of the selected client from database, the history row of selected client in training set Include that the frequency occurs for one or more behavioural characteristics and corresponding second for data, customer type include conclusion of the business client and it is non-at Hand over client;The historical behavior data of client to be measured are input in the decision tree structure, obtain client to be measured to target building Conclusion of the business probability;Client's conclusion of the business probability is effectively predicted to realize, sales force is reduced and strikes a bargain generally to real estate client The intention of rate analyzes difficulty, is conducive to sales force and carries out more targeted follow-up service to the client of different conclusion of the business probability, Promote sales achievement.
Detailed description of the invention
Fig. 1 is real estate client's conclusion of the business probability forecasting method flow diagram of the embodiment of the present invention one;
Fig. 2 is that the K of the embodiment of the present invention one rolls over cross validation schematic diagram;
Fig. 3 is the random forest training process schematic diagram of the embodiment of the present invention one;
Fig. 4 is that the sample size of the embodiment of the present invention one expands schematic diagram;
Fig. 5 is real estate client's conclusion of the business probabilistic forecasting apparatus structure schematic diagram one of the embodiment of the present invention two;
Fig. 6 is real estate client's conclusion of the business probabilistic forecasting apparatus structure schematic diagram two of the embodiment of the present invention two;
Fig. 7 is real estate client's conclusion of the business probabilistic forecasting apparatus structure schematic diagram three of the embodiment of the present invention two;
Fig. 8 is the server architecture schematic diagram of the embodiment of the present invention two.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, below by specific embodiment knot Closing attached drawing, invention is further described in detail.It should be appreciated that specific embodiment described herein is only used to explain this Invention, is not intended to limit the present invention.
Embodiment one:
Referring to Figure 1, Fig. 1 is a kind of real estate client conclusion of the business probability forecasting method process that the embodiment of the present invention one provides Schematic diagram, this method mainly include the following steps:
S101, client to be measured is obtained to the historical behavior data of target building, the historical behavior data of client to be measured include One or more behavioural characteristics and the corresponding first generation frequency.
For model with " client's id+ building id " for basic unit, namely the conclusion of the business probability of prediction client to be measured is to be measured with this Client carries out the historical behavior data of the target building, for the user to be measured to the historical behavior data of other buildings, Then it is not used in the prediction to target building conclusion of the business probability.Wherein, historical behavior data include behavioural characteristic and each behavioural characteristic Corresponding first occurs the frequency.Such as target building " 1567 ", specifically may refer to as shown in table 1 below:
Table 1
Client id Access duration Number of clicks Access day Accession page number
“jm122” 1552 255 12 56
“cs233” 899 123 23 96
Wherein, behavioural characteristic includes but is not limited to: access duration, odd-numbered day maximum access duration, number of clicks, odd-numbered day are maximum Number of clicks, access day, accession page number, average access duration, average number of clicks, average access number of days in access cycle Deng.
It should be noted that the historical behavior data of client can store in server or in database, in client When carrying out business access using client, during historical behavior data can be recorded and upload onto the server, therefore server The historical behavior data of relative clients in system can be got in memory or database by itself.
The decision tree structure that S102, acquisition are trained training set using random forests algorithm model;The training Concentration includes the historical behavior data and customer type of the selected client from database, the historical behavior number of selected client According to including that the frequency occurs for one or more behavioural characteristics and corresponding second, customer type includes conclusion of the business client and non-conclusion of the business visitor Family.
When user has demand to certain commodity, higher attention rate, this attention rate are also correspondingly had to the commodity It can relatively accurately be embodied in the history access behavior and data of filing of user.Therefore, conclusion of the business user and the user that do not strike a bargain exist Significantly difference is also had in terms of history access behavior.
Using random forests algorithm can the historical behavior data to user be trained modeling, will be defeated after training Decision tree structure out.Modeling process in order to better understand is illustrated below with reference to specific example, it should be understood that the example It is mainly used for explaining the process, is not construed as limitation of the present invention:
For example, training set includes sample A, B, C, D, wherein customer type is as data label, and 0 indicates non-conclusion of the business client, 1 Indicate conclusion of the business client;Shown in table 2 specific as follows:
Table 2
The training set is trained using random forests algorithm model, obtains decision tree structure as shown in Figure 3.It is practical In, more decision trees can be constructed according to different condition, then sample to be tested are input in each decision tree, according to each Decision tree judges decision down from level to level, which child node determination finally falls in, then which kind of the client to be measured just belongs to. Wherein, judgement sequence in decision tree, first judges behavioural characteristic a, still first judges behavioural characteristic b, can be according to different demarcation Comentropy determines.Specifically can in known manner, details are not described herein.
After building decision tree, so that it may the decision tree structure is utilized, the prediction of conclusion of the business probability is carried out to client to be measured, The client to be measured is input in each decision tree, sees which child node the client to be measured falls in, then the client to be measured just with The customer type of the child node is identical.Assuming that it is 25 that the historical behavior data of client to be measured, which include: behavioural characteristic a, behavioural characteristic b It is 10, is entered into first decision tree structure, the affiliated customer type that the client to be measured can be obtained is 1, is also Conclusion of the business client;It being entered into the second class decision tree structure, the affiliated customer type of the available client to be measured is 1, As conclusion of the business client;To sum up, judging result is conclusion of the business client twice, and the conclusion of the business probability of the available client to be measured is larger; When there are more decision trees, to same client to be measured, it is understood that there may be the case where belonging to conclusion of the business client, it is also possible to which there are non-conclusions of the business It the case where client, at this time can be by choosing its affiliated customer type in a vote.Such as wherein 10 times be conclusion of the business client, wherein 5 times It, then can be by choosing it in a vote as conclusion of the business client for non-conclusion of the business client.
Optionally, when there are more decision trees, the proportion that output result is conclusion of the business client can be counted, to determine The conclusion of the business probability of client to be measured.For example, client to be measured is input in 20 decision trees there are 20 decision trees, respectively into Row judgement, it is 15 times that obtain output result, which be the number of conclusion of the business client, and the output result of non-conclusion of the business client is 5 times, then can be with Obtaining conclusion of the business probability is 15/20=75%.
Random forest modeling process mainly includes three basic steps: Construction of A Model, model training, model optimization.Wherein Construction of A Model builds algorithm frame in server using the thought of random forests algorithm;It can be rolled in conjunction with K during model training Training parameter is adjusted in cross validation or grid search (GridSearchCV) method.It is that will train number that K, which rolls over cross validation, It according to k equal portions are divided into, is trained using k-1 parts of data therein, in addition a data are as test.If Fig. 2 is that K folding intersects Verify schematic diagram.GridSearchCV carries out cross validation according to given model automatically, by adjust each parameter come with Track appraisal result, for cyclic process when process is instead of parameter search.Model optimization process is according to the test knot of test set Fruit carries out model hyper parameter using control variate method targetedly to adjust optimization, such as: current hyper parameter has a, b, c tri- It is a, it is remained unchanged by controlling b, c, to a mono- sufficiently large upper lower threshold value, finds its optimum point, the optimization of other hyper parameters Similarly.Wherein hyper parameter includes but is not limited to decision tree quantity, depth etc..
Random forests algorithm can automatically process the data of the diversified forms such as binary feature, characteristic of division, numerical characteristics.? During model training, it can be automatically performed the selection of hidden feature, the importance measures of model are provided after training.It is random gloomy Woods model is capable of handling continuous type numerical variable, therefore does not need to carry out input feature vector additional processing, as numerical value discretization, Normalization etc..Meanwhile the weight parameter of each behavioural characteristic will be provided after Random Forest model training, that is, it is based on learning model Feature ordering.Therefore it does not need to carry out excessive Feature Engineering link, need to only be searched for after the completion of training according to backward selection Method determine feature.
Model completes the training of random forests algorithm based on sklearn machine learning library, due to Random Forest model parameter Numerous, therefore, we will be in conjunction with k folding cross validation and grid search (GridSearchCV) method to training in training process Parameter is adjusted.Random forest training process is as shown in Figure 3.
It optionally, can also include: that data preparation, feature selecting and data prediction waited before modeling training Journey, wherein Data Preparation Process: being handled and analyzed to data for convenience, and the data of storage in the database are imported into In Series and Dataframe data structure.Series is an one-dimensional similar array object, the number comprising an array According to (data type of any NumPy) and one and the associated data label of array, that is, index.Seriers interactive display String representation is index on the left side, and value is on the right.Datarame indicates a table, is a kind of similar electrical form Data structure, comprising one by sequence list collection, each of which can have different types values (number, character string, Boolean etc.).Datarame has the index of row and column.It can be counted as the dictionary of a Series, and (each Series is total Enjoy an index).
Feature selection process: model is based primarily upon both sides data: (1) user's history behavioral data;(2) user builds Shelves information (including subscriber identity information, such as ID, name, telephone number etc.).Model is basic system with " User ID+building ID " Unit is counted, access duration, odd-numbered day maximum access duration, number of clicks, odd-numbered day maximum number of clicks, access day, visit are had chosen Ask page number, the features such as average access duration, average number of clicks, average access number of days in access cycle.
The first step, the essential attribute and behavior property of foundation client simultaneously combine business scenario, and thinking summarizes potential n The set F of feature;
Second step, carry out the fractionation of characteristic attribute with merge.Split: by former characteristic set F attribute Fi (0≤i < N) multiple features are split as according to specific rules, such as " access duration " is split as " odd-numbered day maximum accesses duration " and " access cycle Interior average access duration ";Merge: i.e. dimensionality reduction is merged the stronger index of multiple correlations using technologies such as PCA, thus Index dimension is reduced, such as indicates " access frequency " use " access times " or " access cycle ";
Third step, using the feature after splitting, merging as the m dimensional feature collection F ' finally used in a model, mainly Include: access duration, odd-numbered day maximum access duration, number of clicks, odd-numbered day maximum number of clicks, access day, accession page number, The features such as average access duration, average number of clicks, average access number of days in access cycle.
Data preparation: the feature extraction rule come out according to Feature Engineering combing (second step) extracts sample from database Data are formed sample set S (namely training set);S contains value (including structural data and the unstructured number of each feature According to) and sample label (0: it can indicate not strike a bargain, 1: can indicate to strike a bargain).
Data prediction: after obtaining data, pre-processing data set, including removes repeated data, detects and go Except exceptional value, missing values, the conversion of data type and the dimensionality reduction operation that may be needed etc. are handled.In rejecting outliers link, Outlier is removed by the methods of single argument abnormality detection, cluster abnormality detection.After obtaining data, in order to guarantee data Accuracy, dirty data is cleaned, including removal repeated data, detection and remove exceptional value, processing missing values, data class Conversion, dimensionality reduction operation of type etc..
Behavioural characteristic and corresponding value (i.e. second occur the frequency) in training set including sample, in the present embodiment, the The two generation frequencys refer to that the frequency effectively occurs, and it is to be based on actually occurring after the frequency is weighted processing obtaining that the frequency, which effectively occurs, It arrives.It specifically includes: obtaining the historical behavior data of several clients from database, the historical behavior data of several clients include One or more behavioural characteristics and it is corresponding it is in different time periods actually occur the frequency, according to the preset period and second The mapping table of weighted value determines corresponding target weight value of each period, actually occurs the frequency according in different time periods, And effective generation frequency is calculated in target weight value.
For example, certain client to the number of clicks of target building be divided into nearest one week number of clicks be 1000 times, nearest one The number of clicks of a month (not including this nearest one week) is 500 times, the click time of nearest half a year (not including this nearest one month) Number is 10000 times, and the number of clicks before half a year is 6000 times;It so can be according to preset period and the second weighted value Mapping table, it is assumed that mapping table is as follows:
Table 3
Period Nearest one week Nearest one month Nearest half a year Before half a year
Second weighted value 1 0.8 0.5 0.1
Effective generation frequency of client's number of clicks: 1000*1+500*0.8+10000*0.5+6000* can be obtained 0.1=7000 times.
Different timing nodes, feature have different weights, and client's access behavior in nearly one week is certainly than one month Preceding access behavior has more reference value, therefore, in modeling process, assigns not to client in the access behavior of different phase Same weight is conducive to the accuracy for promoting prediction.
Optionally, it before being trained using random forests algorithm model to training set, can also obtain in training set Label is the first sample quantity of conclusion of the business client, calculates the first ratio that first sample quantity accounts for whole sample sizes in training set First ratio is compared by value with the first setting ratio, a small number of using synthesis when the first ratio is less than the first setting ratio Class oversampling technique (Synthetic Minority Oversampling Technique, abbreviation SMOTE) to conclusion of the business client into Row expands.SMOTE algorithm is to be generated to carry out in the new original data set of minority class sample point addition according to certain regular random A kind of trained method.It is as shown in Figure 4 that SMOTE method generates new sample point schematic diagram.Conclusion of the business sample size and the sample that do not strike a bargain When quantity differs greatly, it is reconstructed by SMOTE, reducing training set sample class serious unbalance bring influences.
This model based on random forests algorithm, behavioural characteristic based on user's history access and file information architecture and At.Model is to differentiate the conclusion of the business probability of user as target, it is intended to be carried out by the multidimensional characteristic of user to the conclusion of the business probability of user Prediction.
S103, the historical behavior data of client to be measured are input in decision tree structure, obtain client to be measured to target building The conclusion of the business probability of disk.
The historical behavior data of client to be measured are input in each decision tree structure, the affiliated customer class of client to be measured is obtained The conclusion of the business probability of client to be measured is calculated according to each affiliated customer type for type.
Real estate client conclusion of the business probability forecasting method provided by the invention goes through target building by obtaining client to be measured History behavioral data, the historical behavior data of client to be measured include that frequency occurs for one or more behavioural characteristics and corresponding first It is secondary;Obtain corresponding first weight of each behavioural characteristic being trained using random forests algorithm model to training set Value;It include the historical behavior data of the selected client from database, the historical behavior data of selected client in training set The frequency occurs including one or more behavioural characteristics and corresponding second;According to the historical behavior data of client to be measured and respectively Corresponding first weighted value of behavioural characteristic calculates client to be measured to the conclusion of the business probability value of target building;To realize to client Conclusion of the business probability is effectively predicted, and reduces sales force and analyzes difficulty to the intention of real estate client's conclusion of the business probability, is conducive to pin It sells personnel and more targeted follow-up service is carried out to the client of different conclusion of the business probability, promote sales achievement.
Embodiment two:
The present embodiment on the basis of example 1, provides a kind of real estate client conclusion of the business probabilistic forecasting device, refers to Fig. 5, real estate client's conclusion of the business probabilistic forecasting device 50 is for realizing real estate client conclusion of the business probability described in above-described embodiment one The step of prediction technique, the prediction meanss 50 include data acquisition module 51, processing module 52.
Wherein data acquisition module 51 is for obtaining client to be measured to the historical behavior data of target building, client's to be measured Historical behavior data include that the frequency occurs for one or more behavioural characteristics and corresponding first;And for obtaining using random The decision tree structure that forest algorithm model is trained training set;It include the selected visitor from database in training set The historical behavior data and customer type at family, the historical behavior data of selected client include one or more behavioural characteristics and Corresponding second occurs the frequency, and customer type includes conclusion of the business client and non-conclusion of the business client;
Processing module 52 is used for the historical behavior data of client to be measured accessed by data acquisition module 51 and is somebody's turn to do Decision tree structure obtains client to be measured to the conclusion of the business probability of target building.
Behavioural characteristic includes following at least one: access duration, odd-numbered day maximum access duration, number of clicks, odd-numbered day are maximum Number of clicks, access day, accession page number, average access duration, average number of clicks, average access day in access cycle Number.
Second occurs the frequency including the frequency effectively occurs, and real estate client's conclusion of the business probabilistic forecasting device 50 further includes training set Processing module 53, refers to Fig. 6:
Training set processing module 53 is used to obtain the historical behavior data of several clients from database, and several clients' goes through History behavioral data include one or more behavioural characteristics and it is corresponding it is in different time periods actually occur the frequency, according to presetting Period and the second weighted value mapping table, corresponding target weight value of each period is determined, according to different time sections Actually occur the frequency and target weight value is calculated and the frequency effectively occurs.
Optionally, Fig. 7 is referred to, real estate client's conclusion of the business probabilistic forecasting device 50 further includes training set enlargement module 54, For obtaining the first sample quantity that label in training set is conclusion of the business client, calculates first sample quantity and account for whole samples in training set First ratio is compared, in the first ratio less than the first setting ratio by first ratio of this quantity with the first setting ratio When, conclusion of the business client is expanded using synthesis minority class oversampling technique.
The present embodiment also provides a kind of server, refers to Fig. 8, which includes processor 81, memory 82 and lead to Believe bus 83;Communication bus 83 is for realizing the connection communication between processor 81 and memory 82;Processor 81 is for executing One or more program stored in memory 82, to realize real estate client's conclusion of the business probabilistic forecasting as described in embodiment one The step of method.
Obviously, those skilled in the art should be understood that each module of aforementioned present invention or each step can be with general Computing device realizes that they can be concentrated on a single computing device, or be distributed in constituted by multiple computing devices On network, optionally, they can be realized with the program code that computing device can perform, it is thus possible to be stored in It is performed by computing device in computer storage medium (ROM/RAM, magnetic disk, CD), and in some cases, it can be with not The sequence being same as herein executes shown or described step, or they are fabricated to each integrated circuit modules, or Person makes multiple modules or steps in them to single integrated circuit module to realize.So the present invention is not limited to appoint What specific hardware and software combines.
The above content is specific embodiment is combined, further detailed description of the invention, and it cannot be said that this hair Bright specific implementation is only limited to these instructions.For those of ordinary skill in the art to which the present invention belongs, it is not taking off Under the premise of from present inventive concept, a number of simple deductions or replacements can also be made, all shall be regarded as belonging to protection of the invention Range.

Claims (10)

1. a kind of real estate client conclusion of the business probability forecasting method characterized by comprising
Client to be measured is obtained to the historical behavior data of target building, the historical behavior data of the client to be measured include one or Multiple behavioural characteristics and the corresponding first generation frequency;
Obtain the decision tree structure being trained using random forests algorithm model to training set;Include in the training set The historical behavior data and customer type of selected client, the historical behavior data packet of the selected client from database One or more behavioural characteristics and the corresponding second generation frequency are included, the customer type includes conclusion of the business client and non-conclusion of the business visitor Family;
The historical behavior data of the client to be measured are input in the decision tree structure, obtain the client to be measured to described The conclusion of the business probability of target building.
2. real estate client conclusion of the business probability forecasting method as described in claim 1, which is characterized in that the behavioural characteristic includes It is following at least one:
Access duration, odd-numbered day maximum access duration, number of clicks, odd-numbered day maximum number of clicks, access day, accession page number, Average access duration, average number of clicks, average access number of days in access cycle.
3. real estate client conclusion of the business probability forecasting method as described in claim 1, which is characterized in that described by the visitor to be measured The historical behavior data at family are input in the decision tree structure, and it is general to the conclusion of the business of the target building to obtain the client to be measured Rate includes:
The historical behavior data of the client to be measured are input in each decision tree structure, the institute of the client to be measured is obtained Belong to customer type, the conclusion of the business probability of the client to be measured is calculated according to each affiliated customer type.
4. real estate client conclusion of the business probability forecasting method as described in any one of claims 1-3, which is characterized in that described second It includes that the frequency effectively occurs that the frequency, which occurs, and effective generation frequency is based on as under type obtains:
The historical behavior data of several clients are obtained from the database, and the historical behavior data of several clients include one A or multiple behavioural characteristics and it is corresponding it is in different time periods actually occur the frequency, according to preset period and the second power The mapping table of weight values determines corresponding target weight value of each period, according to it is described it is in different time periods actually occur frequency Effective generation frequency is calculated in the secondary and described target weight value.
5. real estate client conclusion of the business probability forecasting method as claimed in claim 4, which is characterized in that calculated using random forest Before method model is trained training set, further includes:
The first sample quantity that label in the training set is conclusion of the business client is obtained, the first sample quantity is calculated and accounts for the instruction Practice the first ratio for concentrating whole sample sizes, first ratio is compared with the first setting ratio, described first When ratio is less than the first setting ratio, conclusion of the business client is expanded using synthesis minority class oversampling technique.
6. a kind of real estate client conclusion of the business probabilistic forecasting device characterized by comprising
Data acquisition module, for obtaining client to be measured to the historical behavior data of target building, the history of the client to be measured Behavioral data includes that the frequency occurs for one or more behavioural characteristics and corresponding first;And random forest is utilized for obtaining The decision tree structure that algorithm model is trained training set;It include the selected visitor from database in the training set The historical behavior data and customer type at family, the historical behavior data of the selected client include one or more behavioural characteristics And the corresponding second generation frequency, the customer type includes conclusion of the business client and non-conclusion of the business client;
Processing module, it is described for being input to the historical behavior data of client to be measured accessed by the data acquisition module In decision tree structure, the client to be measured is obtained to the conclusion of the business probability of the target building.
7. real estate client conclusion of the business probabilistic forecasting device as claimed in claim 6, which is characterized in that the behavioural characteristic includes It is following at least one:
Access duration, odd-numbered day maximum access duration, number of clicks, odd-numbered day maximum number of clicks, access day, accession page number, Average access duration, average number of clicks, average access number of days in access cycle.
8. real estate client conclusion of the business probabilistic forecasting device as claimed in claim 6, which is characterized in that described second occurs the frequency Including the frequency, the real estate client conclusion of the business probabilistic forecasting device effectively occurs further include:
Training set processing module, for obtaining the historical behavior data of several clients, several clients from the database Historical behavior data include one or more behavioural characteristics and it is corresponding it is in different time periods actually occur the frequency, according to preparatory The mapping table of the period of setting and the second weighted value determine corresponding target weight value of each period, according to it is described not With the period actually occur the frequency and effective generation frequency is calculated in the target weight value.
9. such as the described in any item real estate client conclusion of the business probabilistic forecasting devices of claim 6-8, which is characterized in that the premises Produce client's conclusion of the business probabilistic forecasting device further include:
Training set enlargement module, for obtaining the first sample quantity that label in the training set is conclusion of the business client, described in calculating First sample quantity accounts for the first ratio of whole sample sizes in the training set, by first ratio and the first setting ratio It is compared, when first ratio is less than the first setting ratio, using synthesis minority class oversampling technique to conclusion of the business Client expands.
10. a kind of server, which is characterized in that the server includes processor, memory and communication bus;
The communication bus is for realizing the connection communication between processor and memory;
The processor is for executing one or more program stored in memory, to realize as appointed in claim 1 to 5 Described in one the step of real estate client conclusion of the business probability forecasting method.
CN201811478616.4A 2018-12-05 2018-12-05 Real estate client's conclusion of the business probability forecasting method, device and server Pending CN109615128A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811478616.4A CN109615128A (en) 2018-12-05 2018-12-05 Real estate client's conclusion of the business probability forecasting method, device and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811478616.4A CN109615128A (en) 2018-12-05 2018-12-05 Real estate client's conclusion of the business probability forecasting method, device and server

Publications (1)

Publication Number Publication Date
CN109615128A true CN109615128A (en) 2019-04-12

Family

ID=66005462

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811478616.4A Pending CN109615128A (en) 2018-12-05 2018-12-05 Real estate client's conclusion of the business probability forecasting method, device and server

Country Status (1)

Country Link
CN (1) CN109615128A (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110675069A (en) * 2019-09-26 2020-01-10 重庆锐云科技有限公司 Real estate industry client signing risk early warning method, server and storage medium
CN110716979A (en) * 2019-10-18 2020-01-21 重庆锐云科技有限公司 House buying intention client mining method, device and server
CN110852797A (en) * 2019-10-29 2020-02-28 深圳市看见网络科技有限公司 Method, mobile terminal and computer storage medium for helping broker to judge guests efficiently
CN110910282A (en) * 2019-11-27 2020-03-24 重庆锐云科技有限公司 System, method and storage medium for earning commission of multi-role customer service for real estate electronic commerce
CN111158732A (en) * 2019-12-23 2020-05-15 中国平安人寿保险股份有限公司 Access data processing method and device, computer equipment and storage medium
CN111292140A (en) * 2020-03-19 2020-06-16 重庆锐云科技有限公司 Online customer intelligent distribution method
CN111415199A (en) * 2020-03-20 2020-07-14 重庆锐云科技有限公司 Customer prediction updating method and device based on big data and storage medium
CN111414542A (en) * 2020-03-20 2020-07-14 重庆锐云科技有限公司 Real estate customer group identification and marketing method
CN111563628A (en) * 2020-05-09 2020-08-21 重庆锐云科技有限公司 Real estate customer transaction time prediction method, device and storage medium
CN111695015A (en) * 2020-06-04 2020-09-22 重庆锐云科技有限公司 Customer behavior analysis method and device, computer equipment and storage medium
CN111815066A (en) * 2020-07-21 2020-10-23 上海数鸣人工智能科技有限公司 User click prediction method based on gradient lifting decision tree
CN112001757A (en) * 2020-08-26 2020-11-27 中山世达模型制造有限公司 Sales order prediction method
CN112288117A (en) * 2019-07-23 2021-01-29 贝壳技术有限公司 Target customer deal probability prediction method and device and electronic equipment
CN112417267A (en) * 2020-10-10 2021-02-26 腾讯科技(深圳)有限公司 User behavior analysis method and device, computer equipment and storage medium
JP2021068437A (en) * 2019-10-17 2021-04-30 株式会社ウチダレック Output program, method for output, and output device
CN115239400A (en) * 2022-09-21 2022-10-25 广州越创智数信息科技有限公司 House purchasing user intention degree calculation method and system
CN115309737A (en) * 2022-10-11 2022-11-08 深圳市明源云客电子商务有限公司 Visitor intention analysis method and system, terminal device and readable storage medium
CN117632905A (en) * 2023-11-28 2024-03-01 广州视声智能科技有限公司 Database management method and system based on cloud use records
CN117632905B (en) * 2023-11-28 2024-05-17 广州视声智能科技有限公司 Database management method and system based on cloud use records

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140089908A1 (en) * 2012-09-25 2014-03-27 Jeffrey S. Dunn Decision Tree Ensemble Compilation
DE102013022171A1 (en) * 2013-03-15 2014-09-18 Nvidia Corporation Execution of object recognition operations by means of a graphic processing unit
CN107590688A (en) * 2017-08-24 2018-01-16 平安科技(深圳)有限公司 The recognition methods of target customer and terminal device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140089908A1 (en) * 2012-09-25 2014-03-27 Jeffrey S. Dunn Decision Tree Ensemble Compilation
DE102013022171A1 (en) * 2013-03-15 2014-09-18 Nvidia Corporation Execution of object recognition operations by means of a graphic processing unit
CN107590688A (en) * 2017-08-24 2018-01-16 平安科技(深圳)有限公司 The recognition methods of target customer and terminal device

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112288117A (en) * 2019-07-23 2021-01-29 贝壳技术有限公司 Target customer deal probability prediction method and device and electronic equipment
CN110675069A (en) * 2019-09-26 2020-01-10 重庆锐云科技有限公司 Real estate industry client signing risk early warning method, server and storage medium
CN110675069B (en) * 2019-09-26 2022-05-31 重庆锐云科技有限公司 Real estate industry client signing risk early warning method, server and storage medium
JP7109804B2 (en) 2019-10-17 2022-08-01 株式会社ウチダレック Output program, output method and output device
JP2021068437A (en) * 2019-10-17 2021-04-30 株式会社ウチダレック Output program, method for output, and output device
CN110716979A (en) * 2019-10-18 2020-01-21 重庆锐云科技有限公司 House buying intention client mining method, device and server
CN110852797A (en) * 2019-10-29 2020-02-28 深圳市看见网络科技有限公司 Method, mobile terminal and computer storage medium for helping broker to judge guests efficiently
CN110910282A (en) * 2019-11-27 2020-03-24 重庆锐云科技有限公司 System, method and storage medium for earning commission of multi-role customer service for real estate electronic commerce
CN111158732A (en) * 2019-12-23 2020-05-15 中国平安人寿保险股份有限公司 Access data processing method and device, computer equipment and storage medium
CN111158732B (en) * 2019-12-23 2024-04-02 中国平安人寿保险股份有限公司 Access data processing method, device, computer equipment and storage medium
CN111292140A (en) * 2020-03-19 2020-06-16 重庆锐云科技有限公司 Online customer intelligent distribution method
CN111414542A (en) * 2020-03-20 2020-07-14 重庆锐云科技有限公司 Real estate customer group identification and marketing method
CN111415199A (en) * 2020-03-20 2020-07-14 重庆锐云科技有限公司 Customer prediction updating method and device based on big data and storage medium
CN111563628A (en) * 2020-05-09 2020-08-21 重庆锐云科技有限公司 Real estate customer transaction time prediction method, device and storage medium
CN111695015A (en) * 2020-06-04 2020-09-22 重庆锐云科技有限公司 Customer behavior analysis method and device, computer equipment and storage medium
CN111815066A (en) * 2020-07-21 2020-10-23 上海数鸣人工智能科技有限公司 User click prediction method based on gradient lifting decision tree
CN111815066B (en) * 2020-07-21 2021-03-26 上海数鸣人工智能科技有限公司 User click prediction method based on gradient lifting decision tree
CN112001757A (en) * 2020-08-26 2020-11-27 中山世达模型制造有限公司 Sales order prediction method
CN112417267A (en) * 2020-10-10 2021-02-26 腾讯科技(深圳)有限公司 User behavior analysis method and device, computer equipment and storage medium
CN115239400A (en) * 2022-09-21 2022-10-25 广州越创智数信息科技有限公司 House purchasing user intention degree calculation method and system
CN115309737A (en) * 2022-10-11 2022-11-08 深圳市明源云客电子商务有限公司 Visitor intention analysis method and system, terminal device and readable storage medium
CN117632905A (en) * 2023-11-28 2024-03-01 广州视声智能科技有限公司 Database management method and system based on cloud use records
CN117632905B (en) * 2023-11-28 2024-05-17 广州视声智能科技有限公司 Database management method and system based on cloud use records

Similar Documents

Publication Publication Date Title
CN109615128A (en) Real estate client&#39;s conclusion of the business probability forecasting method, device and server
Cavalli et al. CNN-based multivariate data analysis for bitcoin trend prediction
Zheng et al. Real-time intelligent big data processing: technology, platform, and applications
Fayazi et al. Uncovering crowdsourced manipulation of online reviews
CN102970289B (en) The identity identifying method of sing on web user behavior pattern
CN102890803B (en) The defining method of the abnormal process of exchange of electronic goods and device thereof
CN107689008A (en) A kind of user insures the method and device of behavior prediction
CN103678659A (en) E-commerce website cheat user identification method and system based on random forest algorithm
CN107798027B (en) Information popularity prediction method, information recommendation method and device
CN103136303A (en) Method and equipment of dividing user group in social network service website
CN108230016B (en) Agricultural product market price transmission analysis method and analysis device
CN107679626A (en) Machine learning method, device, system, storage medium and equipment
Yu et al. Data cleaning for personal credit scoring by utilizing social media data: An empirical study
CN107274042A (en) A kind of business participates in the Risk Identification Method and device of object
CN110297990A (en) The associated detecting method and system of crowdsourcing marketing microblogging and waterborne troops
CN115221396A (en) Information recommendation method and device based on artificial intelligence and electronic equipment
CN103440199A (en) Method and device for guiding test
CN116362823A (en) Recommendation model training method, recommendation method and recommendation device for behavior sparse scene
CN110795613A (en) Commodity searching method, device and system and electronic equipment
US20210174367A1 (en) System and method including accurate scoring and response
Abrishami et al. Using real-world store data for foot traffic forecasting
US10339134B2 (en) System, method, and non-transitory computer-readable storage media for generating normalization candidates for a search query
Mulahuwaish et al. Topic modeling based on two-step flow theory: Application to Tweets about bitcoin
CN114943563A (en) Rights and interests pushing method and device, computer equipment and storage medium
Lian et al. A fuel sales forecast method based on variational Bayesian structural time series

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190412

RJ01 Rejection of invention patent application after publication