CN110288465A - Object determines method and device, storage medium, electronic device - Google Patents
Object determines method and device, storage medium, electronic device Download PDFInfo
- Publication number
- CN110288465A CN110288465A CN201910533200.6A CN201910533200A CN110288465A CN 110288465 A CN110288465 A CN 110288465A CN 201910533200 A CN201910533200 A CN 201910533200A CN 110288465 A CN110288465 A CN 110288465A
- Authority
- CN
- China
- Prior art keywords
- information
- sample
- business
- score value
- obtains
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/067—Enterprise or organisation modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
Abstract
The invention discloses a kind of objects to determine method and device, storage medium, electronic device.Wherein, this method comprises: the characteristics of objects information input for the object to be assessed that will acquire obtains the feature score value corresponding with characteristics of objects information of object module output to object module;In the case where feature score value is greater than preset value, object to be assessed is determined as target object.The present invention solves technical problem low to the digging efficiency of potential customers present in the relevant technologies.
Description
Technical field
The present invention relates to computer fields, determine method and device, storage medium, electricity in particular to a kind of object
Sub-device.
Background technique
Personal Financial Business of Commercial Bank sustainable development in recent years, successively have accumulated a large amount of customer data, this
There is huge potential values for the behind of a little data.At the same time, the fast-developing of internet finance also produces business bank
Certain impact is given birth to, new client's growth slowdown and frequent customer are lost aggravation.For traditional commerce bank, developing new client
When fight separately mostly, lack technological system based on client's depth analysis and support.How to find and to excavate personal credit newly objective
Family is a difficult point for bank, and the business revenue and achievement for being directly or indirectly related to commercial bank credit business increase, but
Still more universal phenomenon is that client's marketing thinking is often confined to releasing advertisements leaflet, holds Below-the-line or set using experience
Set pattern then carries out the mode of screening etc., is easy to appear the small limitation of at high cost, low efficiency, range, is unfavorable for the business hair of bank
Exhibition and long-range competitiveness.
For above-mentioned problem, currently no effective solution has been proposed.
Summary of the invention
The embodiment of the invention provides a kind of objects to determine method and device, storage medium, electronic device, at least to solve
The technical problem low to the digging efficiency of potential customers present in the relevant technologies.
According to an aspect of an embodiment of the present invention, it provides a kind of object and determines method, comprising: what be will acquire is to be assessed
The characteristics of objects information input of object obtains the corresponding with the characteristics of objects information of the object module output to object module
Feature score value;In the case where the feature score value is greater than preset value, the object to be assessed is determined as target pair
As.
Optionally, the target is obtained to object module in the characteristics of objects information input for the object to be assessed that will acquire
Before the feature score value corresponding with the characteristics of objects information of model output, the method also includes: obtain multiple samples
Whether information, and each sample object corresponding with each sample information in the multiple sample information handle target service
Business information;Use the multiple sample information and corresponding with each sample information in the multiple sample information every
The business information whether a sample object handled target service is trained archetype, obtains the object module,
In, the multiple sample information is the input of the archetype, each sample of the trained object module output
The corresponding practical business information of this information meets objective function.
Optionally, obtain the multiple sample information, and with each sample information pair in the multiple sample information
After whether each sample object answered handles the business information of target service, the method also includes: by the multiple sample
Information is stored into distributed file system;Data processing is carried out to the multiple sample information, obtains N number of target sample letter
Breath, wherein the N is greater than 1 natural number.
Optionally, carrying out data processing to the multiple sample information includes at least one of: by the multiple sample
The format conversion of information progress preset format;Delete the duplicate message in the multiple sample information;Handle the multiple sample
Invalid information and/or null information in information;Handle the exceptional value in the multiple sample information, wherein the exceptional value
Information including not meeting natural rule;Each sample information in the multiple sample information is normalized.
Optionally, the multiple sample information and corresponding with each sample information in the multiple sample information is used
Each sample object whether handled the business information of target service archetype be trained, obtain the object module
Including;It is mark with the identity information of the corresponding each sample object of target sample information each in N number of target sample information
Know, extract the characteristic information in N number of target sample information in each target sample information, obtain M sample characteristics information,
Wherein, the M is greater than 1 natural number;The sample characteristics information of target service was not handled in the M sample characteristics information
It is determined as first sample characteristic information, obtains O first sample characteristic information, wherein the O is greater than 1 natural number, described
O is less than the M;The sample characteristics information that target service was handled in the M sample characteristics information is determined as the second sample
Characteristic information obtains P the second sample characteristics information, wherein and the P is greater than 1 natural number, and the P is less than the M, and
Less than the O;Use the O first sample characteristic information, the P the second sample characteristics information and the O the first samples
Corresponding first business information of eigen information and the second business information pair corresponding with the P the second sample characteristics information
The archetype is trained, and obtains the object module.
Optionally, using the O first sample characteristic information, the P the second sample characteristics information and the O
Corresponding first business information of first sample characteristic information and the second business corresponding with the P the second sample characteristics information
Information is trained the archetype, and obtaining the object module includes: by the O first sample characteristic information equal part
At Q parts of first sample characteristic informations, wherein the Q is greater than 1 natural number, and the Q is less than or equal to the O;Respectively by institute
Every part of first sample characteristic information and the P the second sample characteristics information stated in Q parts of first sample characteristic informations carry out group
It closes, obtains Q group sample characteristics information;By in the Q group sample characteristics information every group of sample characteristics information and with it is described every
The corresponding business information of group sample characteristics information carries out Q training to the archetype, obtains the object module.
Optionally, by the Q group sample characteristics information every group of sample characteristics information and with every group of sample
The corresponding business information of characteristic information is respectively trained the archetype, after obtaining the object module, the side
Method further include: determine the corresponding prediction of each first sample characteristic information exported in each training process from the archetype
Feature score value obtains each first sample characteristic information and corresponds to Q predicted characteristics score value;Calculate the Q predicted characteristics
The average value of score value obtains the mean prediction profiles score value of each first sample characteristic information;It will be described average pre-
Survey the predicted characteristics score value that feature score value is determined as sample object corresponding with each first sample characteristic information;It will
The predicted characteristics score value of the sample object and the corresponding feature score value of the practical business information of the sample object carry out
It compares, to detect the archetype.
Optionally, the multiple sample information and corresponding with each sample information in the multiple sample information is used
Each sample object whether handled the business information of target service archetype be trained, obtain the object module
It include: to use the multiple sample information and each sample corresponding with each sample information in the multiple sample information
The business information whether object handled target service is trained archetype, obtains meeting default recall ratio, presets and look into
Quasi- rate and the object module for meeting default the number of iterations to the training of the archetype.
According to another aspect of an embodiment of the present invention, a kind of object determining device is additionally provided, comprising: first determines mould
Block, the characteristics of objects information input of the object to be assessed for will acquire obtain the object module output to object module
Feature score value corresponding with the characteristics of objects information;Second determines object, default for being greater than in the feature score value
In the case where value, the object to be assessed is determined as target object.
According to another aspect of an embodiment of the present invention, a kind of storage medium is additionally provided, is stored in the storage medium
Computer program, which is characterized in that the computer program is arranged to execute the method described among the above when operation.
According to another aspect of an embodiment of the present invention, a kind of electronic device, including memory and processor are additionally provided,
Be characterized in that, be stored with computer program in the memory, the processor be arranged to run the computer program with
Execute the method described among the above.
In embodiments of the present invention, using the characteristics of objects information input for the object to be assessed that will acquire to object module,
Obtain the feature score value corresponding with characteristics of objects information of object module output;The case where feature score value is greater than preset value
Under, object to be assessed is determined as target object.The purpose that target object is determined using feature score value may be implemented.And then it solves
It has determined technical problem low to the digging efficiency of potential customers present in the relevant technologies.
Detailed description of the invention
The drawings described herein are used to provide a further understanding of the present invention, constitutes part of this application, this hair
Bright illustrative embodiments and their description are used to explain the present invention, and are not constituted improper limitations of the present invention.In the accompanying drawings:
Fig. 1 is the hardware block diagram that a kind of object of the embodiment of the present invention determines the mobile terminal of method;
Fig. 2 is the flow chart that object according to an embodiment of the present invention determines method;
Fig. 3 is the flow chart of model training according to an embodiment of the present invention;
Fig. 4 is the schematic diagram of object determining device according to an embodiment of the present invention.
Specific embodiment
Hereinafter, the present invention will be described in detail with reference to the accompanying drawings and in combination with Examples.It should be noted that not conflicting
In the case of, the features in the embodiments and the embodiments of the present application can be combined with each other.
It should be noted that description and claims of this specification and term " first " in above-mentioned attached drawing, "
Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.
Embodiment of the method provided by the embodiment of the present application can be in mobile terminal, terminal or similar operation
It is executed in device.For running on mobile terminals, Fig. 1 is that a kind of object of the embodiment of the present invention determines the mobile end of method
The hardware block diagram at end.As shown in Figure 1, mobile terminal 10 may include one or more (only showing one in Fig. 1) processing
Device 102 (processing unit that processor 102 can include but is not limited to Micro-processor MCV or programmable logic device FPGA etc.) and
Memory 104 for storing data, optionally, above-mentioned mobile terminal can also include the transmission device for communication function
106 and input-output equipment 108.It will appreciated by the skilled person that structure shown in FIG. 1 is only to illustrate, simultaneously
The structure of above-mentioned mobile terminal is not caused to limit.For example, mobile terminal 10 may also include it is more than shown in Fig. 1 or less
Component, or with the configuration different from shown in Fig. 1.
Memory 104 can be used for storing computer program, for example, the software program and module of application software, such as this hair
Object in bright embodiment determines that the corresponding computer program of method, processor 102 are stored in memory 104 by operation
Computer program realizes above-mentioned method thereby executing various function application and data processing.Memory 104 may include
High speed random access memory, may also include nonvolatile memory, as one or more magnetic storage device, flash memory or its
His non-volatile solid state memory.In some instances, memory 104 can further comprise remotely setting relative to processor 102
The memory set, these remote memories can pass through network connection to mobile terminal 10.The example of above-mentioned network includes but not
It is limited to internet, intranet, local area network, mobile radio communication and combinations thereof.
Transmitting device 106 is used to that data to be received or sent via a network.Above-mentioned network specific example may include
The wireless network that the communication providers of mobile terminal 10 provide.In an example, transmitting device 106 includes a Network adaptation
Device (Network Interface Controller, referred to as NIC), can be connected by base station with other network equipments to
It can be communicated with internet.In an example, transmitting device 106 can for radio frequency (Radio Frequency, referred to as
RF) module is used to wirelessly be communicated with internet.
A kind of object is provided in the present embodiment and determines method, and Fig. 2 is object determination side according to an embodiment of the present invention
The flow chart of method, as shown in Fig. 2, the process includes the following steps:
Step S202, the characteristics of objects information input for the object to be assessed that will acquire obtain object module to object module
The feature score value corresponding with characteristics of objects information of output;
Object to be assessed is determined as target object in the case where feature score value is greater than preset value by step S204.
Through the invention, mesh is obtained to object module using the characteristics of objects information input for the object to be assessed that will acquire
Mark the feature score value corresponding with characteristics of objects information of model output;It, will in the case where feature score value is greater than preset value
Object to be assessed is determined as target object.The purpose that target object is determined using feature score value may be implemented.And then it solves
The technical problem low to the digging efficiency of potential customers present in the relevant technologies.
Optionally, the executing subject of above-mentioned steps can be terminal etc., but not limited to this.
Optionally, the scene for needing to excavate potential customers, including but not limited to bank be can be applied among the above
The excavation of potential customers, in this scenario, potential customers can be the client for not handling banking, or not handle more
The client of loan.
Optionally, characteristics of objects information includes but is not limited to information shown in table 1.
Table 1:
Optionally, object module can be neural network model, but not limited to this.
In an alternative embodiment, in the characteristics of objects information input for the object to be assessed that will acquire to target mould
Type, before obtaining the feature score value corresponding with characteristics of objects information of object module output, method further include:
S1 obtains multiple sample informations, and each sample corresponding with each sample information in multiple sample informations
Whether object handles the business information of target service;
S2 uses multiple sample informations and each sample pair corresponding with each sample information in multiple sample informations
As if the no business information for handling target service is trained archetype, obtains object module, wherein multiple sample letters
Breath is the input of archetype, and the corresponding practical business information of each sample information of trained object module output meets mesh
Scalar functions.
Optionally, sample information can be the bank visitor for not handling personal credit's business extracted from historical data
Family is also possible to handle the bank client of bank credit.
In an alternative embodiment, obtain multiple sample informations, and with each sample in multiple sample informations
After whether the corresponding each sample object of information handles the business information of target service, method further include:
S1 stores multiple sample informations into distributed file system;
S2 carries out data processing to multiple sample informations, obtains N number of target sample information, wherein N is greater than 1 nature
Number.
Optionally, carrying out data processing to multiple sample informations includes at least one of: multiple sample informations are carried out
The format conversion of preset format;Delete the duplicate message in multiple sample informations;Handle the invalid information in multiple sample informations
And/or null information;Handle the exceptional value in multiple sample informations, wherein exceptional value includes not meeting the information of natural rule;
Each sample information in multiple sample informations is normalized.
For example, read the data of bank card and saving service line storage client, rejected after analysis can not veritify or
Do not have helpful data item to object module, the Hadoop that the initial data after screening imports distributed big data cluster is distributed
Among formula file system (Hadoop Distributed File System, referred to as HDFS).Data format is unified at conversion
Reason (such as: band " percentage of % " be uniformly converted to floating number), deleting duplicated data information handles invalid value and null value, is based on
The outlier processing (such as: the age is not obviously inconsistent normally for -1) of business rule, code value is normalized.
In an alternative embodiment, believe using multiple sample informations and with each sample in multiple sample informations
It ceases the business information whether corresponding each sample object handled target service to be trained archetype, obtains target mould
Type includes:
S1, the identity information with the corresponding each sample object of target sample information each in N number of target sample information are
Mark, extracts the characteristic information in N number of target sample information in each target sample information, obtains M sample characteristics information,
In, M is greater than 1 natural number;
The sample characteristics information for not handling target service in M sample characteristics information is determined as first sample spy by S2
Reference breath, obtains O first sample characteristic information, wherein O is greater than 1 natural number, and O is less than M;
The sample characteristics information that target service was handled in M sample characteristics information is determined as the second sample characteristics by S3
Information obtains P the second sample characteristics information, wherein P is greater than 1 natural number, and P is less than M, and is less than O;
S4 uses O first sample characteristic information, P the second sample characteristics information and O first sample characteristic information
Corresponding first business information and the second business information corresponding with P the second sample characteristics information instruct archetype
Practice, obtains object module.
Optionally, for example, first sample characteristic information, which can be to extract from historical data, did not handled personal credit's industry
The bank client of business.Second sample characteristics information can be to extract from historical data and handle personal credit's business and provide a loan
Once the agriculture-countryside-farmer client of the business such as current row other savings, credits card was handled before 3 months attachment of interests.
Optionally, bank individual credit customer quantity is much smaller than other all client's numbers for not handling personal credit's business
Amount, it is therefore desirable to balanced first sample characteristic information and the second sample characteristics information.
Optionally, using O first sample characteristic information, P the second sample characteristics information and O first sample feature
Corresponding first business information of information and the second business information corresponding with P the second sample characteristics information to archetype into
Row training, obtaining object module includes:
O first sample characteristic information is divided into Q parts of first sample characteristic informations, wherein Q is greater than 1 nature by S1
Number, Q are less than or equal to O;
S2, respectively by the every part of first sample characteristic information and P the second sample characteristics in Q parts of first sample characteristic informations
Information is combined, and obtains Q group sample characteristics information;
S3, by every group of sample characteristics information in Q group sample characteristics information and corresponding with every group of sample characteristics information
Business information carries out Q training to archetype, obtains object module.
Optionally, for example, Q wheel circulation can be carried out, every wheel extracts wherein 1 part of first sample characteristic information and the second sample
Characteristic information inputs archetype together and is trained study, and using archetype to remaining Q-1 parts of sample characteristics information
It is predicted.
Optionally, the core python code of archetype can be such that
import xgboost as xgb;
Rf1=xgb.XGBClassifier (max_depth=dval, n_estimators=eval);
Model_orig=rf1.fit (feature, label.label).
In an alternative embodiment, by Q group sample characteristics information every group of sample characteristics information and with it is every
The corresponding business information of group sample characteristics information is respectively trained archetype, and after obtaining object module, method is also wrapped
It includes:
S1 determines that the corresponding prediction of each first sample characteristic information exported in each training process from archetype is special
Score value is levied, each first sample characteristic information is obtained and corresponds to Q predicted characteristics score value;
S2 calculates the average value of Q predicted characteristics score value, and the consensus forecast for obtaining each first sample characteristic information is special
Levy score value;
Mean prediction profiles score value is determined as the pre- of sample object corresponding with each first sample characteristic information by S3
Survey feature score value;
S4, by the predicted characteristics score value of sample object feature score value corresponding with the practical business information of sample object
It is compared, to detect archetype.
Optionally, the bigger expression of feature score value is that the probability of target object is higher, and feature score value can be 0.65.
In an alternative embodiment, believe using multiple sample informations and with each sample in multiple sample informations
It ceases the business information whether corresponding each sample object handled target service to be trained archetype, obtains target mould
Type includes:
S1 uses multiple sample informations and each sample pair corresponding with each sample information in multiple sample informations
As if the no business information for handling target service is trained archetype, obtains meeting default recall ratio, presets and look into standard
Rate and the object module for meeting default the number of iterations to the training of archetype.
Optionally.Precision ratio reflect be actually in sample positive example ratio, recall ratio reflects and is appropriately determined just
The ratio of the example total positive example of Zhan.Precision ratio and recall ratio are the measurements of conflict.In general, when precision ratio is high, recall ratio is past
It is past relatively low.And recall ratio it is high when, precision ratio is often relatively low.
The present invention is described in detail combined with specific embodiments below:
The present embodiment is illustrated by taking the potential customers in bank as an example, as shown in figure 3, the present embodiment on the whole can be with
It is divided into following steps:
S301: client's initial data is imported into distributed big data cluster.
S302: data prediction.
S303: the feature extraction based on customer basis information and trading activity data.
S304: establishment, model training and the latent objective list of output of data mining machine learning model algorithm
S305: feedback data is collected, and continues iteration, optimization algorithm model according to feedback data.
During to data prediction, it can be handled in the following manner:
Bank card and the data of saving service line storage client are read, rejecting after analysis can not veritify or to target
Model does not have helpful data item, and the initial data after screening is imported to the Hadoop distributed document of distributed big data cluster
Among system (Hadoop Distributed File System, referred to as HDFS).
Data format unifies conversion process (such as: the percentage of band " % " is uniformly converted to floating number).Deleting duplicated data
Information.Handle invalid value and null value.Outlier processing based on business rule (such as: the age is not obviously inconsistent normally for -1).It will
Code value is normalized.
Feature extraction based on customer basis information and trading activity data in the following manner:
Initial data after pretreatment, using bank client passport NO. as unique ID, comb on the whole by selective analysis
Line where managing potential target customers and the related data for being associated with a line, deeply dissect the number of savings with credit card related system
According to feature, by the association of tables of data, such as 1 data item result of table is extracted as client characteristics value, as subsequent step target mould
The input of type.
Training objective model in the following manner:
The bank client for not handling personal credit's business is extracted from historical data, by data prediction and extracts institute
Feature is needed, as model training negative sample;It is extracted from historical data and handled personal credit's business and provided a loan the attachment of interest 3 months
The agriculture-countryside-farmer client for once handling the business such as current row other savings, credits card in the past, by data prediction and feature needed for extracting,
As model training positive sample;
In general, bank individual credit customer quantity is much smaller than other all clients for not handling personal credit's business
Quantity, positive and negative samples data volume differ greatly, therefore negative sample is divided into n parts at random, and positive and negative samples quantity is made to meet phase
To more reasonable ratio.
Object module chooses XGBoost, it is the one kind for promoting Tree Model Algorithm, and algorithm central principle is by many weak points
Class device integrates to form a strong classifier.The model has the advantages that loss letter is utilized in objective function optimization
Number supports parallelization about second dervative to be found a function, and training speed is fast, supports setting sample weights.Take more plans
Over-fitting is slightly prevented, the processing etc. to sparse data is added to.
Take turns and recycle followed by n, every wheel extracts wherein 1 part of negative sample and positive sample, together input XGBoost algorithm into
Row training learns and forms model, and is predicted using the model remaining n-1 parts of negative sample, obtains each in negative sample
Client handles the probability score of personal credit's business, the section of scoring be [0,1).
The core python code of algorithm model is as follows:
import xgboost as xgb;
Rf1=xgb.XGBClassifier (max_depth=dval, n_estimators=eval);
Model_orig=rf1.fit (feature, label.label).
After the completion of n wheel circulation as described above, each negative sample will obtain n-1 probability score.This n-1 probability score
Mean value be final probability, with the value evaluation client handle personal credit's business a possibility that size, numerical value is bigger to represent probability
It is higher.
Finally, meeting final probability, in some threshold value, (such as: client's list more than 0.65), as personal credit's business are latent
In client's list.
Feedback data is collected in the following manner:
After obtaining whole personal credit's business cold lists, carried out according to client's affiliated area or bank's clamp mechanism
It divides, distribution is pushed to the credit customer manager of corresponding region mechanism.Credit customer manager carries out essence to client again according to list
Quasi- marketing, and the result whether potential customers to be marketed successfully handle personal credit's business is marked, and these are tied
Fruit feedback capture is returned.
Iteration is continued according to feedback data in the following manner, optimization algorithm model:
After being collected into the feedback data that credit staff markets to potential customers, it is valuable to can be used as tool
Positive and negative samples supplement, carrys out further iteration optimization model.
The model referred here to is mainly two classification problems, and the combination of the true classification of the result and sample of classification can
To constitute four kinds of situations, respectively real example (TP), false positive example (FP), very negative example (TN) and false negative example (FN).Usually our meetings
It goes to describe these four situations with a confusion matrix, as shown in table 2.
Table 2:
Provide the definition of precision ratio Precision and recall ratio Recall:
Precision ratio reflects classifier and determines in the sample of positive example the actually ratio of positive example, and recall ratio is reflected by just
The ratio of the total positive example of positive example Zhan really determined.Precision ratio and recall ratio are the measurements of conflict.In general, precision ratio is high
When, recall ratio is often relatively low;And recall ratio it is high when, precision ratio is often relatively low.We look into usually using F1 measurement to comprehensively consider
Quasi- rate and recall ratio.F1 is the harmonic-mean of precision ratio numerical value and recall ratio numerical value, and formula is as follows:
It reflects the balance accepted or rejected between precision ratio and recall ratio, only when both precision ratio and recall ratio are all relatively large
When, the value of F1 just can be bigger.
The optimization that result data result feedback of marketing will be helped to implementation model classification and predictive ability iteration, to improve
The value of the cold list obtained next time by algorithm model forms closed loop.
In conclusion compared to modes such as marketing activities under each series advertisements, line, potential customers acquisition side provided by the invention
Method covering customer range is wide, precision is higher and cost is relatively low.Compared under conventional situation, set according to banking expertise
Fixed strong rule is come the method for screening potential customers, and the present invention can strong with generalization ability and Weak Classifier be promoted by choosing
The machine learning algorithm of ability, by choosing more features (can be strong feature, be also possible to weak feature), what can be found out is latent
It is wider in customer range.And by iterative feedback, it can be achieved that the closed loop of model self-optimization.
It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of
Combination of actions, but those skilled in the art should understand that, the present invention is not limited by the sequence of acts described because
According to the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know
It knows, the embodiments described in the specification are all preferred embodiments, and related actions and modules is not necessarily of the invention
It is necessary.
Other side according to an embodiment of the present invention additionally provides a kind of pair that method is determined for implementing above-mentioned object
As determining device.As shown in figure 4, the device includes: that the first determining module 42 and second determine object 44, below to the device into
Row is described in detail:
First determining module 42, the characteristics of objects information input of the object to be assessed for will acquire are obtained to object module
The feature score value corresponding with characteristics of objects information exported to object module;
Second determines object 44, in the case where feature score value is greater than preset value, object to be assessed to be determined as
Target object.
Through the invention, mesh is obtained to object module using the characteristics of objects information input for the object to be assessed that will acquire
Mark the feature score value corresponding with characteristics of objects information of model output;It, will in the case where feature score value is greater than preset value
Object to be assessed is determined as target object.The purpose that target object is determined using feature score value may be implemented.And then it solves
The technical problem low to the digging efficiency of potential customers present in the relevant technologies.
Optionally, the scene for needing to excavate potential customers, including but not limited to bank be can be applied among the above
The excavation of potential customers, in this scenario, potential customers can be the client for not handling banking, or not handle more
The client of loan.
Optionally, characteristics of objects information includes but is not limited to information shown in table 1.
Table 1:
Optionally, object module can be neural network model, but not limited to this.
In an alternative embodiment, in the characteristics of objects information input for the object to be assessed that will acquire to target mould
Type, before obtaining the feature score value corresponding with characteristics of objects information of object module output, method further include:
S1 obtains multiple sample informations, and each sample corresponding with each sample information in multiple sample informations
Whether object handles the business information of target service;
S2 uses multiple sample informations and each sample pair corresponding with each sample information in multiple sample informations
As if the no business information for handling target service is trained archetype, obtains object module, wherein multiple sample letters
Breath is the input of archetype, and the corresponding practical business information of each sample information of trained object module output meets mesh
Scalar functions.
Optionally, sample information can be the bank visitor for not handling personal credit's business extracted from historical data
Family is also possible to handle the bank client of bank credit.
In an alternative embodiment, obtain multiple sample informations, and with each sample in multiple sample informations
After whether the corresponding each sample object of information handles the business information of target service, method further include:
S1 stores multiple sample informations into distributed file system;
S2 carries out data processing to multiple sample informations, obtains N number of target sample information, wherein N is greater than 1 nature
Number.
Optionally, carrying out data processing to multiple sample informations includes at least one of: multiple sample informations are carried out
The format conversion of preset format;Delete the duplicate message in multiple sample informations;Handle the invalid information in multiple sample informations
And/or null information;Handle the exceptional value in multiple sample informations, wherein exceptional value includes not meeting the information of natural rule;
Each sample information in multiple sample informations is normalized.
For example, read the data of bank card and saving service line storage client, rejected after analysis can not veritify or
Do not have helpful data item to object module, the Hadoop that the initial data after screening imports distributed big data cluster is distributed
Among formula file system (Hadoop Distributed File System, referred to as HDFS).Data format is unified at conversion
Reason (such as: band " percentage of % " be uniformly converted to floating number), deleting duplicated data information handles invalid value and null value, is based on
The outlier processing (such as: the age is not obviously inconsistent normally for -1) of business rule, code value is normalized.
In an alternative embodiment, believe using multiple sample informations and with each sample in multiple sample informations
It ceases the business information whether corresponding each sample object handled target service to be trained archetype, obtains target mould
Type includes:
S1, the identity information with the corresponding each sample object of target sample information each in N number of target sample information are
Mark, extracts the characteristic information in N number of target sample information in each target sample information, obtains M sample characteristics information,
In, M is greater than 1 natural number;
The sample characteristics information for not handling target service in M sample characteristics information is determined as first sample spy by S2
Reference breath, obtains O first sample characteristic information, wherein O is greater than 1 natural number, and O is less than M;
The sample characteristics information that target service was handled in M sample characteristics information is determined as the second sample characteristics by S3
Information obtains P the second sample characteristics information, wherein P is greater than 1 natural number, and P is less than M, and is less than O;
S4 uses O first sample characteristic information, P the second sample characteristics information and O first sample characteristic information
Corresponding first business information and the second business information corresponding with P the second sample characteristics information instruct archetype
Practice, obtains object module.
Optionally, for example, first sample characteristic information, which can be to extract from historical data, did not handled personal credit's industry
The bank client of business.Second sample characteristics information can be to extract from historical data and handle personal credit's business and provide a loan
Once the agriculture-countryside-farmer client of the business such as current row other savings, credits card was handled before 3 months attachment of interests.
Optionally, bank individual credit customer quantity is much smaller than other all client's numbers for not handling personal credit's business
Amount, it is therefore desirable to balanced first sample characteristic information and the second sample characteristics information.
Optionally, using O first sample characteristic information, P the second sample characteristics information and O first sample feature
Corresponding first business information of information and the second business information corresponding with P the second sample characteristics information to archetype into
Row training, obtaining object module includes:
O first sample characteristic information is divided into Q parts of first sample characteristic informations, wherein Q is greater than 1 nature by S1
Number, Q are less than or equal to O;
S2, respectively by the every part of first sample characteristic information and P the second sample characteristics in Q parts of first sample characteristic informations
Information is combined, and obtains Q group sample characteristics information;
S3, by every group of sample characteristics information in Q group sample characteristics information and corresponding with every group of sample characteristics information
Business information carries out Q training to archetype, obtains object module.
Optionally, for example, Q wheel circulation can be carried out, every wheel extracts wherein 1 part of first sample characteristic information and the second sample
Characteristic information inputs archetype together and is trained study, and using archetype to remaining Q-1 parts of sample characteristics information
It is predicted.
Optionally, the core python code of archetype can be such that
import xgboost as xgb;
Rf1=xgb.XGBClassifier (max_depth=dval, n_estimators=eval);
Model_orig=rf1.fit (feature, label.label).
In an alternative embodiment, by Q group sample characteristics information every group of sample characteristics information and with it is every
The corresponding business information of group sample characteristics information is respectively trained archetype, and after obtaining object module, method is also wrapped
It includes:
S1 determines that the corresponding prediction of each first sample characteristic information exported in each training process from archetype is special
Score value is levied, each first sample characteristic information is obtained and corresponds to Q predicted characteristics score value;
S2 calculates the average value of Q predicted characteristics score value, and the consensus forecast for obtaining each first sample characteristic information is special
Levy score value;
Mean prediction profiles score value is determined as the pre- of sample object corresponding with each first sample characteristic information by S3
Survey feature score value;
S4, by the predicted characteristics score value of sample object feature score value corresponding with the practical business information of sample object
It is compared, to detect archetype.
Optionally, the bigger expression of feature score value is that the probability of target object is higher, and feature score value can be 0.65.
In an alternative embodiment, believe using multiple sample informations and with each sample in multiple sample informations
It ceases the business information whether corresponding each sample object handled target service to be trained archetype, obtains target mould
Type includes:
S1 uses multiple sample informations and each sample pair corresponding with each sample information in multiple sample informations
As if the no business information for handling target service is trained archetype, obtains meeting default recall ratio, presets and look into standard
Rate and the object module for meeting default the number of iterations to the training of archetype.
Optionally.Precision ratio reflect be actually in sample positive example ratio, recall ratio reflects and is appropriately determined just
The ratio of the example total positive example of Zhan.Precision ratio and recall ratio are the measurements of conflict.In general, when precision ratio is high, recall ratio is past
It is past relatively low.And recall ratio it is high when, precision ratio is often relatively low.
If the integrated unit in above-described embodiment is realized in the form of SFU software functional unit and as independent product
When selling or using, it can store in above-mentioned computer-readable storage medium.Based on this understanding, skill of the invention
Substantially all or part of the part that contributes to existing technology or the technical solution can be with soft in other words for art scheme
The form of part product embodies, which is stored in a storage medium, including some instructions are used so that one
Platform or multiple stage computers equipment (can be personal computer, server or network equipment etc.) execute each embodiment institute of the present invention
State all or part of the steps of method.
In the above embodiment of the invention, it all emphasizes particularly on different fields to the description of each embodiment, does not have in some embodiment
The part of detailed description, reference can be made to the related descriptions of other embodiments.
In several embodiments provided herein, it should be understood that disclosed client, it can be by others side
Formula is realized.Wherein, the apparatus embodiments described above are merely exemplary, such as the division of the unit, and only one
Kind of logical function partition, there may be another division manner in actual implementation, for example, multiple units or components can combine or
It is desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or discussed it is mutual it
Between coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or communication link of unit or module
It connects, can be electrical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art
For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered
It is considered as protection scope of the present invention.
Claims (11)
1. a kind of object determines method characterized by comprising
The characteristics of objects information input for the object to be assessed that will acquire to object module, obtain the object module output with institute
State the corresponding feature score value of characteristics of objects information;
In the case where the feature score value is greater than preset value, the object to be assessed is determined as target object.
2. the method according to claim 1, wherein the characteristics of objects information in the object to be assessed that will acquire is defeated
Enter to object module, before obtaining the feature score value corresponding with the characteristics of objects information of the object module output, institute
State method further include:
Obtain multiple sample informations, and each sample object corresponding with each sample information in the multiple sample information
Whether the business information of target service is handled;
Use the multiple sample information and each sample corresponding with each sample information in the multiple sample information
The business information whether object handled target service is trained archetype, obtains the object module, wherein described
Multiple sample informations are the input of the archetype, each sample information pair of the trained object module output
The practical business information answered meets objective function.
3. according to the method described in claim 2, it is characterized in that, obtain the multiple sample information, and with it is the multiple
It is described after whether the corresponding each sample object of each sample information in sample information handles the business information of target service
Method further include:
The multiple sample information is stored into distributed file system;
Data processing is carried out to the multiple sample information, obtains N number of target sample information, wherein the N is greater than oneself of 1
So number.
4. according to the method described in claim 3, it is characterized in that, to the multiple sample information carry out data processing include with
It is at least one lower:
The multiple sample information is carried out to the format conversion of preset format;
Delete the duplicate message in the multiple sample information;
Handle the invalid information and/or null information in the multiple sample information;
Handle the exceptional value in the multiple sample information, wherein the exceptional value includes not meeting the information of natural rule;
Each sample information in the multiple sample information is normalized.
5. according to the method described in claim 3, it is characterized in that, using the multiple sample information and with the multiple sample
Whether the corresponding each sample object of each sample information in this information handled the business information of target service to original mould
Type is trained, and is obtained the object module and is included:
It is mark with the identity information of the corresponding each sample object of target sample information each in N number of target sample information
Know, extract the characteristic information in N number of target sample information in each target sample information, obtain M sample characteristics information,
Wherein, the M is greater than 1 natural number;
The sample characteristics information for not handling target service in the M sample characteristics information is determined as first sample feature letter
Breath, obtains O first sample characteristic information, wherein the O is greater than 1 natural number, and the O is less than the M;
The sample characteristics information that target service was handled in the M sample characteristics information is determined as the second sample characteristics letter
Breath obtains P the second sample characteristics information, wherein the P is greater than 1 natural number, and the P is less than the M, and is less than institute
State O;
Use the O first sample characteristic information, the P the second sample characteristics information and the O first sample feature
Corresponding first business information of information and the second business information corresponding with the P the second sample characteristics information are to the original
Beginning model is trained, and obtains the object module.
6. according to the method described in claim 5, it is characterized in that, using the O first sample characteristic information, the P
Second sample characteristics information, the first business information corresponding with the O first sample characteristic information and with the P second
Corresponding second business information of sample characteristics information is trained the archetype, obtains the object module and includes:
The O first sample characteristic information is divided into Q parts of first sample characteristic informations, wherein the Q is greater than oneself of 1
So number, the Q are less than or equal to the O;
Respectively by the Q parts of first sample characteristic information every part of first sample characteristic information and the P the second samples it is special
Reference breath is combined, and obtains Q group sample characteristics information;
By every group of sample characteristics information in the Q group sample characteristics information and corresponding with every group of sample characteristics information
Business information carries out Q training to the archetype, obtains the object module.
7. according to the method described in claim 6, it is characterized in that, by every group of sample in the Q group sample characteristics information
Characteristic information and business information corresponding with every group of sample characteristics information are respectively trained the archetype, obtain
To after the object module, the method also includes:
Determine the corresponding predicted characteristics of each first sample characteristic information exported in each training process from the archetype
Score value obtains each first sample characteristic information and corresponds to Q predicted characteristics score value;
The average value for calculating the Q predicted characteristics score value obtains the consensus forecast of each first sample characteristic information
Feature score value;
The mean prediction profiles score value is determined as sample object corresponding with each first sample characteristic information
Predicted characteristics score value;
By the feature scoring corresponding with the practical business information of the sample object of the predicted characteristics score value of the sample object
Value is compared, to detect the archetype.
8. according to the method described in claim 2, it is characterized in that, using the multiple sample information and with the multiple sample
Whether the corresponding each sample object of each sample information in this information handled the business information of target service to original mould
Type is trained, and is obtained the object module and is included:
Use the multiple sample information and each sample corresponding with each sample information in the multiple sample information
The business information whether object handled target service is trained archetype, obtains meeting default recall ratio, presets and look into
Quasi- rate and the object module for meeting default the number of iterations to the training of the archetype.
9. a kind of object determining device characterized by comprising
First determining module, the characteristics of objects information input of the object to be assessed for will acquire obtain described to object module
The feature score value corresponding with the characteristics of objects information of object module output;
Second determines object, in the case where the feature score value is greater than preset value, the object to be assessed to be determined
For target object.
10. a kind of storage medium, computer program is stored in the storage medium, which is characterized in that the computer program
It is arranged to execute method described in any one of claim 1 to 8 when operation.
11. a kind of electronic device, including memory and processor, which is characterized in that be stored with computer journey in the memory
Sequence, the processor are arranged to run the computer program to execute side described in any one of claim 1 to 8
Method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910533200.6A CN110288465A (en) | 2019-06-19 | 2019-06-19 | Object determines method and device, storage medium, electronic device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910533200.6A CN110288465A (en) | 2019-06-19 | 2019-06-19 | Object determines method and device, storage medium, electronic device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110288465A true CN110288465A (en) | 2019-09-27 |
Family
ID=68003889
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910533200.6A Pending CN110288465A (en) | 2019-06-19 | 2019-06-19 | Object determines method and device, storage medium, electronic device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110288465A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110659985A (en) * | 2019-09-30 | 2020-01-07 | 上海淇玥信息技术有限公司 | Method and device for fishing back false rejection potential user and electronic equipment |
CN111796581A (en) * | 2020-07-06 | 2020-10-20 | 中铁二十局集团有限公司 | Shield data acquisition method and device and computer storage medium |
CN113409096A (en) * | 2021-08-19 | 2021-09-17 | 腾讯科技(深圳)有限公司 | Target object identification method and device, computer equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104702465A (en) * | 2015-02-09 | 2015-06-10 | 桂林电子科技大学 | Parallel network flow classification method |
US20170063902A1 (en) * | 2015-08-31 | 2017-03-02 | Splunk Inc. | Interface Having Selectable, Interactive Views For Evaluating Potential Network Compromise |
CN107230108A (en) * | 2017-06-13 | 2017-10-03 | 北京百分点信息科技有限公司 | The processing method and processing device of business datum |
-
2019
- 2019-06-19 CN CN201910533200.6A patent/CN110288465A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104702465A (en) * | 2015-02-09 | 2015-06-10 | 桂林电子科技大学 | Parallel network flow classification method |
US20170063902A1 (en) * | 2015-08-31 | 2017-03-02 | Splunk Inc. | Interface Having Selectable, Interactive Views For Evaluating Potential Network Compromise |
CN107230108A (en) * | 2017-06-13 | 2017-10-03 | 北京百分点信息科技有限公司 | The processing method and processing device of business datum |
Non-Patent Citations (1)
Title |
---|
金秋月: "基于偏袒性集成学习的客户流失建模方法研究", 《中国优秀硕士学位论文全文数据库(经济与管理科学辑)》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110659985A (en) * | 2019-09-30 | 2020-01-07 | 上海淇玥信息技术有限公司 | Method and device for fishing back false rejection potential user and electronic equipment |
CN111796581A (en) * | 2020-07-06 | 2020-10-20 | 中铁二十局集团有限公司 | Shield data acquisition method and device and computer storage medium |
CN111796581B (en) * | 2020-07-06 | 2022-03-18 | 中铁二十局集团有限公司 | Shield data acquisition method and device and computer storage medium |
CN113409096A (en) * | 2021-08-19 | 2021-09-17 | 腾讯科技(深圳)有限公司 | Target object identification method and device, computer equipment and storage medium |
CN113409096B (en) * | 2021-08-19 | 2021-11-16 | 腾讯科技(深圳)有限公司 | Target object identification method and device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109635117A (en) | A kind of knowledge based spectrum recognition user intention method and device | |
CN108197532A (en) | The method, apparatus and computer installation of recognition of face | |
CN110309840A (en) | Risk trade recognition methods, device, server and storage medium | |
CN108427708A (en) | Data processing method, device, storage medium and electronic device | |
CN109583904A (en) | Training method, impaired operation detection method and the device of abnormal operation detection model | |
CN108537671A (en) | A kind of transaction risk appraisal procedure and system | |
CN110288465A (en) | Object determines method and device, storage medium, electronic device | |
CN109360097A (en) | Prediction of Stock Index method, apparatus, equipment and storage medium based on deep learning | |
CN109918560A (en) | A kind of answering method and device based on search engine | |
CN106503006A (en) | The sort method and device of application App neutron applications | |
CN105931068A (en) | Cardholder consumption figure generation method and device | |
CN106844407A (en) | Label network production method and system based on data set correlation | |
CN110110049A (en) | Service consultation method, apparatus, system, service robot and storage medium | |
CN114723966B (en) | Multi-task recognition method, training method, device, electronic equipment and storage medium | |
CN110288350A (en) | User's Value Prediction Methods, device, equipment and storage medium | |
CN114612251A (en) | Risk assessment method, device, equipment and storage medium | |
CN104346698A (en) | Catering member big data analysis and checking system based on cloud computing and data mining | |
CN112529477A (en) | Credit evaluation variable screening method, device, computer equipment and storage medium | |
CN115965058A (en) | Neural network training method, entity information classification method, device and storage medium | |
CN110197426A (en) | A kind of method for building up of credit scoring model, device and readable storage medium storing program for executing | |
CN109493186A (en) | The method and apparatus for determining pushed information | |
CN116049536A (en) | Recommendation method and related device | |
CN106156256A (en) | A kind of user profile classification transmitting method and system | |
CN111368060B (en) | Self-learning method, device and system for conversation robot, electronic equipment and medium | |
CN116502132A (en) | Account set identification method, device, equipment, medium and computer program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190927 |
|
RJ01 | Rejection of invention patent application after publication |