CN110288465A - Object determines method and device, storage medium, electronic device - Google Patents

Object determines method and device, storage medium, electronic device Download PDF

Info

Publication number
CN110288465A
CN110288465A CN201910533200.6A CN201910533200A CN110288465A CN 110288465 A CN110288465 A CN 110288465A CN 201910533200 A CN201910533200 A CN 201910533200A CN 110288465 A CN110288465 A CN 110288465A
Authority
CN
China
Prior art keywords
information
sample
business
score value
obtains
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910533200.6A
Other languages
Chinese (zh)
Inventor
王超
张晓波
赵青柏
王灿辉
李亚南
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mininglamp Software System Co ltd
Original Assignee
Beijing Mininglamp Software System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mininglamp Software System Co ltd filed Critical Beijing Mininglamp Software System Co ltd
Priority to CN201910533200.6A priority Critical patent/CN110288465A/en
Publication of CN110288465A publication Critical patent/CN110288465A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Abstract

The invention discloses a kind of objects to determine method and device, storage medium, electronic device.Wherein, this method comprises: the characteristics of objects information input for the object to be assessed that will acquire obtains the feature score value corresponding with characteristics of objects information of object module output to object module;In the case where feature score value is greater than preset value, object to be assessed is determined as target object.The present invention solves technical problem low to the digging efficiency of potential customers present in the relevant technologies.

Description

Object determines method and device, storage medium, electronic device
Technical field
The present invention relates to computer fields, determine method and device, storage medium, electricity in particular to a kind of object Sub-device.
Background technique
Personal Financial Business of Commercial Bank sustainable development in recent years, successively have accumulated a large amount of customer data, this There is huge potential values for the behind of a little data.At the same time, the fast-developing of internet finance also produces business bank Certain impact is given birth to, new client's growth slowdown and frequent customer are lost aggravation.For traditional commerce bank, developing new client When fight separately mostly, lack technological system based on client's depth analysis and support.How to find and to excavate personal credit newly objective Family is a difficult point for bank, and the business revenue and achievement for being directly or indirectly related to commercial bank credit business increase, but Still more universal phenomenon is that client's marketing thinking is often confined to releasing advertisements leaflet, holds Below-the-line or set using experience Set pattern then carries out the mode of screening etc., is easy to appear the small limitation of at high cost, low efficiency, range, is unfavorable for the business hair of bank Exhibition and long-range competitiveness.
For above-mentioned problem, currently no effective solution has been proposed.
Summary of the invention
The embodiment of the invention provides a kind of objects to determine method and device, storage medium, electronic device, at least to solve The technical problem low to the digging efficiency of potential customers present in the relevant technologies.
According to an aspect of an embodiment of the present invention, it provides a kind of object and determines method, comprising: what be will acquire is to be assessed The characteristics of objects information input of object obtains the corresponding with the characteristics of objects information of the object module output to object module Feature score value;In the case where the feature score value is greater than preset value, the object to be assessed is determined as target pair As.
Optionally, the target is obtained to object module in the characteristics of objects information input for the object to be assessed that will acquire Before the feature score value corresponding with the characteristics of objects information of model output, the method also includes: obtain multiple samples Whether information, and each sample object corresponding with each sample information in the multiple sample information handle target service Business information;Use the multiple sample information and corresponding with each sample information in the multiple sample information every The business information whether a sample object handled target service is trained archetype, obtains the object module, In, the multiple sample information is the input of the archetype, each sample of the trained object module output The corresponding practical business information of this information meets objective function.
Optionally, obtain the multiple sample information, and with each sample information pair in the multiple sample information After whether each sample object answered handles the business information of target service, the method also includes: by the multiple sample Information is stored into distributed file system;Data processing is carried out to the multiple sample information, obtains N number of target sample letter Breath, wherein the N is greater than 1 natural number.
Optionally, carrying out data processing to the multiple sample information includes at least one of: by the multiple sample The format conversion of information progress preset format;Delete the duplicate message in the multiple sample information;Handle the multiple sample Invalid information and/or null information in information;Handle the exceptional value in the multiple sample information, wherein the exceptional value Information including not meeting natural rule;Each sample information in the multiple sample information is normalized.
Optionally, the multiple sample information and corresponding with each sample information in the multiple sample information is used Each sample object whether handled the business information of target service archetype be trained, obtain the object module Including;It is mark with the identity information of the corresponding each sample object of target sample information each in N number of target sample information Know, extract the characteristic information in N number of target sample information in each target sample information, obtain M sample characteristics information, Wherein, the M is greater than 1 natural number;The sample characteristics information of target service was not handled in the M sample characteristics information It is determined as first sample characteristic information, obtains O first sample characteristic information, wherein the O is greater than 1 natural number, described O is less than the M;The sample characteristics information that target service was handled in the M sample characteristics information is determined as the second sample Characteristic information obtains P the second sample characteristics information, wherein and the P is greater than 1 natural number, and the P is less than the M, and Less than the O;Use the O first sample characteristic information, the P the second sample characteristics information and the O the first samples Corresponding first business information of eigen information and the second business information pair corresponding with the P the second sample characteristics information The archetype is trained, and obtains the object module.
Optionally, using the O first sample characteristic information, the P the second sample characteristics information and the O Corresponding first business information of first sample characteristic information and the second business corresponding with the P the second sample characteristics information Information is trained the archetype, and obtaining the object module includes: by the O first sample characteristic information equal part At Q parts of first sample characteristic informations, wherein the Q is greater than 1 natural number, and the Q is less than or equal to the O;Respectively by institute Every part of first sample characteristic information and the P the second sample characteristics information stated in Q parts of first sample characteristic informations carry out group It closes, obtains Q group sample characteristics information;By in the Q group sample characteristics information every group of sample characteristics information and with it is described every The corresponding business information of group sample characteristics information carries out Q training to the archetype, obtains the object module.
Optionally, by the Q group sample characteristics information every group of sample characteristics information and with every group of sample The corresponding business information of characteristic information is respectively trained the archetype, after obtaining the object module, the side Method further include: determine the corresponding prediction of each first sample characteristic information exported in each training process from the archetype Feature score value obtains each first sample characteristic information and corresponds to Q predicted characteristics score value;Calculate the Q predicted characteristics The average value of score value obtains the mean prediction profiles score value of each first sample characteristic information;It will be described average pre- Survey the predicted characteristics score value that feature score value is determined as sample object corresponding with each first sample characteristic information;It will The predicted characteristics score value of the sample object and the corresponding feature score value of the practical business information of the sample object carry out It compares, to detect the archetype.
Optionally, the multiple sample information and corresponding with each sample information in the multiple sample information is used Each sample object whether handled the business information of target service archetype be trained, obtain the object module It include: to use the multiple sample information and each sample corresponding with each sample information in the multiple sample information The business information whether object handled target service is trained archetype, obtains meeting default recall ratio, presets and look into Quasi- rate and the object module for meeting default the number of iterations to the training of the archetype.
According to another aspect of an embodiment of the present invention, a kind of object determining device is additionally provided, comprising: first determines mould Block, the characteristics of objects information input of the object to be assessed for will acquire obtain the object module output to object module Feature score value corresponding with the characteristics of objects information;Second determines object, default for being greater than in the feature score value In the case where value, the object to be assessed is determined as target object.
According to another aspect of an embodiment of the present invention, a kind of storage medium is additionally provided, is stored in the storage medium Computer program, which is characterized in that the computer program is arranged to execute the method described among the above when operation.
According to another aspect of an embodiment of the present invention, a kind of electronic device, including memory and processor are additionally provided, Be characterized in that, be stored with computer program in the memory, the processor be arranged to run the computer program with Execute the method described among the above.
In embodiments of the present invention, using the characteristics of objects information input for the object to be assessed that will acquire to object module, Obtain the feature score value corresponding with characteristics of objects information of object module output;The case where feature score value is greater than preset value Under, object to be assessed is determined as target object.The purpose that target object is determined using feature score value may be implemented.And then it solves It has determined technical problem low to the digging efficiency of potential customers present in the relevant technologies.
Detailed description of the invention
The drawings described herein are used to provide a further understanding of the present invention, constitutes part of this application, this hair Bright illustrative embodiments and their description are used to explain the present invention, and are not constituted improper limitations of the present invention.In the accompanying drawings:
Fig. 1 is the hardware block diagram that a kind of object of the embodiment of the present invention determines the mobile terminal of method;
Fig. 2 is the flow chart that object according to an embodiment of the present invention determines method;
Fig. 3 is the flow chart of model training according to an embodiment of the present invention;
Fig. 4 is the schematic diagram of object determining device according to an embodiment of the present invention.
Specific embodiment
Hereinafter, the present invention will be described in detail with reference to the accompanying drawings and in combination with Examples.It should be noted that not conflicting In the case of, the features in the embodiments and the embodiments of the present application can be combined with each other.
It should be noted that description and claims of this specification and term " first " in above-mentioned attached drawing, " Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.
Embodiment of the method provided by the embodiment of the present application can be in mobile terminal, terminal or similar operation It is executed in device.For running on mobile terminals, Fig. 1 is that a kind of object of the embodiment of the present invention determines the mobile end of method The hardware block diagram at end.As shown in Figure 1, mobile terminal 10 may include one or more (only showing one in Fig. 1) processing Device 102 (processing unit that processor 102 can include but is not limited to Micro-processor MCV or programmable logic device FPGA etc.) and Memory 104 for storing data, optionally, above-mentioned mobile terminal can also include the transmission device for communication function 106 and input-output equipment 108.It will appreciated by the skilled person that structure shown in FIG. 1 is only to illustrate, simultaneously The structure of above-mentioned mobile terminal is not caused to limit.For example, mobile terminal 10 may also include it is more than shown in Fig. 1 or less Component, or with the configuration different from shown in Fig. 1.
Memory 104 can be used for storing computer program, for example, the software program and module of application software, such as this hair Object in bright embodiment determines that the corresponding computer program of method, processor 102 are stored in memory 104 by operation Computer program realizes above-mentioned method thereby executing various function application and data processing.Memory 104 may include High speed random access memory, may also include nonvolatile memory, as one or more magnetic storage device, flash memory or its His non-volatile solid state memory.In some instances, memory 104 can further comprise remotely setting relative to processor 102 The memory set, these remote memories can pass through network connection to mobile terminal 10.The example of above-mentioned network includes but not It is limited to internet, intranet, local area network, mobile radio communication and combinations thereof.
Transmitting device 106 is used to that data to be received or sent via a network.Above-mentioned network specific example may include The wireless network that the communication providers of mobile terminal 10 provide.In an example, transmitting device 106 includes a Network adaptation Device (Network Interface Controller, referred to as NIC), can be connected by base station with other network equipments to It can be communicated with internet.In an example, transmitting device 106 can for radio frequency (Radio Frequency, referred to as RF) module is used to wirelessly be communicated with internet.
A kind of object is provided in the present embodiment and determines method, and Fig. 2 is object determination side according to an embodiment of the present invention The flow chart of method, as shown in Fig. 2, the process includes the following steps:
Step S202, the characteristics of objects information input for the object to be assessed that will acquire obtain object module to object module The feature score value corresponding with characteristics of objects information of output;
Object to be assessed is determined as target object in the case where feature score value is greater than preset value by step S204.
Through the invention, mesh is obtained to object module using the characteristics of objects information input for the object to be assessed that will acquire Mark the feature score value corresponding with characteristics of objects information of model output;It, will in the case where feature score value is greater than preset value Object to be assessed is determined as target object.The purpose that target object is determined using feature score value may be implemented.And then it solves The technical problem low to the digging efficiency of potential customers present in the relevant technologies.
Optionally, the executing subject of above-mentioned steps can be terminal etc., but not limited to this.
Optionally, the scene for needing to excavate potential customers, including but not limited to bank be can be applied among the above The excavation of potential customers, in this scenario, potential customers can be the client for not handling banking, or not handle more The client of loan.
Optionally, characteristics of objects information includes but is not limited to information shown in table 1.
Table 1:
Optionally, object module can be neural network model, but not limited to this.
In an alternative embodiment, in the characteristics of objects information input for the object to be assessed that will acquire to target mould Type, before obtaining the feature score value corresponding with characteristics of objects information of object module output, method further include:
S1 obtains multiple sample informations, and each sample corresponding with each sample information in multiple sample informations Whether object handles the business information of target service;
S2 uses multiple sample informations and each sample pair corresponding with each sample information in multiple sample informations As if the no business information for handling target service is trained archetype, obtains object module, wherein multiple sample letters Breath is the input of archetype, and the corresponding practical business information of each sample information of trained object module output meets mesh Scalar functions.
Optionally, sample information can be the bank visitor for not handling personal credit's business extracted from historical data Family is also possible to handle the bank client of bank credit.
In an alternative embodiment, obtain multiple sample informations, and with each sample in multiple sample informations After whether the corresponding each sample object of information handles the business information of target service, method further include:
S1 stores multiple sample informations into distributed file system;
S2 carries out data processing to multiple sample informations, obtains N number of target sample information, wherein N is greater than 1 nature Number.
Optionally, carrying out data processing to multiple sample informations includes at least one of: multiple sample informations are carried out The format conversion of preset format;Delete the duplicate message in multiple sample informations;Handle the invalid information in multiple sample informations And/or null information;Handle the exceptional value in multiple sample informations, wherein exceptional value includes not meeting the information of natural rule; Each sample information in multiple sample informations is normalized.
For example, read the data of bank card and saving service line storage client, rejected after analysis can not veritify or Do not have helpful data item to object module, the Hadoop that the initial data after screening imports distributed big data cluster is distributed Among formula file system (Hadoop Distributed File System, referred to as HDFS).Data format is unified at conversion Reason (such as: band " percentage of % " be uniformly converted to floating number), deleting duplicated data information handles invalid value and null value, is based on The outlier processing (such as: the age is not obviously inconsistent normally for -1) of business rule, code value is normalized.
In an alternative embodiment, believe using multiple sample informations and with each sample in multiple sample informations It ceases the business information whether corresponding each sample object handled target service to be trained archetype, obtains target mould Type includes:
S1, the identity information with the corresponding each sample object of target sample information each in N number of target sample information are Mark, extracts the characteristic information in N number of target sample information in each target sample information, obtains M sample characteristics information, In, M is greater than 1 natural number;
The sample characteristics information for not handling target service in M sample characteristics information is determined as first sample spy by S2 Reference breath, obtains O first sample characteristic information, wherein O is greater than 1 natural number, and O is less than M;
The sample characteristics information that target service was handled in M sample characteristics information is determined as the second sample characteristics by S3 Information obtains P the second sample characteristics information, wherein P is greater than 1 natural number, and P is less than M, and is less than O;
S4 uses O first sample characteristic information, P the second sample characteristics information and O first sample characteristic information Corresponding first business information and the second business information corresponding with P the second sample characteristics information instruct archetype Practice, obtains object module.
Optionally, for example, first sample characteristic information, which can be to extract from historical data, did not handled personal credit's industry The bank client of business.Second sample characteristics information can be to extract from historical data and handle personal credit's business and provide a loan Once the agriculture-countryside-farmer client of the business such as current row other savings, credits card was handled before 3 months attachment of interests.
Optionally, bank individual credit customer quantity is much smaller than other all client's numbers for not handling personal credit's business Amount, it is therefore desirable to balanced first sample characteristic information and the second sample characteristics information.
Optionally, using O first sample characteristic information, P the second sample characteristics information and O first sample feature Corresponding first business information of information and the second business information corresponding with P the second sample characteristics information to archetype into Row training, obtaining object module includes:
O first sample characteristic information is divided into Q parts of first sample characteristic informations, wherein Q is greater than 1 nature by S1 Number, Q are less than or equal to O;
S2, respectively by the every part of first sample characteristic information and P the second sample characteristics in Q parts of first sample characteristic informations Information is combined, and obtains Q group sample characteristics information;
S3, by every group of sample characteristics information in Q group sample characteristics information and corresponding with every group of sample characteristics information Business information carries out Q training to archetype, obtains object module.
Optionally, for example, Q wheel circulation can be carried out, every wheel extracts wherein 1 part of first sample characteristic information and the second sample Characteristic information inputs archetype together and is trained study, and using archetype to remaining Q-1 parts of sample characteristics information It is predicted.
Optionally, the core python code of archetype can be such that
import xgboost as xgb;
Rf1=xgb.XGBClassifier (max_depth=dval, n_estimators=eval);
Model_orig=rf1.fit (feature, label.label).
In an alternative embodiment, by Q group sample characteristics information every group of sample characteristics information and with it is every The corresponding business information of group sample characteristics information is respectively trained archetype, and after obtaining object module, method is also wrapped It includes:
S1 determines that the corresponding prediction of each first sample characteristic information exported in each training process from archetype is special Score value is levied, each first sample characteristic information is obtained and corresponds to Q predicted characteristics score value;
S2 calculates the average value of Q predicted characteristics score value, and the consensus forecast for obtaining each first sample characteristic information is special Levy score value;
Mean prediction profiles score value is determined as the pre- of sample object corresponding with each first sample characteristic information by S3 Survey feature score value;
S4, by the predicted characteristics score value of sample object feature score value corresponding with the practical business information of sample object It is compared, to detect archetype.
Optionally, the bigger expression of feature score value is that the probability of target object is higher, and feature score value can be 0.65.
In an alternative embodiment, believe using multiple sample informations and with each sample in multiple sample informations It ceases the business information whether corresponding each sample object handled target service to be trained archetype, obtains target mould Type includes:
S1 uses multiple sample informations and each sample pair corresponding with each sample information in multiple sample informations As if the no business information for handling target service is trained archetype, obtains meeting default recall ratio, presets and look into standard Rate and the object module for meeting default the number of iterations to the training of archetype.
Optionally.Precision ratio reflect be actually in sample positive example ratio, recall ratio reflects and is appropriately determined just The ratio of the example total positive example of Zhan.Precision ratio and recall ratio are the measurements of conflict.In general, when precision ratio is high, recall ratio is past It is past relatively low.And recall ratio it is high when, precision ratio is often relatively low.
The present invention is described in detail combined with specific embodiments below:
The present embodiment is illustrated by taking the potential customers in bank as an example, as shown in figure 3, the present embodiment on the whole can be with It is divided into following steps:
S301: client's initial data is imported into distributed big data cluster.
S302: data prediction.
S303: the feature extraction based on customer basis information and trading activity data.
S304: establishment, model training and the latent objective list of output of data mining machine learning model algorithm
S305: feedback data is collected, and continues iteration, optimization algorithm model according to feedback data.
During to data prediction, it can be handled in the following manner:
Bank card and the data of saving service line storage client are read, rejecting after analysis can not veritify or to target Model does not have helpful data item, and the initial data after screening is imported to the Hadoop distributed document of distributed big data cluster Among system (Hadoop Distributed File System, referred to as HDFS).
Data format unifies conversion process (such as: the percentage of band " % " is uniformly converted to floating number).Deleting duplicated data Information.Handle invalid value and null value.Outlier processing based on business rule (such as: the age is not obviously inconsistent normally for -1).It will Code value is normalized.
Feature extraction based on customer basis information and trading activity data in the following manner:
Initial data after pretreatment, using bank client passport NO. as unique ID, comb on the whole by selective analysis Line where managing potential target customers and the related data for being associated with a line, deeply dissect the number of savings with credit card related system According to feature, by the association of tables of data, such as 1 data item result of table is extracted as client characteristics value, as subsequent step target mould The input of type.
Training objective model in the following manner:
The bank client for not handling personal credit's business is extracted from historical data, by data prediction and extracts institute Feature is needed, as model training negative sample;It is extracted from historical data and handled personal credit's business and provided a loan the attachment of interest 3 months The agriculture-countryside-farmer client for once handling the business such as current row other savings, credits card in the past, by data prediction and feature needed for extracting, As model training positive sample;
In general, bank individual credit customer quantity is much smaller than other all clients for not handling personal credit's business Quantity, positive and negative samples data volume differ greatly, therefore negative sample is divided into n parts at random, and positive and negative samples quantity is made to meet phase To more reasonable ratio.
Object module chooses XGBoost, it is the one kind for promoting Tree Model Algorithm, and algorithm central principle is by many weak points Class device integrates to form a strong classifier.The model has the advantages that loss letter is utilized in objective function optimization Number supports parallelization about second dervative to be found a function, and training speed is fast, supports setting sample weights.Take more plans Over-fitting is slightly prevented, the processing etc. to sparse data is added to.
Take turns and recycle followed by n, every wheel extracts wherein 1 part of negative sample and positive sample, together input XGBoost algorithm into Row training learns and forms model, and is predicted using the model remaining n-1 parts of negative sample, obtains each in negative sample Client handles the probability score of personal credit's business, the section of scoring be [0,1).
The core python code of algorithm model is as follows:
import xgboost as xgb;
Rf1=xgb.XGBClassifier (max_depth=dval, n_estimators=eval);
Model_orig=rf1.fit (feature, label.label).
After the completion of n wheel circulation as described above, each negative sample will obtain n-1 probability score.This n-1 probability score Mean value be final probability, with the value evaluation client handle personal credit's business a possibility that size, numerical value is bigger to represent probability It is higher.
Finally, meeting final probability, in some threshold value, (such as: client's list more than 0.65), as personal credit's business are latent In client's list.
Feedback data is collected in the following manner:
After obtaining whole personal credit's business cold lists, carried out according to client's affiliated area or bank's clamp mechanism It divides, distribution is pushed to the credit customer manager of corresponding region mechanism.Credit customer manager carries out essence to client again according to list Quasi- marketing, and the result whether potential customers to be marketed successfully handle personal credit's business is marked, and these are tied Fruit feedback capture is returned.
Iteration is continued according to feedback data in the following manner, optimization algorithm model:
After being collected into the feedback data that credit staff markets to potential customers, it is valuable to can be used as tool Positive and negative samples supplement, carrys out further iteration optimization model.
The model referred here to is mainly two classification problems, and the combination of the true classification of the result and sample of classification can To constitute four kinds of situations, respectively real example (TP), false positive example (FP), very negative example (TN) and false negative example (FN).Usually our meetings It goes to describe these four situations with a confusion matrix, as shown in table 2.
Table 2:
Provide the definition of precision ratio Precision and recall ratio Recall:
Precision ratio reflects classifier and determines in the sample of positive example the actually ratio of positive example, and recall ratio is reflected by just The ratio of the total positive example of positive example Zhan really determined.Precision ratio and recall ratio are the measurements of conflict.In general, precision ratio is high When, recall ratio is often relatively low;And recall ratio it is high when, precision ratio is often relatively low.We look into usually using F1 measurement to comprehensively consider Quasi- rate and recall ratio.F1 is the harmonic-mean of precision ratio numerical value and recall ratio numerical value, and formula is as follows:
It reflects the balance accepted or rejected between precision ratio and recall ratio, only when both precision ratio and recall ratio are all relatively large When, the value of F1 just can be bigger.
The optimization that result data result feedback of marketing will be helped to implementation model classification and predictive ability iteration, to improve The value of the cold list obtained next time by algorithm model forms closed loop.
In conclusion compared to modes such as marketing activities under each series advertisements, line, potential customers acquisition side provided by the invention Method covering customer range is wide, precision is higher and cost is relatively low.Compared under conventional situation, set according to banking expertise Fixed strong rule is come the method for screening potential customers, and the present invention can strong with generalization ability and Weak Classifier be promoted by choosing The machine learning algorithm of ability, by choosing more features (can be strong feature, be also possible to weak feature), what can be found out is latent It is wider in customer range.And by iterative feedback, it can be achieved that the closed loop of model self-optimization.
It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of Combination of actions, but those skilled in the art should understand that, the present invention is not limited by the sequence of acts described because According to the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know It knows, the embodiments described in the specification are all preferred embodiments, and related actions and modules is not necessarily of the invention It is necessary.
Other side according to an embodiment of the present invention additionally provides a kind of pair that method is determined for implementing above-mentioned object As determining device.As shown in figure 4, the device includes: that the first determining module 42 and second determine object 44, below to the device into Row is described in detail:
First determining module 42, the characteristics of objects information input of the object to be assessed for will acquire are obtained to object module The feature score value corresponding with characteristics of objects information exported to object module;
Second determines object 44, in the case where feature score value is greater than preset value, object to be assessed to be determined as Target object.
Through the invention, mesh is obtained to object module using the characteristics of objects information input for the object to be assessed that will acquire Mark the feature score value corresponding with characteristics of objects information of model output;It, will in the case where feature score value is greater than preset value Object to be assessed is determined as target object.The purpose that target object is determined using feature score value may be implemented.And then it solves The technical problem low to the digging efficiency of potential customers present in the relevant technologies.
Optionally, the scene for needing to excavate potential customers, including but not limited to bank be can be applied among the above The excavation of potential customers, in this scenario, potential customers can be the client for not handling banking, or not handle more The client of loan.
Optionally, characteristics of objects information includes but is not limited to information shown in table 1.
Table 1:
Optionally, object module can be neural network model, but not limited to this.
In an alternative embodiment, in the characteristics of objects information input for the object to be assessed that will acquire to target mould Type, before obtaining the feature score value corresponding with characteristics of objects information of object module output, method further include:
S1 obtains multiple sample informations, and each sample corresponding with each sample information in multiple sample informations Whether object handles the business information of target service;
S2 uses multiple sample informations and each sample pair corresponding with each sample information in multiple sample informations As if the no business information for handling target service is trained archetype, obtains object module, wherein multiple sample letters Breath is the input of archetype, and the corresponding practical business information of each sample information of trained object module output meets mesh Scalar functions.
Optionally, sample information can be the bank visitor for not handling personal credit's business extracted from historical data Family is also possible to handle the bank client of bank credit.
In an alternative embodiment, obtain multiple sample informations, and with each sample in multiple sample informations After whether the corresponding each sample object of information handles the business information of target service, method further include:
S1 stores multiple sample informations into distributed file system;
S2 carries out data processing to multiple sample informations, obtains N number of target sample information, wherein N is greater than 1 nature Number.
Optionally, carrying out data processing to multiple sample informations includes at least one of: multiple sample informations are carried out The format conversion of preset format;Delete the duplicate message in multiple sample informations;Handle the invalid information in multiple sample informations And/or null information;Handle the exceptional value in multiple sample informations, wherein exceptional value includes not meeting the information of natural rule; Each sample information in multiple sample informations is normalized.
For example, read the data of bank card and saving service line storage client, rejected after analysis can not veritify or Do not have helpful data item to object module, the Hadoop that the initial data after screening imports distributed big data cluster is distributed Among formula file system (Hadoop Distributed File System, referred to as HDFS).Data format is unified at conversion Reason (such as: band " percentage of % " be uniformly converted to floating number), deleting duplicated data information handles invalid value and null value, is based on The outlier processing (such as: the age is not obviously inconsistent normally for -1) of business rule, code value is normalized.
In an alternative embodiment, believe using multiple sample informations and with each sample in multiple sample informations It ceases the business information whether corresponding each sample object handled target service to be trained archetype, obtains target mould Type includes:
S1, the identity information with the corresponding each sample object of target sample information each in N number of target sample information are Mark, extracts the characteristic information in N number of target sample information in each target sample information, obtains M sample characteristics information, In, M is greater than 1 natural number;
The sample characteristics information for not handling target service in M sample characteristics information is determined as first sample spy by S2 Reference breath, obtains O first sample characteristic information, wherein O is greater than 1 natural number, and O is less than M;
The sample characteristics information that target service was handled in M sample characteristics information is determined as the second sample characteristics by S3 Information obtains P the second sample characteristics information, wherein P is greater than 1 natural number, and P is less than M, and is less than O;
S4 uses O first sample characteristic information, P the second sample characteristics information and O first sample characteristic information Corresponding first business information and the second business information corresponding with P the second sample characteristics information instruct archetype Practice, obtains object module.
Optionally, for example, first sample characteristic information, which can be to extract from historical data, did not handled personal credit's industry The bank client of business.Second sample characteristics information can be to extract from historical data and handle personal credit's business and provide a loan Once the agriculture-countryside-farmer client of the business such as current row other savings, credits card was handled before 3 months attachment of interests.
Optionally, bank individual credit customer quantity is much smaller than other all client's numbers for not handling personal credit's business Amount, it is therefore desirable to balanced first sample characteristic information and the second sample characteristics information.
Optionally, using O first sample characteristic information, P the second sample characteristics information and O first sample feature Corresponding first business information of information and the second business information corresponding with P the second sample characteristics information to archetype into Row training, obtaining object module includes:
O first sample characteristic information is divided into Q parts of first sample characteristic informations, wherein Q is greater than 1 nature by S1 Number, Q are less than or equal to O;
S2, respectively by the every part of first sample characteristic information and P the second sample characteristics in Q parts of first sample characteristic informations Information is combined, and obtains Q group sample characteristics information;
S3, by every group of sample characteristics information in Q group sample characteristics information and corresponding with every group of sample characteristics information Business information carries out Q training to archetype, obtains object module.
Optionally, for example, Q wheel circulation can be carried out, every wheel extracts wherein 1 part of first sample characteristic information and the second sample Characteristic information inputs archetype together and is trained study, and using archetype to remaining Q-1 parts of sample characteristics information It is predicted.
Optionally, the core python code of archetype can be such that
import xgboost as xgb;
Rf1=xgb.XGBClassifier (max_depth=dval, n_estimators=eval);
Model_orig=rf1.fit (feature, label.label).
In an alternative embodiment, by Q group sample characteristics information every group of sample characteristics information and with it is every The corresponding business information of group sample characteristics information is respectively trained archetype, and after obtaining object module, method is also wrapped It includes:
S1 determines that the corresponding prediction of each first sample characteristic information exported in each training process from archetype is special Score value is levied, each first sample characteristic information is obtained and corresponds to Q predicted characteristics score value;
S2 calculates the average value of Q predicted characteristics score value, and the consensus forecast for obtaining each first sample characteristic information is special Levy score value;
Mean prediction profiles score value is determined as the pre- of sample object corresponding with each first sample characteristic information by S3 Survey feature score value;
S4, by the predicted characteristics score value of sample object feature score value corresponding with the practical business information of sample object It is compared, to detect archetype.
Optionally, the bigger expression of feature score value is that the probability of target object is higher, and feature score value can be 0.65.
In an alternative embodiment, believe using multiple sample informations and with each sample in multiple sample informations It ceases the business information whether corresponding each sample object handled target service to be trained archetype, obtains target mould Type includes:
S1 uses multiple sample informations and each sample pair corresponding with each sample information in multiple sample informations As if the no business information for handling target service is trained archetype, obtains meeting default recall ratio, presets and look into standard Rate and the object module for meeting default the number of iterations to the training of archetype.
Optionally.Precision ratio reflect be actually in sample positive example ratio, recall ratio reflects and is appropriately determined just The ratio of the example total positive example of Zhan.Precision ratio and recall ratio are the measurements of conflict.In general, when precision ratio is high, recall ratio is past It is past relatively low.And recall ratio it is high when, precision ratio is often relatively low.
If the integrated unit in above-described embodiment is realized in the form of SFU software functional unit and as independent product When selling or using, it can store in above-mentioned computer-readable storage medium.Based on this understanding, skill of the invention Substantially all or part of the part that contributes to existing technology or the technical solution can be with soft in other words for art scheme The form of part product embodies, which is stored in a storage medium, including some instructions are used so that one Platform or multiple stage computers equipment (can be personal computer, server or network equipment etc.) execute each embodiment institute of the present invention State all or part of the steps of method.
In the above embodiment of the invention, it all emphasizes particularly on different fields to the description of each embodiment, does not have in some embodiment The part of detailed description, reference can be made to the related descriptions of other embodiments.
In several embodiments provided herein, it should be understood that disclosed client, it can be by others side Formula is realized.Wherein, the apparatus embodiments described above are merely exemplary, such as the division of the unit, and only one Kind of logical function partition, there may be another division manner in actual implementation, for example, multiple units or components can combine or It is desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or discussed it is mutual it Between coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or communication link of unit or module It connects, can be electrical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.
The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered It is considered as protection scope of the present invention.

Claims (11)

1. a kind of object determines method characterized by comprising
The characteristics of objects information input for the object to be assessed that will acquire to object module, obtain the object module output with institute State the corresponding feature score value of characteristics of objects information;
In the case where the feature score value is greater than preset value, the object to be assessed is determined as target object.
2. the method according to claim 1, wherein the characteristics of objects information in the object to be assessed that will acquire is defeated Enter to object module, before obtaining the feature score value corresponding with the characteristics of objects information of the object module output, institute State method further include:
Obtain multiple sample informations, and each sample object corresponding with each sample information in the multiple sample information Whether the business information of target service is handled;
Use the multiple sample information and each sample corresponding with each sample information in the multiple sample information The business information whether object handled target service is trained archetype, obtains the object module, wherein described Multiple sample informations are the input of the archetype, each sample information pair of the trained object module output The practical business information answered meets objective function.
3. according to the method described in claim 2, it is characterized in that, obtain the multiple sample information, and with it is the multiple It is described after whether the corresponding each sample object of each sample information in sample information handles the business information of target service Method further include:
The multiple sample information is stored into distributed file system;
Data processing is carried out to the multiple sample information, obtains N number of target sample information, wherein the N is greater than oneself of 1 So number.
4. according to the method described in claim 3, it is characterized in that, to the multiple sample information carry out data processing include with It is at least one lower:
The multiple sample information is carried out to the format conversion of preset format;
Delete the duplicate message in the multiple sample information;
Handle the invalid information and/or null information in the multiple sample information;
Handle the exceptional value in the multiple sample information, wherein the exceptional value includes not meeting the information of natural rule;
Each sample information in the multiple sample information is normalized.
5. according to the method described in claim 3, it is characterized in that, using the multiple sample information and with the multiple sample Whether the corresponding each sample object of each sample information in this information handled the business information of target service to original mould Type is trained, and is obtained the object module and is included:
It is mark with the identity information of the corresponding each sample object of target sample information each in N number of target sample information Know, extract the characteristic information in N number of target sample information in each target sample information, obtain M sample characteristics information, Wherein, the M is greater than 1 natural number;
The sample characteristics information for not handling target service in the M sample characteristics information is determined as first sample feature letter Breath, obtains O first sample characteristic information, wherein the O is greater than 1 natural number, and the O is less than the M;
The sample characteristics information that target service was handled in the M sample characteristics information is determined as the second sample characteristics letter Breath obtains P the second sample characteristics information, wherein the P is greater than 1 natural number, and the P is less than the M, and is less than institute State O;
Use the O first sample characteristic information, the P the second sample characteristics information and the O first sample feature Corresponding first business information of information and the second business information corresponding with the P the second sample characteristics information are to the original Beginning model is trained, and obtains the object module.
6. according to the method described in claim 5, it is characterized in that, using the O first sample characteristic information, the P Second sample characteristics information, the first business information corresponding with the O first sample characteristic information and with the P second Corresponding second business information of sample characteristics information is trained the archetype, obtains the object module and includes:
The O first sample characteristic information is divided into Q parts of first sample characteristic informations, wherein the Q is greater than oneself of 1 So number, the Q are less than or equal to the O;
Respectively by the Q parts of first sample characteristic information every part of first sample characteristic information and the P the second samples it is special Reference breath is combined, and obtains Q group sample characteristics information;
By every group of sample characteristics information in the Q group sample characteristics information and corresponding with every group of sample characteristics information Business information carries out Q training to the archetype, obtains the object module.
7. according to the method described in claim 6, it is characterized in that, by every group of sample in the Q group sample characteristics information Characteristic information and business information corresponding with every group of sample characteristics information are respectively trained the archetype, obtain To after the object module, the method also includes:
Determine the corresponding predicted characteristics of each first sample characteristic information exported in each training process from the archetype Score value obtains each first sample characteristic information and corresponds to Q predicted characteristics score value;
The average value for calculating the Q predicted characteristics score value obtains the consensus forecast of each first sample characteristic information Feature score value;
The mean prediction profiles score value is determined as sample object corresponding with each first sample characteristic information Predicted characteristics score value;
By the feature scoring corresponding with the practical business information of the sample object of the predicted characteristics score value of the sample object Value is compared, to detect the archetype.
8. according to the method described in claim 2, it is characterized in that, using the multiple sample information and with the multiple sample Whether the corresponding each sample object of each sample information in this information handled the business information of target service to original mould Type is trained, and is obtained the object module and is included:
Use the multiple sample information and each sample corresponding with each sample information in the multiple sample information The business information whether object handled target service is trained archetype, obtains meeting default recall ratio, presets and look into Quasi- rate and the object module for meeting default the number of iterations to the training of the archetype.
9. a kind of object determining device characterized by comprising
First determining module, the characteristics of objects information input of the object to be assessed for will acquire obtain described to object module The feature score value corresponding with the characteristics of objects information of object module output;
Second determines object, in the case where the feature score value is greater than preset value, the object to be assessed to be determined For target object.
10. a kind of storage medium, computer program is stored in the storage medium, which is characterized in that the computer program It is arranged to execute method described in any one of claim 1 to 8 when operation.
11. a kind of electronic device, including memory and processor, which is characterized in that be stored with computer journey in the memory Sequence, the processor are arranged to run the computer program to execute side described in any one of claim 1 to 8 Method.
CN201910533200.6A 2019-06-19 2019-06-19 Object determines method and device, storage medium, electronic device Pending CN110288465A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910533200.6A CN110288465A (en) 2019-06-19 2019-06-19 Object determines method and device, storage medium, electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910533200.6A CN110288465A (en) 2019-06-19 2019-06-19 Object determines method and device, storage medium, electronic device

Publications (1)

Publication Number Publication Date
CN110288465A true CN110288465A (en) 2019-09-27

Family

ID=68003889

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910533200.6A Pending CN110288465A (en) 2019-06-19 2019-06-19 Object determines method and device, storage medium, electronic device

Country Status (1)

Country Link
CN (1) CN110288465A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110659985A (en) * 2019-09-30 2020-01-07 上海淇玥信息技术有限公司 Method and device for fishing back false rejection potential user and electronic equipment
CN111796581A (en) * 2020-07-06 2020-10-20 中铁二十局集团有限公司 Shield data acquisition method and device and computer storage medium
CN113409096A (en) * 2021-08-19 2021-09-17 腾讯科技(深圳)有限公司 Target object identification method and device, computer equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104702465A (en) * 2015-02-09 2015-06-10 桂林电子科技大学 Parallel network flow classification method
US20170063902A1 (en) * 2015-08-31 2017-03-02 Splunk Inc. Interface Having Selectable, Interactive Views For Evaluating Potential Network Compromise
CN107230108A (en) * 2017-06-13 2017-10-03 北京百分点信息科技有限公司 The processing method and processing device of business datum

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104702465A (en) * 2015-02-09 2015-06-10 桂林电子科技大学 Parallel network flow classification method
US20170063902A1 (en) * 2015-08-31 2017-03-02 Splunk Inc. Interface Having Selectable, Interactive Views For Evaluating Potential Network Compromise
CN107230108A (en) * 2017-06-13 2017-10-03 北京百分点信息科技有限公司 The processing method and processing device of business datum

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
金秋月: "基于偏袒性集成学习的客户流失建模方法研究", 《中国优秀硕士学位论文全文数据库(经济与管理科学辑)》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110659985A (en) * 2019-09-30 2020-01-07 上海淇玥信息技术有限公司 Method and device for fishing back false rejection potential user and electronic equipment
CN111796581A (en) * 2020-07-06 2020-10-20 中铁二十局集团有限公司 Shield data acquisition method and device and computer storage medium
CN111796581B (en) * 2020-07-06 2022-03-18 中铁二十局集团有限公司 Shield data acquisition method and device and computer storage medium
CN113409096A (en) * 2021-08-19 2021-09-17 腾讯科技(深圳)有限公司 Target object identification method and device, computer equipment and storage medium
CN113409096B (en) * 2021-08-19 2021-11-16 腾讯科技(深圳)有限公司 Target object identification method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109635117A (en) A kind of knowledge based spectrum recognition user intention method and device
CN108197532A (en) The method, apparatus and computer installation of recognition of face
CN110309840A (en) Risk trade recognition methods, device, server and storage medium
CN108427708A (en) Data processing method, device, storage medium and electronic device
CN109583904A (en) Training method, impaired operation detection method and the device of abnormal operation detection model
CN108537671A (en) A kind of transaction risk appraisal procedure and system
CN110288465A (en) Object determines method and device, storage medium, electronic device
CN109360097A (en) Prediction of Stock Index method, apparatus, equipment and storage medium based on deep learning
CN109918560A (en) A kind of answering method and device based on search engine
CN106503006A (en) The sort method and device of application App neutron applications
CN105931068A (en) Cardholder consumption figure generation method and device
CN106844407A (en) Label network production method and system based on data set correlation
CN110110049A (en) Service consultation method, apparatus, system, service robot and storage medium
CN114723966B (en) Multi-task recognition method, training method, device, electronic equipment and storage medium
CN110288350A (en) User's Value Prediction Methods, device, equipment and storage medium
CN114612251A (en) Risk assessment method, device, equipment and storage medium
CN104346698A (en) Catering member big data analysis and checking system based on cloud computing and data mining
CN112529477A (en) Credit evaluation variable screening method, device, computer equipment and storage medium
CN115965058A (en) Neural network training method, entity information classification method, device and storage medium
CN110197426A (en) A kind of method for building up of credit scoring model, device and readable storage medium storing program for executing
CN109493186A (en) The method and apparatus for determining pushed information
CN116049536A (en) Recommendation method and related device
CN106156256A (en) A kind of user profile classification transmitting method and system
CN111368060B (en) Self-learning method, device and system for conversation robot, electronic equipment and medium
CN116502132A (en) Account set identification method, device, equipment, medium and computer program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190927

RJ01 Rejection of invention patent application after publication