CA2403249A1 - Gradient criterion method for neural networks and application to targeted marketing - Google Patents

Gradient criterion method for neural networks and application to targeted marketing Download PDF

Info

Publication number
CA2403249A1
CA2403249A1 CA002403249A CA2403249A CA2403249A1 CA 2403249 A1 CA2403249 A1 CA 2403249A1 CA 002403249 A CA002403249 A CA 002403249A CA 2403249 A CA2403249 A CA 2403249A CA 2403249 A1 CA2403249 A1 CA 2403249A1
Authority
CA
Canada
Prior art keywords
application server
customer
multithreaded
central application
maximum likelihood
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002403249A
Other languages
French (fr)
Inventor
Yuri Galperin
Vladimir Fishman
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Marketswitch Corp
Original Assignee
Marketswitch Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Marketswitch Corp filed Critical Marketswitch Corp
Publication of CA2403249A1 publication Critical patent/CA2403249A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The present invention is drawn to a unique application of the Maximum Likelihood statistical method to commercial neural network technologies. The present invention utilizes the specific nature of the output in target marketing problems and makes it possible to produce more accurate and predictive results by minimizing a gradient criterion to produce model weights to get the maximum likelihood result. It is best used on "noisy" data and when one is interested in determining a distribution's overall accuracy, or best general description of reality.

Description

1 TITLE OF THE INVENTION: Gradient Criterion Method for Neural Networks and 2 Application to Targeted Marketing 3 FIELD OF THE INVENTION:
This invention relates generally to the development of neural network models to optimize the effects of targeted marketing programs. More specifically, this 6 invention is an improvement on the Maximum Likelihood method of training neural 7 networks using a gradient criterion, and is specially designed for binary output having 8 strongly uneven proportion, which is typical for direct marketing problems.

1o BACKGROUND OF THE INVENTION:
11 The goal of most modeling procedures is to minimize the discrepancy between 12 real results and model outputs. If the discrepancy, or error, can be accumulated on a 13 record by record basis, it is suitable for gradient algorithms like Maximum 14 Likelihood.
The goal of target marketing modeling is typically to find a method to 16 calculate the probability of any prospect in the list to respond to an offer. The neural 17 network model is built based on the experimental data (test mailing), and the 18 traditional approach to this problem is to choose a model and compute model 19 parameters with a model fitting procedure.
2o The topology of model-for example, number of nodes, input and transfer 21 functions-defines the formula that expresses the probability of response as a 22 function of attributes.
23 In a special model fitting procedure, the output of the model is tested against 24 actual output (from the results of a test mailing) and discrepancy is accumulated in a special error function. Different types of error functions can be used (e.g., mean 1 square, absolute error); model parameters are determined to minimize the error .-2 function. The best fitting of model parameters is an implicit indication that the model 3 is good (not necessarily the best) in terms of its original objective.
Thus the model building process is defined by two entities: the type of model and the error (or utility) function. The type of model defines the ability of the model 6 to discern various patterns in the data. For example, increasing the number of nodes 7 results in more complicated formulae, so a model can more accurately discern 8 complicated patterns.
9 The "goodness" of the model is ultimately defined by the choice of an error to function, since it is the error function that is minimized during the model training 1i process.
12 To reach the goal of modeling, one wants to use a utility function that assigns 13 probabilities that are most in compliance with the results of the experiment (the test 14 mailing). The Maximum Likelihood criterion is the explicit measure of this compliance. However, the modeling process as it exists today has a significant 16 drawback: it uses conventional utility functions (least mean square, cross entropy) 17 only because there is a mathematical apparatus developed for these utility functions.
1 g What would really be useful is a process that builds a response model that 19 directly maximizes Maximum Likelihood.
For example, a random variable X exists with the distribution p(X, A), where 21 A is an unknown vector of parameters to be estimated based on the independent 22 observations of X: (x1, x2, ..., xN). The goal is to find such a vector A
that makes a 23 probability of the output p(xl,A)*p( x2,A)* ... *p( xN,A) maximally possible. Note that 24 the function p(X, A) should be a known function of two variables. The Maximum 1 Likelihood technique provides the mathematical apparatus to solve this optimization 2 problem.
3 In general, the Maximum Likelihood method can be applied to neural 4 networks as follows. Let the neural network calculate a value of the output variable y based on the input vector X. The observed values (y1, y2, ..., yN) represent the actual 6 output with some error e. Assuming that this error has, for example, a normal 7 distribution, the method can fmd weights W of the neural network that makes a 8 probability of the output p(yl,W)*p( y2,W)* ~ ~-*p( Yrr~W) maximally possible. In 9 the case of a normal probability function, the Maximum Likelihood criterion is l0 equivalent to the Least Mean Square criterion-which is, in fact, most widely used for 11 neural network training.
12 In the case of target marketing, the observed output X is a binary variable that 13 is equal to 1 if a customer responded to the offer, and is 0 otherwise. The normality 14 assumption is too rough, and leads to a sub-optimal set of neural network weights if used in neural network training. This is a typical direct marketing scenario.

1~ SUMMARY OF THE INVENTION:
18 The present invention represents a unique application of the Maximum 19 Likelihood statistical method to commercial neural network technologies.
The present 2o invention utilizes the specific nature of the output in target marketing problems and 21 makes it possible to produce more accurate and predictive results. It is best used on 22 "noisy" data and when one is interested in determining a distribution's overall 23 accuracy, or best general description of reality.
24 The present invention provides a competitive advantage over off the-shelf modeling packages in that it greatly enhances the application of Maximum Likelihood 1 to quantitative marketing applications such as customer acquisition, cross-selling/up 2 selling, predictive customer profitability modeling, and channel optimization.
3 Specifically, the superior predictive modeling capability provided by using the present 4 invention means that marketing analysts will be better able to:
~ Predict the propensity of individual prospects to respond to an offer, thus enabling 6 marketers to better identify target markets.
7 ~ Identify customers and prospects who are most likely to default on loans, so that 8 remedial action can be taken, or so that those prospects can be excluded from 9 certain offers.
l0 ~ Identify customers or prospects who are most likely to prepay loans, so a better 11 estimate can be made of revenues.
12 ~ Identify customers who are most amenable to cross-sell and up-sell opportunities.
13 ~ Predict claims experience, so that insurers can better establish risk and set 14 premiums appropriately.
~ Identify instances of credit-card fraud.

18 Figure 1 shows the dataflow of the method of training the model of the present 19 invention.
Figure 2 illustrates a preferred system architecture for employing the present 21 invention.

1 The present invention uses the neural network to calculate a propensity score 2 g(X, W), where W is a set of weights of the neural network, X is a vector of customer 3 attributes (input vector). The probability to respond to an offer for a customer with 4 attributes X can be calculated by a formula:
_ g(X,W) p 1+g(X,W) 6 If there are N independent samples and among them n are responders, the 7 probability of such output is:
~g(X;,W)~' ~(1-g(Xi,W)) L - isresp isnon-resp N
~(1+g(Xi,W)) r=i 9 Using the logarithm of L as a training criterion (training error) in the form of:
to Err=-1nL=~ln(1+gi )- ~In(gi>- ~ln(1-gi) (W ieresp isnon-resp 11 The neural network training procedure fords the optimal weights W that 12 minimize Err and thus maximize likelihood of the observed output L. One can use 13 back propagation or a similar method to perform training. The gradient criterion that 14 is required by a training procedure is computed as follows:
15 Err _ (~ gi - ~ 1 ~ g; + ~ g' )g~
i=~ 1 ~' gi ieresp ienon-resp 1 - gi 16 In order for the training procedure be robust and stable the output of the neural 17 network should be in the middle of the working interval [0, 1]. To ensure that, the 18 present invention introduces the normalized propensity score f which is related to g 19 as:
2o g(X,W) - f~n (X,W) s I Now, let f be the output of the neural network and choose the parameter i in 2 such a way that f may be of the order of 0.5.
3 Let R be an average response rate in the sample. The above condition is 4 satisfied if:
i =1/lnl R
R
6 While training the model, the criterion is optimized so the calculation is based 7 on the output of the neural network using the formula:
N N
8 Err=-1nP=~ln(1+ ft~z~-1 ~~(.f>- ~ln(1-fr'~~~
(-t ~ ieresp ienon-resp 9 The gradient criterion is computed as follows:
f.m-t N fuzes l0 Err ' --_ ( 1 ~ vz 1 ~ 1 / f + ~ ~'v~ ).~
1 + f 2 ieresp isnon_resp 1 -,l;
11 The method was tested on a variety of business cases against both Least Mean 12 Square and Cross-Entropy criteria. In all cases the method gave 20% - SO%
13 improvement in the lift on top 20% of the target marketing sample customer pools.
14 As shown in figure 1, the method inputs data from modeling database 11 into a selected model 12 to calculate scores 13. The error 14 is calculated from comparison 16 with the known responses from modeling database 11 and checked for convergence 17 15 below a desired level. When convergence occurs, a new model 16 is the result to 18 be used for targeted marketing 17. Otherwise, the process minimizes the error and 19 solves for a new set of weights at 18 and begins a new iteration.
The present invention operates on a computer system and is used for targeted 21 marketing purposes. In a preferred embodiment as shown in figure 2, the system runs 22 on a three-tier architecture that supports CORBA as an intercommunications protocol.
23 The desktop client software on targeted marketing workstations 20 supports JAVA. The WO 00/55790 PCTlUS00/06735 1 central application server 22 and multithreaded calculation engines 24, 25 run on 2 Windows NT or UNIX. Modeling database 26 is used for training new models to be 3 applied for targeted marketing related to customer database 28. The recommended 4 minimum system requirements for application server 22 and multithreaded calculation engines 24, 25 are as follows:
9 _ ~HP Platform~i'~
Processor: ~ HP
emory~ ~j 256 MB
isk Space: ~ 10 MB*W
*Approximately 100 MB/1 million records in customer database. The above assumes the user client is installed on a PC with the recommended configuration found below.

Read/Write permissions in area of server Permissions: installation (no root permissions) perating System: ~~ HP/UX 11 (32 Bit) Protoco~ 1: ~~ CP/IP ___ _ Daemons: ~ elnet and FTP (Optional) 2 The recommended minimum requirements for the targeted marketing workstations 3 are as follows:

6 Using the present invention in conjunction with a neural network, the present '7 invention provides a user with data indicating the individuals or classes of individuals s who are most likely to respond to direct marketing.

Claims

7. A system for training neural networks with a maximum likelihood utility function, comprising:

a central application server;

a modeling database connected to said central application server;

at least one workstation networked to said central application server;

at least one multithreaded calculation engine networked to said central application server; and software instructions on said central application server, at least one workstation and at least one multithreaded calculation engine so as to provide for:

said at least one workstation to select an initial model function for a propensity score g(X,W, where W is a set of weights of the neural network and X is a vector of customer attributes from a modeling database; and said at least one multithreaded calculation engine to calculate propensity scores for the customers in the modeling database;

calculate a training error Err, where measure the error to cheek for convergence below a desired value;
obtain a new model and apply it to new data when convergence occurs;
minimize the error to solve for new weights W by minimizing the gradient criterion defined by the formula:

begin a new iteration of the process by calculating new propensity scores for the customers in the modeling database.

8. The system for training neural networks with a maximum likelihood utility function of claim 7, further comprising:

a customer database connected to said central application server and said at least one multithreaded calculation engine; and software instructions to apply the new model to customer data from said customer database upon being selected by said at least one workstation.

9. The system for training neural networks with a maximum likelihood utility function of claim 7, further comprising:

software instructions on said at least one multithreaded calculation engine to:

define f as a normalized propensity score related to g(X,W) by the formula:

g(X,W)=f~~(X,W) where f is the output of the neural network; and choose the parameter t in such a way that f may be of the order of 0.5;

wherein 12 is an average response rate in the sample and the above condition is satisfied if:

wherein:

and gradient criterion is computed as follows:

10. The system for training neural networks with a maximum likelihood utility function of claim 7, further comprising:

a customer database connected to said central application server and said at least one multithreaded calculation engine; and software instructions to apply the new model to a top 20% of a targeted marketing sample customer pool selected from said customer database by said a least one workstation.

11. The system for training neural networks with a maximum likelihood utility function of claim 9, further comprising:

a customer database connected to said central application server and said at least one multithreaded calculation engine; and software instructions to apply the new model to a top 20% of a targeted marketing sample customer pool selected from said customer database by said a least one workstation.

claims 7-12 added to define apparatus of invention.

All the remaining claims are unchanged.
CA002403249A 1999-03-15 2000-03-15 Gradient criterion method for neural networks and application to targeted marketing Abandoned CA2403249A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US12421799P 1999-03-15 1999-03-15
US60/124,217 1999-03-15
PCT/US2000/006735 WO2000055790A2 (en) 1999-03-15 2000-03-15 Gradient criterion method for neural networks and application to targeted marketing

Publications (1)

Publication Number Publication Date
CA2403249A1 true CA2403249A1 (en) 2000-09-21

Family

ID=22413524

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002403249A Abandoned CA2403249A1 (en) 1999-03-15 2000-03-15 Gradient criterion method for neural networks and application to targeted marketing

Country Status (3)

Country Link
AU (1) AU3884000A (en)
CA (1) CA2403249A1 (en)
WO (1) WO2000055790A2 (en)

Families Citing this family (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6993493B1 (en) 1999-08-06 2006-01-31 Marketswitch Corporation Method for optimizing net present value of a cross-selling marketing campaign
SE519507C2 (en) * 2000-10-31 2003-03-04 Goesta Granlund Method and apparatus for training associative networks for artificial vision
US8346593B2 (en) 2004-06-30 2013-01-01 Experian Marketing Solutions, Inc. System, method, and software for prediction of attitudinal and message responsiveness
US8732004B1 (en) 2004-09-22 2014-05-20 Experian Information Solutions, Inc. Automated analysis of data to generate prospect notifications based on trigger events
US7711636B2 (en) 2006-03-10 2010-05-04 Experian Information Solutions, Inc. Systems and methods for analyzing data
US8036979B1 (en) 2006-10-05 2011-10-11 Experian Information Solutions, Inc. System and method for generating a finance attribute from tradeline data
US8027871B2 (en) 2006-11-03 2011-09-27 Experian Marketing Solutions, Inc. Systems and methods for scoring sales leads
US8606626B1 (en) 2007-01-31 2013-12-10 Experian Information Solutions, Inc. Systems and methods for providing a direct marketing campaign planning environment
US8606666B1 (en) 2007-01-31 2013-12-10 Experian Information Solutions, Inc. System and method for providing an aggregation tool
US20080294540A1 (en) 2007-05-25 2008-11-27 Celka Christopher J System and method for automated detection of never-pay data sets
US7996521B2 (en) 2007-11-19 2011-08-09 Experian Marketing Solutions, Inc. Service for mapping IP addresses to user segments
US20100174638A1 (en) 2009-01-06 2010-07-08 ConsumerInfo.com Report existence monitoring
WO2010132492A2 (en) 2009-05-11 2010-11-18 Experian Marketing Solutions, Inc. Systems and methods for providing anonymized user profile data
US9652802B1 (en) 2010-03-24 2017-05-16 Consumerinfo.Com, Inc. Indirect monitoring and reporting of a user's credit data
US9152727B1 (en) 2010-08-23 2015-10-06 Experian Marketing Solutions, Inc. Systems and methods for processing consumer information for targeted marketing applications
US8930262B1 (en) 2010-11-02 2015-01-06 Experian Technology Ltd. Systems and methods of assisted strategy design
US9235728B2 (en) 2011-02-18 2016-01-12 Csidentity Corporation System and methods for identifying compromised personally identifiable information on the internet
US11030562B1 (en) 2011-10-31 2021-06-08 Consumerinfo.Com, Inc. Pre-data breach monitoring
US10255598B1 (en) 2012-12-06 2019-04-09 Consumerinfo.Com, Inc. Credit card account data extraction
US8812387B1 (en) 2013-03-14 2014-08-19 Csidentity Corporation System and method for identifying related credit inquiries
US10102536B1 (en) 2013-11-15 2018-10-16 Experian Information Solutions, Inc. Micro-geographic aggregation system
US10262362B1 (en) 2014-02-14 2019-04-16 Experian Information Solutions, Inc. Automatic generation of code for attributes
US9576030B1 (en) 2014-05-07 2017-02-21 Consumerinfo.Com, Inc. Keeping up with the joneses
US11257117B1 (en) 2014-06-25 2022-02-22 Experian Information Solutions, Inc. Mobile device sighting location analytics and profiling system
US10339527B1 (en) 2014-10-31 2019-07-02 Experian Information Solutions, Inc. System and architecture for electronic fraud detection
US10242019B1 (en) 2014-12-19 2019-03-26 Experian Information Solutions, Inc. User behavior segmentation using latent topic detection
US11151468B1 (en) 2015-07-02 2021-10-19 Experian Information Solutions, Inc. Behavior analysis using distributed representations of event data
US9767309B1 (en) 2015-11-23 2017-09-19 Experian Information Solutions, Inc. Access control system for implementing access restrictions of regulated database records while identifying and providing indicators of regulated database records matching validation criteria
US20180060954A1 (en) 2016-08-24 2018-03-01 Experian Information Solutions, Inc. Sensors and system for detection of device movement and authentication of device user based on messaging service data from service provider
US10699028B1 (en) 2017-09-28 2020-06-30 Csidentity Corporation Identity security architecture systems and methods
US10896472B1 (en) 2017-11-14 2021-01-19 Csidentity Corporation Security and identity verification system and architecture
US11682041B1 (en) 2020-01-13 2023-06-20 Experian Marketing Solutions, Llc Systems and methods of a tracking analytics platform
CN112070593B (en) * 2020-09-29 2023-09-05 中国银行股份有限公司 Data processing method, device, equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05346915A (en) * 1992-01-30 1993-12-27 Ricoh Co Ltd Learning machine and neural network, and device and method for data analysis
US5504675A (en) * 1994-12-22 1996-04-02 International Business Machines Corporation Method and apparatus for automatic selection and presentation of sales promotion programs
US5774868A (en) * 1994-12-23 1998-06-30 International Business And Machines Corporation Automatic sales promotion selection system and method

Also Published As

Publication number Publication date
AU3884000A (en) 2000-10-04
WO2000055790A3 (en) 2000-12-14
WO2000055790A2 (en) 2000-09-21
WO2000055790B1 (en) 2001-02-22

Similar Documents

Publication Publication Date Title
CA2403249A1 (en) Gradient criterion method for neural networks and application to targeted marketing
Heiat Comparison of artificial neural network and regression models for estimating software development effort
US7080052B2 (en) Method and system for sample data selection to test and train predictive algorithms of customer behavior
Cho Tourism forecasting and its relationship with leading economic indicators
US6640215B1 (en) Integral criterion for model training and method of application to targeted marketing optimization
US8005833B2 (en) User profile classification by web usage analysis
Lin et al. Forecasting from non‐linear models in practice
US20040236734A1 (en) Rating system and method for identifying desirable customers
US20020188507A1 (en) Method and system for predicting customer behavior based on data network geography
US20060155596A1 (en) Revenue forecasting and sales force management using statistical analysis
Cook Savings rates and income distribution: further evidence from LDCs
EP1110159A1 (en) Enhancing utility and diversifying model risk in a portfolio optimization framework
JP2003523578A (en) System and method for determining the validity of an interaction on a network
CA2590438A1 (en) System and method for predictive product requirements analysis
Antel Costly employment contract renegotiation and the labor mobility of young men
AU2001261702A1 (en) Revenue forecasting and sales force management using statistical analysis
KR100751966B1 (en) rating system and method for identifying desirable customers
AU7351300A (en) Method for modeling market response rates
CN113408908A (en) Multi-dimensional credit evaluation model construction method based on performance ability and behaviors
Dańko Analysis of reporting behavior using the hopit R-package (v0. 11.5)
Langlet et al. An Application of the Bootstrap Variance Estimation Method to the Canadian Participation and Activity Limitation Survey,”
CN116204697A (en) Content determination method, content determination device, computer readable storage medium and computer device
Sarkar Estimating diffusion models using repeated cross-sections: Quantifying the digital divide
Hasker et al. An analysis of strategic behavior in ebay auctions
Jain A guideline to statistical approaches in computer performance evaluation studies

Legal Events

Date Code Title Description
FZDE Dead