WO2000055790A2 - Gradient criterion method for neural networks and application to targeted marketing - Google Patents

Gradient criterion method for neural networks and application to targeted marketing Download PDF

Info

Publication number
WO2000055790A2
WO2000055790A2 PCT/US2000/006735 US0006735W WO0055790A2 WO 2000055790 A2 WO2000055790 A2 WO 2000055790A2 US 0006735 W US0006735 W US 0006735W WO 0055790 A2 WO0055790 A2 WO 0055790A2
Authority
WO
Grant status
Application
Patent type
Prior art keywords
ln
method
maximum likelihood
new
neural networks
Prior art date
Application number
PCT/US2000/006735
Other languages
French (fr)
Other versions
WO2000055790A3 (en )
WO2000055790B1 (en )
Inventor
Yuri Galperin
Vladimir Fishman
Original Assignee
Marketswitch Corp.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce, e.g. shopping or e-commerce
    • G06Q30/02Marketing, e.g. market research and analysis, surveying, promotions, advertising, buyer profiling, customer management or rewards; Price estimation or determination
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/08Learning methods

Abstract

The present invention is drawn to a unique application of the Maximum Likelihood statistical method to commercial neural network technologies. The present invention utilizes the specific nature of the output in target marketing problems and makes it possible to produce more accurate and predictive results by minimizing a gradient criterion to produce model weights to get the maximum likelihood result. It is best used on 'noisy' data and when one is interested in determining a distribution's overall accuracy, or best general description of reality.

Description

TITLE OF THE INVENTION: Gradient Cπteπon Method for Neural Networks and Application to Targeted Marketing FIELD OF THE INVENTION: This invention relates generally to the development of neural network models to optimize the effects of targeted marketing programs. More specifically, this invention is an improvement on the Maximum Likelihood method of training neural networks using a gradient criterion, and is specially designed for binary output havmg strongly uneven proportion, which is typical for direct marketing problems.

BACKGROUND OF THE INVENTION: The goal of most modeling procedures is to minimize the discrepancy between real results and model outputs. If the discrepancy, or error, can be accumulated on a record by record basis, it is suitable for gradient algorithms like Maximum Likelihood.

The goal of target marketmg modelmg is typically to find a method to calculate the probability of any prospect in the list to respond to an offer. The neural network model is built based on the experimental data (test mailing), and the traditional approach to this problem is to choose a model and compute model parameters with a model fitting procedure The topology of model — for example, number of nodes, input and transfer functions — defines the formula that expresses the probability of response as a function of attributes In a special model fitting procedure, the output of the model is tested against actual output (from the results of a test mailing) and discrepancy is accumulated in a special error function. Different types of error functions can be used (e.g.. mean square, absolute error); model parameters are determined to minimize the error function. The best fitting of model parameters is an implicit indication that the model is good (not necessarily the best) in terms of its original objective. Thus the model building process is defined by two entities: the type of model and the error (or utility) function. The type of model defines the ability of the model to discern various patterns in the data. For example, increasing the number of nodes results in more complicated formulae, so a model can more accurately discern complicated patterns. The "goodness" of the model is ultimately defined by the choice of an error function, since it is the error function that is minimized during the model training process. To reach the goal of modeling, one wants to use a utility function that assigns probabilities that are most in compliance with the results of the experiment (the test mailing). The Maximum Likelihood criterion is the explicit measure of this compliance. However, the modeling process as it exists today has a significant drawback: it uses conventional utility functions (least mean square, cross entropy) only because there is a mathematical apparatus developed for these utility functions. What would really be useful is a process that builds a response model that directly maximizes Maximum Likelihood. For example, a random variable X exists with the distribution p(X, A), where A is an unknown vector of parameters to be estimated based on the independent observations of X: (xi, x2, ... , XN)- The goal is to find such a vector A that makes a probability of the output p(xι,A)*p( x2,A)* ... *p( XN,A) maximally possible. Note that the function p(X, A) should be a known function of two variables. The Maximum Likelihood technique provides the mathematical apparatus to solve this optimization problem. In general, the Maximum Likelihood method can be applied to neural networks as follows. Let the neural network calculate a value of the output variable y based on the input vector X. The observed values (y ι , y , • .. , YN) represent the actual output with some error e. Assuming that this error has, for example, a normal distribution, the method can find weights W of the neural network that makes a probability of the output p(yι,W)*p( y2, W)* ... *p( VN,W) maximally possible. In the case of a normal probability function, the Maximum Likelihood criterion is equivalent to the Least Mean Square criterion-which is, in fact, most widely used for neural network training. In the case of target marketing, the observed output X is a binary variable that is equal to 1 if a customer responded to the offer, and is 0 otherwise. The normality assumption is too rough, and leads to a sub-optimal set of neural network weights if used in neural network training. This is a typical direct marketing scenario.

SUMMARY OF THE INVENTION: The present invention represents a unique application of the Maximum Likelihood statistical method to commercial neural network technologies. The present invention utilizes the specific nature of the output in target marketing problems and makes it possible to produce more accurate and predictive results. It is best used on "noisy" data and when one is interested in determining a distribution's overall accuracy, or best general description of reality. The present invention provides a competitive advantage over off-the-shelf modeling packages in that it greatly enhances the application of Maximum Likelihood to quantitative marketing applications such as customer acquisition, cross-selling/up- selling, predictive customer profitability modeling, and channel optimization. Specifically, the superior predictive modeling capability provided by using the present invention means that marketing analysts will be better able to: • Predict the propensity of individual prospects to respond to an offer, thus enabling marketers to better identify target markets. • Identify customers and prospects who are most likely to default on loans, so that remedial action can be taken, or so that those prospects can be excluded from certain offers. • Identify customers or prospects who are most likely to prepay loans, so a better estimate can be made of revenues. • Identify customers who are most amenable to cross-sell and up-sell opportunities. • Predict claims experience, so that insurers can better establish risk and set premiums appropriately. • Identify instances of credit-card fraud.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 shows the dataflow of the method of training the model of the present invention.

Figure 2 illustrates a preferred system architecture for employing the present invention.

DETAILED DESCRIPTION OF THE INVENTION The present invention uses the neural network to calculate a propensity score g(X, W), where Wϊs a set of weights of the neural network, Nis a vector of customer attributes (input vector). The probability to respond to an offer for a customer with attributes X can be calculated by a formula: P = s(x,w) l + g(X,W)

If there are N independent samples and among them n are responders, the probability of such output is:

γ

Figure imgf000007_0001
Using the logarithm of L as a training criterion (training error) in the form of:

Err = -ln£ = ∑ln(l + .g- ) - ∑ln(g,) - ∑ln(l -g,) teresp lenon-resp The neural network training procedure finds the optimal weights PFthat minimize Err and thus maximize likelihood of the observed output L. One can use back propagation or a similar method to perform training. The gradient criterion that is required by a training procedure is computed as follows:

£ ■• = (£-& ∑l /g, + ∑ T^~ )g

1=1 A " " gt teresp iznon-resp 1 & l In order for the training procedure be robust and stable the output of the neural network should be in the middle of the working interval [0, 1]. To ensure that, the present invention introduces the normalized propensity score/which is related to g as:

g(X,W) = fh (X,W) Now, let/ be the output of the neural network and choose the parameter x in such a way that/may be of the order of 0.5. Let R be an average response rate in the sample. The above condition is satisfied if:

τ = l/ln

R While training the model, the criterion is optimized so the calculation is based on the output of the neural network using the formula:

Err = -\nP = ∑]n(l + f 'τ ) - - ∑Hf) - ∑hι(l - /;1/τ ) =1 ^ it≡resp tenon _resp The gradient criterion is computed as follows: ι N fllτ -l , N - 1/τ -l

^ 1=1 1 "r" J i * teresp tenon _resp 1 J t The method was tested on a variety of business cases against both Least Mean Square and Cross-Entropy criteria. In all cases the method gave 20% - 50% improvement in the lift on top 20% of the target marketing sample customer pools. As shown in figure 1, the method inputs data from modeling database 11 into a selected model 12 to calculate scores 13. The error 14 is calculated from comparison with the known responses from modeling database 11 and checked for convergence 15 below a desired level. When convergence occurs, a new model 16 is the result to be used for targeted marketing 17. Otherwise, the process minimizes the error and solves for a new set of weights at 18 and begins a new iteration.

The present invention operates on a computer system and is used for targeted marketing purposes. In a preferred embodiment as shown in figure 2, the system runs on a three-tier architecture that supports CORBA as an intercommunications protocol. The desktop client software on targeted marketing workstations 20 supports JAVA. The central application server 22 and multithreaded calculation engines 24, 25 run on Windows NT or UNIX. Modeling database 26 is used for training new models to be applied for targeted marketing related to customer database 28. The recommended minimum system requirements for application server 22 and multithreaded calculation engines 24, 25 are as follows:

Figure imgf000009_0001

*Approximately 100 MB/1 million records in customer database. The above assumes the user client is installed on a PC with the recommended configuration found below.

Figure imgf000009_0002
Figure imgf000010_0001

The recommended minimum requirements for the targeted marketing workstations 20 are as follows:

Figure imgf000010_0002
Using the present invention in conjunction with a neural network, the present invention provides a user with data indicating the individuals or classes of individuals who are most likely to respond to direct marketing.

Claims

We Claim:
1. A method of training neural networks using a maximum likelihood utility function, comprising: selecting an initial model function for a propensity score g(X, W), were W is a set of weights of the neural network and X is a vector of customer attributes from a modeling database; calculating propensity scores for the customers in the modeling database; calculating a training error Err, where
Err = -ln-L = ∑ln(l + g- )- ∑ln(g-)- ∑ln(l-g,) ι=l teresp tenon-resp measuring the error to check for convergence below a desired value; obtaining a new model and applying it to new data when convergence occurs; minimizing the error to solve for new weights W by minimizing the gradient criterion defined by the formula:
Err = (∑^ ∑l/g, + ∑ - - )*' ; and ι=i 1 ' g, teresp i non-resp 1 gt
beginning a new iteration of the process by calculating new propensity scores for the customers in the modeling database.
2. The method of training neural networks using a maximum likelihood utility function of claim 1 , wherein the new model is applied to new customer data for targeted marketing.
3. The method of training neural networks using a maximum likelihood utility function of claim 1 , wherein f is a normalized propensity score related to g(X,W) by the formula: g(X,W) = fih (X,W) and/ is the output of the neural network; choosing the parameter x in such a way that/may be of the order of 0.5; wherein R is an average response rate in the sample and the above condition atisfied if:
x = l/ln ; and
R wherein:
Err = -lnP = ∑ln(l + /1/τ ) -- ∑ln(/) - ∑ln(l -/lrt ) ; ι=I ^ iGresp tenon _resp and gradient criterion is computed as follows:
Figure imgf000012_0001
- *■ J t r
4. The method of training neural networks using a maximum likelihood utility function of claim 1 , wherein the new model is applied to the top 20% of a targeted marketing sample customer pool.
5. The method of training neural networks using a maximum likelihood utility function of claim 3, wherein the new model is applied to the top 20% of a targeted marketing sample customer pool.
6. The method of training neural networks using a maximum likelihood utility function of claim 1 , wherein the method is performed on a computer network.
PCT/US2000/006735 1999-03-15 2000-03-15 Gradient criterion method for neural networks and application to targeted marketing WO2000055790B1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US12421799 true 1999-03-15 1999-03-15
US60/124,217 1999-03-15

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CA 2403249 CA2403249A1 (en) 1999-03-15 2000-03-15 Gradient criterion method for neural networks and application to targeted marketing

Publications (3)

Publication Number Publication Date
WO2000055790A2 true true WO2000055790A2 (en) 2000-09-21
WO2000055790A3 true WO2000055790A3 (en) 2000-12-14
WO2000055790B1 true WO2000055790B1 (en) 2001-02-22

Family

ID=22413524

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/006735 WO2000055790B1 (en) 1999-03-15 2000-03-15 Gradient criterion method for neural networks and application to targeted marketing

Country Status (2)

Country Link
CA (1) CA2403249A1 (en)
WO (1) WO2000055790B1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002037302A1 (en) * 2000-10-31 2002-05-10 Granlund Goesta Training of associative networks
US6993493B1 (en) * 1999-08-06 2006-01-31 Marketswitch Corporation Method for optimizing net present value of a cross-selling marketing campaign
US8027871B2 (en) 2006-11-03 2011-09-27 Experian Marketing Solutions, Inc. Systems and methods for scoring sales leads
US8732004B1 (en) 2004-09-22 2014-05-20 Experian Information Solutions, Inc. Automated analysis of data to generate prospect notifications based on trigger events
US9058340B1 (en) 2007-11-19 2015-06-16 Experian Marketing Solutions, Inc. Service for associating network users with profiles
US9767309B1 (en) 2015-11-23 2017-09-19 Experian Information Solutions, Inc. Access control system for implementing access restrictions of regulated database records while identifying and providing indicators of regulated database records matching validation criteria
US9916596B1 (en) 2007-01-31 2018-03-13 Experian Information Solutions, Inc. Systems and methods for providing a direct marketing campaign planning environment
US10019508B1 (en) 2014-05-07 2018-07-10 Consumerinfo.Com, Inc. Keeping up with the joneses
US10078868B1 (en) 2007-01-31 2018-09-18 Experian Information Solutions, Inc. System and method for providing an aggregation tool

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8036979B1 (en) 2006-10-05 2011-10-11 Experian Information Solutions, Inc. System and method for generating a finance attribute from tradeline data
WO2008147918A3 (en) 2007-05-25 2009-01-22 Experian Information Solutions System and method for automated detection of never-pay data sets
WO2010132492A3 (en) 2009-05-11 2014-03-20 Experian Marketing Solutions, Inc. Systems and methods for providing anonymized user profile data
US9152727B1 (en) 2010-08-23 2015-10-06 Experian Marketing Solutions, Inc. Systems and methods for processing consumer information for targeted marketing applications

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0554083A2 (en) * 1992-01-30 1993-08-04 Ricoh Company, Ltd Neural network learning system
US5504675A (en) * 1994-12-22 1996-04-02 International Business Machines Corporation Method and apparatus for automatic selection and presentation of sales promotion programs
US5774868A (en) * 1994-12-23 1998-06-30 International Business And Machines Corporation Automatic sales promotion selection system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0554083A2 (en) * 1992-01-30 1993-08-04 Ricoh Company, Ltd Neural network learning system
US5504675A (en) * 1994-12-22 1996-04-02 International Business Machines Corporation Method and apparatus for automatic selection and presentation of sales promotion programs
US5774868A (en) * 1994-12-23 1998-06-30 International Business And Machines Corporation Automatic sales promotion selection system and method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SURENDRAN A C ET AL: "Unsupervised, smooth training of feed-forward neural networks for mismatch compensation" 1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING PROCEEDINGS (CAT. NO.97TH8241), 1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING PROCEEDINGS, SANTA BARBARA, CA, USA, 14-17 DEC. 1997, pages 482-489, XP002148180 1997, New York, NY, USA, IEEE, USA ISBN: 0-7803-3698-4 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6993493B1 (en) * 1999-08-06 2006-01-31 Marketswitch Corporation Method for optimizing net present value of a cross-selling marketing campaign
US7499868B2 (en) 1999-08-06 2009-03-03 Marketswitch Corporation Method for optimizing net present value of a cross-selling marketing campaign
US8015045B2 (en) 1999-08-06 2011-09-06 Experian Information Solutions, Inc. Method for optimizing net present value of a cross-selling marketing campaign
US8285577B1 (en) 1999-08-06 2012-10-09 Experian Information Solutions, Inc. Method for optimizing net present value of a cross-selling marketing campaign
WO2002037302A1 (en) * 2000-10-31 2002-05-10 Granlund Goesta Training of associative networks
US8732004B1 (en) 2004-09-22 2014-05-20 Experian Information Solutions, Inc. Automated analysis of data to generate prospect notifications based on trigger events
US8027871B2 (en) 2006-11-03 2011-09-27 Experian Marketing Solutions, Inc. Systems and methods for scoring sales leads
US9916596B1 (en) 2007-01-31 2018-03-13 Experian Information Solutions, Inc. Systems and methods for providing a direct marketing campaign planning environment
US10078868B1 (en) 2007-01-31 2018-09-18 Experian Information Solutions, Inc. System and method for providing an aggregation tool
US9058340B1 (en) 2007-11-19 2015-06-16 Experian Marketing Solutions, Inc. Service for associating network users with profiles
US10019508B1 (en) 2014-05-07 2018-07-10 Consumerinfo.Com, Inc. Keeping up with the joneses
US9767309B1 (en) 2015-11-23 2017-09-19 Experian Information Solutions, Inc. Access control system for implementing access restrictions of regulated database records while identifying and providing indicators of regulated database records matching validation criteria

Also Published As

Publication number Publication date Type
WO2000055790A3 (en) 2000-12-14 application
CA2403249A1 (en) 2000-09-21 application
WO2000055790B1 (en) 2001-02-22 application

Similar Documents

Publication Publication Date Title
Reisinger et al. Structural equation modeling: Critical issues and new developments
Teachman Methodological issues in the analysis of family formation and dissolution
Ward et al. A diffusion approximation for a Markovian queue with reneging
Deshpande et al. Selective Markov models for predicting Web page accesses
Brown et al. Family boundary ambiguity and the measurement of family structure: The significance of cohabitation
US6321179B1 (en) System and method for using noisy collaborative filtering to rank and present items
Daraio et al. Introducing environmental variables in nonparametric frontier models: a probabilistic approach
US6895405B1 (en) Computer-assisted systems and methods for determining effectiveness of survey question
Kenny et al. A general procedure for the estimation of interdependence.
US7143075B1 (en) Automated web-based targeted advertising with quotas
Hsieh et al. Nonparametric and semiparametric estimation of the receiver operating characteristic curve
US7437308B2 (en) Methods for estimating the seasonality of groups of similar items of commerce data sets based on historical sales date values and associated error information
Reich et al. Evaluating machine learning models for engineering problems
US20030212619A1 (en) Targeting customers
Lam et al. The effects of the dimensions of technology readiness on technology acceptance: An empirical analysis
Chan et al. Bayesian poisson regression for crowd counting
Yamaguchi Tasks and heterogeneous human capital
Chen et al. A fuzzy credit-rating approach for commercial loans: a Taiwan case
US6836773B2 (en) Enterprise web mining system and method
US20050278139A1 (en) Automatic match tuning
Tuerlinckx et al. Statistical inference in generalized linear mixed models: A review
US20070239535A1 (en) Behavioral targeting system that generates user profiles for target objectives
US20080208652A1 (en) Method and system utilizing online analytical processing (olap) for making predictions about business locations
Del Jesus et al. Evolutionary fuzzy rule induction process for subgroup discovery: a case study in marketing
Goodspeed A re-examination of the use of ability to pay taxes by local governments

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
AK Designated states

Kind code of ref document: A3

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: B1

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: B1

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

B Later publication of amended claims
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWE Wipo information: entry into national phase

Ref document number: 2403249

Country of ref document: CA

122 Ep: pct application non-entry in european phase