WO2001046896A1 - Processus de commercialisation automatique - Google Patents
Processus de commercialisation automatique Download PDFInfo
- Publication number
- WO2001046896A1 WO2001046896A1 PCT/US1999/030793 US9930793W WO0146896A1 WO 2001046896 A1 WO2001046896 A1 WO 2001046896A1 US 9930793 W US9930793 W US 9930793W WO 0146896 A1 WO0146896 A1 WO 0146896A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- consumers
- model
- consumer
- segments
- selecting
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
Definitions
- the present invention relates to computer-based systems for analyzing marketing data. More specifically, the present invention relates to a method and an apparatus for automatically targeting particular groups of consumers for a marketing campaign, based upon response data from consumers and/or historical consumer behavioral data.
- Marketing campaigns can potentially reach millions of consumers offering goods and services as diverse as credit cards and magazine subscriptions through communication channels such as mass mailings and phone bank solicitations.
- Marketing campaigns are typically directed to particular segments of the population defined by demographic, geographic or behavioral characteristics that have exhibited a propensity to respond to particular marketing messages. In this way, marketing resources can be directed toward consumers who are likely to respond favorably to solicitations.
- What is needed is a system that automatically finds predictive relationships between consumer attributes and consumer responses, and that uses these relationships to automatically produce a marketing campaign directed to consumers who are likely to respond.
- One embodiment of the present invention provides a method and apparatus that automatically finds predictive relationships in marketing data between consumer attributes and consumer responses, and that uses these relationships to produce a marketing campaign directed to consumers who are likely to respond.
- the system operates by constructing a model for consumer responses by dividing a database containing consumer records into segments containing one or more consumer records based upon attributes in the records. This segmentation is performed with a goal of optimizing the predictive power of the model by identifying the "best" splits with which to build an induction tree. Splitting can be based on more than one attribute, including a treatment (type of solicitation) attribute.
- the system accesses a second database containing records for prospective consumers.
- the system uses the consumer response model to select a group of target consumers from the second database based upon their propensity to respond, or their propensity to respond to particular treatments.
- the system next assigns particular treatments to particular target consumers so as to substantially maximize the response rate among the group of target consumers.
- a subsequent marketing campaign applies the specified treatments to target consumers.
- the results of this marketing campaign can be used to adjust the response model for subsequent marketing campaigns.
- FIG. 1 illustrates a system for creating a marketing campaign in accordance with an embodiment of the present invention.
- FIG. 2 is a flow chart illustrating the process of generating a segmentation of consumer records in accordance with an embodiment of the present invention.
- FIG. 3 is a flow chart illustrating the process of generating a scored prospect list of prospective consumers for the marketing campaign in accordance with an embodiment of the present invention.
- FIG. 4 is a flow chart illustrating the process generating a target list in accordance with an embodiment of the present invention.
- FIG. 5 illustrates a tree structure that is used to generate a segmentation in accordance with an embodiment of the present invention.
- FIG. 6 illustrates a cell matrix relating treatments to segments in accordance with an embodiment of the present invention.
- FIG. 7 is a flow chart illustrating the process of using response data from a marketing campaign to update a response model in accordance with an embodiment of the present invention.
- FIG. 8 illustrates how segmentation is performed with multiple treatments in accordance with an embodiment of the present invention.
- the carrier wave may originate from a communications network, such as the internet.
- One embodiment of the present invention uses a process known as "predictive modeling" in the creation of a marketing campaign.
- a model is created from a database containing consumer demographic and behavior data, as well as response information. This model is used to predict which of a group of prospective consumers are likely to respond to a marketing campaign, and those consumers to contact.
- FIG. 1 illustrates an embodiment the present invention as described above in a system for creating a marketing campaign.
- the functional modules shown in FIG. 1 are model-building unit 102, scoring unit 104, target list generator 106, contact initiator 108, response collector 110 and model-refining unit 112.
- Model-building unit 102 receives inputs from consumer database 120 and schema 114, and uses these inputs to generate model 103.
- Scoring unit 104 receives inputs from model 103, prospect database 122 and schema 115, and uses these inputs to generate scored prospect list 105.
- Target list generator 106 receives inputs from scored prospect list 105 and schema 116, and uses these inputs to generate target list 124.
- Target list 124 is input into contact initiator 108, which is responsible for contacting the selected prospective customers.
- Response collector 110 gathers responses 126 to contacts made by contact initiator 108. Responses 126, along with schema 118, are input into model-refining unit 112, which refines model 103.
- model-refining unit 112 which refines model 103.
- Consumer database 120 includes records containing information pertaining to consumers.
- record 130 includes information for a particular consumer, including predictive variable 132.
- predictive variable 132 is a boolean variable indicating whether or not there was a response.
- predictive variable 132 can be a dependent predictive variable that the system attempts to predict using model 103.
- training database When the value of predictive variable 132 for all records is known, consumer database 120 is referred to as a "training database”.
- Record 130 can also include a treatment field 134 indicating which marketing treatment the particular consumer received, where "marketing treatment” refers to any of a variety of types of solicitation.
- a marketing treatment can be related to a number of factors, such as message, product, price and type of channel.
- Record 130 additionally includes descriptive attributes 136, which contain information about the consumer. This information may include demographic data, such as the age and income of the consumer; locational data; and behavioral data, indicating, for example, the types of magazines the consumer subscribes to or categories of products they buy from catalogs. Note that locational data might refer to geographic locations as well as addresses on computer networks such as the internet (for example, domain names).
- Schema 114 contains metadata that describes the structure of consumer database 120. Schema 114 may additionally define transformations of attributes 136. For example, attributes corresponding to a "checking account balance” and a "savings account balance” can be added to form a new attribute “total account balance,” if the "total account balance” has an important business meaning. Schema 114 also includes specific information on how to operate model-building unit 102 so as to produce model 103.
- Model-building unit 102 receives inputs from consumer database 120 and schema 114, and uses these inputs to generate model 103.
- Model 103 may assume a number of different forms, including baseline, response, and refined.
- a "baseline model” is an initial model that is constructed without using response data from prior campaigns.
- records 130 are proxy records. For example, people who already have bought the product may be used as a proxy for people who would respond to a marketing campaign.
- a "response model” is a model constructed using responses gathered from consumers during a first marketing campaign.
- a response model can incorporate response data for a number of different treatments.
- a "refined model” is a model created after at least one marketing campaign. In a refined model, parameters for the previous model (which is either a response model or a refined model) are adjusted according to the responses to the prior campaign.
- Model-building unit 102 generates model 103 by classifying records in database
- classification schemes can be used, such as, but not limited to, schemes based on trees, tables, or other data structures.
- model-building unit 102 uses a segmentation process to generate model 103.
- records are classified according to a binary tree structure so as to separate database 120 into a number of non-overlapping segments, as illustrated in FIG. 5.
- each node in the tree corresponds to a non- overlapping segment.
- each non-leaf node for example, root node 502
- a condition is associated with each non-leaf node, such as "age > 40.” If the condition is true for a given record, then the given record is assigned to the respective child node. If it is false, the record is assigned to the other child node.
- records for consumers with "age > 40" are in the right subtree (node 506), and other records are in the left subtree (node 504).
- node 504 a further split is made on the basis of the attribute "account balance > $16K.” Records for consumers with “account balance > $16K” are in the right subtree of node 504 (node 510), and other records are in the left subtree of node 504 (node 508). Within node 506, the next split is made on the basis of the attribute "income > $60K.” Records for consumers with “income > $60K” are in the right subtree of node 506 (node 514), and other records are in the left subtree of node 506 (node 512).
- each segment can be defined by a series of boolean operations based on one or more of the descriptive attributes.
- Leaf nodes 508, 510, 512 and 514 are associated with terminal segments.
- Leaf node 514 is associated with a segment of consumers having "age > 40" and "income > $60K”; this segment has a response rate of 10%.
- the system can keep track of response rates for each of the treatments.
- the response rate of 4% for node 508 might be the result of a response rate of 2% for a first treatment, applied to half of the members of the segment, and a response rate of 6% for a second treatment, applied to the other half of the members of the segment.
- FIG. 2 is a flow chart illustrating a segmentation process as referred to above.
- model-building unit 102 reads data from schema 114 and uses it to read data from consumer database 120 into model- building unit 102.
- model-building unit 102 determines the type of induction tree (e.g., classification or regression) to use based on the data type (e.g., boolean, categorical, or continuous) of dependent variable 132. Alternatively, the type of induction tree to use can be set manually.
- the type of induction tree e.g., classification or regression
- the type of induction tree to use can be set manually.
- model-building unit 102 recursively segments the records in consumer database 120 to produce model 103.
- segmentation is performed according to a binary induction tree, as illustrated in FIG. 5, will be described.
- step 206 starting with the entire database 120, the best split is found across all attributes 136.
- a boolean attribute i.e., an attribute having two possible states
- only one split is possible.
- n-1 possible splits of the form "greater than.” If the number of distinct values is too large, as determined using information from schema 114, then the values can be distributed into a smaller number of bins according to information from schema 114.
- a categorical attribute i.e., an attribute having an arbitrary number of states
- there are 2 n" '-l possible splits one for each possible non-NULL subset of the distinct values.
- segmentation can be performed without considering splitting based on this attribute.
- an order can be imposed based upon the values of this categorical attribute in order to limit the number of possible splits to "n-1.”
- the metric for determining the best split can be defined from information from schema 114. Any of a variety of metrics can be chosen. Metrics that can be used include Entropy, Gini Index, and Gini-Hat Index. The metric chosen for segmentation has a subtle effect on the types of splits that are chosen by model-building unit 102. The Entropy and Gini Index metrics produce similar trees. The Gini Index is mathematically equivalent to the sum of squared errors metric. The Gini-Hat Index (GHI) is preferred for databases that have many strong relationships in the data. Strong relationships tend to produce trees with many "pure" nodes (100% Yes or 100% No), which are undesirable because they leave very little room for marketing opportunities. A variance metric can be used for regression trees.
- GKI Gini-Hat Index
- the best split can be defined as the split with the largest gain, where the gain is the difference between the value of the metric for the attribute at the parent node and the sum of the values of the metric for the attribute at the two child nodes.
- Database 120 is segmented based on the best split. This segmentation process continues recursively, by repeating step 206 on child nodes, until terminated according to information from schema 114. Any of a variety of termination criteria can be used separately or in combination.
- a parameter can specify the smallest size node that can be split. When the segmentation algorithm encounters a node that is smaller than this size, it ceases splitting on that branch of the tree.
- Another parameter (“MinSize”) can specify the smallest allowable size for a node to prevent creation of a node of smaller than this size.
- the values of the parameters can be set during segmentation in order to prevent over- fitting of the tree to the data. Alternatively, these parameters can be set manually.
- model-building unit 102 can also be used to generate distinguishing and/or descriptive characteristics containing information regarding attributes, including attributes that are not used to define the segment. For instance, the tree structure might not include a split based on "gender". However, 80 percent of the consumers in a particular segment might be "male”. This statistic is computed as part of the distinguishing characteristics. Thus, distinguishing characteristics show important differences between a segment and the total population. Model-building unit 102 can also generate a set of descriptive statistics of attributes of interest by segment or across more than one segment irrespective of the distinct attribute values of a particular segment. In step 210, model-building unit 102 can be used to determine the relative importance of attributes used in generating the segmentation.
- step 212 the segmentation tree can be pruned to prevent overfitting. This pruning process involves examining the statistical significance of segmentation operations, accounting for the bias of looking at multiple attributes if applicable, and then pruning back the tree until all splits in the tree are statistically significant.
- step 212 uses a technique known in the art as Bias Adjusted Significance Pruning (BASP), which provides a quantitative method for determining when a particular segment should not be segmented further.
- BASP creates conservative and robust models without the computational expense of exhaustive cross validation testing.
- BASP is superior to methods in the art such as cross validation that are not easily automated and require ad hoc heuristics.
- BASP also has the advantage of allowing the model-building process to correctly identify the model in one pass. It therefore does not rely on a test set. This means that the training set does not have to be split into train and test. This is important when the number of records in the training set is small.
- step 212 the segmentation process is complete, and model 103 is available for use in creating a marketing campaign.
- scoring unit 104 takes inputs from model 103, prospect database 122, and schema 115, and uses these inputs to generate scored prospect list 105.
- Prospect database 122 includes information on prospective consumers.
- the prospective consumers in prospect database 122 can be the same as the consumers in consumer database 120, as in the case of a cross-sell application, or they can be different, as in the case of a new customer acquisition application.
- At least some of the attributes used in constructing model 103 must be available in prospect database, so that a segment and/or a score can be assigned to each prospect according to the process described below.
- Schema 115 contains metadata that describes the structure of prospect database 122, and can also include information on treatments to be applied to consumers in prospect database 122.
- FIG. 3 is a flow chart illustrating the process of generating scored prospect list 105, a scored list of prospective consumers for the marketing campaign, in accordance with an embodiment of the present invention.
- scoring unit 104 reads records from prospect database 122.
- a score and/or segment is computed for each record by applying the attribute values to the segmentation from model 103. For instance, if predictive variable 132 is a boolean variable indicating a "propensity to buy", then high scoring segments contain records for consumers who are more likely to buy.
- the result is output to scored prospect list 105.
- Scored prospect list 105 contains records from prospect database 122 with a segment and/or score assigned to each record. In one embodiment, scored prospect list 105 is divided into segments based on attribute values in the records.
- the screening process can be performed by scoring unit 104 or by target list generator 106.
- the screening process does not necessarily have to exclude all members of a particular group, it may, for example, exclude only a portion of a group, and allow through the rest of the group.
- scored prospect list 105 along with schema 116, is used by target list generator 106 to generate target list 124.
- Schema 116 contains campaign parameters, such as the desired size of the campaign and the percentage of the campaign that should be used for test and control, and can also include information on treatments to be applied to prospects.
- FIG. 4 is a flow chart illustrating the process of generating target list 124 in accordance with an embodiment of the present invention. This process is performed by target list generator 106 from FIG. 1.
- Target list generator 106 takes in model 103, scored prospect list 105, and schema 116. From these inputs, it generates target list 124. More specifically, in step 402, target list generator 106 reads in scored prospect list 105. In step 404, target list generator 106 reads in campaign parameters from schema 116, including information on how many consumers to include in the target list, and what treatments to use for the contacts.
- target list generator 106 selects prospective consumers to include in target list 124.
- target list generator 106 selects prospective consumers by first selecting records corresponding to the highest scoring segment in model 103. Target list generator 106 then selects records corresponding to the next highest scoring segment, and so on, until the number of selected consumers reaches the size of the campaign.
- target list generator 106 outputs target list 124.
- Target list 124 contains a subset of the records in scored prospect list 105, including at least a unique identifier with which to identify each consumer. Thus, target list 124 specifies which consumers to contact.
- Target list 124 can also contain an identifier for the treatment to use for each record, thereby specifying which treatment to use for each consumer. If information on multiple treatments is available, in one embodiment target list generator 106 assigns the highest scoring treatment for each segment to consumers in that segment. In another embodiment, treatments are assigned across all segments in proportion to their score, such that higher scoring treatments ar3 assigned with greater frequency than lower scoring treatments, but higher scoring treatments are distributed across segments. In another embodiment, or if no treatment info ⁇ nation is available, treatments can be assigned to consumers so as to evenly distribute the multiple treatments across segments.
- target list generator 106 may select some consumers from segments with low scores for inclusion in target list 124. Target list generator 106 may additionally assign sub-optimal treatments to selected consumers. This allows the system to monitor for possible improving behavior of low scoring segments and treatments, for example, if there are multiple phases in a campaign, by comparing the new score for a segment or treatment to the old score for that segment or treatment or to the score for other segments or treatments.
- target list 124 is used by contact initiator 108 for initiating contact with the selected prospective customers using the assigned treatments.
- contact initiator 108 can generate a mailing list for the targeted consumers.
- contact initiator 108 can generate a phone list for calling the targeted consumers.
- the present invention can be applied to any type of consumer contact, including mail, telephone, email, and even door-to-door solicitations.
- the system is used for targeting on websites.
- a model is created for each of a set of promotions based on a predictive variable that acts as a surrogate for the offer.
- the training database consists of a set of consumers who previously visited a website for whom demographic and other data exists, as well as the surrogate predictive variable.
- Each consumer for whom there exists a database record is then scored for each model.
- the promotion that has the highest score for that consumer is presented on the web page.
- other promotion selection strategies are used when the consumer visits the site multiple times, such as presenting multiple copies of the promotions in random order, with the number of copies of each promotion being proportional to the score whereby the highest scoring promotions have the most copies and the lowest scoring promotions have the fewest copies.
- Response collector 110 gathers responses 126 to contacts made by contact initiator 108. This response information typically includes information on whether or not a targeted consumer responded to a treatment, and can also include treatment information.
- Responses 126, along with schema 118 are used by model-refining unit 112 to refine or update model 103 using the information contained in responses 126.
- Schema 118 describes the response data 126, including any transformations to be applied to this data.
- Model-refining unit 112 re-computes the score (response rate) for each of the segments in order to track changing behavior over time. If the prospects in a previously high-scoring segment have a lower response rate, the segment's score is reduced. If prospects in a previously low-scoring segment have a higher response rate, the segment's score is increased. Similarly, responses to different treatments can be incorporated into refined model 103. In one embodiment, model-refining unit 112 does not change the preexisting segmentation in model 103. Hence, a prospective customer who is in a certain segment in model 103 will be in the same segment in refined model 103. Then, if the same prospect database 122 is used to generate a subsequent target list 124, scored prospect list 105 does not have to be regenerated because the segmentation of model 103 does not change during the refining process.
- FIG. 7 is a flow chart illustrating the process of using responses 126 to refine or update response model 103 in accordance with an embodiment of the present invention. This process is performed by model-refining unit 1 12 from FIG. 1.
- model- refining unit 112 reads responses 126.
- Model-refining unit 112 also reads model 103 and schema 118.
- model-refining unit 1 12 uses response data 126 to update scores for segments and treatments in model 103.
- “refining” or “updating” refer to any modification, adaptation, or other change in model 103 to accommodate response data 126. Any of a variety of techniques can be used to update the scores for segments and treatments. In one embodiment, scores are updated by minimizing the variance scaled distance between the squared deviation for all of the observations.
- model-refining unit 112 produces a report specifying how model 103 has been modified.
- the score updating process uses a time decay function so that more recent observations are given more weight than older observations. For instance, each sample can be multiplied by the factor e ' Bt, where t is time and B is a factor that can be adjusted to speed up or slow down the effect of time decay of the function.
- Other types of functions can also be used, such as cyclical functions that weight more heavily observations from the previous year to capture seasonal response patterns, such as for holidays. In general, any function that reduces the impact of an observation based upon the age of the observation can be used.
- FIG. 1 The embodiment of the present invention as illustrated in FIG. 1 has now been described. Many modifications to and variations of this embodiment, as well as other embodiments, are possible within the scope of the present invention. For example, several variations of the segmentation process will now be described.
- the segmentation process can vary depending on what type of induction tree (e.g., classification or regression) is used for classification of records.
- Two way classification can be used for predictive variables that are boolean data types
- K- way classification can be used for predictive variables that are categorical data types with K-states.
- Boolean and categorical classification are related but result in different internal representations.
- two-state categorical classification produces the same tree as boolean classification, significant efficiency gains can result from specifying two-state data types as boolean.
- Regression trees can be used for continuous-valued predictive variables. These continuous variables can be represented as floating point numbers. Ordered discrete variables having integer values, such as a quantity of items, are also continuous-valued and therefore use a regression tree.
- the segmentation process can also be varied to account for missing values in database 120.
- the value of predictive variable 132 is typically defined for all records in database 120. This is because a consumer record 130 cannot be used to predict a behavior if the corresponding predictive variable 132 is not defined.
- missing values are allowed, however, fewer missing values improves the predictive power of model 103.
- missing values are handled using an information-theoretic approach. However, in this method, the overall effect of missing values is not adequately penalized for splits high up in the tree because of the "local" nature of the algorithm of this approach. Model-building unit 102 can adjust for this by adding a penalty for splitting on attributes contain missing values.
- the range of this penalty value can be from 1.0 to 5.0, with 1.0 being used as a strictly local penalty for missing values, and 5.0 being used so that a sparse attribute (i.e., an attribute with many missing values) is least likely to be split first.
- a default value can be set at 3.0. This penalty adjusts information gain by the information-theoretic loss due to the missing values, and can be reapplied for each split.
- missing values may be handled in other ways. Rather than penalizing for missing values, one embodiment of the present invention fills in missing values.
- a missing value can be filled in a number of ways.
- a record with a missing value can be propagated to both children with fractional weights (x and 1 -x) equal to the proportion of non-missing value records that were routed to the children when building the tree.
- fractional weights x and 1 -x
- This approach assumes that the actual values of an attribute with missing values correlate, on average, to the values of that attribute without missing values.
- Another approach is to assign missing values in a way that maintains the integrity of a variance-covariance matrix.
- Another approach is to propagate the record with the missing value to the children with the most similar values of other attributes.
- the segmentation process can also be varied to structure the data more closely to the business model.
- model-building unit 102 can set dependencies between attributes.
- model-building unit 102 creates a "HAS- A" dependency between a credit card balance attribute and a new synthetic attribute indicating whether or not the consumer has a credit card. This approach might be used for when consumer records having different formats are joined. Note that only boolean type fields can be designated as HAS-A attributes. Thus, if A depends on B, B must be of boolean type.
- HAS-A attributes generally do not have enough predictive power to be chosen on their own merits. If sufficient predictive power is found in the dependent attribute, a special composite split, using both the HAS-A attribute and the dependent attribute, can be performed.
- the segmentation process can also be varied if there are a number of different treatments used in a marketing campaign.
- multiple treatments can be handled by a designating treatment attribute 134 as a categorical attribute that adds a second dimension to the splitting process.
- the treatment attribute 134 is preferably free of unknown or missing values.
- the differential in the value of predictive variable 132 across the different values of treatment attribute 134 is taken into account.
- FIG. 8 illustrates how segmentation can be performed in the multiple treatment case in accordance with an embodiment of the present invention. This embodiment uses a segmentation criterion that measures the gain in explanatory power for the response that results from different treatment effects in child nodes above and beyond the power of a basic split that simply differentiates average response irrespective of treatment.
- the tree illustrated in FIG. 8 shows some similarity to the tree without different treatments illustrated in FIG. 5, but there are important differences.
- the tree in FIG. 5 shows the application of simple split criteria at two successive levels while FIG. 8 shows the choice of a split at just one level employing a compound criterion that considers both the predictor attribute and the treatment attribute.
- the responses to different treatments in the base node, 802 are shown by responses in node 803 which contains records of consumers that received treatment A and node 805 which contains records of consumers that received treatment B.
- node 812 contains records of consumers with "age > 40” that received treatment A
- node 814 contains records of consumers with "age > 40” that received treatment B.
- the metric for the choice of splits in the multiple treatment case is an elaboration of the single treatment case.
- the single treatment case corresponds to the nodes 802, 804 and 806, without nodes 808, 810, 812 and 814.
- the function "metric” is based on the value of predictive variable 132, and is a function such as Entropy, Gini Index, and Gini-Hat Index.
- the above equation measures the gain in explanatory power of splitting on the attribute "age.”
- model-building unit 102 uses a parameter "omega” to determine the amount of correction to be made for unbalanced populations.
- the omega parameter for classification trees is typically set between 0.0 and 1.0. For example, if there are 10% "yes” and 90% “no" responses for the predictive variable 132 in a given population, the "omega” factor can be used to force the algorithm to treat the population as a 50% "yes” and 50% “no” population. This approach has practical benefits. It trades off accuracy for increased numbers of mixed nodes. Such mixed nodes provide more opportunities for finding good targets for some marketing campaigns.
- the "omega” parameter can be set to its maximum value of 1.0 for target marketing applications, and can be set to its minimum value of 0.0 for pure classification tasks.
- the "omega” parameter can also be tuned to an intermediate value between 0.0 and 1.0, according to specific needs.
- each consumer record can also include a weight to be used to calculate a weighted response.
- the metrics are based upon the weighted response rates.
- weights can be used to analyze panel data where each record represents a group of people in a larger population. The weight is then proportional to the number of observations in the group.
- weights can be used to value the consumer based on such characteristics as credit risk, profitability, or revenue generation.
- the segmentation process can also be varied to use a technique called "boosting".
- the boosting method can be used to build more accurate trees by building a series of trees based on the remaining errors from a previous tree.
- the basic idea of boosting is to improve the segmentation process to build a set of related trees. Each tree generates a score for each record. The final score is then the sum of the scores across all the trees built.
- the scores are real numbers (e.g., positive for true and negative for false).
- the scores are not restricted to the range between zero and one, and they do not represent probabilities of responding as in a previously described tree model.
- Each tree is built using a different set of weights.
- the first tree typically has all weights equal to one.
- weights are chosen to minimize a loss function. The resulting weights are such that records that are mis-classified have larger weight. Records that are correctly classified have smaller weight.
- the scores are equal to half the log of the odds ratio within a node.
- a different choice of a loss function and/or minimization approximation can be used, leading to a different expression for the score. For example, using the same loss function but choosing a Newton update instead of a full minimization leads to scores that are equal to the difference between the probability of responding and the probability of not responding.
- fi Sum_over_tree_t h(i,t).
- Another embodiment of the boosted model employs a residual regression score as a covariate predictor within nodes of the boosted induction tree.
- Transformed residuals from a pre-boosting tree structure which has typically been pruned more closely than a single tree, are used to perform a step-wise regression against all attributes that are candidate predictors.
- the transformations can be used for non-linearities or for equalizing variances.
- Scores from this regression are then included as covariates in the boosted tree.
- the inclusion as covariates may take the form of binned score values within a multiple treatment implementation of the algorithm.
- the covariate may be included as a scaled value in a simple regression mechanism nested within the induction tree.
- a prospect list can be scored by applying prospect database 122 to model 103.
- the scoring process results in a list of predicted values, one for each record.
- the result is a list of the probabilities of "yes” (generally probability to respond).
- the result is a list of the most likely categories.
- the result is a list of the predicted values.
- the standard deviation of the predicted value can be generated for continuous values. In the multiple treatment case, two values are generated. One is the predicted value, usually the "probability of response", and the other is the best treatment.
- Another embodiment generates the segment number rather than a probability or a prediction. Then, a table can be used to look up a score.
- a list of target prospects must be created.
- a cell matrix is created.
- the cell matrix is created by generating a matrix with one row for each segment and one column for each treatment, plus one column to represent the whole segment. The cells in the latter column have the percentage of the number of targets that should be selected from that segment.
- the treatment columns contain a number that is a percentage of each of the treatments that should be allocated within that segment. Thus, each row of treatment cells sum to 100%.
- the cell matrix is automatically adjusted to contain absolute numbers of targets based on the campaign size provided by the user.
- the cell matrix is automatically constrained based on the number of targets available. This is done by first scoring the prospect database with the segment ID as described above. Then, tallies are made of the number and the percentage of the number of targets for each segment. The cell matrix can then be modified by reducing the number in the cell to ensure that there will be enough targets to fill it. This reduction can be done explicitly or proportionally. A corresponding increase must then be done to the other cells to ensure consistency.
- test cells represent prospects that are selected randomly across all segments, or from a set of segments defined by the user.
- Another embodiment allows the user to specify a number or percentage of the total number of targets to be used for tesing a new treatment.
- One embodiment is a new column that has "N/S" targets, where "N” is the number in the test and "S" is the number of segments. Alternatively, this number may be weighted by the total for the segment.
- These test targets may be in addition to the previously specified campaign size, or may be taken from other cells the preserve the size.
- target list generator 106 populates a cell matrix, such as cell matrix 600 that appears in FIG. 6, to optimize the number of contacts to make for each segment and treatment.
- cell matrix 600 has rows 1, 2, 3, ... corresponding to segments, and columns, A, B, C, ... corresponding to treatments.
- each cell within the cell matrix 600 can have a specific response rate that is determined from responses collected during prior campaigns, and/or a number or percentage as described above.
- target list generator 106 allocates prospects across cell matrix 600 in order to optimize total expected response subject to a risk constraint. This risk constraint penalizes the response rate for a cell if the measured response rate for the cell has a high variance.
- target list generator 106 instead of optimizing for expected response rate, target list generator 106 optimizes expected response rate minus a function of the variance of the expected response (multiplied by a coefficient). By penalizing for variance, this risk constraint reduces that chance that the marketing campaign will perform more poorly than a random mailing.
- the cell matrix is populated by filling the cells with the highest response rate first. Once a cell is populated, the treatment with the best response rate is assigned to all consumers in that cell. As an alternative, if the difference in response rates between treatments is not too large, then a small number of consumers are randomly assigned to the non-dominant treatment. This treatment testing is only done if enough consumers are in this cell. If the difference in response rates is too small, then the treatments are assigned randomly across all consumers in the cell. In another embodiment of the target list generation process, a different algorithm is used. Using this algorithm, the segments are sorted by score.
- Consumers to contact are selected from the segments by assigning a 100% sampling rate to the highest scoring segment until a target set, starting with the highest scoring segments, then selecting a holdout set from a distribution of the remaining segments, for example, by separating the remaining segments into equal quartiles and assigning a sampling rate of 60%, 20%, 10%), and 10% to the first through fourth quartiles, respectively.
- Linear interpolation can be used for segments that run across the boundary of a quartile.
- a sampling rate for each treatment in each segment is determined by according to a Z score for each segment, where the Z score is the difference between the response rate to the best treatment and the test treatment, normalized by the standard deviation of the difference.
- the sample size receiving the best treatment varies based on the Z score, for example, from 50% of the consumers to be contacted from the segment where the Z score is less than a minimum value such as "0.1", to 100% of the consumers to be contacted from the segment where the Z score is greater than a maximum value such as "3.0".
- the sampling assignments are verified using the expected response rates to ensure that the results will be statistically significant.
- the segmentation process can also be varied so as to perform multidimensional segmentation.
- Multidimensional segmentation is performed using a tree algorithm to predict a number, K, of different behaviors at the same time.
- the different behaviors may represent revenue, cost, risk, specific product usage, or loyalty.
- This method is performed by replicating the training data set K times into K separate "tiers.”
- Predictive variable 132 is constructed as a composite, successively taking on values of one of the K behavioral outcomes in each of the K tiers.
- Treatment attribute 124 is defined with a distinct value in each tier which is identified with the behavioral measure present as predictive variable 132 in that tier.
- the criterion for choice of splits is to maximize the difference in the K- dimensional pattern of behavior in the two child nodes relative to the parent node.
- a variation on this embodiment adapts a multidimensional treatment algorithm, for example the K-dimensional Gini information measure with a differentiation parameter set for maximal treatment differentiation as is described below.
- the results are much like a powerful multi-treatment model with some behaviors being very high in some nodes, and other behaviors being very high in other nodes.
- the behaviors in a single analysis referred to as "behavioral drivers" can be both positive (e.g., profit revenue, sales of K different products) and negative (e.g, cost, risk, attrition default). Hence, only minor consideration is placed on the average outcome.
- a variation on the above-described embodiment includes using a regression tree with split criteria based on explained sums of squares. Another variation uses absolute linear differences. These variations allow for flexible weighting of different behavioral drivers to provide for greater or lesser impact on the segmentation as appropriate for a given business problem.
- the previous embodiments handle cases when the number of treatments (or behavior drivers) is small (four to eight). In cases where the number of treatments (or behavior drivers) is large, 20-30 or even 100 (such as one might use in analyzing a large number of products), items that appeal to similar people may cluster together into an overall behavioral pattern. A behavior that follows a cluster pattern in most segments but differs in a few distinct segments may suggest marketing opportunities.
- the above-described differential treatment mechanisms are defined to work on classifications made with boolean data types and in regressions made with continuous or integer data types.
- the underlying behavior measures are scaled data types that are transformed into boolean data types indicating a place in the top or bottom n-tiles of the distribution.
- Another embodiment of multidimensional segmentation treats the K target behaviors as a matrix with K columns and one row per observation.
- the class of target predictive variables is expanded to include continuously-valued measures.
- a metric may include different weights for each of the K behavioral measures to align statistical impacts with the business importance of different measures.
- the above-described embodiments of the present invention provide a number of advantages. First, they can allow a marketing manager to create a marketing campaign automatically without the services of trained statisticians. Second, since an aspect of the present invention exhaustively searches for relationships between consumer attributes and response rates, it is not likely to miss any important relationships. Third, the system automatically handles multiple treatments which generally must be handled manually. Fourth, the system can systematically create a test campaign allowing the marketer to measure the effectiveness of the overall campaign. Fifth, the system automatically handles many labor-intensive data preprocessing functions such as handling missing data, data transformations, and data dependencies. Also, because the system is scalable, sampling is not required. Furthermore, the present invention can be used to automatically create a marketing campaign within hours, instead of weeks as is presently required using existing systems.
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Strategic Management (AREA)
- Finance (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- Marketing (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU23844/00A AU2384400A (en) | 1999-12-20 | 1999-12-20 | Automatic marketing process |
PCT/US1999/030793 WO2001046896A1 (fr) | 1999-12-20 | 1999-12-20 | Processus de commercialisation automatique |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US1999/030793 WO2001046896A1 (fr) | 1999-12-20 | 1999-12-20 | Processus de commercialisation automatique |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2001046896A1 true WO2001046896A1 (fr) | 2001-06-28 |
Family
ID=22274386
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1999/030793 WO2001046896A1 (fr) | 1999-12-20 | 1999-12-20 | Processus de commercialisation automatique |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU2384400A (fr) |
WO (1) | WO2001046896A1 (fr) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8831974B1 (en) | 2009-04-24 | 2014-09-09 | Jpmorgan Chase Bank, N.A. | Campaign specification system and method |
WO2016010762A1 (fr) * | 2014-07-15 | 2016-01-21 | Datagence Inc. | Procédé et appareil pour cloner une liste cible |
CN111882339A (zh) * | 2019-12-20 | 2020-11-03 | 马上消费金融股份有限公司 | 预测模型训练及响应率预测方法、装置、设备及存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4345315A (en) * | 1979-01-19 | 1982-08-17 | Msi Data Corporation | Customer satisfaction terminal |
US5893075A (en) * | 1994-04-01 | 1999-04-06 | Plainfield Software | Interactive system and method for surveying and targeting customers |
US5893098A (en) * | 1994-09-14 | 1999-04-06 | Dolphin Software Pty Ltd | System and method for obtaining and collating survey information from a plurality of computer users |
US5974396A (en) * | 1993-02-23 | 1999-10-26 | Moore Business Forms, Inc. | Method and system for gathering and analyzing consumer purchasing information based on product and consumer clustering relationships |
-
1999
- 1999-12-20 AU AU23844/00A patent/AU2384400A/en not_active Abandoned
- 1999-12-20 WO PCT/US1999/030793 patent/WO2001046896A1/fr active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4345315A (en) * | 1979-01-19 | 1982-08-17 | Msi Data Corporation | Customer satisfaction terminal |
US5974396A (en) * | 1993-02-23 | 1999-10-26 | Moore Business Forms, Inc. | Method and system for gathering and analyzing consumer purchasing information based on product and consumer clustering relationships |
US5893075A (en) * | 1994-04-01 | 1999-04-06 | Plainfield Software | Interactive system and method for surveying and targeting customers |
US5893098A (en) * | 1994-09-14 | 1999-04-06 | Dolphin Software Pty Ltd | System and method for obtaining and collating survey information from a plurality of computer users |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8831974B1 (en) | 2009-04-24 | 2014-09-09 | Jpmorgan Chase Bank, N.A. | Campaign specification system and method |
WO2016010762A1 (fr) * | 2014-07-15 | 2016-01-21 | Datagence Inc. | Procédé et appareil pour cloner une liste cible |
CN111882339A (zh) * | 2019-12-20 | 2020-11-03 | 马上消费金融股份有限公司 | 预测模型训练及响应率预测方法、装置、设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
AU2384400A (en) | 2001-07-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7080052B2 (en) | Method and system for sample data selection to test and train predictive algorithms of customer behavior | |
US7516111B2 (en) | Data classification methods and apparatus for use with data fusion | |
US7328218B2 (en) | Constrained tree structure method and system | |
US7117208B2 (en) | Enterprise web mining system and method | |
Tsiptsis et al. | Data mining techniques in CRM: inside customer segmentation | |
US7660459B2 (en) | Method and system for predicting customer behavior based on data network geography | |
US6836773B2 (en) | Enterprise web mining system and method | |
US9489627B2 (en) | Hybrid clustering for data analytics | |
US6546379B1 (en) | Cascade boosting of predictive models | |
US20030088491A1 (en) | Method and apparatus for identifying cross-selling opportunities based on profitability analysis | |
AU2001291248A1 (en) | Enterprise web mining system and method | |
CN116048912A (zh) | 一种基于弱监督学习的云服务器配置异常识别方法 | |
KR20020020584A (ko) | 인터넷 설문조사 시스템 및 방법과 그 프로그램 소스를저장한 기록매체 | |
WO2001046896A1 (fr) | Processus de commercialisation automatique | |
Wikamulia et al. | Predictive business intelligence dashboard for food and beverage business | |
Yang et al. | Artmap-based data mining approach and its application to library book recommendation | |
WO1992017853A2 (fr) | Procede de diagnostic et de prevision base sur une analyse directe d'une base de donnees | |
Huang et al. | A Clustering‐based Method for Business Hall Efficiency Analysis | |
Rajeswari et al. | Customer segmentation in intelligent learning mechanism in E-banking using Frequent Item-set Hierarchical Clustering | |
Shah | Developing promotional model using Customer Lifetime Value score to avoid Customer Churns | |
Aronsson | Modeling strategies using predictive analytics: Forecasting future sales and churn management | |
CN118278970A (zh) | 一种基于大数据算法建设用户时空画像阵列的方法 | |
Rafla | A bayesian approach for Uplift modeling: application on biased data | |
CN118409822A (zh) | 业务菜单的展示方法、装置、程序产品及电子设备 | |
Hanna | Data‐mining algorithms in Oracle9i and Microsoft SQL Server |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 69(1) EPC |
|
122 | Ep: pct application non-entry in european phase |