WO2001006405A2 - Echange croisee pour recherche en base de donnees - Google Patents

Echange croisee pour recherche en base de donnees Download PDF

Info

Publication number
WO2001006405A2
WO2001006405A2 PCT/US2000/040345 US0040345W WO0106405A2 WO 2001006405 A2 WO2001006405 A2 WO 2001006405A2 US 0040345 W US0040345 W US 0040345W WO 0106405 A2 WO0106405 A2 WO 0106405A2
Authority
WO
WIPO (PCT)
Prior art keywords
product
prospects
products
prospect
purchase
Prior art date
Application number
PCT/US2000/040345
Other languages
English (en)
Inventor
Yuchun Lee
Robert Crites
Original Assignee
Unica Technologies, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unica Technologies, Inc. filed Critical Unica Technologies, Inc.
Priority to AU69539/00A priority Critical patent/AU6953900A/en
Publication of WO2001006405A2 publication Critical patent/WO2001006405A2/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising

Definitions

  • This invention relates generally to data mining software .
  • Data mining software extracts knowledge that may be suggested by a set of data. For example, data mining software can be used to maximize a return on investment in collecting marketing data, as well as other applications such as credit risk assessment, fraud detection, process control, medical diagnoses and so forth.
  • data mining software uses one or a plurality of different types of modeling algorithms in combination with a set of test data to determine what types of characteristics are most useful in achieving a desired response rate, behavioral response or other output from a targeted group of individuals represented by the data.
  • data mining software executes complex data modeling algorithms such as linear regression, logistic regression, back propagation neural network, Classification and Regression (CART) and Chi squared Automatic Interaction Detection (CHAID) decision trees, as well as other types of algorithms on a set of data.
  • CART Classification and Regression
  • CHID Chi squared Automatic Interaction Detection
  • a method of determining a prospect's likelihood to purchase a product includes scoring a plurality of prospects on a plurality of models. Each of the models try to predict the likelihood that the prospects will purchase a particular product. The models produce model scores for each product. The method converts the model scores for each product into probabilities .
  • a computer program product for conducting product cross selling in a marketing campaign includes instructions for causing a computer to score a data set of prospects using a plurality of models that model a prospect's likelihood to purchase a corresponding plurality of products producing model scores for each product. The program converts the model scores for each product into probabilities.
  • a method of determining a number of prospects to contact in a marketing campaign includes fixing a unit cost to market each of a plurality of products by considering a mailing campaign for each product independently and adjusting the unit costs that have been assumed for each product independently, to arrive at costs that are appropriate when considering all of the products together.
  • the method also includes assigning prospects to product mailing lists based on the costs determined in the first pass while accumulating actual quantities of prospects to be placed on each of the lists and correcting the costs based on actual quantities by considering a maximum budget for the marketing campaign.
  • the data mining software allows for execution of multiple models that are designed for and trained with a sample of prospects and their purchase information about a set of products.
  • the sample of prospects that are used for training the models for each product can be filtered to exclude prospects who have recently purchased that product, unless it is desirable to have the software specifically look at repeat purchases as a possibility. For example, if the nature of the product is a one-time or infrequent purchase, then the software can remove customers who have recently purchased the product. Filtering may be important because inclusion of repeat customers under some circumstances can skew the training of the model and thus produce inaccurate probability estimates.
  • the data base mining software uses a cross-selling algorithm that uses multiple models, one for each product.
  • the software scores each prospect using all of the models of the products and transforms the scores into probability estimates in order to allow for comparisons between the scores.
  • the inputs to each model can include a prospect's purchase information about a set of products, as well as any other relevant information.
  • the output from each model is a probability that the prospect will purchase the modeled product.
  • the software or an operator can compare the different probabilities from each of the models and select which products to target for each prospect.
  • the software can also take into account costs and revenues associated with each product and target prospects based on expected profits.
  • This invention allows an organization to pinpoint exactly what products a prospect is most likely to purchase.
  • the software may rank the top, e.g., three products for a prospect such that with a limited amount of contact with the prospect, the organization can determine what products to target to the prospect.
  • FIG. 1 is a block diagram of a computer system executing data mining software.
  • FIG. 2 is a block diagram of a data set.
  • FIG. 2A is a diagram of a record.
  • FIG. 3 is a block diagram of a training process for data mining software that includes a cross-selling algorithm.
  • FIG. 4 is a block diagram of data mining software that includes a cross-selling algorithm.
  • FIG. 5 is a block diagram of a budget process useful in the cross selling algorithm of FIG. 4.
  • a computer system 10 includes a CPU 12, main memory 14 and persistent storage device 16 all coupled via a computer bus 18.
  • the system 10 also includes output devices such as a display 20 and a printer 22, as well as user input devices such as a keyboard 24 and a mouse 26.
  • output devices such as a display 20 and a printer 22, as well as user input devices such as a keyboard 24 and a mouse 26.
  • user input devices such as a keyboard 24 and a mouse 26.
  • software drivers and hardware interfaces to couple all the aforementioned elements to the CPU 12.
  • the computer system 10 also includes data mining software 30 that includes a cross-selling algorithm 32.
  • the cross-selling algorithm 32 is designed to estimate a prospect's likelihood to purchase a plurality of products.
  • the data mining software 30 may reside on the computer system 10 or may reside on a server 28, as shown, which is coupled to the computer system 10 in a conventional manner such as in a client-server arrangement. The details on how this data mining software is coupled to this computer system 10 are not important to understand the present invention.
  • data mining software 30 executes complex data modeling algorithms such as linear regression, logistic regression, back propagation neural network, Classification and Regression Trees (CART) and Chi squared Automatic Interaction Detection (CHAID) decision trees, as well as other types of algorithms that operate on a data set. Also, the data mining software 30 can use any one of these algorithms with different modeling parameters to produce different results.
  • the data mining software 30 can render a visual representation of the results on the display 20 or printer 22 to provide a decision maker with the results. The results that are returned can be based on different algorithm types or different sets of parameters used with the same algorithm.
  • the cross selling algorithm 32 returns is a set of probability estimates for each prospect.
  • the set of probability estimates corresponds to estimates of the likelihood that a given prospect will purchase each of the plurality of products.
  • the results can be retrieved in other formats, for example, a visual depiction of the results such as a graph or other visual depiction of the results .
  • a data set 50 includes a plurality of records 51.
  • the data set 50 generally includes a very large number of such records 51.
  • the records 51 (FIG. 2A) can include an identifier field 53a, as well as one or a plurality of fields 53b corresponding to input variable values that are used in the modeling process 30.
  • the records 51 also include a plurality of result fields 53c that are used by the modeling process to record scores for the record 51.
  • the scores are a measure of the expected behavior of a prospect represented by the record.
  • the result fields include score fields 57a-57i, one for each of a corresponding plurality of models and corresponding probability fields 59a-59i.
  • the data mining software 30 or user randomly selects records from the data set 50 to produce a test sample 52.
  • the test sample is used in a training process 32a (FIG. 3) to train the cross-selling algorithm 32 of the data mining software 30 to provide a process 32b that can be used to generate lists of customers and products .
  • test sample 52 is generated from the data set 50.
  • a random sample of customers to receive a test solicitation e.g., test samples 52a-52j is generated from the test sample.
  • Products to test market are randomly assigned to each customer, with the possible constraint that they should not be marketed a product they have already purchased if that would not be appropriate.
  • the test samples 52a-52j may be filtered 54 to remove from the test sample those records corresponding to prospects who have recently purchased a product modeled by one of the plurality models.
  • each of the product models (FIG. 3) is trained with different mutually exclusive subsets 52a-52j from the random test sample 52.
  • the models are trained using a supervised learning algorithm.
  • training data is provided in the form of input/output pairs, and one or more passes are made through the training data to adjust the model to better match the input to output mapping.
  • the number of prospects in the test sample can be determined based on standard statistical sampling principles.
  • At least one and preferably multiple models 60a-60j are used to model the likelihood of a prospect purchasing corresponding products modeled by the models 60a-60j .
  • ten models 60a-60j are shown that model ten different products. That is, for each product there is a different model that tries to predict the likelihood of a prospect buying the product.
  • the individual multiple models 60a-60j are designed to measure or estimate the likelihood of a prospect to purchase the respective one of the products .
  • the results 64a-64j of testing the models using the test sample are compared with actual test marketing results in order to adjust the models 60a-60j .
  • this cross-selling process 32a different product offerings are made to similar test groups (FIG. 3) .
  • the outcomes of the offers are evaluated.
  • the results can be modeled separately by using a separate model for each product.
  • the cross-selling training process 32a gathers data corresponding to positive and negative examples of whether potential prospects purchased a particular product.
  • the cross-selling training process 32a uses a supervised learning algorithm 66 such as the type described above to train the models 60a-60j to predict whether a prospect would purchase or not purchase the product.
  • Models use this data to differentiate between people who purchased and people who did not purchase a particular product. That set of data is used for each product to build each of the models.
  • operation of a cross- selling process 32b of the data mining software 30 uses the trained models 60a-60j to score the records from the data set 50.
  • the models 60a-60j are designed and trained on data sets 52a-52j explained above. These trained models 60a-60j score records 51 corresponding to prospects.
  • the models 60a-60j model the likelihood of the prospect purchasing particular products.
  • the model scores are converted 70 into probability estimates that are stored in the record 51.
  • Equation 1 An algorithm that can be used to convert 70 model scores into probability estimates is given by Equation 1. Equation 1 can be used to convert 70 model scores into probability estimates while also adjusting for cases where the data used to train the model was sampled with unequal weights given to positive and negative examples.
  • PRS predicted response rate
  • y is the model score between 0 and 1
  • oil is the original response rate for the data segment (typically 1% to 2%)
  • stamp is the sampled response rate for the training data for the model (typically 50%) .
  • y ⁇ l the process will return 1 and for y ⁇ 0, it returns 0.
  • the cross-selling process 32b enters in the records a probability estimate for each product that was scored.
  • the probability estimates can be used to select products to target to particular prospects.
  • the cross-selling process 32b compares 72 the results and ranks them for each prospect in an order of product most likely to be purchased or by including other cost and revenue information, ranks them according to expected profit. From the sorted probability estimates and profit estimates, the cross-selling process 32b produces 74 summary results.
  • Other results that are produced by the cross-selling process 32b include a list of prospects 76.
  • the list of prospects can also be generated, however, by taking into consideration budget thresholds tHa-tHj that are produced by a budget process 90, as described in conjunction with FIG. 5.
  • Equation 1 it may be necessary to adjust the range of predicted response rates based on a comparison with the data. Adjusting predicted response rates should be done in a manner that preserves the average response rate. One way to accomplish this is to multiply the difference between the predicted response rate and the average response rate with an appropriate constant factor "f". The best value for "f" can be determined by comparing the relative magnitude of the difference from the average response rate between the predicted response rate and the average response rate exhibited by the data at one or more places.
  • the data mining software 30 also includes a budget constraint feature 90.
  • the data mining software 30 can make three passes through a list of prospects that have been previously scored by using the cross-selling algorithm 32 (FIG. 4) .
  • the software optimizes 92 the unit cost of each of the products by considering a mailing campaign for each product independently and optimizes 92 the quantity of marketing literature that would be mailed out if that product was the only product.
  • statistical measurements such as correlations between the scores of products are determined 94. Those correlations are used to make adjustments 96 from the unit costs that have been assumed for each product independently, to arrive at costs that are appropriate when considering all of the products together.
  • the correlations between the product scores provide information about the amount of overlap in product recommendations that could be expected if products were being assigned independently to each prospect. Since there may be a limit on the number of products to be targeted to a single prospect, any excess overlap may render invalid the unit costs estimated by considering each product independently.
  • the correlations are used to adjust the estimated volumes for each product, and in turn, their unit costs.
  • An example of the budget constraint feature is given in TABLE 2.
  • One simple heuristic method for using correlations is to compute each product score's correlation with the average product score. This produces a vector instead of an entire correlation matrix. Whether a vector or an entire matrix is generated, the overlap of each product is estimated and its unit volume corrected. Products that are highly positively correlated will require the largest corrections. In the simplest case, the amount of correction could be a linear function of the correlation. In more sophisticated embodiments, additional factors may be taken into consideration, such as the magnitude of the difference in unit cost levels of each product. For example, in some cases it may be most profitable to assign all overlapping prospects to a single product if that product is near the threshold for a volume discount.
  • the budget feature 90 makes assignments of prospects to product mailing lists based on the costs determined in the first pass and accumulates actual quantities of prospects to be placed on each of the lists.
  • the budget feature makes corrections to the costs based on actual quantities, often applying 98 a maximum budget constraint to the quantities. That is, what may have been within budget after fixing the quantity of each product to be mailed by considering it independently from other products, may be over budget after cost corrections that are done by making corrections to costs based on actual quantities.
  • the assignments can be binned by expected profit.
  • the following algorithm can be executed:
  • the budget process can optimize the maximum number of people to mail promotions to.
  • the budget process 90 determines a cutoff point or threshold for each model such that any prospect ranked above that threshold is worthwhile to send promotional information.
  • the actual number of people that are marketed to may be less than the thresholds set for each model because there may be multiple entries .

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne une technique permettant de déterminer la probabilité qu'un client prospect achète un produit. Cette technique attribue des points à une pluralité de prospects sur la base d'une pluralité de modèles de probabilité d'achat d'un produit donné par les prospects. Les points de modèles résultants pour chaque produit sont utilisés pour déduire la probabilité d'achat de chaque produit par l'acheteur. En outre, cette invention concerne un processus de rationalisation économique qui détermine un nombre optimum de prospects à contacter dans une campagne commerciale.
PCT/US2000/040345 1999-07-16 2000-07-11 Echange croisee pour recherche en base de donnees WO2001006405A2 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU69539/00A AU6953900A (en) 1999-07-16 2000-07-11 Cross-selling in database mining

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US35619199A 1999-07-16 1999-07-16
US09/356,191 1999-07-16

Publications (1)

Publication Number Publication Date
WO2001006405A2 true WO2001006405A2 (fr) 2001-01-25

Family

ID=23400507

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/040345 WO2001006405A2 (fr) 1999-07-16 2000-07-11 Echange croisee pour recherche en base de donnees

Country Status (2)

Country Link
AU (1) AU6953900A (fr)
WO (1) WO2001006405A2 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7035811B2 (en) * 2001-01-23 2006-04-25 Intimate Brands, Inc. System and method for composite customer segmentation
US7433841B2 (en) 2001-03-23 2008-10-07 Hewlett-Packard Development Company, L.P. Method and data structure for participation in multiple negotiations
US20130218805A1 (en) * 2012-02-20 2013-08-22 Ameriprise Financial, Inc. Opportunity list engine
US20150112802A1 (en) * 2013-10-23 2015-04-23 Mastercard International Incorporated Method and system for delivering targeted messages based on tracked transaction data
CN109492191A (zh) * 2018-09-17 2019-03-19 平安科技(深圳)有限公司 计算投保概率的方法、装置、计算机设备和存储介质

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7035811B2 (en) * 2001-01-23 2006-04-25 Intimate Brands, Inc. System and method for composite customer segmentation
US7433841B2 (en) 2001-03-23 2008-10-07 Hewlett-Packard Development Company, L.P. Method and data structure for participation in multiple negotiations
US20130218805A1 (en) * 2012-02-20 2013-08-22 Ameriprise Financial, Inc. Opportunity list engine
US10552851B2 (en) * 2012-02-20 2020-02-04 Ameriprise Financial, Inc. Opportunity list engine
US11392965B2 (en) 2012-02-20 2022-07-19 Ameriprise Financial, Inc. Opportunity list engine
US20150112802A1 (en) * 2013-10-23 2015-04-23 Mastercard International Incorporated Method and system for delivering targeted messages based on tracked transaction data
CN109492191A (zh) * 2018-09-17 2019-03-19 平安科技(深圳)有限公司 计算投保概率的方法、装置、计算机设备和存储介质
CN109492191B (zh) * 2018-09-17 2024-05-07 平安科技(深圳)有限公司 计算投保概率的方法、装置、计算机设备和存储介质

Also Published As

Publication number Publication date
AU6953900A (en) 2001-02-05

Similar Documents

Publication Publication Date Title
US11055640B2 (en) Generating product decisions
US11769194B2 (en) Method and system for presenting items in online environment based on previous item selections
US6542894B1 (en) Execution of multiple models using data segmentation
CA2938561C (fr) Systeme permettant d'individualiser une interaction client
US6317752B1 (en) Version testing in database mining
US8650079B2 (en) Promotion planning system
US8620746B2 (en) Scoring quality of traffic to network sites
US20020127529A1 (en) Prediction model creation, evaluation, and training
US20100228604A1 (en) System and Method for Generating Demand Groups
JP2000357204A (ja) 消費者の財政的挙動の予測モデル化方法及びシステム
CN116385048B (zh) 一种农产品智慧营销方法和系统
US20100100420A1 (en) Target marketing method and system
CN114493361A (zh) 一种商品推荐算法的有效性评估方法和装置
WO2001006405A2 (fr) Echange croisee pour recherche en base de donnees
JP5304429B2 (ja) 顧客状態推定システム、顧客状態推定方法および顧客状態推定プログラム
CN113449818A (zh) 一种基于用户行为特征的优惠券额度动态调节方法
CN117934087B (zh) 基于用户交互数据的广告智能投放方法及系统
CN118037396A (zh) 商品推荐方法、装置、电子设备及存储介质
US20200211058A1 (en) System and Method for Probabilistic Matching of Multiple Event Logs to Single Real-World Ad Serve Event
CN114626888A (zh) 网络购物平台的恶意刷单行为预测方法
CN115796959A (zh) 基于数据采集和分析的广告投放效果检测方法
Memari et al. INORM: A new approach in e-commerce recommendation
Wang Exploring online brand choice at the SKU level: the effects of internet-specific attributes
CN114693128A (zh) 一种终端扫码数据的质量评价方法及系统
Osborne Online Appendix for

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP