Disclosure of Invention
The invention aims to provide a method and a system for intelligent marketing, which can completely replace manpower, automatically screen out variables with better monotonicity or U-shaped degree, input the variables into an intelligent marketing model to obtain target customers, push commodities to the target customers and improve marketing efficiency.
In order to achieve the above purpose, the invention provides the following technical scheme:
a method for smart marketing, comprising:
obtaining variables, drawing a variable trend curve, and obtaining the variables with monotonicity characteristics or U-shaped characteristics based on the variable trend curve and a preset trend identification rule;
training an intelligent marketing model by using the variable with monotonicity characteristics or U-shaped characteristics, and screening by using stepwise regression operation to obtain one or more target variables suitable for the intelligent marketing model;
and acquiring a target customer by using the target variable and the intelligent marketing model, and pushing the commodity to the target customer.
Preferably, the variables are subjected to binning to obtain a plurality of division points, and a variable trend curve is drawn based on the division points.
Further, the method for acquiring the variable with the monotonicity characteristic based on the variable trend curve and the preset trend identification rule comprises the following steps:
calculating a total variation TV of a variable based on a variable trend curve, wherein the total variation TV of the variable is the sum of amplitudes between two adjacent branch points of the variable trend curve;
calculating the absolute value of the difference between the left end point and the right end point of the variable trend curve and recording the absolute value as a first difference value AD 1 According to said total variation TV and said first difference AD 1 Obtaining a monotonicity index M _ index of a variable, wherein M _ index = TV/AD 1 ;
And screening out variables with monotonicity characteristics based on a preset monotonicity index threshold value.
Preferably, the method for obtaining the variable with the U-shaped feature based on the variable trend curve and the preset trend identification rule includes:
acquiring the maximum value and the minimum value of a variable trend curve except for a left end point and a right end point;
calculating the sum of absolute values of differences between the left endpoint and the right endpoint of the variable trend curve and the minimum value respectively and recording the sum as a second difference AD 2 According to said total variation TV and said second difference AD 2 Acquiring a positive U-type index U _ index _1 of a variable, wherein U _ index _1= TV/AD 2 (ii) a And/or
Calculating the sum of absolute values of differences between the left end point and the right end point of the variable trend curve and the maximum value respectively and recording the sum as a third difference AD 3 According to said total variation TV and said third difference AD 3 Obtaining an inverted U-index U _ index _2 of a variable, wherein U _ index _2= TV/AD 3 ;
And screening out the variables with the U-shaped characteristics based on the preset positive U-shaped index threshold and the preset reverse U-shaped index threshold.
Preferably, the monotonicity index threshold, the positive U-shaped index threshold and the inverted U-shaped index threshold have a value range of [1,1.5].
Further, when the maximum value of the variable trend curve except for the left end point and the right end point is smaller than the values of the left end point and the right end point at the same time, the inverse U-shaped index U _ index _2 is not calculated;
when the minimum value of the variable trend curve except the left endpoint and the right endpoint is larger than the values of the left endpoint and the right endpoint at the same time, the positive U-shaped index U _ index _1 is not calculated.
Preferably, the method for training the intelligent marketing model by using the variables with the monotonicity characteristic or the U-shaped characteristic and obtaining one or more target variables applicable to the intelligent marketing model by using stepwise regression operation screening comprises the following steps:
carrying out data preprocessing on the variable with the monotonicity characteristic or the U-shaped characteristic;
screening out important variables from the pretreated variables;
and inputting the important variables into an intelligent marketing model, and screening by using stepwise regression operation to obtain one or more target variables suitable for the intelligent marketing model.
Preferably, the data preprocessing comprises missing value filling, outlier processing, and one-hot encoding for the categorical variables.
Preferably, the method for screening out important variables from the preprocessed variables comprises the following steps:
respectively calculating the IV value and the PSI value of the variable;
and screening out important variables with the IV value larger than the IV threshold value and the PSI value smaller than the PSI threshold value.
A system for smart marketing, comprising:
the system comprises a first variable screening module, a second variable screening module and a marketing pushing module; wherein,
the first variable screening module is used for acquiring variables, drawing a variable trend curve and acquiring the variables with monotonicity characteristics or U-shaped characteristics based on the variable trend curve and a preset trend identification rule;
the second variable screening module is used for training an intelligent marketing model by using the variables with the monotonicity characteristics or the U-shaped characteristics, and screening one or more target variables suitable for the intelligent marketing model by using stepwise regression operation;
and the marketing pushing module is used for acquiring a target customer by using the target variable and the intelligent marketing model and pushing the commodity to the target customer.
Compared with the prior art, the method and the system for intelligent marketing provided by the invention have the following beneficial effects:
the method for intelligent marketing provided by the invention utilizes machine learning to replace the traditional step of manually checking the trend graph, can automatically screen out variables with better monotonicity and U-shaped degree in a plurality of variables, inputs the screened variables into the intelligent marketing model to obtain the target customer, and pushes commodities to the target customer, and the identification of the type and monotonicity of the trend curve of the variables and the U-shaped degree is consistent with the manual visual judgment, so that the method can replace manual work, and effectively improves the working efficiency on the basis of ensuring the identification quality.
According to the system for intelligent marketing, which is provided by the invention, by adopting the method for intelligent marketing, the variables can be automatically screened, the target customer can be obtained according to the screened variables, the commodities are pushed to the target customer, and the working efficiency and the success rate of customer purchase after marketing are effectively improved.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention more comprehensible, embodiments accompanying figures are described in detail below. It should be apparent that the described embodiments are only some embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
Referring to fig. 1, a method for smart marketing, comprising:
obtaining variables, drawing a variable trend curve, and obtaining the variables with monotonicity characteristics or U-shaped characteristics based on the variable trend curve and a preset trend identification rule;
training an intelligent marketing model by using the variable with the monotonicity characteristic or the U-shaped characteristic, and screening by using stepwise regression operation to obtain one or more target variables suitable for the intelligent marketing model;
and acquiring a target customer by using the target variable and the intelligent marketing model, and pushing the commodity to the target customer.
According to the method for intelligent marketing, machine learning is utilized, the traditional step of manually checking the trend graph is replaced, variables with good monotonicity and U-shaped degree in a plurality of variables can be automatically screened out, the screened variables are input into the intelligent marketing model to obtain the target customer, the commodity is pushed to the target customer, the success rate of commodity marketing is improved, meanwhile, automatic model training is achieved, and the working efficiency is effectively improved.
Referring to fig. 2, in the embodiment of the present invention, a method for obtaining a variable with monotonicity and U-shaped characteristics based on a variable trend curve and a preset trend recognition rule includes:
the method comprises the steps of dividing variables into boxes to obtain a plurality of division points, and drawing a variable trend curve of a positive sample Rate (Target _ Rate) relative to the variables based on the division points. Where the variables are continuous variables, discrete variables are not contemplated by the present invention.
Assuming that N points are totally arranged on an x axis in a variable trend curve after continuous variables are subjected to box separation, and the N points are P points in sequence 1 ,P 2 ,…,P N The positive sample rates at the N points on the variable trend curve are Target _ Rate respectively 1 ,Target_Rate 2 ,…,Target_Rate N And obtains the positive sample Rate (Target _ Rate) at the left and right end points 1 And Target _ Rate N ) Outer minimum positive sample Rate (Target _ Rate) min ) And a maximum positive sample Rate (Target _ Rate) max )。
Calculating a Total Variation (TV) of the variable trend curve, which reflects the fluctuation degree of the variable trend curve, which can be obtained by calculating the sum of the amplitudes between two adjacent points of the variable trend curve, i.e., TV = ∑ Σ i N =1 |Target-Rate i+1 -Target-Rate i |。
Calculating a first difference (AD) of the trend curves of the variables 1 ) The first difference is the absolute value of the difference between the positive sample rates at the left and right end points on the trend curve of the variable, i.e. AD 1 =|Target-Rate N - Target-Rate 1 |。
Calculating a second difference (AD) of the trend curves of the variables 2 ) The second difference is the sum of absolute values of differences between positive sample rates at the left and right end points on the variable trend curve and the minimum positive sample rate except the positive sample rates at the left and right end points, i.e., AD 2 =|Target-Rate min -Target-Rate 1 |+ |Target-Rate N -Target-Rate min |。
Calculating a third difference (AD) of the trend curve of the variables 3 ) The third difference is the sum of absolute values of differences between the positive sample rates at the left and right end points on the variable trend curve and the maximum positive sample rate except the positive sample rates at the left and right end points, i.e., AD 3 =|Target-Rate max -Target-Rate 1 |+ |Target-Rate N -Target-Rate max |。
Calculating a monotonicity measure (M _ index) of the variable trend curve, which can be obtained by calculating a Total Variation (TV) and a first difference (AD) of the variable trend curve 1 ) Is calculated, i.e. M _ index = TV/AD 1 . If the variable trend curve is completely monotonic, the ratio TV/AD 1 =1; if not monotonic, then the ratio TV/AD 1 >1; ratio TV/AD 1 The closer to 1, the higher the monotonicity degree of the explanatory variable trend curve, and conversely, the lower the monotonicity degree of the explanatory variable trend curve. Therefore, the monotonicity degree index of the variable trend curve has a value range of M _ index ∈ [1, + ∞), and when M _ index =1, it means that the variable trend curve is completely monotonous, and the larger the value of M _ index is, the lower the monotonicity degree of the variable trend curve is.
And calculating a U-shaped degree index of the variable trend curve, wherein the U-shaped degree index can comprise a positive U-shaped degree index (U _ index _ 1) and an inverse U-shaped degree index (U _ index _ 2). The U-shaped degree and the V-shaped degree are substantially the same and are analysis model indexes for judging and predicting objects, current situations and development trends in a special development process.
If and only if condition one (Target _ Rate) min <Target_Rate 1 And Target _ Rate min < Target_Rate N ) When the variable trend curve is established, the variable trend curve can be identified as a positive U shape, and the positive U shape degree index (U _ index _ 1) can be obtained by calculating the Total Variation (TV) and the second difference (AD) of the variable trend curve 2 ) Is calculated, i.e., U _ index _1= tv/AD 2 . If the variable trend curve is strictly positive U-shaped, then the ratioValue TV/AD 2 =1; if not strictly positive U-shaped, then the ratio TV/AD 2 >1; ratio TV/AD 2 The closer to 1, the higher the degree of positive U shape of the explanatory variable trend curve, and conversely, the lower the degree of positive U shape of the explanatory variable trend curve. Therefore, the range of the positive U-shaped degree index of the variable trend curve is U _ index _1 ∈ [1, + ∞ ]), and when U _ index _1=1, the variable trend curve is described to be strictly positive U-shaped, and the larger the value of U _ index _1 is, the lower the positive U-shaped degree of the variable trend curve is.
If and only if the condition two (Target _ Rate) max >Target_Rate 1 And Target _ Rate max > Target_Rate N ) When the variable trend curve is established, the variable trend curve can be identified as an inverted U shape, and the inverted U-shaped degree index (U _ index _ 2) can be obtained by calculating the Total Variation (TV) and the third difference (AD) of the variable trend curve 3 ) Is calculated, i.e., U _ index _2=TV/AD 3 . If the variable trend curve is in strict inverse U shape, the ratio TV/AD 3 =1; if not strictly inverted U-shaped, the ratio TV/AD 3 >1; ratio TV/AD 3 The closer to 1, the higher the degree of the inverse U shape of the explanatory variable trend curve, and conversely, the lower the degree of the inverse U shape of the explanatory variable trend curve. Therefore, the range of the inverted U-shape degree index of the variable trend curve is U _ index _2 ∈ [1, + ∞) ] and when U _ index _2=1, the variable trend curve is described to be strictly inverted U-shaped, and the larger the value of U _ index _2 is, the lower the inverted U-shape degree of the variable trend curve is.
When the U-shaped degree index of the variable trend curve is calculated, because the variable trend curve can be identified as a positive U-shaped degree index and an inverse U-shaped degree index, the positive U-shaped degree index and the inverse U-shaped degree index of the variable trend curve are both calculated, the smaller index of the positive U-shaped degree index and the inverse U-shaped degree index is used for judging that the variable trend curve belongs to the positive U-shaped or the inverse U-shaped degree index, and the smaller index of the positive U-shaped degree index and the inverse U-shaped degree index is used as the U-shaped degree index of the variable trend curve.
The variables with monotonicity characteristics and U-shaped characteristics are screened out based on a preset monotonicity index threshold, a positive U-shaped index threshold or an inverted U-shaped index threshold, and in the embodiment, the value ranges of the monotonicity index threshold, the positive U-shaped index threshold and the inverted U-shaped index threshold are all [1,1.5]. However, when the variable trend curve is monotonous and there are many variables, the threshold range may be decreased, and conversely, the threshold range may be increased.
Further, the monotonicity degree index M _ index and the U-shaped degree index U _ index are used for judging the variable trend curve type, and the judging process relates to the positive sample Rate (Target _ Rate) at the left end point and the right end point 1 ,Target_Rate N ) And the maximum value of the positive sample Rate divided by the left and right endpoints (Target _ Rate) min ,Target_Rate max )。
When the positive sample Rate at the left end point is greater than the positive sample Rate at the right end point (Target _ Rate) 1 ≥Target_Rate N ) Then, the magnitude relationship between the positive sample rate at the left and right end points and the maximum value divided by the positive sample rate at the left and right end points has the following 6 cases:
case A1: target _ Rate max ∈(-∞,Target_Rate N );
Case A2: target _ Rate max ∈[Target_Rate N ,Target_Rate 1 );
Case A3: target _ Rate max ∈[Target_Rate 1 ,+∞);
Case B1: target _ Rate min ∈(-∞,Target_Rate N );
Case B2: target _ Rate min ∈[Target_Rate N ,Target_Rate 1 );
Case B3: target _ Rate min ∈[Target_Rate 1 ,+∞)。
There are 9 combination cases of the above 6 cases, that is,
cases A1-B1: target _ Rate max ∈(-∞,Target_Rate N ) And Target _ Rate min ∈(-∞,Target_Rate N );
Cases A1-B2: target _ Rate max ∈(-∞,Target_Rate N ) And Target _ Rate min ∈[Target_Rate N ,Target_Rate 1 );
Cases A1 to B3: target _ Rate max ∈(-∞,Target_Rate N ) And Target _ Rate min ∈[Target_Rate 1 ,+∞);
Cases A2-B1: target _ Rate max ∈[Target_Rate N ,Target_Rate 1 ) And Target _ Rate min ∈(-∞,Target_Rate N );
Cases A2-B2: target _ Rate max ∈[Target_Rate N ,Target_Rate 1 ) And Target _ Rate min ∈[Target_Rate N ,Target_Rate 1 );
Cases A2 to B3: target _ Rate max ∈[Target_Rate N ,Target_Rate 1 ) And Target _ Rate min ∈[Target_Rate 1 ,+∞);
Cases A3-B1: target _ Rate max ∈[Target_Rate 1 , + ∞) and Target _ Rate min ∈(-∞,Target_Rate N );
Cases A3-B2: target _ Rate max ∈[Target_Rate 1 , + ∞) and Target _ Rate min ∈[Target_Rate N ,Target_Rate 1 );
Cases A3 to B3: target _ Rate max ∈[Target_Rate 1 , + ∞) and Target _ Rate min ∈[Target_Rate 1 ,+∞)。
In cases A1-B1, because of Target _ Rate max ∈(-∞,Target_Rate N ) If the condition II identified as the inverse U-shaped is not met, the variable trend curve cannot be in the inverse U-shaped, so that the inverse U-shaped index does not need to be calculated; the curve is likely to be determined as a positive U shape, so a positive U shape index needs to be calculated; the variable trend curve may also be determined to be monotonically decreasing, so the monotonicity index also needs to be calculated. After calculating the positive U-index U _ index _1 and the monotonicity index M _ index, comparing the magnitudes of the two indexes, if the U _ index _1<M _ index, judging the variable trend curve to be a positive U type; otherwise, the method is judged to be monotone decreasing.
In cases A1-B2, because of Target _ Rate min >Target_Rate max This is not possible and so this situation does not exist.
In case A1-BIn 3, because of Target _ Rate min >Target_Rate max This is not possible and so this situation does not exist.
In case A2-B1, because of Target _ Rate max ∈[Target_Rate N ,Target_Rate 1 ) If the condition II identified as the inverse U-shaped is not met, the variable trend curve cannot be in the inverse U-shaped, so that the inverse U-shaped index does not need to be calculated; the variable trend curve is possibly judged to be a positive U type, so a positive U type index needs to be calculated; the variable trend curve may also be determined to be monotonically decreasing, so the monotonicity index also needs to be calculated. After calculating the positive U-index U _ index _1 and the monotonicity index M _ index, comparing the magnitudes of the two indexes, if the U _ index _1<M _ index, judging the variable trend curve as a positive U type; otherwise, the method is judged to be monotone decreasing.
In case A2-B2, because of Target _ Rate max ∈[Target_Rate N ,Target_Rate 1 ) The condition two identified as an inverted U shape is not satisfied, and the variable trend curve cannot be in an inverted U shape; because Target _ Rate min ∈[Target_Rate N ,Target_Rate 1 ) The condition one identified as positive U-shape is not satisfied, and the variable trend curve may not be positive U-shape. Therefore, the variable trend curve can only be judged to be monotonous decreasing, and only the monotonicity index M _ index needs to be calculated.
In cases A2-B3, because of Target _ Rate min >Target_Rate max This is not possible and so this situation does not exist.
In the case A3-B1, since the variable trend curve satisfies both the condition one identified as the positive U-shape and the condition two identified as the inverted U-shape, it may be identified as either the positive U-shape or the inverted U-shape. Therefore, both the positive U-index U _ index _1 and the inverse U-index U _ index _2 need to be calculated. The variable trend curve may also be identified as a monotonicity decrease, so the monotonicity index M _ index also needs to be calculated. After the 3 indexes are calculated, the sizes of the three indexes are compared, the smallest one is selected, and the variable trend curve is judged to be the corresponding type.
In the cases A3 to B2, since the variable tendency curve does not satisfy the condition one identified as the positive U-shape but satisfies the condition two identified as the inverse U-shape, the inverse U-shape index U _ index _2 is calculated, the monotonicity index M _ index is calculated, the magnitudes of the two are compared, the smallest one is obtained, and the variable tendency curve is determined as the corresponding type.
In the cases A3 to B3, since the variable tendency curve does not satisfy the condition one identified as the positive U-shape but satisfies the condition two identified as the inverse U-shape, the inverse U-shape index U _ index _2 is calculated, the monotonicity index M _ index is calculated, the magnitudes of the two are compared, the smallest one is obtained, and the variable tendency curve is determined as the corresponding type.
When the positive sample Rate at the left end point is smaller than the positive sample Rate at the right end point (Target _ Rate) 1 <Target_Rate N ) In this case, similarly to the above case where the positive sample rate at the left end point is greater than the positive sample rate at the right end point, it is sufficient to change the monotone decrease to monotone increase.
Furthermore, the monotonicity degree index M _ index and the U-shaped degree index are used for sequencing the monotonicity degree and the U-shaped degree of the variables. Specifically, the monotone type variables are classified into one type (including monotone increase and monotone decrease), the U type variables are classified into one type (including positive U type and inverse U type), and the variables are arranged in an ascending order according to corresponding indexes, so that the variables are sorted from the monotone type degree to the low degree and the U type degree from the high degree to the low degree.
The method comprises the steps of training an intelligent marketing model by using variables with monotonicity characteristics or U-shaped characteristics, and screening one or more target variables suitable for the intelligent marketing model by using stepwise regression operation, wherein the method comprises the following steps:
carrying out data preprocessing on variables with monotonicity characteristics or U-shaped characteristics, wherein the data preprocessing comprises missing value filling, abnormal value processing and single hot coding aiming at classified variables;
screening important variables from the pretreated variables, wherein the method comprises the following steps: respectively calculating the IV value and the PSI value of the variable, and further screening out important variables of which the IV value is greater than the IV threshold value and the PSI value is less than the PSI threshold value;
and inputting the important variables into the intelligent marketing model, and screening by using stepwise regression operation to obtain one or more target variables suitable for the intelligent marketing model.
The method provided by the invention is applicable to a responsiveness model of general intelligent marketing, and takes a credit product pull-new marketing model as an example, and the model comprises dimension variables such as user basic information, browsing behavior, purchasing behavior and the like. Since the final goal of the cash credit product update is credit approval, whether credit is approved is used as a criterion for the positive/negative sample of the y-tag. According to experience, after marketing, more than 90% of users apply for the quota within 7 days and are over-ventilated with the A card, whether the pull-up marketing is successful can be judged, so the presentation period is 7 days. If the user believes within 7 days after marketing, it is marked as a positive sample, otherwise it is marked as a negative sample. The observation point of the training set is a certain marketing day, and the observation point of the test set is a certain marketing day after the observation point of the training set.
The skilled person performed AB test, i.e. comparing the results of using the screening method of the present invention and the manual screening method. In the process of using the traditional manual screening, a total of 498 variables are removed from 1061 variables screened in the last step, 563 variables are retained, 55 minutes is consumed, after the screened variables are screened again through stepwise regression, 14 variables are modeled, the AUC (area under the curve) on the training set is 0.91, and the AUC on the testing set is 0.9; in the process of using the monotonic automatic screening of the invention, 516 variables are removed from 1061 variables screened in the last step, 545 variables are reserved, 3 minutes are consumed, after the screened variables are screened again through stepwise regression, 15 variables are modeled, the AUC on the training set is 0.92, and the AUC on the testing set is 0.91. The invention improves the working efficiency, and the model effect is slightly improved compared with manual screening; and the judgment of the dependent person is screened manually, so that the identification is possibly inaccurate, and the monotonicity degree and the U-shaped degree are judged by using the monotonicity index in the automatic identification algorithm, so that the identification result is more accurate.
Example two
A system for intelligent marketing comprises a first variable screening module, a second variable screening module and a marketing pushing module, wherein the first variable screening module is used for obtaining variables, drawing a variable trend curve and obtaining the variables with monotonicity characteristics or U-shaped characteristics based on the variable trend curve and a preset trend identification rule; the second variable screening module is used for training the intelligent marketing model by using variables with monotonicity characteristics or U-shaped characteristics, and screening one or more target variables suitable for the intelligent marketing model by using stepwise regression operation; and the marketing pushing module is used for acquiring the target customer by using the target variable and the intelligent marketing model and pushing the commodity to the target customer.
By adopting the method for intelligent marketing in the first embodiment, the traditional step of manually checking the trend chart is replaced by machine learning, the system for intelligent marketing provided by the invention can automatically screen out the variables with better monotonicity and U-shaped degree in a plurality of variables, input the screened variables into the intelligent marketing model to obtain the target customer, and push the commodities to the target customer, thereby effectively improving the working efficiency and the success rate of purchasing the customers after marketing. Compared with the prior art, the beneficial effect of the system for intelligent marketing provided by the embodiment of the invention is the same as that of the method for intelligent marketing provided by the first embodiment, and other technical features of the system for intelligent marketing are the same as those disclosed by the method of the first embodiment, which are not described herein again.
In the foregoing description of embodiments, the particular features, structures, materials, or characteristics may be combined in any suitable manner in any one or more embodiments or examples.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of the changes or substitutions within the technical scope of the present invention, and shall cover the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.