CN111105274A

CN111105274A - Data processing method, device, medium and electronic equipment

Info

Publication number: CN111105274A
Application number: CN201911349479.9A
Authority: CN
Inventors: 星亮亮; 余卫勇; 杨玄; 唐亮
Original assignee: Beijing Sankuai Online Technology Co Ltd
Current assignee: Beijing Sankuai Online Technology Co Ltd
Priority date: 2019-12-24
Filing date: 2019-12-24
Publication date: 2020-05-05

Abstract

The disclosure provides a data processing method, a data processing device, a data processing medium and electronic equipment, and relates to the technical field of big data processing. The method comprises the following steps: obtaining a plurality of groups of sample data to obtain a first sample set, wherein each group of sample data comprises: the marketing parameters and the marketing label determined by at least one marketing index in the M marketing indexes; training M machine learning models based on the marketing labels in the first sample set to obtain M prediction models respectively used for predicting M marketing indexes; acquiring a second sample set corresponding to the ith marketing index, and determining a disturbance data set according to marketing parameters in the second sample set; and determining target marketing parameters meeting the M marketing indexes based on the disturbance data set and the M prediction models. The technical scheme is favorable for improving the accuracy of the determination of the marketing parameters, further improving the positioning accuracy of the target user, and being favorable for the effect of the marketing campaign to reach the marketing index.

Description

Data processing method, device, medium and electronic equipment

Technical Field

The present disclosure relates to the field of big data processing technologies, and in particular, to a data processing method, a data processing apparatus, and a computer readable medium and an electronic device for implementing the data processing method.

Background

Marketing refers to the process of finding or discovering consumer needs by an enterprise to let the consumer know about the product and then purchase the product. In specific practice, the shopping desire of consumers is stimulated by formulating various marketing activities, and the order placing quantity of products is further improved.

In the related art, the marketing campaign is formulated by mainly setting marketing parameters (such as a full reduction amount value) according to operation experience and intuition so as to develop the marketing campaign according to the marketing parameters.

However, in the scheme of the marketing campaign set according to the related art, the accuracy of the marketing parameters is poor, resulting in failure to achieve the marketing goal of the marketing campaign.

It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.

Disclosure of Invention

An object of the embodiments of the present disclosure is to provide a data processing method, a data processing apparatus, a computer readable medium and an electronic device, so as to improve the accuracy of determining marketing parameters at least to a certain extent, thereby improving the accuracy of positioning a target user, and facilitating the effect of marketing activities to reach marketing indexes.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.

According to a first aspect of the embodiments of the present disclosure, there is provided a data processing method, including:

in an embodiment of the present disclosure, based on the foregoing scheme, a first sample set is obtained by obtaining multiple sets of sample data, where each set of sample data includes: the marketing parameters and the marketing label determined by at least one marketing index in the M marketing indexes, wherein M is an integer greater than 1;

training M machine learning models based on the marketing labels in the first sample set to obtain M prediction models respectively used for predicting the M marketing indexes;

acquiring a second sample set corresponding to the ith marketing index, and determining a disturbance data set according to marketing parameters in the second sample set, wherein i is a positive integer less than or equal to M;

and determining target marketing parameters meeting the M marketing indexes based on the disturbance data set and the M prediction models.

In some embodiments of the present disclosure, based on the foregoing scheme, obtaining a second sample set corresponding to the ith marketing index includes:

acquiring a first expected range of the ith marketing index;

and screening out the samples of the ith marketing label in the first expected range in the first sample set to obtain the second sample set.

In some embodiments of the present disclosure, based on the foregoing, determining a perturbation data set according to the marketing parameters in the second sample set comprises:

calculating the mean value of the marketing parameters in the second sample set, and acquiring a second expected range corresponding to the marketing parameters in the ith marketing index;

calculating a difference value between the marketing parameter in the second sample set and the average value, and screening out sample data of which the difference value is within the second expected range to obtain a screened data set;

and processing the screening data set according to a preset step length to obtain the disturbance data set.

In some embodiments of the present disclosure, based on the foregoing scheme, determining the target marketing parameters that satisfy the M marketing indicators based on the disturbance data set and the M prediction models includes:

inputting the perturbation sample set into an ith prediction model for predicting the ith marketing index, and determining the output of the ith prediction model as a first label set;

determining a first target label meeting a first preset requirement in the first label set, and acquiring a marketing parameter corresponding to the first target label as a first marketing parameter set;

inputting the first marketing parameter set into a jth prediction model, and determining that the output of the jth prediction model is a second label set, j is a positive integer less than or equal to M and is not equal to i;

and determining a second target label meeting a second preset requirement in the second label set, and determining the target marketing parameter according to the marketing parameter corresponding to the second target label.

In some embodiments of the present disclosure, based on the foregoing solution, determining the target marketing parameter according to the marketing parameter corresponding to the second target tag includes:

and calculating the mean value of the marketing parameters corresponding to the second target label as the target marketing parameters.

In some embodiments of the present disclosure, training M machine learning models based on marketing labels in the first sample set based on the foregoing scheme comprises:

acquiring a kth sample set corresponding to the kth marketing label in the first sample set, wherein k is a positive integer less than or equal to M;

training an extreme gradient lifting model through the kth sample set to determine the optimal hyper-parameter of the extreme gradient lifting model, and obtaining a kth prediction model for predicting the corresponding kth marketing label.

In some embodiments of the present disclosure, based on the foregoing, the marketing parameters include one or more of the following information: the number of days for placing an order, the money limit of the order and the denomination of the ticket used for placing the order;

the marketing indicator includes at least two of the following information: profit value, next unit amount, and cost value.

According to a second aspect of the embodiments of the present disclosure, there is provided a data processing apparatus including:

a first sample set determination module to: obtaining a plurality of groups of sample data to obtain a first sample set, wherein each group of sample data comprises: the marketing parameters and the marketing label determined by at least one marketing index in the M marketing indexes, wherein M is an integer greater than 1;

a model training module to: training M machine learning models based on the marketing labels in the first sample set to obtain M prediction models respectively used for predicting the M marketing indexes;

a perturbation data set determination module to: acquiring a second sample set corresponding to the ith marketing index, and determining a disturbance data set according to marketing parameters in the second sample set, wherein i is a positive integer less than or equal to M;

a targeted marketing parameter determination module to: and determining target marketing parameters meeting the M marketing indexes based on the disturbance data set and the M prediction models.

According to a third aspect of the embodiments of the present disclosure, there is provided a computer-readable medium, on which a computer program is stored, which when executed by a processor, implements the data processing method as described in the first aspect of the embodiments above.

According to a fourth aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out the data processing method according to the first aspect of the embodiments.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

in the embodiment provided by the disclosure, firstly, a sample set composed of a plurality of groups of sample data is obtained; then, respectively training M machine learning models based on different marketing labels in the sample set to obtain M prediction models respectively used for predicting each marketing index. Further, a second sample set corresponding to any marketing index (for example, ith, where i is a positive integer less than or equal to M) is obtained, and the second sample set is subjected to screening processing according to marketing parameters in the second sample set to obtain a disturbance data set. And finally, determining target marketing parameters meeting the M marketing indexes based on the disturbance data set and the M prediction models.

On the one hand, the technical scheme determines the target marketing parameters based on the big data, and in the process of determining the target marketing parameters, an effective disturbance data set is constructed by adopting sample data of a certain marketing index, so that the marketing parameters are determined on a theoretical basis, the accuracy of determining the marketing parameters is improved, the positioning accuracy of target users is improved, and the effect of marketing activities is improved to reach the marketing index.

On the other hand, in the technical scheme, in the process of formulating the marketing activities aiming at multiple marketing indexes, an effective disturbance data set is constructed by adopting sample data of a certain marketing index, and then the disturbance data set is applied to a prediction model for predicting other marketing indexes by combining the idea of transfer learning, so that the target marketing parameters are determined based on the multiple marketing indexes. The marketing indexes are favorable for improving the flexibility and the accuracy of the marketing activity setting, so that the target user can be positioned more accurately, the ordering frequency of the target user is improved, the gross profit is improved, and the operation cost is reduced.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty. In the drawings:

FIG. 1 shows a system architecture diagram for implementing a data processing method in an exemplary embodiment of the present disclosure;

FIG. 2 shows a flow diagram of a data processing method according to an embodiment of the present disclosure;

FIG. 3 shows a flow diagram of a model training method according to an embodiment of the present disclosure;

FIG. 4 shows a flow diagram of a method of determining a second set of samples according to an embodiment of the present disclosure;

FIG. 5 illustrates a flow diagram of a method of determining a disturbance data set according to an embodiment of the present disclosure;

FIG. 6 shows a flow diagram of a method of targeted marketing parameter determination, according to an embodiment of the present disclosure;

FIG. 7 shows a schematic block diagram of a data processing apparatus according to an embodiment of the present disclosure;

FIG. 8 shows a schematic diagram of a structure of a computer storage medium in an exemplary embodiment of the disclosure; and the number of the first and second groups,

fig. 9 shows a schematic structural diagram of an electronic device in an exemplary embodiment of the present disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.

Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the disclosure.

The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.

The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.

The present exemplary embodiment first provides a system architecture for implementing a data processing method, which can be applied to various data processing scenarios. Referring to fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send request instructions or the like. The

terminal devices

101, 102, 103 may have various communication client applications installed thereon, such as a photo processing application, a shopping application, a web browser application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.

The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 105 may be a server that provides various services, such as obtaining a first set of samples from a plurality of sets of sample data, and training M machine learning models based on marketing tags in the first set of samples, where M is an integer greater than 1 (for example only). The server 105 may obtain a second sample set corresponding to the ith marketing index, and determine a perturbation data set according to marketing parameters in the second sample set, where i is a positive integer less than or equal to M (for example only). Finally, the server 105 determines the target marketing parameters that satisfy the M marketing indicators based on the disturbance data set and the M prediction models.

In an exemplary embodiment, the marketing campaign scenario to which the present technical solution relates may be: a run-to-run marketing campaign and a banked marketing campaign. Wherein: the continuous marketing campaign means that in a certain time period, the continuous daily ordering behavior (amount/singular) meets a certain condition, and the coupon (coupon/real) can be shared; the accumulated order marketing campaign refers to that in a certain time period, accumulated order placing behaviors (money amount/singular number) meet certain conditions, and then the coupon (coupon/real object) can be shared.

Specifically, whether the purchasing behavior of the user accords with the order-entering/order-accumulating strategy or not is judged when the order is delivered, and if the purchasing behavior accords with the strategy and the after-sale behavior of the order generated in the after-sale validity period does not influence the order-entering/order-accumulating judgment, the appointed preference is given to the user. Therefore, the stimulation user can order or purchase again at the next time, and the requirements of stimulating to promote or stabilize the transaction frequency of the user in a long period are met.

For the marketing campaign described above, exemplary marketing parameters to be set include: the number of days for which the order is placed, related information of related commodities, returned coupons (such as the use limit of the coupons and the denomination of the coupons), related user group information and the like. Factors such as target user characteristics, target commodity range, preferential strength, and list-connected/list-accumulated condition threshold need to be considered in the setting process of marketing parameters. If the marketing parameters are set by depending on marketing experience in the related technology, potential target users do not enjoy activities or marketing promotion is conducted on invalid users, improper setting of target commodity range directly affects gross profit, and preferential strength and activity effect lack of effective data evaluation and the like.

In view of the above problems in the related art, the present technical solution provides a data processing method and apparatus, a computer storage medium, and an electronic device. The data processing method is explained first as follows:

fig. 2 shows a flow diagram of a data processing method according to an embodiment of the present disclosure. The present embodiment provides a data processing method, which at least overcomes the above problems in the prior art to some extent.

The execution subject of the data processing method provided by this embodiment may be a device having a calculation processing function, such as a server. Referring to fig. 2, the data processing method provided in this embodiment includes:

step S210, obtaining a plurality of groups of sample data to obtain a first sample set, wherein each group of sample data comprises: the marketing parameters and the marketing label determined by at least one marketing index in the M marketing indexes, wherein M is an integer greater than 1;

step S220, training M machine learning models based on the marketing labels in the first sample set to obtain M prediction models which are respectively used for predicting the M marketing indexes;

step S230, a second sample set corresponding to the ith marketing index is obtained, a disturbance data set is determined according to marketing parameters in the second sample set, and i is a positive integer less than or equal to M; and the number of the first and second groups,

step S240 determines a target marketing parameter satisfying the M marketing indicators based on the disturbance data set and the M prediction models.

In the technical scheme provided by the embodiment shown in fig. 2, on one hand, the technical scheme determines the target marketing parameters based on the big data, and in the process of determining the target marketing parameters, an effective disturbance data set is constructed by using sample data of a certain marketing index, so that a theoretical basis exists in the determination of the marketing parameters, the accuracy of the determination of the marketing parameters is favorably improved, the positioning accuracy of target users is further improved, and the effect of marketing activities is favorably achieved.

The implementation details of the steps of the solution shown in fig. 2 are explained in detail below:

in an exemplary embodiment, multiple sets of sample data are obtained in step S210 to obtain a sample set (denoted as "first sample set"). Wherein each set of sample data in the first sample set comprises: and the marketing parameters and the marketing label determined by at least one marketing index in the M marketing indexes, wherein M is an integer greater than 1. Therefore, the technical scheme is suitable for formulation of marketing activities with multiple marketing indexes, and the multiple marketing indexes are beneficial to improving the flexibility of the marketing activities and simultaneously are beneficial to more accurately positioning the target users.

In an exemplary embodiment, in determining the first sample set, a marketing index, such as a profit expectation value of the marketing campaign, an expected next volume of the marketing campaign, or an expected cost value of the marketing campaign, may be determined. Further, the influence factor on the marketing index is used as the marketing parameter.

The marketing campaign scenarios referred to in the present exemplary embodiment are a follow-up marketing campaign and a lead marketing campaign. The marketing metrics for the marketing campaign include at least two of the following information: profit value, next unit amount, and cost value; the relevant marketing parameters include one or more of the following: the number of days the order is placed, the monetary limit of the order, and the denomination of the instrument used to place the order.

In an exemplary embodiment, for each set of sample data, X represents the marketing parameter feature, and the marketing parameter feature in each set of sample data is a three-dimensional array, which may be represented as: x ═ X₁，x₂，x₃) (ii) a Wherein x is₁The method is characterized in that: the ordering action lasts for n days and x₂The method is characterized in that: amount limit of order, and x₃The method is characterized in that: the denomination of the instrument used to place the order. In addition, each set of sample data is represented by Y, which is the label (label) corresponding to X in the set of sample data. Wherein Y represents a marketing index (e.g., profit margin, next unit amount, andcost value).

In an exemplary embodiment, with continued reference to fig. 2, after determining the first set of samples, M machine learning models are trained based on the marketing tags in the first set of samples, resulting in M predictive models for predicting the M marketing metrics, respectively, in step S220.

In an exemplary embodiment, fig. 3 shows a flowchart of a model training method according to an embodiment of the present disclosure, which may be specifically used as a specific implementation manner of step S220. Referring to fig. 3, the technical solution shown in the figure includes:

step S310, a kth sample set corresponding to the kth marketing label is obtained in the first sample set, and k is a positive integer less than or equal to M; and step S320, training an extreme gradient lifting model through the kth sample set to determine the optimal hyper-parameter of the extreme gradient lifting model, and obtaining a kth prediction model for predicting the kth marketing label.

Illustratively, a machine learning model is trained by marketing samples labeled "profit value" as labels in the first sample set to obtain a prediction model for predicting "profit"; illustratively, a model1, specifically a model for predicting profit on n consecutive days, per unit limit, ticket denomination.

Training another machine learning model through the samples with the marketing label of 'lower single amount' as the label in the first sample set to obtain a prediction model for predicting the 'lower single amount'; illustratively, a model2, specifically a single-quantum prediction model for n consecutive days, a limit per single amount, the denomination of the coupon. Training a further machine learning model by using the samples with marketing labels as 'cost values' as labels in the first sample set to obtain a prediction model for predicting 'cost values'; illustratively, a model3, embodied as a model of the cost of operation versus the denomination of the ticket, the limit per unit, for n consecutive days.

For example, the machine learning model may be an eXtreme gradient boosting (XGBoost) model, a support vector machine model, a decision tree model, or the like. For example, when training the xgboost model by using the sample set, the xgboost model may be searched by using a GridSearchCV (grid search parameter) method or a random search searcv (random search) parameter tuning method in a skleran library (a machine learning library of python), and the optimal hyper-parameters of the model, such as the number of iteration steps n _ estimators, the maximum tree depth max _ depth, the sub-sampling ratio subsample, and the like.

M predictive models for predicting the M marketing indexes, respectively, are obtained in the embodiment shown in fig. 3 to determine the target marketing parameters in the embodiment of step S240.

In an exemplary embodiment, in order to improve the flexibility of the marketing campaign and to more accurately locate the target users, the target marketing parameters determined by the technical scheme simultaneously satisfy a plurality of marketing indexes. In order to achieve the technical effect that the target marketing parameters simultaneously meet a plurality of marketing indexes, the technical scheme firstly adopts a sample set (recorded as a second sample set) corresponding to one marketing index to determine a disturbance data set, and then determines samples meeting other marketing indexes in the disturbance data set so as to finally determine the target marketing parameters.

With continued reference to FIG. 2, in a first part of step S230, a second sample set corresponding to the ith marketing index is obtained, i being a positive integer less than or equal to M. In this embodiment, if the marketing index includes a profit value, a next unit amount, and a cost value, the value of M is 3, and the value of i is 1, 2, and 3.

In an exemplary embodiment, the second sample set is obtained by screening the second sample set, so as to reduce the computational efficiency of the data computation. Exemplarily, fig. 4 shows a flow diagram of a method for determining a second sample set according to an embodiment of the present disclosure. Referring to fig. 4, the technical solution shown in the figure includes:

step S410, acquiring a first expected range of the ith marketing index; and step S420, screening out the samples of the ith marketing label in the first expected range in the first sample set, to obtain the second sample set.

In the exemplary embodiment, when i is 1, a sample set (second sample set) in which the marketing label is a profit value is described as an example. Wherein, by Y₁₁、Y₁₂、…Y_1u、…Y_1XThe unit is element, the profit value label is shown, X is the number of samples/labels in the second sample set, and the value range of u is 1, X]。

Presetting the maximum profit expectation value of a certain marketing activity as Y₁' element, actual profit value is not less than 10% of the highest value (i.e. the first expected range mentioned above). Then the second sample set can be screened by the formula:

∣Y_1u-Y₁’∣<10％*Y₁' formula one

Therefore, a profit tag set yy close to the maximum profit value is obtained through the first formula, and then the corresponding three-dimensional array xx is found according to the tag data set yy. For example: the operation cost is 100000, the first expected range is 10000, and then the sample data with the operation cost re-range [90000,110000] is screened from the sample data.

In an exemplary embodiment, with continued reference to FIG. 2, in a second portion of step S230, a perturbation data set is determined from the marketing parameters in the second sample set. In the process of determining the target marketing parameters, an effective disturbance data set is constructed by adopting sample data of a certain marketing index, so that the marketing parameter determination has a theoretical basis.

Specifically, fig. 5 shows a flowchart of a determination method of a disturbance data set according to an embodiment of the present disclosure. Referring to fig. 5, the solution shown in the figure includes steps S510 to S530.

In step S510, a mean value of the marketing parameters in the second sample set is calculated, and a difference value between the marketing parameters in the second sample set and the mean value is calculated.

In the exemplary embodiment, a sample set (second sample set) in which the marketing label is a profit value when i takes a value of 1 is still described as an example. Wherein the second sample set in this stepMay be a data set that has been screened through the embodiment shown in fig. 4. As with the above embodiments, the marketing parameter may be represented as xx and the corresponding marketing tag may be represented as yy. The marketing parameters include: the marketing parameter xx composed of a plurality of three-dimensional arrays is specifically marked as [ xx ] under the conditions of the number of days for placing orders, the limit value of the amount of orders and the denomination of tickets used for placing orders₁₁，xx₁₂，xx₁₃]、[xx₂₁，xx₂₂，xx₂₃]、…[xx_v1，xx_v2，xx_v3]And v represents the number of three-dimensional arrays in the second sample set.

Illustratively, the mean value of each marketing parameter in the three-dimensional array is calculated separately, for example: xx_Ave1Representing a first dimension marketing parameter (xx)₁₁、xx₂₁、……xx_v1) Mean value of (1), xx_Ave2Representing a second dimension marketing parameter (xx)₁₂、xx₂₂、……xx_v2) Is a mean value of, and xx_Ave3Representing a third-dimensional marketing parameter (xx)₁₃、xx₂₃、……xx_v3) Is measured. Take the second dimension marketing parameter (amount limit per order) as an example, the mean xx_Ave2Is 200 yuan.

In step S520, a second expected range corresponding to the marketing parameter in the ith marketing index is obtained, and sample data with the difference value within the second expected range is screened out, so as to obtain a screened data set.

Still taking the second-dimension marketing parameter (the monetary limit per order) as an example, if the second expected range corresponding to the second-dimension marketing parameter (the monetary limit per order) is: on the basis of the mean value, the range of disturbance data limited by each unit amount can be obtained by fluctuating by 50 units, and the range is [150,250 ]. Similarly, a disturbance data range of the first-dimension marketing parameter (the number of days for placing an order) and a disturbance data range of the third-dimension marketing parameter (the denomination of the ticket used for placing an order) are obtained. Further, data in a disturbance data range meeting the marketing parameters of each dimension are screened out from the second sample set, and a screened data set is obtained.

In step S530, processing the filtered data set according to a preset step length to obtain the disturbed data set;

still taking the second-dimension marketing parameter (the monetary limit per order) as an example, suppose that the iteration step set for the perturbation range of the second-dimension marketing parameter is 1. Then based on the above embodiment, for sample data that is screened for operating costs in the range [90000,110000], a perturbation data set of (150,151, …,250) per single monetary limit (i.e., the second-dimension marketing parameter mentioned above) may be constructed.

The determination method of the disturbance data set of the first-dimension marketing parameter (the number of days for placing orders) and the disturbance data set of the third-dimension marketing parameter (the denomination of the ticket used for placing orders) is the same as the determination method of the disturbance data set of the second-dimension marketing parameter, and is not repeated herein.

It should be noted that the value of the step length is determined according to the requirement of the actual situation. The smaller the step setting value is, the more accurate determination of the target marketing parameters is facilitated, and meanwhile, the data calculation amount response is increased; on the contrary, the larger the step setting value is, the more accurate the target marketing parameters are not determined, and meanwhile, the data calculation amount response is reduced.

In an exemplary embodiment, with continued reference to fig. 2, in step S240, a targeted marketing parameter that satisfies the M marketing indicators is determined based on the disturbance data set and the M predictive models. The disturbance data set is applied to a prediction model for predicting other marketing indexes by combining with the idea of transfer learning, and the better disturbance data is determined by other prediction models.

Fig. 6 is a flowchart illustrating a method for determining a targeted marketing parameter according to an embodiment of the present disclosure, which may be a specific implementation manner of step S240. Referring to fig. 6, the solution shown in the figure comprises steps S610 to S640.

In step S610, inputting the perturbation sample set into an ith prediction model for predicting marketing indexes related to the ith, and determining the output of the ith prediction model as a first label set; and in step S620, determining a first target tag meeting a first preset requirement in the first tag set, and acquiring a marketing parameter corresponding to the first target tag as a first marketing parameter set.

In the exemplary embodiment, the case where i is 1, that is, the marketing label is a profit value, is still described. A corresponding prediction model1, i.e., a model of the profit for n consecutive days, per unit limit, ticket denomination. The perturbation data set determined in the embodiment shown in FIG. 5 is input into the model1, and the output, i.e., the first tag set y _ hat, is obtained through the calculation process of the model 1. Further, N1 tags and Y are selected from the first tag set Y _ hat₁' (i.e., the highest expected profit value described above) is closest to the target label (noted as the first target label). Thus, the marketing parameters corresponding to the first target tag, i.e., the three-dimensional arrays corresponding to the first target tag, are obtained, and the first marketing parameter set (denoted as "xmf 1") is obtained. Wherein, any one three-dimensional array in the first marketing parameter set is represented as [ xmf1_1 ]_w,xmf1_2_w,xmf1_3_w]W has a value in the range of [1, N1 ]]。

The filtering of the disturbance data set is realized through the steps S610 and S620, so as to effectively reduce the disturbance data volume, and the filtered data is closer to the target marketing parameter. Furthermore, in the technical scheme, in the process of formulating the marketing activities with multiple marketing indexes, an effective disturbance data set is constructed by adopting sample data of a certain marketing index, and then the disturbance data set is applied to a prediction model for predicting other marketing indexes by combining the idea of transfer learning, so that target marketing parameters meeting all marketing indexes are determined, more accurate marketing activity setting is realized, the ordering frequency of target users and the purse share of target products are improved, the gross profit is improved, and the operation cost is reduced. Specifically, the method comprises the following steps:

in step S630, inputting the first marketing parameter set into a jth prediction model, and determining that an output of the jth prediction model is a second tag set, where j is a positive integer less than or equal to M and is not equal to i; and in step S640, determining a second target tag meeting a second preset requirement in the second tag set, and determining the target marketing parameter according to the marketing parameter corresponding to the second target tag.

In an exemplary embodiment, the j-th prediction model may be determined according to actual demand, for example, the j-th prediction model may be a model3 of a prediction model of an operating cost with respect to consecutive n days, a limit per single amount, and a denomination of the ticket, may be a prediction model2 of a single amount with respect to consecutive n days, a limit per single amount, and a denomination of the ticket, and may be the model2 and the model 3.

In an exemplary embodiment, the first marketing parameter set is input into the model2 and the model3, and the order quantity predicted value and the operation cost predicted value, namely the second tag set, are obtained. Taking the "second tag set" as the predicted value of the order quantity output by the model2 as an example, N2 tags closest to the expected value of the order are selected from the second tag set as the second target tags. A three-dimensional array (denoted as "xmf 2") corresponding to the N2 second target tags is then found from the first marketing parameter set xmf 1.

Wherein, any one three-dimensional array in the second marketing parameter set is represented as [ xmf2_1 ]_z,xmf2_2_z,xmf2_3_z]And z has a value in the range of [1, N2 ]]。

Illustratively, mean xmf2 is calculated for each dimension of marketing data in three-dimensional array xmf2_Ave＝[xmf2_1_Ave,xmf2_2_Ave,xmf2_3_Ave]. Illustratively, the mean value xmf2_1 of the first-dimension marketing data is calculated according to the following formula_AveA first dimension of target marketing parameters.

xmf2_1_Ave＝(α₁*xmf2_1₁+α₂*xmf2_1₂+……+α_N2*xmf2_1_N2)/(N2)

Wherein, α₁、α₂、……α_N2Respectively represent xmf2_1₁、xmf2_1₂……xmf2_1_N2The weighting value of (2). In addition, the calculation manner of other two-dimensional target marketing parameters is similar to the formula, and is not repeated herein.

In the embodiment provided by the technical scheme, multi-model modeling training is firstly carried out on multi-dimensional marketing parameter characteristic data by a plurality of marketing indexes, then characteristic sample data is filtered by using an expected range of a certain marketing label value to construct an effective disturbance data set, and then the optimal recommendation parameters are calculated on other models by combining the idea of transfer learning and are subjected to polymerization recommendation, so that more accurate continuous and cumulant marketing activity setting is realized, the ordering frequency of a target user and the wallet share of a target product are improved, the gross profit is improved, and the operation cost is reduced. Therefore, the technical scheme can accurately set the target users and commodities for the order connection/accumulation activities, and improve the ordering frequency of the users and the target commodity wallet share; meanwhile, reasonable coupon/tired coupon preferential force setting is beneficial to improving the gross profit of the commodity and reducing the operation cost.

Those skilled in the art will appreciate that all or part of the steps for implementing the above embodiments are implemented as computer programs executed by a processor (including a CPU and a GPU). When executed by the CPU, performs the functions defined by the above-described methods provided by the present disclosure. The program may be stored in a computer readable storage medium, which may be a read-only memory, a magnetic or optical disk, or the like.

Furthermore, it should be noted that the above-mentioned figures are only schematic illustrations of the processes involved in the methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.

The following describes embodiments of the apparatus of the present disclosure, which may be used to perform the above-mentioned data processing method of the present disclosure.

Fig. 7 shows a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure, and referring to fig. 7, the data processing apparatus 700 provided in this embodiment includes: a first sample set determination module 701, a model training module 702, a perturbation data set determination module 703, and a targeted marketing parameters determination module 704. Wherein:

the first sample set determining module 701 is configured to: obtaining a plurality of groups of sample data to obtain a first sample set, wherein each group of sample data comprises: the marketing parameters and the marketing label determined by at least one marketing index in the M marketing indexes, wherein M is an integer greater than 1;

the model training module 702 is configured to: training M machine learning models based on the marketing labels in the first sample set to obtain M prediction models respectively used for predicting the M marketing indexes;

the disturbance data set determining module 703 is configured to: acquiring a second sample set corresponding to the ith marketing index, and determining a disturbance data set according to marketing parameters in the second sample set, wherein i is a positive integer less than or equal to M;

the targeted marketing parameter determination module 704 is configured to: and determining target marketing parameters meeting the M marketing indexes based on the disturbance data set and the M prediction models.

In some embodiments of the present disclosure, based on the foregoing scheme, the disturbance data set determination module 703 includes: and a sample screening unit. Wherein:

the sample screening unit is used for: after the disturbance data set determination module 703 obtains a second sample set corresponding to the ith marketing index, obtaining a first expected range related to the ith marketing index; and screening out the samples of the ith marketing label in the first expected range in the first sample set to obtain a second sample set.

In some embodiments of the present disclosure, based on the foregoing scheme, the disturbance data set determining module 703 further includes: a disturbance data set determination unit. Wherein:

the disturbance data set determination unit is configured to: calculating a mean value of marketing parameters in the second sample set; acquiring a second expected range corresponding to marketing parameters in the ith marketing index, and screening out sample data of which the difference is within the second expected range to obtain a screened data set; and processing the screening data set according to a preset step length to obtain the disturbance data set.

In some embodiments of the present disclosure, based on the foregoing solution, the targeting marketing parameter determining module 704 includes: the device comprises a first label set determining unit, a first target label determining unit, a second label set determining unit and a second target label determining unit. Wherein:

the first tag set determining unit is configured to: inputting the perturbation sample set into an ith prediction model for predicting the ith marketing index, and determining the output of the ith prediction model as a first label set;

the first target tag determination unit is configured to: determining a first target label meeting a first preset requirement in the first label set, and acquiring a marketing parameter corresponding to the first target label as a first marketing parameter set;

the second tag set determining unit is configured to: inputting the first marketing parameter set into a jth prediction model, and determining that the output of the jth prediction model is a second label set, j is a positive integer less than or equal to M and is not equal to i; and the number of the first and second groups,

the second target tag determination unit is configured to: and determining a second target label meeting a second preset requirement in the second label set, and determining the target marketing parameter according to the marketing parameter corresponding to the second target label.

In some embodiments of the present disclosure, based on the foregoing scheme, the second target tag determining unit is specifically configured to: and calculating the mean value of the marketing parameters corresponding to the second target label as the target marketing parameters.

In some embodiments of the present disclosure, based on the foregoing scheme, the model training module 702 is specifically configured to:

acquiring a kth sample set corresponding to the kth marketing label in the first sample set, wherein k is a positive integer less than or equal to M; and training an extreme gradient lifting model through the kth sample set to determine the optimal hyper-parameter of the extreme gradient lifting model, so as to obtain a kth prediction model for predicting the corresponding k marketing label.

In some embodiments of the present disclosure, based on the foregoing, the marketing parameters include one or more of the following information: the number of days for placing an order, the money limit of the order and the denomination of the ticket used for placing the order; the marketing indicator includes at least two of the following information: profit value, next unit amount, and cost value.

For details which are not disclosed in the embodiments of the apparatus of the present disclosure, please refer to the embodiments of the data processing method described above in the present disclosure for the details which are not disclosed in the embodiments of the apparatus of the present disclosure.

It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.

Moreover, although the steps of the methods of the present disclosure are depicted in the drawings in a particular order, this does not require or imply that the steps must be performed in this particular order, or that all of the depicted steps must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions, etc.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a mobile terminal, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.

In an exemplary embodiment of the present disclosure, there is also provided a computer storage medium capable of implementing the above method. On which a program product capable of implementing the above-described method of the present specification is stored. In some possible embodiments, various aspects of the present disclosure may also be implemented in the form of a program product including program code for causing a terminal device to perform the steps according to various exemplary embodiments of the present disclosure described in the "exemplary methods" section above of this specification when the program product is run on the terminal device.

Referring to fig. 8, a program product 800 for implementing the above method according to an embodiment of the present disclosure is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product described above may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

In addition, in an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.

As will be appreciated by one skilled in the art, aspects of the present disclosure may be embodied as a system, method or program product. Accordingly, various aspects of the present disclosure may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.

An electronic device 900 according to this embodiment of the disclosure is described below with reference to fig. 9. The electronic device 900 shown in fig. 9 is only an example and should not bring any limitations to the functionality or scope of use of the embodiments of the present disclosure.

As shown in fig. 9, the electronic device 900 is embodied in the form of a general purpose computing device. Components of electronic device 900 may include, but are not limited to: the at least one processing unit 910, the at least one memory unit 920, and a bus 930 that couples various system components including the memory unit 920 and the processing unit 910.

Wherein, the storage unit stores program codes, and the program codes can be executed by the processing unit 910, so that the processing unit 910 executes the steps according to various exemplary embodiments of the present disclosure described in the "exemplary method" section above in this specification. For example, the processing unit 910 described above may perform the following as shown in fig. 2: step S210, obtaining a plurality of groups of sample data to obtain a first sample set, wherein each group of sample data comprises: the marketing parameters and the marketing label determined by at least one marketing index in the M marketing indexes, wherein M is an integer greater than 1; step S220, training M machine learning models based on the marketing labels in the first sample set to obtain M prediction models which are respectively used for predicting the M marketing indexes; step S230, a second sample set corresponding to the ith marketing index is obtained, a disturbance data set is determined according to marketing parameters in the second sample set, and i is a positive integer less than or equal to M; and step S240, determining target marketing parameters meeting the M marketing indexes based on the disturbance data set and the M prediction models.

For example, the processing unit 910 may further perform a data processing method as shown in any one of fig. 3 to 6.

The storage unit 920 may include a readable medium in the form of a volatile storage unit, such as a random access memory unit (RAM)9201 and/or a cache memory unit 9202, and may further include a read only memory unit (ROM) 9203.

Storage unit 920 may also include a program/utility 9204 having a set (at least one) of program modules 9205, such program modules 9205 including but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Bus 930 can be any of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 900 may also communicate with one or more external devices 1000 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 900, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 900 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interface 950. Also, the electronic device 900 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network such as the Internet) via the network adapter 970. As shown, the network adapter 960 communicates with the other modules of the electronic device 900 via the bus 930. It should be appreciated that although not shown, other hardware and/or software modules may be used in conjunction with the electronic device 900, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.

Furthermore, the above-described figures are merely schematic illustrations of processes included in methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims

1. A method of data processing, the method comprising:

obtaining a plurality of groups of sample data to obtain a first sample set, wherein each group of sample data comprises: the marketing parameters and the marketing label determined by at least one marketing index in the M marketing indexes, wherein M is an integer greater than 1;

2. The data processing method of claim 1, wherein obtaining a second sample set corresponding to an ith marketing index comprises:

acquiring a first expected range of the ith marketing index;

3. The data processing method of claim 1, wherein determining a perturbation data set from the marketing parameters in the second sample set comprises:

calculating a mean value of marketing parameters in the second sample set;

acquiring a second expected range corresponding to marketing parameters in the ith marketing index, and screening out sample data of which the difference is within the second expected range to obtain a screened data set;

4. The data processing method of any one of claims 1 to 3, wherein determining the targeted marketing parameters that satisfy the M marketing indicators based on the disturbance data set and the M predictive models comprises:

5. The data processing method of claim 4, wherein determining the targeted marketing parameters according to the marketing parameters corresponding to the second targeted tag comprises:

6. The data processing method of any of claims 1 to 3, wherein training M machine learning models based on marketing labels in the first sample set comprises:

7. The data processing method of any of claims 1 to 3, wherein the marketing parameters include one or more of the following: the number of days for placing an order, the money limit of the order and the denomination of the ticket used for placing the order;

8. A data processing apparatus, characterized in that the apparatus comprises:

9. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the data processing method of any one of claims 1 to 7.

10. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out a data processing method as claimed in any one of claims 1 to 7.