CN111090677A - Method and device for determining data object type - Google Patents

Method and device for determining data object type Download PDF

Info

Publication number
CN111090677A
CN111090677A CN201811237122.7A CN201811237122A CN111090677A CN 111090677 A CN111090677 A CN 111090677A CN 201811237122 A CN201811237122 A CN 201811237122A CN 111090677 A CN111090677 A CN 111090677A
Authority
CN
China
Prior art keywords
user
data object
probability
sample set
training sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811237122.7A
Other languages
Chinese (zh)
Inventor
李思旭
杨文君
李奘
成石
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Didi Infinity Technology and Development Co Ltd
Original Assignee
Beijing Didi Infinity Technology and Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Didi Infinity Technology and Development Co Ltd filed Critical Beijing Didi Infinity Technology and Development Co Ltd
Priority to CN201811237122.7A priority Critical patent/CN111090677A/en
Publication of CN111090677A publication Critical patent/CN111090677A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0207Discounts or incentives, e.g. coupons or rebates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Marketing (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method and a device for determining a data object type, wherein the method comprises the following steps: inputting characteristic information of a first user into a sensitive conversion model, and acquiring the sensitive conversion probability of the first user under different types of data objects; inputting the characteristic information of the first user into a natural transformation model to obtain the natural transformation probability of the first user; and respectively acquiring return indexes corresponding to the different types of data objects according to the sensitive conversion probability and the natural conversion probability, and determining the data object type corresponding to the maximum return index as the target data object type of the first user. The method and the device for determining the type of the data object determine the type of the data object according to the characteristic information of the first user, can convert the data object into the second user with the maximum probability on the premise of ensuring budget and income, and improve the operation efficiency of a platform.

Description

Method and device for determining data object type
Technical Field
The invention relates to the technical field of internet, in particular to a method and a device for determining a data object type.
Background
With the development of electronic commerce, network consumption has become a main consumption mode of people. Taking network taxi taking software as an example, a user can conveniently release taxi taking information through a mobile phone and directly communicate with a taxi receiver driver, so that people can go out more conveniently. In a network consumption platform, some users are called silent users, for example: users who have placed orders on the platform within the last 3 months but have not placed orders on the platform within the last 1 month, or users who have logged on the platform but have not completed placing orders; users other than the silent user are generally referred to as active users. For a network consumption platform, the silent users are very precious existing resources, and how to convert the silent users into active users through certain operation measures and reuse the active users to place orders on the platform is an urgent problem to be solved by platform operation.
The silent user is converted into an active user, the most common method is to provide the silent user with a preference, and a way of issuing a coupon to the silent user is generally adopted. The coupons include cash coupons, experience coupons, discount coupons, and the like.
However, when the coupon for the silent user has a low coupon strength, the silent user may not be converted into an active user, and when the coupon for the silent user has a high coupon strength, platform revenue may be affected. Therefore, for each silent user, how to determine the preferential strength of the coupon issued to the silent user is a problem to be researched and solved, so that the silent user can be converted into an active user with the maximum probability on the premise of guaranteeing budget and income.
Disclosure of Invention
The invention provides a method and a device for determining a data object type, which are used for determining the type of a data object according to characteristic information of a first user and converting the data object into a second user with the maximum probability on the premise of ensuring budget and income.
In a first aspect, the method for determining a data object provided by the present invention includes:
inputting feature information of a first user into a sensitive conversion model, and acquiring sensitive conversion probability of the first user under different types of data objects, wherein the sensitive conversion probability refers to the probability that the first user is converted into a second user under the condition of possessing the data objects;
inputting the characteristic information of the first user into a natural conversion model, and acquiring the natural conversion probability of the first user, wherein the natural conversion probability refers to the probability that the first user is converted into a second user under the condition that the first user does not have any data object;
respectively acquiring return indexes corresponding to the different types of data objects according to the sensitive conversion probability and the natural conversion probability;
and determining the data object type corresponding to the maximum return index as the target data object type of the first user according to the return index.
Optionally, the obtaining, according to the sensitive transformation probability and the natural transformation probability, return indexes corresponding to the different types of data objects respectively includes:
according to the formula
Figure BDA0001838455780000021
Respectively acquiring return indexes corresponding to the different types of data objects;
wherein d istIs the type of the data object, ptA probability of sensitive transformation, p, corresponding to said data objectcAnd the ROI is a return index corresponding to the data object and is the natural conversion probability.
Optionally, before inputting the feature information of the first user into the sensitive conversion model, the method further includes:
sampling historical data to generate a first training sample set, wherein samples in the first training sample set consist of first users with data objects at sampling time;
preprocessing the first training sample set;
and training the first training sample set by taking the characteristic information of the samples in the first training sample set and the type of the data object as input and taking whether the samples in the first training sample set are converted into a second user in an observation time window as output to obtain a sensitive conversion model.
Optionally, the preprocessing the first training sample set includes:
if a sample in the first training sample set receives a new data object again within the observation time window, the sample is removed from the first training sample set.
Optionally, the preprocessing the first training sample set further includes:
and downsampling samples with more data object types in the first training sample set, and upsampling samples with less data object types in the first training sample set.
Optionally, before inputting the feature information of the first user into the natural transformation model, the method further includes:
sampling historical data to generate a second training sample set, wherein samples in the second training sample set consist of first users who do not own any data objects at the sampling moment;
removing a sample from the second set of training samples if the sample received a data object within an observation time window;
and training the second training sample set by taking the characteristic information of the samples in the second training sample set as input and taking whether the samples in the second training sample set are converted into a second user in the observation time window as output to obtain a natural conversion model.
Optionally, the feature information includes: a pre-silent characteristic, a silent period behavioral characteristic, and a silent period status characteristic of the first user;
the pre-silencing features include: demographic information, preference information, transaction information and evaluation information of the first user within a preset time length before silence; the behavioral characteristics of the silent period include: whether the first user attempts a transaction during a silent period; the status characteristics of the silent period include: an order status and an account status of the first user during the silent period.
In a second aspect, the present invention provides an apparatus for determining a type of a data object, including:
the first obtaining module is used for inputting the characteristic information of a first user into a sensitive conversion model and obtaining the sensitive conversion probability of the first user under different types of data objects, wherein the sensitive conversion probability refers to the probability that the first user is converted into a second user under the condition of possessing the data objects;
a second obtaining module, configured to input the feature information of the first user into a natural transformation model, and obtain a natural transformation probability of the first user, where the natural transformation probability is a probability that the first user transforms into a second user without having any data object;
a third obtaining module, configured to obtain, according to the sensitive transformation probability and the natural transformation probability, return indexes corresponding to the different types of data objects respectively;
and the determining module is used for determining the data object type corresponding to the maximum return index as the target data object type of the first user according to the return index.
Optionally, the third obtaining module is specifically configured to obtain the third value according to a formula
Figure BDA0001838455780000031
Respectively acquiring return indexes corresponding to the different types of data objects;
wherein d istIs the type of the data object, ptA probability of sensitive transformation, p, corresponding to said data objectcAnd the ROI is a return index corresponding to the data object and is the natural conversion probability.
Optionally, the apparatus further comprises: a first training module to:
sampling historical data to generate a first training sample set, wherein samples in the first training sample set consist of first users with data objects at sampling time;
preprocessing the first training sample set;
and training the first training sample set by taking the characteristic information of the samples in the first training sample set and the type of the data object as input and taking whether the samples in the first training sample set are converted into a second user in an observation time window as output to obtain a sensitive conversion model.
Optionally, the first training module is specifically configured to remove a sample from the first training sample set if the sample in the first training sample set receives a new data object again within the observation time window.
Optionally, the first training module is specifically configured to down-sample samples with a larger data object type percentage in the first training sample set, and up-sample samples with a smaller data object type percentage in the first training sample set.
Optionally, the apparatus further comprises: a second training module to:
sampling historical data to generate a second training sample set, wherein samples in the second training sample set consist of first users who do not own any data objects at the sampling moment;
removing a sample from the second set of training samples if the sample received a data object within an observation time window;
and training the second training sample set by taking the characteristic information of the samples in the second training sample set as input and taking whether the samples in the second training sample set are converted into a second user in the observation time window as output to obtain a natural conversion model.
Optionally, the feature information includes: a pre-silent characteristic, a silent period behavioral characteristic, and a silent period status characteristic of the first user;
the pre-silencing features include: demographic information, preference information, transaction information and evaluation information of the first user within a preset time length before silence; the behavioral characteristics of the silent period include: whether the first user attempts a transaction during a silent period; the status characteristics of the silent period include: an order status and an account status of the first user during the silent period.
In a third aspect, the present invention provides an apparatus for determining a type of a data object, including:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any of the first aspects.
In a fourth aspect, the present invention provides a computer readable storage medium having a computer program stored thereon; the computer program is executed by a processor to implement the method according to any of the first aspect.
The method and the device for determining the data object type provided by the invention input the characteristic information of a first user into a sensitive transformation model, acquire the sensitive transformation probability of the first user under different types of data objects, input the characteristic information of the first user into a natural transformation model, acquire the natural transformation probability of the first user, respectively acquire the return indexes corresponding to the different types of data objects according to the sensitive transformation probability and the natural transformation probability, and determine the data object type corresponding to the maximum return index as the target data object type of the first user, because the sensitive transformation probability, the natural transformation probability and the return index are simultaneously considered when determining the target data object type for the first user, the finally determined data object type can ensure budget and benefit, the maximum probability is converted into a second user, and the operation efficiency of the network platform is improved.
Drawings
FIG. 1 is a flowchart of a first embodiment of a method for determining a type of a data object according to the present invention;
FIG. 2 is a flowchart illustrating a method for obtaining a sensitive transformation model according to an embodiment of a method for determining a data object type provided by the present invention;
FIG. 3 is a flow chart illustrating preprocessing a first training sample set according to an embodiment of the present invention;
FIG. 4 is a flowchart of obtaining a natural transformation model in an embodiment of a method for determining a data object type according to the present invention;
FIG. 5 is a schematic structural diagram of a first apparatus for determining a type of a data object according to the present invention;
fig. 6 is a schematic structural diagram of a second apparatus for determining a data object type according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
As mentioned above, in the network consumption platform, the user is stimulated to order, and the most common method is to provide a preference to the user, and usually adopts a way of issuing a coupon to the user. The coupons include cash coupons, experience coupons, discount coupons, and the like.
However, when the strength of the coupon issued to the user is small, the user may not be stimulated to place an order, and when the strength of the coupon issued to the user is large, the platform profit may be affected. Therefore, for each silent user, how to determine the preferential strength of the coupon issued to the silent user is a problem to be researched and solved, so that the silent user can be converted into an active user with the maximum probability on the premise of guaranteeing budget and income.
The invention provides a method and a device for determining a data object type, which are used for determining the type of a data object according to the characteristic information of a silent user and converting the data object into an active user with the maximum probability on the premise of ensuring budget and benefit.
The technical solution of the present invention will be described in detail below with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.
First, the method for determining the type of the data object provided by the present invention can be applied to any network consumption platform that needs network marketing, including but not limited to: the system comprises an online taxi taking platform, an online shopping platform, an online education platform, an online payment platform and the like. The data object may be a coupon, and more specifically, the coupon may be of various types, including but not limited to a cash voucher, an experience voucher, a discount voucher, and the like. For convenience of description, the following embodiments are described by taking a scenario in which the network taxi-taking platform issues the discount coupons as an example.
In addition, the method for determining the type of the issued data object provided by the invention can determine the type of the coupon issued for a silent user, wherein the type of the coupon specifically refers to the coupon strength. It is understood that the coupon strength can be expressed in various ways, and for convenience of description, the following embodiments are all expressed by discount rate. For example, the different types of coupons may include 7.0, 7.5, 8.0, 8.5, 9.0, and 9.5 coupon, wherein the discount rate for the 7.0 coupon is 0.3, the discount rate for the 7.5 coupon is 0.25, the discount rate for the 8.0 coupon is 0.2, the discount rate for the 8.5 coupon is 0.15, the discount rate for the 9.0 coupon is 0.1, and the discount rate for the 9.5 coupon is 0.05. That is to say, the method for issuing coupons provided by the present invention can determine which discount rate discount coupon should be issued for a silent user, and can convert the silent user into an active user with the maximum probability on the premise of guaranteeing budget and income.
Fig. 1 is a flowchart of a first embodiment of a method for determining a data object type according to the present invention, and as shown in fig. 1, the method for determining a data object type according to the present embodiment includes:
s11: inputting the characteristic information of a first user into a sensitive conversion model, and acquiring the sensitive conversion probability of the first user under different types of data objects, wherein the sensitive conversion probability refers to the probability that the first user is converted into a second user under the condition of possessing the data objects.
The first user is a silent user, the second user is an active user, and for a certain network platform, the silent user and the active user have different behavior characteristics, for example, the ordering frequency of the silent user is low, and the ordering frequency of the active user is high. It should be noted that, for different network platforms, the silent user and the active user may have different definitions, and the present invention is not limited to this specifically, for example: for a taxi taking platform, a silent user refers to a user who has made a bill on the platform in about 3 months but has not made a bill on the platform in the last 1 month, or a user who has logged on the platform but has not made a bill; users other than the silent user are generally referred to as active users. For the sake of uniform description, hereinafter, the first user refers to a silent user, and the second user refers to an active user, which will not be described in detail later.
The reason for the first user to be silent may be various, for example: the trip experience is poor, account balance is insufficient, price is considered to be high and the like before silence. Typically, the first user may be prompted to be converted to an active user by way of a coupon issued to the user. It can be understood that when the coupon is more powerful, the probability of the first user transitioning to the second user is greater, and when the coupon is less powerful, the probability of the first user transitioning to the second user is less. Therefore, by acquiring the feature information of the first user, the coupon sensitivity conversion probabilities of the first user under different types of coupons can be respectively acquired, and specifically, the probabilities that the first user is converted into the second user when the first user owns 7.0-fold, 7.5-fold, 8.0-fold, 8.5-fold, 9.0-fold, and 9.5-fold coupons can be respectively acquired.
It should be noted that the sensitivity conversion probability refers to a probability that the first user is converted into the second user when the first user owns the coupon, and may represent a sensitivity degree of the first user to the coupon. For example, the probability of a first user converting to a second user when having a coupon discounted 7.0 may be 60%, and the probability of a first user converting to a second user when having a coupon discounted 8.5 may be 90%. That is, the conversion probability that a first user converts to a second user by possession of a coupon is referred to as a sensitive conversion probability.
Specifically, according to the feature information of the first user, the sensitive conversion probabilities of the first user under different types of coupons are respectively obtained, and various embodiments are possible, which are not specifically limited in this embodiment, and one optional manner is that a sensitive conversion model between the feature information of the user in the historical data and the sensitive conversion probability is obtained by analyzing and training the historical data, and then the feature information of the first user is input to the sensitive conversion model, so that the sensitive conversion probability of the first user under different types of coupons is predicted.
S12: inputting the characteristic information of the first user into a natural conversion model, and acquiring the natural conversion probability of the first user, wherein the natural conversion probability refers to the probability that the first user is converted into a second user under the condition that the first user does not possess any data object.
It will be appreciated that for some first users, it is possible to convert to a second user even if no coupons are received, and therefore the probability that a first user does not convert to a second user by having any coupons is referred to as the natural conversion probability.
Specifically, there may be multiple embodiments, and this embodiment is not limited to the specific embodiments, where an optional manner is that a natural transformation model between the feature information of the user in the historical data and the natural transformation probability is obtained by analyzing and training the historical data, and then the feature information of the first user is input to the natural transformation model to predict the natural transformation probability of the first user.
S13: and respectively acquiring return indexes corresponding to the different types of data objects according to the sensitive conversion probability and the natural conversion probability.
It can be understood that the sensitive transformation probability obtained in S11 is strongly correlated with the discount rate of the coupon, and generally, the larger the discount rate of the coupon is, the larger the sensitive transformation probability is, and at the same time, the higher the cost is; the smaller the discount rate of the coupon, the smaller the probability of sensitive conversion and, at the same time, the lower the cost. Therefore, both sensitive conversion probabilities and cost factors need to be considered in determining the type of coupon issued to the first user.
Specifically, the coupon issued by the first user can be regarded as the cost of the network platform, and the first user can be regarded as the return of the network platform after receiving the coupon and converting the coupon into the amount of money of the second user under the platform. Therefore, the reward indexes corresponding to different types of coupons can be obtained according to the sensitive conversion probability of the first user under different types of coupons obtained in S11 and the natural conversion probability of the first user obtained in S12.
Optionally, the reward indexes corresponding to different types of coupons are respectively obtained according to the following formula. The different types of coupons refer to coupons with different discount rates, for example, a coupon with 7.0 discount corresponds to a discount rate of 0.3, and a coupon with 8.5 discount corresponds to a discount rate of 0.15.
Figure BDA0001838455780000091
The above formula describes that the first user receives a discount rate dtIn the event of a coupon, specifically gmvtThe reward amount translated to an order generated by a second user for the first user receiving the coupon, gmvcA reward amount for the first user to convert to an order generated by the second user if the first user does not receive the coupon; costtCost corresponding to the couponcThe cost amount under the condition that the first user does not receive the coupon; p is a radical oftFor the corresponding sensitive transformation probability, p, of the couponcFor natural conversion probability, price is order amount, dtFor the discount rate corresponding to the coupon, dcFor the first user to naturally translate into the corresponding discount rate for the second user, i.e. dc0; and the ROI is a return index corresponding to the coupon.
S14: and determining the data object type corresponding to the maximum return index as the target data object type of the first user according to the return index.
For example, assuming that the reward indexes corresponding to the coupons of 7.0, 7.5, 8.0, 8.5, 9.0 and 9.5 discounts issued to the first user are respectively obtained in S13, wherein the reward index corresponding to the coupon of 8.0 discount is the largest, it is determined that the coupon of 8.0 discount is issued to the first user.
Optionally, the feature information includes: a pre-silent feature, a behavioral feature of the silent period, and a status feature of the silent period of the first user.
Wherein the pre-silencing features comprise: demographic information, preference information, transaction information and evaluation information of the first user within a preset time period before silence, such as: the method comprises the following steps of determining demographic information, preference information, completion amount, payment amount, whether bad comments exist or not and the like of a first user in a month before silence, wherein the demographic information can comprise the sex, age, occupation, address and the like of the first user, and the preference information can comprise travel time preference, travel route preference and the like of the first user obtained according to data mining.
The behavioral characteristics of the silent period include: whether the first user attempts a transaction during a silent period, such as: whether the first user has ever turned on the network platform, attempted to place an order, etc. during the silent period.
The status characteristics of the silent period include: the order status and account status of the first user during the silent period, for example: whether the first user's last order has been paid, whether there is a coupon in the account, etc.
In this embodiment, the feature information of a first user is input into a sensitive transformation model, the sensitive transformation probability of the first user under different types of data objects is obtained, the feature information of the first user is input into a natural transformation model, the natural transformation probability of the first user is obtained, the return indexes corresponding to the different types of data objects are respectively obtained according to the sensitive transformation probability and the natural transformation probability, the data object type corresponding to the maximum return index is determined as the target data object type of the first user, and since the sensitive transformation probability, the natural transformation probability and the return index are simultaneously considered when determining the target data object type for the first user, the finally determined data object type can be transformed into a second user with the maximum probability on the premise of guaranteeing budget and profit, the operation efficiency of the network platform is improved.
Fig. 2 is a flowchart of obtaining a sensitive transformation model in an embodiment of a method for determining a data object type provided by the present invention, and on the basis of the embodiment shown in fig. 1, this embodiment describes in detail one optional implementation manner of a training process of the sensitive transformation model, and as shown in fig. 2, may specifically include:
s21: the historical data is sampled, generating a first set of training samples, the samples in the first set of training samples consisting of a first user having a data object at a sampling time.
The first training sample set consists of first users who own coupons at the sampling moment, and the samples of the first training sample set can be trained by utilizing the characteristic information of the samples of the first training sample set and the behavior information of the samples, so that a relation model between the characteristic information and the behavior information is obtained. The behavior information of the sample indicates whether the sample is ordered in an observation time window, and the observation time window may be a time period with a preset duration, for example: five days after the sampling time, seven days after the sampling time, ten days after the sampling time, etc.
Specifically, taking a network taxi taking platform as an example, historical data in a database can be sampled at a certain historical moment to obtain a first training sample set, for example, according to the actual operation condition of the platform, the historical data can be sampled at the moment of issuing a coupon in a sales promotion activity, and a first user who owns the coupon at the sampling moment forms the first training sample set.
It will be appreciated that in order to achieve a good training result, the number of samples in the first training sample set will generally need to be of a certain size. It should be noted that, for the specific number of samples in the first training sample set, the present invention is not particularly limited, and may be reasonably set according to the actual application scenario.
Optionally, if the number of samples obtained at one historical time is small, a plurality of historical times may be selected to sample the historical data in the database, for example: respectively sampling historical data one week before, two weeks before and three weeks before the current time, and accumulating the obtained 3 parts of sample data to obtain a first training sample set so that the number of samples in the first training sample set meets the training requirement.
Each first user in the first training sample set may own one or more coupons, and if the first user places an order within the observation time window under the condition that the first user owns the coupons, the order of the first user is prompted to be placed by the coupon with the highest discount rate in the plurality of coupons. Therefore, only the coupon with the highest discount rate of the first user needs to be considered when training.
S22: preprocessing the first training sample set.
Specifically, after a first training sample set is obtained by sampling historical data, the first training sample set can be preprocessed, so that samples in the first training sample set are purer and more uniformly distributed, and a training result is more accurate.
Fig. 3 is a flowchart of preprocessing a first training sample set according to a second embodiment of the present invention, and as shown in fig. 3, the preprocessing process may include, but is not limited to, the following two steps.
S221: if a sample in the first training sample set receives a new data object again within the observation time window, the sample is removed from the first training sample set.
It can be understood that if a sample in the first training sample set receives a new coupon within the observation time window, in this case, when the sample is placed within the observation time window, i.e. transformed into the second user, it cannot be determined which coupon was placed under the stimulation of the sample, i.e. the sample is a noise sample, which may affect the accuracy of the training result, and therefore, the sample should be removed from the first training sample set to make the sample in the first training sample set purer.
Further, since the accuracy of the training result is only affected if the discount rate of the new discount coupon is greater than the discount rate of the previously owned discount coupon, optionally, if the sample in the first training sample set receives the new discount coupon again within the observation time window and the discount rate of the new discount coupon is greater than the discount rate of the previously owned discount coupon, the sample is removed from the first training sample set.
S222: and downsampling samples with more data object types in the first training sample set, and upsampling samples with less data object types in the first training sample set.
Specifically, the first training sample set is obtained by sampling the historical data, and there may be a case where the sample distribution is not uniform, for example: the number of samples with 7.0 discount coupons in the first training sample set is large, the number of samples with 8.5 discount coupons is small, and therefore accuracy of training results can be affected, secondary sampling can be conducted on the first training sample set, and therefore sample distribution is more uniform.
Wherein, downsampling samples with more coupon types and upsampling samples with less coupon types in the first training sample set, for example: the number of samples having the 7.0-fold coupon is 10 ten thousand, and the number of samples having the 8.5-fold coupon is 2 ten thousand, the number of samples having the 7.0-fold coupon can be reduced by a down-sampling method, and the number of samples having the 8.5-fold coupon can be increased by an up-sampling method. The present invention is not limited to the specific down-sampling and up-sampling methods, and can be implemented by any sampling method in the prior art.
S23: and training the first training sample set by taking the characteristic information of the samples in the first training sample set and the type of the data object as input and taking whether the samples in the first training sample set are converted into a second user in an observation time window as output to obtain a sensitive conversion model.
The above-mentioned method for training the first training sample set may be various, for example: neural network learning algorithms, decision tree learning algorithms, bayesian learning algorithms, and the like. Optionally, the XGBoost algorithm is used to train the first training sample set to obtain the coupon sensitivity conversion model.
Specifically, after the sensitive transformation model is obtained through the above steps, the model may be used to predict the sensitive transformation probability of the first user, and specifically, the sensitive transformation probability of the first user under the data object may be predicted by inputting the feature information of the first user and the type of the data object into the sensitive transformation model.
In the embodiment, historical data is sampled to generate a first training sample set, the first training sample set is subjected to denoising purification and secondary sampling pretreatment, and then is trained to obtain a sensitive transformation model, so that the characteristic information of a first user can be input into the sensitive transformation model, and the sensitive transformation probability of the first user under different types of data objects is predicted; because the sensitive transformation model is obtained by training according to a large amount of historical data, the sensitive transformation probability obtained by adopting the sensitive transformation model for prediction is more accurate.
Fig. 4 is a flowchart of acquiring a natural transformation model in an embodiment of a method for determining a data object type provided by the present invention, and on the basis of the foregoing embodiment, this embodiment describes in detail one optional implementation manner of acquiring a natural transformation model, and as shown in fig. 4, the method may specifically include:
s41: the historical data is sampled, generating a second set of training samples, the samples in the second set of training samples consisting of the first user not having any data objects at the sampling time.
S42: if a sample in the second set of training samples receives a data object within an observation time window, the sample is removed from the second set of training samples.
S43: and training the second training sample set by taking the characteristic information of the samples in the second training sample set as input and taking whether the samples in the second training sample set are converted into a second user in the observation time window as output to obtain a natural conversion model.
Specifically, the second training sample set is composed of the first user who does not have any data object at the sampling time, and the samples of the second training sample set can be trained by using the characteristic information of the samples of the second training sample set and the behavior information of the samples, so as to obtain the natural transformation model. The model may be used to predict a natural transformation probability of the first user, and specifically, the natural transformation probability of the first user may be predicted by inputting the feature information of the first user into the natural transformation model.
It will be appreciated that the process of sampling and training the second set of training samples is similar to the training process of the first set of training samples in the above embodiment, except that the process of subsampling according to the data object type ratio is not required as the samples in the second set of training samples do not have data objects. Therefore, the detailed description of the second embodiment can be referred to for the specific implementation process of S41-S43, and will not be repeated herein.
In the embodiment, historical data is sampled to generate a second training sample set, the second training sample set is subjected to denoising and purification preprocessing, and then is trained to obtain a natural transformation model, so that the characteristic information of a first user can be input into the natural transformation model to predict the natural transformation probability of the first user; because the natural transformation model is obtained by training according to a large amount of historical data, the natural transformation probability obtained by adopting the natural transformation model for prediction is more accurate.
Fig. 5 is a schematic structural diagram of a first embodiment of an apparatus for determining a data object type according to the present invention, and as shown in fig. 5, an apparatus 500 for determining a data object type according to the present embodiment includes: a first obtaining module 501, a second obtaining module 502, a third obtaining module 503 and a determining module 504.
The first obtaining module 501 is configured to input feature information of a first user into a sensitive conversion model, and obtain a sensitive conversion probability of the first user under different types of data objects, where the sensitive conversion probability refers to a probability that the first user converts into a second user when the first user owns the data object.
A second obtaining module 502, configured to input the feature information of the first user into a natural transformation model, and obtain a natural transformation probability of the first user, where the natural transformation probability refers to a probability that the first user transforms into a second user without having any data object.
A third obtaining module 503, configured to obtain, according to the sensitive transformation probability and the natural transformation probability, return indexes corresponding to the different types of data objects respectively.
A determining module 504, configured to determine, according to the return index, a data object type corresponding to the maximum return index as a target data object type of the first user.
Optionally, the third obtaining module 503 is specifically configured to obtain the formula
Figure BDA0001838455780000141
Respectively obtaining return indexes corresponding to the different types of data objects, wherein the different types of data objects have different discount rates; wherein d istType of discount rate, p, corresponding to said data objecttA probability of sensitive transformation, p, corresponding to said data objectcAnd the ROI is a return index corresponding to the data object and is the natural conversion probability.
Optionally, as shown in fig. 5, the apparatus of this embodiment further includes: a first training module 505 for:
sampling historical data to generate a first training sample set, wherein samples in the first training sample set consist of first users with data objects at sampling time; preprocessing the first training sample set; and training the first training sample set by taking the characteristic information of the samples in the first training sample set and the type of the data object as input and taking whether the samples in the first training sample set are converted into a second user in an observation time window as output to obtain a sensitive conversion model.
Optionally, the first training module 505 is specifically configured to remove a sample from the first training sample set if the sample in the first training sample set receives a new data object again within the observation time window.
Optionally, the first training module 505 is specifically configured to perform downsampling on samples with a larger data object type in the first training sample set, and perform upsampling on samples with a smaller data object type in the first training sample set.
Optionally, as shown in fig. 5, the apparatus of this embodiment further includes: a second training module 506 to:
sampling historical data to generate a second training sample set, wherein samples in the second training sample set consist of first users who do not own any data objects at the sampling moment; removing a sample from the second set of training samples if the sample received a data object within an observation time window; and training the second training sample set by taking the characteristic information of the samples in the second training sample set as input and taking whether the samples in the second training sample set are converted into a second user in the observation time window as output to obtain a natural conversion model.
Optionally, the feature information includes: a pre-silent characteristic, a silent period behavioral characteristic, and a silent period status characteristic of the first user; wherein the pre-silencing features comprise: demographic information, preference information, transaction information and evaluation information of the first user within a preset time length before silence; the behavioral characteristics of the silent period include: whether the first user attempts a transaction during a silent period; the status characteristics of the silent period include: an order status and an account status of the first user during the silent period.
The apparatus for determining a data object type provided in this embodiment may be used in any of the above method embodiments, and its implementation principle and technical effect are similar, which are not described herein again.
Fig. 6 is a schematic structural diagram of a second embodiment of the apparatus for determining a data object type according to the present invention, and as shown in fig. 6, the apparatus 600 for determining a data object type according to this embodiment includes: memory 601, at least one processor 602, and computer programs.
The computer program is stored in the memory 601 and configured to be executed by the processor 602 to implement the method for determining a data object type according to the foregoing embodiments, which implements the principles and technical effects, and is not described herein again.
The present invention further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method for determining a data object type according to the foregoing embodiments is implemented, and the implementation principle and the technical effect are similar, and are not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In the foregoing embodiments of the network device or the terminal device, it should be understood that the Processor may be a Central Processing Unit (CPU), or may be another general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present application may be embodied directly in a hardware processor, or in a combination of the hardware and software modules in the processor.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for determining a type of a data object, comprising:
inputting feature information of a first user into a sensitive conversion model, and acquiring sensitive conversion probability of the first user under different types of data objects, wherein the sensitive conversion probability refers to the probability that the first user is converted into a second user under the condition of possessing the data objects;
inputting the characteristic information of the first user into a natural conversion model, and acquiring the natural conversion probability of the first user, wherein the natural conversion probability refers to the probability that the first user is converted into a second user under the condition that the first user does not have any data object;
respectively acquiring return indexes corresponding to the different types of data objects according to the sensitive conversion probability and the natural conversion probability;
and determining the data object type corresponding to the maximum return index as the target data object type of the first user according to the return index.
2. The method according to claim 1, wherein the obtaining the return indexes corresponding to the different types of data objects according to the sensitive transformation probability and the natural transformation probability respectively comprises:
according to the formula
Figure FDA0001838455770000011
Respectively acquiring return indexes corresponding to the different types of data objects;
wherein d istIs the type of the data object, ptA probability of sensitive transformation, p, corresponding to said data objectcAnd the ROI is a return index corresponding to the data object and is the natural conversion probability.
3. The method of claim 1, wherein before entering the first user's profile information into the sensitive transformation model, further comprising:
sampling historical data to generate a first training sample set, wherein samples in the first training sample set consist of first users with data objects at sampling time;
preprocessing the first training sample set;
and training the first training sample set by taking the characteristic information of the samples in the first training sample set and the type of the data object as input and taking whether the samples in the first training sample set are converted into a second user in an observation time window as output to obtain a sensitive conversion model.
4. The method of claim 3, wherein the preprocessing the first set of training samples comprises:
if a sample in the first training sample set receives a new data object again within the observation time window, the sample is removed from the first training sample set.
5. The method of claim 4, wherein the preprocessing the first set of training samples further comprises:
and downsampling samples with more data object types in the first training sample set, and upsampling samples with less data object types in the first training sample set.
6. The method of claim 1, wherein before inputting the feature information of the first user into a natural transformation model, further comprising:
sampling historical data to generate a second training sample set, wherein samples in the second training sample set consist of first users who do not own any data objects at the sampling moment;
removing a sample from the second set of training samples if the sample received a data object within an observation time window;
and training the second training sample set by taking the characteristic information of the samples in the second training sample set as input and taking whether the samples in the second training sample set are converted into a second user in the observation time window as output to obtain a natural conversion model.
7. The method according to any of claims 1-6, wherein the feature information comprises: a pre-silent characteristic, a silent period behavioral characteristic, and a silent period status characteristic of the first user;
the pre-silencing features include: demographic information, preference information, transaction information and evaluation information of the first user within a preset time length before silence; the behavioral characteristics of the silent period include: whether the first user attempts a transaction during a silent period; the status characteristics of the silent period include: an order status and an account status of the first user during the silent period.
8. An apparatus for determining a type of a data object, comprising:
the first obtaining module is used for inputting the characteristic information of a first user into a sensitive conversion model and obtaining the sensitive conversion probability of the first user under different types of data objects, wherein the sensitive conversion probability refers to the probability that the first user is converted into a second user under the condition of possessing the data objects;
a second obtaining module, configured to input the feature information of the first user into a natural transformation model, and obtain a natural transformation probability of the first user, where the natural transformation probability is a probability that the first user transforms into a second user without having any data object;
a third obtaining module, configured to obtain, according to the sensitive transformation probability and the natural transformation probability, return indexes corresponding to the different types of data objects respectively;
and the determining module is used for determining the data object type corresponding to the maximum return index as the target data object type of the first user according to the return index.
9. An apparatus for determining a type of a data object, comprising:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any one of claims 1-7.
10. A computer-readable storage medium, having stored thereon a computer program;
the computer program is executed by a processor to implement the method of any one of claims 1-7.
CN201811237122.7A 2018-10-23 2018-10-23 Method and device for determining data object type Pending CN111090677A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811237122.7A CN111090677A (en) 2018-10-23 2018-10-23 Method and device for determining data object type

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811237122.7A CN111090677A (en) 2018-10-23 2018-10-23 Method and device for determining data object type

Publications (1)

Publication Number Publication Date
CN111090677A true CN111090677A (en) 2020-05-01

Family

ID=70392575

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811237122.7A Pending CN111090677A (en) 2018-10-23 2018-10-23 Method and device for determining data object type

Country Status (1)

Country Link
CN (1) CN111090677A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114493707A (en) * 2022-01-28 2022-05-13 北京百度网讯科技有限公司 Object recommendation method and device
CN114549071A (en) * 2022-02-18 2022-05-27 上海钧正网络科技有限公司 Marketing strategy determination method and device, computer equipment and storage medium
WO2023123933A1 (en) * 2021-12-30 2023-07-06 深圳前海微众银行股份有限公司 User type information determination method and device, and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090024546A1 (en) * 2007-06-23 2009-01-22 Motivepath, Inc. System, method and apparatus for predictive modeling of spatially distributed data for location based commercial services
CN107093084A (en) * 2016-08-01 2017-08-25 北京小度信息科技有限公司 Potential user predicts method for transformation and device
CN107578294A (en) * 2017-09-28 2018-01-12 北京小度信息科技有限公司 User's behavior prediction method, apparatus and electronic equipment
CN107688966A (en) * 2017-08-22 2018-02-13 北京京东尚科信息技术有限公司 Data processing method and its system and non-volatile memory medium
CN108053322A (en) * 2017-12-15 2018-05-18 东峡大通(北京)管理咨询有限公司 The customer investment return evaluation method and system of vehicle

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090024546A1 (en) * 2007-06-23 2009-01-22 Motivepath, Inc. System, method and apparatus for predictive modeling of spatially distributed data for location based commercial services
CN107093084A (en) * 2016-08-01 2017-08-25 北京小度信息科技有限公司 Potential user predicts method for transformation and device
CN107688966A (en) * 2017-08-22 2018-02-13 北京京东尚科信息技术有限公司 Data processing method and its system and non-volatile memory medium
CN107578294A (en) * 2017-09-28 2018-01-12 北京小度信息科技有限公司 User's behavior prediction method, apparatus and electronic equipment
CN108053322A (en) * 2017-12-15 2018-05-18 东峡大通(北京)管理咨询有限公司 The customer investment return evaluation method and system of vehicle

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李洋著, 北京:光明日报出版社 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023123933A1 (en) * 2021-12-30 2023-07-06 深圳前海微众银行股份有限公司 User type information determination method and device, and storage medium
CN114493707A (en) * 2022-01-28 2022-05-13 北京百度网讯科技有限公司 Object recommendation method and device
CN114549071A (en) * 2022-02-18 2022-05-27 上海钧正网络科技有限公司 Marketing strategy determination method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109389431B (en) Method and device for distributing coupons, electronic equipment and readable storage medium
WO2019196579A1 (en) Method and apparatus for issuing smart voucher, and method and apparatus for verification and cancellation by using smart voucher
US8452611B1 (en) Method and apparatus for assessing credit for healthcare patients
CN110827138B (en) Push information determining method and device
US20140337171A1 (en) System and method for consumer-merchant transaction analysis
CN111090677A (en) Method and device for determining data object type
US20210398210A1 (en) Systems and methods of transaction tracking and analysis for near real-time individualized credit scoring
CN110543947B (en) Rewarding resource issuing method and device based on reinforcement learning model
CN111078880A (en) Risk identification method and device for sub-application
CN110992097A (en) Processing method and device for revenue product price, computer equipment and storage medium
US20230206333A1 (en) Systems and methods for measurement of data to provide decision support
CN110232150B (en) User data analysis method and device, readable storage medium and terminal equipment
JP2019114019A (en) Information processing device, determination method, and program
JP2003114977A (en) Method and system for calculating customer's lifelong value
US9286639B1 (en) System and method for providing price information
CN111695988A (en) Information processing method, information processing apparatus, electronic device, and medium
CN110580634A (en) service recommendation method, device and storage medium based on Internet
CN110796379B (en) Risk assessment method, device and equipment of business channel and storage medium
CN113807943A (en) Multi-factor valuation method, system, medium and equipment for bad assets
US20160148323A1 (en) System and method for crediting users respective of a value-added tax reclaim
CN113420789A (en) Method, device, storage medium and computer equipment for predicting risk account
CN111882339A (en) Prediction model training and response rate prediction method, device, equipment and storage medium
WO2018016317A1 (en) Method for calculating insurance premium in which big data is used
JP7317417B1 (en) Cash voucher trading system
KR102570627B1 (en) System and method to support digital sharing services

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200501