CN113793160A - Put-in data processing method, device, equipment and storage medium - Google Patents

Put-in data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN113793160A
CN113793160A CN202010624443.3A CN202010624443A CN113793160A CN 113793160 A CN113793160 A CN 113793160A CN 202010624443 A CN202010624443 A CN 202010624443A CN 113793160 A CN113793160 A CN 113793160A
Authority
CN
China
Prior art keywords
data
processing
release
delivery
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010624443.3A
Other languages
Chinese (zh)
Inventor
苏毓敏
张波
秦筱桦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Wodong Tianjun Information Technology Co Ltd
Priority to CN202010624443.3A priority Critical patent/CN113793160A/en
Publication of CN113793160A publication Critical patent/CN113793160A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0242Determining effectiveness of advertisements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The disclosure provides a method, a device, equipment and a storage medium for processing release data, and relates to the technical field of computers. The method comprises the following steps: acquiring first release data; acquiring second release data clicked before the first release data is clicked; acquiring third release data clicked after the first release data is clicked; processing the first launch data, the second launch data and the third launch data to obtain a conversion probability expression vector of the first launch data; wherein the conversion probability representation vector is used for predicting the conversion rate of the first delivery data. The method can accurately acquire the dynamic behavior data of the user, so that the predicted conversion rate of the first delivery data is more accurate.

Description

Put-in data processing method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for processing delivered data.
Background
Online advertising is an important part of internet economy, and the estimated position of Conversion Rate (CVR) is very important in the online advertising industry. Conversion estimates refer to estimating the probability that a user will place an order (or other desired action) for an advertisement given the advertisement and the user.
Conversion is predicted to have a very large feature and challenge: the time delay between the user click and the conversion (ordering) action may vary from a few seconds to a few weeks. For example, when a user clicks on an advertisement at an e-commerce site, the product may only be added to a shopping cart for further comparison, and the order may be placed after a few days. This delayed conversion feedback may result in a large number of "false negative" samples, i.e., positive samples (eventually converted samples) may be considered negative samples (eventually non-converted samples). The presence of "false negative" samples exacerbates the sparsity of positive samples, which can lead to inaccurate predicted conversion rates.
In the related art, the expected delay distribution between ad clicks and conversions is captured by a static delay model. For example, an exponential probability model is introduced to determine untransformed samples, or to estimate time delays without assuming a parametric distribution. The conversion rate obtained according to the static time delay model is relatively poor in accuracy.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The present disclosure aims to provide a method, an apparatus, a device, and a storage medium for processing delivery data, which can accurately obtain dynamic behavior data of a user, so that a predicted conversion rate of first delivery data is more accurate.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
According to an aspect of the present disclosure, there is provided a delivery data processing method, including: acquiring first release data; acquiring second release data clicked before the first release data is clicked; acquiring third release data clicked after the first release data is clicked; processing the first launch data, the second launch data and the third launch data to obtain a conversion probability expression vector of the first launch data; wherein the conversion probability representation vector is used to predict a conversion rate of the first delivery data.
In an embodiment of the present disclosure, processing the first launch data, the second launch data, and the third launch data to obtain a conversion probability expression vector of the first launch data includes: processing the first launching data to obtain a feature expression vector of the first launching data; processing the second launching data to obtain a fusion feature expression vector of the second launching data; processing the third release data based on a double-layer gate circulation unit GRU structure to obtain an interest feature expression vector of the third release data; and processing the feature representation vector of the first delivery data, the fusion feature representation vector of the second delivery data and the interest feature representation vector of the third delivery data based on a survival analysis method to obtain a conversion probability representation vector of the first delivery data.
In an embodiment of the present disclosure, processing the third delivery data based on a dual-layer GRU structure to obtain an interest feature expression vector of the third delivery data includes: processing the third delivery data based on a first-layer GRU structure to obtain a plurality of interest feature expression vectors at preset intervals; and processing the interest feature representation vectors of all preset intervals based on a second-layer GRU structure to obtain the interest feature representation vectors of the third delivery data.
In an embodiment of the present disclosure, based on a survival analysis method, processing the feature representation vector of the first delivery data, the fusion feature representation vector of the second delivery data, and the interest feature representation vector of the third delivery data to obtain a conversion probability representation vector of the first delivery data includes: inputting the feature representation vector of the first release data, the fusion feature representation vector of the second release data and the interest feature representation vector of the third release data into a full connection layer for processing, and determining a survival function and a risk function of the first release data; and determining a conversion probability expression vector of the first putting data based on the survival function and the risk function of the first putting data.
In an embodiment of the present disclosure, processing the second delivery data to obtain a fusion feature expression vector of the second delivery data includes: processing the second launching data based on a first layer full-connection layer structure to obtain a feature expression vector of the second launching data; and processing the feature representation vector of the second launching data based on a second layer full-connection layer structure to obtain a fusion feature representation vector of the second launching data.
In an embodiment of the present disclosure, the method further includes: preprocessing the first delivery data, the second delivery data and the third delivery data based on the one-hot code; and preprocessing the first putting data, the second putting data and the third putting data based on a visual attraction model.
In an embodiment of the present disclosure, the method further includes: and inputting the first putting data and the conversion probability expression vector of the first putting data into a conversion rate prediction model to obtain the conversion probability of the first putting data.
According to another aspect of the present disclosure, there is provided a delivery data processing apparatus including: the first data acquisition module is used for acquiring first release data; the second data acquisition module is used for acquiring second release data clicked before the first release data is clicked; the third data acquisition module is used for acquiring third release data clicked after the first release data is clicked; the data processing module is used for processing the first release data, the second release data and the third release data to obtain a conversion probability expression vector of the first release data; wherein the conversion probability representation vector is used to predict a conversion rate of the first delivery data.
According to yet another aspect of the present disclosure, there is provided a computer apparatus including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform any of the methods described above via execution of the executable instructions.
According to yet another aspect of the disclosure, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements any of the methods described above.
According to the launch data processing method provided by the embodiment of the disclosure, the conversion probability expression vector of the first launch data is obtained by processing the first launch data, the second launch data and the third launch data. On one hand, the conversion probability expression vector of the first delivery data can be used for calibrating the conversion rate of the first delivery data, so that the predicted conversion rate of the first delivery data is more accurate; on the other hand, the method can accurately acquire the dynamic behavior data of the user, and can bring better user experience.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.
Fig. 1A is a schematic diagram of a system architecture provided by an exemplary embodiment of the present disclosure.
FIG. 1B is a schematic diagram of system interactions provided by an exemplary embodiment of the present disclosure.
Fig. 2 is a flow chart illustrating a method of impression data processing according to an exemplary embodiment.
FIG. 3 is a schematic diagram illustrating a delayed feedback conversion calibration model according to an exemplary embodiment.
FIG. 4 is a schematic diagram illustrating a conversion prediction model and a delayed feedback conversion calibration model according to an exemplary embodiment.
Fig. 5 is a flow chart illustrating another method of impression data processing according to an exemplary embodiment.
Fig. 6 is a schematic diagram illustrating a network structure of a GRU unit according to an exemplary embodiment.
Fig. 7 is a flow chart illustrating another method of impression data processing according to an exemplary embodiment.
Fig. 8 is a launch data processing apparatus according to an exemplary embodiment.
FIG. 9 illustrates a block diagram of a computer system, according to an example embodiment.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
Fig. 1A is a schematic diagram of a system architecture provided by an exemplary embodiment of the present disclosure.
As shown in fig. 1A, the system architecture 100 may include user terminals 101, 102, a network 103, and a server 104. The network 103 serves as a medium for providing communication links between the user terminals 101, 102 and the server 104. Network 103 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
A user may use the user terminals 101, 102 to interact with the server 104 over the network 103 to receive or send messages or the like. Among other things, the user terminals 101, 102 may be various electronic devices having a display screen and supporting the ability to connect to the network 103, including but not limited to smartphones, tablets, laptop portable computers, desktop computers, wearable devices, virtual reality devices, augmented reality devices, gamepads, smart homes, and so forth.
The server 104 may be a server that provides various services, such as a background management server that provides support for devices operated by the user using the terminal apparatus 101, 102. The background management server can analyze and process the received data such as the request and feed back the processing result to the terminal equipment.
Server 104 may, for example, obtain first placement data; server 104 may, for example, obtain second placement data clicked before clicking on the first placement data; server 104 may, for example, obtain third placement data clicked after clicking on the first placement data; server 104 may, for example, process the first placement data, the second placement data, and the third placement data to obtain a transformation probability representation vector for the first placement data.
It should be understood that the number of the user terminals, the networks, and the servers in fig. 1A is only illustrative, and the server 104 may be a physical server, a server cluster composed of a plurality of servers, and a cloud server, and may have any number of user terminals, networks, and servers according to actual needs.
FIG. 1B is a schematic diagram of system interactions provided by an exemplary embodiment of the present disclosure.
As shown in fig. 1B, after the user clicks the launch data, the user may generate click data 105, click data 106, and click data 107, where the click data 105, 106, and 107 may include an Identity (ID) of the user terminal 101, a click time, launch data information, and the like, the user terminal 101 may send the click data 105, 106, and 107 to the server 104, and the server receives the click data 105, 106, and 107, may store the click data 105, 106, and 107, and process the click data. For example, first delivery data to be predicted may be determined by the delivery data information recorded in the click data 105, 106, and 107, for example, the first delivery data is the click data 106, and second delivery data and third delivery data are determined according to the click time recorded in the click data 106, for example, the click data 105 may be determined as second delivery data before the click time recorded in the click data 105 is the click time recorded in the click data 106, and the click data 107 may be determined as third delivery data after the click time recorded in the click data 107 is the click time recorded in the click data 106. The server may generate log files of information exchanged between the user terminals 101, 102 and the server 104, facilitating subsequent operations.
A DFM (delayed feedback model) in the related art is a static delay model in which the converted time delay is simply assumed to be exponentially distributed. The optimization of the static delay model in the related art mainly includes: one, propose other complex distributions, such as Weibull distribution; and secondly, learning the delay distribution by optimizing a nonparametric model.
The above methods all assume a static delay profile, i.e. when a click event occurs, the delay profile is assumed to be deterministic and then remains unchanged. However, as more click information is observed and collected after an advertisement click, the time delay profile should change. For example, if the user does not purchase an item, the user may browse a series of related items several days after clicking on the candidate item, which actually reflects the user's recent strong desire to purchase. This simple but realistic example reflects the dynamic transition probability of purchasing behavior delay, whereas the static delay model in the related art cannot obtain rich and diverse information from user behavior data.
Hereinafter, the steps of the delivery data processing method in the exemplary embodiment of the disclosure will be described in more detail with reference to the drawings and the embodiment.
Fig. 2 is a flow chart illustrating a method of impression data processing according to an exemplary embodiment. The method provided by the embodiments of the present disclosure may be performed by the server 104 as shown in fig. 1.
As shown in fig. 2, the placement data processing method 20 includes steps S202-S208.
In step S202, first delivery data is acquired.
The placement data may be, for example, advertisement placement data, which may include advertisement video, advertisement voice, advertisement image, etc., without limitation to this disclosure.
The first placement data may be, for example, advertisement data Xi to be predicted, and for example, a conversion rate of the advertisement data Xi clicked by the user may be predicted.
In step S204, second delivery data clicked before the first delivery data is clicked is acquired.
The second placement data may be, for example, advertisement data Hi clicked before the user clicked on the advertisement data Xi to be predicted.
The advertisement data Hi may be, for example, M advertisements clicked within a preset time before the user clicks the advertisement data Xi to be predicted, and the preset time may be set as needed, and may be, for example, 5 days, 10 days, or 15 days.
In step S206, third impression data clicked after the first impression data is clicked is acquired.
The third placement data may be, for example, advertisement data Si clicked after the user clicks the advertisement data Xi to be predicted.
The advertisement data Si may be, for example, advertisement data clicked by the user within n days after clicking the advertisement data Xi to be predicted, where S1Can represent the userAdvertisement data clicked within the first day after clicking advertisement data Xi to be predicted, S2May represent advertisement data, S, clicked by the user within the second day after clicking the advertisement data Xi to be predictednIt may represent advertisement data clicked by the user within the nth day after clicking advertisement data Xi to be predicted. n may be set as desired, for example, 5 days, 10 days, 15 days.
In step S208, the first delivery data, the second delivery data, and the third delivery data are processed to obtain a conversion probability expression vector of the first delivery data.
Wherein the conversion probability representation vector is used for predicting the conversion rate of the first delivery data.
For example, the first launch data, the second launch data and the third launch data can be processed through a delay feedback conversion rate calibration model to obtain a conversion probability expression vector of the first launch data; the first delivery data, the second delivery data, and the third delivery data may also be processed by other neural network models, which is not limited by the present disclosure.
The conversion probability expression vector of the first putting data is obtained by processing the first putting data, the second putting data and the third putting data, and can be used for calibrating the conversion rate of the first putting data, so that the conversion rate of the obtained first putting data is more accurate.
The following description will be made by taking a delayed feedback conversion calibration model as an example.
FIG. 3 is a schematic diagram illustrating a delayed feedback conversion calibration model according to an exemplary embodiment.
As shown in fig. 3, the first delivery data Xi may be processed through the fully connected layer 301 to obtain a feature representation vector hc of the first delivery data Xi; the second delivery data Hi can be processed through the multilayer fully-connected layers 302 and 303 to obtain a fusion feature representation vector he of the second delivery data Hi; processing the third release data Si by a double-layer GRU (Gated current Unit) to obtain an interest feature representation vector h2(l) of the third release data Si; the feature representation vector hc, the fused feature representation vector he, and the interest feature representation vector h2(l) may be processed by the full connectivity layer to obtain a transformation probability representation vector hg of the first delivery data.
The interest feature representation vector h2(l) may characterize the interest change of the user.
The transition probability representation vector hg for the first placement data may represent a probability that the first placement data will have a transition behavior within a preset interval of a preset time after clicking on the first placement data, e.g., within 15 days after clicking on the first placement data.
The conversion behavior may be, for example, a leave order behavior, an add shopping cart behavior, etc., which the present disclosure does not limit.
The conversion behavior of the first placement data may be, for example, an ordering behavior of the user for a product corresponding to the advertisement data.
The conversion probability expression vector hg of the first delivery data may represent a probability that the user places an order for the product corresponding to the first delivery data on each day within 15 days after clicking the first delivery data.
In some embodiments, the delivery data processing method 20 may further include step S210.
In step S210, the first delivery data and the conversion probability expression vector of the first delivery data are input to the conversion rate prediction model, and the conversion probability of the first delivery data is obtained.
The conversion rate prediction model may be, for example, a trained neural network model, and may be used to predict the conversion rate of the delivery data.
The conversion probability expression vector of the first dosing data obtained by the delayed feedback conversion rate calibration model can be used for calibrating the prediction result of the conversion rate prediction model.
FIG. 4 is a schematic diagram illustrating a conversion prediction model and a delayed feedback conversion calibration model according to an exemplary embodiment.
As shown in fig. 4, for example, the first placement data 401 may be advertisement data clicked by a user on day 9/1, and the probability that the first placement data 401 is converted on days 9/2 to day 9/16 can be obtained by processing the first placement data 401, the second placement data, and the third placement data (not shown in the figure) through the delayed feedback conversion rate calibration model 403: conversion probability for first impression data day 405, conversion probability for second impression data day 406 … … conversion probability for first impression data day n 407, i.e., the probability of delaying conversion 1 to 15 days after clicking on the ad data.
The conversion probability 404 of the first delivery data can be obtained by calibrating the processing result of the first delivery data by the probability of the first delivery data 401 converting on each day from 2 days at 9 months to 16 days at 9 months, and by processing the first delivery data 401 by the conversion rate prediction model 402.
The conversion rate prediction model 402 is calibrated through the obtained conversion probability expression vector, so that more accurate advertisement conversion rate can be obtained, user experience can be improved, and advertisement income can be improved.
According to the launch data processing method provided by the embodiment of the disclosure, the conversion probability expression vector of the first launch data is obtained by processing the first launch data, the second launch data and the third launch data. On one hand, the conversion probability expression vector of the first delivery data can be used for calibrating the conversion rate of the first delivery data, so that the predicted conversion rate of the first delivery data is more accurate; on the other hand, the method can accurately acquire the dynamic behavior data of the user, and can bring better user experience.
Fig. 5 is a flow chart illustrating another method of impression data processing according to an exemplary embodiment.
Unlike the placement data processing method 20 shown in fig. 2, the placement data processing method 50 shown in fig. 5 further provides how to process the first placement data, the second placement data, and the third placement data to obtain the transition probability expression vector of the first placement data, i.e., provides an embodiment of the step S24.
Referring to FIG. 5, method 50 includes steps S502-S508.
In step S502, the first delivery data is processed to obtain a feature expression vector of the first delivery data.
As shown in fig. 3, the first placement data Xi may be processed through a single-layer fully connected layer (Full Connect Net)301, for example, to obtain a feature expression vector hc of the first placement data Xi.
The first launching data Xi is processed through the single-layer full-connection layer 301, the feature expression vector hc of the first launching data Xi can be extracted, and the influence of the feature of the first launching data on time delay estimation can be strengthened.
The formula used in the single fully connected layer 301 is as follows:
hc=f(WCXi+b)
wherein, WCAnd b denotes a parameter matrix, the activation function may select ReLU.
In step S504, the second delivery data is processed to obtain a fusion feature expression vector of the second delivery data.
As shown in fig. 3, the second delivery data Hi may be processed through the multi-layer fully-connected layers 302 and 303, for example, to obtain a fused feature representation vector he of the second delivery data Hi.
In some embodiments, the second placement data may be processed based on the first fully-connected layer 302 structure to obtain a feature representation vector of the second placement data.
The second impression data Hi may be processed, for example, by the first plurality of fully connected layers 302, to extract a feature representation vector hei for each historical click behavior.
In some embodiments, the feature representation vector of the second delivery data may be processed based on the second layer fully connected layer structure 303 to obtain a fused feature representation vector of the second delivery data.
For example, the feature representation vector hei of each historical click behavior may be fused by the second fully connected layer 303, so as to obtain a fused feature representation vector he of the second delivery data Hi.
The formula of the multilayer full-connection layer is as follows:
he=f(WCHi+b)
wherein, WCAnd b represents a parameter matrix, and the parameter matrix in the formula of the multi-layer fully-connected layer and the single-layer fully-connected layer can be the same or different, and the disclosure does not limit the parameter matrix.
In step S506, the third delivery data is processed based on the dual-layer GRU structure, and an interest feature expression vector of the third delivery data is obtained.
As shown in fig. 3, for example, the third delivery data Si may be processed through the dual-layer GRU structures 305 and 306 to obtain the interest feature expression vector h of the third delivery data Si2(I)。
Fig. 6 is a schematic diagram illustrating a network structure of a GRU unit according to an exemplary embodiment.
Fig. 6 is a schematic diagram of the internal structure of a GRU unit in the delayed feedback conversion ratio calibration model shown in fig. 3. As shown in fig. 6, the internal structure of the GRU unit is a recurrent neural network, and the relationship between parameters in the recurrent neural network is as follows:
zt=σ(Wz·[ht-1,St])
rt=σ(Wr·[ht-1,St])
Figure BDA0002564244270000111
Figure BDA0002564244270000112
wherein S istA feature vector, h, that may represent third impression datat-1Output hidden vector, h, representing last cycle networktRepresenting the characteristic vector output by the current cycle network, Wz, Wr and W representing a model parameter matrix,
Figure BDA0002564244270000113
rt, zt represent intermediate results.
In some embodiments, the third placement data may be processed based on the first layer GRU structure 305 to obtain a plurality of preset interval interest feature representation vectors.
As shown in fig. 3, for example, the third delivery data Si may be processed through the first layer GRU structure 305, and the interest feature expression vector h of the third delivery data Si at preset intervals is extracted1(1)。
The formula in the first layer GRU structure 305 may be:
h1(l)=G(Si,h1(l-1))
wherein Si may represent third delivery data, or represent a dense vector representing the feature of the third delivery quantity after preprocessing, h1(l) The interest feature representation vector representing day 1, i.e. the interest feature representation vector of the user on a single day, G represents a function in the GRU structure.
In some embodiments, the interest feature representation vector of each preset interval may be processed based on the second layer GRU structure 306, so as to obtain the interest feature representation vector of the third delivery data.
As shown in fig. 3, for example, a vector h of interest feature representation for each day may be represented by a second layer GRU structure 306 on the basis of the first layer GRU structure 3051(l) Processing is carried out, the relation among user behaviors in each day can be extracted, and an interest characteristic expression vector h of the third release data Si is obtained2(l) The feature expression vector of the user purchase intention can be strengthened.
The formula in the second layer GRU structure 306 may be:
h2(l)=G(h1(l),h2(l-1))
wherein h is1(l) Representing the feature expression dense vector output by the GRU of the first layer in each circulation, namely the interest feature expression vector of the single-day user, h2(l) And representing the feature output by the GRU of the second layer in each circulation to represent a dense vector, namely a joint interest feature representation vector exhibited by the user in each day.
In the embodiment of the disclosure, the third delivery data is processed based on a double-layer GRU structure, and the obtained interest feature expression vector of the third delivery data can dynamically represent the interest change of the user.
In step S508, based on the survival analysis method, the feature expression vector of the first delivery data, the fusion feature expression vector of the second delivery data, and the interest feature expression vector of the third delivery data are processed to obtain a transformation probability expression vector of the first delivery data.
The survival analysis method can be a subject for carrying out statistical inference on one or more non-negative random variables and researching survival phenomenon and response time data and statistical rules thereof.
The embodiment of the disclosure can obtain how long the conversion behavior occurs after the user clicks the advertisement data to be predicted based on the survival analysis method.
In some embodiments, the feature representation vector of the first placement data, the fused feature representation vector of the second placement data, and the interest feature representation vector of the third placement data may be input into the fully connected layer 304 for processing, and a survival function and a risk function of the first placement data may be determined.
In some embodiments, a conversion probability representation vector for the first placement data may be determined based on a survival function and a risk function for the first placement data.
As shown in fig. 3, for example, the feature representation vector hc of the first placement data, the fused feature representation vector he of the second placement data, and the interest feature representation vector h of the third placement data may be represented by the full connection layer 3042(l) And processing to obtain a conversion probability expression vector hg of the first putting data.
For example, a survival analysis method may be used to dynamically predict the probability of the conversion behavior occurring on the day d after the user clicks the first delivery data, and the specific formula is as follows:
Figure BDA0002564244270000131
hg(di|Xi,Hi,Si)=S(di|Xi,Hi,Si)h(di|Xi,Hi,Si)
wherein Xi may represent the first delivery data, or may represent the feature of the preprocessed first delivery data to represent a dense vector, Hi may represent the second delivery data, Si may represent the third delivery data, S (×) represents a survival function, h (×) represents a risk function, and hg (×) represents a transition probability representation vector at which the first delivery data is transitioned on day d.
Fig. 7 is a flow chart illustrating another method of impression data processing according to an exemplary embodiment.
In addition to the placement data processing method 20 shown in fig. 2, the placement data processing method 70 shown in fig. 7 further includes steps S702 to S704.
In step S702, the first delivery data, the second delivery data, and the third delivery data are preprocessed based on the unique hot code.
For ID (Identity Document) class information in the first delivery data, the second delivery data and the third delivery data, such as ad spot ID information (Pos _ ID) of an ad, tertiary class information (Cid3), etc., onehot (unique hot code) can be used to extract features, so that the distance between the features can be calculated more reasonably.
The result after the one-hot coding processing can be subjected to densification processing to obtain low-dimensional dense vector representations of the first release data, the second release data and the third release data, and the problem of sparse high-dimensional data can be solved.
In step S704, the first delivery data, the second delivery data, and the third delivery data are preprocessed based on the visual appeal model.
For the picture information in the first, second and third delivery data, a pre-trained picture visual appeal model (Telepath) can be used to extract click interest from the picture information, and dense key image information vectorization representation of the first, second and third delivery data is generated.
The image visual attraction model integrates three Neural networks of CNN (Convolutional Neural Network), RNN (Recurrent Neural Network) and DNN (Deep Neural Network), can use CNN to simulate user vision, obtains key visual signals of image attraction, and can use DNN and RNN to obtain interest information of the image based on user browsing records.
For example, the low-dimensional dense vector representation and the key image information vectorization representation may be combined to obtain a feature vector representation of a combination of the first delivery data, the second delivery data, and the third delivery data as a specific vector representation of the inputs Xi, Si, Hi in the delay feedback conversion rate calibration model shown in fig. 3.
The same steps of the delivery data processing method 70 as the delivery data processing method 20 are not described in detail herein.
It is noted that the above-mentioned figures are merely schematic illustrations of processes involved in methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods. For details not disclosed in the embodiments of the apparatus of the present disclosure, refer to the embodiments of the method of the present disclosure.
Fig. 8 is a launch data processing apparatus according to an exemplary embodiment.
As shown in fig. 8, the placement data processing device 80 includes: a first data acquisition module 801, a second data acquisition module 802, a third data acquisition module 803, and a data processing module 804.
The first data acquisition module 801 is configured to acquire first launch data; the second data obtaining module 802 is configured to obtain second launch data clicked before the first launch data is clicked; the third data obtaining module 803 is configured to obtain third launch data clicked after the first launch data is clicked; the data processing module 804 is configured to process the first launch data, the second launch data, and the third launch data to obtain a conversion probability expression vector of the first launch data; wherein the conversion probability representation vector is used for predicting the conversion rate of the first delivery data.
In some embodiments, the data processing module 804 includes a first processing unit, a second processing unit, a third processing unit, and a fourth processing unit. The first processing unit is used for processing the first launching data to obtain a feature expression vector of the first launching data; the second processing unit is used for processing the second release data to obtain a fusion feature expression vector of the second release data; the third processing unit is used for processing the third release data based on the double-layer gate circulation unit GRU structure to obtain an interest feature expression vector of the third release data; and the fourth processing unit is used for processing the feature representation vector of the first delivery data, the fusion feature representation vector of the second delivery data and the interest feature representation vector of the third delivery data based on a survival analysis method to obtain a conversion probability representation vector of the first delivery data.
In some embodiments, the third processing unit comprises: a first interest feature expression vector obtaining unit, configured to process the third delivery data based on the first-layer GRU structure, and obtain a plurality of interest feature expression vectors at preset intervals; and the second interest feature expression vector obtaining unit is used for processing the interest feature expression vectors of all the preset intervals based on the second-layer GRU structure to obtain the interest feature expression vectors of the third launching data.
In some embodiments, the fourth processing unit comprises: the function determining unit is used for inputting the feature representation vector of the first release data, the fusion feature representation vector of the second release data and the interest feature representation vector of the third release data into the full connection layer for processing, and determining a survival function and a risk function of the first release data; and the conversion probability expression vector obtaining unit is used for determining a conversion probability expression vector of the first putting data based on the survival function and the risk function of the first putting data.
In some embodiments, the second processing unit comprises: the characteristic expression vector obtaining unit is used for processing the second release data based on the first layer full connection layer structure to obtain a characteristic expression vector of the second release data; and the fusion characteristic representation vector obtaining unit is used for processing the characteristic representation vector of the second release data based on the second layer full-connection layer structure to obtain the fusion characteristic representation vector of the second release data.
In some embodiments, the launch data processing means 80 further comprises: the first preprocessing module is used for preprocessing the first release data, the second release data and the third release data based on the one-hot code; and the second preprocessing module is used for preprocessing the first release data, the second release data and the third release data based on the visual attraction model.
In some embodiments, the launch data processing means 80 further comprises: and the conversion probability obtaining module is used for inputting the first putting data and the conversion probability expression vector of the first putting data into the conversion rate prediction model to obtain the conversion probability of the first putting data.
The launch data processing device provided by the embodiment of the disclosure obtains the conversion probability expression vector of the first launch data by processing the first launch data, the second launch data and the third launch data. On one hand, the conversion probability expression vector of the first delivery data can be used for calibrating the conversion rate of the first delivery data, so that the predicted conversion rate of the first delivery data is more accurate; on the other hand, the device can accurately acquire the dynamic behavior data of the user, and can bring better user experience.
It is noted that the block diagrams shown in the above figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
FIG. 9 is a schematic diagram illustrating a configuration of a computer device, according to an example embodiment. It should be noted that the computer device shown in fig. 9 is only an example, and should not bring any limitation to the functions and the scope of the application of the embodiments of the present invention.
As shown in fig. 9, the computer apparatus 900 includes a Central Processing Unit (CPU)901 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)902 or a program loaded from a storage section 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data necessary for the operation of the system 900 are also stored. The CPU 901, ROM 902, and RAM 903 are connected to each other via a bus 904. An input/output (I/O) interface 905 is also connected to bus 904.
The following components are connected to the I/O interface 905: an input portion 906 including a keyboard, a mouse, and the like; an output section 907 including components such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 908 including a hard disk and the like; and a communication section 909 including a network interface card such as a LAN card, a modem, or the like. The communication section 909 performs communication processing via a network such as the internet. The drive 910 is also connected to the I/O interface 905 as necessary. A removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 910 as necessary, so that a computer program read out therefrom is mounted into the storage section 908 as necessary.
In particular, according to an embodiment of the present invention, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 909, and/or installed from the removable medium 911. The above-described functions defined in the system of the present invention are executed when the computer program is executed by a Central Processing Unit (CPU) 901.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present invention may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a transmitting unit, an obtaining unit, a determining unit, and a first processing unit. The names of these units do not in some cases constitute a limitation to the unit itself, and for example, the sending unit may also be described as a "unit sending a picture acquisition request to a connected server".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise:
acquiring first release data;
acquiring second release data clicked before the first release data is clicked;
acquiring third release data clicked after the first release data is clicked;
processing the first launch data, the second launch data and the third launch data to obtain a conversion probability expression vector of the first launch data;
wherein the conversion probability representation vector is used for predicting the conversion rate of the first delivery data.
Exemplary embodiments of the present invention are specifically illustrated and described above. It is to be understood that the invention is not limited to the precise construction, arrangements, or instrumentalities described herein; on the contrary, the invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (10)

1. A method for processing delivery data, comprising:
acquiring first release data;
acquiring second release data clicked before the first release data is clicked;
acquiring third release data clicked after the first release data is clicked;
processing the first launch data, the second launch data and the third launch data to obtain a conversion probability expression vector of the first launch data;
wherein the conversion probability representation vector is used to predict a conversion rate of the first delivery data.
2. The method of claim 1, wherein processing the first placement data, the second placement data, and the third placement data to obtain a transformation probability representation vector for the first placement data comprises:
processing the first launching data to obtain a feature expression vector of the first launching data;
processing the second launching data to obtain a fusion feature expression vector of the second launching data;
processing the third release data based on a double-layer gate circulation unit GRU structure to obtain an interest feature expression vector of the third release data;
and processing the feature representation vector of the first delivery data, the fusion feature representation vector of the second delivery data and the interest feature representation vector of the third delivery data based on a survival analysis method to obtain a conversion probability representation vector of the first delivery data.
3. The method of claim 2, wherein processing the third placement data based on a two-layer GRU structure to obtain an interest feature representation vector of the third placement data comprises:
processing the third delivery data based on a first-layer GRU structure to obtain a plurality of interest feature expression vectors at preset intervals;
and processing the interest feature representation vectors of all preset intervals based on a second-layer GRU structure to obtain the interest feature representation vectors of the third delivery data.
4. The method of claim 2, wherein the processing the eigenvector representation of the first placement data, the fused eigenvector representation of the second placement data, and the interest eigenvector representation of the third placement data based on a survival analysis method to obtain a transformation probability expression vector for the first placement data comprises:
inputting the feature representation vector of the first release data, the fusion feature representation vector of the second release data and the interest feature representation vector of the third release data into a full connection layer for processing, and determining a survival function and a risk function of the first release data;
and determining a conversion probability expression vector of the first putting data based on the survival function and the risk function of the first putting data.
5. The method of claim 2, wherein processing the second placement data to obtain a fused feature representation vector for the second placement data comprises:
processing the second launching data based on a first layer full-connection layer structure to obtain a feature expression vector of the second launching data;
and processing the feature representation vector of the second launching data based on a second layer full-connection layer structure to obtain a fusion feature representation vector of the second launching data.
6. The method of claim 1, further comprising:
preprocessing the first delivery data, the second delivery data and the third delivery data based on the one-hot code;
and preprocessing the first putting data, the second putting data and the third putting data based on a visual attraction model.
7. The method of claim 1, further comprising:
and inputting the first putting data and the conversion probability expression vector of the first putting data into a conversion rate prediction model to obtain the conversion probability of the first putting data.
8. A placement data processing apparatus, comprising:
the first data acquisition module is used for acquiring first release data;
the second data acquisition module is used for acquiring second release data clicked before the first release data is clicked;
the third data acquisition module is used for acquiring third release data clicked after the first release data is clicked;
the data processing module is used for processing the first release data, the second release data and the third release data to obtain a conversion probability expression vector of the first release data;
wherein the conversion probability representation vector is used to predict a conversion rate of the first delivery data.
9. A computer device, comprising: memory, processor and executable instructions stored in the memory and executable in the processor, characterized in that the processor implements the method according to any of claims 1-7 when executing the executable instructions.
10. A computer-readable storage medium having stored thereon computer-executable instructions, which when executed by a processor, implement the method of any one of claims 1-7.
CN202010624443.3A 2020-07-01 2020-07-01 Put-in data processing method, device, equipment and storage medium Pending CN113793160A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010624443.3A CN113793160A (en) 2020-07-01 2020-07-01 Put-in data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010624443.3A CN113793160A (en) 2020-07-01 2020-07-01 Put-in data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113793160A true CN113793160A (en) 2021-12-14

Family

ID=79181207

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010624443.3A Pending CN113793160A (en) 2020-07-01 2020-07-01 Put-in data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113793160A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120078706A1 (en) * 2010-09-28 2012-03-29 Openwave Systems Inc. Location prediction protocol (lpp)
CN108109016A (en) * 2017-12-30 2018-06-01 有米科技股份有限公司 Ad conversion rates predictor method and device, computer equipment and storage medium
CN108230003A (en) * 2016-12-22 2018-06-29 北京国双科技有限公司 The dispensing effect analysis method and device of keyword
CN108230010A (en) * 2017-12-12 2018-06-29 深圳市金立通信设备有限公司 A kind of method and server for estimating ad conversion rates
CN108665310A (en) * 2018-05-08 2018-10-16 多盟睿达科技(中国)有限公司 A kind of advertisement placement method and system based on detection pre-estimating technology
CN110415032A (en) * 2019-07-24 2019-11-05 深圳乐信软件技术有限公司 A kind of exposure conversion ratio predictor method, device, equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120078706A1 (en) * 2010-09-28 2012-03-29 Openwave Systems Inc. Location prediction protocol (lpp)
CN108230003A (en) * 2016-12-22 2018-06-29 北京国双科技有限公司 The dispensing effect analysis method and device of keyword
CN108230010A (en) * 2017-12-12 2018-06-29 深圳市金立通信设备有限公司 A kind of method and server for estimating ad conversion rates
CN108109016A (en) * 2017-12-30 2018-06-01 有米科技股份有限公司 Ad conversion rates predictor method and device, computer equipment and storage medium
CN108665310A (en) * 2018-05-08 2018-10-16 多盟睿达科技(中国)有限公司 A kind of advertisement placement method and system based on detection pre-estimating technology
CN110415032A (en) * 2019-07-24 2019-11-05 深圳乐信软件技术有限公司 A kind of exposure conversion ratio predictor method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109460513B (en) Method and apparatus for generating click rate prediction model
CN108197652B (en) Method and apparatus for generating information
CN110555714A (en) method and apparatus for outputting information
CN107845016B (en) Information output method and device
CN111125574A (en) Method and apparatus for generating information
CN110473042B (en) Method and device for acquiring information
CN111783810A (en) Method and apparatus for determining attribute information of user
CN113763019A (en) User information management method and device
WO2022156589A1 (en) Method and device for determining live broadcast click rate
CN113034168A (en) Content item delivery method and device, computer equipment and storage medium
CN111325614B (en) Recommendation method and device of electronic object and electronic equipment
US11962662B2 (en) Method and apparatus for pushing information
US20220198487A1 (en) Method and device for processing user interaction information
CN116186541A (en) Training method and device for recommendation model
CN113793160A (en) Put-in data processing method, device, equipment and storage medium
CN111506643B (en) Method, device and system for generating information
CN115271757A (en) Demand information generation method and device, electronic equipment and computer readable medium
CN109727072B (en) Method and apparatus for processing information
CN113792952A (en) Method and apparatus for generating a model
CN113704596A (en) Method and apparatus for generating a set of recall information
CN111784377A (en) Method and apparatus for generating information
CN116992292A (en) Click rate estimation model training method and device and click rate estimation method and device
CN112860999B (en) Information recommendation method, device, equipment and storage medium
CN111125572B (en) Method and device for processing information
CN111626805B (en) Information display method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination