CN110458220B

CN110458220B - Crowd orientation method, device, server and storage medium

Info

Publication number: CN110458220B
Application number: CN201910714826.7A
Authority: CN
Inventors: 杨春风
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-07-31
Filing date: 2019-07-31
Publication date: 2024-04-12
Anticipated expiration: 2039-07-31
Also published as: CN110458220A

Abstract

The embodiment of the invention discloses a crowd orientation method, a device, a server and a medium, wherein the method comprises the following steps: acquiring a reference user set and a candidate user set of a target advertisement; training and optimizing the estimated model by adopting the reference user set and the candidate user set to obtain an optimized estimated model; invoking the optimized estimation model to perform advertisement orientation estimation on each candidate user according to attribute data of each candidate user in the candidate user set to obtain orientation probability of each candidate user, wherein the orientation probability refers to probability of generating forward feedback on the target advertisement by the candidate user; and screening attribute data of the targeted users from the candidate user set according to the targeting probability of each candidate user, and adding the attribute data of the targeted users into the targeting crowd data of the targeted advertisement. The embodiment of the invention can better orient people and improve the accuracy of oriented crowd data.

Description

Crowd orientation method, device, server and storage medium

Technical Field

The invention relates to the technical field of Internet, in particular to the technical field of advertisement delivery, and particularly relates to a crowd orientation method, a crowd orientation device, a server and a computer storage medium.

Background

Advertising, as the name implies, is advertising, i.e., informing the public of something; narrow advertising refers to the means by which advertisers communicate goods or services information to consumers or users through an advertising media platform in a pay-per-view manner. At present, in the process of delivering a target advertisement, crowd-oriented processing is generally performed on the target advertisement to determine the oriented crowd of the target advertisement, wherein the oriented crowd comprises potential audiences related to the target advertisement; targeted advertising is then placed in the targeted population. Therefore, crowd targeting is a very important link in the process of advertising, and the accuracy of the targeted crowd is closely related to the advertising effect; therefore, how to better orient people to improve the accuracy of oriented people is becoming a research hotspot.

Disclosure of Invention

The embodiment of the invention provides a crowd orientation method, a device, a server and a computer storage medium, which can better orient the crowd and improve the accuracy of oriented crowd data.

In one aspect, an embodiment of the present invention provides a crowd-directing method, including:

acquiring a reference user set and a candidate user set of a target advertisement; the reference user set comprises attribute data of a plurality of reference users, wherein the reference users refer to users capable of generating forward feedback on the target advertisement; the candidate user set comprises attribute data of a plurality of candidate users, wherein the candidate users are users to be oriented;

Training and optimizing the estimated model by adopting the reference user set and the candidate user set to obtain an optimized estimated model;

invoking the optimized estimation model to perform advertisement orientation estimation on each candidate user according to attribute data of each candidate user in the candidate user set to obtain orientation probability of each candidate user, wherein the orientation probability refers to probability of generating forward feedback on the target advertisement by the candidate user;

and screening attribute data of the targeted users from the candidate user set according to the targeting probability of each candidate user, and adding the attribute data of the targeted users into the targeting crowd data of the targeted advertisement.

In another aspect, an embodiment of the present invention provides a crowd direction device, including:

the acquisition unit is used for acquiring a reference user set and a candidate user set of the target advertisement; the reference user set comprises attribute data of a plurality of reference users, wherein the reference users refer to users capable of generating forward feedback on the target advertisement; the candidate user set comprises attribute data of a plurality of candidate users, wherein the candidate users are users to be oriented;

The optimizing unit is used for training and optimizing the estimated model by adopting the reference user set and the candidate user set to obtain an optimized estimated model;

the processing unit is used for calling the optimized estimation model to perform advertisement orientation estimation on each candidate user according to the attribute data of each candidate user in the candidate user set to obtain the orientation probability of each candidate user, wherein the orientation probability refers to the probability that the candidate user generates forward feedback on the target advertisement;

and the processing unit is used for screening out the attribute data of the targeted users from the candidate user set according to the targeting probability of each candidate user, and adding the attribute data of the targeted users into the targeted crowd data of the targeted advertisement.

In yet another aspect, an embodiment of the present invention provides a server, where the server includes a communication interface, and the server further includes:

a processor adapted to implement one or more instructions; the method comprises the steps of,

a computer storage medium storing one or more instructions adapted to be loaded by the processor and to perform the steps of:

In yet another aspect, embodiments of the present invention provide a computer storage medium storing one or more instructions adapted to be loaded by a processor and to perform the steps of:

In the crowd orientation process, the embodiment of the invention can firstly acquire the reference user set and the candidate user set of the target advertisement; wherein the reference user set includes attribute data of a plurality of reference users, and the candidate user set includes attribute data of a plurality of candidate users. Since the reference user refers to a user that can generate forward feedback to the target user, and the candidate users are users to be targeted; the method comprises the steps of training and optimizing the estimation model by adopting a reference user set and a candidate user set, and then calling the optimized estimation model to perform advertisement targeted estimation on each candidate user according to attribute data of each candidate user to obtain targeted probability of each candidate user; therefore, the training space and the prediction space of the pre-estimated model are consistent, and the accuracy of the orientation probability of each candidate user is improved. Because the targeting probability refers to the probability that the candidate users generate forward feedback to the target advertisement, the attribute data of the target users can be screened from the candidate user set according to the targeting probability of each candidate user and added into the targeting crowd data of the target advertisement; by improving the accuracy of the orientation probability, the accuracy of the oriented crowd data can be improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1a is a system architecture diagram of an advertisement delivery provided by an embodiment of the present invention;

FIG. 1b is a schematic diagram of a crowd-targeting scenario provided by an embodiment of the present invention;

fig. 2 is a schematic flow chart of a crowd direction method according to an embodiment of the invention;

FIG. 3 is a schematic diagram of a spatial offset phenomenon according to an embodiment of the present invention;

FIG. 4 is a flow chart of a crowd direction method according to another embodiment of the invention;

FIG. 5a is an application scenario diagram of advertisement delivery provided by an embodiment of the present invention;

FIG. 5b is an application scenario diagram of another advertisement delivery provided by an embodiment of the present invention;

FIG. 5c is a flow chart of another crowd direction method provided by an embodiment of the invention;

FIG. 5d is an application scenario diagram of another advertisement delivery provided by an embodiment of the present invention;

FIG. 5e is an application scenario diagram of another advertisement delivery provided by an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a crowd direction device according to an embodiment of the invention;

fig. 7 is a schematic structural diagram of a server according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

Crowd targeting refers to the process of screening out potential audience members that are most relevant to targeted advertising. The potential audience may also be referred to as a targeted user, specifically, a potential recipient who has a high probability of receiving the targeted advertisement. Research shows that in the process of delivering the target advertisement, the accuracy of crowd targeting is generally proportional to the reach rate of the target advertisement; i.e., the higher the accuracy of crowd targeting, the higher the targeting rate of the targeted advertisement. The touch rate may also be referred to as exposure rate, and is a ratio of the number of users actually exposed to the crowd to the total number of users of the targeted crowd, where the number of users actually exposed refers to the number of users having advertisement exposure behavior (behavior of seeing the targeted advertisement) to the targeted advertisement in a preset period of time; the reach rate can be used to measure how many proportion of users in the targeted crowd actually get the actual advertisement placement, i.e., the reach rate can be used to measure the proportion of users in the targeted crowd who actually see the targeted advertisement. For example, the total number of users of the targeted crowd is 500, if the number of users having advertisement exposure behavior for the targeted advertisement is 450 in a preset period of time, the reach rate (exposure rate) is equal to 450/500=90%; if the number of users who have advertisement exposure behavior to the targeted advertisement for a preset period of time is 200 people, the reach (exposure rate) is equal to 200/500=40%. The preset time period may be set according to actual service requirements.

Therefore, crowd targeting is a very important link in the process of advertising; this process is not crowd-oriented, either from the advertiser's (i.e., the user in need of advertising) perspective or from the traffic party (e.g., the advertising system) perspective. For example, for advertisers, low exposure means that the budget for advertisement placement is not effectively consumed and a sufficient scale of user population is not reached; while for the flow party, low exposure rates can affect its benefits (especially advertising in terms of exposure deductions). Therefore, in order to better perform crowd orientation to improve the subsequent advertisement putting effect, the embodiment of the invention provides a crowd orientation scheme; the crowd-targeting scheme can be applied to an advertising system, wherein the advertising system refers to a system which can provide an advertising media platform and throw advertisements on the advertising media platform for advertisers in a charging mode, such as an SPA MI system of Tencer.

In one embodiment, the advertisement system may be an advertisement system based on man-machine interaction of web pages, which may generally comprise a front end and a background; the front end refers to a foreground part of the advertising system, runs in a browser of the terminal and displays the webpage to an advertiser for browsing; the background refers to a background server for performing a series of data management operations on the front end to implement providing advertising services for advertisers. In still another embodiment, the advertisement system may be an advertisement system based on man-machine interaction of a client, which may generally include two parts, namely a client and a server; the client is an application program (APP) which is installed and operated in the terminal and provides advertisement service for advertisers; the server side is a background server corresponding to the client side and used for supporting the client side to provide advertisement service. For ease of description, the advertisement system mentioned in the following embodiments will be described by taking an advertisement system that performs man-machine interaction based on web pages as an example, unless otherwise specified. The terminals mentioned above may include, but are not limited to: mobile devices such as smartphones, laptops, tablet computers, etc., desktop computers, etc.; targeted advertising may include, but is not limited to: advertisements of the type television advertisements, movie advertisements, web advertisements, video advertisements, etc.; the advertising media platform may include, but is not limited to: television, movies, web pages, video playback clients (e.g., tencel video clients), instant messaging clients (e.g., tencel QQ clients, weChat clients), and the like.

When an advertiser wants to target a targeted crowd of advertisements, a crowd-targeting request may be sent to a background server (hereinafter referred to as a server) through the front end of the advertising system, as shown in FIG. 1 a. After receiving the crowd direction request, the server may respond to the crowd direction request and execute the crowd direction scheme to determine the target advertisement target crowd data, as shown in fig. 1 b; the targeted crowd data comprises attribute data of a plurality of targeted users, and the targeted users form targeted crowd of targeted advertisements. In the implementation process, a server can firstly acquire a seed crowd data set and a large-disc user data set of a target advertisement; wherein the seed crowd data set can comprise attribute data of a plurality of seed users, and the seed users refer to users capable of generating forward feedback on the target advertisement; the large disc user dataset may comprise attribute data of a plurality of large disc users, a large disc user being the user to be targeted. And secondly, determining a training set for training a pre-estimation model from the seed crowd data set and the large-disc user data set, wherein the pre-estimation model refers to a model which can perform advertisement directional pre-estimation on users (such as seed users, large-disc users and the like) so as to obtain probability of forward feedback of the users on target advertisements. Then, training and optimizing the pre-estimation model by adopting the training set, and carrying out advertisement directional pre-estimation on the large-disc users by adopting the optimized pre-estimation model to obtain the probability of generating forward feedback on the target advertisement by each large-disc user. Finally, the large disc users can be ranked based on the probability to determine the directional users, and attribute data of the directional users are added into the directional crowd data.

Therefore, in the crowd orientation scheme, the optimized pre-estimated model is obtained by training and optimizing the pre-estimated model in real time. Firstly, determining a training set from a seed crowd data set and a large-disk user data set, and training and optimizing the pre-estimated model by using the training set; and then, calling the optimized prediction model to perform advertisement targeting prediction on each large-disc user again, so that the training space and the prediction space of the prediction model are consistent, the accuracy of the probability of generating forward feedback on the target advertisement by each large-disc user obtained through prediction is higher, and the accuracy of the targeting crowd data can be improved. In the process of delivering the target advertisement based on the directional crowd data with higher accuracy, the contact rate of the target advertisement can be improved, so that the delivering effect of the target advertisement is improved.

It should be emphasized that in the embodiments of the present application, related data (such as attribute data of a user) of user information and the like are related, and when any method embodiment proposed in the embodiments of the present application is applied to a specific product or technology, the related data is collected with permission or consent of the user, and the collection, use and processing of the related data complies with related laws and regulations and standards of the related region.

Based on the above description, the embodiments of the present invention provide a crowd direction method, which may be performed by the above-mentioned server. Referring to fig. 2, the crowd direction method may include the following steps S201-S204:

s201, a reference user set and a candidate user set of the target advertisement are obtained.

When an advertiser wants to perform crowd targeting on a target advertisement to determine a target advertisement target crowd, the advertiser can send a target advertisement crowd targeting request to a server through the front end of an advertisement system. Accordingly, after receiving the crowd-targeting request, the server may obtain a reference user set and a candidate user set of the targeted advertisement in response to the crowd-targeting request. Wherein the candidate user set may comprise attribute data of a plurality of candidate users, so-called candidate users being users to be targeted (i.e. the aforementioned large disc users). The reference user set may include attribute data of a plurality of reference users, where reference users refer to users capable of generating forward feedback to the targeted advertisement, where forward feedback refers to feedback of seeing or clicking on the targeted advertisement during the process of being placed with the targeted advertisement; accordingly, feedback that the targeted advertisement is not seen in the process of being targeted advertisement may be referred to as negative feedback. For example, the user a is targeted on the day 7.20, in other words, the user a is targeted on the day 7.20; if user A sees the target advertisement or clicks on the target advertisement on day 7.20, it indicates that user A has fed forward to the target advertisement, in which case user A may be the reference user; if user a does not see the targeted advertisement on day 7.20, it indicates that user a has negative feedback on the targeted advertisement, in which case user a is not the reference user.

Wherein the above mentioned attribute data may comprise at least a user identification and a user portrayal of the user, i.e. the attribute data of the reference user comprises a user identification and a user portrayal of the reference user, and the attribute data of the candidate user comprises a user identification and a user portrayal of the candidate user. User identification herein refers to a number that may be used to uniquely identify a user identity, which may include at least one of: social account numbers (e.g., QQ numbers, weChat account numbers), device identification numbers (e.g., IMEI of android device (International Mobile Equipment Identity, international mobile equipment identification), IFA of IOS device (a device identification), phone numbers, identification numbers, etc. The user representation includes a label for materializing the user representation, which may include at least one of: gender, age, personality, interests (hobbies), and the like.

S202, training and optimizing the estimated model by adopting a reference user set and a candidate user set to obtain an optimized estimated model.

In order to better train and optimize the pre-estimated model, the performance of the optimized pre-estimated model is better; according to the embodiment of the invention, the historical delivery data of the target advertisement is adopted to train and optimize the estimated model, wherein the historical delivery data comprises attribute data of a plurality of historical users; and performance testing is also performed on the optimized pre-estimated model. However, practice finds that the performance of an optimized pre-estimated model obtained by training and optimizing the pre-estimated model by adopting historical delivery data is poor, and the problem that the target advertisement is lower in reach (namely exposure) rate when the optimized pre-estimated model is adopted for advertisement delivery is solved. The embodiment of the invention analyzes the problem, and finds that the main reasons for the problem of low exposure rate are as follows:

When training and optimizing the pre-estimated model by adopting historical put data, selecting the attribute data of the historical users with advertisement exposure behaviors and advertisement clicking behaviors (the behaviors of clicking advertisements) as positive samples and selecting the attribute data of the historical users with advertisement exposure behaviors and without advertisement clicking behaviors as negative samples, so as to train and optimize the pre-estimated model. Therefore, the training set adopted by the training optimization mode comprises all attribute data of the historical users, wherein the historical users are provided with target advertisements, namely, positive and negative samples are exposed; while the prediction set is the individual candidate users (i.e., large disk users), the candidate users do not necessarily have advertisement exposure behavior. Training and optimizing the predictive model by using the training set is equivalent to the assumption that all candidate users in the predictive set are always exposed, namely, the assumption that all candidate users in the predictive set are always exposed to the target advertisement; this assumption is not true in crowd targeting because there is virtually no guarantee that each candidate user will have to have advertisement exposure behavior. In other words, in this training mode for training the prediction model by using the historical placement data, the training space includes a user group having advertisement exposure behavior (including a historical user having advertisement exposure behavior and advertisement click behavior and a historical user having advertisement exposure behavior and no advertisement click behavior), and the prediction space is all candidate users corresponding to the whole candidate user set, which is equivalent to that a spatial offset (Sample Selection Bias) occurs between the training space and the prediction space, as shown in fig. 3.

Studies have shown that the above mentioned phenomenon of spatial offset between training space and prediction space is a major cause of problems leading to low exposure rates; in addition, the problem of low exposure rate also causes the problem of low click rate, wherein the click rate (CTR) is the ratio of the number of actual click users to the number of actual exposure users in a specified crowd, and the actual click users refer to the number of users who have advertisement click behaviors on a target advertisement after seeing the target advertisement in a preset time period; for example, if the total number of users of the targeted crowd is 500, and the number of users having advertisement exposure behaviors to the targeted advertisement in the preset time period is 400, the actual number of exposure users of the targeted crowd is 400; if 200 of the 400 persons click on the target advertisement within the preset time period, namely, 200 persons have advertisement click behaviors on the target advertisement, the actual click user number of the targeted crowd is 200 persons, and the click rate is equal to 200/400=50%. The click rate can be used for measuring the click effect of the target advertisement after exposure in the directional crowd; the larger the click rate is, the better the click effect of the target advertisement after exposure in the oriented crowd is shown; the smaller the click rate is, the worse the click effect of the target advertisement after exposure in the oriented crowd is indicated. Therefore, in order to avoid the phenomenon that the training space and the prediction space deviate, the embodiment of the invention adopts the reference user set and the candidate user set to train and optimize the pre-estimated model, so that the training space and the prediction space of the pre-estimated model are consistent, the optimized pre-estimated model obtained by training and optimizing is ensured to have good performance, and the exposure rate or the click rate of the target advertisement can be improved when the optimized pre-estimated model is adopted for advertisement delivery in the follow-up.

S203, invoking an optimized estimation model to perform advertisement targeting estimation on each candidate user according to the attribute data of each candidate user in the candidate user set, and obtaining the targeting probability of each candidate user.

After the optimized estimation model is obtained, the optimized estimation model can be called first to carry out advertisement directional estimation on each candidate user according to attribute data of each candidate user in the candidate user set, so as to obtain the directional probability of each candidate user, wherein the directional probability refers to the probability that the candidate user generates forward feedback to the target advertisement. From the foregoing, the positive feedback refers to the feedback of seeing or clicking the target advertisement in the process of being delivered with the target advertisement; it can be seen that the greater the targeting probability of a candidate user, the greater the probability that the target advertisement is seen or clicked by the candidate user, and the greater the probability that the candidate user becomes the targeting user for the target advertisement. Therefore, after the targeting probability of each candidate user is obtained, the attribute data of the targeting user may be screened out from the candidate user set according to the targeting probability of each candidate user, and the attribute data of the targeting user may be added to the targeting crowd data of the target advertisement, that is, step S204 may be performed. It should be noted that, when the forward feedback means that the feedback of the target advertisement is seen in the process of being put on the target advertisement, the targeting probability is the exposure probability; when the forward feedback refers to the feedback of clicking the target advertisement in the process of being put on the target advertisement, the targeting probability is the clicking probability.

S204, screening out attribute data of the targeted users from the candidate user sets according to the targeting probability of each candidate user, and adding the attribute data of the targeted users into the targeting crowd data of the targeted advertisements.

In order to further increase the exposure rate or click rate of the target advertisement, the server may select the candidate user with a larger targeting probability as the targeting user when executing step S204, and acquire attribute data of the targeting user from the candidate user set and add the attribute data to the targeting crowd data. In the specific implementation process, the crowd orientation request can carry the number of oriented crowd; then, the server can sort the attribute data of each candidate user in the candidate user set according to the order of the directional probability from high to low to obtain a sorted set; and then, the attribute data of the corresponding candidate users can be sequentially selected from the sorting set according to the number of the oriented crowd to serve as the attribute data of the oriented users.

In practice, the advertiser may also choose whether to have each reference user in the reference user set as a targeted user. If the advertiser selects each reference user in the reference user set as a targeting user, the crowd targeting request also carries indication information for targeting the target advertisement to each reference user; then in this case, when the server sequentially selects attribute data of the corresponding candidate users from the sorting set according to the number of the directional crowd as attribute data of the directional users, the server can obtain a difference value between the number of the directional crowd and the number of the reference users to obtain a screening number; and sequentially selecting attribute data of the corresponding candidate users from the sorting set according to the screening quantity as attribute data of the oriented users. And, the server may also add attribute data of each reference user in the reference user set to the targeted crowd data of the targeted advertisement. If the advertiser chooses not to use each reference user in the reference user set as a targeting user, the indication information is not carried in the crowd targeting request; in this case, the server may directly select attribute data of the corresponding candidate users from the sorted set in turn according to the number of the targeted crowd as attribute data of the targeted user.

For example, the number of the directional crowd carried by the crowd direction request is set to be 50, the reference user set includes attribute data of 10 reference users, the candidate user set includes attribute data of 60 candidate users, and the attribute data of each candidate user are ranked according to the order of the direction probability from big to small, and the ranking set includes: attribute data of candidate user 1, attribute data of candidate user 2, attribute data … of candidate user 3, and attribute data of candidate user 60. If the crowd-oriented request carries the indication information for directing the target advertisement to each reference user, the screening quantity is calculated to be 50-10=40, then the attribute data of 40 candidate users can be sequentially selected from the sequencing set to be added into the oriented crowd data, and the attribute data of 10 reference users are also added into the oriented crowd data; namely, the targeting crowd data in this case includes attribute data of 10 reference users and attribute data of candidate user 1, attribute data of candidate user 2, attribute data … of candidate user 3, and attribute data of candidate user 40. If the crowd orientation request does not carry indication information for directing the target advertisement to each reference user, attribute data of 50 candidate users can be selected from the ordering set in sequence and added into the oriented crowd data; that is, the targeting crowd data in this case includes attribute data of candidate user 1, attribute data of candidate user 2, attribute data … of candidate user 3, and attribute data of candidate user 50.

Fig. 4 is a flow chart of another crowd direction method according to an embodiment of the invention. The crowd direction method may be performed by the server mentioned above. Referring to fig. 4, the crowd direction method may include the following steps S401 to S406:

s401, a reference user set and a candidate user set of the target advertisement are obtained.

The server can acquire a reference user set and a candidate user set of the target advertisement after receiving the crowd-oriented request; the crowd-targeting request may carry at least a crowd-targeting goal of the targeted advertisement and a platform identification, where the platform identification includes: a platform identification of an advertising media platform on which the targeted advertisement is placed or a platform identification of an advertising system. Specifically, if the advertiser does not specify the advertising media platform for delivering the target advertisement, the platform identifier carried by the crowd-oriented request is the platform identifier of the advertising system; if the advertiser designates the advertising media platform for delivering the target advertisement, the platform identifier carried by the crowd-oriented request is the platform identifier of the advertising media platform designated by the advertiser.

Correspondingly, when the server acquires the candidate user set of the target advertisement, if the platform identification comprises the platform identification of the advertisement media platform for delivering the target advertisement, the advertisement media platform corresponding to the platform identification can be determined first; and then taking all registered users in the advertising media platform as candidate users, acquiring attribute data of all registered users as attribute data of the candidate users, and adding the attribute data of all registered users into a candidate user set. Or, the recently active registered users in the advertising media platform are used as candidate users, and the attribute data of the recently active registered users are obtained and added into a candidate user set as the attribute data of the candidate users; the recently active registered users refer to registered users who log in the advertising media platform more than preset times in a preset time period based on forward calculation of the current system time. If the platform identifier includes the platform identifier of the advertisement system, all advertisement users (such as historical users who have been put with any advertisement and advertisers who have been put with any advertisement) in the advertisement user list stored in the advertisement system can be used as candidate users, and attribute data of all advertisement users can be obtained and added to the candidate user set as attribute data of the candidate users. Or, the recently active advertisement users in the advertisement user list stored in the advertisement system are taken as candidate users, and the attribute data of the recently active advertisement users are obtained and added into the candidate user set as the attribute data of the candidate users.

The reference user set of the targeted advertisement may be uploaded by the advertiser itself, i.e., the reference user is the user specified by the advertiser; or the reference user set of the targeted advertisement may be actively reported by the third client, i.e., the reference user is an offline converted user. The third party client side refers to a client side capable of carrying out service processing on an advertisement object corresponding to the target advertisement, the advertisement object refers to an object transmitted through the target advertisement, the advertisement object can be commodities, application programs (APP), websites and the like, and the offline converted user refers to a historical user carrying out service operation on the target object through the target advertisement; for example, the target advertisement is an advertisement about a commodity, the advertisement object is a commodity, the third party client is a client (such as shopping client) capable of purchasing the commodity, and the online converted user is a historical user who purchased the commodity of the class A through the target advertisement; for another example, the targeted advertisement is an advertisement related to a social APP, then the advertisement object is the social APP, then the third party client is a client capable of downloading the social APP (e.g., an APP store client), the offline converted user is a historical user who downloaded the social APP through the targeted advertisement, and so on. Accordingly, the specific implementation manner of obtaining the reference user set of the target advertisement may be: receiving a crowd-oriented request of a target advertisement, wherein the crowd-oriented request carries a seed user list; the seed user list comprises attribute data of a plurality of seed users, and the user list is uploaded by an advertiser or a third party client. And adding the attribute data of various sub-users in the sub-user list as the attribute data of the reference user into the reference user set of the target advertisement.

In another embodiment, the reference user set of the target advertisement can be automatically collected by the server according to the historical delivery situation of the target advertisement. Accordingly, the specific implementation manner of obtaining the reference user set of the target advertisement may be: first, a crowd-targeting request for a targeted advertisement may be received, the crowd-targeting request carrying a crowd-targeting target for the targeted advertisement. Secondly, a historical putting flow water meter of the target advertisement can be obtained, wherein the historical putting flow water meter at least comprises attribute data and behavior data of a historical user, and the behavior data of the historical user is used for indicating whether advertisement exposure behavior and advertisement clicking behavior exist for the target advertisement or not by the historical user; then, the attribute data of the reference user can be obtained from the historical delivery flow water meter according to the crowd-oriented targets, and the attribute data of the reference user is added into the reference user set of the target advertisement.

When the attribute data of the reference user is obtained from the historical delivery flow water meter according to the crowd-oriented target, if the crowd-oriented target is crowd-oriented based on the exposure rate, the attribute data of the historical user with advertisement exposure behavior in the historical delivery flow water meter is used as the attribute data of the reference user; if the crowd orientation target is crowd orientation based on the click rate or crowd orientation is performed by combining the exposure rate and the click rate and the weight of the exposure rate is equal to or smaller than the weight of the click rate, taking the attribute data of the historical user with advertisement click behaviors in the historical delivery flow water meter as the attribute data of the reference user; if the crowd-oriented target is the crowd-oriented by combining the exposure rate and the click rate and the weight of the exposure rate is larger than the weight of the click rate, taking the attribute data obtained by sampling the attribute data of the historical users with advertisement exposure behaviors in the historical putting stream water meter as the attribute data of the reference users and taking the attribute data of the historical users with advertisement click behaviors in the historical putting stream water meter as the attribute data of the reference users.

It should be noted that, the historical delivery flow water meter may also include a time tag of the behavior data of the historical user, where the time tag may be used to identify the time of generating the behavior data, that is, identify the time when the advertisement exposure behavior or the advertisement clicking behavior exists in the historical user; correspondingly, when the server acquires the attribute data of the reference user from the historical delivery flow water meter according to the crowd-oriented target, the server can also acquire the attribute data of the reference user from the historical delivery flow water meter by combining the time tag and the crowd-oriented target. For example, if the crowd-targeting is to target crowd-targeting based on the exposure rate, the attribute data of the historical user who has advertisement exposure behavior recently (within a preset period of time calculated forward based on the current system time) in the historical delivery flow meter may be used as the attribute data of the reference user.

In yet another embodiment, the set of reference users for the targeted advertisement may further include attribute data of the reference users obtained in at least two ways: the advertisement is uploaded by the advertiser, actively reported by the third client side and automatically collected by the server according to the historical delivery condition of the target advertisement. For example, the reference user set needs attribute data of 50 reference users, and the advertiser uploaded seed user list only includes attribute data of 30 seed users; the server can acquire the attribute data of 20 seed users from the seed user list actively reported by the third party client side as the attribute data of the reference user and add the attribute data of the 20 seed users into the reference user set, besides adding the attribute data of 30 seed users in the seed user list uploaded by the advertiser as the attribute data of the reference user into the reference user set; or, the server can also obtain the attribute data of 20 historical users from the historical delivery flow water meter according to the crowd-oriented targets as the attribute data of the reference users and add the attribute data to the reference user set. For another example, the reference user set needs attribute data of 100 reference users, and the seed user list uploaded by the advertiser only includes attribute data of 30 seed users, and the seed user list uploaded by the third party client only includes attribute data of 20 seed users; the server may obtain 50 attribute data of the reference users from the historical delivery stream water meter according to the crowd-oriented objective, and add the attribute data to the reference user set, in addition to adding the attribute data of 30 seed users in the seed user list uploaded by the advertiser and the attribute data of 20 seed users in the seed user list uploaded by the third party client as the attribute data of the reference users to the reference user set.

It should be noted that, in the implementation process, priorities may be set in advance for the three modes (obtaining through the seed user list uploaded by the advertiser, obtaining through the seed user list uploaded by the third client, and automatically collecting according to the historical delivery situation of the target advertisement) according to the actual service requirements, so that the attribute data of the reference user may be obtained by adopting the three modes in sequence according to the priority order.

And S402, training and optimizing the estimated model by adopting the reference user set and the candidate user set to obtain an optimized estimated model.

In the embodiment of the invention, the pre-estimation model is a model which can pre-estimate the advertisement orientation of the user so as to obtain the probability of generating forward feedback of the user to the target advertisement. In order to improve the performance of the pre-estimated model, the embodiment of the invention adopts a Machine Learning (ML) technology in the field of artificial intelligence (Artificial Intelligence, AI) to train and optimize the pre-estimated model, so that the optimized pre-estimated model can better pre-estimate advertisements to users, and the accuracy is improved. Artificial intelligence herein is the theory, method, technique, and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results; machine learning in the field of artificial intelligence is the core of artificial intelligence, and can be specifically understood as a multi-field interdisciplinary, and relates to multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. In the process of training and optimizing the pre-estimated model by adopting a machine learning technology, the pre-estimated model can acquire new knowledge or skills by researching how a computer simulates or realizes the learning behavior of human beings, and then reorganizes the existing knowledge structure to continuously improve the performance of the pre-estimated model. In a specific implementation, step S402 may include the following steps S11-S13:

And s11, taking attribute data of each reference user in the reference user set as a positive sample.

Since the reference user refers to a user capable of generating forward feedback to the targeted advertisement, attribute data of the reference user can be taken as a positive sample; specifically, feature stitching can be performed on attribute data of the reference user, so that a positive sample is obtained. The number of positive samples is the same as the number of reference users; in other embodiments, if the number of reference users is large, the reference user set may be randomly sampled according to the actual requirement to obtain positive samples, where the number of positive samples is smaller than the number of reference users.

And s12, sampling the reference user set and the candidate user set according to the crowd-oriented targets of the target advertisements to obtain a plurality of negative samples.

The candidate user set may further include behavior data of each candidate user for indicating whether the candidate user has advertisement exposure behavior and advertisement click behavior with respect to the target advertisement. In the practical application scenario, the negative sample may include only the attribute data of the reference user sampled from the reference user set, or may include only the attribute data of the candidate user sampled from the candidate user set. Specifically, if the crowd-oriented target only sees and pays back the exposure rate, in order to avoid the space deviation phenomenon of training space and prediction space, can sample from the attribute data of candidate users without advertisement exposure behavior in the candidate user set to obtain a negative sample; that is, if the crowd-targeting is to perform crowd-targeting based on the exposure rate, the attribute data of the candidate users having no advertisement exposure behavior in the candidate user set may be randomly sampled to obtain a negative sample. If the crowd oriented target only sees the click rate, the negative sample can be directly obtained by sampling attribute data of the reference users with advertisement exposure behaviors and without advertisement click behaviors in the reference users; that is, if the crowd-targeting is to target crowd-targeting based on click rate, attribute data of reference users having advertisement exposure behavior and no advertisement click behavior in the reference user set may be randomly sampled to obtain a negative sample.

Under the condition that the crowd-oriented target only sees the click rate, if a negative sample is obtained by direct sampling from the reference user set, a spatial offset phenomenon can be generated between a training space and a prediction space, and the click rate of the target advertisement is low. As can be seen from the foregoing, the main reason for this problem is that the prediction space includes candidate users, and the candidate users cannot guarantee exposure; therefore, in order to solve the problem, in the crowd targeting process, before considering whether the candidate user has advertisement click action on the target advertisement, the embodiment of the invention further considers whether the candidate user has advertisement exposure action on the target advertisement. Based on the above, in the embodiment of the invention, under the condition that the crowd-oriented target is crowd-oriented based on the click rate, the crowd-oriented target for crowd-oriented based on the joint click rate and the exposure rate is further provided; under the crowd targeting, when the candidate users are subjected to click prediction, the exposure probability of the candidate users for the advertisement exposure behavior of the target advertisement needs to be predicted, and then the click probability of the advertisement exposure behavior of the target advertisement after exposure is predicted, namely the targeting probability = exposure probability obtained by carrying out the advertisement targeting prediction (here, click prediction) on the candidate users.

In order to further prove that when crowd orientation is performed by combining exposure rate and click rate, the orientation probability of the candidate users is equal to the product of the exposure probability and the click probability, the embodiment of the invention performs mathematical deduction on the relation: assuming that x represents a candidate user, y is the presence advertisement exposure behavior (y=1 represents exposure), and z is the presence click behavior (z=1 represents click); then, the exposure probability and click probability of candidate user x are as follows:

the exposure probability of candidate user x is:/>

the click probability of candidate user x is:

calculating the product of the exposure probability and the click probability: it follows that for candidate user x, the targeting probability at which the advertisement click behavior eventually occurs is the product of the exposure probability and the click probability.

Therefore, when the crowd-oriented target is crowd-oriented based on the click rate, the embodiment of the invention can further convert the crowd-oriented target into the combined exposure rate and the click rate for crowd-oriented; based on the crowd oriented targets, a method for training and optimizing the estimated model by adopting the combination of the touch rate (exposure rate) and the click rate is also provided. Specifically, attribute data of a reference user with advertisement exposure behavior and advertisement clicking behavior is used as a positive sample, candidate users are randomly sampled in a concentrated mode to obtain a negative sample, and then the positive sample and the negative sample in the mode are adopted to train and optimize a pre-estimated model. The optimized estimation model obtained by training and optimizing the training set is adopted, and the expected value of the targeting probability of the advertisement targeting estimation of the candidate users is as follows:

Since the sampled candidate users are randomly sampled from the candidate user set, all candidate users can be represented to some extent. Thus, the expected value may be further expressed as:

the number of exposure click users refers to the number of reference users with advertisement exposure behaviors and advertisement click behaviors, and the number of sampled candidate users refers to the number of candidate users obtained by randomly sampling a candidate user set. Therefore, the probability that any candidate user can finally generate advertisement click action on the target advertisement can be estimated by adopting the mode that the candidate user set is randomly sampled to obtain a negative sample. Under the training mode (namely the mode of adopting the joint training of the touch rate (exposure rate) and the click rate), the training space and the prediction space of the prediction model are consistent, and the problems of low exposure rate and low click rate can be effectively solved. The directional crowd data determined by the optimized predictive model obtained in the training mode is accurate, so that the overall exposure rate and click rate of the target advertisement are effectively improved, and particularly the exposure rate is integrally improved by 60% in the follow-up process of putting the target advertisement.

Based on the above description, when the crowd-targeting is crowd-targeting by combining the click rate and the exposure rate, the negative sample may include only attribute data of candidate users sampled from the candidate user set, or may be a mixture of attribute data of reference users having advertisement exposure behavior and no advertisement click behavior and attribute data of candidate users sampled from the candidate user set, and the ratio of which part of the mixture is larger may depend on the relative importance of the click rate and the exposure rate. In the specific implementation, if the click rate and the exposure rate are important equally or the crowd oriented target is more biased towards the exposure rate, a negative sample can be directly obtained by sampling from the candidate user set; that is, if the crowd-oriented target is the crowd-oriented with the joint exposure rate and click rate and the weight of the exposure rate is equal to or greater than the weight of the click rate, the candidate user set may be randomly sampled to obtain a negative sample. If the crowd-oriented targets are more biased towards click rate, adding attribute data of some reference users with advertisement exposure behaviors and without advertisement click behaviors on the basis of randomly sampling the candidate user set; that is, if the crowd-oriented target is the crowd-oriented with joint exposure rate and click rate and the weight of the exposure rate is smaller than the weight of the click rate, the attribute data of the candidate users obtained by randomly sampling the candidate user set is taken as a negative sample, and the attribute data of the reference users having advertisement exposure behavior and no advertisement click behavior in the reference user set is taken as a negative sample.

It should be noted that, in one embodiment, when the number of candidate users is far greater than the number of reference users (the number of positive samples); in order to balance the number proportion of the positive and negative samples, the subsequent training optimization can be better carried out; when the reference user set and the candidate user set are sampled according to the crowd-oriented targets of the target advertisements to obtain a plurality of negative samples, the reference user set and the candidate user set can be sampled according to the number of the positive samples and the crowd-oriented targets of the target advertisements to obtain a plurality of negative samples, so that the number proportion between the negative samples and the positive samples meets a preset ratio; the preset ratio may be set according to actual service requirements or empirical values, for example, setting the preset ratio to 1, 1.5, etc.

In still another embodiment, since some candidate users may have advertisement exposure behavior and advertisement click behavior on the target advertisement, in order to avoid that when the candidate user set is sampled, attribute data of the candidate users are sampled as negative samples, so that the predictive model erroneously learns data characteristics of the negative samples during training optimization, and thus training optimization effects of the predictive model are affected; when the reference user set and the candidate user set are sampled according to the crowd-oriented target of the target advertisement to obtain a plurality of negative samples, the attribute data of the candidate users with advertisement exposure behaviors and advertisement clicking behaviors on the target advertisement in the candidate user set can be removed to obtain a residual user set; and then sampling the residual user set and the reference user set according to the crowd-oriented targets of the target advertisements to obtain a plurality of negative samples, or sampling the residual user set and the reference user set to obtain a plurality of negative samples by combining the number of the positive samples and the crowd-oriented targets of the target advertisements.

As can be seen from the relevant contents recorded in steps S401-S402, the embodiment of the invention can select different positive and negative samples according to different crowd-oriented targets; under different crowd orientation targets, the selection modes of the positive and negative samples can be shown in the following table 1:

TABLE 1

The exposure data refers to attribute data of historical users with advertisement exposure behaviors in the advertisement delivery flow water meter, the click data refers to attribute data of historical users with advertisement click behaviors in the advertisement delivery flow water meter, the sampling exposure data refers to attribute data obtained by sampling attribute data of historical users with advertisement exposure behaviors in the historical delivery flow water meter, the unexposed data refers to attribute data of candidate users without advertisement exposure behaviors in the candidate user set, the exposed and untracked data refers to attribute data of historical users with advertisement exposure behaviors and without advertisement click behaviors in the advertisement delivery flow water meter, and the sampling data refers to attribute data obtained by sampling the candidate user set.

And s13, training and optimizing the pre-estimated model by adopting a plurality of positive samples and a plurality of negative samples to obtain an optimized pre-estimated model.

In the specific implementation process, after a plurality of positive samples and a plurality of negative samples are obtained, the predictive model can be directly trained and optimized by adopting the plurality of positive samples and the plurality of negative samples based on a model training algorithm, so that the optimized predictive model is obtained. Among other things, model training algorithms may include, but are not limited to: xgboost algorithm (extreme gradient ascent algorithm), GBDT algorithm (Gradient Boosting Decision Tree, gradient ascent decision tree algorithm), and so on. The xgboost algorithm is an integrated machine learning algorithm based on a decision tree by using a gradient enhancement framework, and can be specifically composed of a plurality of decision trees, wherein the decision tree is a classification and regression tree (Classification and regression tree, CART), the CART decision is a binary tree, the values of the internal node characteristics are yes and no, the branch with the value of yes of each node can be used as the left branch of the node, and the branch with the value of no is used as the right branch of the node; the basic idea of the xgboost algorithm is: gradually constructing a plurality of decision trees according to the characteristics of the sample, wherein each time a decision tree is constructed, the overall effect of the model is improved, for example, the function value of a loss function is reduced, and the currently constructed decision tree fits the residual error caused by the previously constructed decision tree. The GBDT algorithm is an iterative decision tree algorithm, the algorithm is composed of a plurality of decision trees, and the final result output by the algorithm is accumulated by the conclusions of all the decision trees.

In one embodiment, since there may be negative samples in the negative samples that are similar to the positive samples, such negative samples that are similar to the positive samples do not actually belong to the true negative samples; therefore, in order to avoid that the pre-estimated model is wrongly learned by taking the negative sample similar to the positive sample as a real negative sample in the process of training and optimizing the pre-estimated model by adopting the positive sample and the negative sample, the pre-estimated model cannot accurately distinguish the difference between the positive sample and the negative sample, and the training and optimizing effect of the pre-estimated model is influenced; in the embodiment of the invention, in the specific implementation process of training and optimizing the estimated model by adopting a plurality of positive samples and a plurality of negative samples to obtain the optimized estimated model, the plurality of negative samples can be subjected to sample cleaning treatment according to the plurality of positive samples, wherein the sample cleaning treatment refers to the treatment of removing the negative samples similar to the positive samples in the plurality of negative samples; and then training and optimizing the estimated model by adopting a plurality of positive samples and negative samples after sample cleaning treatment to obtain an optimized estimated model.

One embodiment of performing the sample cleaning process on the plurality of negative samples according to the plurality of positive samples may be: firstly, a plurality of positive samples can be arbitrarily divided into a first positive sample set and a second positive sample set, and a first label (the value of the first label is 1) is added to the first positive sample set; and constructing a negative sample set by adopting the positive samples and a plurality of negative samples in the second positive sample set, and adding a second label (the value of the second label is-1) to the negative sample set. Secondly, training a classification model by adopting a first positive sample set and a first label and a negative sample set and a second label; and carrying out sample category prediction on each negative sample by adopting the trained classification model to obtain the prediction probability that each negative sample is a positive sample. Since the larger the prediction probability, the greater the similarity between the negative sample and the positive sample is represented; the smaller the prediction probability, the smaller the similarity between the negative sample and the positive sample; therefore, after obtaining the predicted probabilities that each negative sample is a positive sample, a negative sample whose predicted probability is smaller than the probability threshold may be taken as a negative sample after sample cleaning.

Alternatively, another embodiment of performing the sample cleaning process on the plurality of negative samples from the plurality of positive samples may be: firstly, constructing a positive sample set by adopting a plurality of positive samples, and adding a first label (the value of the first label is 1) to the positive sample set; and constructing a negative sample set by adopting a plurality of negative samples, and adding a second label (the value of the second label is-1) to the negative sample set. Secondly, training a classification model by adopting a positive sample set and a first label and a negative sample set and a second label; and carrying out sample category prediction on each negative sample by adopting the trained classification model to obtain the prediction probability that each negative sample is a positive sample. Then, a negative sample with a predicted probability less than the probability threshold may be taken as a negative sample after sample cleaning.

Alternatively, one embodiment of performing the sample cleaning process on the plurality of negative samples from the plurality of positive samples may be: the 1-DNF algorithm (a sample cleaning algorithm) is used to perform sample cleaning processing on a plurality of negative samples according to a plurality of positive samples.

S403, invoking an optimized estimation model to perform advertisement targeting estimation on each candidate user according to the attribute data of each candidate user in the candidate user set, and obtaining the targeting probability of each candidate user.

After the optimized estimation model is obtained, the optimized estimation model can be directly called to carry out advertisement directional estimation on each candidate user according to the attribute data of each candidate user in the candidate user set, so as to obtain the directional probability of each candidate user.

Note that, for the above steps S402 to S403: in other embodiments, when the crowd-oriented target is crowd-oriented by combining the exposure rate and the click rate, the prediction model may include two models, such as an exposure prediction model and a click prediction model, respectively. In this case, in step S402, the exposure estimation model and the click estimation model may be respectively trained and optimized by using the reference user set and the candidate user set, so as to obtain an optimized exposure estimation model and an optimized click estimation model. In the specific implementation process, aiming at an exposure estimation model: the attribute data of the historical users with advertisement exposure behaviors can be selected from the historical delivery flow table to serve as exposure positive samples, and the attribute data of the candidate users without advertisement exposure behaviors in the candidate user set are randomly sampled to obtain exposure negative samples; and training and optimizing the exposure estimation model by adopting an exposure positive sample and an exposure negative sample to obtain an optimized exposure estimation model. Aiming at a click pre-estimation model: the attribute data of the historical users with advertisement click behaviors can be selected from the historical delivery flow table to serve as positive click samples, and the attribute data of the reference users with advertisement exposure behaviors and without advertisement click behaviors in the reference users are randomly sampled to obtain negative click samples; and training and optimizing the click pre-estimated model by adopting a click positive sample and a click negative sample to obtain an optimized click pre-estimated model.

Correspondingly, in step S403, for any candidate user, an optimized exposure estimation model may be invoked to perform exposure estimation on the candidate user according to the attribute data of the candidate user, so as to obtain the exposure rate of the candidate user; and invoking an optimized click prediction model to perform click prediction on the candidate user according to the attribute data of the candidate user, so as to obtain the click rate of the candidate user; and then calculating the product of the exposure probability of the candidate user and the click probability of the candidate user to obtain the orientation probability of the candidate user. And repeatedly executing the steps to obtain the directional probability of each candidate user.

S404, screening out attribute data of the targeted users from the candidate user sets according to the targeting probability of each candidate user, and adding the attribute data of the targeted users into the targeting crowd data of the targeted advertisements.

S405, storing the targeted crowd data of the targeted advertisement and outputting attribute information of the targeted crowd data.

After obtaining the targeted crowd data of the targeted advertisement, the server may store the targeted crowd data so that subsequent advertising may be performed based on the targeted crowd data. In one embodiment, the server may store targeted demographic data of the targeted advertisement directly into the database; in still another embodiment, the server may also compress the targeted crowd data of the targeted advertisement, and store the compressed targeted crowd data in the database, so as to save a storage space of the database.

In addition, after obtaining the targeted crowd data of the targeted advertisement, the server can also obtain the attribute information of the targeted crowd data and output the attribute information. Wherein the attribute information at least includes: data name and data quantity; the data name is used for identifying the obtained directional crowd data at this time, and can be randomly set by a server or preset by an advertiser, such as data 1, data a and the like. The data quantity is the quantity of the directional users corresponding to the specified crowd data; for example, the directional crowd data includes attribute data of 5 directional users, and then the number of the directional users corresponding to the directional crowd data is 5, namely, the number of the data is 5; for another example, the directional crowd data includes attribute data of 100 directional users, and then the number of the directional users corresponding to the directional crowd data is 100, that is, the number of the data is 100.

S406, if a trigger event for advertisement delivery based on the directional crowd data is detected, delivering a target advertisement in the directional crowd corresponding to the directional crowd data.

In one embodiment, the advertiser may select when to deliver advertisements based on his own needs; in this case, the triggering event for advertisement delivery may include: an event is detected that an advertiser selects an advertisement placement operation for advertising based on the targeted demographic data. In yet another embodiment, the server may also automatically perform advertising; in this case, the triggering event for advertisement delivery may include: detecting an event meeting the advertisement putting condition of the target advertisement; the advertisement placement conditions herein may include, but are not limited to: the condition of acquiring the directional crowd data or the condition of acquiring the directional crowd data after the preset time length.

After the server detects the trigger event, the server can respond to the trigger event to put the target advertisement in the targeted crowd corresponding to the targeted crowd data. Specifically, the user identification of each directional user in the directional crowd data can be obtained, and the target advertisement is issued to the user account associated with the user identification of each directional user, so that the target advertisement is put in the directional crowd.

Based on the above description, the embodiment of the invention also provides an application scene for advertising based on the crowd-oriented method; in the application scene, the training and optimizing of the pre-estimated model in a mode of adopting the joint training of the exposure rate and the click rate is illustrated. When an advertiser wants to target advertisement A, the advertiser can log into the advertising system, as shown in FIG. 5 a. After successfully logging into the advertising system, the advertiser may click a create button in the user interface to enter the set up interface for targeting conditions, as shown in FIG. 5 b. In the setup interface, a plurality of orientation conditions may be included: seed population, number of targeted populations, whether seed population is included, expansion trends (population targeted), advertising media platform, number type, etc. Wherein, the seed population may correspond to at least three options: and uploading by an advertiser, uploading by a third party client and automatically determining by a server. The expansion tendency is used for determining a crowd-oriented target, and if the expansion tendency is advertisement exposure, the crowd-oriented target is crowd-oriented by taking the exposure rate as a reference; when the expansion trend is advertisement click, the crowd-targeting is crowd-targeting based on click rate, and so on.

The advertiser may set these multiple targeting conditions in conjunction with its own advertising needs. For example, the targeting conditions set by the advertiser are: the server automatically determines seed crowds, the number of the oriented crowds is 500, the seed crowds are not included, the expansion tendency is set to be advertisement exposure clicking through a user-defined mode, the putting platform is not limited, and the number type is QQ number. After setting these targeting conditions, the advertiser can click the submit button, and the front end of the advertising system can generate a crowd targeting request according to the setting information of the advertiser and send the crowd targeting request to the server. The crowd-oriented request here carries at least the following information: the number of targeted crowd (500 people), the crowd-targeting goal of targeted advertising (crowd-targeting in combination with exposure and click-through rates and weight of exposure equal to weight of click-through rate), platform identification of the advertising system, and so forth. Correspondingly, the server can receive the crowd-oriented request and respond to the crowd-oriented request to perform subsequent crowd-oriented processing to obtain the oriented crowd data of the target advertisement; the specific implementation process of the crowd-oriented treatment can be seen in fig. 5 c:

the server can acquire the crowd oriented targets of the target advertisements; specifically, the crowd-oriented target can be obtained by analyzing the crowd-oriented request. After the crowd-oriented targets are obtained, positive samples can be obtained from the advertisement delivery water meter according to the crowd-oriented targets; because the crowd-oriented targets are combined with the exposure rate and the click rate for crowd-oriented, and the weight of the exposure rate is equal to the weight of the click rate, the attribute data of the historical users with advertisement click behaviors in the historical put-stream water meter can be used as the attribute data of the reference users, and the attribute data of the reference users can be subjected to characteristic splicing to obtain positive samples. The number of negative samples may then be determined from the number of positive samples in order to equalize the number between the positive and negative samples. Because the crowd-targeting request carries a platform identification of the advertising system, the candidate user set may be constructed using attribute data of recently active advertising users in the advertising system. After the number of negative samples is obtained, a corresponding number of negative samples may be sampled from the candidate user set in accordance with the crowd-oriented request. Specifically, because the crowd-oriented request performs crowd-oriented for the joint exposure rate and the click rate, and the weight of the exposure rate is equal to the weight of the click rate, the candidate user set can be randomly sampled to obtain a negative sample.

Since there may be a negative sample similar to the positive sample in the negative samples, the negative samples may also be subjected to sample washing to remove negative samples similar to the positive sample in the negative samples. After the sample is cleaned, negative samples and positive samples after the sample is cleaned can be adopted to perform a series of processes of model optimization training, advertisement orientation estimation and oriented crowd data generation. Specifically, the negative sample and the positive sample after sample cleaning can be adopted to train and optimize the estimated model, so as to obtain an optimized estimated model; and secondly, invoking an optimized estimation model to perform advertisement targeting estimation on each candidate user according to attribute data of each candidate user, so as to obtain targeting probability of each candidate user. Then, the attribute data of each candidate user in the candidate user set can be ordered according to the order of the orientation probability from high to low to obtain an ordering set; and sequentially selecting attribute data of corresponding candidate users from the sorting set according to the number of the oriented crowd (namely 500 people) as attribute data of the oriented users, and adding the attribute data of each oriented user obtained by selection into the oriented crowd data. After obtaining the directional crowd data, the server may store the crowd directional data and output attribute information of the directional crowd data, as shown in fig. 5 d.

After the advertiser can acquire the directional crowd data of the time, if the advertiser wants to directly put the advertisement, the server can be triggered to directly put the advertisement of the target advertisement based on the directional crowd data by clicking a put button. If the advertiser does not want to put the advertisement at this time, the advertiser can exit the advertising system by clicking the exit button, can also return to the setting interface of the targeting condition again by clicking the return button, and set different targeting conditions again. After resetting different orientation conditions, the front end can be triggered to send the crowd orientation request to the server again by clicking a submit button; correspondingly, the server can perform crowd orientation processing again according to the crowd orientation request sent by the front end to obtain another piece of oriented crowd data, store the obtained oriented crowd data again, output attribute information (such as the data name of oriented crowd 2 and the data number of 100 people) … of the obtained oriented crowd data, and so on.

When the advertiser wants to deliver the target advertisement, at least one piece of targeted crowd data can be selected for advertisement delivery, as shown in fig. 5 e; if the front end detects the operation of clicking the 'put' button by the advertiser, the front end can send a trigger event of advertisement putting to the server. Correspondingly, after the server detects the trigger event, advertisement delivery can be carried out according to the targeted crowd data selected by the advertiser; specifically, the server may obtain user identifiers of each targeted user in the targeted crowd data selected by the advertiser, and send the targeted advertisement to the user account associated with the user identifier of each targeted user, so as to implement the targeted advertisement being put in the targeted crowd. Alternatively, after the server successfully delivers the targeted advertisement, feedback information may be output, as shown in fig. 5 e.

Therefore, in the advertisement putting process, the method and the device can adopt the positive sample and the negative sample to train and optimize the estimation model, and then call the optimized estimation model to carry out advertisement orientation estimation on each candidate user according to the attribute data of each candidate user, so as to obtain the orientation probability of each candidate user. Because the negative sample is obtained by sampling from the candidate users, the phenomenon that the training space and the prediction space are spatially offset can be avoided, and the training space and the prediction space of the prediction model can be ensured to be consistent, thereby improving the accuracy of the orientation probability of each candidate user. Because the targeting probability refers to the probability that the candidate users generate forward feedback to the target advertisement, the attribute data of the target users can be screened from the candidate user set according to the targeting probability of each candidate user and added into the targeting crowd data of the target advertisement; by improving the accuracy of the targeting probability, the accuracy of the targeting crowd data can be improved, and the exposure rate and the click rate of the target advertisement can be further improved.

Based on the above description of the embodiment of the crowd direction method, the embodiment of the invention also discloses a crowd direction device, which may be a computer program (including program code) running in a server. The crowd direction device may perform the method shown in fig. 2 or fig. 4. Referring to fig. 6, the crowd direction device may operate as follows:

An acquisition unit 101 for acquiring a reference user set and a candidate user set of a target advertisement; the reference user set comprises attribute data of a plurality of reference users, wherein the reference users refer to users capable of generating forward feedback on the target advertisement; the candidate user set comprises attribute data of a plurality of candidate users, wherein the candidate users are users to be oriented;

the optimizing unit 102 is configured to perform training optimization on the estimated model by using the reference user set and the candidate user set, so as to obtain an optimized estimated model;

the processing unit 103 is configured to invoke the optimized estimation model to perform advertisement targeting estimation on each candidate user according to attribute data of each candidate user in the candidate user set, so as to obtain targeting probability of each candidate user, where the targeting probability refers to probability that the candidate user generates forward feedback on the target advertisement;

the processing unit 103 is configured to screen attribute data of a targeted user from the candidate user set according to the targeting probability of each candidate user, and add the attribute data of the targeted user to the targeting crowd data of the targeted advertisement.

In one embodiment, the obtaining unit 101 is specifically configured to, when used to obtain the reference user set of the target advertisement:

Receiving a crowd-oriented request of a target advertisement, wherein the crowd-oriented request carries a seed user list, the seed user list comprises attribute data of a plurality of seed users, the user list is uploaded by an advertiser or is uploaded by a third party client, and the third party client is a client capable of carrying out service processing on advertisement objects corresponding to the target advertisement;

and adding the attribute data of various sub-users in the sub-user list as the attribute data of the reference user into the reference user set of the target advertisement.

In still another embodiment, the obtaining unit 101 is specifically configured to, when used for obtaining the reference user set of the target advertisement:

receiving a crowd-oriented request of a target advertisement, wherein the crowd-oriented request carries a crowd-oriented target of the target advertisement;

acquiring a historical delivery flow water meter of the target advertisement, wherein the historical delivery flow water meter at least comprises attribute data and behavior data of a historical user, and the behavior data of the historical user is used for indicating whether advertisement exposure behavior and advertisement clicking behavior exist for the target advertisement or not by the historical user;

and acquiring attribute data of a reference user from the historical delivery flow water meter according to the crowd-oriented target, and adding the attribute data of the reference user into the reference user set of the target advertisement.

In still another embodiment, the obtaining unit 101 is specifically configured to, when configured to obtain attribute data of a reference user from the historical delivery flow water meter according to the crowd-targeting objective:

if the crowd orientation target is crowd orientation based on the exposure rate, taking the attribute data of the historical user with advertisement exposure behavior in the historical delivery flow water meter as the attribute data of the reference user;

if the crowd orientation target is crowd orientation based on click rate or crowd orientation is performed by combining exposure rate and click rate and the weight of the exposure rate is equal to or smaller than the weight of the click rate, taking the attribute data of the historical users with advertisement click behaviors in the historical putting stream water meter as the attribute data of the reference users;

if the crowd-oriented target is the crowd-oriented by combining the exposure rate and the click rate and the weight of the exposure rate is larger than the weight of the click rate, taking the attribute data obtained by sampling the attribute data of the historical users with advertisement exposure behaviors in the historical delivery flow water meter as the attribute data of the reference users and taking the attribute data of the historical users with advertisement click behaviors in the historical delivery flow water meter as the attribute data of the reference users.

In another embodiment, the optimizing unit 102 is configured to perform training optimization on the prediction model by using the reference user set and the candidate user set to obtain an optimized prediction model, and specifically is configured to:

taking attribute data of each reference user in the reference user set as positive samples, wherein the number of the positive samples is the same as the number of the reference users;

sampling the reference user set and the candidate user set according to the crowd-oriented targets of the target advertisements to obtain a plurality of negative samples;

and training and optimizing the pre-estimated model by adopting a plurality of positive samples and a plurality of negative samples to obtain an optimized pre-estimated model.

In yet another embodiment, the candidate user set further includes behavior data of each candidate user, where the behavior data of the candidate user is used to indicate whether the candidate user has an advertisement exposure behavior and an advertisement click behavior with respect to the target advertisement; correspondingly, when the optimizing unit 102 is configured to sample the reference user set and the candidate user set according to the crowd-oriented target of the target advertisement to obtain a plurality of negative samples, the optimizing unit is specifically configured to:

if the crowd orientation target is crowd orientation based on the exposure rate, randomly sampling attribute data of candidate users without advertisement exposure behavior in the candidate user set to obtain a negative sample;

If the crowd orientation target is crowd orientation based on click rate, randomly sampling attribute data of reference users with advertisement exposure behaviors and without advertisement click behaviors in the reference user set to obtain a negative sample;

if the crowd orientation target is the crowd orientation of combining the exposure rate and the click rate and the weight of the exposure rate is equal to or greater than the weight of the click rate, randomly sampling the candidate user set to obtain a negative sample;

if the crowd-oriented target is the crowd-oriented by combining the exposure rate and the click rate and the weight of the exposure rate is smaller than the weight of the click rate, taking the attribute data of the candidate users obtained by randomly sampling the candidate user set as a negative sample, and taking the attribute data obtained by randomly sampling the attribute data of the reference users with advertisement exposure behaviors and without advertisement click behaviors in the reference user set as a negative sample.

In another embodiment, when the optimizing unit 102 is configured to perform training optimization on the pre-estimated model by using a plurality of positive samples and the plurality of negative samples, the optimizing unit is specifically configured to:

performing sample cleaning processing on the negative samples according to the positive samples, wherein the sample cleaning processing refers to processing for removing negative samples similar to the positive samples in the negative samples;

And training and optimizing the pre-estimated model by adopting the positive samples and the negative samples after sample cleaning treatment to obtain an optimized pre-estimated model.

In yet another embodiment, the crowd-targeting request carries a number of targeted crowd; accordingly, when the processing unit 103 is configured to filter the attribute data of the targeted user from the candidate user set according to the targeted probability of each candidate user, the processing unit is specifically configured to:

sorting the attribute data of each candidate user in the candidate user set according to the order of the directional probability from large to small to obtain a sorting set;

and selecting attribute data of corresponding candidate users from the sorting set in turn according to the number of the oriented crowd as attribute data of the oriented users.

In yet another embodiment, the crowd-targeting request further carries indication information that targets the targeted advertisement to each reference user; correspondingly, when the processing unit 103 is configured to sequentially select attribute data of the corresponding candidate users from the sorted set according to the number of the oriented crowd as attribute data of the oriented user, the processing unit is specifically configured to:

obtaining the difference between the number of the oriented crowd and the number of the reference users to obtain screening number; selecting attribute data of corresponding candidate users from the sorting set in turn according to the screening quantity as attribute data of the directional user;

The processing unit 103 may further be configured to: and adding the attribute data of each reference user in the reference user set to the targeted crowd data of the targeted advertisement.

In yet another embodiment, the processing unit 103 may be further configured to:

storing the targeted crowd data of the targeted advertisement, and outputting attribute information of the targeted crowd data, wherein the attribute information at least comprises: data name and data quantity;

and if a trigger event for advertising based on the directional crowd data is detected, the target advertisement is placed in the directional crowd corresponding to the directional crowd data.

According to one embodiment of the invention, the steps involved in the method of fig. 2 or 4 may be performed by the various units in the crowd direction device of fig. 6. For example, steps S201 to S202 shown in fig. 2 may be performed by the acquisition unit 101 and the optimization unit 102 shown in fig. 6, respectively, and steps S203 and S204 may be performed by the processing unit 103 shown in fig. 6; as another example, steps S401 to S402 shown in fig. 4 may be performed by the acquisition unit 101 and the optimization unit 102 shown in fig. 6, respectively, and steps S403 to S406 may be performed by the processing unit 103 shown in fig. 6.

According to another embodiment of the present invention, each unit in the crowd direction device shown in fig. 6 may be formed by combining one or several other units separately or all, or some unit(s) thereof may be formed by splitting a plurality of units having smaller functions, which may achieve the same operation without affecting the achievement of the technical effects of the embodiments of the present invention. The above units are divided based on logic functions, and in practical applications, the functions of one unit may be implemented by a plurality of units, or the functions of a plurality of units may be implemented by one unit. In other embodiments of the invention, the crowd-based targeting device may also include other units, and in actual practice, these functions may be assisted by other units and may be cooperatively implemented by a plurality of units.

According to another embodiment of the present invention, a crowd direction device apparatus as shown in fig. 6 may be constructed by running a computer program (including program code) capable of executing the steps involved in the respective methods as shown in fig. 2 or fig. 4 on a general purpose computing device such as a computer including a Central Processing Unit (CPU), a random access storage medium (RAM), a read only storage medium (ROM), etc., processing elements and storage elements, and a crowd direction method of an embodiment of the present invention is implemented. The computer program may be recorded on, for example, a computer-readable recording medium, and loaded into and executed by the above-described computing device via the computer-readable recording medium.

Based on the description of the method embodiment and the device embodiment, the embodiment of the invention also provides a server; the server may be a background server of the advertising system mentioned above. Referring to fig. 7, the server includes at least a processor 201, a communication interface 202, and a computer storage medium 203. The communication interface 202 may include a radio frequency transceiver through which a server may communicate data with other devices. The processor 201, communication interface 202, and computer storage medium 203 within the server may be connected by a bus or other means.

The computer storage medium 203 may be stored in a memory of a server, the computer storage medium 203 being for storing a computer program comprising program instructions, the processor 201 being for executing the program instructions stored by the computer storage medium 203. The processor 201 (or CPU (Central Processing Unit, central processing unit)) is a computing core and a control core of the server, which are adapted to implement one or more instructions, in particular to load and execute one or more instructions to implement a corresponding method flow or a corresponding function; in one embodiment, the processor 201 according to the embodiment of the present invention may be configured to perform a series of crowd-targeting processes on the targeted advertisement, which may specifically include:

The embodiment of the invention also provides a computer storage medium (Memory), which is a Memory device in a server and is used for storing programs and data. It is to be understood that the computer storage media herein may include built-in storage media in the server, or may include extended storage media supported by the server. The computer storage media provides storage space that stores the operating system of the server. Also stored in this memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor 201. The computer storage medium herein may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory; optionally, at least one computer storage medium remote from the processor may be present.

In one embodiment, one or more instructions stored in a computer storage medium may be loaded and executed by the processor 201 to implement the respective steps of the methods described above in connection with the crowd-oriented method embodiments; in particular implementations, one or more instructions in a computer storage medium are loaded by processor 201 and perform the steps of:

In one embodiment, the one or more instructions may also be loaded and executed in particular by the processor 201 in obtaining the reference user set for the targeted advertisement:

In yet another embodiment, the one or more instructions may also be loaded and executed in particular by the processor 201 in obtaining the reference user set for the targeted advertisement:

In yet another embodiment, the one or more instructions may be further loaded and executed by the processor 201 when obtaining the attribute data of the reference user from the historical launch stream water meter according to the crowd-targeting objective:

In yet another embodiment, when training and optimizing the predictive model using the reference user set and the candidate user set to obtain an optimized predictive model, the one or more instructions may be further loaded and executed by the processor 201 to specifically:

In yet another embodiment, the candidate user set further includes behavior data of each candidate user, where the behavior data of the candidate user is used to indicate whether the candidate user has an advertisement exposure behavior and an advertisement click behavior with respect to the target advertisement; accordingly, when sampling the reference user set and the candidate user set according to the crowd-oriented target of the target advertisement to obtain a plurality of negative samples, the one or more instructions may be further loaded and specifically executed by the processor 201 to:

In yet another embodiment, when training and optimizing the predictive model using a plurality of positive samples and the plurality of negative samples to obtain an optimized predictive model, the one or more instructions may be further loaded and executed by the processor 201 to specifically:

In yet another embodiment, the crowd-targeting request carries a number of targeted crowd; accordingly, the one or more instructions may further be loaded and executed by the processor 201 to specifically perform, when screening the attribute data of the targeted user from the candidate user set according to the targeted probability of each candidate user:

In yet another embodiment, the crowd-targeting request further carries indication information that targets the targeted advertisement to each reference user; accordingly, when the attribute data of the corresponding candidate users are sequentially selected from the sorted set according to the number of the oriented crowd as the attribute data of the oriented user, the one or more instructions may further be loaded and specifically executed by the processor 201:

the one or more instructions may also be loaded and executed in particular by processor 201: and adding the attribute data of each reference user in the reference user set to the targeted crowd data of the targeted advertisement.

In yet another embodiment, the one or more instructions may also be loaded and executed in particular by processor 201:

The foregoing disclosure is illustrative of the present invention and is not to be construed as limiting the scope of the invention, which is defined by the appended claims.

Claims

1. A method of crowd orientation comprising:

taking attribute data of each reference user in the reference user set as positive samples, wherein the number of the positive samples is the same as the number of the reference users; sampling the reference user set and the candidate user set according to the crowd-oriented targets of the target advertisements to obtain a plurality of negative samples;

and performing sample cleaning treatment on the negative samples according to the positive samples, wherein the sample cleaning treatment mode comprises the following steps of: arbitrarily dividing the positive samples into a first positive sample set and a second positive sample set, and adding a first label to the first positive sample set; constructing a negative sample set by adopting positive samples and a plurality of negative samples in the second positive sample set, and adding a second label for the negative sample set; training a classification model by adopting the first positive sample set, the first label, the negative sample set and the second label, predicting by adopting the trained classification model to obtain the prediction probability that each negative sample is a positive sample, and taking the negative sample with the prediction probability smaller than a probability threshold value as the negative sample after sample cleaning;

Training and optimizing the estimated model by adopting the positive samples and the negative samples after sample cleaning treatment to obtain an optimized estimated model;

2. The method of claim 1, wherein the obtaining the reference user set for the targeted advertisement comprises:

3. The method of claim 1, wherein the obtaining the reference user set for the targeted advertisement comprises:

4. The method of claim 3, wherein said obtaining attribute data for a reference user from said historical launch flow meter according to said crowd-targeting objective comprises:

5. The method of claim 1, wherein the set of candidate users further includes behavior data for each candidate user, the behavior data for the candidate user indicating whether advertisement exposure behavior and advertisement click behavior exist for the target advertisement;

the sampling the reference user set and the candidate user set according to the crowd-oriented targets of the target advertisement to obtain a plurality of negative samples, including:

6. The method of any one of claims 2-5, wherein the crowd-oriented request carries an oriented crowd quantity; the screening the attribute data of the oriented user from the candidate user set according to the oriented probability of each candidate user comprises the following steps:

7. The method of claim 6, wherein the crowd-sourced request further carries indication information that directs the targeted advertisement to each reference user;

the selecting attribute data of the corresponding candidate users from the sorting set according to the number of the oriented crowd as attribute data of the oriented users sequentially comprises the following steps: obtaining the difference between the number of the oriented crowd and the number of the reference users to obtain screening number; selecting attribute data of corresponding candidate users from the sorting set in turn according to the screening quantity as attribute data of the directional user;

the method further comprises the steps of: and adding the attribute data of each reference user in the reference user set to the targeted crowd data of the targeted advertisement.

8. The method of any one of claims 1-5, wherein the method further comprises:

9. A crowd directing apparatus, comprising:

the optimizing unit is used for taking attribute data of each reference user in the reference user set as positive samples, wherein the number of the positive samples is the same as that of the reference users; sampling the reference user set and the candidate user set according to the crowd-oriented targets of the target advertisements to obtain a plurality of negative samples; and performing sample cleaning treatment on the negative samples according to the positive samples, wherein the sample cleaning treatment mode comprises the following steps of: arbitrarily dividing the positive samples into a first positive sample set and a second positive sample set, and adding a first label to the first positive sample set; constructing a negative sample set by adopting positive samples and a plurality of negative samples in the second positive sample set, and adding a second label for the negative sample set; training a classification model by adopting the first positive sample set, the first label, the negative sample set and the second label, predicting by adopting the trained classification model to obtain the prediction probability that each negative sample is a positive sample, and taking the negative sample with the prediction probability smaller than a probability threshold value as the negative sample after sample cleaning; training and optimizing the estimated model by adopting the positive samples and the negative samples after sample cleaning treatment to obtain an optimized estimated model;

10. A server comprising a communication interface, further comprising:

a computer storage medium storing one or more instructions adapted to be loaded by the processor and to perform the crowd direction method of any one of claims 1-8.

11. A computer storage medium having stored thereon one or more instructions adapted to be loaded by a processor and to perform the crowd direction method of any of claims 1-8.