WO2018149337A1 - 一种信息投放方法、装置及服务器 - Google Patents

一种信息投放方法、装置及服务器 Download PDF

Info

Publication number
WO2018149337A1
WO2018149337A1 PCT/CN2018/075521 CN2018075521W WO2018149337A1 WO 2018149337 A1 WO2018149337 A1 WO 2018149337A1 CN 2018075521 W CN2018075521 W CN 2018075521W WO 2018149337 A1 WO2018149337 A1 WO 2018149337A1
Authority
WO
WIPO (PCT)
Prior art keywords
round
sample subset
training set
individual
population
Prior art date
Application number
PCT/CN2018/075521
Other languages
English (en)
French (fr)
Inventor
肖映鹏
朱张斌
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2018149337A1 publication Critical patent/WO2018149337A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]

Definitions

  • the present invention relates to the field of computer technologies, and in particular, to an information delivery method, apparatus, and server.
  • the existing information delivery method can be based on the seed population to spread the population, get the crowd, and then put the information into the crowd.
  • the so-called seed population refers to people who have the same needs and interests in products or services under specific application scenarios.
  • the population should be in the same demand and interest as the seed population, but the number is several times, dozens of times or even hundreds of times more than the seed population.
  • the accuracy of the distribution of the population spread by the seed population determines the accuracy of the information delivery. Therefore, how to accurately spread the population is the hot spot of current research and development.
  • the embodiments of the present invention provide an information delivery method and device, so as to improve the accuracy of the population that spreads the distribution, thereby improving the accuracy of information delivery.
  • the embodiment of the present invention provides the following technical solutions:
  • an embodiment of the present invention provides an information delivery method, where the method is applied to an information delivery server, including:
  • each individual of the k-th diffusion training set has a feature vector, the feature vector Include multiple attributes of the corresponding individual and corresponding attribute values;
  • the k-th round of the crowd is screened from the overall population using the k-th wheel diffusion model; the k-th round of the crowd is used for the k-th round of information placement.
  • an embodiment of the present invention further provides an information delivery apparatus, a memory, and a processor, where the memory is used to store an instruction, and the processor is configured to execute the instruction to perform the following steps, including:
  • each individual of the diffusion training set has a feature vector, and the feature vector includes a corresponding Multiple attributes of an individual and corresponding attribute values;
  • the k-th round of the crowd is screened from the overall population using the k-th wheel diffusion model; the k-th round of the crowd is used for the k-th round of information placement.
  • an embodiment of the present invention further provides a storage medium, where the storage medium is used to store program code, and the program code is used to execute the information delivery method provided by the first aspect.
  • an embodiment of the present invention further provides a computer program product comprising instructions, when executed on a computer, causing the computer to execute the information delivery method provided by the first aspect.
  • the k-th diffusion training set is adjusted based on the feedback data of the previous round (k-1), so that even if the quality of the initial seed population is poor, the feedback data and the feedback data may be used. Iterative training is used to adjust the sample, so that the matching degree between the people and the information is higher and higher, which improves the accuracy of the spread population.
  • the eigenvectors of all individuals in the k-th diffusion training set are introduced to model training, and the k-th round of the crowd is selected according to the trained model, so that the trained model can be accurately distinguished. People who are similar to the positive sample subsets also increase the accuracy of the spread population.
  • FIG. 1 is a schematic diagram of an application scenario according to an embodiment of the present invention.
  • FIG. 2 is a diagram showing an example of a computer architecture of an information delivery platform or server according to an embodiment of the present invention
  • 3-5 are schematic flowcharts of an information delivery method according to an embodiment of the present invention.
  • FIG. 6 is a block diagram showing an exemplary configuration of an information placing apparatus according to an embodiment of the present invention.
  • the present invention provides an information delivery method and device.
  • the information delivery method and device can be applied to various application fields that need to spread population, for example, it can be applied to the crowd diffusion and advertisement placement field of the WeChat circle of friends.
  • FIG. 1 shows an application scenario of the information delivery device, which may include an information delivery platform 101 and a database 102.
  • the function of the information delivery platform 101 can be implemented by one or more information delivery servers.
  • the information delivery platform 101 is mainly responsible for the distribution of the population based on the initial seed population, and the information delivery to the client of the delivery population.
  • the above information placing device may be applied to the information delivery server in the form of software, or as a component of the information delivery server in the form of hardware (for example, a controller/processor specifically serving as an information delivery server).
  • the information delivery device may be an application, such as a terminal application, or a component or plug-in of an application or an operating system.
  • the database 102 can be used to store user unique identifiers (IDs), basic information, and various attributes and attribute values of each user under the information delivery platform.
  • IDs user unique identifiers
  • basic information basic information
  • attributes and attribute values of each user under the information delivery platform.
  • the basic information may include a mobile phone number, a mailbox, etc.
  • the attributes may include one or more of a region, a gender, an age, a height, and the like.
  • the attributes may further include: an interest tag (the interest tag is information for reflecting the user's interest), a number of purchases, and the like, which are not described herein.
  • the functionality of database 102 can be implemented by one or more database nodes.
  • the functions of the information delivery platform 101 and the database 102 can also be implemented by the same server.
  • the database 102 can also be used to provide an initial seed population.
  • the initial seed population can also be provided by information publishers such as advertisers.
  • database 102 may be further comprised of one or more servers.
  • the database 102 can include a base information server, a user portrait engine (a queryable interest tag), and the like.
  • Fig. 2 shows a general computer system structure of the above information delivery platform, server or device.
  • the above computer system may include a bus, a processor 1, a memory 2, a communication interface 3, an input device 4, and an output device 5.
  • the processor 1, the memory 2, the communication interface 3, the input device 4, and the output device 5 are connected to each other through a bus. among them:
  • the bus can include a path for transferring information between the various pendants of the computer system.
  • the processor 1 may be a general-purpose processor, such as a general-purpose central processing unit (CPU), a network processor (NP Processor, NP for short, a microprocessor, etc., or an application-specific integrated circuit (ASIC). , or one or more integrated circuits for controlling the execution of the program of the present invention, may also be a digital signal processor (DSP), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or a transistor logic device. , separate hardware components.
  • DSP digital signal processor
  • FPGA off-the-shelf programmable gate array
  • the memory 2 stores a program for executing the technical solution of the present invention, and can also store an operating system and other key services.
  • the program can include program code, the program code including computer operating instructions.
  • the memory 2 may include a read-only memory (ROM), other types of static storage devices that can store static information and instructions, random access memory (RAM), storable information, and Other types of dynamic storage devices, disk storage, flash, and so on.
  • Input device 4 may include means for receiving data and information input by a user, such as a keyboard, mouse, camera, scanner, light pen, voice input device, touch screen, pedometer or gravity sensor, and the like.
  • Output device 5 may include devices that allow output of information to the user, such as a display screen, printer, speaker, and the like.
  • Communication interface 3 may include devices that use any type of transceiver to communicate with other devices or communication networks, such as Ethernet, Radio Access Network (RAN), Wireless Local Area Network (WLAN), and the like.
  • RAN Radio Access Network
  • WLAN Wireless Local Area Network
  • the processor 1 executes the program stored in the memory 2, and calls other devices, and can be used to implement various steps in the information delivery method provided by the embodiment of the present invention.
  • FIG. 3 shows an exemplary flow of the above information delivery method.
  • the method shown in FIG. 3 is applied to the domain or application scenario mentioned in FIG. 1, and the processor 1 of the information delivery platform (or server) shown in FIG. 2 interacts with other devices.
  • the processor 1 of the information delivery platform (or server) generates a k-th diffusion training set based on the initial seed population and the feedback data of the k-1th round of the population.
  • k is an integer that increments from 0 to 0.
  • the above-mentioned initial seed population refers to people who have the same needs and interests in products or services under specific application scenarios (such as WeChat friends circle).
  • the initial seed population can be provided by the advertiser.
  • the advertiser can upload an initial seed population package to provide the initial seed population.
  • the content of the initial seed population package may include at least one of a phone number, an account number, a mailbox, a user unique identification (ID), and the like.
  • ID user unique identification
  • the initial seed population can also be obtained from a database or some database, or obtained from the trading platform.
  • each individual has a feature vector (the feature vector can be obtained from the database 102), and the feature vector includes a plurality of attributes of the individual and corresponding attribute values.
  • the k-th round diffusion training set includes 100 user IDs, and each user ID corresponds to multiple attributes and attribute values.
  • Illustrative attributes may include one or more of the region, gender, age, height, monthly income, and the like.
  • the attributes may also include: an interest tag (the interest tag is information for reflecting the user's interest), the number of purchases, and the like.
  • the attribute value is the specific value corresponding to an attribute.
  • the height is 1.8m
  • the height is the attribute
  • 1.8m is the attribute value of the height.
  • the attribute value can also be an interval.
  • the above feedback data may include feedback statistics of the k-1th round of the population, and behavior data of each individual in the k-1th round of the population.
  • the feedback statistics are calculated based on the behavioral data of all individuals in the k-1 round.
  • the feedback statistics may include a click rate, and accordingly, each user's behavior data may include data characterizing whether to click.
  • the feedback statistics can include conversion rates.
  • the conversion rate can include the rate of praise (number of people who liked/number of people), the rate of dislikes (the number of people who don’t like it/the number of people), the rate of comments (fill in the number of comments/number of people), and so on.
  • the behavior data of each user may include APP download information, friend circle like information, friend circle comment information, and even “not interested” information, point "dislike” information, and the like.
  • the processor 1 of the information delivery platform performs the k-th iteration training using the feature vector of the individual in the k-th diffusion training set described above to obtain the k-th wheel diffusion model.
  • the present embodiment does not endlessly perform iterative training and advertisement placement. Iterative training and subsequent ad serving are stopped when the stop condition is met.
  • the stop condition can include the number of iterations reaching an upper limit. In another example, the stop condition may include the number of people serving the crowd reaching the number of people requested by the advertiser, and the like.
  • the k-th round diffusion training set described above includes a first positive sample subset and a first negative sample subset. It should be noted that the first and second in the present invention are used for distinguishing, and are not used to indicate the preceding and succeeding order.
  • the individuals in the first positive sample subset are positive samples, while the first negative sample subset individuals are negative samples.
  • the trained k-th wheel diffusion model may include: a first distinguishing feature vector (or a first feature value weight value vector) for distinguishing between positive and negative samples.
  • the first distinguishing feature vector may include: an attribute strongly associated with the training target that distinguishes the positive and negative samples and a corresponding attribute value.
  • the aforementioned feature vector of each individual includes a plurality of attributes.
  • the weight of each attribute relative to the training target is calculated, and the greater the weight, the stronger the association with the training target.
  • the training goal can be the goal of distinguishing between positive and negative samples.
  • the weight may also be negative.
  • attributes 1-4 their weights relative to the training target are 2, 0.67, 0.625, and -0.125, respectively. If three attributes are taken, attributes 1-3 are attributes that are strongly associated with distinguishing positive and negative samples. .
  • attribute value of the attribute in the first distinguishing feature vector may be an interval or an average value.
  • the first distinguishing feature vector includes an attribute of age and an attribute value
  • the first positive sample subset there are four individuals, and the ages are 20, 25, 15, and 20, respectively
  • the first distinguishing feature vector is
  • Section 303 the k-th round of the population was screened from the overall population using the k-th diffusion model.
  • the overall population refers to all users of the platform.
  • the entire population refers to all WeChat users.
  • each individual (ie, user) in the overall population can be scored using the kth round of diffusion model, the users ranked according to the highest to lowest scores, and the top N users are selected as the serving population.
  • the score characterizes the similarity of the individual's feature vector to the aforementioned first distinguishing feature vector. The higher the score, the higher the similarity between the feature vector characterizing the corresponding individual and the first distinguishing feature vector.
  • a user with a score greater than a certain threshold may be used as a serving population.
  • section 304 the processor 1 of the information delivery platform (or server) performs the k-th round of information delivery to the k-th round of the above-mentioned crowd through the communication interface 3.
  • the processor 1 of the information delivery platform may output the k-th round of the delivery group through the communication interface 3, and the other platform may perform the information delivery to the k-th round of the crowd.
  • next round of information delivery After entering the information delivery, it is determined whether the next round of information delivery is still performed. If the determination is yes, the next round of the diffusion training set is generated, and the subsequent operations are performed.
  • the seed population is packaged as a positive sample subset, and a negative sample subset is randomly selected from the overall population to form a training set;
  • the poor quality seed population spreads out the poor quality of the delivery population, and the advertisement to the poor quality delivery group will result in poor delivery and damage the user's interests;
  • the trained model cannot accurately distinguish the positive and negative samples, resulting in poor accuracy of the model.
  • the k-th diffusion training set is adjusted based on the feedback data of the previous round (k-1), so that even if the quality of the initial seed population is poor, the feedback data and the feedback data can be used.
  • the iterative training is used to adjust the sample, so that the matching degree between the delivery group and the information is higher and higher, thereby improving the accuracy of the spread population.
  • the eigenvectors of all individuals in the k-th diffusion training set are introduced to model training, and the k-th round of the crowd is selected according to the trained model, so that the trained model can be accurately distinguished. People who are similar to the positive sample subsets also increase the accuracy of the spread population.
  • FIG. 4 shows another exemplary flow of the above information delivery method.
  • the method shown in FIG. 4 can be applied to the application scenario shown in FIG. 1, and the processor 1 in the information delivery platform/server shown in FIG. 2 interacts with other components.
  • this embodiment describes the 0th iteration training and advertisement delivery, and the m (m not equal to 0) iterations and advertisement delivery as an example.
  • m is equivalent to any value of k ⁇ 0.
  • the exemplary process includes:
  • the information delivery server obtains the initial seed population as a positive sample subset of the Round 0 diffusion training set (the first positive sample subset).
  • Section 401 the information delivery server randomly selects the same population as the initial seed population from the overall population as the negative sample subset of the Round 0 diffusion training set (the first negative sample subset).
  • the information delivery server randomly selects 50,000 users from the overall population as the first negative sample subset.
  • the information delivery server obtains feature vectors of the first positive sample subset and the first negative sample subset.
  • each sample in the 0th round of the diffusion training set has a feature vector.
  • the information delivery server introduces the feature vector of each individual in the first positive sample subset and the first negative sample subset into the first preset model for training learning, and obtains the 0th round diffusion model.
  • the first preset model may be a Logistic Regression Model (LR), and the LR may further refine the model including the Spark ADMMLR model. Due to the large number of samples, the Spark ADMMLR model can be selected for training and learning.
  • LR Logistic Regression Model
  • the round 0 diffusion model includes: a first distinguishing feature vector.
  • a first distinguishing feature vector For a description of the first distinguishing feature vector, refer to the foregoing section 301, which is not described herein.
  • Section 403 is similar to Section 302 above, and details are not described again.
  • Section 404 the information delivery server uses the Round 0 diffusion model to screen out the 0th round of the population from the overall population.
  • Section 404 is similar to Section 303 above, and details are not described again.
  • the information delivery server performs the 0th round of information delivery for the 0th round of the above-mentioned crowd, and obtains the feedback data of the 0th round of the crowd.
  • the feedback data of the 0th round of the crowd can be acquired for a predetermined period of time. For example, you can wait for 10 minutes, an hour, a day, and so on.
  • the information delivery server generates the m-th round delivery training set according to the feedback data of the m-1 round of the crowd.
  • the feedback data of the m-1th round of the crowd reflects the effectiveness of the ad delivery.
  • the mth round of the training set includes a second positive sample subset and a second negative sample subset.
  • the individuals in the second positive sample subset are used as positive samples, and the second negative sample subsets are used as negative samples.
  • each individual in the m-th round of the training set has a feature vector.
  • the feedback data of the k-1th round may include feedback statistics (click rate or conversion rate) of the k-1th round of the crowd, and behavior data of each individual in the k-1th round of the crowd.
  • the behavior data of the individual in the second positive sample subset has a positive correlation relationship with the feedback statistical data; and the behavior data of the second negative sample subset has an inverse relationship with the feedback statistics.
  • so-called positive association relationship means that when the total number is constant, the feedback statistical data increases as the number of individuals having the behavior data increases.
  • the conversion rate in the case of a certain total number, the number of individuals (referred to as conversion populations) having APP downloading, likes, comments, and the like is higher, and the conversion rate is higher.
  • the acquiring manner of the second positive sample subset of the m-th round of the training set may specifically include: using the clicked population as the second positive sample subset.
  • the obtaining manner of the second positive sample subset of the m-th round of the training set may specifically include: using the transformed population as the second positive sample subset.
  • the so-called reverse association relationship means that when the total number is constant, the feedback statistic decreases as the number of individuals having the behavior data increases.
  • the second negative sample subset of the m round of the training set may be obtained by:
  • the other individuals after the second positive sample subset of the mth round of the cast training set are removed from the m-th round of the population, and the second negative sample subset of the m-th round of the training set is placed; or, the m-th One round of the crowd was placed as the second negative sample subset of the m-th round of the training set.
  • sampling may also be performed to obtain a second negative sample subset of the mth round of the training set.
  • the m-th round delivery training set is obtained. This may cause the second positive sample subset of the mth round of the training set to be strongly associated with increasing the click rate or conversion rate, while the second negative sample subset of the mth round of the training set is strongly associated with decreasing the click rate or conversion rate.
  • the clickthrough rate or conversion rate characterizes the ad serving performance, that is, the positive correlation between the m-th round of the training set (the positive subset of the sample) and the ad serving effect.
  • the distribution population similar to the positive sample subset and contributing to the improvement of the advertisement delivery effect can be accurately diffused. Therefore, in the present embodiment, even if the quality of the seed population is poor, the precise population can be gradually spread.
  • the information delivery server introduces the feature vectors of all individuals in the m-th round of the training set into the second preset model for training learning, and obtains the m-th round delivery model.
  • the second preset model is similar to the first preset model described above, and is not described herein.
  • the m-th round placement model may include: a second distinguishing feature vector for distinguishing between the positive sample and the negative sample; the second distinguishing feature vector includes: an attribute strongly associated with the target that distinguishes the positive and negative samples and a corresponding attribute value.
  • the second distinguishing feature vector is similar to the first distinguishing feature vector.
  • the specific content included in the second distinguishing feature vector and the first distinguishing feature vector may also be different.
  • the trained delivery model can accurately distinguish the positive and negative samples from the existing method, so that the accurate diffusion training set can be selected through the delivery model. In this way, even if the seed population is of poor quality, it can gradually spread out to the precise population.
  • the information delivery server uses the m-th round delivery model described above to select the m-th round diffusion training set from the m-th round delivery training set and the initial seed population.
  • the first positive sample subset of the m-th wheel diffusion training set can be obtained as follows:
  • SeedScore ⁇ (u,score(u))
  • a certain number of users may also be randomly selected from the union of the second positive sample subset and the initial seed group of the m-th round training set to obtain the first positive sample subset of the m-th round diffusion training set.
  • each round selects a positive sample from the seed population, which ensures the similarity between the initial seed population and the selected k-th round of the selected population, thereby ensuring the similarity between the population and the seed population. Based on the spread of the crowd.
  • the first negative sample subset of the m-th diffusion training set can be obtained as follows:
  • the NegativeAD score indicates a set of scores corresponding to the second negative sample subset of the m-th round of the training set (which may be referred to as a third score set), and i indicates a user of the second negative sample subset of the m-th round of the training set, score( i) indicates the score corresponding to a certain user in the second negative sample subset of the mth round of the training set.
  • p(i) represents the probability of the i-th individual in the second negative sample subset of the m-th round of the training set as a negative sample
  • num neg represents the number of samples of the second negative sample subset of the m-th round of the training set
  • num p represents the total number of samples in the mth round of the training set
  • score(i) ⁇ NegativeAD score Represents the summation of the scores of all individuals in the second negative sample subset of the m-th round of the training set.
  • steps (1)-(3) are performed on each negative sample in the m-th round of the training set, and finally the first negative sample subset in the m-th round diffusion training set is obtained.
  • the information delivery server introduces the feature vector of each individual in the m-th diffusion training set into the first preset model for training learning, and obtains the m-th wheel diffusion model.
  • Section 409 is similar to Section 403 and will not be repeated here.
  • both the negative sample and the positive sample in the m-th diffusion training set carry the feature vector, the trained m-th wheel diffusion model can accurately distinguish the positive and negative samples from the existing method, so that the subsequent screening can be performed. Out of the precise crowd.
  • Section 410 the information delivery server uses the m-th wheel diffusion model to screen the m-th round of the population from the overall population.
  • Section 410 is similar to Section 404 and will not be described here.
  • the information delivery server performs information delivery on the above-mentioned m-th round-distributed crowd, and obtains feedback data of the m-th round of the crowd.
  • Section 411 is similar to Section 405 and will not be repeated here.
  • the 0th iteration training and other iterative trainings are introduced in detail. Even if the seed population is of poor quality, based on the feedback data and the feature vector, the diffusion effect can be improved. Serve the crowd.
  • FIG. 6 is a schematic diagram showing a possible structure of the information placing apparatus involved in the foregoing embodiment, including:
  • the diffusion training set generating unit 601 is configured to generate a k-th wheel diffusion training set according to the initial seed population and the feedback data of the k-1th round of the population;
  • each individual in the diffusion training set has a feature vector, and the feature vector includes multiple attributes of the corresponding individual and corresponding attribute values;
  • the training unit 602 is configured to perform the k-th iteration training using the feature vector of the individual in the k-th diffusion training set to obtain the k-th wheel diffusion model;
  • the screening unit 603 is configured to screen the k-th round of the crowd from the whole population by using the k-th wheel diffusion model; and the k-th round of the crowd is used for the k-th round of information placement.
  • the information placing apparatus may further include:
  • the advertisement delivery unit 604 is configured to perform the k-th round of information delivery to the k-th round of the above-mentioned crowd.
  • the diffusion training set generation unit 601 can be used to execute the portion 301 of the embodiment shown in FIG. 3; in addition, portions 400-402 of the embodiment shown in FIG. 4 can also be performed.
  • Training unit 602 can be used to perform portion 302 of the embodiment shown in FIG. 3; in addition, portions 403, 406-409 of the embodiment shown in FIG. 4 can also be implemented.
  • the screening unit 603 can be used to perform the portion 303 of the embodiment shown in Figure 3; in addition, portions 404, 410 of the embodiment shown in Figure 4 can also be performed.
  • Ad placement 604 can be used to perform portion 304 of the embodiment shown in Figure 3; in addition, portions 405, 411 of the embodiment shown in Figure 4 can also be implemented.
  • the embodiment of the present application further provides an information delivery server, which may include any of the information delivery devices described above.
  • the composition of the information delivery server can be as shown in FIG. 1.
  • the program code stored in the memory is executed by the processor according to the instruction in the program code. The following steps:
  • each individual of the diffusion training set has a feature vector, and the feature vector includes a corresponding Multiple attributes of an individual and corresponding attribute values;
  • the k-th round of the crowd is screened from the overall population using the k-th wheel diffusion model; the k-th round of the crowd is used for the k-th round of information placement.
  • the k-th wheel diffusion training set includes a first positive sample subset and a first negative sample subset; the first positive sample subset of individuals is a positive sample, and the first negative sample subset is Individual as a negative sample;
  • the processor is configured to execute the instruction to perform the following steps, including: acquiring an initial seed population as a first positive sample subset;
  • a population equal to the initial seed population is randomly selected from the overall population as a first negative sample subset
  • the processor performs the following steps according to the instructions in the program code, including:
  • the kth round of the training set is generated; each individual of the kth round of the training set has a feature vector;
  • the kth round of the diffusion training set is selected from the kth round of the training set and the initial seed population.
  • the kth round of the training set includes a second positive sample subset and a second negative sample subset; the second positive sample subset of individuals is a positive sample, and the second negative sample subset is Individual as a negative sample;
  • the processor is configured to execute the instructions to perform the following steps, including:
  • the feedback data includes feedback statistics of the k-1th round of the crowd, and behavior data of each of the k-1 rounds of the crowd;
  • the feedback statistics are calculated according to the behavior data of the individuals in the k-1th round of the population;
  • the second positive sample subset includes an individual having a positive relationship between the behavior data and the feedback statistical data, and Each individual in the second positive sample subset has a corresponding feature vector;
  • the second negative sample subset is screened from the k-1th round of the population; the individual of the second negative sample subset has a feature vector.
  • the processor is configured to perform the k-th round diffusion training set from the k-th round delivery training set and the initial seed population Instructions to perform the following steps, including:
  • a union of the filtered seed population and the filtered second positive sample subset is used as a positive sample subset of the k-th round diffusion training set.
  • the processor is configured to perform the screening of the k-th round diffusion training set from the k-th round delivery training set and the initial seed population
  • the instructions to perform the following steps including:
  • the i-th individual is placed in the first negative sample subset of the k-th wheel diffusion training set.
  • an embodiment of the present invention further provides a storage medium for storing program code, where the program code is used to execute any of the foregoing information delivery methods.
  • embodiments of the present invention also provide a computer program product comprising instructions that, when run on a computer, cause the computer to perform the information delivery method described above.
  • the steps of a method or algorithm described in connection with the present disclosure may be implemented in a hardware, or may be implemented by a processor executing software instructions.
  • the software instructions may be comprised of corresponding software modules that may be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, removable hard disk, CD-ROM, or any other form of storage well known in the art.
  • An exemplary storage medium is coupled to the processor to enable the processor to read information from, and write information to, the storage medium.
  • the storage medium can also be an integral part of the processor.
  • the processor and the storage medium can be located in an ASIC. Additionally, the ASIC can be located in the user equipment.
  • the processor and the storage medium may also reside as discrete components in the user equipment.
  • the functions described herein can be implemented in hardware, software, firmware, or any combination thereof.
  • the functions may be stored in a computer readable medium or transmitted as one or more instructions or code on a computer readable medium.
  • Computer readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another.
  • a storage medium may be any available media that can be accessed by a general purpose or special purpose computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种信息投放方法及装置,根据初始种子人群及第k-1轮投放人群的反馈数据,生成第k轮扩散训练集(301);使用上述第k轮扩散训练集中的个体样本的特征向量进行第k轮迭代训练,得到第k轮扩散模型(302);使用上述第k轮扩散模型从整体人群中筛选出第k轮投放人群(303),用于第k轮信息投放(304)。基于前一轮投放人群的反馈数据对第k轮扩散训练集做调整,使投放人群与信息的匹配度越来越高,提高了投放人群的精准度。

Description

一种信息投放方法、装置及服务器
本申请要求于2017年2月15日提交中国专利局、申请号为2017100818432、发明名称为“信息投放方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及计算机技术领域,具体涉及一种信息投放方法、装置及服务器。
背景技术
目前很多领域都需要进行信息投放。例如,微信朋友圈的广告推送,或者,论文搜索网站的论文推荐等。
现有的信息投放方式可基于种子人群进行人群扩散,得到投放人群,再对投放人群进行信息投放。所谓的种子人群是指在特定应用场景下,对产品或服务具有相同需求和兴趣的人群。投放人群理论上应与种子人群具有相同需求和兴趣,但数量是种子人群的几倍、几十倍乃至上百上千倍。
由种子人群扩散出的投放人群的精准性,决定了信息投放的准确性,因此,如何精准扩散出投放人群,是目前研发的热门。
发明内容
有鉴于此,本发明实施例提供信息投放方法及装置,以提高扩散投放人群的准确度,进而提高信息投放的准确性。
为实现上述目的,本发明实施例提供如下技术方案:
第一方面,本发明实施例提供一种信息投放方法,所述方法应用于信息投放服务器,包括:
根据初始种子人群及第k-1轮投放人群的反馈数据,生成第k轮扩散训练集;所述k为整数;所述第k轮扩散训练集中的每一个体具有特征向量,所述特征向量包括对应的个体的多个属性及相应的属性值;
使用所述第k轮扩散训练集中的个体的特征向量进行第k轮迭代训练,得到第k轮扩散模型;
使用所述第k轮扩散模型从整体人群中筛选出第k轮投放人群;所述第k轮投放人群用于第k轮信息投放。
第二方面,本发明实施例还提供一种信息投放装置,存储器和处理器,所述存储器用于存储指令,所述处理器用于执行所述指令,以执行下述步骤,包括:
根据初始种子人群及第k-1轮投放人群的反馈数据,生成第k轮扩散训练集;所述k为整数;所述扩散训练集中的每一个体具有特征向量,所述特征向量包括对应的个体的多个属性及相应的属性值;
使用所述第k轮扩散训练集中的个体的特征向量进行第k轮迭代训练,得到第k轮扩散模型;
使用所述第k轮扩散模型从整体人群中筛选出第k轮投放人群;所述第k轮投放人群用于进行第k轮信息投放。
第三方面,本发明实施例还提供一种存储介质,该存储介质用于存储程序代码,所述程序代码用于执行上述第一方面提供的信息投放方法。
第四方面,本发明实施例还提供一种包括指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行上述第一方面提供的信息投放方法。
在本发明实施例中,是基于前一轮(k-1)的投放人群的反馈数据对第k轮扩散训练集做调整,这样即使初始种子人群的质量差,也可通过反馈数据及多次迭代训练来调整样本,从而使得投放人群与信息的匹配度越来越高,进而提高了扩散出的投放人群的精准度。
同时,在进行迭代训练时,引入了第k轮扩散训练集中所有个体的特征向量进行模型训练,并根据训练出的模型筛选出第k轮投放人群,这样,可保证训练出的模型可以精确区分与正样本子集相似的人群,这样也提高了扩散出的投放人群的精准度。
附图说明
图1所示为根据本发明实施例的应用场景示意图;
图2所示为根据本发明实施例的信息投放平台或服务器的计算机架构示例图;
图3-5所示为根据本发明实施例的信息投放方法示例性流程图;
图6所示为根据本发明实施例的信息投放装置的示例性结构图。
具体实施方式
本发明提供了信息投放方法及装置,上述信息投放方法及装置可应用于各种需要进行人群扩散的应用领域,例如,可应用于微信朋友圈的人群扩散及广告投放领域。
图1示出了上述信息投放装置的一种应用场景,在该应用场景中可包括:信息投放平台101和数据库102。
信息投放平台101的功能可由一台或多台信息投放服务器实现。
在本发明中,信息投放平台101主要负责基于初始种子人群进行扩散得到投放人群,及向投放人群的客户端进行信息投放。
上述信息投放装置可以软件的形式应用于上述信息投放服务器中,或以硬件(例如具体可为信息投放服务器的控制器/处理器)的形式作为信息投放服务器的组成部分。
当以软件形式存在时,上述信息投放装置具体可为一应用程序,例如终端应用程序等,也可作为某应用程序或操作系统的组件或插件。
上述数据库102可用于存储信息投放平台下所有用户的用户唯一标识(ID)、基础信息,以及每一用户的各种属性及属性值。
其中,基础信息可包括手机号、邮箱等,属性示例性的可包括:所在区域、性别、年龄、身高等的一种或多种。在一些应用场景下,属性还可包括:兴趣标签(兴趣标签是用于反映用户兴趣的信息)、购买次数等等,在此不作一一赘述。数据库102的功能可由一台或多台数据库节点实现。
在实际中,信息投放平台101和数据库102的功能也可由同一台服务器实现。
此外,在某些应用场景下,数据库102也可用于提供初始种子人群。当然,初始种子人群也可由诸如广告主等信息发布者来提供。
由于数据库102所提供信息的不同,数据库102可进一步由一个或多个服务器构成。例如,数据库102可包括基础信息服务器、用户画像引擎(可查询兴趣标签)等。
图2示出了上述信息投放平台、服务器或装置的一种通用计算机系统结构。
上述计算机系统可包括总线、处理器1、存储器2、通信接口3、输入设备4和输出设备5。处理器1、存储器2、通信接口3、输入设备4和输出设备5通过总线相互连接。其中:
总线可包括一通路,在计算机系统各个挂件之间传送信息。
处理器1可以是通用处理器,例如通用中央处理器(CPU)、网络处理器(Network Processor,简称NP)、微处理器等,也可以是特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制本发明方案程序执行的集成电路,还可以是数字信号处理器(DSP)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。
存储器2中保存有执行本发明技术方案的程序,还可以保存有操作系统和其他关键业务。具体地,程序可以包括程序代码,程序代码包括计算机操作指令。更具体的,存储器2可以包括只读存储器(read-only memory,ROM)、可存储静态信息和指令的其他类型的静态存储设备、随机存取存储器(random access memory,RAM)、可存储信息和指令的其他类型的动态存储设备、磁盘存储器、flash等等。
输入设备4可包括接收用户输入的数据和信息的装置,例如键盘、鼠标、摄像头、扫描仪、光笔、语音输入装置、触摸屏、计步器或重力感应器等。
输出设备5可包括允许输出信息给用户的装置,例如显示屏、打印机、扬声器等。
通信接口3可包括使用任何收发器一类的装置,以便与其他设备或通信网络通信,如以太网,无线接入网(RAN),无线局域网(WLAN)等。
处理器1执行存储器2中所存放的程序,以及调用其他设备,可用于实现本发明实施例所提供的信息投放方法中的各个步骤。
下面将基于上述本发明涉及的共性方面,对本发明实施例进一步详细说明。
图3示出了上述信息投放方法的一种示例性流程。图3所示的方法应用于图1提及的领域或应用场景中,由图2所示的信息投放平台(或服务器)的处理器1与其他设备交互完成。
上述示例性流程包括:
在301部分:信息投放平台(或服务器)的处理器1根据初始种子人群及第k-1轮投放人群的反馈数据,生成第k轮扩散训练集。
其中,k为从0开始逐一递增的整数。
上述初始种子人群是指在特定应用场景下(例如微信朋友圈),对产品或服务具有相同需求和兴趣的人群。
在广告投放场景,初始种子人群可由广告主提供。广告主可上传初始种子人群包以实现初始种子人群的提供。初始种子人群包的内容可包括电话号码、账号、邮箱、用户唯一身份标识(ID)等的至少一种。对于电话号码、账号、邮箱,可通过与用户ID的关联关系,转化为用户ID。
在其他应用场景,初始种子人群也可从某一数据库或某些数据库中获取,或从交易平台中获取。
上述第k轮扩散训练集及第k-1轮投放人群中的每一个体对应平台上的一个用户,以用户ID标识。进一步的,每一个体均具有特征向量(特征向量可从数据库102中获取),上述特征向量包括该个体的多个属性及相应的属性值。
举例来讲,第k轮扩散训练集中包括100个用户ID,每一用户ID均对应多个属性及属性值。
属性示例性的可包括:所在区域、性别、年龄、身高、月收入等的一种或多种。在一些应用场景下,属性还可包括:兴趣标签(兴趣标签是用于反映用户兴趣的信息)、购买次数等等。
属性值是某一属性对应的具体取值。例如身高1.8m,身高是属性,1.8m是身高的属性值。当然,属性值也可以是一个区间。
上述反馈数据可包括第k-1轮投放人群的反馈统计数据,以及第k-1轮投放人群中每一个体的行为数据。
而反馈统计数据是根据第k-1轮投放人群所有个体的行为数据计算得到的。
在一个示例中,反馈统计数据可包括点击率,相应的,各用户的行为数据可包括表征是否点击的数据。
点击率的计算方式是点击量/曝光量。举例来讲,假定第k-1轮投放人群共100人,这100人在k-1时间段内共打开微信客户端1000次,对某一广告位共点击10次。则点击率为10/1000=1%。
在另一个示例中,反馈统计数据可包括转化率。转化率可包括点赞率(点赞人数/投放人数)、点不喜欢率(点不喜欢人数/投放人数)、点评率(填写评论人数/投放人数)等。
相应的,各用户的行为数据可包括APP下载信息、朋友圈点赞信息、朋友圈评论信息,乃至点“不感兴趣”的信息,点“不喜欢”的信息等。
在302部分:信息投放平台(或服务器)的处理器1使用上述第k轮扩散训练集中的个体的特征向量进行第k轮迭代训练,得到第k轮扩散模型。
需要说明的是,本实施例不会无休止得进行迭代训练及广告投放。会在满足停止条件时停止迭代训练和后续的广告投放。
在一个示例中,停止条件可包括迭代次数达到上限。在另一示例中,停止条件可包括投放人群的人数达到广告主要求的人数等。
上述第k轮扩散训练集包括第一正样本子集和第一负样本子集。需要说明的是,本发明中的第一、第二用于区分,并不用于表示前后顺序。
第一正样本子集中的个体为正样本,而第一负样本子集中的个体为负样本。
相应的,训练得到的第k轮扩散模型可包括:用于区分正负样本的第一区分特征向量(或称为第一特征值权重值向量)。
上述第一区分特征向量可包括:与区分正负样本这一训练目标强关联的属性及相应的属性值。
前述提及了各个个体的特征向量包括多个属性,在迭代训练中,会计算 各个属性相对于训练目标的权重,权重越大,表明与训练目标的关联越强。其中,训练目标可为区分正负样本这一目标。
由于有第一负样本子集,则权重也可能为负。
举例来讲,假定共有属性1-4,其相对于训练目标的权重分别为2、0.67、0.625和-0.125,若取3个属性,则属性1-3为与区分正负样本强关联的属性。
需要说明的是,第一区分特征向量中属性的属性值,可能为取值区间或平均值。
举例来讲,假定第一区分特征向量包括年龄这一属性及属性值,而在第一正样本子集中共有4个个体,年龄分别是20、25、15、20,则第一区分特征向量中年龄这一属性的属性值可为[15,25],也可为(20+25+15+20)/4=20。
在303部分:使用第k轮扩散模型从整体人群中筛选出第k轮投放人群。
在不同的应用场景下,整体人群指的是平台的所有用户。例如,在微信平台下,整体人群指的是所有的微信用户。
在一个示例中,可使用第k轮扩散模型对整体人群中的每一个体(即用户)进行评分,根据得分由高到低对用户进行排序,选取top N用户作为投放人群。
得分(或称为分值)表征了个体的特征向量与前述第一区分特征向量的相似度。得分越高,表征相应的个体的特征向量与第一区分特征向量的相似度越高。
N的取值由广告主选择的投放规模决定,例如,投放规模为10万,则N=10万。
或者,可将得分大于某一阈值的用户作为投放人群。
在304部分:信息投放平台(或服务器)的处理器1通过通信接口3对上述第k轮投放人群进行第k轮信息投放。
或者,也可由信息投放平台(或服务器)的处理器1通过通信接口3输出第k轮投放人群,由其他平台对第k轮投放人群进行信息投放。
在进入信息投放后,会判断是否还进行下一轮的信息投放,若判断为是,则生成下一轮扩散训练集,并执行后续操作。
需要说明的是,人群扩散的传统方法如下:
一、将种子人群包作为正样本子集,随机从整体人群中选出负样本子集,组成训练集;
二、使用训练集训练一个线性逻辑回归(LR)模型;
三、使用训练出的LR模型对整体人群做预测,取出TOP N用户作为投放人群。
其缺点是:
质量差的种子人群扩散出质量差的投放人群,而向质量差的投放人群投放广告会造成投放效果差,损害用户的利益;
随机抽取的负样本子集由于没有携带特征信息,因此训练出来的模型没法精准区分正负样本,导致模型精准性差。
而在本发明实施例中,是基于前一轮(k-1)的投放人群的反馈数据对第k轮扩散训练集做调整,这样即使初始种子人群的质量差,也可通过反馈数据及多次迭代训练来调整样本,从而使得投放人群与信息的匹配度越来越高,进而提高了扩散出的投放人群的精准度。
同时,在进行迭代训练时,引入了第k轮扩散训练集中所有个体的特征向量进行模型训练,并根据训练出的模型筛选出第k轮投放人群,这样,可保证训练出的模型可以精确区分与正样本子集相似的人群,这样也提高了扩散出的投放人群的精准度。
下面将以广告投放场景为例,对本发明的技术方案进行进一步的介绍。
图4示出了上述信息投放方法的另一种示例性流程。图4所示的方法可应用图1所示应用场景中,由图2所示的信息投放平台/服务器中的处理器1与其他部件交互完成。
由于有多次迭代过程,本实施例以第0次迭代训练和广告投放,以及第m(m不等于0)次迭代和广告投放为例进行讲述。m等同于k≠0的任意取值。
该示例性流程包括:
在400部分:信息投放服务器获取初始种子人群作为第0轮扩散训练集的正样本子集(第一正样本子集)。
初始种子人群相关介绍可参见前述实施例的301部分,在此不作赘述。
在401部分:信息投放服务器从整体人群中随机选取与初始种子人群等量的人群,作为第0轮扩散训练集的负样本子集(第一负样本子集)。
举例来讲,若初始种子人群中的用户数量为5万,则信息投放服务器从整体人群中随机选取5万用户作为第一负样本子集。
整体人群相关介绍可参见前述实施例的303部分,在此不作赘述。
在402部分:信息投放服务器获取第一正样本子集和第一负样本子集的特征向量。
特征向量的相关介绍可参见前述实施例的301部分,在此不作赘述。
这样,第0轮扩散训练集中的每一样本均具有特征向量。
在403部分:信息投放服务器将第一正样本子集和第一负样本子集中每一个体的特征向量导入第一预设模型进行训练学习,得到第0轮扩散模型。
更具体的,第一预设模型可为逻辑回归模型(LR),LR还可进一步细化包括Spark ADMMLR模型等模型。由于样本数量大,可选择Spark ADMMLR模型进行训练学习。
当然,在本发明其他实施例中也可选择其他逻辑的模型,如决策树、支持向量机等。
其中,上述第0轮扩散模型包括:第一区分特征向量。第一区分特征向量的相关介绍可参见前述的301部分,在此不作赘述。
403部分与前述的302部分相类似,相关细节不再赘述。
在404部分:信息投放服务器使用第0轮扩散模型从整体人群中筛选出第0轮投放人群。
404部分与前述的303部分相类似,相关细节不再赘述。
在405部分:信息投放服务器对上述第0轮投放人群进行第0轮信息投放,并得到第0轮投放人群的反馈数据。
更具体的,在第0轮信息投放后,可等待预定时长再获取第0轮投放人群的反馈数据。例如,可等待10分钟、一小时、一天等。
反馈数据的相关介绍可参见前述实施例的301部分,在此不作赘述。
在406部分:信息投放服务器根据第m-1轮投放人群的反馈数据,生成第m轮投放训练集。
在广告投放场景中,第m-1轮投放人群的反馈数据反映了广告投放效果。
第m轮投放训练集包括第二正样本子集和第二负样本子集。第二正样本子集中的个体作为正样本,第二负样本子集中的个体作为负样本。与第0轮投放训练集相类似,第m轮投放训练集中的每一个体均具有特征向量。
前已述及,第k-1轮的反馈数据可包括第k-1轮投放人群的反馈统计数据(点击率或转化率),以及第k-1轮投放人群中每一个体的行为数据。
相应的,上述第二正样本子集中个体的行为数据,与上述反馈统计数据之间具有正向关联关系;上述第二负样本子集中个体的行为数据与上述反馈统计数据具有反向关联关系。
需要说明的是,所谓的正向关联关系,是指当总数一定的情况下,反馈统计数据随具有该行为数据的个体的个数的增加而增加。
以反馈统计数据为点击率为例,在总数一定的情况下,具有点击行为的个体(可称为点击人群)的个数越多,点击率越高。
再例如,以反馈统计数据为转化率为例,在总数一定的情况下,具有APP下载、点赞、评论等行为的个体(可称为转化人群)的个数越多,转化率越高。
进一步的,若以点击率为投放目标,则第m轮投放训练集的第二正样本子集的获取方式可具体包括:将点击人群作为第二正样本子集。
若以转化率为投放目标,则第m轮投放训练集的第二正样本子集的获取方式可具体包括:将转化人群作为第二正样本子集。
而所谓的反向关联关系,是指当总数一定的情况下,反馈统计数据随着具有该行为数据的个体的个数的增加而减少。
无论是以点击率还是以转化率为投放目标,第m轮投放训练集的第二负样本子集的获取方式可具体包括:
将第m-1轮投放人群中剔除了第m轮投放训练集的第二正样本子集后的其他个体放入第m轮投放训练集的第二负样本子集;或者,将第m-1轮投放人群作为第m轮投放训练集的第二负样本子集。
当然,如追求第二正样本子集的样本数与第二负样本子集的样本数相同,则还可进行抽样,得到第m轮投放训练集的第二负样本子集。
在本实施中,是基于前一轮(m-1)的投放人群的反馈统计数据(点击率或转化率)和个体的行为数据,得到第m轮投放训练集。这样可使得第m轮投放训练集中的第二正样本子集与提高点击率或转化率强关联,而第m轮投放训练集中的第二负样本子集与降低点击率或转化率强关联。
而点击率或转化率表征了广告投放效果,也即第m轮投放训练集(中的正样本子集)与广告投放效果之间正向强关联。这样,基于第m轮投放训练集得到的第m轮投放模型,可精确扩散出与正样本子集相似、有利于提高广告投放效果的投放人群。因此,在本实施例中,即使种子人群质量较差,也可逐渐扩散出精准的投放人群。
在407部分:信息投放服务器将第m轮投放训练集中所有个体的特征向量导入第二预设模型进行训练学习,得到第m轮投放模型。
第二预设模型与前述第一预设模型相类似,在此不作赘述。
第m轮投放模型可包括:用于区分正样本和负样本的第二区分特征向量;所述第二区分特征向量包括:与区分正负样本这一目标强关联的属性及相应的属性值。
第二区分特征向量与第一区分特征向量相类似,相关介绍请参见前述的302部分,在此不作赘述。
需要说明的是,由于训练集不同,所以第二区分特征向量与第一区分特征向量所包括的具体内容也会随之不同。
在本实施例中,由于负样本和正样本均携带特征向量,因此训练出来的投放模型相较于现有方式可精准得区分正负样本,从而后续可通过投放模型筛选出精准的扩散训练集。这样,即使种子人群质量较差,也可逐渐扩散出精准的投放人群。
在408部分:信息投放服务器使用上述第m轮投放模型,从上述第m轮投放训练集和上述初始种子人群中筛选出第m轮扩散训练集。
更具体的,第m轮扩散训练集中的第一正样本子集可通过如下方式得到:
使用第m轮投放模型对初始种子人群中的每一个体打分,得到集合SeedScore={(u,score(u))|u∈seedUser};其中,SeedScore表示种子人群的得分集合(可称为第一得分集合),u表示初始种子人群中的某一用户,score(u)表示 种子人群中某一用户对应的得分。
过滤掉上述初始种子人群中得分小于第一阈值θ1的个体,得到过滤后的种子人群,当然,也可称其为第一子集P1,P 1={u|score(u)>θ1,(u,score(u))∈SeedScore}。
使用第m轮投放模型对第m轮投放训练集的第二正样本子集中的每一个体打分,得到集合positiveAD Score={(u,score(u))|u∈{第m轮投放训练集的第二正样本子集};其中,positiveAD Score表示第m轮投放训练集中第二正样本子集对应的得分集合(可称为第二得分集合)。
过滤掉第m轮投放训练集的第二正样本子集中得分小于第二阈值θ2的个体,得到过滤后的第二正样本子集,当然,也可称其为第二子集P2,P 2{u|score(u)>θ2,(u,score(u))∈PositiveAD Score}。θ1,θ2可相等或不等。
将过滤后的种子人群和过滤后的第二正样本子集取并集,作为上述第m轮扩散训练集的第一正样本子集P,也即P=P 1∪P 2
在其他实施例中,也可从第m轮投放训练集的第二正样本子集与初始种子群的并集中随机抽取一定数量的用户,得到第m轮扩散训练集中的第一正样本子集。
在本实施例中,每一轮都从种子人群中选出正样本,这样可保证初始种子人群与筛选出的第k轮投放人群的相似度,从而可在保证投放人群与种子人群的相似度的基础上进行人群扩散。
第m轮扩散训练集中的第一负样本子集可通过如下方式得到:
(1)使用第m轮投放模型对第m轮投放训练集的第二负样本子集中的个体打分,得到集合NegativeAD score={i,score(i)}|i∈{第m轮投放训练集中的负样本子集}。NegativeAD score表示第m轮投放训练集中第二负样本子集对应的得分集合(可称为第三得分集合),i表示第m轮投放训练集中第二负样本子集中的某一用户,score(i)表示第m轮投放训练集中第二负样本子集中某一用户对应的得分。
(2)使用伯努利分布对第m轮投放训练集中的第二负样本子集进行抽样,其抽样公式为:
Figure PCTCN2018075521-appb-000001
其中,p(i)表示第m轮投放训练集的第二负样本子集中的第i个个体作为负样本的概率,num neg表示第m轮投放训练集的第二负样本子集的样本数,num p表示第m轮投放训练集的样本总数,score(i)∈NegativeAD score,
Figure PCTCN2018075521-appb-000002
表示对第m轮投放训练集中第二负样本子集中的所有个体的分数求和。
(3)随机计算一个纯小数,若p(i)小于等于该随机计算的纯小数,则将第i个个体放入第m轮扩散训练集中的第一负样本子集中。
对第m轮投放训练集中的每一负样本进行步骤(1)-(3)的操作,最终得到第m轮扩散训练集中的第一负样本子集。
当然,在本发明其他实施例中,也可以使用高斯分布等其他概率分布进行抽样,在此不作赘述。
在409部分:信息投放服务器将上述第m轮扩散训练集中每一个体的特征向量导入第一预设模型进行训练学习,得到第m轮扩散模型。
409部分与403部分相类似,在此不作赘述。
在本实施例中,由于第m轮扩散训练集中的负样本和正样本均携带特征向量,因此训练出来的第m轮扩散模型相较于现有方式可精准得区分正负样本,从而后续可筛选出精准的投放人群。
在410部分:信息投放服务器使用第m轮扩散模型从整体人群中筛选出第m轮投放人群。
410部分与404部分相类似,在此不作赘述。
在411部分:信息投放服务器对上述第m轮投放人群进行信息投放,并得到第m轮投放人群的反馈数据。
411部分与405部分相类似,在此不作赘述。
请参见图5,其示出了图4所示实施例的迭代示意图。
综上,本发明实施例,对第0次迭代训练和其他次的迭代训练进行了详细介绍,即使种子人群质量较差,基于反馈数据和特征向量,也可扩散出有利于提高广告投放效果的投放人群。
图6示出了上述实施例中所涉及的信息投放装置的一种可能的结构示意图,包括:
扩散训练集生成单元601,用于根据初始种子人群及第k-1轮投放人群的反馈数据,生成第k轮扩散训练集;
所述k为整数;所述扩散训练集中的每一个体具有特征向量,所述特征向量包括对应的个体的多个属性及相应的属性值;
训练单元602,用于使用所述第k轮扩散训练集中的个体的特征向量进行第k轮迭代训练,得到第k轮扩散模型;
筛选单元603,用于使用所述第k轮扩散模型从整体人群中筛选出第k轮投放人群;所述第k轮投放人群用于进行第k轮信息投放。
具体细节请参见本文前述记载,在此不作赘述。
在本发明其他实施例中,仍请参见图6,上述信息投放装置还可包括:
广告投放单元604,用于对上述第k轮投放人群进行第k轮信息投放。
其中,扩散训练集生成单元601可用于执行图3所示实施例的301部分;此外,还可执行图4所示实施例的400-402部分。
训练单元602可用于执行图3所示实施例的302部分;此外,还可执行图4所示实施例的403、406-409部分。
筛选单元603可用于执行图3所示实施例的303部分;此外,还可执行图4所示实施例的404、410部分。
广告投放604可用于执行图3所示实施例的304部分;此外,还可执行图4所示实施例的405、411部分。
本申请实施例还提供了一种信息投放服务器,该信息投放服务器可以包括上述所述的任一种信息投放装置。该信息投放服务器的组成结构可以参见图1所示,在本申请实施例中的信息投放服务器中,该存储器中所存储的程序代码,所述处理器根据所述程序代码中的指令,以执行下述步骤:
根据初始种子人群及第k-1轮投放人群的反馈数据,生成第k轮扩散训练集;所述k为整数;所述扩散训练集中的每一个体具有特征向量,所述特征向量包括对应的个体的多个属性及相应的属性值;
使用所述第k轮扩散训练集中的个体的特征向量进行第k轮迭代训练,得到第k轮扩散模型;
使用所述第k轮扩散模型从整体人群中筛选出第k轮投放人群;所述第k轮投放人群用于进行第k轮信息投放。
可选的,所述第k轮扩散训练集包括第一正样本子集和第一负样本子集;所述第一正样本子集中的个体作为正样本,所述第一负样本子集中的个体作为负样本;
当k=0时,在生成第k轮扩散训练集的方面,所述处理器用于执行所述指令,以执行下述步骤,包括:获取初始种子人群作为第一正样本子集;
从整体人群中随机选取与所述初始种子人群等量的人群,作为第一负样本子集;
获取所述第一正样本子集和所述第一负样本子集中的个体的特征向量作为第k轮扩散训练集中每一个体的特征向量。
可选的,所述处理器根据所述程序代码中的指令,以执行下述步骤,包括:
当k≠0时,根据第k-1轮投放人群的反馈数据,生成第k轮投放训练集;所述第k轮投放训练集中的每一个体具有特征向量;
将所述第k轮投放训练集中的个体的特征向量导入第二预设模型进行训练学习,得到第k轮投放模型;
使用所述第k轮投放模型,从所述第k轮投放训练集和所述初始种子人群中筛选出第k轮扩散训练集。
可选的,所述第k轮投放训练集包括第二正样本子集和第二负样本子集;所述第二正样本子集中的个体作为正样本,所述第二负样本子集中的个体作为负样本;
在所述根据第k-1轮投放人群的反馈数据,生成第k轮投放训练集方面,所述处理器用于执行所述指令,以执行下述步骤,包括:
获取第k-1轮投放人群的反馈数据,所述反馈数据包括所述第k-1轮投放人群的反馈统计数据,以及所述第k-1轮投放人群中每一个体的行为数据;所述反馈统计数据是根据所述第k-1轮投放人群中个体的行为数据计算得到的;
从所述第k-1轮投放人群中筛选出所述第二正样本子集;所述第二正样本子集包括行为数据与所述反馈统计数据之间具有正向关系的个体,并且,所述第二正样本子集中的每一个体均具相对应的有特征向量;
从所述第k-1轮投放人群中筛选出所述第二负样本子集;所述第二负样本子集中的每一个体均具有特征向量。
可选的,在使用所述第k轮投放模型,从所述第k轮投放训练集和所述初始种子人群中筛选出所述第k轮扩散训练集方面,所述处理器用于执行所述指令,以执行下述步骤,包括:
使用所述第k轮投放模型对所述初始种子人群中的每一个体进行评分,并过滤掉所述初始种子人群中得分小于第一阈值的个体,得到过滤后的种子人群;
使用所述第k轮投放模型对所述第二正样本子集中的每一个体进行评分,并过滤掉所述第二正样本子集中得分小于第二阈值的个体,得到过滤后的第二正样本子集;
将过滤后的种子人群和过滤后的第二正样本子集的并集,作为所述第k轮扩散训练集的正样本子集。
可选的,在所述使用所述第k轮投放模型,从所述第k轮投放训练集和所述初始种子人群中筛选出所述第k轮扩散训练集方面,所述处理器用于执行所述指令,以执行下述步骤,包括:
使用所述第k轮投放模型对所述第二负样本子集中的第i个个体进行评分;
基于所述第i个个体的得分,计算所述第i个个体作为负样本的概率;
针对所述第i个个体随机生成纯小数;
若所述小于等于所述纯小数,将所述第i个个体放入所述第k轮扩散训练集的第一负样本子集中。
此外,本发明实施例还提供了一种存储介质,该存储介质用于存储程序代码,所述程序代码用于执行任意一项上述的信息投放方法。
另一方面,本发明实施例还提供了一种包括指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述的信息投放方法。
结合本发明公开内容所描述的方法或者算法的步骤可以硬件的方式来实现,也可以是由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于RAM存储器、闪存、ROM存储器、EPROM存储器、EEPROM存储器、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于用户设备中。当然,处理器和存储介质也可以作为分立组件存在于用户设备中。
本领域技术人员应该可以意识到,在上述一个或多个示例中,本发明所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。计算机可读介质包括计算机存储介质和通信介质,其中通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。存储介质可以是通用或专用计算机能够存取的任何可用介质。
以上所述的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本发明的保护范围之内。

Claims (16)

  1. 一种信息投放方法,所述方法应用于信息投放服务器,包括:
    根据初始种子人群及第k-1轮投放人群的反馈数据,生成第k轮扩散训练集;所述k为整数;所述第k轮扩散训练集中的每一个体具有特征向量,所述特征向量包括对应的个体的多个属性及相应的属性值;
    使用所述第k轮扩散训练集中的个体的特征向量进行第k轮迭代训练,得到第k轮扩散模型;
    使用所述第k轮扩散模型从整体人群中筛选出第k轮投放人群;所述第k轮投放人群用于第k轮信息投放。
  2. 如权利要求1所述的方法,
    所述第k轮扩散训练集包括第一正样本子集和第一负样本子集;所述第一正样本子集中的个体作为正样本,所述第一负样本子集中的个体作为负样本;
    所述使用所述第k轮扩散训练集中的个体的特征向量进行第k轮迭代训练,得到第k轮扩散模型,包括:
    将所述第一正样本子集和第一负样本子集中的个体的特征向量导入第一预设模型进行训练学习,得到第k轮扩散模型。
  3. 如权利要求1或2所述的方法,当k=0时,所述方法还包括:
    获取初始种子人群作为第一正样本子集;
    从整体人群中随机选取与所述初始种子人群等量的人群,作为第一负样本子集;
    获取所述第一正样本子集和所述第一负样本子集中的个体的特征向量作为第k轮扩散训练集中的个体的特征向量。
  4. 如权利要求1或2所述的方法,当k≠0时,所述根据初始种子人群及第k-1轮投放人群的反馈数据,生成第k轮扩散训练集,包括:
    根据第k-1轮投放人群的反馈数据,生成第k轮投放训练集;所述第k轮投放训练集中的每一个体具有特征向量;
    将所述第k轮投放训练集中的个体的特征向量导入第二预设模型进行训练学习,得到第k轮投放模型;
    使用所述第k轮投放模型,从所述第k轮投放训练集和所述初始种子人群中筛选出第k轮扩散训练集。
  5. 如权利要求4所述的方法,
    所述第k轮投放训练集包括第二正样本子集和第二负样本子集;所述第二正样本子集中的个体作为正样本,所述第二负样本子集中的个体作为负样本;
    所述根据第k-1轮投放人群的反馈数据,生成第k轮投放训练集,包括:
    获取第k-1轮投放人群的反馈数据;所述反馈数据包括所述第k-1轮投放人群的反馈统计数据,以及所述第k-1轮投放人群中每一个体的行为数据;所述反馈统计数据是根据所述第k-1轮投放人群中个体的行为数据计算得到的;
    从所述第k-1轮投放人群中筛选出所述第二正样本子集;所述第二正样本子集中个体的行为数据与所述反馈统计数据之间具有正向关联关系,并且,所述第二正样本子集中的每一个体均具有特征向量;
    从所述第k-1轮投放人群中筛选出所述第二负样本子集;所述第二负样本子集中的每一个体均具有特征向量。
  6. 如权利要求5所述的方法,所述第二负样本子集包括所述第k-1轮投放人群中剔除了所述第二正样本子集后的其他个体;或者,
    所述第二负样本子集包含所述第k-1轮投放人群。
  7. 如权利要求6所述的方法,所述使用所述第k轮投放模型,从所述第k轮投放训练集和所述初始种子人群中筛选出第k轮扩散训练集,包括:
    使用所述第k轮投放模型对所述初始种子人群中的每一个体进行评分,并过滤掉所述初始种子人群中得分小于第一阈值的个体,得到过滤后的种子人群;
    使用所述第k轮投放模型对所述第二正样本子集中的每一个体进行评分,并过滤掉所述第二正样本子集中得分小于第二阈值的个体,得到过滤后的第二正样本子集;
    将过滤后的种子人群和过滤后的第二正样本子集的并集,作为所述第k轮扩散训练集的第一正样本子集。
  8. 如权利要求7所述的方法,所述使用所述第k轮投放模型,从所述第k轮投放训练集和所述初始种子人群中筛选出所述第k轮扩散训练集,还包括:
    使用所述第k轮投放模型对所述第二负样本子集中的第i个个体进行评分;
    基于所述第i个个体的得分,计算所述第i个个体作为负样本的概率p(i);
    针对所述第i个个体随机生成纯小数;
    若所述p(i)小于等于所述纯小数,将所述第i个个体放入所述第k轮扩散训练集的第一负样本子集中。
  9. 一种信息投放装置,包括存储器和处理器,所述存储器用于存储指令,所述处理器用于执行所述指令,以执行下述步骤:
    根据初始种子人群及第k-1轮投放人群的反馈数据,生成第k轮扩散训练集;所述k为整数;所述扩散训练集中的每一个体具有特征向量,所述特征向量包括对应的个体的多个属性及相应的属性值;
    使用所述第k轮扩散训练集中的个体的特征向量进行第k轮迭代训练,得到第k轮扩散模型;
    使用所述第k轮扩散模型从整体人群中筛选出第k轮投放人群;所述第k轮投放人群用于进行第k轮信息投放。
  10. 如权利要求9所述的装置,所述第k轮扩散训练集包括第一正样本子集和第一负样本子集;所述第一正样本子集中的个体作为正样本,所述第一负样本子集中的个体作为负样本;
    当k=0时,在生成第k轮扩散训练集的方面,所述处理器用于执行所述指令,以执行下述步骤,包括:获取初始种子人群作为第一正样本子集;
    从整体人群中随机选取与所述初始种子人群等量的人群,作为第一负样本子集;
    获取所述第一正样本子集和所述第一负样本子集中的个体的特征向量作为第k轮扩散训练集中每一个体的特征向量。
  11. 如权利要求10所述的装置,所述处理器用于执行所述指令,以执行下述步骤,包括:
    当k≠0时,根据第k-1轮投放人群的反馈数据,生成第k轮投放训练集;所述第k轮投放训练集中的每一个体具有特征向量;
    将所述第k轮投放训练集中的个体的特征向量导入第二预设模型进行训练学习,得到第k轮投放模型;
    使用所述第k轮投放模型,从所述第k轮投放训练集和所述初始种子人群中筛选出第k轮扩散训练集。
  12. 如权利要求11所述的装置,所述第k轮投放训练集包括第二正样本子集和第二负样本子集;所述第二正样本子集中的个体作为正样本,所述第二负样本子集中的个体作为负样本;
    在所述根据第k-1轮投放人群的反馈数据,生成第k轮投放训练集方面,所述处理器用于执行所述指令,以执行下述步骤,包括:
    获取第k-1轮投放人群的反馈数据,所述反馈数据包括所述第k-1轮投放人群的反馈统计数据,以及所述第k-1轮投放人群中每一个体的行为数据;所述反馈统计数据是根据所述第k-1轮投放人群中个体的行为数据计算得到的;
    从所述第k-1轮投放人群中筛选出所述第二正样本子集;所述第二正样本子集包括行为数据与所述反馈统计数据之间具有正向关系的个体,并且,所述第二正样本子集中的每一个体均具相对应的有特征向量;
    从所述第k-1轮投放人群中筛选出所述第二负样本子集;所述第二负样本子集中的每一个体均具有特征向量。
  13. 如权利要求12所述的装置,在使用所述第k轮投放模型,从所述第k轮投放训练集和所述初始种子人群中筛选出所述第k轮扩散训练集方面,所述处理器用于执行所述指令,以执行下述步骤,包括:
    使用所述第k轮投放模型对所述初始种子人群中的每一个体进行评分,并过滤掉所述初始种子人群中得分小于第一阈值的个体,得到过滤后的种子人群;
    使用所述第k轮投放模型对所述第二正样本子集中的每一个体进行评分,并过滤掉所述第二正样本子集中得分小于第二阈值的个体,得到过滤后的第二正样本子集;
    将过滤后的种子人群和过滤后的第二正样本子集的并集,作为所述第k轮扩散训练集的正样本子集。
  14. 如权利要求13所述的装置,在所述使用所述第k轮投放模型,从所述第k轮投放训练集和所述初始种子人群中筛选出所述第k轮扩散训练集方面,所述处理器用于执行所述指令,以执行下述步骤,包括:
    使用所述第k轮投放模型对所述第二负样本子集中的第i个个体进行评分;
    基于所述第i个个体的得分,计算所述第i个个体作为负样本的概率p(i);
    针对所述第i个个体随机生成纯小数;
    若所述p(i)小于等于所述纯小数,将所述第i个个体放入所述第k轮扩散训练集的第一负样本子集中。
  15. 一种存储介质,所述存储介质用于存储程序代码,所述程序代码用于执行权利要求1-8任意一项所述的信息投放方法。
  16. 一种包括指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行权利要求1-8任意一项所述的信息投放方法。
PCT/CN2018/075521 2017-02-15 2018-02-07 一种信息投放方法、装置及服务器 WO2018149337A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710081843.2A CN108427690B (zh) 2017-02-15 2017-02-15 信息投放方法及装置
CN201710081843.2 2017-02-15

Publications (1)

Publication Number Publication Date
WO2018149337A1 true WO2018149337A1 (zh) 2018-08-23

Family

ID=63155504

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/075521 WO2018149337A1 (zh) 2017-02-15 2018-02-07 一种信息投放方法、装置及服务器

Country Status (2)

Country Link
CN (1) CN108427690B (zh)
WO (1) WO2018149337A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615408A (zh) * 2018-10-24 2019-04-12 中国平安人寿保险股份有限公司 基于大数据的广告投放方法及装置、存储介质、电子设备
CN111178934A (zh) * 2019-11-29 2020-05-19 北京深演智能科技股份有限公司 获取目标对象的方法及装置
CN111681057A (zh) * 2020-06-11 2020-09-18 北京深演智能科技股份有限公司 信息投放的媒体资源的处理方法及装置
CN111831827A (zh) * 2019-09-05 2020-10-27 北京嘀嘀无限科技发展有限公司 一种数据处理方法、装置、电子设备及存储介质
CN112925973A (zh) * 2019-12-06 2021-06-08 北京沃东天骏信息技术有限公司 数据处理方法和装置
CN113496304A (zh) * 2020-04-03 2021-10-12 北京达佳互联信息技术有限公司 网络媒介信息的投放控制方法、装置、设备及存储介质
CN114792256A (zh) * 2022-06-23 2022-07-26 上海维智卓新信息科技有限公司 基于模型选择的人群扩量方法及装置

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110866766A (zh) * 2018-08-27 2020-03-06 阿里巴巴集团控股有限公司 广告投放方法、确定推广人群的方法、服务器和客户端
CN110704706B (zh) * 2019-09-11 2021-09-03 北京海益同展信息科技有限公司 分类模型的训练方法、分类方法及相关设备、分类系统
CN112651790B (zh) * 2021-01-19 2024-04-12 恩亿科(北京)数据科技有限公司 基于快消行业用户触达的ocpx自适应学习方法和系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530304A (zh) * 2013-05-10 2014-01-22 Tcl集团股份有限公司 基于自适应分布式计算的在线推荐方法、系统和移动终端
CN106022865A (zh) * 2016-05-10 2016-10-12 江苏大学 一种基于评分和用户行为的商品推荐方法
WO2016191078A1 (en) * 2015-05-22 2016-12-01 Mastercard International Incorporated Adaptive recommendation system and methods
CN106355449A (zh) * 2016-08-31 2017-01-25 腾讯科技(深圳)有限公司 用户选取方法和装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331459B (zh) * 2014-10-31 2018-07-06 百度在线网络技术(北京)有限公司 一种基于在线学习的网络资源推荐方法及装置
WO2016201631A1 (en) * 2015-06-17 2016-12-22 Yahoo! Inc. Systems and methods for online content recommendation
CN105069470A (zh) * 2015-07-29 2015-11-18 腾讯科技(深圳)有限公司 分类模型训练方法及装置
CN105427129B (zh) * 2015-11-12 2020-09-04 腾讯科技(深圳)有限公司 一种信息的投放方法及系统
CN105447730B (zh) * 2015-12-25 2020-11-06 腾讯科技(深圳)有限公司 目标用户定向方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530304A (zh) * 2013-05-10 2014-01-22 Tcl集团股份有限公司 基于自适应分布式计算的在线推荐方法、系统和移动终端
WO2016191078A1 (en) * 2015-05-22 2016-12-01 Mastercard International Incorporated Adaptive recommendation system and methods
CN106022865A (zh) * 2016-05-10 2016-10-12 江苏大学 一种基于评分和用户行为的商品推荐方法
CN106355449A (zh) * 2016-08-31 2017-01-25 腾讯科技(深圳)有限公司 用户选取方法和装置

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615408A (zh) * 2018-10-24 2019-04-12 中国平安人寿保险股份有限公司 基于大数据的广告投放方法及装置、存储介质、电子设备
CN109615408B (zh) * 2018-10-24 2024-04-05 中国平安人寿保险股份有限公司 基于大数据的广告投放方法及装置、存储介质、电子设备
CN111831827A (zh) * 2019-09-05 2020-10-27 北京嘀嘀无限科技发展有限公司 一种数据处理方法、装置、电子设备及存储介质
CN111831827B (zh) * 2019-09-05 2023-12-08 北京嘀嘀无限科技发展有限公司 一种数据处理方法、装置、电子设备及存储介质
CN111178934A (zh) * 2019-11-29 2020-05-19 北京深演智能科技股份有限公司 获取目标对象的方法及装置
CN111178934B (zh) * 2019-11-29 2024-03-08 北京深演智能科技股份有限公司 获取目标对象的方法及装置
CN112925973A (zh) * 2019-12-06 2021-06-08 北京沃东天骏信息技术有限公司 数据处理方法和装置
CN113496304A (zh) * 2020-04-03 2021-10-12 北京达佳互联信息技术有限公司 网络媒介信息的投放控制方法、装置、设备及存储介质
CN113496304B (zh) * 2020-04-03 2024-03-08 北京达佳互联信息技术有限公司 网络媒介信息的投放控制方法、装置、设备及存储介质
CN111681057A (zh) * 2020-06-11 2020-09-18 北京深演智能科技股份有限公司 信息投放的媒体资源的处理方法及装置
CN114792256A (zh) * 2022-06-23 2022-07-26 上海维智卓新信息科技有限公司 基于模型选择的人群扩量方法及装置
CN114792256B (zh) * 2022-06-23 2023-05-26 上海维智卓新信息科技有限公司 基于模型选择的人群扩量方法及装置

Also Published As

Publication number Publication date
CN108427690B (zh) 2022-09-13
CN108427690A (zh) 2018-08-21

Similar Documents

Publication Publication Date Title
WO2018149337A1 (zh) 一种信息投放方法、装置及服务器
US8306922B1 (en) Detecting content on a social network using links
CN105427129B (zh) 一种信息的投放方法及系统
JP6145576B2 (ja) オンライン・ソーシャル・ネットワークにおける大規模ページ推薦
CN105608179B (zh) 确定用户标识的关联性的方法和装置
US20150324857A1 (en) Cross-platform advertising systems and methods
US20170011420A1 (en) Methods and apparatus to analyze and adjust age demographic information
US20140067535A1 (en) Concept-level User Intent Profile Extraction and Applications
US20140095308A1 (en) Advertisement distribution apparatus and advertisement distribution method
CN109165975B (zh) 标签推荐方法、装置、计算机设备及存储介质
US9864951B1 (en) Randomized latent feature learning
US20150242447A1 (en) Identifying effective crowdsource contributors and high quality contributions
US20180068028A1 (en) Methods and systems for identifying same users across multiple social networks
CN106682686A (zh) 一种基于手机上网行为的用户性别预测方法
US8401899B1 (en) Grouping user features based on performance measures
CN109409928A (zh) 一种素材推荐方法、装置、存储介质、终端
US11257019B2 (en) Method and system for search provider selection based on performance scores with respect to each search query
JP2015526795A (ja) ユーザの人口統計データを推定する方法と装置
CN106776925B (zh) 一种移动终端用户性别的预测方法、服务器和系统
CN108629010B (zh) 一种基于主题和服务组合信息的web服务推荐方法
CN107633257B (zh) 数据质量评估方法及装置、计算机可读存储介质、终端
WO2015124024A1 (zh) 一种提升信息的曝光率的方法和装置、确定搜索词的价值的方法和装置
US20230231930A1 (en) Content processing method and apparatus, computer device, and storage medium
JP2019191975A (ja) 人材選定装置、人材選定システム、人材選定方法及びプログラム
CN108647986B (zh) 一种目标用户确定方法、装置及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18753805

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18753805

Country of ref document: EP

Kind code of ref document: A1