CN108647990A - A kind of method, apparatus and electronic equipment of determining target user - Google Patents
A kind of method, apparatus and electronic equipment of determining target user Download PDFInfo
- Publication number
- CN108647990A CN108647990A CN201810297028.4A CN201810297028A CN108647990A CN 108647990 A CN108647990 A CN 108647990A CN 201810297028 A CN201810297028 A CN 201810297028A CN 108647990 A CN108647990 A CN 108647990A
- Authority
- CN
- China
- Prior art keywords
- sample
- feature
- user
- seed user
- seed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0251—Targeted advertisements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0277—Online advertisement
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Strategic Management (AREA)
- Finance (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- Marketing (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
An embodiment of the present invention provides a kind of method, apparatus and electronic equipment of determining target user, this method includes:For each feature in fisrt feature set, the first accounting of seed user sample with this feature and non-seed the second accounting of user's sample with this feature are calculated;According to the magnitude relationship and multiple non-seed user's samples of the first accounting of each feature and the second accounting, negative sample collection is generated;Then using multiple seed user samples as positive sample collection, pass through positive sample collection and negative sample collection training logic of propositions regression model, further according to the Logic Regression Models after the third feature vector sum training of each non-seed user's sample, calculate the sample value of non-seed user's sample, finally, according to the sequence of sample value from big to small, select non-seed user corresponding with the first of target user's quantity non-seed user's sample as target user.To realize the less seed user provided according to advertiser, suitable target user is determined.
Description
Technical field
The present invention relates to advertisements to launch technical field, more particularly to a kind of method, apparatus and electricity of determining target user
Sub- equipment.
Background technology
Currently, launching advertisement in website, as a kind of business model, made by each Large-Scale Interconnected net company
With each Large-Scale Interconnected net company has its advertisement launching platform, advertiser that can submit oneself by advertisement launching platform
Want advertisement, then advertisement launching platform can search out target user according to the want advertisement of advertiser, and then be used to the target
Launch advertisement in family.
Specifically, advertiser can provide seed user when to advertising platform releasing advertisements demand, advertising platform passes through this
Seed user finds the target user for meeting the want advertisement, and then corresponding with the want advertisement to target user dispensing
Advertisement.
However, inventor has found in the implementation of the present invention, at least there are the following problems for the prior art:Work as advertiser
When the seed user negligible amounts of offer, suitable target user can not be determined by the prior art.
Invention content
The embodiment of the present invention is designed to provide a kind of method, apparatus and electronic equipment of determining target user, with reality
The less seed user now provided according to advertiser, determines suitable target user.Specific technical solution is as follows:
In the one side of the embodiment of the present invention, an embodiment of the present invention provides a kind of methods of determining target user, should
Method includes:
Obtain fisrt feature set, multiple seed user samples and multiple non-seed user's samples;
For each feature in fisrt feature set, calculates the seed user sample with this feature and used in multiple seeds
The first accounting in the sample of family and non-seed user's sample with this feature second accounting in multiple non-seed user's samples
Than;
According to the magnitude relationship of the first accounting and the second accounting of each feature, second feature set or third feature are generated
Set;And according to second feature set or third feature set, the first non-seed use is selected in multiple non-seed user's samples
Family sample generates negative sample collection;
Multiple seed user samples are obtained, and using multiple seed user samples as positive sample collection, obtain positive sample collection
First sample label, positive sample concentrate the second sample label of the first eigenvector of each seed user sample, negative sample collection
The second feature vector that each non-seed user's sample is concentrated with negative sample, is trained logic of propositions regression model, obtains
Logic Regression Models after training;
The third feature vector of each non-seed user's sample in multiple non-seed user's samples is obtained, and according to third spy
The Logic Regression Models after vector sum training are levied, the sample of each non-seed user's sample in multiple non-seed user's samples is calculated
Value;
Target user's quantity is obtained, with multiple non-seed user's samples, according to the sequence of sample value from big to small, is selected
Select the first non-seed user's sample for meeting target user's quantity, and will non-seed use corresponding with first non-seed user's sample
Family is as target user.
Optionally, before obtaining fisrt feature set, multiple seed user samples and multiple non-seed user's samples, this
The method of determining target user of inventive embodiments a kind of further includes:
The second feature of the fisrt feature and multiple non-seed user's samples of multiple seed user samples is obtained, and according to
One feature and second feature establish fisrt feature set, wherein each feature in fisrt feature set does not repeat.
Optionally, after obtaining fisrt feature set, multiple seed user samples and multiple non-seed user's samples, this
The method of determining target user of inventive embodiments a kind of further includes:
Each feature in fisrt feature set is encoded, the fisrt feature set after being encoded;
Correspondingly, for each feature in fisrt feature set, the seed user sample with this feature is calculated more
The first accounting in a seed user sample and non-seed user's sample with this feature are in multiple non-seed user's samples
The second accounting, including:
For each feature in the fisrt feature set after coding, the seed user sample with this feature is calculated more
The first accounting in a seed user sample and non-seed user's sample with this feature are in multiple non-seed user's samples
The second accounting.
Optionally, according to the magnitude relationship of the first accounting and the second accounting of each feature, generate second feature set or
Third feature set;And according to second feature set or third feature set, first is selected in multiple non-seed user's samples
Non-seed user's sample generates negative sample collection, including:
For each feature in fisrt feature set, when first accounting of this feature is less than the second accounting, by the spy
Sign is added in second feature set, obtains the second feature set added with multiple features;
The multiple features for obtaining multiple non-seed user's samples, in multiple features, selection is present in second feature set
In third feature select non-seed user's sample corresponding with third feature, generation and in multiple non-seed user's samples
Negative sample collection.
Optionally, according to the magnitude relationship of the first accounting and the second accounting of each feature, generate second feature set or
Third feature set;And according to second feature set or third feature set, first is selected in multiple non-seed user's samples
Non-seed user's sample generates negative sample collection, including:
This feature is added to third when the first accounting is more than second for each feature in fisrt feature set
In characteristic set, the third feature set added with multiple features is obtained;
The multiple features for obtaining multiple non-seed user's samples, in multiple features, selection is not present in third feature collection
Fourth feature in conjunction, and in multiple non-seed user's samples, non-seed user's sample corresponding with fourth feature is selected, it is raw
At negative sample collection.
At the another aspect of the embodiment of the present invention, the embodiment of the present invention additionally provides a kind of device of determining target user,
The device includes:
Acquisition module, for obtaining fisrt feature set, multiple seed user samples and multiple non-seed user's samples;
Accounting computing module is used for for each feature in fisrt feature set, calculating the seed with this feature
First accounting of the family sample in multiple seed user samples and non-seed user's sample with this feature are multiple non-seed
The second accounting in user's sample;
Negative sample collection generation module is used for the magnitude relationship of the first accounting and the second accounting according to each feature, generates
Second feature set or third feature set;And according to second feature set or third feature set, in multiple non-seed users
First non-seed user's sample is selected in sample, generates negative sample collection;
Training module for obtaining multiple seed user samples, and using multiple seed user samples as positive sample collection, obtains
The first sample label of positive sample collection, positive sample is taken to concentrate the first eigenvector of each seed user sample, negative sample collection
Second sample label and negative sample concentrate the second feature vector of each non-seed user's sample, to logic of propositions regression model into
Row training, the Logic Regression Models after being trained;
Sample value computing module, the third for obtaining each non-seed user's sample in multiple non-seed user's samples are special
Sign vector, and according to the Logic Regression Models after the training of third feature vector sum, calculate each in multiple non-seed user's samples
The sample value of non-seed user's sample;
Target user's selecting module, for obtaining target user's quantity, in multiple non-seed user's samples, according to sample
The sequence of this value from big to small, selection meet first non-seed user's sample of target user's quantity, and will with it is first non-seed
The corresponding non-seed user of user's sample is as target user.
Optionally, the device of a kind of determining target user of the embodiment of the present invention further includes:
Fisrt feature set establishes module, the fisrt feature for obtaining multiple seed user samples and multiple non-seed use
The second feature of family sample, and according to fisrt feature and second feature, establish fisrt feature set, wherein fisrt feature set
In each feature do not repeat.
Optionally, the device of a kind of determining target user of the embodiment of the present invention further includes:
Coding module, for being encoded to each feature in fisrt feature set, the fisrt feature after being encoded
Set;
Correspondingly, accounting computing module, is specifically used for:
For each feature in the fisrt feature set after coding, the seed user sample with this feature is calculated more
The first accounting in a seed user sample and non-seed user's sample with this feature are in multiple non-seed user's samples
The second accounting.
Optionally, negative sample collection generation module, including:
Second feature set generates submodule, for for each feature in fisrt feature set, the of this feature
When one accounting is less than the second accounting, this feature is added in second feature set, it is special to obtain second added with multiple features
Collection is closed;
First negative sample collection generates submodule, multiple features for obtaining multiple non-seed user's samples, in multiple spies
In sign, selection is present in the third feature in second feature set, and in multiple non-seed user's samples, and selection is special with third
Corresponding non-seed user's sample is levied, negative sample collection is generated.
Optionally, negative sample collection generation module further includes:
Third feature set generates submodule, each feature for being directed in fisrt feature set, big in the first accounting
When second, this feature is added in third feature set, obtains the third feature set added with multiple features;
Second negative sample collection generates submodule, multiple features for obtaining multiple non-seed user's samples, in multiple spies
In sign, selection is not present in the fourth feature in third feature set, and in multiple non-seed user's samples, selection and the 4th
The corresponding non-seed user's sample of feature generates negative sample collection.
At the another aspect that the present invention is implemented, the embodiment of the present invention additionally provides a kind of electronic equipment, the electronic equipment packet
It includes:Processor, communication interface, memory and communication bus, wherein processor, communication interface, memory are complete by communication bus
At mutual communication;
Memory, for storing computer program;
Processor when for executing the program stored on memory, realizes a kind of any of the above-described determining target
The method of user.
At the another aspect that the present invention is implemented, the embodiment of the present invention additionally provides a kind of computer readable storage medium, institute
It states and is stored with instruction in computer readable storage medium, when run on a computer so that computer executes any of the above-described
A kind of method of the determining target user.
At the another aspect that the present invention is implemented, the embodiment of the present invention additionally provides a kind of computer program production comprising instruction
Product, when run on a computer so that the method that computer executes a kind of any of the above-described determining target user.
The method, apparatus and electronic equipment of a kind of determining target user provided in an embodiment of the present invention, is getting first
After characteristic set, multiple seed user samples and multiple non-seed user's samples, for each feature in fisrt feature set,
Calculate first accounting of the seed user sample with this feature in multiple seed user samples and non-kind with this feature
Second accounting of the child user sample in multiple non-seed user's samples, then accounts for according to the first accounting of each feature and second
The magnitude relationship of ratio generates second feature set or third feature collection for generating negative sample collection and merges generation negative sample collection;
By generating negative sample collection according to the magnitude relationship of the first accounting and the second accounting so that the negative sample collection and positive sample may be used
This collection training logic of propositions regression model can pass through multiple non-seed users after the Logic Regression Models after being trained
Logic Regression Models in sample after the third feature vector sum training of each non-seed user's sample, calculate multiple non-seed use
The sample value of each non-seed user's sample in the sample of family;Sample value is bigger, then illustrates more to be likely to become target user, because
This, according to the sequence of sample value from big to small, can select corresponding with target user's quantity in multiple non-seed user's samples
First non-seed user's sample, and will non-seed user corresponding with first non-seed user's sample as target user, from
And the less seed user provided according to advertiser is provided, determine suitable target user.Certainly, implement the present invention
Any product or method must be not necessarily required to reach all the above advantage simultaneously.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technology description to be briefly described.
Fig. 1 is a kind of flow chart of the first embodiment of the method for determining target user of the embodiment of the present invention;
Fig. 2 is a kind of flow chart of second of embodiment of method of determining target user of the embodiment of the present invention;
Fig. 3 is a kind of flow chart of the third embodiment of the method for determining target user of the embodiment of the present invention;
Fig. 4 is a kind of flow chart of the 4th kind of embodiment of method of determining target user of the embodiment of the present invention;
Fig. 5 is a kind of flow chart of the 5th kind of embodiment of method of determining target user of the embodiment of the present invention;
Fig. 6 is a kind of structural schematic diagram of the device of determining target user of the embodiment of the present invention;
Fig. 7 is the structural schematic diagram of a kind of electronic equipment of the embodiment of the present invention.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention is described.
It is of the existing technology in order to solve the problems, such as, an embodiment of the present invention provides a kind of method of determining target user,
Device and electronic equipment determine suitable target user to realize the less seed user provided according to advertiser.And then to
The target user launches advertisement corresponding with the want advertisement, improves advertisement delivery effect.
In the following, being illustrated first to the method for determining target user of the embodiment of the present invention a kind of, as shown in Figure 1, being
A kind of flow chart of the first embodiment of the method for determining target user of the embodiment of the present invention, in Fig. 1, this method can be with
Including:
S110 obtains fisrt feature set, multiple seed user samples and multiple non-seed user's samples.
Wherein, which can be that is pre-established preserve the characteristic set of multiple features, and this feature can
To include:Age of user, user's gender, the affiliated city of user, user's viewing preference etc..The characteristic set pre-established can be with
It is by carrying out the characteristic set that the feature that signature analysis obtains is established to the historical user for watching film.
In some instances, it when advertiser sends want advertisement to advertisement launching platform, can be thrown simultaneously to the advertisement
It is laid flat platform and sends seed user sample, which, can be with after the seed user sample for receiving advertiser's transmission
Triggering is using a kind of target user's determining device of the method for determining target user of the embodiment of the present invention, target user determination
Device can get multiple seed user samples from above-mentioned advertisement launching platform.Each seed user sample may include kind
The identification information of child user, the feature of seed user, feature vector of seed user etc..
In some instances, above-mentioned advertisement launching platform can pre-establish a historical use data library, the history
The identification information of historical user, the characteristic information etc. of historical user can be preserved in customer data base.Above-mentioned target user
Determining device can obtain multiple non-seed user's samples from the historical use data library.
In some instances, above-mentioned advertisement launching platform can be in the seed user information for receiving advertiser's offer
Afterwards, judge whether the quantity of seed user in seed user information is less than default seed user threshold value, judging seed user letter
When the quantity of seed user is less than default seed user threshold value in breath, above-mentioned target user's determining device is triggered.
It in some instances, can also be simultaneously to the advertisement when advertiser sends want advertisement to advertisement launching platform
Release platform sends the identification information of seed user, and above-mentioned advertisement launching platform is in the identification information for receiving the seed user
Afterwards, in the historical use data library pre-established, the inquiry historical user corresponding with the identification information, and will believe with the mark
Corresponding historical user is ceased as seed user sample, by the historical use data library, except corresponding with the identification information is gone through
Historical user other than history user is sent to above-mentioned target user's determining device as non-seed user's sample, therefore, above-mentioned
Target user's determining device can get multiple seed user samples and multiple non-seed user's samples.
In one possible implementation, above-mentioned target user's determining device can be arranged in above-mentioned advertisement dispensing
Platform interior can also be independently arranged with above-mentioned advertisement launching platform.
S120 calculates the seed user sample with this feature multiple for each feature in fisrt feature set
The first accounting in seed user sample and non-seed user's sample with this feature are in multiple non-seed user's samples
Second accounting.
Specifically, above-mentioned target user's determining device get fisrt feature set, multiple seed user samples and
After multiple non-seed user's samples, each feature that can be directed in the fisrt feature set calculates the seed with this feature
First accounting of user's sample in multiple seed user samples and non-seed user's sample with this feature are at multiple non-kinds
The second accounting in child user sample.
For example, it is assumed that the fisrt feature collection that above-mentioned target user's determining device is got is combined into:{ the age:38、40、45、
47,50, gender:Man, female, affiliated city:Beijing, Guangzhou, Shanghai, Tianjin }.The multiple seed user samples got are:With
Family sample 1, user's sample 2, user's sample 3 and user's sample 4, the multiple non-seed user's samples got are:User's sample
5, user's sample 6, user's sample 7, user's sample 8, user's sample 9 and user's sample 10.
Wherein, user's sample 1 is characterized as:38, man, Beijing, user's sample 2 are characterized as 40, female, Guangzhou, user's sample
Originally 3 it is characterized as that 45, man, Shanghai, user's sample 4 are characterized as that 47, female, Tianjin, user's sample 5 are characterized as 50, man, north
Capital, user's sample 6 are characterized as that 47, man, Guangzhou, user's sample 7 are characterized as that 40, female, Tianjin, user's sample 8 are characterized as
50, man, Shanghai, user's sample 9 are characterized as that 38, female, Beijing, user's sample 10 are characterized as 45, man, Tianjin.
It is user's sample that then above-mentioned target user's determining device, which can count the seed user sample with feature " 38 ",
Sheet 1, non-seed user's sample with feature " 38 " are user's sample 9.It is consequently possible to calculate providing the user of feature " 38 "
First accounting of the sample 1 in 4 seed user samples is 25%, and user's sample 9 with feature " 38 " is in 6 non-seed use
The second accounting in the sample of family is 16.7%.And so on, the seed user sample with feature " man " is in 4 seed user samples
The first accounting in this is 50%, non-seed user's sample with feature " man " in 6 non-seed user's samples second
Accounting is 66.7%.
By this step, above-mentioned target user's determining device can calculate each feature in first set multiple
The second accounting in the first accounting and multiple non-seed user's samples in seed user sample.And then subsequent step can be passed through
Filter out negative sample collection.
S130 generates second feature set or the according to the magnitude relationship of the first accounting and the second accounting of each feature
Three characteristic sets;And according to second feature set or third feature set, selection first is non-in multiple non-seed user's samples
Seed user sample generates negative sample collection.
In some instances, in order to use the preset Logic Regression Models of negative sample set pair to be trained, in this step,
Second feature set or the can be generated according to the magnitude relationship of the first accounting and the second accounting of each feature in first set
Three characteristic sets, further according to second feature set or third feature set, selection first is non-in multiple non-seed user's samples
Seed user sample generates negative sample collection.
Specifically, when first accounting of this feature is less than the second accounting, this feature is added to second feature set, when
When first accounting of this feature is more than the second accounting, this feature is added to third feature set.
For example, it is assumed that the first accounting of the above-mentioned calculated feature of target user's determining device " 38 " is 25%, second
Accounting is 16.7%, then can generate include feature " 38 " third feature set, the first of calculated feature " man " accounts for
Than being 50%, the second accounting be 66.7%, then can generate include feature " man " second feature set.And then it can basis
Third feature set or second feature set select first non-seed user's sample in multiple non-seed user's samples, generate
Negative sample collection.
In some instances, above-mentioned target user's determining device can only generate second feature set, or only generate
Third feature set.When only generating second feature set, above-mentioned target user's determining device can be according to the second feature
Set is selecting first non-seed user's sample in multiple non-seed user's samples, is generating negative sample collection.
When only generate third feature set when, above-mentioned target user's determining device can according to the third feature set,
First non-seed user's sample is being selected in multiple non-seed user's samples, generates negative sample collection.
The method of determining target user through the embodiment of the present invention a kind of generates negative sample collection, can be in subsequent step
In, it is trained using the preset Logic Regression Models of negative sample set pair, and then the logistic regression mould after training can be used
Type finds target user.
S140 obtains multiple seed user samples, and using multiple seed user samples as positive sample collection, obtains positive sample
The first sample label of collection, positive sample concentrate the second sample of the first eigenvector of each seed user sample, negative sample collection
Label and negative sample concentrate the second feature vector of each non-seed user's sample, are trained to logic of propositions regression model,
Logic Regression Models after being trained.
In some instances, can a first sample be arranged to positive sample collection in advance in above-mentioned target user's determining device
Second sample label is arranged to negative sample collection in label, for example, the first sample label can be 1, the second sample label can
Can also be -1 to be 0.
In some instances, may include in above-mentioned multiple seed user samples each seed user sample feature to
It measures, may include the feature vector of each non-seed user's sample in multiple non-seed user's samples.Therefore, above-mentioned target is used
Family determining device can get the first eigenvector that positive sample concentrates each seed user sample, and negative sample is concentrated each non-
The second feature vector of seed user sample.
In some instances, above-mentioned logic of propositions regression model can be formula shown in formula (1):
Wherein, g (xi)=w0+w1xi1+…+wjxij…+wnxin, xijIndicate j-th of feature of i-th of user's sample to
Amount.I >=1, n >=j >=1, n >=1.P(yk=1 | xi) indicate that the sample label of i-th of user's sample is 1 probability, k=1 or 0.
The probability that the sample label of i-th of user's sample is 0 is formula (2):
It is assumed that the positive sample that above-mentioned target user's determining device is got concentrates seed user sample and negative sample to concentrate
The sum of non-seed user's sample is m, due to mutual indepedent between m user's sample, the joint point of all user's samples
Cloth is the product of each user's sample edge distribution, i.e. formula (3):
Then, the method meter for calculating maximal possibility estimation in the prior art may be used in above-mentioned target user's determining device
Calculate g (xi) in parameters:w0,w1,…,wj,…,wnSo that L (w) obtains maximum value.Here estimate to calculating maximum likelihood
The method of meter is not introduced excessively.
In some instances, above-mentioned target user's determining device can be by first eigenvector, first sample label,
Two sample labels and second feature vector substitute into above-mentioned formula (3), calculate the parameter for meeting each user's sample:w0,w1,…,
wj,…,wn.So as to the Logic Regression Models after being trained.
For example, it is assumed that user's sample 1, user's sample 2, user's sample 3, user's sample 4 are user's sample that positive sample is concentrated
This, user's sample 5, user's sample 6, user's sample 8 and user's sample 10 are user's sample that negative sample is concentrated, then above-mentioned mesh
Mark user determining device can obtain the feature vector of user's sample 1, the feature vector of user's sample 2, user's sample 3 respectively
The feature vector of feature vector, user's sample 4, obtain respectively the feature vector of user sample 5, the feature vector of user's sample 6,
The feature vector of the feature vector and user's sample 10 of user's sample 8.
Then pass through the feature vector of user's sample 1, the feature vector of user's sample 2, user's sample 3 of positive sample concentration
Feature vector, the feature vector of user's sample 4, the spy of the feature vector of user's sample 5, user's sample 6 that negative sample is concentrated
The feature vector of sign vector, the feature vector of user's sample 8 and user's sample 10, is trained logic of propositions regression model,
The parameter for meeting above-mentioned 8 user's samples is calculated by above-mentioned formula (3):w0,w1,…,wj,…,wn, so as to
Logic Regression Models after to training.
It through the embodiment of the present invention, will be according to second feature set or third using seed user sample as positive sample collection
Characteristic set, the first non-seed user's sample selected in multiple non-seed user's samples is as negative sample collection, then uses just
Sample set and negative sample set pair logic of propositions regression model are trained, the Logic Regression Models after the training that can make
Positive sample and negative sample can be distinguished.Improve the accuracy by subsequent step selection target user.
S150, the third feature for obtaining each non-seed user's sample in multiple non-seed user's samples is vectorial, and according to
Logic Regression Models after the training of third feature vector sum calculate each non-seed user's sample in multiple non-seed user's samples
Sample value.
In some instances, in the first Logic Regression Models after training, P (yk=1 | xi) it can indicate i-th of user
The probability that the sample label of sample is 1, the probability that sample label is 1 is bigger, it can be said that bright more suitable as target user.
In order to find target user in multiple non-seed users, above-mentioned target user's determining device can instructed
After Logic Regression Models after white silk, the third that can obtain each non-seed user's sample in multiple non-seed user's sample is special
It levies to ideal, the Logic Regression Models after then being trained by the third feature vector sum calculate multiple non-seed user's sample
The probability that sample label of each non-seed user's sample is 1 in this, and be 1 by the sample label of each non-seed user's sample
Sample value of the probability as non-seed user's sample, so as to obtain the sample value of all non-seed user's samples.
For example, above-mentioned target user's determining device is after the Logic Regression Models after being trained, it can be according to user
The feature vector of sample 5, the feature vector of user's sample 6, the feature vector of user's sample 7, the feature vector of user's sample 8,
The feature vector of the feature vector and user's sample 10 of user's sample 9, calculate user's sample 5 sample label be 1 probability,
The sample label of probability, user's sample 8 that the sample label of probability, user's sample 7 that the sample label of user's sample 6 is 1 is 1
The probability that sample label for 1 probability, the probability that the sample label of user's sample 9 is 1 and user's sample 10 is 1.Then will
The probability that respective sample label is 1 is as corresponding sample value.Therefore, sample value, the user's sample of user's sample 5 can be obtained
This 6 sample value, the sample value of user's sample 7, the sample value of user's sample 8, the sample value of user's sample 9 and user's sample 10
Sample value.
S160 obtains target user's quantity, in multiple non-seed user's samples, according to sample value from big to small suitable
Sequence, selection meet first non-seed user's sample of target user's quantity, and will be corresponding non-with first non-seed user's sample
Seed user is as target user.
In some instances, may include target user's number in the want advertisement that advertiser sends to advertisement launching platform
Amount, therefore, above-mentioned target user's determining device can get target user's quantity from advertisement launching platform.
Specifically, above-mentioned target user's determining device is after getting target user's quantity, it can be multiple non-seed
In user's sample, according to the sequence of sample value from big to small, first non-seed user's sample corresponding with target user's quantity is selected
This, then will non-seed user corresponding with first non-seed user's sample as target user.
For example, it is assumed that above-mentioned target user's determining device has calculated the sample of user's sample 5 in non-seed user's sample
This value is 0.65, the sample value of user's sample 6 is 0.3, the sample value of user's sample 7 is 0.55, the sample value of user's sample 8 is
0.4, the sample value of user's sample 9 is 0.75 and the sample value of user's sample 10 is 0.2.In this step, non-kind can be obtained
The sample value of child user sample, i.e.,:The sample value of user's sample 5, the sample value of user's sample 6, the sample value of user's sample 7,
The sample value of the sample value of user's sample 8, the sample value and user's sample 10 of user's sample 9.
Assume again that above-mentioned target user's quantity is 3, then above-mentioned target user's determining device can be non-in above-mentioned 6
In seed user sample, according to the sequence of sample value from big to small, sample value 0.75,0.65 and 0.55, corresponding user are selected
Sample is respectively:User's sample 9, user's sample 5 and user's sample 7.
Finally, it may be determined that go out and the corresponding non-seed user of user's sample 9, non-seed use corresponding with user's sample 5
Family and non-seed user corresponding with user's sample 7 are target user.
In some instances, above-mentioned target user's determining device is after determining target user, can will determine
The identification information of target user is sent to above-mentioned advertisement launching platform, which can be to the target user's
Terminal device corresponding to identification information launches advertisement.
A kind of target user through the embodiment of the present invention determines method, is getting fisrt feature set, multiple seeds
After user's sample and multiple non-seed user's samples, for each feature in fisrt feature set, calculate with this feature
First accounting of the seed user sample in multiple seed user samples and non-seed user's sample with this feature are multiple
The second accounting in non-seed user's sample, it is raw then according to the magnitude relationship of the first accounting and the second accounting of each feature
At the second feature set or third feature collection merging generation negative sample collection for generating negative sample collection;By according to the first accounting
Negative sample collection is generated with the magnitude relationship of the second accounting so that the negative sample collection and positive sample collection training logic of propositions may be used
Regression model can be by each non-seed in multiple non-seed user's samples after the Logic Regression Models after being trained
Logic Regression Models after the third feature vector sum training of user's sample calculate non-kind each in multiple non-seed user's samples
The sample value of child user sample;Sample value is bigger, then illustrates more to be likely to become target user, therefore, can be at multiple non-kinds
In child user sample, according to the sequence of sample value from big to small, the first non-seed user corresponding with target user's quantity is selected
Sample, and will non-seed user corresponding with first non-seed user's sample as target user, so as to realize according to wide
The less seed user for accusing main offer, determines suitable target user.
In a kind of optional embodiment of the embodiment of the present invention, the method for determining target user shown in Fig. 1 a kind of
On the basis of, the embodiment of the present invention additionally provides a kind of possible realization method, as shown in Fig. 2, for one kind of the embodiment of the present invention
It determines the flow chart of second of embodiment of method of target user, in fig. 2, in S110, obtains fisrt feature set, multiple
Before seed user sample and multiple non-seed user's samples, the method for determining target user of the embodiment of the present invention a kind of, also
May include:
S170 obtains the second feature of the fisrt feature and multiple non-seed user's samples of multiple seed user samples, and
According to fisrt feature and second feature, fisrt feature set is established.
Wherein, each feature in fisrt feature set does not repeat.
In some instances, advertisement launching platform can carry out the historical user for watching film signature analysis or right
The historical user for clicking history advertisement carries out signature analysis, obtains the feature of historical user, and then establish fisrt feature set.
In some instances, when seed user sample and non-seed user's sample are all the history of above-mentioned advertisement launching platform
When user's sample in customer data base, above-mentioned target user's determining device can be obtained from the historical use data library first
The fisrt feature for taking seed user sample obtains the second feature of non-seed user's sample, so from the historical use data library
Fisrt feature set is established by fisrt feature and second feature afterwards.
For example, it is assumed that multiple seed user samples and corresponding being characterized as:User's sample 1 corresponding is characterized as:38, male,
Beijing, user's sample 2 corresponding are characterized as:40, female, Guangzhou, user's sample 3 corresponding are characterized as:45, man, Shanghai, user
Sample 4 corresponding is characterized as:47, female, Tianjin, multiple non-seed user's samples and corresponding are characterized as:User's sample 5 corresponds to
It is characterized as:50, man, Beijing, user's sample 6 corresponding are characterized as:47, man, Guangzhou, user's sample 7 corresponding are characterized as:
40, female, Tianjin, user's sample 8 corresponding are characterized as:50, man, Shanghai, user's sample 9 corresponding are characterized as:38, female, north
Capital, user's sample 10 corresponding are characterized as:45, man, Tianjin.
Then above-mentioned target user's determining device can get fisrt feature and be:38,40,45,47, man, female, Beijing,
Guangzhou, Shanghai, Tianjin.Second feature is:38,40,45,47,50, man, female, Beijing, Guangzhou, Tianjin, Shanghai.
Then fisrt feature and second feature can be merged, to after merging fisrt feature and second feature carry out
Duplicate removal processing can obtain fisrt feature set.
The method of determining target user through the embodiment of the present invention a kind of generates fisrt feature set, can reduce by first
The quantity of feature in characteristic set so as to reduce the calculation amount for calculating the first accounting and the second accounting, and then reduces application
The method of determining target user of the embodiment of the present invention a kind of determines the time overhead of target user.
In a kind of optional embodiment of the embodiment of the present invention, the method for determining target user shown in Fig. 2 a kind of
On the basis of, the embodiment of the present invention additionally provides a kind of possible realization method, as shown in figure 3, for one kind of the embodiment of the present invention
It determines the flow chart of the third embodiment of the method for target user, in figure 3, in S110, obtains fisrt feature set, multiple
After seed user sample and multiple non-seed user's samples, the method for determining target user of the embodiment of the present invention a kind of is also wrapped
It includes:
S180 encodes each feature in fisrt feature set, the fisrt feature set after being encoded.
In some instances, in order to reduce the occupancy of each feature in fisrt feature set to hardware device memory space,
The time overhead that target user is determined using the method for determining target user of the embodiment of the present invention a kind of is further decreased, it is above-mentioned
Target user's determining device after getting fisrt feature set, can also to each feature in the fisrt feature set into
Row coding, the fisrt feature set after being encoded.
For example, it is assumed that the fisrt feature collection that above-mentioned target user's determining device is got is combined into:Age 38,40,
45,47,50 }, gender { man, female }, affiliated city { Beijing, Guangzhou, Shanghai, Tianjin } }, then Arabic numerals can be used to this
Each feature in fisrt feature set is encoded, and is to include Arabic number by each Feature Conversion in fisrt feature set
The fisrt feature set of word:{1、2、3、4、5、6、7、8、9、10、11}.
In some instances, above-mentioned target user's determining device can also use lower case or upper case English alphabet pair first
Characteristic set is encoded, the characteristic set after being encoded:{a、b、c、d、e、f、g、h、i、j、k}.
Correspondingly, above-mentioned target user's determining device is after encoding fisrt feature set, step S120, for
Each feature in fisrt feature set calculates the of the seed user sample with this feature in multiple seed user samples
One accounting and second accounting of the non-seed user's sample in multiple non-seed user's samples with this feature may include:
S121 calculates the seed user sample with this feature for each feature in the fisrt feature set after coding
This first accounting in multiple seed user samples and non-seed user's sample with this feature are in multiple non-seed users
The second accounting in sample.
In some instances, above-mentioned target user's determining device can also use after to fisrt feature collective encoding
Coding mode identical with fisrt feature collective encoding mode, each feature to multiple seed user samples and multiple non-seed
Each feature of user's sample encodes.
By the feature of feature and multiple non-seed user's samples to fisrt feature set, multiple seed user samples into
Row coding, above-mentioned target user's determining device can use the feature after coding when calculating the first accounting and the second accounting
The first accounting and the second accounting are calculated, so as to reduce feature to the occupancy of hardware device memory space, further decreases and answers
The time overhead of target user is determined with the method for determining target user of the embodiment of the present invention a kind of.
In a kind of optional embodiment of the embodiment of the present invention, the method for determining target user shown in Fig. 1 a kind of
On the basis of, the embodiment of the present invention additionally provides a kind of possible realization method, as shown in figure 4, for one kind of the embodiment of the present invention
Determine the flow chart of the 4th kind of embodiment of method of target user, in Fig. 4, S130, according to the first accounting of each feature
With the magnitude relationship of the second accounting, second feature set or third feature set are generated;And according to second feature set or third
Characteristic set selects first non-seed user's sample in multiple non-seed user's samples, generates negative sample collection, may include:
S131, will when first accounting of this feature is less than the second accounting for each feature in fisrt feature set
This feature is added in second feature set, obtains the second feature set added with multiple features.
In some instances, above-mentioned target user's determining device is accounted for according to the first accounting of each feature and second
The magnitude relationship of ratio, when generating second feature set or third feature set, an embodiment of the present invention provides two kinds of possible realities
Existing mode, in one possible implementation, when the first accounting of any feature in fisrt feature set is accounted for less than second
Than when, which can be added in second feature set, it is special so as to obtain second added with multiple features
Collection is closed.
For example, it is assumed that above-mentioned target user's determining device, the first accounting of calculated feature " man " is 50%, second
Accounting is 66.7%, then this feature " man " can be added to second feature set, the first accounting of calculated feature " 38 "
It is 40%, the second accounting is 45%, then second feature set can be added to this feature " 38 ", calculated feature " Beijing "
First accounting is 66%, and the second accounting is 73%, then can this feature " Beijing " be added to second feature set etc., so as to
To obtain the second feature set added with feature " 38 ", " man " and " Beijing ".
S132 obtains multiple features of multiple non-seed user's samples, and in multiple features, selection is present in second feature
Third feature in set, and in multiple non-seed user's samples, non-seed user's sample corresponding with third feature is selected,
Generate negative sample collection.
After obtaining the second feature set added with multiple features, in order to be trained to logic of propositions regression model,
Above-mentioned target user's determining device can use the second feature set to being screened in multiple non-seed user's samples, obtain
To the selection result, negative sample collection is then generated according to the selection result.
Specifically, above-mentioned target user's determining device can obtain the feature of each non-seed user's sample, then sentence
Disconnected this feature whether there is in second feature set, if it is, obtaining non-seed user's sample corresponding with this feature.To
Multiple non-seed user's samples that feature is present in second feature set can be obtained, are then present in second using this feature
Multiple non-seed user's samples in characteristic set generate negative sample collection.
For example, it is assumed that second feature collection is combined into:" 38 ", " man " and " Beijing ", the feature of user's sample 5, user's sample 7
Feature, the feature of user's sample 8 and user's sample 10 be present in the second feature set, then above-mentioned target user is true
Determine device, user's sample 5, user's sample 7, user's sample 8 and user's sample 10 can be obtained, and generate and include:User
The negative sample collection of sample 5, user's sample 7, user's sample 8 and user's sample 10.
It is compared by the first accounting to feature and the second accounting, when the first accounting is less than the second accounting,
It can illustrate that this feature is more likely to negative sample, it therefore, can be using the corresponding non-seed user's sample of this feature as negative sample
User's sample of concentration, and then the negative sample set pair logic of propositions regression model generated can be used to be trained, find target
User.
In some instances, above-mentioned target user's determining device obtains second feature set in S131 through the above steps
When, there may be the features for belonging to seed user sample in the second feature set, if being generated using the second feature set negative
Sample set can make the accurate of the Logic Regression Models after training according to the negative sample collection training logic of propositions regression model
Degree reduces, and then reduces the accuracy for finding target user.
In order to improve the accuracy for finding target user using the Logic Regression Models after training, one kind shown in Fig. 1
On the basis of the method for determining target user, the embodiment of the present invention additionally provides alternatively possible realization method, to realize life
At negative sample concentrate and only include the feature of non-seed user's sample.
As shown in figure 5, the flow of the 5th kind of embodiment of method for a kind of determining target user of the embodiment of the present invention
Figure, in Figure 5, S130, according to the magnitude relationship of the first accounting and the second accounting of each feature, generate second feature set or
Third feature set;And according to second feature set or third feature set, first is selected in multiple non-seed user's samples
Non-seed user's sample generates negative sample collection, may include:
This feature is added to for each feature in fisrt feature set when the first accounting is more than second by S133
In third feature set, the third feature set added with multiple features is obtained.
In the alternatively possible realization method of the embodiment of the present invention, when of any feature in fisrt feature set
When one accounting is more than the second accounting, illustrates that any feature is more likely to positive sample, which can be added to third
In characteristic set, so as to obtain the third feature set added with multiple features.
For example, it is assumed that above-mentioned target user's determining device, the first accounting of calculated feature " female " is 61%, second
Accounting is 60%, then this feature " female " can be added to third feature set, and the first accounting of calculated feature " 45 " is
75%, the second accounting is 47%, then can be added to third feature set with this feature " 45 ", the of calculated feature " Guangzhou "
One accounting is 68%, and the second accounting is 59%, then can this feature " Guangzhou " be added to third feature set etc., so as to
Obtain the third feature set added with feature " 45 ", " female " and " Guangzhou ".
S134 obtains multiple features of multiple non-seed user's samples, and in multiple features, selection is not present in third spy
Fourth feature in collection conjunction, and in multiple non-seed user's samples, select non-seed user's sample corresponding with fourth feature
This, generates negative sample collection.
After obtaining the third feature set added with multiple features, in order to avoid there may be categories in second feature set
In the feature of seed user sample the case where, above-mentioned target user's determining device can use third feature set to multiple non-
Seed user sample is screened, and the selection result is obtained, and then generates negative sample collection according to the selection result.
Specifically, above-mentioned target user's determining device can obtain the feature of each non-seed user's sample, then sentence
Disconnected this feature whether there is in third feature set, if it is not, then non-seed user's sample corresponding with this feature is obtained, from
And multiple non-seed user's samples that feature is not present in third feature set can be obtained, then it is not present using this feature
Multiple non-seed user's samples in third feature set generate negative sample collection.
For example, it is assumed that third feature collection is combined into:" 45 ", " female ", " Guangzhou ", the feature of user's sample 5 and user's sample 8
Feature is not present in third feature set, then above-mentioned target user's determining device can obtain user's sample 5 and user's sample
Sheet 8, and generation includes the negative sample collection of user's sample 5 and user's sample 8.
It is compared by the first accounting to feature and the second accounting, when the first accounting is more than the second accounting,
It can illustrate that this feature is more likely to positive sample, it therefore, can be using this feature as third feature set to multiple non-seed use
Family sample is screened so that the feature of multiple non-seed user's samples after screening is not present in third feature set.
After being trained using the negative sample set pair logic of propositions regression model generated according to the result after screening, after training can be improved
Logic Regression Models accuracy, and then can improve and be sought using the method for determining target user of the embodiment of the present invention a kind of
Look for the accuracy of target user.
Corresponding to above-mentioned embodiment of the method, the embodiment of the present invention additionally provides a kind of device of determining target user, such as
It is a kind of structural schematic diagram of the device of determining target user of the embodiment of the present invention shown in Fig. 6, in figure 6, the present invention is implemented
A kind of device of determining target user of example may include:
Acquisition module 610, for obtaining fisrt feature set, multiple seed user samples and multiple non-seed user's samples
This;
Accounting computing module 620, for for each feature in fisrt feature set, calculating the seed with this feature
First accounting of user's sample in multiple seed user samples and non-seed user's sample with this feature are at multiple non-kinds
The second accounting in child user sample;
Negative sample collection generation module 630 is used for the magnitude relationship of the first accounting and the second accounting according to each feature, raw
At second feature set or third feature set;And according to second feature set or third feature set, in multiple non-seed use
First non-seed user's sample is selected in the sample of family, generates negative sample collection;
Training module 640, for obtaining multiple seed user samples, and using multiple seed user samples as positive sample
Collection, the first sample label of acquisition positive sample collection, positive sample concentrate first eigenvector, the negative sample of each seed user sample
The second sample label and negative sample of collection concentrate the second feature vector of each non-seed user's sample, and mould is returned to logic of propositions
Type is trained, the Logic Regression Models after being trained;
Sample value computing module 650, for obtaining each non-seed user's sample in multiple non-seed user's samples
Three feature vectors, and according to the Logic Regression Models after the training of third feature vector sum, calculate in multiple non-seed user's samples
The sample value of each non-seed user's sample;
Target user's selecting module 660, with multiple non-seed user's samples, is pressed for obtaining target user's quantity
According to the sequence of sample value from big to small, selection meets first non-seed user's sample of target user's quantity, and will with it is first non-
The corresponding non-seed user of seed user sample is as target user.
A kind of device of determining target user through the embodiment of the present invention is getting fisrt feature set, multiple seeds
After user's sample and multiple non-seed user's samples, for each feature in fisrt feature set, calculate with this feature
First accounting of the seed user sample in multiple seed user samples and non-seed user's sample with this feature are multiple
The second accounting in non-seed user's sample, it is raw then according to the magnitude relationship of the first accounting and the second accounting of each feature
At the second feature set or third feature collection merging generation negative sample collection for generating negative sample collection;By according to the first accounting
Negative sample collection is generated with the magnitude relationship of the second accounting so that the negative sample collection and positive sample collection training logic of propositions may be used
Regression model can be by each non-seed in multiple non-seed user's samples after the Logic Regression Models after being trained
Logic Regression Models after the third feature vector sum training of user's sample calculate non-kind each in multiple non-seed user's samples
The sample value of child user sample;Sample value is bigger, then illustrates more to be likely to become target user, therefore, can be at multiple non-kinds
In child user sample, according to the sequence of sample value from big to small, the first non-seed user corresponding with target user's quantity is selected
Sample, and will non-seed user corresponding with first non-seed user's sample as target user, so as to realize according to wide
The less seed user for accusing main offer, determines suitable target user.
Specifically, the device of determining target user of the embodiment of the present invention a kind of, can also include:
Fisrt feature set establishes module, the fisrt feature for obtaining multiple seed user samples and multiple non-seed use
The second feature of family sample, and according to fisrt feature and second feature, establish fisrt feature set, wherein fisrt feature set
In each feature do not repeat.
Specifically, the device of determining target user of the embodiment of the present invention a kind of, can also include:
Coding module, for being encoded to each feature in fisrt feature set, the fisrt feature after being encoded
Set;
Correspondingly, accounting computing module, is specifically used for:
For each feature in the fisrt feature set after coding, the seed user sample with this feature is calculated more
The first accounting in a seed user sample and non-seed user's sample with this feature are in multiple non-seed user's samples
The second accounting.
Specifically, negative sample collection generation module 630, including:
Second feature set generates submodule, for for each feature in fisrt feature set, the of this feature
When one accounting is less than the second accounting, this feature is added in second feature set, it is special to obtain second added with multiple features
Collection is closed;
First negative sample collection generates submodule, multiple features for obtaining multiple non-seed user's samples, in multiple spies
In sign, selection is present in the third feature in second feature set, and in multiple non-seed user's samples, and selection is special with third
Corresponding non-seed user's sample is levied, negative sample collection is generated.
Specifically, negative sample collection generation module 630, can also include:
Third feature set generates submodule, each feature for being directed in fisrt feature set, big in the first accounting
When second, this feature is added in third feature set, obtains the third feature set added with multiple features;
Second negative sample collection generates submodule, multiple features for obtaining multiple non-seed user's samples, in multiple spies
In sign, selection is not present in the fourth feature in third feature set, and in multiple non-seed user's samples, selection and the 4th
The corresponding non-seed user's sample of feature generates negative sample collection.
The embodiment of the present invention additionally provides a kind of electronic equipment, as shown in fig. 7, a kind of electronics for the embodiment of the present invention is set
Standby structural schematic diagram, including processor 710, communication interface 720, memory 730 and communication bus 740, wherein processor
710, communication interface 720, memory 730 completes mutual communication by communication bus 740,
Memory 730, for storing computer program;
Processor 710 when for executing the program stored on memory 730, realizes following steps:
Obtain fisrt feature set, multiple seed user samples and multiple non-seed user's samples;
For each feature in fisrt feature set, calculates the seed user sample with this feature and used in multiple seeds
The first accounting in the sample of family and non-seed user's sample with this feature second accounting in multiple non-seed user's samples
Than;
According to the magnitude relationship of the first accounting and the second accounting of each feature, second feature set or third feature are generated
Set;And according to second feature set or third feature set, the first non-seed use is selected in multiple non-seed user's samples
Family sample generates negative sample collection;
Multiple seed user samples are obtained, and using multiple seed user samples as positive sample collection, obtain positive sample collection
First sample label, positive sample concentrate the second sample label of the first eigenvector of each seed user sample, negative sample collection
The second feature vector that each non-seed user's sample is concentrated with negative sample, is trained logic of propositions regression model, obtains
Logic Regression Models after training;
The third feature vector of each non-seed user's sample in multiple non-seed user's samples is obtained, and according to third spy
The Logic Regression Models after vector sum training are levied, the sample of each non-seed user's sample in multiple non-seed user's samples is calculated
Value;
Target user's quantity is obtained, with multiple non-seed user's samples, according to the sequence of sample value from big to small, is selected
Select the first non-seed user's sample for meeting target user's quantity, and will non-seed use corresponding with first non-seed user's sample
Family is as target user.
A kind of electronic equipment through the embodiment of the present invention is getting fisrt feature set, multiple seed user samples
After multiple non-seed user's samples, for each feature in fisrt feature set, the seed user with this feature is calculated
First accounting of the sample in multiple seed user samples and non-seed user's sample with this feature are in multiple non-seed use
The second accounting in the sample of family generates then according to the magnitude relationship of the first accounting and the second accounting of each feature for giving birth to
Merge at the second feature set or third feature collection of negative sample collection and generates negative sample collection;By being accounted for according to the first accounting and second
The magnitude relationship of ratio generates negative sample collection so that the negative sample collection may be used and positive sample collection training logic of propositions returns mould
Type can pass through each non-seed user's sample in multiple non-seed user's samples after the Logic Regression Models after being trained
Logic Regression Models after this third feature vector sum training calculate each non-seed user in multiple non-seed user's samples
The sample value of sample;Sample value is bigger, then illustrates more to be likely to become target user, therefore, can be in multiple non-seed users
In sample, according to the sequence of sample value from big to small, first non-seed user's sample corresponding with target user's quantity is selected, and
Will non-seed user corresponding with first non-seed user's sample as target user, provided according to advertiser so as to realize
Less seed user, determine suitable target user.
The communication bus that above-mentioned electronic equipment is mentioned can be Peripheral Component Interconnect standard (Peripheral Component
Interconnect, abbreviation PCI) bus or expanding the industrial standard structure (Extended Industry Standard
Architecture, abbreviation EISA) bus etc..The communication bus can be divided into address bus, data/address bus, controlling bus etc..
For ease of indicating, only indicated with a thick line in figure, it is not intended that an only bus or a type of bus.
Communication interface is for the communication between above-mentioned electronic equipment and other equipment.
Memory may include random access memory (Random Access Memory, abbreviation RAM), can also include
Nonvolatile memory (non-volatile memory), for example, at least a magnetic disk storage.Optionally, memory may be used also
To be at least one storage device for being located remotely from aforementioned processor.
Above-mentioned processor can be general processor, including central processing unit (Central Processing Unit,
Abbreviation CPU), network processing unit (Network Processor, abbreviation NP) etc.;It can also be digital signal processor
(Digital Signal Processing, abbreviation DSP), application-specific integrated circuit (Application Specific
Integrated Circuit, abbreviation ASIC), field programmable gate array (Field-Programmable Gate Array,
Abbreviation FPGA) either other programmable logic device, discrete gate or transistor logic, discrete hardware components.
In another embodiment provided by the invention, a kind of computer readable storage medium is additionally provided, which can
It reads to be stored with instruction in storage medium, when run on a computer so that computer executes any institute in above-described embodiment
The method of the determining target user stated a kind of.
A kind of computer readable storage medium through the embodiment of the present invention, get fisrt feature set, multiple kinds
After child user sample and multiple non-seed user's samples, for each feature in fisrt feature set, calculating has this feature
First accounting of the seed user sample in multiple seed user samples and non-seed user's sample with this feature more
The second accounting in a non-seed user's sample, then according to the magnitude relationship of the first accounting and the second accounting of each feature,
It generates second feature set or third feature collection for generating negative sample collection and merges generation negative sample collection;By being accounted for according to first
Negative sample collection is generated than the magnitude relationship with the second accounting so that the negative sample collection may be used and positive sample collection training is preset and patrolled
Regression model is collected, it, can be by non-kind each in multiple non-seed user's samples after the Logic Regression Models after being trained
Logic Regression Models after the third feature vector sum training of child user sample calculate each non-in multiple non-seed user's samples
The sample value of seed user sample;Sample value is bigger, then illustrates more to be likely to become target user, therefore, can be multiple non-
In seed user sample, according to the sequence of sample value from big to small, the first non-seed use corresponding with target user's quantity is selected
Family sample, and will non-seed user corresponding with first non-seed user's sample as target user, so as to realize basis
The less seed user that advertiser provides, determines suitable target user.
In another embodiment provided by the invention, a kind of computer program product including instruction is additionally provided, when it
When running on computers so that the method that computer executes any a kind of determining target user in above-described embodiment.
Through the embodiment of the present invention it is a kind of comprising instruction computer program product, get fisrt feature set,
After multiple seed user samples and multiple non-seed user's samples, for each feature in fisrt feature set, calculating has
First accounting of the seed user sample of this feature in multiple seed user samples and non-seed user's sample with this feature
Originally the second accounting in multiple non-seed user's samples, then according to the size of the first accounting and the second accounting of each feature
Relationship generates second feature set or third feature collection for generating negative sample collection and merges generation negative sample collection;Pass through basis
The magnitude relationship of first accounting and the second accounting generates negative sample collection so that the negative sample collection and positive sample collection training may be used
Logic of propositions regression model can be by every in multiple non-seed user's samples after the Logic Regression Models after being trained
Logic Regression Models after the third feature vector sum training of a non-seed user's sample, calculate in multiple non-seed user's samples
The sample value of each non-seed user's sample;Sample value is bigger, then illustrates more to be likely to become target user, therefore, Ke Yi
In multiple non-seed user's samples, according to the sequence of sample value from big to small, select corresponding with target user's quantity first non-
Seed user sample, and will non-seed user corresponding with first non-seed user's sample as target user, so as to reality
The less seed user now provided according to advertiser, determines suitable target user.
In the above-described embodiments, can come wholly or partly by software, hardware, firmware or its arbitrary combination real
It is existing.When implemented in software, it can entirely or partly realize in the form of a computer program product.The computer program
Product includes one or more computer instructions.When loading on computers and executing the computer program instructions, all or
It partly generates according to the flow or function described in the embodiment of the present invention.The computer can be all-purpose computer, special meter
Calculation machine, computer network or other programmable devices.The computer instruction can be stored in computer readable storage medium
In, or from a computer readable storage medium to the transmission of another computer readable storage medium, for example, the computer
Instruction can pass through wired (such as coaxial cable, optical fiber, number from a web-site, computer, server or data center
User's line (DSL)) or wireless (such as infrared, wireless, microwave etc.) mode to another web-site, computer, server or
Data center is transmitted.The computer readable storage medium can be any usable medium that computer can access or
It is comprising data storage devices such as one or more usable mediums integrated server, data centers.The usable medium can be with
It is magnetic medium, (for example, floppy disk, hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid state disk
Solid State Disk (SSD)) etc..
It should be noted that herein, relational terms such as first and second and the like are used merely to a reality
Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation
In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to
Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those
Element, but also include other elements that are not explicitly listed, or further include for this process, method, article or equipment
Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that
There is also other identical elements in process, method, article or equipment including the element.
Each embodiment in this specification is all made of relevant mode and describes, identical similar portion between each embodiment
Point just to refer each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality
For applying example, since it is substantially similar to the method embodiment, so description is fairly simple, related place is referring to embodiment of the method
Part explanation.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all
Any modification, equivalent replacement, improvement and so within the spirit and principles in the present invention, are all contained in protection scope of the present invention
It is interior.
Claims (11)
1. a kind of method of determining target user, which is characterized in that the method includes:
Obtain fisrt feature set, multiple seed user samples and multiple non-seed user's samples;
For each feature in the fisrt feature set, the seed user sample with this feature is calculated at the multiple kind
The first accounting in child user sample and non-seed user's sample with this feature are in the multiple non-seed user's sample
The second accounting;
According to the magnitude relationship of the first accounting and the second accounting of each feature, second feature set or third feature are generated
Set;And it according to the second feature set or the third feature set, is selected in the multiple non-seed user's sample
First non-seed user's sample generates negative sample collection;
The multiple seed user sample is obtained, and using the multiple seed user sample as positive sample collection, acquisition is described just
The first sample label of sample set, the positive sample concentrate the first eigenvector of each seed user sample, the negative sample
Second sample label of collection and the negative sample concentrate the second feature vector of each non-seed user's sample, are returned to logic of propositions
Model is returned to be trained, the Logic Regression Models after being trained;
The third feature vector of each non-seed user's sample in the multiple non-seed user's sample is obtained, and according to described the
Logic Regression Models after three feature vectors and the training calculate each non-seed use in the multiple non-seed user's sample
The sample value of family sample;
Obtain target user's quantity, in the multiple non-seed user's sample, according to the sample value from big to small suitable
Sequence, selection meet first non-seed user's sample of target user's quantity, and will be with described first non-seed user's sample
Corresponding non-seed user is as target user.
2. according to the method described in claim 1, it is characterized in that, in the acquisition fisrt feature set, multiple seed users
Before sample and multiple non-seed user's samples, the method further includes:
Obtain the second feature of the fisrt feature and the multiple non-seed user's sample of the multiple seed user sample, and root
According to the fisrt feature and the second feature, the fisrt feature set is established, wherein each in the fisrt feature set
A feature does not repeat.
3. according to the method described in claim 1, it is characterized in that, in the acquisition fisrt feature set, multiple seed users
After sample and multiple non-seed user's samples, the method further includes:
Each feature in the fisrt feature set is encoded, the fisrt feature set after being encoded;
Correspondingly, each feature in the fisrt feature set, calculates the seed user sample with this feature
The first accounting in the multiple seed user sample and non-seed user's sample with this feature are at the multiple non-kind
The second accounting in child user sample, including:
For each feature in the fisrt feature set after the coding, the seed user sample with this feature is calculated in institute
The first accounting in multiple seed user samples and non-seed user's sample with this feature are stated in the multiple non-seed use
The second accounting in the sample of family.
4. according to claims 1 to 3 any one of them method, which is characterized in that described according to the first of each feature
The magnitude relationship of accounting and the second accounting generates second feature set or third feature set;And according to the second feature collection
It closes or the third feature set, first non-seed user's sample of selection in the multiple non-seed user's sample generates negative
Sample set, including:
For each feature in the fisrt feature set, it is less than second accounting in first accounting of this feature
When, this feature is added in second feature set, the second feature set added with multiple features is obtained;
The multiple features for obtaining the multiple non-seed user's sample, in the multiple feature, selection is present in described second
Third feature in characteristic set, and in the multiple non-seed user's sample, select corresponding with the third feature non-
Seed user sample generates negative sample collection.
5. according to claims 1 to 3 any one of them method, which is characterized in that described according to the first of each feature
The magnitude relationship of accounting and the second accounting generates second feature set or third feature set;And according to the second feature collection
It closes or the third feature set, first non-seed user's sample of selection in the multiple non-seed user's sample generates negative
Sample set, including:
This feature is added when first accounting is more than described second for each feature in the fisrt feature set
It adds in third feature set, obtains the third feature set added with multiple features;
The multiple features for obtaining the multiple non-seed user's sample, in the multiple feature, selection is not present in described the
Fourth feature in three characteristic sets, and in the multiple non-seed user's sample, select corresponding with the fourth feature
Non-seed user's sample generates negative sample collection.
6. a kind of device of determining target user, which is characterized in that described device includes:
Acquisition module, for obtaining fisrt feature set, multiple seed user samples and multiple non-seed user's samples;
Accounting computing module is used for for each feature in the fisrt feature set, calculating the seed with this feature
First accounting of the family sample in the multiple seed user sample and non-seed user's sample with this feature are described more
The second accounting in a non-seed user's sample;
Negative sample collection generation module is used for the magnitude relationship of the first accounting and the second accounting according to each feature, generates
Second feature set or third feature set;And according to the second feature set or the third feature set, described more
First non-seed user's sample is selected in a non-seed user's sample, generates negative sample collection;
Training module, for obtaining the multiple seed user sample, and using the multiple seed user sample as positive sample
Collection, obtain the first sample label of the positive sample collection, the positive sample concentrate the fisrt feature of each seed user sample to
Amount, the second sample label of the negative sample collection and the negative sample concentrate the second feature of each non-seed user's sample to
Amount, is trained logic of propositions regression model, the Logic Regression Models after being trained;
Sample value computing module, the third for obtaining each non-seed user's sample in the multiple non-seed user's sample are special
Sign vector, and according to the Logic Regression Models after training described in the third feature vector sum, calculate the multiple non-seed use
The sample value of each non-seed user's sample in the sample of family;
Target user's selecting module, for obtaining target user's quantity, in the multiple non-seed user's sample, according to institute
The sequence of sample value from big to small is stated, selection meets first non-seed user's sample of target user's quantity, and will be with institute
The corresponding non-seed user of first non-seed user's sample is stated as target user.
7. device according to claim 6, which is characterized in that described device further includes:
Fisrt feature set establishes module, fisrt feature for obtaining the multiple seed user sample and non-kind the multiple
The second feature of child user sample, and according to the fisrt feature and the second feature, the fisrt feature set is established,
In, each feature in the fisrt feature set does not repeat.
8. device according to claim 6, which is characterized in that described device further includes:
Coding module, for being encoded to each feature in the fisrt feature set, the fisrt feature after being encoded
Set;
Correspondingly, the accounting computing module, is specifically used for:
For each feature in the fisrt feature set after the coding, the seed user sample with this feature is calculated in institute
The first accounting in multiple seed user samples and non-seed user's sample with this feature are stated in the multiple non-seed use
The second accounting in the sample of family.
9. according to claim 6~8 any one of them device, which is characterized in that the negative sample collection generation module, including:
Second feature set generates submodule, each feature for being directed in the fisrt feature set, in the institute of this feature
When stating the first accounting less than second accounting, this feature is added in second feature set, obtains being added with multiple features
Second feature set;
First negative sample collection generates submodule, multiple features for obtaining the multiple non-seed user's sample, described more
In a feature, selection is present in the third feature in the second feature set, and in the multiple non-seed user's sample,
Selection non-seed user's sample corresponding with the third feature, generates negative sample collection.
10. according to claim 6~8 any one of them device, which is characterized in that the negative sample collection generation module, including:
Third feature set generates submodule, for for each feature in the fisrt feature set, being accounted for described first
When than being more than described second, this feature being added in third feature set, obtaining the third feature collection added with multiple features
It closes;
Second negative sample collection generates submodule, multiple features for obtaining the multiple non-seed user's sample, described more
In a feature, selection is not present in the fourth feature in the third feature set, and in the multiple non-seed user's sample
In, non-seed user's sample corresponding with the fourth feature is selected, negative sample collection is generated.
11. a kind of electronic equipment, which is characterized in that including processor, communication interface, memory and communication bus, wherein processing
Device, communication interface, memory complete mutual communication by communication bus;
Memory, for storing computer program;
Processor when for executing the program stored on memory, realizes any method and steps of claim 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810297028.4A CN108647990B (en) | 2018-04-04 | 2018-04-04 | Method and device for determining target user and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810297028.4A CN108647990B (en) | 2018-04-04 | 2018-04-04 | Method and device for determining target user and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108647990A true CN108647990A (en) | 2018-10-12 |
CN108647990B CN108647990B (en) | 2022-06-03 |
Family
ID=63745395
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810297028.4A Active CN108647990B (en) | 2018-04-04 | 2018-04-04 | Method and device for determining target user and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108647990B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105447730A (en) * | 2015-12-25 | 2016-03-30 | 腾讯科技(深圳)有限公司 | Target user orientation method and device |
US20170017998A1 (en) * | 2015-07-17 | 2017-01-19 | Adobe Systems Incorporated | Determining context and mindset of users |
CN106920248A (en) * | 2017-01-19 | 2017-07-04 | 博康智能信息技术有限公司上海分公司 | A kind of method for tracking target and device |
CN107093084A (en) * | 2016-08-01 | 2017-08-25 | 北京小度信息科技有限公司 | Potential user predicts method for transformation and device |
CN107369052A (en) * | 2017-08-29 | 2017-11-21 | 北京小度信息科技有限公司 | User's registration behavior prediction method, apparatus and electronic equipment |
CN107679920A (en) * | 2017-10-20 | 2018-02-09 | 北京奇艺世纪科技有限公司 | The put-on method and device of a kind of advertisement |
CN107844584A (en) * | 2017-11-14 | 2018-03-27 | 北京小度信息科技有限公司 | Usage mining method, apparatus, electronic equipment and computer-readable recording medium |
-
2018
- 2018-04-04 CN CN201810297028.4A patent/CN108647990B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170017998A1 (en) * | 2015-07-17 | 2017-01-19 | Adobe Systems Incorporated | Determining context and mindset of users |
CN105447730A (en) * | 2015-12-25 | 2016-03-30 | 腾讯科技(深圳)有限公司 | Target user orientation method and device |
CN107093084A (en) * | 2016-08-01 | 2017-08-25 | 北京小度信息科技有限公司 | Potential user predicts method for transformation and device |
CN106920248A (en) * | 2017-01-19 | 2017-07-04 | 博康智能信息技术有限公司上海分公司 | A kind of method for tracking target and device |
CN107369052A (en) * | 2017-08-29 | 2017-11-21 | 北京小度信息科技有限公司 | User's registration behavior prediction method, apparatus and electronic equipment |
CN107679920A (en) * | 2017-10-20 | 2018-02-09 | 北京奇艺世纪科技有限公司 | The put-on method and device of a kind of advertisement |
CN107844584A (en) * | 2017-11-14 | 2018-03-27 | 北京小度信息科技有限公司 | Usage mining method, apparatus, electronic equipment and computer-readable recording medium |
Non-Patent Citations (2)
Title |
---|
HTTP://DOC.MADAOMALL.COM/241.HTML: "聊聊lookalike模型的使用技巧", 《百度在线》 * |
QIANG MA 等: "A Sub-linear, Massive-scale Look-alike Audience Extension System", 《IEEE》 * |
Also Published As
Publication number | Publication date |
---|---|
CN108647990B (en) | 2022-06-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108280477B (en) | Method and apparatus for clustering images | |
US8706729B2 (en) | Systems and methods for distributed data annotation | |
CN107835113A (en) | Abnormal user detection method in a kind of social networks based on network mapping | |
CN109903086A (en) | A kind of similar crowd's extended method, device and electronic equipment | |
CN104199818B (en) | Method is recommended in a kind of socialization based on classification | |
CN110909222B (en) | User portrait establishing method and device based on clustering, medium and electronic equipment | |
CN108132963A (en) | Resource recommendation method and device, computing device and storage medium | |
CN108734587A (en) | The recommendation method and terminal device of financial product | |
CN108268575A (en) | Processing method, the device and system of markup information | |
WO2022193753A1 (en) | Continuous learning method and apparatus, and terminal and storage medium | |
CN109543940B (en) | Activity evaluation method, activity evaluation device, electronic equipment and storage medium | |
CN106156857B (en) | The method and apparatus of the data initialization of variation reasoning | |
CN110276243A (en) | Score mapping method, face comparison method, device, equipment and storage medium | |
CN109885745A (en) | A kind of user draws a portrait method, apparatus, readable storage medium storing program for executing and terminal device | |
CN113821827A (en) | Joint modeling method and device for protecting multi-party data privacy | |
CN109376307B (en) | Article recommendation method and device and terminal | |
CN108647986A (en) | A kind of target user determines method, apparatus and electronic equipment | |
CN110020910A (en) | Object recommendation method and apparatus | |
CN112541010B (en) | User gender prediction method based on logistic regression | |
CN111833080B (en) | Information pushing method, device, electronic equipment and computer readable storage medium | |
CN104867032A (en) | Electronic commerce client evaluation identification system | |
CN110278524B (en) | User position determining method, graph model generating method, device and server | |
CN109460778B (en) | Activity evaluation method, activity evaluation device, electronic equipment and storage medium | |
CN109033078B (en) | The recognition methods of sentence classification and device, storage medium, processor | |
CN108647990A (en) | A kind of method, apparatus and electronic equipment of determining target user |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |