CN112907295A - Similar population expansion method and device based on computing advertisement background - Google Patents

Similar population expansion method and device based on computing advertisement background Download PDF

Info

Publication number
CN112907295A
CN112907295A CN202110295616.6A CN202110295616A CN112907295A CN 112907295 A CN112907295 A CN 112907295A CN 202110295616 A CN202110295616 A CN 202110295616A CN 112907295 A CN112907295 A CN 112907295A
Authority
CN
China
Prior art keywords
model
processing
end monitoring
monitoring data
advertisement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110295616.6A
Other languages
Chinese (zh)
Inventor
吴园园
段少毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Enyike Beijing Data Technology Co ltd
Original Assignee
Enyike Beijing Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Enyike Beijing Data Technology Co ltd filed Critical Enyike Beijing Data Technology Co ltd
Priority to CN202110295616.6A priority Critical patent/CN112907295A/en
Publication of CN112907295A publication Critical patent/CN112907295A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0263Targeted advertisements based upon Internet or website rating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • G06Q30/0271Personalized advertisement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0277Online advertisement

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Educational Administration (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a similar population expanding method and a similar population expanding device based on a computing advertisement background, wherein the method comprises the following steps of: acquiring a positive sample and a negative sample; carrying out layered sampling processing on the negative sample according to a preset negative sample sampling condition to obtain a sampled negative sample; respectively performing feature processing on the advertisement front-end monitoring data and the third-party tag data, respectively constructing models, and correspondingly obtaining a first model and a second model; respectively adjusting the weight parameters of the positive samples in the first model and the second model to correspondingly obtain a third model and a fourth model; fusing and scoring are carried out on the third model and the fourth model to obtain crowd scoring information; selecting a fuzzy scoring area in the crowd scoring information; further constructing a model for the fuzzy scoring area, and filtering and screening to obtain a first similar population; and screening and filtering the first similar population according to the preset rule label and the advertisement front-end monitoring data to obtain the similar population.

Description

Similar population expansion method and device based on computing advertisement background
Technical Field
The invention relates to the technical field of information processing, in particular to a similar population expansion method and device based on a computing advertisement background.
Background
In internet commercial application, many advertisers can encounter the problems of difficulty in identifying high-potential crowds, difficulty in balancing cost and scale and the like when searching for potential customers, and under the background, a similar crowd targeting technology (similar crowd expansion) is generated.
The mainstream method of the Lookalike technology comprises the following steps: the first method is that positioning is displayed, and an advertiser selects the crowd according to the label. The method is simple and visual, and the advertiser can directly screen the target population by screening the labels of gender, age, preference and the like through the user portrait label. However, this method requires a lot of manual trial and error by advertisers, and has a limitation that manually printed labels cannot completely summarize all attributes of the target population, for example, men do not care about skin care products. The method for displaying and positioning is long in period, high in cost and difficult to use universally.
And the second method is implicit positioning, and modeling is carried out on the seed user through a machine learning method. The implicit positioning method almost does not need an advertiser to participate, only needs the advertiser to provide the characteristics of target crowds (namely, seed users), automatically discovers similar crowds according to seed data through a machine learning method, and effectively avoids the problems faced by the user-defined labels.
The technical difficulty of Lookalike is as follows: the difficult searching, precision and scale balancing points of high-potential users are two main problems faced by advertisers, and the core lies in effective access to large-scale potential users. Achieving the 'pareto optimality' (the most ideal state) between the effect and the scale is relatively troublesome, specifically, if an advertiser wants to reach potential target customers as much as possible, the advertiser needs to reach the potential target customers on a large scale, the focusing of the crowd is necessarily reduced step by step, the proportion of non-target crowd is increased along with the increase of the flow, the advertising cost is increased, but if the advertiser reduces the reaching scale, a part of target crowd is not reached, and the advertising effect is influenced.
Difficulty two: decreasing the sensitivity of the seed user: the seed user is the premise and the basis of the expansion, and the quality of the seed user is often the key to the good and bad effect of the lookelike. But it is difficult for advertisers to provide seed packets that are large enough in data size and wide enough. At this time, it is necessary to consider how to perform effective data preprocessing and model learning under the condition that a small number of seed packets and the seeds can not necessarily cover the global features.
Disclosure of Invention
The invention aims to solve the technical problem of the prior art and provides a similar population expansion method and device based on a computing advertisement background.
The technical scheme for solving the technical problems is as follows: a similar population expansion method based on a computing advertisement background comprises the following steps:
acquiring a positive sample consisting of seed population, a negative sample consisting of non-seed population, a preset negative sample sampling condition, advertisement front-end monitoring data, third-party label data and a preset rule label;
carrying out layered sampling processing on the negative sample according to the preset negative sample sampling condition to obtain a sampled negative sample;
respectively performing feature processing on the advertisement front-end monitoring data and the third-party tag data, respectively constructing models, and correspondingly obtaining a first model and a second model;
respectively adjusting the weight parameters of the positive samples in the first model and the second model to correspondingly obtain a third model and a fourth model;
fusing and scoring are carried out on the third model and the fourth model to obtain crowd scoring information;
selecting a fuzzy scoring area in the crowd scoring information;
further constructing a model for the fuzzy scoring area, and filtering and screening to obtain a first similar population;
and screening and filtering the first similar population according to a preset rule label and the advertisement front-end monitoring data to obtain the similar population.
Further, the step of respectively performing feature processing on the advertisement front-end monitoring data and the third-party tag data, respectively constructing models, and correspondingly obtaining a first model and a second model includes:
carrying out numerical coding processing on the advertisement front-end monitoring data to obtain numerical coded advertisement front-end monitoring data;
carrying out numerical value box separation processing on the numerical coded advertisement front-end monitoring data to obtain customer relationship management characteristics;
and constructing a model for the customer relationship management class characteristics to obtain a first model.
Further, the step of respectively performing feature processing on the advertisement front-end monitoring data and the third-party tag data, respectively constructing models, and correspondingly obtaining a first model and a second model includes:
performing word embedding processing on the third-party tag data to obtain embedded third-party tag data;
and constructing a model for the embedded third-party tag data to obtain a second model.
Further, the step of selecting the fuzzy scoring area in the crowd scoring information includes:
and setting a region with the score value of 0.5-0.7 in the crowd scoring information as a fuzzy scoring region.
Further, the weight parameters for respectively adjusting the positive samples are respectively distributed and adjusted according to the time attenuation coefficient.
The invention has the beneficial effects that: by means of the method of machine learning and rule packet filtering, the size of the reach people is reduced on the basis of ensuring that the reach people reach the target people as far as possible, and the accuracy of target people screening is improved.
In addition, the invention also provides a similar population expanding device based on the computing advertisement background, which comprises:
the system comprises an acquisition device, a data processing device and a data processing device, wherein the acquisition device is used for acquiring a positive sample consisting of seed crowds, a negative sample consisting of non-seed crowds, a preset negative sample sampling condition, advertisement front-end monitoring data, third-party label data and a preset rule label;
the processing equipment is used for carrying out layered sampling processing on the negative sample according to the preset negative sample sampling condition to obtain a sampled negative sample;
the processing equipment is further used for respectively performing feature processing on the advertisement front-end monitoring data and the third-party tag data, respectively building models, and correspondingly obtaining a first model and a second model;
the processing device is further configured to adjust the weight parameters of the positive samples in the first model and the second model respectively, and correspondingly obtain a third model and a fourth model;
the processing equipment is further used for fusing and scoring the third model and the fourth model to obtain crowd scoring information;
the processing device is further used for selecting a fuzzy scoring area in the crowd scoring information;
the processing equipment is further used for further constructing a model for the fuzzy scoring area to carry out filtering and screening to obtain a first similar population;
the processing equipment is further used for screening and filtering the first similar population according to preset rule labels and the advertisement front-end monitoring data to obtain similar populations.
Furthermore, the processing device is further configured to perform digital coding processing on the advertisement front-end monitoring data to obtain digitally coded advertisement front-end monitoring data;
the processing equipment is also used for carrying out numerical value box separation processing on the numerical coded advertisement front-end monitoring data to obtain customer relationship management characteristics;
the processing device is further configured to build a model for the customer relationship management class characteristics to obtain a first model.
Further, the processing device is further configured to perform word embedding processing on the third-party tag data to obtain embedded third-party tag data;
the processing equipment is further used for building a model for the embedded third-party tag data to obtain a second model.
Further, the processing device is further configured to set a region with a score value of 0.5 to 0.7 in the crowd rating information as a fuzzy rating region.
Further, the weight parameters for respectively adjusting the positive samples are respectively distributed and adjusted according to the time attenuation coefficient.
The invention has the beneficial effects that: by means of the method of machine learning and rule packet filtering, the size of the reach people is reduced on the basis of ensuring that the reach people reach the target people as far as possible, and the accuracy of target people screening is improved.
Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
Fig. 1 is a schematic flowchart of a similar population expansion method in a computing-based advertisement context according to an embodiment of the present invention.
Fig. 2 is a second schematic flowchart of a similar population expanding method in the context of computing-based advertising according to an embodiment of the present invention.
Fig. 3 is a schematic structural block diagram of a similar population expansion apparatus in the context of computing-based advertising according to an embodiment of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
As shown in fig. 1 and fig. 2, an embodiment of the present invention provides a similar population expanding method based on a computing advertisement context, which includes:
s1, acquiring a positive sample consisting of seed population, a negative sample consisting of non-seed population, a preset negative sample sampling condition, advertisement front-end monitoring data, third-party label data and a preset rule label;
s2, carrying out layered sampling processing on the negative sample according to the preset negative sample sampling condition to obtain a sampled negative sample;
s3, respectively performing feature processing on the advertisement front-end monitoring data and the third-party tag data, respectively constructing models, and correspondingly obtaining a first model and a second model;
s4, respectively adjusting the weight parameters of the positive samples in the first model and the second model to correspondingly obtain a third model and a fourth model;
s5, carrying out fusion scoring on the third model and the fourth model to obtain crowd scoring information;
s6, selecting a fuzzy scoring area in the crowd scoring information;
s7, further constructing a model for the fuzzy scoring area, and filtering and screening to obtain a first similar population;
s8, screening and filtering the first similar population according to preset rule labels and the advertisement front-end monitoring data to obtain similar populations.
The similar population expansion is a technology for finding more similar populations with potential relevance through a certain algorithm evaluation model based on seed users.
Seed population, namely target population.
logistic regression, a supervised statistical learning method, is mainly used to classify samples.
Gbdt (gradient Boosting Decision tree), an iterative Decision tree algorithm, is composed of a plurality of Decision trees, and the conclusions of all the trees are accumulated to make a final answer.
The embodiment of the invention uses the rule label to carry out unified filtering, and specifically comprises the following steps: step 11, selecting positive and negative samples, and sampling the negative samples; step 12, respectively performing characteristic processing on the advertisement front-end monitoring data and the three-party tag data to construct a model; step 13, adjusting the weight parameter of the positive sample, and distributing the weight according to the time attenuation coefficient; step 14, model fusion scoring; step 15, further constructing a model for the fuzzy scoring area to filter and screen; and step 16, uniformly filtering by using the rule labels, screening active people in a few months, and finally, circling high-latency people.
The method comprises the following steps that 1, seed crowds are selected from positive samples, negative samples are sampled in a layering mode, for example, in a scene of expanding a resource-reserving user, the negative samples can be sampled in a layering mode from samples which arrive at a station and do not reserve resources, then a pu-learning (semi-supervised classifier of the positive samples only) thought multi-time training model is adopted to remove high-frequency samples in the negative samples, the final negative samples are screened out, and most target crowds are included in the recently active crowds, so that the recently active crowds are selected for the positive and negative samples.
2. Because the data dimension of the front-end advertisement monitoring is high and Tree models are difficult to adopt, the front-end advertisement monitoring data are respectively subjected to numerical coding, numerical value binning is carried out, RFM (relationship-Frequency-Monetary) class characteristics are constructed, after the characteristics are leveled, a logistic regression method is used for constructing the models, a word embedding method is used for carrying out word embedding on three-party label data by using a word2vec (unique thermal coding) method, and then a GBDT (Gradient Boosting Decision Tree) method is used for constructing the models.
3. Since recent user behavior is more valuable for mining high potential populations (similar populations), weights of positive samples are assigned in the model by time decay coefficients, with recent positive samples being weighted more heavily than earlier positive samples.
4. And performing model fusion on an LR (logical Regression) model and a GBDT (Gradient Boosting Decision Tree) model by using a stacking method, and finally outputting a springy crowd score.
5. Grouping and grading the crowd scores, finding that a model divides large-scale crowds into high-potential crowds, defining a large number of grades with scores of 0.5-0.7 as fuzzy scoring areas, and further constructing the model for filtering and screening the part of samples.
6. And finally, using a common rule label and combining active crowds monitored by advertisement front-end monitoring data of nearly several months, uniformly screening and filtering the whole high-potential crowds, and finally, enclosing the high-potential crowds.
The main improvement points of the invention are as follows: 1. the negative samples are sampled hierarchically and a pu-left (semi-supervised two classifiers of only positive samples) method is adopted to filter the negative samples which are likely to be positive samples, and the recent activity is adopted. 2. Respectively performing feature processing on advertisement front-end monitoring data and three-party tag data, respectively adopting LR (Logistic Regression) and GBDT (Gradient Boosting Tree) methods to construct a model, and finally performing model fusion scoring by using a stacking method. 3. The weights of the positive samples are assigned in the model by the time attenuation coefficient. 4. And further constructing a model for low-grade crowds of the high-potential crowds circled by the first-layer model, and filtering and screening. 5. In order to reduce the size of the crowd reached by the advertisement, the high-latency users emerging from the model ring are finally filtered by combining the regular labels and the active crowd of nearly several months.
Further, the step of respectively performing feature processing on the advertisement front-end monitoring data and the third-party tag data, respectively constructing models, and correspondingly obtaining a first model and a second model includes:
carrying out numerical coding processing on the advertisement front-end monitoring data to obtain numerical coded advertisement front-end monitoring data;
carrying out numerical value box separation processing on the numerical coded advertisement front-end monitoring data to obtain customer relationship management characteristics;
and constructing a model for the customer relationship management class characteristics to obtain a first model.
Further, the step of respectively performing feature processing on the advertisement front-end monitoring data and the third-party tag data, respectively constructing models, and correspondingly obtaining a first model and a second model includes:
performing word embedding processing on the third-party tag data to obtain embedded third-party tag data;
and constructing a model for the embedded third-party tag data to obtain a second model.
Further, the step of selecting the fuzzy scoring area in the crowd scoring information includes:
and setting a region with the score value of 0.5-0.7 in the crowd scoring information as a fuzzy scoring region.
Further, the weight parameters for respectively adjusting the positive samples are respectively distributed and adjusted according to the time attenuation coefficient.
The invention has the beneficial effects that: by means of the method of machine learning and rule packet filtering, the size of the reach people is reduced on the basis of ensuring that the reach people reach the target people as far as possible, and the accuracy of target people screening is improved.
As shown in fig. 3, in addition, the present invention also provides a similar population expanding device based on the computing advertisement background, which includes:
the system comprises an acquisition device, a data processing device and a data processing device, wherein the acquisition device is used for acquiring a positive sample consisting of seed crowds, a negative sample consisting of non-seed crowds, a preset negative sample sampling condition, advertisement front-end monitoring data, third-party label data and a preset rule label;
the processing equipment is used for carrying out layered sampling processing on the negative sample according to the preset negative sample sampling condition to obtain a sampled negative sample;
the processing equipment is further used for respectively performing feature processing on the advertisement front-end monitoring data and the third-party tag data, respectively building models, and correspondingly obtaining a first model and a second model;
the processing device is further configured to adjust the weight parameters of the positive samples in the first model and the second model respectively, and correspondingly obtain a third model and a fourth model;
the processing equipment is further used for fusing and scoring the third model and the fourth model to obtain crowd scoring information;
the processing device is further used for selecting a fuzzy scoring area in the crowd scoring information;
the processing equipment is further used for further constructing a model for the fuzzy scoring area to carry out filtering and screening to obtain a first similar population;
the processing equipment is further used for screening and filtering the first similar population according to preset rule labels and the advertisement front-end monitoring data to obtain similar populations.
Furthermore, the processing device is further configured to perform digital coding processing on the advertisement front-end monitoring data to obtain digitally coded advertisement front-end monitoring data;
the processing equipment is also used for carrying out numerical value box separation processing on the numerical coded advertisement front-end monitoring data to obtain customer relationship management characteristics;
the processing device is further configured to build a model for the customer relationship management class characteristics to obtain a first model.
Further, the processing device is further configured to perform word embedding processing on the third-party tag data to obtain embedded third-party tag data;
the processing equipment is further used for building a model for the embedded third-party tag data to obtain a second model.
Further, the processing device is further configured to set a region with a score value of 0.5 to 0.7 in the crowd rating information as a fuzzy rating region.
Further, the weight parameters for respectively adjusting the positive samples are respectively distributed and adjusted according to the time attenuation coefficient.
The invention has the beneficial effects that: by means of the method of machine learning and rule packet filtering, the size of the reach people is reduced on the basis of ensuring that the reach people reach the target people as far as possible, and the accuracy of target people screening is improved.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A similar population expansion method based on a computing advertisement background is characterized by comprising the following steps:
acquiring a positive sample consisting of seed population, a negative sample consisting of non-seed population, a preset negative sample sampling condition, advertisement front-end monitoring data, third-party label data and a preset rule label;
carrying out layered sampling processing on the negative sample according to the preset negative sample sampling condition to obtain a sampled negative sample;
respectively performing feature processing on the advertisement front-end monitoring data and the third-party tag data, respectively constructing models, and correspondingly obtaining a first model and a second model;
respectively adjusting the weight parameters of the positive samples in the first model and the second model to correspondingly obtain a third model and a fourth model;
fusing and scoring are carried out on the third model and the fourth model to obtain crowd scoring information;
selecting a fuzzy scoring area in the crowd scoring information;
further constructing a model for the fuzzy scoring area, and filtering and screening to obtain a first similar population;
and screening and filtering the first similar population according to a preset rule label and the advertisement front-end monitoring data to obtain the similar population.
2. The method for expanding similar crowds based on the computation advertisement background as claimed in claim 1, wherein the step of respectively performing feature processing on the advertisement front-end monitoring data and the third-party tag data, respectively constructing models, and correspondingly obtaining a first model and a second model comprises:
carrying out numerical coding processing on the advertisement front-end monitoring data to obtain numerical coded advertisement front-end monitoring data;
carrying out numerical value box separation processing on the numerical coded advertisement front-end monitoring data to obtain customer relationship management characteristics;
and constructing a model for the customer relationship management class characteristics to obtain a first model.
3. The method for expanding similar crowds based on the computation advertisement background as claimed in claim 1, wherein the step of respectively performing feature processing on the advertisement front-end monitoring data and the third-party tag data, respectively constructing models, and correspondingly obtaining a first model and a second model comprises:
performing word embedding processing on the third-party tag data to obtain embedded third-party tag data;
and constructing a model for the embedded third-party tag data to obtain a second model.
4. The method of claim 1, wherein the step of selecting the fuzzy scoring area in the crowd scoring information comprises:
and setting a region with the score value of 0.5-0.7 in the crowd scoring information as a fuzzy scoring region.
5. The method according to claim 1, wherein the weight parameters of the positive samples are respectively assigned and adjusted according to time attenuation coefficients.
6. A similar population expands device based on under calculation advertisement background, its characterized in that includes:
the system comprises an acquisition device, a data processing device and a data processing device, wherein the acquisition device is used for acquiring a positive sample consisting of seed crowds, a negative sample consisting of non-seed crowds, a preset negative sample sampling condition, advertisement front-end monitoring data, third-party label data and a preset rule label;
the processing equipment is used for carrying out layered sampling processing on the negative sample according to the preset negative sample sampling condition to obtain a sampled negative sample;
the processing equipment is further used for respectively performing feature processing on the advertisement front-end monitoring data and the third-party tag data, respectively building models, and correspondingly obtaining a first model and a second model;
the processing device is further configured to adjust the weight parameters of the positive samples in the first model and the second model respectively, and correspondingly obtain a third model and a fourth model;
the processing equipment is further used for fusing and scoring the third model and the fourth model to obtain crowd scoring information;
the processing device is further used for selecting a fuzzy scoring area in the crowd scoring information;
the processing equipment is further used for further constructing a model for the fuzzy scoring area to carry out filtering and screening to obtain a first similar population;
the processing equipment is further used for screening and filtering the first similar population according to preset rule labels and the advertisement front-end monitoring data to obtain similar populations.
7. The device for expanding similar crowds in the context of computing advertising according to claim 6,
the processing device is further configured to perform digital coding processing on the advertisement front-end monitoring data to obtain digitally coded advertisement front-end monitoring data;
the processing equipment is also used for carrying out numerical value box separation processing on the numerical coded advertisement front-end monitoring data to obtain customer relationship management characteristics;
the processing device is further configured to build a model for the customer relationship management class characteristics to obtain a first model.
8. The device for expanding similar crowds in the context of computing advertising according to claim 6,
the processing equipment is also used for carrying out word embedding processing on the third-party tag data to obtain embedded third-party tag data;
the processing equipment is further used for building a model for the embedded third-party tag data to obtain a second model.
9. The apparatus according to claim 6, wherein the processing device is further configured to set a region with a score value between 0.5 and 0.7 in the score information of the crowd as a fuzzy score region.
10. The apparatus according to claim 6, wherein the weight parameters of the positive samples are respectively assigned and adjusted according to time attenuation coefficients.
CN202110295616.6A 2021-03-19 2021-03-19 Similar population expansion method and device based on computing advertisement background Pending CN112907295A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110295616.6A CN112907295A (en) 2021-03-19 2021-03-19 Similar population expansion method and device based on computing advertisement background

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110295616.6A CN112907295A (en) 2021-03-19 2021-03-19 Similar population expansion method and device based on computing advertisement background

Publications (1)

Publication Number Publication Date
CN112907295A true CN112907295A (en) 2021-06-04

Family

ID=76105530

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110295616.6A Pending CN112907295A (en) 2021-03-19 2021-03-19 Similar population expansion method and device based on computing advertisement background

Country Status (1)

Country Link
CN (1) CN112907295A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114444576A (en) * 2021-12-30 2022-05-06 北京达佳互联信息技术有限公司 Data sampling method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170186030A1 (en) * 2015-04-21 2017-06-29 Tencent Technology (Shenzhen) Company Limited Advertisement click-through rate correction method and advertisement push server
CN108205766A (en) * 2016-12-19 2018-06-26 阿里巴巴集团控股有限公司 Information-pushing method, apparatus and system
CN108875761A (en) * 2017-05-11 2018-11-23 华为技术有限公司 A kind of method and device for expanding potential user
CN109034896A (en) * 2018-07-23 2018-12-18 北京奇艺世纪科技有限公司 Crowd's prediction technique and device are launched in a kind of advertisement
CN109903127A (en) * 2019-02-14 2019-06-18 广州视源电子科技股份有限公司 Group recommendation method and device, storage medium and server
CN109934369A (en) * 2017-12-15 2019-06-25 北京京东尚科信息技术有限公司 Method and device for information push
CN111882360A (en) * 2020-07-30 2020-11-03 北京达佳互联信息技术有限公司 User group expansion method and device
CN112508609A (en) * 2020-12-07 2021-03-16 深圳市欢太科技有限公司 Crowd expansion prediction method, device, equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170186030A1 (en) * 2015-04-21 2017-06-29 Tencent Technology (Shenzhen) Company Limited Advertisement click-through rate correction method and advertisement push server
CN108205766A (en) * 2016-12-19 2018-06-26 阿里巴巴集团控股有限公司 Information-pushing method, apparatus and system
CN108875761A (en) * 2017-05-11 2018-11-23 华为技术有限公司 A kind of method and device for expanding potential user
CN109934369A (en) * 2017-12-15 2019-06-25 北京京东尚科信息技术有限公司 Method and device for information push
CN109034896A (en) * 2018-07-23 2018-12-18 北京奇艺世纪科技有限公司 Crowd's prediction technique and device are launched in a kind of advertisement
CN109903127A (en) * 2019-02-14 2019-06-18 广州视源电子科技股份有限公司 Group recommendation method and device, storage medium and server
CN111882360A (en) * 2020-07-30 2020-11-03 北京达佳互联信息技术有限公司 User group expansion method and device
CN112508609A (en) * 2020-12-07 2021-03-16 深圳市欢太科技有限公司 Crowd expansion prediction method, device, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114444576A (en) * 2021-12-30 2022-05-06 北京达佳互联信息技术有限公司 Data sampling method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
WO2020238631A1 (en) Population type recognition method based on mobile phone signaling data
CN106650273B (en) A kind of behavior prediction method and apparatus
CN109325542A (en) A kind of electricity exception intelligent identification Method and system based on multistage machine learning
CN110163647B (en) Data processing method and device
CN107809331A (en) The method and apparatus for identifying abnormal flow
CN111159763B (en) System and method for analyzing portrait of law-related personnel group
CN109345263A (en) Predict the method and system of customer satisfaction
CN110458376A (en) A kind of suspicious risk trade screening method and corresponding system
CN107330464A (en) Data processing method and device
CN110113634A (en) A kind of information interaction method, device, equipment and storage medium
CN113255617B (en) Image recognition method and device, electronic equipment and computer-readable storage medium
CN108470022A (en) A kind of intelligent work order quality detecting method based on operation management
CN113159881B (en) Data clustering and B2B platform customer preference obtaining method and system
CN112700324A (en) User loan default prediction method based on combination of Catboost and restricted Boltzmann machine
CN104657457B (en) A kind of user evaluates data processing method, video recommendation method and the device of video
CN110532331A (en) A kind of method and relevant apparatus that object type is determining
CN108062366A (en) Public culture information recommendation system
CN109559152A (en) A kind of network marketing method, system and computer storage medium
CN111986027A (en) Abnormal transaction processing method and device based on artificial intelligence
CN115545103A (en) Abnormal data identification method, label identification method and abnormal data identification device
CN115544348A (en) Intelligent mass information searching system based on Internet big data
CN113704389A (en) Data evaluation method and device, computer equipment and storage medium
CN112202849A (en) Content distribution method, content distribution device, electronic equipment and computer-readable storage medium
CN112907295A (en) Similar population expansion method and device based on computing advertisement background
CN109918544B (en) Rough set-based social relationship network intelligent analysis method and system for job crime

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination