CN115375361A - Method and device for selecting target population for online advertisement delivery and electronic equipment - Google Patents

Method and device for selecting target population for online advertisement delivery and electronic equipment Download PDF

Info

Publication number
CN115375361A
CN115375361A CN202211013416.8A CN202211013416A CN115375361A CN 115375361 A CN115375361 A CN 115375361A CN 202211013416 A CN202211013416 A CN 202211013416A CN 115375361 A CN115375361 A CN 115375361A
Authority
CN
China
Prior art keywords
advertisement
information
text
label
crowd
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211013416.8A
Other languages
Chinese (zh)
Inventor
张聪
沈菁
康单
陈文海
张天生
陆璐
熊家治
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Feishu Shennuo Digital Technology Shanghai Co ltd
Original Assignee
Feishu Shennuo Digital Technology Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Feishu Shennuo Digital Technology Shanghai Co ltd filed Critical Feishu Shennuo Digital Technology Shanghai Co ltd
Priority to CN202211013416.8A priority Critical patent/CN115375361A/en
Publication of CN115375361A publication Critical patent/CN115375361A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0242Determining effectiveness of advertisements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0277Online advertisement

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method and a device for selecting target population for online advertisement delivery and electronic equipment, wherein the method comprises the following steps: calling an advertisement text classifier to classify the new advertisement text to obtain new advertisement text classification information; screening advertisements similar to classification from the original advertisement label candidate set according to the new advertisement text classification information to form a new advertisement label candidate set; calculating cosine similarity of the obtained new advertisement text vector and the advertisement text vectors in the new advertisement label candidate set, and screening out a most similar advertisement set; and acquiring the crowd label of each advertisement from the most similar advertisement set, and obtaining a final recommended crowd label for selection according to a label sorting rule. By the method, the device, the electronic equipment and the computer readable storage medium provided by the embodiment of the invention, the recommended crowd label result is automatically generated according to the newly-built user advertisement information, and the workload of research and analysis of advertisement optimization personnel is reduced.

Description

Method and device for selecting target population for online advertisement delivery and electronic equipment
Technical Field
The invention relates to the technical field of electronic commerce, in particular to a method and a device for selecting target population for online advertisement delivery, electronic equipment and a computer-readable storage medium.
Background
In the online advertisement putting activity, selecting a proper putting crowd is a necessary step, and the traditional target crowd selection is generally completed by advertisement optimization personnel according to the needs of an advertiser and the experience. Due to the uneven levels of advertisers, the overall industry and crowd data cannot be analyzed in detail by manual work, so that great errors exist in decision making of target crowds for advertisement delivery. Inaccurate selection of advertising target placement population will lead to the following consequences:
1. the advertisement is thrown to an inaccurate target audience, the conversion effect is influenced, and the expectation of an advertiser cannot be achieved;
2. increasing trial and error costs and wasting advertiser budget.
Disclosure of Invention
In order to solve the existing technical problems, embodiments of the present invention provide a method and an apparatus for selecting a target group of online advertisement delivery, an electronic device, and a computer-readable storage medium.
In a first aspect, an embodiment of the present invention provides a method for selecting a target group for online advertisement delivery, where the method includes:
receiving newly-built user advertisement information, and acquiring a new advertisement text according to the user advertisement information;
calling an advertisement text classifier to classify the new advertisement text to obtain new advertisement text classification information;
screening advertisements similar to classification from the original advertisement label candidate set according to the new advertisement text classification information to form a new advertisement label candidate set;
calculating cosine similarity of the obtained new advertisement text vector and the advertisement text vectors in the new advertisement label candidate set, and screening out the most similar advertisement set;
and acquiring the crowd label of each advertisement from the most similar advertisement set, and obtaining a final recommended crowd label for selection according to a label sorting rule.
In a second aspect, an embodiment of the present invention provides a device for selecting a target group for online advertisement delivery, where the device includes:
the advertisement receiving module is used for receiving the newly-built user advertisement information and obtaining a new advertisement text according to the user advertisement information;
the text classification module is used for calling an advertisement text classifier to classify the new advertisement text to obtain new advertisement text classification information;
the advertisement screening module is used for screening advertisements similar in classification from the original advertisement label candidate set to form a new advertisement label candidate set according to the new advertisement text classification information;
the cosine calculation module is used for calculating cosine similarity of the obtained new advertisement text vector and the advertisement text vectors in the new advertisement label candidate set and screening out the most similar advertisement set;
and the label selection module is used for acquiring the crowd label of each advertisement from the most similar advertisement set and obtaining a final recommended crowd label for selection according to a label sorting rule.
In a third aspect, an embodiment of the present invention provides an electronic device, including a bus, a transceiver, a memory, a processor, and a computer program stored on the memory and executable on the processor, where the transceiver, the memory, and the processor are connected via the bus, and when executed by the processor, the computer program implements the steps in the method for selecting the target population for online advertisement delivery as described above.
In a fourth aspect, the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps in the method for selecting the target population for online advertisement delivery as described above.
According to the method, the device, the electronic equipment and the computer readable storage medium provided by the embodiment of the invention, the big data analysis is carried out by utilizing rich data obtained in historical online advertising activities, the targeted algorithm is trained, the recommended crowd label result is automatically generated according to newly-built user advertising information, and the workload of research and analysis of advertising optimization personnel is reduced.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments or the background art of the present invention, the drawings required to be used in the embodiments or the background art of the present invention will be described below.
Fig. 1 is a flowchart illustrating a method for selecting a target population for online advertising according to an embodiment of the present invention;
fig. 2 is a word vector diagram illustrating a selection method for a target population of online advertising according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating a structure of the classification model of step S110 according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram illustrating a device for selecting a target group of online advertising target provided by an embodiment of the present invention;
fig. 5 is a schematic structural diagram illustrating a selection electronic device for a target group of online advertising according to an embodiment of the present invention.
Detailed Description
It will be understood by those within the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
As will be appreciated by those skilled in the art, "client," "terminal device" as used herein includes both devices that are wireless signal receivers, devices that have only wireless signal receivers without transmit capability, and devices that include receive and transmit hardware, devices that have receive and transmit hardware capable of two-way communication over a two-way communication link. Such a device may include: cellular or other communication devices such as personal computers, tablets, etc. having single or multi-line displays or cellular or other communication devices without multi-line displays; PCS (Personal Communications Service), which may combine voice, data processing, facsimile and/or data communication capabilities; a PDA (Personal Digital Assistant), which may include a radio frequency receiver, a pager, internet/intranet access, a web browser, a notepad, a calendar and/or a GPS (Global Positioning System) receiver; a conventional laptop and/or palmtop computer or other device having and/or including a radio frequency receiver. As used herein, a "client," "terminal device" can be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or land-based), or situated and/or configured to operate locally and/or in a distributed fashion at any other location(s) on earth and/or in space. As used herein, the "client", "terminal Device" may also be a communication terminal, internet access terminal, music/video playing terminal, and may be, for example, a PDA, an MID (Mobile Internet Device), and/or a Mobile phone having a music/video playing function, or may be a smart tv, a set-top box, a Virtual Reality (VR) terminal Device, an Augmented Reality (AR) terminal Device, a wireless terminal in industrial control (industrial control), a wireless terminal in self-driving (self-driving), a wireless terminal in remote surgery (remote medical supply), a wireless terminal in smart grid (smart grid), a wireless terminal in transportation safety (transportation safety), a wireless terminal in smart city (smart city), a wireless terminal in smart home (smart home), and the like.
The hardware referred to by the names "server", "client", "service node", etc. in the embodiments of the present invention is essentially an electronic device having the performance of a personal computer, and is a hardware device having necessary components disclosed by von neumann principles such as a central processing unit (including an arithmetic unit and a controller), a memory, an input device, and an output device, in which a computer program is stored in the memory, and the central processing unit calls a program stored in an external memory into the internal memory to run, executes instructions in the program, and interacts with the input and output devices, thereby completing a specific function.
Those skilled in the art will appreciate that the concept of "server" in the embodiments of the present invention may also be extended to be applicable to a server cluster. According to the network deployment principle understood by those skilled in the art, the servers should be logically divided, and in physical space, the servers may be independent from each other but can be called through interfaces, or may be integrated into a physical computer or a set of computer clusters.
As will be appreciated by one skilled in the art, embodiments of the present invention may be embodied as methods, apparatus, electronic devices, and computer-readable storage media. Thus, embodiments of the invention may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), a combination of hardware and software. Furthermore, in some embodiments, embodiments of the invention may also be implemented in the form of a computer program product in one or more computer-readable storage media having computer program code embodied in the storage medium.
The computer-readable storage media described above may take any combination of one or more computer-readable storage media. The computer-readable storage medium includes: an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of the computer-readable storage medium include: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only Memory (ROM), an erasable programmable read-only Memory (EPROM), a Flash Memory (Flash Memory), an optical fiber, a compact disc read-only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any combination thereof. In embodiments of the invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, device, or apparatus.
The computer program code embodied on the computer readable storage medium may be transmitted using any appropriate medium, including: wireless, wire, fiber optic cable, radio Frequency (RF), or any suitable combination thereof.
Computer program code for carrying out operations for embodiments of the present invention may be written in one or more programming languages, including an object oriented programming language such as: java, smalltalk, C + +, and also include conventional procedural programming languages, such as: c or a similar programming language. The computer program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may travel through any type of network, including: a Local Area Network (LAN) or a Wide Area Network (WAN), which may be connected to the user's computer, may be connected to an external computer.
Embodiments of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus, electronic devices, and computer-readable storage media according to embodiments of the invention.
It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions. These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing apparatus to function in a particular manner. Thus, the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The embodiments of the present invention will be described below with reference to the drawings.
Fig. 1 shows a flowchart of a method for selecting a target group for online advertising according to an embodiment of the present invention. As shown in fig. 1, the method includes:
step S102: collecting crowd labels in historical advertisements, acquiring advertisement information data in the historical advertisements, and preprocessing the advertisement information data; the method specifically comprises the following steps:
step S1021: mass advertisement putting data are collected from the advertisement platform, and can be acquired by advertisement optimization personnel through a specific advertisement platform API (application program interface), or acquired by data acquisition tools such as a crawler and the like;
step S1022: the advertisement putting data comprises one or more crowd labels corresponding to the advertisement, and the labels are determined by advertisement optimization personnel according to self experience in the advertisement putting activity;
step S1023: because the people facing each advertisement are different from the country, the labels of different languages are available, so that the non-English labels are uniformly converted into English, and are processed by removing duplication, abnormal characters and the like, and are stored in a database, and the translation process can be realized by means of common third-party translation tools, such as Google translation, track translation and the like;
step S1024: reading the advertisement title and the advertisement body text from the obtained historical advertisement data, and splicing the advertisement title and the advertisement text to form a whole section of advertisement text;
step S1025: removing URL, E-mail, emoji expression, abnormal symbols and the like in the advertisement text;
step S1026: the non-English advertisement text is translated into English, and the translation process can be carried out by means of common third-party translation tools, such as Google translation, track translation and the like;
step S1027: using common natural language processing tools nltk, space and the like to perform word segmentation processing on advertisement texts, and deleting common English stop words and common words in the advertisements through a word list, such as: motion, discount, salt, etc.;
step S1028: through the steps, each advertisement text is converted into a new text consisting of a series of words with practical significance, and the new text is stored in a database;
step S104: obtaining third-party corpus information related to the field of online advertisements, and calculating a word weight TF-IDF (TF (Term Frequency, single text word Frequency) value and an IDF (Inverse text Frequency index)) value of the third-party corpus information as the word weight of each word of an advertisement text;
step S1041: obtaining third-party linguistic data related to the advertising field from the network, and processing according to the steps of S1025 to S1027;
step S1042: calculating a word weight TF-IDF value of each word in the third-party corpus by using common tools such as nltk and space;
the meaning of selecting the third-party linguistic data in the step is that the data collected on the internet is wider than the text obtained from the advertising platform by the user, the calculated TF-IDF value is more representative, the data deviation of a single advertising platform is avoided, and the words of the third-party linguistic data almost contain all advertising text words;
step S1043: the algorithm steps of the TF-IDF include: calculating word frequency, wherein the word frequency (TF) = the times of the appearance of a word in an article;
considering the article has different lengths, in order to facilitate the comparison of different articles, the word frequency is normalized: word frequency (TF) = number of times a word appears in an article/total number of words of the article;
step S1044: calculating the frequency of the inverse document, wherein a corpus (corpus) is needed for simulating the using environment of the language;
inverse Document Frequency (IDF) = log (total number of documents in corpus/(number of documents containing the word + 1))
If a word is more common, then the greater the denominator, the closer to 0 the inverse document frequency is. The denominator is increased by 1 to avoid the denominator being 0, i.e. all documents do not contain the word;
step S1045: calculating TF-IDF, wherein TF-IDF = word frequency (TF) and Inverse Document Frequency (IDF);
it can be seen that TF-IDF is proportional to the number of occurrences of a word in a document and inversely proportional to the number of occurrences of the word in the entire language. Therefore, the algorithm for automatically extracting the keywords is clear, the TF-IDF value of each word in the document is calculated, and then the words are arranged in a descending order, and the first words are taken.
Step S1046: normalizing the obtained TF-IDF value to an interval [ a, b ] according to a maximum and minimum normalization method for the word weight obtained in the step S1045, wherein the values of a and b are determined by multiple experiments, and the main purpose of the step is to correct the over-high or over-low word weight to make the word weight accord with the advertising service scene;
the data normalization method is illustrated as follows:
y max maximum value of target interval to be mapped
y min Minimum value of target interval to be mapped
x max Maximum value of current data
x min Minimum value of current data
Assuming any value in the current data
y is normalized value after mapping
y=y min +((y max -y min )/(x max -x min ))*(x-x min )
Step S1047: taking the obtained word weight as the word weight of the advertisement text processed in step S1028;
step S1048: constructing a feature vector of an advertisement text by using a word2vec word vector weighting method, which specifically comprises the following steps:
step S10481: obtaining an open-source word vector table from the Internet, wherein GloVe and Google word2vec are common;
step S10482: after the processing of step S1028, each word in the advertisement text is searched according to the word vector table to obtain a corresponding vector, which is generally different from 50 dimensions to 1000 dimensions;
step S10483: all vectors obtained by multiplying each word vector by the weight coefficient are summed to obtain the vector representation of the advertisement text;
take the text "queen man" as an example
The word queen vector represents: [0.3, -0.1,0.4];
the term man vector represents: [0.2, -0.2,0.8];
vector representation of text: weight of [0.3, -0.1,0.4 ]. Queen + [0.2, -0.2,0.8 ]. Man;
step S10484: vectorizing all advertisement texts, and storing the vectorized advertisement texts into a database;
as shown in fig. 2, the word vector process schematically includes:
in the natural language processing task, word vector (Word Embedding) is a method of representing natural language words, i.e. each Word is represented as a point in an N-dimensional space, i.e. a vector in a high-dimensional space. By this method, the conversion of natural language computation into vector computation is achieved.
In the task of calculating word vectors as shown in fig. 2, each word (e.g. queen, king, etc.) is first converted into a vector of a high-dimensional space, and these vectors can represent semantic information of the word. And then the association relation among the words can be calculated by calculating the distance among the vectors, so that the purpose of enabling a computer to calculate the natural language like a numerical value is achieved.
Step S106: calling an advertisement text classifier, inputting historical advertisements, and obtaining classification information of the historical advertisements;
in this step, we are involved in classifying the e-commerce advertising text by product category using deep learning NLP (natural language processing) text classification techniques. And predicting the industry classification of the advertisement according to text information such as an advertisement file, an advertisement title, an advertisement name, an advertisement landing page and the like. According to the advertisement classification, the advertisements can be classified, the industry benchmark is calculated, the data trend change of a certain category can be paid attention to regularly, and the guidance effect on the advertisement putting and market insights is achieved.
Wherein, step S106 specifically includes:
step S1061: the API acquires information such as an advertisement title, an advertisement name, an advertisement case description, an advertisement landing page link _ url and the like;
step S1062: using a web crawler to acquire information such as a website title label title, an advertisement case description and the like of the advertisement landing page;
step S1063: splicing the advertisement name, the advertisement title and the advertisement document description of the advertisement to form a first group of document information named ad _ desc;
step S1064: splicing the website title label and the advertisement case description of the advertisement landing page obtained by the crawler processing of the advertisement landing page as a second group of case information, which is named ad _ title;
step S1065: cleaning the website link information of the advertisement landing page, acquiring related information of the advertisement landing page as third group of file information, and naming the third group of file information as url _ keywords;
step S1066: translating the three groups of the file information from the step S1063 to the step S1065 into English;
step S1067: performing binary prediction on the translated English pattern information by using a binary model of deep learning fast-bert (bidirectional encoder representation from transformations) BertForSequence Classification to obtain an effective advertisement pattern containing commodity information;
step S1068: combing the e-commerce industry class information according to the delivery history of the e-commerce advertisements and the e-commerce industry situation;
wherein the first-level classification includes: 21 major categories such as clothes, shoes, bags, jewelry, consumer electronics, computer office, home gardening and the like; the second-level classification and the third-level classification are subdivided on the basis of the respective previous-level classification, such as: apparel-men-tops, etc.;
step S1069: randomly extracting some data from historical advertisement data of an e-commerce advertisement delivery system, initially classifying by using a zero-shot unsupervised classifier, and then manually checking a classification result to obtain a training sample A;
step S10610: processing the advertisement copy on the existing E-commerce website by using a web crawler to obtain the category information of the E-commerce website, and performing category mapping to obtain a training sample B;
step S10611: performing combined optimization on the training samples A and B in the steps S1069 and S10610, performing prediction and manual verification processing on the advertising copy, and performing optimization iteration to obtain a first batch of training samples;
step S10612: training the first batch of training samples obtained in the step S10611 by using a fast-bert BertForSequenceclassication multi-class model, calling a pre-training model uncased _ L-12 \/H-768 _A-12 for fine tuning to obtain a multi-label class model, wherein the multi-label class model comprises a primary class model, a secondary class model and a tertiary class model;
step S10613: step S1067, determining the two classes of ad _ desc, ad _ title and url _ keywords as valid (with the label of "tune") cases, calling a first-class classification model of fast-bert BertForSequenceclassication to obtain the probability value of each class, and classifying and descending the three classes with the highest probability values to be used as top _1_category, top _2 \\ u category and top _3 \ _ u category which are respectively stored in a data table;
step S10614: performing multi-channel voting on the document 1, the picture 1 and the video 1 in the first group of document information ad _ desc, the document 2, the picture 2 and the video 2 in the second group of document information ad _ title, and the document 3, the picture 3 and the video 3 in the third group of document information url _ keywords to obtain the first-level classification of the advertisement;
step S10615: on the basis of the primary classification obtained in the step S10614, referring to the contents of the steps S10610-S10614, obtaining a secondary classification of the advertisement;
step S10616: obtaining a tertiary classification of the advertisement with reference to the contents of the steps S10610-S10614 on the basis of the secondary classification obtained in the step S10615;
step S10617: storing the advertisement classification data obtained in the above steps to a database;
step S108: calculating text similarity among advertisement texts, and selecting the most similar advertisement texts and corresponding crowd labels;
in this step, similarity between advertisement texts is calculated based on the word2vec vector in step S104 by a text similarity search technique, so as to obtain the top N most similar advertisement texts and corresponding crowd labels, and then these information are put into the subsequent steps and further refined.
From the historical advertisements processed in step S102, a certain number of advertisements are screened, and the specific screening rule includes:
1) Advertisements delivered in the last year;
2) The delivery amount exceeds a certain number or the advertisement click rate and the conversion rate exceed a certain numerical value;
3) The specific standard is determined by advertisement optimization personnel according to experience, and the purpose is to screen qualified advertisement data and avoid the influence of invalid advertisements on the final effect;
4) Extracting advertisement texts (advertisement titles + advertisement body contents);
5) Classifying the advertisements according to the same advertisement text, counting the frequency of the crowd labels under each class, and keeping the crowd labels with higher frequency as the advertisement text;
6) Intercepting the frequency of the crowd label of each advertisement text, wherein the specific upper limit and the specific lower limit are determined by advertisement optimization personnel according to experience;
7) And calculating the characterization vector of the advertisement text according to the quantization method in the step S104, and storing the characterization vector in a database.
In the step, a batch of similar advertisements are found by comparing the similarity of advertisement texts, and the crowd labels of the similar advertisements are counted to obtain the high-frequency crowd labels, and the high-frequency crowd labels are sent to the next stage of calculation. Therefore, during off-line training, the purpose is to keep the most representative crowd labels among the audiences corresponding to the advertisement text as much as possible.
Step S110: training the crowd labels according to a deep learning model to obtain an advertisement label candidate set;
step S108 can quickly select a batch of similar advertisements from a large number of advertisements, but there are still many irrelevant situations in the selected advertisements. The crowd labels obtained based on these advertisements are not necessarily accurate.
The embodiment of the invention further selects the primarily screened advertising crowd labels through a deep learning technology. The selection stage of the crowd labels is divided into three different algorithm models, each algorithm model has different characteristics, and the algorithm models specifically comprise the following steps:
step S1101: acquiring a crowd label corresponding to an advertisement text and a CPM (Cost Per Mille, cost Per thousand) value of the crowd label, and selecting the crowd label with the minimum CPM value as an advertisement label candidate set;
and (4) counting the crowd label and the CPM value thereof corresponding to each advertisement text by adopting a statistical method, and introducing a time attenuation strategy. When the model is used, the set of advertisement texts initially selected in step S108 selects a number of advertisement texts with the smallest CPM value (the smaller CPM value represents lower cost and better effect), and as a result of the selection, the following is specifically described:
1) Each advertisement text corresponds to a plurality of advertisements, each advertisement has a crowd tag and also has a CPM value, the CPM values of the advertisements are used as the CPM values of the advertisement tags, and the tags and the CPM values of the advertisement texts are obtained through time attenuation and combination calculation.
2) Time decay of CPM value:
the farther the advertisement is released from the current time, the lower the confidence level of the CPM value of the advertisement is, and accordingly the confidence level of the CPM value of the crowd label is calculated to be correspondingly reduced. Time-decay weighting is applied to the advertisement CPM value to highlight the importance of the new advertisement to the final evaluation impact.
The attenuation formula is equal proportion attenuation, the attenuation in 90 days is 0, and the specific is as follows:
CPM (attenuation value) = (1-days of advertisement release date (more than 90 by 90)/90) × CPM
3) And (3) merging and calculating CPM value:
taking the CPM value of each advertisement as the CPM value of all the crowd labels of the advertisement;
summing the CPM values of all the crowd labels, and dividing the CPM values by the number of the crowd labels to obtain a final CPM value;
examples are as follows:
advertisement A: CPM is 5.0 (after attenuation), t1, t2 and t3 are crowd labels, and the corresponding CPMs are all 5.0;
and B, advertisement B: CPM is 10.0 (after attenuation), t1, t2, t3 and t4 are crowd labels, and CPMs corresponding to the crowd labels are all 10.0;
then, the crowd label and its corresponding CPM value for ad text 1 are:
t1:(5.0+10.0)/2=7.5
t2:(5.0+10.0)/2=7.5
t3:(5.0+10.0)/2=7.5
t4:10.0/1=10.0
and performing the above statistical operation on all advertisement texts, and storing the owned crowd labels and CPM values in a database.
Step S1102: manually screening the crowd labels, and taking the most relevant crowd labels as an advertisement label candidate set;
due to the fact that the mass advertisement data exist, whether a certain crowd label is suitable for a certain advertisement or not can be intelligently judged through a deep learning technology, the model can be continuously studied and optimized along with the addition of a newly added advertisement, the artificial neural network can model complex problems, and an optimal solution which cannot be found by human experience is found out.
The input features of the deep learning model in this step include:
attribute information of the advertisement, such as advertisement publishing time, advertisement account number type, advertisement budget, bidding strategy, advertisement layout, advertisement material quantity and size;
demographic information such as gender, age, country, etc. of the advertisement target population;
advertisement text vectors, advertisement text types, text vectors of crowd labels, etc.;
the prediction target of step S1102 is: whether a certain crowd tag is suitable for a certain advertisement;
the construction steps of the training sample specifically comprise:
1) Constructing a regular sample based on real advertisements and corresponding crowd labels in the database;
the data inventory advertisement is considered reasonable in the selected crowd label. The audience for inventory ads, mostly derived from an optimized manual selection, should be semantically matched to the ad.
2) Sequencing top N recommended words based on a recommendation model to construct negative example samples;
we cannot learn which words belong to the negative examples of the crowd labels that the advertisement should not select, but if for a certain advertisement, based on the content of step S108, find the crowd labels of top N similar advertisement text. And because the matching degree of the other crowd labels irrelevant to the crowd label of the top N similar advertisement text and the advertisement is not high, randomly sampling the other irrelevant crowd labels, and matching the crowd labels with the advertisement to serve as a negative sample.
3) Constructing positive and negative sample based on manual labeling;
the samples constructed in the steps 1) and 2) have high noise due to the data quality, but the method has the advantages that massive samples can be constructed quickly, and the overall generalization capability of the model is improved. Therefore, a manual labeling method is additionally introduced to construct a low-noise high-quality sample, and the method can further improve the accuracy of the model.
The manual labeling of the positive and negative samples can adopt the following two ways:
1) Manual correction, wherein advertisement optimization personnel further screen crowd labels of real stock advertisements, and irrelevant people are used as negative samples, and relevant people are used as positive samples;
2) In actual use, the advertisement optimization personnel input specific advertisement information, call the trained model and return the crowd label recommendation result, and the advertisement optimization personnel select related ones from the result as positive samples and the rest as negative samples;
an example format for the training samples is as follows:
qualified training samples: real advertisement 1 budget, bidding strategy, layout, age group, … …, real crowd label T1 and predicted value 1;
unqualified training samples: the method comprises the following steps of (1) actual advertisement 2 budget, bidding strategy, layout, age group, … …, irrelevant people label T2 and predicted value 0;
in step S1102, the model structure includes: a general two-classification model of the fully-connected neural network is adopted, and as shown in FIG. 3, the problem of whether a certain crowd label is suitable for a certain advertisement is converted into a two-classification problem by using the model.
In step S1102, the model construction and training tool may use a common deep learning framework, such as tensorflow, pitorch, caffe, etc., to save the trained model to the hard disk in the form of a binary file. During subsequent use, the model is called, various characteristics of the advertisement and candidate crowd labels are input one by one to obtain model output results (decimal between 0.0 and 1.0), and a plurality of maximum results are selected as final crowd labels.
Step S1103: a step S1102 of principle synchronization, in which a DeepFM neural network model with an actual delivery effect as a standard is used for obtaining an advertisement label candidate set;
in step S1103, the prediction target is: whether a certain advertisement + a certain crowd label can achieve a certain degree of putting effect or not, wherein the specific putting effect comprises a CPM value, a click rate and a conversion rate.
In step S1103, the construction of the training sample: constructing a sample based on the real advertisement, the corresponding crowd label and the final real delivery effect of the advertisement in the database, and performing segmentation processing on the predicted delivery effect, for example, dividing the CPM value, the click rate and the conversion rate into 5 segments: extremely high, medium, low, extremely low, the specific segmentation values being determined empirically by the specific person.
An example training sample format is as follows:
real advertisement 1 budget, bidding strategy, layout, age group, … …, real crowd label T1, predicted value (advertisement CPM, segment value), predicted value (advertisement click rate, segment value), predicted value (advertisement conversion rate, segment value);
real advertisement 2 budget, bidding strategy, layout, age group, … …, irrelevant people label T2, predicted value (advertisement CPM, segment value), predicted value (advertisement click rate, segment value), predicted value (advertisement conversion rate, segment value);
…………
in step S1103, the effect of a certain crowd label on a certain advertisement is converted into a multi-label classification prediction problem, where there are 3 predicted targets, i.e., CPM, click rate, and conversion rate, and each target has multiple categories, i.e., high, medium, and low, and corresponding training models, so that it classifies the delivery results.
In step S1103, the model construction and training tool may adopt a common deep learning framework, such as tensorflow, pitorch, caffe, etc., to save the trained model in the form of binary file;
in step S1103, in the subsequent use, the model is called to input various features of the advertisement and the candidate crowd labels one by one, so as to obtain a model output result, for example, a result such as a low CPM value, a high click rate, a conversion rate, and the like, and a crowd label that makes the model output effect the best is selected as a final crowd label.
In step S110 of the embodiment of the present invention, step S1101 is based on a statistical ranking method, which has the advantages of fast training speed, good interpretability, but slightly poor final training effect; step S1102 is a simple neural network model taking experience of advertisement optimization personnel as a standard, the speed of the method is in a moderate position in the three methods of the step S110, the interpretability is worse than that of the step S1101, but the training effect is better than that of the step S1101; in step S1103, the speed of the deep fm neural network model is the slowest of the three methods in step S110, and the interpretability is slightly worse than that in step S1101, but the training effect is better than that in the other two methods.
Step S112: inputting a newly-built advertisement title, an advertisement text and an advertisement page by a user, crawling characters of the advertisement page through a crawler program, and cleaning, wherein the crawler program can use common frames such as script, beautiful Soup and Grab; step S112 specifically includes:
1) Removing HTML symbols and abnormal characters;
2) Removing general information such as copyright information, contact information and the like on the webpage;
3) Removing some common words such as motion, salt, recognition and the like according to a common word list provided by service personnel;
step S114: splicing the newly-built advertisement title, the advertisement text and the characters of the advertisement page to form a section of complete text;
step S116: taking the complete text as a parameter, and calling the advertisement text classifier in the step S106 to obtain new advertisement text classification information;
step S118: invoking the vectorization method of the step S104, and obtaining a new advertisement text vector according to the complete text;
step S120: the advertisements, the advertisement crowd labels, the advertisement text vectors and the advertisement texts obtained in the steps S104 and S106 are loaded into a memory in a classified mode to form an original advertisement label candidate set;
step S122: according to the new advertisement text classification information, screening advertisements consistent with the new advertisement text classification from the original advertisement label candidate set to form a new advertisement label candidate set;
step S124: according to the new advertisement text vector obtained in step S118, calculating cosine similarity with all advertisement text vectors in the new advertisement tag candidate set in pairs, screening out the top N most similar advertisements (N is determined by multiple experiments), forming a most similar advertisement set, and storing to a memory;
cosine similarity algorithm: the cosine value between the included angles of the two vectors in one vector space is used as the measure of the difference between the two individuals, and the more the cosine value is close to 1, the included angle tends to 0, which indicates that the two vectors are more similar; the cosine value is close to 0 and the angle approaches 90 degrees, indicating that the two vectors are more dissimilar.
Vector a = (x 1, y 1), vector b = (x 2, y 2)
a · b = x1x2+ yly2= | a | b | cos θ (θ is the included angle of a, b)
The formula for calculating the cosine function in the triangle is as follows:
Figure BDA0003811817420000171
in the rectangular coordinate system, the vector a is represented by coordinates (x 1, y 1), and the vector b is represented by coordinates (x 2, y 2). The length of vector a and vector b in rectangular coordinates is
Figure BDA0003811817420000172
The distance between vector a and vector b is denoted by vector c, which has a length in rectangular coordinates of
Figure BDA0003811817420000173
Thus:
Figure BDA0003811817420000181
as can be seen from the above, the cosine similarity is used for calculating the similarity between individuals, and the smaller the similarity is, the larger the distance is, the larger the similarity is, and the smaller the distance is.
Step S126: obtaining top M crowd labels (M is determined by multiple experiments) of each advertisement from the most similar advertisement set, as shown in step S1101;
in step S126, a common Web framework may be used, such as: flash, tornado, etc., developing corresponding interface programs;
step S128: the method comprises the steps of obtaining advertisements, crowd labels, basic information (budget, bidding strategy, layout and age range) of the advertisements and advertisement effect information (CPM value, click rate and conversion rate) in the most similar advertisement set, selecting different sorting algorithms from step S110 according to actual needs, outputting the crowd labels for advertisement related personnel to select, and specifically inputting and outputting the crowd labels, wherein the method comprises the following steps:
1) Inputting advertisements and crowd labels in the most similar advertisement set, and selecting a plurality of crowd labels with the minimum CPM value as selected crowd labels through the calculation of the step S1101;
2) Inputting advertisements, crowd labels and advertisement basic information in the most similar advertisement set, calling the model in the step S1102 to obtain a model output result (decimal between 0.0 and 1.0), and selecting a plurality of maximum results to obtain the input crowd labels as final crowd labels;
3) Inputting the advertisement, the crowd label and the advertisement basic information in the most similar advertisement set, calling the model in the step S1103 to obtain the output result of the model (CPM, click rate and the segmentation value of the conversion rate), and selecting a plurality of maximum results to obtain the input crowd label as the final crowd label.
According to the selection method of the target crowd for online advertisement delivery, disclosed by the embodiment of the invention, the big data analysis is carried out by utilizing the rich data obtained in the historical online advertisement activities, the targeted algorithm is trained, the recommended crowd label result is automatically generated according to the newly-built user advertisement information, and the workload of research and analysis of advertisement optimization personnel is reduced.
The selection method of the target population for online advertisement delivery provided by the embodiment of the invention provides high-quality target population recommendation and reduces the trial and error cost of advertisement optimization personnel.
The method for selecting the target population for online advertisement delivery, provided by the embodiment of the invention, realizes the intelligent combination of the automatic selection of the target population and advertisement optimization personnel, and helps an advertiser to obtain a better advertisement delivery effect.
The method for selecting the target population for online advertisement delivery according to the embodiment of the present invention is described in detail above with reference to fig. 1 to 3, and the device for selecting the target population for online advertisement delivery according to the embodiment of the present invention is described in detail below with reference to fig. 4.
Fig. 4 is a schematic structural diagram illustrating a device for selecting a target group of online advertising according to an embodiment of the present invention. As shown in fig. 4, the device for selecting the target crowd for online advertisement delivery comprises:
the advertisement receiving module 1 is used for receiving newly-built user advertisement information and obtaining a new advertisement text according to the user advertisement information;
the text classification module 2 is used for calling an advertisement text classifier to classify the new advertisement text to obtain new advertisement text classification information;
the advertisement screening module 3 is used for screening advertisements similar in classification from the original advertisement label candidate set to form a new advertisement label candidate set according to the new advertisement text classification information;
the cosine calculation module 4 is used for calculating cosine similarity of the obtained new advertisement text vector and the advertisement text vectors in the new advertisement label candidate set, and screening out a most similar advertisement set;
and the label selection module 5 is used for acquiring the crowd label of each advertisement from the most similar advertisement set and obtaining a final recommended crowd label for selection according to the label sorting rule.
In the embodiment of the present invention, optionally, as shown in fig. 4, the apparatus for selecting a target group for online advertisement delivery further includes:
the historical data module 6 is used for collecting crowd labels in historical advertisements, acquiring advertisement information data in the historical advertisements and preprocessing the advertisement information data;
the weight calculation module 7 is used for obtaining third-party corpus information related to the online advertisement field, and calculating a word weight TF-IDF value of the third-party corpus information as a word weight of each word of the advertisement text;
a text vector module 8, configured to obtain vector information of the advertisement text according to the corresponding vector of each word of the advertisement text and the word weight;
the text similarity module 9 is configured to calculate text similarity between the advertisement texts, and select a most similar advertisement text and a corresponding crowd label;
and the label training module 10 is used for training the crowd labels according to the deep learning model to obtain an advertisement label candidate set.
In this embodiment of the present invention, optionally, the tag training module 10 is further configured to obtain a crowd tag corresponding to the advertisement text and a CPM value thereof, and select a candidate set of the advertisement tag with a smallest CPM value; alternatively, the first and second electrodes may be,
and manually screening the crowd labels, and taking the most relevant crowd labels as the advertisement label candidate set.
Optionally, in the embodiment of the present invention, the text classification module 2 is configured to acquire content information of an advertisement document, and perform splicing processing on the content information to obtain first document information;
acquiring content information of the advertisement landing page, and splicing the content information to be used as second file information;
cleaning the website link information of the advertisement landing page, and taking the acquired related information as third file information;
translating the first, second and third language case information into English, and performing two-class prediction on the translated advertisement language case;
and calling the bert multi-classification model to carry out all-level classification on the effective advertisement file to obtain a required advertisement classification result.
The selection device of the target crowd for online advertisement delivery in the embodiment of the invention utilizes rich data obtained in historical online advertisement activities to analyze big data, trains a targeted algorithm, and automatically generates a recommended crowd label result according to newly built user advertisement information, thereby reducing the workload of research and analysis of advertisement optimization personnel.
The selection device for the target population for online advertisement delivery provided by the embodiment of the invention provides high-quality target population recommendation and reduces the trial and error cost of advertisement optimization personnel.
The selection device for the target population for online advertisement delivery, provided by the embodiment of the invention, realizes the intelligent combination of the automatic selection of the target population and advertisement optimization personnel, and helps an advertiser to obtain a better advertisement delivery effect.
In addition, an embodiment of the present invention further provides an electronic device, including a bus, a transceiver, a memory, a processor, and a computer program stored in the memory and operable on the processor, where the transceiver, the memory, and the processor are connected via the bus, and when the computer program is executed by the processor, the processes of the embodiment of the method for selecting a target group for delivering an online advertisement are implemented, and the same technical effects can be achieved, and are not described herein again to avoid repetition.
Specifically, referring to fig. 5, an embodiment of the present invention further provides an electronic device, which includes a bus 51, a processor 52, a transceiver 53, a bus interface 54, a memory 55, and a user interface 56.
In an embodiment of the present invention, the electronic device further includes: a computer program stored on the memory 55 and executable on the processor 52, the computer program when executed by the processor 52 performing the steps of:
receiving newly-built user advertisement information, and acquiring a new advertisement text according to the user advertisement information;
calling an advertisement text classifier to classify the new advertisement text to obtain new advertisement text classification information;
screening advertisements similar to classification from the original advertisement label candidate set according to the new advertisement text classification information to form a new advertisement label candidate set;
calculating cosine similarity of the obtained new advertisement text vector and the advertisement text vectors in the new advertisement label candidate set, and screening out the most similar advertisement set;
and acquiring the crowd label of each advertisement from the most similar advertisement set, and obtaining a final recommended crowd label for selection according to a label sorting rule.
Optionally, the computer program when executed by the processor 52 may further implement the steps of:
collecting crowd labels in historical advertisements, acquiring advertisement information data in the historical advertisements, and preprocessing the advertisement information data;
obtaining third-party corpus information related to the field of online advertisements, and calculating a word weight TF-IDF value of the third-party corpus information as a word weight of each word of an advertisement text;
obtaining vector information of the advertisement text according to the corresponding vector of each word of the advertisement text and the word weight;
calling the advertisement text classifier, inputting the historical advertisement and obtaining classification information of the historical advertisement;
calculating the text similarity between the advertisement texts, and selecting the most similar advertisement texts and the corresponding crowd labels;
and training the crowd labels according to a deep learning model to obtain an advertisement label candidate set.
Optionally, the computer program when executed by the processor 52 may further implement the steps of:
acquiring a crowd label corresponding to the advertisement text and a CPM value of the crowd label, and selecting the crowd label with the minimum CPM value as the advertisement label candidate set; alternatively, the first and second electrodes may be,
and manually screening the crowd labels, and taking the most relevant people as the advertisement label candidate set.
Optionally, the computer program when executed by the processor 52 may further implement the steps of:
acquiring content information of an advertisement file, and splicing the content information to be used as first file information;
acquiring content information of the advertisement landing page, and splicing the content information to be used as second file information;
cleaning the website link information of the advertisement landing page, and taking the acquired related information as third file information;
translating the first, second and third language case information into English, and performing two-class prediction on the translated advertisement language case;
and calling the bert multi-classification model to carry out all-level classification on the effective advertisement file so as to obtain a required advertisement classification result.
A transceiver 53 for receiving and transmitting data under the control of the processor 52.
In FIG. 5, a bus architecture (represented by bus 51), bus 51 may include any number of interconnected buses and bridges, bus 51 connecting various circuits including one or more processors, represented by processor 52, and memory, represented by memory 55.
Bus 51 represents one or more of any of several types of bus structures, including a memory bus, and memory controller, a peripheral bus, an Accelerated Graphics Port (AGP), a processor, or a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include: an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an Enhanced ISA (EISA) bus, a Video Electronics Standards Association (VESA), a Peripheral Component Interconnect (PCI) bus.
Processor 52 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method embodiments may be performed by integrated logic circuits in hardware or instructions in software in a processor. The processor described above includes: general purpose processors, central Processing Units (CPUs), network Processors (NPs), digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs), complex Programmable Logic Devices (CPLDs), programmable Logic Arrays (PLAs), micro Control Units (MCUs) or other Programmable Logic devices, discrete gates, transistor Logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in embodiments of the present invention may be implemented or performed. For example, the processor may be a single core processor or a multi-core processor, which may be integrated on a single chip or located on multiple different chips.
The processor 52 may be a microprocessor or any conventional processor. The steps of the method disclosed in connection with the embodiments of the present invention may be directly performed by a hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor. The software modules may be located in a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), a register, and other readable storage media known in the art. The readable storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
The bus 51 may also connect various other circuits such as peripherals, voltage regulators, or power management circuits to one another, and a bus interface 54 provides an interface between the bus 51 and the transceiver 53, as is well known in the art. Therefore, the embodiments of the present invention will not be further described.
The transceiver 53 may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other devices over a transmission medium. For example: the transceiver 53 receives external data from other devices, and the transceiver 53 is used to transmit data processed by the processor 52 to other devices. Depending on the nature of the computer system, a user interface 56 may also be provided, such as: touch screen, physical keyboard, display, mouse, speaker, microphone, trackball, joystick, stylus.
It should be appreciated that in embodiments of the present invention, memory 55 may further include memory located remotely from processor 52, which may be connected to a server via a network. One or more portions of the above-described networks may be an ad hoc network (ad hoc network), an intranet (intranet), an extranet (extranet), a Virtual Private Network (VPN), a Local Area Network (LAN), a Wireless Local Area Network (WLAN), a Wide Area Network (WAN), a Wireless Wide Area Network (WWAN), a Metropolitan Area Network (MAN), the Internet (Internet), a Public Switched Telephone Network (PSTN), a plain old telephone service network (POTS), a cellular telephone network, a wireless fidelity (Wi-Fi) network, and combinations of two or more of the above. For example, the cellular telephone network and the wireless network may be a global system for Mobile Communications (GSM) system, a Code Division Multiple Access (CDMA) system, a Worldwide Interoperability for Microwave Access (WiMAX) system, a General Packet Radio Service (GPRS) system, a Wideband Code Division Multiple Access (WCDMA) system, a Long Term Evolution (LTE) system, an LTE Frequency Division Duplex (FDD) system, an LTE Time Division Duplex (TDD) system, a long term evolution-advanced (LTE-a) system, a Universal Mobile Telecommunications (UMTS) system, an enhanced Mobile Broadband (eMBB) system, a mass Machine Type Communication (mtc) system, an Ultra Reliable Low Latency Communication (urrllc) system, or the like.
It will be appreciated that the memory 55 in embodiments of the invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. Wherein the nonvolatile memory includes: read-Only Memory (ROM), programmable ROM (PROM), erasable PROM (EPROM), electrically Erasable PROM (EEPROM), or Flash Memory.
The volatile memory includes: random Access Memory (RAM), which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as: static random access memory (Static RAM, SRAM), dynamic random access memory (Dynamic RAM, DRAM), synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), double Data Rate Synchronous Dynamic random access memory (Double Data Rate SDRAM, DDRSDRAM), enhanced Synchronous DRAM (ESDRAM), synchronous Link DRAM (SLDRAM), and Direct memory bus RAM (DRRAM). The memory 55 of the electronic device described in the embodiments of the present invention includes, but is not limited to, the above and any other suitable types of memory.
In an embodiment of the present invention, memory 55 stores the following elements of operating system 551 and application programs 552: an executable module, a data structure, or a subset thereof, or an expanded set thereof.
Specifically, the operating system 551 includes various system programs such as: a framework layer, a core library layer, a driver layer, etc. for implementing various basic services and processing hardware-based tasks. The applications 552 include various applications such as: media Player (Media Player), browser (Browser), for implementing various application services. A program implementing the method of an embodiment of the present invention may be included in the application 552. The application programs 552 include: applets, objects, components, logic, data structures, and other computer system executable instructions that perform particular tasks or implement particular abstract data types.
In addition, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements each process of the above-mentioned selection method for an online advertisement delivery target group, and can achieve the same technical effect, and in order to avoid repetition, the details are not repeated here.
In particular, the computer program may, when executed by a processor, implement the steps of:
receiving newly-built user advertisement information, and acquiring a new advertisement text according to the user advertisement information;
calling an advertisement text classifier to classify the new advertisement text to obtain new advertisement text classification information;
screening advertisements similar in classification from the original advertisement label candidate set according to the new advertisement text classification information to form a new advertisement label candidate set;
calculating cosine similarity of the obtained new advertisement text vector and the advertisement text vectors in the new advertisement label candidate set, and screening out the most similar advertisement set;
and acquiring the crowd label of each advertisement from the most similar advertisement set, and obtaining a final recommended crowd label for selection according to a label sorting rule.
Optionally, the computer program when executed by the processor may further implement the steps of:
collecting crowd labels in historical advertisements, acquiring advertisement information data in the historical advertisements, and preprocessing the advertisement information data;
obtaining third-party corpus information related to the field of online advertisements, and calculating a word weight TF-IDF value of the third-party corpus information as a word weight of each word of an advertisement text;
obtaining vector information of the advertisement text according to the corresponding vector of each word of the advertisement text and the word weight;
calling the advertisement text classifier, inputting the historical advertisement and obtaining classification information of the historical advertisement;
calculating the text similarity between the advertisement texts, and selecting the most similar advertisement texts and the corresponding crowd labels;
and training the crowd labels according to a deep learning model to obtain an advertisement label candidate set.
Optionally, the computer program when executed by the processor may further implement the steps of:
acquiring a crowd label corresponding to the advertisement text and a CPM value of the crowd label, and selecting the crowd label with the minimum CPM value as the advertisement label candidate set; alternatively, the first and second liquid crystal display panels may be,
and manually screening the crowd labels, and taking the most relevant crowd labels as the advertisement label candidate set.
Optionally, the computer program when executed by the processor may further implement the steps of:
acquiring content information of an advertisement case, and splicing the content information to be used as first case information;
acquiring content information of the advertisement landing page, and splicing the content information to be used as second file information;
cleaning the website link information of the advertisement landing page, and taking the acquired related information as third file information;
translating the first, second and third language case information into English, and performing two-class prediction on the translated advertisement language case;
and calling the bert multi-classification model to carry out all-level classification on the effective advertisement file to obtain a required advertisement classification result.
The computer-readable storage medium includes: permanent and non-permanent, removable and non-removable media may be tangible devices that retain or store instructions for use by an instruction execution device. The computer-readable storage medium includes: electronic memory devices, magnetic memory devices, optical memory devices, electromagnetic memory devices, semiconductor memory devices, and any suitable combination of the foregoing. The computer-readable storage medium includes: phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), non-volatile random access memory (NVRAM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic tape cartridge storage, magnetic tape disk storage or other magnetic storage devices, memory sticks, mechanically encoded devices (e.g., punched cards or raised structures in a groove having instructions recorded thereon), or any other non-transmission medium useful for storing information that may be accessed by a computing device. As defined in embodiments of the present invention, the computer-readable storage medium does not include transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses traveling through a fiber optic cable), or electrical signals transmitted through a wire.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed in the subject specification can be implemented as electronic hardware, computer software, or combinations of both, and that the elements and steps of the examples have been described in a functional generic sense in the foregoing description for the purpose of illustrating the interchangeability of hardware and software. When implemented in software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer program instructions. The computer program instructions comprise: assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, integrated circuit configuration data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as: smalltalk, C + + and procedural programming languages, such as: c or a similar programming language.
When the computer program instructions are loaded and executed on a computer, which may be a computer, a special purpose computer, a network of computers, or other editable apparatus, all or a portion of the procedures or functions described herein may be performed, in accordance with the embodiments of the invention. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, such as: the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via a wired (e.g., coaxial cable, twisted pair, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave) link. The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device including one or more available media integrated servers, data centers, and the like. The usable medium may be a magnetic medium (e.g., floppy disk, magnetic tape), an optical medium (e.g., optical disk), or a semiconductor medium (e.g., solid State Drive (SSD)), among others. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present embodiments.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing embodiments of the method of the present invention, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus, electronic device and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is merely a logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electrical, mechanical or other form of connection.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to solve the problem to be solved by the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present invention may substantially or partially contribute to the prior art, or all or part of the technical solutions may be embodied in the form of a software product stored in a storage medium, and including several instructions for causing a computer device (including a personal computer, a server, a data center or other network devices) to execute all or part of the steps of the methods according to the embodiments of the present invention. And the storage medium includes various media that can store the program code as listed in the foregoing.
The above description is only a specific implementation of the embodiments of the present invention, but the scope of the embodiments of the present invention is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the embodiments of the present invention, and all such changes or substitutions should be covered by the scope of the embodiments of the present invention. Therefore, the protection scope of the embodiments of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A method for selecting target population for online advertisement delivery, comprising:
receiving newly-built user advertisement information, and acquiring a new advertisement text according to the user advertisement information;
calling an advertisement text classifier to classify the new advertisement text to obtain new advertisement text classification information;
screening advertisements similar in classification from the original advertisement label candidate set according to the new advertisement text classification information to form a new advertisement label candidate set;
calculating cosine similarity of the obtained new advertisement text vector and the advertisement text vectors in the new advertisement label candidate set, and screening out the most similar advertisement set;
and acquiring the crowd label of each advertisement from the most similar advertisement set, and obtaining a final recommended crowd label for selection according to a label sorting rule.
2. The method of claim 1, further comprising:
collecting crowd labels in historical advertisements, acquiring advertisement information data in the historical advertisements, and preprocessing the advertisement information data;
obtaining third-party corpus information related to the field of online advertisements, and calculating a word weight TF-IDF value of the third-party corpus information as a word weight of each word of an advertisement text;
obtaining vector information of the advertisement text according to the corresponding vector of each word of the advertisement text and the word weight;
calling the advertisement text classifier, inputting the historical advertisement and obtaining classification information of the historical advertisement;
calculating the text similarity between the advertisement texts, and selecting the most similar advertisement texts and the corresponding crowd labels;
and training the crowd labels according to a deep learning model to obtain an advertisement label candidate set.
3. The method of claim 2, wherein the step of training the population labels according to a deep learning model to obtain a candidate set of advertisement labels comprises:
acquiring a crowd label corresponding to the advertisement text and a CPM value of the crowd label, and selecting the crowd label with the minimum CPM value as the advertisement label candidate set; alternatively, the first and second electrodes may be,
and manually screening the crowd labels, and taking the most relevant crowd labels as the advertisement label candidate set.
4. The method of claim 1, wherein the step of invoking an advertisement text classifier to classify the new advertisement text and obtaining new advertisement text classification information comprises:
acquiring content information of an advertisement case, and splicing the content information to be used as first case information;
acquiring content information of the advertisement landing page, and splicing the content information to be used as second file information;
cleaning the website link information of the advertisement landing page, and taking the acquired related information as third file information;
translating the first, second and third language case information into English, and performing two-class prediction on the translated advertisement language case;
and calling the bert multi-classification model to carry out all-level classification on the effective advertisement file so as to obtain a required advertisement classification result.
5. A selection device for target population for online advertisement delivery, comprising:
the advertisement receiving module is used for receiving the newly-built user advertisement information and obtaining a new advertisement text according to the user advertisement information;
the text classification module is used for calling an advertisement text classifier to classify the new advertisement text to obtain new advertisement text classification information;
the advertisement screening module is used for screening advertisements similar in classification from the original advertisement label candidate set to form a new advertisement label candidate set according to the new advertisement text classification information;
the cosine calculation module is used for calculating cosine similarity of the obtained new advertisement text vector and the advertisement text vectors in the new advertisement label candidate set and screening out the most similar advertisement set;
and the label selection module is used for acquiring the crowd label of each advertisement from the most similar advertisement set and obtaining a final recommended crowd label for selection according to a label sorting rule.
6. The apparatus of claim 5,
the historical data module is used for collecting crowd labels in historical advertisements, acquiring advertisement information data in the historical advertisements and preprocessing the advertisement information data;
the weight calculation module is used for acquiring third-party corpus information related to the field of online advertisements, and calculating a word weight TF-IDF value of the third-party corpus information as the word weight of each word of the advertisement text;
the text vector module is used for obtaining vector information of the advertisement text according to the corresponding vector of each word of the advertisement text and the word weight;
the text similarity module is used for calculating the text similarity between the advertisement texts and selecting the most similar advertisement texts and the corresponding crowd labels;
and the label training module is used for training the crowd labels according to the deep learning model to obtain an advertisement label candidate set.
7. The apparatus of claim 6,
the label training module is further used for acquiring the crowd labels corresponding to the advertisement texts and the CPM values thereof, and selecting the advertisement text with the minimum CPM value as the advertisement label candidate set; alternatively, the first and second electrodes may be,
and manually screening the crowd labels, and taking the most relevant crowd labels as the advertisement label candidate set.
8. The apparatus of claim 5, wherein the text classification module is configured to obtain content information of an advertisement document, and perform splicing processing on the content information as first document information;
acquiring content information of the advertisement landing page, and splicing the content information to be used as second file information;
cleaning the website link information of the advertisement landing page, and taking the acquired related information as third file information;
translating the first, second and third language case information into English, and performing two-class prediction on the translated advertisement language case;
and calling the bert multi-classification model to carry out all-level classification on the effective advertisement file to obtain a required advertisement classification result.
9. An electronic device comprising a bus, a transceiver, a memory, a processor and a computer program stored on the memory and executable on the processor, the transceiver, the memory and the processor being connected via the bus, characterized in that the computer program, when executed by the processor, implements the steps in the method for selecting a target demographic for online advertising according to any of claims 1 to 4.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for selecting a target population for online advertising according to any one of claims 1 to 4.
CN202211013416.8A 2022-08-23 2022-08-23 Method and device for selecting target population for online advertisement delivery and electronic equipment Pending CN115375361A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211013416.8A CN115375361A (en) 2022-08-23 2022-08-23 Method and device for selecting target population for online advertisement delivery and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211013416.8A CN115375361A (en) 2022-08-23 2022-08-23 Method and device for selecting target population for online advertisement delivery and electronic equipment

Publications (1)

Publication Number Publication Date
CN115375361A true CN115375361A (en) 2022-11-22

Family

ID=84068300

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211013416.8A Pending CN115375361A (en) 2022-08-23 2022-08-23 Method and device for selecting target population for online advertisement delivery and electronic equipment

Country Status (1)

Country Link
CN (1) CN115375361A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422508A (en) * 2023-10-24 2024-01-19 上海网萌网络科技有限公司 Intelligent delivery analysis system and method based on big data

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422508A (en) * 2023-10-24 2024-01-19 上海网萌网络科技有限公司 Intelligent delivery analysis system and method based on big data

Similar Documents

Publication Publication Date Title
CN111444428B (en) Information recommendation method and device based on artificial intelligence, electronic equipment and storage medium
US11748555B2 (en) Systems and methods for machine content generation
US11334635B2 (en) Domain specific natural language understanding of customer intent in self-help
US20220036461A1 (en) Sentiment and rules-based equity analysis using customized neural networks in multi-layer, machine learning-based model
US11995112B2 (en) System and method for information recommendation
CN107808011B (en) Information classification extraction method and device, computer equipment and storage medium
US20200019609A1 (en) Suggesting a response to a message by selecting a template using a neural network
US20230102337A1 (en) Method and apparatus for training recommendation model, computer device, and storage medium
US20200210526A1 (en) Document classification using attention networks
CN109992650A (en) For providing the personalized cognition session proxy seen clearly in operation
CN112148889A (en) Recommendation list generation method and device
US20190163500A1 (en) Method and apparatus for providing personalized self-help experience
US12020267B2 (en) Method, apparatus, storage medium, and device for generating user profile
CN109086265B (en) Semantic training method and multi-semantic word disambiguation method in short text
CN110990532A (en) Method and device for processing text
CN112633690A (en) Service personnel information distribution method, service personnel information distribution device, computer equipment and storage medium
CN114707041B (en) Message recommendation method and device, computer readable medium and electronic equipment
CN115130711A (en) Data processing method and device, computer and readable storage medium
Ertekin et al. Approximating the crowd
Habek et al. Bi-Directional CNN-RNN architecture with group-wise enhancement and attention mechanisms for cryptocurrency sentiment analysis
CN111179055A (en) Credit limit adjusting method and device and electronic equipment
US20210312265A1 (en) Response Generation using Memory Augmented Deep Neural Networks
CN115375361A (en) Method and device for selecting target population for online advertisement delivery and electronic equipment
Wu et al. Machine learning approach to analyze the sentiment of airline passengers’ tweets
CN112287111B (en) Text processing method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination