CN110706021A - Advertisement putting method and system - Google Patents

Advertisement putting method and system Download PDF

Info

Publication number
CN110706021A
CN110706021A CN201910864507.4A CN201910864507A CN110706021A CN 110706021 A CN110706021 A CN 110706021A CN 201910864507 A CN201910864507 A CN 201910864507A CN 110706021 A CN110706021 A CN 110706021A
Authority
CN
China
Prior art keywords
word
advertisement
text
concept
search text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910864507.4A
Other languages
Chinese (zh)
Inventor
孙兴帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weimeng Chuangke Network Technology China Co Ltd
Original Assignee
Weimeng Chuangke Network Technology China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weimeng Chuangke Network Technology China Co Ltd filed Critical Weimeng Chuangke Network Technology China Co Ltd
Priority to CN201910864507.4A priority Critical patent/CN110706021A/en
Publication of CN110706021A publication Critical patent/CN110706021A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history
    • G06Q30/0256User search
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0277Online advertisement

Abstract

The embodiment of the invention provides an advertisement putting method and an advertisement putting system, wherein the method comprises the following steps: determining the weight value of each word in the user search text; determining the correlation degree of the search text and the advertisement commercial concepts according to the weight value of each word in the search text, and the predetermined weight values of the index words and the index words of the advertisement commercial concepts corresponding to the advertisement industry; and determining the advertisement industry corresponding to the search text according to the correlation degree of the search text and the advertisement commercial concept and putting the advertisement. According to the technical scheme, targeted advertisement putting is carried out according to the search text and the comprehensive and accurate conceptual model, and accurate advertisement putting is achieved.

Description

Advertisement putting method and system
Technical Field
The invention belongs to the field of internet advertisements, and particularly relates to an advertisement putting method and an advertisement putting system.
Background
With the development of the mobile internet, the increase of users becomes slow, the traffic dividend gradually reaches a bottleneck, and how to refine the operation and further improve the traffic utilization rate is a problem faced by each internet company. In a microblog common search scene, a search result has strong correlation with a search word, and if the correlation between the search word and an advertisement is blended, the advertisement is put on a search result page so as to ensure the effect of searching stream advertisements and the experience of the search scene. How to associate the search word with the advertisement requires understanding the content of the search word text, which is a research topic belonging to natural language understanding technology.
The problem of classifying search terms into corresponding advertisement categories is generally studied as a multi-category text classification problem, which is a supervised learning task that first needs a large amount of labeled training data and then trains a text classification model with the training data. The training data includes a user's search text and a corresponding advertising industry category label, such as the search text "heavy fire on tv drama month" corresponds to the advertising industry category being "cultural entertainment". The traditional text classification method is to use N-gram features of an N-gram model to extract features in search terms, and then input the features into a classifier such as a support Vector machine (svm) (support Vector machine) to classify texts. With the development of deep learning, text classification methods are evolving continuously, and Neural network methods represented by convolutional Neural networks cnn (convolutional Neural networks), recurrent Neural networks rnn (recursive Neural networks), and Attention mechanism can automatically extract features, thereby realizing end-to-end learning. The text classification effect on public datasets is currently best a BERT model, which was proposed by Google in 2018, using a Transformer as the infrastructure. The model is completed in two stages, initial model parameters are obtained through unsupervised training in a large amount of public data such as Wikipedia, and then the model is applied to a specific downstream task such as a text classification task, and the model is further refined. The SST-2(Stanford sentiment Treebank) and other text classification tasks achieve 94.9% of accuracy, and are the best (state-of-the-art) scheme at present.
However, in the prior art, the problems of accuracy of model construction, completeness of coverage and the like still exist for how to match the characteristics of the search text in the network for advertisement delivery.
Disclosure of Invention
The embodiment of the invention provides an advertisement putting method and system, which aim at advertisement putting according to a search text and a comprehensive and accurate conceptual model.
In order to achieve the above object, in one aspect, an embodiment of the present invention provides an advertisement delivery method, where the method includes:
determining the weight value of each word in the user search text;
determining the correlation degree of the search text and the advertisement commercial concepts according to the weight value of each word in the search text, and the predetermined weight values of the index words and the index words of the advertisement commercial concepts corresponding to the advertisement industry;
and determining the advertisement industry corresponding to the search text according to the correlation degree of the search text and the advertisement commercial concept and putting the advertisement.
In another aspect, an embodiment of the present invention provides an advertisement delivery system, where the system includes:
the search text information determining unit is used for determining the weight value of each word in the search text of the user;
the relevancy determining unit is used for determining the relevancy between the search text and the advertisement commercial concept according to the weight value of each word in the search text, the predetermined index word of the advertisement commercial concept corresponding to the advertisement industry and the predetermined weight value of the index word;
and the advertisement putting unit is used for determining the advertisement industry corresponding to the search text according to the correlation degree of the search text and the advertisement business concept and putting the advertisement.
The technical scheme has the following beneficial effects:
the technical scheme of the invention adopts a concept model which is comprehensive and accurate according to the search text to carry out targeted advertisement putting; a complete knowledge map of the advertisement industry is constructed, the text and the advertisement industry are associated by calculating the correlation degree between the text and concepts under each advertisement industry, and advertisements of corresponding industries are targeted to be delivered, so that the advertisement delivery is more accurate and interpretable; the semantic classification model does not need to do data labeling work, does not need to consider the problem of unbalanced distribution of industry category data, has high semantic calculation accuracy, and greatly reduces the manual workload.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of a method for advertisement delivery according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating the effectiveness of advertisement delivery according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an advertisement delivery system according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Through data analysis on microblog search terms, the microblog search terms have obvious characteristics:
1. the search word text is short and the category distribution of the industry is not even;
2. the search terms are very wide in distribution range, so that the data annotation cost is high;
3. strong timeliness, burst, difficult prediction and poor reusability. The search terms at the head of each microblog are not the same, a certain search term or a certain search term appears suddenly, the flow of the search term is suddenly increased in a short time, then the search term falls back or disappears in 0.5-1.5 days, the effectiveness of a single hot event is about 1.3 days, the reusability of the search term is poor, the characteristics that the microblog search is different from a search engine, the regularity such as seasons and the like does not exist (namely, a historical contemporaneous search term-out strategy cannot be referred), and the microblog search term or the hot event has the characteristics of hot spot, burst, single burst, difficult prediction and the like.
The three characteristics make the construction of a training sample difficult, and the training of a high-accuracy text classification model by using a BERT model and other supervised learning modes is difficult to carry out. In addition, in addition to classifying the search terms into the category of the advertising industry, we also want to obtain fine-grained advertising concept hierarchical relationship, such as the search term "ashilande xiaozi bottle", which can be classified into the category of "beauty and skin care", and also want to know that it belongs to the sub-category of "cosmetics" and "facial cosmetics", and the brand of "ashilande" is a sub-node of the brand of "cosmetics". In addition, the speaker of the 'Yashilandai Xiaozhong bottle' is Chenkun, the semantic relation between the complex concepts is beneficial to improving the effect of advertisement putting, the advertisement putting has interpretability, and the semantic relation cannot be solved only through a text classification problem.
Therefore, the project needs to solve the following technical problems:
1. improving the model accuracy without needing or reducing the data annotation quantity;
2. how to obtain semantic relations among concepts and constructing a semantic relation network;
3. because the search terms are wide in range, the knowledge range which the model can cover is required to be wide, and all the search terms are covered.
Knowledge graph is a technology proposed by google in 2012, is originally used for optimizing a search engine, and is widely applied in the fields of search, recommendation, question answering and the like along with the development of the technology. In order to improve the putting effect of search advertisements, a knowledge graph covering various commercial advertisement industries needs to be constructed, and the construction of the knowledge graph in the field preferably considers the public data source. As the biggest encyclopedia knowledge base in the world, Wikipedia has the characteristics of wide knowledge coverage, high concept hierarchy structuralization degree, high knowledge updating speed and the like, in a Category Category system of the Wikipedia, a hierarchical tree structure between concepts can be obtained according to the Category system, so that semantic relations between the concepts can be analyzed, such as cosmetics, sub-nodes of eye cosmetics, lip cosmetics, cosmetic companies, cosmetic brands and the like are arranged below the Category system, and the cosmetic companies comprise leaf nodes of a senior, a Unilihua and the like. And extracting knowledge from the Wikipedia, and integrating the incidence relation of semantic knowledge to form a complete semantic knowledge map.
As shown in fig. 1 and 2, the present invention is a flowchart of an advertisement delivery method, where the method includes:
s101, determining the weight value of each word in a user search text;
preferably, the weight value of each word in the search text is the word frequency-inverse text frequency TF-IDF value of each word in the search text.
Obtaining a search text, segmenting the search text, and obtaining a word vector T ═ w { w } in the search text after weight value calculationi}; for each word in the search text, calculating the word frequency-inverse text frequency TF-IDF value as the weight value of the word. A certain number of words with the highest weight value are taken to form a word vector T ═ w of the search texti}。
If a search text is input, namely a visitor mistakenly puts an apple mobile phone as an apple to feed a brown bear, word segmentation and keyword extraction are carried out on the text, then the relevancy is calculated with Wikipedia concepts, and the most relevant five Wikipedia concept entries are obtained as follows:
apple/steve arbor/Apple Music/brown bear.
S102, determining the correlation degree of the search text and the advertisement commercial concept according to the weight value of each word in the search text, the index word of the advertisement commercial concept corresponding to the predetermined advertisement industry and the weight value of the index word;
we download the XML data of the chinese wikipedia in 2019, 2, 20, and extract 35 concepts of the commercial advertising industry and semantic relations between the concepts. The sub-tree structure of the "cosmetics" concept node under the "beauty and skin care" industry is shown below, from which the semantic relationship between concepts can be seen:
Figure BDA0002200862780000051
preferably, the index word of the advertisement business concept corresponding to the advertisement industry and the weight value of the index word are determined by the following method:
acquiring extensible markup language (XML) data of Wikipedia, extracting advertisement commercial concepts corresponding to the advertisement industry and semantic relations among the concepts, and extracting text contents of the concepts; performing word segmentation and stop word removal on the text content of the concept; calculating TF-IDF values of the processed words relative to the concept, and sorting the processed words from high to low according to the TF-IDF values; selecting a set number of words with the highest TF-IDF value as index words of corresponding concepts, and using the TF-IDF value corresponding to the index words as the weight value of the index words.
And selecting a set number of words with high weight values as an index word list of the corresponding concepts, wherein TF-IDF values corresponding to the index words are used as the weight values of the index words. For example, the top10 index words corresponding to the two texts of the eau de toilette and the floral water are as follows:
light perfume: perfume/cologne/alcohol/perfume oil/elizabeth/geranium/fragrance/content/ethanol;
floral water: floral water/anophelifuge/floral water/alcohol/daya/stomach lavage/perfume/rose/liniment/antipruritic.
Preferably, the TF-IDF value of the word relative to the concept is calculated by:
tfidfi,j=tfi,j×idfi,j
wherein the word frequency
Figure BDA0002200862780000052
ni,jRepresenting the number of occurrences of word i in text j; sigmaknk,jRepresenting the sum of the occurrence times of all words in the text j, and k represents a word in the text j;
inverse text frequency
Figure BDA0002200862780000061
D | represents the total number of all texts corresponding to the concept, | { j: t |i∈djDenotes a file d containing words i in all texts corresponding to conceptsjNumber of (2), tiRepresenting the same word as word i in all text corresponding to the concept. If the word is not in numberAccording to which then the denominator is zero, so 1+ | { j: t ] is typically usedi∈dj}|。
All text corresponding to a concept refers to all wikipedia text of a certain concept, including text j. For example, under the concept of cosmetics, there are texts describing ashira and also texts describing senior and lancome, which are all under the general category of cosmetics, and the total number is | D |, and it is assumed that only these three texts encode D1、d2、d3Then | D | ═ 3. For example, if the 'small black bottle' appears in the 'lankano' text and does not appear in the 'ashilantai' and 'senegar' texts, then the denominator in the formula is 1 when calculating the idf value of the 'small black bottle', because only d is3This text presents a "small black bottle". The word t in thisiThe term "small black bottle" is used, wherein j is tiRepresents tiWhether this word is in the text djThere is only one. i denotes the word "small black bottle" in the text j, tiRepresenting the same word as word i in all texts corresponding to the concept, djThe jth text is shown.
The tf value for the "small black bottle" is the number of occurrences of the word in the article "lancome" divided by the total number of occurrences of the word in the article "lancome".
Preferably, the determining the correlation between the search text and the advertisement commercial concept according to the weight value of each word in the search text and the weight values of words and words included in the advertisement commercial concept corresponding to a predetermined advertisement industry includes:
determining a relevance R of the search text to an advertising business concept by:
Figure BDA0002200862780000062
wherein T represents a set of words in the search text;
wirepresenting the ith word in the search text;
virepresenting the search textWord wiA weight value in the search text;
kjrepresenting words w in the concept of an advertisement businessiThe weight value of the same word in the advertising business concept.
S103, determining the advertisement industry corresponding to the search text according to the correlation degree of the search text and the advertisement commercial concept and putting the advertisement.
The microblog user inputs the content to be searched in the search box of the search page, and the semantic analysis service calculates the correlation degree of the text content and the advertisement commercial concept through the steps so as to analyze whether the search word belongs to a certain advertisement industry, wherein the following examples are that some search words are divided into corresponding advertisement categories:
the advertisement industry corresponding to 'severe fire in the moon of TV drama' is as follows: culture and entertainment;
the advertisement industry corresponding to the 'leaving to study whether the people are high school or high school' is as follows: education and training;
the Nanyue weaving cloth corresponds to the advertisement industry and comprises the following steps: clothing bags and suitcases;
"is useful for the mosquito to bite the floral water" corresponds to the advertisement trade: beauty treatment;
the advertisement industry corresponding to the 'wide credit card' is as follows: finance;
the 'Beijing Opendum pushing garbage forced classification' corresponds to the advertisement industry as follows: a government enterprise.
If the search term belongs to a certain advertisement industry, the system puts advertisements of the industry on the search display page of the user, and the advertisement putting effect of the education training corresponding to the search term is shown in fig. 2, wherein the advertisement putting effect is that the user goes abroad and keeps on learning high, high and high school.
Corresponding to the above method, as shown in fig. 3, it is a schematic diagram of an advertisement delivery system according to an embodiment of the present invention, where the system includes:
a search text information determining unit 21 for determining a weight value of each word in the user search text;
a relevancy determining unit 22, configured to determine a relevancy between the search text and an advertisement commercial concept according to a weight value of each word in the search text and a weight value of an index word and an index word of the advertisement commercial concept corresponding to a predetermined advertisement industry;
and the advertisement delivery unit 23 is configured to determine an advertisement industry corresponding to the search text according to the correlation between the search text and the advertisement business concept, and deliver the advertisement.
Preferably, in the search text information determination unit, the weight value of each word in the search text is a word frequency-inverse text frequency TF-IDF value of each word in the search text.
Preferably, the method further comprises an index word weight value determination unit, configured to:
acquiring extensible markup language (XML) data of Wikipedia, extracting advertisement commercial concepts corresponding to the advertisement industry and semantic relations among the concepts, and extracting text contents of the concepts;
performing word segmentation and stop word removal on the text content of the concept;
calculating TF-IDF values of the processed words relative to the concept, and sorting the processed words from high to low according to the TF-IDF values;
selecting a set number of words with the highest TF-IDF value as index words of corresponding concepts, and using the TF-IDF value corresponding to the index words as the weight value of the index words.
Preferably, the index word weight value determination unit is specifically configured to calculate a TF-IDF value of the word with respect to the concept by:
tfidfi,j=tfi,j×idfi,j
wherein the word frequency
Figure BDA0002200862780000071
ni,jRepresenting the number of occurrences of word i in text j; sigmaknk,jRepresenting the sum of the occurrence times of all words in the text j, and k represents a word in the text j;
inverse text frequency
Figure BDA0002200862780000081
D | represents the total number of all texts corresponding to the concept, | { j: t |i∈djDenotes a term corresponding to the conceptWith documents d containing words i in the textjNumber of (2), tiRepresenting the same word as word i in all text corresponding to the concept.
Preferably, the correlation determination unit 22 is specifically configured to:
determining a relevance R of the search text to an advertising business concept by:
Figure BDA0002200862780000082
wherein T represents a set of words in the search text;
wirepresenting the ith word in the search text;
viword w representing the search textiA weight value in the search text;
kjrepresenting words w in the concept of an advertisement businessiThe weight value of the same word in the advertising business concept.
Compared with the prior art, the technical scheme has the following advantages:
1. the method constructs a complete advertisement industry knowledge map, associates the text with the advertisement industry by calculating the correlation degree between the text and each advertisement industry concept, and directionally puts advertisements of corresponding industries, so that the advertisement putting is more accurate and has interpretability;
2. the data of the knowledge map is from Wikipedia, the largest knowledge base in the world, so that the method has the characteristics of wide knowledge coverage and clear and complete semantic structure hierarchy, solves the problem that microblog search words relate to wide knowledge, and can cover the whole network knowledge;
3. compared with the prior art, the semantic classification model does not need to do data labeling work, does not need to consider the problem of unbalanced distribution of industry category data, has high semantic calculation accuracy, and greatly reduces the manual workload.
It should be understood that the specific order or hierarchy of steps in the processes disclosed is an example of exemplary approaches. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the processes may be rearranged without departing from the scope of the present disclosure. The accompanying method claims present elements of the various steps in a sample order, and are not intended to be limited to the specific order or hierarchy presented.
In the foregoing detailed description, various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments of the subject matter require more features than are expressly recited in each claim. Rather, as the following claims reflect, invention lies in less than all features of a single disclosed embodiment. Thus, the following claims are hereby expressly incorporated into the detailed description, with each claim standing on its own as a separate preferred embodiment of the invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. To those skilled in the art; various modifications to these embodiments will be readily apparent, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
What has been described above includes examples of one or more embodiments. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the aforementioned embodiments, but one of ordinary skill in the art may recognize that many further combinations and permutations of various embodiments are possible. Accordingly, the embodiments described herein are intended to embrace all such alterations, modifications and variations that fall within the scope of the appended claims. Furthermore, to the extent that the term "includes" is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term "comprising" as "comprising" is interpreted when employed as a transitional word in a claim. Furthermore, any use of the term "or" in the specification of the claims is intended to mean a "non-exclusive or".
Those of skill in the art will further appreciate that the various illustrative logical blocks, units, and steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate the interchangeability of hardware and software, various illustrative components, elements, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design requirements of the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present embodiments.
The various illustrative logical blocks, or elements, described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, a digital signal processor, an Application Specific Integrated Circuit (ASIC), a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a digital signal processor and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a digital signal processor core, or any other similar configuration.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. For example, a storage medium may be coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC, which may be located in a user terminal. In the alternative, the processor and the storage medium may reside in different components in a user terminal.
In one or more exemplary designs, the functions described above in connection with the embodiments of the invention may be implemented in hardware, software, firmware, or any combination of the three. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media that facilitate transfer of a computer program from one place to another. Storage media may be any available media that can be accessed by a general purpose or special purpose computer. For example, such computer-readable media can include, but is not limited to, RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store program code in the form of instructions or data structures and which can be read by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Additionally, any connection is properly termed a computer-readable medium, and, thus, is included if the software is transmitted from a website, server, or other remote source via a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wirelessly, e.g., infrared, radio, and microwave. Such discs (disk) and disks (disc) include compact disks, laser disks, optical disks, DVDs, floppy disks and blu-ray disks where disks usually reproduce data magnetically, while disks usually reproduce data optically with lasers. Combinations of the above may also be included in the computer-readable medium.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. An advertisement delivery method, the method comprising:
determining the weight value of each word in the user search text;
determining the correlation degree of the search text and the advertisement commercial concepts according to the weight value of each word in the search text, and the predetermined weight values of the index words and the index words of the advertisement commercial concepts corresponding to the advertisement industry;
and determining the advertisement industry corresponding to the search text according to the correlation degree of the search text and the advertisement commercial concept and putting the advertisement.
2. The method of claim 1, wherein the weight value of each word in the search text is a word frequency-inverse text frequency TF-IDF value of each word in the search text.
3. The advertisement delivery method according to claim 2, wherein the index word of the advertisement business concept corresponding to the advertisement industry and the weight value of the index word are determined by the following method:
acquiring extensible markup language (XML) data of Wikipedia, extracting advertisement commercial concepts corresponding to the advertisement industry and semantic relations among the concepts, and extracting text contents of the concepts;
performing word segmentation and stop word removal on the text content of the concept;
calculating TF-IDF values of the processed words relative to the concept, and sorting the processed words from high to low according to the TF-IDF values;
selecting a set number of words with the highest TF-IDF value as index words of corresponding concepts, and using the TF-IDF value corresponding to the index words as the weight value of the index words.
4. An advertising method according to claim 3, wherein the TF-IDF value of the word relative to the concept is calculated by the following formula:
tfidfi,j=tfi,j×idfi,j
wherein the word frequency
Figure FDA0002200862770000011
ni,jRepresenting the number of occurrences of word i in text j; sigmaknk,jRepresenting the sum of the occurrence times of all words in the text j, and k represents a word in the text j;
inverse text frequency
Figure FDA0002200862770000012
D represents the total number of all texts corresponding to the concept, | { j: t |i∈djDenotes a file d containing words i in all texts corresponding to conceptsjNumber of (2), tiRepresenting the same word as word i in all text corresponding to the concept.
5. The method of claim 4, wherein the determining the relevance of the search text to the advertisement business concept according to the weight value of each word in the search text and the weight values of words and phrases contained in the advertisement business concept corresponding to a predetermined advertisement industry comprises:
determining a relevance R of the search text to an advertising business concept by:
Figure FDA0002200862770000021
wherein T represents a set of words in the search text;
wirepresenting the ith word in the search text;
viword w representing the search textiA weight value in the search text;
kjrepresenting words w in the concept of an advertisement businessiThe weight value of the same word in the advertising business concept.
6. An advertisement delivery system, the system comprising:
the search text information determining unit is used for determining the weight value of each word in the search text of the user;
the relevancy determining unit is used for determining the relevancy between the search text and the advertisement commercial concept according to the weight value of each word in the search text, the predetermined index word of the advertisement commercial concept corresponding to the advertisement industry and the predetermined weight value of the index word;
and the advertisement putting unit is used for determining the advertisement industry corresponding to the search text according to the correlation degree of the search text and the advertisement business concept and putting the advertisement.
7. The advertisement delivery system according to claim 6, wherein in the search text information determination unit, the weight value of each word in the search text is a word frequency-inverse text frequency TF-IDF value of each word in the search text.
8. The advertisement delivery system according to claim 7, further comprising an index word weight value determination unit configured to:
acquiring extensible markup language (XML) data of Wikipedia, extracting advertisement commercial concepts corresponding to the advertisement industry and semantic relations among the concepts, and extracting text contents of the concepts;
performing word segmentation and stop word removal on the text content of the concept;
calculating TF-IDF values of the processed words relative to the concept, and sorting the processed words from high to low according to the TF-IDF values;
selecting a set number of words with the highest TF-IDF value as index words of corresponding concepts, and using the TF-IDF value corresponding to the index words as the weight value of the index words.
9. An advertisement delivery system according to claim 8, wherein the index word weight value determination unit is specifically configured to calculate a TF-IDF value of a word with respect to a concept by:
tfidfi,j=tfi,j×idfi,j
wherein the word frequency
Figure FDA0002200862770000022
ni,jRepresenting the number of occurrences of word i in text j; sigmaknk,jRepresenting the sum of the occurrence times of all words in the text j, and k represents a word in the text j;
inverse text frequency
Figure FDA0002200862770000031
D represents the total number of all texts corresponding to the concept, | { j: t |i∈djDenotes a file d containing words i in all texts corresponding to conceptsjNumber of (2), tiRepresenting the same word as word i in all text corresponding to the concept.
10. The advertisement delivery system according to claim 9, wherein the relevance determining unit is specifically configured to:
determining a relevance R of the search text to an advertising business concept by:
Figure FDA0002200862770000032
wherein T represents a set of words in the search text;
wirepresenting the ith word in the search text;
viword w representing the search textiA weight value in the search text;
kjrepresenting words w in the concept of an advertisement businessiThe weight value of the same word in the advertising business concept.
CN201910864507.4A 2019-09-12 2019-09-12 Advertisement putting method and system Pending CN110706021A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910864507.4A CN110706021A (en) 2019-09-12 2019-09-12 Advertisement putting method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910864507.4A CN110706021A (en) 2019-09-12 2019-09-12 Advertisement putting method and system

Publications (1)

Publication Number Publication Date
CN110706021A true CN110706021A (en) 2020-01-17

Family

ID=69195173

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910864507.4A Pending CN110706021A (en) 2019-09-12 2019-09-12 Advertisement putting method and system

Country Status (1)

Country Link
CN (1) CN110706021A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114398469A (en) * 2021-12-10 2022-04-26 北京百度网讯科技有限公司 Method and device for determining search term weight and electronic equipment

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101477566A (en) * 2009-01-19 2009-07-08 腾讯科技(深圳)有限公司 Method and apparatus used for putting candidate key words advertisement
CN103295144A (en) * 2012-02-23 2013-09-11 北京星源无限传媒科技有限公司 Mobile internet keyword advertisement putting method
CN103559284A (en) * 2013-11-07 2014-02-05 北京国双科技有限公司 Word expansion method and device for webpage keywords
CN103593792A (en) * 2013-11-13 2014-02-19 复旦大学 Individual recommendation method and system based on Chinese knowledge mapping
CN104951460A (en) * 2014-03-27 2015-09-30 阿里巴巴集团控股有限公司 Ranking parameter value determination method and device based on keyword clustering
CN105809464A (en) * 2014-12-31 2016-07-27 中国电信股份有限公司 Method and device for information delivery
CN106056406A (en) * 2016-05-31 2016-10-26 无锡天脉聚源传媒科技有限公司 Method and device for generating program key word map
CN106682926A (en) * 2015-11-06 2017-05-17 北京奇虎科技有限公司 Method and apparatus for pushing search advertisements
CN108153909A (en) * 2018-01-18 2018-06-12 百度在线网络技术(北京)有限公司 Word method, apparatus and electronic equipment, storage medium are opened up in keyword dispensing
CN108268619A (en) * 2018-01-08 2018-07-10 阿里巴巴集团控股有限公司 Content recommendation method and device
CN108280689A (en) * 2018-01-30 2018-07-13 浙江省公众信息产业有限公司 Advertisement placement method, device based on search engine and search engine system
CN108776901A (en) * 2018-04-27 2018-11-09 微梦创科网络科技(中国)有限公司 Method and system for advertisement recommendation based on search term
CN109857854A (en) * 2019-01-02 2019-06-07 新浪网技术(中国)有限公司 A kind of user's commercial labels method for digging and device, server

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101477566A (en) * 2009-01-19 2009-07-08 腾讯科技(深圳)有限公司 Method and apparatus used for putting candidate key words advertisement
CN103295144A (en) * 2012-02-23 2013-09-11 北京星源无限传媒科技有限公司 Mobile internet keyword advertisement putting method
CN103559284A (en) * 2013-11-07 2014-02-05 北京国双科技有限公司 Word expansion method and device for webpage keywords
CN103593792A (en) * 2013-11-13 2014-02-19 复旦大学 Individual recommendation method and system based on Chinese knowledge mapping
CN104951460A (en) * 2014-03-27 2015-09-30 阿里巴巴集团控股有限公司 Ranking parameter value determination method and device based on keyword clustering
CN105809464A (en) * 2014-12-31 2016-07-27 中国电信股份有限公司 Method and device for information delivery
CN106682926A (en) * 2015-11-06 2017-05-17 北京奇虎科技有限公司 Method and apparatus for pushing search advertisements
CN106056406A (en) * 2016-05-31 2016-10-26 无锡天脉聚源传媒科技有限公司 Method and device for generating program key word map
CN108268619A (en) * 2018-01-08 2018-07-10 阿里巴巴集团控股有限公司 Content recommendation method and device
CN108153909A (en) * 2018-01-18 2018-06-12 百度在线网络技术(北京)有限公司 Word method, apparatus and electronic equipment, storage medium are opened up in keyword dispensing
CN108280689A (en) * 2018-01-30 2018-07-13 浙江省公众信息产业有限公司 Advertisement placement method, device based on search engine and search engine system
CN108776901A (en) * 2018-04-27 2018-11-09 微梦创科网络科技(中国)有限公司 Method and system for advertisement recommendation based on search term
CN109857854A (en) * 2019-01-02 2019-06-07 新浪网技术(中国)有限公司 A kind of user's commercial labels method for digging and device, server

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114398469A (en) * 2021-12-10 2022-04-26 北京百度网讯科技有限公司 Method and device for determining search term weight and electronic equipment

Similar Documents

Publication Publication Date Title
US20210165955A1 (en) Methods and systems for modeling complex taxonomies with natural language understanding
US11720572B2 (en) Method and system for content recommendation
Shivaprasad et al. Sentiment analysis of product reviews: A review
US10599731B2 (en) Method and system of determining categories associated with keywords using a trained model
Li et al. Filtering out the noise in short text topic modeling
CN101420313B (en) Method and system for clustering customer terminal user group
WO2017024553A1 (en) Information emotion analysis method and system
Chehal et al. Implementation and comparison of topic modeling techniques based on user reviews in e-commerce recommendations
CN110942337A (en) Accurate tourism marketing method based on internet big data
US20110106829A1 (en) Personalization engine for building a user profile
CN101004737A (en) Individualized document processing system based on keywords
CN107357793A (en) Information recommendation method and device
Liu et al. Personalized movie recommendation method based on deep learning
Luo et al. Sentiment analysis
Chang et al. Improving recency ranking using twitter data
Zhu et al. Real-time personalized twitter search based on semantic expansion and quality model
WO2017107010A1 (en) Information analysis system and method based on event regression test
Almars et al. Structured sentiment analysis
CN110706021A (en) Advertisement putting method and system
US11762916B1 (en) User interface for identifying unmet technical needs and/or technical problems
Lu et al. Data mining and social networks processing method based on support vector machine and k-nearest neighbor
CN113821718A (en) Article information pushing method and device
Meng et al. A personalized and approximated spatial keyword query approach
Ding et al. [Retracted] Clustering Merchants and Accurate Marketing of Products Using the Segmentation Tree Vector Space Model
Lv et al. Detecting user occupations on microblogging platforms: an experimental study

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200117

RJ01 Rejection of invention patent application after publication