CN114445117A

CN114445117A - Method and device for sorting search advertisements

Info

Publication number: CN114445117A
Application number: CN202111573591.8A
Authority: CN
Inventors: 肖路
Original assignee: Weimeng Chuangke Network Technology China Co Ltd
Current assignee: Weimeng Chuangke Network Technology China Co Ltd
Priority date: 2021-12-21
Filing date: 2021-12-21
Publication date: 2022-05-06

Abstract

The application discloses a method and a device for sequencing search advertisements. The method comprises the following steps: obtaining a first target vector for a target advertisement, wherein the first target vector is obtained based on a plurality of vectors representing different dimensionality semantics of the target advertisement; obtaining a second target vector aiming at the target search content, wherein the second target vector is obtained based on a plurality of vectors representing different dimensional semantics of the target search content, and each dimension representing the target advertisement is consistent with each dimension representing the target search content; determining a similarity between the second target vector and the first target vector; and determining the arrangement sequence of the target advertisements based on the similarity.

Description

Method and device for sorting search advertisements

Technical Field

The present application relates to the field of internet technologies, and in particular, to a method and an apparatus for ranking search advertisements.

Background

With the development of internet technology, search advertisements have always played a significant role in internet commercial advertisement recommendation. The search advertisement is mainly used for audience targeting by analyzing the search content of the user, namely, the user inputs corresponding search content, and the server provides corresponding advertisement by analyzing information in the current search content.

In recommending search advertisements, it is generally necessary to determine the rank order of the search advertisements first, and then the search advertisements may be recommended based on the rank order of the search advertisements.

However, the manner in which the related art determines the arrangement order of the search advertisements is often not accurate enough to affect the recommendation effect.

Disclosure of Invention

The embodiment of the application provides a method and a device for sequencing search advertisements, which aim to solve the problem that the conventional sequencing mode aiming at the advertisements is not accurate enough.

In a first aspect, the present application provides a method for ranking search advertisements, the method comprising:

obtaining a first target vector for a target advertisement, wherein the first target vector is obtained based on a plurality of vectors representing different dimensionality semantics of the target advertisement;

obtaining a second target vector aiming at the target search content, wherein the second target vector is obtained based on a plurality of vectors representing different dimensional semantics of the target search content, and each dimension representing the target advertisement is consistent with each dimension representing the target search content;

determining a similarity between the second target vector and the first target vector;

and determining the arrangement sequence of the target advertisements based on the similarity.

In a second aspect, the present application provides an apparatus for ranking search advertisements, the apparatus comprising:

an obtaining module, configured to obtain a first target vector for a target advertisement, where the first target vector is obtained based on a plurality of vectors representing different dimensional semantics of the target advertisement; obtaining a second target vector aiming at the target search content, wherein the second target vector is obtained based on a plurality of vectors representing different dimensional semantics of the target search content, and each dimension representing the target advertisement is consistent with each dimension representing the target search content;

a determining module for determining a similarity between the second target vector and the first target vector; and determining the arrangement sequence of the target advertisements based on the similarity.

In a third aspect, the present application provides a server, comprising: a processor, a memory and a program or instructions stored on the memory and executed on the processor, which when executed by the processor, implement the steps of the method of the first aspect.

In a fourth aspect, the present application provides a readable storage medium on which is stored a program or instructions which, when executed by a processor, performs the steps of the method of the first aspect.

The embodiment of the application adopts at least one technical scheme which can achieve the following beneficial effects:

in the embodiment of the application, a first target vector for a target advertisement is obtained, wherein the first target vector is obtained based on a plurality of vectors representing different dimensionality semantics of the target advertisement; obtaining a second target vector aiming at the target search content, wherein the second target vector is obtained based on a plurality of vectors representing different dimensional semantics of the target search content, and each dimension representing the target advertisement is consistent with each dimension representing the target search content; determining a similarity between the second target vector and the first target vector; and determining the arrangement sequence of the target advertisements based on the similarity. Therefore, the similarity between the first target vector obtained based on the vectors representing different dimensionality semantics of the target advertisement and the second target vector obtained based on the vectors representing different dimensionality semantics of the target search content is determined, the calculated similarity can be ensured to be more accurate due to more factors considered in the process of calculating the similarity, the arrangement sequence of the target advertisement can be determined according to the accurately obtained similarity, and the arrangement sequence of the target advertisement determined based on the similarity can be ensured to be more accurate by calculating the similarity between the target advertisement and the target search content more accurately in the way of determining the arrangement sequence of the target advertisement, so that the problem that the existing arrangement mode for the advertisement is not accurate enough is solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

FIG. 1 is a flowchart of a method for ranking search advertisements according to an embodiment of the present disclosure;

FIG. 2 is a flowchart of a method for ranking search advertisements according to an embodiment of the present disclosure;

FIG. 3 is a flowchart of a method for ranking search advertisements according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram illustrating a process of obtaining keyword vectors in a method for ranking search advertisements according to an embodiment of the present application;

fig. 5 is a schematic diagram illustrating a process of obtaining a token vector in a method for ranking search advertisements according to an embodiment of the present application;

FIG. 6 is a diagram illustrating a process of pre-training a word vector model in a method for ranking search advertisements according to an embodiment of the present application;

FIG. 7 is a diagram illustrating a method for ranking search advertisements according to an embodiment of the present disclosure;

fig. 8 is a block diagram illustrating a structure of an apparatus for ranking search advertisements according to an embodiment of the present disclosure;

fig. 9 is a block diagram of a server according to an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only a few embodiments of the present application, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the application may be practiced in sequences other than those illustrated or described herein, and that the terms "first," "second," and the like are generally used herein in a generic sense and do not limit the number of terms, e.g., the first term can be one or more than one. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.

The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.

Fig. 1 is a flowchart of a method for ranking search advertisements provided in an embodiment of the present application, and referring to fig. 1, the method for ranking search advertisements provided in an embodiment of the present application may include:

step 110, obtaining a first target vector for a target advertisement, wherein the first target vector is obtained based on a plurality of vectors representing different dimensional semantics of the target advertisement;

the target advertisement can be various advertisements put on a website searched by a user; the first target vector may be a vector for characterizing multidimensional semantics of the target advertisement, e.g., the first target vector may be derived based on a vector characterizing keywords of the target advertisement and a vector characterizing overall semantics of the target advertisement.

Step 120, obtaining a second target vector for the target search content, where the second target vector is obtained based on multiple vectors representing different dimensional semantics of the target search content, and each dimension representing the target advertisement is consistent with each dimension representing the target search content;

the target search content can be content searched in a website search bar by a user; the second target vector may be a vector for characterizing multidimensional semantics of the target search content, for example, the second target vector may be derived based on a vector characterizing keywords of the target search content and a vector characterizing overall semantics of the target search content.

Step 130, determining the similarity between the second target vector and the first target vector;

in an embodiment of the present application, the similarity may be a cosine similarity, and the cosine similarity, also called cosine similarity, is a measure for measuring a difference between two individuals by calculating a cosine value of an included angle between two vectors in a vector space. The closer the cosine value of the angle is to 1, the closer the angle is to 0 degrees, i.e. the more similar the two vectors are.

For example, assume A and B are two n-dimensional vectors, A being [ A ]₁，A₂，...，A_n]B is [ B ]₁，B₂，...，B_n]If the angle between the vector a and the vector B is θ, the cosine similarity between the vector a and the vector B may be:

wherein A is_iAnd B_iRespectively represent each component of the vector A and the vector B, i.e. 1 ≦ i ≦ n, i being a positive integer.

Step 140, determining the arrangement order of the target advertisements based on the similarity.

In the embodiment of the application, the sequence of the target advertisement pushed on the website searched by the user can be determined according to the similarity. Specifically, the similarity may be a cosine value of an included angle between the second target vector and the first target vector, and the cosine value may range between [ -1,1], such that the more the cosine value approaches 1, the more the arrangement order of the target advertisements is advanced; the closer the cosine value approaches-1, the later the ranking of the targeted advertisements.

The method for sequencing the search advertisements, provided by the embodiment of the application, comprises the steps of obtaining a first target vector aiming at a target advertisement, wherein the first target vector is obtained based on a plurality of vectors representing different dimensionality semantics of the target advertisement; obtaining a second target vector aiming at the target search content, wherein the second target vector is obtained based on a plurality of vectors representing different dimensional semantics of the target search content, and each dimension representing the target advertisement is consistent with each dimension representing the target search content; determining a similarity between the second target vector and the first target vector; and determining the arrangement sequence of the target advertisements based on the similarity. Therefore, the similarity between the first target vector obtained based on the vectors representing different dimensionality semantics of the target advertisement and the second target vector obtained based on the vectors representing different dimensionality semantics of the target search content is determined, the calculated similarity can be ensured to be more accurate due to more factors considered in the process of calculating the similarity, the arrangement sequence of the target advertisement can be determined according to the accurately obtained similarity, and the arrangement sequence of the target advertisement determined based on the similarity can be ensured to be more accurate by calculating the similarity between the target advertisement and the target search content more accurately in the way of determining the arrangement sequence of the target advertisement, so that the problem that the existing arrangement mode for the advertisement is not accurate enough is solved.

Optionally, in an embodiment of the present application, the target advertisement may include a plurality of advertisements, and the first target vector may include a plurality of sub-vectors corresponding to the plurality of advertisements; step 130 may specifically include: determining a similarity of the second target vector to each of the plurality of sub-vectors; step 140 may specifically include: determining an order of arrangement of the plurality of advertisements based on a similarity of the second target vector to each of the plurality of sub-vectors.

Wherein each of the plurality of subvectors may also be derived based on a plurality of vectors characterizing different dimensions of each of the plurality of advertisements. In addition, it should be understood that, in an actual application situation, the multiple advertisements are stored in advance on a website for a user to search, and then multiple sub-vectors corresponding to the multiple advertisements may be obtained in advance, so that when the user searches for target search content, a similarity between a second target vector corresponding to the target search content and each of the multiple sub-vectors may be directly determined.

Therefore, when a user searches related content in a website search bar in real time, the similarity between the second target vector and each sub-vector in the plurality of sub-vectors can be determined according to the plurality of sub-vectors corresponding to the plurality of advertisements stored in the website, so that the sequence of pushing the plurality of advertisements to the user can be determined, the advertisements can be pushed to the user more accurately, and the quality of the searched advertisements and the use experience of the user are improved.

In the embodiment of the present application, the specific process of obtaining the first target vector for the target advertisement in step 110 may be implemented in different manners.

A specific implementation example is given below. It is to be understood that the following are merely examples, and are not intended to be limiting.

Referring to fig. 2, the specific process of obtaining the first target vector for the target advertisement in step 110 may include: step 210, step 220 and step 230. These three steps are explained below.

Step 210, obtaining a first keyword vector of a target advertisement, wherein the first keyword vector is used for indicating keywords in the target advertisement;

the first keyword vector may be a vector that characterizes a keyword dimension in the target advertisement, that is, the first keyword vector may characterize core semantics in the target advertisement.

In this embodiment of the present application, the obtaining the first keyword vector of the target advertisement in step 210 may include: extracting at least one keyword for the targeted advertisement; determining at least one keyword vector corresponding to the at least one keyword by using a pre-trained word vector model; based on the at least one keyword vector, a first keyword vector for the targeted advertisement is determined. In this way, a first keyword vector for the target advertisement can be obtained by extracting a plurality of keywords, so that the core semantics in the target advertisement can be better characterized by the first keyword vector.

The number of extracted keywords can be selected according to the content of the target advertisement in the actual application condition, and is not more than 10.

It is understood that in one embodiment of the present application, extracting keywords may be in a manner based on a text summarization (TextRank) algorithm. The text summarization algorithm is a graph-based ranking algorithm for texts, the basic idea of the algorithm is derived from a webpage ranking algorithm (PageRank), and the method can be used for ranking important components in texts only by utilizing a voting mechanism by dividing the texts into a plurality of constituent units (words and sentences) and establishing a graph model. The text summarization algorithm can be separated from the background of a corpus, and keywords or summaries of a single text can be extracted by analyzing the text.

In addition, in this embodiment of the present application, the determining, based on the at least one keyword vector, a first keyword vector for the target advertisement may specifically include: processing the at least one keyword vector through a full connection layer with an activation function to obtain a first appointed keyword vector; and performing pooling processing on the first appointed keyword vector to obtain a first keyword vector aiming at the target advertisement.

As shown in fig. 4, in an embodiment of the present application, after obtaining at least one keyword vector corresponding to the at least one keyword, that is, the keyword vector 1, the keyword vector 2, ·, and the keyword vector n, the n keyword vectors may be sequentially processed through a full connection layer, an activation function, a full connection layer, and a pooling layer, so as to obtain a final keyword vector K. Specifically, for example, 5 keyword vectors are obtained for the target advertisement, the keyword vector 1, the keyword vector 2, the keyword vector 5 may first pass through a full connection layer (dense) with an activation function (Rectified Linear Units, ReLU), and then pass through a full connection layer for better effect in practical application, and then the output of the full connection layer may be subjected to Max-Pooling (Max-Pooling) operation through a Pooling layer to weaken information of unimportant keyword vectors, and finally obtain the first keyword vector K1 for the target advertisement.

The activation function, also called an excitation function, can perform nonlinear processing on input data, and endows the multilayer neural network with deep significance. Each neuron node in the neural network receives the output value of the neuron at the previous layer as the input value of the neuron, and transmits the input value to the next layer, and the neuron node at the input layer can directly transmit the input attribute value to the next layer (hidden layer or output layer). In a multi-layer neural network, there is a functional relationship between the output of an upper node and the input of a lower node, and this function is called an activation function. Commonly used activation functions may include sigmoid functions, tanh functions, and ReLU functions (e.g., Leaky-ReLU, P-ReLU, R-ReLU), among others. The maximum pooling is that a point with the maximum value in a local acceptance domain is taken, the invariance of the position and the rotation of the characteristic is ensured, the number of model parameters is reduced, and the overfitting problem of the model is favorably reduced.

Thus, one keyword vector, namely the first keyword vector, aiming at the target advertisement can be obtained by carrying out linear processing on the full connection layer and the pooling layer and nonlinear processing on the activation function on the plurality of keyword vectors, so that the core semantics in the target advertisement can be better represented.

Step 220, obtaining a first characterization vector of the target advertisement, where the first characterization vector is used to indicate an overall semantic meaning of the target advertisement;

wherein the first characterization vector may be a vector that characterizes an overall semantic dimension in the targeted advertisement, that is, the first characterization vector may characterize an overall semantic of the advertisement text for the targeted advertisement.

In this embodiment of the present application, the obtaining the first characterization vector of the target advertisement in step 220 may include: extracting an advertisement text aiming at the target advertisement, and taking the advertisement text as the input of a pre-trained semantic model; acquiring N classification vectors based on the semantic model; determining a first characterization vector for the targeted advertisement based on the N classification vectors. Wherein N may be a positive integer. Therefore, the first characterization vector aiming at the target advertisement can be obtained by extracting the advertisement text and inputting the advertisement text into the pre-trained semantic model, so that the overall semantics in the target advertisement can be better characterized by utilizing the first characterization vector.

The semantic model may be a BERT (bidirectional Encoder Representation from transforms) model, and the BERT model is a pre-trained language Representation model. As can be seen from the name, the goal of the BERT model is to obtain a representation of a text containing rich semantic information using large-scale unlabeled corpus training, namely: and performing semantic representation on the text, then performing fine adjustment on the semantic representation of the text in a specific natural language task, and finally applying the semantic representation of the text to the natural language task. The primary input of the BERT model is the original word vector of each character or word in the text, which can be initialized randomly or pre-trained by using the word vector model as an initial value; the output of the BERT model is vector representation of each character or word in the text after full-text semantic information is fused. The Classification vector (CLS) may be understood as a Classification task for downstream in the BERT model, which may include single-text Classification tasks and statement pair Classification tasks. Single text classification task: for a text classification task, inserting a [ CLS ] symbol in front of a text by a BERT model, and taking an output vector corresponding to the symbol as semantic representation of the whole text for text classification; the practical application scenario of the statement pair classification task may include: for the task, the BERT model adds [ CLS ] symbols and takes the corresponding output as the semantic representation of the text, and can also divide the input two sentences by using one [ SEP ] symbol and respectively add two different text vectors to the two sentences for distinguishing.

In addition, in this embodiment of the present application, the determining a first characterization vector of the target advertisement based on the N classification vectors may specifically include: performing a full join computation on the N classification vectors to convert into N scalars; calculating the N scalars through a normalization index function to obtain N importance factors; determining a first characterization vector for the targeted advertisement based on the N classification vectors and the N importance factors.

In an embodiment of the present application, after the extracted advertisement text is input into the pre-trained semantic model to obtain P classification vectors, for example, as shown in fig. 5, if P-12 and N-3 are taken, the advertisement text of the target advertisement may be input into the BERT model to obtain 12 classification vectors, and the classification vectors of the last three layers, that is, [ V ═ V-₁,V₂,V₃]Performing full join computation to convert to 3 scalars e₁,e₂,e₃]As shown in equation 1; then, the 3 scalars are calculated through a normalized exponential function (softmax), and 3 importance factors [ S ] corresponding to the 3 classification vectors are obtained₁,S₂,S₃]As shown in equation 2; then, the 3 classification vectors and the 3 importance factors corresponding thereto are subjected to weighted sum (multiplication and addition) calculation to obtain a final first characterization vector V1, as shown in formula 3.

[e₁,e₂,e₃]＝[v₁,v₂,v₃]W formula 1

[s₁,s₂,s₃]＝Softmax([e₁,e₂,e₃]) Equation 2

V1＝v₁*s₁+v₂*s₂+v₃*s₃ Equation 3

Where W may be a matrix, and multiplying a vector by a matrix may be converted into a scalar when performing full join computation. The normalized exponential function (softmax) is a generalization of a two-classification function (sigmoid) on multi-classification, and aims to show the result of multi-classification in a probability form. The first step of normalizing the index function is to convert the prediction result of the model into the index function, so that the nonnegativity of the probability is ensured; then, in order to ensure that the sum of the probabilities of the prediction results is equal to 1, the converted results need to be normalized by dividing the converted results by the sum of all the converted results, which can be understood as the percentage of the total converted results, thus obtaining an approximate probability.

Therefore, N classification vectors of the advertisement text can be obtained by utilizing the semantic model, and various operation processing is carried out on the basis of the N classification vectors to obtain a first characterization vector aiming at the target advertisement, so that the overall semantics in the target advertisement can be better characterized.

Step 230, determining a first target vector for the target advertisement based on the first keyword vector and the first characterization vector.

In this embodiment of the present application, the first keyword vector and the first characterization vector may be fused to obtain a first target vector for the target advertisement. For example, the first keyword vector K1 and the first token vector V1 of the target advertisement may be spliced, a normalization operation is performed to make token scales of the two vectors consistent, so as to obtain a vector C, and then the vector C is calculated through a full connection, so as to obtain a first target vector E for the target advertisement.

According to the method for sequencing the search advertisements, the first keyword vector aiming at the target advertisement can be obtained by performing linear processing on the full connection layer and the pooling layer and nonlinear processing on the activation function on the plurality of keyword vectors; obtaining N classification vectors of the advertisement text by utilizing the semantic model, and performing various operation processing based on the N classification vectors to obtain a first characterization vector aiming at the target advertisement; and then, based on the first keyword vector and the first representation vector, obtaining a first target vector aiming at the target advertisement, thereby not only well representing the core semantics in the target advertisement, but also well representing the overall semantics in the target advertisement, strengthening the importance of the keyword in the representation of the advertisement text and weakening the influence of unimportant words on the representation of the advertisement text.

Optionally, in an embodiment of the present application, as shown in fig. 6, the pre-training process of the word vector model may be: the word segmentation tool is used for segmenting all advertisements and search contents, and then the linguistic data (namely words or phrases) after word segmentation is input into the word vector model, so that a word vector set corresponding to each word or phrase can be obtained. Therefore, the corresponding relation between a large number of words and word vectors can be obtained through the pre-trained word vector model, and the corresponding relation can be directly obtained from the word vector set under the condition of subsequently obtaining the keyword vectors, so that the sequencing efficiency of searching advertisements is improved.

The word segmentation tool can be an LAC (local Analysis of Chinese) word segmentation tool, and can realize the functions of Chinese word segmentation, part of speech tagging, proper name recognition and the like; the word segmentation tool is not used for directly providing search support for services, but is used as a basic tool of a search engine. The Word Vector model can be a Word2Vec (Word Vector/Word embedding) Word Vector model, the Word2Vec model is a simplified neural network, and the effect of the Word Vector model can be to convert words in natural language into Dense vectors (Dense vectors) which can be understood by a computer; specifically, the Skip-Gram architecture in the Word2Vec model is adopted, and the architecture is characterized in that words are input to predict contexts, as shown in fig. 6, the input words can be represented by W (t), and the output contexts can be represented by W (t-2), W (t-1), W (t +2), and the like.

It should be understood that, if the required keyword vector cannot be obtained in the word vector set subsequently, the keyword may be input into the word vector model to obtain the keyword vector corresponding to the keyword.

In this embodiment, the specific process of obtaining the second target vector for the target search content in step 120 may be implemented in different manners.

Referring to fig. 3, the specific process of obtaining the second target vector for the target search content in step 120 may include: step 310, step 320 and step 330. These three steps are explained below.

Step 310, obtaining a second keyword vector of target search content, wherein the second keyword vector is used for indicating keywords in the target search content;

the second keyword vector may be a vector that characterizes a keyword dimension in the target search content, that is, the second keyword vector may characterize core semantics in the target search content.

In this embodiment of the present application, the obtaining the second keyword vector of the target search content in step 310 may include: extracting at least one keyword for the target search content; determining at least one keyword vector corresponding to the at least one keyword by using a pre-trained word vector model; determining a second keyword vector for the targeted search content based on the at least one keyword vector. In this way, a second keyword vector for the target search content can be obtained by extracting a plurality of keywords, so that the core semantics in the target search content can be better characterized by using the second keyword vector.

According to the foregoing description of obtaining the first keyword vector of the target advertisement in step 210, similarly, the number of the extracted keywords for one piece of target search content may also be 10 at most, and the specific number of the extracted keywords may also be selected according to the target search content in the actual application condition, and is not more than 10. Similarly, extracting the keywords for the target search content may also be a text summarization (TextRank) algorithm-based approach.

In addition, in this embodiment of the application, the determining, based on the at least one keyword vector, a second keyword vector for the target search content may specifically include: processing the at least one keyword vector through a full connection layer with an activation function to obtain a second specified keyword vector; and pooling the second specified keyword vector to obtain a second keyword vector aiming at the target search content.

Similarly, as shown in fig. 4, in an embodiment of the present application, after obtaining at least one keyword vector corresponding to the at least one keyword, that is, the keyword vector 1, the keyword vector 2, the keyword vector n, the n keyword vectors may be sequentially processed through a full connection layer, an activation function, a full connection layer, and a pooling layer, so as to obtain a final keyword vector K. Specifically, for example, 3 keyword vectors are obtained for the target search content, the keyword vector 1, the keyword vector 2, and the keyword vector 3 may first pass through a full connection layer (dense) with an activation function (Rectified Linear Units, ReLU), and in practical applications, the keyword vector may pass through a full connection layer for better effect, and then the output of the full connection layer may be subjected to Max-Pooling (Max-Pooling) operation through a Pooling layer to weaken information of unimportant keyword vectors, and finally obtain a second keyword vector K2 for the target search content.

In this way, one keyword vector, namely the second keyword vector, for the target search content can be obtained by performing linear processing on the full connection layer and the pooling layer and nonlinear processing on the activation function on the plurality of keyword vectors, so that the core semantics in the target search content can be better represented.

Step 320, obtaining a second characterization vector of the target search content, where the second characterization vector is used to indicate an overall semantic meaning of the target search content;

the second characterization vector may be a vector that characterizes an overall semantic dimension in the target search content, that is, the second characterization vector may characterize an overall semantic for the target search content.

In this embodiment of the present application, the obtaining the second characterization vector of the target search content in step 320 may include: extracting a search text, and taking the search text as an input of a pre-trained semantic model; obtaining M classification vectors based on the semantic model; determining a second characterization vector for the target search content based on the M classification vectors. Wherein M may be a positive integer. Therefore, the second characterization vector aiming at the target search content can be obtained by extracting the search text and inputting the search text into the pre-trained semantic model, so that the overall semantics in the target search content can be better characterized by using the second characterization vector.

According to the above description of obtaining the first token vector of the target advertisement in step 220, the pre-trained semantic model may also be a BERT model.

In addition, in this embodiment of the application, the determining a second characterization vector of the target search content based on the M classification vectors may specifically include: performing a full join computation on the M classification vectors to convert into M scalars; calculating the M scalars through a normalization index function to obtain M importance factors; determining a second characterization vector for the target search content based on the M classification vectors and the M importance factors.

Similarly, in an embodiment of the present application, after the extracted search text is input into the pre-trained semantic model to obtain Q classification vectors, for example, as shown in fig. 5, if Q is 12 and M is 3, the search text of the target search content may be input into the BERT model to obtain 12 classification vectors, and the next three layers of classification vectors, that is, [ V ] may be obtained₁,V₂,V₃]Performing full join computation to convert to 3 scalars e₁,e₂,e₃]As shown in equation 4; then, the 3 scalars are calculated through a normalized exponential function (softmax), and 3 importance factors [ S ] corresponding to the 3 classification vectors are obtained₁,S₂,S₃]As shown in equation 5; then, the 3 classification vectors and the 3 importance factors corresponding thereto are subjected to weighted sum (multiplication and addition) calculation to obtain a final second characterization vector V2, as shown in equation 6.

[e₁,e₂,e₃]＝[v₁,v₂,v₃]W formula 4

[s₁,s₂,s₃]＝Softmax([e₁,e₂,e₃]) Equation 5

V2＝v₁*s₁+v₂*s₂+v₃*s₃Equation 6

Therefore, M classification vectors of the search text can be obtained by utilizing the semantic model, and various operation processing is carried out based on the M classification vectors to obtain a second characterization vector aiming at the target search content, so that the overall semantic meaning in the target search content can be better characterized.

Step 330, determining a second target vector for the target search content based on the second keyword vector and the second characterization vector.

Similarly, in the embodiment of the present application, the second keyword vector and the second token vector may be fused to obtain a second target vector for the target search content. For example, the second keyword vector K2 and the second token vector V2 of the target advertisement may be spliced, a normalization operation is performed to make token scales of the two vectors consistent, so as to obtain a vector D, and then the vector D is calculated through a full connection, so as to obtain a second target vector F for the target search content.

According to the method for sequencing the search advertisements, a second keyword vector aiming at target search contents can be obtained by performing linear processing on a full connection layer and a pooling layer and nonlinear processing on an activation function on a plurality of keyword vectors; obtaining M classification vectors of the search text by using the semantic model, and performing various operation processing based on the M classification vectors to obtain a second characterization vector aiming at the target search content; and then, based on the second keyword vector and the second representation vector, obtaining a second target vector aiming at the target search content, thereby not only better representing the core semantics in the target search content, but also better representing the whole semantics in the target search content, strengthening the importance of the keywords in the search text representation and weakening the influence of unimportant words on the search text representation.

Similarly, in an embodiment of the present application, as shown in fig. 6, the pre-training process of the word vector model may also be: the word segmentation tool is used for segmenting all advertisements and search contents, and then the linguistic data (namely words or phrases) after word segmentation is input into the word vector model, so that a word vector set corresponding to each word or phrase can be obtained. Therefore, the corresponding relation between a large number of words and word vectors can be obtained through the pre-trained word vector model, and the corresponding relation can be directly obtained from the word vector set under the condition of subsequently obtaining the keyword vectors, so that the sequencing efficiency of searching advertisements is improved.

The following describes in further detail a search advertisement ranking method provided in the embodiment of the present application with reference to an actual application scenario and fig. 7.

For example, as shown in FIG. 7, first, at least one keyword may be extracted for a targeted advertisement, with the keyword contained in each advertisement being limited to a maximum of 10; then, obtaining at least one keyword vector corresponding to at least one keyword based on the word vector model; then, at least one keyword vector passes through a full connection layer with a ReLu activation function, so that a plurality of keywords in the current sentence are subjected to information interaction; then, performing maximum pooling operation on the output of the full connection layer to weaken the information of unimportant keywords; and finally, obtaining a first keyword vector K1 after the keyword information in the target advertisement is interacted. Similarly, corresponding operations can be performed on the target search content to obtain at least one keyword vector corresponding to at least one keyword in each search content, and then a second keyword vector K2 after interaction of the keyword information in the target search content can be obtained.

Secondly, the advertisement text aiming at the target advertisement can be extracted, the advertisement text is input into a pre-trained BERT model, the classification vectors of the three layers behind the BERT model are taken out for fusion, and the fusion mode is realized through attention calculation. The specific method can be as follows: performing full join computation on the 3 classification vectors to convert into 3 scalars; calculating 3 scalars through a normalization index function to obtain 3 importance factors corresponding to the 3 classification vectors; then, the 3 classification vectors and the 3 corresponding importance factors are subjected to weighted sum, namely multiplication and addition calculation, so that a first characterization vector V1 of the final target advertisement is obtained. Similarly, the corresponding operation may be performed on the target search content, so as to obtain the second token vector V2 of the final target search content.

After the first keyword vector V1 and the first token vector K1 of the target advertisement are obtained, the first keyword vector V1 and the first token vector K1 may be spliced, and a normalization operation is performed to make token scales of the two vectors consistent, so as to obtain a vector C; and then, the vector C passes through a full connection layer to obtain a first target vector of the target advertisement. Similarly, after the second keyword vector V2 and the second token vector K2 of the target search content are obtained, the second keyword vector V2 and the second token vector K2 may be spliced, and the second target vector of the target search content is obtained through normalization and full-join calculation.

And finally, the cosine similarity between the first target vector of the target advertisement and the second target vector of the target search content can be calculated, so that the sequencing method of the search advertisement can be smoothly carried out.

In addition, in an actual application scenario, the first target vector may be calculated and stored for all advertisements in advance. When the online user searches the content, only the second target vector of the target search content needs to be calculated, then the second target vector of the target search content and the plurality of first target vectors of all the advertisements are used for calculating the similarity in sequence, and all the advertisements are sequenced according to the similarity. And finally, according to the sequencing order of all advertisements, carrying out advertisement pushing after the user searches the content.

According to the method for sequencing the search advertisements, the similarity between the first target vector obtained based on the vectors representing different dimensionality semantics of the target advertisement and the second target vector obtained based on the vectors representing different dimensionality semantics of the target search content can be determined, the calculated similarity can be ensured to be more accurate due to more factors considered in the process of calculating the similarity, the sequence of the target advertisements can be determined according to the size of the accurately obtained similarity, the method for determining the sequence of the target advertisements can ensure that the sequence of the target advertisements determined based on the similarity is more accurate by calculating the similarity between the target advertisements and the target search content more accurately, and therefore the problem that the conventional sequencing mode for the advertisements is not accurate is solved.

Fig. 8 is a block diagram illustrating a structure of an apparatus for ranking search advertisements according to an embodiment of the present disclosure. Referring to fig. 8, a sorting apparatus 800 provided in an embodiment of the present application may include: an acquisition module 810 and a determination module 820.

The obtaining module 810 is configured to obtain a first target vector for a target advertisement, where the first target vector is obtained based on multiple vectors that characterize different dimensional semantics of the target advertisement; obtaining a second target vector aiming at the target search content, wherein the second target vector is obtained based on a plurality of vectors representing different dimensional semantics of the target search content, and each dimension representing the target advertisement is consistent with each dimension representing the target search content;

the determining module 820 is configured to determine a similarity between the second target vector and the first target vector; and determining the arrangement sequence of the target advertisements based on the similarity.

The sequencing device for the search advertisements, provided by the embodiment of the application, acquires a first target vector for a target advertisement, wherein the first target vector is obtained based on a plurality of vectors representing different dimensionality semantics of the target advertisement; obtaining a second target vector aiming at the target search content, wherein the second target vector is obtained based on a plurality of vectors representing different dimensional semantics of the target search content, and each dimension representing the target advertisement is consistent with each dimension representing the target search content; determining a similarity between the second target vector and the first target vector; and determining the arrangement sequence of the target advertisements based on the similarity. Therefore, the similarity between the first target vector obtained based on the vectors representing different dimensionality semantics of the target advertisement and the second target vector obtained based on the vectors representing different dimensionality semantics of the target search content is determined, the calculated similarity can be ensured to be more accurate due to more factors considered in the process of calculating the similarity, the arrangement sequence of the target advertisement can be determined according to the accurately obtained similarity, and the arrangement sequence of the target advertisement determined based on the similarity can be ensured to be more accurate by calculating the similarity between the target advertisement and the target search content more accurately in the way of determining the arrangement sequence of the target advertisement, so that the problem that the existing arrangement mode for the advertisement is not accurate enough is solved.

Optionally, in an embodiment, the obtaining module 810 may specifically be configured to: acquiring a first keyword vector of a target advertisement, wherein the first keyword vector is used for indicating keywords in the target advertisement; obtaining a first characterization vector of the target advertisement, wherein the first characterization vector is used for indicating the overall semantics of the target advertisement; determining a first target vector for the targeted advertisement based on the first keyword vector and the first characterization vector.

Optionally, in an embodiment, the obtaining module 810 may be further specifically configured to: acquiring a second keyword vector of target search content, wherein the second keyword vector is used for indicating keywords in the target search content; acquiring a second characterization vector of the target search content, wherein the second characterization vector is used for indicating the overall semantics of the target search content; determining a first target vector for the target search content based on the second keyword vector and the second characterization vector.

Optionally, in an embodiment, the obtaining module 810 may be further configured to: extracting at least one keyword for the targeted advertisement; determining at least one keyword vector corresponding to the at least one keyword by using a pre-trained word vector model; based on the at least one keyword vector, a first keyword vector for the targeted advertisement is determined.

Optionally, in an embodiment, the obtaining module 810 may be further configured to: processing the at least one keyword vector through a full connection layer with an activation function to obtain a first appointed keyword vector; and performing pooling processing on the first appointed keyword vector to obtain a first keyword vector aiming at the target advertisement.

Optionally, in an embodiment, the obtaining module 810 may be further configured to: extracting an advertisement text aiming at the target advertisement, and taking the advertisement text as the input of a pre-trained semantic model; acquiring N classification vectors based on the semantic model; determining a first characterization vector for the targeted advertisement based on the N classification vectors.

Optionally, in an embodiment, the obtaining module 810 may be further configured to: performing a full join computation on the N classification vectors to convert into N scalars; calculating the N scalars through a normalization index function to obtain N importance factors; determining a first characterization vector for the targeted advertisement based on the N classification vectors and the N importance factors.

Optionally, in an embodiment, the obtaining module 810 may be further configured to: extracting at least one keyword for the target search content; determining at least one keyword vector corresponding to the at least one keyword by using a pre-trained word vector model; determining a second keyword vector for the targeted search content based on the at least one keyword vector.

Optionally, in an embodiment, the obtaining module 810 may be further configured to: processing the at least one keyword vector through a full connection layer with an activation function to obtain a second specified keyword vector; and pooling the second specified keyword vector to obtain a second keyword vector aiming at the target search content.

Optionally, in an embodiment, the obtaining module 810 may be further configured to: extracting a search text, and taking the search text as an input of a pre-trained semantic model; obtaining M classification vectors based on the semantic model; determining a second characterization vector for the target search content based on the M classification vectors.

Optionally, in an embodiment, the obtaining module 810 may be further configured to: performing a full join computation on the M classification vectors to convert into M scalars; calculating the M scalars through a normalization index function to obtain M importance factors; determining a second characterization vector for the target search content based on the M classification vectors and the M importance factors.

Optionally, in one embodiment, the target advertisement may include a plurality of advertisements, and the first target vector may include a plurality of sub-vectors corresponding to the plurality of advertisements; the determining module 820 may be specifically configured to: determining a similarity of the second target vector to each of the plurality of sub-vectors; the determining module 820 may be further configured to: determining an order of arrangement of the plurality of advertisements based on a similarity of the second target vector to each of the plurality of sub-vectors.

It should be noted that the ranking device for search advertisements provided in the embodiments of the present application corresponds to the above-mentioned ranking method for search advertisements. The related content can refer to the above description of the method for sorting search advertisements, and is not described herein again.

In addition, as shown in fig. 9, an embodiment of the present application further provides a server 900, where the server 900 includes: a processor 910, a memory 920, and programs or instructions stored on the memory 920 and run on the processor 910, which when executed by the processor 910, implement the steps of any of the methods described above. For example, the program when executed by the processor 920 implements the following processes: obtaining a first target vector for a target advertisement, wherein the first target vector is obtained based on a plurality of vectors representing different dimensionality semantics of the target advertisement; obtaining a second target vector aiming at the target search content, wherein the second target vector is obtained based on a plurality of vectors representing different dimensional semantics of the target search content, and each dimension representing the target advertisement is consistent with each dimension representing the target search content; determining a similarity between the second target vector and the first target vector; and determining the arrangement sequence of the target advertisements based on the similarity. Therefore, the similarity between the first target vector obtained based on the vectors representing different dimensionality semantics of the target advertisement and the second target vector obtained based on the vectors representing different dimensionality semantics of the target search content is determined, the calculated similarity can be ensured to be more accurate due to more factors considered in the process of calculating the similarity, the arrangement sequence of the target advertisement can be determined according to the accurately obtained similarity, and the arrangement sequence of the target advertisement determined based on the similarity can be ensured to be more accurate by calculating the similarity between the target advertisement and the target search content more accurately in the way of determining the arrangement sequence of the target advertisement, so that the problem that the existing arrangement mode for the advertisement is not accurate enough is solved.

Embodiments of the present application also provide a readable storage medium, on which a program or instructions are stored, which when executed by the processor 910, implement the steps of any of the methods described above. For example, the program when executed by the processor 910 implements the following processes: obtaining a first target vector for a target advertisement, wherein the first target vector is obtained based on a plurality of vectors representing different dimensionality semantics of the target advertisement; obtaining a second target vector aiming at the target search content, wherein the second target vector is obtained based on a plurality of vectors representing different dimensional semantics of the target search content, and each dimension representing the target advertisement is consistent with each dimension representing the target search content; determining a similarity between the second target vector and the first target vector; and determining the arrangement sequence of the target advertisements based on the similarity. Therefore, the similarity between the first target vector obtained based on the vectors representing different dimensionality semantics of the target advertisement and the second target vector obtained based on the vectors representing different dimensionality semantics of the target search content is determined, the calculated similarity can be ensured to be more accurate due to more factors considered in the process of calculating the similarity, the arrangement sequence of the target advertisement can be determined according to the accurately obtained similarity, and the arrangement sequence of the target advertisement determined based on the similarity can be ensured to be more accurate by calculating the similarity between the target advertisement and the target search content more accurately in the way of determining the arrangement sequence of the target advertisement, so that the problem that the existing arrangement mode for the advertisement is not accurate enough is solved.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A method of ranking search ads, the method comprising:

2. The ranking method of claim 1, wherein the obtaining a first target vector for a target advertisement comprises:

acquiring a first keyword vector of a target advertisement, wherein the first keyword vector is used for indicating keywords in the target advertisement;

obtaining a first characterization vector of the target advertisement, wherein the first characterization vector is used for indicating the overall semantics of the target advertisement;

determining a first target vector for the target advertisement based on the first keyword vector and the first characterization vector;

the obtaining a second target vector for the target search content comprises:

acquiring a second keyword vector of target search content, wherein the second keyword vector is used for indicating keywords in the target search content;

acquiring a second characterization vector of the target search content, wherein the second characterization vector is used for indicating the overall semantics of the target search content;

determining a second target vector for the target search content based on the second keyword vector and the second characterization vector.

3. The ranking method of claim 2 wherein the obtaining a first keyword vector for a targeted advertisement comprises:

extracting at least one keyword for the targeted advertisement;

determining at least one keyword vector corresponding to the at least one keyword by using a pre-trained word vector model;

determining a first keyword vector for the targeted advertisement based on the at least one keyword vector;

the obtaining of the second keyword vector of the target search content includes:

extracting at least one keyword for the target search content;

determining a second keyword vector for the targeted search content based on the at least one keyword vector.

4. The ranking method of claim 3 wherein the determining a first keyword vector for the targeted advertisement based on the at least one keyword vector comprises:

processing the at least one keyword vector through a full connection layer with an activation function to obtain a first appointed keyword vector;

pooling the first appointed keyword vector to obtain a first keyword vector aiming at the target advertisement;

the determining, based on the at least one keyword vector, a second keyword vector for the targeted search content comprises:

processing the at least one keyword vector through a full connection layer with an activation function to obtain a second specified keyword vector;

and pooling the second specified keyword vector to obtain a second keyword vector aiming at the target search content.

5. The ranking method of claim 2, wherein the obtaining a first characterization vector for the targeted advertisement comprises:

extracting an advertisement text aiming at the target advertisement, and taking the advertisement text as the input of a pre-trained semantic model;

acquiring N classification vectors based on the semantic model;

determining a first characterization vector for the targeted advertisement based on the N classification vectors;

the obtaining a second characterization vector of the target search content includes:

extracting a search text, and taking the search text as an input of a pre-trained semantic model;

obtaining M classification vectors based on the semantic model;

determining a second characterization vector for the target search content based on the M classification vectors.

6. The ranking method of claim 5 wherein the determining a first characterization vector for the targeted advertisement based on the N classification vectors comprises:

performing a full join computation on the N classification vectors to convert into N scalars;

calculating the N scalars through a normalization index function to obtain N importance factors;

determining a first characterization vector for the targeted advertisement based on the N classification vectors and the N importance factors;

the determining, based on the M classification vectors, a second characterization vector for the target search content comprises:

performing a full join computation on the M classification vectors to convert into M scalars;

calculating the M scalars through a normalization index function to obtain M importance factors;

determining a second characterization vector for the target search content based on the M classification vectors and the M importance factors.

7. The method of sorting of claim 1, wherein the target advertisement comprises a plurality of advertisements, and wherein the first target vector comprises a plurality of sub-vectors corresponding to the plurality of advertisements;

the determining the similarity between the second target vector and the first target vector comprises: determining a similarity of the second target vector to each of the plurality of sub-vectors;

the determining the ranking order of the target advertisements based on the similarity comprises: determining an order of arrangement of the plurality of advertisements based on a similarity of the second target vector to each of the plurality of sub-vectors.

8. An apparatus for ranking search advertisements, the apparatus comprising:

9. A server, comprising a processor, a memory, and a program or instructions stored on the memory and executed on the processor, which when executed by the processor, performs the steps of the method of any one of claims 1-7.

10. A readable storage medium, characterized in that it stores thereon a program or instructions which, when executed by a processor, implement the steps of the method according to any one of claims 1-7.