CN110222271B - Method and device for generating webpage - Google Patents

Method and device for generating webpage Download PDF

Info

Publication number
CN110222271B
CN110222271B CN201910533278.8A CN201910533278A CN110222271B CN 110222271 B CN110222271 B CN 110222271B CN 201910533278 A CN201910533278 A CN 201910533278A CN 110222271 B CN110222271 B CN 110222271B
Authority
CN
China
Prior art keywords
search
behavior
statement
sample
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910533278.8A
Other languages
Chinese (zh)
Other versions
CN110222271A (en
Inventor
刘昊骋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910533278.8A priority Critical patent/CN110222271B/en
Publication of CN110222271A publication Critical patent/CN110222271A/en
Application granted granted Critical
Publication of CN110222271B publication Critical patent/CN110222271B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a method and a device for generating a webpage. One embodiment of the method comprises: searching based on the search statement to obtain a search result of the search statement and a related search statement; respectively acquiring a search statement, a search result and a behavior vector of a related search statement; respectively calculating the similarity between the behavior vector of the search statement and the behavior vector of the search result and the similarity between the behavior vectors of the related search statements; and respectively sequencing the search results and the related search sentences based on the similarity to generate the search web pages. The implementation mode relates to the field of cloud computing, and the search results and the related search sentences are sequenced based on the similarity to generate the search webpages, so that the search webpages conform to the search behaviors of the user, and the click rate of the user on the search results and the related search sentences on the search webpages is improved.

Description

Method and device for generating webpage
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a method and a device for generating a webpage.
Background
With the rapid development of the internet, information resources on the network are continuously abundant, and the amount of information data is also rapidly increasing. In the modern society, searching for required information by a search engine has become a main information acquisition mode of modern people. Therefore, the main development direction of the search engine is to improve the relevance of the search and provide more convenient and effective query service for users.
Typically, a large number of search results are returned based on a search statement entered by a user. However, the existing search results cannot be matched with the search behavior of the user, so that the user needs to turn pages for searching for the search many times, or input a search sentence for searching for many times to obtain the search result desired by the user.
Disclosure of Invention
The embodiment of the application provides a method and a device for generating a webpage.
In a first aspect, an embodiment of the present application provides a method for generating a web page, including: searching based on the search statement to obtain a search result of the search statement and a related search statement; respectively acquiring a search statement, a search result and a behavior vector of a related search statement; respectively calculating the similarity between the behavior vector of the search statement and the behavior vector of the search result and the similarity between the behavior vectors of the related search statements; and respectively sequencing the search results and the related search sentences based on the similarity to generate the search web pages.
In some embodiments, obtaining the behavior vectors of the search statement, the search result, and the related search statement separately comprises: respectively acquiring word vectors of a search statement, a search result and a related search statement; and multiplying the word vectors of the search statement, the search result and the related search statement by a pre-trained first weight matrix respectively to obtain the behavior vectors of the search statement, the search result and the related search statement.
In some embodiments, obtaining the word vectors of the search statement, the search result, and the related search statement, respectively, comprises: and respectively matching the search sentence, the search result and the related search sentence in the sample search click behavior set to obtain word vectors of the sample search click behavior which is successfully matched as the word vectors of the search sentence, the search result and the related search sentence, wherein the sample search click behavior in the sample search click behavior set is pre-coded into corresponding word vectors.
In some embodiments, the first weight matrix is obtained by training: initializing a first weight matrix; multiplying the first weight matrix with word vectors of sample search click behaviors in the sample search click behavior set respectively to obtain initial behavior vectors of the sample search click behaviors in the sample search click behavior set; taking an initial behavior vector subset corresponding to a sample search click behavior subset belonging to one search session in a sample search click behavior set as a training sample; training is performed based on the training samples to update the first weight matrix.
In some embodiments, when initializing the first weight matrix, further comprises: initializing a second weight matrix; and training based on the training samples to update the first weight matrix, including: taking an initial behavior vector of a context sample search click behavior in a training sample as an input, and outputting a mapping vector based on a preset mapping method; multiplying the mapping vector by the second weight matrix to obtain a product vector; inputting the product vector into a preset activation function to obtain a predicted word vector of a central sample search click behavior in a training sample; searching word vectors and predicted word vectors of click behaviors based on a center sample in the training samples, updating the first weight matrix and the second weight matrix through the cross entropy loss function until the cross entropy loss function is converged, and determining that the training of the first weight matrix is completed.
In a second aspect, an embodiment of the present application provides an apparatus for generating a web page, including: the search unit is configured to search based on the search statement to obtain a search result of the search statement and a related search statement; an acquisition unit configured to acquire a search sentence, a search result, and a behavior vector of a related search sentence, respectively; a calculation unit configured to calculate similarities of the behavior vector of the search sentence with the behavior vector of the search result and the behavior vector of the related search sentence, respectively; and the sequencing unit is configured to sequence the search result and the related search statement respectively based on the similarity to generate a search webpage.
In some embodiments, the obtaining unit comprises: an acquisition subunit configured to acquire a search sentence, a search result, and a word vector of a related search sentence, respectively; and the multiplying subunit is configured to multiply the word vectors of the search statement, the search result and the related search statement with a pre-trained first weight matrix respectively to obtain the behavior vectors of the search statement, the search result and the related search statement.
In some embodiments, the obtaining subunit is further configured to: and respectively matching the search sentence, the search result and the related search sentence in the sample search click behavior set to obtain word vectors of the sample search click behavior which is successfully matched as the word vectors of the search sentence, the search result and the related search sentence, wherein the sample search click behavior in the sample search click behavior set is pre-coded into corresponding word vectors.
In some embodiments, the first weight matrix is obtained by training: initializing a first weight matrix; multiplying the first weight matrix with word vectors of sample search click behaviors in the sample search click behavior set respectively to obtain initial behavior vectors of the sample search click behaviors in the sample search click behavior set; taking an initial behavior vector subset corresponding to a sample search click behavior subset belonging to one search session in a sample search click behavior set as a training sample; training is performed based on the training samples to update the first weight matrix.
In some embodiments, when initializing the first weight matrix, further comprises: initializing a second weight matrix; and training based on the training samples to update the first weight matrix, including: taking an initial behavior vector of a context sample search click behavior in a training sample as an input, and outputting a mapping vector based on a preset mapping method; multiplying the mapping vector by the second weight matrix to obtain a product vector; inputting the product vector into a preset activation function to obtain a predicted word vector of a central sample search click behavior in a training sample; searching word vectors and predicted word vectors of click behaviors based on a center sample in the training samples, updating the first weight matrix and the second weight matrix through the cross entropy loss function until the cross entropy loss function is converged, and determining that the training of the first weight matrix is completed.
In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors; a storage device having one or more programs stored thereon; when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method as described in any implementation of the first aspect.
In a fourth aspect, the present application provides a computer-readable medium, on which a computer program is stored, which, when executed by a processor, implements the method as described in any implementation manner of the first aspect.
According to the method and the device for generating the webpage, firstly, searching is carried out based on the search statement to obtain a search result of the search statement and a related search statement; then respectively acquiring the search statement, the search result and the behavior vector of the related search statement; then respectively calculating the similarity between the behavior vector of the search statement and the behavior vector of the search result and the similarity between the behavior vectors of the related search statements; and finally, sorting the search results and the related search sentences respectively based on the similarity so as to generate the search web pages. And sequencing the search results and the related search sentences based on the similarity to generate the search webpages, so that the search webpages conform to the search behaviors of the user, and the click rate of the user on the search results and the related search sentences on the search webpages is improved.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture to which the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of a method for generating a web page in accordance with the present application;
FIG. 3 is a flow diagram of one embodiment of a method for training a first weight matrix according to the present application;
FIG. 4 is a flow diagram of yet another embodiment of a method for training a first weight matrix according to the present application;
FIG. 5 is a schematic diagram of a mapping process;
FIG. 6 is a schematic diagram illustrating an embodiment of an apparatus for generating a web page according to the present application;
FIG. 7 is a block diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 shows an exemplary system architecture 100 to which embodiments of the method for generating a web page or the apparatus for generating a web page of the present application may be applied.
As shown in fig. 1, a system architecture 100 may include a terminal device 101, a network 102, and a server 103. Network 102 is the medium used to provide communication links between terminal devices 101 and server 103. Network 102 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
A user may use terminal device 101 to interact with server 103 over network 102 to receive or send messages and the like. Various client software, such as a search-type application, may be installed on the terminal device 101.
The terminal apparatus 101 may be hardware or software. When the terminal apparatus 101 is hardware, it may be various electronic apparatuses having a display screen and supporting web browsing. Including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal apparatus 101 is software, it can be installed in the above-described electronic apparatus. It may be implemented as multiple pieces of software or software modules, or as a single piece of software or software module. And is not particularly limited herein.
The server 103 may be a server that provides various services. Such as a web page generation server. The web page generation server may analyze data such as a search term, generate a processing result (for example, a search web page), and push the processing result to the terminal apparatus 101.
The server 103 may be hardware or software. When the server 103 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 103 is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be noted that the method for generating the web page provided by the embodiment of the present application is generally executed by the server 103, and accordingly, the apparatus for generating the web page is generally disposed in the server 103.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for generating a web page in accordance with the present application is shown. The method for generating the webpage comprises the following steps:
step 201, searching is performed based on the search statement to obtain a search result of the search statement and a related search statement.
In this embodiment, an executing body (e.g., the server 103 shown in fig. 1) of the method for generating a web page may first acquire a search statement input by a user from a terminal device (e.g., the terminal device 101 shown in fig. 1), and then perform a search in a search engine based on the search statement to obtain a search result of the search statement and a related search statement. Typically, the number of search results and related search sentences is more than 1. For example, the user opens a search-class application installed on the terminal device, enters a search sentence "a bank credit card" in a search box, and clicks a search button. At this time, the terminal device may transmit the search sentence "a bank credit card" to the execution main body. The execution subject may perform a search based on the search sentence "a bank credit card" to obtain a search result "a bank credit card official website", "a bank young card detail page", "a bank young card transaction flow", and "B bank credit card". Meanwhile, searching is carried out based on the search statement "A bank credit card" to obtain related search statements "A bank young card" and "B bank credit card".
Step 202, behavior vectors of the search statement, the search result and the related search statement are respectively obtained.
In this embodiment, the execution body may obtain behavior vectors of the search statement, the search result, and the related search statement, respectively. Wherein the behavior vector of the search statement can be used to represent the historical search situation of the search statement. The behavior vector of the search result may be used to represent historical click behavior of the search result. The behavior vector of the related search statement may be used to represent a historical search situation of the related search statement. Generally, a search statement is searched historically, and a search result of the search statement is clicked with a high probability, so that the similarity between the behavior vector of the search statement and the behavior vector of the search result is high; on the contrary, there is a smaller probability that the search result is clicked, and then the similarity between the behavior vector of the search statement and the behavior vector of the search result is smaller.
In some optional implementation manners of this embodiment, the execution main body may first obtain word vectors of the search statement, the search result, and the related search statement, respectively; and then multiplying the word vectors of the search statement, the search result and the related search statement by a pre-trained first weight matrix respectively to obtain the behavior vectors of the search statement, the search result and the related search statement. The word vector technology may be to convert words into dense vectors, and for similar words, the corresponding word vectors are also similar. Generally, the word vectors of the search sentence, the search result, and the related search sentence may be obtained by one-hot (one-hot) encoding the search sentence, the search result, and the related search sentence. The first weight matrix is an N х M matrix, and represents the corresponding relationship between the word vector and the behavior vector. Where N is equal to the dimension of the behavior vector, which typically ranges from 300 to 500. M is the number of sample search click behaviors.
In some optional implementation manners of this embodiment, the execution main body may match the search statement, the search result, and the related search statement in the sample search click behavior set, and obtain a word vector of the sample search click behavior successfully matched, as the word vector of the search statement, the search result, and the related search statement. The sample search click behavior in the sample search click behavior set is pre-encoded into a corresponding word vector, and the encoding mode may be one-hot encoding. In general, a sample search click behavior in a set of sample search click behaviors can be a historical search click behavior. It should be noted that, when there is no search statement, search result, or related search statement, the execution subject may search the sample search click behavior set for a word vector of a sample search click behavior that is synonymous or close to the search statement, search result, or related search statement. For example, the search statement is "line a credit card", the search click behavior "line a credit card" does not exist in the sample search click behavior set, but the search click behavior "line a credit card" exists, and at this time, the word vector of "line a credit card" may be acquired as the word vector of the search statement "line a credit card".
Step 203, respectively calculating the similarity between the behavior vector of the search statement and the behavior vector of the search result and the behavior vector of the related search statement.
In this embodiment, the execution body may calculate a similarity between the behavior vector of the search statement and the behavior vector of the search result. Meanwhile, the similarity between the behavior vector of the search statement and the behavior vector of the related search statement is calculated. The similarity calculation method between vectors may include, but is not limited to, a cosine similarity method, a euclidean distance method, and the like.
And step 204, sequencing the search results and the related search sentences based on the similarity respectively to generate a search webpage.
In this embodiment, the execution body may sort the search results based on the similarity between the behavior vector of the search statement and the behavior vector of the search results, and at the same time, sort the related search statements based on the similarity between the behavior vector of the search statement and the behavior vector of the related search statement to generate the search webpage. Generally, a search result area and a related search sentence area are included in a search web page. For example, the execution body may set the search results in the search result area from top to bottom in the order of the similarity from large to small, and set the related search sentences in the related search sentence area from top to bottom in the order of the similarity from large to small, so as to generate the search web page. Subsequently, the execution body may send the search webpage to the terminal device for the user to browse.
According to the method for generating the webpage, firstly, searching is carried out based on the search statement to obtain a search result of the search statement and a related search statement; then respectively acquiring the search statement, the search result and the behavior vector of the related search statement; then respectively calculating the similarity between the behavior vector of the search statement and the behavior vector of the search result and the similarity between the behavior vectors of the related search statements; and finally, sorting the search results and the related search sentences respectively based on the similarity so as to generate the search web pages. And sequencing the search results and the related search sentences based on the similarity to generate the search webpages, so that the search webpages conform to the search behaviors of the user, and the click rate of the user on the search results and the related search sentences on the search webpages is improved.
With continued reference to FIG. 3, a flow 300 of one embodiment of a method for training a first weight matrix according to the present application is shown. The method for training the first weight matrix comprises the following steps:
step 301, initialize a first weight matrix.
In this embodiment, an executing agent (e.g., the server 103 shown in fig. 1) of the method for training the first weight matrix may initialize the first weight matrix. In general, the execution body may randomly initialize the first weight matrix. For example, if the first weight matrix is a 3 row by 7 column matrix, then the first weight matrix may be initialized as:
Figure BDA0002100419250000081
step 302, multiplying the first weight matrix with the word vector of the sample search click behavior in the sample search click behavior set respectively to obtain the initial behavior vector of the sample search click behavior in the sample search click behavior set.
In this embodiment, the executing body may multiply the first weight matrix with a word vector of each sample search click behavior in the sample search click behavior set, respectively, to obtain an initial behavior vector of the sample search click behavior in each sample search click behavior set. Wherein the initial behavior vector is an N-dimensional vector. For example, the sample search click behavior set includes 7 sample search click behaviors, and the sample search click behaviors and corresponding word vectors thereof are shown in the following table:
serial number Search click behavior Word vector
1 Search for "A Bank Credit card" 1 0 0 0 0 0 0
2 Click search result 'A Bank credit card official network' 0 1 0 0 0 0 0
3 Search for "Bank A you ung card" 0 0 1 0 0 0 0
4 Click search result 'A Bank you card detail page' 0 0 0 1 0 0 0
5 Click search result 'A Bank you card handling process' 0 0 0 0 1 0 0
6 Search for 'B Bank Credit card' 0 0 0 0 0 1 0
7 Click search result 'B bank credit card homepage' 0 0 0 0 0 0 1
When the first weight matrix is multiplied by the word vector of the behavior of searching the 'A bank young card', a vector (0.80.40.7) is obtained, namely the vector is the initial behavior vector of the behavior of searching the 'A bank young card'.
Step 303, taking an initial behavior vector subset corresponding to the sample search click behavior subset belonging to one search session in the sample search click behavior set as a training sample.
In this embodiment, the execution subject may use, as the training sample, an initial behavior vector subset corresponding to a sample search click behavior subset belonging to one search session in the sample search click behavior set. Generally, a search click behavior generated in a process from a series of operations of opening a search application for a search click to closing the search application by a user belongs to a search session.
Step 304, training is performed based on the training samples to update the first weight matrix.
In this embodiment, the executing entity may perform training based on the training samples, and continuously update the first weight matrix during the training process. Generally, each time the first weight matrix is updated, the process returns to continue to execute step 302 until the similarity between the initial behavior vectors of the similar sample search click behavior meets the preset constraint condition.
With further reference to fig. 4, a flow 400 of yet another embodiment of a method for training a first weight matrix according to the present application is shown. The method for training the first weight matrix comprises the following steps:
step 401, initializing a first weight matrix and a second weight matrix.
In this embodiment, an executing agent (e.g., the server 103 shown in fig. 1) of the method for training the first weight matrix may initialize the first weight matrix and the second weight matrix. In general, the execution body may randomly initialize the first weight matrix and the second weight matrix. Wherein, the second weight matrix is a matrix of M х N.
Step 402, multiplying the first weight matrix with the word vector of the sample search click behavior in the sample search click behavior set respectively to obtain an initial behavior vector of the sample search click behavior in the sample search click behavior set.
Step 403, taking an initial behavior vector subset corresponding to the sample search click behavior subset belonging to one search session in the sample search click behavior set as a training sample.
In the present embodiment, the specific operations of steps 402-403 have been described in detail in steps 302-303 of the embodiment shown in fig. 3, and are not described herein again.
And step 404, taking the initial behavior vector of the search click behavior of the context sample in the training sample as input, and outputting a mapping vector based on a preset mapping method.
In this embodiment, the executing entity may take an initial behavior vector of a context sample search click behavior in the training sample as an input, and output a mapping vector based on a preset mapping method. The preset mapping method may include, but is not limited to, vector summation, vector averaging, and vector maximization. The mapping vector is an N-dimensional vector.
For example, the training samples are shown in the following table:
serial number Search click behavior Initial behavior vector
S(t-2) Search for "A Bank CreditCard " (0.2 -2 0.1)
S(t-1) Click search result 'A Bank credit card official network' (1 1.2 -0.4)
S(t) Search for "Bank A you ung card" (0.8 0.4 0.7)
S(t+1) Click search result 'A Bank you card detail page' (-1 -0.9 2)
S(t+2) Click search result 'A Bank you card handling process' (3 1 -1)
The mapping process is shown in FIG. 5, where the inputs S (t-2), S (t-1), S (t +1), and S (t +2) are the initial behavior vectors of the context sample search click behavior in the training sample. The output s (t) is the initial behavior vector of the center sample search click behavior in the training samples. The preset mapping method is a vector summation method.
And mapping by adopting a vector summation method to obtain a vector (3.2-0.70.7), namely the mapping vector.
Step 405, multiplying the mapping vector by the second weight matrix to obtain a product vector.
In this embodiment, the execution entity may multiply the mapping vector by the second weight matrix to obtain a product vector. Wherein the product vector is an M-dimensional vector.
And step 406, inputting the product vector into a preset activation function to obtain a predicted word vector of a central sample search click behavior in the training sample.
In this embodiment, the execution subject may input the product vector into a preset activation function to obtain a predicted word vector of a center sample search click behavior in the training samples. Wherein the preset activation function may be a softmax function. The predicted word vector is an M-dimensional vector. Each dimension represents a probability value in that dimension. For example, the predictor vector may be (0.020.080.70.040.060.050.05). And the word vector for the center sample search click behavior in the training samples is (0010000).
Step 407, searching word vectors and predicted word vectors of click behaviors based on a center sample in the training samples, updating the first weight matrix and the second weight matrix through the cross entropy loss function until the cross entropy loss function is converged, and determining that the training of the first weight matrix is completed.
In this embodiment, the execution subject may search word vectors and predicted word vectors of click behaviors based on a center sample in a training sample, update the first weight matrix and the second weight matrix through a cross entropy loss function until the cross entropy loss function converges, and determine that training of the first weight matrix is completed. Specifically, the execution subject may calculate a loss value of the cross-entropy loss function based on the word vector and the predicted word vector of the central sample search click behavior in the training samples, and determine whether the loss value of the cross-entropy loss function converges. And if the first weight matrix is converged, determining that the training of the first weight matrix is finished. If not, the process returns to step 402, and the process loops until convergence.
With further reference to fig. 6, as an implementation of the method shown in the above figures, the present application provides an embodiment of an apparatus for generating a web page, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.
As shown in fig. 6, the apparatus 600 for generating a web page of the present embodiment may include: a search unit 601, an acquisition unit 602, a calculation unit 603, and a sorting unit 604. The search unit 601 is configured to perform a search based on a search statement, and obtain a search result of the search statement and a related search statement; an obtaining unit 602 configured to obtain a search statement, a search result, and a behavior vector of a related search statement, respectively; a calculation unit 603 configured to calculate similarities of the behavior vector of the search term, the behavior vector of the search result, and the behavior vector of the related search term, respectively; the sorting unit 604 is configured to sort the search results and the related search sentences based on the similarity, respectively, and generate a search web page.
In the present embodiment, in the apparatus 600 for generating a web page: the specific processing of the searching unit 601, the obtaining unit 602, the calculating unit 603, and the sorting unit 604 and the technical effects thereof can refer to the related descriptions of step 201, step 202, step 203, and step 204 in the corresponding embodiment of fig. 2, which are not described herein again.
In some optional implementations of this embodiment, the obtaining unit 602 includes: an acquisition subunit (not shown in the figure) configured to acquire the search sentence, the search result, and a word vector of the related search sentence, respectively; and the multiplying subunit (not shown in the figure) is configured to multiply the word vectors of the search statement, the search result and the related search statement with the pre-trained first weight matrix respectively to obtain the behavior vectors of the search statement, the search result and the related search statement.
In some optional implementations of this embodiment, the obtaining subunit is further configured to: and respectively matching the search sentence, the search result and the related search sentence in the sample search click behavior set to obtain word vectors of the sample search click behavior which is successfully matched as the word vectors of the search sentence, the search result and the related search sentence, wherein the sample search click behavior in the sample search click behavior set is pre-coded into corresponding word vectors.
In some optional implementations of this embodiment, the first weight matrix is obtained by training as follows: initializing a first weight matrix; multiplying the first weight matrix with word vectors of sample search click behaviors in the sample search click behavior set respectively to obtain initial behavior vectors of the sample search click behaviors in the sample search click behavior set; taking an initial behavior vector subset corresponding to a sample search click behavior subset belonging to one search session in a sample search click behavior set as a training sample; training is performed based on the training samples to update the first weight matrix.
In some optional implementation manners of this embodiment, when initializing the first weight matrix, the method further includes: initializing a second weight matrix; and training based on the training samples to update the first weight matrix, including: taking an initial behavior vector of a context sample search click behavior in a training sample as an input, and outputting a mapping vector based on a preset mapping method; multiplying the mapping vector by the second weight matrix to obtain a product vector; inputting the product vector into a preset activation function to obtain a predicted word vector of a central sample search click behavior in a training sample; searching word vectors and predicted word vectors of click behaviors based on a center sample in the training samples, updating the first weight matrix and the second weight matrix through the cross entropy loss function until the cross entropy loss function is converged, and determining that the training of the first weight matrix is completed.
Referring now to FIG. 7, a block diagram of a computer system 700 suitable for use in implementing an electronic device (e.g., server 103 shown in FIG. 1) of an embodiment of the present application is shown. The electronic device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the system 700 are also stored. The CPU 701, the ROM 702, and the RAM 703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program, when executed by a Central Processing Unit (CPU)701, performs the above-described functions defined in the method of the present application.
It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or electronic device. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a search unit, an acquisition unit, a calculation unit, and a sorting unit. The names of these units do not constitute a limitation to the units themselves in this case, and for example, a search unit may also be described as a "unit that performs a search based on a search term to obtain a search result of the search term and a related search term".
As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: searching based on the search statement to obtain a search result of the search statement and a related search statement; respectively acquiring a search statement, a search result and a behavior vector of a related search statement; respectively calculating the similarity between the behavior vector of the search statement and the behavior vector of the search result and the similarity between the behavior vectors of the related search statements; and respectively sequencing the search results and the related search sentences based on the similarity to generate the search web pages.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (12)

1. A method for generating a web page, comprising:
searching based on the search statement to obtain a search result of the search statement and a related search statement;
respectively obtaining behavior vectors of the search statement, the search result and the related search statement, wherein the behavior vector of the search statement is used for representing the historical search condition of the search statement, the behavior vector of the search result is used for representing the historical click condition of the search result, and the behavior vector of the related search statement is used for representing the historical search condition of the related search statement;
respectively calculating the similarity of the behavior vector of the search statement and the behavior vector of the search result and the similarity of the behavior vector of the related search statement;
and sequencing the search result and the related search sentences based on the similarity respectively to generate a search webpage.
2. The method of claim 1, wherein the obtaining behavior vectors of the search statement, the search result, and the related search statement, respectively, comprises:
respectively obtaining word vectors of the search statement, the search result and the related search statement;
and multiplying the word vectors of the search statement, the search result and the related search statement with a pre-trained first weight matrix respectively to obtain the behavior vectors of the search statement, the search result and the related search statement.
3. The method of claim 2, wherein said obtaining word vectors for the search statement, the search result, and the related search statement, respectively, comprises:
and matching the search statement, the search result and the related search statement in a sample search click behavior set respectively to obtain word vectors of sample search click behaviors which are successfully matched as the word vectors of the search statement, the search result and the related search statement, wherein the sample search click behaviors in the sample search click behavior set are pre-coded into corresponding word vectors.
4. The method of claim 3, wherein the first weight matrix is trained by:
initializing the first weight matrix;
multiplying the first weight matrix with word vectors of sample search click behaviors in the sample search click behavior set respectively to obtain initial behavior vectors of the sample search click behaviors in the sample search click behavior set;
taking an initial behavior vector subset corresponding to a sample search click behavior subset belonging to one search session in the sample search click behavior set as a training sample;
and training based on the training samples to update the first weight matrix.
5. The method of claim 4, wherein in the initializing the first weight matrix, further comprising:
initializing a second weight matrix; and
the training based on the training samples to update the first weight matrix includes:
taking an initial behavior vector of a search click behavior of a context sample in the training sample as an input, and outputting a mapping vector based on a preset mapping method;
multiplying the mapping vector by the second weight matrix to obtain a product vector;
inputting the product vector into a preset activation function to obtain a predicted word vector of a central sample search click behavior in the training sample;
searching word vectors and predicted word vectors of click behaviors based on a center sample in the training samples, updating the first weight matrix and the second weight matrix through a cross entropy loss function until the cross entropy loss function is converged, and determining that the training of the first weight matrix is completed.
6. An apparatus for generating a web page, comprising:
the search unit is configured to search based on a search statement to obtain a search result of the search statement and a related search statement;
an obtaining unit configured to obtain behavior vectors of the search term, the search result, and the related search term, respectively, where the behavior vector of the search term is used to represent a historical search situation of the search term, the behavior vector of the search result is used to represent a historical click situation of the search result, and the behavior vector of the related search term is used to represent a historical search situation of the related search term;
a calculation unit configured to calculate similarities of the behavior vector of the search sentence with the behavior vector of the search result and the behavior vector of the related search sentence, respectively;
and the sequencing unit is configured to sequence the search result and the related search statement respectively based on the similarity to generate a search webpage.
7. The apparatus of claim 6, wherein the obtaining unit comprises:
an obtaining subunit configured to obtain word vectors of the search sentence, the search result, and the related search sentence, respectively;
and the multiplying subunit is configured to multiply the word vectors of the search statement, the search result and the related search statement with a pre-trained first weight matrix respectively to obtain the behavior vectors of the search statement, the search result and the related search statement.
8. The apparatus of claim 7, wherein the acquisition subunit is further configured to:
and matching the search statement, the search result and the related search statement in a sample search click behavior set respectively to obtain word vectors of sample search click behaviors which are successfully matched as the word vectors of the search statement, the search result and the related search statement, wherein the sample search click behaviors in the sample search click behavior set are pre-coded into corresponding word vectors.
9. The apparatus of claim 8, wherein the first weight matrix is trained by:
initializing the first weight matrix;
multiplying the first weight matrix with word vectors of sample search click behaviors in the sample search click behavior set respectively to obtain initial behavior vectors of the sample search click behaviors in the sample search click behavior set;
taking an initial behavior vector subset corresponding to a sample search click behavior subset belonging to one search session in the sample search click behavior set as a training sample;
and training based on the training samples to update the first weight matrix.
10. The apparatus of claim 9, wherein in the initializing the first weight matrix, further comprising:
initializing a second weight matrix; and
the training based on the training samples to update the first weight matrix includes:
taking an initial behavior vector of a search click behavior of a context sample in the training sample as an input, and outputting a mapping vector based on a preset mapping method;
multiplying the mapping vector by the second weight matrix to obtain a product vector;
inputting the product vector into a preset activation function to obtain a predicted word vector of a central sample search click behavior in the training sample;
searching word vectors and predicted word vectors of click behaviors based on a center sample in the training samples, updating the first weight matrix and the second weight matrix through a cross entropy loss function until the cross entropy loss function is converged, and determining that the training of the first weight matrix is completed.
11. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-5.
12. A computer-readable medium, on which a computer program is stored, wherein the computer program, when being executed by a processor, carries out the method according to any one of claims 1-5.
CN201910533278.8A 2019-06-19 2019-06-19 Method and device for generating webpage Active CN110222271B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910533278.8A CN110222271B (en) 2019-06-19 2019-06-19 Method and device for generating webpage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910533278.8A CN110222271B (en) 2019-06-19 2019-06-19 Method and device for generating webpage

Publications (2)

Publication Number Publication Date
CN110222271A CN110222271A (en) 2019-09-10
CN110222271B true CN110222271B (en) 2022-03-15

Family

ID=67814022

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910533278.8A Active CN110222271B (en) 2019-06-19 2019-06-19 Method and device for generating webpage

Country Status (1)

Country Link
CN (1) CN110222271B (en)

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6862586B1 (en) * 2000-02-11 2005-03-01 International Business Machines Corporation Searching databases that identifying group documents forming high-dimensional torus geometric k-means clustering, ranking, summarizing based on vector triplets
CN100550018C (en) * 2007-02-16 2009-10-14 中国电信股份有限公司 Number know-all search system and method based on structured small text
CN103714084B (en) * 2012-10-08 2018-04-03 腾讯科技(深圳)有限公司 The method and apparatus of recommendation information
CN103294814A (en) * 2013-06-07 2013-09-11 百度在线网络技术(北京)有限公司 Search result recommendation method, system and search engine
CN104598583B (en) * 2015-01-14 2018-01-09 百度在线网络技术(北京)有限公司 The generation method and device of query statement recommendation list
CN107368525B (en) * 2017-06-07 2020-03-03 广州视源电子科技股份有限公司 Method and device for searching related words, storage medium and terminal equipment
CN107491547B (en) * 2017-08-28 2020-11-10 北京百度网讯科技有限公司 Search method and device based on artificial intelligence

Also Published As

Publication number Publication date
CN110222271A (en) 2019-09-10

Similar Documents

Publication Publication Date Title
CN107491534B (en) Information processing method and device
CN107463704B (en) Search method and device based on artificial intelligence
US11544474B2 (en) Generation of text from structured data
WO2020182122A1 (en) Text matching model generation method and device
JP6745384B2 (en) Method and apparatus for pushing information
US20180293506A1 (en) Method and system for recommending content items to a user based on tensor factorization
US11468342B2 (en) Systems and methods for generating and using knowledge graphs
US20210374356A1 (en) Conversation-based recommending method, conversation-based recommending apparatus, and device
CN110069698B (en) Information pushing method and device
US11966389B2 (en) Natural language to structured query generation via paraphrasing
US11874798B2 (en) Smart dataset collection system
CN114385780B (en) Program interface information recommendation method and device, electronic equipment and readable medium
US11030402B2 (en) Dictionary expansion using neural language models
CN111435406A (en) Method and device for correcting database statement spelling errors
CN116415564B (en) Functional point amplification method and system based on knowledge graph
US20230041339A1 (en) Method, device, and computer program product for user behavior prediction
CN114357195A (en) Knowledge graph-based question-answer pair generation method, device, equipment and medium
WO2024082827A1 (en) Text similarity measurement method and apparatus, device, storage medium, and program product
CN116383412B (en) Functional point amplification method and system based on knowledge graph
CN110222271B (en) Method and device for generating webpage
CN111475711A (en) Information pushing method and device, electronic equipment and computer readable medium
CN111126073A (en) Semantic retrieval method and device
CN111400623B (en) Method and device for searching information
CN113343664A (en) Method and device for determining matching degree between image texts
CN110110199B (en) Information output method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant