CN115687597A

CN115687597A - Search box pull-down phrase recommendation method and device, electronic equipment and storage medium

Info

Publication number: CN115687597A
Application number: CN202210954616.7A
Authority: CN
Inventors: 田歌; 任瑜平; 石忠德; 叶齐娇
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2022-08-10
Filing date: 2022-08-10
Publication date: 2023-02-03

Abstract

The disclosure provides a search box pull-down phrase recommendation method and device, electronic equipment and a storage medium, relates to the technical field of artificial intelligence, and can also be used in the technical field of finance. The search box pull-down phrase recommendation method comprises the following steps: acquiring a query term text, wherein the query term text is associated with query words input by a target user in a search box; inputting the query item text into a left tower side network of the double-tower recommendation model, and respectively outputting a semantic vector and a query vector of the query item text through different layers of the left tower side network; determining a first class of recommended phrases matched with the semantics of the query characters by utilizing the semantic vector and a pre-constructed semantic index list; determining a second recommendation phrase matched with the prefix of the query word by using the query vector and a pre-constructed matching index list; and recommending a first recommendation phrase and a second recommendation phrase to the target user, wherein the first recommendation phrase and the second recommendation phrase are used for generating the business requirement book.

Description

Search box pull-down phrase recommendation method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a search box pull-down phrase recommendation method, apparatus, device, medium, and program product.

Background

In the process of writing the business requirement book, a user generally assists writing by utilizing a search function, associates and recommends search words in the input process of user search, helps the user to quickly locate actual keywords to be searched, and is one step of assisting the user in searching the key by a system. In the process of implementing the present disclosure, it is found that the existing recommendation method for pull-down candidate words cannot recommend more accurate candidate phrases to the user, and the user experience is poor.

Disclosure of Invention

In view of the foregoing, the present disclosure provides a search box drop-down phrase recommendation method, apparatus, device, medium, and program product.

One aspect of the present disclosure provides a search box drop-down phrase recommendation method, including:

acquiring a query term text, wherein the query term text is associated with query words input by a target user in a search box;

inputting the query item text into a left tower side network of a double-tower recommendation model, and respectively outputting a semantic vector and a query vector of the query item text through different layers of the left tower side network;

determining a first class of recommended phrases matched with the semantics of the query characters by utilizing the semantic vector and a pre-constructed semantic index list;

determining a second recommendation phrase matched with the prefix of the query word by using the query vector and a pre-constructed matching index list, wherein the semantic index list and the matching index list are respectively constructed by using different layers of a right tower side network of a double-tower recommendation model according to a plurality of historical search records of the historical user in a preset historical time period;

recommending a first class of recommendation phrases and a second class of recommendation phrases for the first query term to the target user, wherein the first class of recommendation phrases and the second class of recommendation phrases are used for generating the business requirement book.

According to an embodiment of the present disclosure, wherein the left tower side network includes a first left side network layer and a second left side network layer connected in sequence, and the outputting the semantic vector and the query vector of the query term text through different layers of the left tower side network respectively includes:

inputting the query term text into the first left network layer to output a semantic vector of the query term text through the first left network layer;

and inputting the semantic vector of the query term text into the second left network layer so as to output the query vector of the query term text through the second left network layer.

According to the embodiment of the disclosure, the right tower side network comprises a first right side network layer and a second right side network layer which are connected in sequence, and the semantic index list is constructed by the following method:

acquiring a plurality of historical search records of a historical user in a preset historical time period, wherein the plurality of historical search records comprise a plurality of historical candidate phrases used for generating a historical service demand book;

inputting a plurality of historical candidate phrases into a first right network layer to output a plurality of first candidate vectors associated with the plurality of historical candidate phrases through the first right network layer;

and combining the plurality of first candidate vectors and the plurality of historical candidate phrases in pairs to form a plurality of semantic index pairs, wherein the plurality of semantic index pairs form a semantic index list.

According to the embodiment of the disclosure, the matching index list is constructed by the following method:

inputting the plurality of first candidate vectors into a second right network layer to output a plurality of second candidate vectors associated with the plurality of historical candidate phrases through the second right network layer;

and combining the plurality of second candidate vectors and the plurality of historical candidate phrases in pairs to form a plurality of matching index pairs, wherein the plurality of matching index pairs form a matching index list.

According to an embodiment of the present disclosure, wherein determining the second referral phrase that matches the prefix of the query word using the query vector and the pre-constructed matching index list comprises:

determining a target matching index pair from a plurality of matching index pairs in the matching index list, wherein a target second candidate vector in the target matching index pair and the query vector meet a preset similarity condition;

and extracting the target history candidate phrase in the target matching index pair as a second referral phrase.

According to an embodiment of the present disclosure, wherein determining a target matching index pair from a plurality of matching index pairs in a matching index list comprises:

a target matching index pair is determined from a plurality of matching index pairs in the matching index list using a predetermined retrieval tool.

According to the embodiment of the disclosure, the double-tower recommendation model is obtained by training the following methods:

acquiring a query term training text, a positive sample of the query term training text and a negative sample of the query term training text, wherein the query term training text is associated with historical query words input by a historical user in a search box, and the positive sample and the negative sample are associated with a plurality of historical search records of the historical user in a preset historical time period;

inputting the query item training text into a left tower side network of a double-tower recommendation model to be trained, and outputting a first training vector of the query item text through the left tower side network to be trained;

inputting a positive sample of the query item training text and a negative sample of the query item training text into a right tower side network of the double-tower recommendation model to be trained, and outputting a second training vector of the positive sample and the second training vector of the negative sample through the right tower side network to be trained;

calculating the similarity of the first training vector and the second training vector;

and under the condition that the similarity of the first training vector and the second training vector meets a preset termination condition, obtaining a double-tower recommendation model obtained through training.

Another aspect of the present disclosure provides a search box pull-down phrase recommendation apparatus, which includes an obtaining module, an input/output module, a first determining module, a second determining module, and a recommending module.

The device comprises an acquisition module, a search module and a search module, wherein the acquisition module is used for acquiring a query term text, and the query term text is associated with query words input by a target user in a search box;

the input and output module is used for inputting the query term text into a left tower side network of the double-tower recommendation model so as to respectively output the semantic vector and the query vector of the query term text through different layers of the left tower side network;

the first determining module is used for determining a first type of recommended phrase matched with the semantic meaning of the query character by utilizing the semantic vector and a pre-constructed semantic index list;

the second determination module is used for determining a second recommendation phrase matched with the prefix of the query word by using the query vector and a pre-constructed matching index list, wherein the semantic index list and the matching index list are respectively constructed by using different layers of a right tower side network of the double-tower recommendation model according to a plurality of historical search records of the historical user in a preset historical time period;

the recommendation module is used for recommending a first class of recommendation phrases and a second class of recommendation phrases to the target user, wherein the first class of recommendation phrases and the second class of recommendation phrases are used for generating the business requirement book.

According to the embodiment of the disclosure, the left tower side network comprises a first left side network layer and a second left side network layer which are connected in sequence, and the input and output module comprises a first input and output unit and a second input and output unit.

The first input and output unit is used for inputting the query term text into the first left network layer so as to output the semantic vector of the query term text through the first left network layer;

and the second input and output unit is used for inputting the semantic vector of the query term text into the second left network layer so as to output the query vector of the query term text through the second left network layer.

According to an embodiment of the present disclosure, wherein the right tower side network comprises a first right side network layer and a second right side network layer connected in series.

The device also comprises a first construction module used for constructing the semantic index list, wherein the first construction module comprises a first acquisition unit, a third input/output unit and a first combination unit.

The system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a plurality of historical search records of historical users in a preset historical time period, and the plurality of historical search records comprise a plurality of historical candidate phrases used for generating a historical service demand book;

a third input-output unit for inputting the plurality of historical candidate phrases into the first right network layer to output a plurality of first candidate vectors associated with the plurality of historical candidate phrases through the first right network layer;

and the first combination unit is used for combining the plurality of first candidate vectors and the plurality of historical candidate phrases in pairs to form a plurality of semantic index pairs, wherein the plurality of semantic index pairs form a semantic index list.

According to an embodiment of the present disclosure, the apparatus further includes a second building module, configured to build the matching index list, where the second building module includes a fourth input/output unit and a second combining unit.

The fourth input-output unit is used for inputting the plurality of first candidate vectors into the second right network layer so as to output a plurality of second candidate vectors associated with the plurality of historical candidate phrases through the second right network layer;

and the second combination unit is used for combining the plurality of second candidate vectors and the plurality of historical candidate phrases in pairs to form a plurality of matching index pairs, wherein the plurality of matching index pairs form a matching index list.

According to the embodiment of the disclosure, the second determination module comprises a determination unit and an extraction unit.

The determining unit is used for determining a target matching index pair from a plurality of matching index pairs in the matching index list, wherein a target second candidate vector in the target matching index pair and the query vector meet a preset similarity condition;

and the extracting unit is used for extracting the target history candidate phrase in the target matching index pair as the second recommendation phrase.

According to an embodiment of the present disclosure, the determining unit includes a determining subunit configured to determine, by using a predetermined retrieval tool, a target matching index pair from a plurality of matching index pairs in the matching index list.

According to the embodiment of the disclosure, the device further comprises a training module for training the double-tower recommendation model, wherein the training module comprises a second obtaining unit, a fifth input/output unit, a sixth input/output unit, a calculating unit and an iteration unit.

The second obtaining unit is used for obtaining a query term training text, a positive sample of the query term training text and a negative sample of the query term training text, wherein the query term training text is associated with historical query words input by a historical user in a search box, and the positive sample and the negative sample are associated with a plurality of historical search records of the historical user in a preset historical time period;

the fifth input and output unit is used for inputting the query item training text into a left tower side network of the double-tower recommendation model to be trained so as to output a first training vector of the query item text through the left tower side network to be trained;

the sixth input and output unit is used for inputting the positive sample of the query item training text and the negative sample of the query item training text into a right tower side network of the double-tower recommendation model to be trained so as to output a second training vector of the positive sample and the negative sample through the right tower side network to be trained;

the calculating unit is used for calculating the similarity of the first training vector and the second training vector;

and the iteration unit is used for obtaining the double-tower recommendation model obtained by training under the condition that the similarity of the first training vector and the second training vector meets the preset termination condition.

Another aspect of the present disclosure provides an electronic device including: one or more processors; a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the search box drop-down phrase recommendation method described above.

Another aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above search box drop-down phrase recommendation method.

Another aspect of the present disclosure also provides a computer program product comprising a computer program that when executed by a processor implements the search box drop-down phrase recommendation method described above.

Drawings

The foregoing and other objects, features and advantages of the disclosure will be apparent from the following description of embodiments of the disclosure, taken in conjunction with the accompanying drawings of which:

FIG. 1 schematically illustrates an application scenario diagram of a search box drop-down phrase recommendation method, apparatus, device, medium, and program product according to an embodiment of the present disclosure;

FIG. 2 schematically illustrates a flow diagram of a search box drop-down phrase recommendation method according to an embodiment of the disclosure;

FIG. 3 schematically illustrates a flow diagram of a search box drop-down phrase recommendation method according to another embodiment of the present disclosure;

FIG. 4 schematically illustrates a flow chart of a method of training a two-tower recommendation model in accordance with an embodiment of the present disclosure;

FIG. 5 is a block diagram schematically illustrating the structure of a search box drop-down phrase recommendation device according to an embodiment of the present disclosure; and

FIG. 6 schematically illustrates a block diagram of an electronic device suitable for implementing a search box drop-down phrase recommendation method in accordance with an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that these descriptions are illustrative only and are not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs, unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.

In those instances where a convention analogous to "at least one of A, B, and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B, and C" would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, etc.).

With the rapid development of internet technology, people edit traditional documents in an online manner, and the quantity of unstructured text data is increased explosively. Business requirements books are the best way to explain business, and many requirements book writing systems are in force. The business requirement book is an important tool for communication between requirement analysts and developers, and is a basis for later delivery description and help testing. And a similar requirement book is provided for the requirement personnel as a reference, so that the pressure of the requirement personnel is relieved.

In the process of writing the business requirement book, a user generally assists writing by utilizing a search function, associates and recommends search words in the input process of user search, helps the user to quickly locate actual keywords to be searched, and is one step of assisting the user in searching the key by a system. How to select the most meaningful requirement book from a large amount of data becomes a significant guide for users. The method comprises the steps of associating and recommending search terms in the input process of user search, and helping a user to quickly locate actual keywords to be searched, namely the most key step of assisting the user search by a system, and the most important step in user experience.

In the process of implementing the present disclosure, it is found that the existing recommendation method for pull-down candidate words cannot recommend more accurate candidate phrases to the user, and the user experience is poor.

For example, most of the recommendations of the pull-down candidate words in the related art are searched according to prefix matching, for example, "apple" is searched, and the candidate words in the pull-down list only start with "apple". However, when searching for a demand book, a demander mainly focuses on the content of the demand book, but because of the limitation of the field of the demand book, the keywords input during searching are limited, and the demand books with similar content but different keywords cannot be searched out at all, which goes against the original idea of the user. Therefore, in order to meet the user requirements, when a candidate word is recommended to a demander, not only semantic matching but also prefix matching need to be considered.

In view of this, embodiments of the present disclosure provide a search box drop-down phrase recommendation method, apparatus, device, medium, and program product.

FIG. 1 schematically illustrates an application scenario diagram of a search box drop-down phrase recommendation method, apparatus, device, medium, and program product according to an embodiment of the disclosure.

As shown in fig. 1, the application scenario 100 according to this embodiment may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

A user may use

terminal devices

101, 102, 103 to interact with a server 105 over a network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).

The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 105 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the

terminal devices

101, 102, 103. The background management server may analyze and perform other processing on the received data such as the user request, and feed back a processing result (e.g., a webpage, information, or data obtained or generated according to the user request) to the terminal device.

In the application scenario of the embodiment of the disclosure, the user helps to compile by using the search function, associates and recommends the search word in the input process of the user search, and helps the user to quickly locate the actual keyword to be searched. A user can input query words by using the

terminal devices

101, 102, and 103 to initiate a request for obtaining a recommended pull-down phrase to the server 105, and the server 105 can be configured to execute the search box pull-down phrase recommendation method according to the embodiment of the present disclosure, input a query term text associated with the query words input by the user in the search box into a left tower side network of a two-tower recommendation model, and output a semantic vector and a query vector of the query term text through different layers of the left tower side network; determining a first class of recommended phrases matched with the semantics of the query characters by utilizing the semantic vector and a pre-constructed semantic index list; and determining a second recommendation phrase matched with the prefix of the query word by using the query vector and a pre-constructed matching index list, and displaying the two recommendation phrases by the users of the pull-down list items of the search box through

terminal equipment

101, 102 and 103.

It should be noted that the search box drop-down phrase recommendation method provided by the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the search box drop-down phrase recommendation apparatus provided by the embodiments of the present disclosure may be generally disposed in the server 105. The search box drop-down phrase recommendation method provided by the embodiment of the present disclosure may also be executed by a server or a server cluster that is different from the server 105 and is capable of communicating with the

terminal devices

101, 102, 103 and/or the server 105. Accordingly, the search box pull-down phrase recommendation apparatus provided in the embodiments of the present disclosure may also be disposed in a server or a server cluster different from the server 105 and capable of communicating with the

terminal devices

101, 102, 103 and/or the server 105.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for an implementation.

It should be noted that the search box pull-down phrase recommendation method and apparatus in the embodiments of the present disclosure may be used in the technical field of artificial intelligence, the technical field of finance, or any field other than the technical field of artificial intelligence and the technical field of finance, and the embodiments of the present disclosure do not limit the application fields of the search box pull-down phrase recommendation method and apparatus.

In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure, application and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations, necessary confidentiality measures are taken, and the customs of the public order is not violated.

In the technical scheme of the disclosure, before the personal information of the user is acquired or collected, the authorization or the consent of the user is acquired.

The search box drop-down phrase recommendation method of the disclosed embodiment will be described in detail below with fig. 2 to 6 based on the scenario described in fig. 1.

FIG. 2 schematically shows a flow chart of a search box drop-down phrase recommendation method according to an embodiment of the present disclosure.

As shown in fig. 2, the search box drop-down phrase recommendation method of this embodiment includes operations S201 to S205.

In operation S201, a query term text is obtained, where the query term text is associated with a query word input by a target user in a search box;

in operation S202, inputting the query term text into a left tower side network of the dual-tower recommendation model to output a semantic vector and a query vector of the query term text through different layers of the left tower side network, respectively;

in operation S203, determining a first type of recommended phrase matched with the semantics of the query text by using the semantic vector and a pre-constructed semantic index list;

in operation S204, determining a second analogy recommendation phrase matched with the prefix of the query word by using the query vector and a pre-constructed matching index list, where the semantic index list and the matching index list are constructed by using different layers of a right tower side network of a two-tower recommendation model according to a plurality of historical search records of the historical user in a preset historical time period;

in operation S205, a first type of recommendation phrase and a second type of recommendation phrase for the first query term are recommended to the target user, wherein the first type of recommendation phrase and the second type of recommendation phrase are used to generate the business requirement book.

According to the embodiment of the present disclosure, in operation S201, the query term text is associated with the query word input by the target user in the search box, and may be, for example, a text obtained by preprocessing the query word input by the user (e.g., extracting keywords, removing stop words, etc.).

According to an embodiment of the present disclosure, a left tower side network of the double-tower recommendation model is used for inputting a query term text, and a right tower side network of the double-tower recommendation model is used for inputting a search record associated with the query term text, for example: a complete search record may be: the system failure recovery time requirement, the query term text associated with the system failure recovery time requirement may be part of characters and words in the search record, such as any one of the following: system, system failure recovery time, etc. And after the query item text is coded and dimensionality reduced through a left tower side network of the double-tower recommendation model, a vector associated with the query item text can be output through the top of a left tower. And after the search record is encoded and dimensionality reduced through a right tower side network of the double-tower recommendation model, a vector related to the search record can be output through the tower top of the right tower.

According to the embodiment of the disclosure, the input phrases, words, sentences and the like are vector-coded by using a trained double-tower recommendation model. The dual tower recommendation model includes a left tower side network and a right tower side network, the left tower side network and the right tower side network including at least two layers of networks, for example, the left tower side network may include a first left side network layer and a second left side network layer, and the right tower side network includes a first right side network layer and a second right side network layer. Different layers of the left tower side network and the right tower side network can respectively obtain different vector representations.

For example, in operation S202, the query term text is input into a left tower side network of the two-tower recommendation model, and the semantic vector and the query vector of the query term text may be output through different layers of the left tower side network, respectively. In operations S203 and S204, the semantic vector may be used to retrieve words/sentences matching the semantic similar to the text of the query term from the semantic index list, the query vector may be used to retrieve words/sentences matching the text prefix of the query term from the matching index list, and the matching rule is determined according to the network structure, the loss function, and the training sample data.

According to the embodiment of the disclosure, a semantic index list and a matching index list can be constructed according to a plurality of historical search records of historical users in a preset historical time period by respectively utilizing different layers of a right tower side network of a double-tower recommendation model. The semantic index list and the matching index list can be respectively constructed by using vectors output by different layers of a right tower side network of the double-tower recommendation model.

According to the embodiment of the disclosure, a semantic index list and a matching index list built by output vectors of different layers of a right tower side network are used for retrieval and matching by utilizing a semantic vector and a query vector output by a left tower side network of a double-tower recommendation model, the semantic vector is used for retrieving semantically matched phrases in the semantic index list, and the query vector is used for matching and retrieving phrases with more matched prefixes from the matching index list. By the method, when a demand person searches a demand book, the document system continuously captures the input of the current search word and requests the recommendation system, so that the real-time dynamic recommendation of the pull-down word in the demand book search is completed, the recommendation phrase given by the pull-down list considers both prefix matching and semantic matching, the recommendation accuracy is higher, the demand books with similar contents but different keywords can be better searched out, the technical problem that the pull-down list search recommendation effect in the related technology is not ideal is solved, the user search requirement is met, the user experience is improved, and the service person can write the service demand book more quickly and efficiently.

According to an embodiment of the present disclosure, the left tower side network of the dual tower recommendation model may specifically include a first left side network layer and a second left side network layer connected in sequence. Further, the outputting the semantic vector and the query vector of the query term text through different layers of the left tower side network respectively comprises:

firstly, inputting a query term text into a first left network layer so as to output a semantic vector of the query term text through the first left network layer; the first left network layer may be, for example, a BERT model, which is used to encode a query term text to extract semantic features of words, and has a mature application in the field of natural language processing, and parameters of the BERT model may be initialized by using parameters of a pre-trained open source BERT model.

Then, the semantic vector of the query term text is input into the second left network layer to output the query vector of the query term text through the second left network layer. And the second left network layer is used for reducing the dimension of the semantic vector output by the BERT model.

According to an embodiment of the present disclosure, by training the BERT model, the semantic vector may be used to retrieve words/sentences matching the semantic meaning similar to the query term text from the semantic index list, by training the second left network layer, the query vector may be used to retrieve words/sentences matching the query term text prefix from the matching index list, and the matching rule is determined according to the network structure, the loss function, and the training sample data.

According to the embodiment of the disclosure, a semantic index list and a matching index list are constructed by using output vectors of different layers of a right tower side network, the semantic index list comprises a plurality of semantic index pairs, the semantic index pairs are obtained by pairwise combination of the output vectors of a first layer of the right tower side network and historical candidate phrases, the matching index list comprises a plurality of matching index pairs, and the matching index pairs are obtained by pairwise combination of the output vectors of a second layer of the right tower side network and the historical candidate phrases.

Determining a first recommended phrase matching the semantics of the query word using the semantic vector and the pre-constructed semantic index list may include: determining a target semantic index pair from a plurality of semantic index pairs in the semantic index list, wherein a target first candidate vector in the target semantic index pair and a semantic vector meet a preset similarity condition (for example, taking the first n index pairs with the former similarity); and extracting target historical candidate phrases in the target semantic index pair as a first-class recommended phrase.

Determining a second referral phrase that matches the prefix of the query word using the query vector and a pre-constructed matching index list comprises: determining a target matching index pair from a plurality of matching index pairs in the matching index list, wherein a target second candidate vector in the target matching index pair and the query vector meet a preset similarity condition (for example, taking the first n index pairs with the front similarity); and extracting the target history candidate phrase in the target matching index pair as a second referral phrase.

Wherein, determining the target matching index pair from the plurality of matching index pairs in the matching index list may be determining the target matching index pair from the plurality of matching index pairs in the matching index list by using a predetermined retrieval tool (e.g., a FAISS tool). The target semantic index pair may also be determined from a plurality of semantic index pairs in the semantic index list using a predetermined retrieval tool.

According to the embodiment of the disclosure, phrases which are respectively similar to the semantic vector and the query vector are retrieved from the semantic index list and the matching index list to serve as the recommended phrases, so that phrases which are more matched with the semantics and the prefixes are further selected from a plurality of candidate phrases to serve as the recommended phrases, and the recommendation efficiency and accuracy are further improved. The retrieval matching in the index list by using a retrieval tool is improved, and the retrieval efficiency and the retrieval accuracy can be improved.

According to the embodiment of the disclosure, the semantic index list and the matching index list can be constructed by using output vectors of different layers of a right tower side network, the right tower side network comprises a first right side network layer and a second right side network layer which are sequentially connected, and the semantic index list and the matching index list are respectively constructed by the following methods.

The semantic index list construction method comprises the following steps:

acquiring a plurality of historical search records of a historical user in a preset historical time period, wherein the plurality of historical search records comprise a plurality of historical candidate phrases used for generating a historical service demand book; inputting a plurality of historical candidate phrases into a first right network layer to output a plurality of first candidate vectors associated with the plurality of historical candidate phrases through the first right network layer; the first right network layer may be, for example, a BERT model, which is used to encode historical search records. And combining the plurality of first candidate vectors and the plurality of historical candidate phrases in pairs to form a plurality of semantic index pairs, wherein the plurality of semantic index pairs form a semantic index list.

The matching index list construction method comprises the following steps:

inputting the plurality of first candidate vectors into a second right network layer to output a plurality of second candidate vectors associated with the plurality of historical candidate phrases through the second right network layer; and the second right network layer is used for reducing the dimension of the first candidate vector output by the BERT model to obtain a second candidate vector. And combining the plurality of second candidate vectors and the plurality of historical candidate phrases in pairs to form a plurality of matching index pairs, wherein the plurality of matching index pairs form a matching index list.

FIG. 3 schematically shows a flow diagram of a search box drop-down phrase recommendation method according to another embodiment of the present disclosure. The method of the embodiments of the present disclosure is described below with reference to fig. 3.

As shown in fig. 3, the search box pull-down phrase recommendation method of the embodiment of the present disclosure includes an offline process and an online process.

The off-line process is divided into two operations. Firstly, training a double-tower recommendation model; secondly, after the model is trained, in order to conveniently feed back the pull-down candidate phrases to the user on line more efficiently, indexes need to be established, an index list can be constructed by using a right tower side network of the trained double-tower recommendation model, and specifically, the semantic index list and the matching index list can be constructed by using output vectors of different layers of the right tower side network. Each sentence/phrase in the index list corresponds to a vector, and several candidate phrases which are closest in the index are retrieved through the output vector of the query item during online query. For a specific index construction method, reference may be made to the description in the foregoing embodiments, and details are not repeated herein. In order to facilitate an efficient online retrieval process, an index can be constructed by using a FAISS tool. When the FAISS constructs the index, the rank with higher similarity to the query vector is arranged in front during query.

The online recommendation process may include the following three operations.

Firstly, a demand document system acquires search words input by a user in a search bar of a demand book, and preprocesses the word contents to obtain a query item text.

Secondly, inputting the query term text into a left tower side network of a double-tower recommendation model, outputting a semantic vector of the query term text through a first left side network layer, and outputting a query vector of the query term text through a second left side network layer;

thirdly, searching in a semantic index list constructed in advance by using a FAISS tool and utilizing a semantic vector, and determining a first class of recommended phrases matched with the semantics of the query characters; and determining a second recommendation phrase matched with the prefix of the query word by using a FAISS tool and utilizing the query vector to check in a pre-constructed matching index list so as to obtain a recommendation candidate set, wherein the number of the retrieval returns can be set according to actual use requirements. And when the index is constructed offline, the higher-similarity ranking is arranged in front of the return value, so that a plurality of phrases related to the characters input by the user can be retrieved, and the recommendation of the pull-down candidate phrases is completed.

FIG. 4 schematically illustrates a flow chart of a method of training a two-tower recommendation model according to an embodiment of the disclosure.

As shown in fig. 4, the method for training the double-tower recommendation model of this embodiment includes operations S401 to S405.

In operation S401, a query term training text, a positive sample of the query term training text, and a negative sample of the query term training text are obtained, where the query term training text is associated with historical query words input by a historical user in a search box, and the positive sample and the negative sample are associated with a plurality of historical search records of the historical user in a preset historical time period.

The training text can be obtained by analyzing a system search log, history search records of all requirement books in the search log are used as a candidate set, and each history search record is used as a phrase.

The query term training text is constructed by the following method: and for each historical search record, removing stop words from the historical search record, segmenting the words, obtaining k words from each historical search record after segmenting the words, and obtaining k-1 query term training texts by respectively taking the first 1 word, the first 2 words, the first k-1 word. For example, one of the historical search records is: the response time requirement of the online system is that k-1 query term training texts are respectively as follows: link, online system response time, and online system response time. The positive sample of the k-1 query term training text is the search record: online system response time requirements.

According to the embodiment of the disclosure, a negative sample of the query term training text is further required to be constructed, a phrase with a prefix not repeated with the first word is randomly selected from the candidate set aiming at each query term training text, and the phrase is used as the negative sample of the query term. For example, for a query term: online, the negative examples randomly chosen from the candidate set may be: the traffic volume increases, either for data security level and access control, or for system batch processing time requirements, or for system security requirements, etc.

Thus, k words are obtained after word segmentation is carried out on one search record, and k-1 positive samples and k-1 negative samples can be derived for the search record, so that 2k-2 training samples are obtained. And selecting a proper number of historical search records from the candidate set, thereby constructing a training sample set.

And after the training sample construction is completed, performing model training. The double-tower recommendation model is a double-tower structure, the bottom input of the left tower is a preprocessed query item, and the bottom text input of the right tower is a complete search record which is a positive sample or a negative sample of the query item. Specifically, the method comprises the following steps:

in operation S402, the query term training text is input into a left tower side network of the dual-tower recommendation model to be trained, so as to output a first training vector of the query term text through the left tower side network to be trained.

In operation S403, a positive sample of the query term training text and a negative sample of the query term training text are input into a right tower side network of the two-tower recommendation model to be trained, so as to output a second training vector of the positive sample and the negative sample through the right tower side network to be trained.

In operation S404, calculating a similarity of the first training vector and the second training vector; positive samples represent a match with a similarity of 1, negative samples represent a mismatch with a similarity of 0.

In operation S405, under the condition that the similarity between the first training vector and the second training vector meets a preset termination condition, a trained dual-tower recommendation model is obtained.

According to the embodiment of the disclosure, a double-tower recommendation model is trained through a positive sample and a negative sample, an index list is constructed by using the right tower of the trained model, and is retrieved in an index through a vector output by the left tower, the semantic vector can be used for retrieving words/sentences matched with the text semanteme of a query item from the semantic index list, the query vector can be used for retrieving words/sentences matched with the text prefix of the query item from the matching index list in a matching way, in the process of recommending pull-down phrases by using the model, the recommendation phrases given by the pull-down list not only consider the prefix matching but also consider the semantic matching, the recommendation accuracy is higher, demand books with similar contents but different keywords can be better searched out, the technical problem that the pull-down list search recommendation effect in the related technology is not ideal is solved, the user search demand is met, and the user experience is improved.

Based on the search box pull-down phrase recommendation method, the disclosure also provides a search box pull-down phrase recommendation device. Fig. 5 schematically shows a block diagram of a structure of a search box drop-down phrase recommendation apparatus according to an embodiment of the present disclosure.

As shown in fig. 5, the search box pull-down phrase recommendation apparatus 500 of this embodiment includes an obtaining module 501, an input/output module 502, a first determining module 503, a second determining module 504, and a recommending module 505.

The obtaining module 501 is configured to obtain a query term text, where the query term text is associated with a query word input by a target user in a search box;

an input/output module 502, configured to input the query term text into a left tower side network of the double-tower recommendation model, so as to output a semantic vector and a query vector of the query term text through different layers of the left tower side network, respectively;

a first determining module 503, configured to determine, by using the semantic vector and a pre-constructed semantic index list, a first type of recommended phrase that matches the semantics of the query text;

a second determining module 504, configured to determine, by using the query vector and a pre-constructed matching index list, a second recommendation phrase that matches the prefix of the query word, where the semantic index list and the matching index list are constructed by using different layers of a right-tower side network of the two-tower recommendation model, according to multiple historical search records of the historical user in a preset historical time period;

a recommending module 505, configured to recommend a first type of recommendation phrase and a second type of recommendation phrase to the target user, where the first type of recommendation phrase and the second type of recommendation phrase are used to generate the business requirement book.

According to the embodiment of the present disclosure, the semantic vector and the query vector output by the left tower side network of the double-tower recommendation model are utilized by the input/output module 502, and the semantic index list and the matching index list constructed by the output vector of different layers of the right tower side network are further utilized by the first determining module 503 and the second determining module 504 to perform retrieval and matching, the semantic vector is used for retrieving a semantically matched phrase from the semantic index list, and the query vector is used for matching and retrieving a phrase with a prefix that is relatively matched from the matching index list. By the device, when a demand book is searched by a demand person, the document system continuously captures the input of the current search word and requests the recommendation system, so that the real-time dynamic recommendation of the pull-down word in the demand book search is completed, the recommendation phrase given by the pull-down list considers both prefix matching and semantic matching, the recommendation accuracy is higher, the demand books with similar contents but different keywords can be better searched out, the technical problem that the pull-down list search recommendation effect in the related technology is not ideal is solved, the user search demand is met, the user experience is improved, and the service person can write the service demand book more quickly and efficiently.

and the extraction unit is used for extracting the target history candidate phrase in the target matching index pair as a second recommendation phrase.

According to an embodiment of the present disclosure, the apparatus further includes a training module, configured to train the double-tower recommendation model, where the training module includes a second obtaining unit, a fifth input/output unit, a sixth input/output unit, a calculating unit, and an iteration unit.

The second acquisition unit is used for acquiring a query term training text, a positive sample of the query term training text and a negative sample of the query term training text, wherein the query term training text is associated with historical query words input by a historical user in a search box, and the positive sample and the negative sample are associated with a plurality of historical search records of the historical user in a preset historical time period;

the sixth input and output unit is used for inputting the positive samples of the query term training texts and the negative samples of the query term training texts into a right tower side network of the double-tower recommendation model to be trained so as to output second training vectors of the positive samples and the negative samples through the right tower side network to be trained;

According to the embodiment of the present disclosure, any plurality of the obtaining module 501, the input/output module 502, the first determining module 503, the second determining module 504, and the recommending module 505 may be combined and implemented in one module, or any one of the modules may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the obtaining module 501, the input/output module 502, the first determining module 503, the second determining module 504, and the recommending module 505 may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented in any one of three manners of software, hardware, and firmware, or in a suitable combination of any several of the three manners. Alternatively, at least one of the obtaining module 501, the input-output module 502, the first determining module 503, the second determining module 504, and the recommending module 505 may be at least partially implemented as a computer program module, which when executed, may perform a corresponding function.

As shown in fig. 6, an electronic device 600 according to an embodiment of the present disclosure includes a processor 601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. Processor 601 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 601 may also include on-board memory for caching purposes. The processor 601 may comprise a single processing unit or a plurality of processing units for performing the different actions of the method flows according to embodiments of the present disclosure.

In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are stored. The processor 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. The processor 601 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in the ROM 602 and/or RAM 603. It is to be noted that the programs may also be stored in one or more memories other than the ROM 602 and RAM 603. The processor 601 may also perform various operations of the method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.

Electronic device 600 may also include input/output (I/O) interface 605, input/output (I/O) interface 605 also connected to bus 604, according to an embodiment of the present disclosure. The electronic device 600 may also include one or more of the following components connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.

The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement a method according to an embodiment of the disclosure.

According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 602 and/or RAM 603 described above and/or one or more memories other than the ROM 602 and RAM 603.

Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method illustrated in the flow chart. When the computer program product runs in a computer system, the program code is used for causing the computer system to realize the search box drop-down phrase recommendation method provided by the embodiment of the disclosure.

The computer program performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure when executed by the processor 601. The systems, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed in the form of a signal on a network medium, downloaded and installed through the communication section 609, and/or installed from the removable medium 611. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program, when executed by the processor 601, performs the above-described functions defined in the system of the embodiments of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.

In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments of the present disclosure and/or the claims may be made without departing from the spirit and teachings of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.

The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the disclosure, and these alternatives and modifications are intended to fall within the scope of the disclosure.

Claims

1. A search box drop-down phrase recommendation method comprises the following steps:

inputting the query term text into a left tower side network of a double-tower recommendation model so as to respectively output a semantic vector and a query vector of the query term text through different layers of the left tower side network;

determining a first class of recommended phrases matched with the semantics of the query characters by using the semantic vector and a pre-constructed semantic index list;

determining a second recommendation phrase matched with the prefix of the query word by using the query vector and a pre-constructed matching index list, wherein the semantic index list and the matching index list are respectively constructed by using different layers of a right tower side network of the double-tower recommendation model according to a plurality of historical search records of historical users in a preset historical time period;

recommending the first class of recommendation phrases and the second class of recommendation phrases to the target user, wherein the first class of recommendation phrases and the second class of recommendation phrases are used to generate a business requirement book.

2. The method of claim 1, wherein the left-tower network comprises a first left-side network layer and a second left-side network layer connected in series, and the outputting the semantic vector and the query vector of the query term text through different layers of the left-tower network respectively comprises:

inputting the semantic vector of the query term text into the second left network layer to output the query vector of the query term text through the second left network layer.

3. The method of claim 1, wherein the right tower side network comprises a first right side network layer and a second right side network layer which are connected in sequence, and the semantic index list is constructed by the following method:

acquiring a plurality of historical search records of historical users in the preset historical time period, wherein the plurality of historical search records comprise a plurality of historical candidate phrases used for generating a historical service demand book;

inputting the plurality of historical candidate phrases into the first right network layer to output, by the first right network layer, a plurality of first candidate vectors associated with the plurality of historical candidate phrases;

combining the plurality of first candidate vectors and the plurality of historical candidate phrases in pairs to form a plurality of semantic index pairs, wherein the plurality of semantic index pairs form the semantic index list.

4. The method of claim 3, wherein the matching index list is constructed by:

inputting the plurality of first candidate vectors into the second right network layer to output, by the second right network layer, a plurality of second candidate vectors associated with the plurality of historical candidate phrases;

combining the plurality of second candidate vectors and the plurality of historical candidate phrases in pairs to form a plurality of matching index pairs, wherein the plurality of matching index pairs form the matching index list.

5. The method of claim 4, wherein determining, using the query vector and a pre-constructed match index list, a second referral phrase that matches a prefix of the query literal comprises:

determining a target matching index pair from the plurality of matching index pairs in the matching index list, wherein a target second candidate vector in the target matching index pair and the query vector satisfy a preset similarity condition;

extracting a target history candidate phrase in the target matching index pair as the second referral phrase.

6. The method of claim 5, wherein the determining a target matching index pair from the plurality of matching index pairs in the matching index list comprises:

determining a target matching index pair from the plurality of matching index pairs in the matching index list using a predetermined retrieval tool.

7. The method of claim 1, wherein the two-tower recommendation model is trained using the following method:

inputting the positive sample of the query term training text and the negative sample of the query term training text into a right tower side network of a double-tower recommendation model to be trained, and outputting a second training vector of the positive sample and the negative sample through the right tower side network to be trained;

and under the condition that the similarity of the first training vector and the second training vector meets a preset termination condition, obtaining a double-tower recommendation model obtained by training.

8. A search box pull-down phrase recommendation apparatus comprising:

recommending the first type of recommendation phrase and the second type of recommendation phrase to the target user, wherein the first type of recommendation phrase and the second type of recommendation phrase are used to generate a business requirements book.

9. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method recited in any of claims 1-7.

10. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any one of claims 1 to 7.

11. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 7.