CN112507188A

CN112507188A - Method, device, equipment and medium for generating candidate search words

Info

Publication number: CN112507188A
Application number: CN202011383662.3A
Authority: CN
Inventors: 潘禄; 陈玉光
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-11-30
Filing date: 2020-11-30
Publication date: 2021-03-16
Anticipated expiration: 2040-11-30
Also published as: CN112507188B

Abstract

The application discloses a method, a device, equipment and a medium for generating candidate search words, and relates to the technical field of natural language processing and knowledge maps. The specific implementation scheme is as follows: after the event related text of the target event is obtained, a plurality of focus probability vectors are generated according to the event related text, an event semantic expression vector of the target event is generated according to the structural information of the target event, and a plurality of candidate search words are generated according to the event semantic expression vector, the text semantic expression vector of the event related text and the focus probability vectors. Therefore, by introducing the plurality of focus probability vectors which are used for guiding the generation of the plurality of candidate search terms, the generation efficiency and accuracy of the candidate search terms are effectively improved, and the diversity of the candidate search term generation is also improved.

Description

Method, device, equipment and medium for generating candidate search words

Technical Field

The application discloses a method, a device, equipment and a medium for generating candidate search words, and relates to the technical field of deep learning, in particular to the technical field of natural language processing and knowledge maps.

Background

With the development of the internet, the amount of information is greatly increased, and in order to improve the information acquisition efficiency and save the information acquisition time, a user can input search words in search boxes of various websites to acquire corresponding resources. For example, one or more search terms may be entered in a search box of an encyclopedia website to obtain corresponding encyclopedia content.

However, most of the events currently have no search word or few search words, so that the user does not pay attention to the event when searching. For this reason, it is important to provide a method for generating a search term of an event.

Disclosure of Invention

The application provides a method, a device, equipment and a medium for generating candidate search terms.

In one aspect of the present application, a method for generating a candidate search term is provided, including:

acquiring an event related text of a target event, and generating a plurality of focus probability vectors according to the event related text;

generating an event semantic representation vector of the target event according to the structural information of the target event;

and generating a plurality of candidate search words according to the event semantic representation vector, the text semantic representation vector of the event-related text and the plurality of focus probability vectors.

As a possible implementation manner of an aspect of the present application, the generating a plurality of focus probability vectors according to the event-related text includes:

inputting the event-related text to a multi-expert model to generate a plurality of expert vectors, wherein the multi-expert model has different points of interest for the event-related text;

inputting the plurality of expert vectors to a connectivity layer to generate the plurality of focus probability vectors.

As another possible implementation manner of an aspect of the present application, the generating an event semantic representation vector of the target event according to the structural information of the target event includes:

extracting argument information from the structural information of the target event, and generating argument semantic expression vectors according to the argument information;

and generating an event semantic representation vector of the target event according to the argument semantic representation vector.

As another possible implementation manner of an aspect of the present application, the extracting argument information from the structured information of the target event and generating an argument semantic representation vector according to the argument information includes:

extracting at least one set of argument information from the structured information; wherein each group of argument information comprises an argument role and an argument value;

inputting the argument roles and the argument values belonging to the same set of argument information into a first bi-directional LSTM model to generate the argument role vectors and argument value vectors;

and splicing the argument role vector and the argument vector to generate the argument semantic representation vector.

As another possible implementation manner of one aspect of the present application, the argument information is a plurality of sets, each set of argument information has the corresponding argument semantic representation vector, and the generating an event semantic representation vector of the target event according to the argument semantic representation vector includes:

and inputting the argument semantic representation vector corresponding to each group of argument information into a second bidirectional LSTM model to generate the event semantic representation vector.

As another possible implementation manner of an aspect of the present application, before generating a plurality of candidate search terms according to the event semantic representation vector, the text semantic representation vector of the event-related text, and the plurality of focus probability vectors, the method further includes:

segmenting the event related text to generate a plurality of words, and acquiring a plurality of word encoding vectors of the words;

inputting the plurality of word encoding vectors to a third bi-directional LSTM model to generate the text semantic representation vector.

As another possible implementation manner of an aspect of the present application, the generating a plurality of candidate search terms according to the event semantic representation vector, the text semantic representation vector of the event-related text, and the plurality of focus probability vectors includes:

splicing one focus probability vector of the focus probability vectors with the event semantic vector and the text semantic expression vector to obtain a spliced vector;

and inputting the splicing vector into a decoder to obtain one candidate search word output by the decoder.

As another possible implementation manner of an aspect of the present application, the decoder is configured to perform a plurality of decoding processes in a loop, where each decoding process is used to decode one character in the candidate search term;

wherein the decoder comprises a hidden layer and an output layer;

the hidden layer is used for generating a hidden state of the decoding process according to the splicing vector, the hidden state indication vector and the output of the output layer in the last decoding process; the hidden state indication vector is generated according to the hidden state generated by the hidden layer in the last decoding process;

and the output layer is used for outputting the characters decoded in the decoding process according to the hidden state of the decoding process.

As another possible implementation manner of an aspect of the present application, the hidden state indication vector includes a hidden state generated by the hidden layer in a last decoding process, and an indication vector used for indicating that a copy mechanism or a generation mechanism is adopted;

wherein, if the character output in the last decoding process is in the event-related text and/or the structured information, the duplication mechanism is adopted, and the indication vector value represents a word vector of the character output in the last decoding process and one or more combinations of positions and contexts in the event-related text and/or the structured information;

and if the character output in the last decoding process is not in the event-related text and/or the structured information, the generating mechanism is adopted, and the value of the indication vector is zero.

According to another aspect of the present application, there is provided a candidate search term generation apparatus, including:

the system comprises a first generation module, a second generation module and a third generation module, wherein the first generation module is used for acquiring an event related text of a target event and generating a plurality of focus probability vectors according to the event related text;

the second generation module is used for generating an event semantic expression vector of the target event according to the structural information of the target event;

and the third generation module is used for generating a plurality of candidate search terms according to the event semantic representation vector, the text semantic representation vector of the event-related text and the plurality of focus probability vectors.

According to another aspect of the present application, there is provided an electronic device including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method for generating candidate search terms set forth in the above embodiments.

According to another aspect of the present application, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method of generating a candidate search word described in the above embodiments.

According to another aspect of the present application, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method for generating a candidate search term described in the above embodiments.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present application, nor do they limit the scope of the present application. Other features of the present application will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

fig. 1 is a schematic flowchart of a method for generating a candidate search term according to an embodiment of the present application;

fig. 2 is a schematic flowchart of another method for generating candidate search terms according to an embodiment of the present application;

FIG. 3 is a sub-flow diagram for generating an event semantic representation vector according to an embodiment of the present disclosure;

FIG. 4 is a sub-flow diagram for generating text semantic representation vectors according to an embodiment of the present disclosure;

FIG. 5 is a schematic sub-flow chart for generating candidate search terms according to an embodiment of the present application;

FIG. 6 is a diagram illustrating an example of a model for generating search terms according to an embodiment of the present application;

FIG. 7 is a diagram illustrating an example of generating candidate search terms according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an apparatus for generating a candidate search term according to an embodiment of the present application;

fig. 9 is a block diagram of an electronic device for implementing a method for generating a candidate search term according to an embodiment of the present application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

A method, an apparatus, a device, and a storage medium for generating candidate search terms according to embodiments of the present application are described below with reference to the accompanying drawings.

Fig. 1 is a flowchart illustrating a method for generating a candidate search term according to an embodiment of the present application.

The embodiment of the present application is exemplified by the method for generating a candidate search term being configured in a device for generating a candidate search term, and the device for generating a candidate search term may be applied to any electronic device, so that the electronic device may perform a function of generating a candidate search term.

The electronic device may be a Personal Computer (PC), a cloud device, a mobile device, and the like, and the mobile device may be a hardware device having various operating systems, such as a mobile phone, a tablet Computer, a Personal digital assistant, a wearable device, and an in-vehicle device.

As shown in fig. 1, the method for generating a candidate search term may include the following steps:

step 101, obtaining an event-related text of a target event, and generating a plurality of focus probability vectors according to the event-related text.

In the embodiment of the application, the event-related text of the target event may be obtained in a manner of capturing on a website, or may be obtained in a manner of obtaining in an event database, or may be obtained in an event map, or may be obtained in other manners according to an actual application scenario, which is not limited herein. An event, as a form of presentation of information, is defined as an objective fact that a specific person or object interacts with a specific place at a specific time, and is generally sentence-level.

The event graph is a heterogeneous graph formed by a plurality of events, and comprises the events and attribute information of the entities.

In order to generate a plurality of candidate search terms, a plurality of focus probability vectors are generated according to an event-related text.

As a possible implementation, a hard mixture of experts (hard-MoE, mixed expert system) may be adopted, and a polynomial hidden variable z e {1, …, K } is introduced, containing K experts (experts), where each expert focuses on a different part of the event related text, and learns K focuses together. The experts are used to generate different focus probability vectors, and are initialized randomly in the model, and different focus probability vectors can be generated by the same context and different expert vectors.

Optionally, the event-related text of the target event may be input into a Bi-GRU (Bi-directional Gated current Unit) for encoding, and then the hidden layer state of the current time step of each word, the hidden layer states of the initial time and the final time, and the expert vector are normalized through two fully-connected layers and a sigmoid activation function to obtain a plurality of focus probability vectors.

And 102, generating an event semantic expression vector of the target event according to the structural information of the target event.

The structured information means that the information can be decomposed into a plurality of components which are mutually associated after being analyzed, and each component has a clear hierarchical structure, is managed by a database in use and maintenance and has certain operation specifications. Information that cannot be fully digitized is referred to as unstructured information, such as document files, pictures, drawing data, microfilms, and the like. The massive information appearing on the internet is roughly divided into three types, namely structured information, semi-structured information and unstructured information.

In the embodiment of the application, after the target event is acquired, event extraction may be performed on the target event to extract basic information of the event, for example, trigger words of the event, the type of the event, participants of the event, occurrence time and place, and the like, and the basic information is presented in a structured form, so as to obtain structured information of the target event.

As a possible implementation manner, after the target event is obtained, a parser generation tool may be used to generate a parser for parsing the target event based on a grammar rule, and then the target event is parsed by the parser to determine a parse tree corresponding to the target event, so as to determine the structured information of the target event based on the parse tree.

In the embodiment of the present application, the structured information of the event may include a trigger word of the event, an event type, an argument and a corresponding argument role, and the like.

The trigger word of the event refers to a core word of the event, and is mostly a verb or an action noun. Argument, referring to the participant of the event, is mainly composed of entity, value, time. The argument role refers to the role that the event argument plays in the event, such as an attacker, a victim, an acquirer, and the like.

As an example, suppose that the event 1 is "a company plans to purchase B company", the event 1 is subjected to event extraction to obtain the structured information of the event 1, the trigger word of the event is "plan to purchase", the arguments are "a company and B company", and the argument roles are "acquirer and acquirer".

As a possible implementation manner of the embodiment of the present application, the structured information of the target event may include argument information, where the argument information may include an argument role and an argument value. After argument information is extracted from the structured information of the target event, the extracted argument information can be encoded by using an encoder to obtain a corresponding argument semantic representation vector. Further, the argument semantic representation vector may be encoded with an encoder to generate an event semantic representation vector for the target event.

For example, after the structured information of event 1 is acquired, the argument information "buyer: company A; the purchased party: company B ".

The argument roles and the argument values reflect attribute information of arguments, and are favorable for generating subject and object parts in the candidate search words.

As a possible implementation manner, a Bi-directional Long Short-Term Memory (Bi-LSTM) model can be adopted to encode the argument semantic expression vector so as to obtain an event semantic expression vector of the target event.

It should be noted that other deep learning models may also be used to encode the argument semantic expression vector, which is not limited herein.

Step 103, generating a plurality of candidate search words according to the event semantic expression vector, the text semantic expression vector of the event-related text and the plurality of focus probability vectors.

The text semantic expression vector of the event-related text refers to a text semantic expression vector obtained by semantically encoding the event-related text of the target event.

In the embodiment of the application, a plurality of focus probability vectors are generated according to an event-related text, an event semantic expression vector of a target event is generated according to structural information of the target event, and after a text semantic expression vector is obtained by semantically encoding the event-related text of the target event, the event semantic expression vector, the text semantic expression vector of the event-related text and the plurality of focus probability vectors can be jointly input into a decoder so as to be decoded by the decoder to obtain a plurality of candidate search words.

Therefore, the plurality of candidate search terms of the target event are generated, and the generated plurality of candidate search terms are recommended to different users, so that the attention degree of the users is increased, the attention speed of the event is favorably improved, and the stickiness of the users can be further improved.

As a possible case of the embodiment of the present application, after a plurality of candidate search terms are generated according to an event semantic representation vector, a text semantic representation vector of an event-related text, and a plurality of focus probability vectors, the generated candidate search terms may be input to an evaluator based on a syntactic dependency tree, where the evaluator uses a new reward mechanism, evaluates the grammatical validity of a current search term based on the similarity of the syntactic dependency tree of the generated search term and a standard search term, and feeds back a reward to the generator, thereby optimizing the generation process of the search terms, and facilitating to improve the grammatical correctness and fluency of the candidate search terms.

According to the method for generating the candidate search words, after the event-related text of the target event is acquired, a plurality of focus probability vectors are generated according to the event-related text, the event semantic expression vector of the target event is generated according to the structural information of the target event, and further, a plurality of candidate search words are generated according to the event semantic expression vector, the text semantic expression vector of the event-related text and the plurality of focus probability vectors. Therefore, by introducing the plurality of focus probability vectors which are used for guiding the generation of the plurality of candidate search terms, the generation efficiency and accuracy of the candidate search terms are effectively improved, and the diversity of the candidate search term generation is also improved.

On the basis of the above embodiment, as a possible implementation manner, when a plurality of focus probability vectors are generated according to an event-related text, a plurality of expert models can be used to focus on different contents in the event-related text to guide generation of the plurality of focus probability vectors, and then the plurality of focus probability vectors are used to guide generation of a plurality of candidate search terms. The above process is described in detail with reference to fig. 2, and fig. 2 is a flowchart illustrating another method for generating a candidate search term according to an embodiment of the present application.

As shown in fig. 2, the method for generating a candidate search term may include the following steps:

step 201, obtaining an event related text of a target event.

In the embodiment of the present application, the implementation process of step 201 may refer to the implementation process of step 101 in the foregoing embodiment, and is not described herein again.

Step 202, inputting the event-related text into a plurality of expert models to generate a plurality of expert vectors, wherein the attention points of the plurality of expert models to the event-related text are different.

The principle of the multi-expert model is, among other things, to train multiple neural networks (i.e., experts), each of which is designated to be applied to a different portion of the event-related text. That is, the event-related text may have a plurality of different candidate search terms, and the data difference between the different candidate search terms is large, so that the content of each part is processed by the designated Neural network, and the model further has a MNN (Managing Neural Net) for determining which Neural network an input should be handed to for processing.

In the embodiment of the application, after the event-related text of the target event is acquired, the event-related text is input to the multi-expert model, and due to the fact that a plurality of experts are introduced into the multi-expert model, the attention points of the experts to the event-related text are different, and therefore a plurality of expert vectors can be generated.

The multi-expert model is obtained by training with training samples, and the mapping relation between the event text and the expert vectors is obtained by learning, so that the multi-expert model can generate a plurality of expert vectors after the event related text is input into the multi-expert model.

Step 203, inputting a plurality of expert vectors into the connectivity layer to generate a plurality of focus probability vectors.

In the embodiment of the application, after the event-related text is input to the multi-expert model to generate the plurality of expert vectors, the plurality of expert vectors can be input to the connection layer, namely, the neural network is adopted to further extract features of the plurality of expert vectors to obtain the feature vectors, and then the feature vectors are activated through the activation function to generate the plurality of focus probability vectors.

The activation function is a function that runs on a neuron of the artificial neural network and is responsible for mapping an input of the neuron to an output.

And step 204, generating an event semantic expression vector of the target event according to the structural information of the target event.

Step 205, a plurality of candidate search terms are generated according to the event semantic representation vector, the text semantic representation vector of the event-related text and the plurality of focus probability vectors.

In the embodiment of the present application, the implementation processes of step 204 and step 205 may refer to the implementation processes of

steps

102 and 103 in the foregoing embodiment, and are not described herein again.

According to the method for generating the candidate search words, after the event-related text of the target event is obtained, the event-related text is input to the multi-expert model to generate a plurality of expert vectors, the expert vectors are input to the connecting layer to generate a plurality of focus probability vectors, and after the event semantic expression vector of the target event is generated according to the structural information of the target event, a plurality of candidate search words are generated according to the event semantic expression vector, the text semantic expression vector of the event-related text and the focus probability vectors. Therefore, a plurality of focus probability vectors are generated by paying attention to different contents in the event related text through the multi-expert model, and then a plurality of candidate search terms are generated based on guidance of the plurality of focus probability vectors, so that the generation efficiency and accuracy of the candidate search terms are effectively improved, and the diversity of the candidate search terms is promoted.

On the basis of the above embodiment, since the structured information includes argument information, the argument information may be encoded to generate an argument semantic representation vector, and an event semantic representation vector of the target event may be generated according to the argument semantic representation vector. The above process is described in detail with reference to fig. 3, and fig. 3 is a sub-flow diagram for generating an event semantic representation vector according to an embodiment of the present application.

As shown in fig. 3, the step of generating the event semantic representation vector is as follows:

step 301, extracting argument information from the structured information of the target event, and generating argument semantic representation vectors according to the argument information.

The argument information may include argument roles and argument values, among others. It should be noted that the argument information extracted from the structured information of the target event is not limited to one group, for example, two groups of argument information may be extracted from the structured information of the target event, and each group of argument information includes an argument role and a corresponding argument value.

In the embodiment of the application, after the structured information of the target event is acquired, argument information can be extracted from the structured information of the target event, and further, the argument information is encoded to generate an argument semantic expression vector. The argument information obtained by extraction can be encoded by an encoder to obtain a corresponding argument semantic representation vector.

The encoder is a special neural network used for feature extraction and data dimensionality reduction. The simplest encoder consists of one input layer, one hidden layer, and one output layer. The encoder may map the input vector to obtain an encoded vector.

The encoder may be CNN (Convolutional Neural Network), RNN (Recurrent Neural Network), birn (Bi-directional Recurrent Neural Network), GRU (Gated Recurrent Unit), LSTM (Long Short-Term Memory Network), or the like.

As a possible implementation, at least one set of argument information may be extracted from the structured information; wherein, each group of argument information comprises argument roles and argument values. Further, argument roles and argument values belonging to the same set of argument information are input into the first LSTM model to generate an argument role vector and an argument value vector. Further, the argument role vector and the argument vector are spliced to generate an argument semantic representation vector. Therefore, the argument role vector and the argument value vector are generated by adopting the LSTM model, and the accuracy of vector generation is improved.

For the convenience of distinction, in the present application, an LSTM model that encodes argument roles and argument values of argument information is referred to as a first bidirectional LSTM model, an LSTM model that encodes argument semantic representation vectors is referred to as a second bidirectional LSTM model, and an LSTM model that encodes a plurality of word encoding vectors after event-related text word segmentation is referred to as a third bidirectional LSTM model.

Among them, the LSTM model is one of RNNs, and LSTM is very suitable for modeling time series data, such as text data, due to its design characteristics. The bidirectional LSTM model is formed by combining a forward LSTM and a backward LSTM.

It should be noted that other deep learning models may be used to encode argument information, which is not limited herein.

Step 302, generating an event semantic representation vector of the target event according to the argument semantic representation vector.

In the embodiment of the application, after the argument information is encoded to generate the argument semantic representation vector, the argument semantic representation vector can be encoded by using an encoder to generate the event semantic representation vector of the target event.

As a possible case of the embodiment of the present application, when the argument information extracted from the structured information of the target event is multiple sets, and each set of argument information has a corresponding argument semantic representation vector, the argument semantic representation vector corresponding to each set of argument information may be input to the second bidirectional LSTM model for encoding, so as to generate the event semantic representation vector. Therefore, when the bidirectional LSTM model is adopted to encode the argument semantic expression vectors corresponding to each group of argument information, the output of the model needs to be determined by a plurality of inputs and a plurality of outputs together, which is beneficial to improving the accuracy of the generated event semantic expression vectors.

In the embodiment of the application, argument information is extracted from the structural information of the target event, an argument semantic representation vector is generated according to the argument information, and further an event semantic representation vector of the target event is generated according to the argument semantic representation vector. Therefore, the argument information extracted from the structured information comprises argument roles and argument values, which can represent attribute information of arguments, and is beneficial to generating the subject and object parts in the candidate search words, so that the accuracy of generating the event semantic expression vector is improved.

As a possible implementation manner, after the event-related text is acquired, the event-related text needs to be encoded to generate a text semantic expression vector, which is described in detail below with reference to fig. 4, where fig. 4 is a schematic sub-flow diagram for generating the text semantic expression vector according to an embodiment of the present application.

As shown in fig. 4, the steps of generating the text semantic representation vector are as follows:

step 401, performing word segmentation on the event-related text to generate a plurality of words, and obtaining a plurality of word encoding vectors of the plurality of words.

The word segmentation is a process of recombining continuous word sequences into word sequences according to a certain standard. For example, if the text is "zhang san comes to a region a", the plurality of words generated by segmenting the text are "zhang san/comes to/a region".

In the embodiment of the application, after the event-related text of the target event is acquired, the event-related text can be preprocessed, that is, the event-related text is segmented to generate a plurality of words.

Alternatively, the event-related text may be participled using a dictionary-based participle method to generate a plurality of words. Firstly, establishing a unified dictionary table, when a target event related text needs to be segmented, firstly, segmenting the event related text into a plurality of parts, enabling each part to be in one-to-one correspondence with a dictionary, if the word is in the dictionary, successfully segmenting the word, and if not, continuously segmenting and matching the word until the word is successfully segmented.

Optionally, a Chinese word segmentation method based on statistics may be used to segment the event-related text to generate a plurality of words. The word segmentation is considered as a probability maximization problem by statistics, namely, a sentence is split, based on a corpus, the probability of the occurrence of words formed by adjacent words is counted, the occurrence frequency of the adjacent words is large, the occurrence probability is high, and the word segmentation is carried out according to the probability value, so that a complete corpus is important.

Further, a plurality of words generated by segmenting the event-related text may be input to the encoder to semantically encode each word to obtain a plurality of word encoding vectors corresponding to each word. Wherein the word encoding vector is capable of indicating the semantics of the corresponding word element and its context.

It should be noted that there are many methods for obtaining word encoding vectors of a plurality of words, but these methods are based on the idea that the meaning of any word can be represented by its neighboring words. Currently, the way of generating word encoding vectors can be divided into: statistical-based methods and language model-based methods. The method for generating the word vector based on the language model is based on the trained NNLM model, and the word coding vector is used as the additional output of the language model. For example, each word may be character-encoded by the bag-of-words model, so as to obtain a word-encoding vector corresponding to each word.

As a possible implementation manner, semantic coding is performed on a plurality of words generated after the event-related text is segmented through one or more layers of RNNs, so as to obtain a word coding vector corresponding to each word. When each word is encoded using the RNN network, at each time, the output word-encoding vector depends not only on the input at the current time, but also takes into account the model "state" at the previous time. Through the dependence on the historical state, the RNN model can effectively represent the context dependence information of the text data.

As another possible implementation manner, a CNN model may also be used to encode a plurality of words generated after the event-related text is participled, so as to obtain a word encoding vector of each word.

It should be noted that there are many methods for obtaining word encoding vectors of multiple words, for example, bilst (tm), Self Attention, CNN, etc. may be used.

In the embodiment of the present application, there is no limitation on the encoding technique adopted by the encoder.

Step 402, inputting a plurality of word encoding vectors into a third bi-directional LSTM model to generate a text semantic representation vector.

In the embodiment of the application, after a plurality of word encoding vectors of a plurality of words are obtained, the plurality of word encoding vectors can be input into the third bidirectional LSTM model for semantic encoding to generate text semantic expression vectors.

The semantic coding is to process the information by words, classify the information according to meanings and systems or organize and summarize the speech material by the language form of the material, find out the basic argument, argument and logic structure of the material, and code the information according to the semantic features.

In the application, a plurality of words are generated by segmenting the event related text, a plurality of word encoding vectors of the words are obtained, and the word encoding vectors are input into the third bidirectional LSTM model to generate the text semantic expression vector. Therefore, the method and the device realize the encoding of the text related to the target event into the text semantic expression vector which can be identified by the computer, generate the text semantic expression vector through the double-line LSTM model and provide the accuracy of the generation of the text semantic expression vector.

Based on the above embodiment, when a plurality of candidate search words are generated based on the event semantic representation vector, the text semantic representation vector of the event-related text, and the plurality of focus probability vectors, a decoder may be used to decode the event semantic representation vector, the text semantic representation vector of the event-related text, and the plurality of focus probability vectors to obtain candidate search words output by the decoder. The above process is described in detail with reference to fig. 5, and fig. 5 is a sub-flow diagram for generating a candidate search term according to an embodiment of the present application.

As shown in fig. 5, the method for generating a candidate search term may include the steps of:

step 501, splicing one focus probability vector of a plurality of focus probability vectors with an event semantic vector and a text semantic expression vector to obtain a spliced vector.

It can be understood that each focus probability vector is used to guide generation of a candidate search term, and one of the plurality of focus probability vectors may be spliced with the event semantic vector and the text semantic representation vector to obtain a spliced vector.

As a possible situation, each focus probability vector of the plurality of focus probability vectors may be spliced with the event semantic vector and the text semantic expression vector to obtain corresponding spliced vectors after splicing.

Step 502, the concatenated vector is input into a decoder to obtain a candidate search term output by the decoder.

In the embodiment of the application, after the spliced vector is input into the decoder, a candidate search term output by the decoder can be obtained.

As a possible situation, each focus probability vector of the plurality of focus probability vectors is spliced with the event semantic vector and the text semantic expression vector, and the obtained spliced vectors are respectively input to the decoder, so that a plurality of candidate search terms output by the decoder can be obtained.

For example, the decoder may be an RNN or other neural network, and is not limited herein.

In the embodiment of the application, the decoder is used for circularly executing a plurality of decoding processes, and each decoding process is used for decoding to obtain one character in one candidate search word. It is to be understood that, when the concatenated vector is input into the decoder, the decoder obtains one character in one candidate search term every time the decoder performs a decoding process, and the decoder performs the decoding process multiple times in a loop to obtain one candidate search term.

Wherein the decoder may include a concealment layer and an output layer.

And the hidden layer is used for generating the hidden state of the decoding process according to the splicing vector, the hidden state indication vector and the output of the output layer in the last decoding process. The hidden state indication vector is generated according to the hidden state generated by the hidden layer in the last decoding process. The hidden state is called a hidden state because the word cannot be directly recognized from the decoded result after the concatenated vector is decoded.

And the output layer is used for outputting the characters decoded in the decoding process according to the hidden state of the decoding process. And the hidden state indication vector comprises a hidden state generated by the hidden layer in the last decoding process and an indication vector for indicating that a copying mechanism or a generating mechanism is adopted.

In one possible case, if the character output in the last decoding process is in the event-related text and/or structured information, a copying mechanism is employed. The instruction vector value of the copying mechanism, the word vector representing the character output in the last decoding process, and one or more combinations of the position and the context in the event related text and/or the structured information.

It is understood that in natural language processing or text processing, we will typically have a vocabulary (vocabularies). This vocalbulary is either preloaded, or self-defined, or extracted from the current data set. Assuming that another data set follows, which contains words that are not in your existing vocabularies, we say that these words are Out-of-vocabularies, abbreviated as OOV. OOV problems are common problems in the generation phase when processing text. In the application, the candidate search words are generated by adopting a replication mechanism, so that not only can the OOV problem be avoided, but also the fluency and the accuracy rate of the candidate search words can be improved.

In one possible scenario, if the character output in the last decoding process was not in the event-related text and/or structured information, a generation mechanism is employed. Wherein, the indication vector value of the generating mechanism is zero. When the character is generated by using the generation mechanism, the character can be generated from a preset vocabulary.

According to the candidate word generation method, a spliced vector is obtained by splicing one of a plurality of focus probability vectors with an event semantic vector and a text semantic expression vector, and the spliced vector is input into a decoder so as to obtain a candidate search word output by the decoder. And when the decoder is adopted to decode the spliced vector, a multi-source copying mechanism is used, so that the accuracy of generating the candidate search words is improved.

As an example, referring to fig. 6, after the event-related text of the target event is obtained, a multi-expert model may be used to select different contents in the event-related text, and a sequence corresponding to the different contents in the event-related text is input to the encoder-decoder model to obtain a plurality of candidate search terms.

As an example, as shown in fig. 7, fig. 7 is a diagram illustrating generation of a candidate search term according to an embodiment of the present application. As shown in fig. 7, after the event-related text of the target event is acquired, the event-related text may be extracted to obtain the structured information of the target event, and the argument encoder is used to encode the structured information of the target event, so as to generate an event semantic expression vector of the target event. And coding the event related text of the target event by adopting a text coder to obtain a text semantic expression vector of the event related text. After a plurality of focus probability vectors are generated according to the event-related text, a plurality of candidate search words are generated based on the event semantic representation vector, the text semantic representation vector of the event-related text and the plurality of focus probability vectors.

The generated candidate search words are input to a syntactic dependency tree-based evaluator, which uses a new reward mechanism to evaluate the grammar validity of the current search word based on the similarity of the syntactic dependency tree of the generated search word and the standard search word, and the reward is fed back to the generator, thereby optimizing the generation process of the search word.

In order to implement the above embodiments, the present application provides a candidate search term generation apparatus.

Fig. 8 is a schematic structural diagram of an apparatus for generating a candidate search term according to an embodiment of the present application.

As shown in fig. 8, the apparatus 800 for generating a candidate search term may include: a first generation module 810, a second generation module 820, and a third generation module 830.

The first generating module 810 is configured to obtain an event-related text of a target event, and generate a plurality of focus probability vectors according to the event-related text.

And a second generating module 820, configured to generate an event semantic representation vector of the target event according to the structural information of the target event.

A third generating module 830, configured to generate a plurality of candidate search terms according to the event semantic representation vector, the text semantic representation vector of the event-related text, and the plurality of focus probability vectors.

As a possible scenario, the first generating module 810 may further be configured to:

inputting the event-related text to a multi-expert model to generate a plurality of expert vectors, wherein the multi-expert model has different points of interest for the event-related text; a plurality of expert vectors are input to the connectivity layer to generate a plurality of focus probability vectors.

As another possible case, the second generating module 820 may include:

the extraction unit is used for extracting argument information from the structured information of the target event and generating an argument semantic expression vector according to the argument information;

and the generating unit is used for generating an event semantic representation vector of the target event according to the argument semantic representation vector.

As another possible case, the extracting unit may be further configured to:

extracting at least one group of argument information from the structured information; wherein, each group of argument information comprises argument roles and argument values; inputting argument roles and argument values belonging to the same group of argument information into a first bidirectional LSTM model to generate argument role vectors and argument value vectors; and splicing the argument role vector and the argument vector to generate an argument semantic representation vector.

As another possible case, the argument information is a plurality of sets, each set of argument information has a corresponding argument semantic representation vector, and the extracting unit may be further configured to:

and inputting the argument semantic representation vector corresponding to each set of argument information into the second bidirectional LSTM model to generate an event semantic representation vector.

As another possible case, the apparatus 800 for generating a candidate search term may further include:

the acquisition module is used for segmenting the event related text to generate a plurality of words and acquiring a plurality of word coding vectors of the words;

and the input module is used for inputting the plurality of word coding vectors into the third bidirectional LSTM model to generate a text semantic representation vector.

As another possible scenario, the third generating module 830 may further be configured to:

splicing one focus probability vector of the focus probability vectors with an event semantic vector and a text semantic expression vector to obtain a spliced vector;

and inputting the spliced vector into a decoder to obtain a candidate search word output by the decoder.

As another possible case, the decoder is configured to perform a plurality of decoding processes in a loop, where each decoding process is used to decode a character in a candidate search term;

wherein the decoder comprises a hidden layer and an output layer;

As another possible case, the hidden state indication vector includes a hidden state generated by a hidden layer in the last decoding process and an indication vector for indicating that a copying mechanism or a generating mechanism is adopted;

wherein, the character output in the last decoding process is in the event-related text and/or the structured information, a replication mechanism is adopted to indicate a vector value, a word vector representing the character output in the last decoding process, and one or more combinations of positions and contexts in the event-related text and/or the structured information;

and if the characters output in the last decoding process are not in the event-related text and/or the structured information, adopting a generation mechanism and indicating that the value of the vector is zero.

It should be noted that the foregoing explanation of the embodiment of the candidate search term generation method is also applicable to the candidate search term generation apparatus of this embodiment, and details are not repeated here.

According to the device for generating the candidate search words, after the event-related text of the target event is obtained, a plurality of focus probability vectors are generated according to the event-related text, the event semantic expression vector of the target event is generated according to the structural information of the target event, and a plurality of candidate search words are generated according to the event semantic expression vector, the text semantic expression vector of the event-related text and the plurality of focus probability vectors. Therefore, by introducing the plurality of focus probability vectors which are used for guiding the generation of the plurality of candidate search terms, the generation efficiency and accuracy of the candidate search terms are effectively improved, and the diversity of the candidate search term generation is also improved.

In order to implement the above embodiments, the present application further provides an electronic device.

The electronic device provided by the application can comprise:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of generating candidate search terms of the above embodiments.

To implement the above embodiments, the present application also proposes a non-transitory computer-readable storage medium storing computer instructions.

The non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the method for generating a candidate search term described in the above embodiments is provided by an embodiment of the present application.

In order to implement the foregoing embodiments, the present application further proposes a computer program product, which includes a computer program, and when being executed by a processor, the computer program implements the method for generating a candidate search term described in the foregoing embodiments.

According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.

Fig. 9 is a block diagram of an electronic device that implements the method for generating a candidate search term according to the embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.

As shown in fig. 9, the electronic apparatus includes: one or more processors 901, memory 902, and interfaces for connecting the various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). Fig. 9 illustrates an example of a processor 901.

Memory 902 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by at least one processor to cause the at least one processor to perform the method for generating candidate search terms provided herein. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to execute the method of generating a candidate search term provided by the present application.

The memory 902, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the candidate search word generation method in the embodiments of the present application (for example, the first generation module 810, the second generation module 820, and the third generation module 830 shown in fig. 8). The processor 901 executes various functional applications of the server and data processing, i.e., a method for generating a candidate search word in the above-described method embodiments, by executing a non-transitory software program, instructions, and modules stored in the memory 902.

The memory 902 may include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 902 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 902 may optionally include memory located remotely from the processor 901, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the method for generating a candidate search term may further include: an input device 903 and an output device 904. The processor 901, the memory 902, the input device 903 and the output device 904 may be connected by a bus or other means, and fig. 9 illustrates the connection by a bus as an example.

The input device 903 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device, such as an input device like a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, etc. The output devices 904 may include a display device, auxiliary lighting devices (e.g., LEDs), tactile feedback devices (e.g., vibrating motors), and the like. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server may be a cloud Server, which is also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service extensibility in a conventional physical host and a Virtual Private Server (VPS). The server may also be a server of a distributed system, or a server incorporating a blockchain.

According to the technical scheme of the embodiment of the application, after the event related text of the target event is obtained, a plurality of focus probability vectors are generated according to the event related text, the event semantic expression vector of the target event is generated according to the structural information of the target event, and a plurality of candidate search words are generated according to the event semantic expression vector, the text semantic expression vector of the event related text and the plurality of focus probability vectors. Therefore, by introducing the plurality of focus probability vectors which are used for guiding the generation of the plurality of candidate search terms, the generation efficiency and accuracy of the candidate search terms are effectively improved, and the diversity of the candidate search term generation is also improved.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A method for generating candidate search terms comprises the following steps:

2. The method for generating a candidate search term according to claim 1, wherein the generating a plurality of focus probability vectors from the event-related text comprises:

3. The method for generating the candidate search term according to claim 1, wherein the generating an event semantic representation vector of the target event according to the structural information of the target event comprises:

4. The method for generating the candidate search term according to claim 3, wherein the extracting argument information from the structured information of the target event and generating an argument semantic representation vector according to the argument information includes:

inputting the argument roles and the argument values belonging to the same set of argument information into a first bidirectional long-short term memory network (LSTM) model to generate argument role vectors and argument value vectors;

5. The method for generating the candidate search term according to claim 4, wherein the argument information is a plurality of sets, each set of argument information has a corresponding argument semantic representation vector, and the generating an event semantic representation vector of the target event according to the argument semantic representation vector comprises:

6. The method for generating candidate search terms according to claim 1, wherein before generating the plurality of candidate search terms according to the event semantic representation vector, the text semantic representation vector of the event-related text, and the plurality of focus probability vectors, the method further comprises:

7. The method for generating candidate search terms according to any one of claims 1-6, wherein the generating a plurality of candidate search terms according to the event semantic representation vector, the text semantic representation vector of the event-related text, and the plurality of focus probability vectors comprises:

8. The method for generating candidate search terms according to claim 7, wherein the decoder is configured to perform a plurality of decoding processes in a loop, each decoding process being configured to decode one character of the one candidate search term;

wherein the decoder comprises a hidden layer and an output layer;

9. The method of generating candidate search terms of claim 8,

the hidden state indication vector comprises a hidden state generated by the hidden layer in the last decoding process and an indication vector used for indicating that a copying mechanism or a generating mechanism is adopted;

10. An apparatus for generating a candidate search term, comprising:

11. The apparatus for generating a candidate search term according to claim 10, wherein the first generating module is further configured to:

12. The apparatus for generating a candidate search term according to claim 10, wherein the second generating module comprises:

the extracting unit is used for extracting argument information from the structural information of the target event and generating an argument semantic expression vector according to the argument information;

13. The apparatus for generating a candidate search term according to claim 12, wherein the extracting unit is further configured to:

14. The apparatus for generating a candidate search term according to claim 13, wherein the argument information is a plurality of sets, each set of argument information having a corresponding argument semantic representation vector, the extracting unit is further configured to:

15. The apparatus for generating a candidate search term according to claim 10, wherein the apparatus further comprises:

an input module to input the plurality of word encoding vectors to a third bi-directional LSTM model to generate the text semantic representation vector.

16. The apparatus for generating a candidate search term according to any one of claims 10-15, wherein the third generating module is further configured to:

17. The apparatus for generating candidate search terms according to claim 16, wherein the decoder is configured to perform a plurality of decoding processes in a loop, each decoding process being configured to decode one character of the one candidate search term;

wherein the decoder comprises a hidden layer and an output layer;

18. The apparatus for generating a candidate search term according to claim 17,

19. An electronic device, comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of generating a candidate search term of any of claims 1-9.

20. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method of generating a candidate search word of any one of claims 1-9.

21. A computer program product comprising a computer program which, when executed by a processor, implements a method of generating a candidate search term as claimed in any one of claims 1 to 9.