CN112182348A - Semantic matching judgment method and device, electronic equipment and computer readable medium - Google Patents

Semantic matching judgment method and device, electronic equipment and computer readable medium Download PDF

Info

Publication number
CN112182348A
CN112182348A CN202011240599.8A CN202011240599A CN112182348A CN 112182348 A CN112182348 A CN 112182348A CN 202011240599 A CN202011240599 A CN 202011240599A CN 112182348 A CN112182348 A CN 112182348A
Authority
CN
China
Prior art keywords
text
search result
title
semantic matching
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011240599.8A
Other languages
Chinese (zh)
Other versions
CN112182348B (en
Inventor
连义江
李爽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu International Technology Shenzhen Co ltd
Original Assignee
Baidu International Technology Shenzhen Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baidu International Technology Shenzhen Co ltd filed Critical Baidu International Technology Shenzhen Co ltd
Priority to CN202011240599.8A priority Critical patent/CN112182348B/en
Publication of CN112182348A publication Critical patent/CN112182348A/en
Application granted granted Critical
Publication of CN112182348B publication Critical patent/CN112182348B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Abstract

The application provides a semantic matching judgment method, and relates to the technical field of computer technology, intelligent search and deep learning. The specific implementation scheme is as follows: extracting a search result title of the first text and a search result title of the second text from the obtained natural search result of the first text and the natural search result of the second text; and processing the first text, the second text, the search result title of the first text and the search result title of the second text based on the semantic matching judgment model component to obtain semantic matching scores of the first text and the second text and semantic matching judgment results of the first text and the second text. The application also provides a semantic matching judgment device, electronic equipment and a computer readable medium. According to the semantic matching judgment method and device, the electronic equipment and the computer readable medium, the accuracy of the semantic matching judgment result can be improved.

Description

Semantic matching judgment method and device, electronic equipment and computer readable medium
Technical Field
The present application relates to the field of computer technology, intelligent search and deep learning technology, and in particular, to a semantic matching determination method, apparatus, electronic device, and computer-readable medium.
Background
In a business Search (Sponsored Search), a Search engine may provide a keyword matching service for advertisers or merchants, and in the case that a Search word (query) submitted by a user matches a keyword (bid) submitted by a merchant, promotion information of the merchant may be presented on a Search result page of the user.
At present, the semantics of the query and the bidword are generally understood and checked for semantic consistency according to the obtained query original text and the bidword original text from the literal meaning of the query and the bidword. If the semantics of query and bidword cannot be correctly understood, the mismatching situation will be handled.
Disclosure of Invention
A semantic matching judgment method, a semantic matching judgment device, an electronic device and a computer-readable medium are provided.
According to a first aspect, there is provided a semantic matching determination method, comprising: extracting a search result title of the first text and a search result title of the second text from the obtained natural search result of the first text and the natural search result of the second text; and processing the first text, the second text, the search result title of the first text and the search result title of the second text based on the semantic matching judgment model component to obtain semantic matching scores of the first text and the second text and semantic matching judgment results of the first text and the second text.
According to a second aspect, there is provided a semantic matching determination apparatus including: the title extraction module is used for extracting a search result title of the first text and a search result title of the second text from the obtained natural search result of the first text and the natural search result of the second text; and the result determining module is used for processing the first text, the second text, the search result title of the first text and the search result title of the second text based on the semantic matching judgment model component to obtain a semantic matching score of the first text and the second text and obtain a semantic matching judgment result of the first text and the second text.
According to a third aspect, there is provided an electronic device comprising: one or more processors; a memory having one or more programs stored thereon that, when executed by the one or more processors, cause the one or more processors to perform any of the above semantic matching determination methods; and one or more I/O interfaces connected between the processor and the memory and configured to realize information interaction between the processor and the memory.
According to a fourth aspect, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to execute any one of the semantic matching determination methods described above.
According to the semantic matching judgment method, in the process of performing semantic matching judgment on the first text and the second text, the natural search result corresponding to the first text and the natural search result of the second text are obtained, and the search result title of the first text and the search result title of the second text are extracted from the natural search results, so that the first text, the second text, the search result title of the first text and the search result title of the second text are processed through the semantic matching judgment model component, the coverage capability of low-frequency search words and keywords can be effectively improved, the occurrence of bad error matching results with similar word faces but far semantic differences can be greatly reduced, and the accuracy of the semantic matching judgment results between the low-frequency texts can be improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present application, nor do they limit the scope of the present application. Other features of the present application will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
fig. 1 is a schematic view of an application scenario of a semantic matching determination method provided in an embodiment of the present application;
FIG. 2 is a flow chart of a semantic matching determination method provided by an embodiment of the present application;
FIG. 3 is a network architecture diagram of a semantic matching decision model component provided by an embodiment of the present application;
fig. 4 is a block diagram illustrating a semantic matching determining apparatus according to an embodiment of the present disclosure;
fig. 5 is a block diagram of an electronic device according to an embodiment of the present disclosure.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The embodiments and features of the embodiments of the present application may be combined with each other without conflict.
As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
In the matching process of the search words and the keywords described in the embodiments herein, for the search words and the keywords with lower frequencies, extra knowledge is usually required to comprehensively and correctly understand the semantics of the search words and the keywords, and the existing semantic recognition technology, even through manual recognition, cannot directly make a correct judgment on whether the search words and the keywords are matched or not based on only the original text of the search words and the original text of the keywords.
For example, when "yang morning" is not recognized correctly, it is easily misunderstood as an abbreviation of "impotence and premature ejaculation" which is a comparatively high frequency, but if "yang morning" is known as a name of an international friend which can be searched in an encyclopedia entry, the two will not be mismatched.
Aiming at the problem of semantic matching between search words and keywords, the application provides a semantic matching judgment method for improving the accuracy of semantic matching check between texts.
Fig. 1 is a schematic view of an application scenario of a semantic matching determination method according to an embodiment of the present application. In the scenario shown in fig. 1, at least one user terminal 11, a search engine server 12 and a merchant promotional information base 13 are schematically shown.
The user terminal 11 and the search engine server 12 communicate with each other through a network, and the user terminal 11 may include but is not limited to: personal computers, smart phones, tablet computers, personal digital assistants, and the like; a user can provide search words to the search engine server 12 through the user terminal 11, and the merchant promotion information base 13 can contain keywords provided by the merchant and advertisement creatives corresponding to the keywords, wherein the keywords can be short texts, namely texts or character strings with the length smaller than a preset threshold value; the search engine server 12 may be configured to execute the semantic matching determination method according to the embodiment of the present application, obtain a matching determination result of the search term and the keyword, and display promotion information (e.g., advertisement creatives) of the matched keyword in a search result page of the user when the matching determination result is determined.
In practical applications, the search engine server 12 may be a local server or a cloud server, and the merchant promotion information base 13 may be stored locally in the search engine server 12 or in a cloud database. The number of devices and the implementation form are not limited in the embodiment of the application, and the device can be flexibly adjusted according to the actual application requirements, which is not described herein again.
Fig. 2 is a flowchart of a semantic matching determination method according to an embodiment of the present application.
In a first aspect, referring to fig. 2, an embodiment of the present application provides a semantic matching determination method, which may include the following steps.
And S110, extracting the search result title of the first text and the search result title of the second text from the acquired natural search result of the first text and the natural search result of the second text.
And S120, processing the first text, the second text, the search result title of the first text and the search result title of the second text based on the semantic matching judgment model component to obtain semantic matching scores of the first text and the second text and obtain a semantic matching judgment result of the first text and the second text.
According to the semantic matching judgment method, in the process of semantic matching judgment, a natural search result corresponding to a first text and a natural search result of a second text are introduced, a search result title of the first text and a search result title of the second text are extracted from the natural search result corresponding to the first text and the natural search result of the second text, and the first text, the second text, the search result title of the first text and the search result title of the second text are processed through a semantic matching judgment model component, so that the natural search engine result can be fused in the process of semantic matching judgment of the first text and the second text, and the accuracy of the semantic matching judgment method is improved. Particularly, in the process of judging the matching of the first text and the second text with the low frequency being improved, the search result title of the first text and the search result title of the second text extracted from the natural search result can assist in checking the semantic consistency of the first text and the second text, so that the accuracy of the semantic judgment result between the low frequency texts is improved.
In some embodiments, the first text and the second text comprise any of the following pairs of texts: query terms and keywords, query terms and query terms, and keywords.
That is to say, the semantic matching determination method in the embodiment of the application may be used for semantic matching determination between search terms and keywords, or may perform matching determination between query terms and query terms, between keywords and keywords, and between any short text pair, and combination of multiple types of text pairs, so that diversity and flexibility of data that can be processed by the semantic matching determination method can be improved.
For simplicity of description, the embodiments described herein illustrate specific processes of the semantic matching determination method by taking user-provided search terms and merchant-provided keywords as examples in a business search scenario. However, this description cannot be interpreted as limiting the scope or implementation possibility of the present solution, and the method of determining semantic matching directly for text pairs other than the text pair composed of the search word and the keyword is consistent with the method of processing the text pair composed of the search word and the keyword.
In step S110, the natural search result of the first text and the natural search result of the second text may be obtained in either of the following two ways. One is by searching in real-time online using existing search engines and the other is by querying in existing natural search result logs. The search engine may be selected according to the actual application environment and the user requirement, and the embodiment of the present application is not particularly limited.
In some embodiments, step S110 may specifically include the following steps.
S11, performing relevance sorting on the natural search results of the first text to obtain a first sorting result, and taking the search result title with the page number and the click rate meeting first preset requirements as the search result title of the extracted first text on the display page of the first sorting result.
S12, performing relevance ranking on the natural search results of the second text to obtain a second ranking result, and taking the title of the search result of which the page number and the click rate meet second preset requirements as the title of the search result of the extracted second text on the display page of the second ranking result.
In this embodiment, through S11-S12, the extraction of titles in the natural search result of the first text and the search result of the second text can be realized, and a search result title satisfying the requirements of relevance, page number, click rate and the like in the natural search result is obtained. In the embodiment of the present application, taking the first text as an example, in a search engine, search results of the first text are ranked according to relevance, and a plurality of ranked search results are obtained. Illustratively, a title that is shown on the top page and has the highest click rate may be extracted as the search result title of the first text. The selection of the page number and the setting of the click rate value can be set according to actual needs, and the embodiment of the application is not particularly limited to this. The processing method of the title extraction of the second text is consistent with the processing method of the first text, and is not described herein again.
In some embodiments, in step S11, the step of taking the search result title whose page number and click rate satisfy the first predetermined requirement as the search result title of the extracted first text may specifically include: and denoising the search result title meeting the first preset requirement to obtain a first search result title which is used as the search result title of the extracted first text.
In step S12, the step of taking the title of the search result whose page number and click rate satisfy the second predetermined requirement as the title of the search result of the extracted second text may specifically include: and denoising the search result title meeting the second preset requirement to obtain a second search result title which is used as the search result title of the extracted second text.
In this embodiment, each search result of the first text and the second text may include title text, pictures, detailed summaries, and the like. After the search result title of the first text and the search result title of the second text are extracted, denoising processing can be performed on the extracted search result titles, for example, preprocessing work such as removing website domain names in the titles and removing redundant words is performed, so that the signal-to-noise ratio of model processing data is improved, and high-quality model processing data is obtained.
In some embodiments, the semantic matching decision model component includes a text matching decision model and a title matching decision model; step S120 may specifically include the following steps.
S21, calculating a first semantic matching score between the first text and the second text using the text matching judgment model.
S22, a second semantic matching score between the search result title of the first text and the search result title of the second text is calculated using the title matching judgment model.
And S23, combining the first semantic matching score and the second semantic matching score by using the text matching judgment model weight coefficient and the title matching judgment model weight coefficient to obtain the semantic matching scores of the first text and the second text.
In the embodiment, the first text and the second text can be subjected to matching judgment through the text matching judgment model, the search result title of the first text and the search result title of the second text are subjected to matching judgment through the title matching judgment model, then the judgment results of the two parts are fused to obtain the semantic matching score of the fused first text and second text, so that the search result title of the first text and the search result title of the second text are introduced in the process of performing semantic matching judgment, the semantic consistency of the first text and the second text is verified in an auxiliary manner, and the accuracy of the semantic judgment result between the low-frequency texts is improved.
In some embodiments, the first semantic matching score and the second semantic matching score may be weighted and fused (for example, weighted and summed) by using preset text matching determination model weight coefficients and title matching determination model weight coefficients, so as to obtain the semantic matching scores of the fused first text and second text.
In some embodiments, the semantic matching decision model component is a model obtained by performing two stages of model training according to first stage sample data and second stage sample data; two stages of model training include: and performing first-stage training on the preset basic model according to the first-stage sample data, and performing second-stage training on the model subjected to the first-stage training according to the second-stage sample data.
In the embodiment, the basic model is trained by stages by using sample data at different stages, so that the training effect of the required semantic matching judgment model component can be improved; moreover, the semantic matching judgment model component is obtained by training based on the existing basic model, so that the model training cost can be reduced, the training efficiency can be improved, and the like.
As an example, the semantic matching decision model component may be derived by fine-tuning (Finetune) a pre-set base model. The fine tuning may include two stages. For example, first-stage fine tuning sample data is acquired, first-stage fine tuning is performed on the basic model by using the first-stage fine tuning sample data, second-stage fine tuning sample data is acquired, and second-stage fine tuning is performed on the model subjected to the first-stage fine tuning by using the second-stage fine tuning sample data, so that the semantic matching judgment model component required by the embodiment of the application is obtained.
For convenience of understanding, the process of acquiring the first stage sample data and the second stage sample data is described by taking the first text as a search word and the second text as a keyword as an example.
In some embodiments, the first stage sample data includes a first positive sample having a positive case flag and a first negative sample having a negative case flag, and the second stage sample data includes a second positive sample having a positive case flag and a second negative sample having a negative case flag.
In some embodiments, the first positive sample comprises: the method comprises the steps that a first text pair meeting a first preset requirement is obtained from a search click log, a second text pair meeting a second preset requirement is obtained from a merchant purchase log, a search result title corresponding to a text contained in the first text pair and a search result title corresponding to a text contained in the second text pair; the first negative sample is sample data constructed according to the merchant negative feedback data acquired in advance.
In some embodiments, the second positive sample comprises: the text pair is composed of a first text collected in advance and a corresponding synonymous second text; the second negative examples include: the text processing device includes a text pair composed of a first text captured in advance and a second text included as a phrase in the first text captured in advance, and a text pair composed of the first text captured in advance and the second text having semantic relevance to the first text captured in advance.
In the embodiment, the first-stage sample data and the second-stage sample data are training data obtained in different modes, wherein the first-stage sample data is based on a search click log, a merchant purchase log and merchant negative feedback data to construct positive sample data and negative sample data of first-stage training; the second-stage sample data is based on the mixed text pairs in the keyword matching service with multiple matching degrees provided for the merchant, and positive sample data and negative sample data of the second-stage training are constructed, so that the diversity of training data is increased, and the training effect of the required semantic matching judgment model component can be improved.
In some embodiments, a search engine will typically provide three degrees of matching keyword matching services for merchants to meet different promotional needs: exact match, phrase match, and broad match. The accurate matching means that the literal contents of the query word and the keyword are consistent, or the literal contents of the query word and the synonymy variant of the keyword are consistent; phrase matching means that the keyword or synonymous variant of the keyword is included as a phrase in the query word; a broad match refers to semantic correlation, i.e., the presence of semantic correlation, of a query term and a keyword.
Thus, for example, a mixed-type text pair herein may include: the query comprises a text pair consisting of a query and a corresponding synonymous keyword, a text pair consisting of a query and a keyword contained in the query as a phrase, and a text pair consisting of a query and a keyword having semantic relevance to the query. In this embodiment, a text pair of a mixed type manually labeled as positive sample data and manually labeled as negative sample data may be used as the second stage sample data to increase the diversity of training data and improve the model training effect.
In the above embodiments, the specific amount of training data may be determined according to actual needs. Illustratively, the number of the first positive samples, the first negative samples, the second positive samples, and the second negative samples in the training data may be not less than 10 ten thousand, respectively, and the embodiment of the present application is not particularly limited.
In some embodiments, the first text pair satisfying the first predetermined requirement may be a first text pair meeting a predetermined search click relationship obtained from the search click log, and the similarity of the texts in the first text pair is greater than a first similarity threshold, where the search click relationship is used to indicate that the texts in the first text pair correspond to the same search result.
In some embodiments, the second text pair satisfying the second predetermined requirement is a second text pair that meets a specified purchase relationship obtained from a merchant purchase log, and the similarity of the texts in the second text pair is greater than a second similarity threshold, where the purchase relationship is used to indicate that the texts in the second text pair correspond to the same merchant or to different merchants of the same type.
In the embodiment, a first text pair meeting a first preset requirement is obtained from a search click log according to the search click relationship and the text similarity, and a second text pair meeting the second preset requirement is obtained from a merchant purchase log according to the specified purchase relationship and the text similarity for constructing first-stage sample data of first-stage training, so that a large amount of high-quality training data can be mined off line, a good data base is provided for subsequent model training, the accuracy of a trained model can be regulated, and the flexibility and the accuracy of a training process can be improved.
Further, the first similarity threshold and the second similarity threshold in the embodiment of the present application may be set according to an accuracy requirement of actual training, and the higher the value of the set first similarity threshold and the set second similarity threshold is, the stricter the requirement on training data is, the more accurate the model obtained by training is, so that the accuracy degree of the model obtained by training may be controlled by setting the value of the first similarity threshold and the second similarity threshold.
In some embodiments, two search terms meeting a predetermined requirement may be combined into a search term pair according to a search click log of a user, and if a similarity of two search terms in the search term pair is greater than a first threshold, the search term pair is used as training data.
In some embodiments, searching is performed according to the search terms and corresponding search results are displayed, and when one or more search results are selected, a search click relationship may be formed between the search terms and the selected search results. Illustratively, the search result corresponding to the search term is a Uniform Resource Locator (URL), and the search click relationship may be expressed as a query-URL click relationship. According to the query-URL clicking relation, under the condition that any two search words correspond to the same URL, the two search words form a search word pair, the similarity of the calculated search word pair is calculated, and if the similarity of the two search words in the search word pair is larger than a first similarity threshold value, the two search words in the search word pair are used as sample data.
In one embodiment, the method for calculating the similarity between two search terms may be various, for example: the Similarity between two search terms is calculated based on Cosine Similarity (Cosine Similarity) or on Word Move Distance (WMD). For example, the cosine similarity of two search terms is calculated, and the larger the calculated value is, the higher the similarity of the two search terms is. The similarity calculation method between search terms is not specifically limited in the embodiments of the present application.
In some embodiments, the keyword and the purchase relationship of the merchant may be obtained according to the keyword purchased by the merchant, for example, the purchase relationship in the embodiment of the present application may be represented as a bid-advertisement purchase relationship. When two keywords are purchased by the same merchant or merchants of the same category, the two keywords can form a keyword pair, the similarity of the keyword pair is calculated, and if the similarity is higher than a second similarity threshold, the keyword pair can be used as sample data.
In some embodiments, merchant negative feedback data submitted by advertisers and merchants may be gathered and negative sample data constructed from the merchant negative feedback data. The negative word data is included in the negative feedback data of the merchant. For example, in a commercial Search application scenario such as Sponsored Search, advertisers and merchants typically mask irrelevant Search terms (negative term data) to obtain masked Search terms, which may provide a large amount of high quality negative sample data (negative example data) for embodiments of the present application.
In some embodiments, noise exists in the merchant negative feedback data submitted by the advertiser and the merchant, and therefore, the collected merchant negative feedback data may be subjected to denoising processing first, and negative sample data in the first-stage sample data may be constructed according to the merchant negative feedback data after denoising processing.
Illustratively, negative sample data in the first-stage sample data constructed according to the pre-acquired negative feedback data of the merchant comprises the following data items: the masked search terms and the keywords corresponding to the masked search terms obtained from the negative feedback data of the merchant, the search result titles of the masked search terms extracted according to the method of the step S110, the search result titles of the keywords corresponding to the extracted masked search terms, and the negative case flag of each negative sample data item.
Fig. 3 is a schematic network structure diagram of a semantic matching determination model component according to an embodiment of the present disclosure. As shown in fig. 3, the semantic matching decision model component may be a dual-model structure, for example, the models in the dual-model structure include a text matching decision model 31 and a title matching decision model 32, and the text matching decision model 31 and the title matching decision model 32 may be a Knowledge Enhanced semantic Representation model (error) or a Bidirectional Encoder Representation From Transformer (BERT) model.
As an example, in fig. 3, the left model structure may be an ERNIE-based text matching decision model for making a semantic matching decision on a search word and a keyword, and the right model structure may be an ERNIE-based title matching decision model for making a semantic matching decision on a search result title of a search word and a search result title of a keyword.
Through the obtaining process of the first-stage sample data described in the above embodiment, a first positive sample (query1, bidword1, query _ title1, bidword _ title1, label1) and a first negative sample (query2, bidword2, query _ title2, bidword _ title2, label2) in the first-stage sample data can be obtained.
In the first positive sample, query1 may be a text in a first text pair that satisfies a first predetermined requirement and is obtained from a search click log as described in the above embodiment, bid 1 may be a text in a second text pair that satisfies a second predetermined requirement and is obtained from a merchant purchase log, query _ title1 is a search result title corresponding to query1, bid _ title1 is a search result title corresponding to bid 1, and label1 is a positive example label.
In the first negative sample, query2 and bidword2 are masked search words and corresponding keywords obtained from negative feedback data of a merchant, query _ title2 is a search result title corresponding to query2, bidword _ title2 is a search result title corresponding to query2, and label2 is a negative example mark.
Through the obtaining process of the second stage sample data described in the above embodiment, a second positive sample (query3, bidword3, query _ title3, bidword _ title3, label3) and a second negative sample (query4, bidword4, query _ title4, bidword _ title4, label4) in the second stage sample data can be obtained.
In the second positive sample, query3 and bidword3 may be consistent search words and keywords according to the literal content obtained from the service data provided by the merchant and matched accurately, query _ title3 is a search result title corresponding to query3, bidword _ title3 is a search result title corresponding to bidword3, and label3 is a positive example label.
In the second negative example, query4 and bid 4 may be text pairs obtained from service data matched with phrases provided for merchants, that is: a text pair consisting of a query and a bidword contained as a phrase in the collected query, and/or a text pair obtained from broadly matching service data provided for a merchant, namely: a text pair consisting of a query and a bidword which has semantic relevance with the query; query _ title4 is the search result title for query4, bidword _ title4 is the search result title for bidword4, and label4 is a negative example label.
With continued reference to fig. 3, in the model training process, according to the model structure shown in fig. 3, the first positive sample (query1, bid 1, query _ title1, bid _ title1, label1) and the first negative sample (query2, bid 2, query _ title2, bid _ title2, label2) perform the first-stage training on the model, so as to obtain the model after the first-stage training.
And performing second-stage training on the model subjected to the first-stage training according to a second positive sample (query3, bidword3, query _ title3, bidword _ title3 and label3) and a second negative sample (query4, bidword4, query _ title4, bidword _ title4 and label4) to obtain a model subjected to the second-stage training, namely a semantic matching judgment model component.
In the two stages of training processes, search words and keywords in the first positive sample, the first negative sample, the second positive sample and the second negative sample can be input into a left basic model to perform first-stage training and second-stage training to obtain a text matching judgment model in the semantic matching judgment model component; the search result titles of the search terms and the search result titles of the keywords in the first positive sample, the first negative sample, the second positive sample and the second negative sample can be input into the right basic model to perform the first-stage training and the second-stage training, so as to obtain a title matching judgment model in the semantic matching judgment model component.
As shown in fig. 3, in the process of training the model, the data input to the model further includes some predetermined special symbols, such as a symbol CLS and a symbol SEP, where CLS is a special symbol for classifying output, and represents that the model is trained as a classification task; SEP is a special symbol used to separate two texts. For example, when a BERT model is used for training, a CLS symbol is inserted in front of a text, and an output vector corresponding to the symbol is used as semantic representation of the whole text for text classification; for example, a symbol SEP is added between the input search words and keywords for separating and distinguishing the search words and keywords.
In some embodiments, in the model training process of the model structure shown in fig. 3, in the model training process, a Transformer (Transformer) based on attention mechanism is used in the model structure, the model parameter Ernie Large may be used as an initial value, which represents model parameters of 24 encoder layers, 1024 hidden units, and 16 attention heads, and other parts in the model structure may adopt random initial values, and perform model training based on a batch (batch) stochastic gradient descent method.
With continued reference to fig. 3, the model structure shown in fig. 3 further includes a full Connected Layer (FC Layer)33, and after the semantic matching determination model component of the dual-model structure is trained, the model training result is output through the processing of the full Connected Layer, where the model training result may include a Class Label (Class Label).
As an example, in the model training process, according to the model training result and the labeled training data, the matching determination result error of the trained semantic matching determination model component may be determined, and the model parameter corresponding to the training data of each stage is adjusted according to the error to update the model parameter of the semantic matching determination model component.
In the embodiment of the application, in the training process of the semantic matching judgment model component, the gradient (direction) of updating the model parameters is determined by using a gradient descent algorithm, so that the result of training the semantic matching judgment model component each time is ensured to be closer to the target of model training.
In some embodiments, in step S120, after the step of obtaining semantic matching scores of the first text and the second text, the method further comprises: and S130, when the semantic matching score is larger than or equal to the preset score, taking the second text as the synonymous text of the first text.
As an example, for a text pair formed by a search word and a keyword, after acquiring a natural search result title of the search word and a natural search result title of the keyword, inputting quadruple data formed by the search word, the keyword, the natural search result title of the search word and the natural search result title of the keyword into a semantic matching judgment model component to obtain a semantic matching score of the text pair formed by the search word to be judged and the keyword, and if the semantic matching score is greater than or equal to a predetermined score, determining that the text pair is a synonymous text.
As another example, for a plurality of text pairs formed by search words and keywords, the semantic matching determination method of the embodiment of the present application may be performed on each text pair to obtain a semantic matching score of each text pair, and the text pairs whose semantic matching scores are smaller than a predetermined score may be filtered, and the text in the remaining text pairs after filtering is a synonym text for semantic matching.
In the embodiment, the semantic matching condition of the text pair formed by the first text and the second text can be judged according to the semantic matching score condition of the first text and the second text, and according to the semantic matching judgment method of the embodiment of the application, the search result title of the first text and the search result title of the second text are introduced based on natural search in the semantic matching judgment process to assist in semantic matching judgment, so that the accuracy of the semantic matching judgment method is improved.
Fig. 4 is a block diagram illustrating a semantic matching determination apparatus according to an embodiment of the present disclosure.
In a second aspect, referring to fig. 4, an embodiment of the present application provides a semantic matching determining apparatus 400, which may include the following modules.
And a title extraction module 410, configured to extract the search result title of the first text and the search result title of the second text from the obtained natural search result of the first text and the natural search result of the second text.
And the result determining module 420 is configured to process the first text, the second text, the search result title of the first text, and the search result title of the second text based on the semantic matching determination model component to obtain a semantic matching score of the first text and the second text, and obtain a semantic matching determination result of the first text and the second text.
According to the semantic matching judgment device, in the process of semantic matching judgment, a natural search result corresponding to a first text and a natural search result of a second text are introduced, a search result title of the first text and a search result title of the second text are extracted from the natural search result corresponding to the first text and the natural search result of the second text, and the first text, the second text, the search result title of the first text and the search result title of the second text are processed through a semantic matching judgment model component, so that the natural search engine result can be fused in the process of semantic matching judgment of the first text and the second text, and the accuracy of the semantic matching judgment method is improved.
In some embodiments, the first text and the second text comprise any of the following pairs of texts: query terms and keywords, query terms and query terms, and keywords.
In some embodiments, the title extraction module 410 may include: the first extraction unit is used for carrying out relevance sorting on the natural search results of the first text to obtain a first sorting result, and taking a search result title of which the page number and the click rate meet first preset requirements as the search result title of the extracted first text on a display page of the first sorting result; and the second extraction unit is used for performing relevance ranking on the natural search results of the second text to obtain a second ranking result, and taking the title of the search result of which the page number and the click rate meet second preset requirements as the title of the extracted search result of the second text on the display page of the second ranking result.
In some embodiments, the first extracting unit, when configured to use a search result title whose page number and click through rate meet a first predetermined requirement as the search result title of the extracted first text, is specifically configured to: and denoising the search result title meeting the first preset requirement to obtain a first search result title which is used as the search result title of the extracted first text.
In some embodiments, the first extracting unit, when configured to use a title of the search result whose page number and click rate satisfy a second predetermined requirement as the search result title of the extracted second text, is specifically configured to perform denoising processing on the search result title that satisfies the second predetermined requirement to obtain a second search result title as the search result title of the extracted second text.
In some embodiments, the semantic matching decision model component includes a text matching decision model and a title matching decision model; the result determination module 420 includes: the first calculation unit is used for calculating a first semantic matching score of the first text and the second text by using the text matching judgment model; a second calculation unit configured to calculate a second semantic matching score of the search result title of the first text and the search result title of the second text using the title matching judgment model; and the result combining unit is used for combining the first semantic matching score and the second semantic matching score by using the text matching judgment model weight coefficient and the title matching judgment model weight coefficient to obtain the semantic matching scores of the first text and the second text.
In some embodiments, the semantic matching decision model component is a model obtained by performing two stages of model training according to first stage sample data and second stage sample data; two stages of model training include: and performing first-stage training on the preset basic model according to the first-stage sample data, and performing second-stage training on the model subjected to the first-stage training according to the second-stage sample data.
In some embodiments, the first stage sample data comprises a first positive sample having a positive case flag and a first negative sample having a negative case flag, and the second stage sample data comprises a second positive sample having a positive case flag and a second negative sample having a negative case flag; wherein the first positive sample comprises: the method comprises the steps that a first text pair meeting a first preset requirement is obtained from a search click log, a second text pair meeting a second preset requirement is obtained from a merchant purchase log, a search result title corresponding to a text contained in the first text pair and a search result title corresponding to a text contained in the second text pair; the first negative sample is sample data constructed according to the merchant negative feedback data acquired in advance;
in some embodiments, the second positive sample comprises: the text pair is composed of a first text collected in advance and a corresponding synonymous second text; the second negative examples include: the text processing device includes a text pair composed of a first text captured in advance and a second text included as a phrase in the first text captured in advance, and a text pair composed of the first text captured in advance and the second text having semantic relevance to the first text captured in advance.
In some embodiments, the first text pair satisfying the first predetermined requirement is a first text pair meeting a predetermined search click relationship obtained from the search click log, and the similarity of the texts in the first text pair is greater than a first similarity threshold, where the search click relationship is used to indicate that the texts in the first text pair correspond to the same search result.
In some embodiments, the second text pair satisfying the second predetermined requirement is a second text pair that meets a specified purchase relationship obtained from a merchant purchase log, and the similarity of the texts in the second text pair is greater than a second similarity threshold, where the purchase relationship is used to indicate that the texts in the second text pair correspond to the same merchant or to different merchants of the same type.
In some embodiments, the semantic matching determining apparatus 400 further includes the same text acquiring module, configured to, after obtaining the semantic matching scores of the first text and the second text, if the semantic matching score is greater than or equal to the predetermined score, take the second text as the synonymous text of the first text.
According to the semantic matching judgment device, in the matching judgment process of the first text and the second text for improving the low frequency, the search result title of the first text and the search result title of the second text extracted from the natural search result can assist in checking the consistency of the requirements of the first text and the second text, effectively improve the coverage capability of low-frequency search words and keywords, greatly reduce the occurrence of bad error matching results with similar word faces but far different semantics, and improve the accuracy of semantic judgment results between the low-frequency texts.
It should be apparent that the present application is not limited to the particular configurations and processes described in the above embodiments and shown in the figures. For convenience and brevity of description, detailed description of a known method is omitted here, and for the specific working processes of the system, the module and the unit described above, reference may be made to corresponding processes in the foregoing method embodiments, which are not described herein again.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
Fig. 5 is a block diagram of an electronic device according to the semantic matching determination method of the embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 5, the electronic apparatus includes: one or more processors 501, memory 502, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 5, one processor 501 is taken as an example.
Memory 502 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the semantic matching determination method provided herein. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to execute the semantic matching determination method provided by the present application.
The memory 502, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the semantic matching determination method in the embodiments of the present application. The processor 501 executes various functional applications of the server and data processing by running non-transitory software programs, instructions and modules stored in the memory 502, that is, implements the semantic matching determination method in the above method embodiment.
The memory 502 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the electronic device of the semantic matching determination method, and the like. Further, the memory 502 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 502 optionally includes memory located remotely from processor 501, which may be connected via a network to an electronic device that performs the semantic matching determination method. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device performing the semantic matching determination method may further include: an input device 503 and an output device 504. The processor 501, the memory 502, the input device 503 and the output device 504 may be connected by a bus or other means, and fig. 5 illustrates the connection by a bus as an example.
The input device 503 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device of the semantic matching determination method, such as an input device of a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or the like. The output devices 504 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. A server may also be a server in a distributed system, or a server in a combination blockchain, with the relationship of client and server arising from computer programs running on the respective computers and having a client-server relationship to each other.
In the embodiment of the present application, artificial intelligence is a subject of research that makes a computer simulate some human thinking process and intelligent behaviors (such as learning, reasoning, planning, etc.), and has both hardware and software technologies. The artificial intelligence hardware technology generally comprises the technologies of a sensor, a special artificial intelligence chip, cloud computing, distributed storage, big data processing and the like; the artificial intelligence software technology comprises a computer vision technology, a voice recognition technology, a natural language processing technology, machine learning/deep learning, a big data processing technology, a knowledge map technology and the like.
According to the technical scheme of the embodiment of the application, in the process of performing semantic matching judgment on the first text and the second text, the natural search result corresponding to the first text and the natural search result of the second text are obtained, and the search result title of the first text and the search result title of the second text are extracted from the natural search results, so that the first text, the second text, the search result title of the first text and the search result title of the second text are processed through the semantic matching judgment model component, the coverage capability of low-frequency search words and keywords can be effectively improved, the occurrence of bad error matching results with similar word faces but far semantic differences can be greatly reduced, and the accuracy of the semantic matching judgment results between the low-frequency texts can be improved.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (12)

1. A semantic matching determination method is characterized by comprising:
extracting a search result title of the first text and a search result title of the second text from the obtained natural search result of the first text and the natural search result of the second text;
and processing the first text, the second text, the search result title of the first text and the search result title of the second text based on a semantic matching judgment model component to obtain semantic matching scores of the first text and the second text and obtain a semantic matching judgment result of the first text and the second text.
2. The method of claim 1, wherein extracting the search result title of the first text and the search result title of the second text from the obtained natural search result of the first text and the natural search result of the second text comprises:
performing relevance sorting on the natural search result of the first text to obtain a first sorting result, and taking a search result title with a page number and a click rate meeting first preset requirements as a search result title of the extracted first text on a display page of the first sorting result;
and performing relevance sorting on the natural search results of the second text to obtain a second sorting result, and taking the title of the search result of which the page number and the click rate meet second preset requirements as the title of the search result of the extracted second text on a display page of the second sorting result.
3. The method of claim 2,
the step of taking the search result title with the page number and the click rate meeting the first preset requirement as the search result title of the extracted first text comprises the following steps: denoising the search result title meeting the first preset requirement to obtain a first search result title serving as the search result title of the extracted first text;
the step of taking the title of the search result with the page number and the click rate meeting the second preset requirement as the title of the search result of the extracted second text comprises the following steps: and denoising the search result title meeting the second preset requirement to obtain a second search result title which is used as the search result title of the extracted second text.
4. The method of claim 1, wherein the semantic matching decision model component comprises a text matching decision model and a title matching decision model;
the processing the first text, the second text, the search result title of the first text and the search result title of the second text based on the semantic matching judgment model component to obtain the semantic matching score of the first text and the semantic matching score of the second text comprises:
calculating a first semantic matching score of the first text and the second text by using the text matching judgment model;
calculating a second semantic matching score of the search result title of the first text and the search result title of the second text by using the title matching judgment model;
and combining the first semantic matching score and the second semantic matching score by using a text matching judgment model weight coefficient and a title matching judgment model weight coefficient to obtain the semantic matching scores of the first text and the second text.
5. The method of claim 1,
the semantic matching judgment model component is a model obtained by performing two stages of model training according to first-stage sample data and second-stage sample data;
the two stages of model training include: and performing first-stage training on a preset basic model according to the first-stage sample data, and performing second-stage training on the model subjected to the first-stage training according to the second-stage sample data.
6. The method of claim 5,
the first stage sample data includes a first positive sample having a positive case flag and a first negative sample having a negative case flag, the second stage sample data includes a second positive sample having a positive case flag and a second negative sample having a negative case flag; wherein the content of the first and second substances,
the first positive sample comprises: the method comprises the steps that a first text pair meeting a first preset requirement is obtained from a search click log, a second text pair meeting a second preset requirement is obtained from a merchant purchase log, a search result title corresponding to a text contained in the first text pair and a search result title corresponding to a text contained in the second text pair;
the first negative sample is sample data constructed according to merchant negative feedback data acquired in advance;
the second positive sample comprises: the text pair is composed of a first text collected in advance and a corresponding synonymous second text;
the second negative examples include: the text processing device includes a text pair composed of a first text captured in advance and a second text included as a phrase in the first text captured in advance, and a text pair composed of the first text captured in advance and the second text having semantic relevance to the first text captured in advance.
7. The method of claim 6,
the first text pair meeting the first preset requirement is a first text pair which is obtained from a search click log and accords with a preset search click relationship, and the similarity of texts in the first text pair is greater than a first similarity threshold value, wherein the search click relationship is used for indicating that the texts in the first text pair correspond to the same search result;
the second text pair meeting the second predetermined requirement is a second text pair which is obtained from a merchant purchase log and accords with a specified purchase relation, and the similarity of the texts in the second text pair is greater than a second similarity threshold, wherein the purchase relation is used for indicating that the texts in the second text pair correspond to the same merchant or different merchants of the same type.
8. The method of claim 1, wherein after obtaining the semantic matching scores for the first text and the second text, the method further comprises:
and if the semantic matching score is larger than or equal to a preset score, taking the second text as the synonymous text of the first text.
9. The method of claim 1,
the first text and the second text comprise any one of the following pairs of texts: query terms and keywords, query terms and query terms, and keywords.
10. A semantic matching determination device, comprising:
the title extraction module is used for extracting a search result title of the first text and a search result title of the second text from the obtained natural search result of the first text and the natural search result of the second text;
and the result determining module is used for processing the first text, the second text, the search result title of the first text and the search result title of the second text based on a semantic matching judgment model component to obtain semantic matching scores of the first text and the second text and obtain a semantic matching judgment result of the first text and the second text.
11. An electronic device, comprising:
one or more processors;
storage means having one or more programs stored thereon which, when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-9;
one or more I/O interfaces connected between the processor and the memory and configured to enable information interaction between the processor and the memory.
12. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-9.
CN202011240599.8A 2020-11-09 2020-11-09 Semantic matching judging method, device, electronic equipment and computer readable medium Active CN112182348B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011240599.8A CN112182348B (en) 2020-11-09 2020-11-09 Semantic matching judging method, device, electronic equipment and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011240599.8A CN112182348B (en) 2020-11-09 2020-11-09 Semantic matching judging method, device, electronic equipment and computer readable medium

Publications (2)

Publication Number Publication Date
CN112182348A true CN112182348A (en) 2021-01-05
CN112182348B CN112182348B (en) 2024-03-29

Family

ID=73917188

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011240599.8A Active CN112182348B (en) 2020-11-09 2020-11-09 Semantic matching judging method, device, electronic equipment and computer readable medium

Country Status (1)

Country Link
CN (1) CN112182348B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112925900A (en) * 2021-02-26 2021-06-08 北京百度网讯科技有限公司 Search information processing method, device, equipment and storage medium
CN112988976A (en) * 2021-04-21 2021-06-18 百度在线网络技术(北京)有限公司 Search method, search apparatus, electronic device, storage medium, and program product
CN114281935A (en) * 2021-09-16 2022-04-05 腾讯科技(深圳)有限公司 Training method, device, medium and equipment for search result classification model

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110282856A1 (en) * 2010-05-14 2011-11-17 Microsoft Corporation Identifying entity synonyms
CN104462060A (en) * 2014-12-03 2015-03-25 百度在线网络技术(北京)有限公司 Method and device for calculating text similarity and realizing search processing through computer
US20160012130A1 (en) * 2014-07-14 2016-01-14 Yahoo! Inc. Aiding composition of themed articles about popular and novel topics and offering users a navigable experience of associated content
US20180121428A1 (en) * 2016-10-27 2018-05-03 International Business Machines Corporation Returning search results utilizing topical user click data when search queries are dissimilar
CN110704578A (en) * 2019-10-09 2020-01-17 精硕科技(北京)股份有限公司 Incidence relation determining method and device, electronic equipment and readable storage medium
CN111597800A (en) * 2019-02-19 2020-08-28 百度在线网络技术(北京)有限公司 Method, device, equipment and storage medium for obtaining synonyms
CN111611452A (en) * 2020-05-22 2020-09-01 上海携程商务有限公司 Method, system, device and storage medium for ambiguity recognition of search text
CN111666417A (en) * 2020-04-13 2020-09-15 百度在线网络技术(北京)有限公司 Method and device for generating synonyms, electronic equipment and readable storage medium
CN111797204A (en) * 2020-07-01 2020-10-20 北京三快在线科技有限公司 Text matching method and device, computer equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110282856A1 (en) * 2010-05-14 2011-11-17 Microsoft Corporation Identifying entity synonyms
US20160012130A1 (en) * 2014-07-14 2016-01-14 Yahoo! Inc. Aiding composition of themed articles about popular and novel topics and offering users a navigable experience of associated content
CN104462060A (en) * 2014-12-03 2015-03-25 百度在线网络技术(北京)有限公司 Method and device for calculating text similarity and realizing search processing through computer
US20180121428A1 (en) * 2016-10-27 2018-05-03 International Business Machines Corporation Returning search results utilizing topical user click data when search queries are dissimilar
CN111597800A (en) * 2019-02-19 2020-08-28 百度在线网络技术(北京)有限公司 Method, device, equipment and storage medium for obtaining synonyms
CN110704578A (en) * 2019-10-09 2020-01-17 精硕科技(北京)股份有限公司 Incidence relation determining method and device, electronic equipment and readable storage medium
CN111666417A (en) * 2020-04-13 2020-09-15 百度在线网络技术(北京)有限公司 Method and device for generating synonyms, electronic equipment and readable storage medium
CN111611452A (en) * 2020-05-22 2020-09-01 上海携程商务有限公司 Method, system, device and storage medium for ambiguity recognition of search text
CN111797204A (en) * 2020-07-01 2020-10-20 北京三快在线科技有限公司 Text matching method and device, computer equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112925900A (en) * 2021-02-26 2021-06-08 北京百度网讯科技有限公司 Search information processing method, device, equipment and storage medium
CN112925900B (en) * 2021-02-26 2023-10-03 北京百度网讯科技有限公司 Search information processing method, device, equipment and storage medium
CN112988976A (en) * 2021-04-21 2021-06-18 百度在线网络技术(北京)有限公司 Search method, search apparatus, electronic device, storage medium, and program product
CN114281935A (en) * 2021-09-16 2022-04-05 腾讯科技(深圳)有限公司 Training method, device, medium and equipment for search result classification model

Also Published As

Publication number Publication date
CN112182348B (en) 2024-03-29

Similar Documents

Publication Publication Date Title
CN111984689B (en) Information retrieval method, device, equipment and storage medium
CN111967262A (en) Method and device for determining entity tag
US9934293B2 (en) Generating search results
CN112182348B (en) Semantic matching judging method, device, electronic equipment and computer readable medium
CN110705460A (en) Image category identification method and device
US10740406B2 (en) Matching of an input document to documents in a document collection
CN112507702B (en) Text information extraction method and device, electronic equipment and storage medium
CN112989208B (en) Information recommendation method and device, electronic equipment and storage medium
CN112632403A (en) Recommendation model training method, recommendation device, recommendation equipment and recommendation medium
CN111563198B (en) Material recall method, device, equipment and storage medium
CN111078878A (en) Text processing method, device and equipment and computer readable storage medium
CN113495942B (en) Method and device for pushing information
CN114417194A (en) Recommendation system sorting method, parameter prediction model training method and device
CN111984774A (en) Search method, device, equipment and storage medium
CN111966782A (en) Retrieval method and device for multi-turn conversations, storage medium and electronic equipment
CN113342946A (en) Model training method and device for customer service robot, electronic equipment and medium
CN112328896A (en) Method, apparatus, electronic device, and medium for outputting information
CN111666417A (en) Method and device for generating synonyms, electronic equipment and readable storage medium
CN113792230B (en) Service linking method, device, electronic equipment and storage medium
CN114329206A (en) Title generation method and device, electronic equipment and computer readable medium
CN111881255B (en) Synonymous text acquisition method and device, electronic equipment and storage medium
CN111144122A (en) Evaluation processing method, evaluation processing device, computer system, and medium
CN112148979B (en) Event-associated user identification method, device, electronic equipment and storage medium
CN112052402B (en) Information recommendation method and device, electronic equipment and storage medium
CN111832313B (en) Method, device, equipment and medium for generating emotion matching set in text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant