CN113535829B - Training method and device of ranking model, electronic equipment and storage medium - Google Patents
Training method and device of ranking model, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN113535829B CN113535829B CN202010307729.9A CN202010307729A CN113535829B CN 113535829 B CN113535829 B CN 113535829B CN 202010307729 A CN202010307729 A CN 202010307729A CN 113535829 B CN113535829 B CN 113535829B
- Authority
- CN
- China
- Prior art keywords
- query
- correlation
- result
- prediction
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 137
- 238000012549 training Methods 0.000 title claims abstract description 106
- 238000003860 storage Methods 0.000 title claims abstract description 14
- 238000012163 sequencing technique Methods 0.000 claims abstract description 20
- 239000013598 vector Substances 0.000 claims description 331
- 239000011159 matrix material Substances 0.000 claims description 88
- 230000006870 function Effects 0.000 claims description 83
- 238000009826 distribution Methods 0.000 claims description 82
- 230000011218 segmentation Effects 0.000 claims description 52
- 230000008569 process Effects 0.000 claims description 35
- 238000000605 extraction Methods 0.000 claims description 14
- 230000000875 corresponding effect Effects 0.000 description 81
- 238000012545 processing Methods 0.000 description 25
- 238000003062 neural network model Methods 0.000 description 22
- 238000010586 diagram Methods 0.000 description 20
- 230000004913 activation Effects 0.000 description 16
- 238000010801 machine learning Methods 0.000 description 12
- 230000000694 effects Effects 0.000 description 11
- 238000013528 artificial neural network Methods 0.000 description 10
- 238000004422 calculation algorithm Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 238000004590 computer program Methods 0.000 description 6
- 230000007246 mechanism Effects 0.000 description 6
- 102100035353 Cyclin-dependent kinase 2-associated protein 1 Human genes 0.000 description 5
- 230000000052 comparative effect Effects 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 5
- 230000000717 retained effect Effects 0.000 description 5
- 230000006399 behavior Effects 0.000 description 4
- VPGRYOFKCNULNK-ACXQXYJUSA-N Deoxycorticosterone acetate Chemical compound C1CC2=CC(=O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H](C(=O)COC(=O)C)[C@@]1(C)CC2 VPGRYOFKCNULNK-ACXQXYJUSA-N 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000003066 decision tree Methods 0.000 description 3
- 238000007477 logistic regression Methods 0.000 description 3
- 238000007637 random forest analysis Methods 0.000 description 3
- 238000012706 support-vector machine Methods 0.000 description 3
- 238000005520 cutting process Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the disclosure discloses a training method and a training device for a ranking model, an electronic device and a storage medium, wherein the method comprises the following steps: acquiring a sample data pair; performing multi-classification prediction on the correlation between the first query content and the first query result by using a sequencing model to obtain a first correlation prediction result, and performing multi-classification prediction on the correlation between the first query content and the second query result to obtain a second correlation prediction result; fitting a first loss between the first correlation prediction result and the first correlation level and a second loss between the second correlation prediction result and the second correlation level with a first loss function; determining a correlation prediction comparison result according to the first correlation prediction result and the second correlation prediction result; fitting the correlation prediction comparison result to a correlation level comparison result using a second loss function; and adjusting the model parameters of the sequencing model according to the first loss, the second loss and the third loss.
Description
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a training method and apparatus for a ranking model, an electronic device, and a storage medium.
Background
Query result ranking systems typically rank query results based on a relevance relationship between the user's query content (which may be referred to as a query) and the query results (e.g., documents, which may be referred to as doc) recalled by the search engine based on the query content. The search engine can recall thousands of query results by judging whether the search engine contains the keywords in the query content, and the query results need to be sorted in order to improve the accuracy and readability of the query results. Measuring the relevance between the query content and the query results is a core part of the ranking system. With the advancement of era and science, a method for measuring the relevance between query contents and query results has been gradually developed from the initial simple word rule into a machine learning algorithm, so that the relevance ranking becomes more accurate and controllable.
Disclosure of Invention
The embodiment of the disclosure provides a training method and device for a ranking model, electronic equipment and a computer-readable storage medium.
In a first aspect, an embodiment of the present disclosure provides a training method for a ranking model, including:
acquiring a sample data pair; wherein the sample data pair comprises a first sample and a second sample, the first sample comprises first query content and a first query result having a first relevance grade with the first query content, and the second sample comprises the first query content and a second query result having a second relevance grade with the first query content; the first level of correlation is different from the second level of correlation;
performing multi-classification prediction on the correlation between the first query content and the first query result by using a ranking model to obtain a first correlation prediction result, and performing multi-classification prediction on the correlation between the first query content and the second query result by using the ranking model to obtain a second correlation prediction result;
fitting a first loss between the first correlation prediction and the first correlation level and a second loss between the second correlation prediction and the second correlation level with a first loss function;
determining a correlation prediction comparison result according to the first correlation prediction result and the second correlation prediction result; the correlation prediction comparison result is used for representing the correlation high-low comparison result of the first query result and the second query result;
fitting a third loss between the correlation prediction comparison and the correlation level comparison using a second loss function; the correlation level comparison result is a comparison result between the first correlation level and the second correlation level;
and adjusting the model parameters of the sequencing model according to the first loss, the second loss and the third loss.
Further, acquiring a sample data pair includes:
obtaining the first query content, a plurality of candidate query results related to the first query content and relevance labels of the candidate query results; the relevance label is used for representing the relevance grade of the candidate query result and the first query content;
and combining the candidate query results with different relevance grades pairwise according to the relevance labels to obtain the first query result and the second query result.
Further, performing multi-class prediction on the correlation between the first query content and the first query result by using a ranking model to obtain a first correlation prediction result, and performing multi-class prediction on the correlation between the first query content and the second query result by using the ranking model to obtain a second correlation prediction result, including:
performing word segmentation on a first text process consisting of the first query content and the first query result to obtain a first word segmentation set, and performing word segmentation on a second text consisting of the first query content and the first query result to obtain a second word segmentation set;
obtaining a first initial vector matrix corresponding to the first participle set according to the initial word vector of each participle in the first participle set, and obtaining a second initial vector matrix corresponding to the second participle set according to the initial word vector of each participle in the second participle set;
extracting, with the ranking model, first vector features in the first initial vector matrix that represent a relationship between the first query content and the first query result, and extracting, with the ranking model, second vector features in the second initial vector matrix that represent a relationship between the first query content and the second query result;
and performing multi-class prediction on the correlation between the first query content and the first query result and the correlation between the first query content and the second query result by using the sorting model according to the first vector feature and the second vector feature respectively to obtain a first correlation prediction result and a second correlation prediction result.
Further, determining a correlation prediction comparison result according to the first correlation prediction result and the second correlation prediction result, including:
determining a first expected value for the first correlation predictor and determining a second expected value for the second correlation predictor;
and determining the correlation prediction comparison result according to the first expected value and the second expected value.
Further, the first relevance prediction result comprises a first probability distribution vector of relevance between the first query content and the first query result over a plurality of relevance levels; the second relevance predictor comprises a second probability distribution vector of relevance between the first query results over multiple relevance levels.
In a second aspect, an embodiment of the present invention provides a method for ranking query results, including:
acquiring query content and a query result to be sequenced;
identifying a relevance between the query content and the query result using a ranking model; the ranking model is obtained by training by using the method of the first aspect;
and sorting the query results according to the relevance.
Further, identifying a relevance between the query content and the query result using a ranking model, comprising:
performing word segmentation on a third text formed by the query content and the query result to obtain a third word segmentation set;
obtaining a third initial vector matrix corresponding to the third participle set according to the initialization vector of each participle in the third participle set;
extracting third vector features representing the relation between the query content and the query result in the third initial vector matrix by using the sequencing model;
identifying a correlation between the query content and the query result for the third vector features using the ranking model.
Further, identifying a correlation between the query content and the query result for the third vector feature using the ranking model, comprising:
performing multi-class prediction on the correlation between the query content and the query result aiming at the third vector characteristics by utilizing the sequencing model to obtain third probability distribution vectors of the correlation between the query content and the query result on a plurality of correlation levels;
determining a third expected value of the third probability distribution vector, and determining the third expected value as a correlation between the query content and the query result.
In a third aspect, an embodiment of the present invention provides a method for ranking query results, including:
acquiring query content and a query result to be sequenced;
performing multi-classification prediction on the correlation between the query content and the query result by using a sequencing model to obtain a third probability distribution vector of the correlation between the query content and the query result on a plurality of correlation levels;
determining a third expected value of the third probability distribution vector, and determining the third expected value as a correlation between the query content and the query result;
and sorting the query results according to the relevance.
Further, performing multi-category prediction on the correlation between the query content and the query result by using a ranking model to obtain a third probability distribution vector of the correlation between the query content and the query result on multiple correlation levels, including:
performing word segmentation on a third text formed by the query content and the query result to obtain a third word segmentation set;
obtaining a third initial vector matrix corresponding to the third participle set according to the initialization vector of each participle in the third participle set;
extracting third vector features representing the relation between the query content and the query result in the third initial vector matrix by using the sequencing model;
and performing multi-class prediction on the correlation between the query content and the query result by using the sequencing model aiming at the third vector characteristic to obtain the third probability distribution vector.
In a fourth aspect, an embodiment of the present disclosure provides a training apparatus for a ranking model, including:
a first obtaining module configured to obtain a sample data pair; wherein the sample data pair comprises a first sample and a second sample, the first sample comprises first query content and a first query result having a first relevance grade with the first query content, and the second sample comprises the first query content and a second query result having a second relevance grade with the first query content; the first level of correlation is different from the second level of correlation;
the first prediction module is configured to perform multi-class prediction on the correlation between the first query content and the first query result by using a ranking model to obtain a first correlation prediction result, and perform multi-class prediction on the correlation between the first query content and the second query result by using the ranking model to obtain a second correlation prediction result;
a first fitting module configured to fit a first loss between the first correlation prediction and the first correlation level with a first loss function and fit a second loss between the second correlation prediction and the second correlation level;
a first comparison module configured to determine a correlation prediction comparison result from the first correlation prediction result and the second correlation prediction result; the correlation prediction comparison result is used for representing the correlation high-low comparison result of the first query result and the second query result;
a second fitting module configured to fit a third loss between the correlation prediction comparison result and the correlation level comparison result with a second loss function; the correlation level comparison result is a comparison result between the first correlation level and the second correlation level;
a parameter adjustment module configured to adjust a model parameter of the order model according to the first penalty, the second penalty, and the third penalty.
Further, the first obtaining module includes:
a first obtaining sub-module configured to obtain the first query content, a plurality of candidate query results related to the first query content, and relevance labels of the candidate query results; the relevance label is used for representing the relevance grade of the candidate query result and the first query content;
and the combination sub-module is configured to combine the candidate query results with different relevance grades pairwise according to the relevance labels to obtain the first query result and the second query result.
Further, the prediction module includes:
the first word segmentation sub-module is configured to segment words of a first text process formed by the first query content and the first query result to obtain a first word segmentation set, and segment words of a second text formed by the first query content and the first query result to obtain a second word segmentation set;
the first vector obtaining sub-module is configured to obtain a first initial vector matrix corresponding to the first participle set according to the initial word vector of each participle in the first participle set, and obtain a second initial vector matrix corresponding to the second participle set according to the initial word vector of each participle in the second participle set;
a first feature extraction sub-module configured to extract, using the ranking model, first vector features in the first initial vector matrix representing a relationship between the first query content and the first query result, and to extract, using the ranking model, second vector features in the second initial vector matrix representing a relationship between the first query content and the second query result;
a first prediction sub-module configured to perform multi-category prediction on the correlations between the first query content and the first query result, and between the first query content and the second query result respectively for the first vector feature and the second vector feature by using the ranking model, so as to obtain the first correlation prediction result and the second correlation prediction result.
Further, the first comparing module includes:
a first determination submodule configured to determine a first expected value of the first correlation predictor and to determine a second expected value of the second correlation predictor;
a second determination submodule configured to determine the correlation prediction comparison result according to the first expectation value and the second expectation value.
Further, the first relevance prediction result comprises a first probability distribution vector of relevance between the first query content and the first query result over a plurality of relevance levels; the second relevance predictor comprises a second probability distribution vector of relevance between the first query results over multiple relevance levels.
In a fifth aspect, an embodiment of the present invention provides an apparatus for sorting query results, including:
the second acquisition module is configured to acquire the query content and the query result to be ranked;
an identification module configured to identify a correlation between the query content and the query results using a ranking model; the ranking model is obtained by training by using the device of the sixth aspect;
a first ranking module configured to rank the query results according to the relevance.
Further, the identification module includes:
the second word segmentation sub-module is configured to segment words of a third text formed by the query content and the query result to obtain a third word segmentation set;
the second vector acquisition submodule is configured to obtain a third initial vector matrix corresponding to the third participle set according to the initialization vector of each participle in the third participle set;
a second feature extraction sub-module configured to extract, by using the ranking model, third vector features in the third initial vector matrix representing a relationship between the query content and the query result;
an identification sub-module configured to identify a correlation between the query content and the query result for the third vector features using the ranking model.
Further, the first identification submodule includes:
a second prediction sub-module configured to perform multi-class prediction on the correlation between the query content and the query result for the third vector feature by using the ranking model, so as to obtain a third probability distribution vector of the correlation between the query content and the query result on multiple correlation levels;
a third determination sub-module configured to determine a third expected value of the third probability distribution vector and determine the third expected value as a correlation between the query content and the query result.
In a sixth aspect, an embodiment of the present invention provides an apparatus for sorting query results, including:
the third acquisition module is configured to acquire the query content and the query result to be ranked;
a second prediction module configured to perform multi-category prediction on the correlation between the query content and the query result by using a ranking model to obtain a third probability distribution vector of the correlation between the query content and the query result on multiple correlation levels;
a first determination module configured to determine a third expected value of the third probability distribution vector and determine the third expected value as a correlation between the query content and the query result;
a second ranking module configured to rank the query results according to the relevance.
Further, the second prediction module comprises:
the third word segmentation sub-module is configured to segment words of a third text formed by the query content and the query result to obtain a third word segmentation set;
a third vector obtaining sub-module configured to obtain a third initial vector matrix corresponding to the third participle set according to the initialization vector of each participle in the third participle set;
a third feature extraction sub-module configured to extract, by using the ranking model, third vector features representing a relationship between the query content and the query result in the third initial vector matrix;
a third prediction sub-module configured to perform multi-class prediction on the correlation between the query content and the query result for the third vector feature by using the ranking model, so as to obtain the third probability distribution vector.
The functions can be realized by hardware, and the functions can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above-described functions.
In one possible design, the apparatus of any of the above aspects may be configured to include a memory for storing one or more computer instructions that enable the apparatus of any of the above aspects to perform the method of any of the above aspects, and a processor configured to execute the computer instructions stored in the memory. The apparatus of any of the above aspects may further comprise a communication interface for communicating with other devices or a communication network.
In a seventh aspect, an embodiment of the present disclosure provides an electronic device, including a memory and a processor; wherein the memory is configured to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor to implement the method of any of the above aspects.
In an eighth aspect, the present disclosure provides a computer-readable storage medium for storing computer instructions for use by any one of the above apparatuses, which includes computer instructions for performing the method according to any one of the above aspects.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:
in the training process of the ranking model, aiming at the same query content in sample data, two query results with different relevance levels respectively form a sample data pair with the query content to form a first sample and a second sample, then the ranking model to be trained is used for conducting multi-class prediction on the first sample and the second sample to respectively obtain relevance prediction results between the query content and the two query results, loss functions are used for respectively fitting losses between the relevance prediction results of the two query results and the real relevance levels, then the relevance prediction comparison results between the two query results are determined by the relevance prediction results of the two query results, namely correlation comparison results between the two query results and the query content are determined according to the prediction results of the ranking model, and then the loss Korean type prediction comparison results and two query results obtained according to the real relevance levels are used And finally, adjusting model parameters of the ranking model by using the three losses obtained by the training method, and after the training of the large number of sample data pairs, the ranking model can achieve the convergence effect, so that the trained ranking model is obtained. The ranking model can score one query result corresponding to one query content, the scored result can be directly used for carrying out relevance comparison with other query results, further carrying out relevance ranking with other query results, and a threshold value can be selected to screen relevant query results under the condition of need. According to the method provided by the embodiment of the disclosure, in the training process of the ranking model, not only is the relevance prediction error between the query content and a single query result considered, but also the relevance comparison error between two query results for the same query content is considered, and the advantages of two training modes, namely pointwise and paiirwise, are combined, so that the ranking model obtained by using the training method has strong relevance ranking capability, the ranking scores obtained by the ranking model have comparative significance, the ranking results obtained for different query contents also have comparability, and under a specified threshold, more relevant query results can be retained, and better query results are more shown.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
Other features, objects, and advantages of the present disclosure will become more apparent from the following detailed description of non-limiting embodiments when taken in conjunction with the accompanying drawings. In the drawings:
FIG. 1(a) shows a ranking flow diagram of a retrieval system according to an embodiment of the present disclosure;
FIG. 1(b) shows a schematic flow diagram for off-line training a ranking model according to an embodiment of the present disclosure;
FIG. 2 illustrates a flow diagram of a training method of a ranking model according to an embodiment of the present disclosure;
fig. 3 shows a flowchart of step S201 according to the embodiment shown in fig. 2;
FIG. 4 shows a flowchart of step S202 according to the embodiment shown in FIG. 2;
FIG. 5 shows a flowchart of step S204 according to the embodiment shown in FIG. 2;
FIG. 6 illustrates a flow diagram of a method of ranking query results according to an embodiment of the present disclosure;
FIG. 7 shows a flowchart of step S602 according to the embodiment shown in FIG. 6;
FIG. 8 shows a flowchart of step S704 according to the embodiment shown in FIG. 7;
FIG. 9 shows a flow diagram of a method of ranking query results according to another embodiment of the present disclosure;
fig. 10 shows a flowchart of step S902 according to the embodiment shown in fig. 9;
FIG. 11 illustrates a flow diagram of a query method according to an embodiment of the present disclosure;
FIG. 12 shows a flow diagram of a query method according to another embodiment of the present disclosure;
FIG. 13 is a schematic flow chart illustrating a user querying information via a client according to an embodiment of the present disclosure;
FIG. 14 is a schematic diagram of an electronic device suitable for use in implementing a training method for a ranking model, a ranking method for query results, and/or a query method according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, exemplary embodiments of the present disclosure will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily implement them. Also, for the sake of clarity, parts not relevant to the description of the exemplary embodiments are omitted in the drawings.
In the present disclosure, it is to be understood that terms such as "including" or "having," etc., are intended to indicate the presence of the disclosed features, numbers, steps, behaviors, components, parts, or combinations thereof, and are not intended to preclude the possibility that one or more other features, numbers, steps, behaviors, components, parts, or combinations thereof may be present or added.
It should be further noted that the embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
As described above, the ranking system in the related art uses the machine learning ranking algorithm to rank a plurality of query results recalled by a search engine according to query contents, and the performance that the machine learning ranking algorithm can ultimately exert is mainly affected by the following two points:
1) training data, i.e., sample data for training the machine learning model. The ordering problem relates to a plurality of forming modes of training data, mainly comprising pointwise and pairwise. The pointwise mode means that the input of a machine learning model is the characteristic data of a query content and a query result, the output is the correlation grade between the query content and the query result, and a loss function used in the training process is used for evaluating the difference between the correlation grade predicted by the machine learning model and the real correlation grade; in the pair training mode, the input of the machine learning model is the feature data of one query content and two query results, the output is the comparison result between the two query results, the comparison result is used for representing the high and low conditions of the correlation between the two query results and the query content, namely the comparison result is used for indicating which of the two query results is higher in correlation with the query content, and the loss function used in the training process is used for measuring whether the comparison result is consistent with the real correlation comparison result. The pointwise and pairwise training modes have advantages respectively, the pointwise training mode can return to a fine-grained relevance grade between the query content and the query result, namely the query result can be labeled into multiple relevance grades, and the pairwise mode can directly compare the relevance height relation between every two query results under the same query content. Both pointwise and pairwise training methods are closely related to the accuracy of the real marking of sample data, and have advantages and disadvantages in use, but from the goal of sorting, the pairwise mode is more direct, the sorting capability is generally better than that of pointwise, and the method is more universal in the industry.
2) The algorithm itself. Since the advent of search engines, machine learning ranking algorithms have been continuously improved, from simple statistical models based on artificially defined features and the like at first, to complex neural networks based on Bag of Words (Bag of Words, BOW), such as Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), and the like, and finally to date, bert (bidirectional Encoder responses from transforms) models based on attention mechanism and pre-training, and each breakthrough in these natural language processing fields brings about a dramatic improvement to the effect of relevance ranking.
The choice question-answer is a special case in the search scenario, which does not arrange multiple results in the end, but only shows the most relevant document in the final ranking result, which requires the ranking algorithm to have very high accuracy. However, in practical application scenarios, it may happen that all query results recalled by a search engine are irrelevant to the query content, and the final presentation effect is not determined by the effect of relevance ranking. Because the order of all query results in the entire recalled sequence of query results is correct, even if the ranking algorithm has reached a high level, the query result that is finally ranked first still has poor relevance to the query content. Therefore, a truncation mechanism may be introduced after the final sorted results, so that only relevant query content is retained.
A truncation mechanism in the related technology is based on manual rules, correlation verification is carried out on information of words in query contents and query results on each dimension, the more times of successful verification on each dimension, the more important the words are, and the more important the words are matched in the query contents and the query results, the higher the correlation between the query contents and the query results is; another truncation mechanism is to correct based on the user's click to cull query results that do not meet the user's search requirements. However, both of these methods have certain limitations:
1) the situation of 'false kill' is easy to occur based on manual rules. Because the manual rules are mostly based on synonym matching, the enumeration of synonyms and the use in actual scenes are difficult, the number of query results retained by manual rule screening is relatively small, and some truly valuable query results may be discarded in the process.
2) Although the click behavior of the user in the search scene is a natural and low-cost user feedback mode, the Martian effect is obvious, and some more relevant and low-click query results may never be shown under the filtering mechanism. In addition, in the fine question and answer scenario, since answers are presented on the search result page, the user's needs may have been satisfied on the result page, so the correlation between the query result and the query content has not been completely correlated with the user's click behavior.
One idea to solve the above problem is to use the scoring mechanism of the ranking algorithm to perform truncation: and selecting an ideal threshold, and comparing the scoring of each query result in the sorted results with the threshold to determine whether to eliminate all query results sorted after the threshold. However, the existing machine learning sorting algorithm is usually trained based on a pairwise mode, and the score output by the final sorting model does not have comparative significance. For example, a series of documents doc recalled by query keyword query A are scored by using a ranking model obtained by pairwise training, and doc with the score larger than 0.3 is observed to be relevant to query A, otherwise, the doc is not relevant; and then score the doc of another query B recall, the cut point between relevant and irrelevant becomes 0.9, and many doc between 0.3-0.9 are very relevant to query B. That is to say, the output scores of the ranking model obtained by the pairwise training have comparative significance only under the same query, and the scores obtained by different queries cannot be directly compared. Therefore, the ranking model trained for the pairwise method cannot select an ideal threshold.
Therefore, the embodiment of the present disclosure provides a training method of a ranking model, in a training process of the ranking model, for a same query content in sample data, forming a sample data pair by combining two query results with different relevance levels with the query content respectively to form a first sample and a second sample, then performing multi-class prediction on the first sample and the second sample by using the ranking model to be trained to obtain relevance prediction results between the query content and the two query results respectively, and fitting losses between the relevance prediction results of the two query results and a true relevance level respectively by using a loss function, then determining a relevance prediction comparison result between the two query results by using the relevance prediction results of the two query results, that is, determining a relevance comparison result between the two query results and the query content respectively according to the prediction results of the ranking model, and finally, model parameters of the ranking model are adjusted by using three losses obtained by the training method, after the training of the large number of sample data pairs, the ranking model can achieve the convergence effect, and further the trained ranking model is obtained. The ranking model can score one query result corresponding to one query content, the scored result can be directly used for carrying out relevance comparison with other query results, further carrying out relevance ranking with other query results, and a threshold value can be selected to screen relevant query results under the condition of need. According to the method provided by the embodiment of the disclosure, in the training process of the ranking model, not only is the relevance prediction error between the query content and a single query result considered, but also the relevance comparison error between two query results for the same query content is considered, and the advantages of two training modes, namely pointwise and paiirwise, are combined, so that the ranking model obtained by using the training method has strong relevance ranking capability, the ranking scores obtained by the ranking model have comparative significance, the ranking results obtained for different query contents also have comparability, and under a specified threshold, more relevant query results can be retained, and better query results are more shown.
Fig. 1(a) shows a ranking flow diagram of a retrieval system according to an embodiment of the present disclosure. As shown in fig. (a), the retrieval system 100 includes a client 101, a server 102, and a search engine 103; the user may input query content through a query entry of the client 101 and obtain ranked query results from a query result presentation interface of the client 101.
The server 102 collects a large amount of sample data, trains the ranking model by using the collected sample data, and then ranks the query results recalled by the search engine 103 by using the trained ranking model. After receiving query content input by a user, a client 101 sends the query content to a server 102, and the server 102 invokes a search engine 103 to recall a plurality of query results related to the query content for the query content; the server 102 ranks the plurality of query results by using the trained ranking model, and returns one or more most relevant query results to the client 101 according to the ranking results, so that the client 101 displays the one or more most relevant query results to the user.
In the embodiment of the present disclosure, the processing flow of the server 102 is divided into three parts: the method comprises the steps of training data construction, offline model training and online identification.
Constructing training data:
the server 102 collects a large amount of sample data, which may include query content and a plurality of query results related to the query content, and the relevance levels of the plurality of query results are known, such as low, medium, and high, and the higher the level is, the more relevant the query result is to the query content can be considered; the relevance grade of the query result can be labeled manually.
In the embodiment of the present disclosure, the server 102 constructs a plurality of sample data pairs for the collected sample data, where each sample data pair includes a first sample and a second sample, the first sample includes first query content and a first query result, and the second sample includes first query content and a second query result; the first relevance grade of the first query result is different from the second relevance grade of the second query result, and may be that the first relevance grade corresponding to the first query result is higher than the second relevance grade corresponding to the second query result, or the first relevance grade corresponding to the first query result is lower than the second relevance grade corresponding to the second query result. The query content in different sample data pairs may be the same or different.
The following illustrates the construction process of the sample data pair.
Query for a query contentAThere are three corresponding query results: docA 1、docA 2And docA 3(ii) a If the manual labeling results are shown in table 1 below:
TABLE 1
Wherein, the relevance rank of the three query results is ranked as L1<L2<L3. From the above rules, a sample data pair as shown in table 2 below can be composed:
TABLE 2
Wherein each row constitutes a sample data pair, and doc + in this column can be considered as a first query result, while doc-in this column can be considered as a second query result, and the relevance level of doc + in the first column is higher than that of doc-in the second query result. When the sample data pairs are formed, only the query result combinations with different relevance levels are selected, and the query result combinations with the same relevance levels are discarded.
(II) offline model training
After the server 102 completes construction of the sample data pair according to the collected sample data, in the training process, one round of training is performed on the ranking model by using one sample data pair. The following describes the training process of the ranking model in the embodiment of the present disclosure by using one sample data pair to perform one round of training. It can be understood that the training flow of each sample data pair is the same, and after multiple rounds of training of multiple sample data pairs, the training is stopped after the model parameters of the ranking model converge or the training times reach a certain value, so as to obtain the trained ranking model.
Firstly, vector representations of a first text corresponding to a first sample and a second text corresponding to a second sample in a sample data pair are obtained, namely the first text and the second text are represented in a vector form. The vector representation may be an initial vector matrix, each vector in the initial vector matrix corresponds to an initialization vector of a participle in the first text or the second text, the initialization vector of the participle may be a vector initialized randomly, or a word vector obtained by using an existing pre-trained dictionary, and the dictionary may include a word vector of any known vocabulary. The word segmentation in the first text and the second text can be obtained by a known word segmentation method, and word segmentation of the text is a known technology, and is not described herein again.
The following examples illustrate: if the first word set obtained by cutting words of the first text formed by the first query content and the first query result is [ w1,w2,w3,…,wn]The group of vectors corresponding to the participles in the first participle set is [ v ]1,v2,v3,…,vn]Wherein v isi(i is an integer between 1 and n) is a radical corresponding to the participle wiMay be a randomly initialized vector or may be a word vector obtained from a pre-trained dictionary. Combining the group of vectors corresponding to the participles to obtain a first initial vector matrix representation matrix [ v ] of the first text1 T,v2 T,v3 T,…,vn T]T. In the same way, a second initial vector matrix of the second text can be obtained. The first initial vector matrix and the second initial vector matrix can be understood as vector representation of the first text and the second text, that is, to facilitate the sorting model to be able to identify and process the input samples, that is, the first sample and the second sample, and after vector representation is performed on the first text corresponding to the first sample and the second text corresponding to the second sample, the obtained first initial vector matrix and second initial vector matrix are input to the sorting model for processing.
And respectively inputting the first initial vector matrix and the second initial vector matrix to a sequencing model. The ranking model may be a deep neural network model (DNN), it is understood that the ranking model is not limited to the DNN model, but may also be other machine learning models that can be trained using the training method proposed in the embodiments of the present disclosure, such as one or more combinations of logistic regression model, convolutional neural network, feedback neural network, support vector machine, K-means, K-neighbors, decision trees, random forests, bayesian networks, and the like.
The ranking model processes the input first initial vector matrix, and the output result is a multi-class prediction result, where the multi-class prediction result includes prediction probabilities distributed over multiple relevance levels, for example, the multiple relevance levels are [1,2,3,4], and the corresponding prediction probabilities are [0.1,0.2,0.3,0.4], so that the probability that the relevance between the first query content and the first query result predicted by the ranking model is the highest at the 4 th level, that is, the most likely relevance level between the first query content and the first query result is the 4 th level. It is understood that the output of the ranking model corresponds to the relevance rank of the sample data, for example, if the relevance rank labeled for the sample data corresponds to N relevance ranks, the output of the ranking model is also N, and each output corresponds to one of the N relevance ranks. The ranking model is used for predicting probability distribution of input samples on N relevance levels, that is, the ranking model can predict probability distribution of relevance between first query content and a first query result corresponding to a first text on N relevance levels after performing feature extraction and other processing on a first initial vector matrix representing the first text. It can be understood that, at the beginning of training, because the model parameters of the ranking model are initialized randomly, the obtained multi-class prediction result is not accurate, and as the training times increase, the model parameters are optimized, the multi-class prediction result output by the ranking model is closer to the real result.
Similarly, the sorting model processes the input second initial vector matrix, and the output result is also a multi-classification prediction result. It can be understood that the processing order of the first initial vector matrix and the second initial vector matrix by the order model is not limited, and the first initial vector matrix may be input to the order model for processing, or the second initial vector matrix may be input to the order model for processing.
For the first text and the second text, after the multi-classification prediction results, i.e., the first relevance prediction result and the second relevance prediction result, are respectively obtained by using the ranking model, losses between the first relevance prediction result and the real result and between the second relevance prediction result and the real result can be respectively fitted by using the first loss function. The true result is a first and second relevance level that is manually or otherwise noted as the sample data is collected. The first relevance tier identifies a relevance tier between the first query content and the first query result, the second relevance tier identifies a relevance tier between the first query content and the second query result, and the first relevance tier is different from the second relevance tier.
In some embodiments, the first loss function may employ a Cross Entropy (CE) that is used to compare the difference between the first correlation prediction result and the first correlation level in the ranking model. In the model training, the purpose of comparing the difference between the predicted result and the real result by using the loss function is to optimize the model parameters, so that the next predicted result of the sequencing model can be closer to the real result. Therefore, in the embodiments of the present disclosure, a first loss between the first correlation prediction result and the first correlation level is fitted with a first loss function, and a second loss between the second correlation prediction result and the second correlation level is fitted with the first loss function, and the first loss and the second loss can be used to optimize the model parameters of the order model. The first loss function can be expressed by the prior art, and is not described in detail herein.
In order to solve the above-mentioned problem of the ranking model obtained by training using correlation techniques, after the first loss and the second loss are fitted, the model parameters of the ranking model are not directly adjusted by using the first loss and the second loss, but the first sample and the second sample are used as a pair of sample data, and the second loss function is continuously used to fit a third loss, the second loss function compares the difference between the correlation prediction comparison result of the first query result and the second query result in the first sample and the second sample with the first query content and the true comparison result, the correlation prediction comparison result is determined according to the first correlation prediction result and the second correlation prediction result predicted by the ranking model and is used to represent which one of the first query result and the second query result has a higher or lower correlation level with the first query content, and the true comparison result is a true result which is determined according to the first relevance level and the second relevance level and is used for indicating which one of the first query result and the second query result has a higher or lower relevance level with the content of the first query.
In some embodiments, the first loss function and the second loss function may be different. The second loss function can be expressed as follows:
wherein,in order to be the third loss, the first loss,may be any two-class loss function, such as sigmoid CE, change loss, etc., s+-s-Representing the result of the correlation prediction comparison, s+May be the first relevance formula or a value, s, derived from the first relevance formula that is indicative of the relevance between the content of the first query and the first query-May be the second relevance predictor or a value derived from the second relevance predictor that is indicative of the relevance between the content of the first query and the second query, lsortFor true correlation level comparison results, lsortEqual to 1, may indicate that the first query result is more relevant than the first query result in a real situation.
After the third penalty is fitted, the model parameters of the order model may be adjusted in combination with the first penalty, the second penalty, and the third penalty optimization. For example, the final loss function that adjusts the model parameters of the ranking model may be determined to be of the form:
wherein,the final loss function is represented as a function of,the first loss is represented by the first loss,representing the second loss, alpha being a weight parameter for the third lossAnd (4) weighting.
In the final loss function, the first loss and the second loss can be understood as multi-classification losses, and the third loss is a pair loss, and the model parameters of the ranking model are optimized by comprehensively considering the first loss, the second loss and the third loss in the final loss function, which is equivalent to simultaneously optimizing the global distinguishing capability and the local ranking capability of the model parameters. The global discriminative power can be understood as the discriminative power of the ranking model for the relevance level of the query results, and the local ranking power can be understood as the ranking power of the ranking model for the relevance level between every two query results.
FIG. 1(b) shows a schematic flow diagram of off-line training of a ranking model according to an embodiment of the present disclosure. As shown in fig. 1(b), (query, doc +) is input to the ranking model as the first sample, and (query, doc-) is input to the ranking model as the second sample, wherein the first correlation between doc + and query is greater than the correlation between doc-and query; the DNN model respectively carries out multi-class prediction on the first sample and the second sample to obtain a first probability distribution vector (namely a first correlation prediction result) and a second probability distribution vector (namely a second correlation prediction result) on a plurality of correlations, and a first loss function is utilized to respectively fit a first loss and a second loss aiming at the first correlation prediction result and the second correlation prediction result; the first probability distribution vector may then be mapped onto a scalar, e.g. a first expected s + of the first probability distribution vector may be calculated, and the second probability distribution vector may be mapped onto a scalar, e.g. a second expected s-of the first probability distribution vector may be calculated, and the second loss function may be used to fit the s + and s-to obtain a pairwise loss, i.e. a third loss. And finally, optimizing the model parameters by combining the first loss, the second loss and the third loss.
(II) on-line identification
After the server 102 performs the offline training process on the ranking model by using a large amount of sample data, the ranking model with optimized model parameters is finally obtained, and the ranking model has better identification capability, can identify a plurality of query results recalled by the search engine 103 according to the query content input by the user, and further can obtain the relevance ranking results of the query results according to the identification results.
As shown in fig. 1(a), a client 101 receives query content input by a user, the client 101 sends the query content to a server 102, the server 102 invokes a search engine 103, and a plurality of query results for the query content are recalled by the search engine 103. The server 102 scores the plurality of query results recalled by the search engine 103, and the server 102 may use the ranking model obtained by the above offline training in the scoring process. The server 102 sequentially cuts the word of a third text formed by the query content and one of the query results, obtains a third initial vector matrix, inputs the third initial vector matrix to the ranking model, and outputs a correlation identification result of the query result by the ranking model, wherein the correlation identification result can represent the correlation between the query result and the query content. The server 102 may send the one or more query results to the client according to the requirements of the client 101 and the sorting result, for example, in a normal data retrieval application scenario, the server may send the query results to the client 101 after sorting according to the sorting result, the client may present the candidate query results according to the sorting order, and for example, in a fine question and answer scenario, the server 102 may send a most relevant query result to the client 101 according to the sorting result, and the client presents the most relevant query result as a query result of the fine question and answer to the user.
The following describes the contents of the examples of the present disclosure in detail through different embodiments.
FIG. 2 shows a flow diagram of a training method of a ranking model according to an embodiment of the present disclosure. As shown in fig. 2, the training method of the ranking model includes the following steps:
in step S201, a sample data pair is acquired; wherein the sample data pair comprises a first sample and a second sample, the first sample comprises first query content and a first query result having a first relevance grade with the first query content, and the second sample comprises the first query content and a second query result having a second relevance grade with the first query content; the first level of correlation is different from the second level of correlation;
in step S202, performing multi-class prediction on the correlation between the first query content and the first query result by using a ranking model to obtain a first correlation prediction result, and performing multi-class prediction on the correlation between the first query content and the second query result by using the ranking model to obtain a second correlation prediction result;
in step S203, fitting a first loss between the first correlation prediction result and the first correlation level and fitting a second loss between the second correlation prediction result and the second correlation level with a first loss function;
in step S204, determining a correlation prediction comparison result according to the first correlation prediction result and the second correlation prediction result; the correlation prediction comparison result is used for representing the correlation high-low comparison result of the first query result and the second query result;
in step S205, fitting a third loss between the correlation prediction comparison result and the correlation level comparison result with a second loss function; the correlation level comparison result is a comparison result between the first correlation level and the second correlation level;
in step S206, the model parameters of the ranking model are adjusted according to the first loss, the second loss, and the third loss.
In this embodiment, the training method of the ranking model is executed at the server side. Firstly, the server may collect a large amount of sample data, where the sample data may include query content and query results obtained for the query content, and the query results have a relevance grade, where the relevance grade is used to identify the relevance size between the query content and the query results, and the relevance grade may be obtained by a manual annotation manner, or may be a verified annotation obtained from another platform. In some embodiments, the query content may be a query keyword, a combination of query keywords, a sentence, a text, a document, or the like, and the query result may be a document related to the query content or an answer to a related question expressed by a word, a sentence, or a text.
The server constructs sample data pairs aiming at the collected sample data, wherein each sample data pair comprises a first sample and a second sample, the first sample comprises first query content and a first query result, and the second sample comprises the first query content and a second query result; the first relevance grade of the first query result is different from the second relevance grade of the second query result, and may be that the first relevance grade corresponding to the first query result is higher than the second relevance grade corresponding to the second query result, or the first relevance grade corresponding to the first query result is lower than the second relevance grade corresponding to the second query result. The query content in different sample data pairs may be the same or different.
After the server constructs and obtains a plurality of sample data pairs, the server conducts one round of training on the ranking model according to each sample data pair. A complete training process of the ranking model comprises multiple rounds of training which are executed circularly, wherein the input sample data of each training is different, but the training process is the same.
In some embodiments, the ranking model may be a deep neural network model. In the training process, the feature data corresponding to the first sample and the feature data corresponding to the second sample may be respectively input to the ranking model for processing, because the input of the neural network model is usually data in a vector form, the feature data corresponding to the first sample may be obtained by performing vector representation on a first text composed of first query content and first query results in the first sample, and the feature data corresponding to the second sample may be obtained by performing vector representation on a second text composed of first query content and second query results in the second sample. It is understood that other configurations of machine learning models may be used for the ranking model, such as one or more combinations of logistic regression models, convolutional neural networks, feedback neural networks, support vector machines, K-means, K-neighbors, decision trees, random forests, bayesian networks, and the like. The model structures are different, the requirements for the input feature data are also different, and the first sample and the second sample can be preprocessed according to the actually selected model structure to obtain the corresponding feature data, which is not limited herein.
After the feature data corresponding to the first sample and the feature data corresponding to the second sample are input to the ranking model, the ranking model may output a corresponding first correlation prediction result and a corresponding second correlation prediction result. Taking the deep neural network model as an example, the ranking model extracts corresponding features after processing the feature data corresponding to the first sample and the feature data corresponding to the second sample layer by layer, and performs multi-classification prediction on the extracted features by using the softmax activation function of the last layer to obtain a first correlation prediction result and a second correlation prediction result respectively. The multi-class prediction results can be expressed as follows:
pl=σ(yl)
wherein p islFor the prediction probability at the correlation level l, σ (-) is the softmax activation function, ylThe input of the activation function softmax, namely the intermediate features extracted by the deep neural network after layer-by-layer processing and corresponding to the relevance grade l.
The first relevance prediction result may comprise a probability distribution, i.e. a first probability distribution vector, of the relevance between the first query content and the first query result over a plurality of relevance levels; and the second relevance prediction result may comprise a probability distribution, i.e. a second probability distribution vector, of the relevance between the first query content and the second query result over a plurality of relevance levels. For example, the plurality of relevance ranks are [1,2,3,4], respectively, and the first relevance prediction result includes a probability distribution of [0.1,0.2,0.3,0.4], then the ranking model may be considered to predict that the probability of the relevance between the first query content and the first query result is the highest on the 4 th rank, that is, the relevance rank between the first query content and the first query result is most likely to be the 4 th rank. It can be understood that the output of the ranking model corresponds to the classification manner of the sample data labeled relevance levels, for example, if the relevance levels of the sample data correspond to N relevance levels, the output of the ranking model is also N, and each output corresponds to one of the N relevance levels. The ranking model is used for predicting probability distribution of input samples on N relevance levels, that is, the ranking model can predict probability distribution of relevance between first query content and a first query result corresponding to a first text on N relevance levels after performing feature extraction and other processing on a first initial vector matrix representing the first text. It can be understood that, at the beginning of training, because the model parameters of the ranking model are only initialized randomly, the obtained multi-class prediction result is not accurate, and as the training times increase, the model parameters are optimized, the multi-class prediction result output by the ranking model is closer to the real result.
For the first sample and the second sample, after the multi-classification prediction results, i.e., the first correlation prediction result and the second correlation prediction result, are respectively obtained by using the ranking model, the loss between the first correlation prediction result and the real result and the second correlation prediction result can be fitted by using the first loss function. The true result is a first and second relevance level that is manually or otherwise noted as the sample data is collected. The first relevance tier identifies a magnitude of relevance between the first query content and the first query result, the second relevance tier identifies a magnitude of relevance between the first query content and the second query result, and the first relevance tier is different from the second relevance tier.
In some embodiments, the first loss function may employ a Cross Entropy (CE) that is used to compare the difference between the first correlation prediction result and the first correlation level in the ranking model. In the model training, the purpose of comparing the difference between the predicted result and the real result by using the loss function is to optimize the model parameters, so that the next predicted result of the model can be closer to the real result. Therefore, in the embodiments of the present disclosure, a first loss between the first correlation prediction result and the first correlation level is fitted with a first loss function, and a second loss between the second correlation prediction result and the second correlation level is fitted with the first loss function, and the first loss and the second loss can be used to optimize the model parameters of the order model. The concrete expression of the first loss function can be determined by using the prior art, and is not limited herein.
After the first loss and the second loss are fitted, the model parameters of the ranking model are not optimized and adjusted directly by using the first loss and the second loss, but the first sample and the second sample are used as sample data pairs, and the fitting of the third loss by using a second loss function is continued, the second loss function compares the difference between the correlation prediction comparison result of the first query result and the correlation prediction comparison result of the second query result in the first sample and the second sample and the first query content and the real comparison result, the correlation prediction comparison result is determined according to the first correlation prediction result and the second correlation prediction result predicted by the ranking model and is used for representing the comparison result of which one of the first query result and the second query result is higher or lower in correlation level with the first query content, and the real comparison result is determined according to the first correlation level and the second correlation level, And is used to represent the true comparison result of which one of the first query result and the second query result has a higher or lower relevance rank to the content of the first query. In some embodiments, the first loss function and the second loss function may be different.
After the third penalty is fitted, the model parameters of the order model may be adjusted in combination with the first penalty, the second penalty, and the third penalty optimization. For example, the final loss function that adjusts the model parameters of the ranking model may be determined to be of the form:
wherein,the final loss function is represented as a function of,the first loss is represented by the first loss,representing the second loss, alpha being a weight parameter for the third lossAnd (4) weighting.
In the final loss function, the first loss and the second loss can be understood as multi-classification losses, and the third loss is a pair loss, and the model parameters of the ranking model are optimized by comprehensively considering the first loss, the second loss and the third loss in the final loss function, which is equivalent to simultaneously optimizing the global distinguishing capability and the local ranking capability of the model parameters. The global discriminative power can be understood as the discriminative power of the ranking model for the relevance level of the query results, and the local ranking power can be understood as the ranking power of the ranking model for the relevance level between every two query results.
In the training process of the ranking model, aiming at the same query content in sample data, two query results with different relevance grades respectively form a sample data pair with the query content to form a first sample and a second sample, then the ranking model to be trained is used for conducting multi-class prediction on the first sample and the second sample to respectively obtain relevance prediction results between the query content and the two query results, loss between the relevance prediction results of the two query results and a real relevance grade is obtained through fitting respectively, then the relevance prediction results of the two query results are used for determining the relevance prediction comparison result between the two query results, namely the relevance comparison results of the two query results and the query content are determined according to the prediction results of the ranking model, and then the loss between the comparison result and the real comparison result of the two query results obtained according to the real relevance grade is fitted And finally, adjusting model parameters of the ranking model by using the three losses obtained by the training method, and after the training of the large number of sample data pairs, the ranking model can achieve the convergence effect so as to obtain the trained ranking model. The ranking model can score one query result corresponding to one query content, and the scored result can be directly used for carrying out relevance comparison with other query results, and further can carry out relevance ranking with other query results. In the training process of the ranking model by the method provided by the embodiment of the disclosure, not only the relevance prediction error between the query content and a single query result is considered, but also the relevance comparison error between two query results for the same query content is considered, and the advantages of two training modes, namely pointwise and paiirwise, are combined, so that the ranking model obtained by the training method has strong relevance ranking capability, the ranking scores obtained by the ranking model have comparative significance, the ranking results obtained for different query contents also have comparability, and under a specified threshold, more relevant query results can be retained, so that better query results are more shown.
In an optional implementation manner of this embodiment, as shown in fig. 3, the step S201, that is, the step of acquiring the sample data pair, further includes the following steps:
in step S301, obtaining the first query content, a plurality of candidate query results related to the first query content, and relevance levels of the candidate query results; the relevance grade is used for representing the relevance grade of the candidate query result and the first query content;
in step S302, the candidate query results with different relevance ranks are pairwise combined according to the relevance ranks to obtain the first query result and the second query result.
In this optional implementation manner, the server constructs sample data pairs for the collected first query content and the plurality of candidate query results corresponding to the first query content. The multiple candidate query results are labeled query results with known relevance levels, for the same query content, such as a first query content, the multiple candidate query results corresponding to the same query content can be pairwise combined according to the relevance levels corresponding to the relevance labels, the first query content and the candidate query results in the combination are respectively combined to obtain a first sample and a second sample, and the first sample and the second sample form a sample data pair. The sample data pair may be constructed as described above in the embodiment shown in fig. 1, and is not described herein again.
In an optional implementation manner of this embodiment, as shown in fig. 4, the step S202 of performing multi-class prediction on the correlation between the first query content and the first query result by using a ranking model to obtain a first correlation prediction result, and performing multi-class prediction on the correlation between the first query content and the second query result by using the ranking model to obtain a second correlation prediction result further includes the following steps:
in step S401, performing word segmentation on a first text process composed of the first query content and the first query result to obtain a first word set, and performing word segmentation on a second text composed of the first query content and the first query result to obtain a second word set;
in step S402, a first initial vector matrix corresponding to the first participle set is obtained according to the initial word vector of each participle in the first participle set, and a second initial vector matrix corresponding to the second participle set is obtained according to the initial word vector of each participle in the second participle set;
in step S403, extracting a first vector feature representing a relationship between the first query content and the first query result in the first initial vector matrix by using the ranking model, and extracting a second vector feature representing a relationship between the first query content and the second query result in the second initial vector matrix by using the ranking model;
in step S404, performing multi-category prediction on the correlations between the first query content and the first query result, and between the first query content and the second query result by using the ranking model for the first vector feature and the second vector feature, respectively, to obtain the first correlation prediction result and the second correlation prediction result.
In this alternative implementation, when the input required by the model structure selected by the ranking model, for example, the deep neural network model, is in the form of a vector, the first sample and the second sample may be first vector-represented.
In the disclosed embodiment, first, a first query content and a first query result in a first sample in a sample data pair are combined into a first text, a first query content and a second query result in a second sample are combined into a second text, the first text may be a simple superposition combination of the first query content and the first query result, the second text may be a simple superposition combination of the first query content and the second query result, and then the first text and the second text are vector-represented, that is, the first text and the second text are represented in a vector form, the vector representation may be an initial vector matrix, each vector in the initial vector matrix corresponds to an initialization vector of a participle in the first text or the second text, the initialization vector of the participle may be a randomly initialized vector, or a word vector obtained by using an existing pre-trained dictionary, the dictionary may include word vectors of any known vocabulary. The word segmentation in the first text and the second text can be obtained by a known word segmentation method, and the method for segmenting words in the text is a known technology and is not described herein again.
The following examples illustrate: if the first word set obtained by cutting words of the first text formed by the first query content and the first query result is [ w1,w2,w3,…,wn]The group of vectors corresponding to the participles in the first participle set is [ v ]1,v2,v3,…,vn]Wherein v isi(i is an integer between 1 and n) is a radical corresponding to the participle wiMay be a randomly initialized vector or may be a word vector obtained from a pre-trained dictionary. Combining the group of vectors corresponding to the participles to obtain a first initial vector matrix representation matrix [ v ] of the first text1 T,v2 T,v3 T,…,vn T]T. In the same way, a second initial vector matrix of the second text can be obtained. The first initial vector matrix and the second initial vector matrix can be understood as vector representation of the first text and the second text, that is, to facilitate the sorting model to be able to identify and process the input samples, that is, the first sample and the second sample, and after vector representation is performed on the first text corresponding to the first sample and the second text corresponding to the second sample, the obtained first initial vector matrix and second initial vector matrix are input to the sorting model for processing.
The ranking model may process the first initial vector matrix and the second initial vector matrix, respectively, to extract a first vector feature that may be used to represent a relationship between the first query content and the first query result, and a second vector feature that may be used to represent a relationship between the first query content and the second query result. The specific extraction method of the first vector feature and the second vector feature is related to a model structure of the ranking model, taking a deep neural network model as an example, the deep neural network model may include multilayer processing, and after the deep neural network performs layer-by-layer processing on the first initial vector matrix and the second initial vector matrix, the first vector feature and the second vector feature may be finally obtained. The last layer of the deep neural network model may be an activation function layer, which may perform multi-class prediction for the first vector feature and the second vector feature, respectively, and then output probability distributions of the first vector feature and the second vector feature on a plurality of correlation levels. In some embodiments, the activation function may be a softmax function. The Softmax function may map the first vector feature and the second vector feature to a data between [0,1], which may be a predicted probability for the corresponding correlation level.
In an optional implementation manner of this embodiment, as shown in fig. 5, the step S204 of determining a correlation prediction comparison result according to the first correlation prediction result and the second correlation prediction result further includes the following steps:
in step S501, a first expected value of the first correlation prediction result and a second expected value of the second correlation prediction result are determined;
in step S502, the correlation prediction comparison result is determined according to the first expected value and the second expected value.
In this optional implementation manner, the first relevance prediction result obtained by performing multi-class prediction on the first sample by the ranking model is probability distribution of the first query content and the first query result on multiple relevance levels, and the second sample is subjected to multi-class predictionThe second relevance prediction result obtained by multi-classification prediction is probability distribution of the first query content and the second query result on multiple relevance levels; the probability distribution can be represented by a vector, for example, with a correlation level of l1,l2,l3,l4]Four levels and the probability distribution obtained by the ranking model can be represented as p by a vector1,p2,p3,p4]. In the embodiment of the present disclosure, in order to compare the first query result and the second query result with the correlation comparison result of the first query content according to the first correlation prediction result and the second correlation prediction result obtained by the ranking model, the probability distribution vectors corresponding to the first correlation prediction result and the second correlation prediction result may be mapped onto a fractional scalar, and then the fractional scalar is compared to determine the correlation comparison result between the first query result and the second query result.
In one embodiment, the probability distribution vector corresponding to the first correlation predictor may be mapped to the scalar of the first expected value and the probability distribution vector corresponding to the second correlation predictor may be mapped to the scalar of the second expected value by calculating the first expected value of the first correlation predictor and the second expected value of the second correlation predictor. The first expected value and the second expected value are calculated as follows:
where s is the first expected value or the second expected value, l is the correlation level, plThe prediction probability on the correlation level l in the probability distribution vector obtained by multi-classification prediction for the ranking model, that is, the probability of the correlation level l between the query content and the query result in the predicted sample, and
pl=σ(yl)
wherein p islTo predict the probability at the relevance level l, σ (-) is the activation function softmax, y of the last layer in the ranking modellFor activating a function sThe input of the soft max, that is, the intermediate features corresponding to the correlation level l extracted from the deep neural network after layer-by-layer processing.
For example, if the ranking model performs multi-class prediction on the first sample to obtain a first correlation prediction result corresponding to a probability distribution vector of [0.1,0.2,0.3,0.4] and a correlation level of [0,1,2,3], the first expected value s is 0.1 × 0+0.2 × 1+0.3 × 2+0.4 × 3 is 2.0.
It can be seen that the expected s of the correlation prediction result is input y relative to the correlation level llThe gradient of (d) is:
that is, the gradients are weighted by the corresponding correlation levels l, so that during the process of fitting the third loss by using the expectation, the model parameters corresponding to each correlation level l participate in the calculation, and only the gradient of the model parameters corresponding to the first correlation level (or the second correlation level) of the first sample (or the second sample) is calculated in the process of directly fitting the first loss (or the second loss) by using the first correlation prediction result (or the second correlation prediction result) of the activation function Softmax in the ranking model, so that the first expectation (or the second expectation) plays a role in accelerating convergence.
FIG. 6 shows a flow diagram of a method of ranking query results according to an embodiment of the present disclosure. As shown in fig. 6, the method for ranking query results includes the following steps:
in step S601, query content and query results to be ranked are obtained;
in step S602, identifying a correlation between the query content and the query result using a ranking model; the ranking model is obtained by training by using the training method of the ranking model;
in step S603, the query results are ranked according to the relevance.
In this embodiment, the method for sorting the query results is executed by the server and corresponds to the online identification part of the server. The server may receive query content from the client and invoke the search engine to recall a plurality of query results corresponding to the query content. The plurality of query results obtained by the server from the search engine are results that are relevant to the query content, but the search engine does not rank the query results according to relevance. Therefore, the server can rank the plurality of query results by using the ranking model obtained by offline training.
The training process of the ranking model is described in the above description of the embodiment shown in fig. 1 to 4 and the related embodiments, and is not repeated herein. After the query content and the query result to be ranked are input into the ranking model, the relevance between the query content and the query result can be obtained, and the server can obtain the ranking order of the query results according to the relevance corresponding to each query result. For example, for query result 1 and query result 2, the relevance obtained by the ranking model is r1 and r2, respectively, and the order of the relevance between query result 1 and query result 2 and the query content can be determined by comparing the sizes between r1 and r 2.
In an optional implementation manner of this embodiment, as shown in fig. 7, the step S602, namely the step of identifying the relevance between the query content and the query result by using a ranking model, further includes the following steps:
in step S701, performing word segmentation on a third text composed of the query content and the query result to obtain a third word segmentation set;
in step S702, a third initial vector matrix corresponding to the third participle set is obtained according to the initialization vector of each participle in the third participle set;
in step S703, extracting a third vector feature representing the relationship between the query content and the query result in the third initial vector matrix by using the ranking model;
in step S704, a correlation between the query content and the query result is identified for the third vector feature using the ranking model.
In this optional implementation, when the input required by the model structure selected by the ranking model, for example, the deep neural network model, is in a vector form, the query content and the query result may be vector-represented.
In the embodiment of the present disclosure, the query content and a query result are firstly combined into a third text, where the third text may be a simple superposition of the query content and the query result, and then the third text is subjected to vector representation, that is, the third text is represented in a vector form, where the vector representation may be an initial vector matrix, each vector in the initial vector matrix corresponds to an initialization vector of a participle in the third text, and the initialization vector of the participle may be a vector initialized randomly, or a word vector obtained by using an existing pre-trained dictionary, and the dictionary may include a word vector of any known vocabulary. The word segmentation in the third text can be obtained by a known word segmentation method, which is a known technology and is not described herein again.
The ranking model can process the third initial vector matrixes respectively, and third vector features which can be used for representing the relation between the query content and the query result are extracted from the third initial vector matrixes. The specific extraction mode of the third vector feature is related to the model structure of the ranking model, taking the deep neural network model as an example, the deep neural network model may include multilayer processing, and the deep neural network may finally obtain the third vector feature after performing layer-by-layer processing on the third initial vector matrix, and the third vector feature is related to the specific neural network model structure and belongs to an intermediate processing result inside the neural network model structure, and no specific limitation is made herein. The ranking model may identify a relevance of the query content to the query results for the third vector features.
In an optional implementation manner of this embodiment, as shown in fig. 8, the step S704, namely, the step of identifying the relevance between the query content and the query result for the third vector feature by using the ranking model, further includes the following steps:
in step S801, performing multi-class prediction on the correlation between the query content and the query result according to the third vector feature by using the ranking model, so as to obtain a third probability distribution vector of the correlation between the query content and the query result on multiple correlation levels;
in step S802, a third expected value of the third probability distribution vector is determined, and the third expected value is determined as a correlation between the query content and the query result.
In this embodiment, taking the deep neural network model as an example, the last layer of the ranking model may be an activation function layer, and the activation function layer may perform multi-class prediction on the third vector feature, and further output probability distribution of the third vector feature on multiple correlation levels, that is, a third probability distribution vector. In some embodiments, the activation function may be a softmax function. The Softmax function may map the third vector feature to a value between [0,1], which may be the prediction probability for the corresponding correlation level, the prediction probabilities at the plurality of correlation levels constituting the third probability distribution vector.
For details of calculating the third expected value, reference may be made to the related description of the first expected value and the second expected value in the training method of the ranking model, and details are not repeated here. It can be understood that the third expected value represents the correlation between the query content and the query result, the correlation between a plurality of query results corresponding to the query content can be quickly compared, and the third expected value obtained in the above manner not only has global distinguishing capability, but also has local ranking capability because the training method of the ranking model in the present disclosure combines pointwise and paiirwise. The third expected value can be directly utilized to perform relevance ranking on the plurality of query results.
FIG. 9 shows a flow diagram of a method of ranking query results according to another embodiment of the present disclosure. As shown in fig. 9, the method for ranking query results includes the following steps:
in step S901, query content and query results to be ranked are obtained;
in step S902, performing multi-category prediction on the correlation between the query content and the query result by using a ranking model, to obtain a third probability distribution vector of the correlation between the query content and the query result on multiple correlation levels;
in step S903, determining a third expected value of the third probability distribution vector, and determining the third expected value as a correlation between the query content and the query result;
in step S904, the query results are ranked according to the relevance.
In this embodiment, the method for sorting the query results is executed by the server and corresponds to the online identification part of the server. The server may receive query content from the client and invoke the search engine to recall a plurality of query results corresponding to the query content. The plurality of query results obtained by the server from the search engine are results that are relevant to the query content, but the search engine does not rank the query results according to relevance. Therefore, the server can rank the multiple query results by using the ranking model obtained by offline training.
The ranking model can perform multi-classification prediction on the correlation between the query content and the query result to obtain a third probability distribution vector, where the third probability distribution vector may be probability distribution on multiple correlation levels, that is, the correlation between the query content and the query result is the probability of each correlation level. The ranking model may be a deep neural network model. It is understood that other configurations of machine learning models may be used for the ranking model, such as one or more combinations of logistic regression models, convolutional neural networks, feedback neural networks, support vector machines, K-means, K-neighbors, decision trees, random forests, bayesian networks, and the like.
The ranking model can be trained in advance, in the training process, the training data of the ranking model can comprise sample query content and sample query results, the sample query content and the characteristic data of the sample query results are input into the ranking model and processed by the ranking model, then multi-classification prediction results are output, further loss between the multi-classification prediction results and real labels of the sample query results can be fitted by using a loss function, and meanwhile, the pairwise loss between two sample query results corresponding to the same sample query content can be fitted by combining a pairwise training mode. And finally, stopping training after model parameters of the training and/or sequencing model are converged, and performing online identification by using the trained sequencing model.
The ranking model may perform multi-class prediction on the correlation between the query content and the query result, where the multi-class prediction result is a probability distribution of the correlation between the query content and the query result at multiple correlation levels, and is represented as a third probability distribution vector. Taking the deep neural network model as an example, the last layer of the ranking model may be an activation function layer, and the activation function layer may perform multi-class prediction on the third vector feature, and further output probability distribution of the third vector feature on multiple correlation levels, that is, a third probability distribution vector. In some embodiments, the activation function may be a Softmax function. The Softmax function may map the third vector feature to a value between [0,1], which may be the prediction probability for the corresponding correlation level, the prediction probabilities at the plurality of correlation levels constituting the third probability distribution vector.
For details of calculating the third expected value, reference may be made to the related description of the first expected value and the second expected value in the training method of the ranking model, and details are not repeated here. It can be understood that, by representing the correlation between the query content and the query result by the third expected value, the magnitude of the correlation between the plurality of query results corresponding to the query content can be quickly compared.
The server may obtain a ranking order of the plurality of query results according to the third expected value corresponding to each query result. For example, for query result 1 and query result 2, the third expected values obtained by the ranking model are s1 and s2, respectively, and then the order of relevance between query result 1 and query result 2 and the query content can be determined by comparing the sizes between s1 and s 2.
In an optional implementation manner of this embodiment, as shown in fig. 10, the step S902, that is, the step of performing multi-category prediction on the correlation between the query content and the query result by using a ranking model to obtain a third probability distribution vector of the correlation between the query content and the query result at multiple correlation levels, further includes the following steps:
in step S1001, a third text composed of the query content and the query result is subjected to word segmentation to obtain a third word segmentation set;
in step S1002, a third initial vector matrix corresponding to the third participle set is obtained according to the initialization vector of each participle in the third participle set;
in step S1003, extracting, by using the ranking model, a third vector feature representing a relationship between the query content and the query result in the third initial vector matrix;
in step S1004, performing multi-class prediction on the correlation between the query content and the query result according to the third vector feature by using the ranking model, so as to obtain the third probability distribution vector.
In this optional implementation, when the input required by the model structure selected by the ranking model, for example, the deep neural network model, is in a vector form, the query content and the query result may be vector-represented.
In the embodiment of the present disclosure, the query content and a query result are firstly combined into a third text, where the third text may be a simple superposition of the query content and the query result, and then the third text is subjected to vector representation, that is, the third text is represented in a vector form, where the vector representation may be an initial vector matrix, each vector in the initial vector matrix corresponds to an initialization vector of a participle in the third text, and the initialization vector of the participle may be a vector initialized randomly, or a word vector obtained by using an existing pre-trained dictionary, and the dictionary may include a word vector of any known vocabulary. The word segmentation in the third text can be obtained by a known word segmentation method, which is a known technology and is not described herein again.
The ranking model can process the third initial vector matrixes respectively, and third vector features which can be used for representing the relation between the query content and the query result are extracted from the third initial vector matrixes. The specific extraction mode of the third vector feature is related to the model structure of the ranking model, taking the deep neural network model as an example, the deep neural network model may include multilayer processing, and the deep neural network may finally obtain the third vector feature after performing layer-by-layer processing on the third initial vector matrix, and the third vector feature is related to the specific neural network model structure and belongs to an intermediate processing result inside the neural network model structure, and no specific limitation is made herein. The ranking model may identify a relevance of the query content to the query results for the third vector features.
Taking the deep neural network model as an example, the last layer of the ranking model may be an activation function layer, and the activation function layer may perform multi-class prediction on the third vector feature, and further output probability distribution of the third vector feature on multiple correlation levels, that is, a third probability distribution vector. In some embodiments, the activation function may be a Softmax function. The Softmax function may map the third vector feature to a value between [0,1], which may be the prediction probability for the corresponding correlation level, the prediction probabilities at the plurality of correlation levels constituting the third probability distribution vector.
FIG. 11 shows a flow diagram of a query method according to an embodiment of the present disclosure. As shown in fig. 11, the query method includes the following steps:
in step S1101, query contents input by a user are received from a client;
in step S1102, a plurality of query results are obtained according to the query content;
in step S1103, sorting the plurality of query results to obtain sorted results; the sorting result is obtained by using the sorting method of the query result;
in step S1104, at least one of the query results is returned to the client according to the sorting result.
In this embodiment, the query method is executed by a server. The user may enter query content, such as query keywords, through the client. The client may provide a front-end presentation page of the retrieval system and provide a search entry on the front-end presentation page through which the user enters query content.
After receiving the query content input by the user, the client sends the query content to the server. After receiving the query content, the server calls a search engine to obtain a plurality of query results corresponding to the query content, wherein the query results are not subjected to relevance ranking. Therefore, the server can rank the query results according to the ranking method of the query results provided by the embodiment of the disclosure, and then obtain the ranking results. The server can return one or more query results to the client according to the sorting result. For example, if the application scenario of the client is a trivia query, the server may return the most relevant query results to the client according to the above ranking results, since the trivia query only requires one most relevant answer. And if the application scene of the client is a common search scene like an online platform, the server can return a plurality of query results ranked in the front to the client according to the ranking result.
For details of the ranking of the multiple query results, reference may be made to the description of the query result ranking method in the embodiments shown in fig. 6 to fig. 10 and the related embodiments, which is not limited herein.
FIG. 12 shows a flow diagram of a query method according to another embodiment of the present disclosure. As shown in fig. 12, the query method includes the following steps:
in step S1201, receiving query content input by a user;
in step S1202, the query content is sent to a query server, so that the query server obtains a query result according to the query method;
in step S1203, the query result is received from the query server and presented.
In this embodiment, the query method is executed by the client. The user may enter query content, such as query keywords, through the client. The client may provide a front-end presentation page of the retrieval system and provide a search entry on the front-end presentation page through which the user enters query content.
After receiving the query content input by the user, the client sends the query content to the server. And after obtaining the query result according to the query content, the server returns the query result to the client. And when the number of the query results returned to the client by the server is multiple, the server also returns a sorting result of sorting the query results according to the relevance. The client may present the query result according to the ranking result, for example, the query result may be presented to the user in an order of decreasing relevance.
For other details related to the embodiments of the present disclosure, reference may be made to the above description of the embodiment shown in fig. 11 and the related embodiments, which are not described herein again.
FIG. 13 is a schematic flow chart illustrating a user querying information through a client according to an embodiment of the disclosure. As shown in fig. 13, a user inputs a query keyword "query a" at a search entrance provided by a client 1301, the client 1301 sends the query keyword "query a" to a backend server 1302, the backend server 1302 invokes a search engine 1303 to recall a plurality of query results "doc 1, doc2, … …, docn", the server 1302 performs relevance ranking on the plurality of query results "doc 1, doc2, … …, docn" by using a ranking method proposed in the embodiment of the present disclosure to obtain "doc 1," doc2, … …, cmdon, "wherein doc1, doc2, … …, and don correspond to one of the foregoing doc1, doc2, … …, and docn. The server returns "docm 1, docm2, … …, docmn" to the client 1301 in the order, and the client 1301 presents the query results on the front-end search page in the order one or more times starting from docm 1.
Table 3 shows the result of comparing the effect of the ranking model obtained by training according to the embodiment of the present disclosure and the result of comparing the effect of the ranking model obtained by the conventional pair training method under various different evaluation indexes after performing correlation prediction on the same test set:
TABLE 3
The positive sequence proportion corresponds to the prediction accuracy, the positive and negative sequences are obtained by calculating the positive sequence proportion/(1-positive sequence proportion), and the effect of the ranking model obtained by training according to the embodiment of the disclosure is obviously superior to that of the ranking model obtained by the traditional pairwise training method. In addition, the recall rates under different accuracies are also shown in table 3, and it can be seen from these indexes that the effect of the ranking model obtained by training according to the embodiment of the present disclosure is significantly better than that of the ranking model obtained by the conventional pairwise training method.
The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods.
According to the training device of the ranking model provided by an embodiment of the disclosure, the training device can be implemented as part or all of an electronic device through software, hardware or a combination of the two. The training device of the ranking model comprises:
a first obtaining module configured to obtain a sample data pair; wherein the sample data pair comprises a first sample and a second sample, the first sample comprises first query content and a first query result having a first relevance grade with the first query content, and the second sample comprises the first query content and a second query result having a second relevance grade with the first query content; the first level of correlation is different from the second level of correlation;
the first prediction module is configured to perform multi-class prediction on the correlation between the first query content and the first query result by using a ranking model to obtain a first correlation prediction result, and perform multi-class prediction on the correlation between the first query content and the second query result by using the ranking model to obtain a second correlation prediction result;
a first fitting module configured to fit a first loss between the first correlation prediction and the first correlation level with a first loss function and fit a second loss between the second correlation prediction and the second correlation level;
a first comparison module configured to determine a correlation prediction comparison result from the first correlation prediction result and the second correlation prediction result; the correlation prediction comparison result is used for representing the correlation high-low comparison result of the first query result and the second query result;
a second fitting module configured to fit a third loss between the correlation prediction comparison result and the correlation level comparison result with a second loss function; the correlation level comparison result is a comparison result between the first correlation level and the second correlation level;
a parameter adjustment module configured to adjust a model parameter of the order model according to the first penalty, the second penalty, and the third penalty.
In an optional implementation manner of this embodiment, the first obtaining module includes:
a first obtaining sub-module configured to obtain the first query content, a plurality of candidate query results related to the first query content, and relevance labels of the candidate query results; the relevance label is used for representing the relevance grade of the candidate query result and the first query content;
and the combination sub-module is configured to combine the candidate query results with different relevance grades pairwise according to the relevance labels to obtain the first query result and the second query result.
In an optional implementation manner of this embodiment, the prediction module includes:
the first word segmentation sub-module is configured to segment words of a first text process formed by the first query content and the first query result to obtain a first word segmentation set, and segment words of a second text formed by the first query content and the first query result to obtain a second word segmentation set;
the first vector obtaining sub-module is configured to obtain a first initial vector matrix corresponding to the first participle set according to the initial word vector of each participle in the first participle set, and obtain a second initial vector matrix corresponding to the second participle set according to the initial word vector of each participle in the second participle set;
a first feature extraction sub-module configured to extract, using the ranking model, first vector features in the first initial vector matrix representing a relationship between the first query content and the first query result, and to extract, using the ranking model, second vector features in the second initial vector matrix representing a relationship between the first query content and the second query result;
a first prediction sub-module configured to perform multi-category prediction on the correlations between the first query content and the first query result, and between the first query content and the second query result respectively for the first vector feature and the second vector feature by using the ranking model, so as to obtain the first correlation prediction result and the second correlation prediction result.
In an optional implementation manner of this embodiment, the first comparing module includes:
a first determination submodule configured to determine a first expected value of the first correlation predictor and to determine a second expected value of the second correlation predictor;
a second determination submodule configured to determine the correlation prediction comparison result according to the first expectation value and the second expectation value.
In an optional implementation of this embodiment, the first relevance prediction result includes a first probability distribution vector of relevance between the first query content and the first query result over multiple relevance levels; the second relevance predictor comprises a second probability distribution vector of relevance between the first query results over multiple relevance levels.
The training device of the ranking model in this embodiment corresponds to the training method of the ranking model described in the embodiment shown in fig. 2 and the related embodiments, and specific details can be referred to the above description of the training method of the ranking model, which is not described herein again.
According to the query result ranking device provided by an embodiment of the present disclosure, the device may be implemented as part or all of an electronic device through software, hardware, or a combination of the two. The device for sorting the query results comprises:
the second acquisition module is configured to acquire the query content and the query result to be ranked;
an identification module configured to identify a correlation between the query content and the query results using a ranking model; the sequencing model is obtained by training by utilizing a training device of the sequencing model;
a first ranking module configured to rank the query results according to the relevance.
In an optional implementation manner of this embodiment, the identifying module includes:
the second word segmentation sub-module is configured to segment words of a third text formed by the query content and the query result to obtain a third word segmentation set;
the second vector acquisition submodule is configured to obtain a third initial vector matrix corresponding to the third participle set according to the initialization vector of each participle in the third participle set;
a second feature extraction sub-module configured to extract, by using the ranking model, third vector features in the third initial vector matrix representing a relationship between the query content and the query result;
an identification sub-module configured to identify a correlation between the query content and the query result for the third vector features using the ranking model.
In an optional implementation manner of this embodiment, the first identification sub-module includes:
a second prediction sub-module configured to perform multi-class prediction on the correlation between the query content and the query result for the third vector feature by using the ranking model, so as to obtain a third probability distribution vector of the correlation between the query content and the query result on multiple correlation levels;
a third determination sub-module configured to determine a third expected value of the third probability distribution vector and determine the third expected value as a correlation between the query content and the query result.
The query result sorting device in this embodiment corresponds to and is consistent with the query result sorting method described in the embodiment and related embodiments shown in fig. 6, and specific details can be referred to the above description of sorting of query results, which is not described herein again.
According to another embodiment of the present disclosure, the query result ranking device may be implemented as part or all of an electronic device through software, hardware, or a combination of the two. The device for sorting the query results comprises:
the third acquisition module is configured to acquire the query content and the query result to be ranked;
a second prediction module configured to perform multi-category prediction on the correlation between the query content and the query result by using a ranking model to obtain a third probability distribution vector of the correlation between the query content and the query result on multiple correlation levels;
a first determination module configured to determine a third expected value of the third probability distribution vector and determine the third expected value as a correlation between the query content and the query result;
a second ranking module configured to rank the query results according to the relevance.
In an optional implementation manner of this embodiment, the second prediction module includes:
the third word segmentation sub-module is configured to segment words of a third text formed by the query content and the query result to obtain a third word segmentation set;
a third vector obtaining sub-module configured to obtain a third initial vector matrix corresponding to the third participle set according to the initialization vector of each participle in the third participle set;
a third feature extraction sub-module configured to extract, by using the ranking model, third vector features representing a relationship between the query content and the query result in the third initial vector matrix;
a third prediction sub-module configured to perform multi-class prediction on the correlation between the query content and the query result for the third vector feature by using the ranking model, so as to obtain the third probability distribution vector.
The query result sorting device in this embodiment corresponds to and is consistent with the query result sorting method described in the embodiment and related embodiments shown in fig. 9, and specific details can be referred to the above description of the query result sorting method, which is not described herein again.
According to the query device provided by an embodiment of the present disclosure, the query device may be implemented as part or all of an electronic device through software, hardware, or a combination of the two. The inquiry apparatus includes:
a first receiving module configured to receive query content input by a user from a client;
the query module is configured to obtain a plurality of query results according to the query content query;
a third ranking module configured to rank the plurality of query results to obtain ranked results; wherein, the sorting result is obtained by using the sorting device of the query result;
a returning module configured to return at least one of the query results to the client according to the ranking result.
The query result sorting device in this embodiment corresponds to and is consistent with the query result sorting method described in the embodiment and related embodiments shown in fig. 11, and specific details can be referred to the above description of the query result sorting method, which is not described herein again.
According to another embodiment of the present disclosure, the query apparatus may be implemented as part or all of an electronic device through software, hardware, or a combination of the two. The inquiry apparatus includes:
the second receiving module is configured to receive query content input by a user;
the sending module is configured to send the query content to a query server so that the query server can obtain a query result according to the query device;
a presentation module configured to receive and present the query result from the query server.
The query device in this embodiment corresponds to the query method described in the embodiment and related embodiments shown in fig. 12, and specific details can be referred to the above description of the query method, which is not described herein again.
FIG. 14 is a schematic diagram of an electronic device suitable for use in implementing a training method for a ranking model, a ranking method for query results, and/or a query method according to embodiments of the present disclosure.
As shown in fig. 14, electronic device 1400 includes a processing unit 1401, which may be implemented as a CPU, GPU, FPGA, NPU, or the like processing unit. The processing unit 1401 can execute various processes in the above-described method embodiments of the present disclosure according to a program stored in a Read Only Memory (ROM)1402 or a program loaded from a storage portion 1408 into a Random Access Memory (RAM) 1403. In the RAM1403, various programs and data necessary for the operation of the electronic device 1400 are also stored. The processing unit 1401, the ROM1402, and the RAM1403 are connected to each other by a bus 1404. An input/output (I/O) interface 1405 is also connected to bus 1404.
The following components are connected to the I/O interface 1405: an input portion 1406 including a keyboard, a mouse, and the like; an output portion 1407 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker and the like; a storage portion 1408 including a hard disk and the like; and a communication portion 1409 including a network interface card such as a LAN card, a modem, or the like. The communication section 1409 performs communication processing via a network such as the internet. The driver 1410 is also connected to the I/O interface 1405 as necessary. A removable medium 1411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1410 as necessary, so that a computer program read out therefrom is installed into the storage section 1408 as necessary.
In particular, according to embodiments of the present disclosure, the methods in the embodiments above with reference to the present disclosure may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a medium readable thereby, the computer program comprising program code for performing the methods of embodiments of the present disclosure. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 1409 and/or installed from the removable media 1411.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowcharts or block diagrams may represent a module, a program segment, or a portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units or modules described in the embodiments of the present disclosure may be implemented by software or hardware. The units or modules described may also be provided in a processor, and the names of the units or modules do not in some cases constitute a limitation of the units or modules themselves.
As another aspect, the present disclosure also provides a computer-readable storage medium, which may be the computer-readable storage medium included in the apparatus in the above-described embodiment; or it may be a separate computer readable storage medium not incorporated into the device. The computer readable storage medium stores one or more programs for use by one or more processors in performing the methods described in the present disclosure.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is possible without departing from the inventive concept. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.
Claims (22)
1. A training method of a ranking model, comprising:
acquiring a sample data pair; wherein the sample data pair comprises a first sample and a second sample, the first sample comprises first query content and a first query result having a first relevance grade with the first query content, and the second sample comprises the first query content and a second query result having a second relevance grade with the first query content; the first level of correlation is different from the second level of correlation;
performing multi-classification prediction on the correlation between the first query content and the first query result by using a ranking model to obtain a first correlation prediction result, and performing multi-classification prediction on the correlation between the first query content and the second query result by using the ranking model to obtain a second correlation prediction result;
fitting a first loss between the first correlation prediction and the first correlation level and a second loss between the second correlation prediction and the second correlation level with a first loss function;
determining a correlation prediction comparison result according to the first correlation prediction result and the second correlation prediction result; the correlation prediction comparison result is used for representing the correlation high-low comparison result of the first query result and the second query result;
fitting a third loss between the correlation prediction comparison and the correlation level comparison using a second loss function; the correlation level comparison result is a comparison result between the first correlation level and the second correlation level;
and adjusting the model parameters of the sequencing model according to the first loss, the second loss and the third loss so as to optimize the model parameters.
2. The method of claim 1, wherein obtaining sample data pairs comprises:
obtaining the first query content, a plurality of candidate query results related to the first query content and relevance labels of the candidate query results; the relevance label is used for representing the relevance grade of the candidate query result and the first query content;
and combining the candidate query results with different relevance grades pairwise according to the relevance labels to obtain the first query result and the second query result.
3. The method of claim 1 or 2, wherein performing multi-class prediction on the correlation between the first query content and the first query result by using a ranking model to obtain a first correlation prediction result, and performing multi-class prediction on the correlation between the first query content and the second query result by using a ranking model to obtain a second correlation prediction result comprises:
performing word segmentation on a first text process consisting of the first query content and the first query result to obtain a first word segmentation set, and performing word segmentation on a second text consisting of the first query content and the first query result to obtain a second word segmentation set;
obtaining a first initial vector matrix corresponding to the first participle set according to the initial word vector of each participle in the first participle set, and obtaining a second initial vector matrix corresponding to the second participle set according to the initial word vector of each participle in the second participle set;
extracting, with the ranking model, first vector features in the first initial vector matrix that represent a relationship between the first query content and the first query result, and extracting, with the ranking model, second vector features in the second initial vector matrix that represent a relationship between the first query content and the second query result;
and performing multi-class prediction on the correlation between the first query content and the first query result and the correlation between the first query content and the second query result by using the sorting model according to the first vector feature and the second vector feature respectively to obtain a first correlation prediction result and a second correlation prediction result.
4. The method of claim 1 or 2, wherein determining a correlation prediction comparison result from the first and second correlation prediction results comprises:
determining a first expected value for the first correlation predictor and determining a second expected value for the second correlation predictor;
and determining the correlation prediction comparison result according to the first expected value and the second expected value.
5. The method of claim 4, wherein the first relevance predictor comprises a first probability distribution vector of relevance between the first query content and the first query result over a plurality of relevance levels; the second relevance predictor comprises a second probability distribution vector of relevance between the first query results over multiple relevance levels.
6. A method for sorting query results comprises the following steps:
acquiring query content and a query result to be sequenced;
identifying a relevance between the query content and the query result using a ranking model; the ranking model is trained by the method of any one of claims 1 to 5;
and sorting the query results according to the relevance.
7. The method of claim 6, wherein identifying the relevance between the query content and the query results using a ranking model comprises:
performing word segmentation on a third text formed by the query content and the query result to obtain a third word segmentation set;
obtaining a third initial vector matrix corresponding to the third participle set according to the initialization vector of each participle in the third participle set;
extracting third vector features representing the relation between the query content and the query result in the third initial vector matrix by using the sequencing model;
identifying a correlation between the query content and the query result for the third vector features using the ranking model.
8. The method of claim 7, wherein identifying, with the ranking model, a correlation between the query content and the query result for the third vector features comprises:
performing multi-class prediction on the correlation between the query content and the query result aiming at the third vector characteristics by utilizing the sequencing model to obtain third probability distribution vectors of the correlation between the query content and the query result on a plurality of correlation levels;
determining a third expected value of the third probability distribution vector, and determining the third expected value as a correlation between the query content and the query result.
9. A method for sorting query results comprises the following steps:
acquiring query content and a query result to be sequenced;
performing multi-classification prediction on the correlation between the query content and the query result by using a sequencing model to obtain a third probability distribution vector of the correlation between the query content and the query result on a plurality of correlation levels;
determining a third expected value of the third probability distribution vector, and determining the third expected value as the correlation between the query content and the query result, wherein the third expected value has global distinguishing capability and local sorting capability;
and sorting the query results according to the relevance.
10. The method of claim 9, wherein performing multi-class prediction on the correlation between the query content and the query result using a ranking model to obtain a third probability distribution vector of the correlation between the query content and the query result at multiple correlation levels comprises:
performing word segmentation on a third text formed by the query content and the query result to obtain a third word segmentation set;
obtaining a third initial vector matrix corresponding to the third participle set according to the initialization vector of each participle in the third participle set;
extracting third vector features representing the relation between the query content and the query result in the third initial vector matrix by using the sequencing model;
and performing multi-class prediction on the correlation between the query content and the query result by using the sequencing model aiming at the third vector characteristic to obtain the third probability distribution vector.
11. A training apparatus for ranking models, comprising:
a first obtaining module configured to obtain a sample data pair; wherein the sample data pair comprises a first sample and a second sample, the first sample comprises first query content and a first query result having a first relevance grade with the first query content, and the second sample comprises the first query content and a second query result having a second relevance grade with the first query content; the first level of correlation is different from the second level of correlation;
the first prediction module is configured to perform multi-class prediction on the correlation between the first query content and the first query result by using a ranking model to obtain a first correlation prediction result, and perform multi-class prediction on the correlation between the first query content and the second query result by using the ranking model to obtain a second correlation prediction result;
a first fitting module configured to fit a first loss between the first correlation prediction and the first correlation level with a first loss function and fit a second loss between the second correlation prediction and the second correlation level;
a first comparison module configured to determine a correlation prediction comparison result from the first correlation prediction result and the second correlation prediction result; the correlation prediction comparison result is used for representing the correlation high-low comparison result of the first query result and the second query result;
a second fitting module configured to fit a third loss between the correlation prediction comparison result and the correlation level comparison result with a second loss function; the correlation level comparison result is a comparison result between the first correlation level and the second correlation level;
a parameter adjustment module configured to adjust a model parameter of the order model according to the first penalty, the second penalty, and the third penalty to optimize the model parameter.
12. The apparatus of claim 11, wherein the first obtaining means comprises:
a first obtaining sub-module configured to obtain the first query content, a plurality of candidate query results related to the first query content, and relevance labels of the candidate query results; the relevance label is used for representing the relevance grade of the candidate query result and the first query content;
and the combination sub-module is configured to combine the candidate query results with different relevance grades pairwise according to the relevance labels to obtain the first query result and the second query result.
13. The apparatus of claim 11 or 12, wherein the prediction module comprises:
the first word segmentation sub-module is configured to segment words of a first text process formed by the first query content and the first query result to obtain a first word segmentation set, and segment words of a second text formed by the first query content and the first query result to obtain a second word segmentation set;
the first vector obtaining sub-module is configured to obtain a first initial vector matrix corresponding to the first participle set according to the initial word vector of each participle in the first participle set, and obtain a second initial vector matrix corresponding to the second participle set according to the initial word vector of each participle in the second participle set;
a first feature extraction sub-module configured to extract, using the ranking model, first vector features in the first initial vector matrix representing a relationship between the first query content and the first query result, and to extract, using the ranking model, second vector features in the second initial vector matrix representing a relationship between the first query content and the second query result;
a first prediction sub-module configured to perform multi-category prediction on the correlations between the first query content and the first query result, and between the first query content and the second query result respectively for the first vector feature and the second vector feature by using the ranking model, so as to obtain the first correlation prediction result and the second correlation prediction result.
14. The apparatus of claim 11 or 12, wherein the first comparing module comprises:
a first determination submodule configured to determine a first expected value of the first correlation predictor and to determine a second expected value of the second correlation predictor;
a second determination submodule configured to determine the correlation prediction comparison result according to the first expectation value and the second expectation value.
15. The apparatus of claim 14, wherein the first relevance predictor comprises a first probability distribution vector of relevance between the first query content and the first query result over a plurality of relevance levels; the second relevance predictor comprises a second probability distribution vector of relevance between the first query results over multiple relevance levels.
16. An apparatus for ranking query results, comprising:
the second acquisition module is configured to acquire the query content and the query result to be ranked;
an identification module configured to identify a correlation between the query content and the query results using a ranking model; the ranking model is trained using the apparatus of any one of claims 11-15;
a first ranking module configured to rank the query results according to the relevance.
17. The apparatus of claim 16, wherein the identification module comprises:
the second word segmentation sub-module is configured to segment words of a third text formed by the query content and the query result to obtain a third word segmentation set;
the second vector acquisition submodule is configured to obtain a third initial vector matrix corresponding to the third participle set according to the initialization vector of each participle in the third participle set;
a second feature extraction sub-module configured to extract, by using the ranking model, third vector features in the third initial vector matrix representing a relationship between the query content and the query result;
an identification sub-module configured to identify a correlation between the query content and the query result for the third vector features using the ranking model.
18. The apparatus of claim 17, wherein the first identification submodule comprises:
a second prediction sub-module configured to perform multi-class prediction on the correlation between the query content and the query result for the third vector feature by using the ranking model, so as to obtain a third probability distribution vector of the correlation between the query content and the query result on multiple correlation levels;
a third determination sub-module configured to determine a third expected value of the third probability distribution vector and determine the third expected value as a correlation between the query content and the query result.
19. An apparatus for ranking query results, comprising:
the third acquisition module is configured to acquire the query content and the query result to be ranked;
a second prediction module configured to perform multi-category prediction on the correlation between the query content and the query result by using a ranking model to obtain a third probability distribution vector of the correlation between the query content and the query result on multiple correlation levels;
a first determining module configured to determine a third expected value of the third probability distribution vector, and determine the third expected value as a correlation between the query content and the query result, wherein the third expected value has a global distinguishing capability and a local sorting capability;
a second ranking module configured to rank the query results according to the relevance.
20. The apparatus of claim 19, wherein the second prediction module comprises:
the third word segmentation sub-module is configured to segment words of a third text formed by the query content and the query result to obtain a third word segmentation set;
a third vector obtaining sub-module configured to obtain a third initial vector matrix corresponding to the third participle set according to the initialization vector of each participle in the third participle set;
a third feature extraction sub-module configured to extract, by using the ranking model, third vector features representing a relationship between the query content and the query result in the third initial vector matrix;
a third prediction sub-module configured to perform multi-class prediction on the correlation between the query content and the query result for the third vector feature by using the ranking model, so as to obtain the third probability distribution vector.
21. An electronic device, comprising a memory and a processor; wherein,
the memory is to store one or more computer instructions, wherein the one or more computer instructions are to be executed by the processor to implement the method of any one of claims 1-10.
22. A computer readable storage medium having computer instructions stored thereon, wherein the computer instructions, when executed by a processor, implement the method of any of claims 1-10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010307729.9A CN113535829B (en) | 2020-04-17 | 2020-04-17 | Training method and device of ranking model, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010307729.9A CN113535829B (en) | 2020-04-17 | 2020-04-17 | Training method and device of ranking model, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113535829A CN113535829A (en) | 2021-10-22 |
CN113535829B true CN113535829B (en) | 2022-04-29 |
Family
ID=78123408
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010307729.9A Active CN113535829B (en) | 2020-04-17 | 2020-04-17 | Training method and device of ranking model, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113535829B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106021364A (en) * | 2016-05-10 | 2016-10-12 | 百度在线网络技术(北京)有限公司 | Method and device for establishing picture search correlation prediction model, and picture search method and device |
CN107506402A (en) * | 2017-08-03 | 2017-12-22 | 北京百度网讯科技有限公司 | Sort method, device, equipment and the computer-readable recording medium of search result |
WO2019052403A1 (en) * | 2017-09-12 | 2019-03-21 | 腾讯科技(深圳)有限公司 | Training method for image-text matching model, bidirectional search method, and related apparatus |
CN110969006A (en) * | 2019-12-02 | 2020-04-07 | 支付宝(杭州)信息技术有限公司 | Training method and system of text sequencing model |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018075995A1 (en) * | 2016-10-21 | 2018-04-26 | DataRobot, Inc. | Systems for predictive data analytics, and related methods and apparatus |
-
2020
- 2020-04-17 CN CN202010307729.9A patent/CN113535829B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106021364A (en) * | 2016-05-10 | 2016-10-12 | 百度在线网络技术(北京)有限公司 | Method and device for establishing picture search correlation prediction model, and picture search method and device |
CN107506402A (en) * | 2017-08-03 | 2017-12-22 | 北京百度网讯科技有限公司 | Sort method, device, equipment and the computer-readable recording medium of search result |
WO2019052403A1 (en) * | 2017-09-12 | 2019-03-21 | 腾讯科技(深圳)有限公司 | Training method for image-text matching model, bidirectional search method, and related apparatus |
CN110969006A (en) * | 2019-12-02 | 2020-04-07 | 支付宝(杭州)信息技术有限公司 | Training method and system of text sequencing model |
Also Published As
Publication number | Publication date |
---|---|
CN113535829A (en) | 2021-10-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108804641B (en) | Text similarity calculation method, device, equipment and storage medium | |
CN112100529B (en) | Search content ordering method and device, storage medium and electronic equipment | |
US8788503B1 (en) | Content identification | |
CN112667794A (en) | Intelligent question-answer matching method and system based on twin network BERT model | |
US20220277038A1 (en) | Image search based on combined local and global information | |
CN110704601A (en) | Method for solving video question-answering task requiring common knowledge by using problem-knowledge guided progressive space-time attention network | |
CN113297369B (en) | Intelligent question-answering system based on knowledge graph subgraph retrieval | |
CN108228541B (en) | Method and device for generating document abstract | |
CN110895559A (en) | Model training method, text processing method, device and equipment | |
CN112148831B (en) | Image-text mixed retrieval method and device, storage medium and computer equipment | |
CN112256845A (en) | Intention recognition method, device, electronic equipment and computer readable storage medium | |
CN115309872B (en) | Multi-model entropy weighted retrieval method and system based on Kmeans recall | |
CN111274822A (en) | Semantic matching method, device, equipment and storage medium | |
CN113011172A (en) | Text processing method and device, computer equipment and storage medium | |
CN117807232A (en) | Commodity classification method, commodity classification model construction method and device | |
CN115168590A (en) | Text feature extraction method, model training method, device, equipment and medium | |
CN116452688A (en) | Image description generation method based on common attention mechanism | |
CN114493783A (en) | Commodity matching method based on double retrieval mechanism | |
CN111079011A (en) | Deep learning-based information recommendation method | |
CN112711944B (en) | Word segmentation method and system, and word segmentation device generation method and system | |
CN118035408A (en) | Intelligent session method and system applicable to vertical field based on large model | |
CN116935411A (en) | Radical-level ancient character recognition method based on character decomposition and reconstruction | |
CN113535829B (en) | Training method and device of ranking model, electronic equipment and storage medium | |
CN116662566A (en) | Heterogeneous information network link prediction method based on contrast learning mechanism | |
CN115827990A (en) | Searching method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |