CN111382367B - Search result ordering method and device - Google Patents
Search result ordering method and device Download PDFInfo
- Publication number
- CN111382367B CN111382367B CN201811614255.1A CN201811614255A CN111382367B CN 111382367 B CN111382367 B CN 111382367B CN 201811614255 A CN201811614255 A CN 201811614255A CN 111382367 B CN111382367 B CN 111382367B
- Authority
- CN
- China
- Prior art keywords
- data set
- items
- model
- ranking
- sorted
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 239000013598 vector Substances 0.000 claims abstract description 60
- 238000012549 training Methods 0.000 claims description 38
- 230000008901 benefit Effects 0.000 abstract description 19
- 230000008569 process Effects 0.000 abstract description 6
- 238000012545 processing Methods 0.000 description 17
- 238000004891 communication Methods 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 5
- 230000005236 sound signal Effects 0.000 description 4
- 238000000375 direct analysis in real time Methods 0.000 description 3
- 238000012063 dual-affinity re-targeting Methods 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000009193 crawling Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the application discloses a search result ordering method and a search result ordering device, wherein two ordering models are adopted in the process of grading and ordering a plurality of items in a search result corresponding to a target keyword. After a first data set to be ranked is determined according to feature vectors respectively corresponding to a plurality of items in a search result, the plurality of items are scored according to the first data set to be ranked through a first ranking model, then a second data set to be ranked is obtained according to the first data set to be ranked and the determined scores, the plurality of items are respectively scored according to the second data set to be ranked through a second ranking model, and the scores are used as ranking basis of the plurality of items. The second data set to be ranked comprises the scoring of the first ranking model on the items, so that the scoring of the second ranking model on the items can show the advantages of the two ranking models in ranking, and the search requirements of different scenes can be better met.
Description
Technical Field
The present application relates to the field of data processing, and in particular, to a method and apparatus for sorting search results.
Background
When a user searches for keywords through a search engine, the search engine can score and rank each item in the search results through a ranking model. Generally, the higher the score of an item, the greater the relevance of the item to the keyword, and the earlier the presentation order, the easier it is for the user to see.
Currently, a single ranking model is often used in ranking items in search results.
Different sorting models have different advantages in sorting, so that the application range of a single sorting model is limited, and the searching requirements of various scenes are difficult to meet.
Disclosure of Invention
In order to solve the technical problems, the application provides a search result ordering method and a search result ordering device, wherein the scoring of the items can show the advantages of two ordering models in ordering, and can better meet the search requirements of different scenes.
The embodiment of the application discloses the following technical scheme:
in a first aspect, an embodiment of the present application provides a search result sorting method, where the method includes:
Determining a first data set to be ordered according to a search result corresponding to a target keyword, wherein the search result comprises a plurality of items, and the first data set to be ordered comprises feature vectors respectively corresponding to the items;
scoring the plurality of items respectively through a first ranking model according to the first data set to be ranked;
Determining a second data set to be ordered according to the first data set to be ordered and a score obtained according to the first data set to be ordered;
Scoring the plurality of items respectively through a second ranking model according to the second data set to be ranked;
and ranking the plurality of items according to the score obtained by the second ranking model.
Optionally, the first ordering model is trained according to a first labeled dataset and a non-labeled dataset; the second ordering model is trained according to a second labeled data set and the scores of the second labeled data set, and the scores of the second labeled data set are obtained by scoring the second labeled data set according to the first ordering model.
Optionally, the first tagged data set and the second tagged data set are different tagged data sets.
Optionally, the unlabeled dataset is obtained by:
acquiring a plurality of unlabeled data obtained by searching according to the keywords;
and randomly extracting the non-tag data from the head and the tail according to a preset proportion according to the arrangement sequence of the plurality of non-tag data to form the non-tag data set.
Optionally, determining a second to-be-sorted data set according to the first to-be-sorted data set and the score obtained according to the first to-be-sorted data set includes:
adding the score obtained according to the first data set to be sorted into the feature vector of the first data set to be sorted as one-dimensional feature to obtain the second data set to be sorted; for any one target item in the plurality of items, the feature vector of the target feature in the second data set to be sorted is a feature vector formed by adding the feature vector corresponding to the target feature to the one-dimensional feature corresponding to the target feature score.
Optionally, the first ranking model and the second ranking model are compressed ranking models.
In a second aspect, an embodiment of the present application provides a search result ranking apparatus, where the apparatus includes a first determining unit, a first scoring unit, a second determining unit, a second scoring unit, and a ranking unit:
the first determining unit is configured to determine a first to-be-sorted data set according to a search result corresponding to a target keyword, where the search result includes a plurality of items, and the first to-be-sorted data set includes feature vectors corresponding to the plurality of items respectively;
The first scoring unit is used for scoring the plurality of items respectively through a first sorting model according to the first data set to be sorted;
the second determining unit is configured to determine a second data set to be ordered according to the first data set to be ordered and a score obtained according to the first data set to be ordered;
the second scoring unit is used for scoring the plurality of items respectively through a second sorting model according to the second data set to be sorted;
the ranking unit is used for ranking the plurality of items according to the scores obtained by the second ranking model.
Optionally, the first ordering model is trained according to a first labeled dataset and a non-labeled dataset; the second ordering model is trained according to a second labeled data set and the scores of the second labeled data set, and the scores of the second labeled data set are obtained by scoring the second labeled data set according to the first ordering model.
Optionally, the first tagged data set and the second tagged data set are different tagged data sets.
Optionally, the apparatus further includes an acquisition unit:
The acquisition unit is used for acquiring a plurality of unlabeled data obtained by searching according to the keywords;
The acquisition unit is further used for randomly extracting the non-tag data from the head and the tail according to a preset proportion according to the arrangement sequence of the plurality of non-tag data to form the non-tag data set.
Optionally, the second determining unit is further configured to add the score obtained according to the first to-be-sorted data set as a one-dimensional feature to a feature vector of the first to-be-sorted data set to obtain the second to-be-sorted data set; for any one target item in the plurality of items, the feature vector of the target feature in the second data set to be sorted is a feature vector formed by adding the feature vector corresponding to the target feature to the one-dimensional feature corresponding to the target feature score.
Optionally, the first ranking model and the second ranking model are compressed ranking models.
In a third aspect, embodiments of the present application provide a search result ranking apparatus comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for:
Determining a first data set to be ordered according to a search result corresponding to a target keyword, wherein the search result comprises a plurality of items, and the first data set to be ordered comprises feature vectors respectively corresponding to the items;
scoring the plurality of items respectively through a first ranking model according to the first data set to be ranked;
Determining a second data set to be ordered according to the first data set to be ordered and a score obtained according to the first data set to be ordered;
Scoring the plurality of items respectively through a second ranking model according to the second data set to be ranked;
and ranking the plurality of items according to the score obtained by the second ranking model.
In a fourth aspect, embodiments of the present application provide a machine-readable medium having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform a search result ordering method as described in one or more of the first aspects.
According to the technical scheme, in the scoring and sorting process of the plurality of items in the search result corresponding to the target keyword, two sorting models, namely a first sorting model and a second sorting model, are adopted. After a first data set to be ranked is determined according to feature vectors respectively corresponding to a plurality of items in a search result, the plurality of items are scored according to the first data set to be ranked through a first ranking model, then a second data set to be ranked is obtained according to the first data set to be ranked and the determined scores, the plurality of items are respectively scored according to the second data set to be ranked through a second ranking model, and the scores are used as ranking basis of the plurality of items. Since the second set of data to be ranked includes the scoring of the items by the first ranking model, the scoring demonstrates the advantage that the first ranking model plays in ranking. Therefore, when the second sorting model scores the plurality of items according to the second data set to be sorted, the scoring of the items by the first sorting model is used as the basis of scoring of the items by the second sorting model, so that the sorting advantages of the first sorting model can be reflected in the scoring of the items by the second sorting model, the scoring of the items by the second sorting model can reflect the sorting advantages of the two sorting models, and the searching requirements of different scenes can be better met.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the application, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.
FIG. 1 is a schematic diagram of a search result ranking system according to an embodiment of the present application;
FIG. 2 is a method flow chart of a search result ordering method according to an embodiment of the present application;
FIG. 3 is a device structure diagram of a search result sorting device according to an embodiment of the present application;
FIG. 4 is a block diagram of a search result sorting device according to an embodiment of the present application;
Fig. 5 is a block diagram of a server according to an embodiment of the present application.
Detailed Description
Embodiments of the present application are described below with reference to the accompanying drawings.
When a user searches for keywords through a search engine, the search engine can score and rank each item in the search results through a ranking model. The items in the search results may be pages, images, media files, etc. related to the keywords. At present, a single sorting model is adopted to score and sort items in search results. For example, in image searching, a search engine scores recalled images for relevance by a ranking model, presenting the user with the image most relevant to the keyword.
The learning of the ranking model mainly depends on feature and data construction, however, in practice, since the common models are mainly machine learning ranking models such as SVM, gbdt, lr, lambdaMart, etc., the models are relatively simple. And the application range of a single model is limited, so that the search requirement of various scenes is difficult to meet.
Therefore, the embodiment of the application provides a method and a device for sorting search results, which are applied to processing equipment, wherein the processing equipment can be a terminal, a computer, a server and the like, and in the process of grading and sorting a plurality of items in the search results corresponding to target keywords, two sorting models, namely a first sorting model and a second sorting model, are adopted. The two sorting models can be arranged in different processing devices or can be arranged in the same processing device.
For example, as shown in fig. 1, after determining the first to-be-ranked data set 101 according to feature vectors corresponding to a plurality of items in the search result 100, the plurality of items are scored by the first ranking model 200 according to the first to-be-ranked data set 101.
And then a second data set to be ranked 102 is obtained according to the first data set to be ranked 101 and the determined scores, and the multiple items are respectively scored according to the second data set to be ranked by the second ranking model 300, and the scores are used as ranking basis of the multiple items.
Since the score of each item in the search results by the first ranking model 200 is included in the second set of data to be ranked 102, the score represents an advantage that the first ranking model 200 plays in ranking. Therefore, when the second sorting model 300 sorts the plurality of items according to the second data set 102 to be sorted, the sorting advantage of the first sorting model 200 can be reflected in the sorting of the items by the second sorting model 300 by taking the sorting of the items by the first sorting model 200 as the basis of the sorting of the items by the second sorting model 300, and the sorting advantage of the two sorting models can be reflected in the sorting by the sorting of the items by the second sorting model 300, so that the searching requirements of different scenes can be better met.
In one possible implementation manner, based on the scene of image search, according to the real performances of various sorting models, the embodiment of the application utilizes an AI technology to fuse the latest IRGAN model (namely the first sorting model) with the DART model (namely the second sorting model), so that the relevance of image results is improved, and the images in the search results are improved.
Wherein, DART model is the enhancement version of lambdaMart model, has overcome the shortcoming of traditional Mart series model. The IRGAN model introduces GAN technology (which was previously mainly used for image generation direction) into the ranking problem, a latest deep learning ranking model.
The method for sorting search results provided by the embodiment of the application is described below with reference to the accompanying drawings, as shown in fig. 2, and comprises the following steps:
S201: and determining a first data set to be ranked according to the search result corresponding to the target keyword.
The target keyword may be a keyword input by a user through a search engine, and the search engine may search through the target keyword to obtain a search result related to the target keyword. The search result includes a plurality of items, which may be in the form of pages, images, media files, etc., and one or more of the above possible forms may be included in the search result corresponding to a target keyword. For example, where the target keyword is "dog search," the items in the search results may include images, pages, media files, etc. related to "dog search.
In order to sort the plurality of items in the search results so as to determine the display sequence displayed to the user, a sorting model needs to be introduced to score the plurality of items, and the degree of relevance between the items and the target keywords can be reflected by the scoring. In order to score the items in the search results through the ranking model, the items in the search results need to be converted into corresponding feature vectors, wherein one item corresponds to one feature vector, and the feature vector of one item is used for reflecting the content carried by the item.
After each item in the search result is converted into a feature vector, the feature vector corresponding to each item is determined to be a first data set to be sorted, namely the first data set to be sorted comprises the feature vectors corresponding to a plurality of items in the search result.
S202: and respectively scoring the plurality of items through a first sorting model according to the first data set to be sorted.
By inputting the first data set to be ranked into the first ranking model, the first ranking model can determine the scores of the items corresponding to the feature vectors according to the feature vectors in the first data set to be ranked. For example, the feature vector corresponding to the item a is input into a first ranking model, and the first ranking model can determine a score of the item a, where the score can reflect the degree of correlation between the item a and the target keyword.
S203: and determining a second data set to be ordered according to the first data set to be ordered and the score obtained according to the first data set to be ordered.
After the scores of the items in the search result are determined according to the first data set to be ranked through the first ranking model, the items in the search result are not ranked according to the scores, but the second data set to be ranked is determined according to the scores obtained according to the first data set to be ranked and the first data set to be ranked, so that the second data set to be ranked comprises the feature vectors respectively corresponding to the items in the search result and the scores determined through the first ranking model, and the scores represent advantages of the first ranking model in ranking.
In order to be able to use the scores of the items determined by the first ranking model for the scoring basis of the second ranking model, it is necessary to convert the scores of the items determined by the first ranking model into the form of feature vectors, so that in the second data set to be ranked, the feature vector corresponding to an item includes not only features for reflecting the content originally carried by the item, but also features for reflecting the scores of the item determined by the first ranking model.
Embodiments of the present application do not limit how the above-described scores are converted into feature vectors.
In order to facilitate the calculation of the score, in a possible implementation manner, the score obtained according to the first to-be-sorted data set may be added as a one-dimensional feature to a feature vector of the first to-be-sorted data set to obtain the second to-be-sorted data set.
For any one target item in the plurality of items, the feature vector of the target feature in the second data set to be sorted is a feature vector formed by adding the feature vector corresponding to the target feature to the one-dimensional feature corresponding to the target feature score.
At least one feature may be included in the feature vector corresponding to a target item, where a feature may represent information of content carried by the target item in one dimension. In the method, the score of a target item is taken as the content carried by the target item, and the score is embodied in the feature vector corresponding to the target item through one-dimensional features. The feature vector formed by the method not only comprises the features for representing the content originally carried by the target item, but also comprises the features for representing the score of the target item determined by the first ranking model.
S204: and respectively scoring the plurality of items through a second sorting model according to the second data set to be sorted.
By inputting the second data set to be ranked into the second ranking model, the second ranking model can determine the scores of the items corresponding to the feature vectors according to the feature vectors in the second data set to be ranked. For example, the feature vector corresponding to the item a is input into a second ranking model, and the second ranking model can determine a score of the item a, where the score can reflect the degree of correlation between the item a and the target keyword.
S205: and ranking the plurality of items according to the score obtained by the second ranking model.
When the second sorting model scores the plurality of items according to the second data set to be sorted, the scoring of the items by the first sorting model is used as the basis of scoring of the items by the second sorting model, so that the sorting advantages of the first sorting model can be reflected in the scoring of the items by the second sorting model, and the scoring of the items by the second sorting model can reflect the sorting advantages of the second sorting model and the sorting advantages of the first sorting model. Therefore, the plurality of items are ranked according to the scores obtained by the second ranking model, the problem caused by a single ranking model is solved, and the search requirements of different scenes can be better met.
It should be noted that, in order to shorten the time for scoring and ranking the items in the search result, in one possible implementation, before ranking, the two ranking models used may be compressed, so as to shorten the processing time of the ranking model, so in this possible implementation, the first ranking model and the second ranking model are compressed ranking models.
In order to better complement the advantages of the first sorting model and the second sorting model in sorting, the types of the first sorting model and the second sorting model can be selected in a targeted mode.
In one possible implementation, the first ranking model is trained from a first labeled dataset and an unlabeled dataset. The second ordering model is trained according to a second labeled data set and the scores of the second labeled data set, and the scores of the second labeled data set are obtained by scoring the second labeled data set according to the first ordering model.
The labeled data set and the unlabeled data set are data sets comprising training data, the training data is used for training a sorting model, the training data can be feature vectors obtained through conversion according to possible forms of pages, images, media files and the like, the pages, images, media files and the like used for obtaining the training data through conversion can be obtained through searching keywords, can be obtained through network crawling, can be provided by a third party and the like.
The training data included in the labeled dataset has labels, and the labels of each training data may be preset or labeled. A label of the training data may represent a degree of correlation between the training data and one or more keywords, where the one or more keywords may be keywords of a set of keywords involved in training the first ranking model.
It should be noted that the number of tagged training data that can be obtained is limited because the tags need to be pre-set or pre-tagged, e.g., by manual. Not only does the amount of unlabeled training data over the network be far greater than labeled training data, but the unlabeled training data actually carries some useful information relative to the training ranking model.
Therefore, in order to increase the number of training data and introduce useful information carried by the unlabeled training data in the training of the sorting model, in the embodiment of the application, the first sorting model is a semi-supervised model, for example, the first sorting model can be a IRGAN model specifically, so that the first sorting model can be trained through the first labeled data set and the unlabeled data set, the first sorting model obtained through training can be positively influenced by the useful information carried by the unlabeled training data, the scoring accuracy of the first sorting model is further improved, and the cost of pre-labeling the training data is reduced.
The training data in the unlabeled data set can be obtained through keyword search, and in one possible implementation manner, the magnitude order of the unlabeled training data obtained through search can reach ten thousand levels, so that the unlabeled training data obtained through search can be subjected to targeted screening, and a part of the unlabeled training data is selected to form the unlabeled data set.
Optionally, after acquiring the plurality of unlabeled data obtained by searching according to the keyword, the unlabeled data may be randomly extracted from the head and the tail according to a preset proportion according to the arrangement sequence of the plurality of unlabeled data to form the unlabeled data set. The embodiment of the application is not limited to the extraction of the preset proportion, and can be 1:2, i.e. the ratio of the number of unlabeled training data extracted from the head to the number of unlabeled training data extracted from the tail is 1:2. the method for randomly extracting the unlabeled data from the head and the tail according to the preset proportion to form the unlabeled data set can improve the training quality of the first ordering model.
However, since the training data of the first ranking model includes unlabeled training data, which, although having a large data size, carries a part of useful information, may also bring about a certain noise impact on the scoring of the item, in order to eliminate this noise impact, in one possible implementation, the second ranking model is not a semi-supervised model, for example, may be a DART model, and the training data used to train the second ranking model should be labeled training data. The first tagged data set and the second tagged data set used to train the first ranking model and the second ranking model may be the same tagged data set or may be different tagged data sets, which is not limited in this embodiment of the present application.
In one possible implementation manner, the first tagged data set and the second tagged data set are different tagged data sets, so that the first ordering model and the second ordering model obtained through training can have better generalization, and in the process of ordering each item in the search result, the advantages of the two models are highlighted by better cascading and fusing the first ordering model and the second ordering model of the semi-supervised model.
In the process of sorting each item in the search result, the first sorting model and the second sorting model of the semi-supervised model are fused in a cascading manner, the advantage complementation of the two models is utilized, the overfitting problem caused by a single model is reduced, in addition, the first sorting model is trained by adopting unlabeled training data, and useful information carried by the unlabeled training data is introduced, so that the sorting effect is greatly improved.
Based on the search result sorting method provided in the corresponding embodiment of fig. 2, the embodiment of the present application provides a search result sorting device, referring to fig. 3, where the device includes a first determining unit 301, a first scoring unit 302, a second determining unit 303, a second scoring unit 304, and a sorting unit 305:
The first determining unit 301 is configured to determine a first to-be-sorted data set according to a search result corresponding to a target keyword, where the search result includes a plurality of items, and the first to-be-sorted data set includes feature vectors corresponding to the plurality of items respectively;
the first scoring unit 302 is configured to score the plurality of items respectively through a first ranking model according to the first to-be-ranked data set;
the second determining unit 303 is configured to determine a second data set to be sorted according to the first data set to be sorted and a score obtained according to the first data set to be sorted;
The second scoring unit 304 is configured to score the plurality of items respectively through a second ranking model according to the second to-be-ranked data set;
the ranking unit 305 is configured to rank the plurality of items according to the scores obtained by the second ranking model.
Optionally, the first ordering model is trained according to a first labeled dataset and a non-labeled dataset; the second ordering model is trained according to a second labeled data set and the scores of the second labeled data set, and the scores of the second labeled data set are obtained by scoring the second labeled data set according to the first ordering model.
Optionally, the first tagged data set and the second tagged data set are different tagged data sets.
Optionally, the apparatus further includes an acquisition unit:
The acquisition unit is used for acquiring a plurality of unlabeled data obtained by searching according to the keywords;
The acquisition unit is further used for randomly extracting the non-tag data from the head and the tail according to a preset proportion according to the arrangement sequence of the plurality of non-tag data to form the non-tag data set.
Optionally, the second determining unit is further configured to add the score obtained according to the first to-be-sorted data set as a one-dimensional feature to a feature vector of the first to-be-sorted data set to obtain the second to-be-sorted data set; for any one target item in the plurality of items, the feature vector of the target feature in the second data set to be sorted is a feature vector formed by adding the feature vector corresponding to the target feature to the one-dimensional feature corresponding to the target feature score.
Optionally, the first ranking model and the second ranking model are compressed ranking models.
Therefore, in the scoring and sorting process for the multiple items in the search result corresponding to the target keyword, two sorting models are adopted, namely a first sorting model and a second sorting model. After a first data set to be ranked is determined according to feature vectors respectively corresponding to a plurality of items in a search result, the plurality of items are scored according to the first data set to be ranked through a first ranking model, then a second data set to be ranked is obtained according to the first data set to be ranked and the determined scores, the plurality of items are respectively scored according to the second data set to be ranked through a second ranking model, and the scores are used as ranking basis of the plurality of items. Since the second set of data to be ranked includes the scoring of the items by the first ranking model, the scoring demonstrates the advantage that the first ranking model plays in ranking. Therefore, when the second sorting model scores the plurality of items according to the second data set to be sorted, the scoring of the items by the first sorting model is used as the basis of scoring of the items by the second sorting model, so that the sorting advantages of the first sorting model can be reflected in the scoring of the items by the second sorting model, the scoring of the items by the second sorting model can reflect the sorting advantages of the two sorting models, and the searching requirements of different scenes can be better met.
FIG. 4 is a block diagram illustrating a search result ordering apparatus 400 according to an example embodiment. For example, apparatus 400 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, exercise device, personal digital assistant, or the like.
Referring to fig. 4, apparatus 400 may include one or more of the following components: a processing component 402, a memory 404, a power supply component 406, a multimedia component 408, an audio component 410, an input/output (I/O) interface 412, a sensor component 414, and a communication component 416.
The processing component 402 generally controls the overall operation of the apparatus 400, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing element 402 may include one or more processors 420 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 402 can include one or more modules that facilitate interaction between the processing component 402 and other components. For example, the processing component 402 may include a multimedia module to facilitate interaction between the multimedia component 408 and the processing component 402.
Memory 404 is configured to store various types of data to support operations at device 400. Examples of such data include instructions for any application or method operating on the apparatus 400, contact data, phonebook data, messages, pictures, videos, and the like. The memory 404 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The power supply component 406 provides power to the various components of the apparatus 400. The power supply components 406 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the apparatus 400.
The multimedia component 408 includes a screen between the device 400 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 408 includes a front camera and/or a rear camera. The front camera and/or the rear camera may receive external multimedia data when the device 400 is in an operational mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
The audio component 410 is configured to output and/or input audio signals. For example, the audio component 410 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 400 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 404 or transmitted via the communication component 416. In some embodiments, audio component 410 further includes a speaker for outputting audio signals.
The I/O interface 412 provides an interface between the processing component 402 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.
The sensor assembly 414 includes one or more sensors for providing status assessment of various aspects of the apparatus 400. For example, the sensor assembly 414 may detect the on/off state of the device 400, the relative positioning of the components, such as the display and keypad of the apparatus 400, the sensor assembly 414 may also detect the change in position of the apparatus 400 or one component of the apparatus 400, the presence or absence of user contact with the apparatus 400, the orientation or acceleration/deceleration of the apparatus 400, and the change in temperature of the apparatus 400. The sensor assembly 414 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact. The sensor assembly 414 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 414 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 416 is configured to facilitate communication between the apparatus 400 and other devices in a wired or wireless manner. The apparatus 400 may access a wireless network based on a communication standard, such as WiFi,2G or 3G, or a combination thereof. In one exemplary embodiment, the communication part 416 receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 416 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 400 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.
Fig. 5 is a schematic structural diagram of a server according to an embodiment of the present invention. The server 500 may vary considerably in configuration or performance and may include one or more central processing units (central processing units, CPUs) 522 (e.g., one or more processors) and memory 532, one or more storage mediums 530 (e.g., one or more mass storage devices) that store applications 542 or data 544. Wherein memory 532 and storage medium 530 may be transitory or persistent. The program stored in the storage medium 530 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, the central processor 522 may be configured to communicate with a storage medium 530 and execute a series of instruction operations in the storage medium 530 on the server 500.
The server 500 may also include one or more power supplies 526, one or more wired or wireless network interfaces 550, one or more input/output interfaces 558, one or more keyboards 556, and/or one or more operating systems 541, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, and the like.
In an exemplary embodiment, a non-transitory computer-readable storage medium is also provided, such as memory 404, including instructions executable by processor 420 of apparatus 400 to perform the above-described method. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
A non-transitory computer readable storage medium, which when executed by a processor of a mobile terminal, causes the terminal to perform a search result ordering method, the method comprising:
Determining a first data set to be ordered according to a search result corresponding to a target keyword, wherein the search result comprises a plurality of items, and the first data set to be ordered comprises feature vectors respectively corresponding to the items;
scoring the plurality of items respectively through a first ranking model according to the first data set to be ranked;
Determining a second data set to be ordered according to the first data set to be ordered and a score obtained according to the first data set to be ordered;
Scoring the plurality of items respectively through a second ranking model according to the second data set to be ranked;
and ranking the plurality of items according to the score obtained by the second ranking model.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, where the above program may be stored in a computer readable storage medium, and when the program is executed, the program performs steps including the above method embodiments; and the aforementioned storage medium may be at least one of the following media: read-only memory (ROM), RAM, magnetic disk or optical disk, etc., which can store program codes.
It should be noted that, in the present specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment is mainly described in a different point from other embodiments. In particular, for the apparatus and system embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, with reference to the description of the method embodiments in part. The apparatus and system embodiments described above are merely illustrative, in which elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The foregoing is only one specific embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the technical scope of the present application should be included in the scope of the present application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.
Claims (10)
1. A method of ranking search results, the method comprising:
Determining a first data set to be ordered according to a search result corresponding to a target keyword, wherein the search result comprises a plurality of items, and the first data set to be ordered comprises feature vectors respectively corresponding to the items;
scoring the plurality of items respectively through a first ranking model according to the first data set to be ranked; the first ordering model is trained according to a first labeled data set and a first unlabeled data set;
Determining a second data set to be ordered according to the first data set to be ordered and a score obtained according to the first data set to be ordered;
Scoring the plurality of items respectively through a second ranking model according to the second data set to be ranked; the second ordering model is obtained through training according to a second labeled data set and the scores of the second labeled data set, and the scores of the second labeled data set are obtained through scoring the second labeled data set according to the first ordering model;
ranking the plurality of items according to the score derived by the second ranking model;
Determining a second data set to be sorted according to the first data set to be sorted and the score obtained according to the first data set to be sorted, comprising:
Adding the score obtained according to the first data set to be sorted into the feature vector of the first data set to be sorted as one-dimensional feature to obtain the second data set to be sorted; for any one target item in the plurality of items, the feature vector of the target item in the second data set to be sorted is a feature vector formed by adding the one-dimensional feature corresponding to the target item score to the feature vector corresponding to the target item.
2. The method of claim 1, wherein the first tagged data set and the second tagged data set are different tagged data sets.
3. The method of claim 1, wherein the unlabeled dataset is obtained by:
acquiring a plurality of unlabeled data obtained by searching according to the keywords;
and randomly extracting the non-tag data from the head and the tail according to a preset proportion according to the arrangement sequence of the plurality of non-tag data to form the non-tag data set.
4. The method of claim 1, wherein the first ranking model and the second ranking model are compressed ranking models.
5. A search result ranking apparatus, characterized in that the apparatus comprises a first determining unit, a first scoring unit, a second determining unit, a second scoring unit and a ranking unit:
the first determining unit is configured to determine a first to-be-sorted data set according to a search result corresponding to a target keyword, where the search result includes a plurality of items, and the first to-be-sorted data set includes feature vectors corresponding to the plurality of items respectively;
The first scoring unit is used for scoring the plurality of items respectively through a first sorting model according to the first data set to be sorted; the first ordering model is trained according to a first labeled data set and a first unlabeled data set;
the second determining unit is configured to determine a second data set to be ordered according to the first data set to be ordered and a score obtained according to the first data set to be ordered;
The second scoring unit is used for scoring the plurality of items respectively through a second sorting model according to the second data set to be sorted; the second ordering model is obtained through training according to a second labeled data set and the scores of the second labeled data set, and the scores of the second labeled data set are obtained through scoring the second labeled data set according to the first ordering model;
The ranking unit is used for ranking the plurality of items according to the scores obtained by the second ranking model;
the second determining unit is further configured to add the score obtained according to the first to-be-sorted data set as a one-dimensional feature to a feature vector of the first to-be-sorted data set to obtain the second to-be-sorted data set; for any one target item in the plurality of items, the feature vector of the target item in the second data set to be sorted is a feature vector formed by adding the one-dimensional feature corresponding to the target item score to the feature vector corresponding to the target item.
6. The apparatus of claim 5, wherein the first tagged data set and the second tagged data set are different tagged data sets.
7. The apparatus according to claim 5, further comprising an acquisition unit:
The acquisition unit is used for acquiring a plurality of unlabeled data obtained by searching according to the keywords;
The acquisition unit is further used for randomly extracting the non-tag data from the head and the tail according to a preset proportion according to the arrangement sequence of the plurality of non-tag data to form the non-tag data set.
8. The apparatus of claim 5, wherein the first ranking model and the second ranking model are compressed ranking models.
9. A search result ordering apparatus comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for:
Determining a first data set to be ordered according to a search result corresponding to a target keyword, wherein the search result comprises a plurality of items, and the first data set to be ordered comprises feature vectors respectively corresponding to the items;
scoring the plurality of items respectively through a first ranking model according to the first data set to be ranked; the first ordering model is trained according to a first labeled data set and a first unlabeled data set;
Determining a second data set to be ordered according to the first data set to be ordered and a score obtained according to the first data set to be ordered;
Scoring the plurality of items respectively through a second ranking model according to the second data set to be ranked; the second ordering model is obtained through training according to a second labeled data set and the scores of the second labeled data set, and the scores of the second labeled data set are obtained through scoring the second labeled data set according to the first ordering model;
ranking the plurality of items according to the score derived by the second ranking model;
Determining a second data set to be sorted according to the first data set to be sorted and the score obtained according to the first data set to be sorted, comprising:
Adding the score obtained according to the first data set to be sorted into the feature vector of the first data set to be sorted as one-dimensional feature to obtain the second data set to be sorted; for any one target item in the plurality of items, the feature vector of the target item in the second data set to be sorted is a feature vector formed by adding the one-dimensional feature corresponding to the target item score to the feature vector corresponding to the target item.
10. A machine readable medium having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform the search result ordering method of one or more of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811614255.1A CN111382367B (en) | 2018-12-27 | 2018-12-27 | Search result ordering method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811614255.1A CN111382367B (en) | 2018-12-27 | 2018-12-27 | Search result ordering method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111382367A CN111382367A (en) | 2020-07-07 |
CN111382367B true CN111382367B (en) | 2024-04-30 |
Family
ID=71220826
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811614255.1A Active CN111382367B (en) | 2018-12-27 | 2018-12-27 | Search result ordering method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111382367B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111563207B (en) * | 2020-07-14 | 2020-11-10 | 口碑(上海)信息技术有限公司 | Search result sorting method and device, storage medium and computer equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106339756A (en) * | 2016-08-25 | 2017-01-18 | 北京百度网讯科技有限公司 | Training data generation method and device and searching method and device |
CN106570197A (en) * | 2016-11-15 | 2017-04-19 | 北京百度网讯科技有限公司 | Searching and ordering method and device based on transfer learning |
WO2017157040A1 (en) * | 2016-03-18 | 2017-09-21 | 北京搜狗科技发展有限公司 | Search method and device, and device used for searching |
CN107977405A (en) * | 2017-11-16 | 2018-05-01 | 北京三快在线科技有限公司 | Data reordering method, data sorting device, electronic equipment and readable storage medium storing program for executing |
-
2018
- 2018-12-27 CN CN201811614255.1A patent/CN111382367B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017157040A1 (en) * | 2016-03-18 | 2017-09-21 | 北京搜狗科技发展有限公司 | Search method and device, and device used for searching |
CN106339756A (en) * | 2016-08-25 | 2017-01-18 | 北京百度网讯科技有限公司 | Training data generation method and device and searching method and device |
CN106570197A (en) * | 2016-11-15 | 2017-04-19 | 北京百度网讯科技有限公司 | Searching and ordering method and device based on transfer learning |
CN107977405A (en) * | 2017-11-16 | 2018-05-01 | 北京三快在线科技有限公司 | Data reordering method, data sorting device, electronic equipment and readable storage medium storing program for executing |
Non-Patent Citations (1)
Title |
---|
元数据描述对搜索引擎排序结果影响研究;邢博;;现代情报(第05期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN111382367A (en) | 2020-07-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11120078B2 (en) | Method and device for video processing, electronic device, and storage medium | |
US20210117726A1 (en) | Method for training image classifying model, server and storage medium | |
CN108227950B (en) | Input method and device | |
CN109918565B (en) | Processing method and device for search data and electronic equipment | |
CN111553372B (en) | Training image recognition network, image recognition searching method and related device | |
CN110764627B (en) | Input method and device and electronic equipment | |
CN107967271A (en) | A kind of information search method and device | |
CN110020106B (en) | Recommendation method, recommendation device and device for recommendation | |
CN106815291B (en) | Search result item display method and device and search result item display device | |
CN110110207B (en) | Information recommendation method and device and electronic equipment | |
CN110309324B (en) | Searching method and related device | |
CN111382339A (en) | Search processing method and device and search processing device | |
CN112784142A (en) | Information recommendation method and device | |
CN112307281A (en) | Entity recommendation method and device | |
CN111368161B (en) | Search intention recognition method, intention recognition model training method and device | |
CN111241844B (en) | Information recommendation method and device | |
CN109901726B (en) | Candidate word generation method and device and candidate word generation device | |
CN113918661A (en) | Knowledge graph generation method and device and electronic equipment | |
CN111382367B (en) | Search result ordering method and device | |
CN112825076B (en) | Information recommendation method and device and electronic equipment | |
CN110110046B (en) | Method and device for recommending entities with same name | |
CN117370586A (en) | Information display method and device, electronic equipment and storage medium | |
CN110147426B (en) | Method for determining classification label of query text and related device | |
CN112462992B (en) | Information processing method and device, electronic equipment and medium | |
CN110851624B (en) | Information query method and related device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |