CN113360796A - Data sorting method and device, and data sorting model training method and device - Google Patents

Data sorting method and device, and data sorting model training method and device Download PDF

Info

Publication number
CN113360796A
CN113360796A CN202110552833.9A CN202110552833A CN113360796A CN 113360796 A CN113360796 A CN 113360796A CN 202110552833 A CN202110552833 A CN 202110552833A CN 113360796 A CN113360796 A CN 113360796A
Authority
CN
China
Prior art keywords
poi
sample
display
query word
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110552833.9A
Other languages
Chinese (zh)
Inventor
梁金升
何金薇
肖垚
蒋前程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN202110552833.9A priority Critical patent/CN113360796A/en
Publication of CN113360796A publication Critical patent/CN113360796A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • User Interface Of Digital Computer (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a data sorting method and device and a data sorting model training method and device. The data sorting method comprises the following steps: acquiring an initial POI list to be ranked and a query word combination corresponding to the initial POI list, wherein the initial POI list comprises L POIs; sequencing each POI in the initial POI list through a first sequencing model to obtain a first POI list; reordering the first M POIs through a second ordering model based on second characteristics of the first M POIs in the first POI list to obtain a second POI list, wherein M is a positive integer less than or equal to L, and the second characteristics at least comprise display correlation characteristics; and obtaining a target POI list obtained by sequencing each POI in the initial POI list based on the first POI list and the second POI list, wherein the first N POIs in the target POI list are consistent with the first N POIs in the second POI list, the rest POIs are sequentially arranged at the (N + 1) th to the L-th positions according to the sequence of the rest POIs in the first POI list, and N is a positive integer less than or equal to M.

Description

Data sorting method and device, and data sorting model training method and device
Technical Field
The invention relates to the technical field of internet, in particular to a data sorting method and device, a data sorting model training method and device, electronic equipment and a storage medium.
Background
In a search scene, after a user clicks for a search, a search result page list is subjected to specific sorting, and according to the relevance of scenes such as each search result item (for example, POI) and query (query term) input by the user, POIs (Point of Information) with high relevance are arranged in front, so that the user experience is improved.
At present, in a search scene, a ranking technical scheme generally comprises four parts, namely recall, rough ranking, fine ranking and regular ranking, wherein a ranking model of the fine ranking is the most complex, and used features are the most abundant, including various click rate features of POI, query and cross dimension, static features of the POI, cross features of a user and the POI, and the like. However, in the related technical scheme, the click is mainly used as a target training model, and after fine ranking, the POIs with high click rate are ranked at the top, but the POIs with high click rate cannot completely meet the user requirements, and the user experience is damaged.
Disclosure of Invention
The embodiment of the invention provides a data sorting method and device, a data sorting model training method and device, electronic equipment and a storage medium, and aims to solve the problem that in the related art, a sorting result is poor due to the fact that a click is mainly used as a target training model in a sorting mode of returning data in scenes such as searching and the like.
In order to solve the technical problem, the invention is realized as follows:
in a first aspect, an embodiment of the present invention provides a data sorting method, which is applied to an electronic device provided with a data sorting model, where the data sorting model at least includes a first sorting model and a second sorting model, and the method includes:
acquiring an initial POI list to be ranked and a query word combination corresponding to the initial POI list, wherein the initial POI list comprises L POIs, L is a positive integer greater than 1, and the query word combination comprises at least one query word;
sequencing each POI in the initial POI list through a first sequencing model to obtain a first POI list;
reordering the first M POIs through a second ordering model based on second characteristics of the first M POIs in the first POI list to obtain a second POI list, wherein M is a positive integer which is greater than 1 and less than or equal to L, the second characteristics at least comprise display correlation characteristics, and the display correlation characteristics are obtained based on the matching relationship between display information and the query words when the POIs are displayed in a page;
and obtaining a target POI list obtained by sequencing each POI in the initial POI list based on the first POI list and the second POI list, wherein the first N POIs in the target POI list are consistent with the first N POIs in the second POI list, the rest POIs are sequentially arranged at the (N + 1) th to the L-th positions according to the sequence of the rest POIs in the first POI list, and N is a positive integer which is more than 1 and less than or equal to M.
Optionally, the data ranking model further includes a pure display relevance model, and the step of ranking each POI in the initial POI list through the first ranking model to obtain the first POI list includes:
ranking each POI in the initial POI list through a first ranking model based on first characteristics of each POI to obtain a first POI list, wherein the first characteristics at least comprise the display relevance characteristics and score characteristics, and the first ranking model is obtained through training of a plurality of manually labeled first POI samples with known first characteristics;
before the step of ranking each POI in the initial POI list through the first ranking model to obtain the first POI list, the method further includes:
acquiring display relevance characteristics of each POI, and acquiring scores of the display relevance characteristics of each POI as score characteristics of each POI through a pure display relevance model based on the display relevance characteristics;
the pure display relevance model is obtained by training a plurality of manually marked first POI samples with known display relevance characteristics.
Optionally, before the step of obtaining the display relevance feature of each POI, and obtaining a score of the display relevance feature of each POI through a pure display relevance model based on the display relevance feature as a score feature of each POI, the method further includes:
acquiring display information of each first POI sample corresponding to a query word combination sample when the first POI sample is displayed in a page and an artificial labeling score of each display correlation feature of the display correlation features, wherein the query word combination sample comprises at least one query word sample;
obtaining display relevance characteristics of each POI sample based on the matching relation between each POI sample and the query word sample to obtain a plurality of known display relevance characteristics and manually marked first POI samples;
training the pure display relevance model based on the plurality of known display relevance features and the first POI sample of the artificial annotation score.
Optionally, before the step of reordering the top M POIs in the first POI list through a second ordering model based on the second features of the top M POIs in the first POI list to obtain a second POI list, the method further includes:
acquiring a plurality of manually labeled first POI samples with known second characteristics and a plurality of second POI samples labeled based on user behaviors, wherein the user behaviors comprise at least one of clicking behaviors and ordering behaviors;
training the second ranking model by the first and second POI samples of known second features.
Optionally, the step of obtaining a plurality of manually labeled first POI samples with known second characteristics and a plurality of second POI samples labeled based on user behavior includes:
obtaining display information of each first POI sample corresponding to the query word combination sample when the first POI sample is displayed in a page, wherein the query word combination sample comprises at least one query word sample;
respectively acquiring display relevance characteristics of each first POI sample and sample attributes of each first POI sample based on the matching relationship between the display information of each first POI sample and the query word sample and the artificial labeling label of each first POI sample, and acquiring a plurality of artificially labeled first POI samples with known second characteristics;
the method comprises the steps of obtaining a POI sample list exposed to a user, user behavior data received by each second POI sample in the POI sample list, a query word combination sample corresponding to each second POI sample and display information when the second POI sample is exposed in a page;
respectively acquiring display correlation characteristics of each second POI sample based on the matching relationship between the display information of each second POI sample and the corresponding query word sample;
based on the user behavior data received by each second POI sample in the second POI sample list and the display relevance characteristics of each second POI sample, obtaining the sample attributes of each second POI sample, and obtaining a plurality of second POI samples with known second characteristics based on user behavior labeling.
Optionally, the step of obtaining, based on the user behavior data received from each second POI sample in the second POI sample list and the display relevance feature of each second POI sample, the sample attribute of each second POI sample, and obtaining a plurality of second POI samples based on the user behavior labels with known second features includes:
obtaining a score of a display relevance feature of each second POI sample through the pure display relevance model;
and for any second POI sample, modifying the sample attribute of the second POI sample based on the score of the second POI sample showing the relevance characteristics and the user behavior data received by the second POI sample.
Optionally, the step of modifying the sample attribute of any one of the second POI samples based on the score of the second POI sample showing the relevant feature and the user behavior data received by the second POI sample comprises:
obtaining a user ordering sample, a user click sample and an exposure but user non-click sample in the second POI samples based on the user behavior data received by each second POI sample;
for any one user ordering sample, in response to the score of the user ordering sample showing the correlation characteristics being larger than a first threshold value, modifying the sample attribute of the user ordering sample into a positive sample;
for any user click sample, in response to the score of the user click sample showing the correlation characteristics being greater than a second threshold value, modifying the sample attribute of the user click sample into a positive sample;
for any of the exposed but user-unchecked samples, in response to the score for the exposed but user-unchecked sample showing relevance features being less than a third threshold, amending the sample attribute of the user-clicked sample to a negative sample.
Optionally, the second feature further includes a partial first feature, and the partial first feature includes: at least one of a distance between a current query user and the POI, a score of the POI, whether an avatar of the POI is a default avatar, a click rate of the query word combination dimension within a nearest specified time range, and a click rate of the query word combination crossing the POI dimension within a nearest specified time range.
Optionally, the displaying the correlation feature comprises: the number of times of matching each display information on the query word combination, the percentage of the query word combination in each display information, the percentage of the matching characters in the query word combination with each display information in the query word combination, the percentage of the matching characters in the query word combination with each display information in the display information, the percentage of the matching characters in each query word with each display information in the query word combination in the query word, the percentage of the matching characters in each query word with each display information in the display information, the number of display information which can be matched on the query word combination, the total number of display information which can be matched on each query word combination, and the percentage of characters which can be matched on the display information in the query word combination, wherein the display information comprises a POI title, a POI belonging combination title, a POI belonging to, And the POI belongs to at least one of the area, the POI category, the POI tag and the POI recommendation reason.
In a second aspect, an embodiment of the present invention provides a method for training a data ranking model, where the data ranking model includes at least a first ranking model and a second ranking model, and the method includes:
acquiring a plurality of first POI samples, a query word combination, a second characteristic and a manually labeled sample attribute corresponding to each first POI sample, and a plurality of second POI samples, wherein the second characteristic of each second POI sample and the sample attribute based on user behavior labeling; wherein, the query word combination comprises at least one query word;
training the first sequencing model through a plurality of first POI samples, a query word combination corresponding to each first POI sample, a second feature and a sample attribute of artificial labeling;
training the second ranking model by a plurality of the first POI samples and a plurality of the second POI samples for which second features and sample attributes are known;
the second feature at least comprises a display correlation feature, the display correlation feature is obtained based on a matching relationship between display information of the POI sample when the POI sample is displayed in a page and the query word, the user behavior comprises at least one of ordering behavior and clicking behavior of a user aiming at the second POI sample when the second POI sample is exposed, and the sample attribute comprises a positive sample and a negative sample.
Optionally, the data ranking model further includes a pure presentation relevance model, and before the step of training the first ranking model by using the plurality of first POI samples, the query word combinations corresponding to each of the first POI samples, the second features, and the manually labeled sample attributes, further includes:
training the pure display relevance model through a plurality of first POI samples with known display relevance features;
the step of training the first ranking model through a plurality of first POI samples, a query word combination corresponding to each of the first POI samples, a second feature, and a manually labeled sample attribute includes:
acquiring a first feature of each first POI sample, and training the first sequencing model based on the first feature and a query word combination of each first POI sample, wherein the first feature at least comprises the display relevance feature and a score feature;
and the score characteristic is the score of the display relevance characteristic of the first POI sample obtained by a pure display relevance model.
Optionally, the step of obtaining a plurality of first POI samples, a query word combination, a second feature, and a manually labeled sample attribute corresponding to each first POI sample, and a plurality of second POI samples, a second feature of each second POI sample, and a sample attribute labeled based on a user behavior includes:
obtaining display information of each first POI sample corresponding to the query word combination sample when the first POI sample is displayed in a page, wherein the query word combination sample comprises at least one query word sample;
respectively acquiring display relevance characteristics of each first POI sample and sample attributes of each first POI sample based on the matching relationship between the display information of each first POI sample and the query word sample and the artificial labeling label of each first POI sample, and acquiring a plurality of artificially labeled first POI samples with known second characteristics;
the method comprises the steps of obtaining a POI sample list exposed to a user, user behavior data received by each second POI sample in the POI sample list, a query word combination sample corresponding to each second POI sample and display information when the second POI sample is exposed in a page;
respectively acquiring display correlation characteristics of each second POI sample based on the matching relationship between the display information of each second POI sample and the corresponding query word sample;
based on the user behavior data received by each second POI sample in the second POI sample list and the display relevance characteristics of each second POI sample, obtaining the sample attributes of each second POI sample, and obtaining a plurality of second POI samples with known second characteristics based on user behavior labeling.
Optionally, the step of obtaining, based on the user behavior data received from each second POI sample in the second POI sample list and the display relevance feature of each second POI sample, the sample attribute of each second POI sample, and obtaining a plurality of second POI samples based on the user behavior labels with known second features includes:
obtaining a score of a display relevance feature of each second POI sample through the pure display relevance model;
and for any second POI sample, modifying the sample attribute of the second POI sample based on the score of the second POI sample showing the relevance characteristics and the user behavior data received by the second POI sample.
Optionally, the step of modifying the sample attribute of any one of the second POI samples based on the score of the second POI sample showing the relevant feature and the user behavior data received by the second POI sample comprises:
obtaining a user ordering sample, a user click sample and an exposure but user non-click sample in the second POI samples based on the user behavior data received by each second POI sample;
for any one user ordering sample, in response to the score of the user ordering sample showing the correlation characteristics being larger than a first threshold value, modifying the sample attribute of the user ordering sample into a positive sample;
for any user click sample, in response to the score of the user click sample showing the correlation characteristics being greater than a second threshold value, modifying the sample attribute of the user click sample into a positive sample;
for any of the exposed but user-unchecked samples, in response to the score for the exposed but user-unchecked sample showing relevance features being less than a third threshold, amending the sample attribute of the user-clicked sample to a negative sample.
In a third aspect, an embodiment of the present invention provides a data sorting device, where a data sorting model is provided in the data sorting device, the data sorting model at least includes a first sorting model and a second sorting model, and the device includes:
the system comprises a ranking data acquisition module, a ranking data acquisition module and a query word combination module, wherein the ranking data acquisition module is used for acquiring an initial POI list to be ranked and the query word combination corresponding to the initial POI list, the initial POI list comprises L POIs, L is a positive integer larger than 1, and the query word combination comprises at least one query word;
the first data sorting module is used for sorting each POI in the initial POI list through a first sorting model to obtain a first POI list;
the second data sorting module is used for re-sorting the first M POIs through a second sorting model based on second characteristics of the first M POIs in the first POI list to obtain a second POI list, wherein M is a positive integer which is greater than 1 and less than or equal to L, the second characteristics at least comprise display correlation characteristics, and the display correlation characteristics are obtained based on the matching relationship between display information of the POIs when the POIs are displayed in a page and the query words;
and the ranking result acquisition module is used for acquiring a target POI list obtained after each POI in the initial POI list is ranked based on the first POI list and the second POI list, wherein the first N POIs in the target POI list are consistent with the first N POIs in the second POI list, the rest POIs are sequentially arranged at the (N + 1) th to the L-th positions according to the sequence of the rest POIs in the first POI list, and N is a positive integer which is more than 1 and less than or equal to M.
Optionally, the first data sorting module includes:
the first data sorting sub-module is used for sorting each POI in the initial POI list through a first sorting model based on a first characteristic of each POI to obtain a first POI list, wherein the first characteristic at least comprises the display relevance characteristic and a score value characteristic, and the first sorting model is obtained through training of a plurality of manually labeled first POI samples with known first characteristics;
the device, still include:
the score feature acquisition module is used for acquiring the display relevance feature of each POI, and acquiring the score of the display relevance feature of each POI as the score feature of each POI through a pure display relevance model based on the display relevance feature;
the pure display relevance model is obtained by training a plurality of manually marked first POI samples with known display relevance characteristics.
Optionally, the apparatus further comprises:
the system comprises a sample data acquisition module, a search word combination sample generation module and a search result analysis module, wherein the sample data acquisition module is used for acquiring display information of each first POI sample corresponding to the search word combination sample when the first POI sample is displayed in a page and an artificial labeling score of each display correlation characteristic of the display correlation characteristics, and the search word combination sample comprises at least one search word sample;
the artificial labeling sample acquisition module is used for acquiring the display relevance characteristics of each POI sample based on the matching relation between each POI sample and the query word sample to obtain a plurality of first POI samples with known display relevance characteristics and artificial labeling scores;
a pure display relevance model training module for training the pure display relevance model based on the plurality of known display relevance features and the first POI samples with artificial labeling scores.
Optionally, the apparatus further comprises:
the second training sample acquisition module is used for acquiring a plurality of manually labeled first POI samples with known second characteristics and a plurality of second POI samples labeled based on user behaviors, wherein the user behaviors comprise at least one of clicking behaviors and ordering behaviors;
and the second ranking model training module is used for training the second ranking model through the first POI sample and the second POI sample with known second characteristics.
Optionally, the second training sample obtaining module includes:
the system comprises a sample display information acquisition sub-module, a search word combination sample generation sub-module and a search result sub-module, wherein the sample display information acquisition sub-module is used for acquiring display information of each first POI sample corresponding to the search word combination sample when the first POI sample is displayed in a page, and the search word combination sample comprises at least one search word sample;
the artificial labeling sample obtaining sub-module is used for obtaining a plurality of artificially labeled first POI samples with known second characteristics based on the matching relationship between the display information of each first POI sample and the query word sample and the artificial labeling label of each first POI sample, and respectively obtaining the display correlation characteristics of each first POI sample and the sample attributes of each first POI sample;
the exposure sample information acquisition sub-module is used for acquiring a POI sample list which is exposed to a user, user behavior data received by each second POI sample in the POI sample list, a query word combination sample corresponding to each second POI sample and display information when the second POI sample is exposed in a page;
the display correlation characteristic obtaining sub-module is used for respectively obtaining the display correlation characteristics of each second POI sample based on the matching relation between the display information of each second POI sample and the corresponding query word sample;
and the behavior labeling sample acquisition sub-module is used for acquiring the sample attribute of each second POI sample based on the user behavior data received by each second POI sample in the second POI sample list and the display relevance characteristics of each second POI sample to obtain a plurality of second POI samples with known second characteristics based on user behavior labeling.
Optionally, the behavior labeling sample obtaining sub-module includes:
a display relevance feature scoring unit, configured to obtain, through the pure display relevance model, a score of a display relevance feature of each second POI sample;
and the behavior labeling sample correction unit is used for correcting the sample attribute of any second POI sample based on the score of the displayed correlation characteristic of the second POI sample and the user behavior data received by the second POI sample.
Optionally, the behavior labeling sample modification unit is specifically configured to:
obtaining a user ordering sample, a user click sample and an exposure but user non-click sample in the second POI samples based on the user behavior data received by each second POI sample;
for any one user ordering sample, in response to the score of the user ordering sample showing the correlation characteristics being larger than a first threshold value, modifying the sample attribute of the user ordering sample into a positive sample;
for any user click sample, in response to the score of the user click sample showing the correlation characteristics being greater than a second threshold value, modifying the sample attribute of the user click sample into a positive sample;
for any of the exposed but user-unchecked samples, in response to the score for the exposed but user-unchecked sample showing relevance features being less than a third threshold, amending the sample attribute of the user-clicked sample to a negative sample.
Optionally, the second feature further includes a partial first feature, and the partial first feature includes: at least one of a distance between a current query user and the POI, a score of the POI, whether an avatar of the POI is a default avatar, a click rate of the query word combination dimension within a nearest specified time range, and a click rate of the query word combination crossing the POI dimension within a nearest specified time range.
Optionally, the displaying the correlation feature comprises: the number of times of matching each display information on the query word combination, the percentage of the query word combination in each display information, the percentage of the matching characters in the query word combination with each display information in the query word combination, the percentage of the matching characters in the query word combination with each display information in the display information, the percentage of the matching characters in each query word with each display information in the query word combination in the query word, the percentage of the matching characters in each query word with each display information in the display information, the number of display information which can be matched on the query word combination, the total number of display information which can be matched on each query word combination, and the percentage of characters which can be matched on the display information in the query word combination, wherein the display information comprises a POI title, a POI belonging combination title, a POI belonging to, And the POI belongs to at least one of the area, the POI category, the POI tag and the POI recommendation reason.
In a fourth aspect, an embodiment of the present invention further provides an apparatus for training a data ranking model, where the data ranking model includes at least a first ranking model and a second ranking model, and the apparatus includes:
the training sample acquisition module is used for acquiring a plurality of first POI samples, a query word combination, a second feature and a sample attribute of artificial labeling corresponding to each first POI sample, and a plurality of second POI samples, a second feature of each second POI sample and a sample attribute based on user behavior labeling; wherein, the query word combination comprises at least one query word;
the first ordering model training module is used for training the first ordering model through a plurality of first POI samples, the query word combinations corresponding to the first POI samples, the second characteristics and the sample attributes of the artificial labels;
a second ranking model training module for training the second ranking model by a plurality of the first POI samples and a plurality of the second POI samples for which second features and sample attributes are known;
the second feature at least comprises a display correlation feature, the display correlation feature is obtained based on a matching relationship between display information of the POI sample when the POI sample is displayed in a page and the query word, the user behavior comprises at least one of ordering behavior and clicking behavior of a user aiming at the second POI sample when the second POI sample is exposed, and the sample attribute comprises a positive sample and a negative sample.
Optionally, the data sorting model further includes a pure presentation relevance model, and the apparatus further includes:
the pure display relevance model training module is used for training the pure display relevance model through a plurality of first POI samples with known display relevance characteristics;
the first sequencing model training module comprises:
a first ordering model training sub-module, configured to obtain a first feature of each first POI sample, and train the first ordering model based on the first feature of each first POI sample and a query word combination, where the first feature at least includes the display relevance feature and a score feature;
and the score characteristic is the score of the display relevance characteristic of the first POI sample obtained by a pure display relevance model.
Optionally, the training sample obtaining module includes:
the system comprises a sample display information acquisition sub-module, a search word combination sample generation sub-module and a search result sub-module, wherein the sample display information acquisition sub-module is used for acquiring display information of each first POI sample corresponding to the search word combination sample when the first POI sample is displayed in a page, and the search word combination sample comprises at least one search word sample;
the artificial labeling sample obtaining sub-module is used for obtaining a plurality of artificially labeled first POI samples with known second characteristics based on the matching relationship between the display information of each first POI sample and the query word sample and the artificial labeling label of each first POI sample, and respectively obtaining the display correlation characteristics of each first POI sample and the sample attributes of each first POI sample;
the exposure sample information acquisition sub-module is used for acquiring a POI sample list which is exposed to a user, user behavior data received by each second POI sample in the POI sample list, a query word combination sample corresponding to each second POI sample and display information when the second POI sample is exposed in a page;
the display correlation characteristic obtaining sub-module is used for respectively obtaining the display correlation characteristics of each second POI sample based on the matching relation between the display information of each second POI sample and the corresponding query word sample;
and the behavior labeling sample acquisition sub-module is used for acquiring the sample attribute of each second POI sample based on the user behavior data received by each second POI sample in the second POI sample list and the display relevance characteristics of each second POI sample to obtain a plurality of second POI samples with known second characteristics based on user behavior labeling.
Optionally, the behavior labeling sample obtaining sub-module includes:
a display relevance feature scoring unit, configured to obtain, through the pure display relevance model, a score of a display relevance feature of each second POI sample;
and the behavior labeling sample correction unit is used for correcting the sample attribute of any second POI sample based on the score of the displayed correlation characteristic of the second POI sample and the user behavior data received by the second POI sample.
Optionally, the behavior labeling sample modification unit is specifically configured to:
obtaining a user ordering sample, a user click sample and an exposure but user non-click sample in the second POI samples based on the user behavior data received by each second POI sample;
for any one user ordering sample, in response to the score of the user ordering sample showing the correlation characteristics being larger than a first threshold value, modifying the sample attribute of the user ordering sample into a positive sample;
for any user click sample, in response to the score of the user click sample showing the correlation characteristics being greater than a second threshold value, modifying the sample attribute of the user click sample into a positive sample;
for any of the exposed but user-unchecked samples, in response to the score for the exposed but user-unchecked sample showing relevance features being less than a third threshold, amending the sample attribute of the user-clicked sample to a negative sample.
In a fifth aspect, an embodiment of the present invention additionally provides an electronic device, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the data ordering method according to the first aspect and/or the steps of the data ordering model training method according to the second aspect.
In a sixth aspect, the present invention provides a computer-readable storage medium, where a computer program is stored, and the computer program, when executed by a processor, implements the steps of the data sorting method according to the first aspect, and/or the steps of the data sorting model training method according to the second aspect.
In the embodiment of the invention, an initial POI list to be ranked and a query word combination corresponding to the initial POI list are obtained, wherein the initial POI list comprises L POIs; sequencing each POI in the initial POI list through a first sequencing model to obtain a first POI list; reordering the first M POIs through a second ordering model based on second characteristics of the first M POIs in the first POI list to obtain a second POI list, wherein M is a positive integer less than or equal to L, and the second characteristics at least comprise display correlation characteristics; and obtaining a target POI list obtained by sequencing each POI in the initial POI list based on the first POI list and the second POI list, wherein the first N POIs in the target POI list are consistent with the first N POIs in the second POI list, the rest POIs are sequentially arranged at the (N + 1) th to the L-th positions according to the sequence of the rest POIs in the first POI list, and N is a positive integer less than or equal to M. In the embodiment of the invention, the artificial marking is taken as a target, the relevance display characteristic is shown to train the second sequencing model, and the influence of user perception elements in sequencing can be enhanced. And meanwhile, the corrected sample based on the user behavior label is mixed, only the front N bits are influenced on line, and the basic stability of the on-line click rate index is ensured while the perception correlation of the user is enhanced.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without inventive labor.
FIG. 1 is a flow chart of the steps of a data sorting method in an embodiment of the present invention;
FIG. 2 is a flow chart of steps of another data sorting method in an embodiment of the present invention;
FIG. 3 is a flow chart of the steps of a method for training a data ordering model in an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a data sorting apparatus according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of another data sorting apparatus according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a data sorting model training apparatus according to an embodiment of the present invention;
fig. 7 is a schematic diagram of a hardware structure of an electronic device in the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a flowchart illustrating steps of a data sorting method according to an embodiment of the present invention is shown.
Step 110, acquiring an initial POI list to be ranked and a query word combination corresponding to the initial POI list, wherein the initial POI list comprises L POIs, L is a positive integer greater than 1, and the query word combination comprises at least one query word;
step 120, ranking each POI in the initial POI list through a first ranking model to obtain a first POI list;
step 130, based on second features of the top M POIs in the first POI list, re-ranking the top M POIs through a second ranking model to obtain a second POI list, where M is a positive integer greater than 1 and less than or equal to L, the second features at least include display relevance features, and the display relevance features are obtained based on a matching relationship between display information of the POIs when displayed in a page and the query words;
step 140, obtaining a target POI list obtained by ranking each POI in the initial POI list based on the first POI list and the second POI list, wherein the top N POIs in the target POI list are consistent with the top N POIs in the second POI list, the rest POIs are sequentially arranged at the N +1 th to the L th positions according to the sequence of the POIs in the first POI list, and N is a positive integer greater than 1 and less than or equal to M.
As described above, the existing ranking technical solution generally includes four parts, i.e., recall, rough ranking, fine ranking and regular ranking, wherein the ranking model of the fine ranking is the most complex, and the used features are the most abundant, including various click rate features of POI and query and cross dimension, static features of POI, and cross features of user and POI, etc. However, the existing ranking model lacks query and page display related features, and is also trained mainly based on click rate, the top is a POI with high click rate, but cannot completely meet user requirements and damage user experience, and manually marked targets and user-perceived display related features cannot be directly and effectively applied on the basis of the existing scheme. In addition, currently, the first ranking model for fine ranking is trained mainly by taking clicks as targets, and distance, merchant scores and the like can be added to calculate auxiliary losses, but factors influencing the user to click on the POI are numerous, and the POI with high click rate is not necessarily strong in correlation with the query information of the user and has a certain deviation; and the refined model lacks the POI-related page display characteristics which can be perceived by the user, and meanwhile, because nearly 700 refined characteristics are obtained at present, the influence on the final target is small by directly adding the characteristics to the refined model.
Therefore, in the embodiment of the present invention, a data sorting method is provided, and specifically, after each POI in a POI list is finely sorted by a first sorting model, a second sorting model trained in advance is combined with second features of the first M results to reorder the top M results (for example, the top M POIs) after fine sorting, and the top M results are closely related to a target, so that the result of reordering only affects the sorting results of the top N POIs, and N is a positive integer greater than 1 and less than or equal to M. In different application scenarios, the POI may be of different data types, for example, in a take-away scenario, the POI may be a take-away merchant queried by the user; in an article retrieval scene, POI can be a document queried by a user; in an address book retrieval scene, POI can be contact information inquired by a user; in a database scenario, the POI may be document data in a database queried by a user, and so on.
The second characteristics used by the second ranking model can include display correlation characteristics perceived by the user, the display correlation characteristics can be obtained based on the matching relationship between the display information of the corresponding POI when displayed in the page and the corresponding query words, the display correlation characteristics perceived by the user can be considered when the previous M POIs are reordered, the influence of user perception elements in the ranking can be enhanced, and the matching degree of the ranking result and the user requirements can be improved.
The display information of the POI displayed in the page may include any relevant information exposed when the POI is displayed in the page, such as a title of the POI, a category to which the POI belongs, an area to which the POI belongs (e.g., a business district to which the POI belongs, etc.), a tag of the POI, a reason for recommending the POI, and so on. The matching relationship between the presentation information and the corresponding query word may include the matching relationship between the presentation information and the whole query word combination corresponding to the presentation information, or may include the matching relationship between the presentation information and each query word in the query word combination corresponding to the presentation information, and the manner of obtaining the presentation correlation characteristic based on the matching relationship may also be set by user according to the requirement, which is not limited in the embodiments of the present invention.
In addition, the first ranking model and the second ranking model may be any machine learning model, and the first ranking model and the second ranking model may be different types of machine learning models, or may be the same type of first ranking model and second ranking model, which is not limited in this embodiment of the present invention. For example, in order to improve the accuracy of the ranking result, the first ranking model may be set as a machine learning model with higher accuracy, and in order to reduce the volume of the data ranking model, the second ranking model may be set as a lightweight machine learning model, such as an XGB (eXtreme Gradient Boosting) model, and the like.
M, N, the value may also be set by user according to requirements, and the embodiment of the present invention is not limited thereto. For example, M may be set to 10, N may be set to 1, and so on.
Moreover, the obtained initial POI list to be ranked may be an unsorted POI list, or may also be the coarse POI list, which is not limited in this embodiment of the present invention. Moreover, when the POIs in the initial POI list are ranked through the first ranking model, the content included in the features of the POIs under consideration may be set by a user according to requirements, for example, any existing features related to ranking may be obtained for fine ranking, and the like.
For example, assume that the existing ranking solution includes four parts, i.e., recall, coarse L1, fine L2, and regular L3, where the coarse L1, fine L2, and regular L3 belong to the ranking process. Then, by the technical solution in the embodiment of the present invention, the existing solution can be improved at the L2 layer (i.e., the first ordering model) and the L3 layer (i.e., the second ordering model). During online prediction sorting, the POI list enters a trained first sorting model after being coarsely sorted by L1, the first M positions of the first POI list sorted by the first sorting model are reordered by the second sorting model, the obtained corresponding second characteristics are input into the second sorting model for scoring, the N POIs with the highest scores in the M POIs are arranged at the top, and the rest POIs are arranged at the N +1 th position to the L th position according to the original sequence of the rest POIs in the first POI list, so that the sorting by L3 is completed. Of course, if the second ranking model finishes ranking, another ranking rule may be additionally set to re-rank the ranking results of the second ranking model, which is not limited in this embodiment of the present invention.
Besides the manually labeled sample, the training sample of the second sequencing model can be added with a sample based on user behavior labeling which is corrected by another pure display relevance model score filter, and the corrected sample based on the user behavior labeling can enable the model to learn the preference of the search behavior of the user and does not conflict with the manually labeled preference. The display correlation characteristics and the related auxiliary loss can be added into the first sequencing model, so that the manual marking indexes are improved to a certain extent.
Referring to fig. 2, in another embodiment, the data sorting model may further include a purely display relevance model, and the step 120 may further include:
step 121, ranking each POI in the initial POI list through a first ranking model based on a first feature of each POI to obtain a first POI list, where the first feature at least includes the display relevance feature and the score feature, and the first ranking model is obtained by training a plurality of manually labeled first POI samples with known first features.
Accordingly, the method may further comprise:
step 10, obtaining display relevance characteristics of each POI, and obtaining scores of the display relevance characteristics of each POI as score characteristics of each POI through a pure display relevance model based on the display relevance characteristics; the pure display relevance model is obtained by training a plurality of manually marked first POI samples with known display relevance characteristics.
In the embodiment of the invention, in order to improve the accuracy of the first sequencing model in the fine sequencing and enhance the influence of the display correlation characteristic on the fine sequencing result, the pure display correlation model in the sequencing model can obtain the calculation score of the display correlation characteristic as the super characteristic, and the calculation score is transmitted to the first sequencing model together with the display correlation characteristic and the original characteristic required by the first sequencing model for scoring.
During off-line training, a pure display relevance model can be trained by using the POI sample with the display relevance characteristics and the manual labeling. For the training sample of the first sequencing model, the score predicted by the display relevance model may be used as a super feature, and meanwhile, when the loss function of the first sequencing model is set, various features of the manually labeled POI sample, such as text similarity of Query (Query word/Query word combination) and poinamine (POI Title based on Query word recall, that is, POI Title), character proportion in Query that can match display information of the POI sample, score of the display relevance feature output by the display relevance model, and the like, may be considered, that is, the weight of the features is set in the loss function of the first sequencing model, so as to consider the features in the loss function and add weight, so as to ensure that the first sequencing model can learn the importance of the relevant features. Accordingly, the above-described features may also be included in the first feature.
When the pure display relevance model and the first sequencing model are trained, information such as sample attributes of the POI sample and scores of display relevance features of the POI sample can be obtained through manual labeling by a related user. For example, an operator may simulate to perform an online query request, label the first 5 POIs as POI samples, label each POI sample as "like", "general" and "dislike" from strong to weak according to the correlation between the display information of each POI sample in the page and the query, further may take "like" as a positive sample, and "general" and "dislike" as negative samples, and so on. After the embodiment of the present invention, in order to obtain the features of each POI sample, for each POI sample, the relevant data of the POI sample may be obtained based on its query, city ID, POI ID, geohash6 (hash value of address location, latitude and longitude, etc.), and other features calculated offline, so as to be used in training the model.
Optionally, in another embodiment, before the step 10, the method may further include:
s1, obtaining display information of each first POI sample corresponding to the query word combination sample when the first POI sample is displayed in a page, and an artificial labeling score of each display correlation feature of the display correlation features, wherein the query word combination sample comprises at least one query word sample;
s2, obtaining the display relevance characteristics of each first POI sample based on the matching relation between each first POI sample and the query word sample, and obtaining a plurality of first POI samples with known display relevance characteristics and artificial labeling scores;
s3, training the pure show relevance model based on the plurality of known show relevance features and the manually labeled score of the first POI sample.
In the embodiment of the present invention, in order to train a pure display relevance model, display information of each first POI sample corresponding to a query word combination sample when displayed in a page may be obtained in advance, and query word samples included in the query word combination sample may be set by user according to requirements, which is not limited in the embodiment of the present invention. And then, based on the matching relationship between each first POI sample and the query word sample, the display relevance characteristics of each first POI sample can be obtained, and meanwhile, according to the artificial marking score of the display relevance characteristics of each first POI sample, a plurality of first POI samples with known display relevance characteristics and artificial marking scores can be obtained, so that a pure display relevance model in the ranking model is trained.
For example, the user may manually label each first POI sample as preference levels such as "like", "general", and "dislike", and further label the score of the display relevance feature of each first POI sample according to each preference level, for example, the score of the "like" first POI sample is labeled as 1, the score of the "general" first POI sample is labeled as 0.5, the score of the "dislike" first POI sample is labeled as 0, and so on.
Moreover, in the embodiment of the present invention, in order to improve the accuracy of the pure display relevance model, the pure display relevance model may be periodically updated, for example, the pure display relevance model is updated and trained once a week, and the pure display relevance model is trained by selecting the first POI sample manually labeled in the last week during each training. The purely shown correlation model may also be any machine learning model, and the embodiment of the present invention is not limited thereto. For example, the pure exhibition correlation model may be set to the XGB model described above, and so on.
In addition, when the first ranking model and the second ranking model are trained, each part in the ranking model can be periodically updated by referring to the pure display correlation model. For example, the ranking model is updated once a week, and each time the ranking model is updated, model training is performed with sample POIs in the latest week, and so on.
Referring to fig. 2, in another embodiment, before the step 130, the method may further include:
t1, obtaining a plurality of manually labeled first POI samples with known second characteristics and a plurality of second POI samples labeled based on user behaviors, wherein the user behaviors comprise at least one of click behaviors and order placing behaviors;
t2, training the second ranking model by the first and second POI samples of known second characteristics.
In practical application, the manually labeled POI sample describes the relevance between the query and the POI from the perspective of subjective perception of the user, but the manually labeling cost is high, so that the data size of the manually labeled POI sample is small, and other simple and direct methods try to use the manual labeling, for example, the manual labeling is used as one item of auxiliary loss, and a good training effect cannot be obtained.
In order to solve the above problem, in the embodiment of the present invention, a second ranking model is trained in the ranking models to reorder the top M-bit results ranked by the first ranking model, and at the same time, the results are closely related to the target, so that the reordered results only affect the top N-bit POI. The second characteristics used by the second ranking model comprise display relevance characteristics perceived by the user, and besides the manually marked POI samples, the training samples of the second ranking model can also be added with POI samples marked based on the user behaviors, so that the model can learn the preference of the actual search behaviors of the user.
Specifically, a plurality of manually labeled first POI samples with known second characteristics and a plurality of second POI samples labeled based on user behavior may be obtained, and then the second ranking model may be trained by the first POI samples with known second characteristics and the second POI samples.
The user behavior may be operation behavior of the second POI sample when the second POI sample is actually exposed, and the operation behavior of the user with respect to the corresponding second POI sample may include, but is not limited to, click behavior, order placing behavior, like behavior, favorite behavior, save behavior, download behavior, forward behavior, and the like.
For example, a POI sample that has not received any further user behavior after exposure may be set as a negative sample, and a POI sample that has received user behavior after exposure may be set as a positive sample; or, the POI sample after exposure and receiving the user behavior of the specified type may also be set as a positive sample, otherwise, as a negative sample; and so on.
In addition, in the embodiment of the present invention, in order to distinguish between the manually labeled POI sample and the POI sample labeled based on the user behavior, the manually labeled POI sample may be defined as a first POI sample, and the POI sample labeled based on the user behavior may be defined as a second POI sample.
Moreover, the display correlation characteristic and the related auxiliary loss are added into the first sequencing model layer, so that the manual labeling index is improved to a certain extent.
Optionally, in another embodiment, the step T1 may further include:
t11, obtaining display information of each first POI sample corresponding to the query word combination sample when the first POI sample is displayed in a page, wherein the query word combination sample comprises at least one query word sample;
t12, based on the matching relationship between the display information of each first POI sample and the query term sample and the artificial labeling label of each first POI sample, respectively obtaining the display relevance feature of each first POI sample and the sample attribute of each first POI sample, and obtaining a plurality of artificially labeled first POI samples with known second features;
t13, acquiring a POI sample list exposed to a user, user behavior data received by each second POI sample in the POI sample list, and a query word combination sample corresponding to each second POI sample and display information when exposed in a page;
t14, respectively acquiring display relevance characteristics of each second POI sample based on the matching relation between the display information of each second POI sample and the corresponding query word sample;
and T15, obtaining a sample attribute of each second POI sample based on the user behavior data received by each second POI sample in the second POI sample list and the display relevance feature of each second POI sample, and obtaining a plurality of second POI samples based on the user behavior labels with known second features.
In an embodiment of the invention, to train the first ranking model and/or the second ranking model, the display information of each first POI sample corresponding to the query word combination sample when displayed in the page can be obtained in advance, the query word samples contained in the query word combination sample can be set in a user-defined manner according to requirements, the embodiment of the present invention is not limited, and further, the display relevance feature of each first POI sample can be obtained based on the matching relationship between each first POI sample and the query term sample, meanwhile, sample attributes of each first POI sample (e.g., negative samples for POI samples "general" and "dislike", positive samples for POI samples "like") can be obtained according to the manual labeling label of each first POI sample (e.g., "like", "general", and "dislike", etc.) of each first POI sample, and obtaining a plurality of manually marked first POI samples showing the correlation characteristics.
Further, in order to obtain a second POI sample labeled based on user behavior, a list of POI samples that have been exposed to the user may be obtained, user behavior data received for each second POI sample in the list of POI samples, and each second POI sample is corresponding to a query word combination sample when exposed and display information when exposed in a page, and then respectively obtaining the display correlation characteristics of each second POI sample based on the matching relationship between the display information of each second POI sample and the corresponding query word sample, and simultaneously, based on the user behavior data received by each second POI sample in the second POI sample list and the display relevance characteristics of each second POI sample, obtaining the sample attribute of each second POI sample to obtain a plurality of second POI samples with known second characteristics based on user behavior marking.
In addition, in order to extract other features included in the first feature and the second feature, feature extraction may be performed in any other available manner, and the embodiment of the present invention is not limited thereto.
When the ranking model is trained each time, the first ranking model, the second ranking model and the pure display relevance model are trained, the first POI sample and the corresponding query word combination sample thereof may be shared, or different first POI samples and corresponding query word combination samples thereof may be respectively adopted, which is not limited in the embodiment of the present invention.
In addition, in order to train the first ranking model, the first feature of each first POI sample may be obtained, and when the score feature in the first feature of the first POI sample is obtained, the trained pure display relevance model may be used to score the display relevance feature of the first POI sample labeled manually. Similarly, the display relevance feature of the second POI sample based on the user behavior annotation may also be scored by using a pure display relevance model, so as to utilize the sample and the score in the first ranking model and the second ranking model, respectively, which is not limited in this embodiment of the present invention.
Optionally, in another embodiment, the step T15 may further include:
step T151, obtaining a score of a display relevance feature of each second POI sample through the pure display relevance model;
step T152, for any second POI sample, modifying the sample attribute of the second POI sample based on the score of the second POI sample showing the relevance feature and the user behavior data received by the second POI sample.
In practical application, directly mixing a POI sample based on user behavior labeling may cause a conflict with a manually labeled POI sample, that is, two POI samples with the same characteristics may have one side being a positive sample and the other side being a negative sample in the user behavior labeling sample and the manually labeled sample, which easily affects the accuracy of the training result of the second ranking model.
Therefore, in the embodiment of the present invention, in order to avoid the above problem, the modified POI sample based on the user behavior annotation may be filtered through the score of the pure display relevance model, so that the modified POI sample based on the user behavior annotation, that is, the POI sample based on the user behavior annotation, not only enables the model to learn the preference of the search behavior of the user, but also does not conflict with the preference of manual annotation.
Specifically, the score of the display relevance feature of each second POI sample can be obtained through the pure display relevance model, and then for any one of the second POI samples, the sample attribute of the second POI sample is corrected based on the score of the display relevance feature of the second POI sample and the user behavior data received by the second POI sample.
In the correction process, the correction modes corresponding to the second POI samples in different score ranges and different user behaviors can be set by user according to requirements, and the embodiment of the invention is not limited.
Optionally, in another embodiment, the step T152 may further include:
step T1521, obtaining a user ordering sample, a user click sample, an exposure sample and a sample which is not clicked by a user in the second POI samples based on the user behavior data received by each second POI sample;
step T1522, aiming at any order placing sample of the user, responding to the fact that the score of the correlation characteristic displayed by the order placing sample of the user is larger than a first threshold value, and correcting the sample attribute of the order placing sample of the user into a positive sample;
step T1523, aiming at any one user click sample, in response to the fact that the score of the user click sample showing the correlation characteristics is larger than a second threshold value, correcting the sample attribute of the user click sample into a positive sample;
step T1524, for any of the exposed but not clicked by the user samples, in response to the score of the exposed but not clicked by the user samples showing the correlation feature being less than the third threshold, modifying the sample attribute of the clicked sample by the user to a negative sample.
The specific values of the first threshold, the second threshold, and the third threshold may be set by self-definition according to requirements and specific application scenarios, and the embodiment of the present invention is not limited thereto.
For example, it may be preferably set that, for a sample ordered by the user, a sample in which the score of the display correlation feature obtained by the pure display correlation model is greater than 0.3 is taken as a positive sample, for a sample clicked by the user, a sample in which the score of the display correlation feature obtained by the pure display correlation model is greater than 0.8 is taken as a positive sample, and for a sample exposed but not clicked by the user, a sample in which the score of the display correlation feature obtained by the pure display correlation model is less than 0.7 is taken as a negative sample; and so on.
Optionally, in an embodiment of the present invention, the second feature further includes a partial first feature, and the partial first feature includes: at least one of a distance between a current query user and the POI, a score of the POI, whether an avatar of the POI is a default avatar, a click rate of the query word combination dimension within a nearest specified time range, and a click rate of the query word combination crossing the POI dimension within a nearest specified time range.
In practical application, when each POI in the initial POI list is ranked by the first ranking model, a fine ranking, that is, a precise ranking may be generally performed, where the first feature referred to by the first ranking model may further include any other feature related to ranking that can be obtained, in addition to the above-mentioned feature of displaying relevance and score, such as an actual distance between a current query user and the POI, a score of the POI, whether an avatar of the POI is a default avatar pattern, a click rate of the POI and a corresponding query word combination within a certain time limit in the recent past, an exposure rate of the POI, basic information of the POI, a category to which the POI belongs, and the like.
For the second ranking model, in order to enable the ranking result to meet the use requirement of the query user, the model learns the preference of the search behavior of the user, and the training sample comprises a plurality of second POI samples labeled based on the user behavior. In addition, in the training and online use process of the second ranking model, the second ranking model can also reuse partial features which mainly affect the display relevance or manually labeled partial features and important features related to the click rate in the features of the first ranking model, that is, the second features utilized by the second ranking model can include partial first features. The content of the specific part of the first feature may be set by a user according to a requirement, and the embodiment of the present invention is not limited.
Preferably, the part of the first features included in the second features may specifically include, but is not limited to, at least one of a distance between a current query user and the POI, a score of the POI, whether an avatar of the POI is a default avatar, a click rate of the query word combination dimension within a nearest specified time range, and a click rate of the query word combination crossing the POI dimension within a nearest specified time range.
The specified time range can be set by self-definition according to requirements, and the embodiment of the invention is not limited. For example, the specified time range may be set to 30 days, a week, and so on.
The click rate of the query word combination dimension can be understood as the click rate of the query word combination corresponding to the POI, and/or the click rate and value of each query word in the query word combination, and the like; the click rate of the query word combination crossing the POI dimension may be understood as the click rate of the POI as the query result under the condition of the query word combination, and the like.
Optionally, in an embodiment of the present invention, the displaying the correlation feature includes: the number of times of matching each display information on the query word combination, the percentage of the query word combination in each display information, the percentage of the matching characters in the query word combination with each display information in the query word combination, the percentage of the matching characters in the query word combination with each display information in the display information, the percentage of the matching characters in each query word with each display information in the query word combination in the query word, the percentage of the matching characters in each query word with each display information in the display information, the number of display information which can be matched on the query word combination, the total number of display information which can be matched on each query word combination, and the percentage of characters which can be matched on the display information in the query word combination, wherein the display information comprises a POI title, a POI belonging combination title, a POI belonging to, And the POI belongs to at least one of the area, the POI category, the POI tag and the POI recommendation reason.
The display correlation feature may specifically include the following three aspects:
(1) inquiring the matching dimension of the whole string of the word combinations:
a) the number of times that Query (Query term combination) matches various kinds of presentation information such as title of top title (POI), title of dealtitle (POI belongs to combination title, group purchase title, etc.)/region (e.g., city, business district, etc.)/category/tag/reason of recommendation. The method specifically comprises the steps of matching the times of each type of display information on the query word combinations and the total times of each type of display information on the query word combinations;
b) the proportion of the Query in the title/default/area/item/label/recommendation reason as a whole;
for example, if the Query is "kendiry fried chicken", then at this time, in order to obtain the proportion of the whole Query in the title, it may be detected whether the title includes "kendiry fried chicken", if so, the proportion of the whole Query in the title may be understood as the ratio of the number of characters of the Query to the number of characters of the title, and if not, the proportion of the whole Query in the title may be 0.
(2) Matching dimension of whole string + substring (single query term):
a) the total word number matched with the upper title/the lower title/the area/the item class/the label/the recommendation reason in the query accounts for the total length of the query, and the total word number matched with the upper title/the lower title/the area/the item class/the label/the recommendation reason in each query word in the query accounts for the total length of the query;
for example, for the Query "kendiry fried chicken" described above, assuming that two Query words are contained therein, namely "kendiry" and "fried chicken", respectively, then the proportion of the total number of words in the "kendiry fried chicken" matching the top title/bottom title/area/type/tag/recommendation reason to the total length of the Query, and the proportion of the total number of words in the "kendiry" and "fried chicken" matching the top title/bottom title/area/type/tag/recommendation reason to the total length of the Query can be obtained.
If the title to which a certain POI belongs contains 'fried chicken' in 'Kendeki fried chicken', the proportion of the character matched with the title in the Query is 2/5. Correspondingly, the occupation ratios of the matched characters of the title in the query words "kendiry" and "fried chicken" in each query word are 0/3 and 2/2, respectively, namely 0 and 1.
b) The total number of words in the query matched with the upper title/the lower title/the area/the category/the label/the recommendation reason correspondingly accounts for the length proportion of each piece of display information, namely the proportion of the total number of words in each piece of display information.
Assuming that the title to which a POI belongs includes "fried chicken" in "kentucky fried chicken" and the title includes 10 characters, the proportion of the character in the Query that matches the title in the title at this time is 2/10. Correspondingly, the occupation ratios of the matched characters of the title in the query words "kendiry" and "fried chicken" in each query word are 0/10 and 2/10, namely 0 and 0.2 respectively.
Note that a substring is a query word including two or more consecutive characters, and the entire string may be used as a substring to calculate features.
(3) Across the element dimension:
a) the number and total times of the display information on the Query whole string can be matched;
for example, if the whole Query string can match two of the above six display information, that is, two display information of the POI include the whole Query string content, the number of the display information that the whole Query string can match corresponding to the POI is 2; if one of the display information matches 2 times, that is, contains the entire string of contents of two Query, and the other display information matches 1 time, then the total number of times that the entire string of Query corresponds to the display information that can be matched with the POI is 3.
b) Each Query substring, namely the total matching times of each Query word on all the display information;
c) the proportion of characters on which the information is displayed can be matched in the Query whole string;
for example, for a Query "kendirk fried chicken", assuming that "kendirk" matches the presentation information, the ratio of characters in the Query that match the presentation information is 3/5.
It is worth mentioning that, in the embodiment of the present invention, the POI sample based on the user behavior annotation of the training model is filtered by using the score of the display relevance model, and then mixed with the manual annotation sample, and then the second ranking model is trained. The benefits of this are:
on the one hand, if only a single user behavior labeled sample is used or a single user artificially labeled sample is used to train the model, the model will only learn this type of preference. The current online measurement target comprises a user search behavior and also comprises a manual evaluation label, and the online two-aspect measurement index can obtain the profit by using a sample training model after two kinds of label rules are fused;
in the second aspect, if the model is trained after the user behavior labeling sample and the manual labeling sample are directly mixed according to a certain proportion, the effect is not good, because the user behavior labeling and the manual labeling are two dimensionality labeling modes, in some cases, one labeling is a positive sample, and the other labeling is a negative sample, such a conflict can cause the fluctuation of model training, and further the on-line effect is influenced.
In the embodiment of the invention, the data sorting method for enhancing the user perception correlation in the application scenes such as the search scene is provided, and the click rate and the manual marking index are optimized by effectively utilizing the manual marking data training model. And moreover, the samples marked by the user behaviors are filtered and corrected by utilizing the scores of the display correlation model, so that the problem of conflict between the samples marked by the user behaviors and the manually marked samples is solved. And manufacturing samples to respectively train a first sequencing model and a second sequencing model, and simultaneously, guiding the target, wherein the second sequencing model only influences the first N bits in the POI list.
Compared with the sorting method mainly taking the click rate as the target in the prior art, the manual marking is taken as the target and the second sorting model is trained by showing the relevance characteristics, so that the influence of user perception elements in sorting can be enhanced. And meanwhile, the corrected sample based on the user behavior label is mixed, only the front N bits are influenced on line, and the basic stability of the on-line click rate index is ensured while the perception correlation of the user is enhanced.
Referring to fig. 3, a flowchart illustrating steps of a data ordering model training method according to an embodiment of the present invention is shown. Wherein the data ordering model at least comprises a first ordering model and a second ordering model, and the first ordering model and the second ordering model are independent of each other, the method comprises:
step 210, obtaining a plurality of first POI samples, a query word combination, a second feature and a manually labeled sample attribute corresponding to each first POI sample, and a plurality of second POI samples, a second feature of each second POI sample and a sample attribute labeled based on user behavior; wherein, the query word combination comprises at least one query word;
step 220, training the first sequencing model through a plurality of first POI samples, the query word combinations corresponding to the first POI samples, the second characteristics and the manually labeled sample attributes;
step 230, training the second ranking model by a plurality of the first POI samples and a plurality of the second POI samples with known second characteristics and sample attributes;
the second feature at least comprises a display correlation feature, the display correlation feature is obtained based on a matching relationship between display information of the POI sample when the POI sample is displayed in a page and the query word, the user behavior comprises at least one of ordering behavior and clicking behavior of a user aiming at the second POI sample when the second POI sample is exposed, and the sample attribute comprises a positive sample and a negative sample.
Optionally, in an embodiment of the present invention, the data sorting model further includes a pure presentation relevance model, and before step 220, the method further includes:
training the pure display relevance model through a plurality of first POI samples with known display relevance features;
accordingly, step 220 may further comprise: acquiring a first feature of each first POI sample, and training the first sequencing model based on the first feature and a query word combination of each first POI sample, wherein the first feature at least comprises the display relevance feature and a score feature;
and the score characteristic is the score of the display relevance characteristic of the first POI sample obtained by a pure display relevance model.
Optionally, in an embodiment of the present invention, step 210 may further include:
step 211, obtaining display information of each first POI sample corresponding to the query word combination sample when the POI sample is displayed in a page, wherein the query word combination sample comprises at least one query word sample;
step 212, based on the matching relationship between the display information of each first POI sample and the query term sample and the artificial labeling label of each first POI sample, respectively obtaining the display relevance characteristics of each first POI sample and the sample attributes of each first POI sample, and obtaining a plurality of artificially labeled first POI samples with known second characteristics;
step 213, acquiring a POI sample list exposed to a user, user behavior data received by each second POI sample in the POI sample list, and a query word combination sample corresponding to each second POI sample and display information when exposed in a page;
step 214, respectively obtaining display correlation characteristics of each second POI sample based on the matching relationship between the display information of each second POI sample and the corresponding query term sample;
step 215, based on the user behavior data received by each second POI sample in the second POI sample list and the display relevance characteristics of each second POI sample, obtaining a sample attribute of each second POI sample, and obtaining a plurality of second POI samples with known second characteristics based on user behavior labeling.
Optionally, in an embodiment of the present invention, step 215 may further include:
step 2151, obtaining scores of display relevance features of each second POI sample through the pure display relevance model;
step 2152, for any second POI sample, modifying sample attributes of the second POI sample based on the score of the second POI sample showing relevance features and the user behavior data received by the second POI sample.
Optionally, in this embodiment of the present invention, step 2152 further includes:
step 21521, obtaining a user ordering sample, a user click sample, an exposure sample and a sample which is not clicked by a user in the second POI samples based on the user behavior data received by each second POI sample;
step 21522, for any one user ordering sample, in response to the score of the user ordering sample showing the correlation characteristics being larger than a first threshold value, modifying the sample attribute of the user ordering sample into a positive sample;
step 21523, for any user click sample, in response to that the score of the user click sample showing the correlation characteristics is larger than a second threshold value, modifying the sample attribute of the user click sample into a positive sample;
step 21524, for any of the exposed but user-unchecked samples, in response to the score of the exposed but user-unchecked sample showing the relevance feature being less than a third threshold, amending the sample attribute of the user-clicked sample to a negative sample.
Referring to fig. 4, a schematic structural diagram of a data sorting apparatus in an embodiment of the present invention is shown. The data sorting device is provided with a data sorting model, and the data sorting model at least comprises a first sorting model and a second sorting model.
The data sorting device of the embodiment of the invention comprises: a sorting data obtaining module 310, a first data sorting module 320, a second data sorting module 330 and a sorting result obtaining module 340.
The functions of the modules and the interaction relationship between the modules are described in detail below.
The ranking data acquiring module 310 is configured to acquire an initial POI list to be ranked and a query word combination corresponding to the initial POI list, where the initial POI list includes L POIs, L is a positive integer greater than 1, and the query word combination includes at least one query word;
a first data ranking module 320, configured to rank, through a first ranking model, each POI in the initial POI list to obtain a first POI list;
a second data sorting module 330, configured to reorder, based on second features of top M POIs in the first POI list, the top M POIs through a second sorting model to obtain a second POI list, where M is a positive integer greater than 1 and less than or equal to L, the second features at least include display relevance features, and the display relevance features are obtained based on a matching relationship between display information of the POIs when displayed in a page and the query word;
the ranking result obtaining module 340 is configured to obtain a target POI list obtained by ranking each POI in the initial POI list based on the first POI list and the second POI list, where top N POIs in the target POI list are consistent with top N POIs in the second POI list, and the rest POIs are sequentially arranged at positions N +1 to L according to the sequence of the POIs in the first POI list, where N is a positive integer greater than 1 and less than or equal to M.
Referring to fig. 5, in the embodiment of the present invention, the first data sorting module 320 may further include:
the first data sorting sub-module 321 is configured to sort, based on a first feature of each POI, each POI in the initial POI list through a first sorting model to obtain a first POI list, where the first feature at least includes the display relevance feature and the score feature, and the first sorting model is obtained by training a plurality of manually labeled first POI samples of known first features;
the apparatus may further include:
a score feature obtaining module 350, configured to obtain a display relevance feature of each POI, and obtain a score of the display relevance feature of each POI as a score feature of each POI through a pure display relevance model based on the display relevance feature;
the pure display relevance model is obtained by training a plurality of manually marked first POI samples with known display relevance characteristics.
Referring to fig. 5, in an embodiment of the present invention, the apparatus may further include:
the sample data obtaining module 360 is configured to obtain display information of each first POI sample corresponding to a query word combination sample when the first POI sample is displayed in a page, and an artificial labeling score of each display relevance feature showing the relevance feature, where the query word combination sample includes at least one query word sample;
the artificial labeling sample obtaining module 370 is configured to obtain a display relevance feature of each first POI sample based on a matching relationship between each first POI sample and the query term sample, so as to obtain a plurality of first POI samples with known display relevance features and artificial labeling scores;
a pure show relevance model training module 380 for training the pure show relevance model based on the plurality of known show relevance features and the first POI sample with the artificial annotation score.
Referring to fig. 5, in an embodiment of the present invention, the apparatus may further include:
a second training sample obtaining module 390, configured to obtain a plurality of manually labeled first POI samples with known second characteristics and a plurality of second POI samples labeled based on user behaviors, where the user behaviors include at least one of a click behavior and an order placing behavior;
a second ranking model training module 3110 for training the second ranking model by the first and second POI samples of known second characteristics.
Optionally, the second training sample obtaining module 390 further includes:
the sample display information acquisition sub-module 391 is configured to acquire display information of each first POI sample corresponding to a query word combination sample when the first POI sample is displayed in a page, where the query word combination sample includes at least one query word sample;
the artificial labeling sample obtaining sub-module 392 is configured to obtain a plurality of artificially labeled first POI samples with known second characteristics based on the matching relationship between the display information of each first POI sample and the query term sample and the artificial labeling label of each first POI sample, and respectively obtain the display relevance characteristics of each first POI sample and the sample attributes of each first POI sample;
the exposure sample information obtaining submodule 393 is used for obtaining a POI sample list which is exposed to a user, user behavior data received by each second POI sample in the POI sample list, a query word combination sample corresponding to each second POI sample, and display information when the second POI sample is exposed in a page;
the display relevance feature obtaining sub-module 394 is configured to obtain the display relevance features of each second POI sample respectively based on the matching relationship between the display information of each second POI sample and the corresponding query term sample;
the behavior labeling sample obtaining sub-module 395 is configured to obtain, based on the user behavior data received by each second POI sample in the second POI sample list and the display relevance feature of each second POI sample, a sample attribute of each second POI sample, and obtain a plurality of second POI samples based on user behavior labeling with known second features.
Optionally, in this embodiment of the present invention, the behavior labeling sample obtaining sub-module 395 may include:
a display relevance feature scoring unit, configured to obtain, through the pure display relevance model, a score of a display relevance feature of each second POI sample;
and the behavior labeling sample correction unit is used for correcting the sample attribute of any second POI sample based on the score of the displayed correlation characteristic of the second POI sample and the user behavior data received by the second POI sample.
Optionally, the behavior labeling sample modification unit may be specifically configured to:
obtaining a user ordering sample, a user click sample and an exposure but user non-click sample in the second POI samples based on the user behavior data received by each second POI sample;
for any one user ordering sample, in response to the score of the user ordering sample showing the correlation characteristics being larger than a first threshold value, modifying the sample attribute of the user ordering sample into a positive sample;
for any user click sample, in response to the score of the user click sample showing the correlation characteristics being greater than a second threshold value, modifying the sample attribute of the user click sample into a positive sample;
for any of the exposed but user-unchecked samples, in response to the score for the exposed but user-unchecked sample showing relevance features being less than a third threshold, amending the sample attribute of the user-clicked sample to a negative sample.
Optionally, the second feature further includes a partial first feature, and the partial first feature includes: at least one of a distance between a current query user and the POI, a score of the POI, whether an avatar of the POI is a default avatar, a click rate of the query word combination dimension within a nearest specified time range, and a click rate of the query word combination crossing the POI dimension within a nearest specified time range.
Optionally, the displaying the correlation feature comprises: the number of times of matching each display information on the query word combination, the percentage of the query word combination in each display information, the percentage of the matching characters in the query word combination with each display information in the query word combination, the percentage of the matching characters in the query word combination with each display information in the display information, the percentage of the matching characters in each query word with each display information in the query word combination in the query word, the percentage of the matching characters in each query word with each display information in the display information, the number of display information which can be matched on the query word combination, the total number of display information which can be matched on each query word combination, and the percentage of characters which can be matched on the display information in the query word combination, wherein the display information comprises a POI title, a POI belonging combination title, a POI belonging to, And the POI belongs to at least one of the area, the POI category, the POI tag and the POI recommendation reason.
The data sorting device provided in the embodiment of the present invention can implement each process implemented in the method embodiments of fig. 1 to fig. 2, and is not described here again to avoid repetition.
Referring to fig. 6, a schematic structural diagram of a data ranking model training apparatus in an embodiment of the present invention is shown, where the data ranking model includes at least a first ranking model and a second ranking model.
The data sequencing model training device of the embodiment of the invention comprises: a training sample acquisition module 410, a first ordering model training module 420, and a second ordering model training module 430.
The functions of the modules and the interaction relationship between the modules are described in detail below.
A training sample obtaining module 410, configured to obtain a plurality of first POI samples, a query word combination, a second feature, and a sample attribute of an artificial label corresponding to each first POI sample, and a plurality of second POI samples, a second feature of each second POI sample, and a sample attribute based on a user behavior label; wherein, the query word combination comprises at least one query word;
a first ordering model training module 420, configured to train the first ordering model through a plurality of first POI samples, a query word combination corresponding to each first POI sample, a second feature, and a sample attribute of an artificial label;
a second ranking model training module 430 for training the second ranking model by a plurality of the first POI samples and a plurality of the second POI samples for which second features and sample attributes are known;
the second feature at least comprises a display correlation feature, the display correlation feature is obtained based on a matching relationship between display information of the POI sample when the POI sample is displayed in a page and the query word, the user behavior comprises at least one of ordering behavior and clicking behavior of a user aiming at the second POI sample when the second POI sample is exposed, and the sample attribute comprises a positive sample and a negative sample.
Optionally, the data sorting model further includes a pure presentation relevance model, and the apparatus may further include:
the pure display relevance model training module is used for training the pure display relevance model through a plurality of first POI samples with known display relevance characteristics;
the first sequencing model training module comprises:
a first ordering model training sub-module, configured to obtain a first feature of each first POI sample, and train the first ordering model based on the first feature of each first POI sample and a query word combination, where the first feature at least includes the display relevance feature and a score feature;
and the score characteristic is the score of the display relevance characteristic of the first POI sample obtained by a pure display relevance model.
Optionally, the training sample obtaining module includes:
the system comprises a sample display information acquisition sub-module, a search word combination sample generation sub-module and a search result sub-module, wherein the sample display information acquisition sub-module is used for acquiring display information of each first POI sample corresponding to the search word combination sample when the first POI sample is displayed in a page, and the search word combination sample comprises at least one search word sample;
the artificial labeling sample obtaining sub-module is used for obtaining a plurality of artificially labeled first POI samples with known second characteristics based on the matching relationship between the display information of each first POI sample and the query word sample and the artificial labeling label of each first POI sample, and respectively obtaining the display correlation characteristics of each first POI sample and the sample attributes of each first POI sample;
the exposure sample information acquisition sub-module is used for acquiring a POI sample list which is exposed to a user, user behavior data received by each second POI sample in the POI sample list, a query word combination sample corresponding to each second POI sample and display information when the second POI sample is exposed in a page;
the display correlation characteristic obtaining sub-module is used for respectively obtaining the display correlation characteristics of each second POI sample based on the matching relation between the display information of each second POI sample and the corresponding query word sample;
and the behavior labeling sample acquisition sub-module is used for acquiring the sample attribute of each second POI sample based on the user behavior data received by each second POI sample in the second POI sample list and the display relevance characteristics of each second POI sample to obtain a plurality of second POI samples with known second characteristics based on user behavior labeling.
Optionally, the behavior labeling sample obtaining sub-module includes:
a display relevance feature scoring unit, configured to obtain, through the pure display relevance model, a score of a display relevance feature of each second POI sample;
and the behavior labeling sample correction unit is used for correcting the sample attribute of any second POI sample based on the score of the displayed correlation characteristic of the second POI sample and the user behavior data received by the second POI sample.
Optionally, the behavior labeling sample modification unit is specifically configured to:
obtaining a user ordering sample, a user click sample and an exposure but user non-click sample in the second POI samples based on the user behavior data received by each second POI sample;
for any one user ordering sample, in response to the score of the user ordering sample showing the correlation characteristics being larger than a first threshold value, modifying the sample attribute of the user ordering sample into a positive sample;
for any user click sample, in response to the score of the user click sample showing the correlation characteristics being greater than a second threshold value, modifying the sample attribute of the user click sample into a positive sample;
for any of the exposed but user-unchecked samples, in response to the score for the exposed but user-unchecked sample showing relevance features being less than a third threshold, amending the sample attribute of the user-clicked sample to a negative sample.
The data sorting training model device provided by the embodiment of the invention can realize each process realized in the method embodiments of fig. 3 to fig. 4, and is not repeated here to avoid repetition.
Preferably, an embodiment of the present invention further provides an electronic device, including: the processor, the memory, and the computer program stored in the memory and capable of running on the processor, when being executed by the processor, implement the above data sorting method and/or each process of the data sorting model training method embodiment, and can achieve the same technical effect, and are not described herein again to avoid repetition.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when being executed by a processor, the computer program implements the above data sorting method and/or each process of the data sorting model training method embodiment, and can achieve the same technical effect, and in order to avoid repetition, the details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
Fig. 7 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the present invention.
The electronic device 500 includes, but is not limited to: a radio frequency unit 501, a network module 502, an audio output unit 503, an input unit 504, a sensor 505, a display unit 506, a user input unit 507, an interface unit 508, a memory 509, a processor 510, and a power supply 511. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 7 does not constitute a limitation of the electronic device, and that the electronic device may include more or fewer components than shown, or some components may be combined, or a different arrangement of components. In the embodiment of the present invention, the electronic device includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted terminal, a wearable device, a pedometer, and the like.
It should be understood that, in the embodiment of the present invention, the radio frequency unit 501 may be used for receiving and sending signals during a message sending and receiving process or a call process, and specifically, receives downlink data from a base station and then processes the received downlink data to the processor 510; in addition, the uplink data is transmitted to the base station. In general, radio frequency unit 501 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 501 can also communicate with a network and other devices through a wireless communication system.
The electronic device provides wireless broadband internet access to the user via the network module 502, such as assisting the user in sending and receiving e-mails, browsing web pages, and accessing streaming media.
The audio output unit 503 may convert audio data received by the radio frequency unit 501 or the network module 502 or stored in the memory 509 into an audio signal and output as sound. Also, the audio output unit 503 may also provide audio output related to a specific function performed by the electronic apparatus 500 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 503 includes a speaker, a buzzer, a receiver, and the like.
The input unit 504 is used to receive an audio or video signal. The input Unit 504 may include a Graphics Processing Unit (GPU) 5041 and a microphone 5042, and the Graphics processor 5041 processes image data of a still picture or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 506. The image frames processed by the graphic processor 5041 may be stored in the memory 509 (or other storage medium) or transmitted via the radio frequency unit 501 or the network module 502. The microphone 5042 may receive sounds and may be capable of processing such sounds into audio data. The processed audio data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 501 in case of the phone call mode.
The electronic device 500 also includes at least one sensor 505, such as light sensors, motion sensors, and other sensors. Specifically, the light sensor includes an ambient light sensor that can adjust the brightness of the display panel 5061 according to the brightness of ambient light, and a proximity sensor that can turn off the display panel 5061 and/or a backlight when the electronic device 500 is moved to the ear. As one type of motion sensor, an accelerometer sensor can detect the magnitude of acceleration in each direction (generally three axes), detect the magnitude and direction of gravity when stationary, and can be used to identify the posture of an electronic device (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), and vibration identification related functions (such as pedometer, tapping); the sensors 505 may also include fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc., which are not described in detail herein.
The display unit 506 is used to display information input by the user or information provided to the user. The Display unit 506 may include a Display panel 5061, and the Display panel 5061 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.
The user input unit 507 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device. Specifically, the user input unit 507 includes a touch panel 5071 and other input devices 5072. Touch panel 5071, also referred to as a touch screen, may collect touch operations by a user on or near it (e.g., operations by a user on or near touch panel 5071 using a finger, stylus, or any suitable object or attachment). The touch panel 5071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 510, and receives and executes commands sent by the processor 510. In addition, the touch panel 5071 may be implemented in various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. In addition to the touch panel 5071, the user input unit 507 may include other input devices 5072. In particular, other input devices 5072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in detail herein.
Further, the touch panel 5071 may be overlaid on the display panel 5061, and when the touch panel 5071 detects a touch operation thereon or nearby, the touch operation is transmitted to the processor 510 to determine the type of the touch event, and then the processor 510 provides a corresponding visual output on the display panel 5061 according to the type of the touch event. Although in fig. 7, the touch panel 5071 and the display panel 5061 are two independent components to implement the input and output functions of the electronic device, in some embodiments, the touch panel 5071 and the display panel 5061 may be integrated to implement the input and output functions of the electronic device, and is not limited herein.
The interface unit 508 is an interface for connecting an external device to the electronic apparatus 500. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 508 may be used to receive input (e.g., data information, power, etc.) from external devices and transmit the received input to one or more elements within the electronic apparatus 500 or may be used to transmit data between the electronic apparatus 500 and external devices.
The memory 509 may be used to store software programs as well as various data. The memory 509 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 509 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
The processor 510 is a control center of the electronic device, connects various parts of the whole electronic device by using various interfaces and lines, performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in the memory 509 and calling data stored in the memory 509, thereby performing overall monitoring of the electronic device. Processor 510 may include one or more processing units; preferably, the processor 510 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 510.
The electronic device 500 may further include a power supply 511 (e.g., a battery) for supplying power to various components, and preferably, the power supply 511 may be logically connected to the processor 510 via a power management system, so as to implement functions of managing charging, discharging, and power consumption via the power management system.
In addition, the electronic device 500 includes some functional modules that are not shown, and are not described in detail herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (18)

1. A data sorting method is applied to electronic equipment provided with data sorting models, wherein the data sorting models at least comprise a first sorting model and a second sorting model, and the method comprises the following steps:
acquiring an initial POI list to be ranked and a query word combination corresponding to the initial POI list, wherein the initial POI list comprises L POIs, L is a positive integer greater than 1, and the query word combination comprises at least one query word;
sequencing each POI in the initial POI list through a first sequencing model to obtain a first POI list;
reordering the first M POIs through a second ordering model based on second characteristics of the first M POIs in the first POI list to obtain a second POI list, wherein M is a positive integer which is greater than 1 and less than or equal to L, the second characteristics at least comprise display correlation characteristics, and the display correlation characteristics are obtained based on the matching relationship between display information and the query words when the POIs are displayed in a page;
and obtaining a target POI list obtained by sequencing each POI in the initial POI list based on the first POI list and the second POI list, wherein the first N POIs in the target POI list are consistent with the first N POIs in the second POI list, the rest POIs are sequentially arranged at the (N + 1) th to the L-th positions according to the sequence of the rest POIs in the first POI list, and N is a positive integer which is more than 1 and less than or equal to M.
2. The method of claim 1, wherein the data-ranking model further comprises a purely display relevance model, and the step of ranking each POI in the initial POI list by the first ranking model to obtain a first POI list comprises:
ranking each POI in the initial POI list through a first ranking model based on first characteristics of each POI to obtain a first POI list, wherein the first characteristics at least comprise the display relevance characteristics and score characteristics, and the first ranking model is obtained through training of a plurality of manually labeled first POI samples with known first characteristics;
before the step of ranking each POI in the initial POI list through the first ranking model to obtain the first POI list, the method further includes:
acquiring display relevance characteristics of each POI, and acquiring scores of the display relevance characteristics of each POI as score characteristics of each POI through a pure display relevance model based on the display relevance characteristics;
the pure display relevance model is obtained by training a plurality of manually marked first POI samples with known display relevance characteristics.
3. The method according to claim 2, wherein before the step of obtaining the display relevance feature of each POI and obtaining the score of the display relevance feature of each POI through a pure display relevance model based on the display relevance feature as the score feature of each POI, the method further comprises:
acquiring display information of each first POI sample corresponding to a query word combination sample when the first POI sample is displayed in a page and an artificial labeling score of each display correlation feature of the display correlation features, wherein the query word combination sample comprises at least one query word sample;
obtaining display relevance characteristics of each POI sample based on the matching relation between each POI sample and the query word sample to obtain a plurality of known display relevance characteristics and manually marked first POI samples;
training the pure display relevance model based on the plurality of known display relevance features and the first POI sample of the artificial annotation score.
4. The method according to claim 1, wherein before the step of reordering the top M POIs in the first POI list by a second ordering model based on the second characteristics of the top M POIs to obtain a second POI list, the method further comprises:
acquiring a plurality of manually labeled first POI samples with known second characteristics and a plurality of second POI samples labeled based on user behaviors, wherein the user behaviors comprise at least one of clicking behaviors and ordering behaviors;
training the second ranking model by the first and second POI samples of known second features.
5. The method of claim 4, wherein the step of obtaining a plurality of manually labeled first POI samples with known second characteristics and a plurality of labeled second POI samples based on user behavior comprises:
obtaining display information of each first POI sample corresponding to the query word combination sample when the first POI sample is displayed in a page, wherein the query word combination sample comprises at least one query word sample;
respectively acquiring display relevance characteristics of each first POI sample and sample attributes of each first POI sample based on the matching relationship between the display information of each first POI sample and the query word sample and the artificial labeling label of each first POI sample, and acquiring a plurality of artificially labeled first POI samples with known second characteristics;
the method comprises the steps of obtaining a POI sample list exposed to a user, user behavior data received by each second POI sample in the POI sample list, a query word combination sample corresponding to each second POI sample and display information when the second POI sample is exposed in a page;
respectively acquiring display correlation characteristics of each second POI sample based on the matching relationship between the display information of each second POI sample and the corresponding query word sample;
based on the user behavior data received by each second POI sample in the second POI sample list and the display relevance characteristics of each second POI sample, obtaining the sample attributes of each second POI sample, and obtaining a plurality of second POI samples with known second characteristics based on user behavior labeling.
6. The method according to claim 5, wherein the step of obtaining the sample attribute of each second POI sample based on the user behavior data received from each second POI sample in the second POI sample list and the display relevance feature of each second POI sample to obtain a plurality of second POI samples with known second features based on the user behavior labels comprises:
obtaining a score of a display relevance feature of each second POI sample through the pure display relevance model;
and for any second POI sample, modifying the sample attribute of the second POI sample based on the score of the second POI sample showing the relevance characteristics and the user behavior data received by the second POI sample.
7. The method of claim 6, wherein the step of modifying the sample attributes of the second POI sample based on the score of the second POI sample for showing the relevance feature and the user behavior data received by the second POI sample for any one of the second POI samples comprises:
obtaining a user ordering sample, a user click sample and an exposure but user non-click sample in the second POI samples based on the user behavior data received by each second POI sample;
for any one user ordering sample, in response to the score of the user ordering sample showing the correlation characteristics being larger than a first threshold value, modifying the sample attribute of the user ordering sample into a positive sample;
for any user click sample, in response to the score of the user click sample showing the correlation characteristics being greater than a second threshold value, modifying the sample attribute of the user click sample into a positive sample;
for any of the exposed but user-unchecked samples, in response to the score for the exposed but user-unchecked sample showing relevance features being less than a third threshold, amending the sample attribute of the user-clicked sample to a negative sample.
8. The method of any of claims 2-7, wherein the second feature further comprises a partial first feature, the partial first feature comprising: at least one of a distance between a current query user and the POI, a score of the POI, whether an avatar of the POI is a default avatar, a click rate of the query word combination dimension within a nearest specified time range, and a click rate of the query word combination crossing the POI dimension within a nearest specified time range.
9. The method of any one of claims 1-7, wherein the displaying the relevance feature comprises: the number of times of matching each display information on the query word combination, the percentage of the query word combination in each display information, the percentage of the matching characters in the query word combination with each display information in the query word combination, the percentage of the matching characters in the query word combination with each display information in the display information, the percentage of the matching characters in each query word with each display information in the query word combination in the query word, the percentage of the matching characters in each query word with each display information in the display information, the number of display information which can be matched on the query word combination, the total number of display information which can be matched on each query word combination, and the percentage of characters which can be matched on the display information in the query word combination, wherein the display information comprises a POI title, a POI belonging combination title, a POI belonging to, And the POI belongs to at least one of the area, the POI category, the POI tag and the POI recommendation reason.
10. A method for training a data sorting model, wherein the data sorting model at least comprises a first sorting model and a second sorting model, the method comprising:
acquiring a plurality of first POI samples, a query word combination, a second characteristic and a manually labeled sample attribute corresponding to each first POI sample, and a plurality of second POI samples, wherein the second characteristic of each second POI sample and the sample attribute based on user behavior labeling; wherein, the query word combination comprises at least one query word;
training the first sequencing model through a plurality of first POI samples, a query word combination corresponding to each first POI sample, a second feature and a sample attribute of artificial labeling;
training the second ranking model by a plurality of the first POI samples and a plurality of the second POI samples for which second features and sample attributes are known;
the second feature at least comprises a display correlation feature, the display correlation feature is obtained based on a matching relationship between display information of the POI sample when the POI sample is displayed in a page and the query word, the user behavior comprises at least one of ordering behavior and clicking behavior of a user aiming at the second POI sample when the second POI sample is exposed, and the sample attribute comprises a positive sample and a negative sample.
11. The method of claim 10, wherein the data ranking model further comprises a purely display relevance model, and further comprising, before the step of training the first ranking model by the plurality of first POI samples, the combination of query words corresponding to each of the first POI samples, the second feature, and the manually labeled sample attribute:
training the pure display relevance model through a plurality of first POI samples with known display relevance features;
the step of training the first ranking model through a plurality of first POI samples, a query word combination corresponding to each of the first POI samples, a second feature, and a manually labeled sample attribute includes:
acquiring a first feature of each first POI sample, and training the first sequencing model based on the first feature and a query word combination of each first POI sample, wherein the first feature at least comprises the display relevance feature and a score feature;
and the score characteristic is the score of the display relevance characteristic of the first POI sample obtained by a pure display relevance model.
12. The method according to claim 10 or 11, wherein the step of obtaining a plurality of first POI samples, each corresponding to the query word combination, the second feature and the manually labeled sample attribute, and a plurality of second POI samples, each corresponding to the second feature and the user behavior label-based sample attribute comprises:
obtaining display information of each first POI sample corresponding to the query word combination sample when the first POI sample is displayed in a page, wherein the query word combination sample comprises at least one query word sample;
respectively acquiring display relevance characteristics of each first POI sample and sample attributes of each first POI sample based on the matching relationship between the display information of each first POI sample and the query word sample and the artificial labeling label of each first POI sample, and acquiring a plurality of artificially labeled first POI samples with known second characteristics;
the method comprises the steps of obtaining a POI sample list exposed to a user, user behavior data received by each second POI sample in the POI sample list, a query word combination sample corresponding to each second POI sample and display information when the second POI sample is exposed in a page;
respectively acquiring display correlation characteristics of each second POI sample based on the matching relationship between the display information of each second POI sample and the corresponding query word sample;
based on the user behavior data received by each second POI sample in the second POI sample list and the display relevance characteristics of each second POI sample, obtaining the sample attributes of each second POI sample, and obtaining a plurality of second POI samples with known second characteristics based on user behavior labeling.
13. The method according to claim 12, wherein the step of obtaining the sample attribute of each second POI sample based on the user behavior data received from each second POI sample in the second POI sample list and the display relevance feature of each second POI sample to obtain a plurality of second POI samples with known second features based on the user behavior labels comprises:
obtaining a score of a display relevance feature of each second POI sample through the pure display relevance model;
and for any second POI sample, modifying the sample attribute of the second POI sample based on the score of the second POI sample showing the relevance characteristics and the user behavior data received by the second POI sample.
14. The method of claim 13, wherein the step of modifying the sample attributes of the second POI sample based on the score of the second POI sample showing relevant features and the user behavior data received from the second POI sample comprises:
obtaining a user ordering sample, a user click sample and an exposure but user non-click sample in the second POI samples based on the user behavior data received by each second POI sample;
for any one user ordering sample, in response to the score of the user ordering sample showing the correlation characteristics being larger than a first threshold value, modifying the sample attribute of the user ordering sample into a positive sample;
for any user click sample, in response to the score of the user click sample showing the correlation characteristics being greater than a second threshold value, modifying the sample attribute of the user click sample into a positive sample;
for any of the exposed but user-unchecked samples, in response to the score for the exposed but user-unchecked sample showing relevance features being less than a third threshold, amending the sample attribute of the user-clicked sample to a negative sample.
15. A data sorting device is characterized in that a data sorting model is arranged in the data sorting device, the data sorting model at least comprises a first sorting model and a second sorting model, and the device further comprises:
the system comprises a ranking data acquisition module, a ranking data acquisition module and a query word combination module, wherein the ranking data acquisition module is used for acquiring an initial POI list to be ranked and the query word combination corresponding to the initial POI list, the initial POI list comprises L POIs, L is a positive integer larger than 1, and the query word combination comprises at least one query word;
the first data sorting module is used for sorting each POI in the initial POI list through a first sorting model to obtain a first POI list;
the second data sorting module is used for re-sorting the first M POIs through a second sorting model based on second characteristics of the first M POIs in the first POI list to obtain a second POI list, wherein M is a positive integer which is greater than 1 and less than or equal to L, the second characteristics at least comprise display correlation characteristics, and the display correlation characteristics are obtained based on the matching relationship between display information of the POIs when the POIs are displayed in a page and the query words;
and the ranking result acquisition module is used for acquiring a target POI list obtained after each POI in the initial POI list is ranked based on the first POI list and the second POI list, wherein the first N POIs in the target POI list are consistent with the first N POIs in the second POI list, the rest POIs are sequentially arranged at the (N + 1) th to the L-th positions according to the sequence of the rest POIs in the first POI list, and N is a positive integer which is more than 1 and less than or equal to M.
16. An apparatus for training a data sorting model, wherein the data sorting model comprises at least a first sorting model and a second sorting model, the apparatus comprising:
the training sample acquisition module is used for acquiring a plurality of first POI samples, a query word combination, a second feature and a sample attribute of artificial labeling corresponding to each first POI sample, and a plurality of second POI samples, a second feature of each second POI sample and a sample attribute based on user behavior labeling; wherein, the query word combination comprises at least one query word;
the first ordering model training module is used for training the first ordering model through a plurality of first POI samples, the query word combinations corresponding to the first POI samples, the second characteristics and the sample attributes of the artificial labels;
a second ranking model training module for training the second ranking model by a plurality of the first POI samples and a plurality of the second POI samples for which second features and sample attributes are known;
the second feature at least comprises a display correlation feature, the display correlation feature is obtained based on a matching relationship between display information of the POI sample when the POI sample is displayed in a page and the query word, the user behavior comprises at least one of ordering behavior and clicking behavior of a user aiming at the second POI sample when the second POI sample is exposed, and the sample attribute comprises a positive sample and a negative sample.
17. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, which computer program, when executed by the processor, implements the data ordering method of any one of claims 1 to 9 and/or the data ordering model training method steps of any one of claims 10 to 14.
18. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the data ordering method according to any one of claims 1 to 9 and/or the data ordering model training method steps according to any one of claims 10 to 14.
CN202110552833.9A 2021-05-20 2021-05-20 Data sorting method and device, and data sorting model training method and device Withdrawn CN113360796A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110552833.9A CN113360796A (en) 2021-05-20 2021-05-20 Data sorting method and device, and data sorting model training method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110552833.9A CN113360796A (en) 2021-05-20 2021-05-20 Data sorting method and device, and data sorting model training method and device

Publications (1)

Publication Number Publication Date
CN113360796A true CN113360796A (en) 2021-09-07

Family

ID=77527047

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110552833.9A Withdrawn CN113360796A (en) 2021-05-20 2021-05-20 Data sorting method and device, and data sorting model training method and device

Country Status (1)

Country Link
CN (1) CN113360796A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115186163A (en) * 2022-06-27 2022-10-14 北京百度网讯科技有限公司 Training method and device of search result ranking model and search result ranking method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102810117A (en) * 2012-06-29 2012-12-05 北京百度网讯科技有限公司 Method and equipment for supplying search result
US20180293298A1 (en) * 2017-04-07 2018-10-11 Sap Se Reordering of enriched inverted indices
CN110046298A (en) * 2019-04-24 2019-07-23 中国人民解放军国防科技大学 Query word recommendation method and device, terminal device and computer readable medium
CN110737816A (en) * 2018-07-02 2020-01-31 北京三快在线科技有限公司 Sorting method and device, electronic equipment and readable storage medium
CN111177585A (en) * 2018-11-13 2020-05-19 北京四维图新科技股份有限公司 Map POI feedback method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102810117A (en) * 2012-06-29 2012-12-05 北京百度网讯科技有限公司 Method and equipment for supplying search result
US20180293298A1 (en) * 2017-04-07 2018-10-11 Sap Se Reordering of enriched inverted indices
CN110737816A (en) * 2018-07-02 2020-01-31 北京三快在线科技有限公司 Sorting method and device, electronic equipment and readable storage medium
CN111177585A (en) * 2018-11-13 2020-05-19 北京四维图新科技股份有限公司 Map POI feedback method and device
CN110046298A (en) * 2019-04-24 2019-07-23 中国人民解放军国防科技大学 Query word recommendation method and device, terminal device and computer readable medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115186163A (en) * 2022-06-27 2022-10-14 北京百度网讯科技有限公司 Training method and device of search result ranking model and search result ranking method and device

Similar Documents

Publication Publication Date Title
US20170091335A1 (en) Search method, server and client
US10162865B2 (en) Generating image tags
CN108121803B (en) Method and server for determining page layout
CN110019840B (en) Method, device and server for updating entities in knowledge graph
CN110209810B (en) Similar text recognition method and device
CN109561211B (en) Information display method and mobile terminal
CN111177180A (en) Data query method and device and electronic equipment
CN110989847B (en) Information recommendation method, device, terminal equipment and storage medium
CN111368171B (en) Keyword recommendation method, related device and storage medium
WO2021147421A1 (en) Automatic question answering method and apparatus for man-machine interaction, and intelligent device
CN110276010B (en) Weight model training method and related device
CN110162653B (en) Image-text sequencing recommendation method and terminal equipment
CN112685578B (en) Method and device for providing multimedia information content
CN111078986A (en) Data retrieval method, device and computer readable storage medium
CN108595107B (en) Interface content processing method and mobile terminal
CN108307039B (en) Application information display method and mobile terminal
CN111629247A (en) Information display method and device and electronic equipment
CN110196833B (en) Application searching method, device, terminal and storage medium
CN111586329A (en) Information display method and device and electronic equipment
CN111553163A (en) Text relevance determining method and device, storage medium and electronic equipment
CN113360796A (en) Data sorting method and device, and data sorting model training method and device
CN112925878B (en) Data processing method and device
CN115080840A (en) Content pushing method and device and storage medium
CN114817742B (en) Knowledge distillation-based recommendation model configuration method, device, equipment and medium
CN110929882A (en) Feature vector calculation method based on artificial intelligence and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20210907

WW01 Invention patent application withdrawn after publication