CN114595370A - Model training and sorting method and device, electronic equipment and storage medium - Google Patents

Model training and sorting method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114595370A
CN114595370A CN202210143365.4A CN202210143365A CN114595370A CN 114595370 A CN114595370 A CN 114595370A CN 202210143365 A CN202210143365 A CN 202210143365A CN 114595370 A CN114595370 A CN 114595370A
Authority
CN
China
Prior art keywords
sample
keyword
semantic representation
vector
keywords
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210143365.4A
Other languages
Chinese (zh)
Inventor
薛涛锋
李悦
陶然
郭圣昱
张凯
杨一帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN202210143365.4A priority Critical patent/CN114595370A/en
Publication of CN114595370A publication Critical patent/CN114595370A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9532Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Creation or modification of classes or clusters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本公开实施例提供了一种模型训练、排序方法、装置、电子设备及存储介质。模型训练方法包括:获取样本数据,样本数据包括样本对象对应的样本搜索信息和样本关键词;在预设的待训练模型中,对样本关键词和样本搜索信息进行特征融合,得到样本关键词的综合语义表征向量,基于综合语义表征向量获取样本对象的样本排序参数;将基于样本排序参数确定训练完成的模型作为排序模型。本公开实施例中,将对象的关键词信息融入到排序模型中,这些关键词是针对对象的非结构化特征挖掘出来的,能够更好地覆盖和刻画用户的意图,因此融入关键词信息的排序模型能够捕捉关键词本身的语义以及关键词和用户搜索信息之间的语义相关性,从而提高排序模型的准确性。

Figure 202210143365

Embodiments of the present disclosure provide a model training and sorting method, apparatus, electronic device, and storage medium. The model training method includes: acquiring sample data, where the sample data includes sample search information and sample keywords corresponding to the sample objects; in a preset to-be-trained model, feature fusion is performed on the sample keywords and the sample search information to obtain the sample keywords. The comprehensive semantic representation vector is used to obtain the sample sorting parameters of the sample object based on the comprehensive semantic representation vector; the trained model determined based on the sample sorting parameters is used as the sorting model. In the embodiment of the present disclosure, the keyword information of the object is integrated into the ranking model. These keywords are excavated according to the unstructured characteristics of the object, which can better cover and describe the user's intention. The ranking model can capture the semantics of the keywords themselves and the semantic correlation between keywords and user search information, thereby improving the accuracy of the ranking model.

Figure 202210143365

Description

模型训练、排序方法、装置、电子设备及存储介质Model training, sorting method, device, electronic device and storage medium

技术领域technical field

本公开涉及数据处理技术领域,特别是涉及一种模型训练、排序方法、装置、电子设备及存储介质。The present disclosure relates to the technical field of data processing, and in particular, to a model training and sorting method, apparatus, electronic device and storage medium.

背景技术Background technique

随着互联网技术的迅猛发展,用户要在信息海洋里查找自己所需的信息,就像大海捞针一样,搜索引擎技术恰好解决了这一难题。搜索引擎是根据用户需求与一定算法,运用特定策略检索出指定信息并反馈给用户的一门检索技术。With the rapid development of Internet technology, users need to find the information they need in the sea of information, just like looking for a needle in a haystack, and search engine technology just solves this problem. A search engine is a retrieval technology that uses specific strategies to retrieve specified information and feed it back to users according to user needs and certain algorithms.

在搜索过程中,首先基于用户输入的搜索信息召回多个对象,然后对召回的对象进行排序。在排序环节,通常是利用排序模型对召回的对象进行分析得到排序参数,利用该排序参数对召回的对象进行排序。During the search process, multiple objects are first recalled based on the search information input by the user, and then the recalled objects are sorted. In the sorting process, a sorting model is usually used to analyze the recalled objects to obtain sorting parameters, and the sorting parameters are used to sort the recalled objects.

现有技术中,排序模型通常是通过特定的网络结构来建模,主要用于分析用户输入的搜索信息与对象本身特征之间的相关性。但是,由于对象本身特征无法准确反映对象的重要特性,因此该种方式中排序模型的分析准确性较差,模型效果较差。In the prior art, the ranking model is usually modeled by a specific network structure, and is mainly used to analyze the correlation between the search information input by the user and the characteristics of the object itself. However, because the characteristics of the object itself cannot accurately reflect the important characteristics of the object, the analysis accuracy of the ranking model in this method is poor, and the model effect is poor.

发明内容SUMMARY OF THE INVENTION

鉴于上述问题,本公开实施例提出了一种模型训练、排序方法、装置、电子设备及存储介质,用以提高排序模型的准确性。In view of the above problems, the embodiments of the present disclosure propose a model training and sorting method, apparatus, electronic device, and storage medium, so as to improve the accuracy of the sorting model.

根据本公开的实施例的第一方面,提供了一种模型训练方法,包括:获取样本数据;所述样本数据包括样本对象对应的样本搜索信息和样本关键词;在预设的待训练模型中,对所述样本关键词和所述样本搜索信息进行特征融合,得到所述样本关键词的综合语义表征向量,基于所述综合语义表征向量获取所述样本对象的样本排序参数;将基于所述样本排序参数确定训练完成的模型作为排序模型。According to a first aspect of the embodiments of the present disclosure, a model training method is provided, including: obtaining sample data; the sample data includes sample search information and sample keywords corresponding to sample objects; in a preset model to be trained , perform feature fusion on the sample keywords and the sample search information to obtain a comprehensive semantic representation vector of the sample keywords, and obtain the sample sorting parameters of the sample objects based on the comprehensive semantic representation vector; The sample ranking parameter determines the trained model as the ranking model.

可选地,对所述样本关键词和所述样本搜索信息进行特征融合,包括:针对每个样本关键词,对当前样本关键词与其他样本关键词进行特征融合,得到当前样本关键词的融合语义表征向量;针对每个样本关键词,对当前样本关键词与所述样本搜索信息进行特征融合,得到当前样本关键词与样本搜索信息之间的相关度;基于各样本关键词的融合语义表征向量和相关度,计算所述样本关键词的综合语义表征向量。Optionally, performing feature fusion on the sample keywords and the sample search information includes: for each sample keyword, performing feature fusion on the current sample keyword and other sample keywords to obtain a fusion of the current sample keywords. Semantic representation vector; for each sample keyword, feature fusion of the current sample keyword and the sample search information to obtain the correlation between the current sample keyword and the sample search information; based on the fusion semantic representation of each sample keyword vector and relevance, and calculate the comprehensive semantic representation vector of the sample keywords.

可选地,对所述样本关键词和所述样本搜索信息进行特征融合,包括:从预设的词与特征向量的对应关系中,查询各样本关键词的特征向量和所述样本搜索信息的特征向量;针对每个样本关键词,对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,得到当前样本关键词的融合语义表征向量;针对每个样本关键词,对当前样本关键词的融合语义表征向量与所述样本搜索信息的特征向量进行特征融合,得到当前样本关键词与所述样本搜索信息之间的相关度;基于各样本关键词的融合语义表征向量和相关度,以及所述样本搜索信息的特征向量,计算所述样本关键词的综合语义表征向量。Optionally, the feature fusion of the sample keywords and the sample search information includes: querying the feature vector of each sample keyword and the sample search information from a preset correspondence between words and feature vectors. Feature vector; for each sample keyword, perform feature fusion on the feature vector of the current sample keyword and the feature vectors of other sample keywords to obtain the fusion semantic representation vector of the current sample keyword; The fused semantic representation vector of the sample keyword and the feature vector of the sample search information perform feature fusion to obtain the correlation between the current sample keyword and the sample search information; based on the fusion semantic representation vector of each sample keyword and the correlation degree, and the feature vector of the sample search information, and calculate the comprehensive semantic representation vector of the sample keyword.

可选地,对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,包括:通过自注意力机制对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,得到当前样本关键词的融合语义表征向量。Optionally, feature fusion is performed on the feature vector of the current sample keyword and the feature vector of other sample keywords, including: performing feature fusion on the feature vector of the current sample keyword and the feature vector of other sample keywords through a self-attention mechanism. , to obtain the fusion semantic representation vector of the current sample keywords.

可选地,所述样本数据还包括所述样本关键词的置信度;对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,包括:通过自注意力机制对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,得到当前样本关键词的初步融合语义表征向量;按照所述置信度对所述样本关键词进行降序排序,基于排序结果获取各样本关键词的位置嵌入向量;将当前样本关键词的初步融合语义表征向量和位置嵌入向量相加,得到当前样本关键词的融合语义表征向量。Optionally, the sample data further includes the confidence level of the sample keyword; the feature fusion of the feature vector of the current sample keyword and the feature vectors of other sample keywords includes: using a self-attention mechanism to analyze the current sample keywords. The feature vector of the word is fused with the feature vectors of other sample keywords to obtain the preliminary fusion semantic representation vector of the current sample keyword; the sample keywords are sorted in descending order according to the confidence, and the key of each sample is obtained based on the sorting result. The position embedding vector of the word; the initial fusion semantic representation vector of the current sample keyword and the position embedding vector are added to obtain the fusion semantic representation vector of the current sample keyword.

可选地,所述样本数据还包括所述样本关键词的置信度;对当前样本关键词的融合语义表征向量与所述样本搜索信息的特征向量进行特征融合,包括:计算当前样本关键词的置信度与所述搜索信息的相关度权重;基于当前样本关键词的融合语义表征向量,所述样本搜索信息的特征向量,以及当前样本关键词的相关度权重,计算当前样本关键词的中间参数;基于所述中间参数和预设的温度参数,计算当前样本关键词与所述样本搜索信息之间的相关度。Optionally, the sample data further includes the confidence level of the sample keyword; performing feature fusion on the fusion semantic representation vector of the current sample keyword and the feature vector of the sample search information, including: calculating the Confidence and the relevance weight of the search information; based on the fusion semantic representation vector of the current sample keyword, the feature vector of the sample search information, and the relevance weight of the current sample keyword, calculate the intermediate parameter of the current sample keyword ; Calculate the correlation between the current sample keyword and the sample search information based on the intermediate parameter and the preset temperature parameter.

可选地,基于各样本关键词的融合语义表征向量和相关度,以及所述样本搜索信息的特征向量,计算所述样本关键词的综合语义表征向量,包括:计算各样本关键词的融合语义表征向量和相关度的乘积的总和,得到所述样本关键词的初步语义表征向量;将所述初步语义表征向量与所述样本搜索信息的特征向量相加,得到所述样本关键词的初步综合语义表征向量;对所述初步综合语义表征向量进行标准化处理,得到所述样本关键词的综合语义表征向量。Optionally, based on the fusion semantic representation vector and correlation of each sample keyword, and the feature vector of the sample search information, calculating the comprehensive semantic representation vector of the sample keyword, including: calculating the fusion semantics of each sample keyword. The sum of the product of the representation vector and the correlation degree is obtained to obtain the preliminary semantic representation vector of the sample keyword; the preliminary semantic representation vector and the feature vector of the sample search information are added to obtain the preliminary synthesis of the sample keyword Semantic representation vector; standardize the preliminary comprehensive semantic representation vector to obtain the comprehensive semantic representation vector of the sample keyword.

可选地,在计算各样本关键词的融合语义表征向量和相关度的乘积的总和,得到所述样本关键词的初步语义表征向量之后,还包括:对所述初步语义表征向量进行随机丢弃处理;将所述初步语义表征向量与所述样本搜索信息的特征向量相加,包括:将随机丢弃处理后的初步语义表征向量与所述样本搜索信息的特征向量相加。Optionally, after calculating the sum of the products of the fusion semantic representation vector of each sample keyword and the correlation degree to obtain the preliminary semantic representation vector of the sample keyword, the method further includes: randomly discarding the preliminary semantic representation vector. ; adding the preliminary semantic representation vector to the feature vector of the sample search information, comprising: adding the preliminary semantic representation vector after the random discarding process to the feature vector of the sample search information.

可选地,所述样本数据还包括所述样本对象对应的样本描述信息;基于所述综合语义表征向量获取所述样本对象的样本排序参数,包括:对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行特征融合,得到所述样本对象的样本排序参数。Optionally, the sample data further includes sample description information corresponding to the sample object; obtaining the sample sorting parameters of the sample object based on the comprehensive semantic representation vector, including: Feature fusion is performed on the description information and the sample search information to obtain sample sorting parameters of the sample object.

可选地,对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行特征融合,包括:对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行深层特征融合,得到所述样本对象的第一样本排序参数;对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行浅层特征融合,得到所述样本对象的第二样本排序参数;基于所述第一样本排序参数和所述第一样本排序参数,计算所述样本对象的样本排序参数。Optionally, performing feature fusion on the comprehensive semantic representation vector, the sample description information, and the sample search information includes: performing a deep analysis on the comprehensive semantic representation vector, the sample description information, and the sample search information. feature fusion to obtain the first sample sorting parameter of the sample object; perform shallow feature fusion on the comprehensive semantic representation vector, the sample description information and the sample search information to obtain the second sample of the sample object Sorting parameter; based on the first sample sorting parameter and the first sample sorting parameter, calculate the sample sorting parameter of the sample object.

可选地,对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行深层特征融合,包括:获取所述样本描述信息的特征向量和所述样本搜索信息的特征向量;基于所述综合语义表征向量、所述样本描述信息的特征向量和所述样本搜索信息的特征向量生成拼接特征向量;利用预设的深层融合网络对所述拼接特征向量进行特征融合处理,得到所述第一样本排序参数。Optionally, deep feature fusion is performed on the comprehensive semantic representation vector, the sample description information and the sample search information, including: acquiring the feature vector of the sample description information and the feature vector of the sample search information; based on The integrated semantic representation vector, the feature vector of the sample description information, and the feature vector of the sample search information generate a splicing feature vector; use a preset deep fusion network to perform feature fusion processing on the splicing feature vector to obtain the The first sample sorting parameter.

可选地,对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行浅层特征融合,包括:获取所述样本描述信息的特征向量和所述样本搜索信息的特征向量;利用预设的浅层融合网络对所述样本描述信息的特征向量、所述样本搜索信息的特征向量与所述综合语义表征向量进行特征融合处理,得到所述第二样本排序参数。Optionally, performing shallow feature fusion on the comprehensive semantic representation vector, the sample description information and the sample search information, including: acquiring a feature vector of the sample description information and a feature vector of the sample search information; A preset shallow fusion network is used to perform feature fusion processing on the feature vector of the sample description information, the feature vector of the sample search information, and the comprehensive semantic representation vector to obtain the second sample ranking parameter.

根据本公开的实施例的第二方面,提供了一种排序方法,包括:获取待排序对象对应的搜索信息和关键词;将所述搜索信息和所述关键词输入预先训练的排序模型,得到所述排序模型输出的所述待排序对象的排序参数;所述排序模型通过如上任一项所述的模型训练方法得到;基于所述排序参数对所述待排序对象进行排序。According to a second aspect of the embodiments of the present disclosure, there is provided a sorting method, comprising: acquiring search information and keywords corresponding to objects to be sorted; inputting the search information and the keywords into a pre-trained sorting model to obtain The sorting parameters of the objects to be sorted output by the sorting model; the sorting model is obtained by the model training method described in any one of the above; the objects to be sorted are sorted based on the sorting parameters.

根据本公开的实施例的第三方面,提供了一种模型训练装置,包括:第一获取模块,用于获取样本数据;所述样本数据包括样本对象对应的样本搜索信息和样本关键词;训练模块,用于在预设的待训练模型中,对所述样本关键词和所述样本搜索信息进行特征融合,得到所述样本关键词的综合语义表征向量,基于所述综合语义表征向量获取所述样本对象的样本排序参数;确定模块,用于将基于所述样本排序参数确定训练完成的模型作为排序模型。According to a third aspect of the embodiments of the present disclosure, a model training apparatus is provided, including: a first acquisition module for acquiring sample data; the sample data includes sample search information and sample keywords corresponding to sample objects; training The module is used to perform feature fusion on the sample keywords and the sample search information in the preset model to be trained, to obtain a comprehensive semantic representation vector of the sample keywords, and obtain the comprehensive semantic representation vector based on the comprehensive semantic representation vector. The sample sorting parameter of the sample object; the determining module is configured to determine the model that has been trained based on the sample sorting parameter as the sorting model.

可选地,所述训练模块包括:第一融合单元,用于针对每个样本关键词,对当前样本关键词与其他样本关键词进行特征融合,得到当前样本关键词的融合语义表征向量;第二融合单元,用于针对每个样本关键词,对当前样本关键词与所述样本搜索信息进行特征融合,得到当前样本关键词与样本搜索信息之间的相关度;第一计算单元,用于基于各样本关键词的融合语义表征向量和相关度,计算所述样本关键词的综合语义表征向量。Optionally, the training module includes: a first fusion unit, configured to perform feature fusion on the current sample keyword and other sample keywords for each sample keyword, to obtain a fusion semantic representation vector of the current sample keyword; The second fusion unit is used to perform feature fusion on the current sample keyword and the sample search information for each sample keyword to obtain the correlation between the current sample keyword and the sample search information; the first calculation unit is used for Based on the fusion semantic representation vector and relevance of each sample keyword, the comprehensive semantic representation vector of the sample keyword is calculated.

可选地,所述训练模块包括:查询单元,用于从预设的词与特征向量的对应关系中,查询各样本关键词的特征向量和所述样本搜索信息的特征向量;第三融合单元,用于针对每个样本关键词,对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,得到当前样本关键词的融合语义表征向量;第四融合单元,用于针对每个样本关键词,对当前样本关键词的融合语义表征向量与所述样本搜索信息的特征向量进行特征融合,得到当前样本关键词与所述样本搜索信息之间的相关度;第二计算单元,用于基于各样本关键词的融合语义表征向量和相关度,以及所述样本搜索信息的特征向量,计算所述样本关键词的综合语义表征向量。Optionally, the training module includes: a query unit for querying the feature vector of each sample keyword and the feature vector of the sample search information from the preset correspondence between words and feature vectors; a third fusion unit , which is used for feature fusion of the feature vector of the current sample keyword and the feature vectors of other sample keywords for each sample keyword to obtain the fusion semantic representation vector of the current sample keyword; the fourth fusion unit is used for each sample keyword. a sample keyword, performing feature fusion on the fusion semantic representation vector of the current sample keyword and the feature vector of the sample search information to obtain the correlation between the current sample keyword and the sample search information; the second computing unit, It is used to calculate the comprehensive semantic representation vector of the sample keywords based on the fusion semantic representation vector and the correlation degree of each sample keyword and the feature vector of the sample search information.

可选地,所述第三融合单元,具体用于通过自注意力机制对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,得到当前样本关键词的融合语义表征向量。Optionally, the third fusion unit is specifically configured to perform feature fusion between the feature vector of the current sample keyword and the feature vectors of other sample keywords through a self-attention mechanism to obtain a fusion semantic representation vector of the current sample keyword.

可选地,所述样本数据还包括所述样本关键词的置信度;所述第三融合单元,具体用于通过自注意力机制对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,得到当前样本关键词的初步融合语义表征向量;按照所述置信度对所述样本关键词进行降序排序,基于排序结果获取各样本关键词的位置嵌入向量;将当前样本关键词的初步融合语义表征向量和位置嵌入向量相加,得到当前样本关键词的融合语义表征向量。Optionally, the sample data also includes the confidence level of the sample keywords; the third fusion unit is specifically used to analyze the feature vector of the current sample keyword and the feature vector of other sample keywords through a self-attention mechanism. Perform feature fusion to obtain a preliminary fusion semantic representation vector of the current sample keywords; sort the sample keywords in descending order according to the confidence, and obtain the position embedding vector of each sample keyword based on the sorting result; The initial fusion semantic representation vector and the position embedding vector are added to obtain the fusion semantic representation vector of the current sample keyword.

可选地,所述样本数据还包括所述样本关键词的置信度;所述第四融合单元,具体用于计算当前样本关键词的置信度与所述搜索信息的相关度权重;基于当前样本关键词的融合语义表征向量,所述样本搜索信息的特征向量,以及当前样本关键词的相关度权重,计算当前样本关键词的中间参数;基于所述中间参数和预设的温度参数,计算当前样本关键词与所述样本搜索信息之间的相关度。Optionally, the sample data further includes the confidence of the sample keywords; the fourth fusion unit is specifically configured to calculate the confidence of the current sample keywords and the correlation weight of the search information; based on the current sample The fusion semantic representation vector of the keyword, the feature vector of the sample search information, and the relevance weight of the current sample keyword, calculate the intermediate parameter of the current sample keyword; based on the intermediate parameter and the preset temperature parameter, calculate the current The correlation between the sample keywords and the sample search information.

可选地,所述第二计算单元,具体用于计算各样本关键词的融合语义表征向量和相关度的乘积的总和,得到所述样本关键词的初步语义表征向量;将所述初步语义表征向量与所述样本搜索信息的特征向量相加,得到所述样本关键词的初步综合语义表征向量;对所述初步综合语义表征向量进行标准化处理,得到所述样本关键词的综合语义表征向量。Optionally, the second calculation unit is specifically configured to calculate the sum of the product of the fusion semantic representation vector and the correlation degree of each sample keyword, and obtain the preliminary semantic representation vector of the sample keyword; The vector is added with the feature vector of the sample search information to obtain a preliminary comprehensive semantic representation vector of the sample keyword; the preliminary comprehensive semantic representation vector is standardized to obtain a comprehensive semantic representation vector of the sample keyword.

可选地,所述第二计算单元,还用于在计算各样本关键词的融合语义表征向量和相关度的乘积的总和,得到所述样本关键词的初步语义表征向量之后,对所述初步语义表征向量进行随机丢弃处理;所述第二计算单元,具体用于将随机丢弃处理后的初步语义表征向量与所述样本搜索信息的特征向量相加。Optionally, the second computing unit is further configured to calculate the sum of the product of the fusion semantic representation vector of each sample keyword and the correlation degree, and obtain the preliminary semantic representation vector of the sample keyword. The semantic representation vector is randomly discarded; the second computing unit is specifically configured to add the preliminary semantic representation vector after the random discard process and the feature vector of the sample search information.

可选地,所述样本数据还包括所述样本对象对应的样本描述信息;所述训练模块,具体用于对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行特征融合,得到所述样本对象的样本排序参数。Optionally, the sample data further includes sample description information corresponding to the sample object; the training module is specifically configured to perform feature fusion on the comprehensive semantic representation vector, the sample description information and the sample search information. , obtain the sample sorting parameters of the sample object.

可选地,所述训练模块包括:第五融合单元,用于对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行深层特征融合,得到所述样本对象的第一样本排序参数;第六融合单元,用于对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行浅层特征融合,得到所述样本对象的第二样本排序参数;第三计算单元,用于基于所述第一样本排序参数和所述第一样本排序参数,计算所述样本对象的样本排序参数。Optionally, the training module includes: a fifth fusion unit, configured to perform deep feature fusion on the comprehensive semantic representation vector, the sample description information, and the sample search information, to obtain the first image of the sample object. this sorting parameter; a sixth fusion unit, configured to perform shallow feature fusion on the comprehensive semantic representation vector, the sample description information and the sample search information to obtain the second sample sorting parameter of the sample object; the third A calculation unit, configured to calculate the sample sorting parameter of the sample object based on the first sample sorting parameter and the first sample sorting parameter.

可选地,所述第五融合单元,具体用于获取所述样本描述信息的特征向量和所述样本搜索信息的特征向量;基于所述综合语义表征向量、所述样本描述信息的特征向量和所述样本搜索信息的特征向量生成拼接特征向量;利用预设的深层融合网络对所述拼接特征向量进行特征融合处理,得到所述第一样本排序参数。Optionally, the fifth fusion unit is specifically configured to obtain the feature vector of the sample description information and the feature vector of the sample search information; based on the comprehensive semantic representation vector, the feature vector of the sample description information and A splicing feature vector is generated from the feature vector of the sample search information; the feature fusion process is performed on the splicing feature vector by using a preset deep fusion network to obtain the first sample sorting parameter.

可选地,所述第六融合单元,具体用于获取所述样本描述信息的特征向量和所述样本搜索信息的特征向量;利用预设的浅层融合网络对所述样本描述信息的特征向量、所述样本搜索信息的特征向量与所述综合语义表征向量进行特征融合处理,得到所述第二样本排序参数。Optionally, the sixth fusion unit is specifically configured to obtain the feature vector of the sample description information and the feature vector of the sample search information; use a preset shallow fusion network to compose the feature vector of the sample description information. and performing feature fusion processing on the feature vector of the sample search information and the comprehensive semantic representation vector to obtain the second sample sorting parameter.

根据本公开的实施例的第四方面,提供了一种排序装置,包括:第二获取模块,用于获取待排序对象对应的搜索信息和关键词;预测模块,用于将所述搜索信息和所述关键词输入预先训练的排序模型,得到所述排序模型输出的所述待排序对象的排序参数;所述排序模型通过如上任一项所述的模型训练方法得到;排序模块,用于基于所述排序参数对所述待排序对象进行排序。According to a fourth aspect of the embodiments of the present disclosure, there is provided a sorting apparatus, comprising: a second acquisition module for acquiring search information and keywords corresponding to objects to be sorted; a prediction module for combining the search information with keywords The keywords are input into a pre-trained sorting model, and the sorting parameters of the objects to be sorted output by the sorting model are obtained; the sorting model is obtained by the model training method described in any one of the above; the sorting module is used based on The sorting parameter sorts the objects to be sorted.

根据本公开的实施例的第五方面,提供了一种电子设备,包括:一个或多个处理器;和其上存储有指令的一个或多个计算机可读存储介质;当所述指令由所述一个或多个处理器执行时,使得所述处理器执行如上任一项所述的模型训练方法,或者,执行如上任一项所述的排序方法。According to a fifth aspect of embodiments of the present disclosure, there is provided an electronic device, comprising: one or more processors; and one or more computer-readable storage media having instructions stored thereon; When executed by the one or more processors, the processors are caused to execute the model training method described in any one of the above, or execute the sorting method described in any one of the above.

根据本公开的实施例的第六方面,提供了一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序被处理器执行时,使得所述处理器执行如上任一项所述的模型训练方法,或者,执行如上任一项所述的排序方法。According to a sixth aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, causes the processor to execute the above-mentioned method. The model training method described above, or, perform the ranking method described in any of the above.

本公开实施例提供了一种模型训练、排序方法、装置、电子设备及存储介质。在模型训练过程中,获取样本数据,所述样本数据包括样本对象对应的样本搜索信息和样本关键词;在预设的待训练模型中,对所述样本关键词和所述样本搜索信息进行特征融合,得到所述样本关键词的综合语义表征向量,基于所述综合语义表征向量获取所述样本对象的样本排序参数;将基于所述样本排序参数确定训练完成的模型作为排序模型。由此可知,本公开实施例中,将对象的关键词信息融入到排序模型中,这些关键词是针对对象的非结构化特征挖掘出来的,相比于对象的非结构化特征,关键词能够更好地提炼和概括对象的主题,能够覆盖用户广泛的意图,因此融入关键词信息的排序模型能够捕捉关键词本身的语义以及关键词和用户搜索信息之间的语义相关性,从而提高排序模型的准确性。Embodiments of the present disclosure provide a model training and sorting method, apparatus, electronic device, and storage medium. During the model training process, sample data is obtained, and the sample data includes sample search information and sample keywords corresponding to the sample objects; in the preset model to be trained, the sample keywords and the sample search information are characterized. Fusion to obtain a comprehensive semantic representation vector of the sample keywords, and obtaining sample sorting parameters of the sample object based on the comprehensive semantic representation vector; and determining a trained model based on the sample sorting parameters as a sorting model. It can be seen that, in the embodiment of the present disclosure, the keyword information of the object is integrated into the ranking model, and these keywords are mined for the unstructured features of the objects. Compared with the unstructured features of the objects, the keywords can be To better refine and summarize the subject of the object, it can cover a wide range of users’ intentions. Therefore, the ranking model incorporating keyword information can capture the semantics of the keywords themselves and the semantic correlation between keywords and user search information, thereby improving the ranking model. accuracy.

附图说明Description of drawings

为了更清楚地说明本公开的实施例的技术方案,下面将对本公开的实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开的实施例的一些附图,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions of the embodiments of the present disclosure more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments of the present disclosure. Obviously, the accompanying drawings in the following description are only the implementation of the present disclosure. For some of the drawings in the example, for those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.

图1是本公开实施例的一种模型训练方法的步骤流程图。FIG. 1 is a flowchart of steps of a model training method according to an embodiment of the present disclosure.

图2是本公开实施例的一种关键词的示意图。FIG. 2 is a schematic diagram of a keyword according to an embodiment of the present disclosure.

图3是本公开实施例的一种排序方法的步骤流程图。FIG. 3 is a flowchart of steps of a sorting method according to an embodiment of the present disclosure.

图4是本公开实施例的另一种模型训练方法的步骤流程图。FIG. 4 is a flowchart of steps of another model training method according to an embodiment of the present disclosure.

图5是本公开实施例的另一种排序方法的步骤流程图。FIG. 5 is a flowchart of steps of another sorting method according to an embodiment of the present disclosure.

图6是本公开实施例的一种整体处理过程的示意图。FIG. 6 is a schematic diagram of an overall processing process according to an embodiment of the present disclosure.

图7是本公开实施例的一种语义编码网络的示意图。FIG. 7 is a schematic diagram of a semantic encoding network according to an embodiment of the present disclosure.

图8是本公开实施例的一种特征交互网络的示意图。FIG. 8 is a schematic diagram of a feature interaction network according to an embodiment of the present disclosure.

图9是本公开实施例的一种模型训练装置的结构框图。FIG. 9 is a structural block diagram of a model training apparatus according to an embodiment of the present disclosure.

图10是本公开实施例的一种排序装置的结构框图。FIG. 10 is a structural block diagram of a sorting apparatus according to an embodiment of the present disclosure.

具体实施方式Detailed ways

下面将结合本公开的实施例中的附图,对本公开的实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例只是本公开的一部分实施例,而不是本公开的全部实施例。基于本公开中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only a part of the embodiments of the present disclosure, rather than the whole of the present disclosure. Example. Based on the embodiments in the present disclosure, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present disclosure.

本公开实施例的排序模型可以应用于搜索场景中的排序环节,推荐场景中的排序环节,等等。通过将对象的关键词信息融入到排序模型中,相比于仅对对象本身的非结构化特征进行分析的排序模型,融入关键词信息的排序模型能够捕捉关键词本身的语义以及关键词和用户搜索信息之间的语义相关性,从而提高排序模型的准确性。The ranking model of the embodiment of the present disclosure can be applied to the ranking link in the search scenario, the ranking link in the recommendation scenario, and so on. By integrating the keyword information of the object into the ranking model, compared with the ranking model that only analyzes the unstructured features of the object itself, the ranking model incorporating the keyword information can capture the semantics of the keywords themselves, as well as the keywords and users. Search for semantic correlations between information, thereby improving the accuracy of ranking models.

参照图1,示出了本公开实施例的一种模型训练方法的步骤流程图。Referring to FIG. 1 , a flowchart of steps of a model training method according to an embodiment of the present disclosure is shown.

如图1所示,模型训练方法可以包括以下步骤:As shown in Figure 1, the model training method can include the following steps:

步骤101,获取样本数据。Step 101, obtain sample data.

本实施例中,样本数据可以包括样本对象对应的样本搜索信息和样本关键词等,样本数据还可以包括样本对象的实际排序参数。In this embodiment, the sample data may include sample search information and sample keywords corresponding to the sample objects, and the sample data may also include actual sorting parameters of the sample objects.

样本对象可以包括但不限于内容对象等。内容对象可以包括但不限于:文档,网页,图片,视频,音频,等等。Sample objects may include, but are not limited to, content objects and the like. Content objects may include, but are not limited to: documents, web pages, pictures, videos, audios, and so on.

样本搜索信息可以包括但不限于用户输入的搜索信息等。搜索信息的形式可以包括但不限于:文本形式,语音形式,图片形式,等等。The sample search information may include, but is not limited to, search information input by a user, and the like. The form of search information may include, but is not limited to: text form, voice form, picture form, and so on.

样本关键词是指从样本对象的相关信息中挖掘出来的,能够更好地提炼和概括样本对象的主题的词语或短语,每个样本对象可以具有至少一个关键词。参照图2,示出了本公开实施例的一种关键词的示意图。图2中示出了4个对象,其中,第一个对象的关键词包括“老上海风情街”和“逛街好去处”,第二个对象的关键词包括“上海书店”和“小众旅游攻略”,第三个对象的关键词包括“上海名人街”和“老洋房”,第四个对象的关键词包括“上海旅行攻略”和“必打卡的景点”。Sample keywords refer to words or phrases mined from the relevant information of the sample objects that can better refine and summarize the subject matter of the sample objects, and each sample object may have at least one keyword. Referring to FIG. 2 , a schematic diagram of a keyword according to an embodiment of the present disclosure is shown. Figure 2 shows 4 objects, among which, the keywords of the first object include "Old Shanghai Style Street" and "a good place to go shopping", and the keywords of the second object include "Shanghai Bookstore" and "Nice Tourism" Strategy”, the keywords of the third object include “Shanghai Celebrity Street” and “Old Western-style House”, and the keywords of the fourth object include “Shanghai Travel Strategy” and “must punch attractions”.

实际排序参数是指与样本对象的排序相关的实际参数。实际参数可以包括但不限于:点击率,转化率,曝光量,等等。The actual sorting parameter refers to the actual parameter related to the sorting of the sample objects. Actual parameters can include, but are not limited to: click-through rate, conversion rate, exposure, etc.

步骤102,在预设的待训练模型中,对所述样本关键词和所述样本搜索信息进行特征融合,得到所述样本关键词的综合语义表征向量,基于所述综合语义表征向量获取所述样本对象的样本排序参数。Step 102: In the preset model to be trained, feature fusion is performed on the sample keywords and the sample search information to obtain a comprehensive semantic representation vector of the sample keywords, and the comprehensive semantic representation vector is obtained based on the comprehensive semantic representation vector. Sample ordering parameters for the sample object.

将样本对象对应的样本搜索信息和样本关键词输入预设的待训练模型中。在待训练模型中,通过对样本关键词和样本搜索信息进行特征融合得到样本关键词的综合语义表征向量。样本关键词的综合语义表征向量能够捕捉用户的搜索意图,提高搜索的相关性,因此,基于样本关键词的综合语义表征向量获取到的样本排序参数,能够更加准确地体现出样本对象的排序特征。Input the sample search information and sample keywords corresponding to the sample object into the preset to-be-trained model. In the model to be trained, a comprehensive semantic representation vector of the sample keywords is obtained by feature fusion of the sample keywords and the sample search information. The comprehensive semantic representation vector of sample keywords can capture the user's search intent and improve the relevance of the search. Therefore, the sample sorting parameters obtained based on the comprehensive semantic representation vector of sample keywords can more accurately reflect the sorting characteristics of sample objects. .

步骤103,将基于所述样本排序参数确定训练完成的模型作为排序模型。Step 103 , determining the trained model based on the sample sorting parameter as the sorting model.

待训练模型基于综合语义表征向量获取样本对象的样本排序参数,并输出样本对象的样本排序参数。基于样本对象的样本排序参数和实际排序参数可以确定是否训练完成。The model to be trained obtains the sample sorting parameters of the sample objects based on the comprehensive semantic representation vector, and outputs the sample sorting parameters of the sample objects. Whether the training is complete can be determined based on the sample sorting parameters and the actual sorting parameters of the sample object.

可选地,基于样本对象的样本排序参数和实际排序参数可以计算损失函数。损失函数是用来估量模型的预测值与真实值的不一致程度。若损失函数很小,表明模型与数据真实分布很接近,则模型性能良好;若损失函数很大,表明模型与数据真实分布差别较大,则模型性能不佳。训练模型的任务是使用优化方法来寻找损失函数最小化对应的模型参数。因此,在损失函数满足预设条件(比如损失函数小于一定阈值)的情况下,可以确定训练完成。可选地,损失函数可以包括但不限于:交叉熵损失函数、指数损失函数、Dice损失函数、交并比损失函数,等等。Optionally, the loss function can be calculated based on the sample ranking parameters and the actual ranking parameters of the sample object. The loss function is used to measure the degree of inconsistency between the predicted value of the model and the true value. If the loss function is small, it indicates that the model is very close to the real distribution of the data, and the model has good performance; if the loss function is large, it indicates that the model is far different from the real distribution of the data, and the model has poor performance. The task of training a model is to use an optimization method to find the loss function that minimizes the corresponding model parameters. Therefore, when the loss function satisfies a preset condition (for example, the loss function is smaller than a certain threshold), it can be determined that the training is completed. Optionally, the loss function may include, but is not limited to, a cross-entropy loss function, an exponential loss function, a Dice loss function, a cross-union loss function, and the like.

响应于确定未训练完成,可以更新待训练模型的参数,继续进行训练,直至训练完成。响应于确定训练完成,将训练完成的模型作为排序模型。In response to determining that the training is not completed, the parameters of the model to be trained can be updated, and the training can be continued until the training is completed. In response to determining that the training is complete, the trained model is used as the ranking model.

基于通过上述图1所示的模型训练方法训练得到的排序模型,可以对待排序对象进行排序。Based on the sorting model trained by the model training method shown in FIG. 1 above, the objects to be sorted can be sorted.

参照图3,示出了本公开实施例的一种排序方法的步骤流程图。Referring to FIG. 3 , a flowchart of steps of a sorting method according to an embodiment of the present disclosure is shown.

如图3所示,排序方法可以包括以下步骤:As shown in Figure 3, the sorting method may include the following steps:

步骤301,获取待排序对象对应的搜索信息和关键词。Step 301: Obtain search information and keywords corresponding to the objects to be sorted.

待排序对象是指需要进行排序的对象,比如在召回环节召回的多个对象可以作为待排序对象。Objects to be sorted refer to objects that need to be sorted. For example, multiple objects recalled in the recall link can be used as objects to be sorted.

待排序对象对应的搜索信息可以包括但不限于用户输入的搜索信息等。搜索信息的形式可以包括但不限于:文本形式,语音形式,图片形式,等等。多个待排序对象对应的搜索信息可以相同,比如基于同一搜索信息召回的多个待排序对象对应的搜索信息相同。The search information corresponding to the objects to be sorted may include, but is not limited to, search information input by the user, and the like. The form of search information may include, but is not limited to: text form, voice form, picture form, and so on. The search information corresponding to the multiple objects to be sorted may be the same, for example, the search information corresponding to the multiple objects to be sorted recalled based on the same search information is the same.

待排序对象对应的关键词是指从待排序对象的相关信息中挖掘出来的,能够更好地提炼和概括待排序对象的主题的词语或短语,每个待排序对象可以具有至少一个关键词。对于挖掘关键词的具体过程,可以根据实际经验进行相关处理,本实施例在此不再详细论述。The keywords corresponding to the objects to be sorted refer to words or phrases mined from the relevant information of the objects to be sorted, which can better refine and summarize the subject of the objects to be sorted, and each object to be sorted may have at least one keyword. For the specific process of mining keywords, relevant processing may be performed according to actual experience, and this embodiment will not discuss in detail here.

步骤302,将所述搜索信息和所述关键词输入预先训练的排序模型,得到所述排序模型输出的所述待排序对象的排序参数。Step 302: Input the search information and the keywords into a pre-trained sorting model, and obtain the sorting parameters of the objects to be sorted output by the sorting model.

将待排序对象对应的搜索信息和关键词输入预先训练的排序模型,在该排序模型中,对待排序对象对应的搜索信息和关键词进行特征融合得到关键词的综合语义表征向量,基于关键词的综合语义表征向量获取待排序对象的排序参数,排序模型输出待排序对象的排序参数。待排序对象的排序参数可以包括但不限于:点击率,转化率,曝光量,等等。The search information and keywords corresponding to the objects to be sorted are input into the pre-trained sorting model. In the sorting model, the search information and keywords corresponding to the objects to be sorted are feature-fused to obtain the comprehensive semantic representation vector of the keywords. The comprehensive semantic representation vector obtains the sorting parameters of the objects to be sorted, and the sorting model outputs the sorting parameters of the objects to be sorted. The sorting parameters of the objects to be sorted may include but are not limited to: click-through rate, conversion rate, exposure, and so on.

步骤303,基于所述排序参数对所述待排序对象进行排序。Step 303: Sort the objects to be sorted based on the sorting parameters.

基于待排序对象的排序参数,可以根据预设的规则对待排序对象进行排序。比如,基于待排序对象的排序参数进行数学运算(比如加权计算,平均计算等)得到待排序对象的综合排序参数,基于该综合排序参数对待排序对象进行排序。比如,综合排序参数越大,排序顺序越靠前,等等。Based on the sorting parameters of the objects to be sorted, the objects to be sorted can be sorted according to preset rules. For example, mathematical operations (such as weighted calculation, average calculation, etc.) are performed based on the sorting parameters of the objects to be sorted to obtain comprehensive sorting parameters of the objects to be sorted, and the objects to be sorted are sorted based on the comprehensive sorting parameters. For example, the larger the comprehensive sorting parameter, the higher the sorting order, and so on.

本公开实施例中,将样本对象的关键词信息融入到排序模型中,这些关键词是针对对象的非结构化特征挖掘出来的,相比于对象的非结构化特征,关键词能够更好地提炼和概括对象的主题,能够覆盖和刻画用户广泛的意图,因此融入关键词信息的排序模型能够捕捉关键词本身的语义以及关键词和用户搜索信息之间的语义相关性,从而提高排序模型的准确性。In the embodiment of the present disclosure, the keyword information of the sample objects is integrated into the ranking model. These keywords are mined for the unstructured features of the objects. Compared with the unstructured features of the objects, the keywords can better Refining and summarizing the subject of the object can cover and describe a wide range of users’ intentions. Therefore, the ranking model incorporating keyword information can capture the semantics of the keywords themselves and the semantic correlation between keywords and user search information, thereby improving the ranking model. accuracy.

参照图4,示出了本公开实施例的另一种模型训练方法的步骤流程图。Referring to FIG. 4 , a flowchart of steps of another model training method according to an embodiment of the present disclosure is shown.

如图4所示,模型训练方法可以包括以下步骤:As shown in Figure 4, the model training method can include the following steps:

步骤401,获取样本数据。Step 401, obtain sample data.

本实施例中,样本数据可以包括样本对象对应的样本搜索信息,样本关键词和样本描述信息等,样本数据还可以包括样本对象的实际排序参数。In this embodiment, the sample data may include sample search information, sample keywords, and sample description information corresponding to the sample objects, and the sample data may also include actual sorting parameters of the sample objects.

对于样本对象,样本搜索信息,样本关键词和实际排序参数,可以参照上述步骤101的相关描述。For sample objects, sample search information, sample keywords and actual sorting parameters, reference may be made to the relevant description of the above step 101 .

样本描述信息可以包括但不限于:样本对象的自身特征(比如文档的文档特征,图片的图片特征等),上下文特征,用户特征,等等。其中,自身特征可以包括但不限于:标题,标识,作者,等等。上下文特征可以包括但不限于:时间,地点,等等。用户特征可以包括但不限于:用户的年龄,性别,偏好,等等。The sample description information may include, but is not limited to: the own characteristics of the sample object (such as document characteristics of documents, picture characteristics of pictures, etc.), context characteristics, user characteristics, and the like. Among them, self-features may include but are not limited to: title, logo, author, and so on. Contextual features may include, but are not limited to: time, location, and the like. User characteristics may include, but are not limited to, the user's age, gender, preferences, and the like.

步骤402,在预设的待训练模型中,对所述样本关键词和所述样本搜索信息进行特征融合,得到所述样本关键词的综合语义表征向量,对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行特征融合,得到所述样本对象的样本排序参数。Step 402, in the preset to-be-trained model, perform feature fusion on the sample keyword and the sample search information to obtain a comprehensive semantic representation vector of the sample keyword, and compare the comprehensive semantic representation vector and the sample search information. The sample description information and the sample search information are feature-fused to obtain sample sorting parameters of the sample object.

将样本对象对应的样本搜索信息,样本关键词和样本描述信息输入预设的待训练模型中。在待训练模型中,通过对样本关键词和样本搜索信息进行特征融合得到样本关键词的综合语义表征向量。样本关键词的综合语义表征向量能够捕捉用户的搜索意图,提高搜索相关性。通过对样本关键词的综合语义表征向量、样本描述信息和样本搜索信息进行特征融合得到样本对象的样本排序参数,能够进行充分的特征交互,提升搜索相关性和结果质量。Input the sample search information, sample keywords and sample description information corresponding to the sample object into the preset to-be-trained model. In the model to be trained, a comprehensive semantic representation vector of the sample keywords is obtained by feature fusion of the sample keywords and the sample search information. The comprehensive semantic representation vector of sample keywords can capture the user's search intent and improve search relevance. Through feature fusion of the comprehensive semantic representation vector of the sample keywords, the sample description information and the sample search information, the sample sorting parameters of the sample object can be obtained, which can conduct sufficient feature interaction and improve the search relevance and result quality.

步骤403,将基于所述样本排序参数确定训练完成的模型作为排序模型。Step 403: Determine the trained model based on the sample sorting parameter as the sorting model.

对于步骤403的具体过程,可以参照上述步骤103的相关描述。For the specific process of step 403, reference may be made to the relevant description of the above-mentioned step 103.

基于通过上述图4所示的模型训练方法训练得到的排序模型,可以对待排序对象进行排序。Based on the sorting model trained by the model training method shown in FIG. 4, the objects to be sorted can be sorted.

参照图5,示出了本公开实施例的另一种排序方法的步骤流程图。Referring to FIG. 5 , a flowchart of steps of another sorting method according to an embodiment of the present disclosure is shown.

如图5所示,排序方法可以包括以下步骤:As shown in Figure 5, the sorting method may include the following steps:

步骤501,获取待排序对象对应的搜索信息、关键词和描述信息。Step 501: Obtain search information, keywords and description information corresponding to the objects to be sorted.

对于待排序对象,搜索信息,关键词,参照上述步骤301的相关描述。For the objects to be sorted, search information and keywords, refer to the relevant description of the above step 301 .

待排序对象的描述信息可以包括但不限于:待排序对象的自身特征(比如文档的文档特征,图片的图片特征等),上下文特征,用户特征,等等。The description information of the objects to be sorted may include, but is not limited to: the own characteristics of the objects to be sorted (such as document characteristics of documents, picture characteristics of pictures, etc.), context characteristics, user characteristics, and so on.

步骤502,将所述搜索信息、所述关键词和所述描述信息输入预先训练的排序模型,得到所述排序模型输出的所述待排序对象的排序参数。Step 502: Input the search information, the keyword and the description information into a pre-trained sorting model, and obtain sorting parameters of the objects to be sorted output by the sorting model.

将待排序对象对应的搜索信息、关键词和描述信息输入预先训练的排序模型。在该排序模型中,对待排序对象对应的关键词和样本搜索信息进行特征融合得到关键词的综合语义表征向量,对关键词的综合语义表征向量、描述信息和搜索信息进行特征融合得到待排序对象的排序参数,排序模型输出待排序对象的排序参数。Input the search information, keywords and description information corresponding to the objects to be sorted into the pre-trained sorting model. In this sorting model, the keywords corresponding to the objects to be sorted and the sample search information are feature-fused to obtain the comprehensive semantic representation vector of the keywords, and the comprehensive semantic representation vector, description information and search information of the keywords are feature-fused to obtain the objects to be sorted. The sorting parameters of the sorting model output the sorting parameters of the objects to be sorted.

步骤503,基于所述排序参数对所述待排序对象进行排序。Step 503: Sort the objects to be sorted based on the sorting parameters.

对于步骤503的具体过程,可以参照上述步骤303的相关描述。For the specific process of step 503, reference may be made to the relevant description of the above-mentioned step 303.

本公开实施例中,引入了对象的关键词,提出了一套通用的,融合关键词语义信息、搜索信息和对象描述信息之间的交互关系的排序框架。该框架通过能够提升关键词语义向量的表达能力,进一步和场景中的重要特征进行充分的特征交互,提升搜索的相关性和结果质量。In the embodiment of the present disclosure, the keyword of the object is introduced, and a set of general ranking framework is proposed, which integrates the interaction relationship between the semantic information of the keyword, the search information and the object description information. This framework can improve the expressive ability of keyword semantic vector, and further fully interact with important features in the scene, so as to improve the relevance of search and the quality of results.

下面,以搜索场景中的内容搜索(具体可以为文档搜索)为例进行说明。但是,在实际应用中并不限于此,可以针对任意场景下任意对象的排序过程进行处理。In the following, content search (specifically, document search) in a search scenario is used as an example for description. However, in practical applications, it is not limited to this, and the sorting process of any object in any scene can be processed.

在搜索场景下,融合关键词的排序方法是指将对象的关键词信息融入到排序模型中,使得排序模型能够捕捉关键词本身的语义以及关键词和用户搜索信息之间的语义相关性,从而提高搜索相关性和结果质量。In the search scenario, the sorting method of fusion keywords refers to integrating the keyword information of the object into the sorting model, so that the sorting model can capture the semantics of the keywords themselves and the semantic correlation between keywords and user search information, thereby Improve search relevance and result quality.

内容搜索(具体可以为文档搜索)是点评搜索的重要模块,承担着点评内容化生态建设的重要使命。目前,内容理解环节产出了高覆盖的关键词(词语或短语),这些关键词(也可称为关键词标签)是从文档相关的文本中挖掘出来的,相比于文档非结构化的正文信息,关键词能够更好地提炼和概括文档的主题,每个文档都会打上若干个关键词标签。如何将这些文档的关键词落地到内容搜索的排序场景中,挖掘关键词和用户搜索信息(也可称为搜索词)之间的相关性,捕捉用户的搜索意图和影响搜索数据分发,是本公开实施例要解决的问题。Content search (specifically, document search) is an important module of comment search, which undertakes the important mission of ecological construction of comment content. Currently, the content understanding process produces high-coverage keywords (words or phrases), which are mined from document-related text, compared to unstructured documents. Text information, keywords can better refine and summarize the topic of the document, and each document will be marked with several keyword tags. How to implement the keywords of these documents into the sorting scenario of content search, mine the correlation between keywords and user search information (also known as search terms), capture users' search intentions and influence the distribution of search data, is the basic The problem to be solved by the disclosed embodiments.

下面,依据关键词在排序中的使用方式以及排序模型中的相关性建模方法来进行介绍。In the following, the introduction will be based on the use of keywords in ranking and the correlation modeling method in the ranking model.

搜索场景下,按照关键词标签在排序中的使用方式可以划分为如下两类:In the search scenario, according to the way the keyword tags are used in sorting, they can be divided into the following two categories:

①显式特征建模①Explicit feature modeling

a.关键词单维度的特征:即:使用关键词本身的特征来作为模型额外的输入特征,如TF-IDF(Term Frequency-Inverse Document Frequency,词频-逆文本频率)统计特征或者预训练向量等。a. Single-dimensional features of keywords: that is, use the features of the keywords themselves as additional input features of the model, such as TF-IDF (Term Frequency-Inverse Document Frequency, term frequency-inverse text frequency) statistical features or pre-training vectors, etc. .

②隐式语义建模②Implicit semantic modeling

a.关键词单维度隐式语义表征:将关键词作为整体并映射到唯一的标识,随机初始化词向量;或者把关键词看成字的序列,随机初始化字向量,并使用词袋模型或者序列模型来聚合得到关键词的语义表征向量,最终通过模型来端到端地学习到关键词的语义表征向量。a. Single-dimensional implicit semantic representation of keywords: take keywords as a whole and map them to unique identifiers, and initialize word vectors randomly; or treat keywords as sequences of words, initialize word vectors randomly, and use a bag-of-words model or sequence The model is used to aggregate the semantic representation vector of the keyword, and finally the semantic representation vector of the keyword is learned end-to-end through the model.

b.基于预训练模型:预训练模型作为关键词标签的特征提取器,然后和排序模型结合在一起,在排序任务中进行微调。b. Based on the pre-trained model: The pre-trained model is used as a feature extractor for keyword labels, and then combined with the ranking model to perform fine-tuning in the ranking task.

搜索场景下,在排序阶段建模搜索相关性的方法可以划分为如下两类:In search scenarios, methods for modeling search relevance in the ranking phase can be divided into the following two categories:

①挖掘相关性特征① Mining correlation features

挖掘文档的文本和用户搜索信息之间的显式交叉特征,例如:文档的标题或正文和搜索信息之间的编辑距离等特征。Mining explicit intersection features between the document's text and user search information, such as features such as the edit distance between the document's title or body and search information.

②模型结构上建模相关性②Modeling correlation in model structure

在搜索场景中,相关性排序方法可以通过特定的网络结构来建模文档文本和用户搜索信息之间的相关性。例如:In search scenarios, relevance ranking methods can model the relevance between document text and user search information through a specific network structure. E.g:

a.引入双塔结构到排序模型中,即:在排序模型中增加两个网络子塔结构,文档的文本和搜索信息分别经过双塔得到表征向量,再通过点积进行特征交互,最后文本和搜索信息的表征或者点积结果可以和排序其它特征的表征向量一起作为排序模型的输入,进行深度融合。a. Introduce the double-tower structure into the sorting model, that is, add two network sub-tower structures to the sorting model, the text and search information of the document are respectively obtained through the double-tower to obtain the representation vector, and then the feature interaction is performed through the dot product, and finally the text and The representation or dot product result of the search information can be used as the input of the ranking model together with the representation vector of ranking other features for deep fusion.

b.使用序列表征模型,如LSTM(Long Short-Term Memory,长短期记忆网络)或Transformers(转换器),将文档的文本和搜索信息拼接在一起,来捕捉文本和搜索信息之间相关性。b. Use a sequence representation model, such as LSTM (Long Short-Term Memory) or Transformers (transformers), to stitch together the text and search information of the document to capture the correlation between the text and search information.

c.引入预训练模型,将BERT(Bidirectional Encoder Representations fromTransformers,基于转换器的双向编码器表征)的编码器引入到排序模型中,对文档的文本或者搜索信息进行编码,并在排序下游任务中微调。c. Introduce a pre-training model, introduce the BERT (Bidirectional Encoder Representations from Transformers) encoder into the sorting model, encode the text or search information of the document, and fine-tune in the downstream task of sorting .

但是,如果简单地采用上述方式将会存在如下问题:However, if the above method is simply adopted, there will be the following problems:

1.上述融合关键词的排序方法无法有效地捕捉关键词的深层语义信息。例如:关键词本身作为输入特征时,如果使用的方式是挖掘关键词本身和搜索信息之间的编辑距离等字面上显式的特征,则当关键词和搜索信息在字面上不一致,但在语义上存在近义、相似、上下位关系等,这种显式关系特征无法捕捉关键词和搜索信息之间的语义相关性。当对关键词进行隐语义建模时,如果采用随机初始化和端到端的学习方式,则由于训练样本中的用户行为非常稀疏,排序模型可能很难学习好关键词本身的隐语义表征,即:学习到的表征向量的表达能力差。1. The above-mentioned sorting method of fused keywords cannot effectively capture the deep semantic information of keywords. For example: when the keyword itself is used as the input feature, if the method used is to mine the literally explicit features such as the edit distance between the keyword itself and the search information, then when the keyword and the search information are literally inconsistent, but semantically There are synonyms, similarities, and hyponyms, etc., and such explicit relationship features cannot capture the semantic correlation between keywords and search information. When modeling the latent semantics of keywords, if random initialization and end-to-end learning are adopted, it may be difficult for the ranking model to learn the latent semantic representation of the keywords themselves due to the sparse user behavior in the training samples, namely: The representation ability of the learned representation vector is poor.

2.关键词本身是存在噪声的,如果单独融合关键词,则会忽视关键词本身的噪声,导致模型的鲁棒性较差。通常,对于某个文档,每个关键词都有一个置信度分数,来衡量其代表该文档的置信程度。有的关键词的置信度较低,直接引入会给模型带来噪声。有的关键词即使置信度很高,在某个搜索信息下,关键词本身和搜索信息可能是不相关性的,直接引入该关键词,对于模型捕捉该搜索信息和该文档的语义相关性也是噪声。2. The keywords themselves have noise. If the keywords are integrated alone, the noise of the keywords themselves will be ignored, resulting in poor robustness of the model. Usually, for a document, each keyword has a confidence score, which measures how confident it represents the document. Some keywords have low confidence, and direct introduction will bring noise to the model. Even if some keywords have a high degree of confidence, under a certain search information, the keyword itself and the search information may be irrelevant. Directly introducing the keyword is also important for the model to capture the semantic correlation between the search information and the document. noise.

3.上述排序相关性建模的方法主要基于文档的非结构化文本和搜索信息之间的相关性,缺乏能够建模文档的结构化关键词,发现用户的搜索意图和文档主题之间的关系的搜索排序框架。文档的关键词是对文档主题的提炼和概括,能够覆盖和刻画搜索场景下用户广泛的搜索意图。若能在排序模型中充分利用,对于提升搜索结果的相关性至关重要。3. The above ranking correlation modeling methods are mainly based on the correlation between the unstructured text of the document and the search information, lack of structured keywords that can model the document, and discover the relationship between the user's search intent and the document topic search ranking framework. The keywords of the document are the extraction and generalization of the document topic, which can cover and describe the user's extensive search intent in the search scenario. If fully utilized in ranking models, it is crucial to improve the relevance of search results.

为了更好地融合关键词到排序模型中,从而提高搜索的相关性和质量,本实施例针对上述方式进行了以下两点优化:In order to better integrate keywords into the ranking model, thereby improving the relevance and quality of search, the following two optimizations are carried out in this embodiment for the above method:

1.本实施例提出了一种融合关键词和用户搜索信息的深度语义编码网络,能够有效挖掘关键词的语义信息,自动过滤噪声信息,提升语义表征的准确性和鲁棒性。1. This embodiment proposes a deep semantic coding network that integrates keywords and user search information, which can effectively mine the semantic information of keywords, automatically filter noise information, and improve the accuracy and robustness of semantic representation.

2.本实施例引入了关键词,提出了一套通用的、基于预训练语义向量、融合关键词语义信息、建模标签和搜索场景重要特征之间的交互关系的排序流程和框架。该框架通过预训练向量来提升关键词语义向量的表达能力,进一步和搜索场景中重要的上下文、文档、用户等特征进行充分的特征交互,提升搜索的相关性和结果质量。2. This embodiment introduces keywords, and proposes a set of general sorting processes and frameworks based on pre-training semantic vectors, fusing keyword semantic information, modeling tags, and interaction between important features of search scenarios. The framework improves the expression ability of keyword semantic vectors through pre-training vectors, and further conducts sufficient feature interaction with important context, document, user and other features in the search scene to improve the relevance of search and the quality of results.

参照图6,示出了本公开实施例的一种整体处理过程的示意图。图6提出了一套通用的、基于关键词和预训练向量的排序框架,能够有效融合关键词到排序模型中,提取语义向量和建模特征间的交互关系。如图6所示,整体处理过程可以包括关键词和预训练向量的预处理流程和排序模型两部分。以下分别介绍。Referring to FIG. 6 , a schematic diagram of an overall processing procedure of an embodiment of the present disclosure is shown. Figure 6 proposes a general ranking framework based on keywords and pre-trained vectors, which can effectively integrate keywords into the ranking model, extract the interaction between semantic vectors and modeling features. As shown in Figure 6, the overall processing process may include two parts: the preprocessing flow of keywords and pretrained vectors, and the sorting model. The following are introduced separately.

一、关键词和预训练向量的预处理流程。1. The preprocessing process of keywords and pre-training vectors.

首先,从海量的语料中,挖掘关键词和训练预训练模型;然后,基于线上曝光日志数据和挖掘到的关键词,构建关键词词典;接着,使用预训练模型来推理和提取关键词的预训练向量。具体的流程如下:First, mine keywords and train pre-trained models from massive corpora; then, build keyword dictionaries based on online exposure log data and mined keywords; then, use pre-trained models to reason and extract keywords Pretrained vector. The specific process is as follows:

(1)预训练模型的训练。基于海量语料进行训练得到预训练模型。预训练模型可以包括BERT模型等,BERT模型是基于Transformers结构的垂直领域预训练模型,在单句分配、序列标注、句间关系等任务中表现优异。(1) Training of the pre-trained model. The pre-trained model is obtained by training based on massive corpus. The pre-training model can include the BERT model, etc. The BERT model is a vertical domain pre-training model based on the Transformers structure, and performs well in tasks such as single sentence assignment, sequence labeling, and inter-sentence relationships.

(2)关键词的挖掘。利用无监督方法,如依存句法分析、TF-IDF、文本聚类等方法,对海量语料进行关键词的挖掘,得到大量的关键词,所挖掘到的关键词能够很好地衡量内容的主题。在挖掘得到关键词后,还可以得到该关键词的置信度。(2) Keyword mining. Using unsupervised methods, such as dependency syntax analysis, TF-IDF, text clustering and other methods, the massive corpus is mined for keywords, and a large number of keywords are obtained. The mined keywords can well measure the theme of the content. After the keyword is obtained by mining, the confidence level of the keyword can also be obtained.

(3)关键词词典的构建。基于线上的内容搜索曝光日志数据进行统计,低频过滤等,从曝光的内容关联的关键词和用户搜索信息中,过滤低频词,统计并构建得到关键词词典(包含关键词和搜索信息)。(3) Construction of keyword dictionary. Based on online content search exposure log data for statistics, low-frequency filtering, etc., from the exposed content-related keywords and user search information, filter low-frequency words, count and build a keyword dictionary (including keywords and search information).

(4)关键词预训练向量的生成。利用上述预训练模型,提取关键词词典中每个词的预训练向量(也即特征向量),得到关键词预训练向量表(也即词与特征向量的对应关系)。预训练向量中包含丰富的语义信息,能够很好地刻画不同关键词或者搜索信息之间的语义相关性。该向量是词中token序列表征向量的平均池化结果。平均池化获取的向量相比于[CLS]token向量,具备更完整的语义特征。(4) Generation of keyword pre-training vectors. Using the above-mentioned pre-training model, the pre-training vector (that is, the feature vector) of each word in the keyword dictionary is extracted, and the keyword pre-training vector table (that is, the correspondence between the word and the feature vector) is obtained. The pre-training vector contains rich semantic information, which can well describe the semantic correlation between different keywords or search information. This vector is the average pooling result of the token sequence representation vectors in the word. Compared with the [CLS]token vector, the vector obtained by average pooling has more complete semantic features.

二、排序模型。Second, the sorting model.

排序模型可以包括语义编码网络和特征交互网络。一方面通过语义编码网络来提取关键词的综合语义表征向量;另一方面通过特征交互网络来进行特征融合,最后输出模型的预测值,用于离线训练和在线推理。具体的流程细节如下:Ranking models can include semantic encoding networks and feature interaction networks. On the one hand, the semantic coding network is used to extract the comprehensive semantic representation vector of keywords; on the other hand, the feature interaction network is used for feature fusion, and finally the predicted value of the model is output for offline training and online reasoning. The specific process details are as follows:

(1)基于关键词预训练向量表进行关键词语义编码,得到关键词的综合语义表征向量。(1) Perform keyword semantic encoding based on the keyword pre-training vector table, and obtain the comprehensive semantic representation vector of the keyword.

通过语义编码网络,对关键词和搜索信息进行特征融合,得到关键词的综合语义表征向量。Through the semantic coding network, the feature fusion of keywords and search information is carried out, and the comprehensive semantic representation vector of keywords is obtained.

参照图7,示出了本公开实施例的一种语义编码网络的示意图。如图7所示,语义编码网络可以包括输入层(Input Layer)、嵌入层(Embedding Layer)、自注意力层(Self-attention Layer)、置信度感知的汇聚层(Confidence-Aware Aggregator)和输出层(Output Layer)。Referring to FIG. 7 , a schematic diagram of a semantic coding network according to an embodiment of the present disclosure is shown. As shown in Figure 7, the semantic coding network may include an Input Layer, an Embedding Layer, a Self-attention Layer, a Confidence-Aware Aggregator and an output layer. Layer (Output Layer).

输入层:输入层的输入为关键词和搜索信息。如图7中,关键词包括keyword1,keyword2,…,keywordn,搜索信息为query。Input layer: The input of the input layer is keywords and search information. As shown in FIG. 7 , the keywords include keyword 1 , keyword 2 , . . . , keyword n , and the search information is query.

嵌入层:嵌入层用于基于词与特征向量的对应关系,查询各关键词的特征向量和搜索信息的特征向量。如图7中,关键词keyword1的特征向量为

Figure BDA0003507575580000161
关键词keyword2的特征向量为
Figure BDA0003507575580000171
…,关键词keywordn的特征向量为
Figure BDA0003507575580000172
搜索信息query的特征向量为eq。Embedding layer: The embedding layer is used to query the feature vector of each keyword and the feature vector of search information based on the correspondence between words and feature vectors. As shown in Figure 7, the feature vector of the keyword keyword 1 is
Figure BDA0003507575580000161
The feature vector of the keyword keyword 2 is
Figure BDA0003507575580000171
..., the feature vector of the keyword keyword n is
Figure BDA0003507575580000172
The feature vector of the search information query is e q .

基于关键词预训练向量表(也即词与特征向量的对应关系),对语义编码网络的嵌入层参数进行初始化,且各相关的特征都共享该嵌入层。即:该参数空间是关键词特征和用户搜索信息特征共享的,能够显著降低复杂度,防止过拟合。相比于随机初始化,使用预训练的特征向量进行初始化能够保证模型从一开始就能有效捕捉和关注文本层面的相关性特征,从而影响排序学习过程。具体而言,使用预训练的特征向量

Figure BDA0003507575580000173
来初始化表征向量参数矩阵E,该参数矩阵是可学习的,会在排序模型训练时进行微调。Based on the keyword pre-training vector table (that is, the correspondence between words and feature vectors), the parameters of the embedding layer of the semantic coding network are initialized, and all related features share the embedding layer. That is, the parameter space is shared by keyword features and user search information features, which can significantly reduce the complexity and prevent overfitting. Compared with random initialization, initialization using pre-trained feature vectors can ensure that the model can effectively capture and focus on text-level relevance features from the beginning, thereby affecting the ranking learning process. Specifically, using pretrained feature vectors
Figure BDA0003507575580000173
to initialize the representation vector parameter matrix E, which is learnable and fine-tuned when the ranking model is trained.

自注意力层:用于发现关键词之间的关系。Self-attention layer: used to discover the relationship between keywords.

某个文档可能存在多个关键词,不同关键词之间是存在协同关系的,例如:″篮球″、″培训″,单独看任意一个关键词,都难以完整刻画这个文档的主题,联合这两个关键词才能够表征该文档主题,且能够推断出在什么样的搜索信息下,展示这个文档结果给用户是最合适的。显然,当用户搜索″篮球培训″时,这个文档是高度相关且满足用户需求的。因此,可以挖掘关键词之间的关系来帮助模型更好地捕捉每个关键词和用户搜索信息的关系。There may be multiple keywords in a document, and there is a synergistic relationship between different keywords, such as "basketball", "training", it is difficult to fully describe the theme of this document by looking at any keyword alone. Only a few keywords can characterize the subject of the document, and can infer what kind of search information is the most appropriate to display the results of the document to the user. Clearly, when a user searches for "basketball training", this document is highly relevant and meets the user's needs. Therefore, the relationship between keywords can be mined to help the model better capture the relationship between each keyword and user search information.

以离线训练为例,自注意力层针对每个样本关键词,对当前样本关键词与其他样本关键词进行特征融合,得到当前样本关键词的融合语义表征向量。Taking offline training as an example, for each sample keyword, the self-attention layer performs feature fusion of the current sample keyword and other sample keywords, and obtains the fusion semantic representation vector of the current sample keyword.

可选地,自注意力层针对每个样本关键词,通过自注意力机制对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,得到当前样本关键词的融合语义表征向量。Optionally, for each sample keyword, the self-attention layer performs feature fusion on the feature vector of the current sample keyword and the feature vector of other sample keywords through the self-attention mechanism to obtain the fusion semantic representation vector of the current sample keyword. .

通过自注意力机制来挖掘关键词与关键词之间的关系,得到融合后的关键词语义表征向量,使得每个关键词都能够融入其它关键词的语义信息。The relationship between keywords is mined through the self-attention mechanism, and the fused keyword semantic representation vector is obtained, so that each keyword can integrate the semantic information of other keywords.

公式表示如公式一:The formula is expressed as formula 1:

Figure BDA0003507575580000174
Figure BDA0003507575580000174

公式一中,Self-Attention()表示自注意力机制计算过程,

Figure BDA0003507575580000181
表示样本关键词的特征向量,
Figure BDA0003507575580000182
表示样本关键词的融合语义表征向量。该种方式对应于图7所示的情况。In formula 1, Self-Attention() represents the calculation process of the self-attention mechanism,
Figure BDA0003507575580000181
the feature vector representing the sample keywords,
Figure BDA0003507575580000182
A fused semantic representation vector representing sample keywords. This approach corresponds to the situation shown in FIG. 7 .

可选地,自注意力层通过自注意力机制对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,得到当前样本关键词的初步融合语义表征向量;按照样本关键词的置信度对所述样本关键词进行降序排序,基于排序结果获取各样本关键词的位置嵌入向量;将当前样本关键词的初步融合语义表征向量和位置嵌入向量相加,得到当前样本关键词的融合语义表征向量。Optionally, the self-attention layer performs feature fusion on the feature vector of the current sample keyword and the feature vector of other sample keywords through the self-attention mechanism to obtain the preliminary fusion semantic representation vector of the current sample keyword; Confidence is sorted in descending order of the sample keywords, and the position embedding vector of each sample keyword is obtained based on the sorting result; the preliminary fusion semantic representation vector of the current sample keyword and the position embedding vector are added to obtain the fusion of the current sample keywords. Semantic representation vector.

对某个文档i,每个关键词k的置信度不同。先按照置信度大小对关键词降序排列,形成关键词序列{k1,k2,...,kn},进而通过查表形成关键词初始表征序列,即:

Figure BDA0003507575580000183
接着,通过自注意力机制来挖掘关键词与关键词之间关系,得到融合后的关键词语义表征序列,使得每个关键词都能够融入其它关键词的语义信息。For a certain document i, the confidence level of each keyword k is different. Firstly, the keywords are arranged in descending order according to the confidence level to form the keyword sequence {k 1 ,k 2 ,...,k n }, and then the initial characterization sequence of keywords is formed by looking up the table, namely:
Figure BDA0003507575580000183
Then, the relationship between keywords and keywords is mined through the self-attention mechanism, and the fused keyword semantic representation sequence is obtained, so that each keyword can be integrated into the semantic information of other keywords.

公式表示如公式二和公式三:The formulas are expressed as formula 2 and formula 3:

Figure BDA0003507575580000184
Figure BDA0003507575580000184

Figure BDA0003507575580000185
Figure BDA0003507575580000185

公式二中,Self-Attention()表示自注意力机制计算过程,

Figure BDA0003507575580000186
表示样本关键词的特征向量,
Figure BDA0003507575580000187
表示样本关键词的初步融合语义表征向量。公式三中,pi表示置信度排名第i的样本关键词ki的位置嵌入向量,pi能够从全局层面捕捉不同排名的样本关键词的置信度偏置,
Figure BDA0003507575580000188
表示置信度排名第i的样本关键词ki的初步融合语义表征向量,
Figure BDA0003507575580000189
表示置信度排名第i的样本关键词ki的融合语义表征向量。In formula 2, Self-Attention() represents the calculation process of the self-attention mechanism,
Figure BDA0003507575580000186
the feature vector representing the sample keywords,
Figure BDA0003507575580000187
Preliminary fused semantic representation vector representing sample keywords. In formula 3, pi represents the position embedding vector of the i - th sample keyword ki with confidence ranking, and pi can capture the confidence bias of sample keywords with different rankings from the global level,
Figure BDA0003507575580000188
represents the initial fusion semantic representation vector of the i -th sample keyword ki with confidence ranking,
Figure BDA0003507575580000189
Represents the fused semantic representation vector of the i -th sample keyword ki with confidence ranking.

置信度感知的汇聚层:用于自动化地选择和用户搜索信息相关的有效关键词,自动过滤噪声信息。Confidence-aware aggregation layer: It is used to automatically select effective keywords related to user search information, and automatically filter noise information.

用户的搜索信息可能和部分的关键词相关,和部分关键词无关,因此需要选择和搜索信息最相关的关键词。The user's search information may be related to some keywords, but not related to some keywords, so it is necessary to select the most relevant keywords to the search information.

以离线训练为例,置信度感知的汇聚层针对每个样本关键词,对当前样本关键词与样本搜索信息进行特征融合,得到当前样本关键词与样本搜索信息之间的相关度,基于各样本关键词的融合语义表征向量和相关度,计算样本关键词的初步综合语义表征向量。Taking offline training as an example, the confidence-aware aggregation layer performs feature fusion on the current sample keyword and the sample search information for each sample keyword, and obtains the correlation between the current sample keyword and the sample search information. The fusion semantic representation vector and relevance of keywords are used to calculate the preliminary comprehensive semantic representation vector of sample keywords.

可选地,置信度感知的汇聚层针对每个样本关键词,对当前样本关键词的融合语义表征向量与样本搜索信息的特征向量进行特征融合,得到当前样本关键词与样本搜索信息之间的相关度;基于各样本关键词的融合语义表征向量和相关度,以及样本搜索信息的特征向量,计算样本关键词的初步综合语义表征向量。Optionally, for each sample keyword, the confidence-aware aggregation layer performs feature fusion on the fusion semantic representation vector of the current sample keyword and the feature vector of the sample search information, to obtain the relationship between the current sample keyword and the sample search information. Relevance: Based on the fusion semantic representation vector and relevancy of each sample keyword, and the feature vector of the sample search information, the preliminary comprehensive semantic representation vector of the sample keyword is calculated.

可选地,对当前样本关键词的融合语义表征向量与所述样本搜索信息的特征向量进行特征融合,得到当前样本关键词与样本搜索信息之间的相关度的过程,可以包括:计算当前样本关键词的置信度与所述搜索信息的相关度权重;基于当前样本关键词的融合语义表征向量,所述样本搜索信息的特征向量,以及当前样本关键词的相关度权重,计算当前样本关键词的中间参数;基于所述中间参数和预设的温度参数,计算当前样本关键词与所述样本搜索信息之间的相关度。Optionally, the process of performing feature fusion on the fusion semantic representation vector of the current sample keyword and the feature vector of the sample search information to obtain the correlation between the current sample keyword and the sample search information may include: calculating the current sample The confidence of the keyword and the relevance weight of the search information; the current sample keyword is calculated based on the fusion semantic representation vector of the current sample keyword, the feature vector of the sample search information, and the relevance weight of the current sample keyword The intermediate parameter; based on the intermediate parameter and the preset temperature parameter, calculate the correlation between the current sample keyword and the sample search information.

如图7所示,在置信度感知的汇聚层中,通过注意力网络(Attention Network),利用注意力机制计算置信度感知的、用户搜索信息和关键词的融合语义表征向量之间的相关度权重,并自适应地从关键词的融合语义表征向量中选择有用信息。As shown in Figure 7, in the confidence-aware convergence layer, through the Attention Network, the attention mechanism is used to calculate the correlation between confidence-aware, user search information and keyword fusion semantic representation vectors weights and adaptively select useful information from the fused semantic representation vector of keywords.

第i个样本关键词ki的置信度与搜索信息的相关度权重的公式表示如公式四:The formula of the confidence degree of the ith sample keyword k i and the relevance weight of the search information is expressed as formula 4:

Figure BDA0003507575580000191
Figure BDA0003507575580000191

公式四中,softmax()表示softmax函数计算过程,

Figure BDA0003507575580000192
表示第i个样本关键词ki的置信度,
Figure BDA0003507575580000193
表示第i个样本关键词ki的相关度权重。In formula 4, softmax() represents the calculation process of the softmax function,
Figure BDA0003507575580000192
Represents the confidence of the i -th sample keyword ki,
Figure BDA0003507575580000193
Represents the relevance weight of the ith sample keyword ki.

如果对应于上述公式一,第i个样本关键词ki的融合语义表征向量为

Figure BDA0003507575580000194
则第i个样本关键词ki的中间参数的公式表示如公式五:If corresponding to the above formula 1, the fusion semantic representation vector of the i -th sample keyword ki is
Figure BDA0003507575580000194
Then the formula of the intermediate parameter of the ith sample keyword k i is expressed as formula 5:

Figure BDA0003507575580000201
Figure BDA0003507575580000201

公式五中,MLP表示多层前馈神经网络,||表示拼接操作,

Figure BDA0003507575580000202
表示第i个样本关键词ki的融合语义表征向量,eq表示样本搜索信息的特征向量,s(q,ki)表示第i个样本关键词ki的中间参数。In formula 5, MLP represents a multi-layer feedforward neural network, || represents a splicing operation,
Figure BDA0003507575580000202
represents the fused semantic representation vector of the ith sample keyword ki, e q represents the feature vector of the sample search information, and s(q, ki ) represents the intermediate parameter of the ith sample keyword ki.

如果对应于上述公式二和公式三,第i个样本关键词ki的融合语义表征向量为

Figure BDA0003507575580000203
则第i个样本关键词ki的中间参数的公式表示如公式六:If corresponding to the above formulas 2 and 3, the fusion semantic representation vector of the i-th sample keyword k i is
Figure BDA0003507575580000203
Then the formula of the intermediate parameter of the i-th sample keyword k i is expressed as formula 6:

Figure BDA0003507575580000204
Figure BDA0003507575580000204

公式六中,

Figure BDA0003507575580000205
表示第i个样本关键词ki的融合语义表征向量,其他参数参照上述公式五的相关描述。In formula six,
Figure BDA0003507575580000205
Represents the fusion semantic representation vector of the i -th sample keyword ki, and other parameters refer to the relevant description of the above formula 5.

第i个样本关键词ki与样本搜索信息之间的相关度的公式表示如公式七:The formula of the correlation between the i-th sample keyword ki and the sample search information is expressed as formula 7:

aA(q,ki)=softmax(s(q,ki)/τ)公式七a A (q, k i )=softmax(s(q, k i )/τ) Formula 7

公式七中,softmax()表示softmax函数计算过程,aA(q,ki)表示第i个样本关键词ki与样本搜索信息之间的相关度,τ表示温度参数。温度参数能够拉大不同关键词的重要性差异度,即:越能反映内容主题、越和用户搜索信息相关的关键词,越有可能获得更大的权重,从而对综合语义表征向量的贡献度越大,而和搜索信息无关的噪声关键词,贡献度很小,能够显著降低噪声关键词的干扰。总之,结合温度参数的、置信度感知的注意力机制,能够有效区分不同置信度层次的关键词的重要性,提升模型的鲁棒性。In formula 7, softmax() represents the calculation process of the softmax function, a A (q, ki ) represents the correlation between the ith sample keyword ki and the sample search information, and τ represents the temperature parameter. The temperature parameter can increase the importance difference of different keywords, that is, keywords that can reflect the content theme and are more related to the user's search information are more likely to obtain greater weights, thereby contributing to the comprehensive semantic representation vector. The larger the value is, the noise keywords that have nothing to do with the search information have a small contribution, which can significantly reduce the interference of the noise keywords. In a word, combining the temperature parameter and the confidence-aware attention mechanism can effectively distinguish the importance of keywords at different confidence levels and improve the robustness of the model.

可选地,基于各样本关键词的融合语义表征向量和相关度,以及所述样本搜索信息的特征向量,计算所述样本关键词的初步综合语义表征向量的过程,可以包括:计算各样本关键词的融合语义表征向量和相关度的乘积的总和,得到所述样本关键词的初步语义表征向量;将所述初步语义表征向量与所述样本搜索信息的特征向量相加,得到所述样本关键词的初步综合语义表征向量。Optionally, based on the fusion semantic representation vector and correlation of each sample keyword, and the feature vector of the sample search information, the process of calculating the preliminary comprehensive semantic representation vector of the sample keyword may include: calculating the key of each sample. The sum of the product of the fusion semantic representation vector of the word and the correlation degree, to obtain the preliminary semantic representation vector of the sample keyword; adding the preliminary semantic representation vector and the feature vector of the sample search information to obtain the sample key Preliminary synthetic semantic representation vector for words.

如果对应于上述公式一,第i个样本关键词ki的融合语义表征向量为

Figure BDA0003507575580000206
则样本关键词的初步语义表征向量的公式表示如公式八:If corresponding to the above formula 1, the fusion semantic representation vector of the i -th sample keyword ki is
Figure BDA0003507575580000206
Then the formula of the preliminary semantic representation vector of the sample keyword is expressed as formula 8:

Figure BDA0003507575580000211
Figure BDA0003507575580000211

如果对应于上述公式二和公式三,第i个样本关键词ki的融合语义表征向量为

Figure BDA0003507575580000212
则样本关键词的初步语义表征向量的公式表示如公式九:If corresponding to the above formulas 2 and 3, the fusion semantic representation vector of the i-th sample keyword k i is
Figure BDA0003507575580000212
Then the formula of the preliminary semantic representation vector of the sample keyword is expressed as formula 9:

Figure BDA0003507575580000213
Figure BDA0003507575580000213

公式八和公式九中,hK表示样本关键词的初步语义表征向量,∑表示求和。In Formula 8 and Formula 9, h K represents the preliminary semantic representation vector of the sample keyword, and Σ represents the summation.

样本关键词的初步综合语义表征向量的公式表示如公式十:The formula of the preliminary comprehensive semantic representation vector of the sample keywords is expressed as formula ten:

Figure BDA0003507575580000214
Figure BDA0003507575580000214

公式十中,eq表示样本搜索信息的特征向量,

Figure BDA0003507575580000215
表示样本关键词的初步综合语义表征向量。In formula ten, e q represents the feature vector of the sample search information,
Figure BDA0003507575580000215
Preliminary synthetic semantic representation vector representing sample keywords.

可选地,由于关键词可能存在不准确的情况,强制引入该关键词到排序模型中,可能会给模型带来噪声,为了保证鲁棒性,还可以通过Dropout(丢弃)机制来随机丢弃初步语义表征向量中的部分神经元。Optionally, because the keyword may be inaccurate, forcibly introducing the keyword into the ranking model may bring noise to the model. In order to ensure robustness, the Dropout mechanism can be used to randomly discard the initial Part of the neuron in the semantic representation vector.

因此,在计算各样本关键词的融合语义表征向量和相关度的乘积的总和,得到所述样本关键词的初步语义表征向量之后,还包括:对所述初步语义表征向量进行随机丢弃处理。Therefore, after calculating the sum of the products of the fusion semantic representation vector of each sample keyword and the correlation degree to obtain the preliminary semantic representation vector of the sample keyword, the method further includes: randomly discarding the preliminary semantic representation vector.

随机丢弃处理后的初步语义表征向量的公式表示如公式十一:The formula of the preliminary semantic representation vector after random discarding is expressed as formula 11:

h′K=dropout(hK)公式十一h′ K = dropout(h K ) formula 11

公式十一中,dropout()表示丢弃机制的处理过程,h′K表示随机丢弃处理后的初步语义表征向量,hK表示样本关键词的初步语义表征向量。In formula 11, dropout( ) represents the processing process of the drop mechanism, h′ K represents the preliminary semantic representation vector after random drop processing, and h K represents the preliminary semantic representation vector of the sample keyword.

相应地,将随机丢弃处理后的初步语义表征向量与所述样本搜索信息的特征向量相加,得到样本关键词的初步综合语义表征向量。Correspondingly, the preliminary semantic representation vector after the random discarding process is added to the feature vector of the sample search information to obtain the preliminary comprehensive semantic representation vector of the sample keywords.

样本关键词的初步综合语义表征向量的公式表示如公式十二:The formula of the preliminary comprehensive semantic representation vector of the sample keywords is expressed as formula 12:

Figure BDA0003507575580000216
Figure BDA0003507575580000216

公式十二中,eq表示样本搜索信息的特征向量,

Figure BDA0003507575580000217
表示样本关键词的初步综合语义表征向量。In formula 12, e q represents the feature vector of the sample search information,
Figure BDA0003507575580000217
Preliminary synthetic semantic representation vector representing sample keywords.

输出层:用于基于关键词的初步综合语义表征向量输出关键词的综合语义表征向量。Output layer: used to output the comprehensive semantic representation vector of keywords based on the preliminary comprehensive semantic representation vector of keywords.

以离线训练为例,在输出层对样本关键词的初步综合语义表征向量进行标准化处理,得到样本关键词的综合语义表征向量。Taking offline training as an example, the initial comprehensive semantic representation vector of sample keywords is normalized at the output layer, and the comprehensive semantic representation vector of sample keywords is obtained.

样本关键词的综合语义表征向量的公式表示如公式十三:The formula of the comprehensive semantic representation vector of the sample keywords is expressed as formula 13:

Figure BDA0003507575580000221
Figure BDA0003507575580000221

公式十三中,LayerNorm()表示标准化处理(Layer Normalization)过程,

Figure BDA0003507575580000222
表示样本关键词的综合语义表征向量。In Formula 13, LayerNorm() represents the Layer Normalization process,
Figure BDA0003507575580000222
A comprehensive semantic representation vector representing sample keywords.

(2)基于关键词的综合语义表征向量进行深层特征融合和浅层特征融合,得到预测值。(2) Perform deep feature fusion and shallow feature fusion based on the comprehensive semantic representation vector of keywords to obtain the predicted value.

将关键词的综合语义表征向量输入到特征交互网络中进行特征融合,通过特征交互网络,基于关键词的综合语义表征向量获取对象的排序参数。具体地,对关键词的综合语义表征向量、描述信息和搜索信息进行特征融合,得到对象的排序参数。The comprehensive semantic representation vector of keywords is input into the feature interaction network for feature fusion. Through the feature interaction network, the sorting parameters of objects are obtained based on the comprehensive semantic representation vector of keywords. Specifically, feature fusion is performed on the comprehensive semantic representation vector, description information and search information of keywords to obtain the ranking parameters of objects.

参照图8,示出了本公开实施例的一种特征交互网络的示意图。图8以多目标模型为例,主要包括了深层融合网络(MMOE(Multi-gate Mixture-of-Experts,多门混合专家)Network)和浅层融合网络(FM(Factor Machine,因子分解机)Network)。底层的输入特征包括了对象的描述信息,搜索信息(query)和经过前述的语义编码网络(Keyword Encoder)得到的关键词的综合语义表征向量。其中,如图8所示,对象的描述信息可以包括但不限于文档特征(Item Feature)、上下文特征(Context Feature),等等。Referring to FIG. 8 , a schematic diagram of a feature interaction network according to an embodiment of the present disclosure is shown. Figure 8 takes the multi-target model as an example, which mainly includes a deep fusion network (MMOE (Multi-gate Mixture-of-Experts) Network) and a shallow fusion network (FM (Factor Machine, factorization machine) Network) ). The underlying input features include the description information of the object, the search information (query) and the comprehensive semantic representation vector of the keywords obtained through the aforementioned semantic encoding network (Keyword Encoder). Wherein, as shown in FIG. 8 , the description information of the object may include, but is not limited to, a document feature (Item Feature), a context feature (Context Feature), and the like.

以离线训练为例,可选地,对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行特征融合的过程,可以包括:对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行深层特征融合,得到所述样本对象的第一样本排序参数;对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行浅层特征融合,得到所述样本对象的第二样本排序参数;基于所述第一样本排序参数和所述第一样本排序参数,计算所述样本对象的样本排序参数。Taking offline training as an example, optionally, the process of performing feature fusion on the comprehensive semantic representation vector, the sample description information and the sample search information may include: performing the feature fusion on the comprehensive semantic representation vector, the sample description performing deep feature fusion with the sample search information to obtain the first sample sorting parameter of the sample object; performing shallow feature fusion on the comprehensive semantic representation vector, the sample description information and the sample search information, Obtain the second sample sorting parameter of the sample object; and calculate the sample sorting parameter of the sample object based on the first sample sorting parameter and the first sample sorting parameter.

可选地,对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行深层特征融合,包括:获取所述样本描述信息的特征向量和所述样本搜索信息的特征向量;基于所述综合语义表征向量、所述样本描述信息的特征向量和所述样本搜索信息的特征向量生成拼接特征向量;利用预设的深层融合网络对所述拼接特征向量进行特征融合处理,得到第一样本排序参数。Optionally, deep feature fusion is performed on the comprehensive semantic representation vector, the sample description information and the sample search information, including: acquiring the feature vector of the sample description information and the feature vector of the sample search information; based on The integrated semantic representation vector, the feature vector of the sample description information, and the feature vector of the sample search information generate a splicing feature vector; a preset deep fusion network is used to perform feature fusion processing on the splicing feature vector to obtain the first Sample sorting parameters.

如图8所示,对样本描述信息的每个特征通过嵌入矩阵或序列建模的方式提取特征向量,并获取样本搜索信息的特征向量。将同一类型的特征向量进行拼接(Concat),如图8中,将文档特征的特征向量进行拼接,将上下文特征的特征向量和样本搜索信息的特征向量进行拼接。再将上述拼接后的向量与样本关键词的综合语义表征向量进行拼接生成拼接特征向量。将拼接特征向量通过融合层(Fusion Network)进行初步融合后作为深层融合网络(MMoE Network)的输入,深层融合网络进行高阶隐向量的提取,能够将关键词中蕴含的语义信息和其它信息充分地融合,输出任务特定的预测值,也即第一样本排序参数。As shown in Figure 8, a feature vector is extracted for each feature of the sample description information by means of embedding matrix or sequence modeling, and the feature vector of the sample search information is obtained. The feature vectors of the same type are concatenated (Concat), as shown in Figure 8, the feature vectors of the document features are concatenated, and the feature vectors of the context features and the feature vectors of the sample search information are concatenated. The spliced vector is then spliced with the comprehensive semantic representation vector of the sample keywords to generate a spliced feature vector. The splicing feature vector is initially fused through the fusion layer (Fusion Network) as the input of the deep fusion network (MMoE Network). , output the task-specific predicted value, that is, the first sample ranking parameter.

可选地,对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行浅层特征融合,包括:获取所述样本描述信息的特征向量和所述样本搜索信息的特征向量;利用预设的浅层融合网络对所述样本描述信息的特征向量、所述样本搜索信息的特征向量与所述综合语义表征向量进行特征融合处理,得到所述第二样本排序参数。Optionally, performing shallow feature fusion on the comprehensive semantic representation vector, the sample description information and the sample search information, including: acquiring a feature vector of the sample description information and a feature vector of the sample search information; A preset shallow fusion network is used to perform feature fusion processing on the feature vector of the sample description information, the feature vector of the sample search information, and the comprehensive semantic representation vector to obtain the second sample ranking parameter.

如图8所示,对样本描述信息的每个特征通过嵌入矩阵或序列建模的方式提取特征向量,并获取样本搜索信息的特征向量。将样本描述信息的特征向量、样本搜索信息的特征向量,样本关键词的综合语义表征向量作为浅层融合网络(FM Network)的输入,浅层融合网络将关键词的综合语义表征向量和一些记忆性的稀疏特征的特征向量进行充分的特征交互,输出任务共享的预测值,也即第二样本排序参数。As shown in Figure 8, a feature vector is extracted for each feature of the sample description information by means of embedding matrix or sequence modeling, and the feature vector of the sample search information is obtained. The feature vector of the sample description information, the feature vector of the sample search information, and the comprehensive semantic representation vector of the sample keyword are used as the input of the shallow fusion network (FM Network). The shallow fusion network combines the comprehensive semantic representation vector of the keyword and some memory The feature vector of the characteristic sparse feature performs sufficient feature interaction, and outputs the predicted value shared by the task, that is, the second sample ranking parameter.

可选地,将第一样本排序参数和第一样本排序参数对应相加,得到样本对象对应的各任务的样本排序参数。样本排序参数可以包括但不限于点击率、转化率、曝光量、用户交互行为参数,等等。Optionally, the first sample sorting parameter and the first sample sorting parameter are correspondingly added to obtain the sample sorting parameters of each task corresponding to the sample object. Sample ranking parameters may include, but are not limited to, click-through rate, conversion rate, exposure, user interaction behavior parameters, and the like.

本公开实施例中,提出了一种置信度感知的关键词语义编码网络,能够有效提取关键词中的语义信息,自动过滤噪声信息和提升语义表征的鲁棒性;同时提出了一种融合关键词综合语义表征向量到排序模型中的框架,能够借助预训练向量和关键词的表达能力,和搜索场景中重要的上下文、文档、用户等特征进行充分的特征交互,从而提升搜索结果的相关性和质量。In the embodiments of the present disclosure, a confidence-aware keyword semantic coding network is proposed, which can effectively extract semantic information in keywords, automatically filter noise information, and improve the robustness of semantic representation; at the same time, a fusion key The framework that integrates the semantic representation vector of words into the ranking model can fully interact with the important context, document, user and other features in the search scene with the help of the expressive ability of pre-trained vectors and keywords, thereby improving the relevance of search results. and quality.

通过引入关键词信息到排序模型中,并基于预训练向量进行参数初始化,然后在排序模型中进行微调,使得排序模型有效捕捉到语义相关性信息,提高排序的相关性。通过基于置信度感知的注意力机制,能够较好地区分不同关键词的重要度,减少噪声,通过温度参数以及Dropout机制也能进一步保证语义表征的鲁棒性。By introducing keyword information into the ranking model, initializing parameters based on the pre-training vector, and then fine-tuning in the ranking model, the ranking model can effectively capture semantic relevance information and improve the relevance of ranking. Through the attention mechanism based on confidence perception, the importance of different keywords can be better distinguished, noise can be reduced, and the robustness of semantic representation can be further ensured through temperature parameters and dropout mechanism.

关键词是内容化社区和平台非常重要的结构化信息,能够用于刻画内容的主题和用户的显式兴趣偏好,广泛应用于内容搜索和推荐场景。其中,搜索场景能够用于捕捉用户搜索词和关键词的相关性;推荐场景可以构建用户点击过的内容的关键词序列,用于刻画用户的长短期兴趣和构建用户画像标签,再基于用户画像标签和内容的关键词,来进行用户兴趣的泛化和精准化捕捉,提升推荐性能。该方法在内容化平台均可能被采用。Keywords are very important structured information for content-based communities and platforms. They can be used to describe the topic of content and users' explicit interest preferences, and are widely used in content search and recommendation scenarios. Among them, the search scene can be used to capture the correlation between user search words and keywords; the recommendation scene can construct the keyword sequence of the content that the user has clicked, which can be used to describe the user's long-term and short-term interests and build user portrait tags, and then based on the user portraits Tags and content keywords are used to generalize and accurately capture user interests and improve recommendation performance. This method may be adopted in all content platforms.

参照图9,示出了本公开实施例的一种模型训练装置的结构框图。Referring to FIG. 9 , a structural block diagram of a model training apparatus according to an embodiment of the present disclosure is shown.

如图9所示,模型训练装置可以包括以下模块:As shown in Figure 9, the model training device may include the following modules:

第一获取模块901,用于获取样本数据;所述样本数据包括样本对象对应的样本搜索信息和样本关键词;The first obtaining module 901 is used for obtaining sample data; the sample data includes sample search information and sample keywords corresponding to the sample object;

训练模块902,用于在预设的待训练模型中,对所述样本关键词和所述样本搜索信息进行特征融合,得到所述样本关键词的综合语义表征向量,基于所述综合语义表征向量获取所述样本对象的样本排序参数;A training module 902, configured to perform feature fusion on the sample keywords and the sample search information in a preset model to be trained, to obtain a comprehensive semantic representation vector of the sample keywords, based on the comprehensive semantic representation vector obtaining the sample sorting parameters of the sample object;

确定模块903,用于将基于所述样本排序参数确定训练完成的模型作为排序模型。The determining module 903 is configured to determine the model that has been trained based on the sample sorting parameter as the sorting model.

可选地,所述训练模块902包括:第一融合单元,用于针对每个样本关键词,对当前样本关键词与其他样本关键词进行特征融合,得到当前样本关键词的融合语义表征向量;第二融合单元,用于针对每个样本关键词,对当前样本关键词与所述样本搜索信息进行特征融合,得到当前样本关键词与样本搜索信息之间的相关度;第一计算单元,用于基于各样本关键词的融合语义表征向量和相关度,计算所述样本关键词的综合语义表征向量。Optionally, the training module 902 includes: a first fusion unit, configured to perform feature fusion on the current sample keyword and other sample keywords for each sample keyword, to obtain a fusion semantic representation vector of the current sample keyword; The second fusion unit is configured to perform feature fusion on the current sample keyword and the sample search information for each sample keyword, to obtain the correlation between the current sample keyword and the sample search information; the first calculation unit, using Based on the fusion semantic representation vector and relevancy of each sample keyword, the comprehensive semantic representation vector of the sample keyword is calculated.

可选地,所述训练模块902包括:查询单元,用于从预设的词与特征向量的对应关系中,查询各样本关键词的特征向量和所述样本搜索信息的特征向量;第三融合单元,用于针对每个样本关键词,对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,得到当前样本关键词的融合语义表征向量;第四融合单元,用于针对每个样本关键词,对当前样本关键词的融合语义表征向量与所述样本搜索信息的特征向量进行特征融合,得到当前样本关键词与所述样本搜索信息之间的相关度;第二计算单元,用于基于各样本关键词的融合语义表征向量和相关度,以及所述样本搜索信息的特征向量,计算所述样本关键词的综合语义表征向量。Optionally, the training module 902 includes: a query unit, configured to query the feature vector of each sample keyword and the feature vector of the sample search information from the preset correspondence between words and feature vectors; the third fusion unit, which is used for feature fusion of the feature vector of the current sample keyword and the feature vectors of other sample keywords for each sample keyword, to obtain the fusion semantic representation vector of the current sample keyword; the fourth fusion unit is used for For each sample keyword, feature fusion is performed on the fusion semantic representation vector of the current sample keyword and the feature vector of the sample search information to obtain the correlation between the current sample keyword and the sample search information; the second calculation unit , which is used to calculate the comprehensive semantic representation vector of the sample keywords based on the fusion semantic representation vector and relevance of each sample keyword and the feature vector of the sample search information.

可选地,所述第三融合单元,具体用于通过自注意力机制对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,得到当前样本关键词的融合语义表征向量。Optionally, the third fusion unit is specifically configured to perform feature fusion between the feature vector of the current sample keyword and the feature vectors of other sample keywords through a self-attention mechanism to obtain a fusion semantic representation vector of the current sample keyword.

可选地,所述样本数据还包括所述样本关键词的置信度;所述第三融合单元,具体用于通过自注意力机制对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,得到当前样本关键词的初步融合语义表征向量;按照所述置信度对所述样本关键词进行降序排序,基于排序结果获取各样本关键词的位置嵌入向量;将当前样本关键词的初步融合语义表征向量和位置嵌入向量相加,得到当前样本关键词的融合语义表征向量。Optionally, the sample data also includes the confidence level of the sample keywords; the third fusion unit is specifically used to analyze the feature vector of the current sample keyword and the feature vector of other sample keywords through a self-attention mechanism. Perform feature fusion to obtain a preliminary fusion semantic representation vector of the current sample keywords; sort the sample keywords in descending order according to the confidence, and obtain the position embedding vector of each sample keyword based on the sorting result; The initial fusion semantic representation vector and the position embedding vector are added to obtain the fusion semantic representation vector of the current sample keyword.

可选地,所述样本数据还包括所述样本关键词的置信度;所述第四融合单元,具体用于计算当前样本关键词的置信度与所述搜索信息的相关度权重;基于当前样本关键词的融合语义表征向量,所述样本搜索信息的特征向量,以及当前样本关键词的相关度权重,计算当前样本关键词的中间参数;基于所述中间参数和预设的温度参数,计算当前样本关键词与所述样本搜索信息之间的相关度。Optionally, the sample data further includes the confidence of the sample keywords; the fourth fusion unit is specifically configured to calculate the confidence of the current sample keywords and the correlation weight of the search information; based on the current sample The fusion semantic representation vector of the keyword, the feature vector of the sample search information, and the relevance weight of the current sample keyword, calculate the intermediate parameter of the current sample keyword; based on the intermediate parameter and the preset temperature parameter, calculate the current The correlation between the sample keywords and the sample search information.

可选地,所述第二计算单元,具体用于计算各样本关键词的融合语义表征向量和相关度的乘积的总和,得到所述样本关键词的初步语义表征向量;将所述初步语义表征向量与所述样本搜索信息的特征向量相加,得到所述样本关键词的初步综合语义表征向量;对所述初步综合语义表征向量进行标准化处理,得到所述样本关键词的综合语义表征向量。Optionally, the second calculation unit is specifically configured to calculate the sum of the product of the fusion semantic representation vector and the correlation degree of each sample keyword, and obtain the preliminary semantic representation vector of the sample keyword; The vector is added with the feature vector of the sample search information to obtain a preliminary comprehensive semantic representation vector of the sample keyword; the preliminary comprehensive semantic representation vector is standardized to obtain a comprehensive semantic representation vector of the sample keyword.

可选地,所述第二计算单元,还用于在计算各样本关键词的融合语义表征向量和相关度的乘积的总和,得到所述样本关键词的初步语义表征向量之后,对所述初步语义表征向量进行随机丢弃处理;所述第二计算单元,具体用于将随机丢弃处理后的初步语义表征向量与所述样本搜索信息的特征向量相加。Optionally, the second computing unit is further configured to calculate the sum of the product of the fusion semantic representation vector of each sample keyword and the correlation degree, and obtain the preliminary semantic representation vector of the sample keyword. The semantic representation vector is randomly discarded; the second computing unit is specifically configured to add the preliminary semantic representation vector after the random discard process and the feature vector of the sample search information.

可选地,所述样本数据还包括所述样本对象对应的样本描述信息;所述训练模块902,具体用于对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行特征融合,得到所述样本对象的样本排序参数。Optionally, the sample data further includes sample description information corresponding to the sample object; the training module 902 is specifically configured to characterize the comprehensive semantic representation vector, the sample description information and the sample search information. Fusion to obtain the sample sorting parameters of the sample object.

可选地,所述训练模块902包括:第五融合单元,用于对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行深层特征融合,得到所述样本对象的第一样本排序参数;第六融合单元,用于对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行浅层特征融合,得到所述样本对象的第二样本排序参数;第三计算单元,用于基于所述第一样本排序参数和所述第一样本排序参数,计算所述样本对象的样本排序参数。Optionally, the training module 902 includes: a fifth fusion unit, configured to perform deep feature fusion on the comprehensive semantic representation vector, the sample description information and the sample search information to obtain the first a sample sorting parameter; a sixth fusion unit, configured to perform shallow feature fusion on the comprehensive semantic representation vector, the sample description information and the sample search information to obtain a second sample sorting parameter of the sample object; the sixth A third calculating unit, configured to calculate the sample sorting parameter of the sample object based on the first sample sorting parameter and the first sample sorting parameter.

可选地,所述第五融合单元,具体用于获取所述样本描述信息的特征向量和所述样本搜索信息的特征向量;基于所述综合语义表征向量、所述样本描述信息的特征向量和所述样本搜索信息的特征向量生成拼接特征向量;利用预设的深层融合网络对所述拼接特征向量进行特征融合处理,得到所述第一样本排序参数。Optionally, the fifth fusion unit is specifically configured to obtain the feature vector of the sample description information and the feature vector of the sample search information; based on the comprehensive semantic representation vector, the feature vector of the sample description information and A splicing feature vector is generated from the feature vector of the sample search information; the feature fusion process is performed on the splicing feature vector by using a preset deep fusion network to obtain the first sample sorting parameter.

可选地,所述第六融合单元,具体用于获取所述样本描述信息的特征向量和所述样本搜索信息的特征向量;利用预设的浅层融合网络对所述样本描述信息的特征向量、所述样本搜索信息的特征向量与所述综合语义表征向量进行特征融合处理,得到所述第二样本排序参数。Optionally, the sixth fusion unit is specifically configured to obtain the feature vector of the sample description information and the feature vector of the sample search information; use a preset shallow fusion network to compose the feature vector of the sample description information. and performing feature fusion processing on the feature vector of the sample search information and the comprehensive semantic representation vector to obtain the second sample sorting parameter.

参照图10,示出了本公开实施例的一种排序装置的结构框图。Referring to FIG. 10 , a structural block diagram of a sorting apparatus according to an embodiment of the present disclosure is shown.

如图10所示,排序装置可以包括以下模块:As shown in Figure 10, the sorting device may include the following modules:

第二获取模块1001,用于获取待排序对象对应的搜索信息和关键词;The second obtaining module 1001 is used to obtain search information and keywords corresponding to the objects to be sorted;

预测模块1002,用于将所述搜索信息和所述关键词输入预先训练的排序模型,得到所述排序模型输出的所述待排序对象的排序参数;所述排序模型通过如上任一实施例所述的模型训练方法得到。The prediction module 1002 is configured to input the search information and the keywords into a pre-trained sorting model, and obtain the sorting parameters of the objects to be sorted output by the sorting model; The model training method described above is obtained.

排序模块1003,用于基于所述排序参数对所述待排序对象进行排序。A sorting module 1003, configured to sort the objects to be sorted based on the sorting parameters.

本公开实施例中,将样本对象的关键词信息融入到排序模型中,这些关键词是针对对象的非结构化特征挖掘出来的,相比于对象的非结构化特征,关键词能够更好地提炼和概括对象的主题,能够覆盖和刻画用户广泛的意图,因此融入关键词信息的排序模型能够捕捉关键词本身的语义以及关键词和用户搜索信息之间的语义相关性,从而提高排序模型的准确性。In the embodiment of the present disclosure, the keyword information of the sample objects is integrated into the ranking model. These keywords are mined for the unstructured features of the objects. Compared with the unstructured features of the objects, the keywords can better Refining and summarizing the subject of the object can cover and describe a wide range of users’ intentions. Therefore, the ranking model incorporating keyword information can capture the semantics of the keywords themselves and the semantic correlation between keywords and user search information, thereby improving the ranking model. accuracy.

对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。As for the apparatus embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and reference may be made to the partial description of the method embodiment for related parts.

在本公开的实施例中,还提供了一种电子设备。该电子设备可以包括一个或多个处理器,以及其上存储有指令的一个或多个计算机可读存储介质,指令例如应用程序。当指令由一个或多个处理器执行时,使得处理器执行如上任一实施例的模型训练方法,或者,执行如上任一实施例的排序方法。In an embodiment of the present disclosure, an electronic device is also provided. The electronic device may include one or more processors, and one or more computer-readable storage media having stored thereon instructions, such as application programs. When the instructions are executed by one or more processors, the processor is caused to perform the model training method as in any of the above embodiments, or, perform the ranking method as in any of the above embodiments.

在本公开的实施例中,还提供了一种非临时性计算机可读存储介质,其上存储有计算机程序,该计算机程序可由电子设备的处理器执行,当所述计算机程序被处理器执行时,使得所述处理器执行如上任一实施例所述的模型训练方法,或者,执行如上任一实施例所述的排序方法。In an embodiment of the present disclosure, there is also provided a non-transitory computer-readable storage medium on which a computer program is stored, the computer program can be executed by a processor of an electronic device, and when the computer program is executed by the processor , causing the processor to execute the model training method described in any of the above embodiments, or to execute the sorting method described in any of the above embodiments.

上述提到的处理器可以是通用处理器,可以包括但不限于:中央处理器(CentralProcessing Unit,简称CPU)、网络处理器(Network Processor,简称NP)、数字信号处理器(Digital Signal Processing,简称DSP)、专用集成电路(Application SpecificIntegrated Circuit,简称ASIC)、现场可编程门阵列(Field-Programmable Gate Array,简称FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件,等等。The above-mentioned processor may be a general-purpose processor, which may include, but is not limited to, a central processing unit (CPU for short), a network processor (NP for short), and a digital signal processor (Digital Signal Processing for short) DSP), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. .

上述提到的计算机可读存储介质可以包括但不限于:只读存储器(Read OnlyMemory,简称ROM)、随机存取存储器(Random Access Memory,简称RAM)、光盘只读储存器(Compact Disc ReadOnly Memory,简称CD-ROM)、电可擦可编程只读存储器(ElectronicErasable Programmable ReadOnly Memory,简称EEPROM)、硬盘、软盘、闪存,等等。The above-mentioned computer-readable storage medium may include but is not limited to: read-only memory (Read OnlyMemory, referred to as ROM), random access memory (Random Access Memory, referred to as RAM), compact disc read-only memory (Compact Disc ReadOnly Memory, CD-ROM for short), Electronic Erasable Programmable ReadOnly Memory (EEPROM for short), hard disk, floppy disk, flash memory, and so on.

在此提供的算法和显示不与任何特定计算机、虚拟系统或者其它设备固有相关。各种通用系统也可以与基于在此的示教一起使用。根据上面的描述,构造这类系统所要求的结构是显而易见的。此外,本公开的实施例也不针对任何特定编程语言。应当明白,可以利用各种编程语言实现在此描述的本公开的实施例的内容,并且上面对特定语言所做的描述是为了披露本公开的实施例的最佳实施方式。The algorithms and displays provided herein are not inherently related to any particular computer, virtual system, or other device. Various general-purpose systems can also be used with teaching based on this. The structure required to construct such a system is apparent from the above description. Furthermore, embodiments of the present disclosure are not directed to any particular programming language. It should be understood that various programming languages may be utilized to implement the contents of the embodiments of the present disclosure described herein, and that the above descriptions of specific languages are intended to disclose the best modes of implementation of the embodiments of the present disclosure.

在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本公开的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。In the description provided herein, numerous specific details are set forth. It will be understood, however, that embodiments of the present disclosure may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

类似地,应当理解,为了精简本公开并帮助理解各个发明方面中的一个或多个,在上面对本公开的示例性实施例的描述中,本公开的实施例的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该公开的方法解释成反映如下意图:即所要求保护的本公开的实施例要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如下面的权利要求书所反映的那样,发明方面在于少于前面公开的单个实施例的所有特征。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本公开的实施例的单独实施例。Similarly, it is to be understood that in the above descriptions of exemplary embodiments of the present disclosure, various features of the embodiments of the present disclosure are sometimes grouped together into a single Examples, figures, or descriptions thereof. However, this method of disclosure should not be interpreted as reflecting an intention that the claimed embodiments of the disclosure require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of embodiments of the present disclosure.

本领域那些技术人员可以理解,可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件,以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的替代特征来代替。Those skilled in the art will understand that the modules in the device in the embodiment can be adaptively changed and arranged in one or more devices different from the embodiment. The modules or units or components in the embodiments may be combined into one module or unit or component, and further they may be divided into multiple sub-modules or sub-units or sub-assemblies. All features disclosed in this specification (including accompanying claims, abstract and drawings) and any method so disclosed may be employed in any combination, unless at least some of such features and/or procedures or elements are mutually exclusive. All processes or units of equipment are combined. Each feature disclosed in this specification (including accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

本公开的实施例的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本公开的实施例的动态图片的生成设备中的一些或者全部部件的一些或者全部功能。本公开的实施例还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序。这样的实现本公开的实施例的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。The various component embodiments of the embodiments of the present disclosure may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art should understand that a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all functions of some or all components of the dynamic picture generating apparatus according to the embodiments of the present disclosure. Embodiments of the present disclosure may also be implemented as apparatus or apparatus programs for performing a part or all of the methods described herein. Such a program implementing embodiments of the present disclosure may be stored on a computer-readable medium, or may be in the form of one or more signals. Such signals may be downloaded from Internet sites, or provided on carrier signals, or in any other form.

应该注意的是上述实施例对本公开的实施例进行说明而不是对本公开的实施例进行限制,并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本公开的实施例可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。It should be noted that the above-described embodiments illustrate rather than limit embodiments of the present disclosure, and that alternative embodiments may be devised by those skilled in the art without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. Embodiments of the present disclosure may be implemented by means of hardware comprising several distinct elements, as well as by means of suitably programmed computers. In a unit claim enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, and third, etc. do not denote any order. These words can be interpreted as names.

所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working process of the above-described systems, devices and units may refer to the corresponding processes in the foregoing method embodiments, which will not be repeated here.

以上所述,仅为本公开的实施例的具体实施方式,但本公开的实施例的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本公开的实施例揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本公开的实施例的保护范围之内。The above are only specific implementations of the embodiments of the present disclosure, but the protection scope of the embodiments of the present disclosure is not limited thereto, and any person skilled in the art is within the technical scope disclosed by the embodiments of the present disclosure , changes or substitutions can be easily conceived, and all should be included within the protection scope of the embodiments of the present disclosure.

Claims (17)

1.一种模型训练方法,其特征在于,包括:1. a model training method, is characterized in that, comprises: 获取样本数据;所述样本数据包括样本对象对应的样本搜索信息和样本关键词;Obtain sample data; the sample data includes sample search information and sample keywords corresponding to the sample object; 在预设的待训练模型中,对所述样本关键词和所述样本搜索信息进行特征融合,得到所述样本关键词的综合语义表征向量,基于所述综合语义表征向量获取所述样本对象的样本排序参数;In the preset to-be-trained model, feature fusion is performed on the sample keyword and the sample search information to obtain a comprehensive semantic representation vector of the sample keyword, and the sample object is obtained based on the comprehensive semantic representation vector. sample sorting parameters; 将基于所述样本排序参数确定训练完成的模型作为排序模型。The trained model is determined based on the sample sorting parameters as the sorting model. 2.根据权利要求1所述的方法,其特征在于,对所述样本关键词和所述样本搜索信息进行特征融合,包括:2. The method according to claim 1, wherein the feature fusion is performed on the sample keywords and the sample search information, comprising: 针对每个样本关键词,对当前样本关键词与其他样本关键词进行特征融合,得到当前样本关键词的融合语义表征向量;For each sample keyword, feature fusion is performed on the current sample keyword and other sample keywords to obtain the fusion semantic representation vector of the current sample keyword; 针对每个样本关键词,对当前样本关键词与所述样本搜索信息进行特征融合,得到当前样本关键词与样本搜索信息之间的相关度;For each sample keyword, feature fusion is performed on the current sample keyword and the sample search information to obtain the correlation between the current sample keyword and the sample search information; 基于各样本关键词的融合语义表征向量和相关度,计算所述样本关键词的综合语义表征向量。Based on the fusion semantic representation vector and relevance of each sample keyword, the comprehensive semantic representation vector of the sample keyword is calculated. 3.根据权利要求1所述的方法,其特征在于,对所述样本关键词和所述样本搜索信息进行特征融合,包括:3. The method according to claim 1, wherein the feature fusion is performed on the sample keywords and the sample search information, comprising: 从预设的词与特征向量的对应关系中,查询各样本关键词的特征向量和所述样本搜索信息的特征向量;From the preset correspondence between words and feature vectors, query the feature vector of each sample keyword and the feature vector of the sample search information; 针对每个样本关键词,对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,得到当前样本关键词的融合语义表征向量;For each sample keyword, feature fusion is performed between the feature vector of the current sample keyword and the feature vectors of other sample keywords to obtain the fusion semantic representation vector of the current sample keyword; 针对每个样本关键词,对当前样本关键词的融合语义表征向量与所述样本搜索信息的特征向量进行特征融合,得到当前样本关键词与所述样本搜索信息之间的相关度;For each sample keyword, feature fusion is performed on the fusion semantic representation vector of the current sample keyword and the feature vector of the sample search information to obtain the correlation between the current sample keyword and the sample search information; 基于各样本关键词的融合语义表征向量和相关度,以及所述样本搜索信息的特征向量,计算所述样本关键词的综合语义表征向量。Based on the fusion semantic representation vector and relevance of each sample keyword, and the feature vector of the sample search information, the comprehensive semantic representation vector of the sample keyword is calculated. 4.根据权利要求3所述的方法,其特征在于,对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,包括:4. The method according to claim 3, wherein the feature fusion of the feature vector of the current sample keyword and the feature vectors of other sample keywords, comprising: 通过自注意力机制对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,得到当前样本关键词的融合语义表征向量。The feature vector of the current sample keyword and the feature vector of other sample keywords are fused through the self-attention mechanism, and the fused semantic representation vector of the current sample keyword is obtained. 5.根据权利要求3所述的方法,其特征在于,所述样本数据还包括所述样本关键词的置信度;对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,包括:5. The method according to claim 3, wherein the sample data further comprises the confidence of the sample keywords; feature fusion is performed on the feature vector of the current sample keyword and the feature vectors of other sample keywords, include: 通过自注意力机制对当前样本关键词的特征向量与其他样本关键词的特征向量进行特征融合,得到当前样本关键词的初步融合语义表征向量;Through the self-attention mechanism, the feature vector of the current sample keyword and the feature vector of other sample keywords are feature fusion, and the preliminary fusion semantic representation vector of the current sample keyword is obtained; 按照所述置信度对所述样本关键词进行降序排序,基于排序结果获取各样本关键词的位置嵌入向量;Sort the sample keywords in descending order according to the confidence level, and obtain the position embedding vector of each sample keyword based on the sorting result; 将当前样本关键词的初步融合语义表征向量和位置嵌入向量相加,得到当前样本关键词的融合语义表征向量。The initial fusion semantic representation vector of the current sample keyword and the position embedding vector are added to obtain the fusion semantic representation vector of the current sample keyword. 6.根据权利要求3所述的方法,其特征在于,所述样本数据还包括所述样本关键词的置信度;对当前样本关键词的融合语义表征向量与所述样本搜索信息的特征向量进行特征融合,包括:6. The method according to claim 3, wherein the sample data further includes the confidence of the sample keywords; the fusion semantic representation vector of the current sample keywords and the feature vector of the sample search information are performed. Feature fusion, including: 计算当前样本关键词的置信度与所述搜索信息的相关度权重;Calculate the confidence of the current sample keyword and the relevance weight of the search information; 基于当前样本关键词的融合语义表征向量,所述样本搜索信息的特征向量,以及当前样本关键词的相关度权重,计算当前样本关键词的中间参数;Calculate the intermediate parameter of the current sample keyword based on the fusion semantic representation vector of the current sample keyword, the feature vector of the sample search information, and the relevance weight of the current sample keyword; 基于所述中间参数和预设的温度参数,计算当前样本关键词与所述样本搜索信息之间的相关度。Based on the intermediate parameter and the preset temperature parameter, the correlation between the current sample keyword and the sample search information is calculated. 7.根据权利要求3所述的方法,其特征在于,基于各样本关键词的融合语义表征向量和相关度,以及所述样本搜索信息的特征向量,计算所述样本关键词的综合语义表征向量,包括:7. The method according to claim 3, wherein the comprehensive semantic representation vector of the sample keywords is calculated based on the fusion semantic representation vector and correlation of each sample keyword and the feature vector of the sample search information ,include: 计算各样本关键词的融合语义表征向量和相关度的乘积的总和,得到所述样本关键词的初步语义表征向量;Calculate the sum of the product of the fusion semantic representation vector of each sample keyword and the correlation degree to obtain the preliminary semantic representation vector of the sample keyword; 将所述初步语义表征向量与所述样本搜索信息的特征向量相加,得到所述样本关键词的初步综合语义表征向量;adding the preliminary semantic representation vector and the feature vector of the sample search information to obtain a preliminary comprehensive semantic representation vector of the sample keyword; 对所述初步综合语义表征向量进行标准化处理,得到所述样本关键词的综合语义表征向量。Standardize the preliminary comprehensive semantic representation vector to obtain the comprehensive semantic representation vector of the sample keyword. 8.根据权利要求7所述的方法,其特征在于,8. The method of claim 7, wherein: 在计算各样本关键词的融合语义表征向量和相关度的乘积的总和,得到所述样本关键词的初步语义表征向量之后,还包括:对所述初步语义表征向量进行随机丢弃处理;After calculating the sum of the product of the fusion semantic representation vector of each sample keyword and the correlation degree to obtain the preliminary semantic representation vector of the sample keyword, the method further includes: randomly discarding the preliminary semantic representation vector; 将所述初步语义表征向量与所述样本搜索信息的特征向量相加,包括:将随机丢弃处理后的初步语义表征向量与所述样本搜索信息的特征向量相加。Adding the preliminary semantic representation vector and the feature vector of the sample search information includes: adding the preliminary semantic representation vector after random discarding processing to the feature vector of the sample search information. 9.根据权利要求1所述的方法,其特征在于,所述样本数据还包括所述样本对象对应的样本描述信息;基于所述综合语义表征向量获取所述样本对象的样本排序参数,包括:9 . The method according to claim 1 , wherein the sample data further comprises sample description information corresponding to the sample object; obtaining the sample sorting parameters of the sample object based on the comprehensive semantic representation vector, comprising: 10 . 对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行特征融合,得到所述样本对象的样本排序参数。Feature fusion is performed on the comprehensive semantic representation vector, the sample description information, and the sample search information to obtain sample sorting parameters of the sample object. 10.根据权利要求9所述的方法,其特征在于,对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行特征融合,包括:10. The method according to claim 9, wherein feature fusion is performed on the comprehensive semantic representation vector, the sample description information and the sample search information, comprising: 对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行深层特征融合,得到所述样本对象的第一样本排序参数;performing deep feature fusion on the comprehensive semantic representation vector, the sample description information and the sample search information to obtain the first sample sorting parameter of the sample object; 对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行浅层特征融合,得到所述样本对象的第二样本排序参数;performing shallow feature fusion on the comprehensive semantic representation vector, the sample description information and the sample search information to obtain a second sample sorting parameter of the sample object; 基于所述第一样本排序参数和所述第一样本排序参数,计算所述样本对象的样本排序参数。A sample ranking parameter of the sample object is calculated based on the first sample ranking parameter and the first sample ranking parameter. 11.根据权利要求10所述的方法,其特征在于,对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行深层特征融合,包括:11. The method according to claim 10, wherein performing deep feature fusion on the comprehensive semantic representation vector, the sample description information and the sample search information, comprising: 获取所述样本描述信息的特征向量和所述样本搜索信息的特征向量;Obtain the feature vector of the sample description information and the feature vector of the sample search information; 基于所述综合语义表征向量、所述样本描述信息的特征向量和所述样本搜索信息的特征向量生成拼接特征向量;Generate a splicing feature vector based on the comprehensive semantic representation vector, the feature vector of the sample description information and the feature vector of the sample search information; 利用预设的深层融合网络对所述拼接特征向量进行特征融合处理,得到所述第一样本排序参数。A preset deep fusion network is used to perform feature fusion processing on the spliced feature vector to obtain the first sample sorting parameter. 12.根据权利要求10所述的方法,其特征在于,对所述综合语义表征向量、所述样本描述信息和所述样本搜索信息进行浅层特征融合,包括:12. The method according to claim 10, wherein performing shallow feature fusion on the comprehensive semantic representation vector, the sample description information and the sample search information, comprising: 获取所述样本描述信息的特征向量和所述样本搜索信息的特征向量;Obtain the feature vector of the sample description information and the feature vector of the sample search information; 利用预设的浅层融合网络对所述样本描述信息的特征向量、所述样本搜索信息的特征向量与所述综合语义表征向量进行特征融合处理,得到所述第二样本排序参数。A preset shallow fusion network is used to perform feature fusion processing on the feature vector of the sample description information, the feature vector of the sample search information, and the comprehensive semantic representation vector to obtain the second sample ranking parameter. 13.一种排序方法,其特征在于,包括:13. A sorting method, comprising: 获取待排序对象对应的搜索信息和关键词;Obtain the search information and keywords corresponding to the objects to be sorted; 将所述搜索信息和所述关键词输入预先训练的排序模型,得到所述排序模型输出的所述待排序对象的排序参数;所述排序模型通过如权利要求1-12中任一项所述的模型训练方法得到;Inputting the search information and the keywords into a pre-trained sorting model to obtain sorting parameters of the objects to be sorted output by the sorting model; The model training method is obtained; 基于所述排序参数对所述待排序对象进行排序。The objects to be sorted are sorted based on the sorting parameters. 14.一种模型训练装置,其特征在于,包括:14. A model training device, comprising: 第一获取模块,用于获取样本数据;所述样本数据包括样本对象对应的样本搜索信息和样本关键词;a first acquisition module, configured to acquire sample data; the sample data includes sample search information and sample keywords corresponding to the sample object; 训练模块,用于在预设的待训练模型中,对所述样本关键词和所述样本搜索信息进行特征融合,得到所述样本关键词的综合语义表征向量,基于所述综合语义表征向量获取所述样本对象的样本排序参数;A training module, configured to perform feature fusion on the sample keywords and the sample search information in a preset model to be trained, to obtain a comprehensive semantic representation vector of the sample keywords, and obtain a comprehensive semantic representation vector based on the comprehensive semantic representation vector sample sorting parameters of the sample object; 确定模块,用于将基于所述样本排序参数确定训练完成的模型作为排序模型。A determination module, configured to determine a model that has been trained based on the sample sorting parameters as a sorting model. 15.一种排序装置,其特征在于,包括:15. A sorting device, comprising: 第二获取模块,用于获取待排序对象对应的搜索信息和关键词;The second acquisition module is used to acquire search information and keywords corresponding to the objects to be sorted; 预测模块,用于将所述搜索信息和所述关键词输入预先训练的排序模型,得到所述排序模型输出的所述待排序对象的排序参数;所述排序模型通过如权利要求1-12中任一项所述的模型训练方法得到;A prediction module, configured to input the search information and the keywords into a pre-trained sorting model, and obtain the sorting parameters of the objects to be sorted output by the sorting model; The model training method described in any one is obtained; 排序模块,用于基于所述排序参数对所述待排序对象进行排序。A sorting module, configured to sort the objects to be sorted based on the sorting parameters. 16.一种电子设备,其特征在于,包括:16. An electronic device, comprising: 一个或多个处理器;和one or more processors; and 其上存储有指令的一个或多个计算机可读存储介质;one or more computer-readable storage media having instructions stored thereon; 当所述指令由所述一个或多个处理器执行时,使得所述处理器执行如权利要求1至12任一项所述的模型训练方法,或者,执行如权利要求13所述的排序方法。When executed by the one or more processors, the instructions cause the processors to perform the model training method as claimed in any one of claims 1 to 12, or to perform the sorting method as claimed in claim 13 . 17.一种计算机可读存储介质,其特征在于,其上存储有计算机程序,当所述计算机程序被处理器执行时,使得所述处理器执行如权利要求1至12任一项所述的模型训练方法,或者,执行如权利要求13所述的排序方法。17. A computer-readable storage medium, characterized in that a computer program is stored thereon, and when the computer program is executed by a processor, the processor is made to execute the method according to any one of claims 1 to 12. A model training method, or, performing the ranking method as claimed in claim 13 .
CN202210143365.4A 2022-02-16 2022-02-16 Model training and sorting method and device, electronic equipment and storage medium Pending CN114595370A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210143365.4A CN114595370A (en) 2022-02-16 2022-02-16 Model training and sorting method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210143365.4A CN114595370A (en) 2022-02-16 2022-02-16 Model training and sorting method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114595370A true CN114595370A (en) 2022-06-07

Family

ID=81806966

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210143365.4A Pending CN114595370A (en) 2022-02-16 2022-02-16 Model training and sorting method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114595370A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116186211A (en) * 2022-12-19 2023-05-30 北京航空航天大学 Text aggressiveness detection and conversion method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110929578A (en) * 2019-10-25 2020-03-27 南京航空航天大学 An attention-based anti-occlusion pedestrian detection method
CN112364624A (en) * 2020-11-04 2021-02-12 重庆邮电大学 Keyword extraction method based on deep learning language model fusion semantic features
CN113569002A (en) * 2021-02-01 2021-10-29 腾讯科技(深圳)有限公司 Text search method, device, equipment and storage medium
CN113590796A (en) * 2021-08-04 2021-11-02 百度在线网络技术(北京)有限公司 Training method and device of ranking model and electronic equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110929578A (en) * 2019-10-25 2020-03-27 南京航空航天大学 An attention-based anti-occlusion pedestrian detection method
CN112364624A (en) * 2020-11-04 2021-02-12 重庆邮电大学 Keyword extraction method based on deep learning language model fusion semantic features
CN113569002A (en) * 2021-02-01 2021-10-29 腾讯科技(深圳)有限公司 Text search method, device, equipment and storage medium
CN113590796A (en) * 2021-08-04 2021-11-02 百度在线网络技术(北京)有限公司 Training method and device of ranking model and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116186211A (en) * 2022-12-19 2023-05-30 北京航空航天大学 Text aggressiveness detection and conversion method
CN116186211B (en) * 2022-12-19 2023-07-25 北京航空航天大学 A Method of Text Attack Detection and Transformation

Similar Documents

Publication Publication Date Title
CN117453921B (en) Data information label processing method of large language model
CN107066464B (en) Semantic natural language vector space
EP3896581A1 (en) Learning to rank with cross-modal graph convolutions
CN106503192A (en) Name entity recognition method and device based on artificial intelligence
US20150095300A1 (en) System and method for mark-up language document rank analysis
US10572806B2 (en) Question answering with time-based weighting
CN111539197A (en) Text matching method and device, computer system and readable storage medium
JP2015518210A (en) Method, apparatus and computer-readable medium for organizing data related to products
KR102695381B1 (en) Identifying entity-attribute relationships
US20240281472A1 (en) Interactive interface with generative artificial intelligence
KR20170004154A (en) Method and system for automatically summarizing documents to images and providing the image-based contents
Liu et al. Open intent discovery through unsupervised semantic clustering and dependency parsing
CN113282711A (en) Internet of vehicles text matching method and device, electronic equipment and storage medium
US12153611B1 (en) Method and system for multi-level artificial intelligence supercomputer design
CN116089567A (en) Recommendation method, device, equipment and storage medium for search keywords
US10573190B2 (en) Iterative deepening knowledge discovery using closure-based question answering
Yuan et al. A new model of information content for measuring the semantic similarity between concepts
Arbaaeen et al. Natural language processing based question answering techniques: A survey
Zhou et al. Leverage knowledge graph and GCN for fine-grained-level clickbait detection
CN112732917B (en) Method and device for determining entity chain finger result
Wu et al. Typical opinions mining based on Douban film comments in animated movies
CN114595370A (en) Model training and sorting method and device, electronic equipment and storage medium
CN116450781A (en) Question and answer processing method and device
Dziczkowski et al. An opinion mining approach for web user identification and clients' behaviour analysis
CN117009170A (en) Training sample generation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination