WO2022041982A1 - 数据推荐方法、装置、计算机设备以及存储介质 - Google Patents

数据推荐方法、装置、计算机设备以及存储介质 Download PDF

Info

Publication number
WO2022041982A1
WO2022041982A1 PCT/CN2021/101674 CN2021101674W WO2022041982A1 WO 2022041982 A1 WO2022041982 A1 WO 2022041982A1 CN 2021101674 W CN2021101674 W CN 2021101674W WO 2022041982 A1 WO2022041982 A1 WO 2022041982A1
Authority
WO
WIPO (PCT)
Prior art keywords
domain
target
feature
sample
features
Prior art date
Application number
PCT/CN2021/101674
Other languages
English (en)
French (fr)
Inventor
郝晓波
葛凯凯
刘雨丹
唐琳瑶
张旭
谢若冰
林乐宇
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2022041982A1 publication Critical patent/WO2022041982A1/zh
Priority to US17/948,082 priority Critical patent/US20230017667A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9035Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/30Profiles
    • H04L67/306User profiles

Definitions

  • the present application relates to the field of computer technology, and in particular, to a data recommendation method, apparatus, computer device, and storage medium.
  • the recommendation system screens out the business content to be recommended according to recent hot events, and then pushes the filtered business content to be recommended to all users.
  • the embodiments of the present application provide a data recommendation method, including:
  • a data recommendation request for a target user in a target domain obtain a set of business objects that have an associated relationship with the target user in multiple domains;
  • the business object set includes a set of business objects associated with the target user in each domain.
  • a business object of an associated relationship the multiple domains include the target domain;
  • a target recommended business object feature matching the interest feature of the target domain is obtained from the features of a plurality of business objects to be recommended, and a target recommended business object corresponding to the target recommended business object feature is output.
  • a data recommendation device including:
  • the obtaining module is used for obtaining the set of business objects that are associated with the target user in multiple domains in response to a data recommendation request for the target user in the target domain;
  • the set of business objects includes The target user has a business object with an associated relationship, and the multiple domains include the target domain;
  • an encoding module configured to perform encoding processing on the multiple domains and the business object set to obtain the target domain interest characteristics of the target user in the target domain;
  • the obtaining module is also used to obtain the features of the business objects to be recommended of a plurality of business objects to be recommended under the target domain;
  • a determination module configured to obtain the target recommended business object feature matching the target domain interest feature from a plurality of to-be-recommended business object features
  • the output module is used for outputting the target recommended business object corresponding to the feature of the target recommended business object.
  • An aspect of the embodiments of the present application provides a computer device, including a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processor executes the methods in the foregoing embodiments.
  • An aspect of the embodiments of the present application provides a non-volatile computer-readable storage medium, where the computer-readable storage medium stores a computer program, the computer program includes program instructions, and when the program instructions are executed by a processor, execute the above methods in the examples.
  • the embodiments of the present application provide a computer program product or computer program.
  • the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the computer instructions are executed by a processor of a computer device, the execution methods in the above embodiments.
  • FIG. 1 is a system architecture diagram of a data recommendation provided by an embodiment of the present application.
  • FIGS. 2a-2d are schematic diagrams of a data recommendation scenario provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a data recommendation provided by an embodiment of the present application.
  • 4a is a schematic diagram of a cross-domain recommendation model provided by an embodiment of the present application.
  • 4b is an overall framework diagram of a data recommendation provided by an embodiment of the present application.
  • 5a-5b are schematic diagrams of a target recommendation business object provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of a data recommendation provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a sample cross-domain recommendation model provided by an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a data recommendation apparatus provided by an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
  • the data recommendation method of the present application involves cloud computing under cloud technology; from the perspective of application, the data recommendation method of the present application involves artificial intelligence cloud services under cloud technology.
  • the terminal device can obtain sufficient computing power through cloud computing technology. power and storage space, and then perform the extraction of the target user's interest features in the target domain in the target domain involved in this application, and the extraction of to-be-recommended business object features of multiple to-be-recommended business objects.
  • the data recommendation method involved in this application can be encapsulated as an artificial intelligence service, and only one interface is exposed externally.
  • the business object recommendation function involved in this application needs to be used in a certain business scenario, the business object recommendation can be completed by calling this interface.
  • the solutions provided in the embodiments of this application belong to machine learning/deep learning under the field of artificial intelligence.
  • this application mainly involves determining the user's interest feature in a specific target field by calling a recommendation model and based on the user's behavior data in multiple fields, and the interest feature can be used to determine the business object to be recommended by the user in the target field.
  • the application can be applied to the following scenarios: when it is necessary to recommend products in the target field to the target user, the behavior data of the target user in multiple fields is obtained, the interest characteristics of the target user in the target field are determined according to the behavior data in the multiple fields, and the The interest features are matched with the product features of multiple products to be recommended in the target field, and the products to be recommended corresponding to the matched product features are output.
  • matching products to be recommended can be pushed to users to achieve precise marketing and improve the accuracy of recommendation.
  • This application mines the user's personalized preferences according to the user's behavior, and then determines the user's recommended business objects. Different recommended business objects can be determined according to different user behaviors to realize personalized content recommendation, and the recommended business objects are determined according to the user behavior. Hot events determine the recommended business objects, and the recommended content determined from the nature of user needs is the content that the user really wants, which can improve the accuracy of content recommendation; in addition, the user behaviors in different fields complement each other to determine the user In terms of interest features in the target field, recommending business objects in the target field for users based on the interest features in the target field is more targeted, which can further improve the accuracy of data recommendation.
  • FIG. 1 is a system architecture diagram of a data recommendation provided by an embodiment of the present application.
  • This application relates to a server 10d and a terminal device cluster, and the terminal device cluster may include: terminal device 10a, terminal device 10b, . . . , terminal device 10c, and the like.
  • the terminal device 10a when the terminal device 10a receives a data recommendation request about the user in the target field, the terminal device 10a sends the data recommendation request to the server 10d.
  • the server 10d acquires a set of business objects that are associated with the target user in multiple domains, and the server 10d performs cross-domain cross-coding processing on the multiple domains and the set of business objects to obtain the target domain interest characteristics of the target user in the target domain, Obtain the features of the to-be-recommended business objects of multiple to-be-recommended business objects in the target domain, determine the target recommended business object features that match the interest features of the target domain from the multiple to-be-recommended business object features, and output the target recommendation corresponding to the target recommended business object features business object.
  • the server 10d may send the target recommended business object to the terminal device 10a, and the terminal device 10a displays the target recommended business object on the page, so as to accurately recommend the business object to the target user.
  • the determination of the target user's interest characteristics in the target domain and the determination of the target recommended business object may also be performed by the terminal device 10a.
  • the server 10d shown in FIG. 1 may be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud service, cloud database, cloud computing, cloud function, cloud storage, network Services, cloud communications, middleware services, domain name services, security services, CDN (Content Delivery Network), and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms.
  • cloud database cloud computing
  • cloud function cloud storage
  • network Services cloud communications
  • middleware services domain name services
  • security services CDN (Content Delivery Network)
  • cloud servers for basic cloud computing services such as big data and artificial intelligence platforms.
  • the terminal device 10a, terminal device 10b, terminal device 10c, etc. shown in FIG. 1 can be smart devices such as mobile phones, tablet computers, notebook computers, handheld computers, mobile internet devices (MID, mobile internet devices), wearable devices, etc.
  • the terminal device cluster and the server 10d may be directly or indirectly connected through wired or wireless communication, which is not limited in this application.
  • FIG. 2a to FIG. 2d are schematic diagrams of a data recommendation scenario provided by an embodiment of the present application.
  • the server 10d when the server 10d receives a data recommendation request from the target user in the text field, the server 10d obtains business objects that are associated with the target user in multiple fields, wherein multiple fields include text fields.
  • the domains are text domain, video domain and shopping domain respectively.
  • the business objects associated with the target user are the business objects that have been exposed/clicked by the target user.
  • the server 10d acquires a text set 20a in the text field that has an associated relationship with the target user. As shown in FIG. 2a, the text set 20a is empty, indicating that the target user has not read any text.
  • the server 10d obtains a video set 20b in the video domain that is associated with the target user.
  • the video set 20b includes Video 1, Video 2, and Video 3, indicating that the target user has viewed/clicked on Video 1, Video 2, and Video 3.
  • the server 10d obtains the commodity set 20c that is associated with the target user in the shopping field.
  • the commodity set 20c includes commodity 1, commodity 2 and commodity 3, indicating that the target user has viewed/clicked/purchased commodity 1, commodity 2 and commodity 3. .
  • the server 10d obtains the text domain features corresponding to the text domain, the video domain features corresponding to the video domain, and the shopping domain features corresponding to the shopping domain, and inputs these three domain features into the first encoder, the first encoder.
  • the three domain features are encoded as domain feature vectors, and then the domain feature vectors are input into the second encoder Transformer, and the second encoder encodes the domain feature vectors as feature vector 20d.
  • the server 10d obtains the text feature corresponding to the empty text, and the text feature corresponding to the empty text may be a vector of all 0s or a vector of all 1s.
  • the three text features are input into the first encoder Encoder, and the first encoder encodes the three text features into text feature vectors.
  • the server 10d obtains the video features corresponding to video 1, the video features corresponding to video 2, and the video features corresponding to video 3.
  • the three video features are input into the first encoder Encoder, and the first encoder encodes the three video features into a video feature vector.
  • the server 10d obtains the product features corresponding to the product 1, the product features corresponding to the product 2, and the product features corresponding to the product 3.
  • the three commodity features are input into the first encoder Encoder, and the first encoder encodes the three commodity features into commodity feature vectors.
  • the server 10d inputs the aforementioned text feature vector, video feature vector and product feature vector into the second encoder Transformer, and the second encoder encodes the text feature vector, video feature vector and product feature vector into feature vector 20e.
  • the server 10d then cross-convolves the feature vector 20d and the feature vector 20e into a feature vector 20f, wherein the feature vector 20f also represents the general interest feature of the target user.
  • the target domain is the text domain
  • the specific process is as follows: the server 10d then splices the text feature vector output by the first encoder Encoder and the text domain features corresponding to the text domain into a feature vector 20g, and cross-convolves the feature vector 20f and the feature vector 20g into a feature vector 20h, wherein the feature vector The vector 20h also represents the text domain interest feature of the target user in the text domain.
  • the server 10d has determined the text domain interest feature 20h of the target user in the text domain.
  • the text domain interest feature 20h can be used to predict the preference of the target user in the text domain.
  • the server 10d After determining the text domain interest characteristics of the target user in the text domain, prediction can be made. As shown in FIG. 2c, the server 10d acquires five text features of the text to be recommended, wherein the five text features are determined during training of the first encoder Encoder and the second encoder Transformer. The server 10a calculates the feature distance between the text domain interest feature 20h and each text feature, and uses the text corresponding to the two text features with the smallest feature distance as the recommended text. It is assumed that among the five text features, the text feature of text 1 and the text feature of text 3 The text feature of is the text feature closest to the text domain interest feature 20h.
  • the server sends text 1 and text 3 to the terminal device, and the terminal device sends the abstract of text 1, the abstract of text 3, the cover image of text 1, the cover image of text 3, the title of text 1, the title of text 3 and the video cover (The video cover is issued by other servers according to the rest of the recommendation requests.) It is combined into a recommendation page 20j.
  • the user can read part of the information of text 1 and text 3 on the recommendation page 20j, and the user can also click on the text or video, and read in detail Text content or watch video content.
  • the business object sets (such as the text set 20a, the video set 20b and the commodity in the above embodiment) that have an associated relationship with the target user in multiple fields (such as the text field, video field and shopping field in the above embodiment).
  • Set 20c obtain the target domain interest feature (such as the text domain interest feature 20h in the above-mentioned embodiment) and a plurality of business object features to be recommended (such as the text features of 5 texts in the above-mentioned embodiment), determine the target recommended business object
  • For specific processes such as text 1 and text 3 in the foregoing embodiment), reference may be made to the following embodiments corresponding to FIGS. 3 to 7 .
  • FIG. 3 is a schematic flowchart of a data recommendation provided by an embodiment of the present application.
  • the following embodiment uses a server as an execution subject to describe how to determine a target user's target recommendation business object.
  • the data recommendation method may include the following steps:
  • Step S101 in response to a data recommendation request for a target user in a target domain, obtain a set of business objects that are associated with the target user in multiple domains.
  • the server in response to the data recommendation request for the target user in the target field, obtains a set of business objects that are associated with the target user in multiple fields (Such as the text set 20a, the video set 20b, and the commodity set 20c in the above-mentioned embodiments corresponding to FIGS. 2a-2d).
  • a data recommendation request is a request to recommend data in the target field to the target user. For example, if the target field is the text field, then the data recommendation request is a request to recommend text to the target user; the target field is the shopping field, then the data recommendation A request is a request to recommend an item to the target user.
  • the target field is any field in multiple fields
  • the business object that has an associated relationship with the target user means that the business object has been exposed to the target user (or the target user has read these business objects) or the business object is a non-physical business object (The non-entity business object is empty and can be represented by the special character UNK).
  • the number of business objects included in the set of business objects associated with the target user in each domain is the same and equal to the number threshold, and the business objects may be empty (or the business objects may be non-entity business objects).
  • the multiple domains include at least two of the following: text domain, video domain, game domain, embedded application domain (ie applet domain), shopping domain, etc., wherein the text domain can be further subdivided into: short text domain (for example, news , official account articles) and long text fields (for example, novels), and the video field can be subdivided into: short video fields (for example, short videos within 1 minute) and long video fields (for example, movies, TV series).
  • text domain can be further subdivided into: short text domain (for example, news , official account articles) and long text fields (for example, novels)
  • the video field can be subdivided into: short video fields (for example, short videos within 1 minute) and long video fields (for example, movies, TV series).
  • the target user has read text 1 and text 2, and has read video 1. If multiple fields include text fields and video fields, and the number of business objects associated with the target user in each field is 3, Then the target business object set can be: text 1, text 2, empty (UNK); video 1, empty (UNK), empty (UNK), at this time the business object "text 1", the business object “text 2” and the business object “Video 1” is the entity business object, and the business object "UNK” is the non-entity business object.
  • the recommendation request at this time can be considered as a cold start recommendation, that is, the target user does not have any business objects in the target domain.
  • the recommendation request at this time can be considered as a cold start recommendation, that is, the target user does not have any business objects in the target domain.
  • the set of business objects belongs to the set of business objects to be recommended, the set of business objects to be recommended includes multiple business objects to be recommended in multiple fields, and the set of business objects to be recommended is determined when training a cross-domain recommendation model.
  • the set of business objects to be recommended includes text 1, text 2, and text 3 in the text field, and video 1, video 2, and video 3 in the video field. If the target user has read text 1, text 4, and video 4 , then the business objects associated with the target user include text 1, empty, and empty in the text field, and empty, empty, and empty in the video field.
  • the specific process of obtaining a set of business objects that are associated with target users in multiple fields is as follows: Obtain business objects that have been exposed to target users in each field (called exposed business objects, and exposed business objects also belong to A collection of business objects to be recommended). For any field, if the number of exposed business objects in this field is equal to the number threshold (the number threshold is the input number threshold of the cross-domain recommendation model), the exposed business objects will be combined into the unit business in this field. collection of objects.
  • the business object set includes multiple unit business object sets, and each unit business object set corresponds to a field. If the number of exposed business objects in this field is less than the quantity threshold, the exposed business objects and non-entity business objects are combined into a unit business object set in this field.
  • the number of business objects contained in the unit business object set under each domain is equal to the quantity threshold, and the unit business object set under each domain is combined into a business object set associated with the target user.
  • the acquisition of the exposed business objects may be determined by the knowledge graph triplet of the target user, and the knowledge graph triplet of the target user includes the user ID of the target user, the domain ID of the domain involved by the target user (the domain involved by the target user) It belongs to multiple fields, and the field involved by the target user is the field to which the business object that has been exposed to the target user belongs), and the business object that has been exposed to the target user in the field involved by the target user. So the knowledge graph triple can be represented as (user, domain, business object).
  • Step S102 Encoding the multiple domains and the business object set to obtain the target domain interest feature of the target user in the target domain.
  • the trained cross-domain recommendation model is invoked to perform cross-coding processing on the business object sets of multiple domains and target users, so as to obtain the general interest features of the target users (such as the feature vector 20f in the corresponding embodiment of FIG. 2a- FIG. 2d ). ).
  • the general interest feature is a one-dimensional vector.
  • the knowledge information of the target domain needs to be introduced as a domain bias to convert the user interest space to learn the target domain interest feature of the target user in the target domain (as shown in Figure 2a- Figure 2d above).
  • the text domain interest feature 20h) in the example can subsequently predict the business objects that the target user may like in the target domain based on the target domain interest feature.
  • the process of determining the features of interest in the target domain is as follows:
  • the server obtains from the business object set the business objects that have an associated relationship with the target user in the target domain (referred to as the target business object, such as the three empty texts included in the text set 20a in the corresponding embodiment of Fig. 2a-Fig. 2d).
  • the cross-domain recommendation model to cross-encode the general interest feature, target domain and target business object, and obtain the target domain interest feature of the target user in the target domain, where the target domain interest feature is also a one-dimensional vector.
  • the cross-domain recommendation model includes a domain encoder, an object encoder, and a domain-object cross-encoder.
  • the domain encoder is used to encode multiple domains
  • the object encoder is used to encode a set of business objects
  • the domain-object cross-encoder is used to encode multiple domains.
  • the encoder is used to re-encode vectors encoded from multiple domains and encoded sets of business objects into general interest features of target users.
  • the domain encoder is Encoder+Transformer
  • the Encoder is the RNN structure of the attention mechanism
  • the Transformer includes n encoders (excluding decoders).
  • the Encoder in the domain encoder is called the domain encoder in the domain
  • the Transformer in the domain encoder is called the domain encoder.
  • the server obtains the original domain features of each domain (the original domain features of each domain are determined when training the cross-domain recommendation model), assuming that the number of domains is k, the dimension of each original domain feature is n1, and the k original domain
  • the feature can be expressed as k ⁇ n1, input the k original domain features k ⁇ n1 as the original domain feature sequence to the Encoder in the domain encoder, and the Encoder in the domain encoder performs self-attention loop on the k original domain features k ⁇ n1
  • k first encoding vectors are output, and the dimension of each first encoding vector may be n2.
  • the k first coding vectors may be combined into a matrix k ⁇ n2 (this matrix may also be referred to as a first coding vector sequence), and average pooling is performed on the matrix to obtain a second coding vector with a dimension of k ⁇ 1. Then the second coding vector k ⁇ 1 is input into the Transformer in the domain encoder. After the Transformer in the domain encoder performs multi-attention encoding on the second coding vector k ⁇ 1, the output dimension is k ⁇ 1. Domain features, where , n1 can be equal to 512, n2 can be equal to 8, and the output data dimension of the Transformer is the same as the input data dimension. Through cross-learning of multiple domains, dense domain features can be expressed and the problem of sparse domain features can be solved.
  • average pooling refers to expressing the average value of a row (or a column) as the feature of this row (or this column).
  • the average pooling can be expressed by the following formula (1):
  • the object encoder includes multiple intra-domain object encoders (Encoders) and an inter-domain object transformer (Transformer).
  • the number of intra-domain object encoders is the same as the number of domains, and each intra-domain object encoder is used for services in the same domain.
  • the object that is, the unit business object set
  • the inter-domain object converter is used to re-encode the multiple encoding vectors output by the multiple intra-domain object encoders.
  • the business object set includes multiple unit business object sets, each unit business object set corresponds to a domain, each intra-domain object encoder (Encoder) is the RNN structure of the attention mechanism, and the inter-domain object converter (Transformer) includes n encoders (decoder not included).
  • the server obtains the original object feature of each business object in the unit business object set (the original object feature of each business object is determined when training the cross-domain recommendation model, if the business object is empty UNK, then the original object feature of the business object can be a vector of all 0s or all 1s or other preset values), assuming that the unit business object set includes p business objects, and the dimension of each original object feature is r1, then p
  • the original object features can be expressed as p ⁇ r1
  • the p original object features p ⁇ r1 are input as the original object feature sequence to the in-domain object encoder (Encoder).
  • the Encoder After encoding the original object feature sequence p ⁇ r1, the Encoder outputs p three third encoding vectors, and the dimension of each third encoding vector may be r2.
  • the p third encoding vectors can be combined into a matrix p ⁇ r2, and average pooling is performed on the matrix to obtain a fourth encoding vector with a dimension of p ⁇ 1.
  • Each unit business object set is encoded in the above manner to obtain a fourth encoding vector of each unit business object set.
  • k fourth encoding vectors with dimension p ⁇ 1 can be obtained, and the k fourth encoding vectors k ⁇ p are input into the inter-domain object transformer (Transformer), and the Transformer is for k
  • a feature matrix with a dimension of k ⁇ p is output.
  • Average pooling is performed on the feature matrix k ⁇ p to obtain object features with dimension k ⁇ 1.
  • r1 can be equal to 512
  • r2 can be equal to 8.
  • average pooling refers to expressing the average value of a row (or a column) as the feature of this row (or this column).
  • the above pooling process can also use maximum pooling, or minimum pooling, etc.
  • the domain features of multiple domains and the object features of the business object set have been determined by the domain encoder, and the dimensions of the domain features and object features are the same, and the domain object cross encoder is called to cross-encode the above domain features and object features as The general interest feature of the target user, where the general interest feature is also a vector.
  • e 1 , e 2 represent the two one-dimensional vectors of the input domain object cross-encoder; W c represents the weight matrix of the domain object cross-encoder, and the function f( ) represents the folding operation of the two vectors, that is, the two A one-dimensional vector is collapsed into a two-dimensional matrix, and the function flatten( ) represents a one-dimensional vectorization of a multi-dimensional input.
  • the calculation method shown in formula (4) can effectively improve the utilization of parameters when the features are sparse. In addition, this calculation method is also very effective for cross processing between different features.
  • cross-domain recommendation model cross-codes general interest features, target domains and target business objects into target domain interest features:
  • the object encoder includes multiple intra-domain object encoders (Encoders) and an inter-domain object transformer (Transformer), the number of intra-domain object encoders is the same as the number of domains, and each intra-domain object encoder (Encoder) is attention
  • the server obtains the original domain features of the target domain (in order to distinguish it from the original domain features mentioned above, the original domain features of the target domain are referred to as target domain features here), obtains the original object features of the target business object, and passes through the domain of the target domain.
  • the object encoder encodes the original object feature of the target business object into the target object feature, where the target object feature may correspond to the aforementioned fourth encoding vector with dimension p ⁇ 1, where p represents the number of target business objects.
  • d t represents the target domain feature
  • f t represents the target object feature
  • v t represents the target domain object feature
  • W d represents the weight matrix of the intra-domain object encoder in the object encoder of the target domain.
  • the domain object cross-encoder is called to cross-encode the general interest feature and the target domain object feature into the target domain interest feature.
  • the calculation formula of the target domain interest feature can be expressed by the following formula (6):
  • ug represents the general interest feature
  • v t represents the object feature of the target domain
  • u t represents the target domain interest feature of the target user in the target domain.
  • the server obtains the interest features of the target domain by calling the cross-domain recommendation model.
  • FIG. 4a is a schematic diagram of a cross-domain recommendation model provided by an embodiment of the present application.
  • the original domain feature sequences of multiple domains are input into the domain encoder (the domain encoder includes the intra-domain domain encoder and the inter-domain domain converter) to obtain domain features.
  • the domain features and object features are input into the domain-object cross-encoder to obtain the general interest features of the target users.
  • target domain features Obtain the original domain features of the target domain (called target domain features), encode the original object feature sequences in the target domain as target object features through the in-domain object encoder in the target domain, and splicing the target domain features and target object features into the target domain Object characteristics.
  • the general interest feature and the target domain object feature are input into the domain object cross-encoder to obtain the target user's target interest feature in the target domain.
  • Transformer is used to extract intra-domain features to solve the problem of sparse features
  • ConvE is used to perform inter-domain feature cross-computation, which improves the effect of cross-domain feature cross-computation, makes the extracted cross-domain features more distinguishable, and enhances user behavior derivation ability.
  • knowledge information of the target domain it is used as a domain bias to transform the user's interest space, enhance the expression ability of the user's target domain interest characteristics, and improve the accuracy of subsequent business object recommendation.
  • Step S103 Acquire characteristics of the business objects to be recommended of a plurality of business objects to be recommended in the target domain.
  • the server obtains the original object features of a plurality of business objects to be recommended in the target domain (in order to distinguish from the original object features in the foregoing, the original object features of the business objects to be recommended are referred to as features of the business objects to be recommended here) .
  • the business objects to be recommended in the target domain are determined after training the cross-domain recommendation model, and the features of the business objects to be recommended are also determined during the training of the cross-domain recommendation model.
  • the business objects to be recommended are all entity business objects.
  • the number of target business objects that are related to the target user under the target domain may be multiple, and multiple target business objects may be empty, that is, all non-entity business objects (called non-entity target business objects); or multiple
  • the target business objects include entity target business objects and non-entity target business objects, and the entity target business objects belong to a plurality of to-be-recommended business objects in the target field.
  • Step S104 Obtain the target recommended business object feature matching the target domain interest feature from the plurality of to-be-recommended business object features, and output the target recommended business object corresponding to the target recommended business object feature.
  • the server calculates the feature distance between the interest feature of the target domain and the feature of each to-be-recommended business object under the target domain, and the Euclidean distance can be used to measure the target domain's interest feature and the feature of each to-be-recommended business object feature distance between.
  • the server sorts the plurality of feature distances, and uses the n business object features to be recommended with the smallest feature distances as the target recommended business object features.
  • the to-be-recommended business object corresponding to the feature of the target recommended business object is taken as the target recommended business object (such as text 1 and text 3 in the corresponding embodiments of the above-mentioned Figures 2a-2d), and the target recommended business object is output.
  • the target recommended business object is the predicted business object that the target user may like in the target field.
  • business object A to be recommended there are currently three business objects to be recommended, namely business object A to be recommended, business object B to be recommended, and business object C to be recommended, the interest feature of the target domain and the business object feature to be recommended of business object A to be recommended.
  • the feature distance between them is 0.3
  • the feature distance between the target domain interest feature and the to-be-recommended business object feature of the to-be-recommended business object B is 0.5
  • the feature distance between the target domain's interest feature and the to-be-recommended business object feature of the to-be-recommended business object C is 0.5.
  • the feature distance is 0.1.
  • the to-be-recommended business objects corresponding to the minimum two feature distances can be used as the target recommended business objects, that is, the to-be-recommended business object A and the to-be-recommended business object C are both target recommendation business object.
  • the target recommended business object is the target recommended electronic book
  • the target recommended electronic book may be novels, news, articles on official accounts, and the like.
  • the server may deliver the target recommended business object to the terminal device where the target user is located, so as to expose the target recommended business object to the target user.
  • the target user reads the target recommended business object, if the target user recommends the target If the business object is interested, you can click the target recommended business object to view the content of the target recommended business object in detail; if the target user is not interested in the target recommended business object, you can skip the target recommended business object.
  • the server can combine the target user, the target domain and the exposed target recommended business objects into a new knowledge graph triplet of the target user, and the knowledge graph triplet can be represented as (user, domain, business object). Based on the new knowledge graph triplet, the cross-domain recommendation model can be updated.
  • Whether the target user clicks on a target recommendation business object can be used as the label of the target recommendation business object. For example, if the target user clicks the target recommendation business object A , then the label of the target recommended business object A is 1. If the target user does not click on the target recommended business object A, then the label of the target recommended business object A is 0.
  • this application proposes a definition of a knowledge graph triplet, which can be used for data structure modeling in multi-product fields, unifies the data form, and provides a data basis for model training; in this application, the Transformer is used to extract features in the domain , to solve the problem of sparse features, and perform cross-domain feature cross-computation through ConvE to improve the effect of cross-domain feature cross-computation and enhance user behavior derivation capabilities.
  • the knowledge information of the target domain as a domain bias
  • the user's interest space is transformed, and the expression ability of the user's target domain interest characteristics is enhanced; finally, the solution of the present application can be used for data recommendation user startup scenarios, which improves the cold-start recommendation. Accuracy.
  • FIG. 4b is an overall framework diagram of a data recommendation provided by an embodiment of the present application.
  • the data recommendation involves two modules: online service and offline training.
  • the offline training is to train a cross-domain recommendation model. , and determine the original object characteristics of the business objects to be recommended in multiple fields.
  • the trained cross-domain recommendation model can predict the user's target domain interest characteristics in the target domain, and call the nearest neighbor service to determine the target domain interest characteristics and the target domain's original object characteristics of the to-be-recommended business object (called the to-be-recommended business object feature).
  • the feature distances between the k features of the to-be-recommended business objects with the smallest feature distances are taken as the target recommended business objects.
  • the sorting service is called to sort the k target recommended business objects, and the sorted k target recommended business objects are delivered to the terminal device where the user is located, so as to be accurately recommended to the user. Subsequently, the user click log of the user for the target recommended business object can be obtained, the multi-domain user behavior processing module generates a knowledge graph triplet based on the user click log, and updates the cross-domain recommendation model according to the knowledge graph triplet.
  • Fig. 5a-Fig. 5b are schematic diagrams of a target recommendation business object provided by an embodiment of the present application.
  • the target field is a text field
  • the server determines the text to be recommended in the above manner
  • the server sends the text to the terminal device where the target user is located. Display the text sent by the server in the terminal device where the target user is located.
  • the text cover of the text can be displayed first (the text cover includes the text title, text illustrations, etc.), and then the detailed text content is displayed after the user clicks on the text cover.
  • the target field is the text field and the video field
  • the current data recommendation request is a request to recommend text and video to the target user.
  • the server will determine the The text and video are sent to the terminal device where the target user is located. Display the text and video delivered by the server in the terminal device where the target user is located.
  • FIG. 6 is a schematic flowchart of a data recommendation method provided by an embodiment of the present application.
  • the following embodiment mainly describes the training process of a cross-domain recommendation model.
  • the data recommendation method may include the following steps:
  • Step S201 acquiring a sample unit business object set that has an associated relationship with a sample user in each field.
  • this embodiment takes one sample user to perform one iteration as an example for description:
  • the server obtains the sample knowledge graph triplet set of the sample user, the sample knowledge graph triplet set includes multiple sample knowledge graph triples, and any sample knowledge graph triplet includes the user ID of the sample user, the The domain identifier of a domain and the object identifier of the sample business objects exposed to sample users in any domain are equivalent to each sample knowledge graph triplet, which can be represented as (user, domain, business object).
  • sample business objects that are associated with the sample users refer specifically to the sample business objects that have been exposed to the users, and each unit business object set includes the exposed unclicked sample business objects and the exposed clicked sample business objects. , and the proportion of exposed unclicked sample business objects and exposed clicked sample business objects in each unit business object set is the same.
  • Step S202 generating random field features of each field, and generating random object features of each sample business object in a plurality of sample unit business object sets.
  • the server generates random domain features of each domain, and generates random object features of each sample business object in the business object set of multiple sample units. Both random domain features and random object features will participate in model training. After training, the random domain The feature is the original domain feature mentioned above, and the random domain feature after training is the original domain feature mentioned above.
  • Step S203 calling the sample cross-domain recommendation model to perform cross-domain cross-coding processing on all random domain features and all random object features, to obtain the predicted interest features of the sample user in the sample domain.
  • the server invokes the sample cross-domain recommendation model to perform cross-coding processing on all random domain features and all random object features, and obtains the sample general interest features of the sample users, wherein the sample general interest features and the target user's general interest features are determined in the foregoing.
  • the process is basically the same, and will not be repeated here.
  • the server determines a sample domain from a plurality of domains, and the sample domain is any one of the plurality of domains. Obtain the random domain feature of the sample domain, and obtain the random object feature of each sample business object in the sample unit business object set under the sample domain.
  • the server invokes the sample cross-domain recommendation model to perform cross-coding processing on the general interest characteristics of the sample, the random domain characteristics of the sample domain, and the random object characteristics of each sample business object in the sample unit business object set under the sample domain, and obtain the sample user in the sample domain. Predicted interest features.
  • the process of determining the predicted interest features of the sample users in the sample domain is basically the same as the above-mentioned process of determining the target domain interest features of the target users in the target domain, and will not be repeated here.
  • Step S204 Determine the feature similarity between the predicted interest feature and the random object feature in the sample field, and obtain the behavior label of the random object feature in the sample field.
  • the server calculates the feature similarity between the predicted interest feature and the random object feature of each sample business object in the sample unit business object set in the sample domain, and the feature similarity is between 0 and 1.
  • the server obtains the behavior label of the random object feature of each sample business object in the sample unit business object set in the sample domain.
  • the value of the behavior label can be 1 or 0.
  • the behavior label with a value of 1 indicates the sample corresponding to the behavior label.
  • the business object has not only been exposed to the sample users, but also clicked by the sample users to view the detailed content of the sample business object; a behavior label with a value of 0 indicates that the sample business object corresponding to the behavior label has only been exposed to the sample users, and the sample users Not clicked.
  • Step S205 determine a prediction error according to the behavior label and the feature similarity, train the sample cross-domain recommendation model based on the prediction error, and obtain the cross-domain recommendation model.
  • the server determines the randomness of each sample business object in the sample unit business object set under the sample domain according to the feature similarity and behavior label of the random object feature of each sample business object in the sample unit business object set under the sample domain
  • the error of the object feature adding all the errors to the prediction error.
  • the prediction error is back-propagated to the sample cross-domain recommendation model, the random domain features of each domain, and the random object features of each sample business object in multiple sample unit business object sets to adjust the sample cross-domain recommendation.
  • the model parameters of the model, the random domain characteristics of each domain are adjusted, and the random object characteristics of each sample business object in the business object set of multiple sample units are adjusted.
  • the server can use the same method to take other fields other than the sample field as the sample field in turn, and then update the sample cross-domain recommendation model.
  • the multiple sample unit business object sets of the remaining sample users are used to participate in the training of the sample cross-domain recommendation model, and are constantly iteratively adjusted. It should be noted that multiple sample users can participate in the model training, and the sample business objects included in the sample unit business object set of each sample user are the same.
  • the adjusted sample cross-domain recommendation model When the adjusted sample cross-domain recommendation model satisfies the model convergence condition, the adjusted sample cross-domain recommendation model is used as the cross-domain recommendation model, and the adjusted random domain features of each domain are used as the original domain features.
  • the adjusted random object feature of each sample business object in the business object set is taken as the original object feature.
  • the adjusted sample cross-domain recommendation model satisfies the model convergence condition; or if the difference between the model parameters before adjustment and the adjusted model parameters is less than the preset difference threshold, then It is considered that the adjusted sample cross-domain recommendation model satisfies the model convergence condition; or the prediction accuracy of the adjusted sample cross-domain recommendation model reaches the preset accuracy threshold, it is considered that the adjusted sample cross-domain recommendation model satisfies the model convergence condition.
  • Step S206 in response to the data recommendation request for the target user in the target domain, obtain a set of business objects that are associated with the target user in multiple domains; the business object set includes The target user has an associated business object, and the multiple domains include the target domain.
  • Step S207 invoking a cross-domain recommendation model to encode the multiple domains and the business object set to obtain the target domain interest characteristics of the target user in the target domain.
  • Step S208 acquiring the features of the business objects to be recommended of multiple business objects to be recommended under the target domain, and obtaining the features of the target recommended business objects matching the interest features of the target domain from the features of the plurality of business objects to be recommended, The target recommended business object corresponding to the feature of the target recommended business object is output.
  • step S206-step S208 For the specific process of step S206-step S208, reference may be made to step S101-step S104 in the embodiment corresponding to FIG. 3 above.
  • FIG. 7 is a schematic diagram of a sample cross-domain recommendation model provided by an embodiment of the present application.
  • random domain features of multiple domains are acquired, and multiple random domain features are combined into random domain features.
  • Sequence input Encoder also known as intra-sample domain domain encoder
  • Transformer also known as inter-sample domain domain transformer
  • Encoder and Transformer encode random domain feature sequences into sample domain features.
  • Encoder also known as intra-sample domain object encoder
  • Transformer also known as inter-sample domain object transformer
  • Encoder encoder and Transformer transformer encode multiple random object feature sequences into samples Object characteristics.
  • the sample domain features and sample object features are input into the ConvE layer (also known as the sample domain object cross-encoder), and the ConvE layer cross-encodes the sample domain features and sample object features into the sample general interest features.
  • the calculation formula of ConvE layer is the aforementioned formula (4).
  • Randomly select a field from multiple fields as the sample field obtain the random field features of the sample field, obtain the sample unit business object set in the sample field, and obtain the Encoder that encodes the random object feature sequence of the sample unit business object set in the sample field.
  • Input the sample domain object features and the sample general interest features into ConvEl ayer, ConvEl ayer cross-codes the sample domain object features and the sample general interest features into the predicted interest features, and calculates the predicted interest features and the sample domain.
  • the feature similarity between random object features based on the Softmax loss loss function to measure the feature similarity and the prediction error between the behavior labels of multiple random object features of the sample unit business object set in the sample field, and adjust the sample according to the prediction error.
  • this application proposes a definition of a knowledge graph triplet, which can be used for data structure modeling in multi-product fields, unifies the data form, and provides a data basis for model training; in this application, the Transformer is used to extract features in the domain , to solve the problem of sparse features, and perform cross-domain feature cross-computation through ConvE to improve the effect of cross-domain feature cross-computation and enhance user behavior derivation capabilities.
  • the knowledge information of the target domain as a domain bias
  • the user's interest space is transformed, and the expression ability of the user's target domain interest characteristics is enhanced; in addition, the user behaviors in different domains are complemented to determine the user's interest in the target domain.
  • Interest features based on the interest features in the target domain, recommending business objects in the target domain for users is more targeted, which can further improve the accuracy of data recommendation.
  • FIG. 8 is a schematic structural diagram of a data recommendation apparatus provided by an embodiment of the present application.
  • the data recommendation apparatus 1 may be applied to the servers in the embodiments corresponding to the above-mentioned FIGS. 3 to 7 .
  • the data recommendation apparatus may be a computer program (including program code) running in a computer device, for example, the data recommendation apparatus is an application software; the apparatus may be used to execute corresponding steps in the methods provided by the embodiments of the present application.
  • the data recommendation apparatus 1 may include: an acquisition module 11 , an encoding module 12 , a determination module 13 and an output module 14 .
  • the obtaining module 11 is configured to, in response to a data recommendation request for a target user in the target domain, obtain a set of business objects that are associated with the target user in multiple domains; the set of business objects is included in each domain a business object associated with the target user, the multiple domains include the target domain;
  • an encoding module 12 configured to perform encoding processing on the multiple domains and the business object set to obtain the target domain interest characteristics of the target user in the target domain;
  • the obtaining module 11 is further configured to obtain the features of the to-be-recommended business objects of a plurality of to-be-recommended business objects under the target domain;
  • a determination module 13 configured to obtain the target recommended business object feature matching the target domain interest feature from a plurality of to-be-recommended business object features
  • the output module 14 is configured to output the target recommended business object corresponding to the feature of the target recommended business object.
  • the determining module 13 is specifically configured to:
  • the feature of the target recommended business object is determined from the features of the plurality of business objects to be recommended.
  • the plurality of domains include at least two of the following: a text domain, a video domain, a game domain, and an embedded application domain;
  • the target recommendation business object includes a target recommendation electronic book.
  • the obtaining module 11 when used to obtain a set of business objects that are associated with the target user in multiple fields, it is specifically used to:
  • the specific function realization mode of the acquisition module 11, the encoding module 12, the determination module 13 and the output module 14 can refer to the steps S101-S104 in the corresponding embodiment of Fig. 3 above, and will not be repeated here.
  • the encoding module 12 may include: a first encoding unit 121 , an obtaining unit 122 and a second encoding unit 123 .
  • a first encoding unit 121 configured to invoke a cross-domain recommendation model to perform cross-encoding processing on the multiple domains and the business object set to obtain the general interest feature of the target user;
  • an obtaining unit 122 configured to obtain a target business object that has an associated relationship with the target user under the target domain
  • the second encoding unit 123 is configured to invoke a cross-domain recommendation model to perform cross-encoding processing on the general interest feature, the target domain, and the target business object, to obtain the target domain interest feature.
  • the cross-domain recommendation model includes a domain encoder, an object encoder, and a domain-object cross encoder;
  • the first coding list 121 units are specifically used for:
  • the domain-object cross-encoder is invoked to cross-encode the domain feature and the object feature into a general interest feature of the target user.
  • the domain encoder includes an intra-domain domain encoder and an inter-domain domain converter
  • the first coding unit 121 When the first coding unit 121 is used to call the domain encoder to encode all original domain features into domain features, it is specifically used for:
  • the inter-domain domain converter is called to perform multi-attention encoding on the second encoding vector to obtain the domain feature.
  • the object encoder comprises an intra-domain object encoder
  • the second encoding unit 123 is specifically used for:
  • the number of the target business objects is multiple, the multiple target business objects are all non-entity target business objects, or the multiple target business objects include entity target business objects and non-entity target business objects, the entity The target business object belongs to the plurality of business objects to be recommended.
  • the specific function implementation manner of the first encoding unit 121 , the obtaining unit 122 and the second encoding unit 123 may refer to step S102 in the above-mentioned embodiment corresponding to FIG. 3 , which will not be repeated here.
  • the data recommendation apparatus 1 may include an acquisition module 11 , an encoding module 12 , a determination module 13 and an output module 14 ; and may also include a combination module 15 , a generation module 16 , a calling module 17 and a training module 18 .
  • the combining module 15 is configured to combine the target user, the target domain and the target recommended business object into a knowledge graph triplet, and update the cross-domain recommendation model based on the knowledge graph triplet.
  • the generating module 16 is used to obtain a sample unit business object set that has an associated relationship with a sample user under each domain, generate random domain characteristics of each domain, and generate a plurality of sample unit business object sets for each sample business object. random object features;
  • the calling module 17 is used to call the sample cross-domain recommendation model to perform cross-domain cross-coding processing on all random domain features and all random object features, so as to obtain the predicted interest features of the sample user in the sample domain;
  • the sample domain is the any of a number of fields;
  • the generating module 16 is further configured to determine the feature similarity between the predicted interest feature and the random object feature of the sample field, and obtain the behavior label of the random object feature of the sample field, according to the behavior label and The feature similarity determines a prediction error;
  • the training module 18 is configured to train the sample cross-domain recommendation model based on the prediction error to obtain the cross-domain recommendation model.
  • the generating module 16 when used to obtain a sample unit business object set that has an associated relationship with a sample user in each field, it is specifically used to:
  • sample knowledge graph triplet set of the sample user includes a plurality of sample knowledge graph triples
  • any sample knowledge graph triplet includes the user ID of the sample user, The domain identifier of any one of the multiple domains and the object identifier of the sample business object that has been exposed to the sample user in the any domain;
  • a set of sample unit business objects that have an associated relationship with the sample user under each domain is determined from the sample knowledge graph triplet set.
  • the training module 18 is specifically used to:
  • the adjusted sample cross-domain recommendation model When the adjusted sample cross-domain recommendation model satisfies the model convergence condition, the adjusted sample cross-domain recommendation model is used as the cross-domain recommendation model, and the adjusted random domain features of each domain are used as the original domain features, and the multi-domain The adjusted random object feature of each sample business object in the sample unit business object set is taken as the original object feature.
  • calling module 17 is specifically used to:
  • the specific function implementation of the combination module 15 can refer to step S104 in the above-mentioned embodiment corresponding to FIG. 3
  • the specific functional implementation of the generation module 16 , the calling module 17 and the training module 18 can refer to the above-mentioned steps in the corresponding embodiment of FIG. 6 S201-step S205.
  • FIG. 9 is a schematic structural diagram of a computer device provided by an embodiment of the present invention.
  • the terminal device in the embodiment corresponding to FIGS. 3-7 may be a computer device 1000 .
  • the computer device 1000 may include a user interface 1002 , a processor 1004 , an encoder 1006 and a memory 1008 .
  • Signal receiver 1016 is used to receive or transmit data via cellular interface 1010 , WIFI interface 1012 , . . . , or NFC interface 1014 .
  • the encoder 1006 encodes the received data into a computer-processable data format.
  • a computer program is stored in the memory 1008, and the processor 1004 is configured to perform the steps in any one of the above method embodiments through the computer program.
  • the memory 1008 may include volatile memory (eg, dynamic random access memory DRAM), and may also include non-volatile memory (eg, one-time programmable read-only memory OTPROM). In some examples, memory 1008 may further include memory located remotely from processor 1004, which may be connected to computer device 1000 through a network.
  • User interface 1002 may include: keyboard 1018 and display 1020 .
  • the processor 1004 can be used to call the computer program stored in the memory 1008 to realize:
  • a data recommendation request for a target user in a target domain obtain a set of business objects that have an associated relationship with the target user in multiple domains;
  • the business object set includes a set of business objects associated with the target user in each domain.
  • a business object of an associated relationship the multiple domains include the target domain;
  • a target recommended business object feature matching the interest feature of the target domain is obtained from the features of a plurality of business objects to be recommended, and a target recommended business object corresponding to the target recommended business object feature is output.
  • the computer device 1000 described in the embodiment of the present invention can execute the description of the data recommendation method in the foregoing embodiments corresponding to FIG. 3 to FIG. 7 , and can also execute the data recommending apparatus 1 in the foregoing embodiment corresponding to FIG. 8 description, which will not be repeated here. In addition, the description of the beneficial effects of using the same method will not be repeated.
  • an embodiment of the present invention further provides a computer storage medium, and the computer storage medium stores a computer program executed by the aforementioned data recommendation apparatus 1, and the computer program includes program instructions,
  • the processor executes the above program instructions, it can execute the methods in the foregoing embodiments corresponding to FIG. 3 to FIG. 7 , and thus will not be repeated here.
  • the description of the beneficial effects of using the same method will not be repeated.
  • program instructions may be deployed on one computer device, or executed on multiple computer devices located at one site, or alternatively, distributed across multiple sites and interconnected by a communications network, Multiple computer devices distributed in multiple locations and interconnected by a communication network can form a blockchain system.
  • a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device can execute the methods in the embodiments corresponding to FIG. 3 to FIG. Repeat.
  • the above-mentioned storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM) or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请实施例公开了一种数据推荐方法、装置、计算机设备以及存储介质,数据推荐方法可以应用于精准推荐,数据推荐方法包括:响应于在目标领域下针对目标用户的数据推荐请求,获取在多个领域下与目标用户具有关联关系的业务对象集合;对多个领域和业务对象集合进行跨领域交叉编码处理,得到目标用户在目标领域下的目标领域兴趣特征;获取目标领域下的多个待推荐业务对象的待推荐业务对象特征;从多个待推荐业务对象特征中获取与目标领域兴趣特征匹配的目标推荐业务对象特征,输出目标推荐业务对象特征对应的目标推荐业务对象。

Description

数据推荐方法、装置、计算机设备以及存储介质
本申请要求于2020年8月28日提交中国专利局、申请号为2020108851407,发明名称为“数据推荐方法、装置、计算机设备以及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,尤其涉及一种数据推荐方法、装置、计算机设备以及存储介质。
背景技术
近年来,互联网技术的日益发展和普及给用户带来了大量的信息,满足了用户对信息的需求。但随着信息呈指数级增长,使得用户难以从海量的数据中筛选出自己真正想要的信息。在这种情况下,推荐系统应运而生,推荐系统用于精准推荐,即向用户提供精准的推荐内容和服务。
目前,推荐系统根据最近的热点事件筛选出待推荐业务内容,再向所有用户推送筛选出来的待推荐业务内容。
技术内容
本申请实施例一方面提供了一种数据推荐方法,包括:
响应于在目标领域下针对目标用户的数据推荐请求,获取在多个领域下与所述目标用户具有关联关系的业务对象集合;所述业务对象集合包括在每个领域下与所述目标用户具有关联关系的业务对象,所述多个领域包括所述目标领域;
对所述多个领域和所述业务对象集合进行编码处理,得到所述目标用户在所述目标领域下的目标领域兴趣特征;
获取所述目标领域下的多个待推荐业务对象的待推荐业务对象特征;
从多个待推荐业务对象特征中获取与所述目标领域兴趣特征匹配的目标推荐业务对象特征,输出所述目标推荐业务对象特征对应的目标推荐业务对象。
本申请实施例一方面提供了一种数据推荐装置,包括:
获取模块,用于响应于在目标领域下针对目标用户的数据推荐请求,获取在多个领域下与所述目标用户具有关联关系的业务对象集合;所述业务对象集合包括在每个领域下与所述目标用户具有关联关系的业务对象,所述多个领域包括所述目标领域;
编码模块,用于对所述多个领域和所述业务对象集合进行编码处理,得到所述目标用户在所述目标领域下的目标领域兴趣特征;
所述获取模块,还用于获取所述目标领域下的多个待推荐业务对象的待推荐业务对象 特征;
确定模块,用于从多个待推荐业务对象特征中获取与所述目标领域兴趣特征匹配的目标推荐业务对象特征;
输出模块,用于输出所述目标推荐业务对象特征对应的目标推荐业务对象。
本申请实施例一方面提供了一种计算机设备,包括存储器和处理器,存储器存储有计算机程序,计算机程序被处理器执行时,使得处理器执行上述各实施例中的方法。
本申请实施例一方面提供了一种非易失性计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,计算机程序包括程序指令,程序指令当被处理器执行时,执行上述各实施例中的方法。
本申请实施例一方面提供了一种计算机程序产品或计算机程序,计算机程序产品或计算机程序包括计算机指令,计算机指令存储在计算机可读存储介质中,计算机指令被计算机设备的处理器执行时,执行上述各实施例中的方法。
附图简要说明
图1是本申请实施例提供的一种数据推荐的系统架构图;
图2a-图2d是本申请实施例提供的一种数据推荐的场景示意图;
图3是本申请实施例提供的一种数据推荐的流程示意图;
图4a是本申请实施例提供的一种跨领域推荐模型的示意图;
图4b是本申请实施例提供的一种数据推荐的整体框架图;
图5a-图5b是本申请实施例提供的一种目标推荐业务对象的示意图;
图6是本申请实施例提供的一种数据推荐的流程示意图;
图7是本申请实施例提供的一种样本跨领域推荐模型的示意图;
图8是本申请实施例提供的一种数据推荐装置的结构示意图;
图9是本发明实施例提供的一种计算机设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
从基础技术角度来说,本申请的数据推荐方法涉及云技术下属的云计算;从应用角度来说,本申请的数据推荐方法涉及云技术下属的人工智能云服务。
在本申请中,对多个领域和业务对象集合进行跨领域交叉编码处理涉及大规模计算, 需要巨大的算力和存储空间,因此在本申请中,可以由终端设备通过云计算技术获取足够算力和存储空间,进而执行本申请中所涉及的提取目标用户在目标领域下的目标领域兴趣特征,以及提取多个待推荐业务对象的待推荐业务对象特征。
可以将本申请涉及的数据推荐方法封装为一个人工智能服务,且仅对外暴露一个接口。当在某一个业务场景下需要使用本申请所涉及的业务对象推荐功能时,通过调用该接口,即可完成对业务对象的推荐。
本申请实施例提供的方案属于人工智能领域下属的机器学习/深度学习。
在本申请中,主要涉及通过调用一个推荐模型以及基于用户在多个领域的行为数据确定用户在特定目标领域的兴趣特征,该兴趣特征可以用于确定用户在目标领域下待推荐的业务对象。
本申请可以应用于如下场景:当需要向目标用户推荐目标领域下的产品时,获取目标用户在多个领域的行为数据,根据多个领域的行为数据确定目标用户在目标领域的兴趣特征,将兴趣特征和目标领域的多个待推荐产品的产品特征进行匹配,输出匹配的产品特征对应的待推荐产品。后续,可以向用户推送匹配的待推荐产品,以实现精准营销,提高推荐准确率。
本申请根据用户行为挖掘用户个性化偏好,进而确定用户的推荐业务对象,可以根据不同的用户行为确定不同的推荐业务对象,实现个性化内容推荐,且依据用户行为确定推荐业务对象,相比根据热点事件确定推荐业务对象,从用户需求本质出发所确定的推荐内容即是用户真正想要的内容,可以提高内容推荐的准确性;再有,通过不同领域下的用户行为互相补充,以确定用户在目标领域下的兴趣特征,基于目标领域下的兴趣特征为用户推荐目标领域下的业务对象更具有针对性,可以进一步提高数据推荐的准确率。
请参见图1,是本申请实施例提供的一种数据推荐的系统架构图。本申请涉及服务器10d以及终端设备集群,终端设备集群可以包括:终端设备10a、终端设备10b、...、终端设备10c等。
以终端设备10a为例,终端设备10a接收到关于用户在目标领域下的数据推荐请求时,将该数据推荐请求发送至服务器10d。服务器10d获取在多个领域下与目标用户具有关联关系的业务对象集合,服务器10d对多个领域和该业务对象集合进行跨领域交叉编码处理,得到目标用户在目标领域下的目标领域兴趣特征,获取目标领域下多个待推荐业务对象的待推荐业务对象特征,从多个待推荐业务对象特征中确定与目标领域兴趣特征匹配的目标推荐业务对象特征,输出目标推荐业务对象特征对应的目标推荐业务对象。服务器10d可以将目标推荐业务对象发送至终端设备10a,终端设备10a在页面上展示目标推荐业务对 象,以向目标用户精准推荐业务对象。
当然,确定目标用户的目标领域兴趣特征和确定目标推荐业务对象也可以由终端设备10a来执行。
图1所示的服务器10d可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN(Content De l ivery Network,内容分发网络)、以及大数据和人工智能平台等基础云计算服务的云服务器。
图1所示的终端设备10a、终端设备10b、终端设备10c等可以是手机、平板电脑、笔记本电脑、掌上电脑、移动互联网设备(MI D,mobi le internet device)、可穿戴设备等智能设备。终端设备集群与服务器10d可以通过有线或无线通信方式进行直接或间接地连接,本申请在此不做限制。
下述以服务器10d如何确定目标用户的目标推荐文本为例进行详细说明:
请参见图2a-图2d,其是本申请实施例提供的一种数据推荐的场景示意图。如图2a所示,当服务器10d接收到目标用户在文本领域的数据推荐请求时,服务器10d获取在多个领域与目标用户具有关联关系的业务对象,其中多个领域包括文本领域,假设多个领域分别是文本领域,视频领域和购物领域,与目标用户具有关联关系的业务对象是指向目标用户曝光过/目标用户点击过的业务对象。服务器10d获取文本领域中与目标用户具有关联关系文本集合20a,如图2a所示文本集合20a为空,此时说明目标用户还未阅览过任何文本。服务器10d获取视频领域中与目标用户具有关联关系的视频集合20b,视频集合20b包括视频1、视频2和视频3,说明目标用户曾经阅览过/点击过视频1,视频2和视频3。服务器10d获取购物领域中与目标用户具有关联关系的商品集合20c,商品集合20c包括商品1、商品2和商品3,说明目标用户曾经阅览过/点击过/购买过商品1,商品2和商品3。
如图2b所示,服务器10d获取文本领域对应的文本领域特征,视频领域对应的视频领域特征和购物领域对应的购物领域特征,将这3个领域特征输入第一编码器Encoder,第一编码器将这3个领域特征编码为领域特征向量,再将该领域特征向量输入第二编码器Transformer,第二编码器将领域特征向量编码为特征向量20d。
服务器10d获取空文本对应的文本特征,空文本对应的文本特征可以是全0向量,或者全1向量。将这3个文本特征输入第一编码器Encoder,第一编码器将这3个文本特征编码为文本特征向量。
服务器10d获取视频1对应的视频特征,视频2对应的视频特征,视频3对应的视频 特征。将这3个视频特征输入第一编码器Encoder,第一编码器将这3个视频特征编码为视频特征向量。
服务器10d获取商品1对应的商品特征,商品2对应的商品特征,商品3对应的商品特征。将这3个商品特征输入第一编码器Encoder,第一编码器将这3个商品特征编码为商品特征向量。
服务器10d将前述文本特征向量、视频特征向量和商品特征向量输入第二编码器Transformer,第二编码器将文本特征向量、视频特征向量和商品特征向量编码为特征向量20e。
服务器10d再将特征向量20d和特征向量20e交叉卷积为特征向量20f,其中特征向量20f也表示目标用户的通用兴趣特征。
由于目标领域是文本领域,还需要引入目标领域的知识信息,当作领域偏置,用于转换兴趣空间,用于学习目标用户在文本领域的文本领域兴趣向量。具体过程为:服务器10d再将第一编码器Encoder输出的文本特征向量以及文本领域对应的文本领域特征拼接为特征向量20g,将特征向量20f和特征向量20g交叉卷积为特征向量20h,其中特征向量20h也表示目标用户在文本领域的文本领域兴趣特征。
至此,服务器10d就确定了目标用户在文本领域的文本领域兴趣特征20h。该文本领域兴趣特征20h可以用于预测目标用户在文本领域的偏好。
确定了目标用户在文本领域的文本领域兴趣特征后,可以进行预测。如图2c所示,服务器10d获取5个待推荐文本的文本特征,其中这5个文本特征是在训练第一编码器Encoder和第二编码器Transformer时确定的。服务器10a计算文本领域兴趣特征20h和每个文本特征之间的特征距离,将特征距离最小的2个文本特征对应的文本作为推荐文本,假设5个文本特征中,文本1的文本特征和文本3的文本特征是与文本领域兴趣特征20h最接近的文本特征。
服务器将文本1和文本3下发至终端设备,终端设备将文本1的摘要、文本3的摘要、文本1的封面图像、文本3的封面图像、文本1的标题、文本3的标题以及视频封面(视频封面是其余服务器根据其余的推荐请求下发下来的)组合为推荐页面20j,用户可以在推荐页面20j中阅览文本1和文本3的部分信息,用户还可以点击文本或者视频,以及详细阅览文本内容或者观看视频内容。
其中,获取在多个领域(如上述实施例中的文本领域,视频领域和购物领域)下与目标用户具有关联关系的业务对象集合(如上述实施例中的文本集合20a、视频集合20b以及商品集合20c),获取目标领域兴趣特征(如上述实施例中的文本领域兴趣特征20h) 以及多个待推荐业务对象特征(如上述实施例中的5个文本的文本特征),确定目标推荐业务对象(如上述实施例中的文本1和文本3)的具体过程可以参见下述图3-图7对应的实施例。
请参见图3,是本申请实施例提供的一种数据推荐的流程示意图,下述实施例以服务器为执行主体描述如何确定目标用户的目标推荐业务对象,数据推荐方法可以包括如下步骤:
步骤S101,响应于在目标领域下针对目标用户的数据推荐请求,获取在多个领域下与所述目标用户具有关联关系的业务对象集合。
具体的,服务器(如上述图2a-图2d对应实施例中的服务器10d)响应于在目标领域下针对目标用户的数据推荐请求,获取在多个领域下与目标用户具有关联关系的业务对象集合(如上述图2a-图2d对应实施例中的文本集合20a、视频集合20b以及商品集合20c)。
通俗来说,数据推荐请求即是请求向目标用户推荐在目标领域下的数据,例如,目标领域是文本领域,那么数据推荐请求就是请求向目标用户推荐文本;目标领域是购物领域,那么数据推荐请求就是请求向目标用户推荐商品。
其中,目标领域是多个领域中的任一领域,与目标用户具有关联关系的业务对象是指业务对象向目标用户曝光过(或者说目标用户阅览过这些业务对象)或者业务对象是非实体业务对象(非实体业务对象为空,可以用特殊字符UNK表示)。每个领域下与目标用户具有关联的业务对象集合中包含的业务对象的数量相同,且都等于数量阈值,业务对象可以是空(或者说业务对象可以是非实体业务对象)。
多个领域包括以下至少两种:文本领域、视频领域、游戏领域、嵌入式应用程序领域(即小程序领域)、购物领域等,其中文本领域又可以细分为:短文本领域(例如,新闻,公众号文章)和长文本领域(例如,小说),视频领域又可以细分为:短视频领域(例如,1分钟以内的短视频)和长视频领域(例如,电影,电视剧)。
举例来说,目标用户曾经阅览过文本1和文本2,阅览过视频1,若多个领域包括文本领域和视频领域,且每个领域下与目标用户具有关联关系的业务对象的数量为3,那么目标业务对象集合可以为:文本1,文本2,空(UNK);视频1,空(UNK),空(UNK),此时业务对象“文本1”、业务对象“文本2”以及业务对象“视频1”就是实体业务对象,业务对象“空(UNK)”就是非实体业务对象。
可以知道,若在目标领域下与目标用户具有关联关系的业务对象均是空(即均是非实体业务对象),那么此时的推荐请求可以认为是冷启动推荐,即目标用户在目标领域下没有任何行为数据,需要基于目标用户在其他领域的行为数据为目标用户推荐目标领域下的 业务对象。
业务对象集合属于待推荐业务对象集合,待推荐业务对象集合包括多个领域下的多个待推荐业务对象,待推荐业务对象集合是训练跨领域推荐模型时确定的。
举例来说,待推荐业务对象集合包括文本领域下的文本1、文本2和文本3以及视频领域下的视频1、视频2和视频3,若目标用户曾经阅览过文本1、文本4和视频4,那么与目标用户具有关联关系的业务对象包括文本领域下的文本1、空、空,以及视频领域下的空、空、空。
获取在多个领域下与目标用户具有关联关系的业务对象集合的具体过程如下:获取在每个领域下已向目标用户曝光过的业务对象(称为已曝光业务对象,已曝光业务对象也属于待推荐业务对象集合)。对任一领域来说,若该领域下已曝光业务对象的数量等于数量阈值(该数量阈值即是跨领域推荐模型的输入数量阈值),则将已曝光业务对象组合为该领域下的单位业务对象集合。业务对象集合包括多个单位业务对象集合,每个单位业务对象集合对应一个领域。若该领域下已曝光业务对象的数量小于数量阈值,则将已曝光业务对象和非实体业务对象组合为该领域下的单位业务对象集合。每个领域下的单位业务对象集合包含的业务对象的数量都等于数量阈值,将每个领域下的单位业务对象集合组合为与目标用户具有关联关系的业务对象集合。
其中,获取已曝光业务对象可以是通过目标用户的知识图谱三元组确定的,目标用户的知识图谱三元组包括目标用户的用户标识,目标用户涉及的领域的领域标识(目标用户涉及的领域属于多个领域,且目标用户涉及的领域即是已向目标用户曝光的业务对象所属的领域),以及在目标用户涉及的领域下已向目标用户曝光的业务对象。因此知识图谱三元组可以表示为(用户,领域,业务对象)。
步骤S102,对所述多个领域和所述业务对象集合进行编码处理,得到所述目标用户在所述目标领域下的目标领域兴趣特征。
具体的,调用训练好的跨领域推荐模型对多个领域和目标用户的业务对象集合进行交叉编码处理,得到目标用户的通用兴趣特征(如上述图2a-图2d对应实施例中的特征向量20f)。其中,通用兴趣特征是一个一维向量。
确定了通用兴趣特征后,还需要引入目标领域知识信息,当作领域偏置,转换用户兴趣空间,用于学习目标用户在目标领域下的目标领域兴趣特征(如上述图2a-图2d对应实施例中的文本领域兴趣特征20h),后续可以基于目标领域兴趣特征对目标用户在目标领域下可能喜欢的业务对象进行预测。确定目标领域兴趣特征的过程如下:
服务器从业务对象集合中获取目标领域下与目标用户具有关联关系的业务对象(称为 目标业务对象,如上述图2a-图2d对应实施例中的文本集合20a包含的3个空文本)。
调用跨领域推荐模型对通用兴趣特征、目标领域以及目标业务对象进行交叉编码处理,得到目标用户在目标领域下的目标领域兴趣特征,其中目标领域兴趣特征也是一个一维向量。
下面对跨领域推荐模型如何将多个领域和业务对象集合交叉编码为通用兴趣特征进行详细说明:
跨领域推荐模型包括领域编码器、对象编码器以及领域对象交叉编码器,其中,领域编码器是用于对多个领域进行编码,对象编码器是用于对业务对象集合进行编码,领域对象交叉编码器是用于将多个领域编码后的向量和业务对象集合编码后的向量再编码为目标用户的通用兴趣特征。领域编码器是Encoder+Transformer,Encoder是attention机制的RNN结构,Transformer包括n个编码器(不包含解码器),将领域编码器中的Encoder称为域内领域编码器,将领域编码器中的Transformer称为域间领域转换器。服务器获取每个领域的原始领域特征(每个领域的原始领域特征是训练跨领域推荐模型时所确定的),假设领域数量为k,每个原始领域特征的维度是n1,那个k个原始领域特征可以是表示为k×n1,将k个原始领域特征k×n1作为原始领域特征序列输入领域编码器中的Encoder,领域编码器中的Encoder对k个原始领域特征k×n1进行自注意循环编码处理后,输出k个第一编码向量,每个第一编码向量的维度可以是n2。k个第一编码向量可以组合为矩阵k×n2(该矩阵也可以称为第一编码向量序列),对该矩阵进行平均池化,得到维度为k×1的第二编码向量。再将第二编码向量k×1输入领域编码器中的Transformer,领域编码器中的Transformer对第二编码向量k×1进行多注意力编码处理后,输出维度为k×1的领域特征,其中,n1可以等于512,n2可以等于8,Transformer的输出数据维度和输入数据维度相同。通过多个领域交叉学习,可以表达出稠密的领域特征,解决领域特征稀疏问题。
其中,平均池化是指将一行(或者一列)的数值的平均值作为这一行(或者这一列)特征表达。
其中,平均池化可以用下述公式(1)表示:
d i=Average_pooling(X i)   (1)
其中,X i表示特征输入。
领域编码器的Transformer的编码过程可以用下述公式(2)表示:
Figure PCTCN2021101674-appb-000001
其中,
Figure PCTCN2021101674-appb-000002
表示领域特征。
对象编码器包括多个域内对象编码器(Encoder)和一个域间对象转换器(Transformer),域内对象编码器的数量与领域数量相同,每个域内对象编码器都用于对同一领域下的业务对象(也即是单位业务对象集合)进行编码,域间对象转换器用于对多个域内对象编码器输出的多个编码向量再编码。业务对象集合包括多个单位业务对象集合,每个单位业务对象集合对应一个领域,每个域内对象编码器(Encoder)是attention机制的RNN结构,域间对象转换器(Transformer)包括n个编码器(不包含解码器)。对一个单位业务对象集合来说,服务器获取单位业务对象集合中的每个业务对象的原始对象特征(每个业务对象的原始对象特征是训练跨领域推荐模型时所确定的,若业务对象是空UNK,那么该业务对象的原始对象特征可以是全0或者全1或者其他预设数值的向量),假设单位业务对象集合中包括p个业务对象,每个原始对象特征的维度是r1,那么p个原始对象特征可以表示为p×r1,将p个原始对象特征p×r1作为原始对象特征序列输入域内对象编码器(Encoder),该Encoder对原始对象特征序列p×r1编码处理后,输出p个第三编码向量,每个第三编码向量的维度可以是r2。p个第三编码向量可以组合为矩阵p×r2,对该矩阵进行平均池化,得到维度为p×1的第四编码向量。对每个单位业务对象集合都采用上述方式进行编码,得到每个单位业务对象集合的第四编码向量。假设单位业务对象集合的数量是k,那么可以得到k个维度为p×1的第四编码向量,将k个第四编码向量k×p输入域间对象转换器(Transformer),Transformer对k个第四编码向量k×p编码处理后,输出维度为k×p的特征矩阵。对特征矩阵k×p再进行平均池化,得到维度为k×1的对象特征。其中,r1可以等于512,r2可以等于8。通过业务对象在多个领域的交叉学习,可以表达出稠密的对象特征,解决对象特征稀疏问题。其中,平均池化是指将一行(或者一列)的数值的平均值作为这一行(或者这一列)特征表达。
上述池化过程除了可以采用平均池化,还可以采用最大池化,或者最小池化等。
对象编码器的Transformer的编码过程可以用下述公式(3)表示:
Figure PCTCN2021101674-appb-000003
其中,
Figure PCTCN2021101674-appb-000004
表示池化前的特征矩阵。
至此,通过领域编码器就确定了多个领域的领域特征,以及业务对象集合的对象特征,且领域特征和对象特征的维度相同,调用领域对象交叉编码器将上述领域特征和对象特征交叉编码为目标用户的通用兴趣特征,其中通用兴趣特征也是一个向量。
领域对象交叉编码器的交叉编码方式如下公式(4)所示:
Figure PCTCN2021101674-appb-000005
CovE(e 1,e 2)=flatten(CNN(f(e 1,e 2))W c)
其中,
Figure PCTCN2021101674-appb-000006
表示池化后的对象特征,
Figure PCTCN2021101674-appb-000007
表示领域特征,u g表示通用兴趣特征。e 1,e 2表示输入领域对象交叉编码器的两个一维向量;W c表示领域对象交叉编码器的权重矩阵,函数f(·)表示将两个向量做折叠操作,即是将两个一维的向量折叠为一个二维矩阵,函数flatten(·)表示将多维的输入一维向量化。
公式(4)所示的计算方式在特征稀疏的时候,可以有效提升参数的利用率,另外,这种计算方式对于不同特征之间的交叉处理,也非常有效。
至此,通过调用跨领域推荐模型就获取到了目标用户的通用兴趣特征。
下面对跨领域推荐模型如何将通用兴趣特征、目标领域和目标业务对象交叉编码为目标领域兴趣特征进行详细说明:
从前述可知,对象编码器包括多个域内对象编码器(Encoder)和一个域间对象转换器(Transformer),域内对象编码器的数量与领域数量相同,每个域内对象编码器(Encoder)是attention机制的RNN结构,域间对象转换器(Transformer)包括n个编码器(不包含解码器)。服务器获取目标领域的原始领域特征(为了和前述中的原始领域特征进行区别,此处将目标领域的原始领域特征称为目标领域特征),获取目标业务对象的原始对象特征,通过目标领域的域内对象编码器将目标业务对象的原始对象特征编码为目标对象特征,此处的目标对象特征可以对应前述中维度为p×1的第四编码向量,p表示目标业务对象的数量。
采用下述公式(5)将目标领域特征和目标对象特征拼接为目标领域对象特征:
v t=Concat(d t,f t)W d   (5)
其中,d t表示目标领域特征,f t表示目标对象特征,v t表示目标领域对象特征,W d表示目标领域的对象编码器中的域内对象编码器的权重矩阵。
调用领域对象交叉编码器将通用兴趣特征和目标领域对象特征交叉编码为目标领域兴趣特征,目标领域兴趣特征的计算公式可以用下述公式(6)表示:
u t=ConE(u g,v t)   (6)
其中,u g表示通用兴趣特征,v t表示目标领域对象特征,u t表示目标用户在目标领域下的目标领域兴趣特征。
至此,服务器通过调用跨领域推荐模型就获取了目标领域兴趣特征。
请参见图4a,图4a是本申请实施例提供的一种跨领域推荐模型的示意图。将多个领域的原始领域特征序列输入领域编码器(领域编码器包括域内领域编码器和域间领域转换器),得到领域特征。将每个领域下的单位业务对象集合对应的原始对象特征序列输入对 象编码器(对象编码器包括多个域内对象编码器和一个域间对象转换器),得到对象特征。将领域特征和对象特征输入领域对象交叉编码器,得到目标用户的通用兴趣特征。获取目标领域的原始领域特征(称为目标领域特征),通过目标领域的域内对象编码器将目标领域下的原始对象特征序列编码为目标对象特征,将目标领域特征和目标对象特征拼接为目标领域对象特征。将通用兴趣特征和目标领域对象特征输入领域对象交叉编码器,得到目标用户在目标领域的目标兴趣特征。
本申请中通过Transformer进行域内特征抽取,解特征稀疏的问题,通过ConvE进行域间特征交叉计算,提升跨域特征交叉计算的效果,使得提取的跨领域特征更具有区分性,增强用户行为推导能力。通过引入目标领域知识信息,当作领域偏置,转换用户兴趣空间,增强用户的目标领域兴趣特征的表达能力,提升后续业务对象推荐的精准度。
步骤S103,获取所述目标领域下的多个待推荐业务对象的待推荐业务对象特征。
具体的,服务器获取目标领域下的多个待推荐业务对象的原始对象特征(为了和前述中的原始对象特征进行区别,此处将待推荐业务对象的原始对象特征称为待推荐业务对象特征)。
目标领域下的待推荐业务对象是在训练跨领域推荐模型就确定的,待推荐业务对象的待推荐业务对象特征也是在训练跨领域推荐模型确定的,待推荐业务对象都是实体业务对象。在目标领域下与目标用户具有关系关联的目标业务对象的数量可以是多个,多个目标业务对象可以都是空,即都是非实体业务对象(称为非实体目标业务对象);或者多个目标业务对象包括实体目标业务对象和非实体目标业务对象,实体目标业务对象属于目标领域下的多个待推荐业务对象。
步骤S104,从所述多个待推荐业务对象特征中获取与所述目标领域兴趣特征匹配的目标推荐业务对象特征,输出所述目标推荐业务对象特征对应的目标推荐业务对象。
具体的,服务器计算目标领域兴趣特征和目标领域下的每个待推荐业务对象的待推荐业务对象特征之间的特征距离,可以用欧式距离来度量目标领域兴趣特征和每个待推荐业务对象特征之间的特征距离。特征距离越大,说明目标领域兴趣特征和该待推荐业务对象特征之间的相似度越小,说明目标用户喜欢该待推荐业务对象特征对应的待推荐业务对象的概率就越小;反之,特征距离越小,说明目标领域兴趣特征和该待推荐业务对象特征之间的相似度越大,说明目标用户喜欢该待推荐业务对象特征对应的待推荐业务对象的概率就越大。
服务器将这多个特征距离进行排序,将特征距离最小的n个待推荐业务对象特征作为目标推荐业务对象特征。将目标推荐业务对象特征对应的待推荐业务对象作为目标推荐业 务对象(如上述图2a-图2d对应实施例中的文本1和文本3),输出目标推荐业务对象。目标推荐业务对象就是预测出来的目标用户在目标领域下可能喜欢的业务对象。
举例来说,现有3个待推荐业务对象,分别是待推荐业务对象A,待推荐业务对象B以及待推荐业务对象C,目标领域兴趣特征与待推荐业务对象A的待推荐业务对象特征之间的特征距离为0.3,目标领域兴趣特征与待推荐业务对象B的待推荐业务对象特征之间的特征距离为0.5,目标领域兴趣特征与待推荐业务对象C的待推荐业务对象特征之间的特征距离为0.1。若业务需求是目标推荐业务对象的数量是2,那么可以将最小的2个特征距离对应的待推荐业务对象作为目标推荐业务对象,即待推荐业务对象A和待推荐业务对象C均是目标推荐业务对象。
在一些实施例中,若目标领域是文本领域,那么目标推荐业务对象即是目标推荐电子读物,目标推荐电子读物可以是小说,新闻,公众号文章等。
在一些实施例中,服务器可以将目标推荐业务对象下发至目标用户所在的终端设备,以将目标推荐业务对象向目标用户曝光,目标用户阅览到目标推荐业务对象后,若目标用户对目标推荐业务对象感兴趣,可以点击该目标推荐业务对象,以详细阅览目标推荐业务对象的内容;若目标用户对目标推荐业务对象不感兴趣,可以跳过该目标推荐业务对象。服务器可以将目标用户,目标领域以及已曝光的目标推荐业务对象组合为目标用户新的知识图谱三元组,知识图谱三元组可以表示为(用户,领域,业务对象)。基于新的知识图谱三元组可以再更新跨领域推荐模型,其中目标用户是否点击某一个目标推荐业务对象,可以作为该目标推荐业务对象的标签,例如,若目标用户点击了目标推荐业务对象A,那么目标推荐业务对象A的标签就是1,若目标用户没有点击目标推荐业务对象A,那么目标推荐业务对象A的标签就是0。
上述可知,本申请提出了知识图谱三元组的定义,该定义可以用于多产品领域的数据结构建模,统一了数据形式,为模型训练提供数据基础;本申请中通过Transformer进行域内特征抽取,解特征稀疏的问题,通过ConvE进行域间特征交叉计算,提升跨域特征交叉计算的效果,增强用户行为推导能力。通过引入目标领域知识信息,当作领域偏置,转换用户兴趣空间,增强用户的目标领域兴趣特征的表达能力;最后,本申请的方案可以用于数据推荐用户启动场景,提升了冷启动的推荐准确率。
请参见图4b,图4b是本申请实施例提供的一种数据推荐的整体框架图,如图4b所示,数据推荐涉及在线服务和离线训练两个模块,离线训练是为了训练跨领域推荐模型,以及确定多个领域的待推荐业务对象的原始对象特征。训练好的跨领域推荐模型可以预测用户在目标领域的目标领域兴趣特征,调用最近邻服务确定目标领域兴趣特征和目标领域的待 推荐业务对象的原始对象特征(称为待推荐业务对象特征)之间的特征距离,将特征距离最小的k个待推荐业务对象特征的待推荐业务对象作为目标推荐业务对象。调用排序服务将这k个目标推荐业务对象排序,并将排序后的k个目标推荐业务对象下发至用户所在的终端设备,以向用户精确推荐。后续,可以获取用户针对目标推荐业务对象的用户点击日志,多领域用户行为处理模块基于用户点击日志生成知识图谱三元组,根据知识图谱三元组更新跨领域推荐模型。
请参见图5a-图5b,其是本申请实施例提供的一种目标推荐业务对象的示意图,如图5a所示,当目标领域是文本领域时,说明当前数据推荐请求是请求向目标用户推荐文本,服务器采用上述方式确定待推荐的文本后,服务器将文本下发至目标用户所在的终端设备。在目标用户所在的终端设备中展示服务器下发的文本,展示时可以先展示文本的文本封面(文本封面包括文本标题,文本插图等),待用户点击文本封面后,再显示详细的文本内容。
如图5b所示,当目标领域是文本领域和视频领域时,说明当前数据推荐请求是请求向目标用户推荐文本和视频,服务器采用上述方式分别确定待推荐的文本和视频后,服务器将确定的文本和视频下发至目标用户所在的终端设备。在目标用户所在的终端设备中展示服务器下发的文本和视频,展示时可以先展示文本的文本封面(文本封面包括文本标题,文本插图等)和视频的视频封面(视频封面包括该视频中的任一视频帧图像),待用户点击文本封面或者点击视频封面后,再显示详细的文本内容或者播放视频内容。
请参见图6,图6是本申请实施例提供的一种数据推荐方法的流程示意图,下述实施例主要描述跨领域推荐模型的训练过程,数据推荐方法可以包括如下步骤:
步骤S201,获取在每个领域下与样本用户具有关联关系的样本单位业务对象集合。
具体的,由于模型训练涉及多次迭代,本实施例以一个样本用户进行一次迭代为例进行说明:
服务器获取样本用户的样本知识图谱三元组集合,样本知识图谱三元组集合包括多个样本知识图谱三元组,任一样本知识图谱三元组包括样本用户的用户标识、多个领域中任一领域的领域标识以及在任一领域下已向样本用户曝光的样本业务对象的对象标识,相当于每个样本知识图谱三元组可以表示为(用户,领域,业务对象)。从样本知识图谱三元组集合中确定在每个领域下与样本用户具有关联关系的样本单位业务对象集合,每个样本单位业务对象集合中包含的样本业务对象的数量均等于前述中的数量阈值。可以知道,与样本用户具有关联关系的样本业务对象具体是指样本业务对象向用户曝光过,其中每个单位业务对象集合中又包括曝光未点击的样本业务对象,以及曝光已点击的样本业务对象, 且每个单位业务对象集合中曝光未点击的样本业务对象和曝光已点击的样本业务对象之间的比例相同。
需要说明的是,此处的多个领域下的样本单位业务对象集合即是前述中的待推荐业务对象集合。
步骤S202,生成每个领域的随机领域特征,以及生成多个样本单位业务对象集合中每个样本业务对象的随机对象特征。
具体的,服务器生成每个领域的随机领域特征,以及生成多个样本单位业务对象集合中每个样本业务对象的随机对象特征,随机领域特征和随机对象特征都会参与模型训练,训练后的随机领域特征即是前述中的原始领域特征,训练后的随机领域特征即是前述中的原始领域特征。
步骤S203,调用样本跨领域推荐模型将所有随机领域特征和所有随机对象特征进行跨领域交叉编码处理,得到所述样本用户在样本领域下的预测兴趣特征。
具体的,服务器调用样本跨领域推荐模型将所有随机领域特征和所有随机对象特征进行交叉编码处理,得到样本用户的样本通用兴趣特征,其中确定样本通用兴趣特征和前述中确定目标用户的通用兴趣特征的过程基本相同,此处就不再复述。
服务器从多个领域中确定样本领域,样本领域是多个领域中的任一领域。获取样本领域的随机领域特征,以及获取样本领域下的样本单位业务对象集合中每个样本业务对象的随机对象特征。服务器调用样本跨领域推荐模型将样本通用兴趣特征,样本领域的随机领域特征以及样本领域下的样本单位业务对象集合中每个样本业务对象的随机对象特征进行交叉编码处理,得到样本用户在样本领域的预测兴趣特征。
其中,确定样本用户在样本领域的预测兴趣特征和前述中确定目标用户在目标领域的目标领域兴趣特征的过程基本相同,此处就不再复述。
步骤S204,确定所述预测兴趣特征和所述样本领域的随机对象特征之间的特征相似度,获取所述样本领域的随机对象特征的行为标签。
具体的,服务器计算预测兴趣特征和样本领域下的样本单位业务对象集合中每个样本业务对象的随机对象特征之间的特征相似度,特征相似度在数值0-1之间。服务器获取样本领域下的样本单位业务对象集合中每个样本业务对象的随机对象特征的行为标签,行为标签的取值可以是1或者0,取值为1的行为标签表示该行为标签对应的样本业务对象不仅向样本用户曝光过,且样本用户还点击过,以查看样本业务对象的详细内容;取值为0的行为标签表示该行为标签对应的样本业务对象仅向样本用户曝光过,样本用户未点击过。
步骤S205,根据所述行为标签和所述特征相似度确定预测误差,基于所述预测误差训 练所述样本跨领域推荐模型,得到所述跨领域推荐模型。
具体的,服务器根据样本领域下的样本单位业务对象集合中每个样本业务对象的随机对象特征的特征相似度以及行为标签,确定样本领域下的样本单位业务对象集合中每个样本业务对象的随机对象特征的误差,将所有的误差叠加为预测误差。基于梯度下降原则,将预测误差反向传播至样本跨领域推荐模型、每个领域的随机领域特征以及多个样本单位业务对象集合中每个样本业务对象的随机对象特征,以调整样本跨领域推荐模型的模型参数,调整每个领域的随机领域特征以及调整多个样本单位业务对象集合中每个样本业务对象的随机对象特征。
至此,就完成了对样本跨领域推荐模型的一次迭代更新,服务器可以采用相同的方式,将多个领域中除样本领域以外的其他领域依次作为样本领域,再更新样本跨领域推荐模型,以及将其余的样本用户的多个样本单位业务对象集合用于参与样本跨领域推荐模型的训练,不断地迭代调整。需要说明的是,参与模型训练的可以是多个样本用户,每个样本用户的样本单位业务对象集合包含的样本业务对象都相同。
当调整后的样本跨领域推荐模型满足模型收敛条件时,将调整后的样本跨领域推荐模型作为跨领域推荐模型,将每个领域调整后的随机领域特征均作为原始领域特征,多个样本单位业务对象集合中每个样本业务对象调整后的随机对象特征均作为原始对象特征。
其中,若调整次数达到次数阈值,则认为调整后的样本跨领域推荐模型满足模型收敛条件;或者若调整前模型参数和调整后的模型参数之间的差异量小于预设的差异量阈值,则认为调整后的样本跨领域推荐模型满足模型收敛条件;或者调整后的样本跨领域推荐模型的预测准确率达到预设的准确率阈值,则认为调整后的样本跨领域推荐模型满足模型收敛条件。
模型训练完成后,就可以开始进行预测了。
步骤S206,响应于在目标领域下针对目标用户的数据推荐请求,获取在多个领域下与所述目标用户具有关联关系的业务对象集合;所述业务对象集合包括在每个领域下与所述目标用户具有关联关系的业务对象,所述多个领域包括所述目标领域。
步骤S207,调用跨领域推荐模型对所述多个领域和所述业务对象集合进行编码处理,得到所述目标用户在所述目标领域下的目标领域兴趣特征。
步骤S208,获取所述目标领域下的多个待推荐业务对象的待推荐业务对象特征,从所述多个待推荐业务对象特征中获取与所述目标领域兴趣特征匹配的目标推荐业务对象特征,输出所述目标推荐业务对象特征对应的目标推荐业务对象。
其中,步骤S206-步骤S208的具体过程可以参见上述图3对应实施例中的步骤S101- 步骤S104。
请参见图7,图7是本申请实施例提供的一种样本跨领域推荐模型的示意图,如图7所示,获取多个领域的随机领域特征,将多个随机领域特征组合为随机领域特征序列输入Encoder(也称为样本域内领域编码器)和Transformer(也称为样本域间领域转换器),Encoder和Transformer将随机领域特征序列编码为样本领域特征。获取每个领域下与样本用户具有关联关系的样本单位业务对象集合的多个随机对象特征,将每个领域下的多个随机对象特征组合为随机对象特征序列。将多个随机对象特征序列输入Encoder(也称为样本域内对象编码器)和Transformer(也称为样本域间对象转换器),Encoder编码器和Transformer转换器将多个随机对象特征序列编码为样本对象特征。将样本领域特征和样本对象特征输入ConvE layer(也称为样本领域对象交叉编码器),ConvE layer将样本领域特征和样本对象特征交叉编码为样本通用兴趣特征。ConvE layer的计算公式即是前述公式(4)。从多个领域中随机选择一个领域作为样本领域,获取样本领域的随机领域特征,获取样本领域的样本单位业务对象集合,获取Encoder将样本领域的样本单位业务对象集合的随机对象特征序列编码后的编码向量,将该编码向量和样本领域的随机领域特征拼接为样本领域对象特征。将样本领域对象特征和样本通用兴趣特征输入ConvE l ayer,ConvE l ayer将样本领域对象特征和样本通用兴趣特征交叉编码为预测兴趣特征,计算预测兴趣特征和样本领域的样本单位业务对象集合的多个随机对象特征之间的特征相似度,基于Softmax loss损失函数度量特征相似度和样本领域的样本单位业务对象集合的多个随机对象特征的行为标签之间的预测误差,根据该预测误差调整样本跨领域推荐模型中的样本域内领域编码器、样本域间领域转换器、样本域内对象编码器、样本域间对象转换器和样本领域对象交叉编码器,以及调整随机领域特征序列以及调整多个随机对象特征序列。多次调整,直至调整后的样本跨领域推荐模型满足模型收敛条件。
上述可知,本申请提出了知识图谱三元组的定义,该定义可以用于多产品领域的数据结构建模,统一了数据形式,为模型训练提供数据基础;本申请中通过Transformer进行域内特征抽取,解特征稀疏的问题,通过ConvE进行域间特征交叉计算,提升跨域特征交叉计算的效果,增强用户行为推导能力。通过引入目标领域知识信息,当作领域偏置,转换用户兴趣空间,增强用户的目标领域兴趣特征的表达能力;再有,通过不同领域下的用户行为互相补充,以确定用户在目标领域下的兴趣特征,基于目标领域下的兴趣特征为用户推荐目标领域下的业务对象更具有针对性,可以进一步提高数据推荐的准确率。
进一步的,请参见图8,是本申请实施例提供的一种数据推荐装置的结构示意图。如图8所示,数据推荐装置1可以应用于上述图3-图7对应实施例中的服务器。数据推荐装 置可以是运行于计算机设备中的一个计算机程序(包括程序代码),例如该数据推荐装置为一个应用软件;该装置可以用于执行本申请实施例提供的方法中的相应步骤。
数据推荐装置1可以包括:获取模块11、编码模块12、确定模块13以及输出模块14。
获取模块11,用于响应于在目标领域下针对目标用户的数据推荐请求,获取在多个领域下与所述目标用户具有关联关系的业务对象集合;所述业务对象集合包括在每个领域下与所述目标用户具有关联关系的业务对象,所述多个领域包括所述目标领域;
编码模块12,用于对所述多个领域和所述业务对象集合进行编码处理,得到所述目标用户在所述目标领域下的目标领域兴趣特征;
所述获取模块11,还用于获取所述目标领域下的多个待推荐业务对象的待推荐业务对象特征;
确定模块13,用于从多个待推荐业务对象特征中获取与所述目标领域兴趣特征匹配的目标推荐业务对象特征;
输出模块14,用于输出所述目标推荐业务对象特征对应的目标推荐业务对象。
在一些实施例中,所述确定模块13,具体用于:
确定所述目标领域兴趣特征与每个待推荐业务对象特征之间的特征距离;
根据所述特征距离,从所述多个待推荐业务对象特征中确定所述目标推荐业务对象特征。
在一些实施例中,所述多个领域包括以下至少两种:文本领域、视频领域、游戏领域和嵌入式应用程序领域;
若所述目标领域是文本领域,则所述目标推荐业务对象包括目标推荐电子读物。
在一些实施例中,获取模块11在用于获取在多个领域下与所述目标用户具有关联关系的业务对象集合时,具体用于:
获取在每个领域下已向所述目标用户曝光的已曝光业务对象;
若任一领域下的已曝光业务对象的数量等于数量阈值,则将所述任一领域下的已曝光业务对象组合为所述任一领域下的单位业务对象集合;
若任一领域下的已曝光业务对象的数量小于所述数量阈值,则获取非实体业务对象,将所述任一领域下的已曝光业务对象和所述非实体业务对象组合为所述任一领域下的单位业务对象集合;
将每个领域下的单位业务对象集合组合为与所述目标用户具有关联关系的业务对象集合。
其中,获取模块11、编码模块12、确定模块13以及输出模块14的具体功能实现方 式可以参见上述图3对应实施例中的步骤S101-步骤S104,这里不再进行赘述。
请参见图8,编码模块12可以包括:第一编码单元121、获取单元122和第二编码单元123。
第一编码单元121,用于调用跨领域推荐模型对所述多个领域和所述业务对象集合进行交叉编码处理,得到所述目标用户的通用兴趣特征;
获取单元122,用于获取在目标领域下与所述目标用户具有关联关系的目标业务对象;
第二编码单元123,用于调用跨领域推荐模型对所述通用兴趣特征、所述目标领域和所述目标业务对象进行交叉编码处理,得到所述目标领域兴趣特征。
在一些实施例中,所述跨领域推荐模型包括领域编码器、对象编码器和领域对象交叉编码器;
所述第一编码单121单元,具体用于:
获取每个领域的原始领域特征,调用所述领域编码器将所有原始领域特征编码为领域特征;
获取所述业务对象集合中每个业务对象的原始对象特征,调用所述对象编码器将所有原始对象特征编码为对象特征;
调用所述领域对象交叉编码器将所述领域特征和所述对象特征交叉编码为所述目标用户的通用兴趣特征。
在一些实施例中,所述领域编码器包括域内领域编码器和域间领域转换器;
第一编码单121单元在用于调用所述领域编码器将所有原始领域特征编码为领域特征时,具体用于:
将所有原始领域特征组合为原始领域特征序列;
调用所述域内领域编码器,对所述原始领域特征序列进行自注意力循环编码,得到第一编码向量序列;
将所述第一编码向量序列进行池化处理,得到第二编码向量;
调用所述域间领域转换器,对所述第二编码向量进行多注意力编码,得到所述领域特征。
在一些实施例中,所述对象编码器包括域内对象编码器;
第二编码单元123,具体用于:
获取所述目标领域的目标领域特征;
调用所述域内对象编码器将所述目标业务对象编码为目标对象特征;
将所述目标领域特征和所述目标对象特征拼接为目标领域对象特征;
调用所述领域对象交叉编码器将所述通用兴趣特征和所述目标领域对象特征交叉编码为所述目标领域兴趣特征。
在一些实施例中,所述目标业务对象的数量为多个,多个目标业务对象均是非实体目标业务对象,或者多个目标业务对象包括实体目标业务对象和非实体目标业务对象,所述实体目标业务对象属于所述多个待推荐业务对象。
其中,第一编码单元121、获取单元122和第二编码单元123的具体功能实现方式可以参见上述图3对应实施例中的步骤S102,这里不再进行赘述。
请再参见图8,数据推荐装置1可以包括:获取模块11、编码模块12、确定模块13以及输出模块14;还可以包括:组合模块15、生成模块16、调用模块17以及训练模块18。
组合模块15,用于将所述目标用户,所述目标领域以及所述目标推荐业务对象组合为知识图谱三元组,基于所述知识图谱三元组更新所述跨领域推荐模型。
生成模块16,用于获取在每个领域下与样本用户具有关联关系的样本单位业务对象集合,生成每个领域的随机领域特征,以及生成多个样本单位业务对象集合中每个样本业务对象的随机对象特征;
调用模块17,用于调用样本跨领域推荐模型将所有随机领域特征和所有随机对象特征进行跨领域交叉编码处理,得到所述样本用户在样本领域下的预测兴趣特征;所述样本领域是所述多个领域中的任一领域;
所述生成模块16,还用于确定所述预测兴趣特征和所述样本领域的随机对象特征之间的特征相似度,获取所述样本领域的随机对象特征的行为标签,根据所述行为标签和所述特征相似度确定预测误差;
训练模块18,用于基于所述预测误差训练所述样本跨领域推荐模型,得到所述跨领域推荐模型。
在一些实施例中,生成模块16在用于获取在每个领域下与样本用户具有关联关系的样本单位业务对象集合时,具体用于:
获取所述样本用户的样本知识图谱三元组集合,所述样本知识图谱三元组集合包括多个样本知识图谱三元组,任一样本知识图谱三元组包括所述样本用户的用户标识、多个领域中任一领域的领域标识以及在所述任一领域下已向所述样本用户曝光的样本业务对象的对象标识;
从所述样本知识图谱三元组集合中确定在每个领域下与所述样本用户具有关联关系的样本单位业务对象集合。
在一些实施例中,训练模块18,具体用于:
基于所述预测误差调整所述样本跨领域推荐模型、每个领域的随机领域特征以及多个样本单位业务对象集合中每个样本业务对象的随机对象特征;
当调整后的样本跨领域推荐模型满足模型收敛条件时,将调整后的样本跨领域推荐模型作为所述跨领域推荐模型,将每个领域调整后的随机领域特征均作为原始领域特征,将多个样本单位业务对象集合中每个样本业务对象调整后的随机对象特征均作为原始对象特征。
在一些实施例中,调用模块17,具体用于:
调用样本跨领域推荐模型将所有随机领域特征和所有随机对象特征进行交叉编码处理,得到所述样本用户的样本通用兴趣特征;
从所述多个领域中确定样本领域,调用样本跨领域推荐模型将所述样本通用兴趣特征、所述样本领域的随机领域特征和所述样本领域的随机对象特征进行交叉编码处理,得到所述预测兴趣特征。
其中,组合模块15的具体功能实现方式可以参见上述图3对应实施例中的步骤S104,生成模块16、调用模块17以及训练模块18的具体功能实现方式可以参见上述图6对应实施例中的步骤S201-步骤S205。
进一步地,请参见图9,是本发明实施例提供的一种计算机设备的结构示意图。上述图3-图7对应实施例中的终端设备可以为计算机设备1000,如图9所示,计算机设备1000可以包括:用户接口1002、处理器1004、编码器1006以及存储器1008。信号接收器1016用于经由蜂窝接口1010、WIFI接口1012、...、或NFC接口1014接收或者发送数据。编码器1006将接收到的数据编码为计算机处理的数据格式。存储器1008中存储有计算机程序,处理器1004被设置为通过计算机程序执行上述任一项方法实施例中的步骤。存储器1008可包括易失性存储器(例如,动态随机存取存储器DRAM),还可以包括非易失性存储器(例如,一次性可编程只读存储器OTPROM)。在一些实例中,存储器1008可进一步包括相对于处理器1004远程设置的存储器,这些远程存储器可以通过网络连接至计算机设备1000。用户接口1002可以包括:键盘1018和显示器1020。
在图9所示的计算机设备1000中,处理器1004可以用于调用存储器1008中存储计算机程序,以实现:
响应于在目标领域下针对目标用户的数据推荐请求,获取在多个领域下与所述目标用户具有关联关系的业务对象集合;所述业务对象集合包括在每个领域下与所述目标用户具有关联关系的业务对象,所述多个领域包括所述目标领域;
对所述多个领域和所述业务对象集合进行编码处理,得到所述目标用户在所述目标领域下的目标领域兴趣特征;
获取所述目标领域下的多个待推荐业务对象的待推荐业务对象特征;
从多个待推荐业务对象特征中获取与所述目标领域兴趣特征匹配的目标推荐业务对象特征,输出所述目标推荐业务对象特征对应的目标推荐业务对象。
应当理解,本发明实施例中所描述的计算机设备1000可执行前文图3到图7所对应实施例中对数据推荐方法的描述,也可执行前文图8所对应实施例中对数据推荐装置1的描述,在此不再赘述。另外,对采用相同方法的有益效果描述,也不再进行赘述。
此外,这里需要指出的是:本发明实施例还提供了一种计算机存储介质,且计算机存储介质中存储有前文提及的数据推荐装置1所执行的计算机程序,且该计算机程序包括程序指令,当处理器执行上述程序指令时,能够执行前文图3到图7所对应实施例中的方法,因此,这里将不再进行赘述。另外,对采用相同方法的有益效果描述,也不再进行赘述。对于本发明所涉及的计算机存储介质实施例中未披露的技术细节,请参照本发明方法实施例的描述。作为示例,程序指令可以被部署在一个计算机设备上,或者在位于一个地点的多个计算机设备上执行,又或者,在分布在多个地点且通过通信网络互连的多个计算机设备上执行,分布在多个地点且通过通信网络互连的多个计算机设备可以组成区块链系统。
根据本申请的一个方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备可以执行前文图3到图7所对应实施例中的方法,因此,这里将不再进行赘述。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,上述程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,上述存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。
以上所揭露的仅为本发明较佳实施例而已,当然不能以此来限定本发明之权利范围,因此依本发明权利要求所作的等同变化,仍属本发明所涵盖的范围。

Claims (15)

  1. 一种数据推荐方法,包括:
    响应于在目标领域下针对目标用户的数据推荐请求,获取在多个领域下与所述目标用户具有关联关系的业务对象集合;所述业务对象集合包括在每个领域下与所述目标用户具有关联关系的业务对象,所述多个领域包括所述目标领域;
    对所述多个领域和所述业务对象集合进行编码处理,得到所述目标用户在所述目标领域下的目标领域兴趣特征;
    获取所述目标领域下的多个待推荐业务对象的待推荐业务对象特征;
    从多个待推荐业务对象特征中获取与所述目标领域兴趣特征匹配的目标推荐业务对象特征,输出所述目标推荐业务对象特征对应的目标推荐业务对象。
  2. 根据权利要求1所述的方法,所述对所述多个领域和所述业务对象集合进行编码处理,得到所述目标用户在所述目标领域下的目标领域兴趣特征,包括:
    调用跨领域推荐模型对所述多个领域和所述业务对象集合进行交叉编码处理,得到所述目标用户的通用兴趣特征;
    获取在目标领域下与所述目标用户具有关联关系的目标业务对象;
    调用跨领域推荐模型对所述通用兴趣特征、所述目标领域和所述目标业务对象进行交叉编码处理,得到所述目标领域兴趣特征。
  3. 根据权利要求2所述的方法,所述跨领域推荐模型包括领域编码器、对象编码器和领域对象交叉编码器;
    所述调用跨领域推荐模型对所述多个领域和所述业务对象集合进行交叉编码处理,得到所述目标用户的通用兴趣特征,包括:
    获取每个领域的原始领域特征,调用所述领域编码器将所有原始领域特征编码为领域特征;
    获取所述业务对象集合中每个业务对象的原始对象特征,调用所述对象编码器将所有原始对象特征编码为对象特征;
    调用所述领域对象交叉编码器将所述领域特征和所述对象特征交叉编码为所述目标用户的通用兴趣特征。
  4. 根据权利要求3所述的方法,所述领域编码器包括域内领域编码器和域间领域转换器;
    所述调用所述领域编码器将所有原始领域特征编码为领域特征,包括:
    将所有原始领域特征组合为原始领域特征序列;
    调用所述域内领域编码器,对所述原始领域特征序列进行自注意力循环编码,得到第一编码向量序列;
    将所述第一编码向量序列进行池化处理,得到第二编码向量;
    调用所述域间领域转换器,对所述第二编码向量进行多注意力编码,得到所述领域特征。
  5. 根据权利要求3所述的方法,所述对象编码器包括域内对象编码器;
    所述调用跨领域推荐模型对所述通用兴趣特征、所述目标领域和所述目标业务对象进行交叉编码处理,得到所述目标领域兴趣特征,包括:
    获取所述目标领域的目标领域特征;
    调用所述域内对象编码器将所述目标业务对象编码为目标对象特征;
    将所述目标领域特征和所述目标对象特征拼接为目标领域对象特征;
    调用所述领域对象交叉编码器将所述通用兴趣特征和所述目标领域对象特征交叉编码为所述目标领域兴趣特征。
  6. 根据权利要求2所述的方法,所述目标业务对象的数量为多个,多个目标业务对象均是非实体目标业务对象,或者多个目标业务对象包括实体目标业务对象和非实体目标业务对象,所述实体目标业务对象属于所述多个待推荐业务对象。
  7. 根据权利要求2所述的方法,还包括:
    获取在每个领域下与样本用户具有关联关系的样本单位业务对象集合;
    生成每个领域的随机领域特征,以及生成多个样本单位业务对象集合中每个样本业务对象的随机对象特征;
    调用样本跨领域推荐模型将所有随机领域特征和所有随机对象特征进行跨领域交叉编码处理,得到所述样本用户在样本领域下的预测兴趣特征;所述样本领域是所述多个领域中的任一领域;
    确定所述预测兴趣特征和所述样本领域的随机对象特征之间的特征相似度,获取所述样本领域的随机对象特征的行为标签;
    根据所述行为标签和所述特征相似度确定预测误差,基于所述预测误差训练所述样本跨领域推荐模型,得到所述跨领域推荐模型。
  8. 根据权利要求7所述的方法,所述获取在每个领域下与样本用户具有关联关系的样本单位业务对象集合,包括:
    获取所述样本用户的样本知识图谱三元组集合,所述样本知识图谱三元组集合包括多个样本知识图谱三元组,任一样本知识图谱三元组包括所述样本用户的用户标识、多个领 域中任一领域的领域标识以及在所述任一领域下已向所述样本用户曝光的样本业务对象的对象标识;
    从所述样本知识图谱三元组集合中确定在每个领域下与所述样本用户具有关联关系的样本单位业务对象集合。
  9. 根据权利要求7所述的方法,所述基于所述预测误差训练所述样本跨领域推荐模型,得到所述跨领域推荐模型,包括:
    基于所述预测误差调整所述样本跨领域推荐模型、每个领域的随机领域特征以及多个样本单位业务对象集合中每个样本业务对象的随机对象特征;
    当调整后的样本跨领域推荐模型满足模型收敛条件时,将调整后的样本跨领域推荐模型作为所述跨领域推荐模型,将每个领域调整后的随机领域特征均作为原始领域特征,将多个样本单位业务对象集合中每个样本业务对象调整后的随机对象特征均作为原始对象特征。
  10. 根据权利要求7所述的方法,所述调用样本跨领域推荐模型将所有随机领域特征和所有随机对象特征进行跨领域交叉编码处理,得到所述样本用户在样本领域下的预测兴趣特征,包括:
    调用样本跨领域推荐模型将所有随机领域特征和所有随机对象特征进行交叉编码处理,得到所述样本用户的样本通用兴趣特征;
    从所述多个领域中确定样本领域,调用样本跨领域推荐模型将所述样本通用兴趣特征、所述样本领域的随机领域特征和所述样本领域的随机对象特征进行交叉编码处理,得到所述预测兴趣特征。
  11. 根据权利要求1所述的方法,所述获取在多个领域下与所述目标用户具有关联关系的业务对象集合,包括:
    获取在每个领域下已向所述目标用户曝光的已曝光业务对象;
    若任一领域下的已曝光业务对象的数量等于数量阈值,则将所述任一领域下的已曝光业务对象组合为所述任一领域下的单位业务对象集合;
    若任一领域下的已曝光业务对象的数量小于所述数量阈值,则获取非实体业务对象,将所述任一领域下的已曝光业务对象和所述非实体业务对象组合为所述任一领域下的单位业务对象集合;
    将每个领域下的单位业务对象集合组合为与所述目标用户具有关联关系的业务对象集合。
  12. 根据权利要求1所述的方法,所述从多个待推荐业务对象特征中获取与所述目标 领域兴趣特征匹配的目标推荐业务对象特征,包括:
    确定所述目标领域兴趣特征与每个待推荐业务对象特征之间的特征距离;
    根据所述特征距离,从所述多个待推荐业务对象特征中确定所述目标推荐业务对象特征。
  13. 一种数据推荐装置,包括:
    获取模块,用于响应于在目标领域下针对目标用户的数据推荐请求,获取在多个领域下与所述目标用户具有关联关系的业务对象集合;所述业务对象集合包括在每个领域下与所述目标用户具有关联关系的业务对象,所述多个领域包括所述目标领域;
    编码模块,用于对所述多个领域和所述业务对象集合进行编码处理,得到所述目标用户在所述目标领域下的目标领域兴趣特征;
    所述获取模块,还用于获取所述目标领域下的多个待推荐业务对象的待推荐业务对象特征;
    确定模块,用于从多个待推荐业务对象特征中获取与所述目标领域兴趣特征匹配的目标推荐业务对象特征;
    输出模块,用于输出所述目标推荐业务对象特征对应的目标推荐业务对象。
  14. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行权利要求1-12中任一项所述方法的步骤。
  15. 一种非易失性计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令被处理器执行时,执行权利要求1-12任一项所述的方法。
PCT/CN2021/101674 2020-08-28 2021-06-23 数据推荐方法、装置、计算机设备以及存储介质 WO2022041982A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/948,082 US20230017667A1 (en) 2020-08-28 2022-09-19 Data recommendation method and apparatus, computer device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010885140.7 2020-08-28
CN202010885140.7A CN112035743B (zh) 2020-08-28 2020-08-28 数据推荐方法、装置、计算机设备以及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/948,082 Continuation US20230017667A1 (en) 2020-08-28 2022-09-19 Data recommendation method and apparatus, computer device, and storage medium

Publications (1)

Publication Number Publication Date
WO2022041982A1 true WO2022041982A1 (zh) 2022-03-03

Family

ID=73586141

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/101674 WO2022041982A1 (zh) 2020-08-28 2021-06-23 数据推荐方法、装置、计算机设备以及存储介质

Country Status (3)

Country Link
US (1) US20230017667A1 (zh)
CN (1) CN112035743B (zh)
WO (1) WO2022041982A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114661994A (zh) * 2022-03-28 2022-06-24 徐勇 基于人工智能的用户兴趣数据处理方法、系统及云平台

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112035743B (zh) * 2020-08-28 2021-10-15 腾讯科技(深圳)有限公司 数据推荐方法、装置、计算机设备以及存储介质
CN112528147B (zh) * 2020-12-10 2024-04-30 北京百度网讯科技有限公司 内容推荐方法和装置、训练方法、计算设备和存储介质
CN112989169B (zh) * 2021-02-23 2023-07-25 腾讯科技(深圳)有限公司 目标对象识别方法、信息推荐方法、装置、设备及介质
CN112801761A (zh) * 2021-04-13 2021-05-14 中智关爱通(南京)信息科技有限公司 商品推荐方法、计算设备和计算机可读存储介质
CN113157972B (zh) * 2021-04-14 2023-09-19 北京达佳互联信息技术有限公司 视频封面文案的推荐方法、装置、电子设备及存储介质
CN114757832B (zh) * 2022-06-14 2022-09-30 之江实验室 基于交叉卷积注意力对抗学习的人脸超分辨方法和装置
CN117132367B (zh) * 2023-10-20 2024-02-06 腾讯科技(深圳)有限公司 业务处理方法、装置及计算机设备、存储介质、程序产品

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897464A (zh) * 2017-03-29 2017-06-27 广东工业大学 一种跨领域推荐方法及系统
EP3648041A1 (en) * 2018-10-31 2020-05-06 Amadeus S.A.S. Recommender systems and methods using cascaded machine learning models
CN111191121A (zh) * 2019-12-19 2020-05-22 安徽逻根农业科技有限公司 一种基于大数据的智慧旅游目标匹配方法
CN111368210A (zh) * 2020-05-27 2020-07-03 腾讯科技(深圳)有限公司 基于人工智能的信息推荐方法、装置以及电子设备
CN111368219A (zh) * 2020-02-27 2020-07-03 广州腾讯科技有限公司 信息推荐方法、装置、计算机设备以及存储介质
CN112035743A (zh) * 2020-08-28 2020-12-04 腾讯科技(深圳)有限公司 数据推荐方法、装置、计算机设备以及存储介质

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102208086B (zh) * 2010-03-31 2014-05-14 北京邮电大学 面向领域的个性化智能推荐系统及实现方法
WO2014153133A1 (en) * 2013-03-18 2014-09-25 The Echo Nest Corporation Cross media recommendation
CN104484431B (zh) * 2014-12-19 2017-07-21 合肥工业大学 一种基于领域本体的多源个性化新闻网页推荐方法
CN106951547A (zh) * 2017-03-27 2017-07-14 西安电子科技大学 一种基于交叉用户的跨域推荐方法
US11604844B2 (en) * 2018-11-05 2023-03-14 Samsung Electronics Co., Ltd. System and method for cross-domain recommendations
CN109544306B (zh) * 2018-11-30 2021-09-21 苏州大学 一种基于用户行为序列特征的跨领域推荐方法及装置
CN110032684B (zh) * 2019-04-22 2021-08-06 山东大学 基于共享账户的信息跨域并行序列推荐方法、介质及设备
CN110232153A (zh) * 2019-05-29 2019-09-13 华南理工大学 一种基于内容的跨领域推荐方法
CN110472145B (zh) * 2019-07-25 2022-11-29 维沃移动通信有限公司 一种内容推荐方法和电子设备
CN111046280B (zh) * 2019-12-02 2023-12-12 哈尔滨工程大学 一种应用fm的跨领域推荐方法
CN111159542B (zh) * 2019-12-12 2023-05-05 中国科学院深圳先进技术研究院 一种基于自适应微调策略的跨领域序列推荐方法
CN111291261B (zh) * 2020-01-21 2023-05-26 江西财经大学 融合标签和注意力机制的跨领域推荐方法及其实现系统
CN111563205A (zh) * 2020-04-26 2020-08-21 山东师范大学 共享账户中基于自注意力机制的跨域信息推荐方法及系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897464A (zh) * 2017-03-29 2017-06-27 广东工业大学 一种跨领域推荐方法及系统
EP3648041A1 (en) * 2018-10-31 2020-05-06 Amadeus S.A.S. Recommender systems and methods using cascaded machine learning models
CN111191121A (zh) * 2019-12-19 2020-05-22 安徽逻根农业科技有限公司 一种基于大数据的智慧旅游目标匹配方法
CN111368219A (zh) * 2020-02-27 2020-07-03 广州腾讯科技有限公司 信息推荐方法、装置、计算机设备以及存储介质
CN111368210A (zh) * 2020-05-27 2020-07-03 腾讯科技(深圳)有限公司 基于人工智能的信息推荐方法、装置以及电子设备
CN112035743A (zh) * 2020-08-28 2020-12-04 腾讯科技(深圳)有限公司 数据推荐方法、装置、计算机设备以及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114661994A (zh) * 2022-03-28 2022-06-24 徐勇 基于人工智能的用户兴趣数据处理方法、系统及云平台
CN114661994B (zh) * 2022-03-28 2022-10-14 中软数智信息技术(武汉)有限公司 基于人工智能的用户兴趣数据处理方法、系统及云平台

Also Published As

Publication number Publication date
US20230017667A1 (en) 2023-01-19
CN112035743B (zh) 2021-10-15
CN112035743A (zh) 2020-12-04

Similar Documents

Publication Publication Date Title
WO2022041982A1 (zh) 数据推荐方法、装置、计算机设备以及存储介质
US20230281448A1 (en) Method and apparatus for information recommendation, electronic device, computer readable storage medium and computer program product
CN109819284B (zh) 一种短视频推荐方法、装置、计算机设备及存储介质
CN111737582B (zh) 一种内容推荐方法及装置
CN107526718B (zh) 用于生成文本的方法和装置
CN114896454B (zh) 一种基于标签分析的短视频数据推荐方法及系统
CN113569129A (zh) 点击率预测模型处理方法、内容推荐方法、装置及设备
CN112650841A (zh) 信息处理方法、装置和电子设备
CN113688310A (zh) 一种内容推荐方法、装置、设备及存储介质
US11763204B2 (en) Method and apparatus for training item coding model
CN110866040A (zh) 用户画像生成方法、装置和系统
CN112650942A (zh) 产品推荐方法、装置、计算机系统和计算机可读存储介质
CN114119123A (zh) 信息推送的方法和装置
CN113495991A (zh) 一种推荐方法和装置
CN114926234A (zh) 物品信息推送方法、装置、电子设备和计算机可读介质
CN115329183A (zh) 数据处理方法、装置、存储介质及设备
CN111626044A (zh) 文本生成方法、装置、电子设备及计算机可读存储介质
CN111784377A (zh) 用于生成信息的方法和装置
CN113283115B (zh) 图像模型生成方法、装置和电子设备
CN116501993B (zh) 房源数据推荐方法及装置
CN115455306B (zh) 推送模型训练、信息推送方法、装置和存储介质
CN114417944B (zh) 识别模型训练方法及装置、用户异常行为识别方法及装置
CN113496304B (zh) 网络媒介信息的投放控制方法、装置、设备及存储介质
CN116911954A (zh) 基于兴趣和流行度推荐物品的方法及装置
CN115481347A (zh) 落地页生成方法、装置、电子设备、介质和程序产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21859824

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 030723)

122 Ep: pct application non-entry in european phase

Ref document number: 21859824

Country of ref document: EP

Kind code of ref document: A1