WO2022033199A1 - 一种获得用户画像的方法及相关装置 - Google Patents

一种获得用户画像的方法及相关装置 Download PDF

Info

Publication number
WO2022033199A1
WO2022033199A1 PCT/CN2021/102604 CN2021102604W WO2022033199A1 WO 2022033199 A1 WO2022033199 A1 WO 2022033199A1 CN 2021102604 W CN2021102604 W CN 2021102604W WO 2022033199 A1 WO2022033199 A1 WO 2022033199A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
feature vector
content
feature
tag
Prior art date
Application number
PCT/CN2021/102604
Other languages
English (en)
French (fr)
Inventor
王伟佳
陈鑫
闫肃
张旭
林乐宇
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2022033199A1 publication Critical patent/WO2022033199A1/zh
Priority to US17/898,270 priority Critical patent/US20220405607A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning

Definitions

  • the embodiments of the present invention relate to the field of computers, and in particular, to a user portrait technology.
  • Personalized recommendation system is one of the core technologies of the Internet, which recommends interesting content to users based on user behavior and interests.
  • User portrait that is, the structuring and labeling of user information, by describing the user's demographic attributes, social attributes, interests and preferences and other dimensions of data, accurately describe and analyze all aspects of user information, and tap potential value, so as to better To improve the effect of personalized recommendation.
  • the portrait labels are first extracted from the user behavior data, and the portrait labels involved in the user behavior data are simply counted, and the portrait labels of each user are scored according to the frequency, that is, the higher the frequency, the higher the score. High, and then obtain user portraits according to the scores of the portrait tags. For cold-start users, due to the lack of user behavior data, the accuracy of user portraits obtained based on portrait tag statistics is low, which in turn affects the accuracy of related services based on user portraits.
  • Embodiments of the present application provide a method and apparatus for obtaining a user portrait, which are used to improve the accuracy of the obtained user portrait and further improve the accuracy of content recommendation.
  • an embodiment of the present application provides a method for obtaining a user portrait, the method comprising:
  • the user profile of the target user is determined based on the candidate tags of the target user.
  • an embodiment of the present application provides a training method for a user portrait model, the method comprising:
  • the training samples include the sample multimedia content and the user characteristics of the sample users.
  • Each iterative training includes:
  • the parameters of the user portrait model to be trained are adjusted.
  • an embodiment of the present application provides a device for obtaining a user portrait, the device comprising:
  • a first feature extraction module configured to determine the user feature vector of the target user according to the attribute information of the target user and historical behavior data
  • the second feature extraction module is used to obtain the tag feature vector of the content tag of the multimedia content in the target application
  • a matching module configured to determine the candidate tag of the target user from the content tag of the multimedia content according to the similarity between the user feature vector and the tag feature vector;
  • a processing module configured to determine the user portrait of the target user based on the candidate tag of the target user.
  • an embodiment of the present application provides a training device for a user portrait model, and the device includes:
  • the model training module is used to perform multiple iterative training using the user portrait model to be trained and the training samples to obtain the user portrait model.
  • the training samples include the sample multimedia content and the user characteristics of the sample users, and each iterative training includes:
  • the parameters of the user portrait model to be trained are adjusted.
  • an embodiment of the present application provides a computer device, including a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor implements the above-mentioned method of obtaining a user portrait when the processor executes the program.
  • an embodiment of the present application provides a computer-readable storage medium, which stores a computer program executable by a computer device, and when the program runs on the computer device, causes the computer device to execute the above-mentioned method of obtaining a user portrait.
  • the user feature vector of the target user is determined according to the attribute information of the target user and the historical behavior data.
  • the user feature vector not only represents the historical behavior and attributes of the user, but also represents the user determined based on the historical behavior and attributes of the user. Therefore, compared with the user portrait obtained based on tag statistics, according to the similarity between the target user’s user feature vector and the tag feature vector, the determined candidate tag of the target user can better represent the user’s preference, thereby improving the acquisition rate.
  • the accuracy of the user persona is not only the label in the historical behavior data of the target user, but also the label outside the historical behavior data, thus improving the generalization. It expands the interest of target users, makes the obtained user portraits more comprehensive and accurate, and improves the accuracy of content recommendation.
  • FIG. 1 is a system architecture diagram provided by an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a method for obtaining a user portrait provided by an embodiment of the present application
  • FIG. 3 is a schematic flowchart of a method for obtaining a user feature vector according to an embodiment of the present application
  • FIG. 4 is a schematic flowchart of a method for obtaining a user feature vector according to an embodiment of the present application
  • FIG. 5 is a schematic structural diagram of a user portrait model provided by an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a user portrait model provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a user portrait model provided by an embodiment of the present application.
  • FIG. 8 is a schematic flowchart of a method for obtaining a content feature vector according to an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of a method for obtaining a content feature vector according to an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a content recommendation page provided by an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a content recommendation page provided by an embodiment of the present application.
  • FIG. 12 is a schematic diagram of a content recommendation page provided by an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of a user portrait model provided by an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of an apparatus for obtaining a user portrait provided by an embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of a content recommendation apparatus provided by an embodiment of the present application.
  • 16 is a schematic structural diagram of a training device for a user portrait model provided by an embodiment of the application.
  • FIG. 17 is a schematic structural diagram of a computer device according to an embodiment of the present application.
  • the user feature vector of the target user and the tag feature vector of the content tag of the multimedia content in the target application are determined by artificial intelligence technology, and based on the user feature vector and the tag feature vector of the content tag of the multimedia content, the target is determined.
  • the user's alternative label and then determine the user portrait of the target user according to the alternative label.
  • the user feature vector of the target user and the tag feature vector of the content tag of the multimedia content in the target application are determined by a specific machine learning model or algorithm in the artificial intelligence technology.
  • Attention mechanism imitates the internal process of biological observation behavior, that is, a mechanism that aligns internal experience with external sensations to increase the fineness of observation in some areas. Simply put, it is to quickly filter out high-value information from a large amount of information. There are two main aspects to this mechanism: deciding which part of the input needs attention; and allocating limited information processing resources to the important parts.
  • the attention-based mechanism enables the neural network to focus on a subset of its inputs (or features) and select specific inputs.
  • the user features of the target user in multiple feature domains are fused based on the attention mechanism to determine the user feature vector of the target user.
  • User portrait is a labelled user model abstracted from information such as user social attributes, living habits and consumption behavior.
  • the core work of constructing user portraits is to “label” users, and labels are highly refined feature identifiers obtained by analyzing user information.
  • the portrait labels are first extracted from the user behavior data, and the portrait labels involved in the user behavior data are simply counted, and the portrait labels of each user are scored according to the frequency, that is, the higher the frequency, the higher the score. High, and then obtain user portraits based on the scores of the portrait tags.
  • the user's portrait tag may only appear once or twice, so when the portrait is scored based on the frequency of the portrait label, the score of the portrait label is not representative.
  • the user portrait obtained by scoring the portrait label is less accurate, and user labels other than user behavior data cannot be obtained at the same time.
  • content that the user does not like may be recommended, thereby affecting user experience.
  • an embodiment of the present application provides a method for obtaining a user portrait, the method comprising: determining a user feature vector of the target user according to the attribute information of the target user and historical behavior data, and obtaining the label of the content label of the multimedia content in the target application Feature vector. Then, according to the similarity between the user feature vector and the tag feature vector, the candidate tag of the target user is determined from the content tags of the multimedia content. The user profile of the target user is determined based on the candidate tags of the target user.
  • this method can more comprehensively characterize user preferences, thereby improving the accuracy of the obtained user labels, which in turn improves the accuracy of the obtained user portraits.
  • the obtained candidate tags are not only the tags in the historical behavior data of the target user, but also the tags outside the historical behavior data, which improves the generalization ability, expands the interest of the target user, and makes the obtained user portrait more Comprehensive and accurate, thereby improving the accuracy of content recommendation.
  • Scenario 1 Document recommendation scenario.
  • the content recommendation device when recommending a document to a target user, the content recommendation device first obtains the attribute information and historical behavior data of the target user, wherein the attribute information of the target user includes gender, age, location, etc., and the historical behavior data includes Historical behavior data of the target user in the target application, and/or historical behavior data of the target user in other applications other than the target application, such as documents clicked by the target user in the target application and/or other applications other than the target application the subject of the document, the category of the document, the content tags contained in the document, etc.
  • the attribute information of the target user includes gender, age, location, etc.
  • the historical behavior data includes Historical behavior data of the target user in the target application, and/or historical behavior data of the target user in other applications other than the target application, such as documents clicked by the target user in the target application and/or other applications other than the target application the subject of the document, the category of the document, the content tags contained in the document, etc.
  • the tag feature vectors of the content tags of multiple documents in the target application and then determine the candidate tags of the target user from the content tags of multiple documents according to the similarity between the user feature vector of the target user and the tag feature vector, based on The target user's alternate tags determine the target user's profile. Then, according to the user portrait, the document in the target application is recommended to the target user.
  • Scenario 2 ad recommendation scenario.
  • the content recommendation device when recommending an advertisement to a target user, the content recommendation device first obtains the attribute information and historical behavior data of the target user, wherein the attribute information of the target user includes gender, age, location, etc., and the historical behavior data includes Historical behavior data of the target user in the target application, and/or historical behavior data of the target user in other applications other than the target application, such as advertisements clicked by the target user in the target application and/or other applications other than the target application the subject of the ad, the category of the ad, the content tags included in the ad, etc.
  • the attribute information of the target user includes gender, age, location, etc.
  • the historical behavior data includes Historical behavior data of the target user in the target application, and/or historical behavior data of the target user in other applications other than the target application, such as advertisements clicked by the target user in the target application and/or other applications other than the target application the subject of the ad, the category of the ad, the content tags included in the ad, etc.
  • the tag feature vector of the content tags of multiple advertisements in the target application and then determine the candidate tag of the target user from the content tags of multiple advertisements according to the similarity between the user feature vector of the target user and the tag feature vector, based on The target user's alternate tags determine the target user's profile. Then, according to the user portrait, the advertisement in the target application is recommended to the target user.
  • the method for obtaining user portraits in the embodiments of the present application is not limited to being applied to the above two implementation scenarios, and may also be audio recommendation, video recommendation, product recommendation, takeaway information recommendation, reading recommendation, news recommendation, small Scenarios such as content recommendation in the program are not specifically limited in this application.
  • FIG. 1 is a system architecture diagram of a method for obtaining a user portrait provided by an embodiment of the present application.
  • the architecture includes at least a terminal device 101 and a server 102 .
  • a target application may be installed in the terminal device 101, where the target application may be a client application, a web version application, a small program application, or the like.
  • the attribute information of the target user can be obtained from the registration information of the target user in the target application, and the historical behavior data of the target user can be obtained from the historical records of the target application and/or other applications other than the target application.
  • the terminal device 101 may include one or more processors 1011, a memory 1012, an I/O interface 1013 interacting with the buried point server 103, a display panel 1014, and the like.
  • the terminal device 101 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc., but is not limited thereto.
  • the server 102 may be a background server of the target application, providing corresponding services for the target application.
  • the server 102 may include one or more processors 1021, a memory 1022, an I/O interface 1023 interacting with the terminal device 101, and the like. Additionally, server 102 may also configure database 1024 .
  • the server 102 may be an independent physical server, or a server cluster or a distributed system composed of multiple physical servers, or may provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, Cloud servers for basic cloud computing services such as middleware services, domain name services, security services, Content Delivery Network (CDN), and big data and artificial intelligence platforms.
  • the terminal device 101 and the server 102 may be directly or indirectly connected through wired or wireless communication, which is not limited in this application.
  • the device for obtaining the user portrait may be the terminal device 101 or the server 102 .
  • the device for obtaining the user portrait is the terminal device 101 .
  • the terminal device 101 obtains the attribute information and historical behavior data of the target user from the server 102, and then determines the user feature vector of the target user according to the attribute information and historical behavior data of the target user.
  • the terminal device 101 obtains the tag feature vector of the content tag of the multimedia content in the target application, and then determines the candidate tag of the target user from the content tag of the multimedia content according to the similarity between the user feature vector of the target user and the tag feature vector. Then, the user portrait of the target user is determined based on the candidate tags of the target user.
  • the target application acquires the multimedia content recommended to the target user from the server 102 according to the user portrait of the target user and displays it.
  • the device that obtains the user portrait is the server 102 .
  • the server 102 determines the user feature vector of the target user according to the attribute information of the target user and the historical behavior data, and obtains the tag feature vector of the content tag of the multimedia content in the target application. Then, according to the similarity between the user feature vector of the target user and the tag feature vector, the candidate tag of the target user is determined from the content tags of the multimedia content. Then, the user portrait of the target user is determined based on the candidate tags of the target user.
  • the target application sends a content recommendation request to the server 102 through the terminal device 101 .
  • the server 102 obtains the multimedia content recommended to the target user from the database according to the user portrait of the target user, and sends the multimedia content recommended to the target user to the terminal device 101, and the terminal device 101 displays the multimedia content recommended to the target user in the target application content.
  • an embodiment of the present application provides a flow of a method for obtaining a user portrait.
  • the flow of the method may be executed by a computer device, and the computer device may be the one shown in FIG. 1 .
  • Step S201 Determine the user feature vector of the target user according to the attribute information of the target user and historical behavior data.
  • the attribute information of the target user can be obtained from the registration information of the target user in the target application, and the attribute information of the target user includes at least two types of information:
  • the first category is the numerical category, that is, information described by numbers, such as age, date of birth, account registration time, etc.
  • the second category is the text category, that is, the information described by text.
  • the gender can be male or female, and the location can be Beijing, Shanghai and other places.
  • the historical behavior data of the target user includes the historical behavior data of the target user in the target application, and/or the historical behavior data of the target user in other applications than the target application.
  • Behavior data includes operation events and attribute information of operation objects. Operation events can be clicks, browsing, favorites, comments, etc., and attribute information of operation objects can be topics, categories, tags, and so on.
  • Step S202 acquiring the tag feature vector of the content tag of the multimedia content in the target application.
  • the multimedia content may be text information, audio, video, etc.
  • One multimedia content may correspond to one or more content tags.
  • the content tags corresponding to a piece of news about a football match include: sports, football, XX football team, etc.
  • the number of multimedia contents may be one or multiple, which is not limited in this embodiment.
  • Step S203 according to the similarity between the user feature vector and the tag feature vector, determine the candidate tag of the target user from the content tags of the multimedia content.
  • a similarity threshold may be preset to determine the similarity between the user feature vector of the target user and each tag feature vector, and a tag feature vector with a similarity greater than the similarity threshold is determined as a matching tag vector, and then determine the content tag corresponding to the matching tag vector as the candidate tag of the target user.
  • the content tag corresponding to the vector is determined as the candidate tag of the target user, where P is the threshold of the number of candidate tags.
  • the multimedia content used in this embodiment may be all the multimedia content in the target application, or may be part of the multimedia content in the target application.
  • Step S204 determining the user portrait of the target user based on the candidate tag of the target user.
  • the duplicate labels can be removed.
  • the upper limit of the number of tags can be set in advance.
  • the similarity between the user feature vector of the target user and the tag feature vector can be calculated degree, sort the candidate tags and the existing tags of the target user in descending order of similarity, and keep the top N tags, where N is the upper limit of the number of tags.
  • the user feature vector of the target user and the label feature vector of the content label of the multimedia content in the target application are obtained, and then the candidate label of the target user is determined according to the similarity between the user feature vector and the label feature vector, Then, the user portrait of the target user is determined based on the candidate tag.
  • user feature vectors can more comprehensively characterize user preferences, thereby improving the accuracy of the obtained user labels and thus improving the accuracy of the obtained user portraits.
  • the obtained candidate tags are not only the tags in the historical behavior data of the target user, but also the tags outside the historical behavior data, thereby improving the generalization ability, expanding the interest of the target user, and making the obtained user portrait more comprehensive Accurate, thereby improving the accuracy of content recommendation.
  • step S201 when obtaining the user feature vector of the target user, first determine the user features of the target user in multiple feature domains according to the attribute information and historical behavior data of the target user, and then Through the user portrait model, the feature vector of user features in each feature domain is extracted, and the feature vector of user features in each feature domain is subjected to hierarchical embedding processing to determine the user feature vector of the target user.
  • the feature domain is a feature dimension that characterizes user features, and the user features in each feature domain may be completely different or partially the same.
  • User characteristics may specifically be attribute information such as gender, age, address, and position, or may be information such as tags, categories, and topics obtained from historical behavior data.
  • the user portrait model is trained based on the correlation between the user feature vector of the sample user and the content feature vector of the sample multimedia content.
  • the content feature vector of the sample multimedia content is a hierarchical embedding of the label feature vector of the content label of the sample multimedia content. Obtained after processing, the user feature vector of the sample user is obtained by performing hierarchical embedding processing on the feature vector of the user feature of the sample user.
  • the user portrait model can be a deep neural network model (Deep Neural Network, DNN for short), a Transformer model, or other models.
  • the attribute information of the target user includes gender, age, address, and position.
  • the historical behavior data of the target user is the historical behavior data of the target user in other applications other than the target application, specifically the historical behavior data of the target user in the video application A, the historical behavior data of the target user in the audio application B, and the target user.
  • seven feature domains are preset, which are respectively the first feature domain to the seventh feature domain, wherein gender is the user feature in the first feature domain, age is the user feature in the second feature domain, and the address is the user feature in the first feature domain.
  • User features in the three feature domains, position is the user feature in the fourth feature domain
  • tags, categories, topics and other information are obtained from the historical behavior data of the video application A as the user features in the fifth feature domain
  • audio application such as tags, categories, and topics are obtained from the historical behavior data of B as user features in the sixth feature domain
  • information such as labels, categories, and topics are obtained from the historical behavior data of shopping application C as user features in the seventh feature domain.
  • 5 feature domains are preset, which are the first feature domain to the fifth feature domain, wherein the gender is the user feature in the first feature domain, the age is the user feature in the second feature domain, and the address is the first feature domain.
  • 4 feature domains are preset, which are the first feature domain to the fourth feature domain, wherein gender, age, address and position are the user features in the first feature domain, and the historical behavior data of video application A is obtained from the video application A.
  • Obtain the label as the user feature in the second feature domain obtain information such as labels, categories, topics from the historical behavior data of audio application B as user features in the third feature domain, and obtain the historical behavior data from the shopping application C.
  • Information such as tags, categories, and topics are obtained as user features in the fourth feature domain.
  • two feature domains are preset, namely the first feature domain and the second feature domain, wherein gender, age, address, and position are the user features in the first feature domain, and the historical behavior data of video application A is obtained from the video application A.
  • the historical behavior data of the audio application B and the historical behavior data of the shopping application C obtain information such as tags, categories, topics, etc., as user features in the second feature domain.
  • feature extraction may also be performed directly on the attribute information and historical behavior data of the target user to determine the user feature vector of the target user, which is not specifically limited in this application.
  • the user characteristics of the target user in multiple characteristic domains are determined according to the attribute information and historical behavior data of the target user, and the user characteristics are characterized from multiple dimensions, thereby improving the user characteristics of the target user determined based on the user characteristics.
  • the accuracy of the vector is determined according to the attribute information and historical behavior data of the target user, and the user characteristics are characterized from multiple dimensions, thereby improving the user characteristics of the target user determined based on the user characteristics.
  • the example of this application performs hierarchical embedding processing on the feature vector of user features in each feature domain, and when determining the user feature vector of the target user, the present application at least It includes the following implementations:
  • Embodiment 1 The feature vectors of user features in each feature domain are fused to obtain the intra-domain feature vectors of each feature domain, and then the intra-domain feature vectors of multiple feature domains are fused to obtain the user feature vector of the target user.
  • the eigenvectors of the user features in each feature domain can be weighted and summed to obtain the in-domain feature vector of each feature domain, which specifically conforms to the following formula (1):
  • ⁇ x is the intra-domain feature vector when fused
  • the weight of , H is the upper limit of the number of eigenvectors in the feature domain, and the upper limit of the number of eigenvectors in different feature domains can be different.
  • ⁇ x is the feature vector when fused in the domain the weight of
  • W t is the spatial transformation matrix of the feature domain t
  • the bias vector is the spatial transformation matrix of the feature domain t
  • N is the number of feature domains.
  • ⁇ t is the feature vector in the time domain of the inter-domain fusion the weight of, is the semantic vector during inter-domain fusion, W t is the spatial transformation matrix of the feature domain t, is the bias vector.
  • the weights in the inter-domain fusion are learned by using the attention mechanism when training the user portrait model.
  • the method of performing intra-domain fusion and inter-domain fusion on the feature vector of user features is not limited to the above-mentioned weighted summation method, but can also be a direct addition method, or one of intra-domain fusion and inter-domain fusion.
  • a method of weighted summation is adopted, and another method of direct addition is adopted, which is not specifically limited in this application.
  • five feature domains are preset, which are the first feature domain to the fifth feature domain, wherein gender is the user feature in the first feature domain, and age is the user feature in the second feature domain.
  • User characteristics, the position is the user characteristics in the third characteristic domain, and the tags, categories, and topics are obtained from the historical behavior data of video application A as user characteristics in the fourth characteristic domain, and obtained from the historical behavior data of audio application B Labels, categories, and topics are used as user features in the fifth feature domain.
  • Extract the feature vector of user features in each feature domain where the feature vector in the first feature domain is the gender feature vector, the feature vector in the second feature domain is the age feature vector, and the feature vector in the third feature domain is the position.
  • Feature vector, the feature vector in the fourth feature domain includes label feature vector, category feature vector, and topic feature vector.
  • the feature vectors in the fifth feature domain include label feature vectors, category feature vectors, and topic feature vectors.
  • intra-domain fusion may not be performed.
  • the label feature vector, category feature vector, and topic feature vector in the fourth feature domain are weighted and summed using the above formula (1) to obtain the in-domain feature vector of the fourth feature domain.
  • the label feature vector, category feature vector, and topic feature vector in the domain are weighted and summed to obtain the intra-domain feature vector of the fifth feature domain.
  • the gender feature vector of the first feature domain, the age feature vector of the second feature domain, the position feature vector of the third feature domain, the intra-domain feature vector of the fourth feature domain, and the intra-domain feature of the fifth feature domain is fused between domains to obtain the user feature vector of the target user.
  • the feature vectors of the user features of the target user in multiple feature domains are fused to obtain the user feature vector of the target user.
  • a weighted summation method can be used to fuse the feature vectors of the user features of the target user in multiple feature domains to obtain the user feature vector of the target user.
  • each feature vector The weight can be learned by using the attention mechanism when training the user portrait model.
  • the user feature vector of the target user can also be obtained by merging the feature vectors of the user features of the target user in multiple feature domains by means of direct addition.
  • five feature domains are preset, which are the first feature domain to the fifth feature domain, wherein gender is the user feature in the first feature domain, and age is the user feature in the second feature domain.
  • User characteristics, the position is the user characteristics in the third characteristic domain, and the tags, categories, and topics are obtained from the historical behavior data of video application A as user characteristics in the fourth characteristic domain, and obtained from the historical behavior data of audio application B Labels, categories, and topics are used as user features in the fifth feature domain.
  • Extract the feature vector of user features in each feature domain where the feature vector in the first feature domain is the gender feature vector, the feature vector in the second feature domain is the age feature vector, and the feature vector in the third feature domain is the position.
  • Feature vector, the feature vector in the fourth feature domain includes label feature vector, category feature vector, and topic feature vector.
  • the feature vectors in the fifth feature domain include label feature vectors, category feature vectors, and topic feature vectors.
  • the gender feature vector in the first feature domain, the age feature vector in the second feature domain, the position feature vector in the third feature domain, the label feature vector in the fourth feature domain, category feature vector, topic feature vector and The label feature vector, category feature vector, and topic feature vector in the fourth feature domain are weighted and summed to obtain the user feature vector of the target user.
  • the weight corresponding to each feature vector is learned by the attention mechanism when training the user portrait model. acquired.
  • the user feature vectors are obtained, so that the user feature vectors can more comprehensively characterize the user features, thereby effectively improving the accuracy of matching user tags based on the user feature vectors.
  • step S202 when obtaining the tag feature vector of the content tag of the multimedia content, first determine the content tag of each multimedia content in the multimedia content in the target application in multiple tag domains, and then extract each content tag through the user portrait model. Label feature vector for content labels in the label domain.
  • the tag domain is a label dimension representing multimedia content, and different label domains represent different label dimensions, and user labels in each label domain may be completely different or may be partially the same.
  • the tag field may be a content tag field, a category tag field, a topic tag field, an official account tag field, and the like.
  • the content tags in each tag domain are embedded through the user portrait model, and the tag feature vector of the content tags in each tag domain is obtained.
  • five tag fields are preset, which are a content tag field, a first-level category tag field, a second-level category tag field, a topic tag field, and an official account tag field.
  • the content tags acquired from this piece of news include: sports, football, M team, and N team, and the acquired content tags are used as tags in the content tag field.
  • the first-level category corresponding to this piece of news is sports, and the content tag "sports" is used as a tag in the first-level category tag field.
  • the secondary category corresponding to this piece of news is football, and the content tag "football” is used as the tag in the secondary category tag field. If the content of the news is a football match, the content tag "soccer” is used as the tag in the subject tag domain.
  • the news comes from the Q sports official account, and the Q sports official account is used as the label in the official account label field.
  • the content tags in each tag field can also be determined in the same manner for other multimedia contents, which will not be repeated here.
  • multiple tag fields are preset to represent tags in the multimedia content, which facilitates subsequent matching of content tags of multiple dimensions for target users, thereby improving the accuracy of user portraits.
  • the implementation of the label domain division is not limited to the above-mentioned example, and may also be a combination of some label domains in the content label domain, the category label domain, the topic label domain, and the official account label domain.
  • This application does not make any specific limitations.
  • the candidate label of the target user is determined in the following manner:
  • the similarity between the user feature vector of the target user and the tag feature vector of the content tag may be the dot product value, the Euclidean distance, the cosine similarity between the user feature vector and the tag feature vector Wait.
  • the similarity threshold is preset, and the same similarity threshold may be set for different tag domains, or different similarity thresholds may be set, which is not specifically limited in this application.
  • the similarity between the user feature vector of the target user and the tag feature vector of the content tag in the tag domain is determined, and the content tag in the tag domain whose similarity is greater than the similarity threshold is used as the target user's backup Select the label.
  • the label quantity threshold is preset, and different label domains can be set with the same label quantity threshold, or different label quantity thresholds, which are not specifically limited in this application.
  • For each label domain determine the similarity between the user feature vector of the target user and the label feature vector of each content label in the label domain, and then follow the similarity order from large to small for the content labels in the label domain Sorting is performed, and the top W content tags are used as candidate tags of the target user, where W is the tag quantity threshold corresponding to the tag domain.
  • multiple tag fields are preset to represent the content tags in the multimedia content, and then based on the similarity between the user feature vector of the target user and the tag feature vector of the content tags in each tag field, from each tag field.
  • the candidate tags of the target users are obtained from the content tags of the tag field, so the obtained candidate tags are also multi-dimensional, so that the obtained user portraits are more comprehensive, and more accurate content can be recommended to users based on the multi-dimensional user portraits in the future.
  • the training process can be performed by a computer device.
  • the computer device can be the terminal device 101 or the server 102 shown in FIG. 1 , and specifically includes the following steps:
  • the training samples include the sample multimedia content and the user characteristics of the sample users.
  • Each iterative training includes:
  • a feature vector of user features of sample users and a tag feature vector of content tags of sample multimedia content are extracted. Then, perform hierarchical embedding processing on the feature vectors of the user features of the sample users to obtain the user feature vectors of the sample users. Perform hierarchical embedding processing on the label feature vector to obtain the content feature vector of the sample multimedia content, and then adjust the parameters of the user portrait model to be trained based on the correlation between the user feature vector of the sample user and the content feature vector of the sample multimedia content .
  • the structure and training methods of the user portrait model include at least the following:
  • the user portrait model includes a first sub-model, a second sub-model, and an estimation layer, wherein the first sub-model includes a first input layer, a first domain fusion layer, and a first domain.
  • the second sub-model includes a second input layer, a second intra-domain fusion layer, and a second inter-domain fusion layer.
  • the first sub-model When training the user portrait model, for the first sub-model, first determine the user characteristics of the sample users in multiple feature domains according to the attribute information and historical behavior data of the sample users, and then use the first input layer to classify the sample users in multiple feature domains.
  • User features in the feature domain are input into the first sub-model to be trained.
  • the first input layer performs feature extraction on the user features of the sample users in each feature domain, obtains feature vectors of the user features in each feature domain, and inputs the feature vectors of the user features into the first domain fusion layer.
  • the first intra-domain fusion layer fuses the feature vectors of user features in each feature domain to obtain the intra-domain feature vectors of each feature domain, and inputs the intra-domain feature vectors of each feature domain into the first inter-domain fusion layer.
  • the first inter-domain fusion layer fuses the intra-domain feature vectors of multiple feature domains to obtain the user feature vector of the sample user, and then inputs the user feature vector of the sample user into the prediction layer.
  • methods such as weighted summation or direct addition may be used.
  • methods such as weighted summation or direct addition can be used.
  • the second sub-model first determine the content tags of the sample multimedia content in the target application in multiple tag fields, and then input the content tags of the sample multimedia content in multiple tag fields into the second input layer to be trained through the second input layer. submodel.
  • the second input layer extracts the label feature vectors of the content labels in each label domain, and then inputs the label feature vectors of the content labels in each label domain into the second intra-domain fusion layer.
  • the second intra-domain fusion layer fuses the label feature vectors of the content labels in each label domain to obtain the intra-domain label vector of each label domain, and then inputs the intra-domain label vector of each label domain into the second inter-domain fusion layer.
  • the second inter-domain fusion layer fuses the intra-domain label vectors of multiple label domains to obtain the content feature vector of the sample multimedia content, and then inputs the content feature vector of the sample multimedia content into the prediction layer.
  • methods such as weighted summation or direct addition may be used.
  • methods such as weighted summation or direct addition can be used.
  • the prediction layer is used to predict the degree of association between the sample user and the sample multimedia content in the target application. For example, the prediction layer can calculate the dot product value or Euclidean distance or cosine similarity between the user feature vector and the content feature vector. The degree of association between the sample user and the sample multimedia content in the target application is determined.
  • cross entropy is used to define the loss function
  • Adaptive Moment Estimation is used to optimize the loss function. When the loss function meets the preset conditions, the training ends.
  • the specific loss function is shown in formula (5):
  • y k is the correlation degree between the k-th sample multimedia content predicted by the user portrait model and the sample user (0 ⁇ y k ⁇ 1), is the correlation between the actual k-th sample multimedia content and the sample user ( is 0 or 1), and K is the number of sample multimedia contents.
  • the following formula (6) can be used to determine the correlation degree y k between the k-th sample multimedia content obtained by prediction and the sample user:
  • the user portrait model includes a first sub-model, a second sub-model, and an estimation layer, wherein the first sub-model includes a first input layer, a first fusion layer, and a second sub-model. It includes a second input layer and a second fusion layer.
  • the first sub-model When training the user portrait model, for the first sub-model, first determine the user characteristics of the sample users in multiple feature domains according to the attribute information and historical behavior data of the sample users, and then use the first input layer to classify the sample users in multiple feature domains.
  • User features in the feature domain are input into the first sub-model to be trained.
  • the first input layer performs feature extraction on the user features of the sample users in each feature domain, obtains feature vectors of the user features in each feature domain, and inputs the feature vectors of the user features into the first fusion layer.
  • the first fusion layer fuses the feature vectors of user features in multiple feature domains to obtain the user feature vectors of the sample users, and then inputs the user feature vectors of the sample users into the prediction layer. Fusion can use either weighted summation or direct addition.
  • the second sub-model first determine the content tags of the sample multimedia content in the target application in multiple tag fields, and then input the content tags of the sample multimedia content in multiple tag fields into the second input layer to be trained through the second input layer. submodel.
  • the second input layer extracts the label feature vectors of the content labels in each label domain, and then inputs the label feature vectors of the content labels in each label domain into the second fusion layer.
  • the second fusion layer fuses the label feature vectors of the content labels of multiple label domains to obtain the content feature vector of the sample multimedia content, and then inputs the content feature vector of the sample multimedia content into the prediction layer. Fusion can use either weighted summation or direct addition.
  • the prediction layer is used to predict the degree of association between the sample user and the sample multimedia content in the target application. For example, the prediction layer can calculate the dot product value or Euclidean distance or cosine similarity between the user feature vector and the content feature vector. The degree of association between the sample user and the sample multimedia content in the target application is determined.
  • cross-entropy is used to define the loss function
  • Adam is used to optimize the loss function.
  • the specific loss function is shown in formula (5). When the loss function satisfies the preset conditions, the training ends.
  • the user portrait model includes a first sub-model, a second sub-model, and an estimation layer, wherein the first sub-model includes a first input layer, a first fusion layer, and a second sub-model. It includes a second input layer and a second fusion layer.
  • the first sub-model When training the user portrait model, for the first sub-model, first determine the multiple user characteristics of the sample user according to the attribute information and historical behavior data of the sample user, and then input the multiple user characteristics of the sample user into the sample user through the first input layer.
  • the first submodel to train.
  • the first input layer performs feature extraction on multiple user features of the sample user, obtains multiple feature vectors, and inputs the multiple feature vectors into the first fusion layer.
  • the first fusion layer fuses multiple feature vectors to obtain the user feature vector of the sample user, and then inputs the user feature vector of the sample user into the prediction layer. Fusion can use either weighted summation or direct addition.
  • For the second sub-model first determine multiple content labels of the sample multimedia content in the target application, and then input the multiple content labels of the sample multimedia content into the second sub-model to be trained through the second input layer.
  • the second input layer extracts the label feature vectors of multiple content labels, and then inputs the multiple label feature vectors into the second fusion layer.
  • the second fusion layer fuses multiple label feature vectors to obtain the content feature vector of the sample multimedia content, and then inputs the content feature vector of the sample multimedia content into the prediction layer. Fusion can use either weighted summation or direct addition.
  • the prediction layer is used to predict the degree of association between the sample user and the sample multimedia content in the target application. For example, the prediction layer can calculate the dot product value or Euclidean distance or cosine similarity between the user feature vector and the content feature vector. The degree of association between the sample user and the sample multimedia content in the target application is determined.
  • cross-entropy is used to define the loss function
  • Adam is used to optimize the loss function.
  • the specific loss function is shown in formula (5). When the loss function satisfies the preset conditions, the training ends.
  • the user portrait model includes a first sub-model, a second sub-model, and an estimation layer, wherein the first sub-model includes a first input layer, a first fusion layer, and a second sub-model. It includes a second input layer, a second intra-domain fusion layer, and a second inter-domain fusion layer.
  • the first sub-model When training the user portrait model, for the first sub-model, first determine the multiple user characteristics of the sample user according to the attribute information and historical behavior data of the sample user, and then input the multiple user characteristics of the sample user into the sample user through the first input layer.
  • the first submodel to train.
  • the first input layer performs feature extraction on multiple user features of the sample user, obtains multiple feature vectors, and inputs the multiple feature vectors into the first fusion layer.
  • the first fusion layer fuses multiple feature vectors to obtain the user feature vector of the sample user, and then inputs the user feature vector of the sample user into the prediction layer. Fusion can use either weighted summation or direct addition.
  • the second sub-model first determine the content tags of the sample multimedia content in the target application in multiple tag fields, and then input the content tags of the sample multimedia content in multiple tag fields into the second input layer to be trained through the second input layer. submodel.
  • the second input layer extracts the label feature vectors of the content labels in each label domain, and then inputs the label feature vectors of the content labels in each label domain into the second intra-domain fusion layer.
  • the second intra-domain fusion layer fuses the label feature vectors of the content labels in each label domain to obtain the intra-domain label vector of each label domain, and then inputs the intra-domain label vector of each label domain into the second inter-domain fusion layer.
  • the second inter-domain fusion layer fuses the intra-domain label vectors of multiple label domains to obtain the content feature vector of the sample multimedia content, and then inputs the content feature vector of the sample multimedia content into the prediction layer.
  • intra-domain fusion any one of weighted summation and direct addition can be used, and for inter-domain fusion, any one of weighted summation and direct addition can be used.
  • the prediction layer is used to predict the degree of association between the sample user and the sample multimedia content in the target application. For example, the prediction layer can calculate the dot product value or Euclidean distance or cosine similarity between the user feature vector and the content feature vector. The degree of association between the sample user and the sample multimedia content in the target application is determined.
  • cross entropy is used to define the loss function
  • Adaptive Moment Estimation is used to optimize the loss function.
  • the loss function is specifically shown in formula (5). When the loss function meets the preset conditions, the training end.
  • the structure of the user portrait model in this application is not limited to the above four types, and may also be other structures obtained by combining the first sub-model and the second sub-model, which are not specifically limited in this application.
  • the content feature vector of the sample multimedia content is obtained, and then based on the user feature vector of the sample user and the sample multimedia content in the target application
  • the user portrait model is obtained by training the correlation between the content feature vectors of the content.
  • the model obtained by this hierarchical embedding method considers the constraint relationship between the content tags in the sample multimedia content, rather than considering the relationship between the content tag and the user alone. Therefore, when the model obtained by training is used to determine the content tags that match users, more accurate content tags can be matched, so as to build more accurate user portraits.
  • the model is trained based on the correlation between the user feature vector of the sample user and the content feature vector of the sample multimedia content in the target application, instead of extracting the labels from the sample multimedia content, based on the user feature vector of the sample user and the label
  • the correlation between the vectors is trained, thereby maintaining the original distribution of the sample data, making the portrait prediction results more accurate.
  • the embodiments of the present application provide at least the following two content recommendation methods:
  • Embodiment 1 is a diagrammatic representation of Embodiment 1:
  • the target multimedia content recommended to the target user is determined from the multimedia content of the target application.
  • the multimedia content matched by the candidate tag may be acquired from the multimedia content of the target application according to the candidate tag in the user portrait, and then the matched multimedia content is recommended to the target user.
  • the candidate tags in the user portrait may come from different tag fields
  • the candidate tags in different tag fields can be used to obtain multimedia content matching the candidate tags from the multimedia content of the target application according to actual requirements.
  • the tag field includes the content tag field, the first-level category tag field, the second-level category tag field, the topic tag field, and the official account tag field.
  • Content tag you can select one or more target tag fields from the above five tag fields, and then use the candidate tag corresponding to the target tag field in the user portrait to obtain the multimedia content matching the candidate tag from the multimedia content of the target application. .
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • the correlation degree of the content feature vector of the content determines the target multimedia content recommended to the target user from the multimedia content.
  • a hierarchical embedding process is performed on the tag feature vectors of the content tags of each multimedia content in multiple tag domains, and when determining the content feature vectors of the multimedia content, at least the following implementations are included:
  • the tag feature vectors of the content tags of the multimedia content in each tag domain are fused to obtain the intra-domain tag vector of each tag domain.
  • the in-domain label vectors of multiple label domains are fused to obtain the content feature vector of the multimedia content.
  • the tag feature vectors of the content tags of the multimedia content in each tag domain are directly added to obtain the intra-domain tag vector of each tag domain, and then the intra-domain tag vectors of multiple tag domains are directly added to obtain the multimedia content.
  • Content feature vector For example, the tag feature vectors of the content tags of the multimedia content in each tag domain are directly added to obtain the intra-domain tag vector of each tag domain, and then the intra-domain tag vectors of multiple tag domains are directly added to obtain the multimedia content.
  • Content feature vector For example, the tag feature vectors of the content tags of the multimedia content in each tag domain are directly added to obtain the intra-domain tag vector of each tag domain, and then the intra-domain tag vectors of multiple tag domains are directly added to obtain the multimedia content.
  • Content feature vector For example, the tag feature vectors of the content tags of the multimedia content in each tag domain are directly added to obtain the intra-domain tag vector of each tag domain, and then the intra-domain tag vectors of multiple tag domains are directly added to obtain the multimedia content.
  • the method of intra-domain fusion and inter-domain fusion of the tag feature vector of the content tag is not limited to the above-mentioned direct addition method, but can also be a weighted summation method.
  • the weight of the weighted summation can be used in training user portraits.
  • the model is obtained by learning the attention mechanism, and one of the intra-domain fusion and the inter-domain fusion may adopt a weighted summation method, and the other adopts a direct addition method, which is not specifically limited in this application.
  • five tag fields are preset, which are a content tag field, a first-level category tag field, a second-level category tag field, a topic tag field, and an official account tag field.
  • the content tags acquired from this piece of news include: sports, football, M team, and N team, and the acquired content tags are used as tags in the content tag field.
  • the first-level category corresponding to this piece of news is sports, and the content tag "sports” is used as a tag in the first-level category tag field.
  • the secondary category corresponding to this piece of news is football, and the content tag "football” is used as the tag in the secondary category tag field. If the content of the news is a football match, the content tag "soccer" is used as the tag in the subject tag domain.
  • the news comes from the Q sports official account, and the Q sports official account is used as the label in the official account label field.
  • the label feature vector of the content label in each label domain is extracted, wherein the label feature vector of the content label in the content label domain includes the sports label feature vector, football label feature vector, M team label feature vector, N
  • the team tag feature vector, the content tag feature vector in the primary category tag domain includes the sports tag feature vector, the content tag feature vector in the secondary category tag domain includes the football tag feature vector, and the topic tag feature vector in the
  • the tag feature vector of the content tag includes the football tag feature vector
  • the tag feature vector of the content tag in the official account tag field includes the official account tag feature vector.
  • intra-domain fusion can be omitted.
  • the label feature vectors of the four content labels in the content label domain are fused by direct addition to obtain the in-domain label vector of the content label domain.
  • the in-domain tag vectors in the content tag field, the first-level category tag field, the second-level category tag field, the topic tag field and the official account tag field are fused by direct addition to obtain the content characteristics of the sports news. vector.
  • Another possible implementation is to fuse the tag feature vectors of the content tags of the multimedia content in multiple tag domains to obtain the content feature vector of the multimedia content.
  • the tag feature vectors of the content tags of the multimedia content in multiple tag domains can be fused by direct addition to obtain the content feature vector of the multimedia content.
  • the weighted summation method can also be used to fuse the label feature vectors of the content labels of the multimedia content in multiple label domains to obtain the content feature vector of the multimedia content.
  • the weight of the weighted summation can use the attention mechanism when training the user portrait model. Learn to get.
  • five tag fields are preset, which are a content tag field, a first-level category tag field, a second-level category tag field, a topic tag field, and an official account tag field.
  • the content tags acquired from this piece of news include: sports, football, M team, and N team, and the acquired content tags are used as tags in the content tag field.
  • the first-level category corresponding to this piece of news is sports, and the content tag "sports” is used as a tag in the first-level category tag field.
  • the secondary category corresponding to this piece of news is football, and the content tag "football” is used as the tag in the secondary category tag field. If the content of the news is a football match, the content tag "soccer" is used as the tag in the subject tag domain.
  • the news comes from the Q sports official account, and the Q sports official account is used as the label in the official account label field.
  • the label feature vector of the content label in each label domain is extracted, wherein the label feature vector of the content label in the content label domain includes the sports label feature vector, football label feature vector, M team label feature vector, N
  • the team tag feature vector, the content tag feature vector in the primary category tag domain includes the sports tag feature vector, the content tag feature vector in the secondary category tag domain includes the football tag feature vector, and the topic tag feature vector in the
  • the tag feature vector of the content tag includes the football tag feature vector
  • the tag feature vector of the content tag in the official account tag field includes the official account tag feature vector.
  • the tag feature vectors of the content tags in the content tag field, the first-level category tag field, the second-level category tag field, the topic tag field and the official account tag field are fused by direct addition to obtain the sports news.
  • the target multimedia content recommended to the target user is determined from the multimedia content according to the correlation between the user feature vector of the target user and the content feature vector of each multimedia content.
  • a correlation threshold may be preset, and when the correlation between the content feature vector of the multimedia content and the user feature vector of the target user is greater than the correlation threshold, the multimedia content is recommended to the target user, And display the recommended content in the target application. It is also possible to preset a threshold for the number of recommended content, sort each multimedia content in descending order of relevance, recommend the top R multimedia content to the target user, and display the recommended content in the target application, where R is Content recommendation quantity threshold.
  • the content feature vector of the multimedia content is obtained, so that the content feature vector can more comprehensively characterize the features of the multimedia content, and then improve the matching based on the user feature vector and the content feature vector to obtain recommendations. Accuracy of multimedia content to users.
  • the target application After determining the multimedia content recommended to the target user by any one of the above two embodiments, the target application displays the recommended content on the target application.
  • the target application is set as an instant messaging application, and the documents recommended to target users are football game review articles A, football game news B, and football star interview reports C, then In the article reading module of the instant messaging application, the link of the football game commentary article A, the link of the football game news B, and the link of the football star interview report C are displayed, and the target user can click the link to view the related articles.
  • the advertisement recommendation scenario as shown in FIG. 11 , when the target application is set as an instant messaging application, and the advertisement recommended to the target user is a car advertisement, the car advertisement is displayed in the circle of friends of the instant messaging application, Target users can click on the ad image to view the ad or go to the purchase page.
  • the target application is set as a shopping application, and when it is determined that the commodities recommended to the target user are "pineapple” and "grape", the recommendation page of the fruit category of the shopping application is used. , display the purchase links of "pineapple” and “grape” first, for example, display the purchase links of "pineapple” and “grape” at the top of the recommendation page, and display the purchase links of "banana” and "strawberry” on the recommendation page the lower end.
  • a method for obtaining a user portrait provided by the embodiments of the present application is described below by taking the target application as a document recommendation application as an example, and the method is executed by a server.
  • the user portrait model includes a first sub-model, a second sub-model, and an estimation layer.
  • the first sub-model includes a first input layer and a first domain fusion layer.
  • a first inter-domain fusion layer and the second sub-model includes a second input layer, a second intra-domain fusion layer, and a second inter-domain fusion layer.
  • P feature domains are preset, namely, feature domain 1, feature domain 2, ..., and feature domain P.
  • the user features of the user in the P feature domains, and then the user features of the sample user in the P feature domains are input into the first sub-model to be trained through the first input layer.
  • the first input layer performs embedding processing on the user features of the sample users in each feature domain, obtains the feature vector of the user feature in each feature domain, and inputs the feature vector of the user feature into the first domain fusion layer.
  • the first intra-domain fusion layer fuses the feature vectors of user features in each feature domain by means of weighted summation, obtains the intra-domain feature vectors of each feature domain, and inputs the intra-domain feature vectors of each feature domain into the first inter-domain fusion layer.
  • the first inter-domain fusion layer uses a weighted summation method to fuse the intra-domain feature vectors of multiple feature domains to obtain the user feature vector of the sample user, and then input the user feature vector of the sample user into the prediction layer.
  • pre-set Q label domains namely label domain 1, label domain 2, ..., label domain Q
  • the content labels of the sample multimedia content in the Q label domains are input into the second sub-model to be trained through the second input layer.
  • the second input layer embeds the content labels in each label domain to obtain the label feature vector of the content label, and then inputs the label feature vector of the content label in each label domain into the fusion layer in the second domain.
  • the second intra-domain fusion layer uses direct addition to fuse the label feature vectors of the content labels in each label domain to obtain the intra-domain label vector of each label domain, and then input the intra-domain label vector of each label domain into the second domain.
  • fusion layer fuses the intra-domain label vectors of multiple label domains by direct addition to obtain the content feature vector of the sample multimedia content, and then input the content feature vector of the sample multimedia content into the prediction layer.
  • the prediction layer first calculates the dot product value between the user feature vector of the sample user and the content feature vector of the sample multimedia content, and then uses the sigmoid function to normalize the dot product value to obtain the difference between the sample user and the sample multimedia content. Correlation.
  • cross-entropy is used to define the loss function
  • Adam is used to optimize the loss function.
  • the specific loss function is shown in formula (5). When the loss function satisfies the preset conditions, the training ends.
  • the attribute information of the target user includes gender, age, location, etc.
  • the historical behavior data includes the target user in other applications than the document recommendation application.
  • historical behavior data such as the target user's video viewing records in video applications, document click records in instant messaging applications, etc.
  • the user characteristics of the target user in the P characteristic domains are determined.
  • the feature vectors of user features in each feature domain are fused by means of weighted summation to obtain the feature vectors within each feature domain. Then, the intra-domain feature vectors of the P feature domains are fused by means of weighted summation to determine the user feature vector of the target user.
  • For each multimedia content of the document recommendation application first determine the content tags of the multimedia content in the Q tag domains, and then use the above trained user portrait model to embed the content tags in each tag domain to obtain each Label feature vector for each content label.
  • the similarity between the user feature vector of the target user and the tag feature vector of the content tag in each tag domain determines the content tag corresponding to the tag feature vector whose similarity is greater than the similarity threshold as the candidate tag of the target user.
  • the user portrait of the target user is determined based on the candidate tag of the target user, and then the document in the document recommendation application recommended to the target user is determined based on the user portrait of the target user.
  • the present application evaluates the effect of the user portrait with the actual click log of the user, and the obtained evaluation results are shown in Table 1:
  • Prec@N is the portrait prediction accuracy index, which represents the proportion of the user's actual click in the content recommended to the user based on the user portrait, which specifically satisfies the following formula (7):
  • N is the number of content recommended to users based on user portraits.
  • the content feature vector of the sample multimedia content is obtained, and then based on the user feature vector of the sample user and the sample multimedia content in the target application
  • the user portrait model is obtained by training the correlation between the content feature vectors of the content.
  • the model obtained by this hierarchical embedding method considers the constraint relationship between the content tags in the sample multimedia content, rather than considering the relationship between the content tag and the user alone. Therefore, when the model obtained by training is used to determine the content tags that match users, more accurate content tags can be matched, so as to build more accurate user portraits.
  • the model is trained based on the correlation between the user feature vector of the sample user and the content feature vector of the sample multimedia content in the target application, instead of extracting the labels from the sample multimedia content, based on the user feature vector of the sample user and the label
  • the correlation between the vectors is trained, thereby maintaining the original distribution of the sample data, making the portrait prediction results more accurate.
  • an embodiment of the present application provides a schematic structural diagram of an apparatus for obtaining user portraits.
  • the apparatus 1400 includes:
  • the first feature extraction module 1401 is used to determine the user feature vector of the target user according to the attribute information of the target user and historical behavior data;
  • the second feature extraction module 1402 is used to obtain the tag feature vector of the content tag of the multimedia content in the target application
  • the matching module 1403 is used to determine the candidate label of the target user from the content label of the multimedia content according to the similarity between the user feature vector and the label feature vector;
  • the processing module 1404 is configured to determine the user portrait of the target user based on the candidate tags of the target user.
  • the second feature extraction module 1402 is specifically configured to:
  • the user portrait model is obtained by training based on the correlation between the user feature vector of the sample user and the content feature vector of the sample multimedia content.
  • the content feature vector is obtained by performing hierarchical embedding processing on the label feature vector of the content label of the sample multimedia content, and the user feature vector of the sample user is obtained by performing hierarchical embedding processing on the feature vector of the user feature of the sample user.
  • the second feature extraction module 1402 is specifically configured to:
  • the in-domain label vectors of multiple label domains are fused to obtain the content feature vector of the sample multimedia content.
  • the first feature extraction module 1401 is specifically used for:
  • the intra-domain eigenvectors of multiple eigendomains are fused to obtain the user eigenvectors of the sample users.
  • the first feature extraction module 1401 is specifically used for:
  • the feature vector of user features in each feature domain is extracted, and the feature vector of user features in each feature domain is subjected to hierarchical embedding processing to determine the user feature vector of the target user.
  • the first feature extraction module 1401 is specifically used for:
  • the intra-domain feature vectors of multiple feature domains are fused to obtain the user feature vector of the target user.
  • the matching module 1403 is specifically used for:
  • the content tags whose similarity satisfies the preset condition are determined as the candidate tags of the target user.
  • processing module 1404 is further configured to:
  • the target multimedia content recommended to the target user is determined from the multimedia content.
  • processing module 1404 is specifically used for:
  • the in-domain label vectors of multiple label domains are fused to obtain the content feature vector of the multimedia content.
  • an embodiment of the present application provides a schematic structural diagram of a content recommendation apparatus.
  • the apparatus 1500 includes:
  • the recommendation module 150 determines the target multimedia content recommended to the target user from the multimedia content of the target application.
  • an embodiment of the present application provides a schematic structural diagram of a training device for a user portrait model.
  • the device 1600 includes:
  • the model training module 1601 is used to perform multiple iterative training using the user portrait model to be trained and the training samples to obtain the user portrait model.
  • the training samples include sample multimedia content and user characteristics of the sample users.
  • Each iterative training includes:
  • the parameters of the user portrait model to be trained are adjusted.
  • model training module 1601 is specifically used for:
  • model training module 1601 is specifically used for:
  • the in-domain label vectors of multiple label domains are fused to obtain the content feature vector of the sample multimedia content.
  • model training module 1601 is specifically used for:
  • the intra-domain eigenvectors of multiple eigendomains are fused to obtain the user eigenvectors of the sample users.
  • an embodiment of the present application provides a computer device, as shown in FIG. 17 , which includes at least one processor 1701 and a memory 1702 connected to the at least one processor.
  • the embodiment of the present application does not limit the processor
  • the specific connection medium between 1701 and the memory 1702 is an example of the connection between the processor 1701 and the memory 1702 through a bus in FIG. 17 .
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the memory 1702 stores instructions that can be executed by at least one processor 1701 , and the at least one processor 1701 can execute the aforementioned method for obtaining a user portrait or content recommendation method or user portrait by executing the instructions stored in the memory 1702 .
  • the processor 1701 is the control center of the computer equipment, and can use various interfaces and lines to connect various parts of the computer equipment, and obtain the user by running or executing the instructions stored in the memory 1702 and calling the data stored in the memory 1702. Profile or recommend content or train a user profile model.
  • the processor 1701 may include one or more processing units, and the processor 1701 may integrate an application processor and a modem processor, wherein the application processor mainly processes the operating system, user interface, and application programs, etc., and the modem
  • the modulation processor mainly handles wireless communication. It can be understood that, the above-mentioned modulation and demodulation processor may not be integrated into the processor 1701.
  • the processor 1701 and the memory 1702 may be implemented on the same chip, and in some embodiments, they may be implemented separately on separate chips.
  • the processor 1701 may be a general-purpose processor, such as a central processing unit (CPU), a digital signal processor, an application specific integrated circuit (ASIC), a field programmable gate array or other programmable logic device, discrete gates or transistors Logic devices and discrete hardware components can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this application.
  • a general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware processor, or executed by a combination of hardware and software modules in the processor.
  • the memory 1702 can be used to store non-volatile software programs, non-volatile computer-executable programs and modules.
  • the memory 1702 may include at least one type of storage medium, for example, may include flash memory, hard disk, multimedia card, card-type memory, random access memory (Random Access Memory, RAM), static random access memory (Static Random Access Memory, SRAM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Magnetic Memory, Disk , CD-ROM, etc.
  • Memory 1702 is, but is not limited to, any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
  • the memory 1702 in this embodiment of the present application may also be a circuit or any other device capable of implementing a storage function, for storing program instructions and/or data.
  • an embodiment of the present application provides a computer-readable storage medium, which stores a computer program executable by a computer device, and when the program runs on the computer device, causes the computer device to execute the above method for obtaining a user portrait or The steps of the content recommendation method or the training method of the user portrait model.
  • embodiments of the present invention may be provided as a method, or as a computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请实施例提供了一种获得用户画像的方法及相关装置,涉及人工智能技术领域,该方法包括:获取目标用户的用户特征向量和目标应用中多媒体内容的内容标签的标签特征向量,然后根据用户特征向量与标签特征向量之间的相似度,确定目标用户的备选标签,进而基于备选标签确定目标用户的用户画像。相较于基于标签统计获得用户画像来说,用户特征向量更能全面表征用户喜好,从而提升获得的用户标签的准确性,进而提高获得的用户画像的准确性。其次,获得的备选标签不仅是目标用户的历史行为数据中的标签,还可以历史行为数据之外的标签,从而提高了泛化能力,扩展了目标用户的兴趣,使获得的用户画像更加全面准确。

Description

一种获得用户画像的方法及相关装置
本申请要求于2020年8月14日提交中国专利局、申请号202010820059.0、申请名称为“一种获得用户画像的方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明实施例涉及计算机领域,尤其涉及用户画像技术。
背景技术
个性化推荐系统是互联网的核心技术之一,其基于用户行为与兴趣为用户推荐感兴趣的内容。用户画像,即用户信息结构化与标签化,通过刻画用户的人口属性、社会属性、兴趣偏好等各个维度的数据,对用户各方面的信息进行精准地刻画、分析,挖掘潜在价值,从而更好地提升个性化推荐的效果。
目前在构建用户画像时,先从用户行为数据中抽取画像标签,对用户行为数据中涉及的画像标签进行简单的统计,按照频次对每个用户的画像标签进行打分,即频次越高、分数越高,之后再根据画像标签的打分获得用户画像。对于冷启动用户来说,由于用户行为数据较少,导致基于画像标签统计获得的用户画像的准确性较低,进而影响到根据用户画像进行的相关业务的精准度。
发明内容
本申请实施例提供了一种获得用户画像的方法及装置,用于提高获得的用户画像的准确性,并进一步提高内容推荐的精准度。
一方面,本申请实施例提供了一种获得用户画像的方法,该方法包括:
根据目标用户的属性信息以及历史行为数据确定所述目标用户的用户特征向量;
获取目标应用中多媒体内容的内容标签的标签特征向量;
根据所述用户特征向量与所述标签特征向量之间的相似度,从所述多媒体内容的内容标签中确定所述目标用户的备选标签;
基于所述目标用户的备选标签确定所述目标用户的用户画像。
一方面,本申请实施例提供了一种用户画像模型的训练方法,该方法包括:
采用待训练的用户画像模型和训练样本进行多次迭代训练,获得用户画像模型,所述训练样本包括样本多媒体内容和样本用户的用户特征,每次迭代训练包括:
提取样本用户的用户特征的特征向量和样本多媒体内容的内容标签的标签特征向量;
对所述样本用户的用户特征的特征向量进行层级嵌入处理,获得所述样本用户的用户特征向量;
对所述标签特征向量进行层级嵌入处理,获得所述样本多媒体内容的内容特征向量;
基于所用户特征向量与所述内容特征向量之间的关联度,调整所述待训练的用户画像模型的参数。
一方面,本申请实施例提供了一种获得用户画像的装置,该装置包括:
第一特征提取模块,用于根据目标用户的属性信息以及历史行为数据确定所述目标用户的用户特征向量;
第二特征提取模块,用于获取目标应用中多媒体内容的内容标签的标签特征向量;
匹配模块,用于根据所述用户特征向量与所述标签特征向量之间的相似度,从所述多媒体内容的内容标签中确定所述目标用户的备选标签;
处理模块,用于基于所述目标用户的备选标签确定所述目标用户的用户画像。
一方面,本申请实施例提供了一种用户画像模型的训练装置,该装置包括:
模型训练模块,用于采用待训练的用户画像模型和训练样本进行多次迭代训练,获得用户画像模型,所述训练样本包括样本多媒体内容和样本用户的用户特征,每次迭代训练包括:
提取样本用户的用户特征的特征向量和样本多媒体内容的内容标签的标签特征向量;
对所述样本用户的用户特征的特征向量进行层级嵌入处理,获得所述样本用户的用户特征向量;
对所述标签特征向量进行层级嵌入处理,获得所述样本多媒体内容的内容特征向量;
基于所述用户特征向量与所述内容特征向量之间的关联度,调整所述待训练的用户画像模型的参数。
一方面,本申请实施例提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现上述获得用户画像的方法的步骤,或者上述用户画像模型的训练方法的步骤。
一方面,本申请实施例提供了一种计算机可读存储介质,其存储有可由计算机设备执行的计算机程序,当所述程序在计算机设备上运行时,使得所述计算机设备执行上述获得用户画像的方法的步骤,或者上述用户画像模型的训练方法的步骤。
本申请实施例中,根据目标用户的属性信息以及历史行为数据确定目标用户的用户特征向量,用户特征向量并不仅仅表征用户的历史行为以及属性,同时表征基于用户的历史行为和属性确定的用户喜好,故相较于基于标签统计获得的用户画像来说,根据目标用户的用户特征向量与标签特征向量之间的相似度,确定的目标用户的备选标签更能表征用户喜好,从而提升获得的用户画像的准确性。其次,将用户特征向量与标签特征向量匹配获得备选标签时,备选标签并不仅仅是目标用户的历史行为数据中的标签,还可以是历史行为数据之外的标签,从而提高了泛化能力,扩展了目标用户的兴趣,使获得的用户画像更加全面准确,进而提高内容推荐的准确度。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简要介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的一种系统架构图;
图2为本申请实施例提供的一种获得用户画像的方法的流程示意图;
图3为本申请实施例提供的一种获得用户特征向量的方法的流程示意图;
图4为本申请实施例提供的一种获得用户特征向量的方法的流程示意图;
图5为本申请实施例提供的一种用户画像模型的结构示意图;
图6为本申请实施例提供的一种用户画像模型的结构示意图;
图7为本申请实施例提供的一种用户画像模型的结构示意图;
图8为本申请实施例提供的一种获得内容特征向量的方法的流程示意图;
图9为本申请实施例提供的一种获得内容特征向量的方法的流程示意图;
图10为本申请实施例提供的一种内容推荐页面的示意图;
图11为本申请实施例提供的一种内容推荐页面的示意图;
图12为本申请实施例提供的一种内容推荐页面的示意图;
图13为本申请实施例提供的一种用户画像模型的结构示意图;
图14为本申请实施例提供的一种获得用户画像的装置的结构示意图;
图15为本申请实施例提供的一种内容推荐装置的结构示意图;
图16为本申请实施例提供的一种用户画像模型的训练装置的结构示意图;
图17为本申请实施例提供的一种计算机设备的结构示意图。
具体实施方式
为了使本发明的目的、技术方案及有益效果更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
为了方便理解,下面对本发明实施例中涉及的名词进行解释。
在本申请实施例中,通过人工智能技术确定目标用户的用户特征向量和目标应用中多媒体内容的内容标签的标签特征向量,并基于用户特征向量和多媒体内容的内容标签的标签特征向量,确定目标用户的备选标签,进而根据备选标签确定目标用户的用户画像。
在本申请实施例中,通过人工智能技术中具体的机器学习模型或者算法确定目标用户的用户特征向量和目标应用中多媒体内容的内容标签的标签特征向量。注意力机制:模仿了生物观察行为的内部过程,即一种将内部经验和外部感觉对齐从而增加部分区域的观察精细度的机制,简单地说就是从大量信息中快速筛选出高价值信息。这种机制主要有两个方面:决定需要关注输入的哪部分;分配有限的信息处理资源给重要的部分。在神经网络中,基于注意力机制可以使得神经网络具备专注于其输入(或特征)子集的能力,选择特定的输入。在本申请实施例中,基于注意力机制将目标用户在多个特征域中的用户特征进行融合,确定目标用户的用户特征向量。
用户画像:用户画像是根据用户社会属性、生活习惯和消费行为等信息抽象出的一个标签化的用户模型。构建用户画像的核心工作即是给用户贴“标签”,而标签是通过对用户信息分析得来的高度精炼的特征标识。
下面对本申请实施例的设计思想进行介绍。
目前在构建用户画像时,先从用户行为数据中抽取画像标签,对用户行为数据中涉及的画像标签进行简单的统计,按照频次对每个用户的画像标签进行打分,即频次越高、分数越高,之后再根据画像标签的打分获得用户画像。对于冷启动用户来说,由于用户行为数据较少,用户的画像标签可能都只出现过一次或两次,故基于画像标签出现的频次对画像进行打分时,画像标签的分数不具备代表性,从而导致根据画像标签打分获得的用户画像准确性较低,同时不能获得用户行为数据之外的用户标签。进而导致基于用户画像向用户推荐内容时,可能推荐用户不喜欢的内容,从而影响用户体验。
考虑到用户的喜好可能体现在用户的属性信息和行为数据中,故综合用户的属性信息和行为数据获得的用户特征可以较好地表征用户喜好,当基于用户特征匹配用户标签时,获得的用户标签与用户喜好相关,且不仅限于用户已有行为数据中的标签。鉴于此,本申请实施例提供了一种获得用户画像的方法,该方法包括:根据目标用户的属性信息以及历史行为数据确定目标用户的用户特征向量,获取目标应用中多媒体内容的内容标签的标签特征向量。然后根据用户特征向量与标签特征向量之间的相似度,从多媒体内容的内容标签中确定目标用户的备选标签。基于目标用户的备选标签确定目标用户的用户画像。
该方法相较于基于标签统计获得用户画像来说,用户特征向量更能全面表征用户喜好,从而提升获得的用户标签的准确性,进而提高获得的用户画像的准确性。其次,获得的备选标签不仅是目标用户的历史行为数据中的标签,还可以是历史行为数据之外的标签,从而提高了泛化能力,扩展了目标用户的兴趣,使获得的用户画像更加全面准确,进而提高内容推荐的准确度。
在介绍完本申请实施例的设计思想之后,下面对本申请实施例的技术方案能够适用的应用场景做一些简单介绍,需要说明的是,以下介绍的应用场景仅用于说明本申请实施例而非限定。在具体实施时,可以根据实际需要灵活地应用本申请实施例提供的技术方案。
场景一、文档推荐场景。
以多媒体内容是文档为例,在向目标用户推荐文档时,内容推荐设备先获取目标用户的属性信息以及历史行为数据,其中,目标用户的属性信息包括性别、年龄、地点等,历史行为数据包括目标用户在目标应用中的历史行为数据,和/或目标用户在目标应用之外的其他应用中的历史行为数据,比如目标用户在目标应用和/或目标应用之外的其他应用中点击的文档的主题、文档类目、文档中包含的内容标签等。获取目标应用中多篇文档的内容标签的标签特征向量,然后根据目标用户的用户特征向量与标签特征向量之间的相似度,从多篇文档的内容标签中确定目标用户的备选标签,基于目标用户的备选标签确定目标用户的用户画像。之后再根据用户画像向目标用户推荐目标应用中的文档。
场景二、广告推荐场景。
以多媒体内容是广告为例,在向目标用户推荐广告时,内容推荐设备先获取目标用户的属性信息以及历史行为数据,其中,目标用户的属性信息包括性别、年龄、地点等,历史行为数据包括目标用户在目标应用中的历史行为数据,和/或目标用户在目标应用之外的其他应用中的历史行为数据,比如目标用户在目标应用和/或目标应用之外的其他应用中点击的广告的主题、广告类目、广告中包含的内容标签等。获取目标应用中多个广告的内容标签的标签特征向量,然后根据目标用户的用户特征向量与标签特征向量之间的相似度,从多个广告的内容标签中确定目标用户的备选标签,基于目标用户的备选标签确定目标用户的用户画像。之后再根据用户画像向目标用户推荐目标应用中的广告。
需要说明的是,本申请实施例中的获得用户画像的方法并不仅限于应用在上述两种实施场景,还可以是音频推荐、视频推荐、商品推荐、外卖信息推荐、读书推荐、新闻推荐、小程序中的内容推荐等场景,对此,本申请不做具体限定。
参考图1,其为本申请实施例提供的获得用户画像的方法的系统架构图。该架构至少包括终端设备101以及服务器102。
终端设备101中可以安装有目标应用,其中,目标应用可以是客户端应用、网页版应用、小程序应用等。目标用户的属性信息可以从目标用户在目标应用中的注册信息中获取,目标用户的历史行为数据可以从目标应用和/或目标应用之外的其他应用的历史记录中获取。终端设备101可以包括一个或多个处理器1011、存储器1012、与埋点服务器103交互的I/O接口1013以及显示面板1014等。终端设备101可以是智能手机、平板电脑、笔记本电脑、台式计算机、智能音箱、智能手表等,但并不局限于此。
服务器102可以是目标应用的后台服务器,为目标应用提供相应的服务,服务器102可以包括一个或多个处理器1021、存储器1022以及与终端设备101交互的I/O接口1023等。此外,服务器102还可以配置数据库1024。服务器102可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(Content Delivery Network,CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器。终端设备101与服务器102可以通过有线或无线通信方式进行直接或间接地连接,本申请在此不做限制。
获得用户画像的装置可以是终端设备101,也可以是服务器102。
第一种情况,获得用户画像的装置是终端设备101。
终端设备101从服务器102中获取目标用户的属性信息以及历史行为数据,然后根据目标用户的属性信息以及历史行为数据确定所述目标用户的用户特征向量。终端设备101获取目标应用中多媒体内容的内容标签的标签特征向量,然后根据目标用户的用户特征向量与标签特征向量之间的相似度,从多媒体内容的内容标签中确定目标用户的备选标签。之后再基于目标用户的备选标签确定目标用户的用户画像。目标用户在终端设备101安装的目标应用中触发内容推荐时,目标应用根据目标用户的用户画像从服务器102中获取推荐给目标用户的多媒体内容并展示。
第二种情况,获得用户画像的装置是服务器102。
服务器102根据目标用户的属性信息以及历史行为数据确定目标用户的用户特征向量,获取目标应用中多媒体内容的内容标签的标签特征向量。然后根据目标用户的用户特征向量与标签特征向量之间的相似度,从多媒体内容的内容标签中确定目标用户的备选标签。之后再基于目标用户的备选标签确定目标用户的用户画像。目标用户在终端设备101安装的目标应用中触发内容推荐时,目标应用通过终端设备101发送内容推荐请求至服务器102。服务器102根据目标用户的用户画像,从数据库中获取推荐给目标用户的多媒体内容,并将推荐给目标用户的多媒体内容发送给终端设备101,终端设备101在目标应用中展示推荐给目标用户的多媒体内容。
基于图1所示的系统架构图,本申请实施例提供了一种获得用户画像的方法的流程,如图2所示,该方法的流程可以由计算机设备执行,该计算机设备可以是图1所示的终端设备101或服务器102,包括以下步骤:
步骤S201,根据目标用户的属性信息以及历史行为数据确定目标用户的用户特征向量。
目标用户的属性信息可以从目标用户在目标应用中的注册信息中获取,目标用户的属性信息至少包括两类信息:
第一类为数值类,即用数字描述的信息,比如年龄、出生年月日、账号注册时间等。
第二类为文本类,即用文本描述的信息,比如性别可选的是男或女,地点可选的是北京、上海等地点。
目标用户的历史行为数据包括目标用户在目标应用中的历史行为数据,和/或目标用户在目标应用之外的其他应用中的历史行为数据。行为数据包括操作事件以及操作对象的属性信息,操作事件可以是点击、浏览、收藏、评论等,操作对象的属性信息可以是主题、类目、标签等。
步骤S202,获取目标应用中多媒体内容的内容标签的标签特征向量。
多媒体内容可以是文本信息、音频、视频等,一个多媒体内容可能对应一个或多个内容标签。比如,一篇关于足球比赛的新闻对应的内容标签包括:体育、足球、XX足球队等。在本实施例中,多媒体内容的数量可以是一个,也可以是多个,本实施例对此不做限定。
步骤S203,根据用户特征向量与标签特征向量之间的相似度,从多媒体内容的内容标签中确定目标用户的备选标签。
在一种可能的实现方式中,可以预先设置相似度阈值,确定目标用户的用户特征向量与每个标签特征向量之间的相似度,将相似度大于相似度阈值的标签特征向量确定为匹配标签向量,之后再将匹配标签向量对应的内容标签确定为目标用户的备选标签。
也可以预先设置备选标签数量阈值,确定目标用户的用户特征向量与每个标签特征向量之间的相似度,按照相似度从大到小的顺序进行排序,将排在前P位的标签特征向量对应的内容标签确定为目标用户的备选标签,其中,P为备选标签数量阈值。
本实施例中所使用的多媒体内容可以是目标应用中所有的多媒体内容,也可以是目标应用中部分多媒体内容。
步骤S204,基于目标用户的备选标签确定目标用户的用户画像。
当目标用户的备选标签中存在重复的标签时,或者目标用户的备选标签与目标用户已有的标签之间存在重复的标签时,可以去除重复的标签。
另外,预先可以设置标签数量的上限值,当目标用户的备选标签与目标用户已有的标签之和超过上限值时,可以根据目标用户的用户特征向量与标签特征向量之间的相似度,对备选标签和目标用户已有的标签按照相似度从大到小的顺序进行排序,保留排在前N位的标签,其中,N为标签数量的上限值。
也可以统计去重之前每个标签出现的频次,然后对备选标签和目标用户已有的标签按照频次从大到小的顺序进行排序,保留排在前N位的标签,其中,N为标签数量的上限值。
本申请实施例中,获取目标用户的用户特征向量和目标应用中多媒体内容的内容标签的标签特征向量,然后根据用户特征向量与标签特征向量之间的相似度,确定目标用户的备选标签,进而基于备选标签确定目标用户的用户画像。相较于基于标签统计获得用户画像来说,用户特征向量更能全面表征用户喜好,从而提升获得的用户标签的准确性,进而提高获得的用户画像的准确性。其次,获得的备选标签不仅是目标用户的历史行为数据中的标签,还可以历史行为数据之外的标签,从而提高了泛化能力,扩展了目标用户的兴趣,使获得的用户画像更加全面准确,进而提高内容推荐的准确度。
在一种可能的实现方式中,在步骤S201中,在获得目标用户的用户特征向量时,先根据目标用户的属性信息以及历史行为数据,确定目标用户在多个特征域中的用户特征,然后通过用户画像模型,提取每个特征域内的用户特征的特征向量,并对每个特征域内的用户特征的特征向量进行层级嵌入处理,确定目标用户的用户特征向量。
其中,特征域为表征用户特征的特征维度,各个特征域中用户特征可以完全不同,也可以部分相同。用户特征具体可以是性别、年龄、地址、职位等属性信息,也可以是从历史行为数据获取的标签、类目、主题等信息。用户画像模型是基于样本用户的用户特征向量与样本多媒体内容的内容特征向量之间的关联度训练得到的,样本多媒体内容的内容特征向量是对样本多媒体内容的内容标签的标签特征向量进行层级嵌入处理后获得的,样本用户的用户特征向量是对样本用户的用户特征的特征向量进行层级嵌入处理后获得的。通过用户画像模型,对每个特征域内的用户特征进行embedding(嵌入)处理,获得用户特征的特征向量。用户画像模型可以是深度神经网络模型(Deep Neural Network,简称DNN)、Transformer模型等模型。
示例性地,目标用户的属性信息包括性别、年龄、地址、职位。目标用户的历史行为数据为目标用户在目标应用之外的其他应用中的历史行为数据,具体为目标用户在视频应用A中的历史行为数据、目标用户在音频应用B中的历史行为数据以及目标用户在购物应用C中的历史行为数据。
实施方式一,预先设置7个特征域,分别为第一特征域至第七特征域,其中,性别为第一特征域中的用户特征,年龄为第二特征域中的用户特征,地址为第三特征域中的用户特征,职位为第四特征域中的用户特征,从视频应用A的历史行为数据中获取标签、类目、主题等信息作为第五特征域中的用户特征,从音频应用B的历史行为数据中获取标签、类目、主题等信息作为第六特征域中的用户特征,从购物应用C的历史行为数据中获取标签、类目、主题等信息作为第七特征域中的用户特征。
实施方式二,预先设置5个特征域,分别为第一特征域至第五特征域,其中,性别为第一特征域中的用户特征,年龄为第二特征域中的用户特征,地址为第三特征域中的用户特征,职位为第四特征域中的用户特征,从视频应用A的历史行为数据、音频应用B的历 史行为数据以及购物应用C的历史行为数据中获取标签、类目、主题等信息作为第五特征域中的用户特征。
实施方式三,预先设置4个特征域,分别为第一特征域至第四特征域,其中,性别、年龄、地址以及职位为第一特征域中的用户特征,从视频应用A的历史行为数据中获取标签作为第二特征域中的用户特征,从音频应用B的历史行为数据中获取标签、类目、主题等信息作为第三特征域中的用户特征,从购物应用C的历史行为数据中获取标签、类目、主题等信息作为第四特征域中的用户特征。
实施方式四,预先设置2个特征域,分别为第一特征域和第二特征域,其中,性别、年龄、地址以及职位为第一特征域中的用户特征,从视频应用A的历史行为数据、音频应用B的历史行为数据以及购物应用C的历史行为数据中获取标签、类目、主题等信息作为第二特征域中的用户特征。
需要说明的是,特征域划分的实施方式并不仅限于上述四种,还可以其他实施方式,对此,本申请不做具体限定。另外,在获取目标用户的用户特征向量时,也可以直接对目标用户的属性信息以及历史行为数据进行特征提取,确定目标用户的用户特征向量,对此,本申请不做具体限定。
本申请实施例中,根据目标用户的属性信息以及历史行为数据,确定目标用户在多个特征域中的用户特征,从多个维度表征用户特征,从而提高基于用户特征确定的目标用户的用户特征向量的准确度。
需要说明的是,本申请实例在提取每个特征域内的用户特征的特征向量之后,对每个特征域内的用户特征的特征向量进行层级嵌入处理,确定目标用户的用户特征向量时,本申请至少包括以下几种实施方式:
实施方式一、将每个特征域内的用户特征的特征向量进行融合,获得每个特征域的域内特征向量,然后将多个特征域的域内特征向量进行融合,获得目标用户的用户特征向量。
例如,可以将每个特征域内的用户特征的特征向量加权求和,获得每个特征域的域内特征向量,具体符合下述公式(1):
Figure PCTCN2021102604-appb-000001
其中,
Figure PCTCN2021102604-appb-000002
为特征域t的域内特征向量,α x为域内融合时特征向量
Figure PCTCN2021102604-appb-000003
的权重,H为特征域内的特征向量数量的上限值,不同特征域内的特征向量数量的上限值可以是不相同的。
域内融合时特征向量
Figure PCTCN2021102604-appb-000004
的权重α x可以采用公式(2)获得,公式(2)具体如下所示:
Figure PCTCN2021102604-appb-000005
其中,α x为域内融合时特征向量
Figure PCTCN2021102604-appb-000006
的权重,
Figure PCTCN2021102604-appb-000007
为域内融合时的语义向量,W t为特征域t的空间变换矩阵,
Figure PCTCN2021102604-appb-000008
为偏置向量。需要说明的是,在具体实施中,每个特征域中的语义向量
Figure PCTCN2021102604-appb-000009
可以是相同的,也可以是不同的。每个特征域中的空间变换矩阵和偏置向量是不相同的,域内融合时的权重是在训练用户画像模型时采用注意力机制学习获得的。
然后,将多个特征域的域内特征向量加权求和,获得目标用户的用户特征向量,具体符合下述公式(3):
Figure PCTCN2021102604-appb-000010
其中,
Figure PCTCN2021102604-appb-000011
为目标用户的用户特征向量,β t为域间融合时域内特征向量
Figure PCTCN2021102604-appb-000012
的权重,N为特征域的数量。
域间融合时域内特征向量
Figure PCTCN2021102604-appb-000013
的权重β t可以采用公式(4)获得,公式(4)具体如下所示:
Figure PCTCN2021102604-appb-000014
其中,β t为域间融合时域内特征向量
Figure PCTCN2021102604-appb-000015
的权重,
Figure PCTCN2021102604-appb-000016
为域间融合时的语义向量,W t为特征域t的空间变换矩阵,
Figure PCTCN2021102604-appb-000017
为偏置向量。域间融合时的权重是在训练用户画像模型时采用注意力机制学习获得的。
需要说明的是,对用户特征的特征向量进行域内融合和域间融合的方法并不仅限于上述加权求和的方法,也可以是直接相加的方法,还可以是域内融合和域间融合中一个采用加权求和的方法,另一个采用直接相加的方法,对此,本申请不做具体限定。
示例性地,如图3所示,预先设置5个特征域,分别为第一特征域至第五特征域,其中,性别为第一特征域中的用户特征,年龄为第二特征域中的用户特征,职位为第三特征域中的用户特征,从视频应用A的历史行为数据中获取标签、类目、主题作为第四特征域中的用户特征,从音频应用B的历史行为数据中获取标签、类目、主题作为第五特征域中的用户特征。
提取每个特征域内的用户特征的特征向量,其中,第一特征域中的特征向量为性别特征向量,第二特征域中的特征向量为年龄特征向量,第三特征域中的特征向量为职位特征向量,第四特征域中的特征向量包括标签特征向量、类目特征向量、主题特征向量。第五特征域中的特征向量包括标签特征向量、类目特征向量、主题特征向量。
由于第一特征域、第二特征域和第三特征域中都只有一个特征向量,故可以不进行域内融合。采用上述公式(1)将第四特征域中的标签特征向量、类目特征向量、主题特征向量进行加权求和,获得第四特征域的域内特征向量,采用上述公式(1)将第五特征域中的标签特征向量、类目特征向量、主题特征向量进行加权求和,获得第五特征域的域内特征向量。
采用上述公式(2)将第一特征域的性别特征向量、第二特征域的年龄特征向量、第三特征域的职位特征向量、第四特征域的域内特征向量以及第五特征域的域内特征向量进行域间融合,获得目标用户的用户特征向量。
实施方式二、将目标用户在多个特征域中的用户特征的特征向量进行融合,获得目标用户的用户特征向量。
在一种可能的实现方式中,可以采用加权求和的方式将目标用户在多个特征域中的用户特征的特征向量进行融合,获得目标用户的用户特征向量,加权求和时每个特征向量的权重可以在训练用户画像模型时采用注意力机制学习获得。也可以采用直接相加的方式将目标用户在多个特征域中的用户特征的特征向量进行融合,获得目标用户的用户特征向量。
示例性地,如图4所示,预先设置5个特征域,分别为第一特征域至第五特征域,其中,性别为第一特征域中的用户特征,年龄为第二特征域中的用户特征,职位为第三特征域中的用户特征,从视频应用A的历史行为数据中获取标签、类目、主题作为第四特征域 中的用户特征,从音频应用B的历史行为数据中获取标签、类目、主题作为第五特征域中的用户特征。
提取每个特征域内的用户特征的特征向量,其中,第一特征域中的特征向量为性别特征向量,第二特征域中的特征向量为年龄特征向量,第三特征域中的特征向量为职位特征向量,第四特征域中的特征向量包括标签特征向量、类目特征向量、主题特征向量。第五特征域中的特征向量包括标签特征向量、类目特征向量、主题特征向量。
将第一特征域中的性别特征向量、第二特征域中的年龄特征向量、第三特征域中的职位特征向量、第四特征域中的标签特征向量、类目特征向量、主题特征向量以及第四特征域中的标签特征向量、类目特征向量、主题特征向量进行加权求和,获得目标用户的用户特征向量,每个特征向量对应的权重是在训练用户画像模型时采用注意力机制学习获得的。
通过融合目标用户在多个特征域中特征向量,获得用户特征向量,使用户特征向量能更加全面的表征用户特征,进而有效提高基于用户特征向量匹配用户标签的准确度。
在步骤S202中,在获得多媒体内容的内容标签的标签特征向量时,先确定目标应用中多媒体内容中的每个多媒体内容在多个标签域中的内容标签,然后通过用户画像模型,提取每个标签域中的内容标签的标签特征向量。
其中,标签域为表征多媒体内容的标签维度,不同的标签域所表征的标签维度不同,各个标签域中用户标签可能完全不同,也可能部分相同。标签域可以是内容标签域、类目标签域、主题标签域、公众号标签域等。通过用户画像模型对每个标签域中的内容标签进行embedding处理,获得每个标签域中的内容标签的标签特征向量。
示例性地,预先设置5个标签域,分别为内容标签域、一级类目标签域、二级类目标签域、主题标签域和公众号标签域。
以目标应用中的一条体育新闻举例来说,设定该体育新闻描述了一场足球比赛,参赛队伍为M队和N队。从该条新闻中获取的内容标签包括:体育、足球、M队、N队,将获取的内容标签作为内容标签域中的标签。该条新闻对应的一级类目为体育,则将内容标签“体育”作为一级类目标签域中的标签。该条新闻对应的二级类目为足球,则将内容标签“足球”作为二级类目标签域中的标签。该条新闻的内容精要为足球赛事,则将内容标签“足球”作为主题标签域中的标签。该条新闻来源于Q体育公众号,则将Q体育公众号作为公众号标签域中的标签。其他多媒体内容也可以采用相同的方式确定在各个标签域中的内容标签,此处不再赘述。
本申请实施例中,预先设置多个标签域表征多媒体内容中的标签,便于后续为目标用户匹配多个维度的内容标签,从而提高了用户画像的准确性。
需要说明的是,标签域划分的实施方式并不仅限于上述举例的一种,还可以是内容标签域、类目标签域、主题标签域、公众号标签域中部分标签域的组合,对此,本申请不做具体限定。另外,本申请中也可以不设置标签域,直接从多媒体内容中获取内容标签,然后对多媒体内容的内容标签进行特征提取,确定内容标签的标签特征向量,对此,本申请不做具体限定。
在本申请实施例中,在获得目标用户的用户特征向量以及目标应用中多媒体内容的内容标签的标签特征向量之后,采用以下方式确定目标用户的备选标签:
确定目标用户的用户特征向量与每个标签域中的内容标签的标签特征向量之间的相似度,然后将多媒体内容在多个标签域中的内容标签中,相似度满足预设条件的内容标签确定为目标用户的备选标签。
在一种可能的实现方式中,目标用户的用户特征向量与内容标签的标签特征向量之间的相似度可以是用户特征向量与标签特征向量之间的点积值、欧氏距离、余弦相似度等。
一种可能的实施方式,预先设置相似度阈值,不同的标签域可以设置相同的相似度阈值,也可以设置不同的相似度阈值,对此,本申请不做具体限定。针对每个标签域,确定目标用户的用户特征向量与该标签域中的内容标签的标签特征向量之间的相似度,将该标签域中相似度大于相似度阈值的内容标签作为目标用户的备选标签。
一种可能的实施方式,预先设置标签数量阈值,不同的标签域可以设置相同的标签数量阈值,也可以设置不同的标签数量阈值,对此,本申请不做具体限定。针对每个标签域,确定目标用户的用户特征向量与该标签域中每个内容标签的标签特征向量之间的相似度,然后按照相似度从大到小的顺序对该标签域中的内容标签进行排序,将排在前W位的内容标签作为目标用户的备选标签,其中,W为该标签域对应的标签数量阈值。
本申请实施例中,预先设置多个标签域表征多媒体内容中的内容标签,然后基于目标用户的用户特征向量与每个标签域中的内容标签的标签特征向量之间的相似度,从每个标签域的内容标签中获得目标用户的备选标签,故获得的备选标签也是多维度的,从而使获得的用户画像更加全面,后续也可以基于多维度的用户画像向用户推荐更加准确的内容。
下面具体介绍训练用户画像模型的过程,训练过程可以由计算机设备执行,该计算机设备可以是图1所示的终端设备101或服务器102,具体包括以下步骤:
采用待训练的用户画像模型和训练样本进行多次迭代训练,获得用户画像模型,训练样本包括样本多媒体内容和样本用户的用户特征,每次迭代训练包括:
提取样本用户的用户特征的特征向量和样本多媒体内容的内容标签的标签特征向量。然后对样本用户的用户特征的特征向量进行层级嵌入处理,获得样本用户的用户特征向量。对标签特征向量进行层级嵌入处理,获得样本多媒体内容的内容特征向量,之后再基于样本用户的用户特征向量与样本多媒体内容的内容特征向量之间的关联度,调整待训练的用户画像模型的参数。
需要说明的是,用户画像模型的结构和训练方式至少包括以下几种:
实施方式一、如图5所示,用户画像模型包括第一子模型、第二子模型、预估层,其中,第一子模型中包括第一输入层、第一域内融合层、第一域间融合层,第二子模型中包括第二输入层、第二域内融合层、第二域间融合层。
在训练用户画像模型时,针对第一子模型,先根据样本用户的属性信息以及历史行为数据,确定样本用户在多个特征域中的用户特征,然后通过第一输入层将样本用户在多个特征域中的用户特征输入待训练的第一子模型。第一输入层对样本用户在每个特征域中的用户特征进行特征提取,获得每个特征域内的用户特征的特征向量,并将用户特征的特征向量输入第一域内融合层。第一域内融合层将每个特征域内的用户特征的特征向量进行融合,获得每个特征域的域内特征向量,并将每个特征域的域内特征向量输入第一域间融合层。第一域间融合层将多个特征域的域内特征向量进行融合,获得样本用户的用户特征向量,然后将样本用户的用户特征向量输入预估层。在一些情况下,将每个特征域内的用户特征的特征向量进行融合,获得每个特征域的域内特征向量时,可以采用加权求和或直接相加等方式。将多个特征域的域内特征向量进行融合,获得样本用户的用户特征向量时,可以采用加权求和或直接相加等方式。
针对第二子模型,先确定目标应用中的样本多媒体内容在多个标签域中的内容标签,然后通过第二输入层将样本多媒体内容在多个标签域中的内容标签输入待训练的第二子模型。第二输入层提取每个标签域中的内容标签的标签特征向量,然后将每个标签域中的内容标签的标签特征向量输入第二域内融合层。第二域内融合层将每个标签域中的内容标签的标签特征向量融合,获得每个标签域的域内标签向量,然后将每个标签域的域内标签向量输入第二域间融合层。第二域间融合层将多个标签域的域内标签向量融合,获得样本多 媒体内容的内容特征向量,然后将样本多媒体内容的内容特征向量输入预估层。在一些情况下,将每个标签域中的内容标签的标签特征向量融合,获得每个标签域的域内标签向量时,可以采用加权求和或直接相加等方式。将多个标签域的域内标签向量融合,获得样本多媒体内容的内容特征向量时,可以采用加权求和或直接相加等方式。
预估层用于预测样本用户与目标应用中的样本多媒体内容之间的关联度,例如,预估层可以通过计算用户特征向量与内容特征向量之间的点积值或欧氏距离或余弦相似度等,确定样本用户与目标应用中的样本多媒体内容之间的关联度。在训练过程中使用交叉熵定义损失函数,使用自适应矩估计(Adaptive Moment Estimation,Adam)对损失函数进行优化,当损失函数满足预设条件时,训练结束。损失函数具体如公式(5)所示:
Figure PCTCN2021102604-appb-000018
其中,y k为用户画像模型预测获得的第k个样本多媒体内容与样本用户之间的关联度(0≤y k≤1),
Figure PCTCN2021102604-appb-000019
为实际的第k个样本多媒体内容与样本用户之间的关联度(
Figure PCTCN2021102604-appb-000020
为0或1),K为样本多媒体内容的数量。
一般可以采用下述公式(6)确定预测获得的第k个样本多媒体内容与样本用户之间的关联度y k
Figure PCTCN2021102604-appb-000021
其中,
Figure PCTCN2021102604-appb-000022
为第k个样本多媒体内容的内容特征向量,
Figure PCTCN2021102604-appb-000023
为样本用户的用户特征向量。
实施方式二、如图6所示,用户画像模型包括第一子模型、第二子模型、预估层,其中,第一子模型中包括第一输入层、第一融合层,第二子模型中包括第二输入层、第二融合层。
在训练用户画像模型时,针对第一子模型,先根据样本用户的属性信息以及历史行为数据,确定样本用户在多个特征域中的用户特征,然后通过第一输入层将样本用户在多个特征域中的用户特征输入待训练的第一子模型。第一输入层对样本用户在每个特征域中的用户特征进行特征提取,获得每个特征域内的用户特征的特征向量,并将用户特征的特征向量输入第一融合层。第一融合层将多个特征域内的用户特征的特征向量进行融合,获得样本用户的用户特征向量,然后将样本用户的用户特征向量输入预估层。融合可以采用加权求和与直接相加中的任意一种方法。
针对第二子模型,先确定目标应用中的样本多媒体内容在多个标签域中的内容标签,然后通过第二输入层将样本多媒体内容在多个标签域中的内容标签输入待训练的第二子模型。第二输入层提取每个标签域中的内容标签的标签特征向量,然后将每个标签域中的内容标签的标签特征向量输入第二融合层。第二融合层将多个标签域的内容标签的标签特征向量融合,获得样本多媒体内容的内容特征向量,然后将样本多媒体内容的内容特征向量输入预估层。融合可以采用加权求和与直接相加中的任意一种方法。
预估层用于预测样本用户与目标应用中的样本多媒体内容之间的关联度,例如,预估层可以通过计算用户特征向量与内容特征向量之间的点积值或欧氏距离或余弦相似度等,确定样本用户与目标应用中的样本多媒体内容之间的关联度。在训练过程中使用交叉熵定义损失函数,使用Adam对损失函数进行优化,损失函数具体如公式(5)所示,当损失函数满足预设条件时,训练结束。
实施方式三、如图6所示,用户画像模型包括第一子模型、第二子模型、预估层,其中,第一子模型中包括第一输入层、第一融合层,第二子模型中包括第二输入层、第二融合层。
在训练用户画像模型时,针对第一子模型,先根据样本用户的属性信息以及历史行为数据,确定样本用户的多个用户特征,然后通过第一输入层将样本用户的多个用户特征输入待训练的第一子模型。第一输入层对样本用户的多个用户特征进行特征提取,获得多个特征向量,并将多个特征向量输入第一融合层。第一融合层将多个特征向量进行融合,获得样本用户的用户特征向量,然后将样本用户的用户特征向量输入预估层。融合可以采用加权求和与直接相加中的任意一种方法。
针对第二子模型,先确定目标应用中的样本多媒体内容的多个内容标签,然后通过第二输入层将样本多媒体内容的多个内容标签输入待训练的第二子模型。第二输入层提取多个内容标签的标签特征向量,然后将多个标签特征向量输入第二融合层。第二融合层将多个标签特征向量融合,获得样本多媒体内容的内容特征向量,然后将样本多媒体内容的内容特征向量输入预估层。融合可以采用加权求和与直接相加中的任意一种方法。
预估层用于预测样本用户与目标应用中的样本多媒体内容之间的关联度,例如,预估层可以通过计算用户特征向量与内容特征向量之间的点积值或欧氏距离或余弦相似度等,确定样本用户与目标应用中的样本多媒体内容之间的关联度。在训练过程中使用交叉熵定义损失函数,使用Adam对损失函数进行优化,损失函数具体如公式(5)所示,当损失函数满足预设条件时,训练结束。
实施方式四、如图7所示,用户画像模型包括第一子模型、第二子模型、预估层,其中,第一子模型中包括第一输入层、第一融合层,第二子模型中包括第二输入层、第二域内融合层、第二域间融合层。
在训练用户画像模型时,针对第一子模型,先根据样本用户的属性信息以及历史行为数据,确定样本用户的多个用户特征,然后通过第一输入层将样本用户的多个用户特征输入待训练的第一子模型。第一输入层对样本用户的多个用户特征进行特征提取,获得多个特征向量,并将多个特征向量输入第一融合层。第一融合层将多个特征向量进行融合,获得样本用户的用户特征向量,然后将样本用户的用户特征向量输入预估层。融合可以采用加权求和与直接相加中的任意一种方法。
针对第二子模型,先确定目标应用中的样本多媒体内容在多个标签域中的内容标签,然后通过第二输入层将样本多媒体内容在多个标签域中的内容标签输入待训练的第二子模型。第二输入层提取每个标签域中的内容标签的标签特征向量,然后将每个标签域中的内容标签的标签特征向量输入第二域内融合层。第二域内融合层将每个标签域中的内容标签的标签特征向量融合,获得每个标签域的域内标签向量,然后将每个标签域的域内标签向量输入第二域间融合层。第二域间融合层将多个标签域的域内标签向量融合,获得样本多媒体内容的内容特征向量,然后将样本多媒体内容的内容特征向量输入预估层。域内融合可以采用加权求和与直接相加中的任意一种方法,域间融合可以采用加权求和与直接相加中的任意一种方法。
预估层用于预测样本用户与目标应用中的样本多媒体内容之间的关联度,例如,预估层可以通过计算用户特征向量与内容特征向量之间的点积值或欧氏距离或余弦相似度等,确定样本用户与目标应用中的样本多媒体内容之间的关联度。在训练过程中使用交叉熵定义损失函数,使用自适应矩估计(Adaptive Moment Estimation,Adam)对损失函数进行优化,损失函数具体如公式(5)所示,当损失函数满足预设条件时,训练结束。
需要说明的是,本申请中用户画像模型的结构并不仅限于上述四种,还可以是第一子模型和第二子模型组合获得的其他结构,对此,本申请不做具体限定。
本申请实施例中,在训练时将样本多媒体内容的内容标签的标签特征向量进行多层级融合后,获得样本多媒体内容的内容特征向量,然后基于样本用户的用户特征向量与目标应用中的样本多媒体内容的内容特征向量之间的关联度训练得到用户画像模型,采用这种层级嵌入的方式训练获得的模型考虑了样本多媒体内容中内容标签之间的约束关系,而不是单独考虑内容标签与用户之间的关系,故采用训练获得的模型确定与用户匹配的内容标签时,可以匹配更准确的内容标签,从而构建更精准的用户画像。其次,基于样本用户的用户特征向量与目标应用中的样本多媒体内容的内容特征向量之间的关联度对模型进行训练,而不是从样本多媒体内容中抽取标签,基于样本用户的用户特征向量与标签向量之间的关联度进行训练,从而保持了样本数据的原始分布,使得画像预估结果更加准确。
在上述任意一个实施例的基础上,本申请实施例提供至少提供以下两种内容推荐方法:
实施方式一:
在获得目标用户的用户画像之后,基于目标用户的用户画像,从目标应用的多媒体内容中确定推荐给目标用户的目标多媒体内容。
在一种可能的实现方式中,可以根据用户画像中的备选标签从目标应用的多媒体内容中获取备选标签匹配的多媒体内容,然后将匹配的多媒体内容推荐给目标用户。由于用户画像中的备选标签可能来自不同的标签域,故可以根据实际需求采用不同标签域中的备选标签从目标应用的多媒体内容中获取备选标签匹配的多媒体内容。比如,标签域包括内容标签域、一级类目标签域、二级类目标签域、主题标签域和公众号标签域,若用户画像的备选标签中包括从上述5个标签域中获得的内容标签,则可以从上述5个标签域中选取一个或多个目标标签域,然后采用用户画像中目标标签域对应的备选标签,从目标应用的多媒体内容中获取备选标签匹配的多媒体内容。
实施方式二:
通过用户画像模型,对每个多媒体内容在多个标签域中的内容标签的标签特征向量进行层级嵌入处理,确定每个多媒体内容的内容特征向量,并根据目标用户的用户特征向量与每个多媒体内容的内容特征向量的关联度,从多媒体内容中确定推荐给目标用户的目标多媒体内容。
在一种可能的实现方式中,对每个多媒体内容在多个标签域中的内容标签的标签特征向量进行层级嵌入处理,确定多媒体内容的内容特征向量时,至少包括以下几种实施方式:
一种可能的实施方式,将多媒体内容在每个标签域中的内容标签的标签特征向量融合,获得每个标签域的域内标签向量。将多个标签域的域内标签向量融合,获得多媒体内容的内容特征向量。
例如,将多媒体内容在每个标签域中的内容标签的标签特征向量直接相加,获得每个标签域的域内标签向量,然后将多个标签域的域内标签向量直接相加,获得多媒体内容的内容特征向量。
需要说明的是,对内容标签的标签特征向量进行域内融合和域间融合的方法并不仅限于上述直接相加的方法,也可以是加权求和的方法,加权求和的权重可以在训练用户画像模型时采用注意力机制学习获得,还可以是域内融合和域间融合中一个采用加权求和的方法,另一个采用直接相加的方法,对此,本申请不做具体限定。
示例性地,如图8所示,预先设置5个标签域,分别为内容标签域、一级类目标签域、二级类目标签域、主题标签域和公众号标签域。
以目标应用中的一条体育新闻举例来说,设定该体育新闻描述了一场足球比赛,参赛队伍为M队和N队。从该条新闻中获取的内容标签包括:体育、足球、M队、N队,将获取的内容标签作为内容标签域中的标签。该条新闻对应的一级类目为体育,则将内容标签“体育”作为一级类目标签域中的标签。该条新闻对应的二级类目为足球,则将内容标签“足球”作为二级类目标签域中的标签。该条新闻的内容精要为足球赛事,则将内容标签“足球”作为主题标签域中的标签。该条新闻来源于Q体育公众号,则将Q体育公众号作为公众号标签域中的标签。
通过用户画像模型,提取每个标签域中的内容标签的标签特征向量,其中,内容标签域中的内容标签的标签特征向量包括体育标签特征向量、足球标签特征向量、M队标签特征向量、N队标签特征向量,一级类目标签域中的内容标签的标签特征向量包括体育标签特征向量,二级类目标签域中的内容标签的标签特征向量包括足球标签特征向量,主题标签域中的内容标签的标签特征向量包括足球标签特征向量,公众号标签域中的内容标签的标签特征向量包括公众号标签特征向量。
由于一级类目标签域、二级类目标签域、主题标签域和公众号标签域都只有一个标签特征向量,故可以不进行域内融合。采用直接相加的方式将内容标签域中4个内容标签的标签特征向量进行融合,获得内容标签域的域内标签向量。然后采用直接相加的方式对内容标签域、一级类目标签域、二级类目标签域、主题标签域和公众号标签域中的域内标签向量进行融合,获得该条体育新闻的内容特征向量。
另一种可能的实施方式,将多媒体内容在多个标签域中的内容标签的标签特征向量融合,获得多媒体内容的内容特征向量。
例如,可以采用直接相加的方式将多媒体内容在多个标签域中的内容标签的标签特征向量融合,获得多媒体内容的内容特征向量。也可以采用加权求和的方式将多媒体内容在多个标签域中的内容标签的标签特征向量融合,获得多媒体内容的内容特征向量,加权求和的权重可以在训练用户画像模型时采用注意力机制学习获得。
示例性地,如图9所示,预先设置5个标签域,分别为内容标签域、一级类目标签域、二级类目标签域、主题标签域和公众号标签域。
以目标应用中的一条体育新闻举例来说,设定该体育新闻描述了一场足球比赛,参赛队伍为M队和N队。从该条新闻中获取的内容标签包括:体育、足球、M队、N队,将获取的内容标签作为内容标签域中的标签。该条新闻对应的一级类目为体育,则将内容标签“体育”作为一级类目标签域中的标签。该条新闻对应的二级类目为足球,则将内容标签“足球”作为二级类目标签域中的标签。该条新闻的内容精要为足球赛事,则将内容标签“足球”作为主题标签域中的标签。该条新闻来源于Q体育公众号,则将Q体育公众号作为公众号标签域中的标签。
通过用户画像模型,提取每个标签域中的内容标签的标签特征向量,其中,内容标签域中的内容标签的标签特征向量包括体育标签特征向量、足球标签特征向量、M队标签特征向量、N队标签特征向量,一级类目标签域中的内容标签的标签特征向量包括体育标签特征向量,二级类目标签域中的内容标签的标签特征向量包括足球标签特征向量,主题标签域中的内容标签的标签特征向量包括足球标签特征向量,公众号标签域中的内容标签的标签特征向量包括公众号标签特征向量。
采用直接相加的方式对内容标签域、一级类目标签域、二级类目标签域、主题标签域和公众号标签域中的内容标签的标签特征向量进行融合,获得该条体育新闻的内容特征向量。
在确定每个多媒体内容的内容特征向量之后,根据目标用户的用户特征向量与每个多媒体内容的内容特征向量的关联度,从多媒体内容中确定推荐给目标用户的目标多媒体内容。
在一种可能的实现方式中,可以预先设置关联度阈值,当多媒体内容的内容特征向量与目标用户的用户特征向量之间的关联度大于关联度阈值时,将该多媒体内容推荐给目标用户,并在目标应用显示推荐的内容。也可以预先设置内容推荐数量阈值,按照关联度从大到小的顺序,对各个多媒体内容进行排序,将排在前R位的多媒体内容推荐给目标用户并在目标应用显示推荐的内容,R为内容推荐数量阈值。
通过融合多媒体内容在多个标签域中的标签向量,获得多媒体内容的内容特征向量,使内容特征向量能更加全面的表征多媒体内容的特征,进而提高基于用户特征向量与内容特征向量匹配,获得推荐给用户的多媒体内容的精准度。
采用上述两种实施方式中任意一种方式确定推荐给目标用户的多媒体内容后,目标应用显示在目标应用显示推荐的内容。
示例性地,在文档推荐场景中,如图10所示,设定目标应用为即时通信应用,推荐给目标用户的文档为足球比赛评论文章A、足球比赛新闻B以及足球球星采访报道C,则在即时通信应用的文章阅读模块中显示足球比赛评论文章A的链接、足球比赛新闻B的链接以及足球球星采访报道C的链接,目标用户可以点击链接查看相关文章。
示例性地,在广告推荐场景中,如图11所示,设定目标应用为即时通信应用,推荐给目标用户的广告为汽车广告时,则在即时通信应用的朋友圈中,展示汽车广告,目标用户可以点击广告图片查看广告或进入购买页面。
示例性地,在商品推荐场景中,如图12所示,设定目标应用为购物应用,确定出推荐给目标用户的商品为“菠萝”和“葡萄”时,在购物应用水果类别的推荐页面中,优先展示“菠萝”和“葡萄”的购买链接,比如将“菠萝”和“葡萄”的购买链接展示在推荐页面的最上端,将“香蕉”和“草莓”的购买链接展示在推荐页面的下端。
为了更好地解释本申请实施例,下面以目标应用为文档推荐应用为例,介绍本申请实施例提供的一种获得用户画像的方法,该方法由服务器执行。
首先介绍用户画像模型的结构,如图13所示,用户画像模型包括第一子模型、第二子模型、预估层,其中,第一子模型中包括第一输入层、第一域内融合层、第一域间融合层,第二子模型中包括第二输入层、第二域内融合层、第二域间融合层。
在训练用户画像模型时,针对第一子模型,预先设置P个特征域,分别为特征域1、特征域2、…、特征域P,先根据样本用户的属性信息以及历史行为数据,确定样本用户在P个特征域中的用户特征,然后通过第一输入层将样本用户在P个特征域中的用户特征输入待训练的第一子模型。第一输入层对样本用户在每个特征域中的用户特征进行embedding处理,获得每个特征域内的用户特征的特征向量,并将用户特征的特征向量输入第一域内融合层。第一域内融合层采用加权求和的方式将每个特征域内的用户特征的特征向量进行融合,获得每个特征域的域内特征向量,并将每个特征域的域内特征向量输入第一域间融合层。第一域间融合层采用加权求和的方式将多个特征域的域内特征向量进行融合,获得样本用户的用户特征向量,然后将样本用户的用户特征向量输入预估层。
针对第二子模型,预先设置Q个标签域,分别为标签域1、标签域2、…、标签域Q,先确定文档推荐应用中的样本多媒体内容在Q个标签域中的内容标签,然后通过第二输入层将样本多媒体内容在Q个标签域中的内容标签输入待训练的第二子模型。第二输入层对每个标签域中的内容标签进行embedding处理,获得内容标签的标签特征向量,然后将每个标签域中的内容标签的标签特征向量输入第二域内融合层。第二域内融合层采用直接相 加的方式将每个标签域中的内容标签的标签特征向量融合,获得每个标签域的域内标签向量,然后将每个标签域的域内标签向量输入第二域间融合层。第二域间融合层采用直接相加的方式将多个标签域的域内标签向量融合,获得样本多媒体内容的内容特征向量,然后将样本多媒体内容的内容特征向量输入预估层。
预估层先计算样本用户的用户特征向量与样本多媒体内容的内容特征向量之间的点积值,然后采用sigmoid函数对点积值进行归一化处理,获得样本用户与样本多媒体内容之间的关联度。在训练过程中使用交叉熵定义损失函数,使用Adam对损失函数进行优化,损失函数具体如公式(5)所示,当损失函数满足预设条件时,训练结束。
在构建目标用户的用户画像时,首先获取目标用户的属性信息以及历史行为数据,目标用户的属性信息包括性别、年龄、地点等,历史行为数据包括目标用户在文档推荐应用之外的其他应用中的历史行为数据,比如目标用户在视频应用中的视频观看记录、在即时通信应用中的文档点击记录等。然后根据目标用户的属性信息以及历史行为数据,确定目标用户在P个特征域中的用户特征。通过上述训练好的第一子模型,对每个特征域内的用户特征进行embedding处理,获得每个特征域内的用户特征的特征向量。采用加权求和的方式将每个特征域内的用户特征的特征向量进行融合,获得每个特征域的域内特征向量。然后采用加权求和的方式将P个特征域的域内特征向量进行融合,确定目标用户的用户特征向量。
针对文档推荐应用的每个多媒体内容,先确定该多媒体内容在Q个标签域中的内容标签,然后通过上述训练好的用户画像模型,对每个标签域中的内容标签进行embedding处理,获得每个内容标签的标签特征向量。
再确定目标用户的用户特征向量与每个标签域中的内容标签的标签特征向量之间的相似度,将相似度大于相似度阈值的标签特征向量对应的内容标签确定为目标用户的备选标签。然后基于目标用户的备选标签确定目标用户的用户画像,之后再基于目标用户的用户画像确定推荐给目标用户的文档推荐应用中的文档。
为了验证本申请实施例中获得用户画像的方法的效果,本申请以用户实际的点击日志对用户画像的效果进行了评估,获得的评估结果如表1所示:
表1.
  Prec@1 Prec@5 Prec@10
现有技术 0.4818 0.3546 0.2985
本申请 0.4957 0.3552 0.3018
其中,Prec@N为画像预估准确率指标,表示基于用户画像推荐给用户的内容中用户实际点击的比例,具体满足下述公式(7):
Prec@N=用户实际点击的内容的数量/N………………………(7)
其中,N为基于用户画像推荐给用户的内容的数量。
本申请实施例中,在训练时将样本多媒体内容的内容标签的标签特征向量进行多层级融合后,获得样本多媒体内容的内容特征向量,然后基于样本用户的用户特征向量与目标应用中的样本多媒体内容的内容特征向量之间的关联度训练得到用户画像模型,采用这种层级嵌入的方式训练获得的模型考虑了样本多媒体内容中内容标签之间的约束关系,而不是单独考虑内容标签与用户之间的关系,故采用训练获得的模型确定与用户匹配的内容标签时,可以匹配更准确的内容标签,从而构建更精准的用户画像。其次,基于样本用户的用户特征向量与目标应用中的样本多媒体内容的内容特征向量之间的关联度对模型进行训 练,而不是从样本多媒体内容中抽取标签,基于样本用户的用户特征向量与标签向量之间的关联度进行训练,从而保持了样本数据的原始分布,使得画像预估结果更加准确。
基于相同的技术构思,本申请实施例提供了一种获得用户画像的装置的结构示意图,如图14所示,该装置1400包括:
第一特征提取模块1401,用于根据目标用户的属性信息以及历史行为数据确定目标用户的用户特征向量;
第二特征提取模块1402,用于获取目标应用中多媒体内容的内容标签的标签特征向量;
匹配模块1403,用于根据用户特征向量与标签特征向量之间的相似度,从多媒体内容的内容标签中确定目标用户的备选标签;
处理模块1404,用于基于目标用户的备选标签确定目标用户的用户画像。
在一种可能的实现方式中,第二特征提取模块1402具体用于:
确定多媒体内容中的每个多媒体内容在多个标签域中的内容标签;
通过用户画像模型,提取每个标签域中内容标签的标签特征向量,用户画像模型是基于样本用户的用户特征向量与样本多媒体内容的内容特征向量之间的关联度训练得到的,样本多媒体内容的内容特征向量是对样本多媒体内容的内容标签的标签特征向量进行层级嵌入处理后获得的,样本用户的用户特征向量是对样本用户的用户特征的特征向量进行层级嵌入处理后获得的。
在一种可能的实现方式中,第二特征提取模块1402具体用于:
确定样本多媒体内容在多个标签域中的内容标签,并提取每个标签域中的内容标签的标签特征向量;
将每个标签域中的内容标签的标签特征向量融合,获得每个标签域的域内标签向量;
将多个标签域的域内标签向量融合,获得样本多媒体内容的内容特征向量。
在一种可能的实现方式中,第一特征提取模块1401具体用于:
确定样本用户在多个特征域中的用户特征,并提取每个特征域内的用户特征的特征向量;
将每个特征域内的用户特征的特征向量进行融合,获得每个特征域的域内特征向量;
将多个特征域的域内特征向量进行融合,获得样本用户的用户特征向量。
在一种可能的实现方式中,第一特征提取模块1401具体用于:
根据目标用户的属性信息以及历史行为数据,确定目标用户在多个特征域中的用户特征;
通过用户画像模型,提取每个特征域内的用户特征的特征向量,并对每个特征域内的用户特征的特征向量进行层级嵌入处理,确定目标用户的用户特征向量。
在一种可能的实现方式中,第一特征提取模块1401具体用于:
将每个特征域内的用户特征的特征向量进行融合,获得每个特征域的域内特征向量;
将多个特征域的域内特征向量进行融合,获得目标用户的用户特征向量。
在一种可能的实现方式中,匹配模块1403具体用于:
确定用户特征向量与每个标签域中的内容标签的标签特征向量之间的相似度;
将多媒体内容在多个标签域中的内容标签中,相似度满足预设条件的内容标签确定为目标用户的备选标签。
在一种可能的实现方式中,处理模块1404还用于:
通过用户画像模型,对每个多媒体内容在多个标签域中的内容标签的标签特征向量进行层级嵌入处理,确定每个多媒体内容的内容特征向量;
根据目标用户的用户特征向量与每个多媒体内容的内容特征向量的关联度,从多媒体内容中确定推荐给目标用户的目标多媒体内容。
在一种可能的实现方式中,处理模块1404具体用于:
将多媒体内容在每个标签域中的内容标签的标签特征向量融合,获得每个标签域的域内标签向量;
将多个标签域的域内标签向量融合,获得多媒体内容的内容特征向量。
基于相同的技术构思,本申请实施例提供了一种内容推荐装置的结构示意图,如图15所示,该装置1500包括:
获得用户画像的装置1400,用于获得目标用户的用户画像;
推荐模块1501,基于目标用户的用户画像,从目标应用的多媒体内容中确定推荐给目标用户的目标多媒体内容。
基于相同的技术构思,本申请实施例提供了一种用户画像模型的训练装置的结构示意图,如图16所示,该装置1600包括:
模型训练模块1601,用于采用待训练的用户画像模型和训练样本进行多次迭代训练,获得用户画像模型,训练样本包括样本多媒体内容和样本用户的用户特征,每次迭代训练包括:
提取样本用户的用户特征的特征向量和样本多媒体内容的内容标签的标签特征向量;
对样本用户的用户特征的特征向量进行层级嵌入处理,获得样本用户的用户特征向量;
对标签特征向量进行层级嵌入处理,获得样本多媒体内容的内容特征向量;
基于用户特征向量与内容特征向量之间的关联度,调整待训练的用户画像模型的参数。
在一种可能的实现方式中,模型训练模块1601具体用于:
确定样本用户在多个特征域中的用户特征,并提取每个特征域内的用户特征的特征向量;
确定样本多媒体内容在多个标签域中的内容标签,并提取每个标签域中的内容标签的标签特征向量。
在一种可能的实现方式中,模型训练模块1601具体用于:
将每个标签域中的内容标签的标签特征向量融合,获得每个标签域的域内标签向量;
将多个标签域的域内标签向量融合,获得样本多媒体内容的内容特征向量。
在一种可能的实现方式中,模型训练模块1601具体用于:
将每个特征域内的用户特征的特征向量进行融合,获得每个特征域的域内特征向量;
将多个特征域的域内特征向量进行融合,获得样本用户的用户特征向量。
基于相同的技术构思,本申请实施例提供了一种计算机设备,如图17所示,包括至少一个处理器1701,以及与至少一个处理器连接的存储器1702,本申请实施例中不限定处理器1701与存储器1702之间的具体连接介质,图17中处理器1701和存储器1702之间通过总线连接为例。总线可以分为地址总线、数据总线、控制总线等。
在本申请实施例中,存储器1702存储有可被至少一个处理器1701执行的指令,至少一个处理器1701通过执行存储器1702存储的指令,可以执行前述获得用户画像的方法或内容推荐方法或用户画像模型的训练方法中所包括的步骤。
其中,处理器1701是计算机设备的控制中心,可以利用各种接口和线路连接计算机设备的各个部分,通过运行或执行存储在存储器1702内的指令以及调用存储在存储器1702内的数据,从而获得用户画像或进行内容推荐或训练用户画像模型。可选的,处理器1701可包括一个或多个处理单元,处理器1701可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。 可以理解的是,上述调制解调处理器也可以不集成到处理器1701中。在一些实施例中,处理器1701和存储器1702可以在同一芯片上实现,在一些实施例中,它们也可以在独立的芯片上分别实现。
处理器1701可以是通用处理器,例如中央处理器(CPU)、数字信号处理器、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件,可以实现或者执行本申请实施例中公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。
存储器1702作为一种非易失性计算机可读存储介质,可用于存储非易失性软件程序、非易失性计算机可执行程序以及模块。存储器1702可以包括至少一种类型的存储介质,例如可以包括闪存、硬盘、多媒体卡、卡型存储器、随机访问存储器(Random Access Memory,RAM)、静态随机访问存储器(Static Random Access Memory,SRAM)、可编程只读存储器(Programmable Read Only Memory,PROM)、只读存储器(Read Only Memory,ROM)、带电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、磁性存储器、磁盘、光盘等等。存储器1702是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。本申请实施例中的存储器1702还可以是电路或者其它任意能够实现存储功能的装置,用于存储程序指令和/或数据。
基于同一发明构思,本申请实施例提供了一种计算机可读存储介质,其存储有可由计算机设备执行的计算机程序,当程序在计算机设备上运行时,使得计算机设备执行上述获得用户画像的方法或内容推荐方法或用户画像模型的训练方法的步骤。
本领域内的技术人员应明白,本发明的实施例可提供为方法、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本发明的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本发明范围的所有变更和修改。
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。

Claims (17)

  1. 一种用户画像模型的训练方法,所述方法由计算机设备执行,所述方法包括:
    采用待训练的用户画像模型和训练样本进行多次迭代训练,获得用户画像模型,所述训练样本包括样本多媒体内容和样本用户的用户特征,每次迭代训练包括:
    提取样本用户的用户特征的特征向量和样本多媒体内容的内容标签的标签特征向量;
    对所述样本用户的用户特征的特征向量进行层级嵌入处理,获得所述样本用户的用户特征向量;
    对所述标签特征向量进行层级嵌入处理,获得所述样本多媒体内容的内容特征向量;
    基于所述用户特征向量与所述内容特征向量之间的关联度,调整所述待训练的用户画像模型的参数。
  2. 如权利要求1所述的方法,所述提取样本用户的用户特征的特征向量和样本多媒体内容的内容标签的标签特征向量,包括:
    确定所述样本用户在多个特征域中的用户特征,并提取每个特征域内的用户特征的特征向量;
    确定所述样本多媒体内容在多个标签域中的内容标签,并提取每个标签域中的内容标签的标签特征向量。
  3. 如权利要求2所述的方法,所述对所述样本多媒体内容的内容标签的标签特征向量进行层级嵌入处理,获得所述样本多媒体内容的内容特征向量,包括:
    将每个标签域中的内容标签的标签特征向量融合,获得每个标签域的域内标签向量;
    将多个标签域的域内标签向量融合,获得所述样本多媒体内容的内容特征向量。
  4. 如权利要求2所述的方法,所述对所述样本用户的用户特征的特征向量进行层级嵌入处理,获得所述样本用户的用户特征向量,包括:
    将每个特征域内的用户特征的特征向量进行融合,获得每个特征域的域内特征向量;
    将多个特征域的域内特征向量进行融合,获得所述样本用户的用户特征向量。
  5. 一种获得用户画像的方法,所述方法由计算机设备执行,所述方法包括:
    根据目标用户的属性信息以及历史行为数据确定所述目标用户的用户特征向量;
    获取目标应用中多媒体内容的内容标签的标签特征向量;
    根据所述用户特征向量与所述标签特征向量之间的相似度,从所述多媒体内容的内容标签中确定所述目标用户的备选标签;
    基于所述目标用户的备选标签确定所述目标用户的用户画像。
  6. 如权利要求5所述的方法,所述获取目标应用中多媒体内容的内容标签的标签特征向量,包括:
    确定所述多媒体内容中的每个多媒体内容在多个标签域中的内容标签;
    通过用户画像模型,提取每个标签域中内容标签的标签特征向量,所述用户画像模型是基于样本用户的用户特征向量与样本多媒体内容的内容特征向量之间的关联度训练得到的,所述样本多媒体内容的内容特征向量是对所述样本多媒体内容的内容标签的标签特征向量进行层级嵌入处理后获得的,所述样本用户的用户特征向量是对所述样本用户的用户特征的特征向量进行层级嵌入处理后获得的。
  7. 如权利要求6所述的方法,所述样本多媒体内容的内容特征向量是对所述样本多媒体内容的内容标签的标签特征向量进行层级嵌入处理后获得的,包括:
    确定所述样本多媒体内容在多个标签域中的内容标签,并提取每个标签域中的内容标签的标签特征向量;
    将每个标签域中的内容标签的标签特征向量融合,获得每个标签域的域内标签向量;
    将多个标签域的域内标签向量融合,获得所述样本多媒体内容的内容特征向量。
  8. 如权利要求6所述的方法,所述样本用户的用户特征向量是对所述样本用户的用户特征的特征向量进行层级嵌入处理后获得的,包括:
    确定所述样本用户在多个特征域中的用户特征,并提取每个特征域内的用户特征的特征向量;
    将每个特征域内的用户特征的特征向量进行融合,获得每个特征域的域内特征向量;
    将多个特征域的域内特征向量进行融合,获得所述样本用户的用户特征向量。
  9. 如权利要求6所述的方法,所述根据目标用户的属性信息以及历史行为数据确定所述目标用户的用户特征向量,包括:
    根据所述目标用户的属性信息以及历史行为数据,确定所述目标用户在多个特征域中的用户特征;
    通过所述用户画像模型,提取每个特征域内的用户特征的特征向量,并对每个特征域内的用户特征的特征向量进行层级嵌入处理,确定所述目标用户的用户特征向量。
  10. 如权利要求9所述的方法,所述对每个特征域内的用户特征的特征向量进行层级嵌入处理,确定所述目标用户的用户特征向量,包括:
    将每个特征域内的用户特征的特征向量进行融合,获得每个特征域的域内特征向量;
    将多个特征域的域内特征向量进行融合,获得所述目标用户的用户特征向量。
  11. 如权利要求6所述的方法,所述根据所述用户特征向量与所述标签特征向量之间的相似度,从所述多媒体内容的内容标签中确定所述目标用户的备选标签,包括:
    确定所述用户特征向量与每个标签域中的内容标签的标签特征向量之间的相似度;
    将所述多媒体内容在多个标签域中的内容标签中,相似度满足预设条件的内容标签确定为所述目标用户的备选标签。
  12. 如权利要求6至11任一项所述的方法,所述方法还包括:
    通过所述用户画像模型,对每个多媒体内容在多个标签域中的内容标签的标签特征向量进行层级嵌入处理,确定每个多媒体内容的内容特征向量;
    根据所述目标用户的用户特征向量与每个多媒体内容的内容特征向量的关联度,从所述多媒体内容中确定推荐给所述目标用户的目标多媒体内容。
  13. 如权利要求12所述的方法,所述对每个多媒体内容在多个标签域中的内容标签的标签特征向量进行层级嵌入处理,确定每个多媒体内容的内容特征向量,包括:
    将多媒体内容在每个标签域中的内容标签的标签特征向量融合,获得每个标签域的域内标签向量;
    将多个标签域的域内标签向量融合,获得多媒体内容的内容特征向量。
  14. 一种用户画像模型的训练装置,包括:
    模型训练模块,用于采用待训练的用户画像模型和训练样本进行多次迭代训练,获得用户画像模型,所述训练样本包括样本多媒体内容和样本用户的用户特征,每次迭代训练包括:
    提取样本用户的用户特征的特征向量和样本多媒体内容的内容标签的标签特征向量;
    对所述样本用户的用户特征的特征向量进行层级嵌入处理,获得所述样本用户的用户特征向量;
    对所述标签特征向量进行层级嵌入处理,获得所述样本多媒体内容的内容特征向量;
    基于所述用户特征向量与所述内容特征向量之间的关联度,调整所述待训练的用户画像模型的参数。
  15. 一种获得用户画像的装置,包括:
    第一特征提取模块,用于根据目标用户的属性信息以及历史行为数据确定所述目标用户的用户特征向量;
    第二特征提取模块,用于获取目标应用中多媒体内容的内容标签的标签特征向量;
    匹配模块,用于根据所述用户特征向量与所述标签特征向量之间的相似度,从所述多媒体内容的内容标签中确定所述目标用户的备选标签;
    处理模块,用于基于所述目标用户的备选标签确定所述目标用户的用户画像。
  16. 一种计算机可读存储介质,其存储有可由计算机设备执行的计算机程序,当所述程序在计算机设备上运行时,使得所述计算机设备执行权利要求1~4任一项所述方法的步骤,或者权利要求5~13任一项所述方法的步骤。
  17. 一种计算机程序产品,当所述计算机程序产品被执行时,用于实现如上述权利要求1~4任一项所述方法的步骤,或者权利要求5~13任一项所述方法的步骤。
PCT/CN2021/102604 2020-08-14 2021-06-28 一种获得用户画像的方法及相关装置 WO2022033199A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/898,270 US20220405607A1 (en) 2020-08-14 2022-08-29 Method for obtaining user portrait and related apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010820059.0 2020-08-14
CN202010820059.0A CN111898031B (zh) 2020-08-14 2020-08-14 一种获得用户画像的方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/898,270 Continuation US20220405607A1 (en) 2020-08-14 2022-08-29 Method for obtaining user portrait and related apparatus

Publications (1)

Publication Number Publication Date
WO2022033199A1 true WO2022033199A1 (zh) 2022-02-17

Family

ID=73229418

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/102604 WO2022033199A1 (zh) 2020-08-14 2021-06-28 一种获得用户画像的方法及相关装置

Country Status (3)

Country Link
US (1) US20220405607A1 (zh)
CN (1) CN111898031B (zh)
WO (1) WO2022033199A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114936325A (zh) * 2022-07-20 2022-08-23 北京数慧时空信息技术有限公司 基于用户画像的遥感影像推荐方法及系统
CN115344732A (zh) * 2022-10-18 2022-11-15 北京数慧时空信息技术有限公司 基于任务驱动的遥感影像推荐方法

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111898031B (zh) * 2020-08-14 2024-04-05 腾讯科技(深圳)有限公司 一种获得用户画像的方法及装置
CN112784165A (zh) * 2021-01-29 2021-05-11 北京百度网讯科技有限公司 关联关系预估模型的训练方法以及预估文件热度的方法
CN112801761A (zh) * 2021-04-13 2021-05-14 中智关爱通(南京)信息科技有限公司 商品推荐方法、计算设备和计算机可读存储介质
CN113420222A (zh) * 2021-07-08 2021-09-21 咪咕文化科技有限公司 内容推荐方法及装置
CN114595323B (zh) * 2022-03-04 2023-03-10 北京百度网讯科技有限公司 画像构建、推荐、模型训练方法、装置、设备及存储介质
CN116523545B (zh) * 2023-06-28 2023-09-15 大汉电子商务有限公司 基于大数据的用户画像构建方法
CN116521908B (zh) * 2023-06-28 2024-01-09 图林科技(深圳)有限公司 一种基于人工智能的多媒体内容个性化推荐方法
CN117057683B (zh) * 2023-10-13 2023-12-22 四川中电启明星信息技术有限公司 基于知识图谱与多源应用数据的员工画像管理系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107423442A (zh) * 2017-08-07 2017-12-01 火烈鸟网络(广州)股份有限公司 基于用户画像行为分析的应用推荐方法及系统,储存介质及计算机设备
CN110297848A (zh) * 2019-07-09 2019-10-01 深圳前海微众银行股份有限公司 基于联邦学习的推荐模型训练方法、终端及存储介质
US10496924B1 (en) * 2018-08-07 2019-12-03 Capital One Services, Llc Dictionary DGA detector model
CN111898031A (zh) * 2020-08-14 2020-11-06 腾讯科技(深圳)有限公司 一种获得用户画像的方法及装置

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105574159B (zh) * 2015-12-16 2019-04-16 浙江汉鼎宇佑金融服务有限公司 一种基于大数据的用户画像建立方法和用户画像管理系统
CN108268547A (zh) * 2016-12-29 2018-07-10 北京国双科技有限公司 用户画像生成方法和装置
CN108989905B (zh) * 2018-06-28 2021-05-28 腾讯科技(深圳)有限公司 媒体流控制方法、装置、计算设备及存储介质
US11132623B2 (en) * 2018-10-15 2021-09-28 International Business Machines Corporation User adapted data presentation for data labeling
CN109753994B (zh) * 2018-12-11 2024-05-14 东软集团股份有限公司 用户画像方法、装置、计算机可读存储介质及电子设备
CN112913210B (zh) * 2018-12-29 2022-10-28 深圳市欢太科技有限公司 信息内容的确定方法及相关产品
CN110097394A (zh) * 2019-03-27 2019-08-06 青岛高校信息产业股份有限公司 产品潜客推荐方法和装置
CN110598016B (zh) * 2019-09-11 2021-08-17 腾讯科技(深圳)有限公司 一种多媒体信息推荐的方法、装置、设备和介质
CN110727860A (zh) * 2019-09-16 2020-01-24 武汉安诠加信息技术有限公司 基于互联网美容平台的用户画像方法、装置、设备及介质
CN110598011B (zh) * 2019-09-27 2024-05-28 腾讯科技(深圳)有限公司 数据处理方法、装置、计算机设备以及可读存储介质
CN110781930A (zh) * 2019-10-14 2020-02-11 西安交通大学 一种基于网络安全设备日志数据的用户画像分组及行为分析方法和系统
CN111192025A (zh) * 2019-12-31 2020-05-22 广东德诚科教有限公司 职业信息匹配方法、装置、计算机设备和存储介质
CN111191092B (zh) * 2019-12-31 2023-07-14 腾讯科技(深圳)有限公司 标签确定方法和标签确定模型训练方法
CN111444357B (zh) * 2020-03-24 2023-10-20 腾讯科技(深圳)有限公司 内容信息确定方法、装置、计算机设备及存储介质
CN111444428B (zh) * 2020-03-27 2022-08-30 腾讯科技(深圳)有限公司 基于人工智能的信息推荐方法、装置、电子设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107423442A (zh) * 2017-08-07 2017-12-01 火烈鸟网络(广州)股份有限公司 基于用户画像行为分析的应用推荐方法及系统,储存介质及计算机设备
US10496924B1 (en) * 2018-08-07 2019-12-03 Capital One Services, Llc Dictionary DGA detector model
CN110297848A (zh) * 2019-07-09 2019-10-01 深圳前海微众银行股份有限公司 基于联邦学习的推荐模型训练方法、终端及存储介质
CN111898031A (zh) * 2020-08-14 2020-11-06 腾讯科技(深圳)有限公司 一种获得用户画像的方法及装置

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114936325A (zh) * 2022-07-20 2022-08-23 北京数慧时空信息技术有限公司 基于用户画像的遥感影像推荐方法及系统
CN114936325B (zh) * 2022-07-20 2022-10-11 北京数慧时空信息技术有限公司 基于用户画像的遥感影像推荐方法及系统
CN115344732A (zh) * 2022-10-18 2022-11-15 北京数慧时空信息技术有限公司 基于任务驱动的遥感影像推荐方法
CN115344732B (zh) * 2022-10-18 2022-12-13 北京数慧时空信息技术有限公司 基于任务驱动的遥感影像推荐方法

Also Published As

Publication number Publication date
CN111898031B (zh) 2024-04-05
US20220405607A1 (en) 2022-12-22
CN111898031A (zh) 2020-11-06

Similar Documents

Publication Publication Date Title
WO2022033199A1 (zh) 一种获得用户画像的方法及相关装置
WO2020228514A1 (zh) 内容推荐方法、装置、设备及存储介质
WO2021203819A1 (zh) 一种内容推荐方法、装置、电子设备和存储介质
WO2022041979A1 (zh) 一种信息推荐模型的训练方法和相关装置
Zhang Incorporating phrase-level sentiment analysis on textual reviews for personalized recommendation
CN103870973B (zh) 基于电子信息的关键词提取的信息推送、搜索方法及装置
CN112434151A (zh) 一种专利推荐方法、装置、计算机设备及存储介质
CN111737582B (zh) 一种内容推荐方法及装置
CN111784455A (zh) 一种物品推荐方法及推荐设备
CN110019943B (zh) 视频推荐方法、装置、电子设备和存储介质
CN111460221B (zh) 评论信息处理方法、装置及电子设备
CN104834686A (zh) 一种基于混合语义矩阵的视频推荐方法
Zhang et al. Multimodal marketing intent analysis for effective targeted advertising
CN112364204B (zh) 视频搜索方法、装置、计算机设备及存储介质
CN112307351A (zh) 用户行为的模型训练、推荐方法、装置和设备
CN110008376A (zh) 用户画像向量生成方法及装置
CN112749330A (zh) 信息推送方法、装置、计算机设备和存储介质
Wang et al. Multifunctional product marketing using social media based on the variable-scale clustering
Zhang et al. SEMA: Deeply learning semantic meanings and temporal dynamics for recommendations
CN113408282B (zh) 主题模型训练和主题预测方法、装置、设备及存储介质
WO2023284516A1 (zh) 基于知识图谱的信息推荐方法、装置、设备、介质及产品
Lei et al. Personalized Item Recommendation Algorithm for Outdoor Sports
CN114330519A (zh) 数据确定方法、装置、电子设备及存储介质
CN114022233A (zh) 一种新型的商品推荐方法
Wang English news text recommendation method based on hypergraph random walk label expansion

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21855254

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 30.06.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21855254

Country of ref document: EP

Kind code of ref document: A1