WO2022100518A1 - 一种基于用户画像的物品推荐方法和装置 - Google Patents

一种基于用户画像的物品推荐方法和装置 Download PDF

Info

Publication number
WO2022100518A1
WO2022100518A1 PCT/CN2021/128877 CN2021128877W WO2022100518A1 WO 2022100518 A1 WO2022100518 A1 WO 2022100518A1 CN 2021128877 W CN2021128877 W CN 2021128877W WO 2022100518 A1 WO2022100518 A1 WO 2022100518A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
processed
behavior data
attribute information
model
Prior art date
Application number
PCT/CN2021/128877
Other languages
English (en)
French (fr)
Inventor
陈伯梁
Original Assignee
北京沃东天骏信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京沃东天骏信息技术有限公司, 北京京东世纪贸易有限公司 filed Critical 北京沃东天骏信息技术有限公司
Priority to EP21891044.6A priority Critical patent/EP4242955A1/en
Publication of WO2022100518A1 publication Critical patent/WO2022100518A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0242Determining effectiveness of advertisements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0254Targeted advertisements based on statistics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present disclosure relates to the field of computer technologies, and in particular, to a method and device for recommending items based on user portraits.
  • User portraits are the key to e-commerce marketing activities, personalized recommendations, and basic data services. Only by obtaining accurate user portrait tags in real time can we compete for the best quality and most accurate user groups in the shortest time and at the least cost, and then conduct various marketing promotions and other activities to promote customer acquisition and retention.
  • the user portrait is a series of data information (such as shopping information, personal information, etc.) included in the successfully registered user ID, which is a virtual data aggregate.
  • the existing technical solutions usually use a big data platform to store data such as user shopping, and then classify user group data through manual analysis and modeling.
  • Traditional user portraits need to have marking data, but the reality is that most of the labeling data of portrait labels is not easy to obtain, or the acquisition cost is very high, or the accuracy is low or even regarded as noise data, that is, traditional user portraits can only be processed.
  • User data for a single business scenario (with tag data). Therefore, based on the user portrait processed in the traditional way, the efficiency and accuracy of user recommendation are not high, and the user experience is not good.
  • the embodiments of the present disclosure provide a method and device for recommending items based on user portraits, which can solve the problem of low efficiency of marketing activities caused by existing user portraits with low accuracy.
  • a method for recommending items based on user portraits which includes receiving user behavior data and user attribute information, and converting them into user behavior data to be processed and user attribute information to be processed through feature engineering.
  • user attribute information obtain the current label calculation task, determine whether the label calculation task belongs to the prediction task, if so, call the preset prediction model, if not, call the preset statistical rule model; according to the prediction model or the
  • the statistical rule model obtains user portraits based on the user behavior data to be processed and the user attribute information to be processed, and then pushes item information to the user according to the user portraits.
  • receive user behavior data including
  • the preprocessing model is called to preprocess user behavior data and user attribute information.
  • the user portrait is obtained based on the user behavior data to be processed and the user attribute information to be processed, including:
  • the user portrait is calculated through the DDPG algorithm model of Actor network gradient fusion.
  • the DDPG algorithm model of the Actor network gradient fusion including:
  • the cross-entropy loss value calculated by the preset supervised learning model is added to the Actor of the DDPG algorithm to evaluate the output value of the Actor.
  • obtain user portraits based on the user behavior data to be processed and the user attribute information to be processed including:
  • the weights of the first user portrait and the second user portrait are determined, so as to obtain the final user portrait by integrating the first user portrait and the second user portrait.
  • the deep reinforcement learning model uses the Actor-Critic algorithm.
  • the present disclosure also provides an item recommendation device based on user portraits, including an acquisition module for receiving user behavior data and user attribute information, and converting them into user behavior data to be processed and user attribute information to be processed through feature engineering ;
  • the processing module is used to obtain the current label calculation task, judge whether the label calculation task belongs to the prediction task, if so, call the preset prediction model, if otherwise, call the preset statistical rule model; according to the prediction model or the
  • the statistical rule model obtains user portraits based on the user behavior data to be processed and the user attribute information to be processed, and then pushes item information to the user according to the user portraits.
  • the present disclosure adopts different task processing methods for user behavior data and user attribute information of different characteristics, that is, supports user portrait processing of different scenarios, and the supported scenarios include: Labeling tasks for labeled data, non-labeled data, and statistical calculation of business rules.
  • the present disclosure can use deep reinforcement learning to train various processing models of user portraits in real time in combination with real data of online operations. Further, the present disclosure integrates supervised learning and reinforcement learning, so that in the user's life cycle, the user portrait processing will actually have an impact on the user's behavior next time and next time through the activity operation, that is, There is a correlation between the results of each processing.
  • FIG. 1 is a schematic diagram of the main flow of a method for recommending items based on user portraits according to a first embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of a main flow of a method for recommending items based on user portraits according to another embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of main modules of an apparatus for recommending items based on user portraits according to an embodiment of the present disclosure
  • FIG. 4 is an exemplary system architecture diagram to which embodiments of the present disclosure may be applied;
  • FIG. 5 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present disclosure.
  • FIG. 1 is a schematic diagram of the main process of a user portrait-based item recommendation method according to the first embodiment of the present disclosure. As shown in FIG. 1 , the user portrait-based item recommendation method includes:
  • Step S101 Receive user behavior data and user attribute information, and convert them into user behavior data to be processed and user attribute information to be processed through feature engineering.
  • receiving user behavior data includes:
  • a preprocessing model may be invoked to preprocess the user behavior data and user attribute information. For example: Preliminary preprocessing of user behavior data and user attribute information, including processing of missing values, noise, outliers, data types, etc., and data cleaning based on analysis results (including processing of missing values, noise, and outliers, etc.) , data encoding, data deformation (including normalization, regularization, scaling, etc.) and other preprocessing operations.
  • step S101 is a process of transforming, intersecting, mapping and extracting records of user behavior data and user attribute information in different categories into data required for the model.
  • Step S102 Obtain the current label calculation task, determine whether the label calculation task belongs to the prediction task, if so, call a preset prediction model, otherwise call a preset statistical rule model.
  • the user portrait is obtained by calculating the DDPG algorithm model of Actor network gradient fusion. Further, the cross-entropy loss value calculated by the preset supervised learning model is added to the Actor of the DDPG algorithm to evaluate the output value of the Actor.
  • the full name of DDPG is Deep Deterministic Policy Gradient, which is a strategy learning method that integrates the network into DPG, and integrates the Actor-Critic framework.
  • Step S103 according to the prediction model or the statistical rule model, obtain a user portrait based on the user behavior data to be processed and the user attribute information to be processed, and then push item information to the user according to the user portrait.
  • a preset statistical rule model if a preset statistical rule model is invoked, a user portrait is obtained based on the user behavior data to be processed and the user attribute information to be processed, including:
  • the deep reinforcement learning model adopts the Actor-Critic algorithm.
  • Actor-Critic algorithm there are three basic elements of deep reinforcement learning: state, action and reward. The corresponding ones in this disclosure are:
  • State is the pending user behavior data and pending user attribute information after feature engineering, such as sales, sales, user behavior, product attributes, user and product cross-features, and so on.
  • Actions can be a set of labels in machine learning, a set of clustering categories in a clustering task, a calculation result in business rules, and so on.
  • Reward is to convert the goal to be achieved (for example, increase the click rate, increase GMV, etc.) into a specific reward function R, and guide the Agent to complete the goal during the learning process. , sales, etc. as rewards.
  • the portrait model can be regarded as an agent and the user as an environment, and the marketing of the portrait model can be regarded as a sequential decision-making problem.
  • the Agent each time a marketing activity is performed, the Agent (portrait model) makes a prediction and sends the prediction result to the user.
  • the user gives feedback signals such as clicks and browsing.
  • Agent portrait model
  • the present disclosure can use deep reinforcement learning to train a label calculation algorithm model for user portraits in real time in combination with real data of online operations.
  • Different from the traditional user portrait label model training method it integrates supervised learning and reinforcement learning, and uses different fusion methods to optimize the traditional label calculation method for different computing scenarios.
  • Supported scenarios include: data labels with marking, non- Labeling data (clustering and labeling tasks), and statistical computing of business rules for labeling tasks. That is to say, in the life cycle of the user, in the life cycle of the user, the portrait model will actually have an impact on the user's behavior next time and the next time it is operated through an activity, that is, there is an association between the results predicted by the portrait model each time ( Example: In this activity, male users are screened to sell razors.
  • the user may support the marketing activity. This behavior is the user's feedback data and should be added to the model in time. training; the model is more likely to predict the user as a male the next time an activity is pushed; on the contrary, if the user is a female, the pushed razor is more likely to be ignored, and the same behavior should be learned and exploited by the model in time update the model).
  • FIG. 2 is a schematic diagram of the main process of a user portrait-based item recommendation method according to another embodiment of the present disclosure (DRL in FIG. 2 is deep reinforcement learning, ML is machine learning), and the user portrait-based item recommendation method may be include:
  • the received user behavior data and user attribute information are sequentially processed through data analysis, data preprocessing, and feature engineering to obtain user behavior data to be processed and user attribute information to be processed.
  • the data analysis is to obtain user behavior data under different categories according to the preset data quantity and data format.
  • Data preprocessing is to call the preprocessing model to preprocess user behavior data and user attribute information.
  • Feature engineering is the process of transforming, intersecting, mapping, and extracting the records of user behavior data and user attribute information in different categories into the data required by the model.
  • the prediction model (the task flow on the left in Figure 2)
  • the statistical rule model (the task flow on the right in Figure 2). For example, if the current label calculation task can be simply obtained by statistical methods such as sql calculation methods, the statistical rule model is called. If the current label calculation task is a task of predicting the class, if most of the gender labels need to be calculated by machine learning algorithms and cannot be calculated by rules, then the prediction model is used).
  • the task process on the left is based on the DDPG algorithm and adds a machine learning model as the gradient fusion of the Actor network, so that the reinforcement learning model (such as the DDPG algorithm) can learn the gradient from the machine learning model.
  • the machine learning model Model in the left task flow in FIG. 2 may use a supervised learning model, such as svm (support vector machine), xgb (an industrial implementation of GBDT), and the like. specifically:
  • Actor network is: Actor_eval_net and Actor_target_net.
  • Critic networks are: Critic_eval_net and Critic_target_net.
  • Actor_eval_net inputs state (that is, feature vector in this disclosure: user behavior data to be processed and user attribute information to be processed), and output behavior is action.
  • Actor_target_net inputs next_state (the action acts on the environment to get next_state and the corresponding reward), and the output behavior is next_action.
  • Critic_eval_net inputs action and outputs the corresponding Q value (Q value refers to the Q(state, action) function (Quality) value, which is used to indicate that the agent takes a certain action in a certain state and then takes the optimal action condition. discounted future rewards).
  • Q value refers to the Q(state, action) function (Quality) value, which is used to indicate that the agent takes a certain action in a certain state and then takes the optimal action condition. discounted future rewards).
  • Critic_target_net inputs next_action and next_state, and outputs the corresponding Q value of the behavior.
  • the state input in the Actor_eval_net prediction network is the feature corresponding to the user's gender, that is, the feature calculated by feature engineering, including: the user's behavior data in different categories such as browsing, placing an order, adding Purchases, clicks, reviews and other sales, sales, product attributes, user and product cross characteristics and so on.
  • the output is that the action corresponding to the user's gender prediction label is male (0), female (1), unknown (-1).
  • the salesperson selects the desired gender according to the marketing scenario conditions, and then delivers it, pushes messages, etc.
  • the male user may click, place an order, add purchase, comment and other behaviors.
  • These behaviors under different categories will be input to Actor_target_net as the next input state (next_state), and then output the next One action (next_action), the same input of Critic_eval_net network is the action action obtained according to the current state (select male, female), and the Q value corresponding to the output behavior, Critic_target_net input is the next action and the next state, the output is the corresponding Q value.
  • the present disclosure utilizes a supervised learning model (for example, the supervised learning model Model can be svm (support vector machine), XGBoost (abbreviation for Extreme Gradient Boosting, which is an efficient implementation of GBDT), etc.), input corresponding features (user behavior data and user attribute information) and marking data, train the model and predict the unmarked data, and supplement the unmarked data.
  • a layer of network is constructed to calculate the cross-entropy loss of the Actor's output in the supervised learning model:
  • a the action output by the network
  • x user behavior data and user attribute information
  • y marking data
  • network parameters to be trained.
  • s represents the state of the agent at a certain moment
  • a represents the action performed at a certain moment
  • Q(s, a) indicates that the agent takes a certain action in a certain state and then takes the optimal action condition
  • Discounted future reward under ⁇ the hyperparameter to be adjusted, which is the weight of the supervised learning model
  • the network parameter of the actor
  • the action value of the corresponding actor in a state at a certain moment
  • N the number of model training iterations .
  • the present disclosure adds the cross-entropy loss loss calculated by the supervised learning model to the Actor, evaluates the output value of the Actor, enhances the stability of the reinforcement learning model, makes full use of the marking data, and improves the performance of the entire prediction model.
  • the task process on the right is to use the user feedback data for deep reinforcement learning model calculation (the training process is the same as the left task process (the Model part is missing)), and the business rule model calculation (for example: calculate whether the user is married, if there is no marking data)
  • the business rule model calculates, for example, the names of historically purchased items include "pregnancy", "children” and other married feature words.), and finally based on online operational performance indicators (such as: conversion rate, length of stay, etc.) Determine the weight of the two calculation results, and then fuse.
  • FIG. 3 is a schematic diagram of main modules of a user portrait-based item recommendation apparatus according to an embodiment of the present disclosure.
  • the user portrait-based item recommendation apparatus 300 includes an acquisition module 301 and a processing module 302 .
  • the acquisition module 301 receives user behavior data and user attribute information, and converts them into user behavior data to be processed and user attribute information to be processed through feature engineering.
  • the processing module 302 obtains the current label calculation task, determines whether the label calculation task belongs to the prediction task, if so, calls the preset prediction model, otherwise calls the preset statistical rule model; according to the prediction model or the statistical rule
  • the model based on the user behavior data to be processed and the user attribute information to be processed, obtains a user portrait, and then pushes item information to the user according to the user portrait.
  • acquisition module 301 receives user behavior data, including
  • the obtaining module 301 after the obtaining module 301 receives the user behavior data and the user attribute information, it includes:
  • the preprocessing model is called to preprocess user behavior data and user attribute information.
  • the processing module 302 obtains the user portrait based on the user behavior data to be processed and the user attribute information to be processed according to the prediction model, including:
  • the user portrait is calculated through the DDPG algorithm model of Actor network gradient fusion.
  • the DDPG algorithm model of the Actor network gradient fusion includes:
  • the cross-entropy loss value calculated by the preset supervised learning model is added to the Actor of the DDPG algorithm to evaluate the output value of the Actor.
  • the processing module 302 obtains the user portrait based on the user behavior data to be processed and the user attribute information to be processed according to the statistical rule model, including:
  • the weights of the first user portrait and the second user portrait are determined, so as to obtain the final user portrait by integrating the first user portrait and the second user portrait.
  • the deep reinforcement learning model adopts the Actor-Critic algorithm.
  • the method for recommending items based on user portraits and the device for recommending items based on user portraits in the present disclosure have a corresponding relationship in terms of specific implementation contents, so repeated content will not be described again.
  • FIG. 4 shows an exemplary system architecture 400 to which the user profile-based item recommendation method or the user profile-based item recommendation apparatus may be applied.
  • the system architecture 400 may include terminal devices 401 , 402 , and 403 , a network 404 and a server 405 .
  • the network 404 is a medium used to provide a communication link between the terminal devices 401 , 402 , 403 and the server 405 .
  • the network 404 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
  • the user can use the terminal devices 401, 402, 403 to interact with the server 405 through the network 404 to receive or send messages and the like.
  • Various communication client applications may be installed on the terminal devices 401 , 402 and 403 , such as shopping applications, web browser applications, search applications, instant messaging tools, email clients, social platform software, etc. (only examples).
  • the terminal devices 401 , 402 and 403 may be various electronic devices having an item recommendation screen based on user portraits and supporting web browsing, including but not limited to smart phones, tablet computers, laptop computers, desktop computers, and the like.
  • the server 405 may be a server that provides various services, for example, a background management server that provides support for shopping websites browsed by the terminal devices 401 , 402 , and 403 (just an example).
  • the background management server can analyze and process the received product information query request and other data, and feed back the processing results (such as target push information, product information—just an example) to the terminal device.
  • the method for recommending items based on user portraits is generally performed by the server 405 , and accordingly, the computing device is generally set in the server 405 .
  • terminal devices, networks and servers in FIG. 4 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.
  • FIG. 5 shows a schematic structural diagram of a computer system 500 suitable for implementing the terminal device of the embodiment of the present disclosure.
  • the terminal device shown in FIG. 5 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present disclosure.
  • a computer system 500 includes a central processing unit (CPU) 501 which can be loaded into a random access memory (RAM) 503 according to a program stored in a read only memory (ROM) 502 or a program from a storage section 508 Instead, various appropriate actions and processes are performed.
  • RAM random access memory
  • ROM read only memory
  • various programs and data necessary for the operation of the computer system 500 are also stored.
  • the CPU 501 , the ROM 502 , and the RAM 503 are connected to each other through a bus 504 .
  • An input/output (I/O) interface 505 is also connected to bus 504 .
  • the following components are connected to the I/O interface 505: an input section 506 including a keyboard, mouse, etc.; an output section 507 including a cathode ray tube (CRT), a liquid crystal user portrait-based item recommender (LCD), etc., and a speaker, etc.; A storage section 508 of a hard disk, etc.; and a communication section 509 including a network interface card such as a LAN card, a modem, and the like. The communication section 509 performs communication processing via a network such as the Internet.
  • a drive 510 is also connected to the I/O interface 505 as needed.
  • a removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 510 as needed so that a computer program read therefrom is installed into the storage section 508 as needed.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the methods illustrated in the flowcharts.
  • the computer program may be downloaded and installed from the network via the communication portion 509 and/or installed from the removable medium 511 .
  • CPU central processing unit
  • the above-described functions defined in the system of the present disclosure are executed.
  • the computer-readable medium shown in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • the modules involved in the embodiments of the present disclosure may be implemented in software or hardware.
  • the described modules can also be provided in the processor, for example, it can be described as: a processor includes an acquisition module and a processing module. Among them, the names of these modules do not constitute a limitation on the module itself under certain circumstances.
  • the present disclosure also provides a computer-readable medium.
  • the computer-readable medium may be included in the device described in the above-mentioned embodiments, or it may exist alone without being assembled into the device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by a device, the device includes receiving user behavior data and user attribute information, and is converted into the user behavior to be processed through feature engineering.
  • the model or the statistical rule model obtains a user portrait based on the user behavior data to be processed and the user attribute information to be processed, and then pushes item information to the user according to the user portrait.
  • the problem of low efficiency of marketing activities caused by the existing user portraits with low accuracy can be solved.

Abstract

本公开提供基于用户画像的物品推荐方法和装置,涉及计算机技术领域。该方法的一具体实施方式包括接收用户行为数据和用户属性信息,通过特征工程转换成待处理的用户行为数据和待处理的用户属性信息;获取当前标签计算任务,判断所述标签计算任务是否属于预测类任务,若是则调用预设的预测模型,若否则调用预设的统计规则模型;根据所述预测模型或所述统计规则模型,基于待处理的用户行为数据和待处理的用户属性信息得到用户画像,进而根据用户画像向该用户推送物品信息。从而,本公开的实施方式能够解决现有精准度低的用户画像所造成的营销活动效率不高的问题。

Description

一种基于用户画像的物品推荐方法和装置
相关申请的交叉引用
本申请要求享有2020年11月12日提交的申请号为202011264500.8的中国发明专利申请的优先权,其全部内容通过引用并入本文。
技术领域
本公开涉及计算机技术领域,尤其涉及一种基于用户画像的物品推荐方法和装置。
背景技术
用户画像是电商作营销活动,个性化推荐,基础数据服务等的关键。只有实时的获取精准用户画像标签,才能在最短的时间内,以最少的成本争抢到质量最优,最精准的用户群体,进而作各种营销推广等活动,促进获客、留客。其中,用户画像为注册成功的用户ID所包括的一系列数据信息(例如购物信息、个人信息等),是一个虚拟的数据集合体。
在实现本公开的过程中,发明人发现现有技术中至少存在如下问题:
针对用户画像建模来说,现有的技术方案通常是利用大数据平台来存储用户购物等数据,然后通过人工分析并建模的方式来分类用户群体数据。传统的用户画像需要有打标数据,但现实是大多画像标签的打标数据不好获得,或者获取成本非常大,或者准确度低甚至于被视为噪声数据,即传统的用户画像只能处理单一业务场景(具有标签数据)的用户数据。因此,基于传统方式处理后的用户画像,进行用户推荐的效率不高、精准度低,用户体验也不好。
发明内容
有鉴于此,本公开实施例提供一种基于用户画像的物品推荐方法和装置,能够解决现有精准度低的用户画像所造成的营销活动效率不高的问题。
为实现上述目的,根据本公开实施例的一个方面,提供了一种基于用户画像的物品推荐方法,包括接收用户行为数据和用户属性信息,通过特征工程转换成待处理的用户行为数据和待处理的用户属性信息;获取当前标签计算任务,判断所述标签计算任务是否属于预测类任务,若是则调用预设的预测模型,若否则调用预设的统计规则模型;根据所述预测模型或所述统计规则模型,基于待处理的用户行为数据和待处理的用户属性信息得到用户画像,进而根据用户画像向该用户推送物品信息。
可选地,接收用户行为数据,包括
根据预设的数据数量和数据格式,获取不同类目下的用户行为数据。
可选地,接收用户行为数据和用户属性信息之后,包括:
调用预处理模型,对用户行为数据和用户属性信息进行预处理。
可选地,根据所述预测模型,基于待处理的用户行为数据和待处理的用户属性信息得到用户画像,包括:
基于待处理的用户行为数据和待处理的用户属性信息,通过Actor网络梯度融合的DDPG算法模型计算得到用户画像。
可选地,所述Actor网络梯度融合的DDPG算法模型,包括:
将通过预设的监督学习模型计算得到的交叉熵损失值加入到DDPG算法的Actor中,评估Actor的输出值。
可选地,根据所述统计规则模型,基于待处理的用户行为数据和 待处理的用户属性信息得到用户画像,包括:
获取预设的深度强化学习模型和业务规则模型,通过待处理的用户行为数据和待处理的用户属性信息分别得到相应的第一用户画像和第二用户画像;
根据目标运营效果指标,确定第一用户画像和第二用户画像的权重,以将第一用户画像和第二用户画像融合得到最终用户画像。
可选地,包括:
所述的深度强化学习模型采用Actor-Critic算法。
另外,本公开还提供了一种基于用户画像的物品推荐装置,包括获取模块,用于接收用户行为数据和用户属性信息,通过特征工程转换成待处理的用户行为数据和待处理的用户属性信息;
处理模块,用于获取当前标签计算任务,判断所述标签计算任务是否属于预测类任务,若是则调用预设的预测模型,若否则调用预设的统计规则模型;根据所述预测模型或所述统计规则模型,基于待处理的用户行为数据和待处理的用户属性信息得到用户画像,进而根据用户画像向该用户推送物品信息。
上述发明中的一个实施例具有如下优点或有益效果:本公开针对不同特性的用户行为数据和用户属性信息,采用不同的任务处理方式,即支持不同场景的用户画像处理,支持场景包括:带打标数据标签,非打标数据以及业务规则统计计算类标签任务。并且,本公开能够利用深度强化学习,结合线上运营的真实数据实时训练用户画像的各类处理模型。更进一步地,本公开将监督学习和强化学习进行融合,从而能够在用户的生命周期里,用户画像处理在每次通过活动运营事实上会在下次以及下下次对用户的行为产生影响,即每次处理的结果间存在关联。
上述的非惯用的可选方式所具有的进一步效果将在下文中结合具 体实施方式加以说明。
附图说明
附图用于更好地理解本公开,不构成对本公开的不当限定。其中:
图1是根据本公开第一实施例的基于用户画像的物品推荐方法的主要流程的示意图;
图2是根据本公开另一实施例的基于用户画像的物品推荐方法的主要流程的示意图;
图3是根据本公开实施例的基于用户画像的物品推荐装置的主要模块的示意图;
图4是本公开实施例可以应用于其中的示例性系统架构图;
图5是适于用来实现本公开实施例的终端设备或服务器的计算机系统的结构示意图。
具体实施方式
以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。
图1是根据本公开第一实施例的基于用户画像的物品推荐方法的主要流程的示意图,如图1所示,所述基于用户画像的物品推荐方法包括:
步骤S101,接收用户行为数据和用户属性信息,通过特征工程转换成待处理的用户行为数据和待处理的用户属性信息。
在一些实施例中,接收用户行为数据,包括:
根据预设的数据数量和数据格式,获取不同类目下的用户行为数据。例如:针对用户性别数据分析,首先需要分析用户性别预测涉及到的特征数据,用户在不同类目下的行为数据的完整性(即格式)、 大小以及数量等,为后面分析挖掘做基础准备。
作为另一些实施例,接收用户行为数据和用户属性信息之后,可以调用预处理模型,对用户行为数据和用户属性信息进行预处理。例如:对用户行为数据和用户属性信息等进行初步预处理,包括缺失值、噪声、离群点、数据类型等处理,结合分析结果进行数据清理(包括缺失值、噪声和离群点处理等)、数据编码、数据变形(包括标准化、正则化、缩放等)等预处理操作。
值得说明的是,在步骤S101中所述的特征工程是对用户行为数据以及用户属性信息在不同类目下的记录经过变换、交叉、映射以及提取等操作加工成模型所需的数据的过程。
步骤S102,获取当前标签计算任务,判断所述标签计算任务是否属于预测类任务,若是则调用预设的预测模型,若否则调用预设的统计规则模型。
在一些实施例中,基于待处理的用户行为数据和待处理的用户属性信息,通过Actor网络梯度融合的DDPG算法模型计算得到用户画像。进一步地,将通过预设的监督学习模型计算得到的交叉熵损失值加入到DDPG算法的Actor中,评估Actor的输出值。其中,DDPG全称Deep Deterministic Policy Gradient,是网络融合进DPG的策略学习方法,且融合了Actor-Critic框架。
步骤S103,根据所述预测模型或所述统计规则模型,基于待处理的用户行为数据和待处理的用户属性信息得到用户画像,进而根据用户画像向该用户推送物品信息。
在一些实施例中,如果调用预设的统计规则模型,基于待处理的用户行为数据和待处理的用户属性信息得到用户画像,包括:
获取预设的深度强化学习模型和业务规则模型,通过待处理的用户行为数据和待处理的用户属性信息分别得到相应的第一用户画像和第二用户画像。然后,根据目标运营效果指标,确定第一用户画像和第二用户画像的权重,以将第一用户画像和第二用户画像融合得到最 终用户画像。
进一步地实施例,所述的深度强化学习模型采用Actor-Critic算法。其中,深度强化学习基本的三要素:状态(state)、行为(action)和奖赏(reward)。在本公开中对应的分别是:
state为特征工程加工后的待处理的用户行为数据和待处理的用户属性信息,例如:销量,销售额,用户行为,商品属性,用户和商品交叉特征等等。
action可以为在机器学习中的label集合,聚类任务中的聚类类别集合,业务规则中的计算结果等等。
reward为将要达到的目标(例如:提高点击率、提高GMV等)转化为具体的奖赏函数R,在学习过程中引导Agent完成目标,本公开中将运营活动中用户的pv(访问量)、销量、销售额等作为reward。
值得说明的是,本公开能够将画像模型看作智能体(Agent)、把用户看作环境(Environment),则画像模型做营销等问题可以被视为顺序决策问题。其中,每次做营销活动时,Agent(画像模型)做预测,将预测结果给用户。用户根据Agent(画像模型)的预测结果,给出点击、浏览等反馈信号。Agent(画像模型)接收反馈信号,在新营销活动时做出新的预测策略,向用户推荐物品。在这种反复不断地试错过程中将进行持续优化,Agent将逐步学习到最优的策略。
综上所述,本公开能够利用深度强化学习,结合线上运营的真实数据实时训练用户画像的标签计算算法模型。与传统用户画像标签模型训练方式不同的是,将监督学习和强化学习进行融合,并针对不同的计算场景采用不同的融合方式优化传统标签的计算方式,支持场景包括:带打标数据标签,非打标数据(聚类标签任务),及业务规则统计计算类标签任务。也就是说,本公开能够在用户的生命周期里,画像模型在每次通过活动运营事实上会在下次以及下下次对用户的行为产生影响,即画像模型每次预测的结果间存在关联(举例:本次活动筛选男性用户推销剃须刀,如果模型计算结果是准确的,则该用户 可能会对营销活动给予支持等行为,该行为即为用户的反馈数据,应被及时加入到模型中进行训练;下次推送活动时模型就更有可能将该用户预测为男性;相反,若该用户为女性,推送的剃须刀更有可能被忽视,同样这种行为应被模型及时学习利用来更新模型)。
图2是根据本公开另一实施例的基于用户画像的物品推荐方法的主要流程的示意图(图2中的DRL为深度强化学习,ML为机器学习),所述基于用户画像的物品推荐方法可以包括:
接收到的用户行为数据和用户属性信息依次通过数据分析、数据预处理以及特征工程的处理,得到待处理的用户行为数据和待处理的用户属性信息。其中,数据分析为根据预设的数据数量和数据格式,获取不同类目下的用户行为数据。数据预处理为调用预处理模型,对用户行为数据和用户属性信息进行预处理。特征工程是对用户行为数据以及用户属性信息在不同类目下的记录经过变换、交叉、映射以及提取等操作加工成模型所需的数据的过程。
完成特征工程后,判断当前标签计算任务是否属于预测类任务,若是则调用预测模型(图2中左边任务流程),若否则调用统计规则模型(图2中右边任务流程)。例如:如果当前标签计算任务可以简单采用统计方式如用sql计算方式求得的,则调用统计规则模型。如果当前标签计算任务是预测类的任务如有大部分性别标签需要采用机器学习算法才能计算得到的而不能通过规则计算求得的,则预测模型)。
而左边任务流程是基于DDPG算法基础上增加了机器学习模型作Actor网络的梯度融合,使强化学习模型(例如DDPG算法)能从机器学习模型中学习到梯度。较佳地,图2中左边任务流程中机器学习模型Model可以采用监督学习模型,例如:svm(支持向量机),xgb(是GBDT的一种工业实现)等。具体地:
先对DDPG算法中的Actor和Critic分别构建两个网络:Actor网络为:Actor_eval_net和Actor_target_net。Critic网络为:Critic_eval_net和Critic_target_net。其中,Actor_eval_net输入state(在本公开中即为特征向量:待处理的用户行为数据和待处理用户属性信息),输出行 为action。Actor_target_net输入next_state(action作用在环境后得到next_state和对应的奖赏reward),输出行为next_action。Critic_eval_net输入action,输出行为对应的Q值(Q值是指Q(state,action)函数(Quality)值,用来表示智能体在某状态下采取某个动作并在之后采取最优动作条件下的打折的未来奖励)。Critic_target_net输入next_action和next_state,输出行为对应的Q值。
例如:以用户性别标签计算为例,在Actor_eval_net预测网络中输入的状态state为用户性别对应的特征,即特征工程计算的特征,包括:用户在不同类目的行为数据例如浏览,下单,加购,点击,评论等销量,销售额,商品属性,用户和商品交叉特征等等。输出的是action对应用户性别预测标签是男(0),女(1),未知(-1)。选择了男或女的action动作后,在环境中(对应营销活动中即:业务人员根据营销场景条件选择对应所需的性别,并作投放,消息推送等)得到人群用户的反馈,例如给男士推送了剃须刀,则该男士用户可能会产生点击,下单,加购,评论等行为,这些在不同类目下的行为将作为下一次输入的状态(next_state)输入到Actor_target_net,再输出下一次的行为(next_action),Critic_eval_net网络同样的输入是根据当前状态得到的action动作(选择男,女),输出行为对应的Q值,Critic_target_net输入的是下一次的action和下一次的状态,输出的是对应的Q值。
较佳地,本公开利用监督学习模型(例如监督学习模型Model可以是svm(支持向量机)、XGBoost(Extreme Gradient Boosting的简称,是GBDT的一种高效实现)等),输入对应的特征(用户行为数据和用户属性信息)及打标数据,训练模型并对未打标数据进行预测,补齐未打标数据。此时所有待输入数据全部完成打标,再构建一层网络,计算Actor的输出在监督学习模型中的交叉熵损失:
Figure PCTCN2021128877-appb-000001
其中,a:网络输出的action,x:用户行为数据和用户属性信息,y:打标数据,σ:待训练网络参数。
这样Actor最终的梯度损失函数变成:
Figure PCTCN2021128877-appb-000002
其中,s:代表某个时刻代理的状态;a:代表在某个时刻下执行的动作;Q(s,a):表示智能体在某状态下采取某个动作并在之后采取最优动作条件下的打折的未来奖励;λ:待调的超参,是监督学习模型的权重;θ:Actor的网络参数,μ:在某时刻的状态下,对应actor的动作值,N:模型训练迭代次数。
因此,本公开将监督学习模型计算的交叉熵损失loss加入到Actor中,评估Actor的输出值,增强强化学习模型的稳定性并充分利用打标数据,提升整个预测模型性能。
而右边任务流程是利用用户反馈数据进行深度强化学习模型计算(其中训练流程同左边任务流程(少了Model部分)),以及业务规则模型计算(例如:计算用户是否已婚,若没有打标数据作为训练集,则业务规则模型计算,如历史购买物品名称包括“怀孕”,“孩子”等已婚特征词。),最后再根据线上的运营效果指标(如:转化率,停留时长等)确定两种计算结果的权重,再进行融合。
图3是根据本公开实施例的基于用户画像的物品推荐装置的主要模块的示意图,如图3所示,所述基于用户画像的物品推荐装置300包括获取模块301和处理模块302。其中,获取模块301接收用户行为数据和用户属性信息,通过特征工程转换成待处理的用户行为数据和待处理的用户属性信息。处理模块302获取当前标签计算任务,判断所述标签计算任务是否属于预测类任务,若是则调用预设的预测模型,若否则调用预设的统计规则模型;根据所述预测模型或所述统计规则模型,基于待处理的用户行为数据和待处理的用户属性信息得到用户画像,进而根据用户画像向该用户推送物品信息。
在一些实施例中,获取模块301接收用户行为数据,包括
根据预设的数据数量和数据格式,获取不同类目下的用户行为数 据。
在一些实施例中,获取模块301接收用户行为数据和用户属性信息之后,包括:
调用预处理模型,对用户行为数据和用户属性信息进行预处理。
在一些实施例中,处理模块302根据所述预测模型,基于待处理的用户行为数据和待处理的用户属性信息得到用户画像,包括:
基于待处理的用户行为数据和待处理的用户属性信息,通过Actor网络梯度融合的DDPG算法模型计算得到用户画像。
在一些实施例中,所述Actor网络梯度融合的DDPG算法模型,包括:
将通过预设的监督学习模型计算得到的交叉熵损失值加入到DDPG算法的Actor中,评估Actor的输出值。
在一些实施例中,处理模块302根据所述统计规则模型,基于待处理的用户行为数据和待处理的用户属性信息得到用户画像,包括:
获取预设的深度强化学习模型和业务规则模型,通过待处理的用户行为数据和待处理的用户属性信息分别得到相应的第一用户画像和第二用户画像;
根据目标运营效果指标,确定第一用户画像和第二用户画像的权重,以将第一用户画像和第二用户画像融合得到最终用户画像。
在一些实施例中,所述的深度强化学习模型采用Actor-Critic算法。
需要说明的是,在本公开所述基于用户画像的物品推荐方法和所述基于用户画像的物品推荐装置在具体实施内容上具有相应关系,故重复内容不再说明。
图4示出了可以应用本公开实施例的基于用户画像的物品推荐方法或基于用户画像的物品推荐装置的示例性系统架构400。
如图4所示,系统架构400可以包括终端设备401、402、403,网络404和服务器405。网络404用以在终端设备401、402、403和服务器405之间提供通信链路的介质。网络404可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
用户可以使用终端设备401、402、403通过网络404与服务器405交互,以接收或发送消息等。终端设备401、402、403上可以安装有各种通讯客户端应用,例如购物类应用、网页浏览器应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等(仅为示例)。
终端设备401、402、403可以是具有基于用户画像的物品推荐屏并且支持网页浏览的各种电子设备,包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。
服务器405可以是提供各种服务的服务器,例如对用户利用终端设备401、402、403所浏览的购物类网站提供支持的后台管理服务器(仅为示例)。后台管理服务器可以对接收到的产品信息查询请求等数据进行分析等处理,并将处理结果(例如目标推送信息、产品信息--仅为示例)反馈给终端设备。
需要说明的是,本公开实施例所提供的基于用户画像的物品推荐方法一般由服务器405执行,相应地,计算装置一般设置于服务器405中。
应该理解,图4中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。
下面参考图5,其示出了适于用来实现本公开实施例的终端设备的计算机系统500的结构示意图。图5示出的终端设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图5所示,计算机系统500包括中央处理单元(CPU)501,其可以根据存储在只读存储器(ROM)502中的程序或者从存储部分508加载到随机访问存储器(RAM)503中的程序而执行各种适当的动作和处理。在RAM503中,还存储有计算机系统500操作所需的各种程序和数据。CPU501、ROM502以及RAM503通过总线504彼此相连。输入/输出(I/O)接口505也连接至总线504。
以下部件连接至I/O接口505:包括键盘、鼠标等的输入部分506;包括诸如阴极射线管(CRT)、液晶基于用户画像的物品推荐器(LCD) 等以及扬声器等的输出部分507;包括硬盘等的存储部分508;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分509。通信部分509经由诸如因特网的网络执行通信处理。驱动器510也根据需要连接至I/O接口505。可拆卸介质511,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器510上,以便于从其上读出的计算机程序根据需要被安装入存储部分508。
特别地,根据本公开公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分509从网络上被下载和安装,和/或从可拆卸介质511被安装。在该计算机程序被中央处理单元(CPU)501执行时,执行本公开的系统中限定的上述功能。
需要说明的是,本公开所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何 计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图或流程图中的每个方框、以及框图或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的模块可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的模块也可以设置在处理器中,例如,可以描述为:一种处理器包括获取模块和处理模块。其中,这些模块的名称在某种情况下并不构成对该模块本身的限定。
作为另一方面,本公开还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的设备中所包含的;也可以是单独存在,而未装配入该设备中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被一个该设备执行时,使得该设备包括接收用户行为数据和用户属性信息,通过特征工程转换成待处理的用户行为数据和待处理的用户属性信息;获取当前标签计算任务,判断所述标签计算任务是否属于预测类任务,若是则调用预设的预测模型,若否则调用预设的统计规则模型;根据所述预测模型或所述统 计规则模型,基于待处理的用户行为数据和待处理的用户属性信息得到用户画像,进而根据用户画像向该用户推送物品信息。
根据本公开实施例的技术方案,能够解决现有精准度低的用户画像所造成的营销活动效率不高的问题。
上述具体实施方式,并不构成对本公开保护范围的限制。本领域技术人员应该明白的是,取决于设计要求和其他因素,可以发生各种各样的修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等,均应包含在本公开保护范围之内。

Claims (10)

  1. 一种基于用户画像的物品推荐方法,其包括:
    接收用户行为数据和用户属性信息,通过特征工程转换成待处理的用户行为数据和待处理的用户属性信息;
    获取当前标签计算任务,判断所述标签计算任务是否属于预测类任务,若是则调用预设的预测模型,若否则调用预设的统计规则模型;
    根据所述预测模型或所述统计规则模型,基于待处理的用户行为数据和待处理的用户属性信息得到用户画像,进而根据用户画像向该用户推送物品信息。
  2. 根据权利要求1所述的方法,其中,接收用户行为数据,包括:
    根据预设的数据数量和数据格式,获取不同类目下的用户行为数据。
  3. 根据权利要求1所述的方法,其中,接收用户行为数据和用户属性信息之后,包括:
    调用预处理模型,对用户行为数据和用户属性信息进行预处理。
  4. 据权利要求1所述的方法,其中,根据所述预测模型,基于待处理的用户行为数据和待处理的用户属性信息得到用户画像,包括:
    基于待处理的用户行为数据和待处理的用户属性信息,通过Actor网络梯度融合的DDPG算法模型计算得到用户画像。
  5. 根据权利要求4所述的方法,其中,所述Actor网络梯度融合的DDPG算法模型,包括:
    将通过预设的监督学习模型计算得到的交叉熵损失值加入到DDPG算法的Actor中,评估Actor的输出值。
  6. 根据权利要求1所述的方法,其中,根据所述统计规则模型, 基于待处理的用户行为数据和待处理的用户属性信息得到用户画像,包括:
    获取预设的深度强化学习模型和业务规则模型,通过待处理的用户行为数据和待处理的用户属性信息分别得到相应的第一用户画像和第二用户画像;
    根据目标运营效果指标,确定第一用户画像和第二用户画像的权重,以将第一用户画像和第二用户画像融合得到最终用户画像。
  7. 根据权利要求6所述的方法,还包括:
    所述的深度强化学习模型采用Actor-Critic算法。
  8. 一种基于用户画像的物品推荐装置,其包括:
    获取模块,用于接收用户行为数据和用户属性信息,通过特征工程转换成待处理的用户行为数据和待处理的用户属性信息;
    处理模块,用于获取当前标签计算任务,判断所述标签计算任务是否属于预测类任务,若是则调用预设的预测模型,若否则调用预设的统计规则模型;根据所述预测模型或所述统计规则模型,基于待处理的用户行为数据和待处理的用户属性信息得到用户画像,进而根据用户画像向该用户推送物品信息。
  9. 一种电子设备,其包括:
    一个或多个处理器;
    存储装置,用于存储一个或多个程序,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-7中任一所述的方法。
  10. 一种计算机可读介质,其上存储有计算机程序,所述程序被处理器执行时实现如权利要求1-7中任一所述的方法。
PCT/CN2021/128877 2020-11-12 2021-11-05 一种基于用户画像的物品推荐方法和装置 WO2022100518A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP21891044.6A EP4242955A1 (en) 2020-11-12 2021-11-05 User profile-based object recommendation method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011264500.8A CN113763093A (zh) 2020-11-12 2020-11-12 一种基于用户画像的物品推荐方法和装置
CN202011264500.8 2020-11-12

Publications (1)

Publication Number Publication Date
WO2022100518A1 true WO2022100518A1 (zh) 2022-05-19

Family

ID=78785997

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/128877 WO2022100518A1 (zh) 2020-11-12 2021-11-05 一种基于用户画像的物品推荐方法和装置

Country Status (3)

Country Link
EP (1) EP4242955A1 (zh)
CN (1) CN113763093A (zh)
WO (1) WO2022100518A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112465565A (zh) * 2020-12-11 2021-03-09 加和(北京)信息科技有限公司 一种基于机器学习的用户画像预测的方法及装置
CN114780860A (zh) * 2022-05-30 2022-07-22 国网浙江省电力有限公司杭州供电公司 基于多维大数据融合汇聚的自主决策方法
CN115484266A (zh) * 2022-11-14 2022-12-16 深圳市乙辰科技股份有限公司 一种基于负载均衡的数据分发处理方法、系统及云平台
CN115983902A (zh) * 2023-01-10 2023-04-18 苏州盈天地资讯科技有限公司 基于用户实时事件的信息推送方法及系统
CN116484109A (zh) * 2023-06-21 2023-07-25 九一金融信息服务(北京)有限公司 基于人工智能的客户画像分析系统及方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114005014B (zh) * 2021-12-23 2022-06-17 杭州华鲤智能科技有限公司 一种模型训练、社交互动策略优化方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090319456A1 (en) * 2008-06-19 2009-12-24 Microsoft Corporation Machine-based learning for automatically categorizing data on per-user basis
CN107423442A (zh) * 2017-08-07 2017-12-01 火烈鸟网络(广州)股份有限公司 基于用户画像行为分析的应用推荐方法及系统,储存介质及计算机设备
CN108960975A (zh) * 2018-06-15 2018-12-07 广州麦优网络科技有限公司 基于用户画像的个性化精准营销方法、服务器及存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180082210A1 (en) * 2016-09-18 2018-03-22 Newvoicemedia, Ltd. System and method for optimizing communications using reinforcement learning
CN109872173A (zh) * 2017-12-04 2019-06-11 北京京东尚科信息技术有限公司 构建用户画像标签的方法、系统及终端设备
CN110929136A (zh) * 2018-08-30 2020-03-27 北京京东尚科信息技术有限公司 一种个性化推荐方法和装置
CN109783730A (zh) * 2019-01-03 2019-05-21 深圳壹账通智能科技有限公司 产品推荐方法、装置、计算机设备和存储介质
US20200279280A1 (en) * 2019-03-01 2020-09-03 Liquid Vine Inc. Algorithmic generation, qualification, and ranking of potential sales leads for human consumable nondurable goods

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090319456A1 (en) * 2008-06-19 2009-12-24 Microsoft Corporation Machine-based learning for automatically categorizing data on per-user basis
CN107423442A (zh) * 2017-08-07 2017-12-01 火烈鸟网络(广州)股份有限公司 基于用户画像行为分析的应用推荐方法及系统,储存介质及计算机设备
CN108960975A (zh) * 2018-06-15 2018-12-07 广州麦优网络科技有限公司 基于用户画像的个性化精准营销方法、服务器及存储介质

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112465565A (zh) * 2020-12-11 2021-03-09 加和(北京)信息科技有限公司 一种基于机器学习的用户画像预测的方法及装置
CN112465565B (zh) * 2020-12-11 2023-09-26 加和(北京)信息科技有限公司 一种基于机器学习的用户画像预测的方法及装置
CN114780860A (zh) * 2022-05-30 2022-07-22 国网浙江省电力有限公司杭州供电公司 基于多维大数据融合汇聚的自主决策方法
CN114780860B (zh) * 2022-05-30 2023-09-05 国网浙江省电力有限公司杭州供电公司 基于多维大数据融合汇聚的自主决策方法
CN115484266A (zh) * 2022-11-14 2022-12-16 深圳市乙辰科技股份有限公司 一种基于负载均衡的数据分发处理方法、系统及云平台
CN115983902A (zh) * 2023-01-10 2023-04-18 苏州盈天地资讯科技有限公司 基于用户实时事件的信息推送方法及系统
CN115983902B (zh) * 2023-01-10 2023-10-20 苏州盈天地资讯科技有限公司 基于用户实时事件的信息推送方法及系统
CN116484109A (zh) * 2023-06-21 2023-07-25 九一金融信息服务(北京)有限公司 基于人工智能的客户画像分析系统及方法
CN116484109B (zh) * 2023-06-21 2023-09-01 九一金融信息服务(北京)有限公司 基于人工智能的客户画像分析系统及方法

Also Published As

Publication number Publication date
CN113763093A (zh) 2021-12-07
EP4242955A1 (en) 2023-09-13

Similar Documents

Publication Publication Date Title
WO2022100518A1 (zh) 一种基于用户画像的物品推荐方法和装置
CN109840730B (zh) 用于数据预测的方法及装置
US20190080352A1 (en) Segment Extension Based on Lookalike Selection
US10552863B1 (en) Machine learning approach for causal effect estimation
CN109034853B (zh) 基于种子用户寻找相似用户方法、装置、介质和电子设备
CN108932625B (zh) 用户行为数据的分析方法、装置、介质和电子设备
CN111160847B (zh) 一种处理流程信息的方法和装置
CN112528110A (zh) 确定实体业务属性的方法及装置
US20220277741A1 (en) Methods and apparatus for intent recognition
US11741111B2 (en) Machine learning systems architectures for ranking
CN111966886A (zh) 对象推荐方法、对象推荐装置、电子设备及存储介质
CN110866625A (zh) 促销指标信息生成方法和装置
WO2022156589A1 (zh) 一种直播点击率的确定方法和装置
CN112749323A (zh) 一种构建用户画像的方法和装置
CN113610610A (zh) 基于图神经网络和评论相似度的会话推荐方法和系统
CN113592593A (zh) 序列推荐模型的训练及应用方法、装置、设备及存储介质
US20230245210A1 (en) Knowledge graph-based information recommendation
CN113822734A (zh) 用于生成信息的方法和装置
Tayyab et al. A machine learning based model for software cost estimation
CN113360816A (zh) 点击率预测的方法和装置
CN115510318A (zh) 用户表征模型的训练方法、用户表征方法及装置
CN113159877A (zh) 数据处理方法、装置、系统、计算机可读存储介质
Zhang System of Cross-Border E-commerce Network Pattern Evolution on Account of Bayes-BP Algorithm
US20230127453A1 (en) Causal multi-touch attribution
US11756065B2 (en) Methods and apparatus for predicting a user churn event

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21891044

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021891044

Country of ref document: EP

Effective date: 20230607

NENP Non-entry into the national phase

Ref country code: DE