CN116738034B - Information pushing method and system - Google Patents

Information pushing method and system Download PDF

Info

Publication number
CN116738034B
CN116738034B CN202211236646.0A CN202211236646A CN116738034B CN 116738034 B CN116738034 B CN 116738034B CN 202211236646 A CN202211236646 A CN 202211236646A CN 116738034 B CN116738034 B CN 116738034B
Authority
CN
China
Prior art keywords
user
vector
vectors
self
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211236646.0A
Other languages
Chinese (zh)
Other versions
CN116738034A (en
Inventor
李虎
冯晓东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Honor Device Co Ltd
Original Assignee
Honor Device Co Ltd
Filing date
Publication date
Application filed by Honor Device Co Ltd filed Critical Honor Device Co Ltd
Priority to CN202211236646.0A priority Critical patent/CN116738034B/en
Publication of CN116738034A publication Critical patent/CN116738034A/en
Application granted granted Critical
Publication of CN116738034B publication Critical patent/CN116738034B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The application provides an information pushing method and system, relates to the technical field of machine learning, and aims at a new user and can push interested information for the new user. After a push request initiated by a first account number in a first platform is received, multiple groups of user characteristics of the first account number in a plurality of preset platforms are obtained, and the plurality of preset platforms comprise the first platform. A plurality of initialization user vectors corresponding to the plurality of sets of user features are generated. And fusing a plurality of initialization user vectors to obtain fused user vectors of the first account. And determining personalized materials according to the fused user vector, and pushing the personalized materials to the first account.

Description

Information pushing method and system
Technical Field
The present application relates to the field of machine learning technologies, and in particular, to an information pushing method and system.
Background
In the present information age, in order to facilitate the user to efficiently acquire the required information, an Application (APP) may push information of interest to the user based on attribute (such as gender, age, etc.) information and behavior (such as clicking, purchasing, sharing, etc.) data of the user. For example, in shopping APP, items of interest to the user may be pushed. As another example, in a video APP, a video of interest to a user may be pushed.
However, for a new user of APP, the existing pushing scheme cannot accurately push information of interest for the new user due to lack of attribute features and behavior features of the user.
Disclosure of Invention
In view of this, the present application provides an information pushing method and system, which can also push interested information for new users.
In order to achieve the above object, the embodiment of the present application provides the following technical solutions:
In a first aspect, an embodiment of the present application provides an information pushing method, which may be used to push a scenario of a personalized material to a user. The personalized good may be audio, video, pictures, articles, posts, or the like. The method comprises the following steps: after a push request initiated by a first account number in a first platform is received, multiple groups of user characteristics of the first account number in a plurality of preset platforms are obtained, and the plurality of preset platforms comprise the first platform. A plurality of initialization user vectors corresponding to the plurality of sets of user features are generated. And fusing a plurality of initialization user vectors to obtain fused user vectors of the first account. And determining a target material according to the fused user vector, and pushing the target material to the first account.
In summary, by adopting the embodiment of the application, the user features of the first account numbers in the plurality of preset platforms can be fused, and the user features in the plurality of preset platforms can be comprehensively used for personalized material pushing. In this way, for users (e.g., new users) who do not have sufficient user characteristics at the current first platform, sufficient user characteristics may also be available for matching personalized materials. And after receiving the push request, the user vectors are obtained through real-time fusion, and the change of the user preference can be found in time, so that the personalized material meeting the current preference of the user can be obtained more advantageously.
In one possible design manner, the generating the plurality of initialization user vectors corresponding to the plurality of sets of user features includes: if the number of the user features of the first account in the second platform is smaller than a first threshold value, determining that the initialized user vector corresponding to the user features in the second platform is a random vector. Or if the number of the user features of the first account in the second platform is less than the first threshold, determining that the initialization user vector corresponding to the user features in the second platform is: and the average value vector of a plurality of second initialization user vectors corresponding to the plurality of second accounts in the second platform. The second platform is any one of a plurality of preset platforms, and the plurality of second accounts are accounts with the number of user features in the second platform being more than a second threshold.
That is, when the first account number does not have enough user features in any of the plurality of preset platforms, then a mean vector or a random vector may be used to represent the initialized user vector for the first account number in that platform.
In one possible design manner, the fusing a plurality of initialization user vectors to obtain a fused user vector of the first account includes: and inputting the plurality of initialized user vectors into the first fusion model, running the first fusion model, and outputting the fused user vectors of the first account. The first fusion model has a function of fusing a plurality of user vectors into one user vector.
That is, a machine learning model may be employed to achieve fusion, improving the intelligence of the fusion process.
In one possible design manner, the first fusion model includes a plurality of first self-attention sub-modules and a first attention sub-module, where the plurality of first self-attention sub-modules are in one-to-one correspondence with a plurality of preset platforms. The inputting the plurality of initialization user vectors into the first fusion model, operating the first fusion model, and outputting the fusion user vector of the first account includes: and inputting the plurality of initialization user vectors into a corresponding plurality of first self-attention sub-modules, running the plurality of first self-attention sub-modules and the first attention sub-modules, and outputting the fusion user vectors of the first account. Each first self-attention submodule is used for correcting the value of each user characteristic item in the corresponding initialized user vector according to the first weight corresponding to each user characteristic item, and the first attention submodule is used for fusing a plurality of vectors obtained by correcting the plurality of first self-attention submodules according to the second weight corresponding to each preset platform.
That is, the user vector can be corrected according to the importance degree of each user feature item, so that the feature value of each user feature item in the corrected user vector is matched with the importance degree of the user feature item, and the reasonability of the user vector is improved. And fusing the user vectors according to the importance degrees of the preset platforms, so that the fused user vectors can comprise the user characteristics of the preset platforms, and the effect of enhancing the user vectors is achieved. And the fused user vector is also suitable for the importance degrees of the user characteristics of a plurality of preset platforms, so that the rationality of the user vector is further improved.
In one possible design, the user characteristic items include gender, age, membership grade, and/or click sequence.
In one possible design manner, the determining the target material according to the fused user vector includes: and calculating the similarity of the fusion user vector and a plurality of fusion material vectors corresponding to the plurality of candidate materials, and determining the preset number of candidate materials pointed by the fusion material vectors with the highest similarity as target materials.
In one possible design, before calculating the similarity between the fused user vector and the plurality of fused material vectors corresponding to the plurality of candidate materials, the method further includes: and acquiring material characteristics of the plurality of candidate materials in a plurality of preset platforms. For the first candidate material, generating a plurality of initialized material vectors corresponding to the first candidate material according to a plurality of groups of material characteristics of the first candidate material in a plurality of preset platforms, and fusing the initialized material vectors to obtain a fused material vector of the first candidate material, wherein the first candidate material is any one of the plurality of candidate materials. And storing a plurality of fusion material vectors corresponding to the plurality of candidate materials.
In one possible design manner, the fusing a plurality of initializing material vectors to obtain a fused material vector of the first candidate material includes: and inputting the plurality of initialized material vectors into a second fusion model, operating the second fusion model, and outputting the fused material vectors of the first candidate materials. The second fusion model has a function of fusing a plurality of material vectors into one material vector.
In one possible design manner, the second fusion model includes a plurality of second self-attention sub-modules and a second attention sub-module, where the plurality of second self-attention sub-modules are in one-to-one correspondence with a plurality of preset platforms. The inputting the plurality of initialized material vectors into the second fusion model, operating the second fusion model, and outputting the fused material vectors of the first candidate materials, including: and inputting the plurality of initialized material vectors into a plurality of corresponding second self-attention sub-modules, operating the plurality of second self-attention sub-modules and the second attention sub-modules, and outputting the fused material vectors of the first candidate materials. And each second self-attention submodule is used for correcting the value of each material characteristic item in the corresponding initialized material vector according to the third weight corresponding to each material characteristic item, and the second attention submodule is used for fusing a plurality of vectors corrected by the second self-attention submodules according to the fourth weight corresponding to each preset platform.
In one possible design, the item features include category, price, number of views, number of shares, and/or number of clicks.
In a second aspect, the embodiment of the application also provides a communication system, which comprises a first device and a second device, wherein the first device is used for receiving a pushing request and pushing personalized materials, and the second device is used for determining the personalized materials.
In a third aspect, embodiments of the present application further provide a computer readable storage medium, including first computer instructions, which when executed on a first device, cause the first device to perform the steps of receiving a push request and pushing personalized material as in the method of the first aspect and any one of its possible designs.
In a fourth aspect, embodiments of the present application further provide a computer readable storage medium comprising second computer instructions which, when run on a second device, cause the second device to perform the step of determining personalized materials in the method of the first aspect and any of its possible designs.
It will be appreciated that the advantages achieved by the communication system according to the second aspect, the computer readable storage medium according to the third aspect and the fourth aspect provided above may refer to the advantages in the first aspect and any possible design manner thereof, and are not described herein.
Drawings
Fig. 1 is a schematic structural diagram of a communication system according to an embodiment of the present application;
fig. 2 is a schematic diagram of a mobile phone interface according to an embodiment of the present application;
fig. 3 is an interaction schematic diagram of a communication system according to an embodiment of the present application;
FIG. 4 is a schematic diagram of another mobile phone interface according to an embodiment of the present application;
fig. 5 is a flow chart of an information pushing method according to an embodiment of the present application;
FIG. 6 is a block diagram of a machine learning model according to an embodiment of the present application;
FIG. 7 is a training schematic diagram of a self-attention module according to an embodiment of the present application;
FIG. 8 is a block diagram of a self-attention sub-module according to an embodiment of the present application;
FIG. 9 is a training schematic diagram of an attention module according to an embodiment of the present application;
FIG. 10 is a block diagram of an attention sub-module according to an embodiment of the present application;
FIG. 11 is a training schematic diagram of a preference calculation module according to an embodiment of the present application;
FIG. 12 is a block diagram of a trained user vector fusion model according to an embodiment of the present application;
FIG. 13 is a block flow diagram of obtaining an online recall database according to an embodiment of the present application;
FIG. 14 is a block flow diagram of obtaining an offline recall database according to an embodiment of the present application;
fig. 15 is a block diagram of a method for implementing information push according to an embodiment of the present application;
Fig. 16 is a block diagram of a chip system according to an embodiment of the present application.
Detailed Description
Referring to fig. 1, an embodiment of the present application provides a communication system including a first device (e.g., a handset 110 shown in fig. 1) and a second device (e.g., a server 120 shown in fig. 1). The communication system can be used for information pushing in a personalized pushing scene. The personalized pushing scene comprises a scene for pushing materials such as videos, pictures, articles, commodities, posts and the like which are interested by the user. The first device may be configured to receive the push request and provide the personalized material fed back by the second device to the user, where the personalized material is a material of interest to the user. The second device is used to determine the personalized material and return to the first device.
By way of example, the first device may be a cell phone, tablet, desktop, laptop, handheld computer, notebook, ultra-mobile personal computer, UMPC, netbook, and cellular telephone, personal Digital Assistant (PDA), augmented reality (augmented reality, AR), virtual Reality (VR) device, or the like, that may install an APP (e.g., shopping APP, video APP, etc.) with personalized push needs and/or an electronic device that may run a website with personalized push needs. The second device may be a cloud end, a server, or another electronic device with a strong computing capability, such as a mobile phone, a tablet computer, or the like. The embodiment of the application does not particularly limit the specific forms of the first device and the second device. Hereinafter, taking the first device as the mobile phone 110 shown in fig. 1, and the second device as the server 120 shown in fig. 1 as an example, the server 120 may be an APP with personalized pushing requirements and/or a server of a website with personalized pushing requirements.
Referring to fig. 2, when the mobile phone 110 detects that the user clicks the icon 220 of the video APP on the desktop 210, it is equivalent to receiving a push request. In response to a user clicking on the icon 220, the handset 110 may send a push request to the server 120. After receiving the push request, the server 120 may determine that the personalized video is fed back to the mobile phone 110, and finally, the mobile phone 110 may display an interface 230, where the interface 230 includes a plurality of personalized videos.
Typically, the server 120 needs to determine personalized materials according to attribute features and behavior features of the user, so as to implement personalized pushing. The attribute features may include information such as gender and age, and the behavior features may include behavior features of browsing, clicking, purchasing, sharing, and the like. For example, the latest mobile phones are pushed to users who are young and frequently browse electronic products. However, if the APP or the website has no attribute features of the user or the behavioral features of the user in the APP or the website are sparse, the server 120 cannot accurately determine the personalized material. For example, the server 120 cannot accurately determine personalized materials for a new user of the APP (i.e., the user corresponding to the new account).
Based on the above, the embodiment of the application also provides an information pushing method, which can be used in the communication system comprising the first device and the second device. Referring to fig. 3, after detecting a push request from a user to the first platform, the handset 110 sends the push request to the server 120. Wherein, the first platform can be APP or website. The pushing request carries the user account number of the first platform. After receiving the push request, the server 120 may fuse the user characteristics (including attribute characteristics and/or behavior characteristics) of the user account in the multiple preset platforms to obtain a fused user vector. The first platform may be any one of a plurality of preset platforms, and in the fusion process, the importance degree of each feature in the user features and the importance degree of the user features of the plurality of platforms may be referred to.
In this way, in a scenario that the same account number can be logged in multiple preset platforms, for example, the same rongxite can be logged in platforms such as rongxite mall APP, my rongxite APP, rongxite video APP, rongxite home network, and the like, even if the user account number logged in the first platform that initiates the push request currently has insufficient user characteristics, the user characteristics of the user account number in other preset platforms can be fused, so that sufficient user characteristics can be obtained for matching personalized materials. In this way, for users (e.g., new users) who do not have sufficient user characteristics at the current first platform, sufficient user characteristics may also be available for matching personalized materials. And, based on importance level fusion, rather than direct stitching, rationality of fusion can be provided.
The server 120 then determines the personalized material based on the fused user vectors and sends it to the mobile phone 110. The handset 110 provides the personalized material to the user. In this way, personalized materials can be accurately pushed for users (such as new users) with insufficient user characteristics on the current first platform.
It should be noted that, the different platforms may also be referred to as different domains, and the data fusion of multiple platforms may also be referred to as cross-domain fusion.
Embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
In the present application, the mobile phone 110 may detect a push request from a user to any one of a plurality of preset platforms (i.e., the first platform), where the push request may trigger the server 120 to determine the personalized material.
Taking the example that the first platform is the video APP, the push request may be a click operation of the user on the icon 220 of the video APP in the desktop 210 shown in fig. 2, or the push request may be a slide refresh operation of the user from top to bottom (as shown by an arrow) in the application interface 410 of the video APP shown in fig. 4, or the push request may be an operation of the user sliding from bottom to top to "guess you still like" in the play interface 420 of the video APP shown in fig. 4.
When the mobile phone 110 detects a push request of the user to the first platform, the push request is sent to the server 120, where the push request carries a user account number (may also be referred to as a first account number) logged in by the first platform in the mobile phone 110. The user account may be logged in a plurality of preset platforms. The user account is carried in the push request, so that the user characteristics in a plurality of preset platforms can be conveniently acquired based on the user account to finish fusion.
The user may update his user characteristics during use of the first platform. For example, browsing, clicking, purchasing, sharing, etc. behavior features are increasing. As another example, the level information in the attribute features changes. For another example, after the user adds information such as gender and age, the attribute characteristics of the user are gradually improved. Based on this, in some embodiments, the push request may further carry a user characteristic (including a user attribute) of the user account in the first platform, so that the latest user characteristic may be sent to the server 120, so that the server 120 can determine the personalized material more accurately.
In other embodiments, the push request may not carry the user feature, but may be queried by the server 120 from the database of the first platform prior to fusion. Hereinafter, this will be mainly described.
After receiving the push request, the server 120 may query the user characteristics of the user account in the multiple preset platforms, which are carried in the push request, represent the user characteristics of the user account in each preset platform in a vector form to obtain an initial user vector (which may also be referred to as an initial user vector), and then fuse the initial user vectors corresponding to the multiple preset platforms to obtain a fused user vector (which may also be referred to as a fused user vector). The fused user vector is the enhanced representation of the user features of a plurality of preset platforms.
Illustratively, the user features include attribute features and behavioral features. Attribute characteristics include gender, age, membership grade. The behavioral characteristics include a sequence of clicks, i.e., a sequence of items that the user clicks on over a period of time (e.g., 3 days, 5 days, one week, etc.). And the plurality of preset platforms comprise shopping APP and video APP, and the user account initiating the push request is account a.
The user characteristics of account a in shopping APP are shown in table 1 below:
TABLE 1
User features Sex (sex) Age of Membership grade Click sequence
Account a Man's body 25 Gold (gold) ID1、ID2、ID3
If the ID1 is the ID of the computer, the ID2 is the ID of the mobile phone, and the ID3 is the ID of the earphone, the IDs 1,2, and 3 indicate that the user clicks the computer, the mobile phone, and the earphone in sequence within a period of time.
The user characteristics of account a in video APP are shown in table 2 below:
TABLE 2
User features Sex (sex) Age of Membership grade Click sequence
Account a Man's body 25 Platinum gold ID4、ID5、ID6
If ID4 is the ID of video 1, ID5 is the ID of video 2, and ID6 is the ID of video 3, ID4, ID5, and ID6 indicate that the user has clicked video 1, video 2, and video 3 in sequence within a period of time.
After receiving the push request, the server 120 may query the user characteristics of the account a in the shopping APP as shown in table 1, and query the user characteristics of the account a in the video APP as shown in table 2. Server 120 may then construct an initial user vector corresponding to the user feature in table 1 and an initial user vector corresponding to the user feature in table 2.
In a specific implementation manner, a corresponding vector may be assigned to a possible value condition of each user feature item in the user features, and then, when an initial user vector is determined, the vectors corresponding to the actual value of each user feature item may be spliced to obtain the initial user vector.
The vectors assigned to possible values of the individual user characteristic items referred to in the above tables 1 and 2 are shown in the following table 3, by way of example:
TABLE 3 Table 3
Table 3 above shows that possible values for the "sex" of the user feature item include "male", "female" and "unknown", and that these 3 values are assigned 1×4 vectors A1 to A3, respectively. The possible values of the user characteristic term "age" are divided into 5 age intervals of 0-20 years old, 21-40 years old, 41-60 years old, 61-80 years old and 81-100 years old, and 1×8 vectors B1 to B5 are assigned to the 5 age intervals, respectively. Possible values of the user characteristic item "membership grade" include "public", "gold", "platinum" and "diamond", and these 4 values are assigned 1×4 vectors C1 to C4, respectively. Possible values of the material ID in the user feature "click sequence" include ID1, ID2 … … ID100, i.e. 100 total materials, and the 100 material IDs are assigned 1x 20 vectors D1 to D100, respectively.
Then, the sex "male" in the user characteristics shown in the above table 1 may take the vector A1 in the table 3, the age 25 may take the vector B2 in the table 3, the membership grade "gold" may take the vector C2 in the table 3, and the ID1, ID2, ID3 in the click sequence may take the D1, D2, D3 in the table 3 in order. Thus, the user feature initial user vector in table 1 may be represented as a vector spliced by A1, B2, C2, D1, D2, and D3, i.e., a vector of 1× (4+8+8+20+20+20).
And, the sex "male" in the user characteristics shown in the above table 2 may take the vector A1 in the table 3, the age 25 may take the vector B2 in the table 3, the membership grade "platinum" may take the vector C3 in the table 3, and the ID4, ID5, ID6 in the click sequence may take the D4, D5, D6 in the table 3 in order. Thus, the user feature initial user vector in table 2 may be represented as a vector spliced by A1, B2, C3, D4, D5, and D6, i.e., a vector of 1× (4+8+8+20+20+20).
In some scenarios, account a may not have sufficient user features in at least one of the plurality of preset platforms (which may be denoted as a second platform, which may also be the first platform). For example, the number of user features is less than a first threshold.
For this scenario, in some embodiments, server 120, in determining the account a-initiated user vector in the second platform, may determine a mean vector of user characteristics of a plurality of user accounts (which may also be referred to as a second account) included in the second platform as the account a-initiated user vector in the second platform. The plurality of user accounts included in the second platform may be accounts with a number of user features greater than a second threshold. In this way, a mean vector may be obtained using a sufficient number of user accounts with user characteristics. Or the average value vector corresponding to the user characteristics of the user accounts included in the preset platforms can be determined as the initial user vector of the account a in the second platform. The average value vector refers to an average value of a plurality of initial user vectors corresponding to a plurality of user accounts. The plurality of user accounts refers to accounts having sufficient user characteristics. Thus, the user characteristics of account a can be represented by an average of a plurality of users.
In other embodiments, the server 120 may directly use the preset vector or the random vector as the initial user vector of the account a in the second platform when determining the initial user vector of the account a in the second platform. In this way, the operation can be simplified.
In the examples of tables 1 and 2 above, the user characteristics included in the shopping APP and the video APP are both "gender", "age", "membership grade", "click sequence". That is, the user features in the plurality of preset platforms include user feature items that are identical. Of course, the user features in the plurality of preset platforms, including the user feature items, may also be partially identical or partially different. For example, the behavior feature of the user feature in shopping APP also includes "purchase sequence". The embodiment of the present application is not particularly limited thereto. In some embodiments, to facilitate subsequent fusion calculations, server 120 may cause a particular length interval in the vector to represent a particular user feature item when generating an initial user vector for the case where the user feature items of the user feature are partially identical. If the user characteristic of the platform does not include the user characteristic item, filling can be preset.
After the initial user vector is constructed by the server 120, the initial user vectors corresponding to the multiple preset platforms may be fused to obtain the fused user vector. Fusion involves two processes:
In the first process, the server 120 corrects the initial user vector based on the importance degree of each user feature item of the user features in the preset platform. In the correction process, the server 120 may multiply the feature value of each user feature item in the initial user vector by a weight value corresponding to the importance degree, to obtain a corrected user vector.
For example, still referring to table 1 and table 3 above, if the initial user vector in the shopping APP is a vector obtained by concatenating A1, B2, C2, D1, D2, and D3, for example, the initial user vector may be represented by [ A1B 2C 2D 1D 2D 3], if the weight corresponding to the importance degree of "gender" is k1, the weight corresponding to the importance degree of "age" is k2, the weight corresponding to the importance degree of "membership grade" is k3, and the weight corresponding to the importance degree of "click sequence" is k4, the initial user vector [ A1B 2C 2D 1D 2D 3] may be corrected by multiplying the initial user vector by k1, the feature value B2 of "age" by k2, the feature value C2 of "membership grade" by k3, and the feature value D2 of "click sequence" by k4 by multiplying the feature value D1, D2 of "click sequence" by k4, and the corrected user vector is [ k1xa1k2k2k2k2k2x4k2k4x4 ].
In some scenarios, the degree of influence of the same user feature item on the user selection of the material is different for different preset platforms, and therefore, the degree of importance of the same user feature item may be different in different preset platforms. For example, in shopping APP, "gender" has a greater impact on the user's choice of materials (e.g., merchandise), and so "gender" is of greater importance in shopping APP. In the video APP, the "gender" does not affect the user to a great extent on selecting the material (such as video), and thus, the importance of the "gender" in the video APP is smaller. Based on this, in some embodiments, to improve the rationality of the correction, the initial user vector corresponding to each preset platform may be corrected separately according to the importance degree of each user feature item of the user feature in the preset platform. For example, if the weights of "gender", "age", "membership grade", "click sequence" in the shopping APP are 0.3, 0.2, 0.1 and 0.4 in order, the feature values of "gender", "age", "membership grade", "click sequence" in the initial user vector corresponding to the shopping APP may be multiplied by 0.3, 0.2, 0.1 and 0.4 in order. The weights of the "gender", "age", "membership grade", "click sequence" in the video APP are 0.2, 0.1 and 0.5 in turn, and then the feature values of the "gender", "age", "membership grade", "click sequence" in the initial user vector corresponding to the video APP can be multiplied by 0.2, 0.1 and 0.5 in turn.
According to the first process, the user vector can be corrected according to the importance degree of each user characteristic item, so that the characteristic value of each user characteristic item in the corrected user vector is matched with the importance degree of the user characteristic item, and the reasonability of the user vector is improved.
In the second process, the server 120 fuses the corrected user vectors corresponding to the preset platforms based on the importance degrees of the user features of the preset platforms, so as to obtain fused user vectors. In the correction process, the server 120 may multiply the corrected user vectors corresponding to the preset platforms by weight values corresponding to the importance degrees, and then fuse the corrected user vectors to obtain fused user vectors. The importance degree of the user features of the preset platforms can be the same, and correspondingly, the weight values corresponding to the preset platforms can be the same. For example, there are 2 preset platforms, and the weight value corresponding to the importance degree of the user feature of each preset platform is 0.5. Or the importance degree of the user features of the preset platforms can be different, and correspondingly, the weight values corresponding to the preset platforms can be different. For example, there are 2 preset platforms, and the weight value corresponding to the importance degree of the user feature of one preset platform is 0.4, and the weight value corresponding to the importance degree of the user feature of the other preset platform is 0.6.
For example, the corrected user vector corresponding to the shopping APP is U1, the corrected user vector corresponding to the shopping APP is U2, the weight value corresponding to the user feature of the shopping APP is 0.4, and the weight value corresponding to the user feature of the video APP is 0.6, and the fused user vector U0 may be 0.4×u1+0.6×u2.
According to the second process, the user vectors can be fused according to the importance degrees of the preset platforms, so that the fused user vectors can comprise the user characteristics of the preset platforms, and the effect of enhancing the user vectors is achieved. And the fused user vector is also suitable for the importance degrees of the user characteristics of a plurality of preset platforms, so that the rationality of the user vector is further improved.
In a typical scenario, the plurality of preset platforms includes shopping APP and video APP shown in fig. 5, and the user initiates a push request from the shopping APP of the mobile phone 110, i.e., the first platform is the shopping APP. Meanwhile, the user account logged in the shopping APP of the mobile phone 110 is account a. The user of account a in shopping APP has very few features, e.g., little to no behavioral features, and even no attribute features. The number of user features of account a in video APP is normal. In this scenario, the server 120 may perform fusion processing on a mean vector (such as a mean vector u1 shown in fig. 5) corresponding to the user account included in the shopping APP and an initial user vector (such as a user vector u2 shown in fig. 5) of the account a in the video APP, to obtain a fused user vector. In this way, even if account a has few user features in the shopping APP that originated the push request, server 120 may generate a user vector in combination with the user features of account a in the video APP such that the generated user vector is associated with account a.
The server 120 needs to use the weight value of each user feature item and the weight value of each preset platform in the process of obtaining the fused user vector based on the initial user vector. In order to distinguish, the weight value of each user characteristic item may be marked as a first weight, and the weight value of the preset platform may be marked as a second weight. There may be a plurality of first weights and second weights. For example, there are n preset platforms, and each preset platform has m user feature items for the user feature, and then there may be n×m first weights and n second weights. The first weight and the second weight may be set empirically or may be obtained based on big data analysis. In some embodiments, it may also be obtained by machine learning. The following description is mainly made in a machine learning manner.
Before the first weight and the second weight are obtained by adopting a machine learning mode, a large number of user characteristic samples and material characteristic samples can be collected from each preset platform. The user characteristics are the same as the user characteristics in the foregoing, and attribute characteristics such as gender, age, membership grade and the like, and behavior characteristics such as clicking, purchasing, sharing and the like are also included in the user characteristics sample, which are not repeated here. It should be noted that, in general, the attribute features or behavior features of the same user may change over time, and thus, the user features of the user may be different in different periods of time. For example, if the user corresponding to the account a in the shopping APP has the click sequences of ID9, ID5, and ID30 in the first week of 9 months and the click sequences of ID3, ID42, ID55, and ID8 in the second week of 9 months, the user characteristics of the account a in the shopping APP in the first week of 9 months and the user characteristics in the second week of 9 months are different. Based on this, in some embodiments, when collecting user feature samples, multiple sets of user feature samples of the same account may be collected in the same preset platform. For example, a sample of user characteristics for account a in the first week of 9 months may be collected as gender "male", age "25 years", membership grade "gold", click sequence "ID9, ID5, ID30", and a sample of user characteristics for account a in the second week of 9 months may be collected as gender "male", age "25 years", membership grade "gold", click sequence "ID3, ID42, ID55, ID8".
The material characteristic sample comprises a material ID, a material title, a material keyword, a behavior statistical characteristic and the like. The behavior statistical characteristics comprise the times of the behaviors such as browsing, clicking, sharing, collecting, purchasing and the like of the materials. And then, constructing user vector samples corresponding to each user characteristic sample to obtain a large number of user vector samples corresponding to a large number of user characteristic samples. For a specific implementation of constructing the user vector samples corresponding to the user feature samples, reference may be made to the foregoing description of constructing the initial user vector, which is not repeated herein. And constructing material vector samples corresponding to each material characteristic sample to obtain a large number of material vector samples corresponding to a large number of material characteristic samples. Similar to the procedure described above for obtaining the initial user vector: the corresponding vector can be given to the possible value condition of each material characteristic item in the material characteristic sample, and then the vectors corresponding to the actual value of each material characteristic item can be spliced when the material vector sample is constructed, so that the material vector sample is obtained. Exemplary, a sample of the material characteristics is shown in table 4 below:
TABLE 4 Table 4
Material ID Number of browses Number of collection times
ID2 1000 50
And, the vectors assigned to the possible values of the individual material characteristic items are shown in the following table 5:
TABLE 5
Table 5 above shows that possible values of the "number of browses" of the material feature item may be divided into 3 sections of 0-500, 501-1000, and 1001 or more, respectively, and 1X 4 vectors E1 to E3 are assigned to the 3 sections, respectively. The possible values of the material characteristic item 'collection times' can be divided into 5 intervals, namely 0-20 times, 21-50 times, 51-100 times, 101-500 times and more than 501 times, and vectors F1 to F6 with the size of 1 multiplied by 6 are respectively assigned to the 5 intervals. Possible values of the material characteristic item include ID1, ID2 … … ID100, i.e. 100 materials in total, and vectors D1 to D100 of 1×20 are respectively assigned to the 100 material IDs.
Then, the material ID "ID2" in the material characteristic sample shown in the above table 4 may take the vector D2 in the table 5, browse the vector E2 in the table 5 1000 times, and collect the vector F2 in the table 5 50 times. Thus, the material vector samples of the material characteristic samples in table 4 may be represented as vectors spliced by D2, E2, F2, i.e., 1× (4+6+20) vectors.
The constructed large number of user vector samples and material vector samples can be used as training input samples for machine learning.
Moreover, after a large number of user feature samples and material feature samples are collected from a plurality of preset platforms, preference tags between the user feature samples and the material feature samples need to be labeled based on behavior features. If the account pointed by the user feature sample has browsing, clicking, sharing, collecting, purchasing and other actions on the material pointed by the material feature sample, the preference label between the user feature sample and the material feature sample can be marked as a first label, such as 1, true and the like. The first tag indicates that the user of the account to which the user characteristic sample points likes the material to which the material characteristic sample points. For example, in the case of logging in the account a on the shopping APP of the mobile phone 110, if the user clicks the computer in the shopping APP, the preference label between the user vector sample of the account a and the material feature sample of the computer may be labeled as 1, which indicates that the user using the account a likes the computer. If the account pointed by the user feature sample does not have browsing, clicking, sharing, collecting, purchasing and other actions on the material pointed by the material feature sample, the preference label between the user feature sample and the material feature sample can be marked as a second label, such as 0, false and the like. The second tag indicates that the user of the account to which the user characteristic sample points dislikes the material to which the material characteristic sample points. For example, in the case that the user logs in to the account a on the shopping APP of the mobile phone 110, if the user never clicks the earphone in the shopping APP, the preference label between the user vector sample of the account a and the material feature sample of the earphone may be marked as 0, which indicates that the user using the account a does not like the earphone.
The preference label obtained by the labeling can be used as a training output sample. It should be understood that the training output samples correspond one-to-one to the training output samples, which may be understood as standard outputs to which the training input samples correspond. Illustratively, a user vector sample 1 may be constructed from a user feature sample 1, and a material vector sample 1 may be constructed from a material feature sample 1. And, preference tags 1 between the user feature samples 1 and the material vector samples 1 may be noted based on the behavior features of the account number to which the user feature samples 1 are directed. Then user vector sample 1 and material vector sample 1 may be used as a set of training input samples and preference label 1 may be used as a training output sample corresponding to the set of training input samples.
In summary, a large number of user vector samples and material vector samples may be obtained as training input samples, and a corresponding large number of preference labels may be obtained as output samples for machine learning. It should be appreciated that a large number of training input samples and training output samples are available for each preset platform.
Before the first weight and the second weight are obtained by adopting a machine learning mode, a model framework of machine learning is also required to be built. Referring to fig. 6, a machine-learned model architecture may include, from input to output, a self-attention module, an attention module, and a preference calculation module.
The self-attention module, the attention module, and the preference calculation module are described below in connection with a training process, respectively.
First, a self-attention module. The self-attention module employs a self-attention mechanism. As shown in fig. 6, during the training process, the input of the self-attention module is the training input samples (i.e., the user vector samples and the material vector samples), and the output is the corrected user vector samples and the corrected material vector samples.
The self-attention module comprises a plurality of groups of self-attention sub-modules corresponding to a plurality of preset platforms, and each self-attention sub-module is used for correcting user vectors and material vectors of the corresponding preset platform. One set of self-attention sub-modules includes two self-attention sub-modules, one for correction of user vectors and the other for correction of material vectors.
Taking a plurality of preset platforms as shopping APP and video APP as an example, referring to fig. 7, the self-attention module includes two groups of self-attention sub-modules. A set of self-attention sub-modules comprises a self-attention sub-module 11 and a self-attention sub-module 12, the self-attention sub-module 11 being used for correction of user vectors of shopping APP and the self-attention sub-module 12 being used for correction of material vectors of video APP. Another set of self-attention sub-modules comprises a self-attention sub-module 21 and a self-attention sub-module 22, the self-attention sub-module 21 being for correction of user vectors of the video APP and the self-attention sub-module 22 being for correction of user vectors of the video APP.
In the training process, the user vector samples and the material vector samples of each preset platform can be respectively used as the input of the corresponding self-attention sub-module. For example, a user vector sample of shopping APP is taken as input to self-attention sub-module 11 shown in fig. 7, and a material vector sample of shopping APP is taken as input to self-attention sub-module 12 shown in fig. 7. In the training process, the self-attention submodule for correcting the user vector can continuously learn the weight value (namely the first weight) of the user characteristic items (such as 'gender', 'age' and other characteristic items) in the corresponding preset platform, and the first weight can be used for correcting the user vector. For example, the self-attention sub-module 11 shown in fig. 7 may learn to constantly learn the weight values of the user feature items in the shopping APP, and the self-attention sub-module 21 shown in fig. 7 may constantly learn the weight values of the user feature items in the video APP. And the self-attention submodule for correcting the material vector can continuously learn the weight value (which can be recorded as a third weight) of the material characteristic item (such as the characteristic item of 'material ID', 'browsing times', and the like) in the corresponding preset platform, and the third weight can be used for correcting the material vector. For example, the self-attention module 12 shown in fig. 7 may continuously learn the weight values of the material feature items in the shopping APP, and the self-attention module 22 shown in fig. 7 may continuously learn the weight values of the material feature items in the video APP. The similarity can be obtained by calculating the similarity to obtain the weight value of each user characteristic item and the weight value of the material characteristic item, and the similarity can be obtained by adopting functions such as dot product, splicing, perceptron and the like.
It should be appreciated that each of the self-attention sub-modules in the self-attention module is based on the training error fed back by the preference calculation module to achieve continuous learning of the first and third weights.
Then, after each iteration, the self-attention sub-module for user vector correction may obtain a corrected user vector sample using the first weight correction learned currently, and the self-attention sub-module for material vector correction may obtain a corrected material vector sample using the third weight correction learned currently.
For example, if the first weights of the currently learned gender, age, membership grade, and click sequence are k1', k2', k3', and k4' in order, the value of the gender in the user vector sample may be multiplied by k1', the value of the age multiplied by k2', the value of the membership grade multiplied by k3', and the value of the click sequence multiplied by k4'. If the third weights of the current learned material ID, the browsing times and the collection times are K1', K2' and K3 'in sequence, the value of the material ID in the material vector sample may be multiplied by K1', the browsing times by K2 'and the collection times by K3'.
Further, in order to improve the reasonability of the user vectors and the material vectors corrected by the Self-Attention sub-modules, referring to fig. 8, a Self-Attention mechanism layer (Self-Attention) and a normalization (Normalize) layer may be included in the Self-Attention sub-modules.
In the self-attention sub-module for user vector correction, the self-attention mechanism layer may continually learn the first weights and correct the user vector samples using the first weights currently learned. And in the self-attention sub-module for material vector correction, the self-attention mechanism layer can continuously learn the third weight and correct the material vector sample by using the third weight which is learned currently.
In the self-attention sub-module for user vector correction, the normalization layer may normalize data of each column in correction results obtained by the self-attention mechanism layer and corresponding to a large number of user vector samples, so that the data of each column may be reasonably distributed, for example, be normally distributed. In the self-attention sub-module for material vector correction, the normalization layer may normalize data of each column in correction results obtained by the self-attention mechanism layer and corresponding to a large number of material vector samples, so that the data of each column may be reasonably distributed, for example, be normally distributed.
Illustratively, the correction results are shown below.
Correction result 1: [ a11×k1' a12×k2' a13×k3' ]
Correction result 2: [ a21×k1' a22×k2' a23×k3' ]
Correction result 3: [ a31×k1' a32×k2' a33×k3' ]
Correction result 4: [ a41×k1' a42×k2' a43×k3' ]
Where aij represents the value of the j-th element of the i-th user vector sample, k1', k2', and k3' are the 3 first weights. Then, normalizing the data for each column includes: normalizing a11×k1', a21×k1', a31×k1 'and a41×k1'; normalizing a12×k2', a22×k2', a32×k2 'and a42×k2'; and, a13×k3', a23×k3', a33×k3', and a43×k3' are normalized.
In one particular implementation, the correction results in the self-attention mechanism layer may be added to the original user vector samples or material vector samples prior to the normalization process. For example, when the correction result is [ a11×k1'a12×k2' a13×k3'], the following result [ a11+a11×k1' a12+a12×k2'a13+a13×k3' ] can be obtained by adding the original user vector sample.
Of course, a fully connected layer (DNN) may also be included in the self-attention sub-module. The full connection layer can be used for combining a plurality of user vector samples of the same user account into one user vector sample, or the full connection layer can be used for performing data dimension reduction processing. This is not repeated here.
Second, an attention module. The attention module employs an attention mechanism. As shown in fig. 6, during the training process, the input of the attention module is the corrected user vector sample and the material vector sample output from the attention module, and the output is the cross-platform fused user vector sample and the material vector sample.
The attention module comprises two attention sub-modules, one attention module is used for fusing user vectors of a plurality of preset platforms, and the other attention module is used for fusing material vectors of a plurality of preset platforms.
Taking a plurality of preset platforms as shopping APP and video APP as an example, referring to fig. 9, the attention module includes an attention sub-module 1 and an attention sub-module 2. The attention sub-module 1 is used for the corrected user vector fusion in the shopping APP and the video APP. The attention sub-module 2 is used for carrying out corrected material vector fusion in the shopping APP and the video APP.
In the training process, the user vector samples corrected by the plurality of preset platforms can be used as the input of one attention sub-module, and the material vector samples corrected by the plurality of preset platforms can be used as the input of another attention sub-module. For example, the shopping APP and video APP modified user vector samples are taken as inputs to the attention sub-module 1 shown in fig. 9, and the shopping APP and video APP modified material vector samples are taken as inputs to the attention sub-module 2 shown in fig. 9. In the training process, the attention sub-module for user vector fusion can continuously learn the weight value (namely the second weight) of each preset platform, and the second weight can be used for user vector fusion. For example, the attention sub-module 1 shown in fig. 9 may constantly learn the weight values of shopping APP and video APP, respectively. And the attention sub-module for material vector fusion can continuously learn the weight value (which can be recorded as a fourth weight) of each preset platform, and the fourth weight can be used for material vector fusion. For example, the attention sub-module 2 shown in fig. 9 may constantly learn the weight values of shopping APP and video APP, respectively.
It should be appreciated that each of the attention sub-modules in the attention module is enabled to learn the second weight and the fourth weight continuously based on the training error fed back by the preference calculation module.
Then, in the training process, after each iteration, the attention sub-module for user vector fusion may fuse a plurality of user vector samples corrected by the preset platform by using the second weight learned currently, and specifically may fuse user vector samples corrected by the same user account. If the user vector sample is collected and initialized in one of the preset platforms for a certain user account, only the corrected user vector sample corresponding to the one preset platform for the user account can be obtained, and for the situation, the user vector sample can be obtained by multiplying the second weight of the one preset platform by the user vector sample during fusion. And after each iteration, the attention sub-module for material vector fusion can fuse a plurality of material vector samples corrected by the preset platform by utilizing the fourth weight learned currently, and can specifically fuse the material vector samples corrected by the same material. If ice is only collected in one of the preset platforms for initializing a certain material to obtain a material vector sample, only a corrected user vector sample of the material corresponding to the one preset platform can be obtained, and for this case, the material vector sample can be obtained after fusion by multiplying the material vector sample by the fourth weight of the one preset platform.
For example, taking a plurality of preset platforms as shopping APP and video APP, correcting the user vector sample of the account a in the shopping APP through the self-attention module to obtain a corrected user vector sample as U1', and correcting the user vector sample of the account a in the video APP through the self-attention module to obtain a corrected user vector sample as U2'. If the second weight of the corresponding shopping APP which is learned currently is m1', and the second weight of the corresponding video APP is m2', the user vector sample fused by aiming at the account a is U1'×m1' +U2'×m2'.
In a specific implementation, the Attention sub-module may include an Attention mechanism layer (Attention) and a fully connected layer (DNN), which may have one or more, as shown in fig. 10, fully connected layers above and below the Attention mechanism layer. In the attention sub-module for user vector fusion, the attention mechanism layer may continuously learn the second weight, and fuse the user vector samples using the second weight learned currently. And in the attention sub-module for material vector fusion, the attention mechanism layer can continuously learn the fourth weight and fuse the material vector samples by using the fourth weight which is learned currently.
Third, a preference calculation module. As shown in fig. 6, in the training process, the input of the preference calculation module is the fused user vector sample and material vector sample output by the attention module, and the output is the preference probability of the user on the material.
Referring to fig. 11, the preference calculation module mainly includes a feature cross sub-module and a probability calculation sub-module. The characteristic cross sub-module is used for cross fusion of the user vector and the material vector. Further, the feature cross sub-module may include a full connection layer (DNN) and an Attention mechanism layer (Attention), and may first obtain important weights of different features through the full connection layer and then obtain an intersection of a user vector and a material vector through the Attention mechanism layer.
The probability calculation submodule is used for calculating the preference probability of the user pointed by the user vector on the material pointed by the material vector, namely the preference probability of the user on the material. In a specific implementation, the probability calculation sub-module may use a softmax function to obtain a probability distribution of preference probabilities.
After calculating the preference probability of the user for the material, the error can be calculated by combining the marked preference label. For example, the error may be calculated using a cross entropy formula. When the error does not meet the preset condition, if the error is greater than the preset threshold, the error can be fed back to the self-attention module, the attention module and the preference calculation module in the model framework, so that the self-attention module, the attention module and the preference calculation module can adjust the respective module parameters accordingly. For example, the self-attention module may adjust the first weight and the third weight, the attention module may adjust the second weight and the fourth weight, and the preference calculation module may adjust the importance weights of different features. And (3) through iterative training, the training can be ended until the error meets the preset condition, and if the error is smaller than the preset threshold value. After the training is finished, a plurality of self-attention sub-modules (also referred to as a plurality of first self-attention sub-modules, such as self-attention sub-module 11 and self-attention sub-module 21) for user vector correction in the self-attention module, and an attention sub-module (also referred to as a first attention sub-module, such as attention sub-module 1) for user vector fusion in the attention module can form a fusion model 1 (also referred to as a first fusion model).
Then, based on the initial user vector, the first weight and the second weight, the server 120 may specifically obtain the fused user vector by: and inputting a plurality of initial user vectors corresponding to a plurality of preset platforms into the fusion model 1, and obtaining the fused user vectors according to the correction of the first weight and the fusion of the second weight.
For example, referring to fig. 12, an initial user vector corresponding to a shopping APP is input to the self-attention sub-module 11 in the fusion model 1, an initial user vector corresponding to a video APP is input to the self-attention sub-module 21 in the fusion model 1, and finally the attention sub-module 1 in the fusion model 1 may output the fused user vector.
After the server 120 calculates the fused user vector, the preference of the user to the candidate materials (materials available on the first platform) can be calculated based on the fused user vector, and the materials with the highest preference degree are selected to be sent to the mobile phone 110 as personalized materials, and finally the mobile phone 110 can push the personalized materials to the user.
After the training is finished, a plurality of self-attention sub-modules (also referred to as a plurality of second self-attention sub-modules, such as self-attention sub-module 12 and self-attention sub-module 22, which are in one-to-one correspondence with a plurality of preset platforms) for material vector correction in the self-attention module, and an attention sub-module (also referred to as a second attention sub-module, such as attention sub-module 2) for material vector fusion in the attention module can form a fusion model 2 (also referred to as a second fusion model). In some embodiments, a plurality of material vector samples of the preset platform may be used as input of the fusion model 2, the fusion model 2 is operated to obtain a fused material vector, and the fused material vector is stored. The stored fused material vector may be used as a material index.
For example, referring to fig. 13, a material vector sample corresponding to a shopping APP is input to the self-attention sub-module 12 in the fusion model 2, a material vector sample corresponding to a video APP is input to the self-attention sub-module 22 in the fusion model 2, and finally the attention sub-module 2 in the fusion model 2 may output a fused material vector. Also, the attention sub-module 2 in the fusion model 2 may be stored in an object store service library (Object Storage Service, OBS).
In this embodiment, after the fused user vector is obtained by calculation, the server 120 may calculate the preference of the user for the materials (may also be referred to as multiple candidate materials) that may be provided by the first platform based on the user vector and the stored material index, select multiple materials with the highest preference degree as personalized materials, send the personalized materials to the mobile phone 110, and finally the mobile phone 110 may push the personalized materials to the user. The preference degree of the user on the materials can be obtained by calculating the similarity between the fused user vector and the material index.
In a specific implementation, the process of calculating the preference of the user for the materials available on the first platform by the server 120 based on the user vector and the stored material index may be implemented based on a Faiss framework.
According to the embodiment of the application, the user characteristics in a plurality of preset platforms can be fused for the scene with sparse user characteristics of the first platform, so that the user characteristics can be generated. And after receiving the pushing request, fusing in real time to obtain a user vector and calculating matched personalized materials, so that the personalized materials can be pushed in real time. If the user characteristics change, the user vector is obtained in a real-time fusion mode, so that the preference change of the user can be conveniently explored. So that after the user's preferences change, it is possible to recommend materials that meet their new preferences.
Of course, in other embodiments, off-line pushing of personalized materials may also be implemented. In this embodiment, after the fusion model 1 and the fusion model 2 are obtained by training, a plurality of user vector samples of a preset platform may be used as input of the fusion model 1, and the fusion model 1 is operated to obtain a fused user vector; and taking the material vector samples of a plurality of preset platforms as the input of the fusion model 2, and operating the fusion model 2 to obtain the fused material vector.
For example, referring to fig. 14, a user vector sample corresponding to a shopping APP is input to the self-attention sub-module 11 in the fusion model 1, a user vector sample corresponding to a video APP is input to the self-attention sub-module 21 in the fusion model 1, and finally the attention sub-module 1 in the fusion model 1 may output the fused user vector. And inputting the material vector sample corresponding to the shopping APP to the self-attention sub-module 12 in the fusion model 2, inputting the material vector sample corresponding to the video APP to the self-attention sub-module 22 in the fusion model 2, and finally outputting the fused material vector by the attention sub-module 2 in the fusion model 2.
In this embodiment, for each fused user vector, the similarity between the fused user vector and each fused material vector is calculated, so as to obtain a preset number of fused material vectors with the highest similarity with the fused user vector. Therefore, the index relation between the user pointed by the fused user vector and the material pointed by the fused material vector with high similarity can be obtained. And stores the index relationship. For example, in the dis database shown in fig. 14. Subsequently, after receiving the push request, the server 120 may query the index relationship based on the user account carried in the push request, and determine that the material associated with the user account is a personalized material. So that personalized materials can be determined based on the offline stored index relationships.
To further enhance the understanding of the embodiments of the present application, the following describes the inventive solution in connection with a complete example shown in fig. 15. In this example, the server 120 further includes a data collection module, a data fusion module, a training optimization module, a result storage module, and a service push module.
The data collection module can collect user portraits, material portraits and behavioral interaction characteristics in a plurality of preset platforms (namely multiple domains, the same applies below). The user portrayal mainly comprises characteristics, namely attribute characteristics, of gender, age, membership grade and the like which do not relate to the user behavior. The material portrayal mainly comprises characteristics of material category, price, title, keywords and the like which do not relate to user behavior. The behavior interaction features comprise the features of clicking, browsing, sharing, collecting, purchasing and other behaviors of the user on the materials. Taking click behavior as an example, the behavior interaction features at least include a user identifier (such as a user account) of a user performing the click behavior and a material identifier (such as a material ID) of a clicked material.
It should be appreciated that the data collection module may update the user portraits, material portraits, and behavioral interaction characteristics in real-time or periodically.
The data fusion module can extract user characteristics and material characteristics from the data collection module. For example, user characteristics of a domain (e.g., shopping APP), user characteristics … … of B domain (e.g., video APP), and material characteristics of a domain (e.g., shopping APP), material characteristics of B domain (e.g., video APP) … … may be extracted. It should be appreciated that the user characteristics are extracted in units of user account numbers, and the material characteristics are extracted in units of materials (e.g., one material ID corresponds to one material).
The data fusion module can also fuse the user vectors and the material vectors of a plurality of preset platforms by adopting a self-attention/attention mechanism to obtain fused user vectors and fused material vectors. Reference is specifically made to the self-attention module and the description of the attention module, and the description is omitted here.
The training optimization module can perform data cross calculation on the fused user vector and the fused material vector to obtain preference probability of the user on the materials. See in particular the description of the preference probability calculation module above, which is not repeated here.
The training optimization module can also perform loss optimization based on the preference probability and the preference label obtained through calculation, so that the user vector and the material vector after fusion are updated. The updating of the fused user vector and the material vector is realized by updating the parameters of the self-attention/attention mechanism in the data fusion module. I.e. the process of learning the first weight, the second weight, the third weight and the fourth weight in the foregoing.
The result storage module is used for storing the data fusion result, such as the fused user vector and/or the material vector, so as to inquire when the personalized material is pushed later. The embodiment of the application can provide two pushing modes, wherein one pushing mode is to push personalized materials in real time, and the other pushing mode is to push personalized materials offline, and the specific description can be seen in the foregoing. Corresponding to the two pushing modes, the result storage module can be divided into two database storage modes. Corresponding to the manner in which personalized materials are pushed in real time, the fused material vectors may be stored in an online recall database (e.g., an OBS as previously described) shown in fig. 15. Corresponding to the way of pushing personalized materials offline, the similarity between the fused user vector and the material vector can be calculated, and for each fused material vector, a plurality of fused material vectors with the highest similarity are selected to be associated with the fused material vectors and stored in an offline recall database (such as Redis) shown in FIG. 15.
After receiving the push request from the user, the handset 110 may send the push request to the server 120.
In some embodiments, the service pushing module in the server 120 responds to the pushing request, if a manner of pushing personalized materials in real time is adopted, the service pushing module needs to calculate the fused user vector of the user account (the user account initiating the pushing request) in real time by adopting a self-attention/attention mechanism (such as the fusion model 1 obtained by training in the foregoing) in the data fusion module, and obtains the similarity of the fused user vector obtained by calculation in real time and the fused material vector in the online recall database, and the materials pointed by a plurality of fused material vectors with the highest similarity are determined to be personalized materials and fed back to the mobile phone 110.
In other embodiments, in response to the push request, if an offline personalized material pushing manner is adopted, the service push module in the server 120 may directly obtain, from the offline recall database, a fused material vector stored in association with a material vector fused with a user account (the user account initiating the push request), determine a material pointed by the obtained fused material vector as a personalized material, and feed back the personalized material to the mobile phone 110.
Embodiments of the present application also provide a chip system, as shown in fig. 16, the chip system 1600 includes at least one processor 1601 and at least one interface circuit 1602. The processor 1601 and the interface circuit 1602 may be interconnected by wires. For example, interface circuitry 1602 may be used to receive signals from other devices (e.g., a memory of an electronic apparatus). For another example, interface circuit 1602 may be used to send signals to other devices (e.g., processor 1601). For example, the interface circuit 1602 may read instructions stored in a memory and send the instructions to the processor 1601. The instructions, when executed by the processor 1601, may cause the first device to perform the steps performed by the handset 110 in the above-described embodiments, and cause the second device to perform the steps performed by the server 120 in the above-described embodiments. Of course, the system-on-chip may also include other discrete devices, which are not particularly limited in accordance with embodiments of the present application.
The embodiment also provides a computer readable storage medium, in which first computer instructions are stored, when the computer instructions run on the first device, the first device is caused to execute the steps executed by the mobile phone 110 in the above method, so as to implement information push.
The present embodiment further provides a computer readable storage medium, where a second computer instruction is stored, where the computer instruction when executed on a second device causes the second device to execute the step executed by the server 120 in the above method, so as to implement information push.
The present embodiment also provides a computer program product which, when run on a computer, causes the computer to perform the above-mentioned related steps to achieve the above-mentioned information push.
In addition, embodiments of the present application also provide an apparatus, which may be embodied as a chip, component or module, which may include a processor and a memory coupled to each other; the memory is configured to store computer-executable instructions, and when the device is operated, the processor may execute the computer-executable instructions stored in the memory, so that the chip performs the methods in the above method embodiments.
The communication system, the computer readable storage medium, the computer program product or the chip provided in this embodiment are used to execute the corresponding method provided above, so that the beneficial effects thereof can be referred to the beneficial effects in the corresponding method provided above, and will not be described herein.
From the foregoing description of the embodiments, it will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of functional modules is illustrated, and in practical application, the above-described functional allocation may be implemented by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to implement all or part of the functions described above.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another apparatus, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and the parts displayed as units may be one physical unit or a plurality of physical units, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated unit may be stored in a readable storage medium if implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the technical solution of the embodiments of the present application may be essentially or a part contributing to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, including several instructions for causing a device (may be a single-chip microcomputer, a chip or the like) or a processor (processor) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present application and not for limiting the same, and although the present application has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present application without departing from the spirit and scope of the technical solution of the present application.

Claims (11)

1. An information pushing method is characterized by comprising the following steps:
After receiving a push request initiated by a first account number in a first platform, acquiring multiple groups of user characteristics of the first account number in a plurality of preset platforms, wherein the preset platforms comprise the first platform;
Generating a plurality of initialization user vectors corresponding to the plurality of groups of user features;
Fusing the plurality of initialization user vectors to obtain a fused user vector of the first account;
According to the fusion user vector, personalized materials are determined, and the personalized materials are pushed to the first account;
The fusing the plurality of initialization user vectors to obtain the fused user vector of the first account includes:
Inputting the plurality of initialization user vectors into a corresponding plurality of first self-attention sub-modules, running the plurality of first self-attention sub-modules and the first self-attention sub-modules, and outputting the fused user vector of the first account;
The first self-attention submodules are in one-to-one correspondence with the preset platforms, each first self-attention submodule is used for correcting the value of each user characteristic item in the corresponding initialized user vector according to the first weight corresponding to each user characteristic item, and each first self-attention submodule is used for fusing a plurality of vectors obtained by correcting the first self-attention submodules according to the second weight corresponding to each preset platform;
wherein the determining personalized materials according to the fused user vector comprises:
Calculating the similarity of the fusion user vector and a plurality of fusion material vectors corresponding to the plurality of candidate materials, and determining the preset number of candidate materials pointed by the fusion material vectors with the highest similarity as the personalized materials; the fusion material vector is obtained by fusing material characteristics of candidate materials in a plurality of preset platforms.
2. The method of claim 1, wherein generating a plurality of initialization user vectors for the plurality of sets of user features comprises:
If the number of the user features of the first account in the second platform is smaller than a first threshold, determining that an initialization user vector corresponding to the user features in the second platform is a random vector; or alternatively
If the number of the user features of the first account in the second platform is smaller than a first threshold, determining an initialization user vector corresponding to the user features in the second platform as follows: a mean vector of a plurality of second initialization user vectors corresponding to a plurality of second accounts in the second platform;
the second platform is any one of the preset platforms, and the plurality of second accounts are accounts with the number of user features in the second platform being greater than a second threshold.
3. The method of claim 1, wherein the user characteristic items include gender, age, membership grade, and/or click sequence.
4. The method of claim 1, wherein prior to said calculating the similarity of the fused user vector to a plurality of fused material vectors corresponding to a plurality of candidate materials, the method further comprises:
acquiring material characteristics of the plurality of candidate materials in the plurality of preset platforms;
For a first candidate material, generating a plurality of initialized material vectors corresponding to the first candidate material according to a plurality of groups of material characteristics of the first candidate material in the plurality of preset platforms, and fusing the plurality of initialized material vectors to obtain a fused material vector of the first candidate material, wherein the first candidate material is any one of the plurality of candidate materials;
And storing a plurality of fusion material vectors corresponding to the plurality of candidate materials.
5. The method of claim 4, wherein the fusing the plurality of initialized material vectors to obtain a fused material vector for the first candidate material comprises:
inputting the plurality of initialized material vectors into a second fusion model, operating the second fusion model, and outputting the fusion material vectors of the first candidate materials;
the second fusion model has a function of fusing a plurality of material vectors into one material vector.
6. The method of claim 5, wherein the second fusion model comprises a plurality of second self-attention sub-modules and a second attention sub-module, the plurality of second self-attention sub-modules being in one-to-one correspondence with the plurality of preset platforms;
the step of inputting the plurality of initialized material vectors into a second fusion model, operating the second fusion model, and outputting the fused material vectors of the first candidate materials comprises the following steps:
Inputting the plurality of initialized material vectors into a corresponding plurality of second self-attention sub-modules, operating the plurality of second self-attention sub-modules and the second attention sub-modules, and outputting the fused material vector of the first candidate material;
Each second self-attention sub-module is used for correcting the value of each material characteristic item in the corresponding initialized material vector according to the third weight corresponding to each material characteristic item, and the second attention sub-modules are used for fusing a plurality of vectors obtained by correcting the plurality of second self-attention sub-modules according to the fourth weight corresponding to each preset platform.
7. The method of claim 6, wherein the item characteristics include category, price, number of views, number of shares, and/or number of clicks.
8. The method of claim 1, wherein the personalized good comprises audio, video, a picture, an article, or a post.
9. A communication system, characterized in that the system comprises a first device configured to perform the steps of receiving a push request and pushing personalized material according to the preceding claims 1-8 and a second device configured to perform the steps of determining personalized material according to the preceding claims 1-8.
10. A computer readable storage medium comprising computer instructions which, when run on a communication system, cause the communication system to perform the method of any of claims 1-8.
11. A chip system, wherein the chip system comprises an interface circuit and a processor; the interface circuit and the processor are interconnected through a circuit; the interface circuit is configured to receive a signal from a memory and to send a signal to the processor, the signal comprising computer instructions stored in the memory; the system-on-chip performs the method of any of claims 1-8 when the processor executes the computer instructions.
CN202211236646.0A 2022-10-10 Information pushing method and system Active CN116738034B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211236646.0A CN116738034B (en) 2022-10-10 Information pushing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211236646.0A CN116738034B (en) 2022-10-10 Information pushing method and system

Publications (2)

Publication Number Publication Date
CN116738034A CN116738034A (en) 2023-09-12
CN116738034B true CN116738034B (en) 2024-06-28

Family

ID=

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108711075A (en) * 2018-05-22 2018-10-26 阿里巴巴集团控股有限公司 A kind of Products Show method and apparatus
CN113868542A (en) * 2021-11-25 2021-12-31 平安科技(深圳)有限公司 Attention model-based push data acquisition method, device, equipment and medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108711075A (en) * 2018-05-22 2018-10-26 阿里巴巴集团控股有限公司 A kind of Products Show method and apparatus
CN113868542A (en) * 2021-11-25 2021-12-31 平安科技(深圳)有限公司 Attention model-based push data acquisition method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN110162693B (en) Information recommendation method and server
Li et al. On both cold-start and long-tail recommendation with social data
US10354184B1 (en) Joint modeling of user behavior
US8301514B1 (en) System, method, and computer readable medium for providing recommendations based on purchase phrases
CN109783539A (en) Usage mining and its model building method, device and computer equipment
TWI823036B (en) Recommended target user selecting method, system, equipment and storage medium
CN111242748A (en) Method, apparatus, and storage medium for recommending items to a user
CN114663197A (en) Commodity recommendation method and device, equipment, medium and product thereof
CN111967924A (en) Commodity recommendation method, commodity recommendation device, computer device, and medium
CN111506821A (en) Recommendation model, method, device, equipment and storage medium
CN116894711A (en) Commodity recommendation reason generation method and device and electronic equipment
CN111680213B (en) Information recommendation method, data processing method and device
CN117172887B (en) Commodity recommendation model training method and commodity recommendation method
CN116089745A (en) Information recommendation method, device, electronic equipment and computer readable storage medium
CN117350816A (en) Independent station commodity recommendation method and device, equipment and medium thereof
CN113793161A (en) Advertisement delivery method, advertisement delivery device, readable storage medium and electronic device
CN113744002A (en) Method, device, equipment and computer readable medium for pushing information
CN113450172A (en) Commodity recommendation method and device
WO2023082864A1 (en) Training method and apparatus for content recommendation model, device, and storage medium
CN116634008A (en) Information pushing method, device, computer equipment and storage medium
CN116738034B (en) Information pushing method and system
CN116342228A (en) Related recommendation method based on directed graph neural network
CN115618126A (en) Search processing method, system, computer readable storage medium and computer device
CN116738034A (en) Information pushing method and system
CN114861079A (en) Collaborative filtering recommendation method and system fusing commodity features

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant