CN110275881B - Method and device for pushing object to user based on Hash embedded vector - Google Patents

Method and device for pushing object to user based on Hash embedded vector Download PDF

Info

Publication number
CN110275881B
CN110275881B CN201910300786.1A CN201910300786A CN110275881B CN 110275881 B CN110275881 B CN 110275881B CN 201910300786 A CN201910300786 A CN 201910300786A CN 110275881 B CN110275881 B CN 110275881B
Authority
CN
China
Prior art keywords
user
vector
hash
predetermined
values
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910300786.1A
Other languages
Chinese (zh)
Other versions
CN110275881A (en
Inventor
陈超超
周俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Advanced New Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced New Technologies Co Ltd filed Critical Advanced New Technologies Co Ltd
Priority to CN201910300786.1A priority Critical patent/CN110275881B/en
Publication of CN110275881A publication Critical patent/CN110275881A/en
Application granted granted Critical
Publication of CN110275881B publication Critical patent/CN110275881B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2237Vectors, bitmaps or matrices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2379Updates performed during online database operations; commit processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification provides a method and a device for acquiring a user hash embedded vector in a platform and a method and a device for pushing an object based on the hash embedded vector, wherein the method for acquiring the user hash embedded vector in the platform comprises the following steps: acquiring actual scores of a plurality of objects in a platform and current object embedding vectors of the plurality of objects by a first user in the platform respectively; acquiring the relationship strength between the first user and a plurality of users in the platform respectively, and the current user embedded vectors of the first user and the plurality of users respectively; updating the user-embedded vector of the first user by a predetermined optimization algorithm based on a predetermined objective function such that a value of the objective function decreases; and in the case that the second execution is not required, mapping each element of the updated user embedded vector of the first user to one of the predetermined number of predetermined values respectively through a predetermined hash algorithm to obtain a user hash embedded vector of the first user.

Description

Method and device for pushing object to user based on Hash embedded vector
Technical Field
The embodiment of the specification relates to the field of data processing, and more particularly, to a method and an apparatus for acquiring a user hash embedded vector in a platform, and a method and an apparatus for pushing an object to a user in the platform.
Background
In the recommendation system, user-object rating information, and social information are two very important resources. Social matrix factorization is an efficient method of processing two kinds of information. The recommendation is carried out by using social matrix decomposition, and the method is mainly divided into two steps: an off-line model training step, namely, obtaining an embedded vector of a predetermined space of each user and each object by using a social matrix decomposition method according to the existing user-object score and the user-user social relationship; and an online prediction step, namely, for each user, calculating the inner product of the embedded vector of the user and the embedded vector of each object as the prediction score of the user for the object, and then sorting the objects according to the prediction scores, thereby recommending the top-ranked objects to the user. However, in practical application scenarios, the online prediction process is very time consuming due to the huge object data.
Therefore, a more efficient object pushing method is needed.
Disclosure of Invention
The embodiments of the present specification aim to provide a scheme for more effectively acquiring a user hash embedded vector in a platform and a scheme for pushing an object to a user in the platform, so as to solve the deficiencies in the prior art.
To achieve the above object, one aspect of the present specification provides a method for obtaining a user hash embedded vector in a platform, the user hash embedded vector including elements of a predetermined dimension, and a value of each element being one of a predetermined number of predetermined values, the predetermined number of predetermined values defining a first range of values, the method including:
acquiring actual scores of a plurality of objects in a platform and current object embedding vectors of the objects respectively by a first user in the platform, wherein the current object embedding vectors are vectors in a vector space of the preset dimension;
acquiring the relationship strength between the first user and a plurality of users in the platform respectively, and the current user embedded vectors of the first user and the plurality of users respectively, wherein the current user embedded vector is a vector in the vector space;
updating, by a predetermined optimization algorithm, the user-embedded vector of the first user based on a predetermined objective function such that a value of the objective function is reduced, wherein the objective function includes a prediction error term in which similarity of the user-embedded vector of the first user to the object-embedded vector of the corresponding object is defined by a size of the actual score, and a social constraint term in which the actual score is converted to a value within a second range of values to define a range of values of an inner product of the user-embedded vector of the first user and the corresponding object-embedded vector, wherein the second range of values is determined based on the first range of values, and the social constraint term includes each of the relationship strengths and the user-embedded vectors of the first user and the plurality of users, respectively;
judging whether the steps need to be executed again; and
in a case where the second execution is not required, mapping each element of the updated user embedded vector of the first user to one of the predetermined number of predetermined values respectively by a predetermined hash algorithm to obtain a user hash embedded vector of the first user.
In one embodiment, the objective function further includes a regular term, and the regular term includes user-embedded vectors of the first user and the plurality of users, respectively.
In one embodiment, the regularization term includes a second order norm of a sum of user embedding vectors of the first user and the respective plurality of users.
Another aspect of the present specification provides a method of obtaining an object hash embedding vector in a platform, the object hash embedding vector including elements of a predetermined dimension, and a value of each element being one of a predetermined number of predetermined values, the predetermined number of predetermined values defining a first numerical range, the method including:
acquiring a current object embedding vector of a first object in the platform, wherein the current object embedding vector is a vector in the vector space with the preset dimensionality;
acquiring actual scores of a plurality of users in a platform on the first object respectively and current user embedded vectors of the users respectively, wherein the current user embedded vectors are vectors in the vector space;
updating, by a predetermined optimization algorithm, the object embedding vector of the first object based on a predetermined objective function such that a value of the objective function is reduced, wherein the objective function includes a prediction error term in which a similarity of a user embedding vector of a respective user to the object embedding vector of the first object is defined in a size of the actual score, and the actual score is converted to a value within a second range of values to define a range of values of an inner product of the respective user embedding vector and the object embedding vector of the first object, wherein the second range of values is determined based on the first range of values;
judging whether the steps need to be executed again or not; and
in a case where the re-execution is not required, mapping each element of the updated object embedding vector of the first object to one of the predetermined number of predetermined values, respectively, by a predetermined hash algorithm to obtain an object hash embedding vector of the first object. In one embodiment, the objective function further includes a regular term including an object embedding vector of each of a plurality of objects in the platform, wherein the first object is included in the plurality of objects.
In one embodiment, the predetermined number of predetermined values equally divides the first range of values.
In one embodiment, the predetermined number of predetermined values is-a and a, and the first range of values is [ -a, a ], where a is any positive integer.
In one embodiment, the predetermined hash algorithm maps a value less than or equal to 0 to a, and maps a value greater than 0 to a.
In one embodiment, the vector space is a K-dimensional space, and the inner product and are included in the prediction error term
Figure BDA0002028159830000031
The second range of values is
Figure BDA0002028159830000032
The range of values of the inner product is limited to [ -Ka [ ] 2 ,Ka 2 ]。
In one embodiment, the initial values of the elements of the user embedded vector and the object embedded vector are limited to the first range of values.
Another aspect of the present specification provides a method for pushing an object to a user in a platform, where a first hash table and a second hash table are preset in the platform, where the first hash table includes user hash embedded vectors of multiple users obtained by the foregoing method, and the second hash table includes object hash embedded vectors of multiple objects obtained by the foregoing method, and the method includes:
looking up a first user hash embedding vector for a first user of the plurality of users from a first hash table;
searching a first object hash embedding vector of each object in a preset object set from a second hash table, wherein the preset object set is composed of objects in the plurality of objects;
calculating a similarity between the first user hash embedded vector and each of the first object hash embedded vectors; and
and determining an object pushed to the first user in the preset object set based on the similarity.
In one embodiment, the similarity is a hamming distance.
Another aspect of the present specification provides an apparatus for obtaining a user hash embedded vector in a platform, the user hash embedded vector including elements of a predetermined dimension, and a value of each element being one of a predetermined number of predetermined values, the predetermined number of predetermined values defining a first range of values, the apparatus comprising:
a first obtaining unit, configured to obtain actual scores of a plurality of objects in a platform by a first user in the platform and current object embedding vectors of the respective objects, where the current object embedding vectors are vectors in the predetermined-dimension vector space;
a second obtaining unit, configured to obtain relationship strengths of the first user and a plurality of users in the platform, respectively, and current user embedded vectors of the first user and the plurality of users, respectively, where the current user embedded vector is a vector in the vector space;
an updating unit configured to update the user-embedded vector of the first user through a predetermined optimization algorithm based on a predetermined objective function so that a value of the objective function is reduced, wherein the objective function includes a prediction error term in which similarity of the user-embedded vector of the first user and an object-embedded vector of a corresponding object is defined by a size of the actual score and a social constraint term in which a value range of an inner product of the user-embedded vector of the first user and the corresponding object-embedded vector is defined by converting the actual score to a value within a second value range determined based on the first value range and including the relationship strength and the user-embedded vectors of the first user and the plurality of users, respectively;
a judging unit configured to judge whether the above steps need to be executed again; and
a mapping unit configured to map each element of the updated user embedded vector of the first user to one of the predetermined number of predetermined values, respectively, by a predetermined hashing algorithm to obtain a user hashed embedded vector of the first user, without re-execution.
Another aspect of the present specification provides an apparatus for obtaining an object hash embedding vector in a platform, where the object hash embedding vector includes elements of a predetermined dimension, and a value of each element is one of a predetermined number of predetermined values, and the predetermined number of predetermined values defines a first numerical value range, the apparatus including:
a first obtaining unit, configured to obtain a current object embedding vector of a first object in a platform, where the current object embedding vector is a vector in the vector space of the predetermined dimension;
a second obtaining unit, configured to obtain actual scores of the first object by a plurality of users in a platform respectively, and current user embedded vectors of the users, where the current user embedded vectors are vectors in the vector space;
an updating unit configured to update the object embedding vector of the first object through a predetermined optimization algorithm based on a predetermined objective function such that a value of the objective function is reduced, wherein the objective function includes a prediction error term in which similarity of a user embedding vector of a corresponding user and the object embedding vector of the first object is defined in a size of the actual score, and the actual score is converted to a value within a second numerical range to define a range of values of an inner product of the corresponding user embedding vector and the object embedding vector of the first object, wherein the second numerical range is determined based on the first numerical range;
a judging unit configured to judge whether the above steps need to be executed again; and
a mapping unit configured to map each element of the updated object embedding vector of the first object to one of the predetermined number of predetermined values, respectively, by a predetermined hashing algorithm to obtain an object hash embedding vector of the first object, without re-execution.
Another aspect of the present specification provides an apparatus for pushing an object to a user in a platform, where a first hash table and a second hash table are preset in the platform, where the first hash table includes user hash embedded vectors of multiple users obtained by the apparatus, and the second hash table includes object hash embedded vectors of multiple objects obtained by the apparatus, and the apparatus includes:
a first lookup unit configured to lookup a first user hash embedding vector of a first user of the plurality of users from a first hash table;
a second lookup unit configured to lookup a respective first object hash embedding vector of each object in a preset object set from a second hash table, wherein the preset object set is composed of objects in the plurality of objects;
a calculation unit configured to calculate a similarity between the first user hash embedded vector and each of the first object hash embedded vectors; and
a determining unit configured to determine, based on the similarity, an object pushed to the first user in the preset object set.
Another aspect of the present specification provides a computer readable storage medium having a computer program stored thereon, which, when executed in a computer, causes the computer to perform any one of the above methods.
Another aspect of the present specification provides a computing device comprising a memory and a processor, wherein the memory stores executable code, and the processor implements any one of the above methods when executing the executable code.
In the push scheme based on hash mapping according to the embodiment of the specification, hash codes of users and objects are learned by using a social matrix decomposition method based on hash, and a recommendation result is obtained by using a hash table lookup method during online prediction, so that the recommendation result can be quickly obtained compared with the prior art, and the recommendation timeliness is good.
Drawings
The embodiments of the present specification may be made more clear by describing the embodiments with reference to the attached drawings:
fig. 1 illustrates a hash-embedded vector based object push system 100 according to an embodiment of the present description;
FIG. 2 illustrates a method of obtaining a user hash embedded vector in a platform according to an embodiment of the present description;
FIG. 3 illustrates a method of obtaining an object hash embedding vector in a platform according to an embodiment of the present description;
FIG. 4 illustrates a method for pushing an object to a user in a platform according to an embodiment of the present description;
fig. 5 illustrates an apparatus 500 for obtaining a user hash embedded vector in a platform according to an embodiment of the present specification;
fig. 6 illustrates an apparatus 600 for obtaining an object hash embedding vector in a platform according to an embodiment of the present specification;
fig. 7 illustrates an apparatus 700 for pushing an object to a user in a platform according to an embodiment of the present description.
Detailed Description
The embodiments of the present specification will be described below with reference to the accompanying drawings.
Fig. 1 illustrates an object push system 100 based on hash-embedded vectors according to an embodiment of the present description. As shown in fig. 1, the system 100 includes a platform server 11, a client 12, and a client 13, where the server 11 includes a data acquisition unit 111, a computation unit 112, and a push unit 113, the client 12 is, for example, a client of a user 1 in a platform, and the client 13 is, for example, a client of a user 2 in the platform. It will be appreciated that although only two clients are shown, in practice, multiple users' respective clients in the platform may be included. The data acquisition unit 111 acquires a scoring matrix R in the platform and a social matrix S between users based on the operation of the users on the client. The calculation unit 112 obtains the embedding vectors of the respective users and the respective objects by, for example, a gradient descent method based on the objective function for learning the hash embedding vector based on the above-described score matrix R and social matrix S, and maps the respective embedding vectors to the hash embedding vectors by a hash algorithm, so that the user hash embedding vector table and the object hash embedding vector table can be obtained. The pushing unit 113 may determine an object for pushing to the user 1 based on, for example, hamming distances of the user hash embedding vector and the plurality of object hash embedding vectors by looking up, for example, the user hash embedding vector of the user 1 and the object hash embedding vectors of the respective objects in the preset object set, and perform the pushing in the display page of the client 12 of the user 1.
The above-described process of obtaining the hash embedded vector and the process of pushing based on the hash embedded vector are described in detail below.
Fig. 2 illustrates a method of obtaining a user hash embedded vector in a platform, the user hash embedded vector including elements of a predetermined dimension, and each element having a value of one of a predetermined number of predetermined values, the predetermined number of predetermined values defining a first range of values, according to an embodiment of the present specification, the method including:
in step S202, actual scores of a plurality of objects in a platform by a first user in the platform and current object embedding vectors of the plurality of objects are obtained, where the current object embedding vectors are vectors in the predetermined-dimension vector space;
in step S204, obtaining the relationship strength between the first user and the plurality of users in the platform, and the current user embedded vectors of the first user and the plurality of users, respectively, where the current user embedded vector is a vector in the vector space;
in step S206, updating the user-embedded vector of the first user through a predetermined optimization algorithm based on a predetermined objective function so that the value of the objective function is reduced, wherein the objective function includes a prediction error term and a social constraint term, wherein in the prediction error term, the similarity of the user-embedded vector of the first user and the object-embedded vector of the corresponding object is defined by the size of the actual score, and the actual score is converted to a value within a second value range to define a value range of an inner product of the user-embedded vector of the first user and the corresponding object-embedded vector, wherein the second value range is determined based on the first value range, and the social constraint term includes the relationship strength and the user-embedded vectors of the first user and the plurality of users;
in step S208, it is determined whether the above steps need to be performed again; and
in step S210, in a case where it is not required to be performed again, each element of the updated user embedded vector of the first user is respectively mapped to one of the predetermined number of predetermined values by a predetermined hash algorithm to obtain a user hash embedded vector of the first user.
Such as a shopping platform or a movie platform, etc. A scoring matrix R of the user set u to the object set v in the platform and a social matrix S among the users in the user set u are stored in a server of the platform. Wherein each element R in the matrix R ij Representing the user i's score for object j. It is understood that the score may be a direct score of the object by the user, or may be a score calculated based on the user's operation (e.g., number of clicks, number of purchases) on the object. Each element S of the matrix S if For indicatingThe strength of the relationship between user i and user f, e.g., the friend relationship between users, thus element S if The value 0 or 1 may be taken, for example, the value 0 indicates that the user i and the user f are not in a friend relationship, the value 1 indicates that the user i and the user f are in a friend relationship, and the like. In one embodiment, the relationship strength is, for example, a pearson correlation coefficient or a cosine similarity calculated based on the scores of the object by user i and user f, respectively.
Unlike the prior art in which the embedded vectors of users and articles are learned by a social matrix method, the method according to the present embodiment is used to learn the hash embedded vector U of a user i And hash embedding vector V of the article j . The hash-embedding vector is a vector of a predetermined dimension (e.g., K dimension), and the value of each element thereof is one of a predetermined number of predetermined values. For example, the predetermined number of predetermined values are-1 and 1, and thus, the hash is embedded in the vector U i And V j Can be represented as U i ,V j ∈{-1,1} K . In the case where the predetermined values are set to-1 and 1, the values of the elements of the calculated embedded vector can be made to fall within a predetermined range (e.g., -1, 1) by setting the objective function and the initial value of the embedded vector]) Approximating such that, after the embedded vector is calculated by an optimization algorithm such as a gradient descent method based on an objective function, respective element values of the embedded vector are mapped to-1 or 1 by a predetermined hash algorithm, and a final hash embedded vector can be obtained. It is to be understood that the predetermined value is not limited to two and is not limited to-1 and 1, for example, the predetermined value may take the values of-2, 0 and 2, based on which the objective function, the initial value of the embedding vector, and the predetermined hash algorithm may be determined accordingly to implement the present invention. The implementation of the steps of the embodiment will be described below by taking the predetermined values-1 and 1 as examples.
At U i ,V j ∈{-1,1} K In this case, a total objective function (or loss function) corresponding to all users u and all objects v in the scoring matrix as shown in formula (1) may be used:
Figure BDA0002028159830000101
in the objective function, the first term is an error term between the actual score of each user for each object and the prediction score based on the embedded vector. Wherein the finally obtained U is caused to be obtained due to the requirement i ,V j ∈{-1,1} K R 'is' ij Is set by dividing R in the scoring matrix ij Divide by the highest score value to get R' ij Is in the range of 0 to 1, further, by including in the first term
Figure BDA0002028159830000102
Thereby making it possible to
Figure BDA0002028159830000103
In the range of
Figure BDA0002028159830000104
In between, can
Figure BDA0002028159830000105
Considered the actual score. In the first item
Figure BDA0002028159830000106
Can be regarded as a prediction score based on the embedded vector, in the calculation process of optimization based on the objective function, thereby leading to
Figure BDA0002028159830000107
Is approached at
Figure BDA0002028159830000108
Inside of
Figure BDA0002028159830000109
I.e. make
Figure BDA00020281598300001010
Approximation of range of
Figure BDA00020281598300001011
That is, make
Figure BDA00020281598300001012
Is close to [ -K, K]. By so arranging, in a continuous cycle, U i ,V j Will gradually approach-1 and 1 respectively, and, for each loop, for U i Or V j The update of (2) makes the prediction error smaller. Can be combined with U i ,V j Each element value of the initial embedding vector of (1) is taken to be [ -1,1]In between, for example, each element can be uniformly valued at 0, or can be [ -1,1 [ ]]Randomly obtaining the value of each element in the range.
The second term of the above formula (1) is a social constraint term based on the similarity of the embedded vectors between the respective users, where λ is a predetermined weight. The constraint term is substantially the same as in the prior art, except that the embedding vector is here calculated according to the formula (1), i.e. each element of the embedding vector is approximated by [ -1,1]]Values within the range, rather than values within the full real number range, where S if For each element in the social matrix S, the strength of the relationship between the user i and the user f is represented.
The third term in the above equation (1) is a regular term for preventing overfitting, where σ is a predetermined weight. In formula (1), the regularization term includes: the second-order norm of the sum of the user-embedded vectors (i.e., | | Σ) for each user individually i∈u U i || 2 ) And the second-order norm (| Σ) of the sum of the object embedding vectors for each object individually j∈v V j || 2 ) Therefore, the value uniformity of the finally obtained hash embedding vector of the user or the object is controlled through the form, namely the proportion of-1 to 1 in each user hash embedding vector is made to be as same as possible, and the proportion of-1 to 1 in each object hash embedding vector is made to be as same as possible. Although a regular term as shown in formula (1) is shown here, the embodiment of the present specification is not limited thereto, and various regular terms such as a sum of squares of second-order norms of respective user embedded vectors and the like may be employedAnd the like.
Although the above is represented by U i ,V j ∈{-1,1} K The overall objective function is given for the example, but in the embodiment of the present invention, the predetermined number of predetermined values is not limited thereto, and for example, U may be set in advance i ,V j ∈{-a,a} K Where a is any positive integer, in which case the objective function may be set similarly to the above example, e.g., the first term may be set to
Figure BDA0002028159830000111
And R 'is' ij Similarly switched to [0, a ] 2 ]So that
Figure BDA0002028159830000112
Is approximated by [ -Ka [ ] 2 ,Ka 2 ]Thereby making U i ,V j The minimum and maximum values of the element of (a) approximate-a and a, respectively. For example, U can also be preset i ,V j ∈{-a,0,a} K The objective function may also be set similarly. Although the predetermined value is set to [ -a, a ] here]For example, the predetermined value may be set to 0,2,4, etc. according to the needs of the scene, in which case the objective function may be set accordingly so that the element values of the computed hash embedding vector approach the predetermined value, which will not be described in detail herein.
After determining the total objective function, user embedded vectors of respective users and object embedded vectors of respective objects may be calculated based on the objective function, and hash mapping may be performed based on the calculated embedded vectors, thereby obtaining corresponding hash embedded vectors. The following first takes the example of obtaining the user hash embedding vector of the first user as an example to describe the specific steps shown in fig. 2.
First, in step S202, actual scores of a plurality of objects in a platform and current object embedding vectors of the plurality of objects, respectively, of a first user in the platform are obtained, where the current object embedding vectors are vectors in the predetermined-dimension vector space.
The first user is, for example, a user i among the multiple users u, and the existing actual scores of the user i on the multiple objects v may be obtained from a scoring matrix obtained in advance by the platform. The scoring matrix is, for example, as shown in table 1 below:
V 1 v 2 V 3
u 1 0 1 5
u 2 2 2 0
u i 3 0 4
TABLE 1
Where u represents a user in the platform and v represents an object in the platform, only three users and three objects are shown here as examples, and in an actual situation, a scoring matrix may be created for all or part of the users and all or part of the objects in the platform, for example. The score is, for example, a 5-point score, where 0 indicates that there is no corresponding score, in which case the existing actual scores for user i for each item include the actual scores for user i for object 1 and object 3.
As known to those skilled in the art, the embedded vector of a user or object is typically computed over multiple cycles, as is the method shown in fig. 2. When steps S202-S206 in the method shown in fig. 2 are performed in the first loop, the current object embedding vector of the plurality of objects is a preset initial object embedding vector, which is a vector in a vector space of a predetermined dimension (e.g., K dimension). The K dimensions of the vector space may respectively correspond to a specific feature of the user, such as a specific age group, a specific gender, a specific preference, and the like, and for the object embedding vector, the element value thereof indicates applicability, relevance, and the like of the object to the corresponding dimension, such as one of the dimensions corresponding to the feature that the preferential magnitude is larger, so that, for example, when the element value of the object embedding vector in the dimension is greater than 0, it may indicate that the object has the feature that the preferential magnitude is larger, and when the element value of the object embedding vector in the dimension is less than 0, it may indicate that the object does not have the feature that the preferential magnitude is larger. Alternatively, the K dimensions of the vector space may each correspond to a hidden feature, which may not have a specific feature meaning. In one embodiment, a range of values [ -1,1] is determined based on the predetermined values, such as-1 and 1 above, and the range of values for each element of the initial object embedding vector is defined to be within [ -1,1]. While the loop shown in fig. 2 is executed for the first time for the first user, the method will be similarly executed for each of the other users to update the current user embedded vector for each user, and for each object, the method for updating the object embedded vector for each object, described below, will be executed to update the current object embedded vector for each object.
After the first loop of steps S202-S206 described above, the current object embedding vector of each of the plurality of objects is the object embedding vector respectively obtained by the corresponding step of the previous loop, and as described above, the values of the respective elements in the object embedding vector gradually approximate the numerical range [ -1,1] in the continuous loop.
In step S204, the relationship strengths between the first user and the plurality of users in the platform, and the current user embedded vectors of the first user and the plurality of users are obtained, where the current user embedded vector is a vector in the vector space.
The relationship strength is S in the social matrix of each user if The social matrix may be obtained as described above, and details are not repeated here. The current user embedding vector for each of the first user and the plurality of users is obtained similarly to the current object embedding vector described above, i.e., the value of each element of the current user embedding vector may be in a range determined by the predetermined value (e.g., [ -1,1] at the first iteration of steps S202-S206]) An intra-random acquisition (or designated acquisition), in a subsequent cycle, the current user-embedded vector being the user-embedded vector acquired by the last execution of the loop step in fig. 2.
In step S206, the user-embedded vector of the first user is updated through a predetermined optimization algorithm based on a predetermined objective function, so that the value of the objective function is reduced, wherein the objective function includes a prediction error term and a social constraint term, wherein in the prediction error term, the similarity of the user-embedded vector of the first user and the object-embedded vector of the corresponding object is defined by the size of the actual score, and the actual score is converted to a value within a second value range to define a value range of an inner product of the user-embedded vector of the first user and the corresponding object-embedded vector, wherein the second value range is determined based on the first value range, and the social constraint term includes the relationship strength and the user-embedded vectors of the first user and the plurality of users.
In e.g. U i ,V j ∈{-1,1} K In this case, the predetermined objective function may be the objective function shown in the above formula (1), and in addition, since the method is only applied to one user, for example, the user iThus, the predetermined objective function may also be an objective function L for only the user i as shown in the following equation (2) i
Figure BDA0002028159830000141
That is, L i Including L including U i All of (1). Wherein R' ij And R 'in the formula (1)' ij The same is true. Similarly to formula (1), in formula (2), the first term is a prediction error term, the second term is a social constraint term, and the third term is a regular term, and the specific description may refer to the description in formula (1) above, and is not repeated here.
The predetermined optimization algorithm is, for example, a stochastic gradient descent method (SGD), it being understood that the predetermined optimization algorithm is not limited to being a SGD, but may be any optimization algorithm known to those skilled in the art, such as various variants of SGD, such as momentum algorithms, and the like.
In the case of optimization by, for example, SGD, the user-embedded vector U of the first user can be updated by the following formula (3) i
Figure BDA0002028159830000142
Where γ is the iteration step. Based on equations (2) and (3), by comparing each R 'acquired in steps S202 and S204' ij 、U i Each V j Each S if Each U f Into formula (3), wherein U i And each U f Form each U k Thereby can update U i
In step S208, it is determined whether the above-described steps need to be performed again.
That is, it is determined whether the loop for steps S202-S206 shown in FIG. 2 is ended. In one embodiment, a predetermined number of cycles may be set, and the determination of whether to end the cycle may be based on whether the number of cycles to steps S202-S206 reaches the predetermined number of cycles. In one embodiment, an objective function threshold may be set such that the loop of steps S202-S206 may end when the value of the objective function reaches the threshold.
In step S210, in a case where it is not required to be performed again, each element of the updated user embedded vector of the first user is respectively mapped to one of the predetermined number of predetermined values by a predetermined hash algorithm to obtain a user hash embedded vector of the first user.
The predetermined hash algorithm may be determined based on a preset predetermined number of predetermined values. For example, in U i ,V j ∈{-1,1} K In this case, the hash algorithm may be set to: when the value of an element is greater than 0, the value of the element is mapped to 1, and when the value of the element is less than or equal to 0, the value of the element is mapped to-1. It is to be understood that the hash algorithm described herein is merely exemplary and not limiting, and that a variety of hash algorithms can be employed, such as mapping the values of the respective elements to-1 or 1 via a random probability function, etc., which are not listed herein. In case the predetermined number of predetermined values are taken to be other values, e.g., -2,0,2, etc., the hash function may be set accordingly to map the respective element values to the respective predetermined values, e.g., values less than or equal to-2 may be mapped to-2, values between-2 and 2 may be mapped to 0, values greater than 2 may be mapped to 2, etc.
Thus, through the hash mapping, a user hash embedded vector of the first user can be obtained, and the user hash embedded vector is equivalent to the hash code of the first user. By the same method, hash codes of respective users can be acquired, and thus a hash table about the hash codes of the users can be made.
In one embodiment, in the event that it is determined in step S208 that it is necessary to re-loop steps S202-S206, the method returns to step S202 for re-looping until it proceeds to step S210 after it is determined in step S208 that re-looping is not necessary.
Fig. 3 illustrates a method of obtaining an object hash embedding vector in a platform, the object hash embedding vector including elements of a predetermined dimension, and a value of each element being one of a predetermined number of predetermined values, the predetermined number of predetermined values defining a first range of values, according to an embodiment of the present specification, the method including:
in step S302, a current object embedding vector of a first object in the platform is obtained, where the current object embedding vector is a vector in the vector space of the predetermined dimension;
in step S304, actual scores of the first object by a plurality of users in the platform and respective current user embedded vectors of the plurality of users are obtained, where the current user embedded vectors are vectors in the vector space;
at step S306, updating the object embedding vector of the first object by a predetermined optimization algorithm based on a predetermined objective function such that a value of the objective function is reduced, wherein the objective function includes a prediction error term in which a similarity of the user embedding vector of the corresponding user and the object embedding vector of the first object is defined by a magnitude of the actual score, and the actual score is converted to a value within a second numerical range to define a range of values of an inner product of the object embedding vector of the first object and the corresponding user embedding vector, wherein the second numerical range is determined based on the first numerical range;
in step S308, it is determined whether the above steps need to be executed again; and
in step S310, each element of the updated object embedding vector of the first object is mapped to one of the predetermined number of predetermined values by a predetermined hashing algorithm, respectively, to obtain an object hash embedding vector of the first object, without need to be performed again. In this method, the obtaining of the respective quantities in step S302 and step S304 may refer to the description in step S202 and step S204 above, and will not be described herein again.
In step S306, the object embedding vector of the first object is updated by a predetermined optimization algorithm based on a predetermined objective function such that the value of the objective function is reduced, wherein the objective function comprises a prediction error term in which the similarity of the user embedding vector of the respective user and the object embedding vector of the first object is defined in the magnitude of the actual score, and the actual score is converted to a value within a second value range to define a value range of an inner product of the respective user embedding vector and the object embedding vector of the first object, wherein the second value range is determined based on the first value range.
The objective function here may be the total objective function shown in the above formula (1), or may be the objective function L for the first object j shown in the following formula (4) j
Figure BDA0002028159830000161
That is, L j Including the above-mentioned L including V j All of (1). Wherein R' ij And R 'in the formula (1)' ij The same is true. In the formula (4), the first term is a prediction error term, and the second term is a regular term, and the specific description may refer to the description in the formula (1) above, and is not repeated here.
Thus, optimization can be performed based on the objective function as well. For example, in the case of optimization by SGD, the object embedding vector V of the first object may be updated by the following formula (5) j
Figure BDA0002028159830000171
Where γ is the iteration step. Based on formulas (4) and (5), by comparing the respective R 'S acquired in steps S302 and S304' ij Each U i Each V k Into formula (5), wherein each V k Including V j Thereby V can be updated j
The specific implementation of step S308 and step S310 can refer to the above description of step S208 and step S210, and is not repeated here.
Thus, through the above hash mapping, an object hash embedding vector of the first object can be obtained, which is equivalent to the hash encoding of the first object. By the same method, hash codes of respective objects can be acquired, and thus a hash table regarding the hash codes of the objects can be made.
In one embodiment, in the event that it is determined in step S308 that it is necessary to re-loop steps S302-S306, the method returns to step S302 for re-looping until it proceeds to step S310 after it is determined in step S308 that re-looping is not necessary.
Fig. 4 illustrates a method for pushing an object to a user in a platform according to an embodiment of the present disclosure, where a first hash table and a second hash table are preset in the platform, where the first hash table includes user hash embedded vectors of respective multiple users obtained by the method shown in fig. 2, and the second hash table includes object hash embedded vectors of respective multiple objects obtained by the method shown in fig. 3, and the method includes:
at step S402, look up a first user hash embedding vector for a first user of the plurality of users from a first hash table;
in step S404, looking up a first object hash embedding vector of each object in a preset object set from a second hash table, wherein the preset object set is composed of objects in the plurality of objects;
in step S406, calculating a similarity between the first user hash embedding vector and each of the first object hash embedding vectors; and
in step S408, based on the similarity, an object pushed to the first user is determined in the preset object set.
First, in step S402, a first user hash embedding vector of a first user of the plurality of users is looked up from a first hash table. The user hash embedded vector of the first user can be looked up in the first hash table through the user identification, the user number and the like of the first user to serve as the first user hash embedded vector.
In step S404, a first object hash embedding vector of each object in a preset object set is looked up from a second hash table, wherein the preset object set is composed of the objects in the plurality of objects. The preset object set is, for example, a plurality of commodities in a predetermined commodity category in the shopping platform, such as mobile phone commodities, household appliance commodities, and the like. Or, the preset object set is, for example, a commodity set that is desired to be promoted to each user, and the like. And searching the respective object hash embedding vector of each object in the preset object set from the second hash table to be used as each first object hash embedding vector through object identification, object number and the like.
In step S406, a similarity between the first user hash embedding vector and each of the first object hash embedding vectors is calculated.
In one embodiment, the similarity is a hamming distance between vectors, that is, the number of identical elements (or different elements) of the first user hash embedding vector and each of the first object hash embedding vectors is calculated. In one embodiment, the number of identical elements (or different elements) of the first user hash embedding vector and each of the first object hash embedding vectors may be obtained by an exclusive or operation. It is to be understood that the similarity is not limited to hamming distance, but may be other similarities, such as euclidean distance, and so on. In one embodiment, in U i ,V j ∈{-1,1} K That is, the element values of the first user hash embedding vector and each of the first object hash embedding vectors are-1 or 1, the-1 may be replaced with 0, and then the xor operation may be performed, thereby facilitating the storage of the computer.
In step S408, based on the similarity, an object pushed to the first user is determined in the preset object set.
In the case that the similarity is a hamming distance, since the obtained embedding vectors of the user and the object are hash embedding vectors in the embodiment of the present specification, each element value of the embedding vectors is limited to a limited number of predetermined values, so that the hash embedding vectors of the user and the object take values in the limited vectors, and when the feature overlap ratio of the user and the object is high, the probability that the hash embedding vectors are the same is high. Thus, the number of identical elements may be determined by xoring the hash embedding vectors of the first user and the respective individual objects, e.g., sorting the individual objects based on the number of identical elements and pushing the top-ranked objects to the first user, or individual objects having the same object hash embedding vector as the user hash embedding vector of the first user may be pushed to the first user, and so on.
And in the case that the similarity is in other similarity forms such as Euclidean distance, each element value of the obtained embedded vectors of the user and the object is limited to a limited plurality of preset values because the embedded vectors are Hash embedded vectors, so that the calculation of the similarity is simplified, and the calculation time is saved.
Fig. 5 illustrates an apparatus 500 for obtaining user hash embedded vectors in a platform, the user hash embedded vectors including elements of a predetermined dimension, and each element having a value of one of a predetermined number of predetermined values, the predetermined number of predetermined values defining a first range of values, according to an embodiment of the present description, the apparatus including:
a first obtaining unit 51, configured to obtain actual scores of a plurality of objects in a platform by a first user in the platform and current object embedding vectors of the respective objects, where the current object embedding vectors are vectors in the vector space of the predetermined dimension;
a second obtaining unit 52, configured to obtain relationship strengths of the first user and the multiple users in the platform, respectively, and current user embedded vectors of the first user and the multiple users, where the current user embedded vectors are vectors in the vector space;
an updating unit 53 configured to update, based on a predetermined objective function, the user-embedded vector of the first user through a predetermined optimization algorithm so that a value of the objective function is reduced, wherein the objective function includes a prediction error term in which a similarity of the user-embedded vector of the first user and the object-embedded vector of the corresponding object is defined by a size of the actual score, and a social constraint term in which a value range of an inner product of the user-embedded vector of the first user and the corresponding object-embedded vector is defined by converting the actual score to a value within a second value range determined based on the first value range, and the social constraint term includes the relationship strengths and the user-embedded vectors of the first user and the plurality of users, respectively;
a judging unit 54 configured to judge whether the above steps need to be performed again; and
a mapping unit 55 configured to map each element of the updated user embedded vector of the first user to one of the predetermined number of predetermined values respectively by a predetermined hashing algorithm to obtain a user hash embedded vector of the first user without re-execution.
Fig. 6 shows an apparatus 600 for obtaining an object hash embedding vector in a platform, the object hash embedding vector including elements of a predetermined dimension, and a value of each element being one of a predetermined number of predetermined values, the predetermined number of predetermined values defining a first range of values, according to an embodiment of the present specification, the apparatus including:
a first obtaining unit 61, configured to obtain a current object embedding vector of a first object in a platform, where the current object embedding vector is a vector in the vector space of the predetermined dimension;
a second obtaining unit 62, configured to obtain actual scores of the first object by a plurality of users in a platform, and current user embedded vectors of the plurality of users, respectively, where the current user embedded vectors are vectors in the vector space;
an updating unit 63 configured to update the object embedding vector of the first object by a predetermined optimization algorithm based on a predetermined objective function such that a value of the objective function is reduced, wherein the objective function includes a prediction error term in which a similarity of the user embedding vector of the corresponding user and the object embedding vector of the first object is defined by a magnitude of the actual score, and the actual score is converted to a value within a second numerical range to define a range of values of an inner product of the user embedding vector of the corresponding user and the object embedding vector of the first object, wherein the second numerical range is determined based on the first numerical range;
a judging unit 64 configured to judge whether the above steps need to be executed again; and
a mapping unit 65 configured to map each element of the updated object embedding vector of the first object to one of the predetermined number of predetermined values, respectively, by a predetermined hashing algorithm to obtain an object hash embedding vector of the first object, without needing to be executed again.
Fig. 7 illustrates an apparatus 700 for pushing an object to a user in a platform according to an embodiment of the present disclosure, where a first hash table and a second hash table are preset in the platform, where the first hash table includes user hash embedded vectors of a plurality of users obtained by the apparatus shown in fig. 2, and the second hash table includes object hash embedded vectors of a plurality of objects obtained by the apparatus shown in fig. 3, and the apparatus includes:
a first lookup unit 71 configured to lookup a first user hash embedding vector of a first user of the plurality of users from a first hash table;
a second lookup unit 72 configured to lookup, from a second hash table, a first object hash embedding vector for each object in a preset object set, wherein the preset object set is composed of objects in the plurality of objects;
a calculating unit 73 configured to calculate a similarity between the first user hash embedding vector and each of the first object hash embedding vectors; and
a determining unit 74 configured to determine, based on the similarity, an object pushed to the first user in the preset object set.
Another aspect of the present specification provides a computer readable storage medium having a computer program stored thereon, which, when executed in a computer, causes the computer to perform any one of the above methods.
Another aspect of this specification provides a computing device comprising a memory and a processor, wherein the memory stores executable code, and the processor executes the executable code to implement any of the above methods.
In the push scheme based on hash mapping according to the embodiment of the specification, hash codes of users and objects are learned by using a social matrix decomposition method based on hash, and a recommendation result is obtained by using a hash table lookup method during online prediction, so that the recommendation result can be quickly obtained compared with the prior art, and the recommendation timeliness is good.
It is to be understood that the terms "first," "second," and the like, herein are used for descriptive purposes only and not for purposes of limitation, to distinguish between similar concepts.
All the embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
It will be further appreciated by those of ordinary skill in the art that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the various examples have been described in a functional generic sense in the foregoing description for the purpose of clearly illustrating the interchangeability of hardware and software. Whether these functions are performed in hardware or software depends on the particular application of the solution and design constraints. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (26)

1. A method of obtaining a user hash embedded vector in a platform, the user hash embedded vector comprising elements of a predetermined dimension and each element having a value of one of a predetermined number of predetermined values, the predetermined number of predetermined values defining a first range of values, the method comprising:
acquiring actual scores of a plurality of objects in a platform and current object embedding vectors of the objects respectively by a first user in the platform, wherein the current object embedding vectors are vectors in a vector space of the preset dimension;
acquiring the relationship strength between the first user and a plurality of users in the platform respectively, and the current user embedded vectors of the first user and the plurality of users respectively, wherein the current user embedded vector is a vector in the vector space;
updating, by a predetermined optimization algorithm, the user-embedded vector of the first user based on a predetermined objective function such that a value of the objective function is reduced, wherein the objective function includes a prediction error term in which similarity of the user-embedded vector of the first user to the object-embedded vector of the corresponding object is defined by a size of the actual score, and a social constraint term in which the actual score is converted to a value within a second range of values to define a range of values of an inner product of the user-embedded vector of the first user and the corresponding object-embedded vector, wherein the second range of values is determined based on the first range of values, and the social constraint term includes each of the relationship strengths and the user-embedded vectors of the first user and the plurality of users, respectively;
judging whether the steps need to be executed again or not; and
in a case where the second execution is not required, mapping each element of the updated user embedded vector of the first user to one of the predetermined number of predetermined values respectively by a predetermined hash algorithm to obtain a user hash embedded vector of the first user.
2. The method of claim 1, wherein the objective function further comprises a regularization term comprising user embedded vectors of each of the first user and the plurality of users.
3. The method of claim 2, wherein the regularization term comprises a second order norm of a sum of user embedding vectors of each of the first user and the plurality of users.
4. A method of obtaining an object hash embedding vector in a platform, the object hash embedding vector comprising elements of a predetermined dimension and each element having a value of one of a predetermined number of predetermined values, the predetermined number of predetermined values defining a first range of values, the method comprising:
acquiring a current object embedding vector of a first object in a platform, wherein the current object embedding vector is a vector in the vector space of the preset dimension;
acquiring actual scores of a plurality of users in a platform on the first object respectively and current user embedded vectors of the users respectively, wherein the current user embedded vectors are vectors in the vector space;
updating, by a predetermined optimization algorithm, an object embedding vector of the first object based on a predetermined objective function such that a value of the objective function is reduced, wherein the objective function comprises a prediction error term in which a similarity of a user embedding vector of a respective user to an object embedding vector of the first object is defined in a magnitude of the actual score, and the actual score is converted to a value within a second numerical range to define a range of values of an inner product of the respective user embedding vector and the object embedding vector of the first object, wherein the second numerical range is determined based on the first numerical range;
judging whether the steps need to be executed again; and
in a case where the re-execution is not required, mapping each element of the updated object embedding vector of the first object to one of the predetermined number of predetermined values, respectively, by a predetermined hash algorithm to obtain an object hash embedding vector of the first object.
5. The method of claim 4, wherein the objective function further comprises a regularization term comprising object embedding vectors for respective ones of a plurality of objects in the platform, wherein the first object is comprised in the plurality of objects.
6. The method according to claim 1 or 4, wherein the predetermined number of predetermined values equally divides the first range of values.
7. The method of claim 6, said predetermined number of predetermined values being-a and a, said first range of values being [ -a, a ], wherein a is any positive integer.
8. The method according to claim 7, wherein the predetermined hash algorithm maps a value less than or equal to 0 to a and a value greater than 0 to a.
9. The method of claim 7, wherein the vector space is a K-dimensional space, and the inner product and are included in the prediction error term
Figure FDA0002028159820000031
The second range of values is
Figure FDA0002028159820000032
The range of values of the inner product is limited to [ -Ka [ ] 2 ,Ka 2 ]。
10. The method of claim 1 or 4, wherein initial values of respective elements of the user embedding vector and the object embedding vector are defined within the first range of values.
11. A method for pushing an object to a user in a platform, wherein a first hash table and a second hash table are preset in the platform, the first hash table includes user hash embedded vectors of a plurality of users obtained by the method of claim 1, and the second hash table includes object hash embedded vectors of a plurality of objects obtained by the method of claim 4, the method comprising:
looking up a first user hash embedding vector for a first user of the plurality of users from a first hash table;
searching a first object hash embedding vector of each object in a preset object set from a second hash table, wherein the preset object set is formed by the objects in the plurality of objects;
calculating a similarity between the first user hash embedding vector and each of the first object hash embedding vectors; and
and determining an object pushed to the first user in the preset object set based on the similarity.
12. The method of claim 11, wherein the similarity is a hamming distance.
13. An apparatus for obtaining a user hash embedded vector in a platform, the user hash embedded vector comprising elements of a predetermined dimension, and each element having a value of one of a predetermined number of predetermined values, the predetermined number of predetermined values defining a first range of values, the apparatus comprising:
a first obtaining unit, configured to obtain actual scores of a plurality of objects in a platform and current object embedding vectors of the plurality of objects, respectively, by a first user in the platform, where the current object embedding vectors are vectors in a vector space of the predetermined dimension;
a second obtaining unit, configured to obtain relationship strengths of the first user and a plurality of users in the platform, respectively, and current user embedded vectors of the first user and the plurality of users, respectively, where the current user embedded vector is a vector in the vector space;
an updating unit configured to update the user-embedded vector of the first user through a predetermined optimization algorithm based on a predetermined objective function so that a value of the objective function is reduced, wherein the objective function includes a prediction error term in which similarity of the user-embedded vector of the first user and an object-embedded vector of a corresponding object is defined by a size of the actual score and a social constraint term in which a value range of an inner product of the user-embedded vector of the first user and the corresponding object-embedded vector is defined by converting the actual score to a value within a second value range determined based on the first value range and including the relationship strength and the user-embedded vectors of the first user and the plurality of users, respectively;
a judging unit configured to judge whether the above steps need to be executed again; and
a mapping unit configured to map each element of the updated user embedded vector of the first user to one of the predetermined number of predetermined values, respectively, by a predetermined hashing algorithm to obtain a user hashed embedded vector of the first user, without re-execution.
14. The apparatus of claim 13, wherein the objective function further comprises a regularization term comprising user embedding vectors for each of the first user and the plurality of users.
15. The apparatus of claim 14, wherein the regularization term comprises a second order norm of a sum of user embedding vectors of each of the first user and the plurality of users.
16. An apparatus for obtaining an object hash embedding vector in a platform, the object hash embedding vector comprising elements of a predetermined dimension, and each element having a value of one of a predetermined number of predetermined values, the predetermined number of predetermined values defining a first range of values, the apparatus comprising:
a first obtaining unit, configured to obtain a current object embedding vector of a first object in a platform, where the current object embedding vector is a vector in the vector space of the predetermined dimension;
a second obtaining unit, configured to obtain actual scores of the first object by a plurality of users in a platform, and current user embedded vectors of the plurality of users, respectively, where the current user embedded vectors are vectors in the vector space;
an updating unit configured to update the object embedding vector of the first object by a predetermined optimization algorithm based on a predetermined objective function such that a value of the objective function is reduced, wherein the objective function includes a prediction error term in which a similarity of a user embedding vector of a corresponding user and the object embedding vector of the first object is defined by a magnitude of the actual score, and the actual score is converted to a value within a second numerical range to define a range of values of an inner product of the user embedding vector of the corresponding user and the object embedding vector of the first object, wherein the second numerical range is determined based on the first numerical range;
a judging unit configured to judge whether the above steps need to be executed again; and
a mapping unit configured to map each element of the updated object embedding vector of the first object to one of the predetermined number of predetermined values, respectively, by a predetermined hashing algorithm to obtain an object hash embedding vector of the first object, without re-execution.
17. The apparatus of claim 16, wherein the objective function further comprises a regularization term comprising object embedding vectors for each of a plurality of objects in the platform, wherein the first object is comprised in the plurality of objects.
18. Apparatus according to claim 13 or 16, wherein the predetermined number of predetermined values equally divides the first range of values.
19. The apparatus of claim 18, said predetermined number of predetermined values being-a and a, said first range of values being [ -a, a ], wherein a is any positive integer.
20. The apparatus of claim 19, wherein the predetermined hash algorithm maps a value less than or equal to 0 to a and a value greater than 0 to a.
21. The apparatus of claim 18, wherein the vector space is a K-dimensional space, and the inner product and are included in the prediction error term
Figure FDA0002028159820000051
The second range of values is
Figure FDA0002028159820000052
The range of values of the inner product is limited to [ -Ka [ ] 2 ,Ka 2 ]。
22. The apparatus of claim 13 or 16, wherein initial values of respective elements of the user embedding vector and the object embedding vector are defined within the first range of values.
23. An apparatus for pushing an object to a user in a platform, wherein a first hash table and a second hash table are preset in the platform, the first hash table includes user hash embedded vectors of a plurality of users obtained by the apparatus of claim 13, and the second hash table includes object hash embedded vectors of a plurality of objects obtained by the apparatus of claim 16, the apparatus comprising:
a first lookup unit configured to lookup a first user hash embedding vector of a first user of the plurality of users from a first hash table;
a second lookup unit configured to lookup, from a second hash table, a first object hash embedding vector for each object in a preset object set, wherein the preset object set is composed of objects in the plurality of objects;
a calculation unit configured to calculate a similarity between the first user hash embedded vector and each of the first object hash embedded vectors; and
a determining unit configured to determine, based on the similarity, an object pushed to the first user in the preset object set.
24. The apparatus of claim 23, wherein the similarity is a hamming distance.
25. A computer-readable storage medium, on which a computer program is stored which, when executed in a computer, causes the computer to carry out the method of any one of claims 1-12.
26. A computing device comprising a memory and a processor, wherein the memory has stored therein executable code that, when executed by the processor, performs the method of any of claims 1-12.
CN201910300786.1A 2019-04-15 2019-04-15 Method and device for pushing object to user based on Hash embedded vector Active CN110275881B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910300786.1A CN110275881B (en) 2019-04-15 2019-04-15 Method and device for pushing object to user based on Hash embedded vector

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910300786.1A CN110275881B (en) 2019-04-15 2019-04-15 Method and device for pushing object to user based on Hash embedded vector

Publications (2)

Publication Number Publication Date
CN110275881A CN110275881A (en) 2019-09-24
CN110275881B true CN110275881B (en) 2023-01-17

Family

ID=67959378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910300786.1A Active CN110275881B (en) 2019-04-15 2019-04-15 Method and device for pushing object to user based on Hash embedded vector

Country Status (1)

Country Link
CN (1) CN110275881B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110766166B (en) * 2019-10-23 2021-03-23 支付宝(杭州)信息技术有限公司 Push model optimization method and device executed by user terminal
US11822447B2 (en) 2020-10-06 2023-11-21 Direct Cursus Technology L.L.C Methods and servers for storing data associated with users and digital items of a recommendation system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016110125A1 (en) * 2015-01-09 2016-07-14 北京大学 Hash method for high dimension vector, and vector quantization method and device
CN106997381A (en) * 2017-03-21 2017-08-01 海信集团有限公司 Recommend the method and device of video display to targeted customer
CN109409393A (en) * 2018-06-20 2019-03-01 苏州大学 A method of User Activity track is modeled using track insertion

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016110125A1 (en) * 2015-01-09 2016-07-14 北京大学 Hash method for high dimension vector, and vector quantization method and device
CN106997381A (en) * 2017-03-21 2017-08-01 海信集团有限公司 Recommend the method and device of video display to targeted customer
CN109409393A (en) * 2018-06-20 2019-03-01 苏州大学 A method of User Activity track is modeled using track insertion

Also Published As

Publication number Publication date
CN110275881A (en) 2019-09-24

Similar Documents

Publication Publication Date Title
WO2020182019A1 (en) Image search method, apparatus, device, and computer-readable storage medium
CN106886599B (en) Image retrieval method and device
CN107492008B (en) Information recommendation method and device, server and computer storage medium
CN109829775A (en) A kind of item recommendation method, device, equipment and readable storage medium storing program for executing
US20160321265A1 (en) Similarity calculation system, method of calculating similarity, and program
CN110275881B (en) Method and device for pushing object to user based on Hash embedded vector
US20190266474A1 (en) Systems And Method For Character Sequence Recognition
CN115982463A (en) Resource recommendation method, device, equipment and storage medium
EP3452916A1 (en) Large scale social graph segmentation
CN110008348B (en) Method and device for embedding network diagram by combining nodes and edges
CN112182144B (en) Search term normalization method, computing device, and computer-readable storage medium
CN108804470B (en) Image retrieval method and device
Insuwan et al. Improving missing values imputation in collaborative filtering with user-preference genre and singular value decomposition
CN114707063A (en) Commodity recommendation method and device, electronic equipment and storage medium
CN113569070A (en) Image detection method and device, electronic equipment and storage medium
CN109255079B (en) Cloud service personality recommendation system and method based on sparse linear method
CN109189773B (en) Data restoration method and device
CN114283300A (en) Label determining method and device, and model training method and device
CN112488355A (en) Method and device for predicting user rating based on graph neural network
CN111563783B (en) Article recommendation method and device
Galinier et al. Genetic algorithm to improve diversity in MDE
CN117252665B (en) Service recommendation method and device, electronic equipment and storage medium
CN113807749B (en) Object scoring method and device
CN114281944B (en) Document matching model construction method and device, electronic equipment and storage medium
CN112765458B (en) Mixed recommendation method based on metric decomposition and label self-adaptive weight distribution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200929

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20200929

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant