CN110782287A - Entity similarity calculation method and device, article recommendation system, medium and equipment - Google Patents

Entity similarity calculation method and device, article recommendation system, medium and equipment Download PDF

Info

Publication number
CN110782287A
CN110782287A CN201911024620.8A CN201911024620A CN110782287A CN 110782287 A CN110782287 A CN 110782287A CN 201911024620 A CN201911024620 A CN 201911024620A CN 110782287 A CN110782287 A CN 110782287A
Authority
CN
China
Prior art keywords
entity
entities
obtaining
similarity
user behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911024620.8A
Other languages
Chinese (zh)
Inventor
李懿
崔娜
李晓霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201911024620.8A priority Critical patent/CN110782287A/en
Publication of CN110782287A publication Critical patent/CN110782287A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Abstract

The embodiment of the invention relates to an entity similarity calculation method and device, an article recommendation system, a medium and equipment, relating to the technical field of big data processing, wherein the method comprises the following steps: acquiring user behavior records generated when a user performs behavior operation on each entity, and segmenting the user behavior records to obtain a plurality of segmentation results; obtaining an entity relationship network between the entities according to the segmentation results, and obtaining an entity path set of the entities according to the entity relationship network; and obtaining entity vectors of the entities according to the entity path set, and obtaining the similarity between the entities according to the entity vectors. The embodiment of the invention improves the accuracy of similarity calculation among the entities.

Description

Entity similarity calculation method and device, article recommendation system, medium and equipment
Technical Field
The embodiment of the invention relates to the technical field of big data processing, in particular to an entity similarity calculation method, an entity similarity calculation device, an article recommendation system, a computer readable storage medium and electronic equipment.
Background
With the continuous development and popularization of internet technology, online shopping malls are increasing. In order to attract more consumers and promote the consumers to increase the purchasing power, how to recommend products meeting the requirements of the users to the users also becomes a difficult problem to be solved urgently by each large e-commerce platform.
In the existing recommendation methods, most of the methods use an association rule analysis method to mine the possibility that two commodities are purchased by the same user. Based on the similarity measurement technology, other entities with high relevance can be recommended to the user according to the entity where the user currently acts, and the user is helped to find needed goods.
However, the above method has the following drawbacks: since the similarity is obtained by using the association rule analysis and the vectorization representation of the entity is not used, the accuracy of the similarity calculation result is low.
Therefore, it is desirable to provide a method and an apparatus for calculating entity similarity.
It is to be noted that the information invented in the above background section is only for enhancing the understanding of the background of the present invention, and therefore, may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The invention aims to provide an entity similarity calculation method, an entity similarity calculation device, an article recommendation system, a computer readable storage medium and an electronic device, and further solves the problem that the accuracy of similarity calculation results is low due to the limitations and defects of the related art at least to a certain extent.
According to an aspect of the present disclosure, there is provided an entity similarity calculation method including:
acquiring user behavior records generated when a user performs behavior operation on each entity, and segmenting the user behavior records to obtain a plurality of segmentation results;
obtaining an entity relationship network between the entities according to the segmentation results, and obtaining an entity path set of the entities according to the entity relationship network;
and obtaining entity vectors of the entities according to the entity path set, and obtaining the similarity between the entities according to the entity vectors.
In an exemplary embodiment of the present disclosure, the obtaining a plurality of segmentation results by segmenting the user behavior record includes:
sequencing the user behavior records according to the generation time of the user behavior records;
segmenting the sorted user behavior records according to a preset time condition to obtain a plurality of segmentation results;
and the time difference between the generation times corresponding to any two user behavior records in each segmentation result is not greater than a preset time threshold.
In an exemplary embodiment of the present disclosure, obtaining an entity relationship network between the entities according to the segmentation results includes:
judging whether the relation between the user behavior records corresponding to any two entities meets a preset relation condition or not;
when determining that the relationship between the user behavior records corresponding to any two entities meets the preset relationship condition, determining that entity connection exists between the two entities;
and obtaining an entity relationship network between the entities according to entity connection between the entities existing in the segmentation results.
In an exemplary embodiment of the disclosure, the preset relationship condition is that any two entities belong to the same segmentation result, and user behavior records corresponding to any two entities are located at adjacent positions in the same segmentation result.
In an exemplary embodiment of the present disclosure, the entity similarity calculation method further includes:
calculating the number of confirmed entity connections between any two entities in each segmentation result;
and obtaining the weight of the entity connection of any two entities in the entity relationship network according to the number of the entity connection confirmed to exist between any two entities.
In an exemplary embodiment of the present disclosure, obtaining an entity path set corresponding to each of the entities according to the entity relationship network includes:
step S10, randomly selecting an entity i as a starting point of an entity path corresponding to the entity i, and selecting an entity j with the most number of entity connections between the entity i in the entity relationship network as a next node;
step S20, traversing other entities in the entity relationship network, and obtaining a plurality of other nodes according to the weights of the other entities and the entity connection of the entity i in the entity relationship network;
step S30, obtaining an entity path corresponding to the entity i according to the other nodes, the entity j and the entity i;
step S40, the step S10 to the step S30 are circulated until entity paths corresponding to all the entities included in the entity relationship network are obtained;
step S50, obtaining the entity path set corresponding to each entity according to the entity path corresponding to each entity.
In an exemplary embodiment of the present disclosure, obtaining an entity vector of each entity according to the entity path set includes:
and processing the entity path corresponding to each entity in the entity path set by using a vectorization processing tool to obtain an entity vector of each entity.
In an exemplary embodiment of the present disclosure, obtaining a similarity between the entities according to the entity vectors includes:
and calculating cosine values among the entity vectors, and taking the cosine values as the similarity among the entities.
In an exemplary embodiment of the present disclosure, the entity similarity calculation method further includes:
obtaining a similarity matrix corresponding to the user behavior record according to the similarity between the entities;
and obtaining target recommendation data corresponding to the user behavior record according to the similarity matrix.
According to an aspect of the present disclosure, there is provided an entity similarity calculation apparatus including:
the behavior record segmentation module is used for acquiring user behavior records generated when a user performs behavior operation on each entity, and segmenting the user behavior records to obtain a plurality of segmentation results;
an entity relationship network determining module, configured to obtain an entity relationship network between the entities according to the segmentation results, and obtain an entity path set of each entity according to the entity relationship network;
and the similarity determining module is used for obtaining entity vectors of the entities according to the entity path set and obtaining the similarity between the entities according to the entity vectors.
According to an aspect of the present disclosure, there is provided an item recommendation system including:
the server is used for acquiring user behavior records generated when the user performs behavior operation on each entity and segmenting the user behavior records to obtain a plurality of segmentation results; and
obtaining an entity relationship network between the entities according to the segmentation results, and obtaining an entity path set of the entities according to the entity relationship network; and
obtaining entity vectors of the entities according to the entity path set, and obtaining similarity between the entities according to the entity vectors; and
obtaining target recommendation data corresponding to the user behavior record according to the similarity between the entities;
the terminal equipment is in communication connection with the server and is used for receiving behavior operation of the user on each entity; and
and displaying the target recommendation data.
According to an aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the entity similarity degree calculation method according to any one of the above.
According to an aspect of the present disclosure, there is provided an electronic device including:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the entity similarity calculation method of any one of the above via execution of the executable instructions.
On one hand, the method and the device for calculating the entity similarity acquire user behavior records generated when a user performs behavior operation on each entity, and segment the user behavior records to obtain a plurality of segmentation results; obtaining an entity relationship network among the entities according to the segmentation results, and obtaining an entity path set of each entity according to the entity relationship network; the entity vectors of the entities are obtained according to the entity path set, and the similarity among the entities is obtained according to the entity vectors, so that the similarity among the entities can be obtained based on the entity vectors of the entities, the problem that in the prior art, the similarity calculation result is low in accuracy because the similarity is obtained by using association rule analysis and vectorization expression of the entities is not used is solved, and the accuracy of the similarity calculation among the entities is improved; on the other hand, a plurality of segmentation results are obtained by obtaining user behavior records generated when the user performs behavior operation on each entity and segmenting the user behavior records; and the entity relationship network among the entities is obtained according to the segmentation results, so that the accuracy of the entity relationship network is improved; on the other hand, the entity vectors of the entities are obtained according to the entity path set, and the similarity between the entities is obtained according to the entity vectors, so that the problem that the accuracy of the similarity is low because the features of the entities can only be expressed in a vector form with limited dimensionality is solved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 schematically shows a flowchart of an entity similarity calculation method according to an exemplary embodiment of the present invention;
FIG. 2 schematically illustrates a flowchart of a method for obtaining an entity relationship network between entities according to segmentation results, according to an exemplary embodiment of the present invention;
FIG. 3 schematically illustrates a flowchart of a method for obtaining a set of entity paths corresponding to entities from an entity relationship network, according to an example embodiment of the present invention;
fig. 4 schematically illustrates a flowchart of a method for item recommendation using the entity similarity calculation method according to an exemplary embodiment of the present invention;
fig. 5 schematically shows a block diagram of an entity similarity calculation apparatus according to an exemplary embodiment of the present invention;
FIG. 6 schematically illustrates a block diagram of an item recommendation system according to an exemplary embodiment of the present invention;
fig. 7 schematically illustrates an electronic device for implementing the entity similarity calculation method according to an exemplary embodiment of the present invention.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention may be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the invention.
Furthermore, the drawings are merely schematic illustrations of the invention and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
In some entity similarity calculation methods, the following methods may be included:
one is, user behavior extraction technique: when a user uses a product, a large amount of behavior data is generated, and a method for extracting and processing the behavior data is a user behavior extraction technology. Behavior data aggregation, behavior data vectorization representation and the like are common.
Another is, entity vectorization represents: in a large-scale sparse data scene, One-hot coding, a noise reduction self-encoder mode and the like are generally adopted for the feature expression of an entity, and the feature of the entity is expressed in a vector form with limited dimensionality and is used for entity correlation calculation. Such entity vectorization representation may reduce the storage space of the entity representation while increasing the compression rate of the encoding.
Alternatively, the correlation metric technique: a relevance metric refers to a method of assessing the degree of similarity between two entities. For example, in an e-commerce recommendation scenario, the possibility that two commodities are purchased by the same user is mined by using an association rule analysis method. Based on the similarity measurement technology, other entities with high relevance can be recommended to the user according to the entity where the user currently acts, and the user is helped to find needed goods.
However, in the first method, the user behavior extraction is not expressed in a manner of considering the use of a behavior network, so that the accuracy of the calculation result of the similarity is low; in the second method, entity vectorization shows that user behavior information is not combined, so that the accuracy of the calculation result of the similarity is further low; in the third method, the similarity measure does not use the high-dimensional vectorization representation of the entity, and the accuracy of the calculation result of the similarity is also low.
In the present exemplary embodiment, a method for calculating entity similarity is first provided, where the method may be performed in a server, a server cluster, a cloud server, or the like, and may also be performed in a terminal device; of course, those skilled in the art may also operate the method of the present invention on other platforms as needed, and this is not particularly limited in this exemplary embodiment. Referring to fig. 1, the entity similarity calculation method may include the steps of:
and S110, acquiring user behavior records generated when the user performs behavior operation on each entity, and segmenting the user behavior records to obtain a plurality of segmentation results.
And S120, obtaining an entity relationship network between the entities according to the segmentation results, and obtaining an entity path set of each entity according to the entity relationship network.
Step S130, obtaining entity vectors of all the entities according to the entity path set, and obtaining similarity between all the entities according to all the entity vectors.
In the entity similarity calculation method, on one hand, a plurality of segmentation results are obtained by obtaining user behavior records generated when a user performs behavior operation on each entity and segmenting the user behavior records; obtaining an entity relationship network among the entities according to the segmentation results, and obtaining an entity path set of each entity according to the entity relationship network; the entity vectors of the entities are obtained according to the entity path set, and the similarity among the entities is obtained according to the entity vectors, so that the similarity among the entities can be obtained based on the entity vectors of the entities, the problem that in the prior art, the similarity calculation result is low in accuracy because the similarity is obtained by using association rule analysis and vectorization expression of the entities is not used is solved, and the accuracy of the similarity calculation among the entities is improved; on the other hand, a plurality of segmentation results are obtained by obtaining user behavior records generated when the user performs behavior operation on each entity and segmenting the user behavior records; and an entity relationship network among the entities is obtained according to each segmentation result, so that the accuracy of the entity relationship network is improved; on the other hand, the entity vectors of the entities are obtained according to the entity path set, and the similarity between the entities is obtained according to the entity vectors, so that the problem that the accuracy of the similarity is low because the features of the entities can only be expressed in a vector form with limited dimensionality is solved.
Hereinafter, each step involved in the entity similarity calculation method according to the exemplary embodiment of the present invention will be explained and explained in detail with reference to the drawings.
In step S110, a user behavior record generated when a user performs behavior operations on each entity is obtained, and the user behavior record is segmented to obtain a plurality of segmentation results.
In this exemplary embodiment, first, a user behavior record generated when a user performs behavior operation on each entity may be obtained; in the case of electronic commerce, the entity may include various tradeable items displayed on the network interface, such as clothes, shoes, fruits, stationery, and the like; behavioral operations may include, for example, clicking, browsing, purchasing, shopping, paying attention, collecting, commenting, and appending comments, among others; the user behavior record may be, for example, a record generated by the user acting on the entity corresponding to the behavior operation, such as purchasing a brand XX article of clothing, or paying attention to a brand XX shoe, and so on.
Further, after obtaining the user behavior record, the user behavior record may be segmented to obtain a plurality of segmentation results, which specifically includes: firstly, sequencing the user behavior records according to the generation time of the user behavior records; secondly, segmenting the sorted user behavior records according to a preset time condition to obtain a plurality of segmentation results; and the time difference between the generation times corresponding to any two user behavior records in each segmentation result is not greater than a preset time threshold. In detail:
when a user performs a behavior operation on an entity, the system generally records data in the form of triples. For example, a record is represented as:
r=<u,i,t>(r∈R);
wherein u represents a user, i represents an entity, t represents a timestamp of behavior generation, and R represents all behavior records of the user. Further, all records of the same user are sorted from small to large according to time t; after the ordering is finished, if the time difference between two records is more than 30 minutes, the record is cut off to be used as an access session (cutting result). By applying the method, all user behaviors can be converted into a set S of a plurality of visit sessions, wherein:
session=<i 1,i 2,...,i n>the product isAccess session may be used to represent a sequence of user behavioural operations on an entity by time, i 1,i 2,...,i nRepresenting the user's behavioral operations on the entities at various different times within 30 minutes.
In step S120, an entity relationship network between the entities is obtained according to the segmentation results, and an entity path set of each entity is obtained according to the entity relationship network.
In this exemplary embodiment, first, an entity relationship network corresponding to each entity is obtained according to each segmentation result. Specifically, referring to fig. 2, obtaining the entity relationship network corresponding to each entity according to each segmentation result may include steps S210 to S230, which are described in detail below.
In step S210, it is determined whether a relationship between the user behavior records corresponding to any two entities satisfies a preset relationship condition.
In step S220, when it is determined that the relationship between the user behavior records corresponding to any two of the entities satisfies the preset relationship condition, it is determined that an entity connection exists between the two entities.
In step S230, an entity relationship network between the entities is obtained according to the entity connection between the entities existing in each segmentation result.
Hereinafter, steps S210 to S230 will be explained and explained.
First, the preset relationship conditions are explained and explained. The preset relationship condition may be, for example: any two entities belong to the same segmentation result, and user behavior records corresponding to any two entities are in adjacent positions in the segmentation result. For example, if entity i kAnd i jIf they occur and are adjacent in the same access session, entity i is considered kAnd i jAnd the relation between the corresponding user behavior records meets the preset relation condition.
Further, firstly, when judging whether the relation between the user behavior records corresponding to any two entities meets the preset relation condition, confirming that entity connection exists between the two entities; then, after determining that all session sets S are satisfied, an entity relationship network N may be formed, where a node of the network is an entity i.
Further, after the entity relationship network is obtained, the entity path set corresponding to each entity can be obtained according to the entity relationship network. Specifically, referring to fig. 3, obtaining the entity path set corresponding to each entity according to the entity relationship network may include steps S10-S50, which will be described in detail below.
In step S10, an entity i is randomly selected as a starting point of an entity path corresponding to the entity i, and an entity j having the largest number of entity connections with the entity i is selected as a next node in the entity relationship network.
In step S20, traverse other entities in the entity relationship network, and obtain a plurality of other nodes according to the weights of the entity relationship network in which the other entities and the entity of the entity i are connected.
In step S30, an entity path corresponding to the entity i is obtained according to the plurality of other nodes, the entity j, and the entity i.
In step S40, looping the steps S10 to S30 until entity paths corresponding to all entities included in the entity relationship network are obtained;
in step S50, the entity path set corresponding to each entity is obtained according to the entity path corresponding to each entity.
Hereinafter, the steps S10-S50 will be explained.
First, in order to obtain a more accurate entity path set, the entity similarity calculation method may further include: firstly, calculating the number of confirmed entity connections between any two entities in each segmentation result; secondly, according to the number of the entity connections confirmed to exist between any two entities, the weight of the entity connections of any two entities in the entity relationship network is obtained. Specifically, the whole entity relationship network may be traversed, and the number of times that the entity i and the entity j are regarded as the existence entity in all the session set S is used as the weight of the edge connecting the entity i and the entity j; and then sequentially obtaining the weights of the edges of the entity i connected with other entities and the weights of the edges of the other entities connected with other entities.
Further, after obtaining the weights, the entity paths corresponding to the entities may be obtained in sequence according to the steps S10 to S30, and then the entity path set may be obtained according to the entity giant corresponding to the entities. Specifically, firstly, randomly selecting an entity i from a network as a starting point of a path; secondly, selecting a next node j of the path according to the probability and the edge weight connected with the entity i, and recording the next node j into the path; then, step 2 is repeated until the path length meets the set requirement.
It should be noted that, in the entity path set P generated by the method, one path is denoted as P ═ P<i 1,i 2,...,i n>. The method uses all user behaviors as the path generation basis, can directly simulate the general behavior rules of the user, and simultaneously avoids the problem of possible individual deviation when the user behaviors are directly used.
In step S130, an entity vector of each entity is obtained according to the entity path set, and a similarity between the entities is obtained according to each entity vector.
In this exemplary embodiment, first, a vectorization processing tool is used to process an entity path corresponding to each entity included in the entity path set, so as to obtain an entity vector of each entity. Then, cosine values between the entity vectors are calculated, and the cosine values are used as similarity between the entities. In detail:
firstly, on the entity path set P data, a high-dimensional vectorization representation E of an entity can be learned by using a vectorization learning method such as word2vec, wherein the representation E of a certain entity is E ═<a 1,a 2,...,a m>Wherein, a1 is a real number,m is the dimension of the vector representation and can also represent how many entities have entity connections with the entity in the entity network relationship.
Secondly, after the entity vectors of the entities are obtained, cosine values among the entity vectors can be calculated, and then the cosine values are used as the similarity among the entities. Specifically, the similarity between two entities can be further calculated by using a cosine similarity formula according to the entity vectorization representation obtained in the above steps:
Figure BDA0002248285280000111
wherein similarity represents the similarity between entities, and A and B represent the entity vectors of two entities.
Further, in order to facilitate recommending corresponding goods for the user according to the similarity between the entities, the entity similarity calculation method may further include: obtaining a similarity matrix corresponding to the user behavior record according to the similarity between the entities; and obtaining target recommendation data corresponding to the user behavior record according to the similarity matrix. In detail:
firstly, calculating the similarity between every two entity pairs, and then generating a similarity matrix M according to the similarity, wherein the element M in the matrix M ijRepresenting the similarity of entity i and entity j. Further, according to k nearest entities operated by the user, aiming at each entity, in the entity similarity matrix M, a plurality of entities with the highest correlation with the entity similarity matrix M are searched to serve as a recommendation candidate set. When the user enters the recommendation page next time, the system presents the corresponding entity to the user according to the candidate set, so that the recommendation purpose is achieved.
It should be further added that the method may be used not only for recommending goods, but also for recommending other aspects, such as recommendation of a user during reading a text or a novel, and the like, which is not limited in this example.
Hereinafter, a method of recommending an item using the entity similarity calculation method according to the exemplary embodiment of the present invention will be explained and explained with reference to fig. 4. Referring to fig. 4, the item recommendation method may include the steps of:
step S401, a user logs in a certain E-commerce APP through terminal equipment;
step S402, the user performs behavior operation on the entity in the e-commerce APP through the terminal device, which may include browsing, purchasing or purchasing, for example;
step S403, after receiving the behavior operation of the entity by the user, the server generates a user behavior record according to the behavior operation;
step S404, the server divides each user behavior record according to the generation time of the user behavior record to form visit times;
step S405, the server forms an entity relationship network according to each visit;
step S406, the server obtains an entity path set according to the entity relationship network;
step S407, the server obtains an entity vector according to the entity path set;
step S408, the server calculates entity similarity among the entities according to the entity vector;
step S409, the server obtains an entity similarity matrix according to the entity similarity, generates a recommendation result according to the entity similarity matrix, and sends the recommendation result to the terminal equipment;
and step S410, the terminal equipment receives the recommendation result and then displays the recommendation result so that the user can conduct behavior operation again on the entity included in the recommendation result.
In the entity similarity calculation method provided by the example embodiment of the present disclosure, an entity relationship network is extracted from large-scale sparse user behavior data; then sampling on an entity relationship network to generate an unbiased entity sequence, and calculating on the basis that the entity vectorization representation obtained has higher accuracy; and finally, the calculation timeliness can be improved by using a mode of expressing the calculation similarity by entity vectorization, and better user experience is provided due to the improvement of the correlation calculation accuracy.
The disclosure also provides an entity similarity calculation device. Referring to fig. 5, the entity similarity calculation means may include: a behavior record segmentation module 510, an entity relationship network determination module 520, and a similarity determination module 530. Wherein:
the behavior record segmentation module 510 may be configured to obtain a user behavior record generated when a user performs a behavior operation on each entity, and segment the user behavior record to obtain a plurality of segmentation results.
The entity relationship network determining module 520 may be configured to obtain an entity relationship network between the entities according to the segmentation results, and obtain an entity path set of each entity according to the entity relationship network.
The similarity determining module 530 may be configured to obtain an entity vector of each entity according to the entity path set, and obtain a similarity between the entities according to each entity vector.
In an exemplary embodiment of the present disclosure, the obtaining a plurality of segmentation results by segmenting the user behavior record includes:
sequencing the user behavior records according to the generation time of the user behavior records; segmenting the sorted user behavior records according to a preset time condition to obtain a plurality of segmentation results; and the time difference between the generation times corresponding to any two user behavior records in each segmentation result is not greater than a preset time threshold.
In an exemplary embodiment of the present disclosure, obtaining an entity relationship network between the entities according to the segmentation results includes:
judging whether the relation between the user behavior records corresponding to any two entities meets a preset relation condition or not; when determining that the relationship between the user behavior records corresponding to any two entities meets the preset relationship condition, determining that entity connection exists between the two entities; and obtaining an entity relationship network between the entities according to entity connection between the entities existing in the segmentation results.
In an exemplary embodiment of the disclosure, the preset relationship condition is that any two entities belong to the same segmentation result, and user behavior records corresponding to any two entities are located at adjacent positions in the same segmentation result.
In an exemplary embodiment of the present disclosure, the entity similarity calculation apparatus further includes:
the entity connection number calculation module may be configured to calculate the number of the entity connections confirmed to exist between any two entities in each of the segmentation results.
The weight confirmation module may be configured to obtain the weight of the entity connection of any two entities in the entity relationship network according to the number of the entity connections confirmed to exist between any two entities.
In an exemplary embodiment of the present disclosure, obtaining an entity path set corresponding to each of the entities according to the entity relationship network includes:
step S10, randomly selecting an entity i as a starting point of an entity path corresponding to the entity i, and selecting an entity j with the most number of entity connections between the entity i in the entity relationship network as a next node;
step S20, traversing other entities in the entity relationship network, and obtaining a plurality of other nodes according to the weights of the other entities and the entity connection of the entity i in the entity relationship network;
step S30, obtaining an entity path corresponding to the entity i according to the other nodes, the entity j and the entity i;
step S40, the step S10 to the step S30 are circulated until entity paths corresponding to all the entities included in the entity relationship network are obtained;
step S50, obtaining the entity path set corresponding to each entity according to the entity path corresponding to each entity.
In an exemplary embodiment of the present disclosure, obtaining an entity vector of each entity according to the entity path set includes:
and processing the entity path corresponding to each entity in the entity path set by using a vectorization processing tool to obtain an entity vector of each entity.
In an exemplary embodiment of the present disclosure, obtaining a similarity between the entities according to the entity vectors includes:
and calculating cosine values among the entity vectors, and taking the cosine values as the similarity among the entities.
In an exemplary embodiment of the present disclosure, the entity similarity calculation apparatus further includes:
the similarity matrix determining module may be configured to obtain a similarity matrix corresponding to the user behavior record according to a similarity between the entities;
and the target recommendation data determining module can be used for obtaining target recommendation data corresponding to the user behavior record according to the similarity matrix.
Hereinafter, the item recommendation system according to the exemplary embodiment of the present invention will be explained and explained with reference to fig. 6.
Referring to fig. 6, the item recommendation system may include a server 601 and a terminal device 602. Wherein:
the server 601 may be configured to obtain a user behavior record generated when a user performs behavior operations on each entity and segment the user behavior record to obtain a plurality of segmentation results; and
obtaining an entity relationship network between the entities according to the segmentation results, and obtaining an entity path set of the entities according to the entity relationship network; and
obtaining entity vectors of the entities according to the entity path set, and obtaining similarity between the entities according to the entity vectors; and
obtaining target recommendation data corresponding to the user behavior record according to the similarity between the entities;
the terminal device 602 is in communication connection with the server, and may be configured to receive behavior operations performed on each entity by the user; and
and displaying the target recommendation data.
Further, in order to reduce the stress on the server, the item recommendation system may further include a storage device 603. The storage device may be configured to store the user behavior record and the recommendation result.
The specific details of each module in the entity similarity calculation apparatus have been described in detail in the corresponding entity similarity calculation method, and therefore are not described herein again.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the invention. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
Moreover, although the steps of the methods of the present invention are depicted in the drawings in a particular order, this does not require or imply that the steps must be performed in this particular order, or that all of the depicted steps must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions, etc.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiment of the present invention can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to make a computing device (which can be a personal computer, a server, a mobile terminal, or a network device, etc.) execute the method according to the embodiment of the present invention.
In an exemplary embodiment of the present invention, there is also provided an electronic device capable of implementing the above method.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
An electronic device 700 according to this embodiment of the invention is described below with reference to fig. 7. The electronic device 700 shown in fig. 7 is only an example and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 7, electronic device 700 is embodied in the form of a general purpose computing device. The components of the electronic device 700 may include, but are not limited to: the at least one processing unit 710, the at least one memory unit 720, and a bus 730 that couples various system components including the memory unit 720 and the processing unit 710.
Wherein the storage unit stores program code that is executable by the processing unit 710 such that the processing unit 710 performs the steps according to various exemplary embodiments of the present invention as described in the above section "exemplary method" of the present specification. For example, the processing unit 710 may perform step S110 as shown in fig. 1: acquiring user behavior records generated when a user performs behavior operation on each entity, and segmenting the user behavior records to obtain a plurality of segmentation results; step S120: obtaining an entity relationship network between the entities according to the segmentation results, and obtaining an entity path set of the entities according to the entity relationship network; step S130: and obtaining entity vectors of the entities according to the entity path set, and obtaining the similarity between the entities according to the entity vectors.
The storage unit 720 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)7201 and/or a cache memory unit 7202, and may further include a read only memory unit (ROM) 7203.
The storage unit 720 may also include a program/utility 7204 having a set (at least one) of program modules 7205, such program modules 7205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 730 may be any representation of one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 700 may also communicate with one or more external devices 800 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 700, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 700 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 750. Also, the electronic device 700 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the internet) via the network adapter 760. As shown, the network adapter 760 communicates with the other modules of the electronic device 700 via the bus 730. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 700, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiment of the present invention can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to make a computing device (which can be a personal computer, a server, a terminal device, or a network device, etc.) execute the method according to the embodiment of the present invention.
In an exemplary embodiment of the present invention, there is also provided a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, aspects of the invention may also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps according to various exemplary embodiments of the invention described in the above section "exemplary methods" of the present description, when said program product is run on the terminal device.
According to the program product for realizing the method, the portable compact disc read only memory (CD-ROM) can be adopted, the program code is included, and the program product can be operated on terminal equipment, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
Furthermore, the above-described figures are merely schematic illustrations of processes involved in methods according to exemplary embodiments of the invention, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

Claims (13)

1. An entity similarity calculation method, comprising:
acquiring user behavior records generated when a user performs behavior operation on each entity, and segmenting the user behavior records to obtain a plurality of segmentation results;
obtaining an entity relationship network between the entities according to the segmentation results, and obtaining an entity path set of the entities according to the entity relationship network;
and obtaining entity vectors of the entities according to the entity path set, and obtaining the similarity between the entities according to the entity vectors.
2. The entity similarity calculation method according to claim 1, wherein the step of segmenting the user behavior record into a plurality of segmentation results comprises:
sequencing the user behavior records according to the generation time of the user behavior records;
segmenting the sorted user behavior records according to a preset time condition to obtain a plurality of segmentation results;
and the time difference between the generation times corresponding to any two user behavior records in each segmentation result is not greater than a preset time threshold.
3. The method of calculating entity similarity according to claim 2, wherein obtaining the entity relationship network between the entities according to the segmentation results comprises:
judging whether the relation between the user behavior records corresponding to any two entities meets a preset relation condition or not;
when determining that the relationship between the user behavior records corresponding to any two entities meets the preset relationship condition, determining that entity connection exists between the two entities;
and obtaining an entity relationship network between the entities according to entity connection between the entities existing in the segmentation results.
4. The entity similarity calculation method according to claim 3, wherein the preset relationship condition is that any two entities belong to the same segmentation result, and user behaviors corresponding to any two entities are recorded in adjacent positions in the same segmentation result.
5. The entity similarity calculation method according to claim 3, further comprising:
calculating the number of confirmed entity connections between any two entities in each segmentation result;
and obtaining the weight of the entity connection of any two entities in the entity relationship network according to the number of the entity connection confirmed to exist between any two entities.
6. The entity similarity calculation method according to claim 5, wherein obtaining the entity path set corresponding to each of the entities according to the entity relationship network comprises:
step S10, randomly selecting an entity i as a starting point of an entity path corresponding to the entity i, and selecting an entity j with the most number of entity connections between the entity i in the entity relationship network as a next node;
step S20, traversing other entities in the entity relationship network, and obtaining a plurality of other nodes according to the weights of the other entities and the entity connection of the entity i in the entity relationship network;
step S30, obtaining an entity path corresponding to the entity i according to the other nodes, the entity j and the entity i;
step S40, the step S10 to the step S30 are circulated until entity paths corresponding to all the entities included in the entity relationship network are obtained;
step S50, obtaining the entity path set corresponding to each entity according to the entity path corresponding to each entity.
7. The method of claim 1, wherein obtaining the entity vector of each entity according to the entity path set comprises:
and processing the entity path corresponding to each entity in the entity path set by using a vectorization processing tool to obtain an entity vector of each entity.
8. The method of claim 1, wherein obtaining the similarity between the entities according to the entity vectors comprises:
and calculating cosine values among the entity vectors, and taking the cosine values as the similarity among the entities.
9. The entity similarity calculation method according to claim 7, further comprising:
obtaining a similarity matrix corresponding to the user behavior record according to the similarity between the entities;
and obtaining target recommendation data corresponding to the user behavior record according to the similarity matrix.
10. An entity similarity calculation apparatus, comprising:
the behavior record segmentation module is used for acquiring user behavior records generated when a user performs behavior operation on each entity, and segmenting the user behavior records to obtain a plurality of segmentation results;
an entity relationship network determining module, configured to obtain an entity relationship network between the entities according to the segmentation results, and obtain an entity path set of each entity according to the entity relationship network;
and the similarity determining module is used for obtaining entity vectors of the entities according to the entity path set and obtaining the similarity between the entities according to the entity vectors.
11. An item recommendation system, comprising:
the server is used for acquiring user behavior records generated when the user performs behavior operation on each entity and segmenting the user behavior records to obtain a plurality of segmentation results; and
obtaining an entity relationship network between the entities according to the segmentation results, and obtaining an entity path set of the entities according to the entity relationship network; and
obtaining entity vectors of the entities according to the entity path set, and obtaining similarity between the entities according to the entity vectors; and
obtaining target recommendation data corresponding to the user behavior record according to the similarity between the entities;
the terminal equipment is in communication connection with the server and is used for receiving behavior operation of the user on each entity; and
and displaying the target recommendation data.
12. A computer-readable storage medium on which a computer program is stored, the computer program, when being executed by a processor, implementing the entity similarity degree calculation method according to any one of claims 1 to 9.
13. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the entity similarity calculation method of any one of claims 1-9 via execution of the executable instructions.
CN201911024620.8A 2019-10-25 2019-10-25 Entity similarity calculation method and device, article recommendation system, medium and equipment Pending CN110782287A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911024620.8A CN110782287A (en) 2019-10-25 2019-10-25 Entity similarity calculation method and device, article recommendation system, medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911024620.8A CN110782287A (en) 2019-10-25 2019-10-25 Entity similarity calculation method and device, article recommendation system, medium and equipment

Publications (1)

Publication Number Publication Date
CN110782287A true CN110782287A (en) 2020-02-11

Family

ID=69386718

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911024620.8A Pending CN110782287A (en) 2019-10-25 2019-10-25 Entity similarity calculation method and device, article recommendation system, medium and equipment

Country Status (1)

Country Link
CN (1) CN110782287A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111526037A (en) * 2020-03-23 2020-08-11 北京三快在线科技有限公司 Configuration method and device of network node, electronic equipment and readable storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6266649B1 (en) * 1998-09-18 2001-07-24 Amazon.Com, Inc. Collaborative recommendations using item-to-item similarity mappings
CA2760308A1 (en) * 2009-04-29 2010-11-04 Amazon Technologies, Inc. Generating recommendations based on similarities between location information of multiple users
CN103208073A (en) * 2012-01-17 2013-07-17 阿里巴巴集团控股有限公司 Method and device for obtaining recommend commodity information and providing commodity information
CN103400286A (en) * 2013-08-02 2013-11-20 世纪禾光科技发展(北京)有限公司 Recommendation system and method for user-behavior-based article characteristic marking
KR20160064448A (en) * 2014-11-28 2016-06-08 이종찬 A recommendation method for items by using preference prediction of their similar group
KR101871747B1 (en) * 2017-04-07 2018-06-27 주식회사 화성 Similarity tendency based user-sightseeing recommendation system and method thereof
CN108898459A (en) * 2018-06-25 2018-11-27 中国联合网络通信集团有限公司 A kind of Method of Commodity Recommendation and device
CN109242164A (en) * 2018-08-22 2019-01-18 中国平安人寿保险股份有限公司 Optimize method and device, the computer storage medium, electronic equipment in product path
CN109871491A (en) * 2019-03-20 2019-06-11 江苏满运软件科技有限公司 Forum postings recommended method, system, equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6266649B1 (en) * 1998-09-18 2001-07-24 Amazon.Com, Inc. Collaborative recommendations using item-to-item similarity mappings
CA2760308A1 (en) * 2009-04-29 2010-11-04 Amazon Technologies, Inc. Generating recommendations based on similarities between location information of multiple users
CN103208073A (en) * 2012-01-17 2013-07-17 阿里巴巴集团控股有限公司 Method and device for obtaining recommend commodity information and providing commodity information
CN103400286A (en) * 2013-08-02 2013-11-20 世纪禾光科技发展(北京)有限公司 Recommendation system and method for user-behavior-based article characteristic marking
KR20160064448A (en) * 2014-11-28 2016-06-08 이종찬 A recommendation method for items by using preference prediction of their similar group
KR101871747B1 (en) * 2017-04-07 2018-06-27 주식회사 화성 Similarity tendency based user-sightseeing recommendation system and method thereof
CN108898459A (en) * 2018-06-25 2018-11-27 中国联合网络通信集团有限公司 A kind of Method of Commodity Recommendation and device
CN109242164A (en) * 2018-08-22 2019-01-18 中国平安人寿保险股份有限公司 Optimize method and device, the computer storage medium, electronic equipment in product path
CN109871491A (en) * 2019-03-20 2019-06-11 江苏满运软件科技有限公司 Forum postings recommended method, system, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111526037A (en) * 2020-03-23 2020-08-11 北京三快在线科技有限公司 Configuration method and device of network node, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN109360012B (en) Advertisement delivery channel selection method and device, storage medium and electronic equipment
CN109389442A (en) Method of Commodity Recommendation and device, storage medium and electric terminal
US20160307133A1 (en) Quality prediction
US10699198B2 (en) Method and system for cold-start item recommendation
EP4242955A1 (en) User profile-based object recommendation method and device
US11775412B2 (en) Machine learning models applied to interaction data for facilitating modifications to online environments
CN111666275B (en) Data processing method and device, electronic equipment and storage medium
CN109934646B (en) Method and device for predicting associated purchasing behavior of new commodity
CN111209351B (en) Object relation prediction method, object recommendation method, object relation prediction device, object recommendation device, electronic equipment and medium
CN106600360B (en) Method and device for sorting recommended objects
CN107644102B (en) Data feature construction method and device, storage medium and electronic equipment
CN112990625A (en) Method and device for allocating annotation tasks and server
CN111026973B (en) Commodity interest degree prediction method and device and electronic equipment
CN110348581B (en) User feature optimizing method, device, medium and electronic equipment in user feature group
CN110782287A (en) Entity similarity calculation method and device, article recommendation system, medium and equipment
CN107562461B (en) Feature calculation system, feature calculation method, storage medium, and electronic device
CN114818843A (en) Data analysis method and device and computing equipment
CN106651408B (en) Data analysis method and device
CN115204931A (en) User service policy determination method and device and electronic equipment
CN113763080A (en) Method and device for determining recommended article, electronic equipment and storage medium
CN115392992A (en) Commodity recommendation method, terminal device and computer-readable storage medium
CN113159877A (en) Data processing method, device, system and computer readable storage medium
CN111598638A (en) Click rate determination method, device and equipment
CN111915339A (en) Data processing method, device and equipment
CN110879853A (en) Information vectorization method and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200211

RJ01 Rejection of invention patent application after publication