CN112182357B

CN112182357B - Data recommendation method, device, computer equipment and storage medium

Info

Publication number: CN112182357B
Application number: CN201910599002.XA
Authority: CN
Inventors: 张新宇; 杜颖
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-07-04
Filing date: 2019-07-04
Publication date: 2023-10-31
Anticipated expiration: 2039-07-04
Also published as: CN112182357A

Abstract

The application relates to a data recommendation method, a data recommendation device, computer equipment and a computer storage medium, wherein a source click sequence is obtained, and the source click sequence comprises a sequence of data objects clicked and accessed by a target user; determining a clicking object code of the source clicking sequence, wherein the clicking object code is a code of a data object of the source clicking sequence; fusing all click object codes to obtain codes of the source click sequence; determining a target click sequence corresponding to the source click sequence chain-to-chain; the code of the target click sequence and the code of the source click sequence meet similar conditions; based on the target click sequence, a data object is recommended to the target user. Because the source click sequence is compared with the data object accessed by single click, the real interests of the target user can be more comprehensively reflected, and the interest conversion of the user can be mined, so that the accuracy of data object recommendation can be improved.

Description

Data recommendation method, device, computer equipment and storage medium

Technical Field

The present application relates to the field of computer information processing technologies, and in particular, to a data recommendation method, apparatus, computer device, and storage medium.

Background

With the rapid development of information technology, the application of information processing technology has been advanced to the aspects of life. For example, a recommendation system is a tool for associating users with articles, and based on user interaction data, the recommendation system can help users to screen information of interest in the articles in a dispute manner, and personalized information services are provided for the users. Specifically, merchandise recommendation, news information recommendation, article recommendation, and the like.

Conventional data recommendation methods generally recall similar data objects based on a click access object of interest to a user, and then recommend the data objects like the user based on the recall result. Therefore, the conventional data recommendation method has a problem of low accuracy.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a data recommendation method, apparatus, computer device, and storage medium that improve data recommendation accuracy.

A data recommendation method, the method comprising:

acquiring a source click sequence, wherein the source click sequence comprises a sequence of data objects clicked and accessed by a target user;

determining a clicking object code of the source clicking sequence, wherein the clicking object code is a code of a data object of the source clicking sequence;

Fusing all click object codes to obtain codes of the source click sequence;

determining a target click sequence corresponding to the source click sequence chain-to-chain; the code of the target click sequence and the code of the source click sequence meet similar conditions;

based on the target click sequence, a data object is recommended to the target user.

In one embodiment, the source click sequence further comprises a sequence of data objects accessed by non-target user clicks; the recommending the data object to the target user based on the target click sequence comprises the following steps:

acquiring an effective click sequence of the target user;

determining the target click sequence corresponding to the effective source click sequence as a recall sequence; the effective source click sequence is the source click sequence to which each data object of the effective click sequence belongs respectively;

and recommending the data object to the target user according to the recall sequence.

In one embodiment, the determining the target click sequence corresponding to the valid source click sequence as a recall sequence includes:

converting the chain-to-chain correspondence of the source click sequence and the target click sequence into the point-to-chain correspondence of each data object of the source click sequence and the target click sequence;

And determining the target click sequence corresponding to each data object in the effective source click sequence in the point-to-chain corresponding relation as a recall sequence.

In one embodiment, the obtaining the valid click sequence of the target user includes:

when a recommendation request is received, determining the target user according to the recommendation request;

acquiring historical access information of the target user before receiving the recommendation request;

and determining an effective click sequence according to at least two data objects recently accessed by the target user in the historical access information.

In one embodiment, the determining a target click sequence corresponding to the source click sequence chain-to-chain includes:

determining the similarity of the source click sequence and the candidate click sequence;

determining the candidate click sequence with the similarity meeting the similarity condition as a quasi click sequence;

and according to the similarity, splicing at least two quasi click sequences with the same data object to obtain a target click sequence.

Performing cell division on the candidate click sequence;

determining a core cell to which the source click sequence belongs;

determining a target click sequence corresponding to the source click sequence chain-to-chain based on the candidate click sequences in the core cells; the target click sequence corresponding to the source click sequence from chain to chain comprises candidate click sequences, wherein the similarity of the candidate click sequences with the source click sequence in the core cell meets the similarity condition.

In one embodiment, after determining the core cell to which the source click sequence belongs, the method further includes: determining additional cells that satisfy a distance condition from the core cell;

the determining a target click sequence corresponding to the source click sequence chain-to-chain based on the candidate click sequences in the core cells includes: determining a target click sequence corresponding to the source click sequence chain-to-chain based on the candidate click sequences in the core cell and the additional cells;

the target click sequence corresponding to the source click sequence from chain to chain further comprises candidate click sequences, wherein the candidate click sequences in the additional cells and the similarity of the candidate click sequences with the source click sequence meet the similarity condition.

In one of the g embodiments, the determining the respective click object encodings of the source click sequence includes:

acquiring a relation matrix of each data object and an object label in a data object pool;

performing matrix decomposition on the relation matrix to obtain a matrix decomposition result;

and determining each click object code based on the matrix decomposition result.

In one embodiment, when the object code update condition is triggered, a corresponding relation matrix of each data object in the data object pool and the object label is obtained;

and carrying out matrix decomposition on the relation matrix, wherein the obtained matrix decomposition result comprises the codes of all the data objects in the data object pool.

In one embodiment, the matrix decomposition is performed on the relation matrix, and the obtained matrix decomposition result further includes the codes of the object labels in the data object pool;

the determining, based on the matrix decomposition result, the coding of each click object further includes:

weighting and summing the codes of all object labels of the newly added data object to obtain the codes of the newly added data object;

and the newly added data object is a data object which is newly added into the data object pool after the object coding update condition is triggered.

In one embodiment, the fusing the codes of the clicking objects to obtain the codes of the source clicking sequences includes:

acquiring data characteristics of each data object of the source click sequence;

according to the data characteristics, determining the weighting weight of each click object code of the source click sequence;

and carrying out weighted fusion on each click object code according to the weighted weight to obtain the codes of the source click sequence.

In one embodiment, the data characteristic includes access heat;

the step of determining the weighted weight of each click object code of the source click sequence according to the data characteristics comprises the following steps:

determining the weight of each data object of the source click sequence according to the access heat of each data object of the source click sequence;

the greater the access heat, the smaller the weighting weight.

A data recommendation device, the device comprising:

the system comprises a source sequence acquisition module, a target user acquisition module and a data acquisition module, wherein the source sequence acquisition module is used for acquiring a source click sequence which comprises a sequence of data objects clicked and accessed by the target user;

the object coding module is used for determining each click object code of the source click sequence, wherein the click object code is the code of the data object of the source click sequence;

The code fusion module is used for fusing the codes of the clicking objects to obtain the codes of the source clicking sequences;

the target sequence determining module is used for determining a target click sequence corresponding to the source click sequence from chain to chain; the code of the target click sequence and the code of the source click sequence meet similar conditions;

and the object recommending module is used for recommending the data object to the target user based on the target click sequence.

A computer device comprising a memory storing a computer program and a processor which when executing the computer program performs the steps of:

fusing all click object codes to obtain codes of the source click sequence;

A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

fusing all click object codes to obtain codes of the source click sequence;

Based on the data recommendation method, the data recommendation device, the computer equipment and the storage medium of the embodiment, a source click sequence is obtained, wherein the source click sequence comprises a sequence of data objects clicked and accessed by a target user; determining each click object code of the source click sequence, wherein the click object code is the code of the data object of the source click sequence; fusing the codes of the clicking objects to obtain codes of a source clicking sequence; determining a target click sequence corresponding to the source click sequence chain-to-chain; the coding of the target click sequence and the coding of the source click sequence meet similar conditions; based on the target click sequence, the data object is recommended to the target user. In this manner, data objects are recommended to the target user based on the target click sequence corresponding to the source click sequence chain-to-chain. Because the source click sequence is compared with the data object accessed by single click, the real interests of the target user can be more comprehensively reflected, and the interest conversion of the user can be mined, so that the accuracy of data object recommendation can be improved.

Drawings

FIG. 1 is an application environment pictorial view of a data recommendation method in one embodiment;

FIG. 2 is a flow chart of a data recommendation method according to one embodiment;

FIG. 3 is an application scenario diagram of a data recommendation method in an embodiment;

FIG. 4 is an application scenario diagram of a data recommendation method in an embodiment;

FIG. 5 is a block diagram of a data recommendation device in one embodiment;

FIG. 6 is a schematic diagram of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

FIG. 1 is an illustration of an application environment for a data recommendation method in one embodiment. The data recommendation method provided by the application can be applied to an application environment shown in fig. 1. Wherein the user terminal 102 and the target terminal 104 communicate with the server 108 via a network. The user terminal 102 and the target terminal 104 may be desktop devices or mobile terminals, such as desktop computers, tablet computers, smart phones, and the like. The server 108 may be a stand-alone physical server, a cluster of physical servers, or a virtual server.

The data recommendation method of one embodiment of the present application may run on server 108. The user terminal 102 and the target terminal 104 access data objects on the server 108, each click accessing a data object, generating an access record, forming a source click sequence for non-target users and a source click sequence for target users on the server 108. The server 108 obtains a source click sequence comprising a sequence of data objects that are clicked on by the target user; determining a clicking object code of the source clicking sequence, wherein the clicking object code is a code of a data object of the source clicking sequence; fusing all click object codes to obtain codes of the source click sequence; determining a target click sequence corresponding to the source click sequence chain-to-chain; the code of the target click sequence and the code of the source click sequence meet similar conditions; based on the target click sequence, a data object is recommended to the target user.

As shown in fig. 2, in one embodiment, a data recommendation method is provided. The method may run on the server 108 in fig. 1. The data recommendation method comprises the following steps:

S202, acquiring a source click sequence, wherein the source click sequence comprises a sequence of data objects clicked and accessed by a target user.

The server obtains a source click sequence. The source click sequence is a sequence of user clicks to access a data object. The user may include a target user, and thus, based on the source click sequence of the target user, a target click sequence corresponding to a chain-to-chain thereof is determined, thereby recommending the data object to the target user.

A source click sequence may refer to a sequence of a user clicking on an access data object for a fixed period of time. The fixed time period may be a time period determined every six hours every other day.

The source click sequence is a base click sequence used to determine a chain-to-chain correspondence from which a chain-to-chain can correspond to a target click sequence. The click sequence may include a user identification and at least two data object identifications. The click sequence may be determined according to the historical access information of the user, for example, after the historical access information of the target user is obtained, the time interval of continuous clicking is determined to be not greater than a preset interval (for example, 20 minutes, 15 minutes) as one click sequence.

As shown in table 1, the first row in the table is an access information record sample of the user, for example, (U1, A1, T1) indicates that the user U1 clicks to access the data object A1 at T1, where U1 is a user identifier, A1 is a data object identifier, and T1 is an access time. Three access records shown in the first row indicate that the user U1 clicks to access the data object A1 at T1, clicks to access the data object A2 at T2, clicks to access the data object A3 at T3; the second row is a click sequence sample, and if the time interval between two consecutive clicked accesses in three data objects does not exceed the preset interval, the server merges the three access records into one click sequence: (U1, A1, A2, A3).

Table 1 data statistics sample table

Access information sample	(U1，A1，T1),(U1，A2，T2),(U1，A3，T3)……
		Click sequence sample	(U1，A1，A2，A3)……

The data objects may be news information, articles, merchandise information, etc. The data object may be identified by a data object identification, such as a chapter number, news information number, merchandise number, etc.

S204, determining the clicking object codes of the source clicking sequence, wherein the clicking object codes are codes of data objects of the source clicking sequence.

The server determines each click object code of the source click sequence. At least two data object identifications may be included in the source click sequence, i.e. the number of data objects of the source click sequence is at least 2.

Each data object of the source click sequence can be encoded by an encoder, so that each click object code of the source click sequence is obtained.

Each data object may be encoded in combination with the data characteristics of the data object. The data characteristics may include at least one of a data tag, a data content of the data object. Data tags are tags that categorize data objects, such as place names, as well as food, restaurants, snacks, bars, hotels, stars, and the like.

S206, fusing the codes of the clicking objects to obtain the codes of the source clicking sequences.

The server fuses the coding of the clicking objects of the source clicking sequence, and the coding of the source clicking sequence can be obtained.

The way to fuse the coding of each click object may be a weighted fusion. For example, the coding of each click object in the source click sequence is weighted and fused according to the weighting weight. The size of the weighting may be determined based on data characteristics of each data object of the source click sequence, such as access hotness.

The mode of fusing the coding of each click object can be splicing and fusing. For example, each click object of the source click sequence is encoded, and splicing is performed according to the sequence order. For another example, each click object of the source click sequence is encoded, and the repetitive content encoded by each click object is spliced.

S208, determining a target click sequence corresponding to the source click sequence chain-to-chain. The encoding of the target click sequence and the encoding of the source click sequence satisfy similar conditions.

The server determines candidate click sequences that satisfy similar conditions as the source click sequence as target click sequences corresponding to the source click sequence chain-to-chain. The candidate click sequence is the click sequence of other users except the target user in the time period corresponding to the source click sequence. The time period corresponding to the source click sequence refers to a time period to which the time of clicking the data objects to be accessed in the source click sequence belongs. The time period may be divided at preset intervals, such as every two hours, every hour, every six hours, into one time period.

The codes of the target click sequences meeting the similarity condition with the codes of the source click sequences can be the codes of candidate click sequences with the similarity larger than or equal to a preset value, or the codes of the candidate click sequences with the similarity of the codes in the candidate click sequences and the codes of the target click sequences are ordered in the order from large to small, and the codes of the candidate click sequences with the similarity arranged in the previous preset position are encoded.

S210, recommending the data object to the target user based on the target click sequence.

After determining the target click sequence corresponding to the source click sequence chain-to-chain, the server may store each data object in each target click sequence corresponding to the source click sequence chain-to-chain in a data object list. The list of data objects may be used as recall results, which are rearranged and scattered to recommend data objects to the target user.

After determining the target click sequence corresponding to the source click sequence chain-to-chain, the server may further determine the target click sequence of each source click sequence based on each click sequence within a fixed time period as the source click sequence. And then determining the target click sequence of the source click sequence of each data object in the effective click sequence of the target user for the source click sequence of each data object in the effective click sequence. And storing each data object in the target click sequence of the source click sequence to which each data object in the effective click sequence belongs into a data object list. The list of data objects may be used as recall results, which are rearranged and scattered to recommend data objects to the target user.

The server may recommend data objects to the target user after each receipt of the data recommendation request by the target user. The data object may also be recommended to the target user every preset time.

Based on the data recommendation method of the embodiment, a source click sequence is acquired, wherein the source click sequence comprises a sequence of data objects clicked and accessed by a target user; determining each click object code of the source click sequence, wherein the click object code is the code of the data object of the source click sequence; fusing the codes of the clicking objects to obtain codes of a source clicking sequence; determining a target click sequence corresponding to the source click sequence chain-to-chain; the coding of the target click sequence and the coding of the source click sequence meet similar conditions; based on the target click sequence, the data object is recommended to the target user. In this manner, data objects are recommended to the target user based on the target click sequence corresponding to the source click sequence chain-to-chain. Because the source click sequence is compared with the data object accessed by single click, the real interests of the target user can be more comprehensively reflected, and the interest conversion of the user can be mined, so that the accuracy of data object recommendation can be improved.

The accuracy of the data object recommendation may be determined by counting the probability of the target user clicking on the recommended data object.

In one embodiment, the source click sequence further comprises a sequence of data objects accessed by non-target user clicks; recommending a data object to a target user based on the target click sequence, comprising: acquiring an effective click sequence of a target user; determining a target click sequence corresponding to the effective source click sequence as a recall sequence; the effective source click sequence is a source click sequence to which each data object of the effective click sequence belongs respectively; and recommending the data object to the target user according to the recall sequence.

In this embodiment, the source click sequence is a click sequence that includes all users within a fixed period of time. The target click sequence corresponding to the source click sequence chain-to-chain is for the source click sequence. The valid click sequence of the target user may be a sequence of data objects that the target user clicks to access before the current time, and may reflect the interest of the target user.

In this embodiment, the server determines the source click sequences to which each data object of the valid click sequences belongs respectively, and the target click sequences corresponding to the source click sequences are recall sequences, and recommends the data objects to the target user according to the recall sequences. For example, the server may store each data object of the recall sequence in a list of data objects, which may be used as recall results, and recommend data objects to the target user after reordering and scattering the recall results. And searching whether the data object with consistent data object identification exists in the source click sequence through the data object identification in the effective click sequence, and if so, obtaining a target click sequence corresponding to the data object of the effective click sequence as the target click sequence corresponding to the source click sequence.

Based on the data recommendation method of the embodiment, the server respectively and correspondingly obtains source click sequences from each data object in the effective click sequences, and links the link to the target click sequences based on each source click sequence. Thus, the link-to-link correspondence from the source click sequence to the target click sequence is utilized to establish the point-to-link correspondence from each data object in the effective click sequence to the target click sequence. The effective click sequence and the target click sequence comprise two layers of chain-to-chain correspondence, the first layer is the effective click sequence and the source click sequence, and the second layer is the source click sequence and the target click sequence, so that the interests of target users can be reflected more comprehensively and accurately, and the accuracy of data object recommendation is further improved.

In one embodiment, determining the target click sequence corresponding to the valid source click sequence as the recall sequence includes: converting the chain-to-chain correspondence of the source click sequence and the target click sequence into the point-to-chain correspondence of each data object of the source click sequence and the target click sequence; and determining the target click sequence corresponding to each data object in the effective source click sequence in the point-to-chain corresponding relation as a recall sequence.

The chain-to-chain correspondence between the source click sequence and the target click sequence is converted into the point-to-chain correspondence between each data object of the source click sequence and the target click sequence, which may be that each data object in the source click sequence is respectively corresponding to the target click sequence corresponding to the source click sequence.

Converting the chain-to-chain correspondence of the source click sequence and the target click sequence into the point-to-chain correspondence of each data object of the source click sequence and the target click sequence, and determining the target click sequence corresponding to each data object in the effective source click sequence in the point-to-chain correspondence as a recall sequence. The speed of searching the target click sequence corresponding to each data object in the effective click sequence can be improved, so that the determination speed of the recall sequence of the effective click sequence is improved, and the response speed of the recommended data object is improved.

In one of the application scenarios, the data recommendation method is applied to a news recommendation system, as shown in fig. 3, where the news recommendation system includes a target terminal and a server, and the server is provided with a user portrait service, a Rank service, a Rerank service, and a link-to-link recall service and other recall services. The user portrait service provides user portrait service, and provides data support of user portrait for recall service, order service and reorder service. The recall service may recall data objects of interest to the user from a pool of data objects in between the user profile; the chain-to-chain recall service determines recall results from the pool of data objects based on the target click sequence corresponding to the source click sequence chain-to-chain as described above. The sorting service sorts the recall results, and after the recall results are broken up, the reordering service reorders the broken up results. The server recommends the reordered results to the target terminal that the target user is logged on. The target terminal can perform behavior statistics after the user clicks and accesses the recommended data object, and report the behavior statistics to the server so as to feed back the recommendation result of the recommended data object. Thus, the server may also accept feedback results after recommending the data object to the target user, and revise the recall service based on the feedback results. Such as for chain-to-chain based recall services, the encoding of the data object may be updated; the fusion mode or weight of fusing the codes of the clicking objects can be updated; similar conditions may be updated, and so on.

In one embodiment, obtaining a valid click sequence for a target user includes: when a recommendation request is received, determining a target user according to the recommendation request; acquiring historical access information of a target user before receiving a recommendation request; and determining the effective click sequence according to at least two data objects recently accessed by the target user in the historical access information.

The recommendation request carries a request user identifier, and the server can determine the target user according to the request user identifier in the recommendation request.

The server may obtain historical access information for the target user based on the requesting user identification before receiving the recommendation request. The access information may be an access record recorded by the server upon receiving a click access request of the target user. The access information can also be a click access record fed back by the terminal logged in by the target user and received by the server. The historical access information may be a historical access record of the target user prior to receiving the recommendation request.

In this embodiment, at least two data objects recently accessed by the target user are determined to be data objects of an effective click sequence, so that the determined effective click sequence has better timeliness, and therefore, the data objects recommended based on the effective click sequence have better timeliness, more conform to the current interests of the target user, and have higher accuracy.

In one embodiment, determining a target click sequence corresponding to a source click sequence chain-to-chain includes: determining the similarity between the source click sequence and the candidate click sequence; determining the candidate click sequence with the similarity meeting the similarity condition as a quasi click sequence; and splicing at least two quasi-click sequences with the same data object according to the similarity to obtain a target click sequence.

The candidate click sequence is a different click sequence than the user of the source click sequence. The acquisition time and the acquisition mode of the candidate click sequence are the same as those of the click sequence, and are different for the user. Each source click sequence will be a candidate click sequence when analyzed by other users. Such as the click sequence of user U1 (U1, A1, A2, A3), the click sequence of user U2 (U2, A4, A5, A1). When analyzing user U1, (U1, A1, A2, A3) is the source click sequence and (U2, A4, A5, A1) is the candidate click sequence. When analyzing user U2, (U2, A4, A5, A1) is the source click sequence and (U1, A1, A2, A3) is the candidate click sequence.

In other embodiments, candidate click sequences whose similarity satisfies the similarity condition may be determined as target click sequences. In this embodiment, it is further necessary to splice click sequences in which the same data object exists. Thus, the target click sequence is perfected, so that the data recommendation result is more accurate. The candidate click sequences with the similarity meeting the similarity condition can be candidate click sequences with the similarity larger than or equal to a preset value, or can be candidate click sequences with the similarity arranged in the previous preset bits in the sequence from big to small.

Further, the splicing mode can be splicing according to the similarity. For example, assume that the source click sequence is (A1, A2, A3), and assume that the similarity is greater than 0.7; assuming that candidate click sequence 1 is (A4, A5, A8), its similarity to the source click sequence is 0.9; assuming candidate click sequence 2 is (A6, A7, A8), its similarity to the source click sequence is 0.8; then, the candidate click sequence 1 and the candidate click sequence 2 are quasi click sequences, the same data object A8 exists, and the two quasi click sequences are spliced according to the similarity, so that the obtained target click sequence can be (A4, A5, A8, A6, A7) or (A4, A5, A8, A6), (A4, A5, A8, A7). According to the similarity, the principle of splicing at least two quasi-click sequences with the same data object can be that all the data objects in the quasi-click sequences with the same data object can be reserved under the condition that the length of the target click sequence is not limited; under the condition that the length of the target click sequence is limited, firstly, data objects in the quasi-click sequence with large similarity are reserved, and then, data objects in the quasi-click sequence with small similarity are reserved at random or according to other preset rules. Other rules, such as re-stitching may be in order of time of click accesses.

Because the same data object exists in different quasi-click sequences, the fact that the two quasi-click sequences have certain correlation is indicated, and therefore the two quasi-click sequences are spliced, the spliced target click sequences are more accurate, and therefore the accuracy of data object recommendation can be further improved.

In one embodiment, determining a target click sequence corresponding to a source click sequence chain-to-chain includes: performing cell division on the candidate click sequence; determining a core cell to which the source click sequence belongs; determining a target click sequence based on the candidate click sequences in the core cells; the target click sequences corresponding to the source click sequences from chain to chain comprise candidate click sequences, wherein the similarity between the candidate click sequences and the source click sequences in the core cells meets the similarity condition.

The candidate click sequences may be binned based on similarity between the candidate click sequences, each bin being considered a cluster. The cell to which the source click sequence belongs is referred to as a core cell. And determining the candidate click sequence with the similarity meeting the similarity condition with the source click sequence in the core cell as a target click sequence. Thus, the target click sequence meeting the similar condition with the source click sequence is not required to be determined from all the candidate click sequences, and the target click sequence is only required to be determined from the candidate click sequence in a smaller cell. Therefore, the speed of determining the target click sequence can be improved, and the response speed of data recommendation can be improved.

In one embodiment, after determining the core cell to which the source click sequence belongs, the method further includes: additional cells that satisfy the distance condition from the core cell are determined. Determining a target click sequence corresponding to a source click sequence chain-to-chain based on candidate click sequences in the core cells, comprising: a target click sequence corresponding to the source click sequence chain-to-chain is determined based on candidate click sequences in the core cell and the additional cells. The target click sequence corresponding to the source click sequence from chain to chain further comprises candidate click sequences, wherein the candidate click sequences are in additional cells and have similarity with the source click sequence meeting the similarity condition.

The additional cell is a cell satisfying a distance condition with the core cell, for example, may be a cell having a distance from the core cell smaller than a preset value. Thus, the target click sequence meeting the similar condition with the source click sequence is not required to be determined from all the candidate click sequences, and the target click sequence is only required to be determined from the candidate click sequences in a small number of cells. Therefore, the speed of determining the target click sequence can be improved, the response speed of data recommendation is improved, and meanwhile, the accuracy of the target click sequence is ensured.

In one particular embodiment, the target click sequence corresponding to the source click sequence chain-to-chain may be determined using Faiss techniques. Among them, the Faiss technique is a framework developed by Facebook AI Research (facial makeup company artificial intelligence institute) to provide efficient similarity search and clustering for dense vectors, and can quickly find a vector close to a target vector among a large number of vectors. Therefore, the target click sequence can be rapidly determined, and the response speed of data recommendation is improved.

In one embodiment, determining a respective keystroke object code of a source keystroke sequence comprises: acquiring a relation matrix of each data object and an object label in a data object pool; performing matrix decomposition on the relation matrix to obtain a matrix decomposition result; based on the matrix decomposition result, each click object code is determined.

In this embodiment, a relationship matrix is established for the data objects in the data object pool and their object labels. When determining the codes of the data objects of the source click sequences, the matrix decomposition result can be detected by decomposing the relation matrix, and then the codes of the data objects of the source click sequences, namely the click object codes, are determined based on the matrix decomposition result.

The data object pool is a data object which needs to be considered when data recommendation is carried out, and has higher timeliness. Over time, data in the data object pool may be moved out, and data objects outside of the data object pool may also be newly added to the data object pool. For example, a data object that has not been clicked on by any user for a continuous period of time may be removed from the data object pool, and any newly added data object in the system needs to be added to the data object pool.

In the relational matrix, the data objects and object labels in the data object pool may be regarded as rows and columns of the matrix, respectively. In the content of the matrix list, when one data object has a certain object tag, the data object may be marked as 1 at a position corresponding to the rank. For example, suppose that data objects A and B, including object tags a and B, are included in a data object pool; the object label of data object a includes a and the object label of data object B includes B, this relationship matrix can be expressed as follows:

the relation matrix is decomposed to obtain the codes of the data object A, B and the codes of the object tags a and B, so that the inner product of the codes of the data object a and the codes of the object tag a is equal to 1, the inner product of the codes of the data object a and the codes of the object tag B is equal to 0, the inner product of the codes of the data object B and the codes of the object tag a is equal to 0, and the inner product of the codes of the data object B and the codes of the object tag B is equal to 1.

Based on the results of the matrix factorization, each click object code of the source click sequence may be determined. Specifically, the code of the same data object as the data object of the source click sequence in the data object pool is determined as the code of the data object of the source click sequence, namely the click object code. The relation matrix is subjected to matrix decomposition to obtain the coding of the semantic level of the data object, namely the coding of the click object into the coding of the semantic level.

According to the data recommendation method, when the data object is coded, the factors of the data object and the object label are considered, so that the correlation between the codes can be reflected more closely, and the coding of the data object is more reasonable. Therefore, the obtained source click sequence is more reasonable in coding, and the target click sequence corresponding to the chain from the determined source click sequence is more reasonable, so that the recommended data object is more accurate based on the target click sequence.

In one embodiment, when the object code updating condition is triggered, a corresponding relation matrix of each data object in the data object pool and the object label is obtained; and carrying out matrix decomposition on the relation matrix, wherein the obtained matrix decomposition result comprises the codes of all the data objects in the data object pool.

The object code update condition may include: updating once every preset time interval; may further include: the update is performed upon receiving an object code update instruction, which may be sent to the server by an administrator or developer. Thus, a mechanism for updating the data object code is provided, and the stability of the code is improved. The method is characterized in that when the data in the data object pool is continuously updated and the matrix is decomposed, a plurality of data objects can be moved out of the data object pool as time goes by, or a plurality of new data objects can be added into the data object pool, if the original coding mode is maintained all the time, the coding can not be continued, so that the data coding mode with an updating mechanism is provided, and the stability of the whole coding can be improved. Thereby, accuracy of data recommendation is improved.

In one embodiment, the matrix decomposition is performed on the relation matrix, and the obtained matrix decomposition result further comprises the codes of the object labels in the data object pool; based on the matrix decomposition result, determining the coding of each click object, and further comprising: weighting and summing the codes of all object labels of the newly added data object to obtain the codes of the newly added data object; the newly added data object is the data object which is newly added into the data object pool after the object code update condition is triggered.

When the object code update condition is triggered, the decomposition result obtained by decomposing the relation matrix may include the codes of the data objects in the data object pool at the time, but does not include the codes of the data objects newly added to the data object pool after the object code update condition is triggered. If the original coding mode is maintained at this time, the codes of the object labels of the newly added data objects can be weighted and summed to obtain the codes of the newly added data objects, so that the codes of the newly added data objects can be coded and the codes of the newly added data objects can meet the condition of the matrix decomposition result.

Because the data object may have a strong timeliness, the data object may have a strong timeliness when the data object is news information or an article. New news information, articles and other data objects are continuously added into the data object pool, and in order to encode the newly added data object newly added into the data object pool, the object tag codes of the newly added data are weighted and summed according to the object tag codes of the newly added data object, so that the codes of the newly added data object are determined. Thus, the encoding of the data object can support high timeliness of the data object and meanwhile, the stability of the encoding is maintained.

Further, the weighted summation is carried out on the codes of the object labels of the newly added data objects to obtain the codes of the newly added data objects, which comprises the following steps:

carrying out weighted summation on codes of object labels of the newly added data object to obtain a first code; determining data objects in the data object pool, which are close to the first code in distance, according to the first code; and carrying out weighted summation on the codes of the data objects with similar distances to obtain a second code. When determining the data objects with the similar distances to the first codes in the data object pool according to the first codes, determining the data objects with the similar distances to the first codes based on the dimension of each object label. Thus, the consistency of the codes of the newly added data object and the codes of the data objects in the original data object pool can be ensured.

For example, in one specific example, suppose a data object pool includes articles 1,2,3,4,5, … …; article 1 contains tag 1, tag 2, tag 3; article 2 contains tag 4, tag 2, tag 3; article 3 includes tag 1, tag 5, and tag 3. The codes of articles 1,2,3,4 and 5 obtained by matrix decomposition are A1, A2, A3, A4 and A5 respectively; the labels corresponding to the labels 1,2,3,4,5 are encoded as t1, t2, t3, t4, t5. Assume that the newly added article 6 contains tag 1, tag 2, and tag 3. All articles are encoded in the tag encoding dimension, resulting in article 1 encoding of (t1+t2+t3)/3, and article 6 encoding of (t1+t2+t3)/3, and so on. From the encoding of the article 6, data objects in the data object pool, such as articles 1,2,3, which are close to the encoding distance of the article 6, are determined. Then, the codes of the data objects close in distance are weighted and summed to obtain a second code (A1+A2+A3)/3. That is, finally, the code of the newly added article 6 is (a1+a2+a3)/3. In this embodiment, the weights of the weighted sums are averaged. In other embodiments, the weights may also be determined in other ways, such as by determining the weights from the distance of the data object that is close to the first encoded distance.

In one embodiment, fusing the codes of the clicking objects to obtain the codes of the source clicking sequences includes: acquiring data characteristics of each data object of the source click sequence; according to the data characteristics, determining the weighting weight of each click object code of the source click sequence; and carrying out weighted fusion on each click object code according to the weighted weight to obtain the code of the source click sequence.

The data characteristics of the data object may be any one or more of an object tag of the data object, a category of the data object, a content of the data object, a heat of access of the data object, and so forth. Different weights can be given to different object labels, types, contents and access hotness, then the weighting weights of the respective click object codes of the source click sequence are determined based on the weights, and the respective click object codes are weighted and fused according to the weighting weights, so that the codes of the source click sequence are obtained. The weighted weights are determined based on the data characteristics, and the relationship between the data objects and other data objects in the source click sequence can be more accurately embodied than by adopting uniform weighted fusion. Thus, the resulting source click sequence is more reasonably encoded. Therefore, the determined target click sequence can be more reasonable, and the accuracy of recommending the data object to the target user based on the target click sequence can be improved.

In one embodiment, the data characteristic includes access heat. According to the data characteristics, determining the weighted weight of each click object code of the source click sequence comprises the following steps: determining the weight of each data object of the source click sequence according to the access heat of each data object of the source click sequence; the greater the access heat, the smaller the weighting weight.

In this embodiment, fusing the codes of the clicking objects to obtain the codes of the source clicking sequences includes: acquiring the access heat of each data object of the source click sequence; determining the weighting weight of each click object code of the source click sequence according to the access hotness; and carrying out weighted fusion on each click object code according to the weighted weight to obtain the code of the source click sequence.

The access hotness may be represented by the exposure and click access of the data object. Because the data objects with high access heat can appear in a large number of click sequences, the data objects with high access heat are provided with lower weighting weights, so that the interference of the data objects with high access heat on the accuracy of the click sequence coding can be avoided, and the articles at the moment can describe interest points of users. Therefore, the coding of the source click sequence is more reasonable, the determined target click sequence is more reasonable, and the accuracy of data object recommendation is improved.

As shown in fig. 4, in a specific application scenario, the data recommendation method is applied to article recommendation, and it is assumed that each article corresponds to a vector of k dimensions, article information extraction is performed on an article pool in a current time period, and matrix decomposition is performed based on an article and article tag (tag) relation matrix. After matrix decomposition, each article is mapped to a vector, i.e., the code of the article. In order to realize coding of the click article sequence, the article codes of the click article sequence are weighted to form a click sequence code. The weight can adopt the access heat of the article, and the higher the access heat is, the lower the weight is. Because the hotness article can appear in a large number of click sequences, reducing the weight of the hotness article can avoid interference of the hotness article with the accuracy of encoding the click sequences, so that the article at the moment can describe interest points of users. To maintain the stability of the code, the article code is updated every certain time. The articles have strong timeliness, and new articles can continuously enter the article pool. The new article is not encoded when it just entered the article pool, and it is necessary to encode this new article in order to introduce the information of this new article into the click sequence. After the encoding of the source click sequence is obtained, a fasss technique may be used to find a target click sequence that is similar to the source click sequence, and the target click sequence may be multiple, thus mapping from single chain to multiple chain. And taking the list of the articles of each target click sequence as a recall result, converting the mapping from the source click sequence to the list into the mapping relationship from the articles in the source click sequence to the list before storing the recall result, and storing the mapping relationship into a Redis so as to quickly respond when on-line data recommendation is requested. Redis is a memory-based storage mechanism. And after the recalled articles are ranked and rearranged, recommending the articles to the target user.

It should be understood that, although the steps in the flowchart of fig. 2 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 2 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or other steps.

In one embodiment, as shown in fig. 5, there is provided a data recommendation device corresponding to the above data recommendation method, including:

a source sequence acquisition module 502, configured to acquire a source click sequence, where the source click sequence includes a sequence of data objects that are clicked and accessed by a target user;

an object encoding module 504, configured to determine each click object encoding of the source click sequence, where the click object encoding is an encoding of a data object of the source click sequence;

The code fusion module 506 is configured to fuse the codes of the click objects to obtain a code of the source click sequence;

a target sequence determination module 508 that determines a target click sequence corresponding to the source click sequence chain-to-chain; the code of the target click sequence and the code of the source click sequence meet similar conditions;

and the object recommending module 510 is configured to recommend a data object to the target user based on the target click sequence.

Based on the data recommendation device of the embodiment, a source click sequence is acquired, wherein the source click sequence comprises a sequence of data objects clicked and accessed by a target user; determining each click object code of the source click sequence, wherein the click object code is the code of the data object of the source click sequence; fusing the codes of the clicking objects to obtain codes of a source clicking sequence; determining a target click sequence corresponding to the source click sequence chain-to-chain; the coding of the target click sequence and the coding of the source click sequence meet similar conditions; based on the target click sequence, the data object is recommended to the target user. In this manner, data objects are recommended to the target user based on the target click sequence corresponding to the source click sequence chain-to-chain. Because the source click sequence is compared with the data object accessed by single click, the real interests of the target user can be more comprehensively reflected, and the interest conversion of the user can be mined, so that the accuracy of data object recommendation can be improved.

In one embodiment, the source click sequence further comprises a sequence of data objects accessed by non-target user clicks. The device further comprises: and the effective sequence acquisition module and the recall sequence determination module. Wherein, the liquid crystal display device comprises a liquid crystal display device,

the effective sequence acquisition module is used for acquiring an effective click sequence of the target user;

the recall sequence determining module is used for determining the target click sequence corresponding to the effective source click sequence as a recall sequence; the effective source click sequence is the source click sequence to which each data object of the effective click sequence belongs respectively;

and the object recommending module is used for recommending the data object to the target user according to the recall sequence.

In one embodiment, the recall sequence determination module comprises:

the relation conversion unit is used for converting the chain-to-chain correspondence of the source click sequence and the target click sequence into the point-to-chain correspondence of each data object of the source click sequence and the target click sequence;

and the recall sequence determining unit is used for determining the target click sequence corresponding to each data object in the effective source click sequence in the point-to-chain corresponding relation as a recall sequence.

In one embodiment, the active sequence acquisition module includes:

the target user determining unit is used for determining the target user according to the recommendation request when the recommendation request is received;

a history information acquisition unit, configured to acquire history access information of the target user before receiving the recommendation request;

and the effective sequence acquisition unit is used for determining an effective click sequence according to at least two data objects recently accessed by the target user in the historical access information.

In one embodiment, the target sequence determining module includes:

the similarity determining module is used for determining the similarity between the source click sequence and the candidate click sequence;

the quasi sequence determining module is used for determining the candidate click sequence with the similarity meeting the similarity condition as a quasi click sequence;

and the sequence splicing module is used for splicing at least two quasi-click sequences with the same data object according to the similarity to obtain a target click sequence.

In one embodiment, the target sequence determination module includes:

the cell division unit is used for carrying out cell division on the candidate click sequence;

A core unit determining unit, configured to determine a core cell to which the source click sequence belongs;

a target sequence determining unit configured to determine a target click sequence corresponding to the source click sequence chain-to-chain based on the candidate click sequences in the core cells; the target click sequence corresponding to the source click sequence from chain to chain comprises candidate click sequences, wherein the similarity of the candidate click sequences with the source click sequence in the core cell meets the similarity condition.

In one embodiment, the target sequence determining module further includes: an additional unit determining unit.

An additional unit determining unit configured to determine an additional cell satisfying a distance condition with the core cell;

a target sequence determining unit, configured to determine a target click sequence corresponding to the source click sequence chain-to-chain based on the candidate click sequences in the core cell and the additional cell;

In one embodiment, the object encoding module includes:

A relationship matrix acquisition unit, configured to acquire a relationship matrix of each data object and an object tag in the data object pool;

the matrix decomposition unit is used for carrying out matrix decomposition on the relation matrix to obtain a matrix decomposition result;

and the object code determining unit is used for determining each click object code based on the matrix decomposition result.

In one embodiment, the relationship matrix obtaining unit is configured to obtain a corresponding relationship matrix of each data object in the data object pool and the object tag when the object code update condition is triggered;

the matrix decomposition unit performs matrix decomposition on the relation matrix, and the obtained matrix decomposition result comprises codes of all data objects in the data object pool.

In one embodiment, the matrix decomposition unit performs matrix decomposition on the relationship matrix, and the obtained matrix decomposition result further includes the codes of the object labels in the data object pool;

the object encoding module further includes:

the new added object coding unit is used for carrying out weighted summation on the codes of all object labels of the new added data object to obtain the codes of the new added data object;

In one embodiment, the encoding fusion module 506 includes:

a data feature acquisition unit, configured to acquire a data feature of each data object of the source click sequence;

the weighting weight determining unit is used for determining the weighting weight of each click object code of the source click sequence according to the data characteristics;

and the code weighted fusion unit is used for carrying out weighted fusion on the click object codes according to the weighted weights to obtain the codes of the source click sequence.

In one embodiment, the data characteristic includes access heat;

the weighted weight determining unit is used for determining the weighted weight of each data object of the source click sequence according to the access heat of each data object of the source click sequence; the greater the access heat, the smaller the weighting weight.

As shown in fig. 6, in one embodiment, a computer device is provided, which may be a server. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external computer device through a network connection. The computer program is executed by a processor to implement a data recommendation method.

It will be appreciated by those skilled in the art that the structure shown in FIG. 6 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In one embodiment, a computer device is provided. The computer device comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of the data recommendation method when executing the computer program.

The application provides a computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

fusing all click object codes to obtain codes of the source click sequence;

acquiring an effective click sequence of the target user;

Performing cell division on the candidate click sequence;

determining a core cell to which the source click sequence belongs;

In one embodiment, the determining the respective click object encodings of the source click sequence includes:

In one embodiment, the data characteristic includes access heat;

the greater the access heat, the smaller the weighting weight.

In one embodiment, a computer readable storage medium is provided having stored thereon a computer program which when executed by a processor performs the steps of:

fusing all click object codes to obtain codes of the source click sequence;

acquiring an effective click sequence of the target user;

performing cell division on the candidate click sequence;

determining a core cell to which the source click sequence belongs;

In one embodiment, the data characteristic includes access heat;

the greater the access heat, the smaller the weighting weight.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims

1. A data recommendation method, the method comprising:

acquiring a source click sequence, wherein the source click sequence comprises a sequence of data objects clicked and accessed by a target user and a sequence of data objects clicked and accessed by a non-target user;

Fusing all click object codes to obtain codes of the source click sequence;

acquiring an effective click sequence of the target user;

2. The method of claim 1, wherein the determining the target click sequence corresponding to the valid source click sequence as a recall sequence comprises:

3. The method of claim 1, wherein the obtaining the valid click sequence for the target user comprises:

4. The method of claim 1, wherein the determining a target click sequence corresponding to the source click sequence chain-to-chain comprises:

5. The method of claim 1, wherein the determining a target click sequence corresponding to the source click sequence chain-to-chain comprises:

performing cell division on the candidate click sequence;

determining a core cell to which the source click sequence belongs;

6. The method according to claim 5, wherein:

after determining the core cell to which the source click sequence belongs, the method further comprises: determining additional cells that satisfy a distance condition from the core cell;

7. The method of claim 1, wherein the determining the encoding of the pointing object for the source pointing sequence comprises:

8. The method according to claim 7, wherein:

when the object code updating condition is triggered, a corresponding relation matrix of each data object in the data object pool and the object label is obtained;

9. The method according to claim 8, wherein:

performing matrix decomposition on the relation matrix, wherein the obtained matrix decomposition result also comprises codes of object labels in the data object pool;

10. The method of claim 1, wherein said fusing each of said click object codes to obtain a code of said source click sequence comprises:

11. The method according to claim 10, wherein:

the data characteristics include access hotness;

the greater the access heat, the smaller the weighting weight.

12. A data recommendation device, the device comprising:

the system comprises a source sequence acquisition module, a target user acquisition module and a target user acquisition module, wherein the source sequence acquisition module is used for acquiring a source click sequence which comprises a sequence of a data object clicked and accessed by the target user and a sequence of a data object clicked and accessed by a non-target user;

13. The apparatus of claim 12, wherein the recall sequence determination module comprises:

14. The apparatus of claim 12, wherein the active sequence acquisition module comprises:

15. The apparatus of claim 12, wherein the target sequence determination module comprises:

16. The apparatus of claim 15, wherein the target sequence determination module comprises:

17. The apparatus according to claim 16, wherein: the target sequence determining module further includes: an additional unit determining unit;

18. The apparatus of claim 12, wherein the object encoding module comprises:

19. The apparatus according to claim 18, wherein the relationship matrix obtaining unit is configured to obtain a correspondence matrix between each data object in the data object pool and the object tag when the object code update condition is triggered;

20. The apparatus of claim 19, wherein the matrix decomposition unit performs matrix decomposition on the relationship matrix, and the obtained matrix decomposition result further includes a code of each object tag in the data object pool;

the object encoding module further includes:

21. The apparatus of claim 12, wherein the encoding fusion module comprises:

22. The apparatus of claim 21, wherein the data characteristic comprises access hotness;

23. A computer device comprising a memory storing a computer program and a processor implementing the steps of the method of any one of claims 1 to 11 when the computer program is executed.

24. A computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method of any of claims 1 to 11.