CN111506822A

CN111506822A - Data coding and information recommendation method, device and equipment

Info

Publication number: CN111506822A
Application number: CN202010471060.7A
Authority: CN
Inventors: 张琳; 蔡捷; 梁忠平; 温祖杰
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2020-05-28
Filing date: 2020-05-28
Publication date: 2020-08-07
Anticipated expiration: 2040-05-28
Also published as: CN111506822B

Abstract

The application provides a data coding and information recommending method, device and equipment. The method comprises the following steps: and inputting the state data for determining the attention weight into the first neural network to obtain a first coding vector. And inputting the long-term behavior data of the user, the first interval duration data between the occurrence time of the long-term behavior and the occurrence time of the last long-term behavior of the long-term behavior into a second neural network to obtain a second coding vector. And inputting the short-term behavior data of the user and second interval time length data between the occurrence time of the short-term behavior and the occurrence time of the last short-term behavior of the short-term behavior into a third neural network to obtain a third coding vector. An attention weight vector is determined based on the first encoded vector. And encoding the second encoded vector and the third encoded vector based on the attention weight vector.

Description

Data coding and information recommendation method, device and equipment

Technical Field

The application relates to computer technology, in particular to a data coding and information recommending method, device and equipment.

Background

When a user accesses a service system, the service system will generally recommend information to the user.

When the system recommends information for the user, the long-term behavior data and the short-term behavior data of the user are generally referred to so as to recommend information matching the user's expectation.

Disclosure of Invention

In view of the above, the present application discloses a data encoding method, including:

inputting state data for determining attention weight into a first neural network to obtain a first coding vector;

inputting the long-term behavior data of the user, the occurrence time of the long-term behavior and first interval duration data between the occurrence time of the last long-term behavior of the long-term behavior into a second neural network to obtain a second coding vector;

inputting the short-term behavior data of the user and second interval time-length data between the occurrence time of the short-term behavior and the occurrence time of the last short-term behavior of the short-term behavior into a third neural network to obtain a third coding vector;

determining an attention weight vector according to the first encoding vector;

and encoding the second encoded vector and the third encoded vector based on the attention weight vector.

In an embodiment, the method further includes:

an intermediate vector obtained by encoding the second encoded vector and the third encoded vector based on the attention weight vector is further encoded with the state data.

In an embodiment, the encoding the second encoded vector and the third encoded vector based on the attention weight vector includes:

multiplying the attention weight vector by the second encoding vector to obtain a first result;

multiplying the result obtained by subtracting the attention weight vector from 1 by the third encoding vector to obtain a second result;

and adding the first result and the second result.

In an embodiment, the determining the attention weight vector according to the first encoding vector includes:

normalizing each dimension data in the first coding vector;

and constructing an attention weight vector based on the data of the above-mentioned dimensions after the normalization processing.

In an embodiment, the constructing an attention weight vector based on the normalized data of the dimensions includes:

respectively taking each dimension data after normalization processing as a numerator, and taking the sum of each dimension data after normalization processing as a denominator to obtain a weight value corresponding to each dimension data;

and constructing an attention weight vector based on the weight values respectively corresponding to the dimensional data.

In an embodiment, the intermediate vector obtained by encoding the second encoded vector and the third encoded vector based on the attention weight vector is further encoded with the state data, and the intermediate vector includes any one or more of the following combinations:

splicing the intermediate vector with the first coding vector;

adding the intermediate vector to the first encoded vector;

multiplying the intermediate vector by the first encoded vector.

In an embodiment shown, the first neural network, the second neural network, and the third neural network are networks or attention mechanism networks constructed based on any one or a combination of the following networks:

RNN network, transformer network, L STM network and CNN network.

In an illustrated embodiment, the user long-term behavior includes a behavior in which an interval duration between a behavior occurrence time and a current encoding time reaches a preset duration;

the short-term user behaviors comprise behaviors that the interval duration between the behavior occurrence moment and the current coding moment does not reach the preset duration;

the status data may include any one or any combination of the following:

coding corresponding system state data; coding corresponding business activity state data; user status data of the user.

The application also discloses an information recommendation method, which comprises the following steps:

acquiring long-term behavior data, short-term behavior data, first interval duration data, second interval duration data and state data for determining attention weight of a target user;

wherein the first interval duration data includes interval duration data between the occurrence time of the long-term behavior and the occurrence time of the last long-term behavior of the long-term behavior; the second interval duration data includes interval duration data between the occurrence time of the short-term action and the occurrence time of the last short-term action of the short-term action;

encoding the long-term behavior data, the short-term behavior data, the first interval duration data, the second interval duration data, and the status data by using any one of the data encoding methods disclosed in claims 1 to 8 to obtain an encoded result;

and determining recommendation information corresponding to the current recommendation based on the encoding result, and outputting the recommendation information.

The present application also discloses a data encoding apparatus, comprising:

the first coding module is used for inputting the state data for determining the attention weight into a first neural network to obtain a first coding vector;

the second coding module is used for inputting the long-term behavior data of the user, the occurrence time of the long-term behavior and first interval duration data between the occurrence time of the last long-term behavior of the long-term behavior into a second neural network to obtain a second coding vector;

the third coding module is used for inputting the short-term behavior data of the user, second interval time-length data between the occurrence time of the short-term behavior and the occurrence time of the last short-term behavior of the short-term behavior into a third neural network to obtain a third coding vector;

an attention weight determination module for determining an attention weight vector based on the first encoded vector;

and a fourth encoding module configured to encode the second encoded vector and the third encoded vector based on the attention weight vector.

In an embodiment, the apparatus further includes:

and a coding module for coding an intermediate vector obtained by coding the second coded vector and the third coded vector based on the attention weight vector, and further coding the intermediate vector and the state data.

In an embodiment, the fourth encoding module includes:

and adding the first result and the second result.

In an embodiment, the attention weight determining module includes:

a normalization processing module for performing normalization processing on each dimension data in the first coding vector;

and the construction module is used for constructing an attention weight vector based on the data of the dimensions after the normalization processing.

In an embodiment, the normalization processing module includes:

In an embodiment, the further encoding module includes any one or a combination of the following:

splicing the intermediate vector with the first coding vector;

adding the intermediate vector to the first encoded vector;

multiplying the intermediate vector by the first encoded vector.

RNN network, transformer network, L STM network and CNN network.

the status data may include any one or any combination of the following:

The application also discloses an information recommendation device, includes:

the acquisition module is used for acquiring long-term behavior data, short-term behavior data, first interval duration data, second interval duration data and state data for determining attention weight of a target user;

an encoding module, for encoding the long-term behavior data, the short-term behavior data, the first interval duration data, the second interval duration data, and the status data by using any one of the data encoding methods disclosed in claims 1 to 8 to obtain an encoding result;

and the recommending module determines the recommending information corresponding to the recommendation based on the encoding result and outputs the recommending information.

The present application also discloses a data encoding apparatus, the apparatus comprising:

a processor;

a memory for storing processor-executable instructions;

the processor executes the executable instructions to implement the data encoding method disclosed in any one of the above embodiments.

The application also discloses information recommendation equipment, and the equipment comprises:

a processor;

a memory for storing processor-executable instructions;

the processor executes the executable instructions to implement the information recommendation method disclosed in any one of the above embodiments.

According to the technical scheme, in the process of coding the long-term and short-term behaviors of the user, on one hand, the long-term behavior data of the user and the first interval duration data are fused and coded; and the short-term behavior data of the user and the time-length data of the second interval are subjected to fusion coding, so that more comprehensive characteristic information is extracted;

on the other hand, since the attention weight vector is determined from the first coded vector and the second coded vector are fusion-coded based on the attention weight vector, it is important to extract more useful features by fusing the long-term behavior and the short-term behavior to extract features.

Of course, the accuracy of the recommendation information determined by the encoded data obtained by encoding by using the encoding method is higher.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

In order to more clearly illustrate one or more embodiments of the present application or technical solutions in the related art, the drawings needed to be used in the description of the embodiments or the related art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in one or more embodiments of the present application, and other drawings can be obtained by those skilled in the art without inventive exercise.

FIG. 1 is a method flow diagram of a data encoding method shown in the present application;

FIG. 2 is a flow chart of a method for recommending information according to the present application;

FIG. 3 is a flow chart of a method for recommending information according to the present application;

FIG. 4 is a block diagram of an information recommendation network shown in the present application;

fig. 5 is a block diagram showing a data encoding apparatus according to the present application;

fig. 6 is a block diagram of an information recommendation apparatus shown in the present application;

fig. 7 is a hardware configuration diagram of a data encoding apparatus shown in the present application;

fig. 8 is a hardware configuration diagram of an information recommendation device according to the present application.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It should also be understood that the word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination," depending on the context.

The following briefly describes how the business system makes information recommendations with reference to the user's long-term behavior data and short-term behavior data.

In the related art, the business system needs to encode the long-term behavior of the user and the short-term behavior of the user.

At this time, the service system usually encodes the long-term behavior of the user and the short-term behavior of the user to obtain a long-term behavior encoding vector and a short-term behavior encoding vector.

After the respective encoding is finished, the business system inputs the long-term behavior encoding vector and the short-term behavior encoding vector into a multi-classifier trained in advance for calculation, and information to be recommended is obtained. The multi-classifier is obtained by training based on sample data labeled with a plurality of recommendation information.

After the information to be recommended is obtained, the service system can output the information to be recommended to the user.

Through the scheme, in the related technology, in the process of coding the long-term and short-term behaviors of the user, only the features related to the long-term and short-term behaviors are respectively extracted, so that on one hand, the extracted features are incomplete; on the other hand, the extracted features are not screened and the emphasis is not highlighted. Of course, the accuracy of the recommendation information determined by the encoded data obtained by encoding by using the encoding method is also low.

Based on this, the present application proposes a data encoding method. In the process of coding long-short term behaviors of a user, on one hand, extracting features related to interval duration data of adjacent long-short term behaviors; on the other hand, the attention weight is determined based on the state data for determining the attention weight, and the long-term behavior and the short-term behavior are fused with emphasis to extract features, thereby extracting more beneficial features. Of course, the accuracy of the information recommendation network recommendation information obtained by encoding data by using the encoding method is higher.

The following description will be given with reference to specific examples.

Referring to fig. 1, fig. 1 is a flowchart illustrating a method of encoding data according to the present application. As shown in fig. 1, the method may include:

s102, inputting state data for determining attention weight into a first neural network to obtain a first coding vector;

s104, inputting the long-term behavior data of the user, the occurrence time of the long-term behavior and first interval duration data between the occurrence time of the last long-term behavior of the long-term behavior into a second neural network to obtain a second coding vector;

s106, inputting the short-term behavior data of the user, second interval time-length data between the occurrence time of the short-term behavior and the occurrence time of the last short-term behavior of the short-term behavior into a third neural network to obtain a third coding vector;

s108, determining an attention weight vector according to the first encoding vector;

s110 encodes the second encoded vector and the third encoded vector based on the attention weight vector.

The data encoding method can be installed in any system in the form of a software device. For example, the system may be an information recommendation system.

It is understood that the system described above may be embodied in any terminal device. Such as a PC terminal, a mobile terminal, a PAD terminal, etc. The implementation of the above method usually requires the provision of computational power by the equipment on which the method is carried out.

The following description will be given taking a system in which the method is installed as an execution subject as an example.

The state data may specifically be state data for determining attention weight. In practical applications, the status data may include any one or several of the following items:

In practical applications, the system state data corresponding to the current encoding may specifically be state information of a system that executes the current encoding, for example, a system time, a system version, and the like corresponding to the current encoding. The service activity state data corresponding to the current code may be a service activity (for example, a substitute bank card activity, etc.) being promoted by the system when the current code is encoded. The user status data of the user may include characteristics of the user, such as age, interest, gender, credit record, and the like.

When encoding the state data, the system may first fuse the state data. The specific ways of fusing the state data include, but are not limited to, addition, multiplication, and concatenation.

For example, the system may splice and fuse system state data corresponding to the current code, service activity state data corresponding to the current code, and user state data of the user to obtain fused state data.

After the fused data is obtained, the system may input the fused state data into a first neural network, and encode the fused state data to obtain a first encoding vector.

The first neural network may be a neural network constructed based on any one or a combination of several networks (the specific structure of the network may refer to related technologies, and is not described in detail here):

RNN network, transformer network, L STM network and CNN network.

In order to extract more relevant features from the state data, in an embodiment, the first neural network may be an attention mechanism network constructed based on any one or a combination of the following networks (the specific structure of the attention mechanism network may refer to related technologies, and is not described in detail here):

RNN network, transformer network, L STM network and CNN network.

In an embodiment, when encoding the fused data, the encoding may be performed without using a neural network. For example, the system may preset a coding rule (e.g., normalization processing or normalization processing), and during coding, the system may map the fused data according to the coding rule to obtain the first coding vector.

The long-term behavior data may be specifically a behavior sequence corresponding to a behavior that is counted in the system and has a time interval between the behavior occurrence time and the current encoding time reaching a preset time.

The preset duration can be set according to actual service requirements. For example, the preset time period may be 10 hours. At this time, the behavior counted by the system that the interval duration between the behavior occurrence time and the current encoding time reaches 10 hours may be regarded as long-term behavior.

The first interval duration data is specifically interval duration data between the occurrence of the long-term behavior and the last long-term behavior of the occurrence of the long-term behavior.

For example, the first interval duration may be an interval duration between the current long-term behavior time counted by the system and the last counted long-term behavior time.

When encoding the user long-term behavior data and the first interval duration data, the system may first fuse the user long-term behavior data and the first interval duration data. The specific way of fusion includes, but is not limited to, addition, multiplication, and splicing.

For example, the system may perform splicing and fusion on the user long-term behavior data and the first interval duration data to obtain fused behavior data.

After the fused behavior data is obtained, the system may input the fused behavior data into a second neural network, and encode the fused behavior data to obtain a second encoding vector.

The second neural network may be a network constructed based on a combination of any one or more of the following networks (the specific structure of the network may refer to the related art, and is not described in detail here):

RNN network, transformer network, L STM network and CNN network.

In order to extract more relevant features from the state data, in an embodiment, the second neural network may be an attention mechanism network constructed based on a combination of any one or more of the following networks (the specific structure of the attention mechanism network may refer to related technologies, and will not be described in detail here):

RNN network, transformer network, L STM network and CNN network.

In an embodiment, when encoding the behavior data after the fusion, the encoding may not be performed using a neural network. For example, the system may be preset with an encoding rule (e.g., normalization processing or normalization processing). During encoding, the system may map the fused behavior data according to the encoding rule to obtain a second encoding vector.

The short-term behavior data may be specifically a behavior sequence corresponding to a behavior that is counted in the system and has an interval duration between a behavior occurrence time and the current encoding time not reaching the preset duration.

For example, the preset time period may be 10 hours. At this time, the behavior counted by the system that the interval duration between the behavior occurrence time and the current encoding time does not reach 10 hours may be regarded as short-term behavior.

The second interval duration data may specifically be interval duration data between the occurrence of the short-term behavior and the last short-term behavior of the short-term behavior.

For example, the second interval duration may be an interval duration between the counted current short-term behavior time and the last counted short-term behavior time in the system.

When encoding the user short-term behavior data and the second interval duration data, the system may first merge the user short-term behavior data and the second interval duration data. The specific way of fusion includes, but is not limited to, addition, multiplication, and splicing.

For example, the system may perform splicing and fusion on the user short-term behavior data and the second interval time-length data to obtain fused behavior data.

After the fused behavior data is obtained, the system may input the fused behavior data into a third neural network, and encode the fused behavior data to obtain a third encoding vector.

The third neural network may be a network constructed based on any one or a combination of several networks (the specific structure of the network may refer to related technologies, and is not described in detail here):

RNN network, transformer network, L STM network and CNN network.

In order to extract more relevant features from the state data, in an embodiment, the third neural network may be an attention mechanism network constructed based on any one or a combination of several networks (the specific structure of the attention mechanism network may refer to related technologies, and is not described in detail here):

RNN network, transformer network, L STM network and CNN network.

In an embodiment, when encoding the behavior data after the fusion, the encoding may not be performed using a neural network. For example, the system may be preset with an encoding rule (e.g., normalization processing or normalization processing). During encoding, the system may map the fused behavior data according to the encoding rule to obtain a third encoding vector.

It should be noted that, in order to simplify the network computation, in one embodiment, the dimensions of the first code vector, the second code vector and the third code vector may be the same.

The attention weight vector may be a weight vector determined based on the first code vector.

In one embodiment, the system may normalize each dimension of the data in the first encoded vector when determining the weight vector based on the first encoded vector.

In practical applications, the system may use a sigmod function to perform the normalization processing operation. It should be noted that the function used for performing the normalization processing operation may be set according to actual service requirements, and is not particularly limited herein.

After normalizing each piece of dimensional data in the first encoded vector, the system may construct an attention weight vector based on each piece of normalized dimensional data.

In practical applications, the system may construct the attention weight vector according to dimension information of each dimension data in the first encoding vector.

For example, assuming that the first encoding vector is A, B, C three-dimensional features, A, B, C three-dimensional data obtained by normalizing the three-dimensional features are in the 1 st, 2 nd and 3 rd dimensions of the first encoding vector. At this time, the constructed attention weight vector may be represented as { A, B, C }.

In an embodiment, when constructing the attention weight vector based on the normalized data of the dimensions, the system may calculate the weight value of the feature of each dimension based on the SOFTMAX function, and construct the attention weight vector based on the weight value corresponding to each data of the dimensions.

For example, when the attention weight vector is constructed based on the respective dimensional data after the normalization processing, the system may obtain the weight value corresponding to each of the dimensional data by using the respective dimensional data after the normalization processing as a numerator and the sum of the respective dimensional data after the normalization processing as a denominator.

After obtaining the weight values corresponding to the dimensional data, the system may construct an attention weight vector according to dimension information of the dimensional data in the first encoding vector.

After obtaining the attention weight vector, the system may encode the second code vector and the third code vector.

In one embodiment, when the second encoded vector and the third encoded vector are encoded based on the attention weight vector, the system may multiply the attention weight vector by the second encoded vector to obtain a first result, and multiply a result obtained by subtracting the attention weight vector by 1 by the third encoded vector to obtain a second result.

After obtaining the first result and the second result, the system may add the first result and the second result.

Here, on the other hand, the order of calculating the first result and the second result is not particularly limited. On the other hand, when the first result and the second result are merged, a vector calculation method such as multiplication or concatenation of the first result and the second result may be adopted.

It is to be understood that, when the second encoded vector and the third encoded vector are encoded based on the attention weight vector, the system may further multiply the second encoded vector by a result of subtracting the attention weight vector from 1 to obtain a first result, and multiply the attention weight vector by the third encoded vector to obtain a second result, which is not particularly limited herein.

In one embodiment, in order to further extract more comprehensive features, after the second code vector and the third code vector are encoded based on the attention weight vector, the system may further encode an intermediate vector obtained by encoding the second code vector and the third code vector based on the attention weight vector, with the state data.

In practical applications, when the system encodes the intermediate vector obtained by encoding the second encoded vector and the third encoded vector based on the attention weight vector, and further encodes the intermediate vector with the state data, the system may adopt any one or a combination of the following methods:

splicing the intermediate vector with the first coding vector;

adding the intermediate vector to the first encoded vector;

multiplying the intermediate vector by the first encoded vector.

The method of fusing the intermediate vector and the first code vector includes, but is not limited to, the above method, and may further include, for example, performing a dot product operation on the intermediate vector and the first code vector, which is not exhaustive.

In the present embodiment, since the state data for determining the attention weight and the above-described intermediate vector are fused at the time of data encoding, more beneficial features can be extracted. Of course, the accuracy of the information recommendation network recommendation information obtained by encoding data by using the encoding method is higher.

The application also provides an information recommendation method. According to the method, the long-term behavior data, the short-term behavior data, the first interval duration data, the second interval duration data and the state data used for determining the attention weight of the target user are encoded by the encoding method shown in any one of the embodiments, so that more characteristics beneficial to the recommended information can be extracted, and the accuracy of the recommended information is improved.

Referring to fig. 2, fig. 2 is a flowchart of a method of information recommendation method shown in the present application, and as shown in fig. 2, the method includes:

s202, acquiring long-term behavior data, short-term behavior data, first interval duration data, second interval duration data and state data for determining attention weight of a target user;

s204, encoding the long-term behavior data, the short-term behavior data, the first interval duration data, the second interval duration data, and the status data by using the data encoding method disclosed in any of the embodiments to obtain an encoding result;

and S206, determining recommendation information corresponding to the current recommendation based on the coding result, and outputting the recommendation information.

In executing the above S204, the long-term behavior data, the short-term behavior data, the first interval duration data, the second interval duration data, and the state data for determining the attention weight of the target user may be input into a trained coding network to be calculated, so as to obtain a coding result. In executing the above S206, the encoding result may be generally input into a trained multi-classifier for calculation, so as to obtain recommendation information corresponding to the present recommendation.

It should be noted that the coding network and the multi-classifier may be two independent networks or combined into one information recommendation network, which is not limited herein. The output of the encoding network may be the input of the multi-classifier.

The present embodiment will be described below with reference to an actual scene.

Referring to fig. 3, fig. 3 is a flowchart illustrating a method of information recommendation according to the present application. As shown in fig. 3, the method may include:

s302, inputting long-term behavior data, short-term behavior data, first interval duration data, second interval duration data and state data for determining attention weight of a target user into an information recommendation network to obtain information to be recommended;

wherein the first interval duration data includes interval duration data between the occurrence time of the long-term behavior and the occurrence time of the last long-term behavior of the long-term behavior; the second interval duration data includes interval duration data between the occurrence time of the short-term action and the occurrence time of the last short-term action of the short-term action.

The method for coding the target user long-short term behavior data and the target user interval duration data of the long-short term behavior in the information recommendation network adopts the data coding method disclosed in any one of the embodiments.

And S304, outputting the information to be recommended.

Suppose a user is accessing a certain banking system. The bank service system is provided with an information recommendation network and can recommend related information to users based on the network.

Referring to fig. 4, fig. 4 is a structural diagram of an information recommendation network according to the present application.

As shown in fig. 4, the information recommendation network includes a first neural network for encoding state data for determining attention weights.

The information recommendation network further comprises a second neural network used for coding the long-term behaviors of the user and first interval duration data between the occurrence time of the long-term behaviors and the occurrence time of the last long-term behaviors of the long-term behaviors.

The information recommendation network further comprises a third neural network for encoding the short-term behavior data of the user and second interval time length data between the occurrence time of the short-term behavior and the occurrence time of the last short-term behavior of the short-term behavior.

The first neural network, the second neural network, and the third neural network may be attention mechanism networks constructed based on RNN networks.

The information recommendation network further comprises a fusion device for constructing a feature vector based on an attention mechanism;

the information recommendation network further comprises a plurality of classifiers for recommending information to be recommended.

The training process of the information recommendation network is described below.

It should be noted that, for convenience of describing the embodiment, only the process of training the information recommendation network is described below.

It is to be understood that the training of the information recommendation network is actually the training of the number of sub-networks included in the information recommendation network, and the training process of each sub-network is not described in detail herein.

During training, the network parameters of the information recommendation network can be initialized, and a loss function (e.g., cross entropy) used for training and the number of training iterations can be determined.

In one embodiment, in order to train the network efficiently, when initializing the network parameters of the information recommendation network, the network parameters of the information recommendation network may be initialized based on the pre-trained network parameters of the first, second, and third neural networks. It should be noted that the method of pre-training described above is not limited herein with reference to the related art.

Of course, when initializing the network parameters of the information recommendation network, the network parameters may be randomly designated.

After the basic parameters are determined, a plurality of training samples marked with system recommendation information can be obtained.

In practical applications, the system may count the access behaviors of the user, and based on the counted access behaviors of the user, divide the access behaviors of the user into a long-term behavior of the user, a short-term behavior of the user, and an actual access behavior of the user (at this time, the actual access behavior of the user may be regarded as the recommendation information of the system).

Based on the statistical access behavior, a number of training samples may be constructed.

The structure of the training sample is not limited herein.

After the training samples are obtained, the network parameters can be propagated reversely based on a gradient descent method until the information recommendation model converges.

At this point, training for the information recommendation network is completed.

The following describes a procedure of information recommendation using the above information recommendation network.

After receiving the access behavior of the target user, the system can splice the card transaction activity data, the timestamp data at the current moment and the user state data of the target user, which are being promoted by the system, to form a 100-dimensional environment vector. The dimension of the vector may be set according to actual traffic, and is not particularly limited herein.

Then, the system may input the environment vector into the first neural network for calculation to obtain a 100-dimensional first encoding vector.

The system can also take user behavior data which is 10 hours away from the current time as long-term user behavior data, and splice the long-term user behavior data and the first interval duration data between the occurrence time of the long-term user behavior and the occurrence time of the last long-term user behavior of the long-term user behavior to form a 100-dimensional first behavior vector.

Then, the system can input the first behavior vector into the second neural network for calculation to obtain a 100-dimensional second encoding vector.

The system can also take the user behavior data within 10 hours from the current time as the user short-term behavior data, and splice the user short-term behavior data and the second interval time-length data between the occurrence time of the short-term behavior of the user and the occurrence time of the last short-term behavior of the short-term behavior to form a 100-dimensional second behavior vector.

Then, the system may input the second behavior vector into the third neural network for calculation, so as to obtain a third 100-dimensional encoding vector.

After obtaining the first code vector, the second code vector, and the third code vector, the system may input the first code vector, the second code vector, and the third code vector to the fusion device to perform vector fusion.

In the fusion device, the system may normalize each piece of dimensional data in the first encoded vector, and construct a 100-dimensional attention weight vector based on each piece of dimensional data after the normalization.

After obtaining the attention vector, the system may multiply the attention weight vector by the second encoding vector to obtain a first result;

and adding the first result and the second result to obtain a 100-dimensional intermediate vector.

After obtaining the intermediate vector, the system may stitch the intermediate vector with the first encoded vector to obtain a 200-dimensional feature vector (i.e., the input of the multi-classifier).

After obtaining the feature vector, the system may input the feature vector into the multi-classifier for calculation to obtain information to be recommended.

After the information to be recommended is obtained, the system can output the information to be recommended. For example, the system may display the information to be recommended through an interface interacting with the target user.

And the system finishes the process of information recommendation by using the information recommendation network.

According to the technical scheme, the data coding method shown in any one of the embodiments is adopted in the process of coding long-term and short-term behaviors of the user, so that more characteristics beneficial to recommendation information are extracted, and the information recommendation accuracy is improved.

Corresponding to any one of the above embodiments, the present application also provides a data encoding apparatus. Referring to fig. 5, fig. 5 is a block diagram illustrating a data encoding apparatus according to the present application. As shown in fig. 5, the apparatus 500 includes:

a first encoding module 510, which inputs the state data for determining the attention weight into a first neural network to obtain a first encoding vector;

the second encoding module 520 inputs the long-term behavior data of the user and the first interval duration data between the occurrence time of the long-term behavior and the occurrence time of the last long-term behavior of the long-term behavior into the second neural network to obtain a second encoding vector;

a third encoding module 530, configured to input the user short-term behavior data, and second interval time length data between the occurrence time of the short-term behavior and the occurrence time of the last short-term behavior of the short-term behavior into a third neural network to obtain a third encoding vector;

an attention weight determining module 540, for determining an attention weight vector according to the first encoding vector;

the fourth encoding module 550 encodes the second encoded vector and the third encoded vector based on the attention weight vector.

In an embodiment, the apparatus 500 further includes:

the encoding module 560 encodes an intermediate vector obtained by encoding the second encoded vector and the third encoded vector based on the attention weight vector, and further encodes the intermediate vector with the state data.

In an embodiment, the fourth encoding module 550 includes:

and adding the first result and the second result.

In an embodiment, the attention weight determining module 540 includes:

In an embodiment, the normalization processing module includes:

In an embodiment, the further encoding module 560 includes any one or a combination of the following:

splicing the intermediate vector with the first coding vector;

adding the intermediate vector to the first encoded vector;

multiplying the intermediate vector by the first encoded vector.

RNN network, transformer network, L STM network and CNN network.

the status data may include any one or any combination of the following:

The application also provides an information recommendation device. Referring to fig. 6, fig. 6 is a structural diagram of an information recommendation device according to the present application. As shown in fig. 6, the apparatus 600 includes:

an obtaining module 610, configured to obtain long-term behavior data, short-term behavior data, first interval duration data, second interval duration data, and state data used for determining attention weight of a target user;

an encoding module 620 for encoding the long-term behavior data, the short-term behavior data, the first interval duration data, the second interval duration data, and the status data by using any one of the data encoding methods disclosed in claims 1 to 8 to obtain an encoding result;

the recommending module 630 determines recommendation information corresponding to the current recommendation based on the encoding result, and outputs the recommendation information.

The embodiment of the data encoding device shown in the application can be applied to data encoding equipment. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. Taking a software implementation as an example, as a logical device, the device is formed by reading, by a processor of the electronic device where the device is located, a corresponding computer program instruction in the nonvolatile memory into the memory for operation. From a hardware aspect, as shown in fig. 7, the hardware structure diagram of the data encoding device shown in this application is shown, except for the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 7, the electronic device where the apparatus is located in the embodiment may also include other hardware according to the actual function of the electronic device, which is not described again.

Referring to fig. 7, a data encoding apparatus is shown, the apparatus comprising: a processor.

A memory for storing processor-executable instructions.

The processor executes the executable instructions to implement the data encoding method disclosed in any of the embodiments.

The present application proposes a computer-readable storage medium, in which a computer program is stored, the computer program being configured to implement the data encoding method disclosed in any of the above embodiments.

The embodiment of the information recommendation device shown in the application can be applied to information recommendation equipment. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. Taking a software implementation as an example, as a logical device, the device is formed by reading, by a processor of the electronic device where the device is located, a corresponding computer program instruction in the nonvolatile memory into the memory for operation. In terms of hardware, as shown in fig. 8, a hardware structure diagram of an information recommendation device shown in this application is shown, except for the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 8, an electronic device where a device is located in an embodiment may also include other hardware according to an actual function of the electronic device, which is not described again.

Referring to fig. 8, an information recommendation apparatus is shown, the apparatus including: a processor.

A memory for storing processor-executable instructions.

The processor executes the executable instructions to implement the information recommendation method disclosed in any of the embodiments.

The present application provides a computer-readable storage medium, where a computer program is stored, and the computer program is used to implement the information recommendation method disclosed in any of the above embodiments.

One skilled in the art will recognize that one or more embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The embodiments in the present application are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the data processing apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to part of the description of the method embodiment.

The foregoing description of specific embodiments of the present application has been presented. Other embodiments are within the scope of the following claims. In some cases, the acts or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

Embodiments of the subject matter and functional operations described in this application may be implemented in the following: digital electronic circuitry, tangibly embodied computer software or firmware, computer hardware including the structures disclosed in this application and their structural equivalents, or a combination of one or more of them. Embodiments of the subject matter described in this application can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions, encoded on a tangible, non-transitory program carrier for execution by, or to control the operation of, data processing apparatus. Alternatively or additionally, the program instructions may be encoded on an artificially generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode and transmit information to suitable receiver apparatus for execution by the data processing apparatus. The computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.

The processes and logic flows described in this application can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output. The processes and logic flows described above can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).

Computers suitable for executing computer programs include, for example, general and/or special purpose microprocessors, or any other type of central processing unit. Generally, a central processing unit will receive instructions and data from a read-only memory and/or a random access memory. The basic components of a computer include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer does not necessarily have such a device. Moreover, a computer may be embedded in another device, e.g., a mobile telephone, a Personal Digital Assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device such as a Universal Serial Bus (USB) flash drive, to name a few.

Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices (e.g., EPROM, EEPROM, and flash memory devices), magnetic disks (e.g., an internal hard disk or a removable disk), magneto-optical disks, and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

Although this application contains many specific implementation details, these should not be construed as limiting the scope of any disclosure or of what may be claimed, but rather as merely describing features of particular disclosed embodiments. Certain features that are described in this application in the context of separate embodiments can also be implemented in combination in a single embodiment. In other instances, features described in connection with one embodiment may be implemented as discrete components or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.

Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In some cases, multitasking and parallel processing may be advantageous. Moreover, the separation of various system modules and components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.

Thus, particular embodiments of the subject matter have been described. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. Further, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some implementations, multitasking and parallel processing may be advantageous.

The foregoing is merely a preferred embodiment of one or more embodiments of the present application and is not intended to limit the scope of the one or more embodiments of the present application, such that any modifications, equivalents, improvements and the like which come within the spirit and principle of one or more embodiments of the present application are included within the scope of the one or more embodiments of the present application.

Claims

1. A method of data encoding comprising:

determining an attention weight vector from the first encoded vector;

encoding the second encoding vector and the third encoding vector based on the attention weight vector.

2. The method of claim 1, further comprising:

and encoding an intermediate vector obtained by encoding the second encoding vector and the third encoding vector based on the attention weight vector, and further encoding the intermediate vector and the state data.

3. The method of claim 2, the encoding the second encoding vector and the third encoding vector based on the attention weight vector, comprising:

multiplying a result obtained by subtracting the attention weight vector from 1 by the third encoding vector to obtain a second result;

adding the first result and the second result.

4. The method of claim 1, the determining an attention weight vector from the first encoding vector, comprising:

normalizing each dimension data in the first coding vector;

and constructing an attention weight vector based on the data of each dimension after the normalization processing.

5. The method of claim 4, wherein constructing an attention weight vector based on the normalized dimensional data comprises:

6. The method of claim 2, wherein the intermediate vector obtained by encoding the second encoding vector and the third encoding vector based on the attention weight vector is further encoded with the state data, and the intermediate vector comprises any one or more of the following combinations:

splicing the intermediate vector with the first encoding vector;

adding the intermediate vector to the first encoded vector;

multiplying the intermediate vector by the first encoded vector.

7. The method of claim 6, wherein the first, second and third neural networks are neural networks or attention-based neural networks constructed based on a combination of any one or more of the following networks:

RNN network, transformer network, L STM network and CNN network.

8. The method according to claim 7, wherein the user long-term behavior comprises a behavior that an interval duration between a behavior occurrence moment and a current coding moment reaches a preset duration;

the status data includes any one or any of:

9. An information recommendation method, comprising:

wherein the first interval duration data includes interval duration data between the occurrence time of the long-term behavior and the occurrence time of the last long-term behavior of the long-term behavior; the second interval duration data comprises interval duration data between the occurrence time of the short-term behavior and the occurrence time of the last short-term behavior of the short-term behavior;

encoding the long-term behavior data, the short-term behavior data, the first interval duration data, the second interval duration data, and the status data using any one of the data encoding methods disclosed in claims 1 to 8 to obtain an encoded result;

10. A data encoding apparatus comprising:

an attention weight determination module for determining an attention weight vector according to the first encoding vector;

a fourth encoding module that encodes the second encoding vector and the third encoding vector based on the attention weight vector.

11. The apparatus of claim 10, further comprising:

and a further encoding module, configured to encode an intermediate vector obtained by encoding the second encoding vector and the third encoding vector based on the attention weight vector, and further encode the intermediate vector with the state data.

12. The apparatus of claim 11, the fourth encoding module, comprising:

adding the first result and the second result.

13. The apparatus of claim 10, the attention weight determination module, comprising:

the normalization processing module is used for normalizing the data of each dimension in the first coding vector;

and the construction module is used for constructing an attention weight vector based on the dimensional data after the normalization processing.

14. The apparatus of claim 13, the normalization processing module, comprising:

15. The apparatus of claim 11, the further encoding module comprises any one or a combination of:

splicing the intermediate vector with the first encoding vector;

adding the intermediate vector to the first encoded vector;

multiplying the intermediate vector by the first encoded vector.

16. The apparatus of claim 15, wherein the first, second and third neural networks are networks or attention mechanism networks constructed based on a combination of any one or more of the following networks:

RNN network, transformer network, L STM network and CNN network.

17. The device of claim 16, wherein the user long-term behavior comprises a behavior that an interval duration between a behavior occurrence time and a current coding time reaches a preset duration;

the status data includes any one or any of:

18. An information recommendation apparatus comprising:

an encoding module, for encoding the long-term behavior data, the short-term behavior data, the first interval duration data, the second interval duration data, and the status data by using any one of the data encoding methods disclosed in claims 1 to 8, to obtain an encoding result;

and the recommendation module determines recommendation information corresponding to the recommendation based on the coding result and outputs the recommendation information.

19. A data encoding apparatus, the apparatus comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor implements the data encoding method of any one of claims 1-8 by executing the executable instructions.

20. An information recommendation apparatus, the apparatus comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor implements the information recommendation method of claim 9 by executing the executable instructions.