WO2021093368A1 - 用户聚类及特征学习方法、设备、计算机可读介质 - Google Patents
用户聚类及特征学习方法、设备、计算机可读介质 Download PDFInfo
- Publication number
- WO2021093368A1 WO2021093368A1 PCT/CN2020/104002 CN2020104002W WO2021093368A1 WO 2021093368 A1 WO2021093368 A1 WO 2021093368A1 CN 2020104002 W CN2020104002 W CN 2020104002W WO 2021093368 A1 WO2021093368 A1 WO 2021093368A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sequence
- transaction behavior
- clustering
- transaction
- deep learning
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
Definitions
- This application relates to the field of information technology, and in particular to a method, device, and computer-readable medium for user clustering and feature learning.
- the current commonly used method is to use a clustering algorithm to divide all users, and to understand the distribution of risky customers based on the clustering results.
- the commonly used clustering algorithms such as k-means clustering algorithm (K_means), density-based noise application spatial clustering algorithm (DBscan), etc., although they have good performance on certain data sets, they use clustering features Most of them rely on manual experience. After the data set changes, its performance will be significantly reduced. It is necessary to re-determine the clustering features manually again.
- One purpose of the present application is to provide a solution for user clustering and feature learning, so as to solve the problem of the inability to quickly obtain features for clustering while clustering in existing solutions.
- the embodiment of the present application provides a user clustering and feature learning method.
- the method includes: acquiring user transaction behavior data, and determining the transaction behavior sequence of each user according to the transaction behavior data, and the sequence in the transaction behavior sequence The element is used to represent the transaction behavior data of the user within a time window; the encoder based on the deep learning network encodes the transaction behavior sequence of each user to generate deep features; the decoder based on the deep learning network Decoding the deep features to obtain the restored transaction behavior sequence, clustering users according to the deep features, and obtaining the clustering results; determining the learning target according to the loss function of the deep learning network and the objective function of the clustering, the The loss function of the deep learning network is determined according to the difference information between the restored trading behavior sequence and the original trading behavior sequence, the objective function of the clustering is determined according to the clustering result; the deep learning is performed according to the learning target The parameters of the encoder and decoder of the network are adjusted iteratively so that the learning target meets the preset conditions.
- the embodiment of the present application also provides a user clustering and feature learning device.
- the device includes: a data acquisition module for acquiring user transaction behavior data, and determining the transaction behavior sequence of each user according to the transaction behavior data.
- the sequence elements in the transaction behavior sequence are used to represent the user's transaction behavior data within a time window;
- the deep learning module is used to encode the transaction behavior sequence of each user based on the encoder of the deep learning network to generate depth Features; and a decoder based on a deep learning network that decodes the deep features to obtain a restored transaction behavior sequence;
- a clustering module is used to cluster users according to the deep features to obtain clustering results;
- the iterative processing module is used to determine the learning objective according to the loss function of the deep learning network and the objective function of the clustering.
- the loss function of the deep learning network is based on the difference information between the restored transaction behavior sequence and the original transaction behavior sequence It is determined that the objective function of the clustering is determined according to the clustering result; and the parameters of the encoder and decoder of the deep learning network are iteratively adjusted according to the learning objective, so that the learning objective conforms to a preset condition.
- some embodiments of the present application also provide a computing device, which includes a memory for storing computer program instructions and a processor for executing computer program instructions, wherein when the computer program instructions are executed by the processor When the time, the user clustering and feature learning method is triggered.
- FIG. 1 Another embodiment of the present application also provide a computer-readable medium having computer program instructions stored thereon, and the computer-readable instructions can be executed by a processor to implement the user clustering and feature learning method.
- the user clustering and feature learning solution provided by the embodiments of this application combines a clustering algorithm and a coding and decoding model in a deep learning network.
- the user’s transaction behavior sequence can be determined based on the user’s transaction behavior data, and then based on the coding of the deep learning network.
- the device encodes the transaction behavior sequence of each user to generate deep features; while clustering users according to the deep features to obtain the clustering results, decode the deep features based on the decoder of the deep learning network, Obtain the restored transaction behavior sequence; then determine the learning target according to the clustering result and the decoding result, and iteratively adjust the parameters of the encoder and decoder of the deep learning network according to the learning target, thereby completing the clustering at the same time,
- the deep learning network can be optimized to obtain better deep features for clustering.
- FIG. 1 is a processing flowchart of a method for user clustering and feature learning provided by an embodiment of this application;
- FIG. 2 is a processing principle diagram of the decoding and encoding process in the embodiment of the application
- FIG. 3 is a schematic structural diagram of a user clustering and feature learning device provided by an embodiment of this application.
- FIG. 4 is a schematic structural diagram of a computing device for implementing user clustering and feature learning provided by an embodiment of the application;
- both the terminal and the equipment serving the network include one or more processors (CPU), input/output interfaces, network interfaces, and memory.
- processors CPU
- input/output interfaces network interfaces
- memory volatile and non-volatile memory
- the memory may include non-permanent memory in a computer readable medium, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer readable media.
- RAM random access memory
- ROM read-only memory
- flash RAM flash memory
- Computer-readable media includes permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology.
- the information can be computer readable instructions, data structures, program devices, or other data.
- Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile disc (DVD) or other optical storage, magnetic cassette type Magnetic tape, magnetic tape disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices.
- PRAM phase change memory
- SRAM static random access memory
- DRAM dynamic random access memory
- RAM random access memory
- ROM read-only memory
- EEPROM electrically erasable programmable read-only memory
- flash memory or other memory technologies
- the embodiment of the application provides a method for user clustering and feature learning.
- the method combines a clustering algorithm and a coding and decoding model in a deep learning network. It can determine learning targets according to the clustering results and decoding results, and pair them according to the learning targets.
- the parameters of the encoder and decoder of the deep learning network are adjusted iteratively, so that while clustering is completed, the deep learning network can be optimized to obtain better depth features for implementing clustering, thereby solving the existing problems. In the solution, it is impossible to quickly obtain features for clustering while clustering.
- the execution subject of this method may be user equipment, network equipment, or a device formed by the integration of user equipment and network equipment through a network, and may also be a program running in the above-mentioned equipment.
- the user equipment includes, but is not limited to, various terminal devices such as computers, mobile phones, and tablet computers;
- the network equipment includes, but is not limited to, implementations such as a network host, a single network server, a set of multiple network servers, or a set of computers based on cloud computing, etc.
- the cloud is composed of a large number of hosts or network servers based on Cloud Computing.
- Cloud computing is a type of distributed computing, a virtual computer composed of a group of loosely coupled computer sets.
- Fig. 1 shows a user clustering and feature learning method provided by an embodiment of the present application.
- the method includes the following processing steps S101 to S106.
- Step S101 Obtain user transaction behavior data, and determine the transaction behavior sequence of each user according to the transaction behavior data.
- the user's transaction behavior data may be any data that can reflect the relevant behaviors performed by the user during the transaction process.
- the transaction behavior data may be multiple transaction behavior information corresponding to the user in multiple time windows, for example, may be the transaction amount, the number of transactions, and the number of transaction objects in consecutive days. , The time period during which the transaction is concentrated, the main area of the transaction object, etc. Due to similar users, there will be certain similarities in transaction behaviors. For example, they tend to conduct transactions at the same time period every day, the number of transactions per day is similar, and the amount of transactions per day is similar. Therefore, it is based on the user’s transaction behavior.
- the data is used as input data to realize the clustering processing of users.
- the transaction behavior sequence is the data content presented in a preset form after data processing is performed based on the user's transaction behavior data.
- the sequence elements in the transaction behavior sequence are used to represent the transaction behavior data of the user in a time window, and the transaction behavior data in each time window can be expressed in the form of a vector.
- each sequence element is a vector composed of transaction behavior data in one day.
- the transaction behavior data used in this embodiment includes the transaction amount in the time window, the number of transactions, the number of transaction objects, and the time period during which the transaction concentration occurs, the transaction corresponding to each sequence element
- the vector of behavioral data is [transaction amount in the time window, number of transactions, number of transaction objects, time period during which transaction concentration occurs].
- the transaction behavior sequence can be expressed as the following matrix:
- each row in the matrix represents a vector corresponding to a sequence element
- the vector elements in each row are the transaction amount in the time window, the number of transactions, the number of transaction objects, and the time period during which the transaction concentration occurs, for example
- the first row in the matrix [10000,20,8,17] is the transaction behavior data of the first day, the transaction amount is 10000, the number of transactions is 20, the number of transaction objects is 8, and the time period during which the transaction concentration occurs It is the 17th hour, which is 16:00:00-17:00:00.
- each sequence element (that is, a vector composed of transaction behavior data in a time window) can also be embedding, that is, each sequence element is regarded as being in natural language processing.
- each sequence element is regarded as being in natural language processing.
- each sequence element in the transaction behavior sequence can be represented by the above-mentioned N-dimensional vector.
- the sequence numbers of the sequence elements in the transaction behavior sequence ⁇ S A1 , S A2 , S A3 , S A4 , S A5 , S A6 , S A7 ⁇ are 1, 7, 3, 2, respectively 11, 6, 100.
- the transaction behavior sequence after embedding can be expressed in the form of Table 2:
- Step S102 based on the encoder of the deep learning network, encode the transaction behavior sequence of each user to generate deep features.
- Step S103 Based on a decoder of the deep learning network, decode the deep features to obtain a restored sequence of transaction behaviors.
- a deep learning network based on an Encoder-Decoder is used to implement deep feature learning.
- the principle of the encoding and decoding model is that the parameters of the encoder and the decoder can be adjusted in an iterative manner.
- the depth characteristics obtained by the encoding can be considered It has enough ability to distinguish samples. At this time, clustering through these depth features can obtain a better clustering effect.
- a multi-head attention mechanism can be used.
- the input data can be positionally encoded (Positional encoding), so that the multi-head attention mechanism is adopted.
- the encoder can get better depth characteristics. Therefore, in some embodiments of the present application, in an encoder based on a deep learning network, the transaction behavior sequence of each user is encoded, and when the in-depth feature is generated, the user’s transaction behavior sequence can be position-encoded to determine the sequence elements.
- the relative position information in the transaction behavior sequence, and then the transaction behavior sequence carrying the relative position information, is input into the encoder of the deep learning network using the multi-head attention mechanism to obtain the deep features.
- the relative position information of the sequence elements in the transaction behavior sequence can be determined according to the sort position of the sequence elements in the transaction behavior sequence and the dimension of the element sequence.
- the sorting position corresponds to the time window sequence, for example, the sequence element S[7] corresponding to the transaction behavior data of the second day
- the trigonometric function that can be used to determine the position coding information for example, when i is an even number, the sin() function is used, and when i is an odd number, the cos() function is used.
- the position information can be determined in the following way:
- the matrix Matrix_P about the position code information can be obtained.
- the dimension of the matrix Matrix_P is the same as the dimension of the user transaction sequence matrix Matrix_A.
- the position code can be obtained.
- the new matrix of information Matrix_N Matrix_A+Matrix_P.
- the new matrix Matrix_N carries the relative position information of sequence elements in the transaction behavior sequence, which can be input to the encoder of the deep learning network using the multi-head attention mechanism to obtain deep features.
- a multi-head attention mechanism may also be used.
- the decoding process first input the deep features into the decoder of the deep learning network that uses the multi-head attention mechanism to obtain the first sequence element in the restored transaction behavior sequence, and then perform iterative processing to input the deep features and obtain the previous decoding
- the sequence element of is input into the decoder of the deep learning network using the multi-head attention mechanism, and the subsequent sequence elements in the restored transaction behavior sequence are restored until the complete transaction behavior sequence is obtained by decoding.
- the depth feature obtained by user A's transaction behavior sequence after encoding is expressed as C.
- the depth feature information is input into a decoder of a deep learning network using a multi-head attention mechanism for decoding, the first obtained The first sequence element in the restored transaction behavior sequence is S A1 ', and then iterative processing is performed, using C and S A1 'as the input of the decoder to obtain the second sequence element S A2 ', until all transaction behavior sequences are obtained All sequence elements in.
- Fig. 2 shows the processing principle of the decoding and encoding process in the embodiment of the present application.
- the input 210 of the encoder is the original sequence of transaction behaviors. Before input to the encoder, position encoding 220 is required.
- the encoder 230 includes a multi-head attention mechanism (Multi-head Attention) layer 231, a residual connection standardization (Add&norm) layer 232, and a forward feedback (Feed Forward) layer 233.
- Multi-head attention layer 231 performs h different projections on the three input Query, Key, and Value.
- each projection can use a different linear transformation, and then according to Query and Key Calculate the weight coefficient, and then perform a weighted sum of Value according to the weight coefficient, thereby obtaining h self-attention results, splicing these results together, and output the processing results of the multi-head attention mechanism through a linear mapping.
- the Feed Forward layer 233 performs linear transformation on the input, and the dimensions of the input and output matrices are the same, which is used to further optimize the learning of deep features.
- the Add&norm layer 232 is used to solve the degradation problem in deep learning and avoid the disappearance of the gradient.
- the output of the previous layer can be randomly inactivated (dropout), and then superimposed with the original input of the previous layer, and then the result For standardization.
- the outputs of the multi-head attention layer 231 and the Add&norm layer 232 are processed by the Add&norm layer 232.
- the decoder 240 also includes a multi-head attention layer 241, a residual connection standardization (Add&norm) layer 242, and a feed forward layer 243.
- the difference from the encoder 230 is that the input of the Multi-head attention layer 241 in the decoder 240 is different from that of the encoder 230.
- the two input Key and Value are the output result of the encoder, that is, the depth feature.
- Another input, 250Query is the sequence element after the restored transaction sequence is shifted to the right by one bit, that is, the previous output of the decoder, and position encoding is also performed.
- the Add&norm layer 242 and the Feed Forward layer 243 in the encoder are similar to those in the decoder, and will not be repeated here.
- the encoder also includes a linear layer 244, and a fully connected layer may be used to map the input to restore the original transaction behavior sequence to the dimension and size, thereby completing the decoding.
- Step S104 clustering users according to the depth feature, and obtaining a clustering result.
- the clustering algorithm can choose the spatial clustering algorithm for noise application based on hierarchical density (HDBSCAN), k-means clustering algorithm (K_means), the spatial clustering algorithm for noise application based on density (DBscan), spectral clustering (Spectral Clustering) algorithm, etc. Since the HDBSCAN algorithm can cluster highly similar targets together and has hierarchical results, in some embodiments of this application, a spatial clustering algorithm for noise applications based on hierarchical density may be used to perform the evaluation of users based on the depth features. Clustering, obtain the clustering results.
- Step S105 Determine the learning target according to the loss function of the deep learning network and the target function of the clustering.
- the loss function Loss (Decoder) of the deep learning network can be determined according to the difference information between the restored transaction behavior sequence and the original transaction behavior sequence. For example, in the embodiment of the present application, it may be the restored transaction behavior sequence and the original transaction behavior sequence.
- the square difference of the sequence of trading behaviors can be calculated using the following formula:
- x i represents the i-th feature in the original transaction behavior sequence
- x pi represents the i-th feature in the restored transaction behavior sequence
- M represents the total number of features in the transaction behavior sequence.
- the objective function Object (clustering) of the clustering is determined according to the clustering result.
- it may be the sum of the standard deviations of the depth features corresponding to each category in the clustering result, which publicly indicates as follows:
- ⁇ is the adjusted value, which can be preset by the user according to the actual scene
- m is the number of clusters obtained after clustering processing
- std(C j ) is the standard deviation of the depth features in the jth class
- k is each The feature dimension of the sequence in the class
- f i represents the depth feature of the i-th dimension in each class
- std(f i ) represents the standard deviation of the depth feature of the i-th dimension in a certain class.
- Step S106 Iteratively adjust the parameters of the encoder and decoder of the deep learning network according to the learning target, so that the learning target meets a preset condition.
- the preset condition may be that the learning target is smaller than the preset value, or the learning target reaches the minimum value, for example, min: loss (Decoder) + Object (clustering).
- the clustering results and depth features obtained when the learning target meets the preset conditions can be used as the final output content, thereby automatically learning to obtain appropriate features while completing the clustering , And rely on manual methods to get them based on experience.
- an embodiment of the application also provides a user clustering and feature learning device.
- the method corresponding to the device is the user clustering and feature learning method in the foregoing embodiment, and the principle of solving the problem is the same as that The method is similar.
- the embodiment of the present application provides a user clustering and feature learning device, which combines a clustering algorithm and a coding and decoding model in a deep learning network, and can determine learning targets according to the clustering results and decoding results, and pair them according to the learning targets.
- the parameters of the encoder and decoder of the deep learning network are adjusted iteratively, so that while clustering is completed, the deep learning network can be optimized to obtain better depth features for implementing clustering, thereby solving the existing problems. In the solution, it is impossible to quickly obtain features for clustering while clustering.
- the specific implementation of the device may be user equipment, network equipment, or a device formed by integrating user equipment and network equipment through a network, and may also be a program running in the above-mentioned equipment.
- the user equipment includes, but is not limited to, various terminal devices such as computers, mobile phones, and tablet computers;
- the network equipment includes, but is not limited to, implementations such as a network host, a single network server, a set of multiple network servers, or a set of computers based on cloud computing, etc.
- the cloud is composed of a large number of hosts or network servers based on Cloud Computing.
- Cloud computing is a type of distributed computing, a virtual computer composed of a group of loosely coupled computer sets.
- FIG. 3 shows a user clustering and feature learning device provided by an embodiment of the present application.
- the device includes a data acquisition module 310, a deep learning module 320, a clustering module 330, and an iterative processing module 340.
- the data acquisition module 310 is used to acquire user transaction behavior data, and determine the transaction behavior sequence of each user according to the transaction behavior data.
- the deep learning module 320 is used for an encoder based on a deep learning network to encode the transaction behavior sequence of each user to generate a deep feature; and a decoder based on the deep learning network to decode the deep feature to obtain a restored transaction behavior sequence.
- the clustering module 330 is configured to cluster the users according to the depth feature, and obtain a clustering result.
- the iterative processing module 340 is configured to determine the learning target according to the loss function of the deep learning network and the objective function of the clustering, and to iteratively adjust the parameters of the encoder and decoder of the deep learning network according to the learning target to Make the learning objective meet the preset conditions.
- the user's transaction behavior data may be any data that can reflect the relevant behaviors performed by the user during the transaction process.
- the transaction behavior data may be multiple transaction behavior information corresponding to the user in multiple time windows, for example, may be the transaction amount, the number of transactions, and the number of transaction objects in consecutive days. , The time period during which the transaction is concentrated, the main area of the transaction object, etc. Due to similar users, there will be certain similarities in transaction behaviors. For example, they tend to conduct transactions at the same time period every day, the number of transactions per day is similar, and the amount of transactions per day is similar. Therefore, it is based on the user’s transaction behavior.
- the data is used as input data to realize the clustering processing of users.
- the transaction behavior sequence is the data content presented in a preset form after data processing is performed based on the user's transaction behavior data.
- the sequence elements in the transaction behavior sequence are used to represent the transaction behavior data of the user in a time window, and the transaction behavior data in each time window can be expressed in the form of a vector.
- each sequence element is a vector composed of transaction behavior data in one day.
- the transaction behavior data used in this embodiment includes the transaction amount in the time window, the number of transactions, the number of transaction objects, and the time period during which the transaction concentration occurs, the transaction corresponding to each sequence element
- the vector of behavioral data is [transaction amount in the time window, number of transactions, number of transaction objects, time period during which transaction concentration occurs].
- the transaction behavior sequence can be expressed as the following matrix:
- each row in the matrix represents a vector corresponding to a sequence element
- the vector elements in each row are respectively the transaction amount in the time window, the number of transactions, the number of transaction objects, and the time period during which the transaction concentration occurs, for example
- the first row in the matrix [10000,20,8,17] is the transaction behavior data of the first day, the transaction amount is 10000, the number of transactions is 20, the number of transaction objects is 8, and the time period during which the transaction concentration occurs It is the 17th hour, which is 16:00:00-17:00:00.
- each sequence element (that is, a vector composed of transaction behavior data in a time window) can also be embedding, that is, each sequence element is regarded as being in natural language processing. For words, perform operations similar to word embedding, and the specific processing methods are as follows:
- sequence elements correspond to There are 2000 kinds of vectors in total.
- these 2000 different sequence elements can be mapped into 2000 N-dimensional vectors.
- N is the number of embedding dimensions in embedding, which can be set according to actual conditions, for example, 512, 256, etc. can be set.
- all 2000 sequence elements can be shown in the form of Table 1.
- each sequence element in the transaction behavior sequence can be represented by the above-mentioned N-dimensional vector.
- the sequence numbers of the sequence elements in the transaction behavior sequence ⁇ S A1 , S A2 , S A3 , S A4 , S A5 , S A6 , S A7 ⁇ are 1, 7, 3, 2, respectively 11, 6, 100. Therefore, the transaction behavior sequence after embedding processing can be expressed in the form of Table 2.
- the deep learning module 320 uses a deep learning network based on an Encoder-Decoder to implement deep feature learning.
- the principle of the encoding and decoding model is that the parameters of the encoder and the decoder can be adjusted in an iterative manner.
- the depth characteristics obtained by the encoding can be considered It has enough ability to distinguish samples. At this time, clustering through these depth features can obtain a better clustering effect.
- the deep learning module can adopt a multi-head attention mechanism (Multi-head attention mechanism). ).
- the input data can be positionally encoded (Positional encoding), so that the multi-head attention mechanism is adopted.
- the encoder can get better depth characteristics. Therefore, in some embodiments of the present application, in an encoder based on a deep learning network, the transaction behavior sequence of each user is encoded, and when the in-depth feature is generated, the deep learning module can first position-encode the user’s transaction behavior sequence. Determine the relative position information of the sequence elements in the transaction behavior sequence, and then input the transaction behavior sequence carrying the relative position information into the encoder of the deep learning network using the multi-head attention mechanism to obtain the deep features.
- the relative position information of the sequence elements in the transaction behavior sequence can be determined according to the sorted position of the sequence elements in the transaction behavior sequence and the dimension of the element sequence.
- the position code information of the transaction behavior sequence of the user A can be as shown in Table 3 below.
- the trigonometric function that can be used to determine the position coding information for example, when i is an even number, the sin() function is used, and when i is an odd number, the cos() function is used.
- the position information can be determined in the following way:
- the matrix Matrix_P about the position code information can be obtained.
- the dimension of the matrix Matrix_P is the same as the dimension of the user transaction sequence matrix Matrix_A.
- the position code can be obtained.
- the new matrix of information Matrix_N Matrix_A+Matrix_P.
- the new matrix Matrix_N carries the relative position information of sequence elements in the transaction behavior sequence, which can be input to the encoder of the deep learning network using the multi-head attention mechanism to obtain deep features.
- the deep learning module may also adopt a multi-head attention mechanism.
- the decoding process first input the deep feature into the decoder of the deep learning network using the multi-head attention mechanism to obtain the first sequence element in the restored transaction behavior sequence, and then perform iterative processing to input the deep feature and obtain the previous decoding
- the sequence element of is input to the decoder of the deep learning network using the multi-head attention mechanism, and the subsequent sequence elements in the restored transaction behavior sequence are restored until the complete transaction behavior sequence is obtained by decoding.
- the depth feature obtained by user A's transaction behavior sequence after encoding is expressed as C.
- the depth feature information is input into a decoder of a deep learning network using a multi-head attention mechanism for decoding, the first obtained The first sequence element in the restored transaction behavior sequence is S A1 ', and then iterative processing is performed, using C and S A1 'as the input of the decoder to obtain the second sequence element S A2 ', until all transaction behavior sequences are obtained All sequence elements in.
- Fig. 2 shows the processing principle of the decoding and encoding process in the embodiment of the present application.
- the input 210 of the encoder is the original sequence of transaction behaviors. Before input to the encoder, position encoding 220 is required.
- the encoder 230 includes a multi-head attention mechanism (Multi-head Attention) layer 231, a residual connection standardization (Add&norm) layer 232, and a forward feedback (Feed Forward) layer 233.
- Multi-head attention layer 231 performs h different projections on the three input Query, Key, and Value.
- each projection can use a different linear transformation, and then according to Query and Key Calculate the weight coefficient, and then perform a weighted sum of Value according to the weight coefficient, thereby obtaining h self-attention results, splicing these results together, and output the processing results of the multi-head attention mechanism through a linear mapping.
- the Feed Forward layer 233 performs linear transformation on the input, and the dimensions of the input and output matrices are the same, which is used to further optimize the learning of deep features.
- the Add&norm layer 232 is used to solve the degradation problem in deep learning and avoid the disappearance of the gradient.
- the output of the previous layer can be randomly inactivated (dropout), and then superimposed with the original input of the previous layer, and then the result For standardization.
- the outputs of the multi-head attention layer 231 and the Add&norm layer 232 are processed by the Add&norm layer 232.
- the decoder 240 also includes a multi-head attention layer 241, a residual connection standardization (Add&norm) layer 242, and a feed forward layer 243.
- the difference from the encoder 230 is that the input of the Multi-head attention layer 241 in the decoder 240 is different from that of the encoder 230.
- the two input Key and Value are the output result of the encoder, that is, the depth feature.
- Another input, 250Query is the sequence element after the restored transaction sequence is shifted to the right by one bit, that is, the previous output of the decoder, and position encoding is also performed.
- the Add&norm layer 242 and the Feed Forward layer 243 in the encoder are similar to those in the decoder, and will not be repeated here.
- the encoder also includes a linear layer 244, and a fully connected layer can be used to map the input to restore the original transaction behavior sequence to the dimension and size, thereby completing the decoding.
- the clustering algorithm used by the clustering module 330 can choose the spatial clustering algorithm for noise application based on hierarchical density (HDBSCAN), k-means clustering algorithm (K_means), the spatial clustering algorithm for noise application based on density (DBscan), and spectral clustering. Class (Spectral Clustering) algorithm, etc. Since the HDBSCAN algorithm can cluster highly similar targets together and has hierarchical results, in some embodiments of this application, a spatial clustering algorithm for noise applications based on hierarchical density may be used to perform the evaluation of users based on the depth features. Clustering, obtain the clustering results.
- HDBSCAN hierarchical density
- K_means k-means clustering algorithm
- DBscan density
- Class (Spectral Clustering) algorithm etc. Since the HDBSCAN algorithm can cluster highly similar targets together and has hierarchical results, in some embodiments of this application, a spatial clustering algorithm for noise applications based on hierarchical density may be used to perform the evaluation of users based on the depth features. Clustering
- the loss function Loss (Decoder) of the deep learning network can be determined according to the difference information between the restored transaction behavior sequence and the original transaction behavior sequence. For example, in the embodiment of the present application, it may be the restored transaction behavior sequence and the original transaction behavior sequence.
- the square difference of the sequence of trading behaviors can be calculated using the following formula:
- x i represents the i-th feature in the original transaction behavior sequence
- x pi represents the i-th feature in the restored transaction behavior sequence
- M represents the total number of features in the transaction behavior sequence.
- the objective function Object (clustering) of the clustering is determined according to the clustering result.
- it may be the sum of the standard deviations of the depth features corresponding to each category in the clustering result, which publicly indicates as follows:
- ⁇ is the adjusted value, which can be preset by the user according to the actual scene
- m is the number of clusters obtained after clustering processing
- std(C j ) is the standard deviation of the depth features in the jth class
- k is each The feature dimension of the sequence in the class
- f i represents the depth feature of the i-th dimension in each class
- std(f i ) represents the standard deviation of the depth feature of the i-th dimension in a certain class.
- the preset condition adopted by the iterative processing module 340 may be that the learning target is smaller than the preset value, such as Object(total) ⁇ L, or the learning target reaches the minimum value, such as min: loss(Decoder)+Object (clustering).
- the user clustering and feature learning device may use the clustering results and depth features obtained when the learning target meets preset conditions as the final output content, thereby completing the clustering At the same time, the appropriate features are obtained by automatic learning, while relying on manual methods to obtain them based on experience.
- the user clustering and feature learning solution combines the clustering algorithm and the encoding and decoding model in the deep learning network.
- the user's transaction behavior sequence can be determined based on the user's transaction behavior data, and then the user's transaction behavior sequence can be determined based on the user's transaction behavior data.
- the encoder based on the deep learning network encodes the transaction behavior sequence of each user to generate deep features; while clustering the users according to the deep features to obtain the clustering results, the decoder based on the deep learning network
- the deep features are decoded to obtain the restored transaction behavior sequence; then the learning target is determined according to the clustering result and the decoding result, and the parameters of the encoder and decoder of the deep learning network are iteratively adjusted according to the learning target. While completing the clustering, the deep learning network can be optimized to obtain better deep features for clustering.
- a part of this application can be applied as a computer program product, such as a computer program instruction, when it is executed by a computer, through the operation of the computer, the method and/or technical solution according to this application can be invoked or provided.
- the program instructions that call the method of this application may be stored in a fixed or removable recording medium, and/or be transmitted through a data stream in a broadcast or other signal-bearing medium, and/or be stored in accordance with the program In the working memory of the computer equipment on which the instructions are executed.
- some embodiments according to the present application include a computing device as shown in FIG.
- some embodiments of the present application also provide a computer-readable medium on which computer program instructions are stored, and the computer-readable instructions can be executed by a processor to implement the methods and methods of the foregoing multiple embodiments of the present application. / Or technical solutions.
- this application can be implemented in software and/or a combination of software and hardware.
- it can be implemented using an application specific integrated circuit (ASIC), a general purpose computer or any other similar hardware device.
- ASIC application specific integrated circuit
- the software program of the present application may be executed by a processor to realize the above steps or functions.
- the software program (including related data structure) of the present application can be stored in a computer-readable recording medium, such as RAM memory, magnetic or optical drive, or floppy disk and similar devices.
- some steps or functions of the present application may be implemented by hardware, for example, as a circuit that cooperates with a processor to execute each step or function.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- General Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Probability & Statistics with Applications (AREA)
- Economics (AREA)
- Evolutionary Biology (AREA)
- Development Economics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
本申请提供了一种用户聚类及特征学习方案,结合了聚类算法和深度学习网络中的编码解码模型,可以先基于用户的交易行为数据确定用户的交易行为序列,而后基于深度学习网络的编码器,将各个用户的交易行为序列进行编码,生成深度特征;在根据所述深度特征对用户进行聚类获取聚类结果的同时,基于深度学习网络的解码器,对所述深度特征进行解码,获得还原的交易行为序列;而后根据聚类结果和解码结果确定学习目标,并根据学习目标对所述深度学习网络的编码器和解码器的参数进行迭代调整,由此在完成聚类的同时,能够优化深度学习网络,以获得更好的、用于实现聚类的深度特征。
Description
本申请涉及信息技术领域,尤其涉及一种用户聚类及特征学习方法、设备、计算机可读介质。
随着互联网技术以及电子商务的发展,涌现了大量的电商平台,给消费者带来了便利。而电商平台中接入的用户数量也越来越多,虽然其中正常用户的数量一般占绝大多数,但其中也会隐藏一些实施非法行为的用户,此类用户会给电商平台以及使用电商平台的消费者带来风险。
为了能够识别出此类用户,目前常用的方式是使用聚类算法对所有用户进行划分,根据聚类结果了解风险客户的分布。而目前常用的聚类算法,如k均值聚类算法(K_means)、基于密度的噪声应用空间聚类算法(DBscan)等,虽然在一定数据集上有较好的表现,但是使用的聚类特征大都依赖人工的经验形成的,在数据集发生变化后其性能会显著降低,需要再次通过人工的方式重新确定聚类特征。
发明内容
本申请的一个目的是提供一种用户聚类及特征学习的方案,用以解决现有方案中无法在聚类的同时快速获得用于聚类的特征的问题。
本申请实施例提供了一种用户聚类及特征学习方法,该方法包括:获取用户的交易行为数据,并根据所述交易行为数据确定各个用户的交易行为序列,所述交易行为序列中的序列元素用于表示所述用户在一个时间窗口内的交易行为数据;基于深度学习网络的编码器,将各个用户的交易行为序列进行编码,生成深度特征;基于深度学习网络的解码器,对所述深度特征进行解码,获得还原的交易行为序列,并根据所述深度特征对用户进行聚类,获取聚类结果;根据所述深度学习网络的损失函数和聚类的目标函数确定学习目标,所述深度学习网络的损失函数根据还原的交易行为序列与原始的交易行为序列之间的差异信息确定,所述聚类的目标函数根据所述聚类结果确定;根据所述学习目标对所述深度学习网络的编码器和解码器的参数进行迭代调整,以使所述学习目标符 合预设条件。
本申请实施例还提供了一种用户聚类及特征学习设备,该设备包括:数据获取模块,用于获取用户的交易行为数据,并根据所述交易行为数据确定各个用户的交易行为序列,所述交易行为序列中的序列元素用于表示所述用户在一个时间窗口内的交易行为数据;深度学习模块,用于基于深度学习网络的编码器,将各个用户的交易行为序列进行编码,生成深度特征;以及基于深度学习网络的解码器,对所述深度特征进行解码,获得还原的交易行为序列;聚类模块,用于根据所述深度特征对用户进行聚类,获取聚类结果;
迭代处理模块,用于根据所述深度学习网络的损失函数和聚类的目标函数确定学习目标,所述深度学习网络的损失函数根据还原的交易行为序列与原始的交易行为序列之间的差异信息确定,所述聚类的目标函数根据所述聚类结果确定;以及根据所述学习目标对所述深度学习网络的编码器和解码器的参数进行迭代调整,以使所述学习目标符合预设条件。
此外,本申请的一些实施例还提供了一种计算设备,该设备包括用于存储计算机程序指令的存储器和用于执行计算机程序指令的处理器,其中,当该计算机程序指令被该处理器执行时,触发所述用户聚类及特征学习方法。
本申请的另一些实施例还提供了一种计算机可读介质,其上存储有计算机程序指令,所述计算机可读指令可被处理器执行以实现所述用户聚类及特征学习方法。
本申请实施例提供的用户聚类及特征学习方案结合了聚类算法和深度学习网络中的编码解码模型,可以先基于用户的交易行为数据确定用户的交易行为序列,而后基于深度学习网络的编码器,将各个用户的交易行为序列进行编码,生成深度特征;在根据所述深度特征对用户进行聚类获取聚类结果的同时,基于深度学习网络的解码器,对所述深度特征进行解码,获得还原的交易行为序列;而后根据聚类结果和解码结果确定学习目标,并根据学习目标对所述深度学习网络的编码器和解码器的参数进行迭代调整,由此在完成聚类的同时,能够优化深度学习网络以获得更好的、用于实现聚类的深度特征。
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:
图1为本申请实施例提供的一种用户聚类及特征学习方法的处理流程图;
图2为本申请的实施例中进行解码和编码过程的处理原理图;
图3为本申请实施例提供的一种用户聚类及特征学习设备的结构示意图;
图4为本申请实施例提供的一种用于实现用户聚类及特征学习的计算设备的结构示意图;
附图中相同或相似的附图标记代表相同或相似的部件。
下面结合附图对本申请作进一步详细描述。
在本申请一个典型的配置中,终端、服务网络的设备均包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体,可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的装置或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。
本申请实施例提供了一种用户聚类及特征学习方法,该方法结合了聚类算法和深度学习网络中的编码解码模型,能够根据聚类结果和解码结果确定学习目标,并根据学习目标对所述深度学习网络的编码器和解码器的参数进行迭代调整,由此在完成聚类的同时,能够优化深度学习网络以获得更好的、用于实现聚类的深度特征,从而解决现有方案中无法在聚类的同时快速获得用于聚类的特征的问题。
在实际场景中,该方法的执行主体可以是用户设备、网络设备或者用户设备与网络设备通过网络相集成所构成的设备,此外也可以是运行于上述设备中的程序。所述用户 设备包括但不限于计算机、手机、平板电脑等各类终端设备;所述网络设备包括但不限于如网络主机、单个网络服务器、多个网络服务器集或基于云计算的计算机集合等实现。在此,云由基于云计算(Cloud Computing)的大量主机或网络服务器构成,其中,云计算是分布式计算的一种,由一群松散耦合的计算机集组成的一个虚拟计算机。
图1示出了本申请实施例提供的一种用户聚类及特征学习方法,该方法包括以下处理步骤S101至S106。
步骤S101,获取用户的交易行为数据,并根据所述交易行为数据确定各个用户的交易行为序列。
其中,所述用户的交易行为数据可以是任意能够反映出用户在交易过程中所实施的相关行为的数据。在本申请的一些实施例中,所述交易行为数据可以是用户在多个时间窗口所对应的多项交易行为信息,例如可以是连续几天内的交易金额、交易笔数、交易对象的数量、交易集中发生的时间段、交易对象的主要地域等。由于相似的用户之间,交易行为也会存在一定的相似性,例如都倾向于在每天同一时间段内进行交易,每天的交易笔数相似,每天的交易金额相似等,因此基于用户的交易行为数据作为输入数据,实现用户的聚类处理。
所述交易行为序列是基于用户的交易行为数据进行数据处理后,按照预设形式所呈现的数据内容。所述交易行为序列中的序列元素用于表示所述用户在一个时间窗口内的交易行为数据,每个时间窗口内的交易行为数据可以采用向量的形式表示。
例如,对于一个用户A而言,以一周为统计周期,其交易行为序列可以包含7天内的交易行为数据,若每个时间窗口设定为1天,则该用户A的交易行为序列包含7个序列元素{S
A1,S
A2,S
A3,S
A4,S
A5,S
A6,S
A7},每个序列元素即为1天中的交易行为数据组成的向量。若本实施例中所采用的交易行为数据包括了时间窗口内的交易金额、交易笔数、交易对象的数量和交易集中发生的时间段这4项交易行为信息,则每个序列元素对应的交易行为数据的向量为[时间窗口内的交易金额,交易笔数,交易对象的数量,交易集中发生的时间段]。此时,所述交易行为序列可以表示为如下的矩阵:
其中,矩阵中的每一个行表示一个序列元素对应的向量,每一个行中的向量元素依次分别为时间窗口内的交易金额,交易笔数,交易对象的数量,交易集中发生的时间段,例如,矩阵中的第一行[10000,20,8,17]即为第一天的交易行为数据,交易金额为10000,交易笔数为20,交易对象的数量为8,交易集中发生的时间段为第17个小时,即16:00:00-17:00:00。
在本申请的一些实施例中,也可以将每个序列元素(即一个时间窗口内的交易行为数据组成的向量)进行嵌入(embedding)处理,即,将每个序列元素视为自然语言处理中的单词,进行类似词嵌入的操作,具体的处理方式如下。
首先,对数据集中所有用户的每个时间窗口的交易行为数据进行编码,例如,本申请实施例中数据集中所有用户的各个时间窗口中交易行为数据有2000种不同的情况,即序列元素对应的向量一共有2000种,此时可以将这2000种不同的序列元素映射为2000个N维的向量。其中,N为embedding时的嵌入维度数,可以根据实际情况设定,例如可以设定512、256等。由此,所有的2000个序列元素可以如表1的形式所示:
序列元素序号 | 维度1 | 维度2 | 维度3 | …… | 维度N-1 | 维度N |
0 | 0.33645 | 0.823 | 0.9238 | …… | 0.7257 | 0.8446 |
1 | 0.54 | 0.701 | 0.957 | …… | 0.4029 | 0.923 |
2 | 0.844 | 0.854 | 0.17 | …… | 0.54029 | 0.7317 |
…… | …… | …… | …… | …… | …… | …… |
1998 | 0.029 | 0.364 | 0.4029 | …… | 0.446 | 0.257 |
1999 | 0.23 | 0.6731 | 0.29 | …… | 0.755 | 0.8462 |
表1
然后,可以根据每个用户实际包含的序列元素,将交易行为序列中的每个序列元素采用上述N维向量进行表示。例如,对于前述的用户A,其交易行为序列{S
A1,S
A2,S
A3,S
A4,S
A5,S
A6,S
A7}中的序列元素的序号分别为1、7、3、2、11、6、100。由此,进行嵌入处理后的交易行为序列可以表示为表2的形式:
排序位置 | 序列元素 | 维度1 | 维度2 | 维度3 | …… | 维度N-1 | 维度N |
1 | S[1] | 0.54 | 0.701 | 0.957 | …… | 0.4029 | 0.923 |
2 | S[7] | 0.113 | 0.657 | 0.732 | …… | 0.1001 | 0.255 |
3 | S[3] | 0.456 | 0.811 | 0.71 | …… | 0.565 | 0.875 |
4 | S[2] | 0.844 | 0.854 | 0.17 | …… | 0.54029 | 0.7317 |
5 | S[11] | 0.2315 | 0.2343 | 0.786 | …… | 0.1234 | 0.25 |
6 | S[6] | 0.213 | 0.752 | 0.875 | …… | 0.741 | 0.441 |
7 | S[100] | 0.23 | 0.6731 | 0.29 | …… | 0.755 | 0.8462 |
表2
由此,前述用户A的交易行为序列在进行embedding处理之后,可以表示为如下的矩阵Matrix_A:
步骤S102,基于深度学习网络的编码器(Encoder),将各个用户的交易行为序列进行编码,生成深度特征。
步骤S103,基于深度学习网络的解码器(Decoder),对所述深度特征进行解码,获得还原的交易行为序列。
在本申请的实施例中,是利用基于编码解码模型(Encoder-Decoder)的深度学习网络来实现深度特征的学习。编码解码模型的原理在于:可以通过迭代的方式调整编码器和解码器的参数,在解码还原的输入内容与原始的输入内容之间的差异小到足够的程度时,可以认为编码获得的深度特征具有足够的区分样本的能力,此时通过这些深度特征进行聚类,可以获得较好的聚类效果。
为了能够更好地获取各个交易行为序列内部各个向量之间的关系,从而提取聚类性能更好的深度特征,在编码和解码过程中,可以采用多头注意力机制(Multi-head attention)。
在采用多头注意力机制时,由于相同序列元素在处于交易行为序列的不同排序位置时,会体现出不同的信息,因此可以对输入的数据进行位置编码(Positional encoding),使得采用多头注意力机制的编码器能够获得更好的深度特征。由此,本申请一些实施例 中,在基于深度学习网络的编码器,将每个用户的交易行为序列进行编码,生成深度特征时,可以先对用户的交易行为序列进行位置编码,确定序列元素在交易行为序列中的相对位置信息,而后再将携带有相对位置信息的交易行为序列,输入采用多头注意力机制的深度学习网络的编码器,获得深度特征。
进行位置编码时,其目的在交易行为序列中插入位置编码信息,使得序列元素在交易行为序列中的相对位置信息能够被确定。在本申请的一些实施例中,可以根据序列元素在交易行为序列中的排序位置和元素序列的维度,确定序列元素在交易行为序列中的相对位置信息。以前述embedding处理后的用户A交易行为序列为例,所述排序位置对应于时间窗口顺序,例如第2天的交易行为数据对应的序列元素S[7],其排序位置为即为2,可以记为pos=2,元素序列的维度,即为embedding处理时所映射的向量的维度N,若本实施例中为512,则可以记为d_model=512,由此,位置编码信息可以表示为函数f(pos,i),其中,i∈[1,2,3,4,…,d_model]。
由此,用户A的交易行为序列的位置编码信息可以如下表3所示:
排序位置 | 维度1 | 维度2 | 维度3 | …… | 维度N-1 | 维度N |
1 | f(1,1) | f(1,2) | f(1,3) | …… | f(1,N-1) | f(1,N) |
2 | f(2,1) | f(2,2) | f(2,3) | …… | f(2,N-1) | f(2,N) |
3 | f(3,1) | f(3,2) | f(3,3) | …… | f(3,N-1) | f(3,N) |
4 | f(4,1) | f(4,2) | f(4,3) | …… | f(4,N-1) | f(4,N) |
5 | f(5,1) | f(5,2) | f(5,3) | …… | f(5,N-1) | f(5,N) |
6 | f(6,1) | f(6,2) | f(6,3) | …… | f(6,N-1) | f(6,N) |
7 | f(7,1) | f(7,2) | f(7,3) | …… | f(7,N-1) | f(7,N) |
表3
在实际场景中,确定位置编码信息时可以采用的三角函数,例如当i偶数时采用sin()函数,当i为奇数时采用cos()函数,此时位置信息可以由以下方式确定:
在获取到位置编码信息的具体数值之后,可以获得关于位置编码信息的矩阵 Matrix_P,该矩阵Matrix_P的维度与用户交易行为序列矩阵Matrix_A的维度相同,将两个矩阵相加之后即可获得包含位置编码信息的新矩阵Matrix_N=Matrix_A+Matrix_P。该新矩阵Matrix_N中携带有序列元素在交易行为序列中的相对位置信息,可以输入采用多头注意力机制的深度学习网络的编码器,获得深度特征。
本申请实施例中,在基于深度学习网络的解码器,对所述深度特征进行解码,获得还原的交易行为序列时,也可以采用多头注意力机制。在解码过程中,首先将深度特征输入采用多头注意力机制的深度学习网络的解码器,获得还原的交易行为序列中的首个序列元素,而后进行迭代处理,将深度特征输入和前一次解码获得的序列元素,输入采用多头注意力机制的深度学习网络的解码器,还原的交易行为序列中的后续序列元素,直至解码获得完整的交易行为序列。
例如,本申请实施例中,用户A的交易行为序列在进行编码之后所获得深度特征表示为C,将该深度特征信息输入采用多头注意力机制的深度学习网络的解码器进行解码时,首先获得还原的交易行为序列中的首个序列元素,即S
A1',而后进行迭代处理,将C和S
A1'作为解码器的输入,获得第二个序列元素S
A2',直至获得所有交易行为序列中的所有序列元素。在本实施例中,后续序列元素可以表示为:S
j'=f1(C,S
j-1'),其中,S
j'表示还原的第j个序列元素,即将前一次的输入右移一位之后作为本次的输入,f1()表示解码处理,
图2示出了本申请的实施例中进行解码和编码过程的处理原理。编码器的输入210为原始的交易行为序列,在输入编码器之前,需要进行位置编码220。
编码器230包括了多头注意力机制(Multi-head attention)层231,残差连接标准化(Add&norm)层232,前向反馈(Feed Forward)层233。首先,由Multi-head attention层231对三个输入Query、Key、Value做h次不同的投影,在编码器中Query=Key=Value,每次投影可以采用不同的线性变换,而后根据Query和Key计算权重系数,而后根据权重系数对Value进行加权求和,由此获得h个自注意力的结果,将这些结果拼接在一起,经过一个线性映射即可输出多头注意力机制的处理结果。Feed Forward层233对输入进行线性变换,其输入和输出的矩阵的维度是相同的,用于进一步优化学习深度特征。Add&norm层232用于解决深度学习中的退化问题,避免梯度消失,实际场景中可以对前一层的输出进行随机失活(dropout)处理之后,与前一层的原始输入进行叠加,而后对结果作标准化处理。Multi-head attention层231和Add&norm层232的输出均经过Add&norm层232的处理。
解码器240也包括了多头注意力机制(Multi-head attention)层241,残差连接标准化(Add&norm)层242,前向反馈(Feed Forward)层243。与编码器230中的区别在于,所述解码器240中Multi-head attention层241的输入与编码器230中不同,其中两个输入Key、Value即为编码器的输出结果,即深度特征,而另一输入250Query为还原的交易行为序列右移一位之后的序列元素,即解码器前一次的输出,并且也会进行位置编码。编码器中的Add&norm层242和Feed Forward层243与解码器中类似,此处不再赘述。此外,编码器还包括一线性(linear)层244,可以采用一全连接层,用于将输入进行映射,使其恢复到原始的交易行为序列的维度和大小,由此完成解码。
步骤S104,根据所述深度特征对用户进行聚类,获取聚类结果。其中,所述聚类算法可以选择基于层次密度的噪声应用空间聚类算法(HDBSCAN)、k均值聚类算法(K_means)、基于密度的噪声应用空间聚类算法(DBscan)、谱聚类(Spectral Clustering)算法等。由于HDBSCAN算法能将相似性很强的目标聚集在一起,且有层次结果,因此本申请的一些实施例中,可以采用基于层次密度的噪声应用空间聚类算法,根据所述深度特征对用户进行聚类,获取聚类结果。
在此,本领域技术人员应当理解,上述聚类的具体算法仅为举例,现有或今后出现的基于类似原理的其它形式如果能够适用于本申请,也应该包含在本申请的保护范围内,并以引用的形式包含于此。
步骤S105,根据所述深度学习网络的损失函数和聚类的目标函数确定学习目标。所述深度学习网络的损失函数Loss(Decoder)可以根据还原的交易行为序列与原始的交易行为序列之间的差异信息确定,例如,本申请实施例中,可以是还原的交易行为序列与原始的交易行为序列的平方差,具体可以采用如下的计算公式:
其中,所述x
i表示原始的交易行为序列中的第i个特征,x
pi表示还原的交易行为序列中的第i个特征,M表示交易行为序列中的特征总数。
而所述聚类的目标函数Object(聚类)根据所述聚类结果确定,例如,本申请实施例中,可以是聚类结果中各个类别对应的深度特征的标准差之和,其公开表示如下:
其中,λ为调整值,可以由用户根据实际场景预设设定,m为聚类处理后获得的类数量,std(C
j)为第j个类中深度特征的标准差,k为每个类中的序列的特征维度,f
i表示每个类中的第i维深度特征,std(f
i)表示某个类中第i维深度特征的标准差。
在本申请的一些实施例中,学习目标Object(total)可以是前述损失函数与目标函数之和,即Object(total)=Loss(Decoder)+Object(聚类)。
步骤S106,根据所述学习目标对所述深度学习网络的编码器和解码器的参数进行迭代调整,以使所述学习目标符合预设条件。其中,所述预设条件可以是学习目标小于预设值,也可以是学习目标达到最小值,例如min:loss(Decoder)+Object(聚类)。
在本申请的一些实施例中,可以将学习目标符合预设条件时所获得的聚类结果以及深度特征,作为最终的输出内容,由此在完成聚类的同时,自动学习获得到合适的特征,而依赖人工的方式根据经验来得到。
基于同一发明构思,本申请实施例中还提供了一种用户聚类及特征学习设备,所述设备对应的方法是前述实施例中用户聚类及特征学习方法,并且其解决问题的原理与该方法相似。
本申请实施例提供了一种用户聚类及特征学习设备,该设备结合了聚类算法和深度学习网络中的编码解码模型,能够根据聚类结果和解码结果确定学习目标,并根据学习目标对所述深度学习网络的编码器和解码器的参数进行迭代调整,由此在完成聚类的同时,能够优化深度学习网络以获得更好的、用于实现聚类的深度特征,从而解决现有方案中无法在聚类的同时快速获得用于聚类的特征的问题。
在实际场景中,该设备的具体实现可以是用户设备、网络设备或者用户设备与网络设备通过网络相集成所构成的设备,此外也可以是运行于上述设备中的程序。所述用户设备包括但不限于计算机、手机、平板电脑等各类终端设备;所述网络设备包括但不限于如网络主机、单个网络服务器、多个网络服务器集或基于云计算的计算机集合等实现。在此,云由基于云计算(Cloud Computing)的大量主机或网络服务器构成,其中,云计算是分布式计算的一种,由一群松散耦合的计算机集组成的一个虚拟计算机。
图3示出了本申请实施例提供的一种用户聚类及特征学习设备,该设备包括数据获取模块310、深度学习模块320、聚类模块330和迭代处理模块340。其中,所述数据获取模块310用于获取用户的交易行为数据,并根据所述交易行为数据确定各个用户的交 易行为序列。深度学习模块320用于基于深度学习网络的编码器,将各个用户的交易行为序列进行编码,生成深度特征;以及基于深度学习网络的解码器,对所述深度特征进行解码,获得还原的交易行为序列。聚类模块330用于根据所述深度特征对用户进行聚类,获取聚类结果。迭代处理模块340用于根据所述深度学习网络的损失函数和聚类的目标函数确定学习目标,以及根据所述学习目标对所述深度学习网络的编码器和解码器的参数进行迭代调整,以使所述学习目标符合预设条件。
其中,所述用户的交易行为数据可以是任意能够反映出用户在交易过程中所实施的相关行为的数据。在本申请的一些实施例中,所述交易行为数据可以是用户在多个时间窗口所对应的多项交易行为信息,例如可以是连续几天内的交易金额、交易笔数、交易对象的数量、交易集中发生的时间段、交易对象的主要地域等。由于相似的用户之间,交易行为也会存在一定的相似性,例如都倾向于在每天同一时间段内进行交易,每天的交易笔数相似,每天的交易金额相似等,因此基于用户的交易行为数据作为输入数据,实现用户的聚类处理。
所述交易行为序列是基于用户的交易行为数据进行数据处理后,按照预设形式所呈现的数据内容。所述交易行为序列中的序列元素用于表示所述用户在一个时间窗口内的交易行为数据,每个时间窗口内的交易行为数据可以采用向量的形式表示。
例如,对于一个用户A而言,以一周为统计周期,其交易行为序列可以包含7天内的交易行为数据,若每个时间窗口设定为1天,则该用户A的交易行为序列包含7个序列元素{S
A1,S
A2,S
A3,S
A4,S
A5,S
A6,S
A7},每个序列元素即为1天中的交易行为数据组成的向量。若本实施例中所采用的交易行为数据包括了时间窗口内的交易金额、交易笔数、交易对象的数量和交易集中发生的时间段这4项交易行为信息,则每个序列元素对应的交易行为数据的向量为[时间窗口内的交易金额,交易笔数,交易对象的数量,交易集中发生的时间段]。此时,所述交易行为序列可以表示为如下的矩阵:
其中,矩阵中的每一个行表示一个序列元素对应的向量,每一个行中的向量元素依次分别为时间窗口内的交易金额,交易笔数,交易对象的数量,交易集中发生的时间段, 例如,矩阵中的第一行[10000,20,8,17]即为第一天的交易行为数据,交易金额为10000,交易笔数为20,交易对象的数量为8,交易集中发生的时间段为第17个小时,即16:00:00-17:00:00。
在本申请的一些实施例中,也可以将每个序列元素(即一个时间窗口内的交易行为数据组成的向量)进行嵌入(embedding)处理,即,将每个序列元素视为自然语言处理中的单词,进行类似词嵌入的操作,具体的处理方式如下:
首先,对数据集中所有用户的每个时间窗口的交易行为数据进行编码,例如,本申请实施例中数据集中所有用户的各个时间窗口中交易行为数据有2000种不同的情况,即序列元素对应的向量一共有2000种,此时可以将这2000种不同的序列元素映射为2000个N维的向量。其中,N为embedding时的嵌入维度数,可以根据实际情况设定,例如可以设定512、256等。由此,所有的2000个序列元素可以如表1的形式所示。
然后,可以根据每个用户实际包含的序列元素,将交易行为序列中的每个序列元素采用上述N维向量进行表示。例如,对于前述的用户A,其交易行为序列{S
A1,S
A2,S
A3,S
A4,S
A5,S
A6,S
A7}中的序列元素的序号分别为1、7、3、2、11、6、100。由此,进行嵌入处理后的交易行为序列可以表示为表2的形式。
由此,前述用户A的交易行为序列在进行embedding处理之后,可以表示为如下的矩阵Matrix_A:
在本申请的实施例中,深度学习模块320是利用基于编码解码模型(Encoder-Decoder)的深度学习网络来实现深度特征的学习。编码解码模型的原理在于:可以通过迭代的方式调整编码器和解码器的参数,在解码还原的输入内容与原始的输入内容之间的差异小到足够的程度时,可以认为编码获得的深度特征具有足够的区分样本的能力,此时通过这些深度特征进行聚类,可以获得较好的聚类效果。
为了能够更好地获取各个交易行为序列内部各个向量之间的关系,从而提取聚类性能更好的深度特征,在编码和解码过程中,深度学习模块可以采用多头注意力机制(Multi-head attention)。
在采用多头注意力机制时,由于相同序列元素在处于交易行为序列的不同排序位置时,会体现出不同的信息,因此可以对输入的数据进行位置编码(Positional encoding),使得采用多头注意力机制的编码器能够获得更好的深度特征。由此,本申请一些实施例中,在基于深度学习网络的编码器,将每个用户的交易行为序列进行编码,生成深度特征时,深度学习模块可以先对用户的交易行为序列进行位置编码,确定序列元素在交易行为序列中的相对位置信息,而后再将携带有相对位置信息的交易行为序列,输入采用多头注意力机制的深度学习网络的编码器,获得深度特征。
进行位置编码时,其目的在交易行为序列中插入位置编码信息,使得序列元素在交易行为序列中的相对位置信息能够被确定。在本申请的一些实施例中,可以根据序列元素在交易行为序列中的排序位置和元素序列的维度,确定序列元素在交易行为序列中的相对位置信息。以前述embedding处理后的用户A交易行为序列为例,所述排序位置对应于时间窗口顺序,例如第2天的交易行为数据对应的序列元素S[7],其排序位置为即为2,可以记为pos=2,元素序列的维度,即为embedding处理时所映射的向量的维度N,若本实施例中为512,则可以记为d_model=512,由此,位置编码信息可以表示为函数f(pos,i),其中,i∈[1,2,3,4,…,d_model]。
由此,用户A的交易行为序列的位置编码信息可以如下表3所示。
在实际场景中,确定位置编码信息时可以采用的三角函数,例如当i偶数时采用sin()函数,当i为奇数时采用cos()函数,此时位置信息可以由以下方式确定:
在获取到位置编码信息的具体数值之后,可以获得关于位置编码信息的矩阵Matrix_P,该矩阵Matrix_P的维度与用户交易行为序列矩阵Matrix_A的维度相同,将两个矩阵相加之后即可获得包含位置编码信息的新矩阵Matrix_N=Matrix_A+Matrix_P。该新矩阵Matrix_N中携带有序列元素在交易行为序列中的相对位置信息,可以输入采用多头注意力机制的深度学习网络的编码器,获得深度特征。
本申请实施例中,在基于深度学习网络的解码器,对所述深度特征进行解码,获得还原的交易行为序列时,深度学习模块也可以采用多头注意力机制。在解码过程中,首先将深度特征输入采用多头注意力机制的深度学习网络的解码器,获得还原的交易行为 序列中的首个序列元素,而后进行迭代处理,将深度特征输入和前一次解码获得的序列元素,输入采用多头注意力机制的深度学习网络的解码器,还原的交易行为序列中的后续序列元素,直至解码获得完整的交易行为序列。
例如,本申请实施例中,用户A的交易行为序列在进行编码之后所获得深度特征表示为C,将该深度特征信息输入采用多头注意力机制的深度学习网络的解码器进行解码时,首先获得还原的交易行为序列中的首个序列元素,即S
A1',而后进行迭代处理,将C和S
A1'作为解码器的输入,获得第二个序列元素S
A2',直至获得所有交易行为序列中的所有序列元素。在本实施例中,后续序列元素可以表示为:S
j'=f1(C,S
j-1'),其中,S
j'表示还原的第j个序列元素,即将前一次的输入右移一位之后作为本次的输入,f1()表示解码处理,
图2示出了本申请的实施例中进行解码和编码过程的处理原理。编码器的输入210为原始的交易行为序列,在输入编码器之前,需要进行位置编码220。
编码器230包括了多头注意力机制(Multi-head attention)层231,残差连接标准化(Add&norm)层232,前向反馈(Feed Forward)层233。首先,由Multi-head attention层231对三个输入Query、Key、Value做h次不同的投影,在编码器中Query=Key=Value,每次投影可以采用不同的线性变换,而后根据Query和Key计算权重系数,而后根据权重系数对Value进行加权求和,由此获得h个自注意力的结果,将这些结果拼接在一起,经过一个线性映射即可输出多头注意力机制的处理结果。Feed Forward层233对输入进行线性变换,其输入和输出的矩阵的维度是相同的,用于进一步优化学习深度特征。Add&norm层232用于解决深度学习中的退化问题,避免梯度消失,实际场景中可以对前一层的输出进行随机失活(dropout)处理之后,与前一层的原始输入进行叠加,而后对结果作标准化处理。Multi-head attention层231和Add&norm层232的输出均经过Add&norm层232的处理。
解码器240也包括了多头注意力机制(Multi-head attention)层241,残差连接标准化(Add&norm)层242,前向反馈(Feed Forward)层243。与编码器230中的区别在于,所述解码器240中Multi-head attention层241的输入与编码器230中不同,其中两个输入Key、Value即为编码器的输出结果,即深度特征,而另一输入250Query为还原的交易行为序列右移一位之后的序列元素,即解码器前一次的输出,并且也会进行位置编码。编码器中的Add&norm层242和Feed Forward层243与解码器中类似,此处不再赘述。此外,编码器还包括一线性(linear)层244,可以采用一全连接层,用于将输 入进行映射,使其恢复到原始的交易行为序列的维度和大小,由此完成解码。
聚类模块330所采用的聚类算法可以选择基于层次密度的噪声应用空间聚类算法(HDBSCAN)、k均值聚类算法(K_means)、基于密度的噪声应用空间聚类算法(DBscan)、谱聚类(Spectral Clustering)算法等。由于HDBSCAN算法能将相似性很强的目标聚集在一起,且有层次结果,因此本申请的一些实施例中,可以采用基于层次密度的噪声应用空间聚类算法,根据所述深度特征对用户进行聚类,获取聚类结果。
在此,本领域技术人员应当理解,上述聚类的具体算法仅为举例,现有或今后出现的基于类似原理的其它形式如果能够适用于本申请,也应该包含在本申请的保护范围内,并以引用的形式包含于此。
所述深度学习网络的损失函数Loss(Decoder)可以根据还原的交易行为序列与原始的交易行为序列之间的差异信息确定,例如,本申请实施例中,可以是还原的交易行为序列与原始的交易行为序列的平方差,具体可以采用如下的计算公式:
其中,所述x
i表示原始的交易行为序列中的第i个特征,x
pi表示还原的交易行为序列中的第i个特征,M表示交易行为序列中的特征总数。
而所述聚类的目标函数Object(聚类)根据所述聚类结果确定,例如,本申请实施例中,可以是聚类结果中各个类别对应的深度特征的标准差之和,其公开表示如下:
其中,λ为调整值,可以由用户根据实际场景预设设定,m为聚类处理后获得的类数量,std(C
j)为第j个类中深度特征的标准差,k为每个类中的序列的特征维度,f
i表示每个类中的第i维深度特征,std(f
i)表示某个类中第i维深度特征的标准差。
在本申请的一些实施例中,学习目标Object(total)可以是前述损失函数与目标函数之和,即Object(total)=Loss(Decoder)+Object(聚类)。
迭代处理模块340所采用的预设条件可以是学习目标小于预设值,如Object(total)<L,也可以是学习目标达到最小值,如min:loss(Decoder)+Object(聚类)。
在本申请的一些实施例中,所述用户聚类及特征学习设备可以将学习目标符合预设 条件时所获得的聚类结果以及深度特征,作为最终的输出内容,由此在完成聚类的同时,自动学习获得到合适的特征,而依赖人工的方式根据经验来得到。
综上所述,本申请实施例提供的用户聚类及特征学习方案,结合了聚类算法和深度学习网络中的编码解码模型,可以先基于用户的交易行为数据确定用户的交易行为序列,而后基于深度学习网络的编码器,将各个用户的交易行为序列进行编码,生成深度特征;在根据所述深度特征对用户进行聚类获取聚类结果的同时,基于深度学习网络的解码器,对所述深度特征进行解码,获得还原的交易行为序列;而后根据聚类结果和解码结果确定学习目标,并根据学习目标对所述深度学习网络的编码器和解码器的参数进行迭代调整,由此在完成聚类的同时,能够优化深度学习网络,以获得更好的、用于实现聚类的深度特征。
另外,本申请的一部分可被应用为计算机程序产品,例如计算机程序指令,当其被计算机执行时,通过该计算机的操作,可以调用或提供根据本申请的方法和/或技术方案。而调用本申请的方法的程序指令,可能被存储在固定的或可移动的记录介质中,和/或通过广播或其他信号承载媒体中的数据流而被传输,和/或被存储在根据程序指令运行的计算机设备的工作存储器中。在此,根据本申请的一些实施例包括一个如图4所示的计算设备,该设备包括存储有计算机可读指令的一个或多个存储器410和用于执行计算机可读指令的处理器420,其中,当该计算机可读指令被该处理器执行时,使得所述设备执行基于前述本申请的多个实施例的方法和/或技术方案。
此外,本申请的一些实施例还提供了一种计算机可读介质,其上存储有计算机程序指令,所述计算机可读指令可被处理器执行以实现前述本申请的多个实施例的方法和/或技术方案。
需要注意的是,本申请可在软件和/或软件与硬件的组合体中被实施,例如,可采用专用集成电路(ASIC)、通用目的计算机或任何其他类似硬件设备来实现。在一些实施例中,本申请的软件程序可以通过处理器执行以实现上文步骤或功能。同样地,本申请的软件程序(包括相关的数据结构)可以被存储到计算机可读记录介质中,例如,RAM存储器,磁或光驱动器或软磁盘及类似设备。另外,本申请的一些步骤或功能可采用硬件来实现,例如,作为与处理器配合从而执行各个步骤或功能的电路。
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由 所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。装置权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一,第二等词语用来表示名称,而并不表示任何特定的顺序。
Claims (18)
- 一种用户聚类及特征学习方法,其中,该方法包括:获取用户的交易行为数据,并根据所述交易行为数据确定各个用户的交易行为序列,所述交易行为序列中的序列元素用于表示所述用户在一个时间窗口内的交易行为数据;基于深度学习网络的编码器,将各个用户的交易行为序列进行编码,生成深度特征;基于深度学习网络的解码器,对所述深度特征进行解码,获得还原的交易行为序列,并根据所述深度特征对用户进行聚类,获取聚类结果;根据所述深度学习网络的损失函数和聚类的目标函数确定学习目标,所述深度学习网络的损失函数根据还原的交易行为序列与原始的交易行为序列之间的差异信息确定,所述聚类的目标函数根据所述聚类结果确定;根据所述学习目标对所述深度学习网络的编码器和解码器的参数进行迭代调整,以使所述学习目标符合预设条件。
- 根据权利要求1所述的方法,其中,基于深度学习网络的编码器,将每个用户的交易行为序列进行编码,生成深度特征,包括:对用户的交易行为序列进行位置编码,确定序列元素在交易行为序列中的相对位置信息;将携带有相对位置信息的交易行为序列,输入采用多头注意力机制的深度学习网络的编码器,获得深度特征。
- 根据权利要求2所述的方法,其中,基于深度学习网络的解码器,对所述深度特征进行解码,获得还原的交易行为序列,包括:将深度特征输入采用多头注意力机制的深度学习网络的解码器,获得还原的交易行为序列中的首个序列元素;将深度特征输入和前一次解码获得的序列元素,输入采用多头注意力机制的深度学习网络的解码器,还原的交易行为序列中的后续序列元素。
- 根据权利要求2所述的方法,其中,对用户的交易行为序列进行位置编码,确定序列元素在交易行为序列中的相对位置信息,包括:根据序列元素在交易行为序列中的排序位置和元素序列的维度,确定序列元素在交易行为序列中的相对位置信息。
- 根据权利要求1所述的方法,其中,所述交易行为数据包括在多个时间窗口所对应的多项交易行为信息。
- 根据权利要求1所述的方法,其中,所述深度学习网络的损失函数为还原的交易 行为序列与原始的交易行为序列的平方差。
- 根据权利要求1所述的方法,其中,所述聚类的目标函数为聚类结果中各个类别对应的深度特征的标准差之和。
- 根据权利要求1所述的方法,其中,根据所述深度特征对用户进行聚类,获取聚类结果,包括:采用基于层次密度的噪声应用空间聚类算法,根据所述深度特征对用户进行聚类,获取聚类结果。
- 一种用户聚类及特征学习设备,其中,该设备包括:数据获取模块,用于获取用户的交易行为数据,并根据所述交易行为数据确定各个用户的交易行为序列,所述交易行为序列中的序列元素用于表示所述用户在一个时间窗口内的交易行为数据;深度学习模块,用于基于深度学习网络的编码器,将各个用户的交易行为序列进行编码,生成深度特征;以及基于深度学习网络的解码器,对所述深度特征进行解码,获得还原的交易行为序列;聚类模块,用于根据所述深度特征对用户进行聚类,获取聚类结果;迭代处理模块,用于根据所述深度学习网络的损失函数和聚类的目标函数确定学习目标,所述深度学习网络的损失函数根据还原的交易行为序列与原始的交易行为序列之间的差异信息确定,所述聚类的目标函数根据所述聚类结果确定;以及根据所述学习目标对所述深度学习网络的编码器和解码器的参数进行迭代调整,以使所述学习目标符合预设条件。
- 根据权利要求9所述的设备,其中,所述深度学习模块,用于对用户的交易行为序列进行位置编码,确定序列元素在交易行为序列中的相对位置信息;将携带有相对位置信息的交易行为序列,输入采用多头注意力机制的深度学习网络的编码器,获得深度特征。
- 根据权利要求10所述的设备,其中,所述深度学习模块,用于将深度特征输入采用多头注意力机制的深度学习网络的解码器,获得还原的交易行为序列中的首个序列元素;将深度特征输入和前一次解码获得的序列元素,输入采用多头注意力机制的深度学习网络的解码器,还原的交易行为序列中的后续序列元素。
- 根据权利要求10所述的设备,其中,所述深度学习模块,用于根据序列元素在交易行为序列中的排序位置和元素序列的维度,确定序列元素在交易行为序列中的相对位置信息。
- 根据权利要求9所述的设备,其中,所述交易行为数据包括在多个时间窗口所对应的多项交易行为信息。
- 根据权利要求9所述的设备,其中,所述深度学习网络的损失函数为还原的交易行为序列与原始的交易行为序列的平方差。
- 根据权利要求9所述的设备,其中,所述聚类的目标函数为聚类结果中各个类别对应的深度特征的标准差之和。
- 根据权利要求1所述的设备,其中,所述聚类模块,用于采用基于层次密度的噪声应用空间聚类算法,根据所述深度特征对用户进行聚类,获取聚类结果。
- 一种计算设备,其中,该设备包括用于存储计算机程序指令的存储器和用于执行计算机程序指令的处理器,其中,当该计算机程序指令被该处理器执行时,触发所述设备执行权利要求1至8中任一项所述的方法。
- 一种计算机可读介质,其上存储有计算机程序指令,所述计算机可读指令可被处理器执行以实现如权利要求1至8中任一项所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911115032.5A CN111062416B (zh) | 2019-11-14 | 2019-11-14 | 用户聚类及特征学习方法、设备、计算机可读介质 |
CN201911115032.5 | 2019-11-14 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021093368A1 true WO2021093368A1 (zh) | 2021-05-20 |
Family
ID=70298556
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/104002 WO2021093368A1 (zh) | 2019-11-14 | 2020-07-24 | 用户聚类及特征学习方法、设备、计算机可读介质 |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN111062416B (zh) |
TW (1) | TWI752485B (zh) |
WO (1) | WO2021093368A1 (zh) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114220496A (zh) * | 2021-11-30 | 2022-03-22 | 华南理工大学 | 一种基于深度学习的逆合成预测方法、装置、介质及设备 |
CN114792113A (zh) * | 2022-05-10 | 2022-07-26 | 中国人民解放军国防科技大学 | 基于多阶邻居信息传递融合聚类网络的图聚类方法和装置 |
CN114973407A (zh) * | 2022-05-10 | 2022-08-30 | 华南理工大学 | 一种基于rgb-d的视频三维人体姿态估计方法 |
CN115098672A (zh) * | 2022-05-11 | 2022-09-23 | 合肥工业大学 | 基于多视图深度聚类的用户需求发现方法和系统 |
CN115463430A (zh) * | 2022-08-26 | 2022-12-13 | 杭州电魂网络科技股份有限公司 | 一种游戏用户群筛选的方法、系统、电子装置和存储介质 |
CN116068910A (zh) * | 2023-04-06 | 2023-05-05 | 江西财经大学 | 一种基于大数据的智能家居控制方法及系统 |
CN116129330A (zh) * | 2023-03-14 | 2023-05-16 | 阿里巴巴(中国)有限公司 | 基于视频的图像处理、行为识别、分割、检测方法及设备 |
CN116932766A (zh) * | 2023-09-15 | 2023-10-24 | 腾讯科技(深圳)有限公司 | 对象分类方法、装置、设备、存储介质及程序产品 |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111062416B (zh) * | 2019-11-14 | 2021-09-21 | 支付宝(杭州)信息技术有限公司 | 用户聚类及特征学习方法、设备、计算机可读介质 |
CN111340506A (zh) * | 2020-05-22 | 2020-06-26 | 支付宝(杭州)信息技术有限公司 | 交易行为的风险识别方法、装置、存储介质和计算机设备 |
CN112000863B (zh) * | 2020-08-14 | 2024-04-09 | 北京百度网讯科技有限公司 | 用户行为数据的分析方法、装置、设备和介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105825269A (zh) * | 2016-03-15 | 2016-08-03 | 中国科学院计算技术研究所 | 一种基于并行自动编码机的特征学习方法及系统 |
US10068251B1 (en) * | 2008-06-26 | 2018-09-04 | Amazon Technologies, Inc. | System and method for generating predictions based on wireless commerce transactions |
CN108734338A (zh) * | 2018-04-24 | 2018-11-02 | 阿里巴巴集团控股有限公司 | 基于lstm模型的信用风险预测方法及装置 |
CN109389166A (zh) * | 2018-09-29 | 2019-02-26 | 聚时科技(上海)有限公司 | 基于局部结构保存的深度迁移嵌入聚类机器学习方法 |
CN111062416A (zh) * | 2019-11-14 | 2020-04-24 | 支付宝(杭州)信息技术有限公司 | 用户聚类及特征学习方法、设备、计算机可读介质 |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104915386B (zh) * | 2015-05-25 | 2018-04-27 | 中国科学院自动化研究所 | 一种基于深度语义特征学习的短文本聚类方法 |
CN106055699B (zh) * | 2016-06-15 | 2018-07-06 | 腾讯科技(深圳)有限公司 | 一种特征聚类的方法及装置 |
CN106203624B (zh) * | 2016-06-23 | 2019-06-21 | 上海交通大学 | 基于深度神经网络的矢量量化系统及方法 |
US10846308B2 (en) * | 2016-07-27 | 2020-11-24 | Anomalee Inc. | Prioritized detection and classification of clusters of anomalous samples on high-dimensional continuous and mixed discrete/continuous feature spaces |
US10657376B2 (en) * | 2017-03-17 | 2020-05-19 | Magic Leap, Inc. | Room layout estimation methods and techniques |
CN110298663B (zh) * | 2018-03-22 | 2023-04-28 | 中国银联股份有限公司 | 基于序列宽深学习的欺诈交易检测方法 |
CN108647730B (zh) * | 2018-05-14 | 2020-11-24 | 中国科学院计算技术研究所 | 一种基于历史行为共现的数据划分方法及系统 |
CN109165950B (zh) * | 2018-08-10 | 2023-02-03 | 哈尔滨工业大学(威海) | 一种基于金融时间序列特征的异常交易识别方法,设备及可读存储介质 |
CN109753608B (zh) * | 2019-01-11 | 2023-08-04 | 腾讯科技(深圳)有限公司 | 确定用户标签的方法、自编码网络的训练方法及装置 |
CN110260914B (zh) * | 2019-05-06 | 2020-06-19 | 河海大学 | 一种基于测点时空特征的工程安全监测系统区域划分方法 |
CN110390358A (zh) * | 2019-07-23 | 2019-10-29 | 杨勇 | 一种基于特征聚类的深度学习方法 |
-
2019
- 2019-11-14 CN CN201911115032.5A patent/CN111062416B/zh active Active
-
2020
- 2020-05-06 TW TW109115042A patent/TWI752485B/zh active
- 2020-07-24 WO PCT/CN2020/104002 patent/WO2021093368A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10068251B1 (en) * | 2008-06-26 | 2018-09-04 | Amazon Technologies, Inc. | System and method for generating predictions based on wireless commerce transactions |
CN105825269A (zh) * | 2016-03-15 | 2016-08-03 | 中国科学院计算技术研究所 | 一种基于并行自动编码机的特征学习方法及系统 |
CN108734338A (zh) * | 2018-04-24 | 2018-11-02 | 阿里巴巴集团控股有限公司 | 基于lstm模型的信用风险预测方法及装置 |
CN109389166A (zh) * | 2018-09-29 | 2019-02-26 | 聚时科技(上海)有限公司 | 基于局部结构保存的深度迁移嵌入聚类机器学习方法 |
CN111062416A (zh) * | 2019-11-14 | 2020-04-24 | 支付宝(杭州)信息技术有限公司 | 用户聚类及特征学习方法、设备、计算机可读介质 |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114220496A (zh) * | 2021-11-30 | 2022-03-22 | 华南理工大学 | 一种基于深度学习的逆合成预测方法、装置、介质及设备 |
CN114792113A (zh) * | 2022-05-10 | 2022-07-26 | 中国人民解放军国防科技大学 | 基于多阶邻居信息传递融合聚类网络的图聚类方法和装置 |
CN114973407A (zh) * | 2022-05-10 | 2022-08-30 | 华南理工大学 | 一种基于rgb-d的视频三维人体姿态估计方法 |
CN114973407B (zh) * | 2022-05-10 | 2024-04-02 | 华南理工大学 | 一种基于rgb-d的视频三维人体姿态估计方法 |
CN115098672A (zh) * | 2022-05-11 | 2022-09-23 | 合肥工业大学 | 基于多视图深度聚类的用户需求发现方法和系统 |
CN115463430A (zh) * | 2022-08-26 | 2022-12-13 | 杭州电魂网络科技股份有限公司 | 一种游戏用户群筛选的方法、系统、电子装置和存储介质 |
CN116129330A (zh) * | 2023-03-14 | 2023-05-16 | 阿里巴巴(中国)有限公司 | 基于视频的图像处理、行为识别、分割、检测方法及设备 |
CN116129330B (zh) * | 2023-03-14 | 2023-11-28 | 阿里巴巴(中国)有限公司 | 基于视频的图像处理、行为识别、分割、检测方法及设备 |
CN116068910A (zh) * | 2023-04-06 | 2023-05-05 | 江西财经大学 | 一种基于大数据的智能家居控制方法及系统 |
CN116932766A (zh) * | 2023-09-15 | 2023-10-24 | 腾讯科技(深圳)有限公司 | 对象分类方法、装置、设备、存储介质及程序产品 |
CN116932766B (zh) * | 2023-09-15 | 2023-12-29 | 腾讯科技(深圳)有限公司 | 对象分类方法、装置、设备、存储介质及程序产品 |
Also Published As
Publication number | Publication date |
---|---|
TW202119254A (zh) | 2021-05-16 |
CN111062416A (zh) | 2020-04-24 |
TWI752485B (zh) | 2022-01-11 |
CN111062416B (zh) | 2021-09-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021093368A1 (zh) | 用户聚类及特征学习方法、设备、计算机可读介质 | |
US11157693B2 (en) | Stylistic text rewriting for a target author | |
US10713654B2 (en) | Enterprise blockchains and transactional systems | |
US11263223B2 (en) | Using machine learning to determine electronic document similarity | |
US20170212781A1 (en) | Parallel execution of blockchain transactions | |
WO2017143914A1 (zh) | 一种利用训练数据训练模型的方法和训练系统 | |
CN111651573B (zh) | 一种智能客服对话回复生成方法、装置和电子设备 | |
US11593622B1 (en) | Artificial intelligence system employing graph convolutional networks for analyzing multi-entity-type multi-relational data | |
US11003910B2 (en) | Data labeling for deep-learning models | |
CN112231592B (zh) | 基于图的网络社团发现方法、装置、设备以及存储介质 | |
WO2018032982A1 (zh) | 电子支付过程中资金交易路径的检测方法和装置 | |
US11687535B2 (en) | Automatic computation of features from a data stream | |
CN107729944B (zh) | 一种低俗图片的识别方法、装置、服务器及存储介质 | |
CN112470172B (zh) | 使用随机序列嵌入的符号序列分析的计算效率 | |
US11507845B2 (en) | Hybrid model for data auditing | |
CN114943279A (zh) | 招投标合作关系的预测方法、设备及系统 | |
US9201967B1 (en) | Rule based product classification | |
US20220335270A1 (en) | Knowledge graph compression | |
US20220147852A1 (en) | Mitigating partiality in regression models | |
US11861459B2 (en) | Automatic determination of suitable hyper-local data sources and features for modeling | |
US20200327615A1 (en) | Portfolio risk measures aggregation | |
CN113010666B (zh) | 摘要生成方法、装置、计算机系统及可读存储介质 | |
CN112307334B (zh) | 信息推荐方法、信息推荐装置、存储介质与电子设备 | |
CN114707591A (zh) | 数据处理方法和数据处理模型的训练方法、装置 | |
US10755130B2 (en) | Image compression based on textual image content |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20887174 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20887174 Country of ref document: EP Kind code of ref document: A1 |