CN104636486B

CN104636486B - A kind of user characteristics abstracting method and draw-out device based on the conversion of non-negative alternating direction

Info

Publication number: CN104636486B
Application number: CN201510087359.1A
Authority: CN
Inventors: 许明; 罗辛; 张能锋; 袁野; 吴迪; 夏云霓
Original assignee: Worth Watching Cloud Technology Co ltd; Chongqing Institute of Green and Intelligent Technology of CAS
Current assignee: Shenzhen Wanjiaan Interconnected Technology Co ltd
Priority date: 2015-02-25
Filing date: 2015-02-25
Publication date: 2018-01-02
Anticipated expiration: 2035-02-25
Also published as: CN104636486A

Abstract

The invention proposes a user feature extraction method and extraction device based on non-negative alternating direction transformation. The extracting device includes a data receiving module, a data storage module and an execution module, wherein the data receiving module is connected to the data storage module, and the data receiving module is used to receive user behavior statistics data collected by the server, and collect the collected data from the server. The statistical data of user behavior is transmitted to the data storage module for storage, and the data storage module is connected with the execution module, and the execution module executes the user feature extraction instruction sent by the server, and stores the extracted user feature data into the data storage module. The present invention directly acts on the known data set in the user behavior statistical matrix, can process the extremely sparse user behavior statistical matrix with a large number of missing values, has fast convergence speed, high data restoration accuracy, and can solve problems in the big data processing environment User feature extraction problem.

Description

A user feature extraction method and extraction device based on non-negative alternating direction transformation

技术领域technical field

本发明涉及计算机大数据处理技术领域，特别涉及电子商务系统中一种基于非负交替方向变换的用户特征抽取方法及抽取装置。The invention relates to the technical field of computer big data processing, in particular to a method and device for extracting user features based on non-negative alternating direction transformation in an e-commerce system.

背景技术Background technique

现代大型电子商务系统，其用户数量和信息数量十分巨大。此类系统中，用户的各种客观行为，如点击、浏览、评论、搜索等等，随系统运营时间累积，汇集成为庞大的用户历史行为数据集，数据量至少在TB量级，为典型的大数据环境。Modern large-scale e-commerce systems have huge numbers of users and information. In this type of system, various objective behaviors of users, such as clicking, browsing, commenting, searching, etc., accumulate with the operating time of the system and aggregate into a huge user historical behavior data set. The data volume is at least in the TB level, which is a typical big data environment.

在大型电子商务系统中，一种典型的数据描述结构是用户行为统计矩阵，其中的每一行对应一个用户，每一列对应于一个项目；项目指系统中任何可能由用户操作的客观物体，如新闻、图片、商品；每个矩阵元素对应单个用户对单个项目的历史行为数据，该数据是使用该用户对该项目的客观历史行为数据，利用符合自然规律的数学统计方法进行量化计算构成。大型电子商务系统中，用户和项目数量十分巨大，对应用户行为统计矩阵也十分巨大。同时，一个用户不可能操作所有的项目，一个项目也不可能被所有的用户操作；一般而言，用户行为统计矩阵中的已知数据远少于未知数据，是极端稀疏的。In a large-scale e-commerce system, a typical data description structure is a user behavior statistics matrix, in which each row corresponds to a user, and each column corresponds to an item; an item refers to any objective object in the system that may be operated by a user, such as news , pictures, commodities; each matrix element corresponds to the historical behavior data of a single user on a single item, which is composed of quantitative calculations using the user’s objective historical behavior data on the item, using mathematical statistical methods that conform to natural laws. In a large-scale e-commerce system, the number of users and items is very large, and the corresponding user behavior statistics matrix is also very large. At the same time, it is impossible for a user to operate all items, and it is impossible for an item to be operated by all users; generally speaking, the known data in the user behavior statistics matrix is far less than the unknown data, which is extremely sparse.

系统运营过程中，基于用户行为统计矩阵中的已知数据，从中抽取用户特征，可对用户的行为进行有效的分析，从中挖掘包括用户类别、行为模式等规律。在用户特征的抽取过程中，保持用户特征的非负性，是一个关键，这是因为非负的用户特征更加符合电子商务系统中用户行为数据为正数的自然规律，能更好地对用户行为进行表征。现有非负特征抽取技术多用于计算机视觉领域，其基本特点是对于给定的图形或者图像，将其视为一个满秩矩阵，并对其进行非负条件限制下的矩阵因式分解，从而抽取出该图形或图像的局部物体特征。但是，电子商务系统中的用户特征抽取问题，与计算机视觉中的非负物体特征抽取问题，具备很大区别。这是因为计算机视觉中的非负物体特征抽取所处理的图形、图像所转化的矩阵是满秩矩阵，不具备缺失值，此类矩阵的非负矩阵因式分解问题可以借助常规的矩阵迭代运算进行处理；但电子商务系统中的非负用户行为抽取问题，所处理的用户行为统计矩阵，通常情况下是极端稀疏的，其中具备大量的缺失值，无法使用传统的矩阵因式分解处理，而需要用能作用于稀疏矩阵的非负隐特征分析处理。但是，现有非负矩阵隐特征分析方法，具备收敛速度慢、数据还原准确度低的缺点。In the process of system operation, based on the known data in the user behavior statistics matrix, user characteristics are extracted from it, and user behavior can be effectively analyzed, and rules including user categories and behavior patterns can be excavated from them. In the process of extracting user features, maintaining the non-negativity of user features is a key, because non-negative user features are more in line with the natural law of positive user behavior data in e-commerce systems, and can better understand users. Behavioral representations. The existing non-negative feature extraction technology is mostly used in the field of computer vision. Its basic feature is that for a given graph or image, it is regarded as a full-rank matrix, and it is subjected to matrix factorization under the constraints of non-negative conditions, so that Extract the local object features of the graph or image. However, the problem of user feature extraction in e-commerce systems is quite different from the problem of non-negative object feature extraction in computer vision. This is because the graphics and images transformed by the non-negative object feature extraction in computer vision are full-rank matrices and do not have missing values. The non-negative matrix factorization of such matrices can be solved by conventional matrix iterative operations. However, for the non-negative user behavior extraction problem in e-commerce systems, the user behavior statistical matrix processed is usually extremely sparse, with a large number of missing values, which cannot be processed by traditional matrix factorization, and It needs to be processed with non-negative latent feature analysis that can operate on sparse matrices. However, the existing non-negative matrix latent feature analysis method has the disadvantages of slow convergence speed and low accuracy of data restoration.

因此，如何针对大型电子商务系统中的、具备大量缺失值的用户行为统计矩阵，进行收敛速度快、数据还原准确度高的非负隐特征分析，从而获取能够良好描述用户行为自然规律的用户特征，是对现代大型电子商务系统所产生的海量数据进行分析所需要处理的一个关键问题。Therefore, for the user behavior statistical matrix with a large number of missing values in large-scale e-commerce systems, how to perform non-negative latent feature analysis with fast convergence speed and high data restoration accuracy, so as to obtain user characteristics that can well describe the natural laws of user behavior , is a key issue that needs to be dealt with when analyzing the massive data generated by modern large-scale e-commerce systems.

发明内容Contents of the invention

为了克服上述现有技术中存在的缺陷，本发明的目的是提供一种基于非负交替方向变换的用户特征抽取方法及抽取装置，本发明直接作用于用户行为统计矩阵中的已知数据集合，能够处理具备大量缺失值的、极端稀疏的用户行为统计矩阵，收敛速度快，数据还原准确度高，能够解决大数据处理环境中的用户特征抽取问题。In order to overcome the defects in the above-mentioned prior art, the object of the present invention is to provide a user feature extraction method and extraction device based on non-negative alternating direction transformation. The present invention directly acts on the known data set in the user behavior statistics matrix, It can handle extremely sparse user behavior statistical matrices with a large number of missing values, has fast convergence speed and high data restoration accuracy, and can solve the user feature extraction problem in the big data processing environment.

为了实现本发明的上述目的，本发明提供了一种基于非负交替方向变换的用户特征抽取方法，包括以下步骤：In order to achieve the above object of the present invention, the present invention provides a user feature extraction method based on non-negative alternating direction transformation, comprising the following steps:

S1.服务器对抽取装置发出进行用户特征抽取的指令；S1. The server sends an instruction to the extraction device to extract user features;

S2.抽取装置接收指令并初始化参数,初始化参数包括：特征空间维数f、对偶学习速率η、拉格朗日增强因子λ、用户特征矩阵X、用户训练辅助矩阵X_U、X_D和X_C、项目特征矩阵Y、项目训练辅助矩阵Y_U、Y_D和Y_C、迭代控制变量t、迭代上限n、收敛判定阈值 S2. The extraction device receives instructions and initializes parameters. The initialization parameters include: feature space dimension f, dual learning rate η, Lagrangian enhancement factor λ, user feature matrix X, user training auxiliary matrices X_U, X_D and X_C, item features Matrix Y, project training auxiliary matrices Y_U, Y_D and Y_C, iteration control variable t, iteration upper limit n, convergence judgment threshold

S3.抽取装置构造累积绝对误差ε(P,Q,X,Y)，其中P为用户特征约束矩阵，Q为项目特征约束矩阵；S3. The extraction device constructs a cumulative absolute error ε(P, Q, X, Y), where P is the user feature constraint matrix, and Q is the item feature constraint matrix;

S4.抽取装置使用约束条件对累积绝对误差ε(P,Q,X,Y)进行约束，保证矩阵P、Q的参数在训练过程中的非负性；S4. The extraction device uses constraints to constrain the cumulative absolute error ε(P, Q, X, Y) to ensure the non-negativity of the parameters of the matrices P and Q during the training process;

S5.抽取装置构造统一损失函数L(P,Q,X,Y,Γ,Κ)，其中Γ和Κ均为对偶参数；S5. The extraction device constructs a unified loss function L (P, Q, X, Y, Γ, Κ), wherein Γ and Κ are dual parameters;

S6.抽取装置判断迭代训练控制变量t是否已达到上限n，若是，则执行步骤S9，若否，则执行步骤S7；S6. The extraction device judges whether the iterative training control variable t has reached the upper limit n, if so, execute step S9, if not, execute step S7;

S7.抽取装置判断统一损失函数L相对于P，Q，X，Y，Γ和Κ是否在用户行为统计矩阵中的已知数据集合C上收敛，若是，则执行步骤S9，若否，则执行步骤S8；S7. The extraction device judges whether the unified loss function L converges on the known data set C in the user behavior statistics matrix with respect to P, Q, X, Y, Γ and K, if so, then execute step S9, if not, then execute Step S8;

S8.抽取装置在用户行为统计矩阵中的已知数据集合C中的已知数据上对P，Q，X，Y，Γ和Κ进行迭代训练，再执行步骤S6；S8. The extraction device performs iterative training on P, Q, X, Y, Γ and K on the known data in the known data set C in the user behavior statistics matrix, and then executes step S6;

S9.抽取装置将通过迭代训练获取的用户特征矩阵X和项目特征矩阵Y输出，存储至数据模块中的获取特征存储单元。S9. The extraction device outputs the user feature matrix X and item feature matrix Y obtained through iterative training, and stores them in the acquired feature storage unit in the data module.

本方法中，步骤S2中特征空间维数f为用户特征所处特征空间的维度，决定特征向量的维数，为正实数集合内的任意正整数。In this method, the feature space dimension f in step S2 is the dimension of the feature space where the user feature is located, which determines the dimension of the feature vector, and is any positive integer in the set of positive real numbers.

对偶学习速率η为统一损失函数中，对拉格朗日乘子进行训练的学习速率，为区间(0.0001,0.05)内的浮点数。The dual learning rate η is the learning rate for training the Lagrangian multipliers in the unified loss function, which is a floating-point number in the interval (0.0001, 0.05).

拉格朗日增强因子λ为统一损失函数中，对约束条件进行规约表达的因数，为区间(0.01,0.1)内的任意小数。The Lagrangian enhancement factor λ is a factor that expresses the constraints in the unified loss function, and it is any decimal in the interval (0.01, 0.1).

用户特征矩阵X为需要抽取的特征，是一个|A|×f的矩阵，其中A代表装置的存储单元所存储的用户集合。X的每一行对应一个用户，X的每一行向量是一个用户的特征向量。本发明实施例中，用户特征矩阵X中每个元素的初始值设置为开区间(0.4,0.8)范围内的随机数。The user feature matrix X is the feature to be extracted, which is a matrix of |A|×f, where A represents the set of users stored in the storage unit of the device. Each row of X corresponds to a user, and each row vector of X is a feature vector of a user. In the embodiment of the present invention, the initial value of each element in the user characteristic matrix X is set as a random number within the range of the open interval (0.4, 0.8).

用户训练辅助矩阵X_U、X_D和X_C为用以辅助迭代训练用户特征的数据结构，均为|A|×f的矩阵。User training auxiliary matrices X_U, X_D, and X_C are data structures used to assist iterative training of user features, all of which are |A|×f matrices.

项目特征矩阵Y为需要抽取的特征，是一个|B|×f的矩阵，其中B代表装置的存储单元所存储的项目集合。Y的每一行对应一个项目，Y的每一行向量是全部用户对一个项目进行操作的特征向量。The item feature matrix Y is the feature to be extracted, which is a |B|×f matrix, where B represents the item set stored in the storage unit of the device. Each row of Y corresponds to an item, and each row vector of Y is a feature vector of all users operating on an item.

项目训练辅助矩阵Y_U、Y_D和Y_C为用以辅助迭代训练用户特征的数据结构，均为|B|×f的矩阵。Item training auxiliary matrices Y_U, Y_D, and Y_C are data structures used to assist iterative training of user features, all of which are |B|×f matrices.

迭代控制变量t为控制特征训练过程的变量，迭代控制变量t初始化为0。The iteration control variable t is a variable that controls the feature training process, and the iteration control variable t is initialized to 0.

迭代上限n为控制训练过程迭代上限的变量，为正实数集合内的任意正整数。The iteration upper limit n is a variable that controls the iteration upper limit of the training process, and is any positive integer in the set of positive real numbers.

收敛判定阈值为判断迭代训练是否已收敛的阈值参数。Convergence Decision Threshold It is the threshold parameter for judging whether the iterative training has converged.

本方法直接作用于用户行为统计矩阵中的已知数据集合，能够处理具备大量缺失值的、极端稀疏的用户行为统计矩阵，收敛速度快，数据还原准确度高，能够处理大数据处理环境中的用户特征抽取问题。This method directly acts on the known data set in the user behavior statistical matrix, and can handle the extremely sparse user behavior statistical matrix with a large number of missing values. User feature extraction problem.

优选的，步骤S3中所述绝对误差的计算公式为：Preferably, the formula for calculating the absolute error described in step S3 is:

s.t.P＝X,P≥0,s.t.P=X,P≥0,

Q＝Y,Q≥0.Q=Y, Q≥0.

其中，C表示用户行为统计矩阵中的已知数据集合；r_u,i表示用户行为统计矩阵中第u行，第i列的元素值，代表用户u在项目i上的历史行为统计数据；x_u,k表示用户特征矩阵X的第u行，第k列元素；y_i,k表示项目特征矩阵Y的第i行，第k列元素；P为用户特征约束矩阵，Q为项目特征约束矩阵。Among them, C represents the known data set in the user behavior statistics matrix; r _u,i represents the element value of row u and column i in the user behavior statistics matrix, representing the historical behavior statistics data of user u on item i; x _{u, k} represent the u-th row and k-th column element of the user feature matrix X; y _{i, k} represent the i-th row and k-th column element of the item feature matrix Y; P is the user feature constraint matrix, and Q is the item feature constraint matrix .

步骤S3构造累积绝对误ε(P,Q,X,Y)对误差和非负性约束进行充分的表述，同时，对引入对偶参数提供了条件。Step S3 constructs the cumulative absolute error ε(P, Q, X, Y) to fully express the error and non-negativity constraints, and at the same time, provides conditions for the introduction of dual parameters.

步骤S4使用约束条件对累积绝对误差ε(P,Q,X,Y)进行约束，保证相关模型参数在训练过程中的非负性。Step S4 uses constraint conditions to constrain the cumulative absolute error ε(P, Q, X, Y) to ensure the non-negativity of relevant model parameters during the training process.

步骤S5构造统一损失函数是通过使用拉格朗日乘子法对损失函数和相关约束条件进行统一，从而在训练过程中满足约束条件的约束。The construction of the unified loss function in step S5 is to unify the loss function and related constraints by using the Lagrange multiplier method, so that the constraints of the constraints are satisfied during the training process.

优选的，步骤S4包括以下步骤：Preferably, step S4 includes the following steps:

S4-1.对于P中每一元素p_u,k，如其不等于X中对应元素x_u,k，则令p_u,k＝x_u,k；S4-1. For each element p _u,k in P, if it is not equal to the corresponding element x _u,k in X, then let p _u,k =x _u,k ;

S4-2.对于Q中每一元素q_i,k，如其不等于Y中对应元素y_i,k，则令q_i,k＝y_i,k；S4-2. For each element q _i,k in Q, if it is not equal to the corresponding element y _i,k in Y, then let q _i,k =y _i,k ;

S4-3.对于P中每一元素p_u,k，如其小于0，则令p_u,k＝0；S4-3. For each element p _u,k in P, if it is less than 0, set p _u,k =0;

S4-4.对于Q中每一元素q_i,k，如其小于0，则令q_i,k＝0。S4-4. For each element q _i,k in Q, if it is less than 0, set q _i,k =0.

其中，p_u,k表示用户特征约束矩阵P中第u行，第k列元素，q_i,k表示项目特征约束矩阵中第i行，第k列元素。Among them, p _{u, k} represent elements in row u and column k in the user feature constraint matrix P, and q _{i, k} represent elements in row i and column k in the item feature constraint matrix.

优选的，步骤S5中所述损失函数计算公式为：Preferably, the calculation formula of the loss function described in step S5 is:

其中Γ和Κ均为对偶参数,γ_u,k表示Γ中第u行，第k列元素，κ_i,k表示K中第i行，第k列元素，该公式采用的是规约拉格朗日乘子法(augmented lagrangian)，规约拉格朗日乘子法(augmented lagrangian)是在拉格朗日乘子法的基础上加入了对应限制条件的规约项，规约项为ρ是规约拉格朗日乘子法的规约参数，该参数参应于矩阵X，计算中是一个常量。Wherein Γ and Κ are dual parameters, γ _{u, k} represent elements in row u and column k in Γ, and κ _{i, k} represent elements in row i and column k in K, and this formula adopts the reduced Lagrang The Augmented Lagrangian method, the Augmented Lagrangian method is based on the Lagrangian multiplier method, adding a statute item corresponding to the restriction condition, and the statute item is ρ is the reduction parameter of the reduced Lagrangian multiplier method, which refers to the matrix X and is a constant in the calculation.

优选的，步骤S8中的迭代训练包括以下步骤：Preferably, the iterative training in step S8 includes the following steps:

S8-1.确定迭代训练目标，即全部参数P，Q，X，Y，Γ和Κ，使其满足统一损失函数L相对于P，Q，X，Y，Γ和Κ在用户行为统计矩阵中的已知数据集合C上最小，表示为公式：S8-1. Determine the iterative training target, that is, all parameters P, Q, X, Y, Γ and Κ, so that it meets the unified loss function L relative to P, Q, X, Y, Γ and Κ in the user behavior statistics matrix The smallest known data set C, expressed as a formula:

τ是规约拉格朗日乘子法的规约参数，为规约项，该参数参应于矩阵Y，计算中是一个常量。τ is the reduction parameter of the reduced Lagrange multiplier method, It is a specification item, this parameter refers to the matrix Y, and it is a constant in the calculation.

S8-2.使用非负方向交替变换，对P，Q，X，Y，Γ和Κ中的单一元素进行顺序训练，训练规则表示为公式:S8-2. use non-negative direction alternating transformation, carry out sequential training to the single element in P, Q, X, Y, Γ and Κ, training rule is expressed as formula:

for k＝1～f,for k=1～f,

S8-3、对于P，Q，X，Y，Γ和Κ中的每个元素，按照如下公式对其进行S8-3, for each element in P, Q, X, Y, Γ and Κ, perform it according to the following formula

ρ_u＝λ|C(u)|,τ_i＝λ|C(i)|；ρ _u =λ|C(u)|,τ _i =λ|C(i)|;

训练更新，其中，C(u)和C(i)分别表示已知数据集合C中，与用户u和项目i相关联的子集，τ_i中τ是规约拉格朗日乘子法的规约参数，该参数参应于矩阵Y，i对应于Y的第i行。training updates, Among them, C(u) and C(i) respectively represent the subsets associated with user u and item i in the known data set C, and τ in τ _i is the reduction parameter of the reduced Lagrangian multiplier method, the The parameter refers to the matrix Y, and i corresponds to the ith row of Y.

本发明还提出一种基于非负交替方向变换的用户特征抽取方法的抽取装置，包括数据接收模块、数据存储模块和执行模块，其中，所述数据接收模块与数据存储模块连接，数据接收模块用于接收服务器采集的用户行为统计数据，并将所接收的服务器采集的用户行为统计数据传递给数据存储模块进行存储，所述数据存储模块与执行模块相连接，执行模块执行服务器发送的进行用户特征抽取的指令，并将抽取的用户特征数据存入数据存储模块中。The present invention also proposes an extraction device based on a user feature extraction method based on non-negative alternating direction transformation, including a data receiving module, a data storage module and an execution module, wherein the data receiving module is connected to the data storage module, and the data receiving module uses To receive the user behavior statistics data collected by the server, and transfer the received user behavior statistics data collected by the server to the data storage module for storage, the data storage module is connected to the execution module, and the execution module executes the user characteristic sent by the server extracting instructions, and storing the extracted user feature data into the data storage module.

用数据接收模块采集服务器的用户行为统计数据，用执行模块执行用户特征抽取的指令，数据存储模块对数据接收模块采集的服务器的用户行为统计数据和执行模块抽取的用户特征数据进行存储。本装置可直接作用于用户行为统计矩阵中的已知数据集合，能够处理具备大量缺失值的、极端稀疏的用户行为统计矩阵，能够解决大数据处理环境中的用户特征抽取问题。The data receiving module is used to collect user behavior statistics data of the server, the execution module is used to execute user feature extraction instructions, and the data storage module stores the server user behavior statistics data collected by the data receiving module and the user feature data extracted by the execution module. The device can directly act on the known data set in the user behavior statistical matrix, can process the extremely sparse user behavior statistical matrix with a large number of missing values, and can solve the user feature extraction problem in the big data processing environment.

进一步的，所述数据存储模块包括获取特征存储单元和统计数据存储单元，所述获取特征存储单元与所述执行模块连接，用于存储执行模块抽取的用户特征数据；所述统计数据存储单元与所述数据接收模块连接，用于存储数据接收模块传递的用户行为统计数据。Further, the data storage module includes an acquisition feature storage unit and a statistical data storage unit, the acquisition feature storage unit is connected to the execution module, and is used to store user characteristic data extracted by the execution module; the statistical data storage unit and The data receiving module is connected to store user behavior statistical data delivered by the data receiving module.

对数据接收模块采集的服务器的用户行为统计数据和执行模块抽取的用户特征数据进行分单元存储，在调取数据时能更加方便、准确和快捷。The user behavior statistics data of the server collected by the data receiving module and the user characteristic data extracted by the execution module are stored in units, so that it is more convenient, accurate and faster to retrieve data.

进一步的，所述执行模块包括初始化单元，训练单元和输出单元，Further, the execution module includes an initialization unit, a training unit and an output unit,

所述初始化单元对用户特征抽取过程所依赖的参数进行初始化，初始化参数包括：特征空间维数f、对偶学习速率η、拉格朗日增强因子λ、用户特征矩阵X、用户训练辅助矩阵X_U、X_D和X_C、项目特征矩阵Y、项目训练辅助矩阵Y_U、Y_D和Y_C、迭代控制变量t、迭代上限n、收敛判定阈值 The initialization unit initializes the parameters that the user feature extraction process depends on, and the initialization parameters include: feature space dimension f, dual learning rate η, Lagrangian enhancement factor λ, user feature matrix X, user training auxiliary matrix X_U, X_D and X_C, project characteristic matrix Y, project training auxiliary matrix Y_U, Y_D and Y_C, iteration control variable t, iteration upper limit n, convergence judgment threshold

所述训练单元输入端分别与初始化单元和数据存储模块连接，根据初始化单元初始化的参数和数据存储模块中的用户行为统计数据构造用户特征数据，包括用户特征矩阵X和项目特征矩阵Y，训练单元首先构造累积绝对误差ε(P,Q,X,Y)，其中P为用户特征约束矩阵，Q为项目特征约束矩阵，再构造统一损失函数L(P,Q,X,Y,Γ,K)，其中Γ和Κ均为对偶参数，然后对P，Q，X，Y，Γ和Κ进行迭代训练，直至统一损失函数L(P,Q,X,Y,Γ,K)相对于P，Q，X，Y，Γ和Κ在用户行为统计矩阵中的已知数据集合C上收敛，或者迭代控制变量t等于迭代上限n；The input end of the training unit is connected with the initialization unit and the data storage module respectively, according to the parameters initialized by the initialization unit and the user behavior statistics data in the data storage module to construct user characteristic data, including user characteristic matrix X and item characteristic matrix Y, training unit First construct the cumulative absolute error ε(P,Q,X,Y), where P is the user feature constraint matrix, Q is the item feature constraint matrix, and then construct a unified loss function L(P,Q,X,Y,Γ,K) , where Γ and Κ are dual parameters, and then iteratively train P, Q, X, Y, Γ and Κ until the unified loss function L(P, Q, X, Y, Γ, K) is relative to P, Q , X, Y, Γ and Κ converge on the known data set C in the user behavior statistics matrix, or the iteration control variable t is equal to the iteration upper limit n;

所述输出单元输入端与训练单元输出端连接，输出单元输出端与数据存储模块相连接，所述输出单元将训练单元构造的用户特征数据输出并存储到数据存储模块中。The input end of the output unit is connected to the output end of the training unit, the output end of the output unit is connected to the data storage module, and the output unit outputs and stores the user characteristic data constructed by the training unit into the data storage module.

把执行模块划分为三个单元，使得在抽取的用户特征数据时，对参数的初始化、数据的抽取和存储能更加准确快捷。The execution module is divided into three units, so that when extracting user characteristic data, initialization of parameters, extraction and storage of data can be more accurate and quicker.

本发明的有益效果是：本方法旨在通过非负交替方向变换，直接作用于用户行为统计矩阵中的已知数据集合，有以下优点：The beneficial effects of the present invention are: the method aims to directly act on the known data set in the user behavior statistical matrix through non-negative alternating direction transformation, and has the following advantages:

1、能够处理具备大量缺失值的、极端稀疏的用户行为统计矩阵；1. Able to handle extremely sparse user behavior statistics matrix with a large number of missing values;

2、收敛速度快，数据还原准确度高，能够解决大数据处理环境中的用户特征抽取问题。2. The convergence speed is fast, and the data restoration accuracy is high, which can solve the user feature extraction problem in the big data processing environment.

本发明的附加方面和优点将在下面的描述中部分给出，部分将从下面的描述中变得明显，或通过本发明的实践了解到。Additional aspects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

附图说明Description of drawings

本发明的上述和/或附加的方面和优点从结合下面附图对实施例的描述中将变得明显和容易理解，其中：The above and/or additional aspects and advantages of the present invention will become apparent and comprehensible from the description of the embodiments in conjunction with the following drawings, wherein:

图1是本发明结构示意图；Fig. 1 is a structural representation of the present invention;

图2是本发明流程示意图；Fig. 2 is a schematic flow chart of the present invention;

图3为应用本发明实施例前后，对用户特征进行抽取的收敛速度对比图；Fig. 3 is before and after applying the embodiment of the present invention, the comparison diagram of the convergence speed of extracting user features;

图4为应用本发明实施例前后，对用户特征进行抽取的数据还原准确度对比图。Fig. 4 is a comparison chart of data restoration accuracy for extracting user features before and after applying the embodiment of the present invention.

具体实施方式detailed description

下面详细描述本发明的实施例，所述实施例的示例在附图中示出，其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的，仅用于解释本发明，而不能理解为对本发明的限制。Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.

在本发明的描述中，除非另有规定和限定，需要说明的是，术语“安装”、“相连”、“连接”应做广义理解，例如，可以是机械连接或电连接，也可以是两个元件内部的连通，可以是直接相连，也可以通过中间媒介间接相连，对于本领域的普通技术人员而言，可以根据具体情况理解上述术语的具体含义。In the description of the present invention, unless otherwise specified and limited, it should be noted that the terms "installation", "connection" and "connection" should be understood in a broad sense, for example, it can be mechanical connection or electrical connection, or two The internal communication of each element may be directly connected or indirectly connected through an intermediary. Those skilled in the art can understand the specific meanings of the above terms according to specific situations.

如图1所示，本发明提供了一种基于非负交替方向变换的用户特征抽取方法，包括以下步骤：As shown in Figure 1, the present invention provides a kind of user feature extraction method based on non-negative alternating direction transformation, comprising the following steps:

其中，特征空间维数f为用户特征所处特征空间的维度，决定特征向量的维数，为正实数集合内的任意正整数，如25。Among them, the feature space dimension f is the dimension of the feature space where the user feature is located, which determines the dimension of the feature vector, and is any positive integer in the set of positive real numbers, such as 25.

对偶学习速率η为统一损失函数中，对拉格朗日乘子进行训练的学习速率，为区间(0.0001,0.05)内的浮点数，如0.001。The dual learning rate η is the learning rate for training the Lagrangian multipliers in the unified loss function, which is a floating-point number in the interval (0.0001, 0.05), such as 0.001.

拉格朗日增强因子λ为统一损失函数中，对约束条件进行规约表达的因数，为区间(0.01,0.1)内的任意小数，如0.05。The Lagrangian enhancement factor λ is a factor that expresses the constraints in the unified loss function, which is any decimal in the interval (0.01, 0.1), such as 0.05.

用户特征矩阵X为需要抽取的特征，是一个|A|×f的矩阵，其中A代表装置的存储单元所存储的用户集合。X的每一行对应一个用户，X的每一行向量是一个用户的特征向量。本发明实施例中，用户特征矩阵X中每个元素的初始值设置为开区间(0.4,0.8)范围内的随机数，如0.47。The user feature matrix X is the feature to be extracted, which is a matrix of |A|×f, where A represents the set of users stored in the storage unit of the device. Each row of X corresponds to a user, and each row vector of X is a feature vector of a user. In the embodiment of the present invention, the initial value of each element in the user feature matrix X is set to a random number within the open interval (0.4, 0.8), such as 0.47.

用户训练辅助矩阵X_U、X_D和X_C为用以辅助迭代训练用户特征的数据结构，均为|A|×f的矩阵，其中，X_U用于缓存X矩阵在训练过程中的初始值、X_D用于缓存X矩阵在训练过程中的目标值，X_C用于缓存X矩阵在训练过程中的更新值。本发明实施例中，X_U、X_D和X_C中每个元素的初始化为0。User training auxiliary matrices X_U, X_D, and X_C are data structures used to assist iterative training of user features, all of which are |A|×f matrices, where X_U is used to cache the initial value of X matrix during training, and X_D is used to Cache the target value of the X matrix during the training process, and X_C is used to cache the update value of the X matrix during the training process. In the embodiment of the present invention, the initialization of each element in X_U, X_D and X_C is 0.

项目训练辅助矩阵Y_U、Y_D和Y_C为用以辅助迭代训练用户特征的数据结构，均为|B|×f的矩阵,其中，Y_U用于缓存Y矩阵在训练过程中的初始值、Y_D用于缓存Y矩阵在训练过程中的目标值，Y_C用于缓存Y矩阵在训练过程中的更新值。本发明实施例中，Y_U、Y_D和Y_C中每个元素的初始化为0。Project training auxiliary matrices Y_U, Y_D, and Y_C are data structures used to assist iterative training of user features, all of which are |B|×f matrices, where Y_U is used to cache the initial value of the Y matrix during the training process, and Y_D is used to Cache the target value of the Y matrix during the training process, and Y_C is used to cache the update value of the Y matrix during the training process. In the embodiment of the present invention, the initialization of each element in Y_U, Y_D and Y_C is 0.

迭代上限n为控制训练过程迭代上限的变量，为正实数集合内的任意正整数，如500。The iteration upper limit n is a variable that controls the iteration upper limit of the training process, which is any positive integer in the positive real number set, such as 500.

收敛判定阈值为判断迭代训练是否已收敛的阈值参数。本发明实施例中，设置为开区间(0,0.001)内的任意小数，如0.0001。Convergence Decision Threshold It is the threshold parameter for judging whether the iterative training has converged. In the embodiment of the present invention, it is set to any decimal within the open interval (0,0.001), such as 0.0001.

本步骤中所述绝对误差的计算公式为：The formula for calculating the absolute error described in this step is:

s.t.P＝X,P≥0,s.t.P=X,P≥0,

Q＝Y,Q≥0.Q=Y, Q≥0.

S4.抽取装置使用约束条件对累积绝对误差ε(P,Q,X,Y)进行约束，P＝X，P≥0，Q＝Y，Q≥0均为约束条件。S4. The extraction device constrains the cumulative absolute error ε(P, Q, X, Y) using constraint conditions, where P=X, P≥0, Q=Y, and Q≥0 are all constraint conditions.

本步骤中所述损失函数计算公式为：The calculation formula of the loss function described in this step is:

S6.抽取装置判断迭代训练控制变量t是否已达到上限n，S6. The extraction device judges whether the iterative training control variable t has reached the upper limit n,

若已达上限，则执行步骤S9:抽取装置将通过迭代训练获取的用户特征矩阵X和项目特征矩阵Y输出，存储至数据模块中的获取特征存储单元，完成对用户特征的抽取；If the upper limit has been reached, then step S9 is performed: the user feature matrix X and the item feature matrix Y output obtained by the iterative training are output by the extraction device, stored in the acquisition feature storage unit in the data module, and the extraction of the user features is completed;

若没有达到上限，则执行步骤S7；If the upper limit is not reached, step S7 is executed;

本步骤中，抽取装置首先在迭代控制变量t上累加1，然后判断迭代控制变量t是否大于迭代上限n。In this step, the extraction device first accumulates 1 on the iteration control variable t, and then judges whether the iteration control variable t is greater than the iteration upper limit n.

S7.抽取装置判断统一损失函数L相对于P，Q，X，Y，Γ和Κ是否在用户行为统计矩阵中的已知数据集合C上收敛，S7. The extraction device judges whether the unified loss function L converges on the known data set C in the user behavior statistics matrix with respect to P, Q, X, Y, Γ and Κ,

若是，则执行步骤S9:抽取装置将通过迭代训练获取的用户特征矩阵X和项目特征矩阵Y输出，存储至数据模块中的获取特征存储单元，完成对用户特征的抽取；If so, then perform step S9: the extraction device will output the user feature matrix X and the item feature matrix Y obtained through iterative training, and store them in the acquisition feature storage unit in the data module to complete the extraction of user features;

若否，则执行步骤S8；If not, execute step S8;

本步骤中，用户特征抽取装置判断统一损失函数L相对于用户特征矩阵X、项目特征矩阵Y、用户特征在用户行为统计矩阵中的已知数据集合C上是否已收敛的依据为，本轮迭代训练开始前，统一损失函数L的数值，对比上轮迭代训练开始前，统一损失函数L的数值，其差的绝对值是否小于收敛判定阈值如果小于，则判定为已收敛，反之亦然。In this step, the basis for the user feature extraction device to judge whether the unified loss function L has converged with respect to the user feature matrix X, item feature matrix Y, and user features on the known data set C in the user behavior statistics matrix is that the current round of iteration Before the training starts, the value of the unified loss function L is compared with the value of the unified loss function L before the start of the last round of iterative training. Whether the absolute value of the difference is less than the convergence judgment threshold If it is less than , it is judged to have converged, and vice versa.

S8.抽取装置在用户行为统计矩阵中的已知数据集合C中的已知数据上对P，Q，X，Y，Γ和Κ进行迭代训练，再执行步骤S6，如此循环，直至完成步骤S9抽取装置将通过迭代训练获取的用户特征矩阵X和项目特征矩阵Y输出，存储至数据模块中的获取特征存储单元，完成对用户特征的抽取。S8. The extraction device performs iterative training on P, Q, X, Y, Γ and K on the known data in the known data set C in the user behavior statistics matrix, and then performs step S6, and so on, until step S9 is completed The extraction device outputs the user feature matrix X and item feature matrix Y obtained through iterative training, and stores them in the acquired feature storage unit in the data module to complete the extraction of user features.

作为本实施例的优选方案，步骤S4包括以下步骤：As a preferred solution of this embodiment, step S4 includes the following steps:

其中，p_u,k表示用户特征约束矩阵P中第u行，第k列元素；x_u,k表示用户特征矩阵X中第u行，第k列元素；q_i,k表示项目特征约束矩阵Q中第i行，第k列元素；y_i,k表示项目特征矩阵Y第i行，第k列元素。Among them, p _u,k represents the u-th row and k-th column element in the user characteristic constraint matrix P; x _u,k represents the u-th row and k-th column element in the user characteristic matrix X; q _i,k represents the item characteristic constraint matrix The element in row i and column k in Q; y _i,k represents the element in row i and column k of item feature matrix Y.

步骤S8中的迭代训练包括以下步骤：The iterative training in step S8 comprises the following steps:

S8-1.确定迭代训练目标，即全部参数P，Q，X，Y，Γ和Κ，对参数P，Q，X，Y，Γ和Κ进行求解，使其满足统一损失函数L相对于P，Q，X，Y，Γ和Κ在用户行为统计矩阵中的已知数据集合C上最小，表示为公式：S8-1. Determine the iterative training target, that is, all parameters P, Q, X, Y, Γ and Κ, solve the parameters P, Q, X, Y, Γ and Κ, so that it satisfies the unified loss function L relative to P , Q, X, Y, Γ and Κ are the smallest on the known data set C in the user behavior statistics matrix, expressed as a formula:

for k＝1～f,for k=1～f,

其中，t和t+1分别表示第t轮和第t+1轮迭代。Among them, t and t+1 represent the t-th round and the t+1-th round of iterations, respectively.

ρ_u＝λ|C(u)|,τ_i＝λ|C(i)|；ρ _u =λ|C(u)|,τ _i =λ|C(i)|;

训练更新， training updates,

其中，C(u)和C(i)分别表示已知数据集合C中，与用户u和项目i相关联的子集。Among them, C(u) and C(i) represent the subsets associated with user u and item i in the known data set C, respectively.

按照此方法对P，Q，X，Y，Γ和Κ进行迭代训练后，再重复执行步骤S6，如此循环，直至完成对用户特征的抽取。According to this method, P, Q, X, Y, Γ and K are iteratively trained, and then step S6 is repeated, and so on, until the extraction of user features is completed.

本发明还提出一种基于非负交替方向变换的用户特征抽取方法的抽取装置，如图2所示，包括数据接收模块、数据存储模块和执行模块，其中，所述数据接收模块与数据存储模块连接，数据接收模块用于接收服务器采集的用户行为统计数据，并将所接收的服务器采集的用户行为统计数据传递给数据存储模块进行存储，所述数据存储模块与执行模块相连接，执行模块执行服务器发送的进行用户特征抽取的指令，并将抽取的用户特征数据存入数据存储模块中。The present invention also proposes an extraction device based on a user feature extraction method based on non-negative alternating direction transformation, as shown in Figure 2, including a data receiving module, a data storage module and an execution module, wherein the data receiving module and the data storage module connection, the data receiving module is used to receive the user behavior statistical data collected by the server, and transfer the received user behavior statistical data collected by the server to the data storage module for storage, the data storage module is connected to the execution module, and the execution module executes The server sends an instruction to extract user features, and stores the extracted user feature data into the data storage module.

本实施例中，所述数据存储模块包括获取特征存储单元和统计数据存储单元，所述获取特征存储单元与所述执行模块连接，用于存储执行模块抽取的用户特征数据；所述统计数据存储单元与所述数据接收模块连接，用于存储数据接收模块传递的用户行为统计数据。In this embodiment, the data storage module includes an acquisition feature storage unit and a statistical data storage unit, and the acquisition feature storage unit is connected to the execution module for storing user feature data extracted by the execution module; the statistical data storage The unit is connected with the data receiving module, and is used for storing user behavior statistics data delivered by the data receiving module.

作为本实施例的优选方案，所述执行模块包括初始化单元，训练单元和输出单元。As a preferred solution of this embodiment, the execution module includes an initialization unit, a training unit and an output unit.

所述初始化单元对用户特征抽取过程所依赖的参数进行初始化，初始化参数包括：特征空间维数f、对偶学习速率η、拉格朗日增强因子λ、用户特征矩阵X、用户训练辅助矩阵X_U、X_D和X_C、项目特征矩阵Y、项目训练辅助矩阵Y_U、Y_D和Y_C、迭代控制变量t、迭代上限n、收敛判定阈值其中，用户特征矩阵X、用户训练辅助矩阵X_U、X_D和X_C是根据当前用户集合A，和当前特征空间维数f，建立的|A|行，|f|列的矩阵；用户特征矩阵X中每个元素的初始值为区间(0.2,0.6)范围内的随机数，用户训练辅助矩阵X_U、X_D和X_C中每个元素的初始值为0。项目特征矩阵Y、项目训练辅助矩阵Y_U、Y_D和Y_C是根据当前项目集合B，和当前特征空间维数f，建立的|B|行，|f|列的矩阵；项目特征矩阵X中每个元素的初始值为区间(0.2,0.6)范围内的随机数，项目训练辅助矩阵Y_U、Y_D和Y_C中每个元素的初始值为0。The initialization unit initializes the parameters that the user feature extraction process depends on, and the initialization parameters include: feature space dimension f, dual learning rate η, Lagrangian enhancement factor λ, user feature matrix X, user training auxiliary matrix X_U, X_D and X_C, project characteristic matrix Y, project training auxiliary matrix Y_U, Y_D and Y_C, iteration control variable t, iteration upper limit n, convergence judgment threshold Among them, the user feature matrix X, user training auxiliary matrices X_U, X_D and X_C are matrixes of |A| rows and |f| columns established according to the current user set A and the current feature space dimension f; the user feature matrix X The initial value of each element is a random number within the interval (0.2,0.6), and the initial value of each element in the user training auxiliary matrices X_U, X_D and X_C is 0. Item feature matrix Y, item training auxiliary matrix Y_U, Y_D, and Y_C are matrices of |B| rows and |f| columns established according to the current item set B and the current feature space dimension f; each item feature matrix X The initial value of the element is a random number within the interval (0.2,0.6), and the initial value of each element in the item training auxiliary matrix Y_U, Y_D and Y_C is 0.

所述训练单元输入端分别与初始化单元和数据存储模块连接，根据初始化单元初始化的参数和数据存储模块中的用户行为统计数据构造用户特征数据，包括用户特征矩阵X和项目特征矩阵Y，X中的每一个行向量对应于一个用户的非负行为特征；Y中的每一个行向量对应于已知全部用户对于一个项目的非负操作特征。训练构造用户特征数据进一步包括，训练单元首先构造累积绝对误差ε(P,Q,X,Y)，其中P为用户特征约束矩阵，Q为项目特征约束矩阵，再使用增强拉格朗日乘子法构造统一损失函数L(P,Q,X,Y,Γ,K)，其中Γ和Κ均为对偶参数，并求解得出相关参数P，Q，X，Y，Γ和Κ，使全局损失函数在用户行为统计矩阵中的已知数据集合C上最小，然后使用非负交替方向变换，对P，Q，X，Y，Γ和Κ进行迭代训练，直至统一损失函数L(P,Q,X,Y,Γ,K)相对于P，Q，X，Y，Γ和Κ在用户行为统计矩阵中的已知数据集合C上收敛，或者迭代控制变量t等于迭代上限n；The input end of the training unit is connected with the initialization unit and the data storage module respectively, and constructs user characteristic data according to the parameters initialized by the initialization unit and the user behavior statistics data in the data storage module, including user characteristic matrix X and item characteristic matrix Y, among X Each row vector in Y corresponds to a user's non-negative behavior feature; each row vector in Y corresponds to the known non-negative operation feature of all users for an item. Training and constructing user feature data further includes that the training unit first constructs the cumulative absolute error ε(P, Q, X, Y), where P is the user feature constraint matrix, Q is the item feature constraint matrix, and then uses the enhanced Lagrangian multiplier method to construct a unified loss function L(P, Q, X, Y, Γ, K), where Γ and Κ are dual parameters, and solve the related parameters P, Q, X, Y, Γ and Κ, so that the global loss The function is minimized on the known data set C in the user behavior statistics matrix, and then uses non-negative alternating direction transformation to iteratively train P, Q, X, Y, Γ and Κ until the unified loss function L(P, Q, X, Y, Γ, K) converge on the known data set C in the user behavior statistics matrix with respect to P, Q, X, Y, Γ and Κ, or the iteration control variable t is equal to the iteration upper limit n;

所述输出单元输入端与训练单元输出端连接，输出单元输出端与数据存储模块相连接，所述输出单元将训练单元构造的用户特征数据，包括用户特征矩阵X和项目特征矩阵Y，输出并存储到数据存储模块中。The input of the output unit is connected with the output of the training unit, and the output of the output unit is connected with the data storage module, and the output unit will output the user characteristic data constructed by the training unit, including the user characteristic matrix X and the item characteristic matrix Y, and output and stored in the data storage module.

在具体实施中，实例分析使用训练迭代次数作为衡量进行用户特征抽取的收敛速度的指标，训练迭代次数越少，抽取用户特征的收敛速度越快；使用平均绝对误差MAE作为进行用户特征抽取的数据还原准确度的指标，平均绝对误差MAE越低，进行用户特征抽取的数据还原准确度越高。In the specific implementation, the example analysis uses the number of training iterations as an index to measure the convergence speed of user feature extraction. The fewer the number of training iterations, the faster the convergence speed of user feature extraction; the mean absolute error MAE is used as the data for user feature extraction The indicator of restoration accuracy, the lower the mean absolute error MAE, the higher the accuracy of data restoration for user feature extraction.

图3为应用本实施例前后，抽取用户特征的收敛速度对比。应用本发明实施例后，在非负限制下抽取用户特征时，迭代次数有明显下降，收敛速度有明显提高。FIG. 3 is a comparison of the convergence speed of extracting user features before and after applying this embodiment. After applying the embodiment of the present invention, when user features are extracted under non-negative constraints, the number of iterations is significantly reduced, and the convergence speed is significantly improved.

图4为应用本实施例前后，抽取用户特征的数据还原准确度对比。应用本发明实施例后，在非负限制下抽取用户特征时，平均绝对误差MAE有明显下降，数据还原准确度有明显提高。FIG. 4 is a comparison of data restoration accuracy of extracted user features before and after applying this embodiment. After applying the embodiment of the present invention, when user features are extracted under non-negative constraints, the mean absolute error MAE is significantly reduced, and the accuracy of data restoration is significantly improved.

在本说明书的描述中，参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中，对上述术语的示意性表述不一定指的是相同的实施例或示例。而且，描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。In the description of this specification, descriptions with reference to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or feature is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

尽管已经示出和描述了本发明的实施例，本领域的普通技术人员可以理解：在不脱离本发明的原理和宗旨的情况下可以对这些实施例进行多种变化、修改、替换和变型，本发明的范围由权利要求及其等同物限定。Although the embodiments of the present invention have been shown and described, those skilled in the art can understand that various changes, modifications, substitutions and variations can be made to these embodiments without departing from the principle and spirit of the present invention. The scope of the invention is defined by the claims and their equivalents.

Claims

1. A user feature extraction method based on non-negative alternating direction transformation, characterized in that, comprising the following steps:

S1. The server sends an instruction to the extraction device to extract user features;

S2. The extraction device receives instructions and sets initialization parameters. The initialization parameters include: feature space dimension f, dual learning rate η, Lagrangian enhancement factor λ, user feature matrix X, user training auxiliary matrices X_U, X_D and X_C, items Feature matrix Y, project training auxiliary matrix Y_U, Y_D and Y_C, iteration control variable t, iteration upper limit n, convergence judgment threshold

S3. The extraction device constructs a cumulative absolute error ε(P, Q, X, Y), where P is the user feature constraint matrix, and Q is the item feature constraint matrix;

S4. The extraction device uses constraints to constrain the cumulative absolute error ε(P, Q, X, Y) to ensure the non-negativity of the parameters of the matrices P and Q during the training process;

S5. The extraction device constructs a unified loss function L(P, Q, X, Y, Γ, K), where Γ and K are dual parameters;

S6. The extraction device judges whether the iterative training control variable t has reached the upper limit n, if so, execute step S9, if not, execute step S7;

S7. The extraction device judges whether the unified loss function L converges on the known data set C in the user behavior statistics matrix with respect to P, Q, X, Y, Γ and K, if so, execute step S9, if not, execute Step S8;

S8. The extraction device performs iterative training on P, Q, X, Y, Γ and K on the known data in the known data set C in the user behavior statistics matrix, and then executes step S6;

S9. The extraction device outputs the user feature matrix X and item feature matrix Y obtained through iterative training, and stores them in the acquired feature storage unit in the data module.

2. the user feature extraction method based on non-negative alternating direction transformation according to claim 1, is characterized in that, the computing formula of absolute error described in step S3 is:

<mrow><mi>&epsiv;</mi><mrow><mo>(</mo><mi>P</mi><mo>,</mo><mi>Q</mi><mo>,</mo><mi>X</mi><mo>,</mo><mi>Y</mi><mo>)</mo></mrow><mo>=</mo><munder><mo>&Sigma;</mo><mrow><mo>(</mo><mi>u</mi><mo>,</mo><mi>i</mi><mo>)</mo><mo>&Element;</mo><mi>C</mi></mrow></munder><msup><mrow><mo>(</mo><msub><mi>r</mi><mrow><mi>u</mi><mo>,</mo><mi>i</mi></mrow></msub><mo>-</mo><munderover><mo>&Sigma;</mo><mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow><mi>f</mi></munderover><msub><mi>x</mi><mrow><mi>u</mi><mo>,</mo><mi>k</mi></mrow></msub><msub><mi>y</mi><mrow><mi>i</mi><mo>,</mo><mi>k</mi></mrow></msub><mo>)</mo></mrow><mn>2</mn></msup><mo>,</mo></mrow>

s.t.P=X,P≥0,

Q=Y,Q≥0.

Among them, C represents the known data set in the user behavior statistics matrix; r _u,i represents the element value of row u and column i in the user behavior statistics matrix, representing the historical behavior statistics data of user u on item i; x _{u, k} represent the u-th row and k-th column element of the user feature matrix X; y _{i, k} represent the i-th row and k-th column element of the item feature matrix Y; P is the user feature constraint matrix, and Q is the item feature constraint matrix .

3. the user feature extraction method based on non-negative alternating direction transformation according to claim 1, is characterized in that, step S4 comprises the following steps:

S4-1. For each element p _u,k in P, if it is not equal to the corresponding element x _u,k in X, then let p _u,k =x _u,k ;

S4-2. For each element q _i,k in Q, if it is not equal to the corresponding element y _i,k in Y, then let q _i,k =y _i,k ;

S4-3. For each element p _u,k in P, if it is less than 0, set p _u,k =0;

S4-4. For each element q _i,k in Q, if it is less than 0, set q _i,k =0.

4. The user feature extraction method based on non-negative alternating direction transformation according to claim 1, wherein the unified loss function calculation formula described in step S5 is:

<mrow><mtable><mtr><mtd><mrow><mi>L</mi><mrow><mo>(</mo><mi>P</mi><mo>,</mo><mi>Q</mi><mo>,</mo><mi>X</mi><mo>,</mo><mi>Y</mi><mo>,</mo><mi>&Gamma;</mi><mo>,</mo><mi>K</mi><mo>)</mo></mrow><mo>=</mo><mfrac><mn>1</mn><mn>2</mn></mfrac><munder><mo>&Sigma;</mo><mrow><mo>(</mo><mi>u</mi><mo>,</mo><mi>i</mi><mo>)</mo><mo>&Element;</mo><mi>C</mi></mrow></munder><msup><mrow><mo>(</mo><msub><mi>r</mi><mrow><mi>u</mi><mo>,</mo><mi>i</mi></mrow></msub><mo>-</mo><munderover><mo>&Sigma;</mo><mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow><mi>f</mi></munderover><msub><mi>x</mi><mrow><mi>u</mi><mo>,</mo><mi>k</mi></mrow></msub><msub><mi>y</mi><mrow><mi>i</mi><mo>,</mo><mi>k</mi></mrow></msub><mo>)</mo></mrow><mn>2</mn></msup><mo>+</mo><munder><mo>&Sigma;</mo><mrow><mo>(</mo><mi>u</mi><mo>,</mo><mi>k</mi><mo>)</mo></mrow></munder><msub><mi>&gamma;</mi><mrow><mi>u</mi><mo>,</mo><mi>k</mi></mrow></msub><mrow><mo>(</mo><msub><mi>x</mi><mrow><mi>u</mi><mo>,</mo><mi>k</mi></mrow></msub><mo>-</mo><msub><mi>p</mi><mrow><mi>u</mi><mo>,</mo><mi>k</mi></mrow></msub><mo>)</mo></mrow></mrow></mtd></mtr><mtr><mtd><mrow><mo>+</mo><munder><mo>&Sigma;</mo><mrow><mo>(</mo><mi>i</mi><mo>,</mo><mi>k</mi><mo>)</mo></mrow></munder><msub><mi>&kappa;</mi><mrow><mi>i</mi><mo>,</mo><mi>k</mi></mrow></msub><mrow><mo>(</mo><msub><mi>y</mi><mrow><mi>i</mi><mo>,</mo><mi>k</mi></mrow></msub><mo>-</mo><msub><mi>q</mi><mrow><mi>i</mi><mo>,</mo><mi>k</mi></mrow></msub><mo>)</mo></mrow><mo>+</mo><mfrac><mi>&rho;</mi><mn>2</mn></mfrac><munder><mo>&Sigma;</mo><mrow><mo>(</mo><mi>u</mi><mo>,</mo><mi>k</mi><mo>)</mo></mrow></munder><msup><mrow><mo>(</mo><msub><mi>x</mi><mrow><mi>u</mi><mo>,</mo><mi>k</mi></mrow></msub><mo>-</mo><msub><mi>p</mi><mrow><mi>u</mi><mo>,</mo><mi>k</mi></mrow></msub><mo>)</mo></mrow><mn>2</mn></msup><mo>+</mo><mfrac><mi>&tau;</mi><mn>2</mn></mfrac><munder><mo>&Sigma;</mo><mrow><mo>(</mo><mi>i</mi><mo>,</mo><mi>k</mi><mo>)</mo></mrow></munder><msup><mrow><mo>(</mo><msub><mi>y</mi><mrow><mi>i</mi><mo>,</mo><mi>k</mi></mrow></msub><mo>-</mo><msub><mi>q</mi><mrow><mi>i</mi><mo>,</mo><mi>k</mi></mrow></msub><mo>)</mo></mrow><mn>2</mn></msup></mrow></mtd></mtr></mtable><mo>,</mo></mrow>

Among them, Γ and K are dual parameters, r _{u, i} represent the element value of row u and column i in the user behavior statistics matrix, representing the historical behavior statistics data of user u on item i; x _{u, k} represent user characteristics The u-th row and k-th column element of matrix X; y _i,k represents the i-th row and k-th column element of the item feature matrix Y; γ _u,k represents the u-th row and k-th column element in Γ, κ _{i, k} represents the i-th row and the k-th column element in K. This formula uses the reduced Lagrangian multiplier method, which adds the corresponding The statute of limitations, the statute is τ is the reduction parameter of the reduced Lagrange multiplier method, p _u,k represents the u-th row and k-th column element in the user characteristic constraint matrix P; q _i,k represents the i-th row, the item characteristic constraint matrix Q The k column elements, ρ and τ are the reduction parameters of the reduction Lagrangian multiplier method.

5. the user feature extraction method based on non-negative alternating direction transformation according to claim 1, is characterized in that, iterative training in step S8 comprises the following steps:

S8-1. Determine the iterative training target, that is, all parameters P, Q, X, Y, Γ and K, so that it satisfies the unified loss function L relative to P, Q, X, Y, Γ and K in the user behavior statistics matrix The smallest known data set C, expressed as a formula:

Among them, r _{u, i} represent the element value of row u and column i in the user behavior statistics matrix, representing the historical behavior statistics data of user u on item i; x _{u, k} represents the uth row of user feature matrix X, k-th column element; y _{i, k} means the i-th row and k-th column element of the item feature matrix Y; γ _{u, k} means the u-th row and k-th column element in Γ; κ _{i, k} means the i-th row in K , the k-th column element, the formula uses the reduced Lagrange multiplier method, the reduced Lagrangian multiplier method is based on the Lagrange multiplier method, adding the corresponding restriction items, the reduced Item is τ is the reduction parameter of the reduced Lagrange multiplier method, p _u,k represents the u-th row and k-th column element in the user characteristic constraint matrix P; q _i,k represents the i-th row, the item characteristic constraint matrix Q K column elements, ρ and τ are the reduction parameters of the reduction Lagrangian multiplier method;

S8-2. Use non-negative direction alternating transformation to carry out sequential training to the single element in P, Q, X, Y, Γ and K, and the training rule is expressed as a formula:

for k=1～f,

<mfenced open = "{" close = ""><mtable><mtr><mtd><mrow><msubsup><mi>X</mi><mrow><mo>,</mo><mi>k</mi></mrow><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msubsup><mo>:</mo><munder><mrow><mi>arg</mi><mi>min</mi></mrow><msub><mi>X</mi><mrow><mo>.</mo><mo>,</mo><mi>k</mi></mrow></msub></munder><mi>L</mi><mrow><mo>(</mo><msup><mi>P</mi><mi>t</mi></msup><mo>,</mo><msup><mi>Q</mi><mi>t</mi></msup><mo>,</mo><msubsup><mi>X</mi><mrow><mo>,</mo><mn>1</mn><mo>~</mo><mi>k</mi><mo>-</mo><mn>1</mn></mrow><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msubsup><mo>,</mo><msub><mi>X</mi><mrow><mo>,</mo><mi>k</mi></mrow></msub><mo>,</mo><msubsup><mi>X</mi><mrow><mo>,</mo><mi>k</mi><mo>+</mo><mn>1</mn><mo>~</mo><mi>f</mi></mrow><mi>t</mi></msubsup><mo>,</mo><msubsup><mi>Y</mi><mrow><mo>,</mo><mn>1</mn><mo>~</mo><mi>k</mi><mo>-</mo><mn>1</mn></mrow><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msubsup><mo>,</mo><msubsup><mi>Y</mi><mrow><mo>,</mo><mi>k</mi><mo>~</mo><mi>f</mi></mrow><mi>t</mi></msubsup><mo>,</mo><msup><mi>&Gamma;</mi><mi>t</mi></msup><mo>,</mo><msup><mi>K</mi><mi>t</mi></msup><mo>)</mo></mrow></mrow></mtd></mtr><mtr><mtd><mrow><msubsup><mi>Y</mi><mrow><mo>,</mo><mi>k</mi></mrow><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msubsup><mo>:</mo><munder><mrow><mi>arg</mi><mi>min</mi></mrow><msub><mi>Y</mi><mrow><mo>.</mo><mo>,</mo><mi>k</mi></mrow></msub></munder><mi>L</mi><mrow><mo>(</mo><msup><mi>P</mi><mi>t</mi></msup><mo>,</mo><msup><mi>Q</mi><mi>t</mi></msup><mo>,</mo><msubsup><mi>X</mi><mrow><mo>,</mo><mn>1</mn><mo>~</mo><mi>k</mi><mo>-</mo><mn>1</mn></mrow><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msubsup><mo>,</mo><msubsup><mi>X</mi><mrow><mo>,</mo><mi>k</mi><mo>~</mo><mi>f</mi></mrow><mi>t</mi></msubsup><mo>,</mo><msubsup><mi>Y</mi><mrow><mo>,</mo><mn>1</mn><mo>~</mo><mi>k</mi><mo>-</mo><mn>1</mn></mrow><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msubsup><mo>,</mo><msub><mi>Y</mi><mrow><mo>,</mo><mi>k</mi></mrow></msub><mo>,</mo><msubsup><mi>Y</mi><mrow><mo>,</mo><mi>k</mi><mo>+</mo><mn>1</mn><mo>~</mo><mi>f</mi></mrow><mi>t</mi></msubsup><mo>,</mo><msup><mi>&Gamma;</mi><mi>t</mi></msup><mo>,</mo><msup><mi>K</mi><mi>t</mi></msup><mo>)</mo></mrow></mrow></mtd></mtr></mtable></mfenced>

<mrow><mfenced open = "{" close = ""><mtable><mtr><mtd><mrow><msup><mi>P</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>=</mo><munder><mi>argmin</mi><mi>P</mi></munder><mi>L</mi><mrow><mo>(</mo><mi>P</mi><mo>,</mo><msup><mi>Q</mi><mi>t</mi></msup><mo>,</mo><msup><mi>X</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>,</mo><msup><mi>Y</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>,</mo><msup><mi>&Gamma;</mi><mi>t</mi></msup><mo>,</mo><msup><mi>K</mi><mi>t</mi></msup><mo>)</mo></mrow></mrow></mtd></mtr><mtr><mtd><mrow><msup><mi>Q</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>=</mo><munder><mi>argmin</mi><mi>Q</mi></munder><mi>L</mi><mrow><mo>(</mo><msup><mi>P</mi><mi>t</mi></msup><mo>,</mo><mi>Q</mi><mo>,</mo><msup><mi>X</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>,</mo><msup><mi>Y</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>,</mo><msup><mi>&Gamma;</mi><mi>t</mi></msup><mo>,</mo><msup><mi>K</mi><mi>t</mi></msup><mo>)</mo></mrow></mrow></mtd></mtr><mtr><mtd><mrow><msup><mi>&Gamma;</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>=</mo><msup><mi>&Gamma;</mi><mi>t</mi></msup><mo>+</mo><mi>&eta;</mi><msub><mo>&dtri;</mo><mi>&Gamma;</mi></msub><mi>L</mi><mrow><mo>(</mo><msup><mi>P</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>,</mo><msup><mi>Q</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>,</mo><msup><mi>X</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>,</mo><msup><mi>Y</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>,</mo><mi>&Gamma;</mi><mo>,</mo><msup><mi>K</mi><mi>t</mi></msup><mo>)</mo></mrow></mrow></mtd></mtr><mtr><mtd><mrow><msup><mi>K</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>=</mo><msup><mi>K</mi><mi>t</mi></msup><mo>+</mo><mi>&eta;</mi><msub><mo>&dtri;</mo><mi>K</mi></msub><mi>L</mi><mrow><mo>(</mo><msup><mi>P</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>,</mo><msup><mi>Q</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>,</mo><msup><mi>X</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>,</mo><msup><mi>Y</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msup><mo>,</mo><msup><mi>&Gamma;</mi><mi>t</mi></msup><mo>,</mo><mi>K</mi><mo>)</mo></mrow></mrow></mtd></mtr></mtable></mfenced><mo>;</mo></mrow>

S8-3. For each element in P, Q, X, Y, Γ and K, perform its calculation according to the following formula

ρ _u =λ|C(u)|,τ _i =λ|C(i)|;

training updates, Among them, C(u) and C(i) represent the subsets associated with user u and item i in the known data set C, respectively.

6. An extraction device according to any one of claims 1-5 based on a user feature extraction method based on non-negative alternating direction transformation, characterized in that it includes a data receiving module, a data storage module and an execution module, wherein,

The data receiving module is connected to the data storage module, the data receiving module is used to receive the user behavior statistics data collected by the server, and transfer the received user behavior statistics data collected by the server to the data storage module for storage,

The data storage module is connected with the execution module, and the execution module executes the user feature extraction instruction sent by the server, and stores the extracted user feature data into the data storage module.

7. The extracting device based on the user feature extraction method of non-negative alternating direction transformation according to claim 6, wherein the data storage module includes an acquisition feature storage unit and a statistical data storage unit,

The acquisition feature storage unit is connected to the execution module, and is used to store the user feature data extracted by the execution module;

The statistical data storage unit is connected with the data receiving module, and is used for storing user behavior statistical data delivered by the data receiving module.

8. The extracting device based on the user feature extraction method of non-negative alternating direction exchange according to claim 6, wherein the execution module includes an initialization unit, a training unit and an output unit,

The initialization unit initializes the parameters that the user feature extraction process depends on, and the initialization parameters include: feature space dimension f, dual learning rate η, Lagrangian enhancement factor λ, user feature matrix X, user training auxiliary matrix X_U, X_D and X_C, project characteristic matrix Y, project training auxiliary matrix Y_U, Y_D and Y_C, iteration control variable t, iteration upper limit n, convergence judgment threshold

The input end of the training unit is connected with the initialization unit and the data storage module respectively, according to the parameters initialized by the initialization unit and the user behavior statistics data in the data storage module to construct user characteristic data, including user characteristic matrix X and item characteristic matrix Y, training unit First construct the cumulative absolute error ε(P,Q,X,Y), where P is the user feature constraint matrix, Q is the item feature constraint matrix, and then construct a unified loss function L(P,Q,X,Y,Γ,K) , where Γ and K are dual parameters, and then iteratively train P, Q, X, Y, Γ and K until the unified loss function L(P, Q, X, Y, Γ, K) is relative to P, Q , X, Y, Γ and K converge on the known data set C in the user behavior statistics matrix, or the iteration control variable t is equal to the iteration upper limit n;

The input end of the output unit is connected to the output end of the training unit, the output end of the output unit is connected to the data storage module, and the output unit outputs and stores the user characteristic data constructed by the training unit into the data storage module.