A kind of user characteristics abstracting method based on the conversion of non-negative alternating direction and draw-out device
Technical field
The present invention relates to the large technical field of data processing of computing machine, particularly a kind of user characteristics abstracting method based on the conversion of non-negative alternating direction and draw-out device in e-commerce system.
Background technology
Modern electronic business system, its number of users and information content are very huge.In this type systematic, the various objective behavior of user, as clicked, browsing, comment on, search for etc., with system operation accumulated time, collects the user's historical behavior data set becoming huge, and data volume, at least in TB magnitude, is typical large data environment.
In electronic business system, a kind of typical data description structure is user behavior statistical matrix, the corresponding user of every a line wherein, and each row corresponds to a project; Project refer to any in system may by the objective object of user operation, as news, picture, commodity; The corresponding unique user of each matrix element is to the historical behavior data of single project, and these data use this user to the objective history behavioral data of this project, utilizes the mathematical statistics method meeting the natural law to carry out quantum chemical method formation.In electronic business system, user and the number of entry are very huge, and respective user behavioral statistics matrix is also very huge.Meanwhile, a user can not operate all projects, and a project also can not by all user operations; Generally speaking, the given data in user behavior statistical matrix, far fewer than unknown data, is extremely sparse.
In system operation process, based on the given data in user behavior statistical matrix, therefrom extract user characteristics, effectively can analyze the behavior of user, therefrom excavate and comprise the rule such as class of subscriber, behavior pattern.In the extraction process of user characteristics, keep the nonnegativity of user characteristics, be a key, this is because it is the natural law of positive number that the user characteristics of non-negative meets user behavior data in e-commerce system more, can characterize user behavior better.Existing non-negative feature extraction technology is used for computer vision field, its basic characteristics are for given figure or image, be regarded as a non-singular matrix, and the matrix factorisation under the restriction of non-negative condition is carried out to it, thus extract the local objects feature of this figure or image.But the user characteristics in e-commerce system extracts problem, extracts problem, possess very large difference with the non-negative object features in computer vision.This is because non-negative object features in computer vision extract handled by the matrix that transforms of figure, image be non-singular matrix, do not possess missing values, the non-negative matrix factorization problem of this matroid can process by the matrix iteration computing of routine; But the non-negative user behavior in e-commerce system extracts problem, handled user behavior statistical matrix, extremely sparse under normal circumstances, wherein possesses a large amount of missing values, traditional matrix factorisation process cannot be used, and need the hidden signature analysis process of non-negative with sparse matrix can be acted on.But the hidden characteristic analysis method of existing nonnegative matrix, possesses the shortcoming that speed of convergence is slow, data convert accuracy is low.
Therefore, how in electronic business system, the user behavior statistical matrix that possesses a large amount of missing values, carry out the hidden signature analysis of non-negative that fast convergence rate, data convert accuracy are high, thus obtain the user characteristics that well can describe the user behavior natural law, be that the mass data produced modern electronic business system carries out the required key issue to be processed of analysis.
Summary of the invention
In order to overcome the defect existed in above-mentioned prior art, the object of this invention is to provide a kind of user characteristics abstracting method based on the conversion of non-negative alternating direction and draw-out device, the present invention directly acts on the given data set in user behavior statistical matrix, that possess a large amount of missing values, extremely sparse user behavior statistical matrix can be processed, fast convergence rate, data convert accuracy is high, and the user characteristics that can solve in large data processing circumstance extracts problem.
In order to realize above-mentioned purpose of the present invention, the invention provides a kind of user characteristics abstracting method based on the conversion of non-negative alternating direction, comprising the following steps:
S1. server sends the instruction carrying out user characteristics extraction to draw-out device;
S2. draw-out device receives instruction and initiation parameter, and initiation parameter comprises: feature space dimension f, paired-associate learning speed η, Lagrangian enhancer λ, user characteristics matrix X, user train companion matrix X_U, X_D and X_C, item characteristic matrix Y, project training companion matrix Y_U, Y_D and Y_C, iteration control variable t, iteration upper limit n, convergence decision threshold
S3. draw-out device structure accumulation absolute error ε (P, Q, X, Y), wherein P is user characteristics constraint matrix, and Q is item characteristic constraint matrix;
S4. draw-out device uses constraint condition to retrain accumulation absolute error ε (P, Q, X, Y), ensures the parameter nonnegativity in the training process of matrix P, Q;
S5. unified loss function L (P, Q, X, Y, Γ, Κ) of draw-out device structure, wherein Γ and Κ is antithesis parameter;
S6. draw-out device judges whether repetitive exercise control variable t reaches upper limit n, if so, then performs step S9, if not, then performs step S7;
S7. draw-out device judges that unified loss function L restrains relative on given data set C whether in user behavior statistical matrix of P, Q, X, Y, Γ and Κ, if so, then performs step S9, if not, then performs step S8;
S8. the given data in the given data set C of draw-out device in user behavior statistical matrix carries out repetitive exercise to P, Q, X, Y, Γ and Κ, then performs step S6;
S9. the user characteristics matrix X obtained by repetitive exercise and item characteristic matrix Y is exported by draw-out device, is stored to the acquisition characteristic storing unit in data module.
In this method, the dimension of feature space dimension f feature space residing for user characteristics in step S2, determining the dimension of proper vector, is any positive integer in arithmetic number set.
Paired-associate learning speed η, in unified loss function, to the learning rate that Lagrange multiplier is trained, is the floating number in interval (0.0001,0.05).
Lagrange enhancer λ, in unified loss function, carries out the factor of stipulations expression to constraint condition, be the arbitrary small number in interval (0.01,0.1).
User characteristics matrix X is the feature needing to extract, and is one | the matrix of A| × f, user's set that the storage unit that wherein A represents device stores.The corresponding user of every a line of X, each row vector of X is the proper vector of a user.In the embodiment of the present invention, in user characteristics matrix X, the initial value of each element is set to the random number in open interval (0.4,0.8) scope.
User trains companion matrix X_U, X_D and X_C to be data structure in order to auxiliary repetitive exercise user characteristics, is | the matrix of A| × f.
Item characteristic matrix Y is the feature needing to extract, and is one | the matrix of B| × f, the project set that the storage unit that wherein B represents device stores.The corresponding project of every a line of Y, each row vector of Y is the proper vector that whole user operates a project.
Project training companion matrix Y_U, Y_D and Y_C are the data structure in order to auxiliary repetitive exercise user characteristics, are | the matrix of B| × f.
Iteration control variable t is the variable of controlling feature training process, and iteration control variable t is initialized as 0.
Iteration upper limit n is the variable of the controlled training process iterates upper limit, is any positive integer in arithmetic number set.
Convergence decision threshold
for judging the threshold parameter whether repetitive exercise has restrained.
This method directly acts on the given data set in user behavior statistical matrix, that possess a large amount of missing values, extremely sparse user behavior statistical matrix can be processed, fast convergence rate, data convert accuracy is high, and the user characteristics that can process in large data processing circumstance extracts problem.
Preferably, the computing formula of absolute error described in step S3 is:
s.t.P=X,P≥0,
Q=Y,Q≥0.
Wherein, C represents the given data set in user behavior statistical matrix; r
u,irepresent that in user behavior statistical matrix, u is capable, the element value of the i-th row, the historical behavior statistics of representative of consumer u on project i; x
u,krepresent that the u of user characteristics matrix X is capable, kth column element; y
i,krepresent i-th row of item characteristic matrix Y, kth column element; P is user characteristics constraint matrix, and Q is item characteristic constraint matrix.
Step S3 structure accumulation definitely by mistake ε (P, Q, X, Y) is stated fully to error and non-negativity constraint, meanwhile, provides condition to introducing antithesis parameter.
Step S4 uses constraint condition to retrain accumulation absolute error ε (P, Q, X, Y), ensures correlation model parameters nonnegativity in the training process.
The unified loss function of step S5 structure is by using method of Lagrange multipliers to unify loss function and related constraint condition, thus meets the constraint of constraint condition in the training process.
Preferably, step S4 comprises the following steps:
S4-1. for each element p in P
u,k, as it is not equal to corresponding element x in X
u,k, then p is made
u,k=x
u,k;
S4-2. for each element q in Q
i,k, as it is not equal to corresponding element y in Y
i,k, then q is made
i,k=y
i,k;
S4-3. for each element p in P
u,k, as it is less than 0, then make p
u,k=0;
S4-4. for each element q in Q
i,k, as it is less than 0, then make q
i,k=0.
Wherein, p
u,krepresent that in user characteristics constraint matrix P, u is capable, kth column element, q
i,krepresent the i-th row in item characteristic constraint matrix, kth column element.
Preferably, the computing formula of loss function described in step S5 is:
Wherein Γ and Κ is antithesis parameter, γ
u,krepresent that in Γ, u is capable, kth column element, κ
i,krepresent the i-th row in K, kth column element, that this formula adopts is stipulations method of Lagrange multipliers (augmented lagrangian), stipulations method of Lagrange multipliers (augmented lagrangian) is the stipulations item adding corresponding restrictive condition on the basis of method of Lagrange multipliers, and stipulations item is
ρ is the stipulations parameter of stipulations method of Lagrange multipliers, and this parameter ginseng should, in matrix X, be a constant in calculating.
Preferably, the repetitive exercise in step S8 comprises the following steps:
S8-1. determine repetitive exercise target, i.e. all parameter P, Q, X, Y, Γ and Κ, make it meet unified loss function L relative to minimum on given data set C in user behavior statistical matrix of P, Q, X, Y, Γ and Κ, be expressed as formula:
τ is the stipulations parameter of stipulations method of Lagrange multipliers,
for stipulations item, this parameter ginseng should, in matrix Y, be a constant in calculating.
S8-2. use non-negative direction checker, carry out order training to the single-element in P, Q, X, Y, Γ and Κ, training rules is expressed as formula:
for k=1~f,
S8-3, for each element in P, Q, X, Y, Γ and Κ, according to following formula, it to be carried out
ρ
u=λ|C(u)|,τ
i=λ|C(i)|;
Training upgrades,
Wherein, C (u) and C (i) represents in given data set C respectively, the subset be associated with user u and project i, τ
imiddle τ is the stipulations parameter of stipulations method of Lagrange multipliers, and this parameter ginseng should correspond to i-th row of Y in matrix Y, i.
The present invention also proposes a kind of draw-out device of the user characteristics abstracting method based on the conversion of non-negative alternating direction, comprise data reception module, data memory module and execution module, wherein, described data reception module is connected with data memory module, data reception module is used for the user behavior statistics that reception server gathers, and the user behavior statistics of received collection of server is passed to data memory module store, described data memory module is connected with execution module, execution module performs the instruction carrying out user characteristics extraction that server sends, and by the user characteristic data of extraction stored in data memory module.
By the user behavior statistics of data reception module acquisition server, perform the instruction of user characteristics extraction with execution module, data memory module stores the user characteristic data that the user behavior statistics of the server of data reception module collection and execution module extract.This device can directly act on the given data set in user behavior statistical matrix, can process that possess a large amount of missing values, extremely sparse user behavior statistical matrix, and the user characteristics that can solve in large data processing circumstance extracts problem.
Further, described data memory module comprises acquisition characteristic storing unit and statistics storage unit, and described acquisition characteristic storing unit is connected with described execution module, for storing the user characteristic data that execution module extracts; Described statistics storage unit is connected with described data reception module, for storing the user behavior statistics that data reception module transmits.
Subdivision storage is carried out to the user characteristic data that user behavior statistics and the execution module of the server of data reception module collection extract, can be convenient, accurate and quick when called data.
Further, described execution module comprises initialization unit, training unit and output unit,
Described initialization unit carries out initialization to the parameter that user characteristics extraction process relies on, and initiation parameter comprises: feature space dimension f, paired-associate learning speed η, Lagrangian enhancer λ, user characteristics matrix X, user train companion matrix X_U, X_D and X_C, item characteristic matrix Y, project training companion matrix Y_U, Y_D and Y_C, iteration control variable t, iteration upper limit n, convergence decision threshold
Described training unit input end is connected with initialization unit and data memory module respectively, according to the user behavior statistics structuring user's characteristic in the initialized parameter of initialization unit and data memory module, comprise user characteristics matrix X and item characteristic matrix Y, first training unit constructs accumulation absolute error ε (P, Q, X, Y), wherein P is user characteristics constraint matrix, Q is item characteristic constraint matrix, the unified loss function L of structure (P again, Q, X, Y, Γ, K), wherein Γ and Κ is antithesis parameter, then to P, Q, X, Y, Γ and Κ carries out repetitive exercise, until unified loss function L (P, Q, X, Y, Γ, K) relative to P, Q, X, Y, the given data set C of Γ and Κ in user behavior statistical matrix restrains, or iteration control variable t equals iteration upper limit n,
Described output unit input end is connected with training unit output terminal, and output unit output terminal is connected with data memory module, and the user characteristic data that training unit constructs exports and is stored in data memory module by described output unit.
Execution module is divided into three unit, makes when the user characteristic data extracted, to the initialization of parameter, the extraction of data and stored energy accurate quick more.
The invention has the beneficial effects as follows: this method is intended to be converted by non-negative alternating direction, directly acts on the given data set in user behavior statistical matrix, has following advantage:
1, that possess a large amount of missing values, extremely sparse user behavior statistical matrix can be processed;
2, fast convergence rate, data convert accuracy is high, and the user characteristics that can solve in large data processing circumstance extracts problem.
Additional aspect of the present invention and advantage will part provide in the following description, and part will become obvious from the following description, or be recognized by practice of the present invention.
Accompanying drawing explanation
Above-mentioned and/or additional aspect of the present invention and advantage will become obvious and easy understand from accompanying drawing below combining to the description of embodiment, wherein:
Fig. 1 is structural representation of the present invention;
Fig. 2 is schematic flow sheet of the present invention;
Fig. 3 is before and after the application embodiment of the present invention, to the speed of convergence comparison diagram that user characteristics extracts;
Fig. 4 is before and after the application embodiment of the present invention, to the data convert accuracy comparison diagram that user characteristics extracts.
Embodiment
Be described below in detail embodiments of the invention, the example of described embodiment is shown in the drawings, and wherein same or similar label represents same or similar element or has element that is identical or similar functions from start to finish.Being exemplary below by the embodiment be described with reference to the drawings, only for explaining the present invention, and can not limitation of the present invention being interpreted as.
In describing the invention, unless otherwise prescribed and limit, it should be noted that, term " installation ", " being connected ", " connection " should be interpreted broadly, such as, can be mechanical connection or electrical connection, also can be the connection of two element internals, can be directly be connected, also indirectly can be connected by intermediary, for the ordinary skill in the art, the concrete meaning of above-mentioned term can be understood as the case may be.
As shown in Figure 1, the invention provides a kind of user characteristics abstracting method based on the conversion of non-negative alternating direction, comprise the following steps:
S1. server sends the instruction carrying out user characteristics extraction to draw-out device;
S2. draw-out device receives instruction and initiation parameter, and initiation parameter comprises: feature space dimension f, paired-associate learning speed η, Lagrangian enhancer λ, user characteristics matrix X, user train companion matrix X_U, X_D and X_C, item characteristic matrix Y, project training companion matrix Y_U, Y_D and Y_C, iteration control variable t, iteration upper limit n, convergence decision threshold
Wherein, the dimension of feature space dimension f feature space residing for user characteristics, determining the dimension of proper vector, is any positive integer in arithmetic number set, as 25.
Paired-associate learning speed η, in unified loss function, to the learning rate that Lagrange multiplier is trained, is the floating number in interval (0.0001,0.05), as 0.001.
Lagrange enhancer λ, in unified loss function, carries out the factor of stipulations expression to constraint condition, be the arbitrary small number in interval (0.01,0.1), as 0.05.
User characteristics matrix X is the feature needing to extract, and is one | the matrix of A| × f, user's set that the storage unit that wherein A represents device stores.The corresponding user of every a line of X, each row vector of X is the proper vector of a user.In the embodiment of the present invention, in user characteristics matrix X, the initial value of each element is set to the random number in open interval (0.4,0.8) scope, as 0.47.
User trains companion matrix X_U, X_D and X_C to be data structure in order to auxiliary repetitive exercise user characteristics, be | the matrix of A| × f, wherein, X_U is used for buffer memory X matrix initial value in the training process, X_D is used for buffer memory X matrix desired value in the training process, and X_C is used for buffer memory X matrix updated value in the training process.In the embodiment of the present invention, in X_U, X_D and X_C, each element is initialized as 0.
Item characteristic matrix Y is the feature needing to extract, and is one | the matrix of B| × f, the project set that the storage unit that wherein B represents device stores.The corresponding project of every a line of Y, each row vector of Y is the proper vector that whole user operates a project.
Project training companion matrix Y_U, Y_D and Y_C are the data structure in order to auxiliary repetitive exercise user characteristics, be | the matrix of B| × f, wherein, Y_U is used for buffer memory Y matrix initial value in the training process, Y_D is used for buffer memory Y matrix desired value in the training process, and Y_C is used for buffer memory Y matrix updated value in the training process.In the embodiment of the present invention, in Y_U, Y_D and Y_C, each element is initialized as 0.
Iteration control variable t is the variable of controlling feature training process, and iteration control variable t is initialized as 0.
Iteration upper limit n is the variable of the controlled training process iterates upper limit, is any positive integer in arithmetic number set, as 500.
Convergence decision threshold
for judging the threshold parameter whether repetitive exercise has restrained.In the embodiment of the present invention, be set to the arbitrary small number in open interval (0,0.001), as 0.0001.
S3. draw-out device structure accumulation absolute error ε (P, Q, X, Y), wherein P is user characteristics constraint matrix, and Q is item characteristic constraint matrix;
The computing formula of absolute error described in this step is:
s.t.P=X,P≥0,
Q=Y,Q≥0.
Wherein, C represents the given data set in user behavior statistical matrix; r
u,irepresent that in user behavior statistical matrix, u is capable, the element value of the i-th row, the historical behavior statistics of representative of consumer u on project i; x
u,krepresent that the u of user characteristics matrix X is capable, kth column element; y
i,krepresent i-th row of item characteristic matrix Y, kth column element; P is user characteristics constraint matrix, and Q is item characteristic constraint matrix.
S4. draw-out device uses constraint condition to retrain accumulation absolute error ε (P, Q, X, Y), and P=X, P >=0, Q=Y, Q >=0 is constraint condition.
S5. unified loss function L (P, Q, X, Y, Γ, Κ) of draw-out device structure, wherein Γ and Κ is antithesis parameter;
The computing formula of loss function described in this step is:
S6. draw-out device judges whether repetitive exercise control variable t reaches upper limit n,
If reach the upper limit, then perform step S9: the user characteristics matrix X obtained by repetitive exercise and item characteristic matrix Y is exported by draw-out device, is stored to the acquisition characteristic storing unit in data module, completes the extraction to user characteristics;
If do not reach the upper limit, then perform step S7;
In this step, first draw-out device adds up 1 on iteration control variable t, then judges whether iteration control variable t is greater than iteration upper limit n.
S7. draw-out device judges that unified loss function L restrains relative on given data set C whether in user behavior statistical matrix of P, Q, X, Y, Γ and Κ,
If so, then step S9 is performed: the user characteristics matrix X obtained by repetitive exercise and item characteristic matrix Y is exported by draw-out device, is stored to the acquisition characteristic storing unit in data module, completes the extraction to user characteristics;
If not, then step S8 is performed;
In this step, user characteristics draw-out device judge unified loss function L relative to whether given data set C in user behavior statistical matrix of user characteristics matrix X, item characteristic matrix Y, user characteristics has restrained according to being, before epicycle repetitive exercise starts, the numerical value of unified loss function L, before in contrast, wheel repetitive exercise starts, the numerical value of unified loss function L, whether the absolute value of its difference is less than convergence decision threshold
if be less than, be then judged to have restrained, vice versa.
S8. to P in the given data in the given data set C of draw-out device in user behavior statistical matrix, Q, X, Y, Γ and Κ carry out repetitive exercise, then perform step S6, circulation like this, until the user characteristics matrix X obtained by repetitive exercise and item characteristic matrix Y is exported by completing steps S9 draw-out device, be stored to the acquisition characteristic storing unit in data module, complete the extraction to user characteristics.
As the preferred version of the present embodiment, step S4 comprises the following steps:
S4-1. for each element p in P
u,k, as it is not equal to corresponding element x in X
u,k, then p is made
u,k=x
u,k;
S4-2. for each element q in Q
i,k, as it is not equal to corresponding element y in Y
i,k, then q is made
i,k=y
i,k;
S4-3. for each element p in P
u,k, as it is less than 0, then make p
u,k=0;
S4-4. for each element q in Q
i,k, as it is less than 0, then make q
i,k=0.
Wherein, p
u,krepresent that in user characteristics constraint matrix P, u is capable, kth column element; x
u,krepresent that in user characteristics matrix X, u is capable, kth column element; q
i,krepresent the i-th row in item characteristic constraint matrix Q, kth column element; y
i,krepresent item characteristic matrix Y i-th row, kth column element.
Repetitive exercise in step S8 comprises the following steps:
S8-1. repetitive exercise target is determined, i.e. whole parameter P, Q, X, Y, Γ and Κ, parameter P, Q, X, Y, Γ and Κ are solved, makes it meet unified loss function L relative to P, minimum on Q, X, Y, Γ and the Κ given data set C in user behavior statistical matrix, be expressed as formula:
S8-2. use non-negative direction checker, carry out order training to the single-element in P, Q, X, Y, Γ and Κ, training rules is expressed as formula:
for k=1~f,
Wherein, t and t+1 represents that t wheel and t+1 take turns iteration respectively.
S8-3, for each element in P, Q, X, Y, Γ and Κ, according to following formula, it to be carried out
ρ
u=λ|C(u)|,τ
i=λ|C(i)|;
Training upgrades,
Wherein, C (u) and C (i) represents in given data set C respectively, the subset be associated with user u and project i.
After in this way repetitive exercise being carried out to P, Q, X, Y, Γ and Κ, then repeated execution of steps S6, so circulate, until complete the extraction to user characteristics.
The present invention also proposes a kind of draw-out device of the user characteristics abstracting method based on the conversion of non-negative alternating direction, as shown in Figure 2, comprise data reception module, data memory module and execution module, wherein, described data reception module is connected with data memory module, data reception module is used for the user behavior statistics that reception server gathers, and the user behavior statistics of received collection of server is passed to data memory module store, described data memory module is connected with execution module, execution module performs the instruction carrying out user characteristics extraction that server sends, and by the user characteristic data of extraction stored in data memory module.
By the user behavior statistics of data reception module acquisition server, perform the instruction of user characteristics extraction with execution module, data memory module stores the user characteristic data that the user behavior statistics of the server of data reception module collection and execution module extract.This device can directly act on the given data set in user behavior statistical matrix, can process that possess a large amount of missing values, extremely sparse user behavior statistical matrix, and the user characteristics that can solve in large data processing circumstance extracts problem.
In the present embodiment, described data memory module comprises acquisition characteristic storing unit and statistics storage unit, and described acquisition characteristic storing unit is connected with described execution module, for storing the user characteristic data that execution module extracts; Described statistics storage unit is connected with described data reception module, for storing the user behavior statistics that data reception module transmits.
Subdivision storage is carried out to the user characteristic data that user behavior statistics and the execution module of the server of data reception module collection extract, can be convenient, accurate and quick when called data.
As the preferred version of the present embodiment, described execution module comprises initialization unit, training unit and output unit.
Described initialization unit carries out initialization to the parameter that user characteristics extraction process relies on, and initiation parameter comprises: feature space dimension f, paired-associate learning speed η, Lagrangian enhancer λ, user characteristics matrix X, user train companion matrix X_U, X_D and X_C, item characteristic matrix Y, project training companion matrix Y_U, Y_D and Y_C, iteration control variable t, iteration upper limit n, convergence decision threshold
wherein, user characteristics matrix X, user train companion matrix X_U, X_D and X_C to be according to active user's set A, and current feature space dimension f, foundation | and A| is capable, | the matrix of f| row; In user characteristics matrix X, the initial value of each element is the random number in interval (0.2,0.6) scope, and user trains the initial value of each element in companion matrix X_U, X_D and X_C to be 0.Item characteristic matrix Y, project training companion matrix Y_U, Y_D and Y_C are according to current project set B, and current feature space dimension f, foundation | B| is capable, | the matrix of f| row; In item characteristic matrix X, the initial value of each element is the random number in interval (0.2,0.6) scope, and in project training companion matrix Y_U, Y_D and Y_C, the initial value of each element is 0.
Described training unit input end is connected with initialization unit and data memory module respectively, according to the user behavior statistics structuring user's characteristic in the initialized parameter of initialization unit and data memory module, comprise user characteristics matrix X and item characteristic matrix Y, each row vector in X corresponds to the non-negative behavioural characteristic of a user; Each row vector in Y corresponds to the non-negative operating characteristics of known whole user for a project.Training structuring user's characteristic comprises further, first training unit constructs accumulation absolute error ε (P, Q, X, Y), wherein P is user characteristics constraint matrix, Q is item characteristic constraint matrix, re-use the unified loss function L of augmented Lagrange multiplier method structure (P, Q, X, Y, Γ, K), wherein Γ and Κ is antithesis parameter, and solve and draw correlation parameter P, Q, X, Y, Γ and Κ, make on the given data set C of overall loss function in user behavior statistical matrix minimum, then the conversion of non-negative alternating direction is used, to P, Q, X, Y, Γ and Κ carries out repetitive exercise, until unified loss function L (P, Q, X, Y, Γ, K) relative to P, Q, X, Y, the given data set C of Γ and Κ in user behavior statistical matrix restrains, or iteration control variable t equals iteration upper limit n,
Described output unit input end is connected with training unit output terminal, output unit output terminal is connected with data memory module, the user characteristic data that training unit constructs by described output unit, comprises user characteristics matrix X and item characteristic matrix Y, exports and be stored in data memory module.
In concrete enforcement, instance analysis uses training iterations as weighing the index of carrying out the speed of convergence of user characteristics extraction, and training iterations is fewer, and the speed of convergence extracting user characteristics is faster; Use mean absolute error MAE as the index of data convert accuracy of carrying out user characteristics extraction, mean absolute error MAE is lower, and the data convert accuracy carrying out user characteristics extraction is higher.
Fig. 3, for before and after application the present embodiment, extracts the speed of convergence contrast of user characteristics.After the application embodiment of the present invention, when extracting user characteristics under non-negative restriction, iterations has obvious decline, and speed of convergence is significantly improved.
Fig. 4, for before and after application the present embodiment, extracts the data convert accuracy contrast of user characteristics.After the application embodiment of the present invention, when extracting user characteristics under non-negative restriction, mean absolute error MAE has obvious decline, and data convert accuracy is significantly improved.
In the description of this instructions, specific features, structure, material or feature that the description of reference term " embodiment ", " some embodiments ", " example ", " concrete example " or " some examples " etc. means to describe in conjunction with this embodiment or example are contained at least one embodiment of the present invention or example.In this manual, identical embodiment or example are not necessarily referred to the schematic representation of above-mentioned term.And the specific features of description, structure, material or feature can combine in an appropriate manner in any one or more embodiment or example.
Although illustrate and describe embodiments of the invention, those having ordinary skill in the art will appreciate that: can carry out multiple change, amendment, replacement and modification to these embodiments when not departing from principle of the present invention and aim, scope of the present invention is by claim and equivalents thereof.