A kind of user characteristics abstracting method and draw-out device based on the conversion of non-negative alternating direction
Technical field
It is a kind of based on non-negative more particularly in e-commerce system the present invention relates to computer big data processing technology field
The user characteristics abstracting method and draw-out device of alternating direction conversion.
Background technology
Modern electronic business system, its number of users and information content are very huge.In such system, user's is each
The objective behavior of kind, such as clicks on, browses, comments on, searches for, with system operation accumulated time, collect and gone through as huge user
History behavioral data collection, data volume are at least typical big data environment in TB magnitudes.
In electronic business system, a kind of typical data description structure is user behavior statistical matrix, therein
Per a line, a corresponding user, each row correspond to a project;Project refer in system it is any may be operated by user it is objective
Object, such as news, picture, commodity;Each matrix element corresponds to historical behavior data of the unique user to single project, the data
It is that the objective history behavioral data of the project is quantified using the mathematical statistics method for meeting the natural law using the user
Calculate and form.In electronic business system, user and the number of entry are very huge, correspond to user behavior statistical matrix also very
It is huge.Meanwhile a user can not possibly operate all projects, a project is also impossible to by all user's operations;Typically
For, the given data in user behavior statistical matrix is extremely sparse far fewer than unknown data.
During system operation, based on the given data in user behavior statistical matrix, user characteristics is therefrom extracted, can be right
The behavior of user is effectively analyzed, and therefrom excavating includes the rules such as class of subscriber, behavior pattern.In the extraction of user characteristics
During, the nonnegativity of user characteristics is kept, is a key, because non-negative user characteristics more conforms to ecommerce
User behavior data is the natural law of positive number in system, and preferably user behavior can be characterized.Existing non-negative feature is taken out
Technology is taken to be used for computer vision field, its basic characteristics is for given figure or image, is regarded as one completely
Order matrix, and the matrix factorisation under non-negative condition limitation is carried out to it, so as to extract the local thing of the figure or image
Body characteristicses.But the non-negative object features that the user characteristics in e-commerce system is extracted in problem, with computer vision extract
Problem, possesses very big difference.Because the non-negative object features in computer vision extract handled figure, image turns
The matrix of change is non-singular matrix, does not possess missing values, and the non-negative matrix factorization problem of this matroid can be by routine
Matrix iteration computing is handled;But the non-negative user behavior in e-commerce system extracts problem, handled user behavior
Statistical matrix, it is extremely sparse under normal circumstances, wherein possessing substantial amounts of missing values, traditional matrix factor point can not be used
Solution processing, and need use to act on the non-negative hidden signature analysis of sparse matrix and handle.But the existing hidden feature of nonnegative matrix point
Analysis method, possesses the shortcomings that convergence rate is slow, the data convert degree of accuracy is low.
Therefore, how for user behavior statistical matrix in electronic business system, possessing a large amount of missing values, enter
The high non-negative hidden signature analysis of row fast convergence rate, the data convert degree of accuracy, user behavior can well be described certainly so as to obtain
The user characteristics of right rule, be mass data caused by modern electronic business system is analyzed needed for it is to be processed
One key issue.
The content of the invention
In order to overcome defect present in above-mentioned prior art, it is an object of the invention to provide one kind to be based on non-negative alternating side
The datum directly acted on to the user characteristics abstracting method and draw-out device of conversion, the present invention in user behavior statistical matrix
According to set, user behavior statistical matrix that possess a large amount of missing values, extremely sparse can be handled, fast convergence rate, data are also
The former degree of accuracy is high, can solve the problem that the user characteristics in big data processing environment extracts problem.
In order to realize the above-mentioned purpose of the present invention, the invention provides a kind of user based on the conversion of non-negative alternating direction is special
Abstracting method is levied, is comprised the following steps:
S1. server sends the instruction for carrying out user characteristics extraction to draw-out device;
S2. draw-out device receives instruction and initiation parameter, and initiation parameter includes:Feature space dimension f, paired-associate learning
Speed η, Lagrangian enhancer λ, user characteristics matrix X, user train companion matrix X_U, X_D and X_C, item characteristic square
Battle array Y, project training companion matrix Y_U, Y_D and Y_C, iteration control variable t, iteration upper limit n, convergence decision threshold
S3. draw-out device construction accumulation absolute error ε (P, Q, X, Y), wherein P is user characteristics constraint matrix, and Q is project
Feature constraint matrix;
S4. draw-out device enters row constraint using constraints to accumulation absolute error ε (P, Q, X, Y), ensures matrix P, Q
The nonnegativity of parameter in the training process;
S5. the unified loss function L (P, Q, X, Y, Γ, Κ) of draw-out device construction, wherein Γ and Κ are antithesis parameter;
S6. draw-out device judges whether repetitive exercise control variable t has reached upper limit n, if so, step S9 is then performed, if
It is no, then perform step S7;
S7. whether draw-out device judges unified loss function L relative to P, Q, X, Y, Γ and Κ in user behavior statistical moment
Restrained on given data set C in battle array, if so, step S9 is then performed, if it is not, then performing step S8;
S8. in the given data in given data set C of the draw-out device in user behavior statistical matrix to P, Q, X,
Y, Γ and Κ are iterated training, then perform step S6;
S9. draw-out device exports the user characteristics matrix X obtained by repetitive exercise and item characteristic matrix Y, storage
Acquisition characteristic storing unit into data module.
In this method, in step S2 feature space dimension f be feature space residing for user characteristics dimension, determine feature to
The dimension of amount, it is any positive integer in arithmetic number set.
Paired-associate learning speed η is the learning rate in unified loss function, being trained to Lagrange multiplier, is section
Floating number in (0.0001,0.05).
Lagrangian enhancer λ is in unified loss function, and the factor of stipulations expression is carried out to constraints, is section
Arbitrary small number in (0.01,0.1).
User characteristics matrix X is the feature for needing to extract, and is one | A | × f matrix, wherein A represent the storage of device
User's set that unit is stored.The corresponding user of X every a line, X each row vector is the characteristic vector of a user.
In the embodiment of the present invention, in user characteristics matrix X the initial value of each element be set in the range of section (0.4,0.8) with
Machine number.
It is to aid in the data structure of repetitive exercise user characteristics that user, which trains companion matrix X_U, X_D and X_C, is
| A | × f matrix.
Item characteristic matrix Y is the feature for needing to extract, and is one | B | × f matrix, wherein B represent the storage of device
The project set that unit is stored.The corresponding project of Y every a line, Y each row vector are whole users to a project
The characteristic vector operated.
Project training companion matrix Y_U, Y_D and Y_C are to aid in the data structure of repetitive exercise user characteristics, are
| B | × f matrix.
Iteration control variable t is the variable of controlling feature training process, and iteration control variable t is initialized as 0.
Iteration upper limit n is the variable of the controlled training process iteration upper limit, is any positive integer in arithmetic number set.
Restrain decision thresholdTo judge repetitive exercise whether convergent threshold parameter.
The given data set that this method is directly acted in user behavior statistical matrix, can handle and possess a large amount of missings
Value, extremely sparse user behavior statistical matrix, fast convergence rate, the data convert degree of accuracy is high, can handle at big data
The user characteristics managed in environment extracts problem.
Preferably, the calculation formula of absolute error is described in step S3:
S.t.P=X, P >=0,
Q=Y, Q >=0.
Wherein, C represents the given data set in user behavior statistical matrix;ru,iRepresent in user behavior statistical matrix
U rows, the element value of the i-th row, represent historical behavior statistics of the user u on project i;xu,kRepresent user characteristics matrix X
U rows, kth column element;yi,kRepresent item characteristic matrix Y the i-th row, kth column element;P is user characteristics constraint matrix, Q
For item characteristic constraint matrix.
Definitely ε (P, Q, X, Y) is sufficiently stated error and nonnegativity restriction the accumulation of step S3 constructions by mistake, meanwhile,
Condition is provided to introducing antithesis parameter.
Step S4 enters row constraint using constraints to accumulation absolute error ε (P, Q, X, Y), ensures that correlation model parameters exist
Nonnegativity in training process.
The unified loss function of step S5 constructions is to loss function and related constraint bar by using method of Lagrange multipliers
Part carries out unification, so as to meet the constraint of constraints in the training process.
Preferably, step S4 comprises the following steps:
S4-1. for each element p in Pu,k, as it is not equal to corresponding element x in Xu,k, then p is madeu,k=xu,k;
S4-2. for each element q in Qi,k, as it is not equal to corresponding element y in Yi,k, then q is madei,k=yi,k;
S4-3. for each element p in Pu,k, as its be less than 0, then make pu,k=0;
S4-4. for each element q in Qi,k, as its be less than 0, then make qi,k=0.
Wherein, pu,kRepresent u rows in user characteristics constraint matrix P, kth column element, qi,kRepresent item characteristic constraint square
I-th row in battle array, kth column element.
Preferably, loss function calculation formula is described in step S5:
Wherein Γ and Κ is antithesis parameter, γu,kRepresent u rows in Γ, kth column element, κi,kThe i-th row in K is represented, the
K column elements, using stipulations method of Lagrange multipliers (augmented lagrangian), stipulations Lagrange multiplies the formula
Sub- method (augmented lagrangian) is the stipulations that corresponding restrictive condition is added on the basis of method of Lagrange multipliers
, stipulations item isρ is the stipulations parameter of stipulations method of Lagrange multipliers, and parameter ginseng should be in square
Battle array X, is a constant in calculating.
Preferably, the repetitive exercise in step S8 comprises the following steps:
S8-1. repetitive exercise target is determined, i.e., whole parameter P, Q, X, Y, Γ and Κ, it is met unified loss function L
Relative to P, Q, X, Y, Γ and Κ on the given data set C in user behavior statistical matrix it is minimum, be expressed as formula:
τ is the stipulations parameter of stipulations method of Lagrange multipliers,For stipulations item, parameter ginseng Ying Yu
Matrix Y, it is a constant in calculating.
S8-2. non-negative direction checker is used, to P, Q, X, Y, the single-element carry out order training in Γ and Κ, instruction
It is formula to practice Rule Expression:
For k=1~f,
S8-3, for P, Q, X, Y, each element in Γ and Κ, it is carried out according to equation below
ρu=λ | C (u) |, τi=λ | C (i) |;
Training renewal,Wherein, C (u) and C
(i) represent respectively in given data set C, the subset associated with user u and project i, τiMiddle τ is stipulations Lagrange multiplier
The stipulations parameter of method, parameter ginseng should correspond to Y the i-th row in matrix Y, i.
The present invention also proposes a kind of draw-out device of the user characteristics abstracting method based on the conversion of non-negative alternating direction, including
Data reception module, data memory module and execution module, wherein, the data reception module is connected with data memory module,
Data reception module is used for the user behavior statistics of the reception server collection, and by the user of the collection of server received
Behavioral statisticses data transfer is stored to data memory module, and the data memory module is connected with execution module, is performed
The instruction for the progress user characteristics extraction that module execute server is sent, and the user characteristic data of extraction is stored in data storage
In module.
With the user behavior statistics of data reception module acquisition server, perform user characteristics with execution module and extract
Instruction, the user behavior statistics and execution module of the server that data memory module gathers to data reception module extract
User characteristic data stored.The given data set that the present apparatus can be done directly in user behavior statistical matrix, energy
Enough processing possess a large amount of missing values, extremely sparse user behavior statistical matrix, can solve the problem that in big data processing environment
User characteristics extracts problem.
Further, the data memory module includes obtaining characteristic storing unit and statistics memory cell, described
Obtain characteristic storing unit to be connected with the execution module, for storing the user characteristic data of execution module extraction;The system
Count memory cell to be connected with the data reception module, the user behavior statistical number for data storage receiving module transmission
According to.
The user characteristics that user behavior statistics and execution module to the server of data reception module collection extract
Data carry out subdivision storage, can more facilitate in called data, be accurate and quick.
Further, the execution module includes initialization unit, training unit and output unit,
The parameter that the initialization unit is relied on user characteristics extraction process initializes, initiation parameter bag
Include:Feature space dimension f, paired-associate learning speed η, Lagrangian enhancer λ, user characteristics matrix X, user train auxiliary moment
Battle array X_U, X_D and X_C, item characteristic matrix Y, project training companion matrix Y_U, Y_D and Y_C, iteration control variable t, iteration
Upper limit n, convergence decision threshold
The training unit input is connected with initialization unit and data memory module respectively, according at the beginning of initialization unit
User behavior statistics structuring user's characteristic in the parameter and data memory module of beginningization, including user characteristics matrix
X and item characteristic matrix Y, training unit construct accumulation absolute error ε (P, Q, X, Y) first, and wherein P is that user characteristics constrains square
Battle array, Q is item characteristic constraint matrix, reconstructs unified loss function L (P, Q, X, Y, Γ, K), and wherein Γ and Κ are antithesis ginseng
Number, is then iterated training to P, Q, X, Y, Γ and Κ, until uniformly loss function L (P, Q, X, Y, Γ, K) relative to P, Q,
X, Y, Γ and Κ restrain on the given data set C in user behavior statistical matrix, or iteration control variable t is equal to iteration
Upper limit n;
The output unit input is connected with training unit output end, output unit output end and data memory module phase
The user characteristic data that training unit constructs is exported and stored in data memory module by connection, the output unit.
Execution module is divided into three units so that in the user characteristic data of extraction, to the initializing of parameter, number
According to extraction and storage can more accurate quick.
The beneficial effects of the invention are as follows:This method is intended to convert by non-negative alternating direction, directly acts on user behavior
Given data set in statistical matrix, has the advantage that:
1st, user behavior statistical matrix that possess a large amount of missing values, extremely sparse can be handled;
2nd, fast convergence rate, the data convert degree of accuracy is high, can solve the problem that the user characteristics in big data processing environment extracts
Problem.
The additional aspect and advantage of the present invention will be set forth in part in the description, and will partly become from the following description
Obtain substantially, or recognized by the practice of the present invention.
Brief description of the drawings
The above-mentioned and/or additional aspect and advantage of the present invention will become in the description from combination accompanying drawings below to embodiment
Substantially and it is readily appreciated that, wherein:
Fig. 1 is schematic structural view of the invention;
Fig. 2 is schematic flow sheet of the present invention;
Fig. 3 be using the embodiment of the present invention before and after, the convergence rate comparison diagram that is extracted to user characteristics;
Fig. 4 be using the embodiment of the present invention before and after, the data convert degree of accuracy comparison diagram that is extracted to user characteristics.
Embodiment
Embodiments of the invention are described below in detail, the example of the embodiment is shown in the drawings, wherein from beginning to end
Same or similar label represents same or similar element or the element with same or like function.Below with reference to attached
The embodiment of figure description is exemplary, is only used for explaining the present invention, and is not considered as limiting the invention.
In the description of the invention, unless otherwise prescribed with limit, it is necessary to explanation, term " installation ", " connected ",
" connection " should be interpreted broadly, for example, it may be mechanical connection or electrical connection or the connection of two element internals, can
To be to be joined directly together, can also be indirectly connected by intermediary, for the ordinary skill in the art, can basis
Concrete condition understands the concrete meaning of above-mentioned term.
As shown in figure 1, the invention provides a kind of user characteristics abstracting method based on the conversion of non-negative alternating direction, including
Following steps:
S1. server sends the instruction for carrying out user characteristics extraction to draw-out device;
S2. draw-out device receives instruction and initiation parameter, and initiation parameter includes:Feature space dimension f, paired-associate learning
Speed η, Lagrangian enhancer λ, user characteristics matrix X, user train companion matrix X_U, X_D and X_C, item characteristic square
Battle array Y, project training companion matrix Y_U, Y_D and Y_C, iteration control variable t, iteration upper limit n, convergence decision threshold
Wherein, feature space dimension f is the dimension of feature space residing for user characteristics, determines the dimension of characteristic vector, is
Any positive integer in arithmetic number set, such as 25.
Paired-associate learning speed η is the learning rate in unified loss function, being trained to Lagrange multiplier, is section
Floating number in (0.0001,0.05), such as 0.001.
Lagrangian enhancer λ is in unified loss function, and the factor of stipulations expression is carried out to constraints, is section
Arbitrary small number in (0.01,0.1), such as 0.05.
User characteristics matrix X is the feature for needing to extract, and is one | A | × f matrix, wherein A represent the storage of device
User's set that unit is stored.The corresponding user of X every a line, X each row vector is the characteristic vector of a user.
In the embodiment of the present invention, in user characteristics matrix X the initial value of each element be set in the range of section (0.4,0.8) with
Machine number, such as 0.47.
It is to aid in the data structure of repetitive exercise user characteristics that user, which trains companion matrix X_U, X_D and X_C, is
| A | × f matrix, wherein, X_U is training for caching initial value, the X_D of X matrix in the training process for caching X matrix
During desired value, X_C be used for cache the updated value of X matrix in the training process.In the embodiment of the present invention, X_U, X_D and
Each element is initialized as 0 in X_C.
Item characteristic matrix Y is the feature for needing to extract, and is one | B | × f matrix, wherein B represent the storage of device
The project set that unit is stored.The corresponding project of Y every a line, Y each row vector are whole users to a project
The characteristic vector operated.
Project training companion matrix Y_U, Y_D and Y_C are to aid in the data structure of repetitive exercise user characteristics, are
| B | × f matrix, wherein, Y_U is training for caching initial value, the Y_D of Y matrixes in the training process for caching Y matrixes
During desired value, Y_C be used for cache the updated value of Y matrixes in the training process.In the embodiment of the present invention, Y_U, Y_D and
Each element is initialized as 0 in Y_C.
Iteration control variable t is the variable of controlling feature training process, and iteration control variable t is initialized as 0.
Iteration upper limit n is the variable of the controlled training process iteration upper limit, is any positive integer in arithmetic number set, such as
500。
Restrain decision thresholdTo judge repetitive exercise whether convergent threshold parameter.In the embodiment of the present invention, set
For the arbitrary small number in open interval (0,0.001), such as 0.0001.
S3. draw-out device construction accumulation absolute error ε (P, Q, X, Y), wherein P is user characteristics constraint matrix, and Q is project
Feature constraint matrix;
The calculation formula of absolute error described in this step is:
S.t.P=X, P >=0,
Q=Y, Q >=0.
Wherein, C represents the given data set in user behavior statistical matrix;ru,iRepresent in user behavior statistical matrix
U rows, the element value of the i-th row, represent historical behavior statistics of the user u on project i;xu,kRepresent user characteristics matrix X
U rows, kth column element;yi,kRepresent item characteristic matrix Y the i-th row, kth column element;P is user characteristics constraint matrix, Q
For item characteristic constraint matrix.
S4. draw-out device enters row constraint, P=X, P >=0, Q=using constraints to accumulation absolute error ε (P, Q, X, Y)
Y, Q >=0 are constraints.
S5. the unified loss function L (P, Q, X, Y, Γ, Κ) of draw-out device construction, wherein Γ and Κ are antithesis parameter;
Loss function calculation formula described in this step is:
S6. draw-out device judges whether repetitive exercise control variable t has reached upper limit n,
If reaching the upper limit, step S9 is performed:The user characteristics matrix X and item that draw-out device will be obtained by repetitive exercise
Mesh eigenmatrix Y is exported, and is stored the acquisition characteristic storing unit into data module, is completed the extraction to user characteristics;
If being not reaching to the upper limit, step S7 is performed;
In this step, then whether draw-out device judges iteration control variable t cumulative 1 first on iteration control variable t
More than iteration upper limit n.
S7. whether draw-out device judges unified loss function L relative to P, Q, X, Y, Γ and Κ in user behavior statistical moment
Restrained on given data set C in battle array,
If so, then perform step S9:The user characteristics matrix X and item characteristic that draw-out device will be obtained by repetitive exercise
Matrix Y is exported, and is stored the acquisition characteristic storing unit into data module, is completed the extraction to user characteristics;
If it is not, then perform step S8;
In this step, user characteristics draw-out device judges that unified loss function L is special relative to user characteristics matrix X, project
Levy matrix Y, whether convergent foundation is epicycle to user characteristics on the given data set C in user behavior statistical matrix
Before repetitive exercise starts, unified loss function L numerical value, before wheel repetitive exercise starts in contrast, unified loss function L number
Whether value, its poor absolute value are less than convergence decision thresholdIf it is less, being judged to having restrained, vice versa.
S8. in the given data in given data set C of the draw-out device in user behavior statistical matrix to P, Q, X,
Y, Γ and Κ are iterated training, then perform step S6, so circulation, until completing step S9 draw-out devices will be instructed by iteration
Practice the user characteristics matrix X obtained and item characteristic matrix Y outputs, store the acquisition characteristic storing unit into data module,
Complete the extraction to user characteristics.
As the preferred scheme of the present embodiment, step S4 comprises the following steps:
S4-1. for each element p in Pu,k, as it is not equal to corresponding element x in Xu,k, then p is madeu,k=xu,k;
S4-2. for each element q in Qi,k, as it is not equal to corresponding element y in Yi,k, then q is madei,k=yi,k;
S4-3. for each element p in Pu,k, as its be less than 0, then make pu,k=0;
S4-4. for each element q in Qi,k, as its be less than 0, then make qi,k=0.
Wherein, pu,kRepresent u rows in user characteristics constraint matrix P, kth column element;xu,kRepresent in user characteristics matrix X
U rows, kth column element;qi,kRepresent the i-th row in item characteristic constraint matrix Q, kth column element;yi,kRepresent item characteristic matrix
The rows of Y i-th, kth column element.
Repetitive exercise in step S8 comprises the following steps:
S8-1. repetitive exercise target is determined, i.e., whole parameter P, Q, X, Y, Γ and Κ, parameter P, Q, X, Y, Γ and Κ are entered
Row solves, and it is met datums of the unified loss function L relative to P, Q, X, Y, Γ and Κ in user behavior statistical matrix
According to minimum on set C, formula is expressed as:
S8-2. non-negative direction checker is used, to P, Q, X, Y, the single-element carry out order training in Γ and Κ, instruction
It is formula to practice Rule Expression:
For k=1~f,
Wherein, t and t+1 represents t wheels and t+1 wheel iteration respectively.
S8-3, for P, Q, X, Y, each element in Γ and Κ, it is carried out according to equation below
ρu=λ | C (u) |, τi=λ | C (i) |;
Training renewal,
Wherein, C (u) and C (i) are represented in given data set C respectively, the subset associated with user u and project i.
After being iterated training to P, Q, X, Y, Γ and Κ in this way, repeat and perform step S6, so circulation, directly
To the extraction completed to user characteristics.
The present invention also proposes a kind of draw-out device of the user characteristics abstracting method based on the conversion of non-negative alternating direction, such as schemes
Shown in 2, including data reception module, data memory module and execution module, wherein, the data reception module and data storage
Module connects, and data reception module is used for the user behavior statistics of the reception server collection, and the server that will be received
The user behavior statistics of collection passes to data memory module and stored, the data memory module and execution module phase
Connection, the instruction for the progress user characteristics extraction that execution module execute server is sent, and the user characteristic data of extraction is deposited
Enter in data memory module.
With the user behavior statistics of data reception module acquisition server, perform user characteristics with execution module and extract
Instruction, the user behavior statistics and execution module of the server that data memory module gathers to data reception module extract
User characteristic data stored.The given data set that the present apparatus can be done directly in user behavior statistical matrix, energy
Enough processing possess a large amount of missing values, extremely sparse user behavior statistical matrix, can solve the problem that in big data processing environment
User characteristics extracts problem.
In the present embodiment, the data memory module includes obtaining characteristic storing unit and statistics memory cell, institute
State acquisition characteristic storing unit to be connected with the execution module, for storing the user characteristic data of execution module extraction;It is described
Statistics memory cell is connected with the data reception module, and the user behavior for data storage receiving module transmission counts
Data.
The user characteristics that user behavior statistics and execution module to the server of data reception module collection extract
Data carry out subdivision storage, can more facilitate in called data, be accurate and quick.
As the preferred scheme of the present embodiment, the execution module includes initialization unit, training unit and output unit.
The parameter that the initialization unit is relied on user characteristics extraction process initializes, initiation parameter bag
Include:Feature space dimension f, paired-associate learning speed η, Lagrangian enhancer λ, user characteristics matrix X, user train auxiliary moment
Battle array X_U, X_D and X_C, item characteristic matrix Y, project training companion matrix Y_U, Y_D and Y_C, iteration control variable t, iteration
Upper limit n, convergence decision thresholdWherein, user characteristics matrix X, user's training companion matrix X_U, X_D and X_C are that basis is worked as
Preceding user's set A, and current feature space dimension f, foundation | A | OK, | f | the matrix of row;Each member in user characteristics matrix X
The initial value of element is the random number in the range of section (0.2,0.6), and user trains each member in companion matrix X_U, X_D and X_C
The initial value of element is 0.Item characteristic matrix Y, project training companion matrix Y_U, Y_D and Y_C be according to current project set B,
With current feature space dimension f, foundation | B | OK, | f | the matrix of row;The initial value of each element is in item characteristic matrix X
Random number in the range of section (0.2,0.6), the initial value of each element is in project training companion matrix Y_U, Y_D and Y_C
0。
The training unit input is connected with initialization unit and data memory module respectively, according at the beginning of initialization unit
User behavior statistics structuring user's characteristic in the parameter and data memory module of beginningization, including user characteristics matrix
Each row vector in X and item characteristic matrix Y, X corresponds to the non-negative behavioural characteristic of a user;Each row in Y
Vector corresponds to non-negative operating characteristics of the known whole users for a project.Training structuring user's characteristic is further wrapped
Include, training unit constructs accumulation absolute error ε (P, Q, X, Y) first, and wherein P is user characteristics constraint matrix, and Q is item characteristic
Constraint matrix, the unified loss function L (P, Q, X, Y, Γ, K) of augmented Lagrange multiplier method construction is reused, wherein Γ and Κ are equal
For antithesis parameter, and solve and draw relevant parameter P, Q, X, Y, Γ and Κ, make global loss function in user behavior statistical matrix
In given data set C on it is minimum, then converted using non-negative alternating direction, instruction be iterated to P, Q, X, Y, Γ and Κ
Practice, until unified loss function L (P, Q, X, Y, Γ, K) relative to P, Q, X, Y, Γ and Κ in user behavior statistical matrix
Restrained on given data set C, or iteration control variable t is equal to iteration upper limit n;
The output unit input is connected with training unit output end, output unit output end and data memory module phase
Connection, the user characteristic data that the output unit constructs training unit, including user characteristics matrix X and item characteristic matrix
Y, export and store in data memory module.
In specific implementation, instance analysis uses training iterations as the convergence speed for weighing progress user characteristics extraction
The index of degree, training iterations is fewer, and the convergence rate for extracting user characteristics is faster;Using mean absolute error MAE as
The index of the data convert degree of accuracy of user characteristics extraction is carried out, mean absolute error MAE is lower, carries out user characteristics extraction
The data convert degree of accuracy is higher.
Fig. 3 be using the present embodiment before and after, extract user characteristics convergence rate contrast.After the embodiment of the present invention,
When user characteristics is extracted under non-negative limitation, iterations is decreased obviously, and convergence rate is significantly improved.
Fig. 4 be using the present embodiment before and after, extract user characteristics the data convert degree of accuracy contrast.Implement using the present invention
After example, when user characteristics is extracted under non-negative limitation, mean absolute error MAE is decreased obviously, and the data convert degree of accuracy has bright
It is aobvious to improve.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or the spy for combining the embodiment or example description
Point is contained at least one embodiment or example of the present invention.In this manual, to the schematic representation of above-mentioned term not
Necessarily refer to identical embodiment or example.Moreover, specific features, structure, material or the feature of description can be any
One or more embodiments or example in combine in an appropriate manner.
Although an embodiment of the present invention has been shown and described, it will be understood by those skilled in the art that:Not
In the case of departing from the principle and objective of the present invention a variety of change, modification, replacement and modification can be carried out to these embodiments, this
The scope of invention is limited by claim and its equivalent.