CN110472659A

CN110472659A - Data processing method, device, computer readable storage medium and computer equipment

Info

Publication number: CN110472659A
Application number: CN201910604468.4A
Authority: CN
Inventors: 黄严汉
Original assignee: Ping An Life Insurance Company of China Ltd
Current assignee: Ping An Life Insurance Company of China Ltd
Priority date: 2019-07-05
Filing date: 2019-07-05
Publication date: 2019-11-19
Anticipated expiration: 2039-07-05
Also published as: CN110472659B

Abstract

This application involves big data, a kind of data processing method, device, computer readable storage medium and computer equipment are provided, which comprises in the characteristic crossover model for having trained the input of the first dimensionality reduction feature, obtain at least one target dimensionality reduction feature pair；In the weight calculation model that each target dimensionality reduction feature has trained input, obtain with each target dimensionality reduction feature to corresponding first weight information；In the weighted model that first weight information and the input of the second dimensionality reduction feature have been trained, the second weight information corresponding with the second dimensionality reduction feature is exported；Object data is ranked up according to the second weight information and generates corresponding object data sequence, object data sequence is sent to corresponding terminal, so that terminal sequentially shows object data according to object data sequence, accurate object data sequence can be obtained by the processing to feature, and corresponding object data is shown according to object data sequence, improve the accuracy that object data is shown.

Description

Data processing method, device, computer readable storage medium and computer equipment

Technical field

This application involves field of computer technology, more particularly to a kind of data processing method, device, computer-readable deposit Storage media and computer equipment.

Background technique

With the rapid development of computer technology, machine learning has been widely applied to every field, and machine learning is One important branch of artificial intelligence, is related to multiple subjects such as statistics, matrix analysis, optimization analysis, and essence is by certainly Dynamic analysis obtains universal law from data, and goes to give a forecast to unknown data using the universal law to learn, machine learning Appearance facilitate daily life significantly.

It is traditional when carrying out Feature Selection using machine learning and feature weight is calculated since the individual to user believes It is insufficient to cease attention rate, be easy to cause inaccuracy when characteristic processing, accurate object can not be obtained by the processing to feature Data sequence.

Summary of the invention

Based on this, it is necessary in view of the above technical problems, provide a kind of data processing method, device, computer equipment and Storage medium, can obtain accurate object data sequence by the processing to feature, and be shown pair according to object data sequence The object data answered improves the accuracy that object data is shown.

A kind of data processing method, which comprises

Behavioral data and object data are obtained, the behavioral data and the object data are subjected to characterization, obtained To behavioral data feature and object data feature；

Kernel principal component analysis is carried out to the behavioral data feature and the object data feature, is obtained and the behavior number According to the corresponding first dimensionality reduction feature of feature, the second dimensionality reduction feature corresponding with the object data feature；

In the characteristic crossover model that first dimensionality reduction feature input has been trained, at least one target dimensionality reduction feature is obtained It is right；

In the weight calculation model that each target dimensionality reduction feature has trained input, obtain and each target Dimensionality reduction feature is to corresponding first weight information；

In first weight information and the weighted model trained of the second dimensionality reduction feature input, will export with it is described Corresponding second weight information of second dimensionality reduction feature；

Corresponding object data sequence is ranked up and generated to the object data according to second weight information, it will The object data sequence is sent to corresponding terminal, so that described in the terminal sequentially shows according to the object data sequence Object data.

In one of the embodiments, the method also includes:

Obtain corresponding with behavioral data feature fisrt feature average value, corresponding with the object data feature the Two feature average values；

According to the fisrt feature average value and the second feature average value to the behavioral data feature and described right Image data feature carries out average value, obtains target signature, and the target signature includes the behavioral data feature after average value With object data feature；

Characteristic value corresponding with the target signature and feature vector are obtained, according to the characteristic value to described eigenvector It is ranked up, obtains ranking results；

The feature vector that ranking results are greater than preset threshold is established into across data field subspace, by the behavioral data feature With the object data Feature Mapping into across data field subspace, it is special to obtain the first dimensionality reduction corresponding with behavioral data feature Sign, the second dimensionality reduction feature corresponding with object data feature.

In one of the embodiments, the method also includes:

Obtain the first sub- dimensionality reduction feature in the first dimensionality reduction feature；

Described first sub- dimensionality reduction feature is associated with the second sub- dimensionality reduction feature in the first dimensionality reduction feature respectively, is obtained Each target dimensionality reduction feature pair.

In one of the embodiments, the method also includes:

Object data packet is generated according to the object data, object data sequence is generated according to the object data sequence Packet；

The object data packet and object data sequence packet are sent to corresponding terminal, so that the terminal is according to Object data packet and the object data sequence packet sequentially show the object data.

In one of the embodiments, the method also includes:

The aspect of model is obtained, the aspect of model is divided into training characteristics, verifying feature and test feature；

The training characteristics are inputted in foundation characteristic cross over model and are trained, prepared characteristic crossover model is obtained；

The verifying feature is inputted in the prepared characteristic crossover model and is verified, result is verified；

The parameter in the prepared characteristic crossover model is adjusted according to the verification result, obtains target signature friendship Pitch model；

The test feature is inputted in the target signature cross over model and is tested, test result is obtained；

Until when the test result meets default test result, using the target signature cross over model as the feature Cross over model.

A kind of data processing equipment, described device include:

Data acquisition module, for obtaining behavioral data and object data, by the behavioral data and the object data Characterization is carried out, behavioral data feature and object data feature are obtained；

Feature Dimension Reduction module, for carrying out kernel principal component point to the behavioral data feature and the object data feature Analysis, obtains the first dimensionality reduction feature corresponding with the behavioral data feature, the second dimensionality reduction corresponding with the object data feature Feature；

Characteristic crossover module, in the characteristic crossover model trained of the first dimensionality reduction feature input, will obtain to A few target dimensionality reduction feature pair；

Weight calculation module, for by each target dimensionality reduction feature in the weight calculation model trained of input, It obtains with each target dimensionality reduction feature to corresponding first weight information；

Characteristic weighing module, for first weight information and the second dimensionality reduction feature to be inputted the weighting trained In model, the second weight information corresponding with the second dimensionality reduction feature is exported；

Data disaply moudle, for being ranked up according to second weight information to the object data and generating correspondence Object data sequence, the object data sequence is sent to corresponding terminal, so that the terminal is according to the number of objects The object data is sequentially shown according to sequence.

Described device in one of the embodiments, further include:

Average value obtains module, and described for obtaining corresponding with behavioral data feature fisrt feature average value The corresponding second feature average value of object data feature；

Feature processing block is used for according to the fisrt feature average value and the second feature average value to the behavior Data characteristics and the object data feature carry out average value, obtain target signature, and the target signature includes removing average value Behavioral data feature and object data feature afterwards；

Vector order module, for obtaining characteristic value corresponding with the target signature and feature vector, according to the spy Value indicative is ranked up described eigenvector, obtains ranking results；

Feature Mapping module, the feature vector for ranking results to be greater than to preset threshold establish across data field subspace, By the behavioral data feature and the object data Feature Mapping into across data field subspace, obtain and behavioral data feature Corresponding first dimensionality reduction feature, the second dimensionality reduction feature corresponding with object data feature.

Described device in one of the embodiments, further include:

Sub- dimensionality reduction feature obtains module, for obtaining the first sub- dimensionality reduction feature in the first dimensionality reduction feature；

Feature is to module is obtained, for dropping the described first sub- dimensionality reduction feature with the second son in the first dimensionality reduction feature respectively Dimensional feature is associated, and obtains each target dimensionality reduction feature pair.

A kind of computer equipment can be run on a memory and on a processor including memory, processor and storage The step of computer program, the processor realizes the above method when executing described program.

A kind of computer readable storage medium, is stored thereon with computer program, realization when which is executed by processor The step of above method.

Above-mentioned data processing method, device, computer readable storage medium and computer equipment, server obtain behavior number According to and object data, and behavioral data and object data are subjected to characterization, obtain behavioral data feature and object data Feature carries out kernel principal component analysis to behavioral data feature and object data feature, obtains corresponding with behavioral data feature the One dimensionality reduction feature, the second dimensionality reduction feature corresponding with object data feature.Server passes through to behavioral data feature and number of objects According to feature carry out kernel principal component analysis, can be avoided dimension disaster (with the increase of dimension, calculation amount exponentially increase again one Kind phenomenon), further reduce the calculation amount of server, the generalization ability of Enhanced feature cross over model.Server is again by first In the characteristic crossover model that the input of dimensionality reduction feature has been trained, at least one target dimensionality reduction feature pair is obtained, by each target dimensionality reduction Feature in the weight calculation model trained of input, obtaining with each target dimensionality reduction feature to corresponding first weight information, Usually obtain the corresponding weight information of single dimensionality reduction feature in existing technical solution, this mode, which is easy to cause, to be ultimately produced Object data sequence is inaccurate, by acquisition target dimensionality reduction feature pair in the technical program, then obtains and target dimensionality reduction feature pair Corresponding first weight information, has sufficiently paid close attention to the connectivity between the personal information of user, is conducive to generate accurate number of objects According to sequence.In the weighted model that server has trained the first weight information and the input of the second dimensionality reduction feature, output and the second drop Corresponding second weight information of dimensional feature generates object data sequence corresponding with object data according to the second weight information, will Object data sequence is sent to corresponding terminal, so that terminal sequentially shows object data according to object data sequence, passes through number According to characterization, Feature Dimension Reduction, generate dimensionality reduction feature to, calculate dimensionality reduction feature pair weight information, regeneration object data sequence Accurate object data sequence can be obtained by the processing to feature etc. the cooperation between multiple processing steps, and according to object Data sequence shows corresponding object data, improves the accuracy that object data is shown.

Detailed description of the invention

Fig. 1 is the applied environment figure of data processing method in one embodiment；

Fig. 2 is the flow diagram of data processing method in one embodiment；

Fig. 3 is the flow diagram of data processing method in another embodiment；

Fig. 4 is the flow diagram of data processing method in further embodiment；

Fig. 5 is the structural block diagram of data processing equipment in one embodiment；

Fig. 6 is the structural block diagram of computer equipment in one embodiment.

Specific embodiment

It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not For limiting the application.

Data processing method provided in the embodiment of the present invention can be applied in application environment as shown in Figure 1, should Data processing method is applied to data processing system.The data processing system includes terminal 110, server 120.110 He of terminal Server 120 specifically can be terminal console or mobile terminal by network connection, terminal 110, and mobile terminal specifically can be At least one of mobile phone, tablet computer, laptop etc..Server 120 can be either multiple with independent server The server cluster of server composition is realized.

Based on above-mentioned data processing system, server 120 obtains behavioral data and object data, and server 120 is by behavior Data and object data carry out characterization, obtain behavioral data feature and object data feature, server 120 is to behavior number Kernel principal component analysis is carried out according to feature and object data feature, obtains the first dimensionality reduction feature corresponding with behavioral data feature, with First dimensionality reduction feature is inputted the characteristic crossover mould trained by the corresponding second dimensionality reduction feature of object data feature, server 120 In type, at least one target dimensionality reduction feature pair, the weight that server 120 has trained each target dimensionality reduction feature to input are obtained It in computation model, obtains with each target dimensionality reduction feature to corresponding first weight information, server 120 is by the first weight information In the weighted model trained with the input of the second dimensionality reduction feature, the second weight information corresponding with the second dimensionality reduction feature, clothes are exported Business device 120 is ranked up object data according to the second weight information and generates corresponding object data sequence, by object data Sequence is sent to corresponding terminal 110, so that terminal 110 sequentially shows object data according to object data sequence.

In one embodiment, as shown in Fig. 2, providing a kind of data processing method.The present embodiment is mainly in this way It is illustrated applied to the server 120 in above-mentioned Fig. 1.Referring to Fig. 2, which specifically comprises the following steps:

S202 obtains behavioral data and object data, and behavioral data and object data are carried out characterization, gone For data characteristics and object data feature.

Wherein, behavioral data refers to that the data information relevant to terminal behavior that server obtains, object data refer to use In the object information of sequencing display.It is understood that behavioral data include but is not limited to web log file, search engine logs, Account travel log and external environment data.Web log file refers to when account is when accessing some targeted website, website records Account corelation behaviour information.Search engine logs refer to account that search engine logs system is recorded on a search engine Corelation behaviour information.Account travel log refers to the account recorded by specific tool and approach record account in the search Corelation behaviour information on engine.External environment data refer to as mobile Internet flow, surfing Internet with cell phone account increase, at one's own expense covers Meal etc..Object data includes but is not limited to product data, information data and service class data etc..Behavioral data feature refers to clothes Behavioral data is carried out characterization by business device, and converts obtained characteristic information after the form of expression being characterized, number of objects Refer to that object data is carried out characterization by server according to feature, and converts obtained feature after the form of expression being characterized Information.

Specifically, server can obtain behavioral data and object data from terminal or other distributed servers.Feature Change processing refers to server according to the feature of behavioral data and object data building different dimensions, by behavioral data and object data Characteristic information is converted to for carrying out kernel principal component analysis.Characterization includes feature construction, feature extraction and feature Selection.Feature construction constructs new feature according to initial data, needs to find out some features with physical significance, feature extraction Refer to and automatically construct new feature, original tag is converted to one group with obvious physical significance or statistical significance or core Feature.Such as geometrical characteristic, texture etc., feature selecting refer to that the feature that one group of most statistical significance is selected from characteristic set is sub Collection, deletes unrelated feature, to achieve the effect that dimensionality reduction.It is understood that server is by behavioral data and number of objects It is substantially to indicate behavioral data and object data using more efficient coding mode (feature) according to characterization is carried out.Make With the information of character representation, information loss is less, and the rule for including in initial data (behavioral data and object data) is still Retain.Initial data can be reduced by the way that behavioral data and object data are converted to behavioral data feature and object data feature In (behavioral data feature and object data feature) uncertain factor (white noise, abnormal data and shortage of data etc.), energy It is enough that accurate object data sequence is obtained by the processing to feature.

S204 carries out kernel principal component analysis to behavioral data feature and object data feature, obtains and behavioral data feature Corresponding first dimensionality reduction feature, the second dimensionality reduction feature corresponding with object data feature.

Wherein, the first dimensionality reduction feature refers to that behavioral data feature is carried out the feature obtained after Feature Dimension Reduction and believed by server Breath, the second dimensionality reduction feature refer to that object data feature is carried out the characteristic information obtained after Feature Dimension Reduction by server.Kernel principal component Analysis refers to the expression that initial data is transformed to one group of each dimension linear independence by linear transformation, can be used for extracting data Main feature component.

Specifically, it includes following step that server, which carries out kernel principal component analysis to behavioral data feature and object data feature, It is rapid: 1) by initial data by column composition n row m column matrix X；2) every a line of X (representing an attribute field) is subjected to zero-mean Change, that is, subtracts the mean value of this line；3) covariance matrix is found out；4) find out covariance matrix characteristic value and corresponding feature to Measure r；5) feature vector is pressed into corresponding eigenvalue size from top to bottom by rows at matrix, k row forms matrix P before taking；6) i.e. For the data after dimensionality reduction to k dimension.Server is by carrying out kernel principal component analysis, energy to behavioral data feature and object data feature It is enough to indicate originally very high-dimensional data with seldom some representative dimensions, without losing crucial data information, make It obtains server to reduce the calculation amount of feature, improves the processing speed of server.

S206 inputs the first dimensionality reduction feature in the characteristic crossover model trained, and it is special to obtain at least one target dimensionality reduction Sign pair.

Wherein, target dimensionality reduction feature is to the group for referring to that server will obtain after each first dimensionality reduction feature progress combination of two Close information, it is to be understood that due to the first dimensionality reduction feature be at least one, target dimensionality reduction feature at least one. Specifically, characteristic crossover model is trained in advance for the first dimensionality reduction feature to be carried out to the model of characteristic crossover, and server is logical Efficiently study high order cross feature is crossed, the workload of manual features is reduced.In estimating, the number for estimating a little think feature it Between relationship be more it is a kind of " and " relationship, rather than the relationship of " adding ".For example, gender is crowd that is male and liking game, than It plays gender male and likes the crowd of game, the former group composition and division in a proportion the latter can more be characterized by the meaning of intersection.

S208 is obtained by each target dimensionality reduction feature to inputting in the weight calculation model trained and each target is dropped Dimensional feature is to corresponding first weight information.

Wherein, weight calculation model refers to the model that the weight information for carrying out feature pair of training in advance calculates.The One weight information refers to target dimensionality reduction feature to corresponding weight information, target dimensionality reduction feature at least one, with target Dimensionality reduction feature is at least one to corresponding first weight information.

Specifically, server, which is determined, can be used Objective Weight to corresponding first weight information with each target dimensionality reduction feature Method, objective weighted model include but is not limited to Principal Component Analysis Method and Information Entropy etc., and server determines according to Principal Component Analysis Method One weight information mainly comprises the steps that (1) first by data normalization, and the dimension that this is allowed between different data is different It causes, thus needs nondimensionalization.(2) factorial analysis (principal component method) is carried out to the data after standardization, uses variance maximum Change rotation.(3) the equation contribution rate of main gene score and each main gene is write out.(4) index weights are found out, i.e., each target drop Dimensional feature is to corresponding first weight information.Information Entropy refers to the mathematical method for judging the dispersion degree of some index.From Scattered degree is bigger, and the influence to the index to overall merit is bigger.The dispersion degree of some index can be judged with entropy, serviced Device is further determined that with each target dimensionality reduction feature according to the dispersion degree to corresponding first weight information.

S210 inputs the first weight information and the second dimensionality reduction feature in the weighted model trained, output and the second drop Corresponding second weight information of dimensional feature.

Wherein, weighted model refers to the model for being weighted to the second dimensionality reduction feature of training in advance, the second weight Information refers to the corresponding weight information of each second dimensionality reduction feature.Factorial analysis flexible strategy method, information specifically can be used in weighted model Measure the output such as flexible strategy method, independence flexible strategy method and dispersion method the second weight information corresponding with the second dimensionality reduction feature.

Specifically, factorial analysis flexible strategy method refers to according to factor-analysis approach in mathematical statistics, calculates altogether each index The accumulation contribution rate of sex factor determines power.It is bigger to accumulate contribution rate, illustrates that the index is bigger to the effect of the general character factor, weighs surely Weight is also bigger.Information content flexible strategy method refers to the resolution information for including according to each evaluation index to determine flexible strategy.Using the coefficient of variation Method, the coefficient of variation is bigger, and the flexible strategy assigned are also bigger.The coefficient of variation for calculating each index, using CV as weight score value, then through returning One change processing, obtains information content weight coefficient.Independence flexible strategy method refers to using multiple regression procedure in mathematical statistics, calculates phase Relationship number for weighing surely, and multiple correlation coefficient is bigger, and the flexible strategy assigned are bigger.Dispersion method refers to that the standard deviation of some index is got over Greatly, show that the degree of variation of index value is bigger, the information content provided is more, and the effect played in overall merit is bigger, power Weight is also bigger.On the contrary, the standard deviation of some index is smaller, showing that the degree of variation of index value is smaller, the information content provided is fewer, Effect played in overall merit is smaller, weight also Ying Yue little.

S212 is ranked up to object data according to the second weight information and generates corresponding object data sequence, will be right Image data sequence is sent to corresponding terminal, so that terminal sequentially shows object data according to object data sequence.

Wherein, object data sequence, which refers to, generates after server is ranked up object data according to the second weight information Sequence information.Object data sequence is sent to corresponding terminal, the object data that terminal will be sent according to server by server Sequence sequentially shows corresponding object data, can obtain accurate object data sequence by the processing to feature, and according to Object data sequence shows corresponding object data, improves the accuracy that object data is shown.

In one embodiment, server to object data be ranked up including but not limited to bubble sort, selected and sorted, Insertion sort, Shell sorting, merger sequence, quicksort, count sort, heapsort, bucket sort and radix sorting etc..It emits Bubble sequence refers to that it repeatedly visited the object data to be sorted, once relatively two object datas, if their sequence Mistake then carry out sequence exchange.The work for visiting ordered series of numbers is repeatedly carried out until not needing to exchange again, that is to say, that the number Arrange ranked completion.Selected and sorted refers to the object data for finding minimum (big) in unsorted sequence first, the row of being stored in Then the initial position of sequence sequence, then continually looks for minimum (big) object data from remaining unsorted object data, then puts To the end of collating sequence.And so on, it is finished until all object datas sort.Insertion sort is orderly by constructing Sequence scans unsorted data from back to front in collating sequence, finds corresponding position and is inserted into.

In one embodiment, Shell sorting, which refers to, is first partitioned into several subsequences for records series entirely to be sorted Straight Insertion Sort is carried out respectively, specifically: increment object data sequence a t1, t2 ..., tk are selected, wherein ti > tj, tk =1；By increment object data sequence number k, k times sequences are carried out to object data sequence；Every time sequence, according to corresponding increment Ti, the subsequence for being m at several length by object data sequences segmentation to be arranged carry out Straight Insertion Sort to each sublist respectively. When only increment factor is 1, entire object data sequence is handled as a table, and table length is entire object data sequence Length.

In one embodiment, merger sequence is to merge orderly subsequence, obtains complete ordering sequence；I.e. first Keep each subsequence orderly, then make subsequence section between orderly.Quicksort be by one time sequence will record separation be arranged at only The keyword of vertical two parts, a portion record is smaller than the keyword of another part, then can remember respectively to this two parts Record continues to sort, orderly to reach entire sequence.

In one embodiment, heapsort utilizes a kind of sort algorithm designed by this data structure of heap.Accumulation is one The structure of a approximation complete binary tree, and meets the property of accumulation simultaneously: i.e. the key assignments of child node or index always less than (or Greater than) its father node.Count sort is to convert key for the data value of input to be stored in the array space additionally opened up, and is made For a kind of sequence of linear time complexity, it must be the integer for having determining range that count sort, which requires the data of input,.Bucket row Sequence is the upgrade version of count sort.The mapping relations of function are utilized in it, efficiently whether key be that this mapping function Determination.Bucket sort refers to: assuming that input data obedience is uniformly distributed, in the bucket that data are assigned to limited quantity, each bucket Sort (being arranged it is possible that reusing other sort algorithm or continuing to use bucket sort in a recursive manner) respectively again.

In one embodiment, radix sorting first sorts according to low level, then collects；It sorts according still further to a high position, then again It collects；And so on, until highest order.Sometimes some attributes have priority orders, first sort by low priority, then press High priority sequence.Last order is exactly that high priority is high preceding, and the identical low priority of high priority is high preceding.

In the present embodiment, server obtains behavioral data and object data, and behavioral data and object data are carried out spy Signization processing, obtains behavioral data feature and object data feature, carries out core master to behavioral data feature and object data feature Analysis of components, obtains the first dimensionality reduction feature corresponding with behavioral data feature, and the second dimensionality reduction corresponding with object data feature is special Sign.Server by carrying out kernel principal component analysis to behavioral data feature and object data feature, can be avoided dimension disaster (with The increase of dimension, a kind of phenomenon that calculation amount exponentially increases again), further reduce the calculation amount of server, enhancing is special Levy the generalization ability of cross over model.Server in the characteristic crossover model trained of the first dimensionality reduction feature input, will obtain again to A few target dimensionality reduction feature pair in the weight calculation model for having trained each target dimensionality reduction feature to input, obtains and respectively It is corresponding that a target dimensionality reduction feature to corresponding first weight information, in existing technical solution usually obtains single dimensionality reduction feature Weight information, this mode are easy to cause the object data sequence inaccuracy ultimately produced, pass through acquisition mesh in the technical program Dimensionality reduction feature pair is marked, then is obtained with target dimensionality reduction feature to corresponding first weight information, the personal letter of user has sufficiently been paid close attention to Connectivity between breath is conducive to generate accurate object data sequence.Server is by the first weight information and the second dimensionality reduction feature It inputs in the weighted model trained, the second weight information corresponding with the second dimensionality reduction feature is exported, according to the second weight information Generate corresponding with object data object data sequence, object data sequence be sent to corresponding terminal so that terminal according to Object data sequence sequentially shows object data, by data characterization, Feature Dimension Reduction, generate dimensionality reduction feature to, calculate dimensionality reduction Cooperation between multiple processing steps such as weight information, the regeneration object data sequence of feature pair can pass through the place to feature Reason obtains accurate object data sequence, and shows corresponding object data according to object data sequence, and it is aobvious to improve object data The accuracy shown.

In one embodiment, step 204 further include: fisrt feature average value corresponding with behavioral data feature is obtained, Second feature average value corresponding with object data feature；According to fisrt feature average value and second feature average value to behavior number Average value is carried out according to feature and object data feature, obtains target signature, target signature includes the behavior number after average value According to feature and object data feature；Obtain corresponding with target signature characteristic value and feature vector, according to characteristic value to feature to Amount is ranked up, and obtains ranking results；The feature vector that ranking results are greater than preset threshold is established into across data field subspace, it will Behavioral data feature and object data Feature Mapping obtain corresponding with behavioral data feature first into across data field subspace Dimensionality reduction feature, the second dimensionality reduction feature corresponding with object data feature.

Wherein, fisrt feature average value refers to the feature average value with behavioral data feature that server determines, second is special Sign average value refers to the feature average value corresponding with object data feature that server determines.Target signature includes after removing average value Behavioral data feature and object data feature, target signature is the behavioral data feature obtain after average value and object Data characteristics.Characteristic value refers to value corresponding with each behavioral data feature gone after average value and object data feature, feature Vector refers to vector information corresponding with each behavioral data feature gone after average value and object data feature.Ranking results are Server thinks about it each feature according to the size information of characteristic value be ranked up after obtained result.Across data field subspace is Refer to the space for carrying out Feature Mapping, by the dimensionality reduction that Feature Mapping can be realized to feature to the space.

Specifically, when data need to carry out dimensionality reduction to K, following steps need to be carried out: 1) goes average value (i.e. decentralization), I.e. each feature subtracts respective average value.2) covariance matrix is calculated, note: removing here or does not remove sample size n or n-1, The feature vector found out is not influenced in fact.3) with Eigenvalues Decomposition method ask covariance matrix characteristic value and feature to Amount.4) characteristic value is sorted from large to small, selects maximum k.Then using its corresponding k feature vector as Row vector composition characteristic vector matrix P.5) data are transformed into the new space of k feature vector building, i.e. Y=PX is obtained Data after dimensionality reduction, i.e., the first dimensionality reduction feature corresponding with behavioral data feature, the second dimensionality reduction corresponding with object data feature Feature.

In the present embodiment, by the way that behavioral data feature and object data feature are carried out average value, target signature is obtained, And characteristic value corresponding with target signature and feature vector are obtained, feature vector is ranked up according to characteristic value, is sorted As a result, the feature vector that ranking results are greater than preset threshold is established across data field subspace again, by behavioral data feature and right Image data Feature Mapping obtains the first dimensionality reduction feature corresponding with behavioral data feature, with object into across data field subspace The corresponding second dimensionality reduction feature of data characteristics, capable of can be avoided dimension disaster, (with the increase of dimension, calculation amount is exponentially times A kind of phenomenon increased), further reduce the calculation amount of server, the generalization ability of Enhanced feature cross over model, so that clothes Business device can quickly and easily obtain accurate object data sequence by the processing to feature.

In one embodiment, as shown in figure 3, step 206 further include:

S206A obtains the first sub- dimensionality reduction feature in the first dimensionality reduction feature.

First sub- dimensionality reduction feature is associated with the second sub- dimensionality reduction feature in the first dimensionality reduction feature respectively, obtains by S206B To each target dimensionality reduction feature pair.

Wherein, the first dimensionality reduction feature is at least one, remaining when a certain first dimensionality reduction feature is the first sub- dimensionality reduction feature Each first dimensionality reduction feature be the second sub- dimensionality reduction feature.Server is by the first sub- dimensionality reduction feature respectively and in the first dimensionality reduction feature The second sub- dimensionality reduction feature be associated, i.e., progress combination of two, obtain each target dimensionality reduction feature pair.For example, if One dimensionality reduction feature includes liking reading, liking reading a book, like listening to music, then target dimensionality reduction feature is to including liking reading and like It reads a book, like reading and like listening to music, like reading a book and like listening to music.Server is by obtaining in the first dimensionality reduction feature First sub- dimensionality reduction feature, and the first sub- dimensionality reduction feature is closed with the second sub- dimensionality reduction feature in the first dimensionality reduction feature respectively Connection obtains target dimensionality reduction feature, improves the attention rate to user interest, further increases the standard when carrying out characteristic processing with this Exactness.

In the present embodiment, server obtains the first sub- dimensionality reduction feature in the first dimensionality reduction feature, and the first sub- dimensionality reduction is special Sign is associated with the second sub- dimensionality reduction feature in the first dimensionality reduction feature respectively, obtains each target dimensionality reduction feature pair, Neng Gouti Height, can be more accurate when carrying out characteristic processing to the attention rate of user interest, further passes through accurately characteristic processing process Corresponding object data sequence is generated, and corresponding object data is shown according to object data sequence, object data is improved and shows Accuracy.

In one embodiment, as shown in figure 4, step 212 further include:

S212A, generates object data packet according to object data, generates object data sequence packet according to object data sequence.

Object data packet and object data sequence packet are sent to corresponding terminal, so that terminal is according to number of objects by S212B Object data is sequentially shown according to packet and object data sequence.

Wherein, object data packet refers to the data packet including data information corresponding with object data.Object data sequence Packet refers to the sequence packet including the corresponding sorting position of each object data.Specifically, server is according to object data generation pair Image data packet, and object data sequence packet is generated according to object data sequence, server is by object data packet and object data sequence Column packet is sent to corresponding terminal, and terminal will sequentially show object data according to object data packet and object data sequence.

In the present embodiment, server generates object data packet according to object data, generates object according to object data sequence Data sequence packet, and object data packet and object data sequence packet are sent to corresponding terminal, so that terminal is according to number of objects Object data is sequentially shown according to packet and object data sequence, can be shown corresponding object data in terminal, be improved object data The accuracy of display.

In one embodiment, this method further include: obtain the aspect of model, the aspect of model is divided into training characteristics, is tested Characteristics of syndrome and test feature；Training characteristics are inputted in foundation characteristic cross over model and are trained, prepared characteristic crossover is obtained Model；Verifying feature is inputted in preparation characteristic crossover model and is verified, result is verified；According to verification result to preparation Parameter in characteristic crossover model is adjusted, and obtains target signature cross over model；Test feature input target signature is intersected It is tested in model, obtains test result；Until when test result meets default test result, by target signature cross over model As characteristic crossover model.

Wherein, the aspect of model is the characteristic information of training characteristics cross over model, and training characteristics, which refer to, is trained model When used characteristic information, verifying feature refers to when verifying to model, and used characteristic information, test feature refer to Characteristic information used when for testing model.Foundation characteristic cross over model refers to unbred model, and preparation is special Sign cross over model refers to the model obtained after model initial training, and it is pre- that verification result refers to that server will verify feature input After standby characteristic crossover model is verified, obtained result relevant to model verifying.Test result refers to that server will be tested After being tested in feature input target signature cross over model, obtained result relevant to model measurement.

In the present embodiment, server obtains the aspect of model, and the aspect of model is divided into training characteristics, verifying feature and survey Feature is tried, then training characteristics are inputted in foundation characteristic cross over model and are trained, prepared characteristic crossover model is obtained, will verify Feature is inputted in preparation characteristic crossover model and is verified, and is verified result.Server is according further to verification result pair Parameter in preparation characteristic crossover model is adjusted, and obtains target signature cross over model, test feature is inputted target signature It is tested in cross over model, obtains test result, until intersecting target signature when test result meets default test result Model is as characteristic crossover model.Characteristic crossover model can be continuously improved by training, verifying and the test to model Model performance, so that server is to obtain target dimensionality reduction feature clock synchronization using characteristic crossover model more accurate.

As shown in figure 5, for the schematic diagram of the data processing equipment in an embodiment, which includes:

Data acquisition module 302 carries out behavioral data and object data special for obtaining behavioral data and object data Signization processing, obtains behavioral data feature and object data feature；

Feature Dimension Reduction module 304 is obtained for carrying out kernel principal component analysis to behavioral data feature and object data feature The first dimensionality reduction feature corresponding with behavioral data feature, the second dimensionality reduction feature corresponding with object data feature；

Characteristic crossover module 306 obtains at least for inputting the first dimensionality reduction feature in the characteristic crossover model trained One target dimensionality reduction feature pair；

Weight calculation module 308, for inputting in the weight calculation model trained, obtaining each target dimensionality reduction feature To with each target dimensionality reduction feature to corresponding first weight information；

Characteristic weighing module 310, for the first weight information and the second dimensionality reduction feature to be inputted the weighted model trained In, export the second weight information corresponding with the second dimensionality reduction feature；

Data disaply moudle 312, it is corresponding right for being ranked up and being generated to object data according to the second weight information Object data sequence is sent to corresponding terminal by image data sequence, so that terminal is sequentially shown pair according to object data sequence Image data.

In one embodiment, Feature Dimension Reduction module includes: that average value obtains module, for obtaining and behavioral data feature Corresponding fisrt feature average value, second feature average value corresponding with object data feature；Feature processing block is used for basis Fisrt feature average value and second feature average value carry out average value to behavioral data feature and object data feature, obtain mesh Feature is marked, target signature includes the behavioral data feature and object data feature after average value；Vector order module, for obtaining Characteristic value corresponding with target signature and feature vector are taken, feature vector is ranked up according to characteristic value, obtains ranking results； Feature Mapping module, the feature vector for ranking results to be greater than to preset threshold establishes across data field subspace, by behavior number According to feature and object data Feature Mapping into across data field subspace, it is special to obtain the first dimensionality reduction corresponding with behavioral data feature Sign, the second dimensionality reduction feature corresponding with object data feature.

In one embodiment, characteristic crossover module includes: that sub- dimensionality reduction feature obtains module, special for obtaining the first dimensionality reduction The first sub- dimensionality reduction feature in sign；Feature is used for the first sub- dimensionality reduction feature respectively and in the first dimensionality reduction feature to module is obtained The second sub- dimensionality reduction feature be associated, obtain each target dimensionality reduction feature pair.

In one embodiment, data disaply moudle is used to generate object data packet according to object data, according to number of objects Object data sequence packet is generated according to sequence；Object data packet and object data sequence packet are sent to corresponding terminal, so that eventually End sequentially shows object data according to object data packet and object data sequence packet.

In one embodiment, characteristic crossover module is also used to obtain the aspect of model, and the aspect of model is divided into training spy Sign, verifying feature and test feature；Training characteristics are inputted in foundation characteristic cross over model and are trained, prepared feature is obtained Cross over model；Verifying feature is inputted in preparation characteristic crossover model and is verified, result is verified；According to verification result pair Parameter in preparation characteristic crossover model is adjusted, and obtains target signature cross over model；Test feature is inputted into target signature It is tested in cross over model, obtains test result；Until target signature is intersected when test result meets default test result Model is as characteristic crossover model.

Specific about data processing equipment limits the restriction that may refer to above for data processing method, herein not It repeats again.Modules in above-mentioned data processing equipment can be realized fully or partially through software, hardware and combinations thereof.On Stating each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also store in a software form In memory in computer equipment, the corresponding operation of the above modules is executed in order to which processor calls.The processor It can be central processing unit (CPU), microprocessor, single-chip microcontroller etc..Above-mentioned data processing equipment can be implemented as a kind of calculating The form of machine program.

In one embodiment, a kind of computer equipment is provided, which can be server, be also possible to Terminal.When the computer equipment is server, internal structure chart can be as shown in Figure 6.When the computer equipment is terminal When, internal structure includes display screen, input unit, camera, voice collection device and loudspeaker etc..The computer equipment packet Include processor, memory and the network interface connected by system bus.Wherein, the processor of the computer equipment is for providing Calculating and control ability.The memory of the computer equipment includes non-volatile memory medium, built-in storage.This is non-volatile to deposit Storage media is stored with operating system and computer program.The built-in storage is operating system and meter in non-volatile memory medium The operation of calculation machine program provides environment.The network interface of the computer equipment is used for logical by network connection with external terminal Letter.To realize a kind of data processing method when the computer program is executed by processor.It will be understood by those skilled in the art that figure Structure shown in 6, only the block diagram of part-structure relevant to application scheme, does not constitute and is answered application scheme With the restriction of computer equipment thereon, specific computer equipment may include than more or fewer portions as shown in the figure Part perhaps combines certain components or with different component layouts.

Wherein, performed the steps of when processor executes program and obtain behavioral data and object data, by behavioral data and Object data carries out characterization, obtains behavioral data feature and object data feature；To behavioral data feature and number of objects Kernel principal component analysis is carried out according to feature, the first dimensionality reduction feature corresponding with behavioral data feature is obtained, with object data feature pair The the second dimensionality reduction feature answered；In the characteristic crossover model that the input of first dimensionality reduction feature has been trained, at least one target drop is obtained Dimensional feature pair；In the weight calculation model that each target dimensionality reduction feature has trained input, obtain special with each target dimensionality reduction Sign is to corresponding first weight information；It is defeated in the weighted model that first weight information and the input of the second dimensionality reduction feature have been trained The second weight information corresponding with the second dimensionality reduction feature out；Object data is ranked up according to the second weight information and is generated pair Object data sequence is sent to corresponding terminal by the object data sequence answered so that terminal according to object data sequence sequentially Show object data.

The above-mentioned restriction for computer equipment may refer to the specific restriction above for data processing method, herein It repeats no more.

Please continue to refer to Fig. 6, a kind of computer readable storage medium is also provided, is stored thereon with computer program, such as Fig. 6 Shown in non-volatile memory medium, wherein the program performed the steps of when being executed by processor obtain behavioral data and Behavioral data and object data are carried out characterization, obtain behavioral data feature and object data feature by object data；It is right Behavioral data feature and object data feature carry out kernel principal component analysis, and it is special to obtain the first dimensionality reduction corresponding with behavioral data feature Sign, the second dimensionality reduction feature corresponding with object data feature；In the characteristic crossover model that the input of first dimensionality reduction feature has been trained, Obtain at least one target dimensionality reduction feature pair；In the weight calculation model that each target dimensionality reduction feature has trained input, obtain To with each target dimensionality reduction feature to corresponding first weight information；First weight information and the input of the second dimensionality reduction feature have been instructed In experienced weighted model, the second weight information corresponding with the second dimensionality reduction feature is exported；According to the second weight information to number of objects According to being ranked up and generating corresponding object data sequence, object data sequence is sent to corresponding terminal, so that terminal root Object data is sequentially shown according to object data sequence.

The above-mentioned restriction for computer readable storage medium may refer to above for the specific of data processing method It limits, details are not described herein.

Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the program can be stored in a non-volatile computer and can be read In storage medium, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, the storage is situated between Matter can be magnetic disk, CD, read-only memory (Read-OnlyMemory, ROM) etc..

Each technical characteristic of embodiment described above can be combined arbitrarily, for simplicity of description, not to above-mentioned reality It applies all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited In contradiction, all should be considered as described in this specification.

The embodiments described above only express several embodiments of the present invention, and the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to protection of the invention Range.Therefore, the scope of protection of the patent of the invention shall be subject to the appended claims.

Claims

1. a kind of data processing method, which comprises

Behavioral data and object data are obtained, the behavioral data and the object data are subjected to characterization, gone For data characteristics and object data feature；

Kernel principal component analysis is carried out to the behavioral data feature and the object data feature, is obtained special with the behavioral data Corresponding first dimensionality reduction feature is levied, the second dimensionality reduction feature corresponding with the object data feature；

In the characteristic crossover model that first dimensionality reduction feature input has been trained, at least one target dimensionality reduction feature pair is obtained；

In the weighted model that first weight information and the second dimensionality reduction feature input have been trained, output and described second Corresponding second weight information of dimensionality reduction feature；

Corresponding object data sequence is ranked up and generated to the object data according to second weight information, it will be described Object data sequence is sent to corresponding terminal, so that the terminal sequentially shows the object according to the object data sequence Data.

2. the method according to claim 1, wherein described to the behavioral data feature and the object data Feature carries out kernel principal component analysis, the first dimensionality reduction feature corresponding with the behavioral data feature is obtained, with the object data The corresponding second dimensionality reduction feature of feature, comprising:

Fisrt feature average value corresponding with the behavioral data feature is obtained, corresponding with the object data feature second is special Levy average value；

According to the fisrt feature average value and the second feature average value to the behavioral data feature and the number of objects Average value is carried out according to feature, obtains target signature, the target signature includes behavioral data feature after average value and right Image data feature；

Characteristic value corresponding with the target signature and feature vector are obtained, described eigenvector is carried out according to the characteristic value Sequence, obtains ranking results；

The feature vector that ranking results are greater than preset threshold is established into across data field subspace, by the behavioral data feature and institute Object data Feature Mapping is stated into across data field subspace, obtains the first dimensionality reduction feature corresponding with behavioral data feature, with The corresponding second dimensionality reduction feature of object data feature.

3. the method according to claim 1, wherein described input the first dimensionality reduction feature the spy trained It levies in cross over model, obtains at least one target dimensionality reduction feature pair, comprising:

Described first sub- dimensionality reduction feature is associated with the second sub- dimensionality reduction feature in the first dimensionality reduction feature respectively, is obtained described Each target dimensionality reduction feature pair.

4. the method according to claim 1, wherein described be sent to corresponding end for the object data sequence End, so that the terminal sequentially shows the object data according to the object data sequence, further includes:

Object data packet is generated according to the object data, object data sequence packet is generated according to the object data sequence；

The object data packet and object data sequence packet are sent to corresponding terminal, so that the terminal is according to the object Data packet and the object data sequence packet sequentially show the object data.

5. the method according to claim 1, wherein described input the first dimensionality reduction feature the spy trained It levies in cross over model, obtains at least one target dimensionality reduction feature pair, comprising:

The parameter in the prepared characteristic crossover model is adjusted according to the verification result, obtains target signature crossed module Type；

Until when the test result meets default test result, using the target signature cross over model as the characteristic crossover Model.

6. a kind of data processing equipment, which is characterized in that described device includes:

Data acquisition module carries out the behavioral data and the object data for obtaining behavioral data and object data Characterization obtains behavioral data feature and object data feature；

Feature Dimension Reduction module is obtained for carrying out kernel principal component analysis to the behavioral data feature and the object data feature To the first dimensionality reduction feature corresponding with the behavioral data feature, the second dimensionality reduction feature corresponding with the object data feature；

Characteristic crossover module obtains at least one for inputting the first dimensionality reduction feature in the characteristic crossover model trained A target dimensionality reduction feature pair；

Weight calculation module, for inputting in the weight calculation model trained, obtaining each target dimensionality reduction feature With each target dimensionality reduction feature to corresponding first weight information；

Characteristic weighing module, for first weight information and the second dimensionality reduction feature to be inputted the weighted model trained In, export the second weight information corresponding with the second dimensionality reduction feature；

Data disaply moudle, it is corresponding right for being ranked up and being generated to the object data according to second weight information The object data sequence is sent to corresponding terminal by image data sequence, so that the terminal is according to the object data sequence Column sequentially show the object data.

7. device according to claim 6, which is characterized in that the module includes:

Average value obtains module, for obtaining fisrt feature average value corresponding with the behavioral data feature, with the object The corresponding second feature average value of data characteristics；

Feature processing block is used for according to the fisrt feature average value and the second feature average value to the behavioral data Feature and the object data feature carry out average value, obtain target signature, after the target signature includes average value Behavioral data feature and object data feature；

Vector order module, for obtaining characteristic value corresponding with the target signature and feature vector, according to the characteristic value Described eigenvector is ranked up, ranking results are obtained；

Feature Mapping module, the feature vector for ranking results to be greater than to preset threshold establishes across data field subspace, by institute Behavioral data feature and the object data Feature Mapping are stated into across data field subspace, is obtained corresponding with behavioral data feature The first dimensionality reduction feature, the second dimensionality reduction feature corresponding with object data feature.

8. device according to claim 6, which is characterized in that described device includes:

Feature is to module is obtained, for the described first sub- dimensionality reduction feature is special with the second sub- dimensionality reduction in the first dimensionality reduction feature respectively Sign is associated, and obtains each target dimensionality reduction feature pair.

9. a kind of computer readable storage medium, be stored with computer program makes when the computer program is executed by processor The processor is obtained to execute such as the step of any one of claims 1 to 5 the method.

10. a kind of computer equipment, including memory and processor, the memory is stored with computer program, the calculating When machine program is executed by the processor, so that the processor executes the step such as any one of claims 1 to 5 the method Suddenly.