CN110472659B

CN110472659B - Data processing method, device, computer readable storage medium and computer equipment

Info

Publication number: CN110472659B
Application number: CN201910604468.4A
Authority: CN
Inventors: 黄严汉
Original assignee: Ping An Life Insurance Company of China Ltd
Current assignee: Ping An Life Insurance Company of China Ltd
Priority date: 2019-07-05
Filing date: 2019-07-05
Publication date: 2024-03-08
Anticipated expiration: 2039-07-05
Also published as: CN110472659A

Abstract

The present application relates to big data, and provides a data processing method, apparatus, computer readable storage medium and computer device, the method comprising: inputting the first dimension reduction feature into a trained feature cross model to obtain at least one target dimension reduction feature pair; inputting each target dimension reduction feature pair into a trained weight calculation model to obtain first weight information corresponding to each target dimension reduction feature pair; inputting the first weight information and the second dimension reduction feature into a trained weighting model, and outputting second weight information corresponding to the second dimension reduction feature; and sequencing the object data according to the second weight information, generating a corresponding object data sequence, and sending the object data sequence to a corresponding terminal, so that the terminal can display the object data in sequence according to the object data sequence, can obtain an accurate object data sequence through processing the characteristics, and can display the corresponding object data according to the object data sequence, thereby improving the accuracy of object data display.

Description

Data processing method, device, computer readable storage medium and computer equipment

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a data processing method, apparatus, computer readable storage medium, and computer device.

Background

With the rapid development of computer technology, machine learning is widely applied to various fields, and is an important branch of artificial intelligence, and relates to a plurality of subjects such as statistics, matrix analysis, optimization analysis and the like, the essence of the machine learning is that general rules are obtained from data through automatic analysis, unknown data is predicted by using the learned general rules, and the daily life of people is greatly facilitated by the appearance of the machine learning.

In the conventional method, when feature screening and feature weight calculation are performed by using machine learning, the degree of attention to personal information of a user is insufficient, so that inaccuracy in feature processing is easily caused, and an accurate object data sequence cannot be obtained by processing the features.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a data processing method, apparatus, computer device, and storage medium that are capable of obtaining an accurate object data sequence by processing features, and displaying corresponding object data according to the object data sequence, thereby improving accuracy of object data display.

A method of data processing, the method comprising:

acquiring behavior data and object data, and performing characterization processing on the behavior data and the object data to obtain behavior data characteristics and object data characteristics;

performing kernel principal component analysis on the behavior data features and the object data features to obtain first dimension reduction features corresponding to the behavior data features and second dimension reduction features corresponding to the object data features;

inputting the first dimension reduction feature into a trained feature cross model to obtain at least one target dimension reduction feature pair;

inputting each target dimension reduction feature pair into a trained weight calculation model to obtain first weight information corresponding to each target dimension reduction feature pair;

inputting the first weight information and the second dimension reduction feature into a trained weighting model, and outputting second weight information corresponding to the second dimension reduction feature;

and sequencing the object data according to the second weight information, generating a corresponding object data sequence, and sending the object data sequence to a corresponding terminal so that the terminal displays the object data in sequence according to the object data sequence.

In one embodiment, the method further comprises:

acquiring a first characteristic average value corresponding to the behavior data characteristic and a second characteristic average value corresponding to the object data characteristic;

the behavior data features and the object data features are subjected to de-averaging according to the first feature average value and the second feature average value to obtain target features, wherein the target features comprise the behavior data features and the object data features after de-averaging;

acquiring a feature value and a feature vector corresponding to the target feature, and sequencing the feature vector according to the feature value to obtain a sequencing result;

and establishing a cross-data-domain subspace of feature vectors with the sorting result larger than a preset threshold, mapping the behavior data features and the object data features into the cross-data-domain subspace, and obtaining a first dimension reduction feature corresponding to the behavior data features and a second dimension reduction feature corresponding to the object data features.

In one embodiment, the method further comprises:

acquiring a first sub-dimension reduction feature in the first dimension reduction features;

and respectively associating the first sub-dimension-reduction features with second sub-dimension-reduction features in the first dimension-reduction features to obtain the target dimension-reduction feature pairs.

In one embodiment, the method further comprises:

generating an object data packet according to the object data, and generating an object data sequence packet according to the object data sequence;

and sending the object data packet and the object data sequence packet to corresponding terminals so that the terminals display the object data in sequence according to the object data packet and the object data sequence packet.

In one embodiment, the method further comprises:

obtaining model features, and dividing the model features into training features, verification features and test features;

inputting the training features into a basic feature cross model for training to obtain a preliminary feature cross model;

inputting the verification feature into the preparation feature cross model for verification to obtain a verification result;

adjusting parameters in the preparation feature cross model according to the verification result to obtain a target feature cross model;

inputting the test features into the target feature intersection model for testing to obtain a test result;

and taking the target characteristic crossing model as the characteristic crossing model until the test result accords with a preset test result.

A data processing apparatus, the apparatus comprising:

the data acquisition module is used for acquiring behavior data and object data, and carrying out characteristic processing on the behavior data and the object data to obtain behavior data characteristics and object data characteristics;

the feature dimension reduction module is used for carrying out kernel principal component analysis on the behavior data features and the object data features to obtain first dimension reduction features corresponding to the behavior data features and second dimension reduction features corresponding to the object data features;

the feature intersection module is used for inputting the first dimension reduction feature into the trained feature intersection model to obtain at least one target dimension reduction feature pair;

the weight calculation module is used for inputting each target dimension reduction feature pair into the trained weight calculation model to obtain first weight information corresponding to each target dimension reduction feature pair;

the feature weighting module is used for inputting the first weight information and the second dimension reduction feature into a trained weighting model and outputting second weight information corresponding to the second dimension reduction feature;

and the data display module is used for sequencing the object data according to the second weight information, generating a corresponding object data sequence and sending the object data sequence to a corresponding terminal so that the terminal can display the object data in sequence according to the object data sequence.

In one embodiment, the apparatus further comprises:

the average value obtaining module is used for obtaining a first characteristic average value corresponding to the behavior data characteristic and a second characteristic average value corresponding to the object data characteristic;

the feature processing module is used for carrying out de-averaging on the behavior data features and the object data features according to the first feature average value and the second feature average value to obtain target features, wherein the target features comprise the behavior data features and the object data features after de-averaging;

the vector ordering module is used for acquiring a feature value and a feature vector corresponding to the target feature, and ordering the feature vector according to the feature value to obtain an ordering result;

and the feature mapping module is used for establishing a cross-data-domain subspace for feature vectors with the sorting result larger than a preset threshold value, mapping the behavior data features and the object data features into the cross-data-domain subspace, and obtaining a first dimension reduction feature corresponding to the behavior data features and a second dimension reduction feature corresponding to the object data features.

In one embodiment, the apparatus further comprises:

the sub-dimension-reduction feature acquisition module is used for acquiring a first sub-dimension-reduction feature in the first dimension-reduction features;

And the feature pair acquisition module is used for respectively associating the first sub-dimension reduction features with the second sub-dimension reduction features in the first dimension reduction features to obtain the target dimension reduction feature pairs.

A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when the program is executed.

A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the above method.

According to the data processing method, the data processing device, the computer readable storage medium and the computer equipment, the server acquires the behavior data and the object data, performs characteristic processing on the behavior data and the object data to obtain the behavior data characteristics and the object data characteristics, performs core principal component analysis on the behavior data characteristics and the object data characteristics to obtain the first dimension reduction characteristics corresponding to the behavior data characteristics and the second dimension reduction characteristics corresponding to the object data characteristics. The server can avoid dimension disasters (a phenomenon that the calculated amount increases exponentially along with the increase of dimension) by carrying out kernel principal component analysis on the behavior data characteristics and the object data characteristics, so that the calculated amount of the server is further reduced, and the generalization capability of the characteristic cross model is enhanced. The server inputs the first dimension reduction feature into the trained feature cross model to obtain at least one target dimension reduction feature pair, inputs each target dimension reduction feature pair into the trained weight calculation model to obtain first weight information corresponding to each target dimension reduction feature pair, and the weight information corresponding to a single dimension reduction feature is often obtained in the prior art. The server inputs the first weight information and the second dimension reduction feature into the trained weighting model, outputs the second weight information corresponding to the second dimension reduction feature, generates an object data sequence corresponding to the object data according to the second weight information, and sends the object data sequence to a corresponding terminal, so that the terminal sequentially displays the object data according to the object data sequence, and the accuracy of object data display is improved through the cooperation among the processing steps of data characterization, feature dimension reduction, dimension reduction feature pair generation, dimension information calculation, object data sequence regeneration and the like.

Drawings

FIG. 1 is a diagram of an application environment for a data processing method in one embodiment;

FIG. 2 is a flow diagram of a data processing method in one embodiment;

FIG. 3 is a flow chart of a data processing method according to yet another embodiment;

FIG. 4 is a flow chart of a data processing method in yet another embodiment;

FIG. 5 is a block diagram of a data processing apparatus in one embodiment;

FIG. 6 is a block diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

The data processing method provided by the embodiment of the invention can be applied to an application environment shown in fig. 1, and the data processing method is applied to a data processing system. The data processing system includes a terminal 110, a server 120. The terminal 110 and the server 120 are connected through a network, and the terminal 110 may be a desktop terminal or a mobile terminal, and the mobile terminal may be at least one of a mobile phone, a tablet computer, a notebook computer, and the like. The server 120 may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.

Based on the above data processing system, the server 120 obtains the behavior data and the object data, the server 120 performs the characterization processing on the behavior data and the object data to obtain the behavior data feature and the object data feature, the server 120 performs the kernel principal component analysis on the behavior data feature and the object data feature to obtain the first dimension reduction feature corresponding to the behavior data feature, the second dimension reduction feature corresponding to the object data feature, the server 120 inputs the first dimension reduction feature into the trained feature cross model to obtain at least one target dimension reduction feature pair, the server 120 inputs each target dimension reduction feature pair into the trained weight calculation model to obtain the first weight information corresponding to each target dimension reduction feature pair, the server 120 inputs the first weight information and the second dimension reduction feature into the trained weight model, outputs the second weight information corresponding to the second dimension reduction feature, the server 120 sorts the object data according to the second weight information and generates a corresponding object data sequence, and the object data sequence is sent to the corresponding terminal 110, so that the terminal 110 displays the object data according to the object data sequence.

In one embodiment, as shown in FIG. 2, a data processing method is provided. The present embodiment is mainly exemplified by the application of the method to the server 120 in fig. 1. Referring to fig. 2, the data processing method specifically includes the steps of:

S202, behavior data and object data are obtained, and the behavior data and the object data are subjected to characteristic processing to obtain behavior data characteristics and object data characteristics.

The behavior data refers to data information related to terminal behaviors acquired by a server, and the object data refers to object information for sequencing display. It is understood that the behavioral data includes, but is not limited to, website logs, search engine logs, account browsing logs, and external environmental data. The website log refers to account related behavior information recorded by a website when the account is accessing a certain target website. The search engine log refers to the relevant behavior information of the account on the search engine recorded by the search engine log system. Account travel logs refer to the information about the relative behavior of accounts on the search engine recorded by recording accounts through specific tools and pathways. The external environment data refer to mobile internet traffic, mobile phone internet account growth, self-fee package and the like. The object data includes, but is not limited to, product data, information data, service class data, and the like. The behavior data feature is feature information obtained after the server performs the feature processing on the behavior data and converts the behavior data into the feature expression form, and the object data feature is feature information obtained after the server performs the feature processing on the object data and converts the object data into the feature expression form.

In particular, the server may obtain behavioral data and object data from a terminal or other distributed server. The characterization processing means that the server constructs features of different dimensions according to the behavior data and the object data, and converts the behavior data and the object data into feature information for performing core principal component analysis. The characterization process includes feature construction, feature extraction and feature selection. Feature construction new features are constructed according to the original data, and some features with physical significance need to be found out, wherein feature extraction means that new features are automatically constructed, and the original tag is converted into a group of features with obvious physical significance or statistical significance or kernel. For example, geometric features, textures and the like, the feature selection refers to selecting a group of feature subsets with most statistical significance from a feature set, and deleting irrelevant features, so that the effect of reducing the dimension is achieved. It will be appreciated that the server characterizing the behavior data and the object data is essentially a more efficient way of encoding (characterizing) the behavior data and the object data. With the information represented by the features, there is less loss of information, while the laws contained in the raw data (behavior data and object data) remain. By converting behavior data and object data into behavior data features and object data features, uncertainty factors (white noise, abnormal data, data missing, etc.) in the original data (behavior data features and object data features) can be reduced, and an accurate object data sequence can be obtained by processing the features.

S204, performing kernel principal component analysis on the behavior data features and the object data features to obtain first dimension reduction features corresponding to the behavior data features and second dimension reduction features corresponding to the object data features.

The first dimension reduction feature is feature information obtained by the server after feature dimension reduction of the behavior data feature, and the second dimension reduction feature is feature information obtained by the server after feature dimension reduction of the object data feature. The kernel principal component analysis refers to transforming the original data into a set of linearly independent representations of each dimension through linear transformation, which can be used to extract principal feature components of the data.

Specifically, the server performs core principal component analysis on the behavior data characteristics and the object data characteristics, including the following steps: 1) Forming an n-row m-column matrix X from the original data according to columns; 2) Zero-equalizing each row of X (representing an attribute field), i.e., subtracting the average of the row; 3) Solving a covariance matrix; 4) Obtaining eigenvalues and corresponding eigenvectors r of the covariance matrix; 5) Arranging the eigenvectors into a matrix according to the corresponding eigenvalue from top to bottom, and taking the first k rows to form a matrix P; 6) Namely, the data after the dimension is reduced to k dimension. The server can represent the original data with very high dimensionality by a few representative dimensionalities through carrying out kernel principal component analysis on the behavior data characteristics and the object data characteristics, and key data information is not lost, so that the calculation amount of the server on the characteristics is reduced, and the processing rate of the server is improved.

S206, inputting the first dimension reduction feature into the trained feature cross model to obtain at least one target dimension reduction feature pair.

The target dimension reduction feature pair is combination information obtained by combining the first dimension reduction features by two, and it can be understood that the target dimension reduction feature pair is at least one because the first dimension reduction feature pair is at least one. Specifically, the feature cross model is a pre-trained model for feature cross of the first dimension reduction feature, and the server efficiently learns the high-order cross feature, so that the workload of the artificial feature is reduced. In the estimation, the number of times of estimating the points considers that the relation between the features is more a 'and' relation rather than an 'adding' relation. For example, the gender is a population of men and enjoying a game, and the combination of the former is more characteristic than the latter than the gender is of men and enjoying a game.

S208, inputting each target dimension reduction feature pair into the trained weight calculation model to obtain first weight information corresponding to each target dimension reduction feature pair.

The weight calculation model is a model which is trained in advance and used for calculating weight information of the feature pairs. The first weight information is weight information corresponding to the target dimension reduction feature pair, at least one target dimension reduction feature pair is provided, and the first weight information corresponding to the target dimension reduction feature pair is provided.

Specifically, the server determines the first weight information corresponding to each target dimension-reduction feature pair by using an objective weighting method, wherein the objective weighting method includes, but is not limited to, a principal component analysis method, an entropy value method and the like, and the server determines the first weight information according to the principal component analysis method mainly includes the following steps: (1) Data is first normalized, which is to take into account the dimensionless disparity between different data, and thus dimensionless is required. (2) The normalized data was factor analyzed (principal component method) and variance was used to maximize rotation. (3) The principal factor score and the equation contribution rate for each principal factor are written. (4) And solving index weights, namely first weight information corresponding to each target dimension reduction feature pair. Entropy method refers to a mathematical method for judging the degree of dispersion of a certain index. The greater the degree of dispersion, the greater the impact on the overall evaluation of the index. The degree of dispersion of a certain index can be judged by using the entropy value, and the server further determines first weight information corresponding to each target dimension reduction feature pair according to the degree of dispersion.

S210, inputting the first weight information and the second dimension reduction feature into the trained weighting model, and outputting the second weight information corresponding to the second dimension reduction feature.

The weighting model is a pre-trained model for weighting the second dimension-reduction features, and the second weight information is weight information corresponding to each second dimension-reduction feature. The weighting model specifically may output second weight information corresponding to the second dimension reduction feature by using a factor analysis weight method, an information amount weight method, an independence weight method, a standard deviation method, and the like.

Specifically, the factor analysis weighting method refers to weighting by calculating the cumulative contribution rate of the commonality factors for each index according to the factor analysis method in the mathematical statistics. The larger the cumulative contribution rate, the larger the effect of the index on the commonality factor is, and the larger the determined weight is. The information weight method is to determine weights based on resolution information included in each evaluation index. The larger the coefficient of variation, the larger the assigned weight. And calculating variation coefficients of the indexes, taking CV as a weight score, and carrying out normalization processing to obtain an information weight coefficient. The independent weighting method is to calculate the correlation coefficient for weighting by using a multiple regression method in mathematical statistics, and the larger the complex correlation coefficient is, the larger the assigned weight is. The standard deviation method is that the larger the standard deviation of a certain index, the larger the degree of variation of the index value, the more information is provided, the larger the function in the comprehensive evaluation is, and the larger the weight is. Conversely, the smaller the standard deviation of a certain index, the smaller the degree of variation of the index, the smaller the amount of information provided, and the smaller the function to be played in the comprehensive evaluation, the smaller the weight thereof should be.

S212, sorting the object data according to the second weight information, generating a corresponding object data sequence, and sending the object data sequence to a corresponding terminal so that the terminal displays the object data in sequence according to the object data sequence.

The object data sequence is sequence information generated after the server orders the object data according to the second weight information. The server sends the object data sequence to the corresponding terminal, the terminal sequentially displays the corresponding object data according to the object data sequence sent by the server, the accurate object data sequence can be obtained through processing the characteristics, the corresponding object data is displayed according to the object data sequence, and the accuracy of object data display is improved.

In one embodiment, the server orders the object data including, but not limited to, bubble ordering, select ordering, insert ordering, hill ordering, merge ordering, fast ordering, count ordering, heap ordering, bucket ordering, and radix ordering, among others. Bubble ordering refers to the fact that it repeatedly walks through the object data to be ordered, compares the two object data at a time, and exchanges order if their order is wrong. The work of walking through the array is repeated until no more exchanges are needed, i.e. the array has been ordered. Selecting sorting refers to firstly finding the smallest (large) object data in an unordered sequence, storing the smallest (large) object data in the beginning position of the ordered sequence, then continuously searching the smallest (large) object data from the rest unordered object data, and then placing the smallest (large) object data at the end of the ordered sequence. And the like, until all object data are sequenced. Insert ordering is by constructing an ordered sequence, scanning back and forth in the ordered sequence for unordered data, finding the corresponding position and inserting.

In one embodiment, the hil ordering refers to dividing the whole record sequence to be ordered into a plurality of subsequences for direct insertion ordering respectively, specifically: selecting a delta object data sequence t1, t2, …, tk, wherein ti > tj, tk = 1; according to the number k of the increment object data sequences, carrying out k times of sequencing on the object data sequences; and (3) each time of sequencing, dividing the object data sequence to be sequenced into a plurality of subsequences with the length of m according to the corresponding increment ti, and respectively performing direct insertion sequencing on each sub-table. When the increment factor is only 1, the whole object data sequence is treated as a table, and the table length is the length of the whole object data sequence.

In one embodiment, merge ordering is the merging of ordered subsequences to yield a fully ordered sequence; that is, each sub-sequence is ordered first, and then sub-sequence segments are ordered. The quick sorting is to divide the record to be sorted into two independent parts by one-pass sorting, wherein the key words of one part of the records are smaller than those of the other part, and the two parts of the records can be sequentially sorted respectively so as to achieve the whole ordered sequence.

In one embodiment, heap ordering utilizes an ordering algorithm designed for such data structures. Stacking is a structure that approximates a complete binary tree and at the same time satisfies the properties of stacking: i.e., the child node's key or index is always smaller (or larger) than its parent node. The counting and sorting are to convert the input data value into keys to be stored in an additionally opened array space, and the counting and sorting are required to be the integer with a definite range as a sort of linear time complexity. Bucket ordering is an upgraded version of count ordering. The method utilizes the mapping relation of the function, and the key of the efficiency is the determination of the mapping function. Bucket ordering refers to: assuming that the input data is uniformly distributed, the data is divided into a limited number of buckets, each of which is separately ordered (it is possible to re-use another ordering algorithm or continue to use bucket ordering in a recursive manner for ordering).

In one embodiment, the radix ordering is ordered first by low order, then collected; sequencing according to high order, and collecting; and so on until the most significant bit. Sometimes some attributes are prioritized, first low priority and then high priority. The last order is the high priority preceding, the same high priority preceding, the low priority preceding.

In this embodiment, the server obtains the behavior data and the object data, performs the characterizing processing on the behavior data and the object data to obtain the behavior data feature and the object data feature, and performs the kernel principal component analysis on the behavior data feature and the object data feature to obtain the first dimension reduction feature corresponding to the behavior data feature and the second dimension reduction feature corresponding to the object data feature. The server can avoid dimension disasters (a phenomenon that the calculated amount increases exponentially along with the increase of dimension) by carrying out kernel principal component analysis on the behavior data characteristics and the object data characteristics, so that the calculated amount of the server is further reduced, and the generalization capability of the characteristic cross model is enhanced. The server inputs the first dimension reduction feature into the trained feature cross model to obtain at least one target dimension reduction feature pair, inputs each target dimension reduction feature pair into the trained weight calculation model to obtain first weight information corresponding to each target dimension reduction feature pair, and the weight information corresponding to a single dimension reduction feature is often obtained in the prior art. The server inputs the first weight information and the second dimension reduction feature into the trained weighting model, outputs the second weight information corresponding to the second dimension reduction feature, generates an object data sequence corresponding to the object data according to the second weight information, and sends the object data sequence to a corresponding terminal, so that the terminal sequentially displays the object data according to the object data sequence, and the accuracy of object data display is improved through the cooperation among the processing steps of data characterization, feature dimension reduction, dimension reduction feature pair generation, dimension information calculation, object data sequence regeneration and the like.

In one embodiment, step 204 further comprises: acquiring a first characteristic average value corresponding to the behavior data characteristic and a second characteristic average value corresponding to the object data characteristic; the method comprises the steps that according to a first characteristic average value and a second characteristic average value, the average value of the behavior data characteristics and the object data characteristics is removed, and target characteristics are obtained, wherein the target characteristics comprise the behavior data characteristics and the object data characteristics after the average value is removed; acquiring a feature value and a feature vector corresponding to a target feature, and sequencing the feature vector according to the feature value to obtain a sequencing result; and establishing a cross-data-domain subspace of feature vectors with the sorting result larger than a preset threshold, mapping the behavior data features and the object data features into the cross-data-domain subspace, and obtaining first dimension reduction features corresponding to the behavior data features and second dimension reduction features corresponding to the object data features.

The first characteristic average value refers to a characteristic average value of the behavior data characteristic determined by the server, and the second characteristic average value refers to a characteristic average value corresponding to the object data characteristic determined by the server. The target features comprise behavior data features and object data features after the average value is removed, and the target features are the behavior data features and the object data features obtained after the average value is removed. The feature value is a value corresponding to each of the behavior data feature and the object data feature after the average value is removed, and the feature vector is vector information corresponding to each of the behavior data feature and the object data feature after the average value is removed. The sequencing result is a result obtained after the server wants to sequence each feature according to the size information of the feature value. The cross-data-domain subspace refers to a space for feature mapping, and feature dimension reduction can be achieved by mapping features to the space.

Specifically, when the data needs to be reduced to K, the following steps are required: 1) The mean value is removed (i.e., the center is removed), i.e., each bit feature is subtracted from the respective mean value. 2) Calculating covariance matrix, and injecting: here, the number of samples n or n-1 is divided or not, which has no effect on the feature vector obtained. 3) And solving eigenvalues and eigenvectors of the covariance matrix by using an eigenvalue decomposition method. 4) The feature values are sorted from large to small, and the largest k of the feature values are selected. And then respectively forming a feature vector matrix P by using the k corresponding feature vectors as row vectors. 5) And converting the data into a new space constructed by k eigenvectors, namely Y=PX, to obtain the dimensionality-reduced data, namely a first dimensionality-reduced feature corresponding to the behavior data feature and a second dimensionality-reduced feature corresponding to the object data feature.

In this embodiment, the target feature is obtained by performing the de-averaging on the behavior data feature and the object data feature, the feature value and the feature vector corresponding to the target feature are obtained, the feature vector is ordered according to the feature value to obtain the ordering result, then the feature vector with the ordering result larger than the preset threshold value is built to cross the data domain subspace, the behavior data feature and the object data feature are mapped into the cross data domain subspace to obtain the first dimension reduction feature corresponding to the behavior data feature and the second dimension reduction feature corresponding to the object data feature, so that the dimension disaster (a phenomenon that the calculated amount increases exponentially along with the increase of the dimension) can be avoided, the calculated amount of the server is further reduced, the generalization capability of the feature cross model is enhanced, and the server can quickly and conveniently obtain the accurate object data sequence through the feature processing.

In one embodiment, as shown in FIG. 3, step 206 further comprises:

S206A, acquiring a first sub-dimension reduction feature in the first dimension reduction features.

S206B, the first sub-dimension reduction features are respectively associated with the second sub-dimension reduction features in the first dimension reduction features, and each target dimension reduction feature pair is obtained.

And when one of the first dimension reduction features is a first sub dimension reduction feature, the rest first dimension reduction features are second sub dimension reduction features. And the server associates the first sub-dimension-reduction features with the second sub-dimension-reduction features in the first dimension-reduction features respectively, namely, performs pairwise combination to obtain each target dimension-reduction feature pair. For example, if the first dimension-reducing feature includes like reading, like listening to music, the target dimension-reducing feature pair includes like reading and like reading, like reading and like listening to music. The server obtains the target dimension reduction feature by acquiring a first sub dimension reduction feature in the first dimension reduction feature and associating the first sub dimension reduction feature with a second sub dimension reduction feature in the first dimension reduction feature respectively, so that the attention to the user interest is improved, and the accuracy of feature processing is further improved.

In this embodiment, the server obtains a first sub-dimension-reduction feature in the first dimension-reduction features, and correlates the first sub-dimension-reduction feature with a second sub-dimension-reduction feature in the first dimension-reduction features, so as to obtain each target dimension-reduction feature pair, so that the interest degree of the user can be improved, the feature processing can be more accurate, a corresponding object data sequence can be further generated through an accurate feature processing process, corresponding object data can be displayed according to the object data sequence, and the accuracy of object data display can be improved.

In one embodiment, as shown in FIG. 4, step 212 further comprises:

S212A, generating an object data packet according to the object data, and generating an object data sequence packet according to the object data sequence.

And S212B, the object data packet and the object data sequence packet are sent to the corresponding terminal, so that the terminal displays the object data in sequence according to the object data packet and the object data sequence.

The object data packet refers to a data packet including data information corresponding to object data. The object data sequence packet refers to a sequence packet including ordering positions corresponding to respective object data. Specifically, the server generates an object data packet according to the object data, generates an object data sequence packet according to the object data sequence, sends the object data packet and the object data sequence packet to corresponding terminals, and the terminals display the object data in sequence according to the object data packet and the object data sequence.

In this embodiment, the server generates the object data packet according to the object data, generates the object data sequence packet according to the object data sequence, and sends the object data packet and the object data sequence packet to the corresponding terminal, so that the terminal sequentially displays the object data according to the object data packet and the object data sequence, and can display the corresponding object data at the terminal, thereby improving the accuracy of displaying the object data.

In one embodiment, the method further comprises: the method comprises the steps of obtaining model features, and dividing the model features into training features, verification features and test features; inputting the training features into a basic feature cross model for training to obtain a preliminary feature cross model; inputting the verification features into a preparation feature cross model for verification to obtain a verification result; adjusting parameters in the preparation feature intersection model according to the verification result to obtain a target feature intersection model; inputting the test features into a target feature cross model for testing to obtain a test result; and taking the target characteristic crossing model as the characteristic crossing model until the test result accords with the preset test result.

The model features are feature information of a training feature cross model, the training features refer to feature information used when the model is trained, the verification features refer to feature information used when the model is verified, and the test features refer to feature information used when the model is tested. The basic feature cross model refers to an untrained model, the preparation feature cross model refers to a model obtained after model preliminary training, and the verification result refers to a result related to model verification obtained after the server inputs verification features into the preparation feature cross model for verification. The test result is a result related to the model test obtained after the server inputs the test feature into the target feature cross model for testing.

In this embodiment, the server acquires model features, divides the model features into training features, verification features and test features, inputs the training features into the basic feature cross model for training to obtain a preliminary feature cross model, and inputs the verification features into the preliminary feature cross model for verification to obtain a verification result. And the server further adjusts parameters in the preparation feature cross model according to the verification result to obtain a target feature cross model, inputs the test features into the target feature cross model for testing to obtain a test result, and takes the target feature cross model as the feature cross model until the test result accords with a preset test result. The model performance of the feature cross model can be continuously improved through training, verifying and testing the model, so that the server can obtain the target dimension reduction feature time by utilizing the feature cross model.

FIG. 5 is a schematic diagram of a data processing apparatus according to one embodiment, the apparatus including:

the data acquisition module 302 is configured to acquire behavior data and object data, and perform characterization processing on the behavior data and the object data to obtain behavior data features and object data features;

The feature dimension reduction module 304 is configured to perform kernel principal component analysis on the behavior data feature and the object data feature to obtain a first dimension reduction feature corresponding to the behavior data feature and a second dimension reduction feature corresponding to the object data feature;

the feature intersection module 306 is configured to input the first dimension-reduction feature into the trained feature intersection model, to obtain at least one target dimension-reduction feature pair;

the weight calculation module 308 is configured to input each target dimension reduction feature pair into the trained weight calculation model, so as to obtain first weight information corresponding to each target dimension reduction feature pair;

the feature weighting module 310 is configured to input the first weight information and the second dimension-reduction feature into the trained weighting model, and output second weight information corresponding to the second dimension-reduction feature;

the data display module 312 is configured to sort the object data according to the second weight information, generate a corresponding object data sequence, and send the object data sequence to a corresponding terminal, so that the terminal displays the object data in sequence according to the object data sequence.

In one embodiment, the feature dimension reduction module comprises: the average value acquisition module is used for acquiring a first characteristic average value corresponding to the behavior data characteristic and a second characteristic average value corresponding to the object data characteristic; the feature processing module is used for carrying out de-averaging on the behavior data features and the object data features according to the first feature average value and the second feature average value to obtain target features, wherein the target features comprise the behavior data features and the object data features after the de-averaging; the vector ordering module is used for acquiring the feature value and the feature vector corresponding to the target feature, and ordering the feature vector according to the feature value to obtain an ordering result; the feature mapping module is used for establishing a cross-data-domain subspace for feature vectors with the sorting result larger than a preset threshold value, mapping the behavior data features and the object data features into the cross-data-domain subspace, and obtaining first dimension reduction features corresponding to the behavior data features and second dimension reduction features corresponding to the object data features.

In one embodiment, the feature crossing module comprises: the sub-dimension-reduction feature acquisition module is used for acquiring a first sub-dimension-reduction feature in the first dimension-reduction features; the feature pair acquisition module is used for respectively associating the first sub-dimension reduction features with the second sub-dimension reduction features in the first dimension reduction features to obtain target dimension reduction feature pairs.

In one embodiment, the data display module is configured to generate an object data packet according to the object data, and generate an object data sequence packet according to the object data sequence; and sending the object data packet and the object data sequence packet to the corresponding terminal so that the terminal displays the object data in sequence according to the object data packet and the object data sequence packet.

In one embodiment, the feature intersection module is further configured to obtain model features, and divide the model features into training features, verification features, and test features; inputting the training features into a basic feature cross model for training to obtain a preliminary feature cross model; inputting the verification features into a preparation feature cross model for verification to obtain a verification result; adjusting parameters in the preparation feature intersection model according to the verification result to obtain a target feature intersection model; inputting the test features into a target feature cross model for testing to obtain a test result; and taking the target characteristic crossing model as the characteristic crossing model until the test result accords with the preset test result.

For specific limitations of the data processing apparatus, reference may be made to the above limitations of the data processing method, and no further description is given here. Each of the modules in the above-described data processing apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules. The processor may be a Central Processing Unit (CPU), microprocessor, single-chip microcomputer, etc. The data processing apparatus described above may be implemented in the form of a computer program.

In one embodiment, a computer device is provided, which may be a server or a terminal. When the computer device is a server, its internal structure may be as shown in fig. 6. When the computer equipment is a terminal, the internal structure of the computer equipment comprises a display screen, an input device, a camera, a sound acquisition device, a loudspeaker and the like. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a data processing method. It will be appreciated by those skilled in the art that the structure shown in fig. 6 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

Wherein the processor when executing the program performs the steps of: acquiring behavior data and object data, and performing characterization processing on the behavior data and the object data to obtain behavior data characteristics and object data characteristics; performing kernel principal component analysis on the behavior data features and the object data features to obtain first dimension reduction features corresponding to the behavior data features and second dimension reduction features corresponding to the object data features; inputting the first dimension reduction feature into a trained feature cross model to obtain at least one target dimension reduction feature pair; inputting each target dimension reduction feature pair into a trained weight calculation model to obtain first weight information corresponding to each target dimension reduction feature pair; inputting the first weight information and the second dimension reduction feature into a trained weighting model, and outputting second weight information corresponding to the second dimension reduction feature; and sequencing the object data according to the second weight information, generating a corresponding object data sequence, and sending the object data sequence to a corresponding terminal so that the terminal displays the object data in sequence according to the object data sequence.

The above definition of the computer device may refer to the above specific definition of the data processing method, which is not repeated here.

With continued reference to fig. 6, there is also provided a computer readable storage medium having stored thereon a computer program, such as the non-volatile storage medium shown in fig. 6, wherein the program when executed by a processor performs the steps of: acquiring behavior data and object data, and performing characterization processing on the behavior data and the object data to obtain behavior data characteristics and object data characteristics; performing kernel principal component analysis on the behavior data features and the object data features to obtain first dimension reduction features corresponding to the behavior data features and second dimension reduction features corresponding to the object data features; inputting the first dimension reduction feature into a trained feature cross model to obtain at least one target dimension reduction feature pair; inputting each target dimension reduction feature pair into a trained weight calculation model to obtain first weight information corresponding to each target dimension reduction feature pair; inputting the first weight information and the second dimension reduction feature into a trained weighting model, and outputting second weight information corresponding to the second dimension reduction feature; and sequencing the object data according to the second weight information, generating a corresponding object data sequence, and sending the object data sequence to a corresponding terminal so that the terminal displays the object data in sequence according to the object data sequence.

The definition of the computer readable storage medium is referred to above as a specific definition of the data processing method, and will not be repeated here.

Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-only memory (ROM), or the like.

The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples illustrate only a few embodiments of the invention, which are described in detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.

Claims

1. A method of data processing, the method comprising:

acquiring behavior data and object data, and performing characterization processing on the behavior data and the object data to obtain behavior data characteristics and object data characteristics; the behavior data are data information related to terminal behaviors, acquired by a server, and the object data are object information for sequencing display; performing kernel principal component analysis on the behavior data features and the object data features to obtain first dimension reduction features corresponding to the behavior data features and second dimension reduction features corresponding to the object data features;

the first sub-dimension-reduction features are respectively associated with second sub-dimension-reduction features in the first dimension-reduction features, so that each target dimension-reduction feature pair is obtained;

And sequencing the object data according to the second weight information, generating a corresponding object data sequence, generating an object data packet according to the object data, generating an object data sequence packet according to the object data sequence, and sending the object data packet and the object data sequence packet to a corresponding terminal so that the terminal sequentially displays the object data according to the object data packet and the object data sequence.

2. The method of claim 1, wherein the performing a kernel principal component analysis on the behavioral data characteristic and the object data characteristic to obtain a first dimension-reduction characteristic corresponding to the behavioral data characteristic and a second dimension-reduction characteristic corresponding to the object data characteristic comprises:

3. The method of claim 1, wherein the second weight information corresponding to the second dimension-reduction feature is output by the weighting model by a factor analysis weighting method, an information amount weighting method, an independence weighting method, or a standard deviation method.

4. A method according to claim 3, wherein the factor analysis weighting method is a method of calculating a cumulative contribution rate of a commonality factor for each index according to a factor analysis method in mathematical statistics, and the larger the cumulative contribution rate, the larger the determined weight.

5. The method of claim 1, wherein said inputting the first dimension-reduction feature into the trained feature intersection model results in at least one target dimension-reduction feature pair, comprising:

6. A data processing apparatus, the apparatus comprising:

the data acquisition module is used for acquiring behavior data and object data, and carrying out characteristic processing on the behavior data and the object data to obtain behavior data characteristics and object data characteristics; the behavior data are data information related to terminal behaviors, acquired by a server, and the object data are object information for sequencing display;

The feature intersection module is used for acquiring a first sub-dimension reduction feature in the first dimension reduction features, and correlating the first sub-dimension reduction feature with a second sub-dimension reduction feature in the first dimension reduction features respectively to obtain each target dimension reduction feature pair;

the data display module is used for sequencing the object data according to the second weight information and generating a corresponding object data sequence, generating an object data packet according to the object data, generating an object data sequence packet according to the object data sequence, and sending the object data packet and the object data sequence packet to a corresponding terminal so that the terminal can display the object data in sequence according to the object data packet and the object data sequence.

7. The apparatus of claim 6, wherein the module comprises:

8. The apparatus of claim 6, wherein the feature intersection module is further configured to obtain model features, and to divide the model features into training features, verification features, and test features; inputting the training features into a basic feature cross model for training to obtain a preliminary feature cross model; inputting the verification feature into the preparation feature cross model for verification to obtain a verification result; adjusting parameters in the preparation feature cross model according to the verification result to obtain a target feature cross model; inputting the test features into the target feature intersection model for testing to obtain a test result; and taking the target characteristic crossing model as the characteristic crossing model until the test result accords with a preset test result.

9. A computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method of any one of claims 1 to 5.

10. A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of the method of any of claims 1 to 5.