CN113610107A

CN113610107A - Feature optimization method and device

Info

Publication number: CN113610107A
Application number: CN202110751068.3A
Authority: CN
Inventors: 宋万鹏; 陈冬雨
Original assignee: Tongdun Technology Co ltd; Tongdun Holdings Co Ltd
Current assignee: Tongdun Technology Co ltd; Tongdun Holdings Co Ltd
Priority date: 2021-07-02
Filing date: 2021-07-02
Publication date: 2021-11-05

Abstract

The present disclosure provides a feature optimization method and apparatus, the method includes inputting features to be optimized into a pre-constructed feature optimization model; performing dimensionality reduction mapping on the feature to be optimized through the encoder, and mapping the feature to be optimized into a first intermediate feature; performing dimension-ascending mapping on the first intermediate feature through the decoder, and mapping the first intermediate feature into a second intermediate feature with the same dimension as the feature to be optimized; optimizing the feature to be optimized based on the feature difference between the second intermediate feature and the feature to be optimized, and determining an optimized feature. The feature optimization method can efficiently acquire the features with cross relations in the features to be optimized, reduce feature dimensions and improve calculation efficiency.

Description

Feature optimization method and device

Technical Field

The disclosure relates to the technical field of wind control, and in particular relates to a feature optimization method and device.

Background

The multi-head loan characteristic is a characteristic commonly used in the field of credit wind control modeling, the existing characteristic processing is mainly obtained by processing loan application behavior data of a user in a plurality of financial institutions, meanwhile, according to the categories and different time slices of the financial institutions, the characteristics are subjected to derivation by operators such as addition, subtraction, multiplication, division, statistics and aggregation, and finally, a characteristic set of the credit wind control modeling is obtained, so that modeling operation is performed.

The main disadvantages of the existing characteristic processing method are as follows:

firstly, the characteristic derivation and selection mainly depends on manual experience, and a great deal of manpower and time are needed;

secondly, the information which can be mined in a manual mode is limited, and valuable information of a deeper level in the characteristics cannot be mined;

thirdly, the number of the features obtained by a manual mode is large, correlation exists among the features, mutual interference is easy to occur in the modeling process, the calculated amount is large, and the feature screening efficiency is low.

Disclosure of Invention

The embodiment of the disclosure provides a feature optimization method and device, which can efficiently acquire features having a cross relationship in features to be optimized, reduce feature dimensions, and improve calculation efficiency.

In a first aspect of the embodiments of the present disclosure, a method for feature optimization is provided, where the method includes:

inputting the characteristics to be optimized into a pre-constructed characteristic optimization model, wherein the characteristic optimization model comprises an encoder and a decoder;

performing dimensionality reduction mapping on the feature to be optimized through the encoder, and mapping the feature to be optimized into a first intermediate feature;

performing dimension-ascending mapping on the first intermediate feature through the decoder, and mapping the first intermediate feature into a second intermediate feature with the same dimension as the feature to be optimized;

optimizing the feature to be optimized based on the feature difference between the second intermediate feature and the feature to be optimized, and determining an optimized feature.

In an alternative embodiment of the method according to the invention,

the encoder includes a batch normalization processing layer, a first mapping layer and a second mapping layer,

the method for performing the dimension reduction mapping on the feature to be optimized through the encoder comprises the following steps:

carrying out batch standardization processing on the features to be optimized through the batch standardization layer to obtain standardized features;

mapping, by the first mapping layer, the normalized features to third intermediate features that are the same dimension as the first mapping layer;

mapping, by the second mapping layer, the third intermediate feature to a first intermediate feature having the same dimension as the second mapping layer,

and the dimension of the second mapping layer is smaller than that of the first mapping layer, and the dimension of the first mapping layer is smaller than that of the feature to be optimized.

In an alternative embodiment of the method according to the invention,

the method of mapping the normalized features to third intermediate features of the same dimension as the first mapping layer by the first mapping layer comprises:

determining the third intermediate feature based on an activation function corresponding to an encoder, a first weight matrix corresponding to the first mapping layer, a first bias parameter corresponding to the first mapping layer, and the normalized feature;

the method of mapping the third intermediate feature to a first intermediate feature having the same dimension as the second mapping layer by the second mapping layer comprises:

determining the first intermediate feature based on the activation function corresponding to the encoder, the second weight matrix corresponding to the second mapping layer, the second bias parameter corresponding to the second mapping layer, and the third intermediate feature.

In an alternative embodiment of the method according to the invention,

the dimension of the first mapping layer is 512 dimensions, and the dimension of the second mapping layer is 128 dimensions.

In an alternative embodiment of the method according to the invention,

the method further includes training a feature optimization model,

the method for training the feature optimization model comprises the following steps:

based on the pre-obtained training features, performing dimensionality reduction mapping on the training features through an encoder of a feature optimization model to be trained, and mapping the training features into fourth intermediate features;

performing ascending-dimension mapping on the fourth intermediate feature through a decoder of a feature optimization model to be trained, and mapping the fourth intermediate feature into a fifth intermediate feature with the same dimension as the training feature;

and iteratively optimizing a loss function of the feature optimization model to be trained through a back propagation algorithm based on the feature errors of the fifth intermediate feature and the training feature so as to enable the feature errors to meet a preset convergence condition, and completing the training of the feature optimization model.

In a second aspect of the embodiments of the present disclosure, there is provided a feature optimization apparatus, including:

the device comprises a first unit, a second unit and a third unit, wherein the first unit is used for inputting the characteristics to be optimized into a pre-constructed characteristic optimization model, and the characteristic optimization model comprises an encoder and a decoder;

a second unit, configured to perform dimension reduction mapping on the feature to be optimized through the encoder, and map the feature to be optimized as a first intermediate feature;

a third unit, configured to perform, by the decoder, ascending-dimension mapping on the first intermediate feature, and map the first intermediate feature into a second intermediate feature having a dimension that is the same as that of the feature to be optimized;

and the fourth unit is used for optimizing the feature to be optimized and determining the optimized feature based on the feature difference between the second intermediate feature and the feature to be optimized.

In an alternative embodiment of the method according to the invention,

the second unit comprises a normalization unit, a first mapping unit and a second mapping unit,

the standardization unit is used for carrying out batch standardization processing on the features to be optimized through the batch standardization layer to obtain standardized features;

the first mapping unit is configured to map, by the first mapping layer, the normalized feature into a third intermediate feature having the same dimension as the first mapping layer;

the second mapping unit is configured to map, by the second mapping layer, the third intermediate feature into a first intermediate feature having the same dimension as the second mapping layer,

In an alternative embodiment of the method according to the invention,

the first mapping unit is further configured to:

the second mapping unit is further configured to:

In an alternative embodiment of the method according to the invention,

the apparatus further comprises a fifth unit for training the feature optimization model;

the fifth unit is configured to:

The present disclosure provides a method of feature optimization, the method comprising:

dimension reduction mapping is carried out on the features to be optimized based on the encoder, so that mutual interference of high-dimensional features in the subsequent calculation process is avoided, and the calculation amount is reduced. In addition, the deep relation among the features to be optimized can be automatically learned by carrying out dimension reduction mapping on the features to be optimized, and some hidden nonlinear information features cannot be mined by the conventional feature derivation method.

based on the fact that the feature output by the encoder is subjected to the ascending-dimension mapping by the decoder, the difference between the input and the output of the feature optimization model can be compared, and therefore the output result of the feature optimization model implies deep valuable information.

By iteratively optimizing the quantization features, the interference of the correlation among the features on the model can be effectively avoided.

Drawings

FIG. 1 is a schematic flow chart diagram of a feature optimization method according to an embodiment of the present disclosure;

fig. 2 is a schematic structural diagram of a feature optimization device according to an embodiment of the disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present disclosure and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein.

It should be understood that, in various embodiments of the present disclosure, the sequence numbers of the processes do not mean the execution sequence, and the execution sequence of the processes should be determined by the functions and the inherent logic of the processes, and should not constitute any limitation on the implementation process of the embodiments of the present disclosure.

It should be understood that in the present disclosure, "including" and "having" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be understood that in the present disclosure, "plurality" means two or more. "and/or" is merely an association describing an associated object, meaning that three relationships may exist, for example, and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "comprises A, B and C" and "comprises A, B, C" means that all three of A, B, C comprise, "comprises A, B or C" means that one of A, B, C comprises, "comprises A, B and/or C" means that any 1 or any 2 or 3 of A, B, C comprises.

It should be understood that in this disclosure, "B corresponding to a", "a corresponds to B", or "B corresponds to a" means that B is associated with a, from which B can be determined. Determining B from a does not mean determining B from a alone, but may be determined from a and/or other information. And the matching of A and B means that the similarity of A and B is greater than or equal to a preset threshold value.

As used herein, "if" may be interpreted as "at … …" or "when … …" or "in response to a determination" or "in response to a detection", depending on the context.

The technical solution of the present disclosure is explained in detail below with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.

Fig. 1 schematically illustrates a flow chart of a feature optimization method according to an embodiment of the present disclosure, and as shown in fig. 1, the method includes:

step S101, inputting the characteristics to be optimized into a pre-constructed characteristic optimization model;

for example, the embodiment of the present disclosure takes the feature to be optimized as the multi-head loan feature as an example, and it should be noted that the embodiment of the present disclosure does not limit the category and the number of the feature to be optimized.

The multi-headed loan feature is a statistical index describing the loan behavior of a user at different classes of financial institutions, and can be used to indicate the 5 loan applications of a certain user in the banking industry within the past 30 days, for example. The multi-head borrowing and lending characteristics are also characteristics commonly used in the field of credit wind control modeling, wherein the existing multi-head borrowing and lending characteristics are mainly obtained by processing loan application behavior data of a plurality of financial institutions based on users, and meanwhile, the multi-head borrowing and lending characteristics are derived through various calculation methods (such as addition, subtraction, multiplication, division, statistics, aggregation and the like) according to the categories and different time slices of the financial institutions, and finally, a characteristic set required by credit wind control modeling is obtained, so that modeling operation is performed.

Illustratively, the feature optimization model of the embodiments of the present disclosure is constructed based on a neural network for reducing the dimensionality of the input features and obtaining an efficient feature representation of the input features;

it is to be understood that the function of the feature optimization model of the embodiment of the present disclosure may include optimizing the input features, which may be constructed based on a neural network, for example, the feature optimization model of the embodiment of the present disclosure may include a self-encoder, it should be noted that the feature optimization model may include a self-encoder, which is only exemplary, and the embodiment of the present disclosure does not limit the specific type of the feature optimization model.

Where the self-encoder is one of a neural network, an unsupervised learning algorithm, a back-propagation algorithm may be used to train the network so that the dimensions of the output from the encoder are the same as the dimensions of the input. In practical application, specific settings can be added in the self-encoder, and valuable expressions and information about input features can be learned.

The feature optimization model of the embodiments of the present disclosure may include a self-encoder, and for convenience of description, the feature optimization model is referred to as a self-encoder in the following.

The self-encoder of the disclosed embodiment may include an input layer, an encoding layer, a decoder, and an output layer,

illustratively, the input layer may be composed of features to be optimized, wherein the features to be optimized may include different user information, such as account age, behavior interval, amount of multi-head loan, etc. of the user, and the embodiments of the present disclosure do not limit the type and amount of the features to be optimized. Step S102, performing dimension reduction mapping on the feature to be optimized through the encoder, and mapping the feature to be optimized into a first intermediate feature;

in an alternative embodiment, the encoder includes a batch normalization layer, a first mapping layer and a second mapping layer,

Illustratively, the encoder of the embodiments of the present disclosure may include three layers, respectively, a batch normalization processing layer, a first mapping layer, and a second mapping layer.

The batch standardization processing layer of the encoder can perform dimension standardization processing on the features to be optimized to obtain standardized features.

Specifically, the batch normalization processing layer of the encoder may normalize the feature to be optimized to a distribution with a mean value of 0 and a standard deviation of 1, and for example, the method for performing the dimension normalization processing on the feature to be optimized by the batch normalization processing layer of the encoder may be represented by the following formula:

wherein the content of the first and second substances,

the feature to be optimized after the dimension standardization processing, namely the standardized feature,

represents the feature to be optimized before dimension normalization, μ^kRepresenting the mean, σ, of each dimension in the feature to be optimized^kRepresents the standard deviation of each dimension in the feature to be optimized, represents a constant different from 0, and belongs to [1, D ]]，i∈[1，m]D represents the dimension of the feature to be optimized, m represents the number of the feature to be optimized, and the feature to be optimized can be represented as X ∈ R^DWherein X represents the feature to be optimized, and D represents the dimension of the feature to be optimized.

Dimension standardization processing is carried out on the features to be optimized, so that the features with different dimensions/dimensions can be compared under the same dimension/dimension, namely, data with different dimensions are unified under a reference system, and the comparison is meaningful; secondly, the convergence speed of the corresponding model can be ensured to be accelerated during operation, and the operation speed of the model can be improved.

Specifically, the first mapping layer of the encoder may include a fully-connected layer with a dimension of 512 dimensions, and the second mapping layer of the encoder may include a fully-connected layer with a dimension of 128 dimensions, where the first mapping layer and the second mapping layer have similar structures and are both capable of performing dimension-reduction mapping on the input features.

In an alternative embodiment, the method of mapping the normalized feature to a third intermediate feature having the same dimension as the first mapping layer by the first mapping layer comprises:

Exemplarily, taking the determination of the third intermediate characteristic as an example, the method for determining the third intermediate characteristic based on the activation function corresponding to the encoder, the first weight matrix corresponding to the first mapping layer, the first bias parameter corresponding to the first mapping layer, and the normalized characteristic may be as follows:

h＝α(ωx+b)

wherein h denotes the third intermediate feature, α denotes an activation function corresponding to the encoder, ω denotes a first weight matrix corresponding to the first mapping layer, x denotes the normalized feature, and b denotes a first bias parameter corresponding to the first mapping layer.

The above manner is only an exemplary description of the method for determining the third intermediate feature, and the method for determining the first intermediate feature in the embodiment of the present disclosure may refer to the above contents, except that each parameter in the formula for determining the first intermediate feature corresponds to the second mapping layer, which is not described herein again in the embodiment of the present disclosure.

It can be understood that the traditional feature development method needs a lot of time, the feature derivation mainly depends on manual experience, a lot of manpower and time are needed, information obtained by mining derived features through manual experience is limited, meanwhile, high-dimensional features are easy to interfere with each other in the subsequent modeling process, the calculation amount is large, and the feature screening efficiency is low.

In the embodiment of the disclosure, the deep-level relationship between the features to be optimized is automatically learned through the multi-layer network structure in the self-encoder, and some hidden nonlinear information features which cannot be mined by the conventional feature derivation method can be mined, for example, two accounts which are seemingly unrelated but actually belong to a certain specific organization, so that the two accounts are further analyzed, which cannot be done in a manual manner.

The features to be optimized may be further processed by a decoder after being encoded by the encoder.

Step S103, performing ascending dimension mapping on the first intermediate feature through the decoder, and mapping the first intermediate feature into a second intermediate feature with the same dimension as the feature to be optimized;

specifically, a first intermediate feature may be subjected to dimension-up mapping by a decoder, and the first intermediate feature may be mapped to a second intermediate feature having the same dimension as the feature to be optimized.

For example, the method for performing the ascending-dimension mapping on the first intermediate feature to obtain the second intermediate feature may be as follows:

wherein the content of the first and second substances,

a second intermediate characteristic is represented that is,

representing the corresponding activation function of the decoder,

representing a weight matrix with the same dimension as the feature to be optimized, h representing a first intermediate feature,

indicating the corresponding offset parameters of the decoder.

And the decoder performs dimension-ascending mapping on the first intermediate characteristic to obtain a second intermediate characteristic with the dimension same as that of the characteristic to be optimized. The self-encoder is used as a special type of neural network, the dimension of the output of the self-encoder is the same as that of the input, and the loss function of the self-encoder can be optimized iteratively by comparing the difference between the input and the output, so that the output result obtained from the encoder implies deep-level valuable information.

S104, optimizing the feature to be optimized based on the feature difference between the second intermediate feature and the feature to be optimized, and determining an optimized feature;

in an alternative embodiment, the method further comprises training a feature optimization model,

and iteratively optimizing a loss function of the feature optimization model to be trained through a back propagation algorithm based on the feature errors of the fifth intermediate feature and the training feature so as to enable the feature errors to meet a preset convergence condition, and completing the training of the feature optimization model. Specifically, the loss function of the feature optimization model to be trained can be iteratively optimized through a back propagation algorithm according to a method shown in the following formula:

wherein x represents a training feature,

a fifth intermediate characteristic is shown which is,

the loss function is represented.

In practical applications, the loss function of the self-encoder can be iteratively optimized through a back propagation algorithm, and the method for iteratively optimizing the loss function of the self-encoder is not limited in the embodiment of the disclosure.

For example, the preset convergence condition of the embodiment of the present disclosure may include that the feature error of the fifth intermediate feature and the training feature converges around a certain preset value. It should be noted that the above convergence condition in the present disclosure is only an exemplary description, and the preset convergence condition is not limited in the embodiment of the present disclosure.

Through iterative optimization of the loss function of the feature optimization model to be trained, valuable information implicit in the features to be optimized can be effectively learned, and interference of correlation among the features on modeling is avoided.

For example, after the optimization features are obtained, they may be input into a credit wind control model. The credit wind control model of the embodiment of the disclosure generally takes whether a user violates a specified period as a prediction target, and is a binary model constructed by a machine learning algorithm, and outputs the probability of the user violation.

In the embodiment of the disclosure, the optimization features can be used as the feature input of the credit wind control model, the dimensionality of the optimization features is lower, the feature representation is more accurate, the optimization features are used as the input of the credit wind control model, the step of screening out noise in the input features is omitted, and the calculation efficiency and the accuracy of the output result of the credit wind control model are improved.

Fig. 2 schematically illustrates a structural diagram of a feature optimization device according to an embodiment of the present disclosure, and as shown in fig. 2, the device includes:

a first unit 21, configured to input a feature to be optimized into a pre-constructed feature optimization model, where the feature optimization model includes an encoder and a decoder;

a second unit 22, configured to perform dimension reduction mapping on the feature to be optimized through the encoder, and map the feature to be optimized into a first intermediate feature;

a third unit 23, configured to perform, by the decoder, ascending-dimension mapping on the first intermediate feature, and map the first intermediate feature into a second intermediate feature having the same dimension as the feature to be optimized;

a fourth unit 24, configured to optimize the feature to be optimized based on a feature difference between the second intermediate feature and the feature to be optimized, and determine an optimized feature.

In an alternative embodiment of the method according to the invention,

said second unit 22 comprises a normalization unit and a first mapping unit and a second mapping unit,

In an alternative embodiment of the method according to the invention,

the first mapping unit is further configured to:

the second mapping unit is further configured to:

In an alternative embodiment of the method according to the invention,

the fifth unit is configured to:

It should be noted that, for the beneficial effects of the feature optimization device in the embodiment of the present disclosure, reference may be made to the beneficial effects of the feature optimization method, and details of the embodiment of the present disclosure are not repeated here.

The present disclosure also provides a program product comprising execution instructions stored in a readable storage medium. The at least one processor of the device may read the execution instructions from the readable storage medium, and the execution of the execution instructions by the at least one processor causes the device to implement the methods provided by the various embodiments described above.

The readable storage medium may be a computer storage medium or a communication medium. Communication media includes any medium that facilitates transfer of a computer program from one place to another. Computer storage media may be any available media that can be accessed by a general purpose or special purpose computer. For example, a readable storage medium is coupled to the processor such that the processor can read information from, and write information to, the readable storage medium. Of course, the readable storage medium may also be an integral part of the processor. The processor and the readable storage medium may reside in an Application Specific Integrated Circuits (ASIC). Additionally, the ASIC may reside in user equipment. Of course, the processor and the readable storage medium may also reside as discrete components in a communication device. The readable storage medium may be a read-only memory (ROM), a random-access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

In the above embodiments of the terminal or the server, it should be understood that the Processor may be a Central Processing Unit (CPU), other general-purpose processors, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present disclosure may be embodied directly in a hardware processor, or in a combination of the hardware and software modules within the processor.

Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present disclosure, and not for limiting the same; while the present disclosure has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present disclosure.

Claims

1. A method of feature optimization, the method comprising:

2. The method of claim 1, wherein the encoder includes a batch normalization layer, a first mapping layer, and a second mapping layer,

3. The method of claim 2,

4. The method of claim 3, wherein the first mapping layer has a dimension of 512 dimensions and the second mapping layer has a dimension of 128 dimensions.

5. The method of claim 1, further comprising training a feature optimization model,

6. A feature optimization device, the device comprising:

7. The apparatus of claim 6, wherein the encoder comprises a batch normalization layer, a first mapping layer, and a second mapping layer,

8. The apparatus of claim 7,

the first mapping unit is further configured to:

the second mapping unit is further configured to:

9. The apparatus of claim 8, wherein the first mapping layer has a dimension of 512 dimensions and the second mapping layer has a dimension of 128 dimensions.

10. The apparatus of claim 6, further comprising a fifth unit for training the feature optimization model;

the fifth unit is configured to: