CN113762005A

CN113762005A - Method, device, equipment and medium for training feature selection model and classifying objects

Info

Publication number: CN113762005A
Application number: CN202011242309.3A
Authority: CN
Inventors: 祖辰
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2020-11-09
Filing date: 2020-11-09
Publication date: 2021-12-07
Anticipated expiration: 2040-11-09
Also published as: CN113762005B

Abstract

The embodiment of the invention discloses a method, a device, equipment and a medium for training and object classification of a feature selection model. The training method of the feature selection model comprises the following steps: inputting a plurality of training samples and class marking results respectively corresponding to each training sample into a feature selection model, wherein each training sample comprises a plurality of sample features, and the feature selection model is used for selecting target features from the sample features; and adjusting the network parameters in the target function determination module according to the target function value output by the target function determination module in the feature selection model, wherein the target function determination module comprises a loss function measurement unit constructed based on an adaptive loss function, and the adaptive loss function comprises a first expression related to a first norm and a second expression related to a second norm. The technical scheme of the embodiment of the invention achieves the effect of high-accuracy feature selection by the adaptive loss function which is suitable for data measurement under various data distributions.

Description

Method, device, equipment and medium for training feature selection model and classifying objects

Technical Field

The embodiment of the invention relates to the technical field of computers, in particular to a method, a device, equipment and a medium for training and object classification of a feature selection model.

Background

Feature selection is an important component of many feature selection applications, particularly in many bioinformatics and computer vision applications, which require efficient and powerful feature selection techniques to extract meaningful features and remove noise and redundant features to avoid degrading the application performance of subsequent correlation algorithms.

The feature selection is a process of selecting a relevant feature subset, and is a key component for constructing a robust feature selection model for numerous applications such as classification, regression, clustering and the like because the feature selection can accelerate the learning process of the model, improve the generalization capability of the model and relieve the influence of dimension disasters on the model.

A number of feature selection techniques have been proposed by researchers and used in practical applications, such as filtering feature selection techniques, wrapping feature selection techniques, embedded feature selection techniques, and so forth. The embedded feature selection technology embeds the feature selection process into the training process of the feature selection model, and along with the completion of the training of the feature selection model, the feature selection is also finished at the same time.

In the process of implementing the invention, the inventor finds that the following technical problems exist in the prior art: the embedded feature selection technology usually measures the difference between a predicted value and a true value by using a square loss function, which amplifies the loss value of an abnormal point in data, i.e., it is sensitive to the abnormal value in the data, which may have a great influence on the accuracy of feature selection.

Disclosure of Invention

The embodiment of the invention provides a method, a device, equipment and a medium for training and object classification of a feature selection model, so as to realize the effect of high-accuracy feature selection under various data distributions.

In a first aspect, an embodiment of the present invention provides a method for training a feature selection model, which may include:

inputting a plurality of training samples and class marking results respectively corresponding to each training sample into a feature selection model, wherein the training samples comprise samples related to biological feature processing, image feature processing, voice feature processing and/or text feature processing, each training sample comprises a plurality of sample features, and the feature selection model is used for selecting target features from the sample features;

and adjusting the network parameters in the objective function determination module according to an objective function value output by the objective function determination module in the feature selection model, wherein the objective function determination module comprises a loss function measurement unit constructed based on an adaptive loss function, and the adaptive loss function comprises a first expression related to a first norm and a second expression related to a second norm.

In a second aspect, an embodiment of the present invention further provides an object classification method, which may include:

obtaining the characteristics of an object to be classified, and selecting target characteristics from the characteristics based on a characteristic selection model obtained by training according to the training method in any embodiment of the invention;

and classifying the objects to be classified according to the target characteristics.

In a third aspect, an embodiment of the present invention further provides a training apparatus for a feature selection model, which may include:

the data input module is used for inputting a plurality of training samples and class marking results corresponding to the training samples into the feature selection model, wherein the training samples comprise samples related to biological feature processing, image feature processing, voice feature processing and/or text feature processing, each training sample comprises a plurality of sample features, and the feature selection model is used for selecting target features from the sample features;

and the model training module is used for selecting an objective function value output by the objective function determining module in the model according to the characteristics and adjusting the network parameters in the objective function determining module, wherein the objective function determining module comprises a loss function measuring unit constructed based on an adaptive loss function, and the adaptive loss function comprises a first equation related to the first norm and a second equation related to the second norm.

In a fourth aspect, an embodiment of the present invention further provides an object classification apparatus, which may include:

the characteristic selection module is used for acquiring the characteristics of the object to be classified, and selecting target characteristics from the characteristics based on a characteristic selection model obtained by training according to the training method in any embodiment of the invention;

and the object classification module is used for classifying the objects to be classified according to the target characteristics.

In a fifth aspect, an embodiment of the present invention further provides an electronic device, where the electronic device may include:

one or more processors;

a memory for storing one or more programs;

when executed by one or more processors, the one or more programs cause the one or more processors to implement a method for training a feature selection model or a method for object classification as provided in any of the embodiments of the present invention.

In a sixth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for training the feature selection model or the method for classifying the object provided in any embodiment of the present invention.

According to the technical scheme of the embodiment of the invention, training samples comprising a plurality of sample characteristics and class marking results respectively corresponding to the training samples are input into a characteristic selection model, wherein the training samples can comprise samples related to biological characteristic processing, image characteristic processing, voice characteristic processing and/or text characteristic processing, and the characteristic selection model is a model for selecting target characteristics from the characteristics of the samples; because the loss function measurement unit in the feature selection model is constructed based on the adaptive loss function which can adapt to data measurement under various data distributions, the loss value of an abnormal point in data cannot be amplified, and therefore, the network parameters can be accurately adjusted based on the output result of the objective function determination module where the loss function measurement unit is located. According to the technical scheme, the adjustment precision of the network parameters under various conditions is guaranteed through the adaptive loss function which is suitable for data measurement under various data distributions, the selection precision of the target characteristics is improved through the characteristic selection model obtained based on the accurately adjusted network parameter training, and the robustness of characteristic selection is high.

Drawings

FIG. 1 is a flowchart of a method for training a feature selection model according to a first embodiment of the present invention;

fig. 2a is a schematic diagram illustrating a comparison between a curve of an adaptive loss function when σ is 0.1 and a curve of the remaining norm in a training method of a feature selection model according to a first embodiment of the present invention;

fig. 2b is a schematic diagram illustrating a comparison between a curve of the adaptive loss function when σ is 1 and a curve of the remaining norm in the training method of the feature selection model according to the first embodiment of the present invention;

fig. 2c is a schematic diagram illustrating a comparison between a curve of the adaptive loss function when σ is 10 and a curve of the remaining norm in the training method of the feature selection model according to the first embodiment of the present invention;

FIG. 3 is a flowchart of a method for training a feature selection model according to a second embodiment of the present invention;

FIG. 4 is a flowchart of an object classification method according to a third embodiment of the present invention;

FIG. 5 is a block diagram of a training apparatus for a feature selection model according to a fourth embodiment of the present invention;

fig. 6 is a block diagram of an object classification apparatus according to a fifth embodiment of the present invention;

fig. 7 is a schematic structural diagram of an electronic device in a sixth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Example one

Fig. 1 is a flowchart of a training method for a feature selection model according to a first embodiment of the present invention. The present embodiment is applicable to a case where a target feature is selected from a plurality of sample features based on an adaptive loss function adaptable to various training samples in a model training process. The method may be performed by a training apparatus for feature selection models provided in the embodiments of the present invention, the apparatus may be implemented by software and/or hardware, and the apparatus may be integrated on an electronic device, and the electronic device may be various user terminals or servers.

Referring to fig. 1, the method of the embodiment of the present invention specifically includes the following steps:

s110, inputting a plurality of training samples and class marking results corresponding to the training samples into a feature selection model, wherein the training samples comprise samples related to biological feature processing, image feature processing, voice feature processing and/or text feature processing, each training sample comprises a plurality of sample features, and the feature selection model is used for selecting target features from the sample features.

Each training sample may be data obtained by performing feature extraction on a training object corresponding to the training sample, and therefore each training sample may include a plurality of sample features. For example, the training object may be biometric data, image data, text data, voice data, etc., and accordingly, the training sample may be a sample related to biometric processing, image feature processing, voice feature processing, text feature processing, etc., and the feature extraction of the training object may be implemented by Scale-invariant feature transform (SIFT), wavelet transform, Histogram of Oriented Gradient (HOG), Natural Language Programming (NLP), fourier transform, etc.

Specifically, in the field of biological feature processing, the feature processing modes that may be involved include filtering, segmentation, artifact removal, independent component analysis, time domain analysis, frequency domain analysis, sequence alignment, and the like. Taking the processing of gene data as an example, after the gene data of a certain object is obtained, the gene data and the pre-stored gene data (hereinafter, may be referred to as pre-stored data) may be compared based on a sequence comparison algorithm, and a plurality of gene features may be obtained in the comparison process. Then, for example, when the pre-stored data is gene data of each category, such as human gene data, canine gene data, feline gene data, etc., the category of the object to which the gene data belongs, such as human, canine, feline, etc., can be determined according to the above-mentioned gene characteristics; further illustratively, when performing paternity test based on the gene data, it is possible to determine whether or not the object to which the gene data belongs and the object to which the prestored data belongs have a paternity relationship based on the above-mentioned gene characteristics; etc., and are not specifically limited herein.

In the field of image feature processing, possible related feature processing methods include SIFT, wavelet transform, HOG, image segmentation, morphological analysis, and the like. Taking the processing of the fingerprint data as an example, after the fingerprint data of a certain user is obtained, in general, the fingerprint data is a color image, and the color image is converted into a gray image and is subjected to normalization processing to obtain a normalized image; solving a transverse gray image and a longitudinal gray image of the normalized image based on a Sobel operator; detecting the singular point positions of fingerprints in the transverse gray level image and the longitudinal gray level image by using a singular point detection algorithm based on Poincare index; the direction field information is solved by using the variable-size template, then the direction field information is corrected according to the direction field consistency, finally the corrected direction field information is subjected to smooth filtering through a mean filtering algorithm to obtain the final direction field information, and the final direction field information can be used as the fingerprint characteristics of the fingerprint data. Besides, histogram equalization can be carried out on the color image, and the structure tensor of the color image is calculated according to the result of the histogram equalization; determining an eigenvector corresponding to the eigenvalue of the structure tensor; processing the feature vector based on an arc tangent formula to obtain a point direction field of each pixel point in the color image, wherein the point direction field can also be used as a fingerprint feature of fingerprint data; etc., which are not specifically limited herein. In subsequent applications, it may be determined whether the fingerprint data belongs to fingerprint data of a target object, fingerprint data of a target class, or the like, based on the fingerprint feature.

In the field of text feature processing, possible related feature processing modes include word segmentation, text washing, normalization, NLP, and the like. Processing the text data to obtain a similarity matrix of a text vector after certain text data is obtained by using the processing texture of the text data, and obtaining an initial equivalent division threshold value of each text data by using each row of elements of the similarity matrix so as to perform initial equivalent division on the text data and further determine an initial cluster number and an initial cluster center, wherein the initial cluster number and the initial cluster center can be used as text characteristics of the text data; in addition, an artificial fish swarm algorithm can be adopted in a combined manner, the state of each artificial fish is updated according to the global optimal information and the local optimal information, so that a global optimal clustering center can be found, and the global optimal clustering center can also be used as a text feature of the text data; and so on. In subsequent applications, whether the text data is from a writer, the emotional tendency reflected by the writer and the like can be judged according to the text characteristics.

In the field of speech feature processing, possible involved feature processing methods include fourier transform, Mel cepstral coefficients, linear prediction coefficients, line spectral frequencies, and so on. Taking the processing of the voice data as an example, after some voice data is obtained, windowing processing can be performed on the voice data to obtain a voice frame, and the voice frame can be used as a voice feature of the voice data; in addition, a threshold value can be obtained from the voice frame, and voiced data in the voice data can be obtained according to the threshold value, and the voiced data can be used as voice characteristics of the voice data; and the like. In subsequent applications, the speech features may be compared with pre-stored speech features, and it is determined whether the segment of speech data originates from a target object, reflects a target emotion, or the like according to the comparison result.

The feature selection model is any model capable of completing a feature selection task, and target features selected based on the feature selection model can be used for realizing classification, regression, clustering and other tasks. Therefore, each training sample has a corresponding class labeling result, where the class labeling result is a class where the training sample really belongs in a feature selection task such as classification, regression, clustering, and the like, for example, a target feature selected based on a feature selection model is used to classify an image into a kitten, a puppy, and the like, and the class labeling result corresponding to a training sample of an image may be a kitten or some data representation corresponding to a kitten.

In practical applications, optionally, as an example, assume training samples { x }₁,x₂,…,x_nThe number of the training samples is n, each training sample comprises d sample features, and a sample set formed by a plurality of the training samples can pass through a sample matrix

Perform a representation in which the ith row in X represents the ith training sample

Assuming that the training samples relate to c classes in total, the labeling set formed by labeling results of each class can pass through the labeling matrix

Is shown in which

Labeling the class of the ith training sample with one-hot coding (one-hot coding) of the result, i.e., if x_iBelongs to the jth category of the c categories, then y_iThe jth element of (1)

The remaining elements are all 0.

And S120, adjusting network parameters in the objective function determination module according to an objective function value output by the objective function determination module in the feature selection model, wherein the objective function determination module comprises a loss function measurement unit constructed based on an adaptive loss function, and the adaptive loss function comprises a first equation related to a first norm and a second equation related to a second norm.

After the training samples and the class labeling results are input to the feature selection model, the feature selection model can perform class prediction on the training samples to obtain class prediction results, at this time, a loss function measurement unit in the objective function determination module can measure the class prediction results and the class labeling results belonging to the same training sample, and then an objective function value of the objective function determination module can be calculated according to the measurement results and other factors, wherein the other factors can be factors related to feature selection, and the objective function value can be an adjustment basis of network parameters in the objective function determination module. After the network parameters are adjusted, the adjusted network parameters can be used for feature selection (because the network parameters are involved in the selection process of the feature selection unit related to feature selection in the objective function determination module), and can also be used for completing the task of feature selection such as classification, regression, clustering, and the like.

It should be noted that, the loss function metric unit may be a calculation unit constructed based on an adaptive loss function, and in consideration of application scenarios that may be involved in the embodiments of the present invention, each training object is likely to simultaneously include data under multiple data distributions, for example, a certain training object includes data partially conforming to a gaussian distribution and data partially conforming to a laplacian distribution, and it is difficult to perform a better metric on each data based on a loss function constructed based on a single norm, for example, based on l₂Norm squared error loss function is sensitive to outliers in data under Laplace distribution because penalty is squared, e.g., based on l₁The norm error loss function is difficult to accurately delineate data under gaussian distribution.

In order to adapt to accurate depiction of data under various data distributions, an adaptive loss function proposed in the embodiment of the present invention includes a first equation related to a first norm and a second equation related to a second norm, where the adaptive loss function is a loss function obtained by fusing a first norm and a second norm, the first norm and the second norm are norms that can be respectively adapted to data under different data distributions, and illustratively, the first norm may be a norm of data under a laplacian distribution, and the second norm may be a norm of data under a gaussian distribution; and vice versa. The adaptive loss function can adapt to accurate measurement between a predicted value and a true value of data under various data distributions, and is not sensitive to loss values of abnormal points in the data under a certain data distribution, so that Gaussian noise and Laplace noise existing in the data can be overcome simultaneously, accurate calculation of a target function value is an important basis for precision adjustment of network parameters under various data distributions, the learning performance of a feature selection model and the accuracy of subsequent selection of target features from various sample features are improved by the accurately adjusted network parameters, and the robustness of feature selection is high.

It should be noted that, as described above, the network parameter is involved in the selection process of the feature selection unit related to feature selection in the objective function determination module, so that the target feature can be accurately selected from the sample features according to the accurately adjusted network parameter, where the target feature may be a sample feature that has a large influence on the remaining tasks, for example, a sample feature that contributes to each of the classification subtasks in the plurality of classification tasks. The importance of each sample feature is automatically calculated in the model training process, and the target feature is selected from the multiple sample features according to the importance, so that redundant and irrelevant features are deleted from the multiple sample features, and the implementation performance of the subsequent other tasks is improved.

According to the technical scheme of the embodiment of the invention, training samples comprising a plurality of sample characteristics and class marking results respectively corresponding to the training samples are input into a characteristic selection model, wherein the training samples can comprise samples related to biological characteristic processing, image characteristic processing, voice characteristic processing and/or text characteristic processing, and the characteristic selection model is a model for selecting target characteristics from the characteristics of the samples; because the loss function measurement unit in the feature selection model is constructed based on the adaptive loss function which can adapt to data measurement under various data distributions, the loss value of an abnormal point in data cannot be amplified, and therefore, the network parameters can be accurately adjusted based on the output result of the objective function determination module where the loss function measurement unit is located. According to the technical scheme, the adjustment precision of the network parameters under various conditions is ensured through the adaptive loss function which can adapt to data measurement under various data distributions, the selection precision of the target characteristics is improved through the characteristic selection model obtained based on the accurately adjusted network parameter training, and the robustness of characteristic selection is high.

In practical application, optionally, a fusion parameter used for fusing the first norm and the second norm may be set in the first equation and/or the second equation, and a specific value of the fusion parameter may influence which norm is more biased to the fused adaptive loss function. The reason for this is that, in practical applications, the distribution of each data in a certain sample object is unknown, and it may be data conforming to the gaussian distribution, data conforming to the laplace distribution, or even data that includes both data conforming to the gaussian distribution and data conforming to the laplace distribution, and therefore, the ratio between the gaussian distribution and the laplace distribution can be adjusted by setting the fusion parameter, so that the adaptive loss function can adapt to the data distribution in each situation.

To improve the accuracy of feature selection, a set of fusion parameters may be preset, some of which may bias the adaptive loss function more toward the norm adaptable for the gaussian distribution, and some of which may bias the adaptive loss function more toward the norm adaptable for the laplacian distribution. On this basis, each fusion parameter corresponds to an adaptive loss function, that is, a corresponding feature selection model can be obtained by training based on each fusion parameter, and then it can be determined according to the application results (such as classification results, regression results, clustering results) of the feature selection models completed by different training, which value fusion parameter has better adaptability to such data, and then the value fusion parameter can be used in such data.

Alternatively, the curve of the adaptive loss function may include a curve between a curve of the first norm and a curve of the second norm, such adaptive loss function being a norm between the first norm and the second norm, having both advantages of the first norm and the second norm, being robust to data outliers under the laplacian distribution, and being able to effectively learn normal data under the gaussian distribution. Illustratively, the first norm may be l_2,1The norm, and/or the second norm may be/_FNorm and of course vice versa.

On the basis of the technical solutions, in consideration of application scenarios possibly involved in the embodiments of the present invention, a d-dimensional vector is used

It l₁Norm sum l₂Norm is defined as

And

z_irepresenting the ith element in z. Based on l₂The norm squared error loss function is insensitive to small losses but very sensitive to outliers, since outliers with large losses will dominate the objective function, thus having a large impact on the learning performance of the feature selection model. Although the use is based on l₁The loss function of norm is insensitive to outliers, but it is sensitive to small losses (compared to l)₂Norm, which gives a relatively large penalty for small losses). In other words, |₁The norm may be suitable for data delineation under Laplace distribution, and l₂The norm may be suitable for data delineation under a gaussian distribution. In general, if the correct feature selection model has been selected, most of the data may have less loss to fit the model, while only a small amount of data has greater loss to fit the model, which may be considered outliers under the model. In practical application, the data may contain data partially conforming to Gaussian distribution and partially conforming to Laplacian distributionData that is gaussian distributed, i.e., a small loss of most data can be assumed to be gaussian distributed, while a large loss of some data is assumed to be laplace distributed. Based on such considerations, the vector

The adaptive loss function of (1) can be set at₁Norm sum l₂Between norms, an alternative arrangement is shown in equation (1):

wherein z is_iIs the ith element of z, σ is a fusion parameter, which is for the pair l₁Norm or is₂And (5) parameters for approximating the norm. The adaptive loss function may be smoothly intermediate between l₁Norm sum l₂Between norms, FIGS. 2 a-2 c show the adaptive loss function and/for different values of σ₁Norm sum l₂The difference of norm, from which it is clear that the curve of the adaptive loss function lies between l₁Curve sum of norm l₂Between the curves of the norm. Meanwhile, such an adaptive loss function has the following properties:

1.‖z‖_σis a non-negative convex function;

2.‖z‖_σtwice can be micro;

3. when σ tends to 0, the adaptive loss function | z |_σTend to l₁Norm | z |₁；

4. When σ tends towards ∞, the adaptive loss function | z |_σTend to l₂Norm | z |₂。

In actual data mining, the loss of the matrix is often drawn by fitting a multidimensional target of a training sample, and the adaptive loss function of the vector can be extended to the adaptive loss function of the matrix. In particular, for matrices

It l_2,1Norm (this is l)₁Matrix expansion of norm) and l_FNorm (this is l)₂Matrix expansion of norm) are respectively defined as

And

wherein z isⁱThe ith row vector denoted as Z. On the basis of the above, the adaptive loss function of the matrix Z can be set at l_2,1Norm sum l_FBetween norms, an alternative arrangement is shown in equation (2):

in combination with the application scenarios that may be involved in the embodiments of the present invention, the above variables have the following meanings, where σ is a fusion parameter, | | zⁱ||₂+ sigma is the first equation and is,

is a second expression, Z is a difference between a class prediction result of the training sample predicted by the feature selection model and a class labeling result corresponding to the class prediction result, and Z isⁱIs the difference corresponding to the ith training sample, and n is the number of training samples. Obviously, when the matrix Z degenerates to a vector Z, equation (2) will be simplified to equation (1), i.e. the adaptive loss function of the vector Z is a special case of the adaptive loss function of the matrix Z. Similar to the adaptive loss function of the vector, the adaptive loss function of the matrix also has the following properties:

1.‖Z‖_σis a non-negative convex function;

2.‖Z‖_σtwice can be micro;

3. when σ tends to 0, the adaptive loss function | Z |_σTend to l_2,1Norm | Z |_2,1；

4. Adaptive loss function | Z | as σ tends toward ∞_σTend to l_FNorm | Z |_F。

On this basis, optionally, in consideration of application scenarios that may be involved in the embodiments of the present invention, an optional value of Z may be Z ═ XW +1b^T-Y, wherein XW +1b^TIs the class prediction result, Y is the class labeling result, X is the training sample, W and b are the network parameters, W is the regression coefficient, b is the bias. In particular, W can be considered to be

B can be regarded as

Is determined by the offset vector of (a),

is a full 1 vector of size n, XW +1b^TMeaning that the feature selection model can obtain a class prediction result by regressing the training sample to the class labeling result through W and b.

Example two

Fig. 3 is a flowchart of a training method for a feature selection model according to a second embodiment of the present invention. The present embodiment is optimized based on the above technical solutions. In this embodiment, optionally, the objective function determining module further includes a feature selecting unit constructed based on a sparse regularization term, where the sparse regularization term is a norm related to a network parameter; and the feature selection model is specifically used for determining output results which are output by the feature selection unit and respectively correspond to the features of the samples according to the adjusted network parameters, and selecting target features from the features of the samples according to the output results. The same or corresponding terms as those in the above embodiments are not explained in detail herein.

Referring to fig. 3, the method of this embodiment may specifically include the following steps:

s210, inputting a plurality of training samples and class labeling results respectively corresponding to the training samples into a feature selection model, wherein the training samples comprise samples related to biological feature processing, image feature processing, voice feature processing and/or text feature processing, and each training sample comprises a plurality of sample features.

S220, adjusting network parameters in an objective function determination module according to objective function values output by the objective function determination module in the feature selection model, wherein the objective function determination module comprises a loss function measurement unit constructed based on an adaptive loss function and a feature selection unit constructed based on a sparse regularization term, the adaptive loss function comprises a first formula related to a first norm and a second formula related to a second norm, and the sparse regularization term is the norm related to the network parameters; and the feature selection model is specifically used for determining output results which are output by the feature selection unit and respectively correspond to the features of the samples according to the adjusted network parameters, and selecting target features from the features of the samples according to the output results.

The feature selection unit is a unit for selecting features, which is constructed based on a sparse regularization term, wherein the sparse regularization term is a norm related to network parameters. After the feature selection model receives the training samples and the class labeling results corresponding to each training sample, the loss function metric unit can calculate a loss function metric value according to the training samples, the class labeling results and the network parameters, and the feature selection unit can calculate a feature selection value according to the network parameters, so that the operation result of the loss function metric value and the feature selection value can be used as an objective function value output by the objective function determination module. Therefore, the characteristic selection process is fused into the model training process, and the loss function metric value related to loss function calculation and the characteristic selection value related to characteristic selection both generate basis for the adjustment process of the network parameters.

In practical applications, the sparse regularization term may be/₁Norm, e.g. l₁-SVM using a method of generating sparse solutions₁Selecting characteristics by using a norm regularization term (the characteristics of a sample corresponding to non-zero elements are target characteristics); the sparse regularization term may alsoIs/is_2,1Norm, e.g. using l_2,1Norm regularization term for feature selection across task coupling, l_2,1The norm regularization item can select the characteristics with joint sparsity in the data, specifically, if the contribution of a certain sample characteristic to each classification subtask of multiple classes of classification tasks is high, the sample characteristic can be kept as a target characteristic, so that a group sparse effect similar to a group lasso method is generated; the sparse regularization term may also be l related to the regression coefficient W_2,1Norm, for example, based on l associated with regression coefficient W in view of application scenarios that may be involved with embodiments of the present invention_2,1The norm-constructed feature selection unit may be λ | W |_2,1Acting on l on W_2,1The norm can generate a row sparse effect, namely, a plurality of rows of elements in W obtained through final optimization are all 0, and non-zero rows correspond to the reserved target features, so that a feature selection function is realized; λ is a regularization parameter that can be used to balance the weights of the loss function metric unit and the feature selection unit, while l can be controlled_2,1The sparseness of W in the norm, where more rows of W are compressed to 0 when λ is larger.

On the basis, the objective function determination module can be expressed by the following formula (3), wherein the adjustment process of the network parameter can be an iterative optimization process, and min means that in the iterative process, W and b take different values, and the value is selected to be | < XW +1b^T-Y‖_σThe smallest values of W and b.

min_W,b‖XW+1b^T-Y‖_σ+λ‖W‖_2,1 (3)

It should be noted that the output result output by the feature selection unit and corresponding to each sample feature may be a certain line of an adjusted certain network parameter, each line of the adjusted network parameter may correspond to a sample feature, and the output result may be considered as a certain line vector in the matrix. Therefore, according to the output results corresponding to the sample characteristics respectively, the target characteristics can be selected from the sample characteristics, for example, the sample characteristics corresponding to the rows with non-zero elements can be used as the target characteristics; if the target features are not selected, the target features are selected according to the preset number of the target features to be selected and the sorting result; etc., and are not specifically limited herein.

According to the technical scheme of the embodiment of the invention, the feature selection unit is constructed through the sparse regularization item related to the network parameters, the sparse regularization item can generate a sparse effect on the network parameters, and the target features can be accurately selected from all sample features according to the output result of the feature selection unit.

An optional technical solution, where the adjusting the network parameter in the objective function determining module according to the objective function value output by the objective function determining module in the feature selection model may include: inputting the network parameters into the feature selection model, and determining whether the objective function value output by the objective function determination module in the feature selection model is converged; if not, determining parameter adjustment data according to the network parameters, adjusting the network parameters according to the parameter adjustment data, and updating the network parameters according to the adjustment result; and repeating the step of inputting the network parameters into the feature selection model until the objective function value is converged and the network parameter adjustment is finished. The parameter adjustment data may be intermediate variables related to the network parameters, which are set to simplify the calculation process, and may be used to update the network parameters.

In order to better understand the specific implementation process of the above steps, the following takes the example shown in formula (3) as an example to exemplarily describe the training method of the feature selection model of the present embodiment. Optionally, deriving formula (3) to obtain formula (4):

where D is the element at the ith position on the diagonal of the diagonal matrix is

WⁱIs row i in W, order

And d_iIs parameter adjustment data (i.e., intermediate variable), since the adaptive loss function has a non-convex, differentiable property, the whole equation of equation (4) obtained after derivation of equation (3) is 0, i.e., equation (4) is made zero to obtain equation (5), and thus the expressions of W and b can be obtained:

due to the fact that

And d_iAll of which are associated with the variables W to be optimized, while the optimization process for optimizing these variables is very difficult, for which purpose these variables can be optimized based on an iterative reweighting algorithm: the input data may comprise a sample matrix of a plurality of training samples

Labeling matrix formed by labeling results of multiple categories

The number k of target features to be selected and a regularization parameter lambda are calculated; the output data may include k target features, and this iterative re-weighting algorithm is performed as follows:

________________________________________________________

1: t is 0(t is the number of iterations)

2: initializing W_tI (I is an identity matrix of size n × d) and b_t＝1

3: repeat (repeatedly execute the following steps)

4: according to W_tAnd b_tComputing

And

5: updating W according to equation (5)_t+1And update b_t+1

6: until converges (Until the objective function value converges)

7: according to | | Wⁱ||₂Sorting from big to small, and taking the sample characteristics corresponding to the first k rows as target characteristics

_______________________________________________

EXAMPLE III

Fig. 4 is a flowchart of an object classification method provided in the third embodiment of the present invention. The embodiment can be applied to the condition of classifying the object to be classified based on the target feature selected by the feature selection model obtained by pre-training. The method may be performed by an object classification apparatus provided in an embodiment of the present invention, where the apparatus may be implemented by software and/or hardware, and the apparatus may be integrated on an electronic device, where the electronic device may be a terminal or a server.

Referring to fig. 4, the method of the embodiment of the present invention specifically includes the following steps:

s310, obtaining the characteristics of the object to be classified, and selecting target characteristics from the characteristics based on the characteristic selection model obtained by training according to the training method in any embodiment of the invention.

Wherein the object to be classified is an object to be classified, such as biological data, image data, text data, voice data, and the like to be classified; the features are obtained by extracting features of the object to be classified, and as described in the first embodiment of the present invention, there are various implementation manners of feature extraction, which are not described herein again.

After the features of the object to be classified are obtained, target features can be screened out from the features based on a feature selection model obtained through pre-training, wherein the target features are features with high contribution degree to a classification task of the object to be classified.

And S320, classifying the objects to be classified according to the target characteristics.

The implementation manner of classifying the object to be classified according to the target features is various, for example, each target feature is processed again, and the classification result of the object to be classified is obtained according to the processing result; inputting each target feature into an object classification model obtained by pre-training, and obtaining a classification result of the object to be classified according to an output result of the object classification model; etc., and are not specifically limited herein.

It should be noted that the above technical solution can be applied to many fields, for example, in the biological field, continuing with the gene data described above as an example, according to the gene target feature of certain gene data (i.e. the target feature obtained by selecting the feature of each gene feature), the gene data can be classified into the gene data of human, canine or feline, and the gene data can be classified into the gene data of an object having or not having a relationship with the target object; in the image field, continuing with the example of the fingerprint data described above, a fingerprint data may be classified into a fingerprint data belonging to or not belonging to a target object, a fingerprint data belonging to or not belonging to a target class, which may be a human, an animal, etc., according to a fingerprint target feature of the fingerprint data; in the text field, continuing with the text data described above as an example, the text data can be classified into the text data from the writer a, the writer B or the writer C according to the text target feature of the text data, and the emotional tendency reflected by the text data can be classified into happiness, pain or sadness; in the voice field, continuing with the voice data described above as an example, according to the voice target feature of a certain voice data, the voice data can be classified as the voice data belonging to or not belonging to the target object, and the emotion represented by the voice data can be classified as belonging to or not belonging to the target emotion; and so on. Of course, the above technical solution can also be applied to other fields, and is not specifically limited herein.

According to the technical scheme of the embodiment of the invention, the target characteristics with higher contribution degree to the classification task of the subsequent object to be classified are selected from all the characteristics based on the trained characteristic selection model by acquiring the characteristics of the object to be classified; and classifying the object to be classified according to the target characteristics to obtain a classification result of the object to be classified. According to the technical scheme, the target characteristics with high contribution degree to the classification task of the subsequent object to be classified are selected from the characteristics through the characteristic selection model, and then the object to be classified is classified based on the target characteristics, so that the classification accuracy of the object to be classified is improved.

Example four

Fig. 5 is a block diagram of a feature selection model training apparatus according to a fourth embodiment of the present invention, which is configured to execute a feature selection model training method according to any of the embodiments described above. The device and the training method of the feature selection model of the embodiments belong to the same inventive concept, and details which are not described in detail in the embodiment of the training device of the feature selection model can refer to the embodiment of the training method of the feature selection model. Referring to fig. 5, the apparatus may specifically include: a data input module 410 and a model training module 420.

The data input module 410 is configured to input a plurality of training samples and class labeling results corresponding to each training sample into a feature selection model, where the training samples include samples related to biological feature processing, image feature processing, voice feature processing, and/or text feature processing, each training sample includes a plurality of sample features, and the feature selection model is configured to select a target feature from the sample features;

and a model training module 420, configured to select an objective function value output by an objective function determination module in the model according to the characteristics, and adjust a network parameter in the objective function determination module, where the objective function determination module includes a loss function metric unit constructed based on an adaptive loss function, and the adaptive loss function includes a first equation related to the first norm and a second equation related to the second norm.

Optionally, adaptiveThe curve of the loss function is a curve between the curve of the first norm and the curve of the second norm; and/or fusion parameters for fusing the first norm and the second norm are set in the first equation and/or the second equation; and/or the first norm comprises l_2,1The norm and/or the second norm comprises l_FAnd (4) norm.

Optionally, the adaptive loss function is represented by the following formula:

where σ is the fusion parameter, | | zⁱ||₂+ sigma is the first equation and is,

is a second expression, Z is the difference between the class prediction result of the training sample predicted by the feature selection model and the class labeling result corresponding to the class prediction result, Z isⁱIs the difference corresponding to the ith training sample, and n is the number of training samples.

Optionally, Z ═ XW +1b^T-Y, wherein XW +1b^TIs the class prediction result, Y is the class labeling result, X is the training sample, W and b are the network parameters, W is the regression coefficient, b is the bias.

Optionally, the target function determining module further includes a feature selecting unit constructed based on a sparse regularization term, where the sparse regularization term is a norm related to a network parameter; and the feature selection model is specifically used for determining output results which are output by the feature selection unit and respectively correspond to the features of the samples according to the adjusted network parameters, and selecting target features from the features of the samples according to the output results.

On this basis, optionally, the sparse regularization term is l related to the regression coefficient W_2,1And (4) norm.

Optionally, the model training module 420 may be specifically configured to:

inputting the network parameters into the feature selection model, and determining whether the objective function value output by the objective function determination module in the feature selection model is converged; if not, determining parameter adjustment data according to the network parameters, adjusting the network parameters according to the parameter adjustment data, and updating the network parameters according to the adjustment result; and repeating the step of inputting the network parameters into the feature selection model until the objective function value is converged and the network parameter adjustment is finished.

In the training apparatus for a feature selection model provided in the fourth embodiment of the present invention, a data input module may input a training sample including a plurality of sample features and a class labeling result corresponding to each training sample into the feature selection model, where the training sample may include samples related to biometric feature processing, image feature processing, voice feature processing, and/or text feature processing, and the feature selection model is a model for selecting a target feature from each sample feature; the model training module is obtained by constructing the loss function measurement unit in the feature selection model based on the adaptive loss function which can adapt to data measurement under various data distributions, and the loss value of an abnormal point in data can not be amplified, so that the output result of the target function determination module based on the loss function measurement unit can be used for accurately adjusting network parameters. The device ensures the adjustment precision of the network parameters under various conditions through the adaptive loss function which can adapt to data measurement under various data distributions, and the selection precision of the target characteristics is improved based on the characteristic selection model obtained by training the accurately adjusted network parameters, so that the robustness of the characteristic selection is higher.

The training device of the feature selection model provided by the embodiment of the invention can execute the training method of the feature selection model provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.

It should be noted that, in the embodiment of the training apparatus for feature selection model, the included units and modules are only divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be realized; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

EXAMPLE five

Fig. 6 is a block diagram of an object classification apparatus according to a fifth embodiment of the present invention, which is configured to execute the object classification method according to any of the embodiments described above. The object classification method of the present invention is not limited to the above embodiments, and the embodiments of the object classification method may be referred to for details that are not described in detail in the embodiments of the object classification device. Referring to fig. 6, the apparatus may specifically include: a feature selection module 510 and an object classification module 520. Wherein the content of the first and second substances,

a feature selection module 510, configured to obtain features of an object to be classified, and select a target feature from the features based on a feature selection model obtained by training according to the training method described in any embodiment of the present invention;

and an object classification module 520, configured to classify the object to be classified according to the target feature.

In the object classification device provided by the fifth embodiment of the present invention, the features of the object to be classified are obtained through the feature selection module, and the target features with a large contribution degree to the classification task of the subsequent object to be classified are selected from the features based on the trained feature selection model; and then, the object classification module classifies the object to be classified according to the target characteristic to obtain a classification result of the object to be classified. According to the device, the target characteristics with high contribution degree to the classification task of the subsequent object to be classified are selected from the characteristics through the characteristic selection model, and then the object to be classified is classified based on the target characteristics, so that the effect of improving the classification accuracy of the object to be classified is achieved.

The object classification device provided by the embodiment of the invention can execute the object classification method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.

It should be noted that, in the embodiment of the object classification apparatus, each included unit and module are only divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

EXAMPLE six

Fig. 7 is a schematic structural diagram of an electronic device according to a sixth embodiment of the present invention, as shown in fig. 7, the electronic device includes a memory 610, a processor 620, an input device 630, and an output device 640. The number of the processors 620 in the electronic device may be one or more, and one processor 620 is taken as an example in fig. 7; the memory 610, processor 620, input device 630, and output device 640 in the electronic device may be connected by a bus or other means, such as by bus 650 in fig. 7.

The memory 610 may be used as a computer-readable storage medium for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the training method of the feature selection model in the embodiment of the present invention (for example, the data input module 410 and the model training module 420 in the training device of the feature selection model), or program instructions/modules corresponding to the object classification method in the embodiment of the present invention (for example, the feature selection module 510 and the object classification module 520 in the object classification device). The processor 620 executes software programs, instructions and modules stored in the memory 610 to perform various functional applications and data processing of the electronic device, i.e., implementing the above-described training method of the feature selection model or the object classification method.

The memory 610 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 610 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 610 may further include memory located remotely from processor 620, which may be connected to devices through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 630 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function controls of the device. The output device 640 may include a display device such as a display screen.

EXAMPLE seven

A seventh embodiment of the present invention provides a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a method for training a feature selection model, the method comprising:

Of course, the storage medium containing the computer-executable instructions provided by the embodiments of the present invention is not limited to the method operations described above, and may also perform related operations in the training method of the feature selection model provided by any embodiments of the present invention.

Example eight

An eighth embodiment of the present invention provides a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a method for object classification, the method comprising:

From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. With this understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A method for training a feature selection model, comprising:

inputting a plurality of training samples and a class labeling result corresponding to each training sample into a feature selection model, wherein the training samples comprise samples related to biological feature processing, image feature processing, voice feature processing and/or text feature processing, each training sample comprises a plurality of sample features, and the feature selection model is used for selecting a target feature from the sample features;

and adjusting the network parameters in an objective function determination module according to an objective function value output by the objective function determination module in the feature selection model, wherein the objective function determination module comprises a loss function measurement unit constructed based on an adaptive loss function, and the adaptive loss function comprises a first equation related to a first norm and a second equation related to a second norm.

2. The method of claim 1, wherein the curve of the adaptive loss function is a curve between the curve of the first norm and the curve of the second norm; and/or the presence of a gas in the gas,

the first equation and/or the second equation are/is provided with fusion parameters for fusing the first norm and the second norm; and/or the presence of a gas in the gas,

the first norm comprises l_2,1Norm, and/or the second norm comprises l_FAnd (4) norm.

3. The method according to claim 1 or 2, characterized in that the adaptive loss function is represented by the following formula:

is the second expression, and Z is a difference between a class prediction result of the training sample predicted by the feature selection model and the class labeling result corresponding to the class prediction resultValue zⁱIs the difference corresponding to the ith training sample, and n is the number of training samples.

4. A method according to claim 3, wherein Z ═ XW +1b^T-Y, wherein XW +1b^TIs the class prediction result, Y is the class labeling result, X is the training sample, W and b are the network parameters, W is a regression coefficient, b is a bias.

5. The method of claim 1, wherein the objective function determination module further comprises a feature selection unit constructed based on a sparse regularization term, wherein the sparse regularization term is a norm related to the network parameter;

the feature selection model is specifically configured to determine, according to the adjusted network parameters, output results output by the feature selection unit and corresponding to the sample features, respectively, and select a target feature from the sample features according to the output results.

6. The method of claim 5, wherein the sparse regularization term is/related to a regression coefficient W_2,1And (4) norm.

7. The method of claim 1, wherein the adjusting the network parameters in the objective function determination module according to the objective function values output by the objective function determination module in the feature selection model comprises:

inputting the network parameters into the feature selection model, and determining whether the objective function value output by an objective function determination module in the feature selection model is converged;

if not, determining parameter adjustment data according to the network parameters, adjusting the network parameters according to the parameter adjustment data, and updating the network parameters according to adjustment results;

and repeatedly executing the step of inputting the network parameters into the feature selection model until the objective function value is converged and the network parameter adjustment is finished.

8. An object classification method, comprising:

acquiring the characteristics of an object to be classified, and selecting target characteristics from the characteristics based on a characteristic selection model obtained by training according to the training method of any one of claims 1 to 7;

and classifying the object to be classified according to the target characteristics.

9. An apparatus for training a feature selection model, comprising:

the data input module is used for inputting a plurality of training samples and class marking results respectively corresponding to the training samples into a feature selection model, wherein the training samples comprise samples related to biological feature processing, image feature processing, voice feature processing and/or text feature processing, each training sample comprises a plurality of sample features, and the feature selection model is used for selecting a target feature from the sample features;

and the model training module is used for selecting an objective function value output by an objective function determining module in the model according to the characteristics and adjusting the network parameters in the objective function determining module, wherein the objective function determining module comprises a loss function measuring unit constructed based on an adaptive loss function, and the adaptive loss function comprises a first formula related to a first norm and a second formula related to a second norm.

10. An object classification apparatus, comprising:

the characteristic selection module is used for acquiring the characteristics of the object to be classified, and selecting target characteristics from the characteristics based on a characteristic selection model obtained by training according to the training method of any one of claims 1 to 7;

and the object classification module is used for classifying the object to be classified according to the target characteristics.

11. An electronic device, comprising:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement a method of training a feature selection model as claimed in any one of claims 1 to 7, or a method of object classification as claimed in claim 8.

12. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out a method of training a feature selection model as claimed in any one of claims 1 to 7, or a method of object classification as claimed in claim 8.