CN111046972A

CN111046972A - Feature selection method and device

Info

Publication number: CN111046972A
Application number: CN201911354862.3A
Authority: CN
Inventors: 李虹锋; 曹清鑫; 樊丹
Original assignee: China Construction Bank Corp; CCB Finetech Co Ltd
Current assignee: China Construction Bank Corp
Priority date: 2019-12-25
Filing date: 2019-12-25
Publication date: 2020-04-21

Abstract

The invention discloses a feature selection method and a device, wherein the method comprises the following steps: determining discrete characteristic values of all the characteristics in the overall sample; calculating information gain obtained by dividing the overall sample by each feature according to the discrete feature value of each feature; calculating a correlation coefficient among the features according to the discrete feature values of the features; calculating a correlation coefficient comprehensive weighing value of each characteristic and other characteristics according to the correlation coefficients of each characteristic and all other characteristics in the overall sample; and determining a feature comprehensive value of each feature according to the information gain obtained by dividing the overall sample according to each feature and the correlation coefficient comprehensive measurement value of each feature and other features, and selecting each feature in the overall sample according to the feature comprehensive value. The invention provides a method for calculating the score of the measured characteristics, and the method can comprehensively evaluate each characteristic.

Description

Feature selection method and device

Technical Field

The invention relates to a display method, in particular to a feature selection method and device.

Background

Feature selection is an important "data preprocessing" process. In engineering practice based on machine learning, the problem of dimension disaster is frequently encountered, which is caused by excessive features participating in training, and if important features can be selected from an original feature set, the problem of dimension disaster can be greatly reduced if a model is trained on only a part of feature subsets in a subsequent learning process; and secondly, removing features irrelevant to modeling targets is beneficial to reducing the difficulty of the learning task. For both reasons, we usually perform feature selection on the modeling dataset and then train the algorithm model with the reduced dataset.

Feature selection typically involves two key links: how to select a candidate feature subset, and how to evaluate the quality of each feature in the feature subset. The first link is a 'subset search' problem, the second link is a 'subset evaluation' problem, namely the 'importance' of each feature in the subset to a modeling target is balanced through some quantitative calculation methods, and then feature selection is carried out. The existing feature selection method generally focuses on the importance of each modeling feature to a modeling target, ignores the relationship between modeling features, and may cause that strong correlation exists between the selected features, thereby causing the problem of feature redundancy and influencing the final modeling effect.

Disclosure of Invention

The present invention provides a feature selection method and apparatus to solve at least one technical problem in the background art.

In order to achieve the above object, according to one aspect of the present invention, there is provided a feature selection method including:

determining discrete characteristic values of all the characteristics in the overall sample;

calculating information gain obtained by dividing the overall sample by each feature according to the discrete feature value of each feature;

calculating a correlation coefficient among the features according to the discrete feature values of the features;

calculating a correlation coefficient comprehensive weighing value of each characteristic and other characteristics according to the correlation coefficients of each characteristic and all other characteristics in the overall sample;

and determining a feature comprehensive value of each feature according to the information gain obtained by dividing the overall sample according to each feature and the correlation coefficient comprehensive measurement value of each feature and other features, and selecting each feature in the overall sample according to the feature comprehensive value.

Optionally, the determining the discrete characteristic value of each characteristic in the population sample specifically includes:

and discretizing the continuous characteristic in the overall sample to obtain a discrete characteristic value of the continuous characteristic.

Optionally, the calculating, according to the discrete feature value of each feature, an information gain obtained by dividing the total sample by each feature includes:

calculating the information entropy of a sample subset generated by dividing the overall sample according to each discrete characteristic value of each characteristic;

and calculating the information gain obtained by dividing the overall sample by each feature according to the information entropy of the sample subset generated by dividing the overall sample by each discrete feature value of each feature and the information entropy of the overall sample.

Optionally, the correlation coefficient comprehensive metric value of each feature and other features is calculated according to the correlation coefficients of each feature and all other features in the total sample, and a calculation formula of the correlation coefficient comprehensive metric value of each feature and other features is as follows:

wherein Z is_(i)For the correlation coefficient of the characteristic i with other characteristics, a combined measure, r_ijThere are n features in the population for the correlation coefficient of feature i and feature j.

In order to achieve the above object, according to another aspect of the present invention, there is provided a feature selection apparatus including:

the discrete characteristic value determining unit is used for determining discrete characteristic values of all the characteristics in the overall sample;

the information gain calculation unit is used for calculating the information gain obtained by dividing the overall sample by each feature according to the discrete feature value of each feature;

the correlation coefficient calculation unit is used for calculating the correlation coefficient among the characteristics according to the discrete characteristic values of the characteristics;

the correlation coefficient comprehensive weighing value calculating unit is used for calculating the correlation coefficient comprehensive weighing value of each characteristic and other characteristics according to the correlation coefficients of each characteristic and all other characteristics in the overall sample;

and the characteristic comprehensive value calculating unit is used for determining the characteristic comprehensive value of each characteristic according to the information gain obtained by dividing the overall sample by each characteristic and the correlation coefficient comprehensive weighing value of each characteristic and other characteristics so as to select each characteristic in the overall sample according to the characteristic comprehensive value.

Optionally, the discrete feature value determining unit includes:

and the discretization processing module is used for discretizing the continuous characteristic in the overall sample to obtain a discrete characteristic value of the continuous characteristic.

Optionally, the information gain calculating unit includes:

the information entropy calculation module is used for calculating the information entropy of a sample subset generated by dividing the overall sample by each discrete characteristic value of each characteristic;

and the information gain calculation module is used for calculating the information gain obtained by dividing the overall sample by each characteristic according to the information entropy of the sample subset generated by dividing the overall sample by each discrete characteristic value of each characteristic and the information entropy of the overall sample.

Optionally, the feature integrated value calculating unit calculates the correlation coefficient integrated metric value of each feature and other features according to the following formula:

In order to achieve the above object, according to another aspect of the present invention, there is also provided a computer device including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps in the above feature selection method when executing the computer program.

In order to achieve the above object, according to another aspect of the present invention, there is also provided a computer-readable storage medium storing a computer program which, when executed in a computer processor, implements the steps in the above-described feature selection method.

The invention has the beneficial effects that: the invention provides a calculation method for measuring feature scores based on two statistical feature indexes of information gain and correlation coefficient theories and considering the importance of features to a modeling target and the correlation among the features.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts. In the drawings:

FIG. 1 is a flow chart of a feature selection method of an embodiment of the invention;

FIG. 2 is a flow chart of calculating information gain according to an embodiment of the present invention;

FIG. 3 is a block diagram of a feature selection apparatus according to an embodiment of the present invention;

FIG. 4 is a block diagram of an information gain calculating unit according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a computer apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

It should be noted that the terms "comprises" and "comprising," and any variations thereof, in the description and claims of the present invention and the above-described drawings, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Some of the terms appearing in the description and claims of the invention are explained below.

Overall sample: the population sample is the total of the subjects at the time of statistical analysis.

Characteristics (independent variables): the quantitative variation index is called a feature, and its expression is a specific characteristic value or variable value.

Feature and feature matrix: the quantitative variation index is called a feature, and its expression is a specific characteristic value or variable value. The set of eigenvalues of each individual over the features is called a feature matrix.

Continuous type characteristic: the feature that can take value at will in a certain interval is called continuous feature, and its numerical value is continuous, and two adjacent numerical values can be infinitely divided, i.e. infinite numerical values can be taken.

Discrete type characteristics: the value space of a feature is finite or can be listed as infinite, or the probability 1 is distributed on each value with a certain probability and is called as a discrete feature.

Modeling objective (dependent variable): the fingers will change with the change of the features (arguments) and are the objects and objects to be studied for modeling.

k-means binning method: the clusters clustered by k-means are shown by a histogram, the horizontal axis of which represents the classification of each group, and the height of the vertical axis rectangle of which represents the frequency of the corresponding group.

Information entropy: the definition of entropy in thermodynamics is used to describe the degree of disorder of a substance, and is used to measure the uncertainty, the more disordered a substance is, the greater the uncertainty, and the higher the entropy value. Entropy is an expected value of the amount of information, and represents expected values of various kinds of information contained in a substance, and is expressed by the following formula:

information gain: in probability theory and information theory, the information gain is asymmetric to measure the difference between two probability distributions P and Q, and the information gain describes the difference when coding with Q and then coding with P. Typically P represents the distribution of samples or observations; q represents a theory, model, description, or approximation to P.

Correlation statistics: the statistics are a vector with each component corresponding to an initial feature, and the importance of the subset of features is determined by the sum of the components of the relevant statistics corresponding to each feature in the subset.

Correlation coefficient: is a quantity for researching the degree of linear correlation between variables and can only reflect the linear correlation between the variables.

Correlation matrix: also called a correlation coefficient matrix, which is formed by the correlation coefficients between the columns of the matrix.

It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict. The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.

Fig. 1 is a flowchart of a feature selection method according to an embodiment of the present invention, and as shown in fig. 1, the feature selection method according to the embodiment includes steps S101 to S105.

Step S101, determining discrete characteristic values of all characteristics in the overall sample.

In the embodiment of the invention, the step directly counts the characteristic value of the discrete characteristic in the overall sample. Discretization is firstly needed for the continuous characteristic, and then discrete characteristic values of the continuous characteristic are obtained. In an alternative embodiment of the present invention, the discretization of the continuous feature may include the following specific steps: and discretizing the original characteristic value by using a k-means binning method, and counting segment values, so that the segment values can be used as discrete characteristic values.

And step S102, calculating information gain obtained by dividing the overall sample by each feature according to the discrete feature value of each feature.

The method comprises the following steps: calculating the information entropy of a sample subset generated by dividing the overall sample according to each discrete characteristic value of each characteristic; and calculating the information gain obtained by dividing the overall sample by each feature according to the information entropy of the sample subset generated by dividing the overall sample by each discrete feature value of each feature and the information entropy of the overall sample

Step S103, calculating the correlation coefficient among the characteristics according to the discrete characteristic value of each characteristic.

In the embodiment of the invention, the correlation coefficient between the discrete features can be calculated by various methods in the prior art, such as a pearson coefficient and the like.

In an alternative embodiment of the present invention, calculating the correlation coefficient between the features may specifically include the following steps.

Assume that two continuity features A, B are divided into m, n segment value intervals by step S101, respectively, as follows:

{[a₀,a₁],[a₁,a₂],…,[a_m-1,a_m]}

{[b₀,b₁],[b₁,b₂],…,[b_n-1,b_n]}

TABLE 1

In Table 1, N11 represents [ a ] under A characteristics₀,a₁]Segmentation value interval and [ B ] under B feature₀,b₁]The number of samples in the segmentation value interval is N11.

Setting:

o＝mn

the correlation coefficient r of the features a and B_ABComprises the following steps:

the step calculates a correlation coefficient reflecting whether the characteristic A and the characteristic B have a linear correlation relationship, and the value of the correlation coefficient is in an interval of [ -1,1]If r is_AB>0, which indicates that the characteristic A and the characteristic B have a positive linear correlation relationship, namely A and B present a change in the same direction; if r_AB0, the characteristic A and the characteristic B do not have a linear correlation relationship, namely the A and the B are not related in variation; if r_AB<And 0, which indicates that the characteristic A and the characteristic B have an inverse linear correlation relationship, i.e. A and B show inverse changes. In the embodiment of the invention, the correlation coefficient r of the characteristics A and B calculated by the formula_ABAlso known as "pseudo correlation coefficients".

And step S104, calculating a correlation coefficient comprehensive weighing value of each feature and other features according to the correlation coefficients of each feature and all other features in the overall sample.

In the embodiment of the present invention, the step calculates the linear correlation coefficient r of the feature a and the feature B according to the step S103_ABAssume that the population sample contains the feature { P }₁,P₂,…,P_nSelecting two characteristics each time, and obtaining P in the total sample by calculation according to the step S103_i、P_jIs related to_ijRepeating the steps for multiple times to finally obtain a correlation coefficient matrix, which is shown in the following table 2:

TABLE 2

And calculating the correlation coefficient comprehensive weighing value of the characteristic and other characteristics. The value range of the correlation coefficient is [ -1,1]The larger the absolute value of the correlation coefficient is, the larger the linear correlation relationship between the two characteristics is represented, and the smaller the comprehensive weighing value of the correlation coefficient is. In an alternative embodiment of the present invention, this step may specifically calculate the correlation coefficient comprehensive measure Z of the feature i and other features by the following formula_(i)：

And step S105, determining a characteristic comprehensive value of each characteristic according to the information gain obtained by dividing the overall sample according to each characteristic and the correlation coefficient comprehensive measurement value of each characteristic and other characteristics, and selecting each characteristic in the overall sample according to the characteristic comprehensive value.

In the embodiment of the present invention, in this step, the information gain obtained by dividing the total sample according to each feature calculated in the above step S102 and the feature integrated value of each feature are calculated according to the correlation coefficient integrated metric values of each feature and other features calculated in the above steps S103 and S104. The feature composite value is a composite score for each feature. In an alternative embodiment of the present invention, the calculation method of the feature comprehensive value score (i) of the feature i is as follows:

Score_(i)＝Gain(D,i)+Z_(i)

wherein Gain (D, i) is an information Gain obtained by dividing the overall sample D by the characteristic i, and the value range is [0,1]The larger the value is, the better the improvement effect on the sample classification is represented. Z_(i)The correlation coefficient of the feature i and other features is integrated with a constant value in the range of (0, n-1)]And n represents that n characteristics exist in total, the larger the value of the n characteristics is, the smaller the linear correlation between the characteristic and other characteristics is, and the better the modeling effect is influenced. The comprehensive value of the features obtained by adding the two values is used for measuring the classification effect of the features on the samples and the linear correlation among the features, and theoretical basis is provided for further screening the most-significant feature subsets.

In an alternative embodiment of the present invention, this step further ranks the feature comprehensive values score (i) of all features according to the scores, which is shown in table 3 below:

Score(i)	Rank
		Score(1)	1
Score(2)	2
		……	……
Score(n)	n

TABLE 3

In an alternative embodiment of the present invention, the present invention may perform feature selection and feature screening according to table 3. Specifically, a threshold τ may be specified first, and features smaller than τ may be removed according to the feature comprehensive value score (i); or the number k of the selected features is specified, and features ranked behind k are removed (i), and the like.

From the above description, it can be seen that the invention provides a calculation method for measuring feature scores based on two statistical feature indexes of information gain and correlation coefficient theory, and simultaneously considering the importance of features to a modeling target and the correlation between features.

Fig. 2 is a flowchart of calculating an information gain according to an embodiment of the present invention, and as shown in fig. 2, in the embodiment of the present invention, the information gain obtained by dividing the total sample according to each feature calculated according to the discrete feature value of each feature in step S102 specifically includes step S201 and step S202.

Step S201, calculating information entropy of a sample subset generated by dividing the total sample according to each discrete feature value of each feature.

Step S202, calculating the information gain obtained by dividing the overall sample by each feature according to the information entropy of the sample subset generated by dividing the overall sample by each discrete feature value of each feature and the information entropy of the overall sample.

In an alternative embodiment of the present invention, the information entropy of the overall sample can be calculated by: assuming that the modeling target in the current set sample D is classified into K types, and the proportion of the K type sample is P_kThen the set of dependent variables for D can be represented as [ P ]₁,P₂,P₃,…,P_k]Where k is 1,2,3, …. The information entropy ent (d) of the overall sample can be calculated according to the following calculation:

for each feature i (argument), its information gain is calculated:

assume that there are v possible values { I } for the discrete feature I₁,I₂,I₃,…,I_vF, if I is used to divide the original sample set D, v sample subsets are generated, wherein the v sample subset includes all values I on the feature I in D_vSample of (2), denoted as D_v. D can be calculated according to the formula for calculating the information entropy of the overall sample_vConsidering the number of samples contained in different subsets, weights are given

I.e., the greater the number of samples, the greater the effect of the subset, and the information gain obtained by dividing the set of samples D by the feature I can then be calculated.

Further, the information gain obtained by dividing the global sample D by the feature I and the information entropy ent (D) of the global sample D may be calculated according to the information gain obtained by dividing the global sample D by the feature I, and the specific calculation formula may be:

where Gain (D, i) is an information Gain obtained by dividing the ensemble sample D by the feature i. In general, a larger information gain means a larger discrimination of the feature I with respect to the original sample set.

As can be seen from the above embodiments, the feature selection method of the present invention achieves at least the following advantageous effects:

1. the invention provides a method for quantitatively calculating modeling characteristic screening based on information gain and correlation coefficient theories in information theory and statistics, and the method is more scientific and efficient than manual characteristic selection;

2. the invention designs a method for measuring the importance of the features based on variable statistics, provides a method for calculating the comprehensive Score value Score of each feature according to a 'pseudo information gain meter' and a 'pseudo correlation coefficient', emphasizes the features which have larger influence on the sample classification effect, and reduces the blindness of modeling feature selection.

It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.

Based on the same inventive concept, embodiments of the present invention further provide a feature selection apparatus, which can be used to implement the feature selection method described in the above embodiments, as described in the following embodiments. Because the principle of the feature selection apparatus for solving the problem is similar to that of the feature selection method, the embodiment of the feature selection apparatus can be referred to the embodiment of the feature selection method, and repeated details are not repeated. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.

Fig. 3 is a block diagram of a feature selection apparatus according to an embodiment of the present invention, and as shown in fig. 3, the feature selection apparatus according to the embodiment of the present invention includes: the device comprises a discrete characteristic value determining unit 1, an information gain calculating unit 2, a correlation coefficient calculating unit 3, a correlation coefficient comprehensive measure value calculating unit 4 and a characteristic comprehensive value calculating unit 5.

And the discrete characteristic value determining unit is used for determining the discrete characteristic value of each characteristic in the overall sample.

In an alternative embodiment of the present invention, the discrete feature value determination unit 1 includes: and the discretization processing module is used for discretizing the continuous characteristic in the overall sample to obtain a discrete characteristic value of the continuous characteristic.

And the information gain calculation unit is used for calculating the information gain obtained by dividing the overall sample according to the discrete characteristic value of each characteristic.

And the correlation coefficient calculating unit is used for calculating the correlation coefficient among the characteristics according to the discrete characteristic values of the characteristics.

And the correlation coefficient comprehensive weighing value calculating unit is used for calculating the correlation coefficient comprehensive weighing value of each characteristic and other characteristics according to the correlation coefficient of each characteristic and all other characteristics in the overall sample.

In an optional embodiment of the present invention, the feature integrated value calculating unit specifically calculates the correlation coefficient integrated metric value of each feature and other features according to the following formula:

Fig. 4 is a block diagram of the information gain calculating unit 2 according to the embodiment of the present invention, and as shown in fig. 4, the information gain calculating unit 2 according to the embodiment of the present invention includes: an information entropy calculation module 201 and an information gain calculation module 202.

And the information entropy calculation module 201 is used for calculating the information entropy of the sample subset generated by dividing the overall sample by each discrete characteristic value of each characteristic.

And the information gain calculation module 202 is configured to calculate an information gain obtained by dividing the overall sample by each feature according to the information entropy of the sample subset generated by dividing the overall sample by each discrete feature value of each feature and the information entropy of the overall sample.

To achieve the above object, according to another aspect of the present application, there is also provided a computer apparatus. As shown in fig. 5, the computer device comprises a memory, a processor, a communication interface and a communication bus, wherein a computer program that can be run on the processor is stored in the memory, and the steps of the method of the above embodiment are realized when the processor executes the computer program.

The processor may be a Central Processing Unit (CPU). The Processor may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or a combination thereof.

The memory, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and units, such as the corresponding program units in the above-described method embodiments of the present invention. The processor executes various functional applications of the processor and the processing of the work data by executing the non-transitory software programs, instructions and modules stored in the memory, that is, the method in the above method embodiment is realized.

The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by the processor, and the like. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and such remote memory may be coupled to the processor via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The one or more units are stored in the memory and when executed by the processor perform the method of the above embodiments.

The specific details of the computer device may be understood by referring to the corresponding related descriptions and effects in the above embodiments, and are not described herein again.

In order to achieve the above object, according to another aspect of the present application, there is also provided a computer-readable storage medium storing a computer program which, when executed in a computer processor, implements the steps in the above-described feature selection method. It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD) or a Solid State Drive (SSD), etc.; the storage medium may also comprise a combination of memories of the kind described above.

It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and they may alternatively be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, or fabricated separately as individual integrated circuit modules, or fabricated as a single integrated circuit module from multiple modules or steps. Thus, the present invention is not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method of feature selection, comprising:

2. The method for selecting features according to claim 1, wherein the determining discrete feature values of each feature in the population sample specifically comprises:

3. The feature selection method according to claim 1, wherein the calculating an information gain obtained by dividing the population sample according to each feature based on the discrete feature value of each feature comprises:

4. The method according to claim 1, wherein the correlation coefficient comprehensive measure value of each feature and other features is calculated according to the correlation coefficient of each feature and all other features in the population sample, and the calculation formula of the correlation coefficient comprehensive measure value of each feature and other features is as follows:

5. A feature selection apparatus, comprising:

6. The feature selection device according to claim 5, wherein the discrete eigenvalue determination unit comprises:

7. The feature selection device according to claim 5, wherein the information gain calculation unit includes:

8. The feature selection device according to claim 5, wherein the feature integrated value calculation unit calculates the correlation coefficient integrated metric value of each feature and the other features according to the following formula:

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of claims 1 to 4 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when executed in a computer processor, implements the method of any one of claims 1 to 4.