CN116226628A

CN116226628A - Feature optimization method, device, equipment and medium

Info

Publication number: CN116226628A
Application number: CN202310181430.7A
Authority: CN
Inventors: 杨建雄; 杨晨曦
Original assignee: Beijing Si Tech Information Technology Co Ltd
Current assignee: Beijing Si Tech Information Technology Co Ltd
Priority date: 2023-02-20
Filing date: 2023-02-20
Publication date: 2023-06-06

Abstract

The invention discloses a feature optimization method, a device, equipment and a medium. The feature optimization method comprises the following steps: normalizing the use behavior characteristics of the telecommunication products of the user to obtain characteristics to be analyzed; determining feature importance data according to the features to be analyzed and the model to be optimized, and screening the features to be analyzed according to the feature importance data to obtain original features to be reconstructed; determining target optimization features according to original features to be reconstructed, an entropy weight TOPSIS method and a self-adaptive weight coefficient correction algorithm; and optimizing the model to be optimized according to the target optimization characteristics so as to recommend telecommunication products for users through the optimized model to be optimized. According to the technical scheme provided by the embodiment of the invention, the input characteristics of the learning model for recommending the telecommunication product can be optimized, and the training effect of the learning model for recommending the telecommunication product is improved.

Description

Feature optimization method, device, equipment and medium

Technical Field

The present invention relates to the field of supervised learning, and in particular, to a feature optimization method, apparatus, device, and medium.

Background

In supervised learning, the quality of a feature attribute is closely related to the accurate prediction of a variable. The following methods are typically used for feature optimization to improve feature quality.

At present, input features of a learning model for recommending telecommunication products are mainly subjected to analysis, cleaning, screening, retaining, remodelling and other processes through descriptive statistical analysis, so that abnormal data features such as null values, outliers and the like are processed through statistical knowledge, and as judgment of business analysis depends on professional knowledge capacity of operators, data processing randomness is high, so that training effects of the learning model are not good, influence degrees of feature importance analysis features on supervision results are analyzed, correlation degrees among the features are analyzed through correlation analysis, feature optimization is performed according to causal relations of the features, and the learning model training effects are limited to technologies, application scenes, business knowledge and business capacity and cannot be satisfied.

Disclosure of Invention

The invention provides a feature optimization method, a device, equipment and a medium, which are used for solving the problem of poor training effect of a learning model of telecom product recommendation caused by poor feature analysis optimization input to the learning model of telecom product recommendation.

According to an aspect of the present invention, there is provided a feature optimization method including:

normalizing the use behavior characteristics of the telecommunication products of the user to obtain characteristics to be analyzed;

determining feature importance data according to the features to be analyzed and the model to be optimized, and screening the features to be analyzed according to the feature importance data to obtain original features to be reconstructed;

determining target optimization features according to original features to be reconstructed, an entropy weight superior-inferior solution distance TOPSIS method and a self-adaptive weight coefficient correction algorithm;

and optimizing the model to be optimized according to the target optimization characteristics so as to recommend telecommunication products for users through the optimized model to be optimized.

According to another aspect of the present invention, there is provided a feature optimizing apparatus including:

the to-be-analyzed feature acquisition module is used for carrying out normalization processing on the use behavior features of the telecommunication products of the users to obtain to-be-analyzed features;

the to-be-reconstructed original feature acquisition module is used for determining feature importance data according to the to-be-analyzed features and the to-be-optimized model, and screening the to-be-analyzed features according to the feature importance data to obtain the to-be-reconstructed original features;

the target optimization feature determining module is used for determining target optimization features according to the original features to be reconstructed, the entropy weight TOPSIS method and the self-adaptive weight coefficient correction algorithm;

and the model optimization module is used for optimizing the model to be optimized according to the target optimization characteristics so as to recommend telecommunication products for users through the optimized model to be optimized.

According to another aspect of the present invention, there is provided an electronic apparatus including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the feature optimization method of any one of the embodiments of the invention.

According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute a feature optimization method according to any one of the embodiments of the present invention.

According to the technical scheme, the characteristics to be analyzed are obtained through normalization processing of the behavior characteristics of the telecommunication product of the user, so that the characteristic importance data are determined according to the characteristics to be analyzed and the model to be optimized, the characteristics to be analyzed are screened according to the characteristic importance data, the original characteristics to be reconstructed are obtained, the target optimization characteristics are determined according to the original characteristics to be reconstructed, the entropy weight TOPSIS method and the self-adaptive weight coefficient correction algorithm, the model to be optimized is further optimized according to the target optimization characteristics, and the telecommunication product is recommended to the user through the optimized model to be optimized. In the scheme, the characteristics can be reconstructed through the entropy weight TOPSIS method, the characteristics input to the model to be optimized are enriched, the self-adaptive weight coefficient correction algorithm can correct the weights of the characteristics, redundancy caused by characteristic optimization is reduced, the problem that the existing learning model recommended by the telecommunication product is poor in training effect due to poor characteristic analysis optimization input to the learning model recommended by the telecommunication product is solved, the input characteristics of the learning model recommended by the telecommunication product can be optimized, and the training effect of the learning model recommended by the telecommunication product is improved.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a feature optimization method according to a first embodiment of the present invention;

FIG. 2 is a flowchart of a feature optimization method according to a second embodiment of the present invention;

FIG. 3 is a logic schematic diagram of a feature optimization method according to a second embodiment of the present invention;

fig. 4 is a schematic structural diagram of a feature optimization device according to a third embodiment of the present invention;

fig. 5 shows a schematic diagram of the structure of an electronic device that may be used to implement an embodiment of the invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "original," "target," and the like in the description and claims of the present invention and the above-described drawings are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

Fig. 1 is a flowchart of a feature optimization method according to an embodiment of the present invention, where the method may be performed by a feature optimization device, which may be implemented in hardware and/or software, and the feature optimization device may be configured in an electronic device, where the method is applicable to a case of optimizing features of a learning model recommended for a telecommunication product. As shown in fig. 1, the method includes:

s110, carrying out normalization processing on the user' S telecom product usage behavior characteristics to obtain characteristics to be analyzed.

Wherein the telecommunication product usage behavior feature may be a behavior feature of a user when using the telecommunication product. Illustratively, the telecommunication product usage behavior characteristics may include, but are not limited to, user age, user gender, and frequency of usage of a certain type of telecommunication product by the user, etc. The feature to be analyzed may be a feature obtained by normalizing the behavior feature of the telecommunication product.

In the embodiment of the invention, firstly, the use behavior characteristics of the telecommunication product of the user can be obtained according to the recommendation requirement of the telecommunication product, and then the normalization processing is carried out on the use behavior characteristics of the telecommunication product, so that the condition that the subsequent characteristic processing is complex due to different dimensions is avoided, and the characteristics to be analyzed are obtained.

Optionally, the normalization process may be performed after the encoding process is performed on the enumerated type of features in the behavior feature of the telecommunication product.

And S120, determining feature importance data according to the features to be analyzed and the model to be optimized, and screening the features to be analyzed according to the feature importance data to obtain original features to be reconstructed.

The model to be optimized can be a learning model which needs to be recommended for telecommunication products according to the characteristics to be analyzed. The feature importance data may be used to describe the importance of the feature to be analyzed to the supervised results of the learning model. The original features to be reconstructed may be features screened from the features to be analyzed according to the feature importance data, and are used for performing feature reconstruction.

In the embodiment of the invention, the feature to be analyzed can be input into the model to be optimized, and the feature importance data of each feature in the features to be analyzed is output based on the interpretability of the model to be optimized, so that the features to be analyzed are screened according to the feature importance data of each feature in the features to be analyzed, so that the features with larger contribution to the supervision result, namely the features with larger importance degree in the features to be analyzed are screened out, and the original features to be reconstructed are obtained.

S130, determining target optimization features according to the original features to be reconstructed, the entropy weight TOPSIS method and the adaptive weight coefficient correction algorithm.

The adaptive weight coefficient correction algorithm may be an algorithm for correcting a weight coefficient corresponding to a feature. The target optimization feature can be a new feature with weight reconstructed by using an entropy weight TOPSIS (Technique for Order Preference by Similarity to an Ideal Solution, good-bad solution distance) method and an adaptive weight coefficient correction algorithm based on the original feature to be reconstructed, and a feature combined by the original feature to be reconstructed.

In the embodiment of the invention, the original features to be reconstructed can be randomly selected, the selected features are subjected to feature reconstruction based on the entropy weight TOPSIS method, the weight coefficient correction is further carried out on the reconstructed features by utilizing the self-adaptive weight coefficient correction algorithm, and the reconstructed features subjected to the coefficient correction are combined with the original features to be reconstructed to obtain the target optimized features. Because the reconstructed characteristics do not belong to the use behavior characteristics of the telecommunication product, redundancy caused by characteristic optimization can be avoided by carrying out weight correction.

And S140, optimizing the model to be optimized according to the target optimization characteristics so as to recommend telecommunication products for users through the optimized model to be optimized.

In the embodiment of the invention, the target optimization characteristics can be input into the model to be optimized, and the model to be optimized is trained through the target optimization characteristics, so that the trained model to be optimized has the capability of accurately recommending telecommunication products for users.

In the technical scheme of the embodiment of the invention, the acquisition, storage, application and the like of the related user personal information (such as the use behavior characteristics of the telecommunication products of the user) accord with the regulations of related laws and regulations, and the public sequence is not violated.

Example two

Fig. 2 is a flowchart of a feature optimization method provided by a second embodiment of the present invention, where the present embodiment is implemented based on the foregoing embodiment, and a specific optional implementation manner of determining a target optimization feature according to an original feature to be reconstructed, an entropy weight TOPSIS method, and an adaptive weight coefficient correction algorithm is provided. As shown in fig. 2, the method includes:

s210, carrying out normalization processing on the use behavior characteristics of the telecommunication product of the user to obtain characteristics to be analyzed.

S220, determining feature importance data according to the features to be analyzed and the model to be optimized, and screening the features to be analyzed according to the feature importance data to obtain original features to be reconstructed.

In an alternative embodiment of the present invention, screening the feature to be analyzed according to the feature importance data to obtain the original feature to be reconstructed may include: setting a data screening proportion threshold value; determining the screening feature quantity according to the feature quantity of the feature to be analyzed and the data screening proportion threshold value; and screening the characteristics to be analyzed according to the characteristic importance data and the screening characteristic quantity to obtain the original characteristics to be reconstructed.

The data filtering proportion threshold value may be a preset feature filtering proportion upper limit value. The data screening scale threshold may be a representation of a percentile, for example, the data screening scale threshold may include, but is not limited to, 80%. The number of screening features may be the number of features of the original feature to be reconstructed.

In the embodiment of the invention, the data screening proportion threshold value can be set according to the quantity requirement of feature screening, and then the feature quantity of the feature to be analyzed and the data screening proportion threshold value are multiplied to obtain the screening feature quantity, so that the feature to be analyzed is ranked according to the feature importance data, and the feature to be analyzed of the screening feature quantity with higher importance in the ranking is used as the original feature to be reconstructed.

S230, generating a random feature sequence set according to the screening feature quantity and the original features to be reconstructed.

The random feature sequence set may be a set formed by combining all feature sequences of at least two features selected from original features to be reconstructed.

In the embodiment of the invention, at least two characteristics can be randomly selected from the original characteristics to be reconstructed, and the characteristic construction characteristic sequences of the characteristic quantity are screened at most, so that a random characteristic sequence set is obtained.

S240, determining reconstruction features according to the entropy weight TOPSIS method and the random feature sequence set.

The reconstruction feature may be a new feature obtained by reconstructing a feature selected from a random sequence set based on an entropy weight TOPSIS method.

In the embodiment of the invention, the feature sequence to be reconstructed can be selected from the random feature sequence set according to the feature reconstruction requirement, and then the feature sequence selected from the random feature sequence set is subjected to reconstruction processing by an entropy weight TOPSIS method to obtain the reconstructed feature.

In an alternative embodiment of the present invention, determining the reconstruction feature according to the entropy weighted TOPSIS method and the random feature sequence set may include: randomly screening target random feature sequences with the number of the reconstructed features from the random feature sequence set; and carrying out feature reconstruction on the target random feature sequence according to the entropy weight TOPSIS method to obtain a reconstructed feature.

Wherein the number of reconstructed features may be used to represent the number of reconstructed features required. The target random feature sequence may be a partial feature sequence randomly screened from the random feature sequence set, and the number of feature sequences of the target random feature sequence is consistent with the number of reconstructed features.

In the embodiment of the invention, the number of the reconstructed features can be determined according to the feature reconstruction requirement, and then the feature sequence of the number of the reconstructed features is randomly screened out from the random feature sequence set to be used as the target random feature sequence, so that the target random feature sequence is subjected to feature reconstruction according to the entropy weight TOPSIS method to obtain the reconstructed features.

S250, determining target optimization features according to the reconstruction features, the original features to be reconstructed and the adaptive weight coefficient correction algorithm.

In the embodiment of the invention, the weight coefficient modification can be carried out on the reconstructed feature through the self-adaptive weight coefficient modification algorithm, so that the original feature to be reconstructed and the reconstructed feature subjected to the weight coefficient modification are combined together to obtain the target optimization feature.

In an alternative embodiment of the present invention, determining the target optimization feature according to the reconstruction feature, the original feature to be reconstructed, and the adaptive weight coefficient correction algorithm may include: determining a reconstruction feature weight coefficient according to the self-adaptive weight coefficient correction algorithm, and carrying out weight coefficient correction on the reconstruction feature according to the reconstruction feature weight coefficient to obtain a corrected reconstruction feature; and correcting the reconstruction feature and the original feature to be reconstructed to form a target optimization feature.

The reconstructed feature weight coefficient may be a weight coefficient calculated for the reconstructed feature based on an adaptive weight coefficient correction algorithm. The correction of the reconstructed feature may be a result of performing a weight coefficient correction on the reconstructed feature by reconstructing the feature weight coefficient.

In the embodiment of the invention, the reconstruction feature weight coefficient of the reconstruction feature can be determined through a self-adaptive weight coefficient correction algorithm, so that multiplication operation is carried out on the reconstruction feature weight coefficient and the reconstruction feature to obtain a corrected reconstruction feature, and the corrected reconstruction feature and the original feature to be reconstructed are combined together to obtain the target optimization feature.

In an alternative embodiment of the present invention, determining the reconstructed feature weight coefficients according to an adaptive weight coefficient correction algorithm may include: the reconstructed feature weight coefficients are determined based on the following formula:

wherein I represents a reconstructed feature weight coefficient, and j represents a reconstructed feature quantity.

And S260, optimizing the model to be optimized according to the target optimization characteristics so as to recommend telecommunication products for users through the optimized model to be optimized.

In an optional embodiment of the present invention, after optimizing the model to be optimized, the method may further include: updating the number of the reconstructed features at least once, and returning to execute the operation of randomly screening the target random feature sequences of the number of the reconstructed features from the random feature sequence set until the target random feature sequences matched with all the updated number of the reconstructed features are screened.

In the embodiment of the invention, after the model to be optimized is optimized, the number of reconstructed features can be updated at least once, so as to obtain at least one updated number of reconstructed features, and then the reconstructed features corresponding to each updated number of reconstructed features are randomly screened out from the random feature sequence set according to each updated number of reconstructed features, until the target random feature sequences matched with all updated number of reconstructed features are screened out, the target random feature sequences matched with the updated number of reconstructed features are further subjected to feature reconstruction by using an entropy weight TOPSIS method, and the reconstructed feature weight coefficient of the target random feature sequences matched with the updated number of reconstructed features is determined according to the updated number of reconstructed features and a self-adaptive weight coefficient correction algorithm, so that the reconstructed features corresponding to each updated number of reconstructed features are subjected to weight coefficient correction by using each reconstructed feature weight coefficient, and thus each corrected reconstructed feature and the original feature to be combined into each target optimized feature, the target optimized feature is further subjected to feature optimization by using each target optimized feature, and the optimal telecom model to be optimized is determined, and the optimal telecom product performance is enabled to be the optimal telecom model to be based on the optimal telecom model, and the optimal telecom product performance is determined by using the optimal telecom model to have the optimal telecom model performance.

In a specific example, after the telecommunication product of the user is normalized by using the behavior characteristics to obtain the characteristics to be analyzed, the feature importance data corresponding to each feature in the characteristics to be analyzed can be determined based on a regression algorithm in the model to be optimized. The feature importance data corresponding to each feature in the feature to be analyzed is assumed to be shown in table 1:

TABLE 1 feature importance data statistics for features to be analyzed

Feature attributes	Feature importance data
		Feature 1	2810
Feature 2	2384
		Feature 3	1153
Feature 4	1092
		Feature 5	674
Feature 6	648
		Feature 7	348
Feature 8	335
		Feature 9	311
Feature 10	149
		Feature 11	96

Alternatively, the normalization process may be completed by dividing feature importance data corresponding to each feature in the features to be analyzed by the sum of feature importance data of the features to be analyzed.

Further, the data screening proportion threshold value is set to 80%, the first 80% of the 11 features in the features to be analyzed are selected, the first 9 features are rounded off in the decimal form of the calculated result, the original features to be reconstructed are obtained, and then m (m is more than or equal to 2 and m is less than or equal to 9) features are selected from the 9 features to be screened, so that a feature sequence (required to be formed

The number of the feature sequences) and naming all the feature sequences as a random feature sequence set, randomly screening target random feature sequences with the number of the reconstructed features from the random feature sequence set, and reconstructing the target random feature sequences by using an entropy weight TOPSIS method to obtain the reconstructed features with the number of the reconstructed features. And (3) introducing a self-adaptive weight coefficient correction algorithm to correct the weight coefficient of the reconstructed feature to obtain a new feature sequence, namely correcting the reconstructed feature, and combining the corrected reconstructed feature and the original feature to be reconstructed into a target optimization feature, so that the target optimization feature is input into a model to be optimized, a prediction/evaluation index of the use behavior feature of the telecommunication product of the user in the model is obtained through model training, and finally, the model output result is averaged to be used as a final supervision result, thereby reducing the error problem caused by a single experiment.

The feature optimization method provided by the scheme solves the defects of lack of correlation remodelling information, low data feature utilization rate and the like, solves the problems of feature optimization and purification by efficiently utilizing original data (using behavior features of telecommunication products), greatly improves training prediction performance of machine learning by using more developed potential sample features, verifies on various data sets, improves about 9.6% -40.6% on an F1 value compared with a basic model index, and reflects superiority and universality of the scheme in feature optimization. The data index of the present scheme under two data sets can be referred to tables 2 and 3.

Table 2 model index under dataset 1

The telecom product using behavior characteristics of a user are directly input into a model to be optimized, the F1 value is 0.856, target optimization characteristics are obtained according to 10 to 50 reconstructed characteristics, the target optimization characteristics are input into the model to be optimized, the worst F1 value is 0.911, the optimal F1 value is 0.949, and the average improvement is about 9.61%.

Table 3 model index under dataset 2

The telecom product using behavior characteristics of a user are directly input into the model, and the F1 value is 0.489; according to the constructed 10 to 18 reconstruction features, target optimization features are obtained, the target optimization features are input into a model to be optimized, and the worst F1 value is 0.684, the best F1 value is 0.691, and the average F1 value is improved by about 40.55%.

Fig. 3 is a logic schematic diagram of a feature optimization method provided in a second embodiment of the present invention, as shown in fig. 3, firstly, a user's telecom product usage behavior feature is obtained, so as to normalize the telecom product usage behavior feature, obtain feature to be analyzed, calculate feature importance data of the feature to be analyzed, determine original feature to be reconstructed of screening feature quantity according to the feature importance data, further set up the number of reconstructed features, thereby create reconstructed features, and determine whether the number of reconstructed features is greater than the number of reconstructed features, if not, randomly select m features from the reconstructed features, and further utilize entropy weight TOPSIS to reconstruct features. If the number of the reconstructed features is larger than the number of the reconstructed features, a target random feature sequence of the number of the reconstructed features is obtained, the weight coefficient of the target random feature sequence is corrected to obtain target optimization features, the target optimization features are input into a model to be optimized to obtain a supervision result, and then the supervision result is output and the average value is obtained.

According to the technical scheme, the characteristics to be analyzed are obtained through normalization processing of the behavior characteristics of the telecommunication products of the users, so that the characteristic importance data are determined according to the characteristics to be analyzed and the models to be optimized, the characteristics to be analyzed are screened according to the characteristic importance data, the original characteristics to be reconstructed are obtained, further, a random characteristic sequence set is generated according to the screening characteristic quantity and the original characteristics to be reconstructed, the reconstruction characteristics are further determined according to the entropy weight TOPSIS method and the random characteristic sequence set, and the target optimization characteristics are determined according to the reconstruction characteristics, the original characteristics to be reconstructed and the adaptive weight coefficient correction algorithm, so that the models to be optimized are optimized according to the target optimization characteristics, and the telecommunication products are recommended to the users through the optimized models to be optimized. In the scheme, the characteristics can be reconstructed through the entropy weight TOPSIS method, the characteristics input to the model to be optimized are enriched, the self-adaptive weight coefficient correction algorithm can correct the weights of the characteristics, redundancy caused by characteristic optimization is reduced, the problem that the existing learning model recommended by the telecommunication product is poor in training effect due to poor characteristic analysis optimization input to the learning model recommended by the telecommunication product is solved, the input characteristics of the learning model recommended by the telecommunication product can be optimized, and the training effect of the learning model recommended by the telecommunication product is improved.

Example III

Fig. 4 is a schematic structural diagram of a feature optimization device according to a third embodiment of the present invention. As shown in fig. 4, the apparatus includes: a feature to be analyzed acquisition module 310, a raw feature to be reconstructed acquisition module 320, a target optimization feature determination module 330, and a model optimization module 340, wherein,

the feature to be analyzed obtaining module 310 is configured to normalize the behavior feature of the telecommunication product of the user to obtain a feature to be analyzed;

the to-be-reconstructed original feature obtaining module 320 is configured to determine feature importance data according to the to-be-analyzed feature and the to-be-optimized model, and screen the to-be-analyzed feature according to the feature importance data to obtain the to-be-reconstructed original feature;

the target optimization feature determining module 330 is configured to determine a target optimization feature according to the original feature to be reconstructed, the entropy weight TOPSIS method, and the adaptive weight coefficient correction algorithm;

the model optimizing module 340 is configured to optimize the model to be optimized according to the target optimizing feature, so as to recommend a telecommunication product to the user through the optimized model to be optimized.

Optionally, the original feature to be reconstructed acquisition module 320 is specifically configured to set a data screening proportion threshold; determining the number of screening features according to the number of features of the features to be analyzed and the data screening proportion threshold; and screening the features to be analyzed according to the feature importance data and the screening feature quantity to obtain the original features to be reconstructed.

Optionally, the target optimization feature determining module 330 includes a random feature sequence set generating unit, a reconstruction feature determining unit, and a target optimization feature determining unit, where the random feature sequence set generating unit is configured to generate a random feature sequence set according to the number of screening features and the original feature to be reconstructed; a reconstruction feature determining unit, configured to determine a reconstruction feature according to the entropy weight TOPSIS method and a random feature sequence set; and the target optimization feature determining unit is used for determining the target optimization feature according to the reconstruction feature, the original feature to be reconstructed and the adaptive weight coefficient correction algorithm.

Optionally, the reconstruction feature determining unit is configured to randomly screen out a target random feature sequence of the reconstruction feature number from the random feature sequence set; and carrying out feature reconstruction on the target random feature sequence according to the entropy weight TOPSIS method to obtain a reconstructed feature.

Optionally, the target optimization feature determining unit is configured to determine a weight coefficient of the reconstructed feature according to the adaptive weight coefficient correction algorithm, and correct the weight coefficient of the reconstructed feature according to the weight coefficient of the reconstructed feature to obtain a corrected reconstructed feature; and forming the target optimization feature by the corrected reconstruction feature and the original feature to be reconstructed.

Optionally, the target optimization feature determining unit includes a reconstruction feature weight coefficient determining subunit, configured to determine a reconstruction feature weight coefficient based on the following formula:

wherein I represents the reconstructed feature weight coefficient, and j represents the reconstructed feature quantity.

Optionally, the feature optimization device further includes a reconstructed feature quantity updating module, configured to update the reconstructed feature quantity at least once, and return to perform an operation of randomly screening the target random feature sequence of the reconstructed feature quantity from the random feature sequence set until a target random feature sequence matching with all the updated reconstructed feature quantity is screened.

The feature optimization device provided by the embodiment of the invention can execute the feature optimization method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

Example IV

Fig. 5 shows a schematic diagram of the structure of an electronic device that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.

As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.

Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as feature optimization methods.

In some embodiments, the feature optimization method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. One or more of the steps of the feature optimization method described above may be performed when the computer program is loaded into RAM 13 and executed by processor 11. Alternatively, in other embodiments, the processor 11 may be configured to perform the feature optimization method in any other suitable way (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.

The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.

The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims

1. A method of feature optimization, comprising:

determining target optimization features according to the original features to be reconstructed, the entropy weight superior-inferior solution distance TOPSIS method and the self-adaptive weight coefficient correction algorithm;

2. The method according to claim 1, wherein screening the feature to be analyzed according to the feature importance data to obtain the original feature to be reconstructed comprises:

setting a data screening proportion threshold value;

determining the number of screening features according to the number of features of the features to be analyzed and the data screening proportion threshold;

and screening the features to be analyzed according to the feature importance data and the screening feature quantity to obtain the original features to be reconstructed.

3. The method according to claim 2, wherein determining target optimization features from the original features to be reconstructed, entropy weight TOPSIS method and adaptive weight coefficient correction algorithm comprises:

generating a random feature sequence set according to the screening feature quantity and the original features to be reconstructed;

determining reconstruction features according to the entropy weight TOPSIS method and the random feature sequence set;

and determining the target optimization feature according to the reconstruction feature, the original feature to be reconstructed and the adaptive weight coefficient correction algorithm.

4. A method according to claim 3, characterized in that determining reconstruction features from the entropy weighted TOPSIS method and a set of random feature sequences comprises:

randomly screening target random feature sequences with the number of the reconstructed features from the random feature sequence set;

and carrying out feature reconstruction on the target random feature sequence according to the entropy weight TOPSIS method to obtain a reconstructed feature.

5. The method of claim 4, wherein determining the target optimization feature based on the reconstructed feature, the original feature to be reconstructed, and the adaptive weight coefficient correction algorithm comprises:

determining a reconstruction feature weight coefficient according to the self-adaptive weight coefficient correction algorithm, and carrying out weight coefficient correction on the reconstruction feature according to the reconstruction feature weight coefficient to obtain a corrected reconstruction feature;

and forming the target optimization feature by the corrected reconstruction feature and the original feature to be reconstructed.

6. The method of claim 5, wherein determining reconstructed feature weight coefficients according to the adaptive weight coefficient correction algorithm comprises:

the reconstructed feature weight coefficients are determined based on the following formula:

7. The method of claim 4, further comprising, after optimizing the model to be optimized:

updating the number of the reconstruction features at least once, and returning to execute the operation of randomly screening the target random feature sequences of the number of the reconstruction features from the set of the random feature sequences until the target random feature sequences matched with all the updated number of the reconstruction features are screened.

8. A feature optimization apparatus, comprising:

the to-be-reconstructed original feature acquisition module is used for determining feature importance data according to the to-be-analyzed features and the to-be-optimized model, and screening the to-be-analyzed features according to the feature importance data to obtain to-be-reconstructed original features;

the target optimization feature determining module is used for determining target optimization features according to the original features to be reconstructed, the entropy weight TOPSIS method and the adaptive weight coefficient correction algorithm;

9. An electronic device, the electronic device comprising:

at least one processor; and

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the feature optimization method of any one of claims 1-7.

10. A computer readable storage medium storing computer instructions for causing a processor to perform the feature optimization method of any one of claims 1-7.