CN112148764B - Feature screening method, device, equipment and storage medium - Google Patents

Feature screening method, device, equipment and storage medium Download PDF

Info

Publication number
CN112148764B
CN112148764B CN201910576711.6A CN201910576711A CN112148764B CN 112148764 B CN112148764 B CN 112148764B CN 201910576711 A CN201910576711 A CN 201910576711A CN 112148764 B CN112148764 B CN 112148764B
Authority
CN
China
Prior art keywords
type
feature
mutual information
samples
coverage rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910576711.6A
Other languages
Chinese (zh)
Other versions
CN112148764A (en
Inventor
王倩
徐晓飞
杨海华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910576711.6A priority Critical patent/CN112148764B/en
Publication of CN112148764A publication Critical patent/CN112148764A/en
Application granted granted Critical
Publication of CN112148764B publication Critical patent/CN112148764B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2477Temporal data queries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Fuzzy Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Complex Calculations (AREA)

Abstract

The application provides a feature screening method, device, equipment and storage medium, wherein in the scheme, electronic equipment acquires a plurality of samples to be screened, each sample comprises at least one type of feature, mutual information and coverage rate of each type of feature in different time periods are acquired according to preset time intervals, stability indexes of each type of feature are acquired according to the mutual information and coverage rate of each type of feature in each time period, the features in the plurality of samples are screened according to the stability indexes of each type of feature, feature selection is performed by calculating dynamic indexes of stability measurement in different time periods, modeling effect can be effectively improved, and model accuracy is improved.

Description

Feature screening method, device, equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of big data, in particular to a feature screening method, a device, equipment and a storage medium.
Background
With the development of big data technology, mass data is continuously enriched, more and more data information is applied to machine learning application, and model construction is required to be carried out by using the data information. Machine learning can be used in various scenarios, such as recommendation systems, search systems, etc. in the internet, and in order to train to a more accurate model, it is necessary to identify various indexes of features in the acquired data and filter the features.
In the prior art, the data information comprises a plurality of features, and the mutual information and the coverage rate are important indexes for feature selection, so that a common mode is to filter the features by calculating the mutual information and the coverage rate of each feature and defining a threshold according to the magnitude of the numerical value.
However, the data information as the features often has the characteristic of unstable temporal distribution, and features with unstable temporal distribution are difficult to screen out, so that the model effect at the feature training place is poor.
Disclosure of Invention
The embodiment of the application provides a feature screening method, device, equipment and storage medium, which are used for solving the problem that the model effect at a feature training place is poor because the feature which is unstable in time distribution is difficult to screen out.
The first aspect of the present application provides a method for screening features, the method comprising:
obtaining a plurality of samples to be screened, wherein each sample comprises at least one type of characteristic;
According to a preset time interval, acquiring the mutual information and coverage rate of each type of feature in different time periods;
Acquiring stability indexes of each type of characteristics according to mutual information and coverage rate of each type of characteristics in each time period;
And screening the characteristics in the plurality of samples according to the stability index of each type of characteristics.
In a specific embodiment, each sample further includes time information of the features, and the acquiring, according to a preset time interval, mutual information and coverage rate of each type of feature in different time periods includes:
dividing each sample into sub-samples of a plurality of time periods according to the time information of the features in each sample and the time interval;
the mutual information and coverage of each type of feature in each sub-sample is calculated.
In a specific embodiment, the obtaining the stability index of each type of feature according to the mutual information and the coverage rate of each type of feature in each time period includes:
calculating the variance value of the mutual information and coverage rate corresponding to a plurality of time periods aiming at each type of characteristics; the stability index includes the variance value.
In a specific embodiment, the screening the features in the plurality of samples according to the stability index of each type of feature includes:
and filtering the characteristics of which the stability index is smaller than a preset threshold value in the samples to obtain at least one type of characteristics of which the stability is higher than the threshold value.
A second aspect of the present application provides a screening apparatus for a feature, comprising:
The acquisition module is used for acquiring a plurality of samples to be screened, and each sample comprises at least one type of characteristic;
the processing module is used for acquiring the mutual information and coverage rate of each type of feature in different time periods according to the preset time interval;
the processing module is also used for acquiring the stability index of each type of feature according to the mutual information and coverage rate of each type of feature in each time period;
and the screening module is used for screening the characteristics in the samples according to the stability index of the characteristics of each type.
Optionally, each sample further includes time information of the feature, and the processing module is specifically configured to:
dividing each sample into sub-samples of a plurality of time periods according to the time information of the features in each sample and the time interval;
the mutual information and coverage of each type of feature in each sub-sample is calculated.
Optionally, the processing module is specifically configured to:
calculating the variance value of the mutual information and coverage rate corresponding to a plurality of time periods aiming at each type of characteristics; the stability index includes the variance value.
Optionally, the screening module is specifically configured to:
and filtering the characteristics of which the stability index is smaller than a preset threshold value in the samples to obtain at least one type of characteristics of which the stability is higher than the threshold value.
A third aspect of the present application provides an electronic apparatus, comprising: a processor, a memory, and a computer program; the computer program is stored in the memory, and the processor executes the computer program to implement the method of screening features provided in any one of the first aspects.
A fourth aspect of the present application provides a computer-readable storage medium storing a computer program for implementing the screening method of the features provided in any one of the first aspects.
A fifth aspect of the application provides a computer program product comprising: a computer program stored in a readable storage medium, from which it can be read by at least one processor of an electronic device, the at least one processor executing the computer program causing the electronic device to perform the method of screening for features as described in the first aspect.
According to the feature screening method, device, equipment and storage medium, the electronic equipment acquires a plurality of samples to be screened, each sample comprises at least one type of feature, mutual information and coverage rate of each type of feature in different time periods are acquired according to preset time intervals, stability indexes of each type of feature are acquired according to the mutual information and coverage rate of each type of feature in each time period, the features in the plurality of samples are screened according to the stability indexes of each type of feature, feature selection is performed by calculating dynamic indexes of stability measurement in different time periods, modeling effects can be effectively improved, and model accuracy is improved.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are some embodiments of the application and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.
FIG. 1 is a flowchart of a first embodiment of a feature screening method provided by the present application;
FIG. 2 is a flowchart of a second embodiment of a feature screening method provided by the present application;
FIG. 3 is a flowchart of an example of a feature screening method according to the present application;
fig. 4 is a schematic structural diagram of a screening apparatus according to an embodiment of the present application;
Fig. 5 is a schematic structural diagram of an electronic device entity according to the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The data information comprises a plurality of features, and the mutual information and the coverage rate are important indexes for feature selection, so that a common mode is to filter the features by calculating the mutual information and the coverage rate of each feature and defining a threshold according to the numerical value.
However, the data information as the features often has the characteristic of unstable temporal distribution, and features with unstable temporal distribution are difficult to screen out, so that the model effect at the feature training place is poor.
The application provides a feature screening method which can be applied to various technical fields such as finance, internet, retail and the like, and can screen features with unstable time distribution according to the scheme so as to be more accurate in the application process (such as model training) of the screened features.
The screening method for this feature will be described in detail below by means of several embodiments.
Fig. 1 is a flowchart of an embodiment of a feature screening method provided by the present application, as shown in fig. 1, an execution subject of the present application may be a server, a cloud server, a terminal for performing data processing, a computer, or other electronic devices, where the present application is not limited to this, and the feature screening method includes the following steps:
S101: a plurality of samples to be screened are obtained, each sample including at least one type of feature.
In this step, before performing model training or big data analysis, a large amount of data, that is, the above samples, needs to be obtained, first, a large amount of samples need to be prepared, and the samples include at least one type of feature, and when feature data is collected, time information of the feature data needs to be recorded, for example: in collecting revenue for a user, it is necessary to record the collection time, for example: * Year month day. So that the data analysis processing can be performed according to the time information when the data analysis processing is performed later.
S102: and acquiring the mutual information and coverage rate of each type of feature in different time periods according to the preset time interval.
In this step, mutual information (Mutual Information) is a useful information measure in the information theory, which can be seen as the amount of information contained in one random variable about another random variable, or as the uncertainty that one random variable is reduced by knowing another random variable. Coverage is a measure of test integrity and is a measure of test effectiveness.
In a specific implementation, each sample may be divided into sub-samples of a plurality of time periods according to the time interval according to the time information of the features in each sample; the mutual information and coverage of each type of feature in each sub-sample is calculated.
The method is that after a large number of samples are acquired, the samples are divided into sub-samples corresponding to a plurality of time periods according to time information corresponding to each feature data, and then mutual information among the features and coverage rate of each type of features are calculated in the sub-samples corresponding to each time period.
S103: and acquiring the stability index of each type of feature according to the mutual information and coverage rate of each type of feature in each time period.
In this step, after obtaining the mutual information and coverage rate of each type of feature in different sub-samples, the stability index of each type of feature can be obtained by calculating according to the value of the mutual information and coverage rate of the same type of feature in different samples, where the stability index is used to measure whether each feature is stable or not, and the parameters may be the mean value, variance, mean square error and the like of the obtained mutual information and coverage rate, which are not limited in this scheme.
Preferably, in order to determine the stability of the feature more clearly, the obtained stability index may be normalized, and the stability index may be normalized to a range of 0 to 1.
S104: and screening the characteristics in the plurality of samples according to the stability index of the characteristics of each type.
In this step, after the stability index of each type of feature is obtained, an appropriate threshold may be set according to different requirements of the specific application scenario for stability of different features, then, for each type of feature, the stability index is compared with the set threshold, if the stability index is higher than the threshold, it is determined that the type of feature is relatively stable to a certain extent, and if the stability index is lower than the threshold, the stability is considered relatively poor.
According to this mode, the features in the acquired multiple samples can be subjected to screening processing, for example: the feature with low stability can be obtained after the feature with good stability is performed, so that model training can be performed later, and a model with high accuracy can be obtained.
According to the feature screening method provided by the embodiment, mutual information and coverage rate of each type of feature in different time periods are obtained according to the preset time interval, stability indexes of each type of feature are obtained according to the mutual information and coverage rate of each type of feature in each time period, features in the plurality of samples are screened according to the stability indexes of each type of feature, feature selection is performed by calculating dynamic indexes of stability measurement in different time periods, modeling effect can be effectively improved, and model accuracy is improved.
Fig. 2 is a flowchart of a second embodiment of the feature screening method provided by the present application, and as shown in fig. 2, on the basis of the foregoing embodiment, the feature screening method provided by the present embodiment specifically includes the following steps:
S101: a plurality of samples to be screened are obtained, each sample including at least one type of feature.
S102: and acquiring the mutual information and coverage rate of each type of feature in different time periods according to the preset time interval.
The two steps are identical to those of the previous embodiment, and specific implementation can be referred to as embodiment one.
In a specific implementation, the acquired samples need to be divided into sub-samples of a plurality of time periods according to time intervals, for example, the sub-samples may be divided according to each day, each month, each year, etc., and specific situations may be set according to the situation of the feature itself, and generally, one evaluation period of the feature may be adopted as one time period, which is not limited.
In this scenario, it should be understood that mutual information is used to measure the interaction between two objects, in this scenario, between two types of features in the same sub-sample. In the scheme for feature screening, the method is used for measuring the distinguishing degree of the features on the subject. The definition of mutual information approximates to cross entropy. Mutual information is a concept in information theory, is used for representing the relationship between information, is a measure of statistical correlation of two random variables, and is based on the following assumption that the mutual information theory is used for feature extraction: the occurrence frequency of the entry in a specific category is high, but the occurrence frequency of the entry in other categories is low, and the mutual information of the entry and the category is larger. Mutual information is usually used as a measure between the feature words and the category, and if the feature words belong to the category, their mutual information amount is the largest, and the method does not need to make any assumption about the nature of the relationship between the feature words and the category.
S1031: calculating the variance value of the mutual information and coverage rate corresponding to a plurality of time periods aiming at each type of characteristics; the stability index includes the variance value.
In this step, after the obtained mutual information and coverage rate of each type of feature in each sub-sample, for one type of feature, the variance value of the mutual information and coverage rate in each sub-sample may be calculated, which is the index for measuring the stability of the type of feature.
S1041: and filtering the characteristics of which the stability index is smaller than a preset threshold value in the plurality of samples to obtain at least one type of characteristics of which the stability is higher than the threshold value.
In this step, after the stability index of each type of feature is obtained, the feature may be screened and filtered, and in general, the stability index may be compared with a set threshold, and features with stability index lower than the threshold may be filtered out, so as to retain features with higher stability.
In particular implementations, an appropriate threshold may be selected. For example: the stability index is the standard deviation of the mutual information and coverage, that is, the standard deviation (standard deviation of the mean sequence/mean of the mean sequence) is the stability index of the feature, and the larger the standard deviation is, the more unstable the feature is, and otherwise, the more stable the feature is. In the process of feature screening, if the requirement on model stability is high in the subsequent model training, a low threshold value is selected, if the standard deviation of the features is larger than the threshold value, the features of the type are unstable in time, and if the standard deviation of the features is smaller than the threshold value, the features of the type are determined to be stable in time. If the goal is to improve the prediction accuracy, different thresholds are tried, or the threshold with the highest accuracy is taken, and different selections can be made according to different application scenes.
According to the feature screening method provided by the embodiment, the samples are divided into time periods according to a certain time interval, the coverage rate and mutual information of the features in each time period are calculated, the coverage rate and the mutual information of the features in each time period are integrated, the corresponding variance value is calculated, the features are screened according to the threshold value of the variance value, the features unstable in time can be screened, the training model effect is improved, and the problem of overfitting is avoided.
Based on the above two embodiments, the following describes a screening method of the features provided by the present application through a specific implementation manner.
FIG. 3 is a flowchart of an example of a feature screening method according to the present application; as shown in fig. 3, the screening method of the feature specifically includes the following steps:
S201: samples and features are prepared.
In this step, m samples with time information and p-dimensional features are prepared in advance.
S202: the mutual information and coverage of the features at different time periods are calculated.
And calculating the feature mutual information and coverage rate in different time periods. Dividing the sample into N sections in an equal frequency manner according to time information, and respectively calculating the mutual information and coverage rate of the features in each time section, specifically
The formula is as follows:
Mutual information calculation formula:
Coverage rate calculation formula: u n(xi;y)=qi/p
In the above formula, I n is used to represent the mutual information of the feature x and the feature y in the period n; u n is used to represent the coverage of feature x in time period n.
S203: and calculating a stability index.
In this step, the variance values of the feature mutual information and the coverage over N time periods are used as an index of the stability measure, and normalized to be within the [0,1] range.
S204: and selecting the characteristics according to the stability index.
S205: the feature selection ends.
And defining a reasonable threshold according to the characteristic stability index obtained in the steps, and selecting the characteristics which remain larger than the threshold as two groups of characteristics which are finally screened, namely, the characteristics with the stability index lower than the threshold are removed.
In the feature screening method provided by the technical scheme of the application, feature selection is limited by calculating dynamic indexes of stability measurement in different time periods, and the filtered mutual information and the features distributed stably in coverage rate are beneficial to improving modeling effect.
Fig. 4 is a schematic structural diagram of a first embodiment of a screening apparatus with features according to the present application, as shown in fig. 4, a screening apparatus 10 with features includes:
an obtaining module 11, configured to obtain a plurality of samples to be screened, where each sample includes at least one type of feature;
The processing module 12 is configured to obtain mutual information and coverage rate of each type of feature in different time periods according to a preset time interval;
the processing module 12 is further configured to obtain a stability index of each type of feature according to the mutual information and the coverage rate of each type of feature in each time period;
and the screening module 13 is used for screening the characteristics in the plurality of samples according to the stability index of the characteristics of each type.
The screening device for features provided in this embodiment is configured to execute the technical solution of the electronic device in the foregoing method embodiment, where each of the plurality of samples includes at least one type of feature, obtain mutual information and coverage rate of each type of feature in different time periods according to a preset time interval, obtain a stability index of each type of feature according to the mutual information and coverage rate of each type of feature in each time period, screen features in the plurality of samples according to the stability index of each type of feature, and perform feature selection by calculating dynamic indexes of stability measurement in different time periods, so as to effectively improve modeling effect and improve model accuracy.
Based on the above-described embodiments, in one specific implementation,
Optionally, each sample further includes time information of the feature, and the processing module 12 is specifically configured to:
dividing each sample into sub-samples of a plurality of time periods according to the time information of the features in each sample and the time interval;
the mutual information and coverage of each type of feature in each sub-sample is calculated.
Optionally, the processing module 12 is specifically configured to:
calculating the variance value of the mutual information and coverage rate corresponding to a plurality of time periods aiming at each type of characteristics; the stability index includes the variance value.
Optionally, the screening module 13 is specifically configured to:
and filtering the characteristics of which the stability index is smaller than a preset threshold value in the samples to obtain at least one type of characteristics of which the stability is higher than the threshold value.
The screening device of the features provided in any of the foregoing embodiments is used for executing the technical scheme of the electronic device in the foregoing method embodiment, and its implementation principle and technical effect are similar, and are not repeated herein.
Fig. 5 is a schematic structural diagram of an electronic device entity provided by the present application, and as shown in fig. 5, the electronic device 20 includes: a processor 21, a memory 22, and a computer program; the computer program is stored in the memory 22, and the processor 21 executes the technical solution of the screening method for the characteristics of the electronic device in any of the foregoing method embodiments.
Alternatively, the memory 22 may be separate or integrated with the processor 21.
When the memory 22 is a device separate from the processor 21, the electronic apparatus may further include:
A bus 23 for connecting the processor 21 and the memory 22.
The application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program is used for realizing the technical scheme of the screening method of the characteristics of the electronic equipment in any method embodiment.
The present application also provides a computer program product comprising: a computer program stored in a readable storage medium, from which the computer program can be read by at least one processor of an electronic device, the at least one processor executing the computer program causing the electronic device to perform an aspect of a screening method embodiment of any of the features described above.
In the specific implementation of the electronic device, it should be understood that the Processor may be a central processing unit (english: central Processing Unit, abbreviated as CPU), and may also be other general purpose processors, digital signal processors (english: DIGITAL SIGNAL Processor, abbreviated as DSP), application-specific integrated circuits (english: application SPECIFIC INTEGRATED Circuit, abbreviated as ASIC), and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in the processor for execution.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: read-only memory (abbreviated as ROM), RAM, flash memory, hard disk, solid state disk, magnetic tape (English: MAGNETIC TAPE), floppy disk (English: floppy disk), optical disk (English: optical disk), and any combination thereof.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims (7)

1. A method of screening for a feature, the method comprising:
obtaining a plurality of samples to be screened, wherein each sample comprises at least one type of characteristic;
According to a preset time interval, acquiring the mutual information and coverage rate of each type of feature in different time periods;
Acquiring stability indexes of each type of characteristics according to mutual information and coverage rate of each type of characteristics in each time period;
Screening the characteristics in the samples according to the stability index of each type of characteristics;
each sample further includes time information of the features, and the acquiring the mutual information and coverage rate of each type of feature in different time periods according to the preset time interval includes:
dividing each sample into sub-samples of a plurality of time periods according to the time information of the features in each sample and the time interval;
Calculating the mutual information and coverage rate of each type of feature in each sub-sample;
According to the mutual information and coverage rate of each type of feature in each time period, the method for acquiring the stability index of each type of feature comprises the following steps:
calculating the variance value of the mutual information and coverage rate corresponding to a plurality of time periods aiming at each type of characteristics; the stability index includes the variance value.
2. The method of claim 1, wherein the screening the features in the plurality of samples according to the stability index of each type of feature comprises:
and filtering the characteristics of which the stability index is smaller than a preset threshold value in the samples to obtain at least one type of characteristics of which the stability is higher than the threshold value.
3. A screening apparatus of the character, comprising:
The acquisition module is used for acquiring a plurality of samples to be screened, and each sample comprises at least one type of characteristic;
the processing module is used for acquiring the mutual information and coverage rate of each type of feature in different time periods according to the preset time interval;
the processing module is also used for acquiring the stability index of each type of feature according to the mutual information and coverage rate of each type of feature in each time period;
The screening module is used for screening the characteristics in the samples according to the stability index of each type of characteristics;
Each sample also comprises time information of the characteristics, and the processing module is specifically used for:
dividing each sample into sub-samples of a plurality of time periods according to the time information of the features in each sample and the time interval;
Calculating the mutual information and coverage rate of each type of feature in each sub-sample;
The processing module is specifically configured to:
calculating the variance value of the mutual information and coverage rate corresponding to a plurality of time periods aiming at each type of characteristics; the stability index includes the variance value.
4. The apparatus of claim 3, wherein the screening module is specifically configured to:
and filtering the characteristics of which the stability index is smaller than a preset threshold value in the samples to obtain at least one type of characteristics of which the stability is higher than the threshold value.
5. An electronic device, comprising: a processor, a memory, and a computer program; the computer program is stored in the memory, and the processor executes the computer program to implement the screening method of the features of claim 1 or 2.
6. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for implementing the screening method of the features of claim 1 or 2.
7. A computer program product, comprising: computer program for implementing the screening method of the features of the preceding claim 1 or 2 when executed by a processor.
CN201910576711.6A 2019-06-28 2019-06-28 Feature screening method, device, equipment and storage medium Active CN112148764B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910576711.6A CN112148764B (en) 2019-06-28 2019-06-28 Feature screening method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910576711.6A CN112148764B (en) 2019-06-28 2019-06-28 Feature screening method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112148764A CN112148764A (en) 2020-12-29
CN112148764B true CN112148764B (en) 2024-05-07

Family

ID=73869457

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910576711.6A Active CN112148764B (en) 2019-06-28 2019-06-28 Feature screening method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112148764B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007147166A2 (en) * 2006-06-16 2007-12-21 Quantum Leap Research, Inc. Consilence of data-mining
CN104346379A (en) * 2013-07-31 2015-02-11 克拉玛依红有软件有限责任公司 Method for identifying data elements on basis of logic and statistic technologies
CN105528465A (en) * 2016-02-03 2016-04-27 天弘基金管理有限公司 Credit status assessment method and device
WO2016067072A1 (en) * 2014-10-28 2016-05-06 Super Sonic Imagine Imaging methods and apparatuses for performing shear wave elastography imaging
WO2017101506A1 (en) * 2015-12-14 2017-06-22 乐视控股(北京)有限公司 Information processing method and device
CN106991447A (en) * 2017-04-06 2017-07-28 哈尔滨理工大学 A kind of embedded multi-class attribute tags dynamic feature selection algorithm

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007147166A2 (en) * 2006-06-16 2007-12-21 Quantum Leap Research, Inc. Consilence of data-mining
CN104346379A (en) * 2013-07-31 2015-02-11 克拉玛依红有软件有限责任公司 Method for identifying data elements on basis of logic and statistic technologies
WO2016067072A1 (en) * 2014-10-28 2016-05-06 Super Sonic Imagine Imaging methods and apparatuses for performing shear wave elastography imaging
WO2017101506A1 (en) * 2015-12-14 2017-06-22 乐视控股(北京)有限公司 Information processing method and device
CN105528465A (en) * 2016-02-03 2016-04-27 天弘基金管理有限公司 Credit status assessment method and device
CN106991447A (en) * 2017-04-06 2017-07-28 哈尔滨理工大学 A kind of embedded multi-class attribute tags dynamic feature selection algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于互信息的遗传算法在光谱谱段选择中应用;孔清清;宫会丽;丁香乾;刘明;;光谱学与光谱分析;20180115(01);全文 *

Also Published As

Publication number Publication date
CN112148764A (en) 2020-12-29

Similar Documents

Publication Publication Date Title
US10324989B2 (en) Microblog-based event context acquiring method and system
US11221904B2 (en) Log analysis system, log analysis method, and log analysis program
CN107463904A (en) A kind of method and device for determining periods of events value
CN108802535B (en) Screening method, main interference source identification method and device, server and storage medium
CN109873832B (en) Flow identification method and device, electronic equipment and storage medium
CN106327230B (en) Abnormal user detection method and equipment
CN111680085A (en) Data processing task analysis method and device, electronic equipment and readable storage medium
CN109726737B (en) Track-based abnormal behavior detection method and device
CN115660262B (en) Engineering intelligent quality inspection method, system and medium based on database application
CN109597746A (en) fault analysis method and device
CN114290960A (en) Method and device for acquiring battery health degree of power battery and vehicle
CN111784160A (en) River hydrological situation change evaluation method and system
CN115184674A (en) Insulation test method and device, electronic terminal and storage medium
CN112383828A (en) Experience quality prediction method, equipment and system with brain-like characteristic
CN113806343B (en) Evaluation method and system for Internet of vehicles data quality
CN113495913A (en) Air quality data missing value interpolation method and device
CN113568952A (en) Internet of things resource data analysis method
CN110852322B (en) Method and device for determining region of interest
CN106874286B (en) Method and device for screening user characteristics
CN112148764B (en) Feature screening method, device, equipment and storage medium
CN110991241A (en) Abnormality recognition method, apparatus, and computer-readable medium
CN110717653A (en) Risk identification method and device and electronic equipment
CN115407388A (en) ARIMA model-based earthquake prediction method
CN112988536B (en) Data anomaly detection method, device, equipment and storage medium
CN114842382A (en) Method, device, equipment and medium for generating semantic vector of video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant