WO2023151488A1 - Model training method, training device, electronic device and computer-readable medium - Google Patents
Model training method, training device, electronic device and computer-readable medium Download PDFInfo
- Publication number
- WO2023151488A1 WO2023151488A1 PCT/CN2023/074026 CN2023074026W WO2023151488A1 WO 2023151488 A1 WO2023151488 A1 WO 2023151488A1 CN 2023074026 W CN2023074026 W CN 2023074026W WO 2023151488 A1 WO2023151488 A1 WO 2023151488A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- feature
- data set
- features
- fault
- average distance
- Prior art date
Links
- 238000012549 training Methods 0.000 title claims abstract description 61
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000004364 calculation method Methods 0.000 claims abstract description 24
- 238000012216 screening Methods 0.000 claims abstract description 20
- 238000012512 characterization method Methods 0.000 claims description 10
- 230000015654 memory Effects 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 7
- 238000001914 filtration Methods 0.000 claims description 4
- 230000001133 acceleration Effects 0.000 description 13
- 238000012545 processing Methods 0.000 description 12
- 238000011156 evaluation Methods 0.000 description 8
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000013145 classification model Methods 0.000 description 5
- 238000010801 machine learning Methods 0.000 description 5
- 238000003745 diagnosis Methods 0.000 description 4
- 238000012544 monitoring process Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 238000012423 maintenance Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 238000010187 selection method Methods 0.000 description 3
- 238000001228 spectrum Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000012530 fluid Substances 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 229910000831 Steel Inorganic materials 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 239000004568 cement Substances 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000000546 chi-square test Methods 0.000 description 1
- 239000003245 coal Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- -1 electric power Substances 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000004898 kneading Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 239000010687 lubricating oil Substances 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 239000003921 oil Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Definitions
- the present disclosure relates to the field of artificial intelligence, and in particular to a model training method, a training device, electronic equipment and a computer-readable medium.
- vibration signal is usually one of the main bases for diagnosing the equipment status.
- Mechanical forced vibration including periodic vibration, impact vibration, random vibration, etc.
- another type of vibration is due to structural response, vibration response caused by self-excited vibration or environmental vibration, such as: fluid surge vibration, bearing oil film vibration, response vibration of the component itself, local vibration of the structure, etc.
- the present disclosure aims to provide a model training method, a training device, an electronic device and a computer-readable medium, so as to improve the efficiency of model training.
- a model training method including:
- the initial data set is characterized by three dimensions of fault category, sample and feature, and the selection step includes:
- the average distance calculation is performed in the two dimensions of the sample and the fault category, and the representative value of the importance of each feature is obtained based on the average distance calculation results of the two dimensions;
- the key data set is composed of features whose importance characteristic value is greater than the first threshold.
- the method further includes: in the key data set, calculating the correlation between each feature and other features and filtering the features of the key data set based on the correlation between features.
- calculating the correlation between each feature and other features and filtering the features of the key data set based on the correlation between features includes:
- Sorting by importance calculating the variance inflation factor of each feature in the key data set and the feature subset
- calculating the average distance in the sample dimension includes:
- calculating the average distance in the fault category dimension includes:
- the average distance of each feature average among different fault categories is calculated, and recorded as the second average distance.
- the characterization value of the importance of each feature obtained based on the calculation result of the average distance of the two dimensions includes:
- a representative value of the importance of each feature is obtained based on the compensation factor of each feature, the first average distance, and the second average distance.
- the calculation of the average distance between different samples of each feature under the same fault category adopts formula (1):
- the calculation of the average value of each feature in all samples under the same fault category adopts formula (3):
- the step of calculating the variance inflation factor of the feature subset includes: substituting into formula (5) to obtain the variance inflation factor of the feature,
- a model training device comprising:
- a feature extraction unit is used to construct an initial dataset of high-frequency vibration signals
- a feature screening unit configured to screen the initial data set to obtain key data sets
- a model training unit configured to use the key data set for model training to obtain a fault recognition model, wherein,
- the initial data set includes the use of three dimensions of fault category, sample and feature representation, then the feature screening unit includes:
- the average distance calculation is performed in the two dimensions of the sample and the fault category, and the representative value of the importance of each feature is obtained based on the average distance calculation results of the two dimensions;
- the key data set is composed of features whose importance characteristic value is greater than the first threshold.
- an electronic device including a memory and a processor, the memory further stores computer instructions executable by the processor, and when the computer instructions are executed, any one of the above-mentioned The model training method described above.
- a computer-readable medium stores computer instructions that can be executed by an electronic device, and when the computer instructions are executed, the model training described in any one of the above is implemented method.
- the data set is characterized into three dimensions of fault category, sample and feature, and the average distance calculation is performed in the two dimensions of sample and fault category, and based on the two average distance calculation results, the The characterization value of the importance of each feature, so that the initial data set can be screened according to the characterization value of importance to obtain an optimized feature combination, and then the optimized feature combination is used for model training, which can achieve the purpose of improving training efficiency.
- FIG. 1 is a schematic flowchart of a model training method provided by an embodiment of the present disclosure
- FIG. 2 is a flowchart of a method for screening initial data sets in combination with importance and relevance provided by an embodiment of the present disclosure
- Fig. 3 is a schematic structural diagram of a model training device provided by an embodiment of the present disclosure.
- Fig. 4 is a schematic structural diagram of an exemplary electronic device.
- the model training method provided by the embodiment of the present disclosure is shown in FIG. 1 .
- Step S11 is to construct an initial data set of high-frequency vibration signals.
- Step S12 is to filter the initial data set to obtain the key data set, that is, perform feature screening on the initial data set containing rich features, and filter out a set of features with high importance and low correlation;
- Step S13 is to use key data sets for training to obtain a fault classification model. That is to say, build a fault recognition model (using ML models such as support vector machine SVM, random forest RF, etc.), provide the data set screened in step S12 to the fault recognition model, train the fault recognition model, and in the training Continuously optimize the weight parameters of the fault recognition model, and stop training until the loss function meets expectations.
- ML models such as support vector machine SVM, random forest RF, etc.
- the high-frequency vibration signal usually comes from a vibration sensor, and the high-frequency vibration signal of the analog signal is converted into a digital signal through the vibration sensor, and then various methods are used to extract the feature of the high-frequency vibration signal of the digital signal.
- various existing feature extraction techniques which can be roughly divided into time-domain features, frequency-domain features, time-frequency domain features, and features extracted for some specific scenes. Time-domain features are the most intuitive, and the following indicators are usually calculated: average, root mean square, peak-to-peak, pulse, margin, kurtosis, kurtosis, etc.
- the frequency domain feature is mainly to perform Fourier transform on the signal, observe the spectrum of the signal from another perspective, and extract various features of the spectrum, such as: mean value, maximum value, centroid frequency, spectral kurtosis, spectral power, etc.
- the time-frequency domain feature is mainly used in the start-stop stage of the equipment.
- the time-frequency spectrum of the signal is obtained to observe the change of the signal frequency with time, and the time-varying feature is extracted. All the extracted features form a data set, which is used as the input for feature selection in the next step.
- the screening step is the process of selecting some of the most effective features from the data pool to reduce the dimensionality of the data set. It is an important means to improve the performance of the learning algorithm and is also a key data preprocessing step in pattern recognition.
- the system can obtain Feature combinations for specific metric optimization (such as model recognition accuracy). According to whether it is independent of the subsequent machine learning algorithm, the feature selection methods used in the screening step are mainly divided into three categories:
- Filtering type It has nothing to do with subsequent machine learning algorithms. Generally, the statistical performance of all training data is used to evaluate features directly, and various evaluation criteria are used to enhance the correlation between features and classes and reduce the correlation between features. Common selection methods include: variance selection method, correlation coefficient method, chi-square test method, mutual information method, etc.
- the model here can be various machine learning algorithms.
- (c) Embedded Directly use a certain machine learning model for training to obtain the weight coefficients of each feature value, and select features according to the importance of the coefficients from large to small as features.
- the initial data set is represented as data in three dimensions of fault category, sample and feature.
- the segment of high-frequency signal is processed into a sample including multiple features, and each feature in the multiple features can be a time-domain feature, a time-domain feature, or a time-domain feature. Domain features or time-frequency domain features, and the sample has an assigned fault category.
- the fault category to which the sample belongs can be obtained through manual labeling or a trained fault recognition model. Specifically, for a fault recognition model that has been trained, a sample can be input into the model to obtain the fault category corresponding to the sample.
- the average distance is performed in the two dimensions of sample and fault category respectively to obtain the first average distance of each feature in the sample dimension and the second average distance in the fault category dimension.
- first average distance first calculate the average distance between multiple samples of each feature under the same fault category, thus obtaining the multiple average distances under the fault category, and then calculate the average value of the average distances of the feature under multiple fault categories based on the multiple average distances of each feature under multiple fault categories as the first average distance of the feature;
- For the second average distance first calculate the average value of all samples under the same fault category for each feature, thereby obtaining multiple average values of multiple features under multiple fault categories, and then based on each feature in multiple fault categories Calculate the average distance of the average value of the feature under multiple fault categories as the second average distance of the feature, and then get the feature's average distance based on the first average distance and the second average distance of each feature.
- the characteristic value of importance thus obtaining the characteristic value of the importance of all features, and then comparing the characteristic values of the importance of all features and selecting features
- the average distance is calculated in the two dimensions of sample and fault category, and the two average distances of each feature in the two dimensions are obtained, and the evaluation of each feature is obtained based on these two average distances.
- the representation value of the importance is used to select a combination of features from the data set to train the fault recognition model.
- the features selected by the above distance evaluation method have high importance, they may also have a high correlation with each other, which will increase the complexity of the model and affect the final fault identification effect. Therefore, in some embodiments, the correlation between each feature and other features is continuously calculated, and the screening is continued using the correlation.
- Fig. 2 is a flowchart of a method for screening an initial data set in combination with importance and relevance provided by an embodiment of the present disclosure, which specifically includes the following steps.
- step S121 create an empty feature subset.
- step S122 the first average distance and the second average distance of each feature are calculated, and the characteristic value of the importance of each feature is obtained accordingly.
- step S123 the features whose importance characteristic value is greater than the first threshold are formed into a first feature set.
- step S124 the features of the first feature set are sorted in descending order of importance, and the feature with the highest importance is taken out and put into the feature subset.
- step S125 each feature in the first feature set and the variance inflation factor in the feature subset are calculated one by one in order of importance.
- step S126 it is judged whether the variance inflation factor of the feature is smaller than the second threshold, if yes, execute step S127, otherwise execute step S125.
- Steps S125 to S127 form a loop, and the number of repetitions of this loop is (N-1) times, where N is the number of features included in the first feature set.
- step S127 put the feature into a feature subset.
- step S1208 the feature subset is output.
- the first average distance and the second average distance of each feature are calculated according to the calculation method of the previous embodiment, and the characterization value of the importance of each feature is obtained accordingly, and then the characterization of the importance
- the value greater than the first threshold is used as the screening condition to obtain the first feature set, the features of the first feature set are sorted according to the characteristic value of importance from large to small, and the feature with the highest characteristic value of importance is taken out and put into the feature subset, and then Arranged in descending order of importance Sequence, calculate the variance inflation factor of each feature and feature subset in the first feature set, and put the features whose variance inflation factor is less than the second threshold into the feature subset, which means that when a feature and feature subset When the variance inflation factor of is smaller than the second threshold, it is considered that the correlation between the feature and the feature subset is relatively small, so the feature can be put into the feature subset as a feature subset.
- the number of features in the feature subset increases as the number of cycles increases, that is, the number of features used to calculate the variance inflation factor gradually increases, so the calculation of the variance inflation factor of the features in the key data set and the feature subset is computationally and complex. degree increases with increasing.
- I k is the number of samples of the kth fault category
- K is the number of fault categories
- J is the number of features for each sample.
- the average distance is calculated in the two dimensions of sample and fault category, and the specific operation steps to obtain the two average distances of each feature are as follows:
- the variance inflation factor for the jth feature be calculated as follows:
- an embodiment of the present disclosure provides a fault detection device, including a feature extraction unit 301 , a feature screening unit 302 and a model training unit 303 .
- Feature extraction unit 301 is used to construct the initial data set of high-frequency vibration signal
- the feature screening unit 302 is used to screen the initial data set to obtain key data sets
- the model training unit 303 is used to perform model training using key data sets to obtain a fault recognition model.
- the initial data set includes the three-dimensional representation of fault category, sample and feature
- the model training unit 303 includes: for each feature, the average distance calculation is performed in the two dimensions of sample and fault category, and based on the two dimensions The characterization value of the importance of each feature is obtained from the average distance calculation result, and the features whose characterization value of importance is greater than the first threshold are selected to form the key data set.
- FIG. 4 is a schematic structural diagram of an exemplary electronic device 400 .
- the electronic device can be used to execute the application system including the above-mentioned fault identification model, and at the same time, the electronic device can be used to train the fault identification model.
- the server 400 includes a scheduler 401 , a storage unit 403 , an I/O interface 404 , and multiple model acceleration units 402 coupled via a bus 405 .
- the storage unit 403 may include readable media in the form of volatile storage units, such as random access memory units (RAM) and/or cache storage units.
- the storage unit 403 may also include a readable medium in the form of a non-volatile storage unit, for example, a read only memory unit (ROM), a flash memory, and various magnetic disk memories.
- the storage unit 403 can store various program modules and data, and various program modules include an operating system, providing such as Applications for text processing, video playback, software editing and compiling, etc.
- the executable codes of these application programs are read from the storage unit 403 by the scheduler 401 and executed, so as to realize the predetermined operations of these program modules.
- the scheduler 401 is generally a processor (CPU).
- the storage unit 403 stores the fault detection system based on the above.
- the bus 405 may represent one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local area using any bus structure in a variety of bus structures. bus.
- the server 400 can communicate with one or more external devices (such as keyboards, pointing devices, Bluetooth devices, etc.), and can also communicate with one or more devices that enable users to interact with the server 400, and/or communicate with the server 400 to enable Any device (eg, router, modem, etc.) that communicates with one or more other computing devices communicates. Such communication may occur through input/output (I/O) interface 404 .
- the server 400 can also communicate with one or more networks (such as a local area network (LAN), a wide area network (WAN) and/or a public network such as the Internet) through a network adapter (not shown).
- the terminal 103 in FIG. 1 can access the server 400, for example through a network adapter.
- server 400 may be used based on the server 400, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and Data backup storage system, etc.
- the server 400 includes a plurality of model acceleration units 402 .
- the traditional processor architecture design is very effective in logic control, but not efficient enough in large-scale parallel computing, so it is not efficient for model calculations. Therefore, a model acceleration unit is developed, and different models can be adapted to different model acceleration units.
- the model acceleration unit is, for example, a neural network acceleration unit (NPU).
- NPU adopts a data-driven parallel computing architecture and is used as a processing unit for processing a large number of operations (such as convolution, pooling, etc.) of each neural network node.
- GPU graphics processing unit
- model acceleration units 402 are controlled by the scheduler 401 , and under the control of the scheduler 401 , multiple model acceleration units 402 can work together.
- the step S11 corresponding to the feature extraction unit 301 and the step S12 corresponding to the feature screening unit 302 can be executed by the CPU, and the step S13 corresponding to the model training unit 303 can be It is completed by the cooperation of the scheduler 401 and multiple model acceleration units 402.
- the scheduler 401 generally controls the process, puts the execution of the trained fault classification model on multiple model acceleration units 402 and summarizes the multiple model acceleration units 402
- the data interaction between multiple model acceleration units 402 can be realized through the scheduler 401 , or the data interaction can be directly performed by multiple model acceleration units 402 .
- model training method and model training device use a supervised distance evaluation method to select an optimized feature combination from a data set to improve model training efficiency.
- the supervised distance evaluation method refers to the evaluation using the average distance of the dimension of the fault category (the data for the dimension needs to be marked or generated through a trained fault recognition model).
- an embodiment of the present disclosure also provides a computer-readable medium for storing the above-mentioned model training method computer readable instructions.
- Embodiments of the present disclosure provide a model training method. Compared with the prior art, the embodiment of the present disclosure adopts a supervised distance evaluation method to select an optimized feature combination from a data set for model training, so as to achieve the purpose of improving model training efficiency and saving computing resources. Therefore, the model training method has certain practical value and economic value.
- modules or elements described or illustrated herein as separate may be combined into a single module or element, and modules or elements described or illustrated herein as a single may be split into a plurality of modules or elements.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Complex Calculations (AREA)
Abstract
Provided are a model training method, a training device, an electronic device and a computer readable medium. The method comprises: constructing an initial dataset of high-frequency vibration signals; screening the initial dataset to obtain a key dataset; using the key dataset to perform model training and obtain a fault identification model, wherein the initial dataset is represented by three dimensions, namely fault category, sample and feature. Selection steps comprise: for each feature, performing average distance calculation respectively on two dimensions, namely the sample and the fault category, and obtaining, on the basis of the average distance, a representation value of the importance of each feature; and forming a key dataset with the features of which representation values are greater than a first threshold value. According to the embodiments, the representation value of the importance of each feature is calculated on the basis of the average distance of the two dimensions, namely the sample and the fault category, and the initial dataset is screened according to the representation value of the importance, and model training is carried out by using the screened features, thereby improving the model training efficiency.
Description
本申请要求于2022年02月11日提交中国专利局、申请号为202210127454.X、申请名称为“模型训练方法、训练装置、电子设备和计算机可读介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202210127454.X and the application name "model training method, training device, electronic equipment and computer readable medium" submitted to the China Patent Office on February 11, 2022, which The entire contents are incorporated by reference in this application.
本公开涉及人工智能领域,尤其涉及一种模型训练方法、训练装置、电子设备和计算机可读介质。The present disclosure relates to the field of artificial intelligence, and in particular to a model training method, a training device, electronic equipment and a computer-readable medium.
由于转子、轴承、壳体、密封和基础等部分的结构及加工安装方面缺陷,或者由于外部作用等原因,大量工业设备运行时产生振动,过大的振动又往往是设备损坏的主要原因。据统计,对于工业中量大面广的旋转机械与往复机械运动,设备由于振动引起的故障占总故障率的60%以上。所以对机械设备的振动监测和分析非常重要,相较其它状态参数,比如润滑油或者内部流体的温度、压力、流量或电机的电流等,振动参数往往更能直接、快速、准确地反应机组运行状态,振动信号通常是对设备状态进行诊断的主要依据之一。Due to structural and processing installation defects of rotors, bearings, housings, seals and foundations, or due to external effects, a large number of industrial equipment vibrates during operation, and excessive vibration is often the main cause of equipment damage. According to statistics, for the large-scale and wide-ranging rotating machinery and reciprocating machinery in the industry, the failure of equipment due to vibration accounts for more than 60% of the total failure rate. Therefore, the vibration monitoring and analysis of mechanical equipment is very important. Compared with other state parameters, such as the temperature, pressure, flow rate of lubricating oil or internal fluid, or the current of the motor, vibration parameters can often reflect the operation of the unit more directly, quickly and accurately. The vibration signal is usually one of the main bases for diagnosing the equipment status.
随着我国工业现代化的发展,大型旋转设备的应用越来越广泛,从钢铁、煤炭、电力、水泥,到地铁、飞机、火车、船舶等,都离不开旋转设备的身影,这些旋转设备的稳定运行对国民经济的发展越来越重要。设备在长期工作运行中,不可避免会发生各种故障,如果早期故障征兆发现不及时,随着其发展扩大,当到达一定的临界点后,设备容易发生突发性严重故障,导致大量的非计划性维护工作。这些故障轻则造成一定的经济损失,重则造成人员的伤亡。With the development of my country's industrial modernization, the application of large-scale rotating equipment is becoming more and more extensive. From steel, coal, electric power, cement, to subways, airplanes, trains, ships, etc., rotating equipment is inseparable from the figure of these rotating equipment. Stable operation is more and more important to the development of national economy. During the long-term operation of the equipment, various failures will inevitably occur. If the early failure symptoms are not found in time, with its development and expansion, when it reaches a certain critical point, the equipment is prone to sudden and serious failures, resulting in a large number of abnormal Planned maintenance work. These failures may cause certain economic losses in the slightest, and cause casualties in severe cases.
工业旋转设备在工作中某一部分发生磨损、退化等轻微故障,由于宏观表征微弱,仅依靠人工辨识往往无法有效监测,而且费时费力。而振动信号随着机器的运转产生并持续,即使机器的运行状态较好,由于微小激励,也将产生振动。对于机械设备来说通常有两类性质不同的振源:一类是由于机械运动件的质量不平衡、几何轴线不对中、齿轮捏合不好、传动件配合失当、轴颈轴承间隙过大等引起的机械强迫振动,包括周期振动、冲击振动、随机振动等,同时也引起噪声;另一类振动是由于结构响应,自激励振动或环境振动引起的振动响应,比如:流体的喘激振动、轴承的油膜振动、部件本身的响应振动、结构的局部振动等。一旦出现早期故障,相应的振动情况和噪音的大小都会随之发生一系列变化。因此,采用科学的方法,通过对振动信号的监测与诊断对提高旋转设备的稳定运行有很重要的作用,建立在现代故障诊断技术的监测与诊断系统,可以实时监测设备的运行状态,通过对数据的处理与分析,可以发现设备故障的原因以及预测设备可能的故障,为预防事故、
科学安排检修提供科学的依据,从而节约维修成本,提高设备的可靠性和安全性。Minor faults such as wear and degradation occur in a certain part of industrial rotating equipment during work. Due to the weak macroscopic representation, it is often impossible to effectively monitor only by manual identification, and it is time-consuming and laborious. The vibration signal is generated and continues with the operation of the machine. Even if the machine is in good operating condition, it will vibrate due to slight excitation. For mechanical equipment, there are usually two types of vibration sources with different properties: one is caused by the mass imbalance of mechanical moving parts, misalignment of geometric axes, poor gear kneading, improper coordination of transmission parts, and excessive journal bearing clearance. Mechanical forced vibration, including periodic vibration, impact vibration, random vibration, etc., also causes noise; another type of vibration is due to structural response, vibration response caused by self-excited vibration or environmental vibration, such as: fluid surge vibration, bearing oil film vibration, response vibration of the component itself, local vibration of the structure, etc. Once an early failure occurs, the corresponding vibration and noise will undergo a series of changes. Therefore, using scientific methods, the monitoring and diagnosis of vibration signals plays an important role in improving the stable operation of rotating equipment. The monitoring and diagnosis system based on modern fault diagnosis technology can monitor the operating status of equipment in real time. The processing and analysis of data can discover the cause of equipment failure and predict the possible failure of equipment, in order to prevent accidents, Scientific arrangements for maintenance provide a scientific basis, thereby saving maintenance costs and improving equipment reliability and safety.
近年来,随着深度学习的兴起,研究者们将神经网络模型应用于机器故障监测和诊断并取得了重要进展。但在实践中,基于高频振动信号产生的数据集庞杂而繁多,如果直接使用这些数据集进行模型训练,无疑将耗费大量的资源且效率不高。In recent years, with the rise of deep learning, researchers have made important progress in applying neural network models to machine fault monitoring and diagnosis. However, in practice, the data sets generated based on high-frequency vibration signals are huge and diverse. If these data sets are directly used for model training, it will undoubtedly consume a lot of resources and be inefficient.
发明内容Contents of the invention
有鉴于此,本公开旨在提供一种模型训练方法、训练装置、电子设备和计算机可读介质,以提高模型训练效率。In view of this, the present disclosure aims to provide a model training method, a training device, an electronic device and a computer-readable medium, so as to improve the efficiency of model training.
根据本公开的第一方面,提供一种模型训练方法,包括:According to a first aspect of the present disclosure, a model training method is provided, including:
构建高频振动信号的初始数据集;Construct an initial data set of high-frequency vibration signals;
对所述初始数据集进行筛选,以得到关键数据集;Screening the initial data set to obtain key data sets;
采用所述关键数据集进行模型训练,以得到故障识别模型,Using the key data set to perform model training to obtain a fault identification model,
其中,所述初始数据集采用故障类别、样本和特征三个维度表征,则所述选择步骤包括:Wherein, the initial data set is characterized by three dimensions of fault category, sample and feature, and the selection step includes:
对于每个特征,在所述样本和所述故障类别两个维度分别进行平均距离计算,并基于两个维度的平均距离计算结果得到每个特征的重要性的表征值;以及For each feature, the average distance calculation is performed in the two dimensions of the sample and the fault category, and the representative value of the importance of each feature is obtained based on the average distance calculation results of the two dimensions; and
选择重要性的表征值大于第一阈值的特征组成所述关键数据集。The key data set is composed of features whose importance characteristic value is greater than the first threshold.
在一些实施例中,还包括:在所述关键数据集中,计算每个特征与其他特征的相关性并基于特征之间的相关性对所述关键数据集的特征进行筛选。In some embodiments, the method further includes: in the key data set, calculating the correlation between each feature and other features and filtering the features of the key data set based on the correlation between features.
在一些实施例中,所述在所述关键数据集中,计算每个特征与其他特征的相关性并基于特征之间的相关性对所述关键数据集的特征进行筛选包括:In some embodiments, in the key data set, calculating the correlation between each feature and other features and filtering the features of the key data set based on the correlation between features includes:
将所述关键数据集的特征按照重要性从大到小排序并将重要性最高的特征取出放入特征子集;Sorting the features of the key data set in descending order of importance and taking out the features with the highest importance and putting them into the feature subset;
按照重要性排序,计算所述关键数据集中的每个特征与所述特征子集的方差膨胀因子;Sorting by importance, calculating the variance inflation factor of each feature in the key data set and the feature subset;
将所述关键数据集中每个特征的方差膨胀因子与第二阈值进行比较,如果该特征的方差膨胀因子小于第二阈值,则将该特征放入到特征子集,comparing the variance inflation factor of each feature in the key data set with a second threshold, and if the variance inflation factor of the feature is less than the second threshold, putting the feature into a feature subset,
重复执行所述计算所述关键数据集中的每个特征与所述特征子集的方差膨胀因子和将所述关键数据集中每个特征的方差膨胀因子与第二阈值进行比较,如果该特征的方差膨胀因子小于第二阈值,则将该特征放入到特征子集的步骤,直到完成所述关键数据集的所有特征的比较。repeating said calculating the variance inflation factor for each feature in said key dataset and said subset of features and comparing the variance inflation factor for each feature in said key dataset to a second threshold, if the variance of the feature If the expansion factor is smaller than the second threshold, then put the feature into the feature subset until the comparison of all the features of the key data set is completed.
在一些实施例中,对于每个特征,在所述样本维度进行平均距离计算包括:In some embodiments, for each feature, calculating the average distance in the sample dimension includes:
计算每个特征在相同故障类别下不同样本间的平均距离;Calculate the average distance between different samples of each feature under the same fault category;
基于每个特征在相同故障类别下不同样本间的平均距离,计算每个特征在多个故障类别的平均值,并将其记为第一平均距离;Based on the average distance between different samples of each feature under the same fault category, calculate the average value of each feature in multiple fault categories, and record it as the first average distance;
对于每个特征,在所述故障类别维度进行平均距离计算包括:
For each feature, calculating the average distance in the fault category dimension includes:
计算相同故障类别下所有样本中每个特征的平均值;Calculate the average value of each feature in all samples under the same fault category;
基于相同故障类别下所有样本中每个特征的平均值,计算不同故障类别间每个特征平均值的平均距离,并将其记为第二平均距离。Based on the average value of each feature in all samples under the same fault category, the average distance of each feature average among different fault categories is calculated, and recorded as the second average distance.
在一些实施例中,所述并基于两个维度的平均距离计算结果得到每个特征的重要性的表征值包括:In some embodiments, the characterization value of the importance of each feature obtained based on the calculation result of the average distance of the two dimensions includes:
计算每个特征的第一平均距离的方差因子,Calculate the variance factor of the first mean distance for each feature,
计算每个特征的第二平均距离的方差因子;Calculate the variance factor of the second mean distance for each feature;
基于每个特征的两个方差因子计算补偿因子;Calculate compensation factors based on the two variance factors for each feature;
基于每个特征的补偿因子、第一平均距离和第二平均距离得到每个特征的重要性的表征值。A representative value of the importance of each feature is obtained based on the compensation factor of each feature, the first average distance, and the second average distance.
在一些实施例中,所述计算每个特征在相同故障类别下不同样本间的平均距离采用公式(1):
In some embodiments, the calculation of the average distance between different samples of each feature under the same fault category adopts formula (1):
In some embodiments, the calculation of the average distance between different samples of each feature under the same fault category adopts formula (1):
所述计算每个特征在多个故障类别的平均值采用公式(2):
The average value of each feature in multiple fault categories is calculated using formula (2):
The average value of each feature in multiple fault categories is calculated using formula (2):
其中,所述初始数据集包括K种故障类别,qi,k,j,i=1,2,…,Ik;k=1,2,…,K;j=1,2,…,J,其中,qi,k,j代表第k种故障类别中第i个样本的第j个特征值,Ik是第k个故障类别的样本数量,K是故障类别数量,J是每个样本的特征数量。Wherein, the initial data set includes K kinds of fault categories, q i,k,j , i=1,2,...,I k ; k=1,2,...,K; j=1,2,...,J , where q i,k,j represent the j-th eigenvalue of the i-th sample in the k-th fault category, I k is the number of samples of the k-th fault category, K is the number of fault categories, and J is each sample number of features.
在一些实施例中,所述计算相同故障类别下所有样本中每个特征的平均值采用公式(3):
In some embodiments, the calculation of the average value of each feature in all samples under the same fault category adopts formula (3):
In some embodiments, the calculation of the average value of each feature in all samples under the same fault category adopts formula (3):
所述计算不同故障类别间每个特征平均值的平均距离采用公式(4):
The average distance of each characteristic mean value between the described calculation different fault categories adopts formula (4):
The average distance of each characteristic mean value between the described calculation different fault categories adopts formula (4):
其中,所述初始数据集包括K种故障类别,qi,k,j,i=1,2,…,Ik;k=1,2,…,K;j=1,2,…,J,其中,qi,k,j代表第k种故障类别中第i个样本的第j个特征值,Ik是第k个故障类别的样本数量,K是故障类别数量,J是每个样本的特征数量。Wherein, the initial data set includes K kinds of fault categories, q i,k,j , i=1,2,...,I k ; k=1,2,...,K; j=1,2,...,J , where q i,k,j represent the j-th eigenvalue of the i-th sample in the k-th fault category, I k is the number of samples of the k-th fault category, K is the number of fault categories, and J is each sample number of features.
在一些实施例中,对于每个关键数据集中的每个特征,与所述特征子集的方差膨胀因子的计算步骤包括:代入到公式(5),以得到该特征的方差膨胀因子,
In some embodiments, for each feature in each key data set, the step of calculating the variance inflation factor of the feature subset includes: substituting into formula (5) to obtain the variance inflation factor of the feature,
In some embodiments, for each feature in each key data set, the step of calculating the variance inflation factor of the feature subset includes: substituting into formula (5) to obtain the variance inflation factor of the feature,
其中,代表回归方程Xj=β0+βX′的拟合优度,X′为所述特征子集的特征。in, Represents the goodness of fit of the regression equation X j =β 0 +βX′, where X′ is a feature of the feature subset.
根据本公开的第二方面,提供一种模型训练装置,包括:According to a second aspect of the present disclosure, there is provided a model training device, comprising:
特征提取单元,用于构建高频振动信号的初始数据集;A feature extraction unit is used to construct an initial dataset of high-frequency vibration signals;
特征筛选单元,用于对所述初始数据集进行筛选,以得到关键数据集;A feature screening unit, configured to screen the initial data set to obtain key data sets;
模型训练单元,用于采用所述关键数据集进行模型训练,以得到故障识别模型,其中,
所述初始数据集包括采用故障类别、样本和特征三个维度表征,则所述特征筛选单元包括:A model training unit, configured to use the key data set for model training to obtain a fault recognition model, wherein, The initial data set includes the use of three dimensions of fault category, sample and feature representation, then the feature screening unit includes:
对于每个特征,在所述样本和所述故障类别两个维度分别进行平均距离计算,并基于两个维度的平均距离计算结果得到每个特征的重要性的表征值;以及For each feature, the average distance calculation is performed in the two dimensions of the sample and the fault category, and the representative value of the importance of each feature is obtained based on the average distance calculation results of the two dimensions; and
选择重要性的表征值大于第一阈值的特征组成所述关键数据集。The key data set is composed of features whose importance characteristic value is greater than the first threshold.
根据本公开的第三方面,提供一种电子设备,包括存储器和处理器,所述存储器还存储有可由所述处理器执行的计算机指令,所述计算机指令被执行时,实现上述任一项所述的模型训练方法。According to a third aspect of the present disclosure, there is provided an electronic device, including a memory and a processor, the memory further stores computer instructions executable by the processor, and when the computer instructions are executed, any one of the above-mentioned The model training method described above.
根据本公开的第四方面,提供一种计算机可读介质,所述计算机可读介质存储有可由电子设备执行的计算机指令,所述计算机指令被执行时,实现上述任一项所述的模型训练方法。According to a fourth aspect of the present disclosure, a computer-readable medium is provided, the computer-readable medium stores computer instructions that can be executed by an electronic device, and when the computer instructions are executed, the model training described in any one of the above is implemented method.
根据本公开实施例提供的模型训练方法,将数据集表征为故障类别、样本和特征三个维度,且在样本和故障类别两个维度分别进行平均距离计算,并基于两个平均距离计算结果得到每个特征的重要性的表征值,从而可根据重要性的表征值对初始数据集进行筛选,以获得优化的特征组合,进而采用优化的特征组合进行模型训练,能达到提高训练效率的目的。According to the model training method provided by the embodiments of the present disclosure, the data set is characterized into three dimensions of fault category, sample and feature, and the average distance calculation is performed in the two dimensions of sample and fault category, and based on the two average distance calculation results, the The characterization value of the importance of each feature, so that the initial data set can be screened according to the characterization value of importance to obtain an optimized feature combination, and then the optimized feature combination is used for model training, which can achieve the purpose of improving training efficiency.
通过参考以下附图对本公开实施例的描述,本公开的上述以及其它目的、特征和优点将更为清楚,在附图中:The above and other objects, features and advantages of the present disclosure will be more clear by describing the embodiments of the present disclosure with reference to the following drawings, in which:
图1是本公开实施例提供的模型训练方法的流程示意图;FIG. 1 is a schematic flowchart of a model training method provided by an embodiment of the present disclosure;
图2是本公开实施例提供的结合重要性和相关度对初始数据集进行筛选的方法流程图;FIG. 2 is a flowchart of a method for screening initial data sets in combination with importance and relevance provided by an embodiment of the present disclosure;
图3是本公开实施例提供的模型训练装置的结构示意图;Fig. 3 is a schematic structural diagram of a model training device provided by an embodiment of the present disclosure;
图4是一个示例性的电子设备的结构示意图。Fig. 4 is a schematic structural diagram of an exemplary electronic device.
以下基于实施例对本公开进行描述,但是本公开并不仅仅限于这些实施例。在下文对本公开的细节描述中,详尽描述了一些特定的细节部分。对本领域技术人员来说没有这些细节部分的描述也可以完全理解本公开。为了避免混淆本公开的实质,公知的方法、过程、流程没有详细叙述。另外附图不一定是按比例绘制的。The present disclosure is described below based on examples, but the present disclosure is not limited only to these examples. In the following detailed description of the disclosure, some specific details are set forth in detail. The present disclosure can be fully understood by those skilled in the art without the description of these detailed parts. In order to avoid obscuring the essence of the present disclosure, well-known methods, procedures, and procedures are not described in detail. Additionally, the drawings are not necessarily drawn to scale.
本公开实施例提供的模型训练方法的如图1所示。The model training method provided by the embodiment of the present disclosure is shown in FIG. 1 .
步骤S11是构建高频振动信号的初始数据集。Step S11 is to construct an initial data set of high-frequency vibration signals.
步骤S12是对初始数据集进行筛选,以得到关键数据集,即对包含丰富特征的初始数据集进行特征筛选,过滤出对于高重要性低相关性的一组特征;
Step S12 is to filter the initial data set to obtain the key data set, that is, perform feature screening on the initial data set containing rich features, and filter out a set of features with high importance and low correlation;
步骤S13是采用关键数据集进行训练以得到故障分类模型。也就是说,搭建故障识别模型(使用如支持向量机SVM、随机森林RF等的ML模型),将步骤S12筛选出的数据集提供给故障识别模型,对故障识别模型进行训练,并在训练中不断地优化故障识别模型的权重参数,直到损失函数达到预期才停止训练。Step S13 is to use key data sets for training to obtain a fault classification model. That is to say, build a fault recognition model (using ML models such as support vector machine SVM, random forest RF, etc.), provide the data set screened in step S12 to the fault recognition model, train the fault recognition model, and in the training Continuously optimize the weight parameters of the fault recognition model, and stop training until the loss function meets expectations.
其中,高频振动信号通常来自震动传感器,通过震动传感器将模拟信号的高频振动信号转换为数字信号,然后采用各种方法对数字信号的高频振动信号进行特征提取。现有的特征提取技术多种多样,大体分为时域特征、频域特征、时频域特征以及针对一些特定场景提取的特征。时域特征最为直观,通常会计算以下指标:平均值、均方根、峰峰值、脉冲、裕度、峰度、峭度等。频域特征主要是将信号进行傅立叶变换,从另一个视角观察信号的频谱,提取频谱的各类特征,比如:均值、最大值、质心频率、谱峰度、谱功率等。时频域特征主要用于设备起停阶段,通过获得信号的时频谱观察信号频率随时间的变化,提取时变特征。所有提取出来的特征组成一个数据集,作为下一环节特征选择的输入。Among them, the high-frequency vibration signal usually comes from a vibration sensor, and the high-frequency vibration signal of the analog signal is converted into a digital signal through the vibration sensor, and then various methods are used to extract the feature of the high-frequency vibration signal of the digital signal. There are various existing feature extraction techniques, which can be roughly divided into time-domain features, frequency-domain features, time-frequency domain features, and features extracted for some specific scenes. Time-domain features are the most intuitive, and the following indicators are usually calculated: average, root mean square, peak-to-peak, pulse, margin, kurtosis, kurtosis, etc. The frequency domain feature is mainly to perform Fourier transform on the signal, observe the spectrum of the signal from another perspective, and extract various features of the spectrum, such as: mean value, maximum value, centroid frequency, spectral kurtosis, spectral power, etc. The time-frequency domain feature is mainly used in the start-stop stage of the equipment. The time-frequency spectrum of the signal is obtained to observe the change of the signal frequency with time, and the time-varying feature is extracted. All the extracted features form a data set, which is used as the input for feature selection in the next step.
筛选步骤是从数据池中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤,经由筛选步骤能够获得能够使得系统特定指标优化(比如模型识别精度)的特征组合。依据是否独立于后续的机器学习算法,筛选步骤采用的特征选择方法主要分为三大类:The screening step is the process of selecting some of the most effective features from the data pool to reduce the dimensionality of the data set. It is an important means to improve the performance of the learning algorithm and is also a key data preprocessing step in pattern recognition. Through the screening step, the system can obtain Feature combinations for specific metric optimization (such as model recognition accuracy). According to whether it is independent of the subsequent machine learning algorithm, the feature selection methods used in the screening step are mainly divided into three categories:
(a)过滤式:与后续机器学习算法无关,一般直接利用所有训练数据的统计性能评估特征,使用多种评价准则来增强特征与类的相关性,削减特征之间的相关性。常见的选择方法有:方差选择法,相关系数法,卡方检验法,互信息法等。(a) Filtering type: It has nothing to do with subsequent machine learning algorithms. Generally, the statistical performance of all training data is used to evaluate features directly, and various evaluation criteria are used to enhance the correlation between features and classes and reduce the correlation between features. Common selection methods include: variance selection method, correlation coefficient method, chi-square test method, mutual information method, etc.
(b)封装式:将特征选择算法作为后续机器学习算法的一个组成部分,并且直接使用分类性能作为特征重要性程度的评价标准。它的依据是选择子集最终被用于构造分类模型。因此,若在构造分类模型时,直接采用那些能取得较高分类性能的特征,从而获得一个分类性能较高的分类模型,这里的模型可以为各类机器学习算法。(b) Encapsulation: The feature selection algorithm is used as an integral part of the subsequent machine learning algorithm, and the classification performance is directly used as the evaluation criterion for the degree of feature importance. It is based on the fact that a selected subset is ultimately used to construct a classification model. Therefore, if the features that can achieve higher classification performance are directly used when constructing the classification model, a classification model with higher classification performance can be obtained. The model here can be various machine learning algorithms.
(c)嵌入式:直接使用某种机器学习模型进行训练,得到各个特征值的权值系数,根据系数从大到小作为特征的重要程度进行选择特征。(c) Embedded: Directly use a certain machine learning model for training to obtain the weight coefficients of each feature value, and select features according to the importance of the coefficients from large to small as features.
在本文中主要是针对过滤式的特征选择进行改进。首先,在构建初始数据集时,将初始数据集表征为故障类别、样本和特征三个维度的数据。具体来说,即对于通过传感器输出的一段数字的高频信号,将这段高频信号处理成一个包括多个特征的样本,所述多个特征中的每个特征可以是时域特征、时域特征或时频域特征,且该样本具有归属的故障类别。样本所属的故障类别可通过人工标注或已经训练好的故障识别模型得到。具体地,对于已经训练好的故障识别模型,可将样本输入给该模型,以得到该样本对应的故障类别。In this paper, it is mainly aimed at improving the filter-type feature selection. First, when constructing the initial data set, the initial data set is represented as data in three dimensions of fault category, sample and feature. Specifically, for a segment of digital high-frequency signal output by the sensor, the segment of high-frequency signal is processed into a sample including multiple features, and each feature in the multiple features can be a time-domain feature, a time-domain feature, or a time-domain feature. Domain features or time-frequency domain features, and the sample has an assigned fault category. The fault category to which the sample belongs can be obtained through manual labeling or a trained fault recognition model. Specifically, for a fault recognition model that has been trained, a sample can be input into the model to obtain the fault category corresponding to the sample.
然后在样本和故障类别两个维度分别进行平均距离,以得到每个特征在样本维度的第一平均距离和在故障类别维度的第二平均距离。具体来说,对于第一平均距离,先计算每个特征在同一故障类别下的多个样本之间的平均距离,由此得到多个特征在多个故
障类别下的多个平均距离,然后基于每个特征在多个故障类别下的多个平均距离计算出该特征在多个故障类别下的平均距离的平均值作为该特征的第一平均距离;对于第二平均距离,先计算每个特征在同一故障类别下所有样本的平均值,由此得到多个特征在多个故障类别下的多个平均值,然后基于每个特征在多个故障类别下的多个平均值计算出该特征在多个故障类别下的平均值的平均距离作为该特征的第二平均距离,然后基于每个特征的第一平均距离和第二平均距离得到该特征的重要性的表征值,由此得到所有特征的重要性的表征值,进而比较所有特征的重要性的表征值并根据比较结果从初始数据集中选出特征组成关键数据集(关键数据集同样表征为故障类别、样本和特征三个维度),例如只要某个特征的重要性的表征值大于设定阈值,即将该特征归属到关键数据集。Then the average distance is performed in the two dimensions of sample and fault category respectively to obtain the first average distance of each feature in the sample dimension and the second average distance in the fault category dimension. Specifically, for the first average distance, first calculate the average distance between multiple samples of each feature under the same fault category, thus obtaining the multiple average distances under the fault category, and then calculate the average value of the average distances of the feature under multiple fault categories based on the multiple average distances of each feature under multiple fault categories as the first average distance of the feature; For the second average distance, first calculate the average value of all samples under the same fault category for each feature, thereby obtaining multiple average values of multiple features under multiple fault categories, and then based on each feature in multiple fault categories Calculate the average distance of the average value of the feature under multiple fault categories as the second average distance of the feature, and then get the feature's average distance based on the first average distance and the second average distance of each feature The characteristic value of importance, thus obtaining the characteristic value of the importance of all features, and then comparing the characteristic values of the importance of all features and selecting features from the initial data set according to the comparison results to form a key data set (the key data set is also characterized as fault categories, samples, and features), for example, as long as the characteristic value of the importance of a feature is greater than the set threshold, the feature will be assigned to the key data set.
综上,对于每个特征,分别在样本和故障类别两个维度进行平均距离计算,得到每个特征在两个维度上的两个平均距离,并基于这两个平均距离得到评价每个特征的重要性的表征值,利用该表征值从数据集中选择出特征组合以训练故障识别模型。To sum up, for each feature, the average distance is calculated in the two dimensions of sample and fault category, and the two average distances of each feature in the two dimensions are obtained, and the evaluation of each feature is obtained based on these two average distances. The representation value of the importance is used to select a combination of features from the data set to train the fault recognition model.
然而,通过以上距离评估方法选取出的特征虽然具有较高的重要性,但彼此之间也可能存在较高的相关性,这样就会增加模型的复杂度,影响最终故障识别效果。因此在一些实施例中,继续计算每个特征与其他特征之间的相关度,并利用相关度继续进行筛选。However, although the features selected by the above distance evaluation method have high importance, they may also have a high correlation with each other, which will increase the complexity of the model and affect the final fault identification effect. Therefore, in some embodiments, the correlation between each feature and other features is continuously calculated, and the screening is continued using the correlation.
图2是本公开实施例提供的结合重要性和相关度对初始数据集进行筛选的方法流程图,具体包括以下步骤。Fig. 2 is a flowchart of a method for screening an initial data set in combination with importance and relevance provided by an embodiment of the present disclosure, which specifically includes the following steps.
在步骤S121中,新建一个空的特征子集。In step S121, create an empty feature subset.
在步骤S122中,计算每个特征的第一平均距离和第二平均距离,并据此得到每个特征的重要性的表征值。In step S122, the first average distance and the second average distance of each feature are calculated, and the characteristic value of the importance of each feature is obtained accordingly.
在步骤S123中,将重要性的表征值大于第一阈值的特征组成第一特征集合。In step S123, the features whose importance characteristic value is greater than the first threshold are formed into a first feature set.
在步骤S124中,将第一特征集合的特征按照重要性从大到小排序并将重要性最高的特征取出放入特征子集。In step S124, the features of the first feature set are sorted in descending order of importance, and the feature with the highest importance is taken out and put into the feature subset.
在步骤S125中,按照重要性排序,逐个计算第一特征集合中的每个特征与特征子集中的方差膨胀因子。In step S125, each feature in the first feature set and the variance inflation factor in the feature subset are calculated one by one in order of importance.
在步骤S126中,判断该特征的方差膨胀因子是否小于第二阈值,如果是,则执行步骤S127,否则执行步骤S125。步骤S125至S127组成一个循环,该循环的重复次数为(N-1)次,N为第一特征集合包含的特征的数量。In step S126, it is judged whether the variance inflation factor of the feature is smaller than the second threshold, if yes, execute step S127, otherwise execute step S125. Steps S125 to S127 form a loop, and the number of repetitions of this loop is (N-1) times, where N is the number of features included in the first feature set.
在步骤S127中,将该特征放入到特征子集。In step S127, put the feature into a feature subset.
在步骤S128中,输出特征子集。In step S128, the feature subset is output.
在本实施例中,根据上一实施例的计算方式计算出每个特征的第一平均距离和第二平均距离,并据此得到每个特征的重要性的表征值,然后以重要性的表征值大于第一阈值为筛选条件得到第一特征集合,将第一特征集合的特征按照重要性的表征值从大到小排序并将重要性的表征值最高的特征取出放入特征子集,然后按照重要性从大到小的排
序,计算第一特征集合中的每个特征与特征子集的方差膨胀因子,并且将方差膨胀因子小于第二阈值的特征放入到特征子集中,这意味着,当一个特征与特征子集的方差膨胀因子小于第二阈值时,认为该特征与特征子集的相关性相对较小,因此可将该特征放入到特征子集中以作为特征子集中。应理解,特征子集中的特征数量随着循环次数增加而增加,即用来计算方差膨胀因子的特征数量逐渐增加,因此关键数据集中的特征与特征子集的方差膨胀因子计算的计算量和复杂度随着增加而增加。In this embodiment, the first average distance and the second average distance of each feature are calculated according to the calculation method of the previous embodiment, and the characterization value of the importance of each feature is obtained accordingly, and then the characterization of the importance The value greater than the first threshold is used as the screening condition to obtain the first feature set, the features of the first feature set are sorted according to the characteristic value of importance from large to small, and the feature with the highest characteristic value of importance is taken out and put into the feature subset, and then Arranged in descending order of importance Sequence, calculate the variance inflation factor of each feature and feature subset in the first feature set, and put the features whose variance inflation factor is less than the second threshold into the feature subset, which means that when a feature and feature subset When the variance inflation factor of is smaller than the second threshold, it is considered that the correlation between the feature and the feature subset is relatively small, so the feature can be put into the feature subset as a feature subset. It should be understood that the number of features in the feature subset increases as the number of cycles increases, that is, the number of features used to calculate the variance inflation factor gradually increases, so the calculation of the variance inflation factor of the features in the key data set and the feature subset is computationally and complex. degree increases with increasing.
下面结合公式,更具体介绍如何在样本和故障类别两个维度进行平均距离计算,得到每个特征的两个平均距离,并基于这两个平均距离得到评价每个特征的重要性的表征值。Combined with the formula below, we will introduce in more detail how to calculate the average distance in the two dimensions of sample and fault category to obtain two average distances for each feature, and obtain the characterization value to evaluate the importance of each feature based on these two average distances.
假设初始数据集包含K种故障类别:qi,k,j,i=1,2,…,Ik;k=1,2,…,K;j=1,2,…,J,其中,qi,k,j代表第k种故障类别中第i个样本的第j个特征值。Ik是第k个故障类别的样本数量,K是故障类别数量,J是每个样本的特征数量。Suppose the initial data set contains K kinds of fault categories: q i,k,j , i=1,2,…,I k ; k=1,2,…,K; j=1,2,…,J, where, q i,k,j represent the jth eigenvalue of the i-th sample in the k-th fault category. I k is the number of samples of the kth fault category, K is the number of fault categories, and J is the number of features for each sample.
其中在样本和故障类别两个维度进行平均距离计算,得到每个特征的两个平均距离的具体操作步骤如下:Among them, the average distance is calculated in the two dimensions of sample and fault category, and the specific operation steps to obtain the two average distances of each feature are as follows:
(1)计算相同故障类别下不同样本间的平均距离:
(1) Calculate the average distance between different samples under the same fault category:
(1) Calculate the average distance between different samples under the same fault category:
(2)然后计算每个特征在K个故障类别的平均值:
(2) Then calculate the average value of each feature in K fault categories:
(2) Then calculate the average value of each feature in K fault categories:
(3)计算相同故障类别下所有样本不同特征各自的平均值:
(3) Calculate the average values of different characteristics of all samples under the same fault category:
(3) Calculate the average values of different characteristics of all samples under the same fault category:
(4)然后计算不同故障类别间各个特征平均值的平均距离:
(4) Then calculate the average distance of each feature average among different fault categories:
(4) Then calculate the average distance of each feature average among different fault categories:
然后基于这两个平均距离得到评价每个特征的重要性的表征值的具体操作步骤如下:Then, based on these two average distances, the specific operation steps to obtain the characterization value for evaluating the importance of each feature are as follows:
(5)定义平均距离的方差因子
(5) Define the average distance variance factor
(5) Define the average distance variance factor
(6)定义平均距离的方差因子
(6) Define the average distance variance factor
(6) Define the average distance variance factor
(7)综合考虑两个方差因子定义补偿因子:
(7) Considering the two variance factors comprehensively to define the compensation factor:
(7) Considering the two variance factors comprehensively to define the compensation factor:
(8)计算考虑补偿因子后的系数:
(8) Calculate the coefficient after considering the compensation factor:
(8) Calculate the coefficient after considering the compensation factor:
(9)归一化得到特征重要性指标:
(9) Normalized to obtain the feature importance index:
(9) Normalized to obtain the feature importance index:
在一些实施例中,提出第j个特征的方差膨胀因子计算如下:
In some embodiments, it is proposed that the variance inflation factor for the jth feature be calculated as follows:
In some embodiments, it is proposed that the variance inflation factor for the jth feature be calculated as follows:
其中,代表回归方程Xj=β0+βX′的拟合优度,X′包括除了Xj之外要进行相关性比对的其他特征。对应图2的实施例,X′即是指特征子集。in, Represents the goodness of fit of the regression equation X j = β 0 + βX', where X' includes other features to be compared for correlation besides X j . Corresponding to the embodiment in FIG. 2 , X' refers to the feature subset.
下面依据图2的实施例再进行简单说明。首先最开始状态下,特征子集里只有一个重要性为最高的特征,然后从关键数据集中取出重要性次高的特征,并在该特征与特征子集里的重要性为最高的特征之间计算方差膨胀因子,如果方差膨胀因子小于第二阈值,则将该特征放入到特征子集中,然后从关键数据集中取出重要性为第三的特征,并计算该特征与特征子集中的重要性为最高和次高的两个特征的组合之间的方差膨胀因子,如果方差膨胀因子小于第二阈值,则将该特征放入到特征子集中,依次类推。A brief description will be given below based on the embodiment shown in FIG. 2 . First, in the initial state, there is only one feature with the highest importance in the feature subset, and then the feature with the second highest importance is taken from the key data set, and between this feature and the feature with the highest importance in the feature subset Calculate the variance inflation factor, if the variance inflation factor is less than the second threshold, put the feature into the feature subset, then take out the feature with the third importance from the key data set, and calculate the importance of the feature and the feature subset is the variance inflation factor between the combination of the highest and second highest two features, if the variance inflation factor is less than the second threshold, put the feature into the feature subset, and so on.
相应地,本公开实施例提供一种故障检测装置,包括特征提取单元301,特征筛选单元302和模型训练单元303。Correspondingly, an embodiment of the present disclosure provides a fault detection device, including a feature extraction unit 301 , a feature screening unit 302 and a model training unit 303 .
特征提取单元301用于构建高频振动信号的初始数据集;Feature extraction unit 301 is used to construct the initial data set of high-frequency vibration signal;
特征筛选单元302用于对初始数据集进行筛选,以得到关键数据集;The feature screening unit 302 is used to screen the initial data set to obtain key data sets;
模型训练单元303用于采用关键数据集进行模型训练,以得到故障识别模型,。其中,初始数据集包括采用故障类别、样本和特征三个维度表征,而模型训练单元303包括:对于每个特征,在样本和故障类别两个维度分别进行平均距离计算,并基于两个维度的平均距离计算结果得到每个特征的重要性的表征值,选择重要性的表征值大于第一阈值的特征组成关键数据集。The model training unit 303 is used to perform model training using key data sets to obtain a fault recognition model. Among them, the initial data set includes the three-dimensional representation of fault category, sample and feature, and the model training unit 303 includes: for each feature, the average distance calculation is performed in the two dimensions of sample and fault category, and based on the two dimensions The characterization value of the importance of each feature is obtained from the average distance calculation result, and the features whose characterization value of importance is greater than the first threshold are selected to form the key data set.
关于特征提取单元301,特征筛选单元302和模型训练单元303的更详细的说明,可参见上文描述。For a more detailed description of the feature extraction unit 301 , the feature screening unit 302 and the model training unit 303 , please refer to the above description.
图4是一个示例性的电子设备400的结构示意图。该电子装置可用于执行包含上文所述的故障识别模型的应用系统,同时该电子装置可用于对故障识别模型进行训练。如图上所示,服务器400包括经由总线405耦接的调度器401、存储单元403、I/O接口404、多个模型加速单元402。FIG. 4 is a schematic structural diagram of an exemplary electronic device 400 . The electronic device can be used to execute the application system including the above-mentioned fault identification model, and at the same time, the electronic device can be used to train the fault identification model. As shown in the figure, the server 400 includes a scheduler 401 , a storage unit 403 , an I/O interface 404 , and multiple model acceleration units 402 coupled via a bus 405 .
存储单元403可以包括易失性存储单元形式的可读介质,例如随机存取存储单元(RAM)和/或高速缓存存储单元。存储单元403还可以包括非易失性存储单元形式的可读介质,例如,只读存储单元(ROM)、闪存存储器和各种磁盘存储器。The storage unit 403 may include readable media in the form of volatile storage units, such as random access memory units (RAM) and/or cache storage units. The storage unit 403 may also include a readable medium in the form of a non-volatile storage unit, for example, a read only memory unit (ROM), a flash memory, and various magnetic disk memories.
存储单元403可以存储各种程序模块和数据,各种程序模块包括操作系统、提供诸如
文本处理、视频播放、软件编辑和编译等功能的应用程序。这些应用程序的可执行代码被调度器401从存储单元403中读出并执行,以实现这些程序模块预定的操作。调度器401一般为处理器(CPU)。特别地,存储单元403存储有基于上文所述的故障检测系统。The storage unit 403 can store various program modules and data, and various program modules include an operating system, providing such as Applications for text processing, video playback, software editing and compiling, etc. The executable codes of these application programs are read from the storage unit 403 by the scheduler 401 and executed, so as to realize the predetermined operations of these program modules. The scheduler 401 is generally a processor (CPU). In particular, the storage unit 403 stores the fault detection system based on the above.
总线405可以为表示几类总线结构中的一种或多种,包括存储单元总线或者存储单元控制器、外围总线、图形加速端口、处理单元或者使用多种总线结构中的任意总线结构的局域总线。The bus 405 may represent one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local area using any bus structure in a variety of bus structures. bus.
服务器400可以与一个或多个外部设备(例如键盘、指向设备、蓝牙设备等)通信,还可与一个或者多个使得用户能与该服务器400交互的设备通信,和/或与使得服务器400能与一个或多个其它计算设备进行通信的任何设备(例如路由器、调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口404进行。并且,服务器400还可以通过网络适配器(未示出)与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。例如通过网络适配器,图1中的终端103可访问服务器400。应当明白,尽管图中未示出,基于服务器400可使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。The server 400 can communicate with one or more external devices (such as keyboards, pointing devices, Bluetooth devices, etc.), and can also communicate with one or more devices that enable users to interact with the server 400, and/or communicate with the server 400 to enable Any device (eg, router, modem, etc.) that communicates with one or more other computing devices communicates. Such communication may occur through input/output (I/O) interface 404 . Moreover, the server 400 can also communicate with one or more networks (such as a local area network (LAN), a wide area network (WAN) and/or a public network such as the Internet) through a network adapter (not shown). The terminal 103 in FIG. 1 can access the server 400, for example through a network adapter. It should be appreciated that although not shown in the figure, other hardware and/or software modules may be used based on the server 400, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and Data backup storage system, etc.
如图上所示,服务器400包括多个模型加速单元402。传统的处理器架构设计,逻辑控制方面十分有效,而在大规模并行计算方面则效率不够,因此对模型计算并不高效。为此,开发出模型加速单元,不同模型可适配于不同的模型加速单元。模型加速单元例如为神经网络加速单元(NPU)。NPU采用数据驱动并行计算的架构,用于处理各神经网络节点的大量运算(例如卷积、池化等)的处理单元。再或者为图形处理单元(GPU),用于专门做图像和图形相关的运算工作,由于图形处理单元采用大量用于专门做图形计算的计算单元,使显卡减少了对CPU的依赖,承担了CPU原来承担的一些计算密集的图形图像处理工作,因此对于图像数据的处理效率大大提高。多个模型加速单元402接受调度器401的控制,通过调度器401的控制,多个模型加速单元402能够协同工作。As shown in the figure, the server 400 includes a plurality of model acceleration units 402 . The traditional processor architecture design is very effective in logic control, but not efficient enough in large-scale parallel computing, so it is not efficient for model calculations. Therefore, a model acceleration unit is developed, and different models can be adapted to different model acceleration units. The model acceleration unit is, for example, a neural network acceleration unit (NPU). The NPU adopts a data-driven parallel computing architecture and is used as a processing unit for processing a large number of operations (such as convolution, pooling, etc.) of each neural network node. Or it is a graphics processing unit (GPU), which is used to do image and graphics-related calculations. Since the graphics processing unit uses a large number of computing units dedicated to graphics calculations, the graphics card reduces its dependence on the CPU and assumes the responsibility of the CPU. Originally undertaken some computationally intensive graphics and image processing work, so the processing efficiency of image data is greatly improved. Multiple model acceleration units 402 are controlled by the scheduler 401 , and under the control of the scheduler 401 , multiple model acceleration units 402 can work together.
以本公开各个实施例为例,参考图1和图3,特征提取单元301所对应的步骤S11和特征筛选单元302所对应的步骤S12可由CPU执行,而模型训练单元303所对应的S13则可以由调度器401和多个模型加速单元402协作完成,此时调度器401总控过程,将已经训练好的故障分类模型的执行放到多个模型加速单元402上并且汇总多个模型加速单元402的执行结果,多个模型加速单元402之间的数据交互可通过调度器401实现,也可由多个模型加速单元402直接进行数据交互。Taking each embodiment of the present disclosure as an example, referring to FIG. 1 and FIG. 3 , the step S11 corresponding to the feature extraction unit 301 and the step S12 corresponding to the feature screening unit 302 can be executed by the CPU, and the step S13 corresponding to the model training unit 303 can be It is completed by the cooperation of the scheduler 401 and multiple model acceleration units 402. At this time, the scheduler 401 generally controls the process, puts the execution of the trained fault classification model on multiple model acceleration units 402 and summarizes the multiple model acceleration units 402 As a result of execution, the data interaction between multiple model acceleration units 402 can be realized through the scheduler 401 , or the data interaction can be directly performed by multiple model acceleration units 402 .
应理解,本公开实施例提供的模型训练方法和模型训练装置是利用监督式的距离评估方法可以从数据集中选择出优化的特征组合以达到提高模型训练效率的目的。监督式的距离评估方法是指利用故障类别(给维度的数据需要标注或通过已经训练好的故障识别模型才能产生)维度的平均距离进行评估。It should be understood that the model training method and model training device provided by the embodiments of the present disclosure use a supervised distance evaluation method to select an optimized feature combination from a data set to improve model training efficiency. The supervised distance evaluation method refers to the evaluation using the average distance of the dimension of the fault category (the data for the dimension needs to be marked or generated through a trained fault recognition model).
另外,本公开实施例还提供一个计算机可读介质,用于存储实现上述模型训练方法的
计算机可读指令。In addition, an embodiment of the present disclosure also provides a computer-readable medium for storing the above-mentioned model training method computer readable instructions.
本公开的商业价值Commercial value of this disclosure
本公开实施例提供了模型训练方法。相比现有技术,本公开实施例采用监督式的距离评估方法从数据集中选择出优化的特征组合进行模型训练从而可达到提高模型训练效率和节约计算资源的目的。因此该模型训练方法具有一定的实用价值和经济价值。Embodiments of the present disclosure provide a model training method. Compared with the prior art, the embodiment of the present disclosure adopts a supervised distance evaluation method to select an optimized feature combination from a data set for model training, so as to achieve the purpose of improving model training efficiency and saving computing resources. Therefore, the model training method has certain practical value and economic value.
需要领会,以上所述仅为本公开的优选实施例,并不用于限制本公开,对于本领域技术人员而言,本说明书的实施例存在许多变型。凡在本公开的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本公开的保护范围之内。It should be appreciated that the above descriptions are only preferred embodiments of the present disclosure, and are not intended to limit the present disclosure. For those skilled in the art, there are many variations to the embodiments of the present disclosure. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure shall be included in the protection scope of the present disclosure.
应该理解,本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同或相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。It should be understood that each embodiment in this specification is described in a progressive manner, the same or similar parts of each embodiment can be referred to each other, and each embodiment focuses on the difference from other embodiments .
应该理解,上述对本说明书特定实施例进行了描述。其它实施例在权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。It should be understood that the foregoing describes specific embodiments of the present specification. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in an order different from that in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. Multitasking and parallel processing are also possible or may be advantageous in certain embodiments.
应该理解,本文用单数形式描述或者在附图中仅显示一个的元件并不代表将该元件的数量限于一个。此外,本文中被描述或示出为分开的模块或元件可被组合为单个模块或元件,且本文中被描述或示出为单个的模块或元件可被拆分为多个模块或元件。It should be understood that describing an element herein in the singular or showing only one in a drawing does not mean limiting the number of that element to one. Furthermore, modules or elements described or illustrated herein as separate may be combined into a single module or element, and modules or elements described or illustrated herein as a single may be split into a plurality of modules or elements.
还应理解,本文采用的术语和表述方式只是用于描述,本说明书的一个或多个实施例并不应局限于这些术语和表述。使用这些术语和表述并不意味着排除任何示意和描述(或其中部分)的等效特征,应认识到可能存在的各种修改也应包含在权利要求范围内。其他修改、变化和替换也可能存在。相应的,权利要求应视为覆盖所有这些等效物。
It should also be understood that the terms and expressions used herein are for description only, and one or more embodiments of this specification should not be limited to these terms and expressions. The use of these terms and expressions does not mean to exclude any equivalent features shown and described (or parts thereof), and it should be recognized that various modifications may also be included within the scope of the claims. Other modifications, changes and substitutions are also possible. Accordingly, the claims should be read to cover all such equivalents.
Claims (11)
- 一种模型训练方法,包括:A model training method, comprising:构建高频振动信号的初始数据集;Construct an initial data set of high-frequency vibration signals;对所述初始数据集进行筛选,以得到关键数据集;Screening the initial data set to obtain key data sets;采用所述关键数据集进行模型训练,以得到故障识别模型,Using the key data set to perform model training to obtain a fault identification model,其中,所述初始数据集采用故障类别、样本和特征三个维度表征,则所述筛选包括:Wherein, the initial data set is characterized by three dimensions of fault category, sample and feature, then the screening includes:对于每个特征,在所述样本和所述故障类别两个维度分别进行平均距离计算,并基于两个维度的平均距离计算结果得到每个特征的重要性的表征值;以及For each feature, the average distance calculation is performed in the two dimensions of the sample and the fault category, and the representative value of the importance of each feature is obtained based on the average distance calculation results of the two dimensions; and选择重要性的表征值大于第一阈值的特征组成所述关键数据集。The key data set is composed of features whose importance characteristic value is greater than the first threshold.
- 根据权利要求1所述的模型训练方法,还包括:在所述关键数据集中,计算每个特征与其他特征的相关性并基于特征之间的相关性对所述关键数据集的特征进行筛选。The model training method according to claim 1, further comprising: in the key data set, calculating the correlation between each feature and other features and filtering the features of the key data set based on the correlation between features.
- 根据权利要求1所述的模型训练方法,其中,所述在所述关键数据集中,计算每个特征与其他特征的相关性并基于特征之间的相关性对所述关键数据集的特征进行筛选包括:The model training method according to claim 1, wherein, in the key data set, the correlation between each feature and other features is calculated and the features of the key data set are screened based on the correlation between features include:将所述关键数据集的特征按照重要性从大到小排序并将重要性最高的特征取出放入特征子集;Sorting the features of the key data set in descending order of importance and taking out the features with the highest importance and putting them into the feature subset;按照重要性排序,计算所述关键数据集中的每个特征与所述特征子集的方差膨胀因子;Sorting by importance, calculating the variance inflation factor of each feature in the key data set and the feature subset;将所述关键数据集中每个特征的方差膨胀因子与第二阈值进行比较,如果该特征的方差膨胀因子小于第二阈值,则将该特征放入到特征子集,comparing the variance inflation factor of each feature in the key data set with a second threshold, and if the variance inflation factor of the feature is less than the second threshold, putting the feature into a feature subset,重复执行所述计算所述关键数据集中的每个特征与所述特征子集的方差膨胀因子和将所述关键数据集中每个特征的方差膨胀因子与第二阈值进行比较,如果该特征的方差膨胀因子小于第二阈值,则将该特征放入到特征子集的步骤,直到完成所述关键数据集的所有特征的比较。repeating said calculating the variance inflation factor for each feature in said key dataset and said subset of features and comparing the variance inflation factor for each feature in said key dataset to a second threshold, if the variance of the feature If the expansion factor is smaller than the second threshold, then put the feature into the feature subset until the comparison of all the features of the key data set is completed.
- 根据权利要求1所述的模型训练方法,其中,对于每个特征,在所述样本维度进行平均距离计算包括:The model training method according to claim 1, wherein, for each feature, calculating the average distance in the sample dimension comprises:计算每个特征在相同故障类别下不同样本间的平均距离;Calculate the average distance between different samples of each feature under the same fault category;基于每个特征在相同故障类别下不同样本间的平均距离,计算每个特征在多个故障类别的平均值,并将其记为第一平均距离;Based on the average distance between different samples of each feature under the same fault category, calculate the average value of each feature in multiple fault categories, and record it as the first average distance;对于每个特征,在所述故障类别维度进行平均距离计算包括:For each feature, calculating the average distance in the fault category dimension includes:计算相同故障类别下所有样本中每个特征的平均值;Calculate the average value of each feature in all samples under the same fault category;基于相同故障类别下所有样本中每个特征的平均值,计算不同故障类别间每个特征平均值的平均距离,并将其记为第二平均距离。Based on the average value of each feature in all samples under the same fault category, the average distance of each feature average among different fault categories is calculated, and recorded as the second average distance.
- 根据权利要求4所述的模型训练方法,其中,所述并基于两个维度的平均距离计算结果得到每个特征的重要性的表征值包括:The model training method according to claim 4, wherein said obtaining the characterization value of the importance of each feature based on the calculation result of the average distance of two dimensions comprises:计算每个特征的第一平均距离的方差因子, Calculate the variance factor of the first mean distance for each feature,计算每个特征的第二平均距离的方差因子;Calculate the variance factor of the second mean distance for each feature;基于每个特征的两个方差因子计算补偿因子;Calculate compensation factors based on the two variance factors for each feature;基于每个特征的补偿因子、第一平均距离和第二平均距离得到每个特征的重要性的表征值。A representative value of the importance of each feature is obtained based on the compensation factor of each feature, the first average distance, and the second average distance.
- 根据权利要求4所述的模型训练方法,其中,所述计算每个特征在相同故障类别下不同样本间的平均距离采用公式(1):
The model training method according to claim 4, wherein the calculation of the average distance between different samples of each feature under the same fault category adopts formula (1):
所述计算每个特征在多个故障类别的平均值采用公式(2):
The average value of each feature in multiple fault categories is calculated using formula (2):
其中,所述初始数据集包括K种故障类别,qi,k,j,i=1,2,…,Ik;k=1,2,…,K;j=1,2,…,J,其中,qi,k,j代表第k种故障类别中第i个样本的第j个特征值,Ik是第k个故障类别的样本数量,K是故障类别数量,J是每个样本的特征数量。Wherein, the initial data set includes K kinds of fault categories, q i,k,j , i=1,2,...,I k ; k=1,2,...,K; j=1,2,...,J , where q i,k,j represent the j-th eigenvalue of the i-th sample in the k-th fault category, I k is the number of samples of the k-th fault category, K is the number of fault categories, and J is each sample number of features. - 根据权利要求4所述的模型训练方法,其中,所述计算相同故障类别下所有样本中每个特征的平均值采用公式(3):
The model training method according to claim 4, wherein the calculation of the average value of each feature in all samples under the same fault category adopts formula (3):
所述计算不同故障类别间每个特征平均值的平均距离采用公式(4):
The average distance of each characteristic mean value between the described calculation different fault categories adopts formula (4):
其中,所述初始数据集包括K种故障类别,qi,k,j,i=1,2,…,Ik;k=1,2,…,K;j=1,2,…,J,其中,qi,k,j代表第k种故障类别中第i个样本的第j个特征值,Ik是第k个故障类别的样本数量,K是故障类别数量,J是每个样本的特征数量。Wherein, the initial data set includes K kinds of fault categories, q i,k,j , i=1,2,...,I k ; k=1,2,...,K; j=1,2,...,J , where q i,k,j represent the j-th eigenvalue of the i-th sample in the k-th fault category, I k is the number of samples of the k-th fault category, K is the number of fault categories, and J is each sample number of features. - 根据权利要求3的模型训练方法,其中,对于每个关键数据集中的每个特征,与所述特征子集的方差膨胀因子的计算步骤包括:代入到公式(5),以得到该特征的方差膨胀因子,
The model training method according to claim 3, wherein, for each feature in each key data set, the step of calculating the variance inflation factor with the feature subset comprises: substituting into formula (5) to obtain the variance of the feature expansion factor,
其中,代表回归方程Xj=β0+βX′的拟合优度,X′为所述特征子集的特征。in, Represents the goodness of fit of the regression equation X j =β 0 +βX′, where X′ is a feature of the feature subset. - 一种模型训练装置,包括:A model training device, comprising:特征提取单元,用于构建高频振动信号的初始数据集;A feature extraction unit is used to construct an initial dataset of high-frequency vibration signals;特征筛选单元,用于对所述初始数据集进行筛选,以得到关键数据集;A feature screening unit, configured to screen the initial data set to obtain key data sets;模型训练单元,用于采用所述关键数据集进行模型训练,以得到故障识别模型,其中,A model training unit, configured to use the key data set for model training to obtain a fault recognition model, wherein,所述初始数据集采用故障类别、样本和特征三个维度表征,则所述特征筛选单元包括:The initial data set is characterized by three dimensions of fault category, sample and feature, and the feature screening unit includes:对于每个特征,在所述样本和所述故障类别两个维度分别进行平均距离计算,并基于两个维度的平均距离计算结果得到每个特征的重要性的表征值;以及For each feature, the average distance is calculated in the two dimensions of the sample and the fault category, and the representative value of the importance of each feature is obtained based on the average distance calculation results of the two dimensions; and选择重要性的表征值大于第一阈值的特征组成所述关键数据集。The key data set is composed of features whose importance characteristic value is greater than the first threshold.
- 一种电子设备,包括存储器和处理器,所述存储器还存储有可由所述处理器执行 的计算机指令,所述计算机指令被执行时,实现所述如权利要求1至8任一项所述的模型训练方法。An electronic device comprising a memory and a processor, the memory further storing information executable by the processor computer instructions, when the computer instructions are executed, the model training method according to any one of claims 1 to 8 is realized.
- 一种计算机可读介质,所述计算机可读介质存储有可由电子设备执行的计算机指令,所述计算机指令被执行时,实现所述如权利要求1至8任一项所述的模型训练方法。 A computer-readable medium, the computer-readable medium stores computer instructions executable by an electronic device, and when the computer instructions are executed, the model training method according to any one of claims 1 to 8 is implemented.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210127454.XA CN114169539A (en) | 2022-02-11 | 2022-02-11 | Model training method, training device, electronic device, and computer-readable medium |
CN202210127454.X | 2022-02-11 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023151488A1 true WO2023151488A1 (en) | 2023-08-17 |
Family
ID=80489724
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/074026 WO2023151488A1 (en) | 2022-02-11 | 2023-01-31 | Model training method, training device, electronic device and computer-readable medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114169539A (en) |
WO (1) | WO2023151488A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117171629A (en) * | 2023-11-03 | 2023-12-05 | 西安热工研究院有限公司 | Electrical equipment discharge fault type identification method |
CN118154936A (en) * | 2024-02-01 | 2024-06-07 | 北京格致博雅生物科技有限公司 | Machine learning-based variety identification and classification method and system |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114169539A (en) * | 2022-02-11 | 2022-03-11 | 阿里巴巴(中国)有限公司 | Model training method, training device, electronic device, and computer-readable medium |
CN114662702A (en) * | 2022-03-31 | 2022-06-24 | 北京百度网讯科技有限公司 | Fault detection method, device, electronic equipment and medium |
CN114764594A (en) * | 2022-04-02 | 2022-07-19 | 阿里巴巴(中国)有限公司 | Classification model feature selection method, device and equipment |
CN115048985B (en) * | 2022-05-17 | 2024-02-13 | 国网浙江省电力有限公司嘉兴供电公司 | Electrical equipment fault discrimination method |
CN117909840A (en) * | 2024-03-19 | 2024-04-19 | 之江实验室 | Model training method and device, storage medium and electronic equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111126426A (en) * | 2019-10-11 | 2020-05-08 | 平安普惠企业管理有限公司 | Feature selection method and device, computer equipment and storage medium |
WO2020199591A1 (en) * | 2019-03-29 | 2020-10-08 | 平安科技(深圳)有限公司 | Text categorization model training method, apparatus, computer device, and storage medium |
CN114169539A (en) * | 2022-02-11 | 2022-03-11 | 阿里巴巴(中国)有限公司 | Model training method, training device, electronic device, and computer-readable medium |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104615894B (en) * | 2015-02-13 | 2018-09-28 | 上海中医药大学 | A kind of Chinese Medicine Diagnoses System based on k neighbour's label certain weights features |
CN107727395B (en) * | 2017-07-21 | 2019-12-03 | 中国矿业大学 | A kind of Method for Bearing Fault Diagnosis based on full variation and uncompensation distance assessment |
CN109974782B (en) * | 2019-04-10 | 2021-03-02 | 郑州轻工业学院 | Equipment fault early warning method and system based on big data sensitive characteristic optimization selection |
CN111522632A (en) * | 2020-04-14 | 2020-08-11 | 重庆邮电大学 | Hadoop configuration parameter selection method based on kernel clustering feature selection |
-
2022
- 2022-02-11 CN CN202210127454.XA patent/CN114169539A/en active Pending
-
2023
- 2023-01-31 WO PCT/CN2023/074026 patent/WO2023151488A1/en unknown
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020199591A1 (en) * | 2019-03-29 | 2020-10-08 | 平安科技(深圳)有限公司 | Text categorization model training method, apparatus, computer device, and storage medium |
CN111126426A (en) * | 2019-10-11 | 2020-05-08 | 平安普惠企业管理有限公司 | Feature selection method and device, computer equipment and storage medium |
CN114169539A (en) * | 2022-02-11 | 2022-03-11 | 阿里巴巴(中国)有限公司 | Model training method, training device, electronic device, and computer-readable medium |
Non-Patent Citations (1)
Title |
---|
HANHUI JIAO, HU MINGHUI, JIANG ZHINONG, FENG KUN: "Fast and intelligent identification method for faults of a centrifugal pump based on the compensation distance evaluation and one-dimensional convolution neural network", JOURNAL OF VIBRATION AND SHOCK, vol. 40, no. 10, 28 May 2021 (2021-05-28), pages 41 - 49, XP093085151 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117171629A (en) * | 2023-11-03 | 2023-12-05 | 西安热工研究院有限公司 | Electrical equipment discharge fault type identification method |
CN118154936A (en) * | 2024-02-01 | 2024-06-07 | 北京格致博雅生物科技有限公司 | Machine learning-based variety identification and classification method and system |
Also Published As
Publication number | Publication date |
---|---|
CN114169539A (en) | 2022-03-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023151488A1 (en) | Model training method, training device, electronic device and computer-readable medium | |
Zhang et al. | Intelligent fault diagnosis under varying working conditions based on domain adaptive convolutional neural networks | |
Li et al. | A novel method for imbalanced fault diagnosis of rotating machinery based on generative adversarial networks | |
Lu et al. | Deep model based domain adaptation for fault diagnosis | |
Yin et al. | Wasserstein Generative Adversarial Network and Convolutional Neural Network (WG‐CNN) for Bearing Fault Diagnosis | |
JP2020518938A (en) | Analysis of sequence data using neural network | |
Fu et al. | Broad auto-encoder for machinery intelligent fault diagnosis with incremental fault samples and fault modes | |
Zhang et al. | A novel data-driven method based on sample reliability assessment and improved CNN for machinery fault diagnosis with non-ideal data | |
Chen et al. | Fault diagnosis method of rotating machinery based on stacked denoising autoencoder | |
CN115112372A (en) | Bearing fault diagnosis method and device, electronic equipment and storage medium | |
Lu et al. | A modified active learning intelligent fault diagnosis method for rolling bearings with unbalanced samples | |
Zheng et al. | An unsupervised transfer learning method based on SOCNN and FBNN and its application on bearing fault diagnosis | |
CN117803579A (en) | Centrifugal pump fault diagnosis method, system, medium and equipment | |
Zheng et al. | Few-shot intelligent fault diagnosis based on an improved meta-relation network | |
Hu et al. | An improved metalearning framework to optimize bearing fault diagnosis under data imbalance | |
CN114818811B (en) | Aeroengine rolling bearing fault diagnosis method based on twin network metric learning | |
Hao et al. | New fusion features convolutional neural network with high generalization ability on rolling bearing fault diagnosis | |
CN115373912A (en) | Small sample disk fault prediction method in industrial cloud environment | |
Yang | Conditional generative adversarial networks (cgan) for abnormal vibration of aero engine analysis | |
Wallsberger et al. | Explainable Artificial Intelligence for a high dimensional condition monitoring application using the SHAP Method | |
Zhang et al. | Fault Diagnosis of Rolling Bearing Based on CNN with Attention Mechanism and Dynamic Learning Rate | |
Tagirova et al. | Data mining of the Dynamometry of oil Production Sucker Rod Pumping Unit | |
Kolar et al. | Rotating shaft fault prediction using convolutional neural network: a preliminary study | |
Zhang et al. | Efficient Bearing Fault Diagnosis by Fast-Residual Network With 2-D Representation of Vibration Signals | |
Lyu et al. | A Novel Fault Diagnosis Method Based on Feature Fusion and Model Agnostic Meta-Learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23752268 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |