CN115859055A - Feature extraction method for multi-source heterogeneous big data in aircraft manufacturing process - Google Patents

Feature extraction method for multi-source heterogeneous big data in aircraft manufacturing process Download PDF

Info

Publication number
CN115859055A
CN115859055A CN202211701182.6A CN202211701182A CN115859055A CN 115859055 A CN115859055 A CN 115859055A CN 202211701182 A CN202211701182 A CN 202211701182A CN 115859055 A CN115859055 A CN 115859055A
Authority
CN
China
Prior art keywords
data
frequency
domain
extraction
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211701182.6A
Other languages
Chinese (zh)
Inventor
张政
杨林
王雪松
白伟
刘登峰
邹澳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202211701182.6A priority Critical patent/CN115859055A/en
Publication of CN115859055A publication Critical patent/CN115859055A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Complex Calculations (AREA)

Abstract

A special extraction method for multi-source heterogeneous big data in an aircraft manufacturing process comprises the steps of firstly, performing filtering reduction processing on sample data after cleaning based on a sliding average filtering method; step two, adopting time domain feature extraction, frequency domain feature extraction and time-frequency domain three-domain feature parameter extraction algorithms to the sample data after filtering reduction processing; optimizing a three-domain feature extraction result, reducing a feature structure by adopting an Isomap manifold learning algorithm, improving the operation efficiency and completing a special extraction method of multi-source heterogeneous big data; the invention is based on a sliding average filtering method, accelerates the filtering reduction processing and improves the characteristic extraction efficiency; aiming at the characteristics of large data volume, strong randomness and low value density, a three-domain feature processing result is extracted and optimized based on multi-domain parameter features, feature values after three-domain feature processing are fused into a mixed domain, and an Isomap manifold learning algorithm is applied to the mixed domain, so that sample data is reflected more comprehensively, and the data processing efficiency after feature extraction is improved.

Description

Feature extraction method for multi-source heterogeneous big data in aircraft manufacturing process
Technical Field
The invention belongs to the field of aerospace product manufacturing industrial data, and particularly relates to a feature extraction method for multi-source heterogeneous big data in an aircraft manufacturing process.
Background
In the actual production process of the aerospace industry, a large number of sensors are usually installed in a factory building and equipment facilities are used for regularly collecting production data of relevant production for good planning and arrangement. Due to the influence of factors such as processing and actual environment, a large amount of sensor equipment generates production data which has the characteristics of wide source, large data, high acquisition frequency, high dimensionality, low data quality and the like, and the production data influences the manufacturing efficiency of the airplane, the quality of products and the like in the manufacturing process of the airplane. Therefore, based on the current artificial intelligence technology, data cleaning and feature extraction processing analysis can be performed on the data firstly, and abnormal values in sample data are processed through the data cleaning, so that the interference of noise data on data processing results is avoided. The characteristics are extracted from the cleaned sample data by utilizing characteristic extraction, and the quality of the extracted characteristics is directly related to the quality of subsequent models such as fault diagnosis, service life prediction and the like, so that the method for extracting the high-quality data characteristics in the aircraft manufacturing process by adopting a proper method is very important.
The purpose of feature extraction is to map a sample set from a high-dimensional feature space to a low-dimensional feature space, and to make the mapped sample set have good separability. The final recognition effect of the system can be influenced by the quality of the feature extraction algorithm, and meanwhile, the calculation time of the system can be obviously shortened due to the reduction of the dimensionality of the data set after feature extraction.
With the generation and continuous development of new technologies, the requirements for the acquisition and analysis of multi-source heterogeneous large data are increasing. However, in the field of machine manufacturing, many studies on feature extraction tend to be performed in image processing such as machine vision, and there are few studies on data processing applications generated on processing machines in the manufacturing industry. Meanwhile, most of the existing data feature extraction methods for machining equipment adopt time sequence processing of multi-source signals or multi-domain processing of signals such as vibration signals for a single signal source. For example, the patent application CN202210771293.8 discloses a method for extracting multi-source heterogeneous data features and a method for diagnosing faults of large-scale rotating machinery, which adopts time sequence processing of multi-source signals. For example, the article "multi-domain feature extraction and rolling bearing intelligent diagnosis of extreme learning machine" adopts multi-domain processing of a single signal such as a vibration signal. However, in the face of high standards and high requirements of airplane processing, the two signal analysis and processing methods cannot effectively extract features from large data generated in the processing process. In addition, for the feature extraction by adopting the multi-domain processing method, the dimension reduction processing is also required to be carried out, so that the purposes of reducing the feature structure and improving the operation efficiency are achieved, and a foundation is laid for fault diagnosis by using data.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a special extraction method for multi-source heterogeneous big data in the aircraft manufacturing process, improve the data set processing efficiency in the equipment manufacturing process, and simultaneously adopt a reasonable feature dimension reduction algorithm to reduce the dimension of high-dimensional mixed domain features, thereby improving the operation efficiency.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a special extraction method for multi-source heterogeneous big data in an aircraft manufacturing process comprises the following steps:
step one, based on a sliding average filtering method, filtering and reducing the sample data after cleaning;
the filtering and reducing processing of the sample data is based on a moving average filtering algorithm, firstly, a sample data sampling value C is determined, an accumulator S and an average value A are set, and the number of samples, namely a sampling period N, is set; then, giving an initial value to A; then give the accumulator S = a × N; then, entering se:Sup>A datse:Sup>A sliding input process, namely S = S-A + C, discarding the average value as se:Sup>A discarded value, and accumulating the sampling values; after that, calculating A = S/N by arithmetic mean to obtain a filtering value; finally, the average value is given to the third step, and circulation is carried out until the filtering reduction of the whole sample data is realized;
step two, adopting time domain feature extraction, frequency domain feature extraction and time-frequency domain three-domain feature parameter extraction algorithms to the sample data after filtering reduction processing;
selecting 12 time domain characteristic parameters of a peak value, a mean value, a root mean square, a square root amplitude value, a kurtosis, a skewness, a peak value index, a pulse index, a waveform index, a margin index, a kurtosis index and a skewness index in the time domain characteristic; according to the statistical principle, listing corresponding algorithm expressions as a feature extraction algorithm to obtain feature values; then, the characteristic value is normalized by a maximum and minimum method:
Figure BDA0004024120370000031
finally, visualizing representative Y-axis direction force signal data, X-axis direction vibration signal data and acoustic emission signal data in the sample data according to the time domain characteristic parameter values to obtain a characteristic diagram of the time domain characteristic parameters in each signal data changing along with the increase of the milling cycle times;
in the frequency domain characteristic, firstly, the power spectrum analysis is carried out on the data to assume the existing stable random process x (t), and the autocorrelation function of the existing stable random process x (t) is as follows:
Figure BDA0004024120370000041
according to the principles of Vena-Xinchang, there are:
Figure BDA0004024120370000042
the Veno-Cincinnan theorem shows that for a stationary random process, the power spectral density is the Fourier transform of the autocorrelation function; then, further selecting 4 frequency domain characteristic parameters of center-of-gravity frequency, frequency variance, mean square frequency and root mean square frequency according to the obtained power spectrum, and obtaining a corresponding algorithm expression as a characteristic extraction algorithm to obtain a characteristic value; finally, the Y-axis direction force signal data, the X-axis direction vibration signal data and the acoustic emission signal data are also selected for visualization, and a characteristic diagram of frequency domain characteristic parameters changing along with the increase of milling cycle times in each signal data is obtained;
in a time-frequency domain, complete description of frequency components and energy of signals at each moment is realized by wavelet packet decomposition, and first sym4 wavelets with remarkable effect in wavelet threshold denoising are selected as wavelet basis functions; then, carrying out wavelet packet decomposition and reconstruction on the high-frequency and low-frequency data of the sample signal data; then decomposing the original signal into an approximate coefficient and a detail coefficient which respectively correspond to the characteristics of the low-frequency part and the high-frequency part of the signal; determining the frequency band resolution according to the peak interval and the peak frequency of the low-frequency band, and further determining the number of wavelet decomposition layers by combining the sampling frequency; and finally, extracting an energy mean value:
when the original signal length is N, the number of decomposition layers is k,
Figure BDA0004024120370000043
length is shortened to 2 -k N the energy at this time is expressed as:
Figure BDA0004024120370000044
further obtaining a required time-frequency characteristic value, and selecting the first 8 energy mean values of the vibration signals in the X-axis direction as time-frequency domain characteristic parameters;
optimizing a three-domain feature extraction result, reducing a feature structure by adopting an Isomap manifold learning algorithm, improving the operation efficiency and completing a special extraction method of multi-source heterogeneous big data;
fusing 12 time domain characteristic parameters, 4 frequency domain characteristic parameters and 8 time-frequency domain characteristic parameters obtained in the multi-domain characteristic parameter characteristic extraction process into 24-dimensional mixed domain vectors; taking a mixed domain vector, namely a mixed domain characteristic parameter, as an input of an Isomap algorithm, selecting a neighborhood, and constructing a neighborhood graph G; calculating the shortest path; constructing d-dimensional embedding for the shortest path distance matrix by using an MDS algorithm; defining a neighborhood on a sample set using a k-neighborhood methodDefining an empowerment undirected graph as a neighborhood graph G; when two points x on G i And x j When the neighboring point is near, the weight is d x (x i ,x j ). Calculating the shortest path between two points in G to obtain a distance matrix D G =d G (x i ,x j ) (ii) a At D G In the above, the d-dimensional embedded coordinate Y capable of maintaining the high-order data popular topological geometric space can be calculated by using an MDS algorithm; and obtaining a feature distribution map after dimension reduction through the coordinates, namely displaying the feature distribution condition of the sample data set in a three-dimensional space.
The invention has the beneficial effects that:
1. the method is improved based on the sliding filter average algorithm, the noise reduction processing of the sample data is realized, and the calculation efficiency and the modification efficiency in the filtering process are improved;
2. based on a multi-domain characteristic parameter extraction algorithm, sample data is comprehensively analyzed in three domains of a time domain, a frequency domain and a time-frequency domain, the problem that the resolution of single-domain analysis is not high is solved, data characteristics are fully displayed, and wavelet packet decomposition is adopted for time-frequency domain analysis, so that the method has strong time-frequency localization decomposition capacity;
3. the Isomap algorithm is applied to a mixed domain formed after multi-domain feature extraction, further extraction and dimension reduction are carried out, a three-domain feature extraction result is optimized, the dimension reduction precision of feature data is improved, and a foundation is laid for the follow-up improvement of fault diagnosis precision and the like. Therefore, the method has the effect of antecedent and posterous development in the process of big data analysis.
Drawings
FIG. 1 is a process for extracting features of multi-source heterogeneous data.
Fig. 2 is a flow chart of an improved post-filtering algorithm based on moving average filtering.
Fig. 3 is a process of a three-domain feature extraction algorithm based on multi-domain feature parameters.
FIG. 4 is a dimension reduction mixed domain nonlinear data feature using the Isomap algorithm.
Detailed Description
The present invention will be described in detail with reference to the accompanying drawings.
Referring to fig. 1, a process for implementing multi-source heterogeneous data feature extraction in an aircraft manufacturing process is shown. The method firstly obtains and arranges the sample data after data cleaning. Because a large amount of noise data still exists in the sample data after data cleaning, interference is generated in the characteristic extraction process of the data, and therefore filtering reduction processing needs to be performed on the sample data by using an improved filtering method based on moving average filtering, so that noise interference is eliminated. And obtaining sample data after filtering processing, and performing feature extraction on the data multi-source signal by using a multi-domain-based feature parameter extraction algorithm. And finally, fusing the obtained multi-domain characteristic parameters into mixed domain characteristic parameters, optimizing a three-domain characteristic extraction result by using Isomap manifold learning, and further extracting and reducing the dimension nonlinear data characteristics, thereby reducing the characteristic structure and improving the operation efficiency. The specific scheme is as follows: a special extraction method for multi-source heterogeneous big data in an aircraft manufacturing process comprises the following steps:
step one, based on a sliding average filtering method, filtering and reducing the cleaned sample data;
referring to fig. 2, the flow of the improved post-filtering algorithm based on the moving average filtering is specifically shown.
For the acquired data, even though the data is cleaned and the like, various interference signals still exist in the data set to cause the data to deviate from the true value, so the data needs to be filtered and reduced before feature extraction. The traditional sliding filtering is often used for filtering and reducing signals, and the main method is to adopt a first-in first-out principle, regard N sampling values obtained continuously as a queue, fix the length of the queue to N, obtain a new data to be put into the tail of the queue after each sampling, discard a data at the head of the original queue, and perform arithmetic mean operation on the N data in the queue to obtain a new filtering result. However, the conventional algorithm is relatively wasteful of RAM and has low execution efficiency. Therefore, the invention adopts the improved sliding filtering algorithm to carry out filtering reduction processing on the sample data.
The filtering and reducing processing of the sample data is based on a moving average filtering algorithm, firstly, a sample data sampling value C is determined, an accumulator S and an average value A are set, and the number of samples, namely a sampling period N, is set; then, giving an initial value to the A; then give the accumulator S = a × N; then, entering se:Sup>A datse:Sup>A sliding input process, namely S = S-A + C, discarding the average value as se:Sup>A discarded value, and accumulating the sampling values; after that, calculating A = S/N by arithmetic mean to obtain a filtering value; finally, the average value is assigned to the third step, and entering a loop until the whole sample data is subjected to filtering reduction.
Therefore, the sample data filtering reduction processing is based on a moving average filtering algorithm, the first-in first-out principle of the queue is removed, for the Mth calculation, the average value obtained by the M-1 th calculation is discarded as a discarded value, and then the arithmetic mean operation is carried out on N data in the queue to obtain a filtering result. By the method, RAM maintenance is removed, RAM is saved, and execution efficiency and modification efficiency of the algorithm are improved.
Step two, adopting time domain feature extraction, frequency domain feature extraction and time-frequency domain three-domain feature parameter extraction algorithms to the sample data after filtering reduction processing;
referring to fig. 3, the process of the three-domain feature extraction algorithm based on the multi-domain feature parameters is shown in detail.
And the three-domain characteristic parameter extraction algorithm is used for extracting effective characteristics related to modeling from the original data to establish a model. Meanwhile, in order to improve the correlation between model input and a modeling target, reduce redundancy, avoid dimension disaster, and provide better understanding for subsequent data processing, reasonable feature parameter selection needs to be performed in each extraction method.
In the time domain characteristics, the invention adopts a method that dimensional characteristic indexes are often matched with dimensionless indexes for use, and the time domain information is better extracted. Selecting four indexes of peak value, mean value, root mean square value and square root amplitude of a signal as dimensional characteristic indexes; margin and skewness indexes, peak indexes, pulse indexes, kurtosis indexes and waveform indexes are selected as dimensionless characteristic indexes.
In the frequency domain characteristic, the invention analyzes the frequency domain of the sample data signal by adopting a Fourier transform and power spectrum analysis method. However, the simple fourier transform cannot meet the absolute integrable requirement in the random process, and a power spectrum analysis method is further adopted for the current situation. According to the Venn-Cinzhou theorem, for a stable random process, the power spectral density is the Fourier transform of the autocorrelation function, and the frequency domain characteristics of the signal are further extracted according to the obtained power spectrum. And selecting the center-of-gravity frequency, the frequency variance, the mean square frequency and the root mean square frequency as frequency domain characteristic parameters.
The wavelet decomposition method allows the original vibration signal to be divided into a plurality of sub-bands. Further analysis shows that wavelet decomposition has the problems of low frequency resolution in the high frequency band of the signal and low time resolution in the low frequency band. Therefore, in the time-frequency domain characteristics, for the non-stationary sample signal data, the invention innovatively adopts the extension of a wavelet decomposition method, namely a wavelet packet decomposition method, so that the extraction precision is improved.
For sample data, wavelet packet decomposition is adopted, and high-frequency bands and low-frequency bands of signals are decomposed simultaneously, so that fine decomposition of the signals is further realized. The decomposition of each layer decomposes the original signal into an approximate coefficient and a detail coefficient which respectively correspond to the characteristics of the low-frequency part and the high-frequency part of the signal. And carrying out recursive decomposition on the original signal to the specified number of layers, and further extracting the energy average value of each wavelet packet coefficient to obtain the required time-frequency characteristic value.
Selecting 12 time domain characteristic parameters of a peak value, a mean value, a root mean square, a square root amplitude value, a kurtosis, a skewness, a peak value index, a pulse index, a waveform index, a margin index, a kurtosis index and a skewness index in the time domain characteristic; according to the statistical principle, listing corresponding algorithm expressions as a feature extraction algorithm to obtain feature values; then, the characteristic value is normalized by a maximum and minimum method:
Figure BDA0004024120370000091
finally, visualizing representative Y-axis direction force signal data, X-axis direction vibration signal data and acoustic emission signal data in the sample data according to the time domain characteristic parameter values to obtain a characteristic graph of which the time domain characteristic parameters in the signal data change along with the increase of the milling cycle times;
in the frequency domain characteristic, firstly, the power spectrum analysis is carried out on the data to assume the existing stable random process x (t), and the autocorrelation function of the existing stable random process x (t) is as follows:
Γ x (t)=E[x(t+τ)x(τ)] (2)
according to the principles of Vena-Xinchang, there are:
Figure BDA0004024120370000101
the Veno-Cincinnan theorem shows that for a stationary random process, the power spectral density is the Fourier transform of the autocorrelation function; then, further selecting 4 frequency domain characteristic parameters of center-of-gravity frequency, frequency variance, mean square frequency and root mean square frequency according to the obtained power spectrum, and obtaining a corresponding algorithm expression as a characteristic extraction algorithm to obtain a characteristic value; finally, the Y-axis direction force signal data, the X-axis direction vibration signal data and the acoustic emission signal data are also selected for visualization, and a characteristic diagram of frequency domain characteristic parameters changing along with the increase of milling cycle times in each signal data is obtained;
in a time-frequency domain, complete description of frequency components and energy of signals at each moment is realized by wavelet packet decomposition, and first sym4 wavelets with remarkable effect in wavelet threshold denoising are selected as wavelet basis functions; then, carrying out wavelet packet decomposition and reconstruction on the high-frequency and low-frequency data of the sample signal data; then decomposing the original signal into an approximate coefficient and a detail coefficient which respectively correspond to the characteristics of the low-frequency part and the high-frequency part of the signal; determining the frequency band resolution according to the peak interval and the peak frequency of the low-frequency band, and further determining the number of wavelet decomposition layers by combining the sampling frequency; and finally, extracting an energy mean value:
when the original signal length is N and the number of decomposition layers is k,
Figure BDA0004024120370000102
length is shortened to 2 -k N the energy at this time is expressed as:
Figure BDA0004024120370000103
and further obtaining a required time-frequency characteristic value, and selecting the first 8 energy mean values of the vibration signals in the X-axis direction as time-frequency domain characteristic parameters.
And step three, optimizing a three-domain feature extraction result, reducing a feature structure by adopting an Isomap manifold learning algorithm, and improving the operation efficiency. Referring to FIG. 4, a diagram illustrates the use of the Isomap algorithm to reduce the dimension of the mixed domain nonlinear data features.
And fusing 12 time domain characteristic parameters, 4 frequency domain characteristic parameters and 8 time-frequency domain characteristic parameters which are obtained in the multi-domain characteristic parameter feature extraction process into a 24-dimensional mixed domain vector. Taking a mixed domain vector, namely a mixed domain characteristic parameter as the input of an Isomap algorithm, selecting a neighborhood, and constructing a neighborhood graph G; calculating the shortest path; and constructing d-dimensional embedding for the shortest path distance matrix by using an MDS algorithm. And defining a neighborhood by adopting a k-neighborhood method, and defining an empowerment undirected graph on the sample set as a neighborhood graph G. When two points x on G i And x j When the neighboring point is near, the weight is d x (x i ,x j ). Calculating the shortest path between two points in G to obtain a distance matrix D G =d G (x i ,x j ). At D G In addition, the d-dimensional embedded coordinate Y capable of maintaining the high-order data popular topological geometric space can be calculated by using the MDS algorithm. And obtaining the feature distribution map after dimension reduction through the coordinates, namely displaying the feature distribution condition of the sample data set in the three-dimensional space.
After the multi-domain features are extracted, the obtained visual chart only simply represents the related information in the corresponding domain, and in order to more comprehensively reflect the feature information contained in the sample data set, the three-domain feature parameters form high-dimensional mixed domain feature parameters, and an Isomap manifold learning algorithm is applied to the high-dimensional mixed domain feature parameters for dimension reduction, sample features are displayed, and the three-domain feature extraction result is optimized.
In the conventional machine learning method, data points and distances between the data points and a mapping function are defined in the euclidean space, however, in a practical case, the data points may not be distributed in the euclidean space, and thus the metric of the conventional euclidean space is difficult to be used for the real-world nonlinear data, so that a new assumption needs to be introduced on the distribution of the data. Manifold learning assumes that the processed data points are distributed over, or can constitute, a potential manifold embedded in the outer dimension euclidean space. In summary, manifold learning, a class of nonlinear dimension reduction methods that use the concept of topological manifolds for reference, can be used to process high-dimensional data.
The Isomap algorithm is also called an equal-measure mapping algorithm, most of the improvement is a scaling algorithm, and the Isomap algorithm is a high-efficiency method in a manifold learning algorithm. The basic principle of the algorithm is that the distances of all data points in a low-dimensional space are consistent with those of all data points in a high-dimensional space as much as possible. In the invention, the time domain characteristic parameters, the frequency domain characteristic parameters and the time-frequency domain characteristic parameters form mixed domain characteristic parameters together, and the Isomap algorithm is utilized to perform data dimension reduction and further characteristic extraction. On the basis of multi-domain feature extraction, the feature information of the sample data is more comprehensively reflected.

Claims (6)

1. A special extraction method for multi-source heterogeneous big data in an aircraft manufacturing process is characterized by comprising the following steps:
step one, based on a sliding average filtering method, filtering and reducing the cleaned sample data;
step two, adopting time domain feature extraction, frequency domain feature extraction and time-frequency domain three-domain feature parameter extraction algorithms to the sample data after filtering reduction processing;
and step three, optimizing a three-domain feature extraction result, reducing a feature structure by adopting an Isomap manifold learning algorithm, improving the operation efficiency and completing the special extraction method of the multi-source heterogeneous big data.
2. The special extraction method for the multi-source heterogeneous big data in the aircraft manufacturing process according to claim 1, wherein the first step is specifically as follows:
the filtering and reducing processing of the sample data is based on a moving average filtering algorithm, firstly, a sample data sampling value C is determined, an accumulator S and an average value A are set, and the number of samples, namely a sampling period N, is set; then, giving an initial value to A; then give the accumulator S = a × N; then, entering se:Sup>A datse:Sup>A sliding input process, namely S = S-A + C, discarding the average value as se:Sup>A discarded value, and accumulating the sampling values; after that, calculating A = S/N by arithmetic mean to obtain a filtering value; and finally, giving the average value to the third step, and entering circulation until the whole sample data is subjected to filtering reduction.
3. The special extraction method for the multi-source heterogeneous big data in the aircraft manufacturing process according to claim 1, wherein the time domain feature extraction in the second step is specifically as follows:
selecting 12 time domain characteristic parameters of a peak value, a mean value, a root mean square, a square root amplitude value, a kurtosis, a skewness, a peak value index, a pulse index, a waveform index, a margin index, a kurtosis index and a skewness index in the time domain characteristic; according to the statistical principle, listing corresponding algorithm expressions as a feature extraction algorithm to obtain feature values; then, the characteristic value is normalized by a maximum and minimum method:
Figure FDA0004024120360000021
and finally, visualizing representative Y-axis direction force signal data, X-axis direction vibration signal data and acoustic emission signal data in the sample data according to the time domain characteristic parameter values to obtain a characteristic diagram of the time domain characteristic parameters in each signal data changing along with the increase of the milling cycle number.
4. The special extraction method for the multi-source heterogeneous big data in the aircraft manufacturing process according to claim 1, wherein the frequency domain feature extraction in the second step is specifically as follows:
in the frequency domain characteristic, firstly, the power spectrum analysis is carried out on the data to assume the existing stable random process x (t), and the autocorrelation function of the existing stable random process x (t) is as follows:
Γ x (t)=E[x(t+τ)x(τ)] (2)
according to the principles of Vena-Xinchang, there are:
Figure FDA0004024120360000022
the Veno-Cincinnan theorem shows that for a stationary random process, the power spectral density is the Fourier transform of the autocorrelation function; then, further selecting 4 frequency domain characteristic parameters of center-of-gravity frequency, frequency variance, mean square frequency and root mean square frequency according to the obtained power spectrum, and obtaining a corresponding algorithm expression as a characteristic extraction algorithm to obtain a characteristic value; and finally, similarly selecting the force signal data in the Y-axis direction, the vibration signal data in the X-axis direction and the acoustic emission signal data for visualization to obtain a characteristic diagram of frequency domain characteristic parameters changing along with the increase of the milling cycle times in each signal data.
5. The special extraction method for the multi-source heterogeneous big data in the aircraft manufacturing process according to claim 1, wherein the time-frequency domain feature extraction in the second step is specifically as follows:
in a time-frequency domain, complete description of frequency components and energy of signals at each moment is realized by wavelet packet decomposition, and first sym4 wavelets with remarkable effect in wavelet threshold denoising are selected as wavelet basis functions; then, carrying out wavelet packet decomposition and reconstruction on the high-frequency and low-frequency data of the sample signal data; then decomposing the original signal into an approximate coefficient and a detail coefficient which respectively correspond to the characteristics of the low-frequency part and the high-frequency part of the signal; determining the frequency band resolution according to the peak interval and the peak frequency of the low-frequency band, and further determining the number of wavelet decomposition layers by combining the sampling frequency; and finally, extracting an energy mean value:
when the original signal length is N, the number of decomposition layers is k,
Figure FDA0004024120360000031
length is shortened to 2 -k N the energy at this time is expressed as:
Figure FDA0004024120360000032
and further obtaining a required time-frequency characteristic value, and selecting the first 8 energy mean values of the vibration signals in the X-axis direction as time-frequency domain characteristic parameters.
6. The special extraction method for the multi-source heterogeneous big data in the aircraft manufacturing process according to claim 1, wherein the third step is specifically as follows:
fusing 12 time domain characteristic parameters, 4 frequency domain characteristic parameters and 8 time-frequency domain characteristic parameters obtained in the multi-domain characteristic parameter characteristic extraction process into 24-dimensional mixed domain vectors; taking a mixed domain vector, namely a mixed domain characteristic parameter as the input of an Isomap algorithm, selecting a neighborhood, and constructing a neighborhood graph G; calculating the shortest path; constructing d-dimensional embedding for the shortest path distance matrix by using an MDS algorithm; defining a neighborhood by adopting a k-neighborhood method, and defining a weighted undirected graph on a sample set as a neighborhood graph G; when two points x on G i And x j When the neighboring point is near, the weight is d x (x i ,x j ) (ii) a Calculating the shortest path between two points in G to obtain a distance matrix D G =d G (x i ,x j ) (ii) a At D G In the above, the d-dimensional embedded coordinate Y capable of maintaining the high-order data popular topological geometric space can be calculated by using an MDS algorithm; and obtaining the feature distribution map after dimension reduction through the coordinates, namely displaying the feature distribution condition of the sample data set in the three-dimensional space.
CN202211701182.6A 2022-12-28 2022-12-28 Feature extraction method for multi-source heterogeneous big data in aircraft manufacturing process Pending CN115859055A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211701182.6A CN115859055A (en) 2022-12-28 2022-12-28 Feature extraction method for multi-source heterogeneous big data in aircraft manufacturing process

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211701182.6A CN115859055A (en) 2022-12-28 2022-12-28 Feature extraction method for multi-source heterogeneous big data in aircraft manufacturing process

Publications (1)

Publication Number Publication Date
CN115859055A true CN115859055A (en) 2023-03-28

Family

ID=85655653

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211701182.6A Pending CN115859055A (en) 2022-12-28 2022-12-28 Feature extraction method for multi-source heterogeneous big data in aircraft manufacturing process

Country Status (1)

Country Link
CN (1) CN115859055A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117454155A (en) * 2023-12-26 2024-01-26 电子科技大学 IGBT acoustic emission signal extraction method based on SSAF and EMD

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117454155A (en) * 2023-12-26 2024-01-26 电子科技大学 IGBT acoustic emission signal extraction method based on SSAF and EMD
CN117454155B (en) * 2023-12-26 2024-03-15 电子科技大学 IGBT acoustic emission signal extraction method based on SSAF and EMD

Similar Documents

Publication Publication Date Title
CN108830127B (en) Rotary machine fault feature intelligent diagnosis method based on deep convolutional neural network structure
CN109827777B (en) Rolling bearing fault prediction method based on partial least square method extreme learning machine
CN111539152B (en) Rolling bearing fault self-learning method based on two-stage twin convolutional neural network
CN106650071A (en) Intelligent fault diagnosis method for rolling bearing
CN113176092B (en) Motor bearing fault diagnosis method based on data fusion and improved experience wavelet transform
CN114469124B (en) Method for identifying abnormal electrocardiosignals in movement process
CN109918417B (en) Time sequence data self-adaptive segmentation, dimension reduction and characterization method based on wavelet transformation and application
CN112945546B (en) Precise diagnosis method for complex faults of gearbox
CN115859055A (en) Feature extraction method for multi-source heterogeneous big data in aircraft manufacturing process
CN114169377A (en) G-MSCNN-based fault diagnosis method for rolling bearing in noisy environment
Liang et al. Average descent rate singular value decomposition and two-dimensional residual neural network for fault diagnosis of rotating machinery
CN116186593B (en) Electrocardiosignal detection method based on separable convolution and attention mechanism
CN110555243A (en) Two-dimensional map construction method for vibration information of machine tool spindle in milling process
CN116304751A (en) Operation data processing method for overhauling motor train unit components
CN117251798A (en) Meteorological equipment anomaly detection method based on two-layer progressive process
CN112116029A (en) Intelligent fault diagnosis method for gearbox with multi-scale structure and characteristic fusion
CN115587290A (en) Aero-engine fault diagnosis method based on variational self-coding generation countermeasure network
CN108830878B (en) Target tracking method based on FPN neural network
CN111025100A (en) Transformer ultrahigh frequency partial discharge signal mode identification method and device
CN112614539B (en) Motor imagery detection method based on TEO-MIC algorithm
CN115587541A (en) Method for describing turbofan engine process characteristics by utilizing multi-feature fusion residual error network
CN115618202A (en) Mechanical fault diagnosis method based on manifold embedding and key feature selection
Qiao et al. Bearing fault diagnosis based on natural adaptive moment estimation algorithm and improved octave convolution
CN113869358A (en) Bearing fault diagnosis method based on cyclic correlation entropy and one-dimensional shallow convolution neural network
CN113639985A (en) Mechanical fault diagnosis and state monitoring method based on optimized fault characteristic frequency spectrum

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination