CN114822584A - Transmission device signal separation method based on integral improved generalized cross-correlation - Google Patents

Transmission device signal separation method based on integral improved generalized cross-correlation Download PDF

Info

Publication number
CN114822584A
CN114822584A CN202210439737.8A CN202210439737A CN114822584A CN 114822584 A CN114822584 A CN 114822584A CN 202210439737 A CN202210439737 A CN 202210439737A CN 114822584 A CN114822584 A CN 114822584A
Authority
CN
China
Prior art keywords
correlation
matrix
cross
time
different
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210439737.8A
Other languages
Chinese (zh)
Inventor
李旭
栾峰
王涛
吴艳
韩月娇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN202210439737.8A priority Critical patent/CN114822584A/en
Publication of CN114822584A publication Critical patent/CN114822584A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/141Discrete Fourier transforms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/028Voice signal separating using properties of sound source
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Discrete Mathematics (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention discloses a transmission device signal separation method based on integral improved generalized cross-correlation, which is a new blind source separation method combining a generalized cross-correlation algorithm and a non-negative matrix decomposition algorithm to separate sound signals of different transmission devices. Combining a generalized cross-correlation algorithm with a nonnegative matrix decomposition algorithm, obtaining arrival time difference by using the generalized cross-correlation algorithm, and judging the number of sources; combining non-negative matrix decomposition to obtain information of which source the specific dictionary atom comes from, thereby providing factual basis for generating mask matrixes of different sources; the generalized cross-correlation is improved by using an integral method, and the accuracy of the estimation of the arrival time difference is improved; a new non-negative matrix factorization initialization method is designed, and the time for calculating the non-negative matrix factorization is reduced. The method solves the problem that other blind source separation methods depend on an ideal mathematical model or on a training neural network.

Description

Transmission device signal separation method based on integral improved generalized cross-correlation
Technical Field
The invention belongs to the technical field of transmission device signal separation methods, and relates to a transmission device signal separation method based on integral improvement generalized cross-correlation.
Background
The transmission device transmits the power of the power device to equipment such as a working mechanism and the like, and the figure of the transmission device can be seen on various machines or vehicles, wherein mechanisms such as a bearing and the like which mainly do circular motion are more. Because the working environment of the equipment is possibly very harsh, the failure rate of the equipment is improved, the normal operation of the operation is influenced, and in order to ensure that the machine operates normally, the performance of the machine needs to be monitored so as to find problems in time and maintain in time to recover the work. Specific sensors can be used to acquire the sound signal or vibration curve of the transmission device, but in some special scenes, the sensors cannot be added, and a non-contact mode is needed for monitoring. In a scenario with multiple actuators, the acoustic signals can be separated using blind source separation techniques, and the acoustic signals from multiple actuators can be separated for condition monitoring. Blind source separation enables the waveform of a source signal to be recovered from an observed mixed signal without determining the mixed signal mixing process and the source signal. The "blind" of blind source separation has two main points: the source signal is unknown and the transmission channel parameters of the signal are also unknown.
The origin of the blind source separation technology can trace back to 80 years in the 20 th century, pioneering work is mainly completed by Jutten and Herault, in a conference about a neural network held in the United states, an H-J learning algorithm in the blind source separation is proposed based on a feedback neural network model, the separation of two aliasing source signals is completed, the uncertainty of the quantity and the channel of the source signals can be solved, and a special CMOS chip is designed to realize the algorithm. At present, numerous scholars at home and abroad have a very deep research on the blind source separation technology. For example, Independent Component Analysis (ICA) method, the separation of the source signals can be accomplished as long as the mutual independence between the individual signals of the mixed signal is restored. On the basis of ICA theory, a large number of excellent blind source separation algorithms emerge: the deconvolution problem can be converted into the instantaneous problem by oversampling, and then the separation is carried out by the traditional ICA method; the wavelet decomposition of the signal can be regularized by adopting ICA to search for independent characteristics to carry out wavelet ICA; the constrained optimization problem can be solved using a constrained ICA algorithm with an adaptive solution like newton's learning. In addition, FastICA is provided based on the ICA principle and the non-Gaussian maximum fixed point algorithm, and the blind source separation problem can be converted into the estimation of a density function and then researched. However, the above algorithms all assume that sound follows a certain distribution and conforms to a mathematical model under an ideal situation, but it is difficult to satisfy the assumption in an actual scene, and thus the robustness exhibited by the above methods is not strong enough. In addition, the blind source separation method based on deep learning depends on pre-training, cannot be applied immediately, and has the advantages that the separation effect tests the generalization performance of the network and the stability is not good.
Disclosure of Invention
To solve the above technical problem, the present invention provides a method for separating transmission signals based on integral-improved generalized cross-correlation.
The invention provides a transmission device signal separation method based on integral improved generalized cross-correlation, which comprises the following steps:
step 1: collecting original dual-channel audio mixed signals and preprocessing the mixed signals;
step 2: performing time-frequency analysis on the dual-channel audio mixed signal to obtain time-frequency information of the mixed signal, wherein the time-frequency information comprises an amplitude spectrum and an angle spectrum;
and step 3: estimating time delay by using an integral improved generalized cross-correlation algorithm to obtain time delay of different sound sources;
and 4, step 4: carrying out non-negative matrix decomposition on the magnitude spectrum to obtain a dictionary matrix and a coefficient matrix;
and 5: combining an integral improvement generalized cross-correlation algorithm with a non-negative matrix decomposition algorithm to generate a mask matrix;
step 6: multiplying the mask matrix and the coefficient matrix element by element to obtain a separated coefficient matrix;
and 7: carrying out inverse nonnegative matrix decomposition, and multiplying the dictionary matrix and the separated coefficient matrix to obtain the amplitude spectrums of the different sound sources after separation;
and 8: and combining the amplitude spectrum and the angle spectrum of the different separated sound sources, and performing inverse short-time Fourier transform to obtain time domain information of the different separated sound sources, so as to complete separation.
In the transmission device signal separation method based on integral improved generalized cross-correlation of the present invention, the step 1 specifically is:
step 1.1: the method comprises the following steps of collecting original audio mixed signals by using a dual-channel microphone array, placing different sound sources in different directions during collection, and enabling the different sound sources to emit sound at the same time to simulate the sound emitted by different transmission devices;
step 1.2: in order to improve the signal-to-noise ratio of the mixed signal and further improve the quality of separation, the original audio mixed signal is preprocessed, and a polynomial least square method is adopted to eliminate a trend term error.
In the transmission device signal separation method based on integral improved generalized cross-correlation of the present invention, the step 2 specifically is:
step 2.1: performing discrete short-time Fourier transform on the audio mixed signal:
Figure BDA0003613259260000031
wherein f is frequency, t is time, STFT (f, t) is time frequency information, k is temporary variable required by integral operation, x () is input signal, g () is window function, and Hamming window is specifically adopted;
step 2.2: decomposing the time-frequency information by using the following formula:
Figure BDA0003613259260000032
wherein, V ft Is an amplitude spectrum divided into an amplitude spectrum V of a left channel lft Amplitude spectrum V of the right channel rft ;φ ft Is an angular spectrum.
In the transmission device signal separation method based on integral improved generalized cross-correlation of the present invention, the step 3 specifically is:
step 3.1: the basic generalized cross-correlation algorithm is defined as follows:
Figure BDA0003613259260000041
wherein tau is time delay,
Figure BDA0003613259260000042
For cross power spectrum, psi ft As a frequency weighting function, G τt Is a cross-correlation function;
step 3.2: the integration method is used to improve the basic generalized cross-correlation algorithm, and the cross-correlation function G is obtained by the formula (3) τt Front, cross power spectrum
Figure BDA0003613259260000043
Integration along the time t axis: in particular, a specified window length is selected, in the cross-power spectrum
Figure BDA0003613259260000044
The sliding window algorithm is carried out on each line of the window, the mean value in the window is calculated, the value is assigned to the element at the center of the window, and then the cross-correlation function G is carried out τt Calculating (1);
step 3.3: the obtained cross-correlation function G τt Summing along the t-axis, the cross-correlation function becomes a one-dimensional delay profile
Figure BDA0003613259260000045
Finding out using peak detection algorithm
Figure BDA0003613259260000046
The time delay corresponding to the abscissa of the peak value is the calculated time delay tau S ( s 1,2.. n), s represents different sound sources, and n represents the number of sound sources.
In the transmission device signal separation method based on integral improved generalized cross-correlation of the present invention, the step 4 specifically is:
step 4.1: initialization of the dictionary matrix: from the magnitude spectrum V ft The method comprises the following steps of selecting a plurality of column vectors with the maximum infinite norm, averaging the column vectors to be used as columns of a dictionary matrix, wherein the infinite norm of the columns is defined as follows:
Figure BDA0003613259260000047
wherein | vec | purple light I.e. infinite norm, v, of vector vec i Is an element of the vector, len is the length of the vector;
step 4.2: initializing the coefficient matrix by a random initialization mode;
step 4.3: dictionary matrix W fd Sum coefficient matrix H dt After initialization, iteration is carried out for a plurality of times by adopting the following iteration formula, and a decomposed dictionary matrix and a decomposed coefficient matrix are solved:
Figure BDA0003613259260000048
Figure BDA0003613259260000051
wherein the content of the first and second substances,
Figure BDA0003613259260000052
the dictionary matrix obtained for the mth iteration is calculated,
Figure BDA0003613259260000053
the coefficient matrix obtained for the mth iteration is calculated.
In the transmission device signal separation method based on integral improved generalized cross-correlation of the present invention, the step 5 specifically is:
step 5.1: the dictionary matrix generated using the non-negative matrix factorization defines a new frequency weighting function as follows:
Figure BDA0003613259260000054
wherein the content of the first and second substances,
Figure BDA0003613259260000055
is a new frequency weighting function;
step 5.2: weighting the new frequency function
Figure BDA0003613259260000056
Substituting into the integral improvement generalized cross-correlation algorithm definition formula in the step 3, the objective of combining the integral improvement generalized cross-correlation algorithm with the non-negative matrix decomposition is realized, and the following formula is obtained:
Figure BDA0003613259260000057
wherein the content of the first and second substances,
Figure BDA0003613259260000058
is a new cross-correlation function;
step 5.3: the meaning of the mask matrix is that different sound sources correspond to different time delays τ S Respectively substituting the time delays of different sound sources into
Figure BDA0003613259260000059
Attributing the dictionary atom to
Figure BDA00036132592600000510
The source with the largest value sets the element of the specified position of the mask matrix of the sound source to 1, otherwise to 0, and the definition of the mask matrix is shown as follows:
Figure BDA00036132592600000511
wherein M is dt For the mask matrix, s denotes different sound sources.
In the transmission signal separation method based on integral improved generalized cross-correlation of the present invention, the step 8 specifically is:
combining the amplitude spectrums of the different sources obtained in the step 7 with the angle spectrums obtained in the step 2, and performing short-time Fourier inverse transformation to obtain time domain information of the different separated sources, so that the information of different sound sources is finally separated, sound signals of different transmission devices are separated, and the following formula is an inverse short-time Fourier transformation formula:
Figure BDA0003613259260000061
wherein the content of the first and second substances,
Figure BDA0003613259260000062
i.e. sound signals representing separate different actuators,
Figure BDA0003613259260000063
for amplitude spectra of different sound sources, phi ft Is an angular spectrum.
The invention discloses a transmission device signal separation method based on integral improved generalized cross-correlation, which at least has the following beneficial effects:
(1) the integration improvement generalized cross-correlation algorithm is combined with a non-negative matrix decomposition algorithm to carry out the sound signal separation of the multiple transmission devices by a blind source separation method, so that the precision is high, the calculation speed is high, and the robustness is strong;
(2) an integration method is used for enhancing a generalized cross-correlation algorithm, so that the result of the generalized cross-correlation is more accurate, and the error of time delay estimation is smaller;
(3) when the dictionary matrix and the coefficient matrix are solved, a new initialization method of a non-negative matrix decomposition algorithm is adopted, the decomposition speed is increased, and the separation effect is improved;
(4) the sound generated by different transmission devices can be separated without additionally arranging a sensor on the transmission device, so that the sound-separating device is suitable for being applied in a scene that the sensor cannot be additionally arranged, and the sound of the transmission device can be acquired in a non-contact manner.
Drawings
FIG. 1 is a flow chart of a transmission signal separation method of the present invention based on integral-improved generalized cross-correlation;
FIG. 2 is a flow chart of the integral-improved generalized cross-correlation algorithm of the present invention;
FIG. 3a is a graph comparing the time domain waveforms of the source signal 1 and the source signal 1 separated by the method of the present invention in example 1;
FIG. 3b is a graph comparing the time-frequency spectrum of the source signal 1 in example 1 and the source signal 1 separated by the method of the present invention;
FIG. 3c is a comparison of the time domain waveforms of the source signal 2 and the source signal 2 isolated by the method of the present invention in example 1;
FIG. 3d is a comparison graph of the time-frequency spectrum of the source signal 2 in example 1 and the source signal 2 separated by the method of the present invention;
FIG. 4a is a comparison graph of the time domain waveforms of the source signal 1 separated by the ICA method and the source signal 1 in example 1;
FIG. 4b is a graph comparing the time-frequency spectrum of the source signal 1 separated by the ICA method and the source signal 1 in example 1;
FIG. 4c is a graph comparing time domain waveforms of the source signal 2 and the source signal 2 separated by the ICA method in example 1;
FIG. 4d is a comparison graph of the time-frequency spectrum of the source signal 2 separated by the ICA method and the source signal 2 in example 1;
FIG. 5 is a graph comparing experimental results on two sets of sound source data;
fig. 6 is a graph comparing experimental effects on a three-sound source data set.
Detailed Description
As shown in fig. 1, the invention relates to a method for separating transmission signals based on integral improved generalized cross-correlation, which comprises the following steps:
step 1: the method comprises the following steps of collecting original dual-channel audio mixed signals and preprocessing the mixed signals, and specifically comprises the following steps:
step 1.1: the method comprises the following steps of collecting original audio mixed signals by using a dual-channel microphone array, placing different sound sources in different directions during collection, and enabling the different sound sources to emit sound at the same time to simulate the sound emitted by different transmission devices;
step 1.2: in order to improve the signal-to-noise ratio of the mixed signal and further improve the quality of separation, the original audio mixed signal is preprocessed, and a polynomial least square method is adopted to eliminate a trend term error.
Step 2: performing time-frequency analysis on the dual-channel audio mixed signal to obtain time-frequency information of the mixed signal, wherein the time-frequency information comprises an amplitude spectrum and an angle spectrum, and the step 2 specifically comprises the following steps:
step 2.1: performing discrete short-time Fourier transform on the audio mixed signal:
Figure BDA0003613259260000081
wherein f is frequency, t is time, STFT (f, t) is time frequency information, k is temporary variable required by integral operation, x () is input signal, g () is window function, and Hamming window is specifically adopted;
step 2.2: decomposing the time-frequency information by using the following formula:
Figure BDA0003613259260000082
wherein, V ft Is an amplitude spectrum divided into an amplitude spectrum V of a left channel lft Amplitude spectrum V of the right channel rft ;φ ft Is an angular spectrum.
And step 3: the time delay is estimated by using an integral improved generalized cross-correlation algorithm to obtain the time delays of different sound sources, and the flow of the integral improved generalized cross-correlation algorithm is shown in fig. 2. The method specifically comprises the following steps:
step 3.1: the basic generalized cross-correlation algorithm is defined as follows:
Figure BDA0003613259260000083
wherein tau is time delay,
Figure BDA0003613259260000084
For cross power spectrum, psi ft As a frequency weighting function, G τt Is a cross-correlation function;
step 3.2: the integration method is used to improve the basic generalized cross-correlation algorithm, and the cross-correlation function G is obtained by the formula (3) τt Front, cross power spectrum
Figure BDA0003613259260000085
Integration along the time t axis: in particular, a specified window length is selected, in the cross-power spectrum
Figure BDA0003613259260000086
The sliding window algorithm is carried out on each line of the image data, the mean value in the window is calculated, the value is assigned to the element at the center of the window, and then the cross-correlation function G is carried out τt Calculating (1);
step 3.3: the obtained cross-correlation function G τt Summing along the t-axis, the cross-correlation function becomes a one-dimensional delay profile
Figure BDA0003613259260000087
Finding out using peak detection algorithm
Figure BDA0003613259260000088
The time delay corresponding to the abscissa of the peak value is the calculated time delay tau S ( s 1,2.. n), s represents different sound sources, and n represents the number of sound sources.
And 4, step 4: carrying out nonnegative matrix decomposition on the magnitude spectrum to obtain a dictionary matrix and a coefficient matrix, and specifically comprising the following steps:
step 4.1: initialization of the dictionary matrix: from the magnitude spectrum V ft Selecting a plurality of column vectors with the maximum infinite norm, averaging the column vectors to be used as columns of a dictionary matrixThe infinite norm of a column is defined as follows:
Figure BDA0003613259260000091
wherein | vec | purple light I.e. infinite norm, v, of vector vec i Is an element of the vector, len is the length of the vector;
step 4.2: initializing the coefficient matrix by a random initialization mode;
step 4.3: dictionary matrix W fd Sum coefficient matrix H dt After initialization, iteration is carried out for a plurality of times by adopting the following iteration formula, and a decomposed dictionary matrix and a decomposed coefficient matrix are solved:
Figure BDA0003613259260000092
Figure BDA0003613259260000093
wherein the content of the first and second substances,
Figure BDA0003613259260000094
the dictionary matrix obtained for the mth iteration is calculated,
Figure BDA0003613259260000095
the coefficient matrix obtained for the mth iteration is calculated.
And 5: combining an integral improvement generalized cross-correlation algorithm with a non-negative matrix decomposition algorithm to generate a mask matrix, which specifically comprises the following steps:
step 5.1: the dictionary matrix generated using the non-negative matrix decomposition defines a new frequency weighting function as follows:
Figure BDA0003613259260000096
wherein the content of the first and second substances,
Figure BDA0003613259260000097
is a new frequency weighting function;
step 5.2: weighting the new frequency function
Figure BDA0003613259260000098
Substituting into the integral improvement generalized cross-correlation algorithm definition formula in the step 3, the objective of combining the integral improvement generalized cross-correlation algorithm with the non-negative matrix decomposition is realized, and the following formula is obtained:
Figure BDA0003613259260000101
wherein the content of the first and second substances,
Figure BDA0003613259260000102
is a new cross-correlation function;
step 5.3: the meaning of the mask matrix is that different sound sources correspond to different time delays τ S Respectively substituting the time delays of different sound sources into
Figure BDA0003613259260000103
Attributing the dictionary atom to
Figure BDA0003613259260000104
The source with the largest value sets the element of the specified position of the mask matrix of the sound source to 1, otherwise to 0, and the definition of the mask matrix is shown as follows:
Figure BDA0003613259260000105
wherein M is dt For the mask matrix, s denotes different sound sources.
Step 6: multiplying the mask matrix and the coefficient matrix element by element to obtain a separated coefficient matrix;
Figure BDA0003613259260000106
and 7: carrying out inverse nonnegative matrix decomposition, and multiplying the dictionary matrix and the separated coefficient matrix to obtain the amplitude spectrums of the different separated sound sources;
Figure BDA0003613259260000107
and 8: combining the amplitude spectrum and the angle spectrum of the different separated sound sources, and performing inverse short-time Fourier transform to obtain time domain information of the different separated sound sources, so as to complete separation; the method specifically comprises the following steps:
combining the amplitude spectrums of the different sources obtained in the step 7 with the angle spectrums obtained in the step 2, and performing short-time Fourier inverse transformation to obtain time domain information of the different separated sources, so that the information of different sound sources is finally separated, sound signals of different transmission devices are separated, and the following formula is an inverse short-time Fourier transformation formula:
Figure BDA0003613259260000108
wherein the content of the first and second substances,
Figure BDA0003613259260000111
i.e. sound signals representing separate different actuators,
Figure BDA0003613259260000112
for amplitude spectra of different sound sources, phi ft Is an angular spectrum.
The present invention is further illustrated by the following examples.
Example 1:
the method selects the double-sound-source audio mixed signal acquired on site in a laboratory, and the output data is a single-channel audio signal after separation.
Firstly, the audio mixed signal is collected, a microphone array and two loudspeakers are used, and the loudspeakers simulate the sound emitted by the transmission device, namely simulate the scene that the two transmission devices work simultaneously. And enabling the loudspeaker to play audio simultaneously, and acquiring by the microphone array. And then, setting parameters of the program, wherein the method adopts short-time Fourier transform to perform time-frequency analysis, the width of a window function is set to be 1024, and the overlapping proportion of windows is 87.5%. The method adopts a generalized cross-correlation algorithm to estimate the arrival time difference, and divides the arrival time difference into 128 intervals; setting the size of the dictionary atoms in the nonnegative matrix decomposition, namely the length of the dictionary matrix column to be 128, and solving the upper limit of the iteration times of the nonnegative matrix decomposition by an iterative method to be 100 times; the distance between the microphones is set to 8cm as a real example.
Fig. 3a is a time domain waveform comparison graph of the source signal 1 in example 1 and the source signal 1 separated by the method of the present invention: FIG. 3b is a graph comparing the time-frequency spectrum of the source signal 1 in example 1 and the source signal 1 separated by the method of the present invention; fig. 3c is a time domain waveform comparison of the source signal 2 of example 1 and the source signal 2 isolated by the method of the present invention: FIG. 3d is a comparison graph of the time-frequency spectrum of the source signal 2 in example 1 and the source signal 2 separated by the method of the present invention;
3a-3d, it can be seen that the transmission signal separation method based on the improved generalized cross-correlation provided by the present invention can effectively separate a mixed signal into several clean source signals, and has excellent separation capability in both time domain and frequency domain, and better robustness.
Further, the audio data set used is kept unchanged, and an ICA method is adopted for comparison experiments, and fig. 4a is a time domain waveform comparison graph of the source signal 1 separated by the source signal 1 and the ICA method in example 1; FIG. 4b is a graph comparing the time-frequency spectrum of the source signal 1 separated by the ICA method and the source signal 1 in example 1; FIG. 4c is a comparison graph of the time domain waveforms of the source signal 2 and the separated source signal 2 by the ICA method in example 1; fig. 4d is a time-frequency map comparing the source signal 2 separated by the ICA method and the source signal 2 in example 1.
The experimental results of fig. 4a-4d show that the result of separating the mixed signals by the ICA method is poor, the signal-to-noise ratio is low, and the effect is inferior to that of the transmission signal separation method based on the improved generalized cross-correlation provided by the invention. Compared with an ICA method, the blind source separation method formed by combining the improved generalized cross-correlation with the non-negative matrix decomposition method can effectively solve the signal separation problem of the transmission device, and has great practical significance and application value.
Example 2:
in order to further prove the advancement of the transmission device signal separation method based on the improved generalized cross-correlation, the multi-sound-source mixed signal is collected for 1.5 hours and is divided into two audio data sets according to different sources (two sound sources or three sound sources), and each audio data set comprises 270 segments of two-channel audio signals of 10 seconds. The method, the ICA method and the PCA principal component analysis method are respectively tested on the two audio data, and the test results are compared. The BSS-EVAL evaluation tool kit widely applied in the field of blind Source separation is used for quantitative evaluation of the method, and three indexes are used, namely an artifact rate (SAR), a Distortion rate (SDR) and an interference rate (SIR).
According to fig. 5 and 6, it can be seen that the method provided by the present invention has excellent performance on various evaluation indexes on two audio data sets, and on different audio data sets, each index SAR, SDR, SIR is superior to the other two methods. By combining the performances of two data sets, the multi-transmission device signal separation method based on the improved generalized cross-correlation has the best performance and the best robustness in three methods.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the scope of the present invention, which is defined by the appended claims.

Claims (7)

1. A method for separating transmission device signals based on integral improved generalized cross-correlation is characterized by comprising the following steps:
step 1: collecting original dual-channel audio mixed signals and preprocessing the mixed signals;
step 2: performing time-frequency analysis on the dual-channel audio mixed signal to obtain time-frequency information of the mixed signal, wherein the time-frequency information comprises an amplitude spectrum and an angle spectrum;
and step 3: estimating time delay by using an integral improved generalized cross-correlation algorithm to obtain time delay of different sound sources;
and 4, step 4: carrying out non-negative matrix decomposition on the magnitude spectrum to obtain a dictionary matrix and a coefficient matrix;
and 5: combining an integral improvement generalized cross-correlation algorithm with a non-negative matrix decomposition algorithm to generate a mask matrix;
step 6: multiplying the mask matrix and the coefficient matrix element by element to obtain a separated coefficient matrix;
and 7: carrying out inverse nonnegative matrix decomposition, and multiplying the dictionary matrix and the separated coefficient matrix to obtain the amplitude spectrums of the different separated sound sources;
and 8: and combining the amplitude spectrum and the angle spectrum of the different separated sound sources, and performing inverse short-time Fourier transform to obtain time domain information of the different separated sound sources, so as to complete separation.
2. The method for separating the signals of the transmission device based on the integral improved generalized cross-correlation as claimed in claim 1, wherein the step 1 is specifically as follows:
step 1.1: the method comprises the following steps of collecting original audio mixed signals by using a dual-channel microphone array, placing different sound sources in different directions during collection, and enabling the different sound sources to emit sound at the same time to simulate the sound emitted by different transmission devices;
step 1.2: in order to improve the signal-to-noise ratio of the mixed signal and further improve the quality of separation, the original audio mixed signal is preprocessed, and a polynomial least square method is adopted to eliminate a trend term error.
3. The method for separating the signals of the transmission device based on the integral improved generalized cross-correlation as claimed in claim 1, wherein the step 2 is specifically as follows:
step 2.1: performing discrete short-time Fourier transform on the audio mixed signal:
Figure FDA0003613259250000021
wherein f is frequency, t is time, STFT (f, t) is time frequency information, k is temporary variable required by integral operation, x () is input signal, g () is window function, and Hamming window is specifically adopted;
step 2.2: decomposing the time-frequency information by using the following formula:
Figure FDA0003613259250000022
wherein, V ft Is an amplitude spectrum divided into an amplitude spectrum V of a left channel lft Amplitude spectrum V of the right channel rft ;φ ft Is an angular spectrum.
4. The method for separating the signals of the transmission device based on the integral improved generalized cross-correlation as claimed in claim 3, wherein the step 3 is specifically as follows:
step 3.1: the basic generalized cross-correlation algorithm is defined as follows:
Figure FDA0003613259250000023
wherein tau is time delay,
Figure FDA0003613259250000024
For cross power spectrum, psi ft As a frequency weighting function, G τt Is a cross-correlation function;
step 3.2: the integration method is used to improve the basic generalized cross-correlation algorithm, and the cross-correlation function G is obtained by the formula (3) τt Front, cross power spectrum
Figure FDA0003613259250000025
Integration along the time t axis: in particular, a specified window length is selected, in the cross-power spectrum
Figure FDA0003613259250000026
The sliding window algorithm is carried out on each line of the image data, the mean value in the window is calculated, the value is assigned to the element at the center of the window, and then the cross-correlation function G is carried out τt Calculating (1);
step 3.3: the obtained cross-correlation function G τt Summing along the t-axis, the cross-correlation function becomes a one-dimensional delay profile
Figure FDA0003613259250000027
Finding out using peak detection algorithm
Figure FDA0003613259250000028
The time delay corresponding to the abscissa of the peak value is the calculated time delay tau S (s 1,2.. n), s represents different sound sources, and n represents the number of sound sources.
5. The method for separating the signals of the transmission device based on the integral improved generalized cross-correlation as claimed in claim 4, wherein the step 4 is specifically as follows:
step 4.1: initialization of the dictionary matrix: from the magnitude spectrum V ft The method comprises the following steps of selecting a plurality of column vectors with the maximum infinite norm, averaging the column vectors to be used as columns of a dictionary matrix, wherein the infinite norm of the columns is defined as follows:
Figure FDA0003613259250000031
wherein | vec | purple light I.e. infinite norm, v, of vector vec i Is an element of the vector, len is the length of the vector;
step 4.2: initializing the coefficient matrix by a random initialization mode;
step 4.3: dictionary matrix W fd Sum coefficient matrix H dt After initialization, iteration is carried out for a plurality of times by adopting the following iteration formula, and a decomposed dictionary matrix and a decomposed coefficient matrix are solved:
Figure FDA0003613259250000032
Figure FDA0003613259250000033
wherein the content of the first and second substances,
Figure FDA0003613259250000034
the dictionary matrix obtained for the mth iteration is calculated,
Figure FDA0003613259250000035
the coefficient matrix obtained for the mth iteration is calculated.
6. The method for separating the signals of the transmission device based on the integral improved generalized cross-correlation as claimed in claim 5, wherein the step 5 is specifically as follows:
step 5.1: the dictionary matrix generated using the non-negative matrix factorization defines a new frequency weighting function as follows:
Figure FDA0003613259250000036
wherein the content of the first and second substances,
Figure FDA0003613259250000037
is a new frequency weighting function;
step 5.2: weighting the new frequency function
Figure FDA0003613259250000038
Integral modified generalized cross-correlation algorithm definition substituted into step 3The formula realizes the aim of combining the integral improvement generalized cross-correlation algorithm with the non-negative matrix decomposition, and obtains the following formula:
Figure FDA0003613259250000041
wherein the content of the first and second substances,
Figure FDA0003613259250000042
is a new cross-correlation function;
step 5.3: the meaning of the mask matrix is that different sound sources correspond to different time delays τ S Respectively substituting the time delays of different sound sources into
Figure FDA0003613259250000043
Attributing the dictionary atom to
Figure FDA0003613259250000044
The source with the largest value sets the element of the specified position of the mask matrix of the sound source to 1, otherwise to 0, and the definition of the mask matrix is shown as follows:
Figure FDA0003613259250000045
wherein M is dt For the mask matrix, s denotes different sound sources.
7. The method for separating the signals of the transmission device based on the integral improved generalized cross-correlation as claimed in claim 5, wherein the step 8 is specifically as follows:
combining the amplitude spectrums of the different sources obtained in the step 7 with the angle spectrums obtained in the step 2, and performing short-time Fourier inverse transformation to obtain time domain information of the different separated sources, so that the information of different sound sources is finally separated, sound signals of different transmission devices are separated, and the following formula is an inverse short-time Fourier transformation formula:
Figure FDA0003613259250000046
wherein the content of the first and second substances,
Figure FDA0003613259250000047
i.e. sound signals representing separate different actuators,
Figure FDA0003613259250000048
for amplitude spectra of different sound sources, phi ft Is an angular spectrum.
CN202210439737.8A 2022-04-25 2022-04-25 Transmission device signal separation method based on integral improved generalized cross-correlation Pending CN114822584A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210439737.8A CN114822584A (en) 2022-04-25 2022-04-25 Transmission device signal separation method based on integral improved generalized cross-correlation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210439737.8A CN114822584A (en) 2022-04-25 2022-04-25 Transmission device signal separation method based on integral improved generalized cross-correlation

Publications (1)

Publication Number Publication Date
CN114822584A true CN114822584A (en) 2022-07-29

Family

ID=82508422

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210439737.8A Pending CN114822584A (en) 2022-04-25 2022-04-25 Transmission device signal separation method based on integral improved generalized cross-correlation

Country Status (1)

Country Link
CN (1) CN114822584A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116597856A (en) * 2023-07-18 2023-08-15 山东贝宁电子科技开发有限公司 Voice quality enhancement method based on frogman intercom
CN117825898A (en) * 2024-03-04 2024-04-05 国网浙江省电力有限公司电力科学研究院 GIS distributed vibration and sound combined monitoring method, device and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106226739A (en) * 2016-07-29 2016-12-14 太原理工大学 Merge the double sound source localization method of Substrip analysis
CN107479030A (en) * 2017-07-14 2017-12-15 重庆邮电大学 Based on frequency dividing and improved broad sense cross-correlation ears delay time estimation method
US20180299527A1 (en) * 2015-12-22 2018-10-18 Huawei Technologies Duesseldorf Gmbh Localization algorithm for sound sources with known statistics

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180299527A1 (en) * 2015-12-22 2018-10-18 Huawei Technologies Duesseldorf Gmbh Localization algorithm for sound sources with known statistics
CN106226739A (en) * 2016-07-29 2016-12-14 太原理工大学 Merge the double sound source localization method of Substrip analysis
CN107479030A (en) * 2017-07-14 2017-12-15 重庆邮电大学 Based on frequency dividing and improved broad sense cross-correlation ears delay time estimation method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
SEAN U.N.WOOD ET AL.: "《Blind Speech Separation and Enhancement With GCC-NMF》", 《IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING》, vol. 25, no. 4, 30 April 2017 (2017-04-30), pages 745 - 755 *
吴君钦等: "《基于GCC-NMF的语音分离研究》", 《江西理工大学学报》, vol. 41, no. 5, 31 October 2020 (2020-10-31), pages 65 - 72 *
皮磊等: "《SRP-NMF:一种多通道盲源分离算法》", 《通信技术》, vol. 54, no. 6, 30 June 2021 (2021-06-30), pages 1333 - 1338 *
邓承韵: "《基于麦克风阵列的语音分离算法研究》", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 09, 15 September 2019 (2019-09-15) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116597856A (en) * 2023-07-18 2023-08-15 山东贝宁电子科技开发有限公司 Voice quality enhancement method based on frogman intercom
CN116597856B (en) * 2023-07-18 2023-09-22 山东贝宁电子科技开发有限公司 Voice quality enhancement method based on frogman intercom
CN117825898A (en) * 2024-03-04 2024-04-05 国网浙江省电力有限公司电力科学研究院 GIS distributed vibration and sound combined monitoring method, device and medium

Similar Documents

Publication Publication Date Title
Adavanne et al. A multi-room reverberant dataset for sound event localization and detection
CN107644650B (en) Improved sound source positioning method based on progressive serial orthogonalization blind source separation algorithm and implementation system thereof
CN114822584A (en) Transmission device signal separation method based on integral improved generalized cross-correlation
CN103426434B (en) Separated by the source of independent component analysis in conjunction with source directional information
Yegnanarayana et al. Processing of reverberant speech for time-delay estimation
CN102565759B (en) Binaural sound source localization method based on sub-band signal to noise ratio estimation
CN111899756B (en) Single-channel voice separation method and device
CN106373589B (en) A kind of ears mixing voice separation method based on iteration structure
CN112885368B (en) Multi-band spectral subtraction vibration signal denoising method based on improved capsule network
CN113472390B (en) Frequency hopping signal parameter estimation method based on deep learning
CN109597021B (en) Direction-of-arrival estimation method and device
CN112394324A (en) Microphone array-based remote sound source positioning method and system
Li et al. A si-sdr loss function based monaural source separation
Aeron et al. Broadband dispersion extraction using simultaneous sparse penalization
CN110706709B (en) Multi-channel convolution aliasing voice channel estimation method combined with video signal
CN110265060B (en) Speaker number automatic detection method based on density clustering
CN109658944B (en) Helicopter acoustic signal enhancement method and device
CN114613384B (en) Deep learning-based multi-input voice signal beam forming information complementation method
CN111103568A (en) Sound source positioning method, device, medium and equipment
Peng et al. Competing Speaker Count Estimation on the Fusion of the Spectral and Spatial Embedding Space.
CN116106827A (en) Sound source positioning method based on four-microphone array and deep learning
Jafari et al. Underdetermined blind source separation with fuzzy clustering for arbitrarily arranged sensors
CN108957550B (en) TSP strong industrial electric interference suppression method based on SVD-ICA
Oliinyk et al. Center weighted median filter application to time delay estimation in non-Gaussian noise environment
CN109272054B (en) Vibration signal denoising method and system based on independence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination