CN110610717B - Separation method of mixed signals in complex frequency spectrum environment - Google Patents

Separation method of mixed signals in complex frequency spectrum environment Download PDF

Info

Publication number
CN110610717B
CN110610717B CN201910810854.9A CN201910810854A CN110610717B CN 110610717 B CN110610717 B CN 110610717B CN 201910810854 A CN201910810854 A CN 201910810854A CN 110610717 B CN110610717 B CN 110610717B
Authority
CN
China
Prior art keywords
sampling
layer
data
signal
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910810854.9A
Other languages
Chinese (zh)
Other versions
CN110610717A (en
Inventor
余湋
马松
张毅
刘田
陈霄南
王军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Electronic Technology Institute No 10 Institute of Cetc
Original Assignee
Southwest Electronic Technology Institute No 10 Institute of Cetc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Electronic Technology Institute No 10 Institute of Cetc filed Critical Southwest Electronic Technology Institute No 10 Institute of Cetc
Priority to CN201910810854.9A priority Critical patent/CN110610717B/en
Publication of CN110610717A publication Critical patent/CN110610717A/en
Application granted granted Critical
Publication of CN110610717B publication Critical patent/CN110610717B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0224Processing in the time domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a method for separating mixed signals in a complex frequency spectrum environment, which can solve the problem of separating multiple highly-overlapped mixed signals in the complex frequency spectrum environment. The method has the following technical characteristics: based on a classical structure U-Net of a semantic segmentation neural network, adopting a down-sampling coding network consisting of down-sampling modules to connect with an up-sampling coding network consisting of up-sampling modules, compressing data characteristics through the down-sampling coding network, and recovering data size through the up-sampling coding network; the receiver carries out time domain windowing and spectrum reconstruction processing on IQ two-path data of the received mixed signal, carries out spectrum reconstruction based on the time domain windowing, carries out Fast Fourier Transform (FFT) with the length of N on the signal subjected to the time domain windowing, calculates the amplitude of the signal and completes spectrum reconstruction; after the receiver completes time domain windowing and spectrum reconstruction of the mixed signal, IQ two-path data and the amplitude spectrum form tensor data which are used as network input, and IQ two-path and amplitude spectrum data of a target signal are obtained; under the environment of Gaussian noise, a semantic segmentation network improved based on U-Net learns the signal characteristics in received data, and the complex baseband IQ two paths and the amplitude spectrum data of a target signal are separated from a mixed signal in parallel to recover a source signal.

Description

Separation method of mixed signals in complex frequency spectrum environment
Technical Field
The invention belongs to the field of spectrum sensing in wireless communication, and relates to a mixed signal separation method based on a semantic segmentation network
Background
Semantic segmentation is one of important research branches in the field of computer vision, and is simply to give a picture, classify each pixel point in the picture, and the picture after semantic segmentation is a picture containing a plurality of colors, wherein each color represents one class. Image semantic segmentation is an important branch in the field of AI and is an important ring for image understanding in machine vision technology. Different from classification, semantic segmentation needs to judge the category of each pixel point of an image and perform accurate segmentation. Deep learning is a branch of machine learning, mainly referring to a deep neural network algorithm, a neural network is an artificial neuron system established by simulating human neurons, and the neural network has multiple inputs and single output, and the output is used as the input of the next neuron. A number of individual neurons are organized together to form a neural network. While deep neural networks are generally applied to the separation or enhancement of sparse speech signals, or the separation of signals with certain periodicity. The semantic segmentation neural network can return pixel-level labels and segment the targets in the input data according to the labels. With the development of deep learning semantic segmentation technology, semantic segmentation networks are also applied to mixed signal separation technology.
In the non-cooperative reception of communication signals, single-channel mixed signals widely exist in environments such as short-wave, ultra-short-wave and satellite channels due to various factors, such as a specific communication system adopting frequency reuse, a complex electromagnetic environment, intentional or unintentional interference from other systems, or limitation of third-party reception regions and priori knowledge. Because the signals are aliased in both time domain and frequency domain, the effective separation of the source signals in the mixed signals is difficult to realize by adopting the traditional time domain or frequency domain filtering method, and the influence is caused to the signal analysis and information extraction work. Achieving separation of multiple time-frequency aliased signal components is inherently a difficult problem with fewer quantities to estimate more quantities. In a complex electromagnetic environment, the signal received by the sensor is very complex, and mainly consists of an echo signal, an interference signal, a clutter signal and internal noise. Due to the wide frequency spectrum, unknown characteristics and complex and variable waveforms of the signals, practical difficulties are brought to signal processing. For example, the signal received by the passive sonar may be a plurality of mixed signals which are completely unknown, and the transmission channel of the signal is also unknown, or is time-varying depending on the temperature and the ocean current (for example, marine environment). Since the received signal spectrums are aliased, it is difficult to separate them in the frequency domain.
The main method of multi-signal separation at present is to transform the signal from time domain to frequency domain, and to separate and identify the signal in the frequency domain, or to use the time-frequency analysis tool like wavelet transform to realize the detection and analysis of the signal. The method is mainly realized by utilizing the difference of the signals in the time domain and the frequency domain, namely, the signal frequency spectrum does not generate aliasing under the condition that the signal environment is relatively ideal. Since they are spectrally aliased, the original signal cannot be separated from the frequency domain. It is very difficult to realize signal separation and sorting identification by the above method. For a mixed signal whose source signal satisfies the independent equal distribution condition, when the probability density function of the source signal is severely smeared, a straight line of the mixed signal on the probability density contour map may pass through 2 branches of the source signal joint probability density contour map, and in this case, even if the mixing coefficient is known, the source signal cannot be separated from the single-path mixed signal.
Aiming at the problem of signal separation under a complex frequency spectrum environment in which various signals are mixed in a highly overlapping mode, the prior art has the following problems: there is no effective separation means for the mixed signal with high aliasing of time-frequency domain, and there are sparsity and periodicity requirements for the signal as the separation target. Signals which do not meet the requirements of sparsity and periodicity in an actual communication system cannot be effectively separated. In order to solve the mixed signal separation, the mixed signal is generally separated by separating signal sources, and depending on multiple receiving antennas, the mixed signal separation is realized by searching each signal source by carrying out algorithms such as clustering and matching on multi-path receiving data. However, for the situation that there are many signal sources and the received data of a single receiving antenna has high time-frequency domain overlapping, the traditional method and the existing deep neural network method cannot realize effective signal separation.
Disclosure of Invention
The invention aims to provide a method for separating a mixed signal in a complex frequency spectrum environment, which has good separation performance and can effectively inhibit noise, aiming at the problem of separating signals under the condition of high overlapping (serious time-frequency domain overlapping) of various signals and the problem of the existing mixed signal separation technology, so as to solve the problem of separating various mixed signals under the condition of high overlapping in the complex frequency spectrum environment.
The above object of the present invention can be achieved by the following measures, a method for separating a mixed signal in a complex spectrum environment, having the following technical features: based on a classical structure U-Net of a semantic segmentation neural network, adopting a down-sampling coding network consisting of down-sampling modules to connect with an up-sampling coding network consisting of up-sampling modules, compressing data characteristics through the down-sampling coding network, and recovering data size through the up-sampling coding network; the receiver carries out time domain windowing and spectrum reconstruction processing on IQ two-path data of the received mixed signal, carries out spectrum reconstruction based on the time domain windowing, carries out Fast Fourier Transform (FFT) with the length of N on the signal subjected to the time domain windowing, calculates the amplitude of the signal and completes spectrum reconstruction; after the receiver completes time domain windowing and spectrum reconstruction of the mixed signal, IQ two-path data and the amplitude spectrum form tensor data which are used as network input, and IQ two-path and amplitude spectrum data of a target signal are obtained; under the environment of Gaussian noise, a semantic segmentation network improved based on U-Net learns the signal characteristics in received data, and the complex baseband IQ two paths and the amplitude spectrum data of a target signal are separated from a mixed signal in parallel to recover a source signal.
Compared with the prior art, the invention has the following beneficial effects.
Aiming at the problem of signal separation under high overlapping (serious time-frequency domain overlapping) of various signals, IQ two-path and amplitude spectrum data of a target signal are output through an output network by utilizing a semantic segmentation network, time domain windowing and spectrum reconstruction processing are carried out on the IQ two-path data of a received mixed signal, the time domain windowing inhibits spectrum leakage during spectrum reconstruction, the semantic segmentation network improved based on U-Net separates a baseband mixed signal under a Gaussian noise environment, the problem that the time domain and the frequency domain overlapping of the mixed signal are difficult to separate is solved, the characteristics of the mixed signal can be extracted when the semantic segmentation network of the target signal is trained under the condition that the signal mixing mode is unknown, and the time domain and the frequency domain separation of the mixed signal is realized by traversing the semantic segmentation network of each target signal. The output result can position the position of the target category, obtain good separation performance in time-frequency domain, and effectively inhibit the influence of noise. Simulation results show that the invention has strong tracking capability and wide application range, and obtains better separation performance under low signal-to-noise ratio and complex frequency spectrum environment.
The semantic segmentation network is constructed based on the classical structure U-Net of the semantic segmentation neural network. And improving a semantic segmentation network of convolution, pooling kernel size, loss function and output size by using U-Net, wherein the U-Net adopts an encoding-decoding network structure and promotes multi-scale feature fusion of data by using channel splicing. The method is characterized in that each time data passes through an up-sampling layer, feature fusion is carried out on the data and a down-sampling layer with the same data size. And better signal separation performance is obtained under the complex frequency spectrum environment with low signal-to-noise ratio. The U-Net is more suitable for processing data with small samples and large scales, so that the U-Net can be applied to the signal separation problem and can process longer time-frequency sampling sequences. The method improves a U-net semantic segmentation model, replaces the cross entropy of a pixel label with the mean square error of a time-frequency domain waveform as a loss function, adjusts the sizes of convolution and pooling kernels to ensure that the cross entropy is suitable for a one-dimensional time-frequency sampling sequence of a signal, extracts the characteristics of the time-frequency domain of the signal in the training process, realizes the separation of mixed signals, and solves the problems in the prior art to a certain extent.
Drawings
FIG. 1 is a diagram of a semantic segmentation network for separating mixed signals under Gaussian noise environment according to an embodiment of the present invention;
fig. 2 is a block diagram of a down-sampling module according to an embodiment of the present invention;
FIG. 3 is a block diagram of an upsampling module provided by an embodiment of the present invention;
FIG. 4 is a complex baseband I waveform of a mixed signal according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a comparison between a complex baseband I waveform of the present invention and a complex baseband I waveform of an original time domain Gaussian pulse signal;
FIG. 6 is a schematic diagram of a mixed signal magnitude spectrum provided by an embodiment of the invention;
FIG. 7 is a schematic diagram of the comparison of the amplitude spectrum of the polyphonic signal separated from the mixed signal and the amplitude spectrum of the original polyphonic signal according to the present invention;
fig. 8 is a time-frequency domain mean square error plot of a polyphonic signal separated from a mixed signal in accordance with the present invention.
In order to make the objects, technical means and advantages of the present invention more apparent in detail, the present invention will be further described in detail with reference to the accompanying drawings and examples.
Detailed Description
Referring to fig. 1, the semantic segmentation network structure designed according to the present invention is based on the classical structure U-Net of the semantic segmentation neural network, and adopts a down-sampling coding network composed of down-sampling modules connected to an up-sampling coding network composed of up-sampling modules, and inputs data through the down-sampling coding network and then through the up-sampling coding network; learning signal characteristics from received data in a training process, carrying out time domain windowing and spectrum reconstruction processing on IQ two-path data of a received mixed signal by a receiver, carrying out spectrum reconstruction based on the time domain windowing, carrying out Fast Fourier Transform (FFT) with the length of N on the signal subjected to the time domain windowing, calculating the amplitude of the signal, and finishing the spectrum reconstruction; after the receiver completes time domain windowing and spectrum reconstruction of the mixed signal, IQ two-path data and the amplitude spectrum of the mixed signal form tensor data which are used as input of a semantic segmentation network, and IQ two-path and amplitude spectrum data of a target signal are obtained; under the environment of Gaussian noise, a semantic segmentation network improved based on U-Net learns the signal characteristics in received data, and the complex baseband IQ two paths and the amplitude spectrum data of a target signal are separated from a mixed signal in parallel to recover a source signal.
The first layer of the semantic segmentation network is an input layer, the input size is (512,1024,1,3), the input layer is followed by a down-sampling coding network, the fourth down-sampling module is followed by an up-sampling decoding network, and the fourth up-sampling module is followed by an output network. The down-sampling coding network comprises: a first down-sampling module feature concatenation Conv2D, a convolution-ReLU activation layer, Conv2D, a second down-sampling module feature concatenation fourth up-sampling module, a third down-sampling module feature concatenation third up-sampling module, a fourth down-sampling module feature concatenation second up-sampling module; the first down-sampling module sequentially passes through the second sampling module, the third down-sampling module and the fourth down-sampling module, and is connected with Conv2D and a convolution-ReLU active layer in parallel through the first up-sampling module, the second up-sampling module, the third up-sampling module and the fourth up-sampling module; tensor
Figure BDA0002185012930000041
Splicing the output of an up-sampling layer of the first up-sampling module with the input of a down-sampling layer of the fourth down-sampling module through the first down-sampling module and the second up-sampling module, and splicing the output of the up-sampling layer of the first up-sampling module with the input of the down-sampling layer of the fourth down-sampling module; the second up-sampling module splices the output of the up-sampling layer of the first up-sampling module and the input of the down-sampling layer of the fourth down-sampling module; the third up-sampling module splices the output of the up-sampling layer of the second up-sampling module and the input of the down-sampling layer of the third down-sampling module; the fourth up-sampling module splices the output of the up-sampling layer of the third up-sampling module with the input of the down-sampling layer of the second down-sampling module, after the output of the up-sampling layer of the fourth up-sampling module is spliced with the input of the down-sampling layer of the first down-sampling module, the output of the up-sampling layer of the fourth up-sampling module passes through a convolution-ReLU active layer with a convolution kernel size of (3,1)) and a convolution layer with a convolution kernel size of (1,1), and the number of the convolution layer channels is 3, wherein X isIFor I-way data flow, XQFor Q data streams, Y is the amplitude spectrumAnd (4) data flow.
The specific structure of the down-sampling coding network is as follows:
the first down-sampling module uses the basic structure of a down-sampling coding network, and the number of channels of the convolutional layer and the pooling layer is 32.
The second down-sampling module uses a basic structure of a down-sampling coding network, and the number of channels of the convolution layer and the pooling layer is 64;
the third down-sampling module uses a basic structure of a down-sampling coding network, and the number of channels of the convolution layer and the pooling layer is 128;
the fourth down-sampling module uses a basic structure of a down-sampling coding network, and the number of channels of the convolution layer and the pooling layer is 256;
the fourth down-sampling module is followed by an up-sampling decoding network, and the specific structure of the up-sampling decoding network is as follows:
the first up-sampling module uses the basic structure of an up-sampling decoding network, the number of channels of a convolution kernel is 512, and the number of channels of an up-sampling layer is 256.
The second up-sampling module splices the output of the up-sampling layer of the first up-sampling module with the input of the down-sampling layer of the fourth down-sampling module. Using a basic structure of an up-sampling decoding network, wherein the number of channels of a convolution kernel is 256, and the number of channels of an up-sampling layer is 128;
the third up-sampling module splices the output of the up-sampling layer of the second up-sampling module with the input of the down-sampling layer of the third down-sampling module. Using a basic structure of an up-sampling decoding network, wherein the number of channels of a convolution kernel is 128, and the number of channels of an up-sampling layer is 64;
and the fourth up-sampling module splices the output of the up-sampling layer of the third up-sampling module with the input of the down-sampling layer of the second down-sampling module. Using a basic structure of an up-sampling decoding network, wherein the number of channels of a convolution kernel is 64, and the number of channels of an up-sampling layer is 32;
the output network behind the fourth up-sampling module is spliced with the input of the down-sampling layer of the first down-sampling module, and then the number of channels of the convolutional layer is 32 through a convolutional-ReLU active layer with the convolutional kernel size of (3, 1); finally, IQ two paths and amplitude spectrum data of the separated target signal are respectively output by a layer of convolution layer with convolution kernel size of (1,1), wherein the number of the convolution layer channels is 3, and the three channels are respectively output. The output size is (512,1024,1, 3). The loss function of the network is a mean square error function, the optimizer is Adam, and the learning rate is 0.001.
The down-sampling coding network is a basic structure formed by two layers of convolution, an activation layer and a step pooling down-sampling layer. The basic structure of the down-sampling coding network is stacked 4 times to form the down-sampling coding network. The up-sampling decoding network is composed of two convolution layers, an active layer and an up-sampling layer to form a basic structure. The basic structure of the system forms an up-sampling decoding network 4 times, the basic structure is subjected to characteristic splicing with a down-sampling coding layer with the same size after each up-sampling, the spliced data passes through a convolution output layer with three channels, and the three channels respectively output IQ two channels and amplitude spectrum data of a separated target signal
Figure BDA0002185012930000051
Wherein, XoutputIComplex baseband I-path data, X, representing a split target signaloutputQComplex baseband Q-path data, Y, representing the split target signaloutputAnd magnitude spectrum data representing the separated target signal.
The semantic segmentation network training set inputs a tensor sample set consisting of two paths of IQ and amplitude spectrum data of a mixed signal, and the label of the training set is a tensor consisting of two paths of IQ and amplitude spectrum data of a target signal
Figure BDA0002185012930000052
The validation dataset is generated in the same way as the training dataset, where XlabelIComplex baseband I-path data, X, representing a tag target signallabelQComplex baseband Q-way data, Y, representing a target signal taglabelMagnitude spectrum data representing a target signal tag.
See fig. 2. The down-sampling coding network is composed of down-sampling modules, and the down-sampling modules comprise: two Conv2D-ReLU activated convolutional layers and one Max boosting 2D maximum value pooling downsampling layer, the convolutional layers ensure the input and output sizes of the convolutional layers to be the same by complementing 0, and the Maxboost 2D layer reduces the input size to half of the original size to be used as an output. In the example Conv2D-ReLU active convolutional layer, the convolutional kernel size is (3,1), and the downsampling step of the maximum pooling downsampling layer is (2, 1). In the down-sampling module architecture shown in fig. 2; the downsampling module first extracts features through the Conv2D-ReLU activated convolutional layer, and then halves the data size through the maximum pooling layer with a pooling stride of (2, 1).
See fig. 3. The up-sampling decoding network is composed of up-sampling modules, and each up-sampling module comprises: two Conv2D-ReLU activated convolutional layers and one upsampling layer, which ensure the same input and output size of the convolutional layer by complementing 0. The up-sampling layer doubles its input size as output. The convolution kernel size of the convolution layer of this example is (3,1), and the up-sampling step of the up-sampling layer is (2, 1). In the upsampling module structure illustrated in fig. 3, the upsampling module first extracts features through the active layer of the Conv2D-ReLU convolution, and then multiplies the data size through the upsampling layer with the upsampling step of (2, 1). Specifically, the ReLU activation function, the up-sampling operation, and the mean square error function operation involved in the semantic segmentation network structure are as follows: ReLU is a modified linear unit activation function (ReLU) commonly used in convolutional neural networks, and is expressed as: the parameters of the upsampling operation in the relu (x) upsampling module are the step size (0, size (1)), the upsampling operation is repeated size (0) times along the rows of the input data and size (1) times along the columns of the input data. Taking the data size (B,1024,1,3) as an example, the data size (B,2048,1,3) is obtained after the up-sampling layer performs up-sampling with the step size (2, 2). The mean square error loss function is defined as follows:
Figure BDA0002185012930000061
where N is the data length, ωiIs y ═ yiProbability of yiIn order to be the data of the tag,
Figure BDA0002185012930000062
outputting data for the network.
The specific implementation comprises the following steps:
step 1: and the receiver performs time domain windowing and frequency spectrum reconstruction processing on the IQ two-path data of the received mixed signal. IQ two-path data of the mixed signal and the amplitude spectrum data form tensor data as network input; IQ two-path data of the target signal and the amplitude spectrum form tensor data as a network tag.
In an alternative embodiment, the present embodiment takes the I-way as an example,
setting the I path data stream received by the receiver as
Figure BDA0002185012930000063
The mixed signal of (2) adopts a time domain window function with the window length consistent with the number N of sampling points
Figure BDA0002185012930000064
Performing time domain windowing to obtain a time domain windowing result as follows: xwindowedI(i)=XI(i)WI(i),i=1,2,...,N
Spectrum reconstruction is performed based on time domain windowing, Fast Fourier Transform (FFT) with the length of N is performed on the time domain windowed signal, and the amplitude of the FFT is calculated. And performing frequency spectrum reconstruction based on time domain windowing to complete frequency spectrum reconstruction.
Step 2: the receiver constructs a semantic segmentation network and a sample set used for training, specifies a target signal and trains the network. In a gaussian noise environment, the present embodiment employs a semantic division network, which forms a tensor by using IQ two-way data and amplitude spectrum data, which are input as mixed signals
Figure BDA0002185012930000071
The semantic segmentation network input layer takes N as the sample length, B is the number of samples input each time, the data are input in a sample format of (B, N,1,3), input data are compressed through a down-sampling coding network, then the data size is recovered through an up-sampling decoding network, complex baseband IQ two-path data and amplitude spectrum data of a target signal are separated from a mixed signal in parallel, and finally the baseband IQ two-path data and the amplitude spectrum data of the target signal are output by an output layer.
And step 3: for a mixed signal without a target signal, the semantic segmentation network forms IQ two paths and amplitude spectrum data of the reserved mixed signal into tensor
Figure BDA0002185012930000072
Tensor composed of mixed signal I path, Q path and amplitude spectrum at input layer of semantic segmentation network
Figure BDA0002185012930000073
Inputting the semantic segmentation network to obtain I path, Q path and amplitude spectrum tensor of the target signal
Figure BDA0002185012930000074
And the receiver performs time domain windowing and frequency spectrum reconstruction processing on the IQ two-path data of the mixed signal received by the receiver according to the trained semantic segmentation network, then is connected with the IQ two-path data and the amplitude spectrum, and inputs the IQ two-path data and the amplitude spectrum data of the target signal into the semantic segmentation network. One network model corresponds to only one target signal. The semantic segmentation network model that specifies the target signal separates only the target signal from the mixed signal. For a mixed signal containing a target signal, the semantic segmentation network separates IQ two paths of the target signal and tensor consisting of amplitude spectrum data
Figure BDA0002185012930000075
In an alternative embodiment, the parameters and implementation steps of each type of parameter signal are as follows:
step 1: and the receiver performs time domain windowing and spectrum reconstruction processing on the IQ two-path data of the received mixed signal. Here, a hanning window is used, where the length N is 1024 and the FFT length N is 1024. In this embodiment, a mixed signal of a time domain gaussian pulse signal, a polyphonic signal, a linear sweep signal, and a noise blocking signal is used as an embodiment.
Step 2: and the training set specifies a target signal and constructs a semantic segmentation network. The overall structure of the network is shown in fig. 1. And the semantic segmentation network takes tensor data formed by the processed I path, Q path and magnitude spectrum as a training sample to train the network.
Fig. 4 shows the I-path waveform of the mixed signal complex baseband of the embodiment, the signal-to-noise ratio of each signal is-5 dB, and the situation that the time domain of each signal is overlapped seriously is visually shown.
Fig. 5 shows a comparison between the complex baseband I-path waveform of the time-domain gaussian pulse signal separated from the mixed signal and the complex baseband I-path waveform of the original time-domain gaussian pulse signal, and visually shows the signal separation performance in the time domain. The spectrum difference shows that the invention has better separation performance in the time domain.
Fig. 6 shows the amplitude spectrum waveform of the mixed signal of the embodiment, the signal components and the signal parameters are the same as those of the signal contained in fig. 4, and the condition that the frequency domains of the signals are overlapped seriously is visually shown.
Fig. 7 shows the comparison between the amplitude spectrum waveform of the polyphonic signal separated from the mixed signal and the amplitude spectrum waveform of the original polyphonic signal, and visually shows the signal separation performance in the frequency domain. The spectrum difference shows that the invention has better separation performance in the frequency domain.
Fig. 8 shows a plot of the mean square error of the separated polyphonic signals from the mixed signal, calculated as the sum of the time-frequency domain mean square errors of the separated polyphonic signals. And quantizing the signal separation performance of the semantic segmentation network. The mean square error is less than-20 dB under the lowest signal-to-noise ratio, the mean square error is less than-40 dB under the highest signal-to-noise ratio, and the separation performance is better.
The parameters of the mixed signal used in the examples are shown in table 1.
Table 1 example mixed signal parameter setting scheme
Figure BDA0002185012930000081
The foregoing is directed to the preferred embodiment of the present invention and it is noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.

Claims (10)

1. A separation method of mixed signals in a complex frequency spectrum environment has the following technical characteristics: based on a classical structure U-Net of a semantic segmentation neural network, adopting a down-sampling coding network consisting of down-sampling modules to connect with an up-sampling coding network consisting of up-sampling modules, compressing data characteristics through the down-sampling coding network, and recovering data size through the up-sampling coding network; the receiver carries out time domain windowing and spectrum reconstruction processing on IQ two-path data of the received mixed signal, carries out spectrum reconstruction based on the time domain windowing, carries out Fast Fourier Transform (FFT) with the length of N on the signal subjected to the time domain windowing, calculates the amplitude of the signal and completes spectrum reconstruction; after the receiver completes time domain windowing and spectrum reconstruction of the mixed signal, IQ two-path data and the amplitude spectrum form tensor data which are used as network input, and IQ two-path and amplitude spectrum data of a target signal are obtained; under the environment of Gaussian noise, a semantic segmentation network based on U-Net improvement learns signal characteristics in received data by taking a mean square error function as a loss function, and separates two paths of complex baseband IQ and amplitude spectrum data of a target signal from a mixed signal in parallel to recover a source signal.
2. The method for separating a complex spectrum environment mixed signal according to claim 1, wherein: the first layer of the semantic segmentation network is an input layer, the input size is (512,1024,1,3), the input layer is followed by a down-sampling coding network, the fourth down-sampling module is followed by an up-sampling decoding network, and the fourth up-sampling module is followed by an output network.
3. The method for separating a complex spectrum environment mixed signal according to claim 1, wherein: the downsampling coding network includes: a first downsampling module feature concatenation Conv2D, a convolution-ReLU activation layer, Conv2D, a second downsampling module feature concatenation fourth upsampling module, a third downsampling module feature concatenation third upsampling module, and a fourth downsampling module feature concatenation second upsampling module.
4. The method for separating a complex spectral environment mixed signal according to claim 3, wherein: the first down-sampling module sequentially passes through the second sampling module, the third down-sampling module and the fourth down-sampling module, and is connected with Conv2D and a convolution-ReLU active layer in parallel through the first up-sampling module, the second up-sampling module, the third up-sampling module and the fourth up-sampling module; tensor
Figure FDA0003188743060000011
Splicing the output of an up-sampling layer of the first up-sampling module with the input of a down-sampling layer of the fourth down-sampling module through the first down-sampling module and the second up-sampling module, and splicing the output of the up-sampling layer of the first up-sampling module with the input of the down-sampling layer of the fourth down-sampling module; the second up-sampling module splices the output of the up-sampling layer of the first up-sampling module and the input of the down-sampling layer of the fourth down-sampling module; the third up-sampling module splices the output of the up-sampling layer of the second up-sampling module and the input of the down-sampling layer of the third down-sampling module; the fourth up-sampling module splices the output of the up-sampling layer of the third up-sampling module with the input of the down-sampling layer of the second down-sampling module, after the output of the up-sampling layer of the fourth up-sampling module is spliced with the input of the down-sampling layer of the first down-sampling module, the output of the up-sampling layer of the fourth up-sampling module passes through a convolution-ReLU active layer with a convolution kernel size of (3,1)) and a convolution layer with a convolution kernel size of (1,1), and the number of the convolution layer channels is 3, wherein X isIFor I-way data flow, XQThe data flow is Q paths, and Y is the data flow of the amplitude spectrum.
5. The method for separating a complex spectral environment mixed signal according to claim 3, wherein: the down-sampling coding network is formed by two layers of convolution, an activation layer and a step pooling down-sampling layer, and the basic structure is stacked for 4 times to form the down-sampling coding network.
6. The method for separating a complex spectrum environment mixed signal according to claim 1, wherein: the up-sampling decoding network is composed of two convolution layers, an activation layer and an up-sampling layer, the basic structure of the up-sampling decoding network is composed of up-sampling decoding networks 4 times, the basic structure is spliced with down-sampling coding layers with the same size after each up-sampling, the spliced data passes through a convolution output layer with three channels, and the three channels respectively output IQ two-channel and amplitude spectrum data of a separated target signal
Figure FDA0003188743060000021
Wherein, XoutputIComplex baseband I-path data, X, representing a split target signaloutputQComplex baseband Q-path data, Y, representing the split target signaloutputAnd magnitude spectrum data representing the separated target signal.
7. The method for separating a complex spectral environment mixed signal according to claim 3, wherein: the down-sampling coding network is composed of down-sampling modules, and the down-sampling modules comprise: two Conv2D-ReLU activated convolutional layers and one Max boosting 2D maximum value pooling downsampling layer, the convolutional layers ensure the input and output sizes of the convolutional layers to be the same by complementing 0, and the Maxboost 2D layer reduces the input size to half of the original size to be used as an output.
8. The method for separating a complex spectral environment mixed signal according to claim 7, wherein: the down-sampling module firstly extracts features through a Conv2D-ReLU activated convolutional layer, and then reduces the data size to half through a maximum pooling layer with a pooling step of (2, 1); the upsampling module first extracts features through the active layer of the Conv2D-ReLU convolution, and then multiplies the data size through the upsampling layer with the upsampling step of (2, 1).
9. The method for separating a complex spectral environment mixed signal according to claim 6, wherein: the up-sampling decoding network is composed of up-sampling modules, and each up-sampling module comprises: two Conv2D-ReLU activated convolutional layers and one upsampling layer, the convolutional layers ensure the input and output sizes of the convolutional layers to be the same by complementing 0, and the upsampling layer doubles the input size as the output.
10. The method for separating a complex spectrum environment mixed signal according to claim 1, wherein: the receiver receives the I data stream as
Figure FDA0003188743060000022
The mixed signal of (2) adopts a time domain window function with the window length consistent with the number N of sampling points
Figure FDA0003188743060000023
Performing time domain windowing to obtain a time domain windowing result as follows:
XwindowedI(i)=XI(i)WI(i) and i is 1,2, 9, N, performing frequency spectrum reconstruction based on time domain windowing, performing Fast Fourier Transform (FFT) with the length of N on the signal subjected to time domain windowing, calculating the amplitude of the signal, performing frequency spectrum reconstruction based on the time domain windowing, and finishing frequency spectrum reconstruction.
CN201910810854.9A 2019-08-30 2019-08-30 Separation method of mixed signals in complex frequency spectrum environment Active CN110610717B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910810854.9A CN110610717B (en) 2019-08-30 2019-08-30 Separation method of mixed signals in complex frequency spectrum environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910810854.9A CN110610717B (en) 2019-08-30 2019-08-30 Separation method of mixed signals in complex frequency spectrum environment

Publications (2)

Publication Number Publication Date
CN110610717A CN110610717A (en) 2019-12-24
CN110610717B true CN110610717B (en) 2021-10-15

Family

ID=68890685

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910810854.9A Active CN110610717B (en) 2019-08-30 2019-08-30 Separation method of mixed signals in complex frequency spectrum environment

Country Status (1)

Country Link
CN (1) CN110610717B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111126332B (en) * 2019-12-31 2022-04-22 桂林电子科技大学 Frequency hopping signal classification method based on contour features
CN112272066B (en) * 2020-09-15 2022-08-26 中国民用航空飞行学院 Frequency spectrum data cleaning method used in airport terminal area very high frequency communication
CN112420065B (en) * 2020-11-05 2024-01-05 北京中科思创云智能科技有限公司 Audio noise reduction processing method, device and equipment
CN112434415B (en) * 2020-11-19 2023-03-14 中国电子科技集团公司第二十九研究所 Method for implementing heterogeneous radio frequency front end model for microwave photonic array system
CN113707164A (en) * 2021-09-02 2021-11-26 哈尔滨理工大学 Voice enhancement method for improving multi-resolution residual error U-shaped network
CN113782043B (en) * 2021-09-06 2024-06-14 北京捷通华声科技股份有限公司 Voice acquisition method, voice acquisition device, electronic equipment and computer readable storage medium
CN117935826B (en) * 2024-03-22 2024-07-05 深圳市东微智能科技股份有限公司 Audio up-sampling method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1008984A3 (en) * 1998-12-11 2000-08-02 Sony Corporation Windband speech synthesis from a narrowband speech signal
US6463406B1 (en) * 1994-03-25 2002-10-08 Texas Instruments Incorporated Fractional pitch method
CN106537499A (en) * 2014-07-28 2017-03-22 弗劳恩霍夫应用研究促进协会 Apparatus and method for generating an enhanced signal using independent noise-filling

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6463406B1 (en) * 1994-03-25 2002-10-08 Texas Instruments Incorporated Fractional pitch method
EP1008984A3 (en) * 1998-12-11 2000-08-02 Sony Corporation Windband speech synthesis from a narrowband speech signal
CN106537499A (en) * 2014-07-28 2017-03-22 弗劳恩霍夫应用研究促进协会 Apparatus and method for generating an enhanced signal using independent noise-filling

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
End-to-end Sound Source Separation Conditioned on Instrument Labels;Olga Slizovskaia et al.;《 ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)》;20190417;第306-310页 *
Online Singing Voice Separation Using a Recurrent One-dimensional U-NET Trained with Deep Feature Losses;Clement S. J. Doire et al.;《ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)》;20190417;第3752-3754页 *
Simultaneous Arteriole and Venule Segmentation of Dual-Modal Fundus Images Using a Multi-Task Cascade Network;Zhang, Shulin et al.;《IEEE ACCESS》;20190514;第57561-57565页 *
语义分割网络下的混合信号频谱分离;马松等;《电讯技术》;20200428;第413-420页 *

Also Published As

Publication number Publication date
CN110610717A (en) 2019-12-24

Similar Documents

Publication Publication Date Title
CN110610717B (en) Separation method of mixed signals in complex frequency spectrum environment
CN107122738B (en) Radio signal identification method based on deep learning model and implementation system thereof
CN107220606B (en) Radar radiation source signal identification method based on one-dimensional convolutional neural network
CN107944442B (en) Based on the object test equipment and method for improving convolutional neural networks
CN108764077B (en) Digital signal modulation classification method based on convolutional neural network
CN111783558A (en) Satellite navigation interference signal type intelligent identification method and system
Wang et al. Radar emitter recognition based on the short time Fourier transform and convolutional neural networks
CN113312996B (en) Detection and identification method for aliasing short-wave communication signals
CN101510309A (en) Segmentation method for improving water parting SAR image based on compound wavelet veins region merge
CN114254141B (en) End-to-end radar signal sorting method based on depth segmentation
Xie et al. Adaptive ship-radiated noise recognition with learnable fine-grained wavelet transform
CN111680539A (en) Dynamic gesture radar recognition method and device
CN102496144A (en) NSCT (nonsubsampled contourlet transform) sonar image enhancement method based on HSV (hue, saturation and value) color space
CN112183300A (en) AIS radiation source identification method and system based on multi-level sparse representation
CN107864020A (en) The transform domain extracting method of underwater Small object simple component sound scattering echo
CN107564546A (en) A kind of sound end detecting method based on positional information
CN116894207A (en) Intelligent radiation source identification method based on Swin transducer and transfer learning
CN111181574A (en) End point detection method, device and equipment based on multi-layer feature fusion
CN106971392A (en) A kind of combination DT CWT and MRF method for detecting change of remote sensing image and device
CN110491408A (en) A kind of music signal based on sparse meta analysis is deficient to determine aliasing blind separating method
CN114580476A (en) Unmanned aerial vehicle signal identification model construction method and corresponding identification method and system
CN103323853A (en) Fish identification method and system based on wavelet packets and bispectrum
CN114841195A (en) Avionics space signal modeling method and system
Hinderer Blind source separation of radar signals in time domain using deep learning
Huynh-The et al. WaveNet: Towards Waveform Classification in Integrated Radar-Communication Systems with Improved Accuracy and Reduced Complexity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant