WO2007109960A1

WO2007109960A1 - Method, system and data signal detector for realizing dada service

Info

Publication number: WO2007109960A1
Application number: PCT/CN2007/000423
Authority: WO
Inventors: Tong Jin
Original assignee: Huawei Technologies Co., Ltd.
Priority date: 2006-03-24
Filing date: 2007-02-07
Publication date: 2007-10-04
Also published as: CN101043759A; CN101043759B

Abstract

A method for realizing data service in a voice band data manner comprises the following steps: a. a data signal detector divides the input signal frames into data signal frames and non-data signal frames, then a voice coder proceeds a voice coding on the data signal frames, while the non-voice coder proceeds a mute compression coding on the non-data signal frames; b. a voice decoder proceeds a voice decoding on the data signal frames which are voice coded to output available dada signals, while a non-voice decoder proceeds a decoding on the non-data signal frames which are mute compression coded to reconstruct mute signals. A system and a signal detector use this method.

Description

Method, system and data signal detector for implementing data services

The present invention relates to media stream delivery technologies, and more particularly to a method, system and data signal detector for implementing data services.

Background technique

The rapid development of network technology makes it possible to use packet technology to transmit multimedia. The convergence of traditional communication technologies and packet technologies is becoming more and more obvious. Because traditional communication networks and media packets transmitted in modern packet communication networks are encoded differently, The combination of the traditional network and the packet network requires the codec to perform the media stream coding mode conversion, and the device that implements the conversion is the gateway.

Currently, the media streams processed by the gateway mainly include: a voice stream, a data stream, a video stream, and the like. The data stream mainly refers to the signals sent by the data devices such as fax machines, digital modems (modems), and text phones during the interaction. At present, the way in which the industry transmits data streams through gateways mainly includes VBD (Voice-Band Data) and Relay (forward).

The VBD mode refers to the low-loss codec processing of the data stream as a normal voice stream by using a codec with a relatively small signal impairment. The proposed codec standard is ITU-T (International Telecommunication Union-Telecommunication Standard) Divisional) G.711 and ITU-T G.726. The advantage of this method is that it is simple to implement, and does not care about the meaning of the specific data signal. It only needs to process the data signal as voice, and the processing power of the gateway is very small.

With the continuous development of data services, the types of data services are endless. The VBD method has attracted more and more attention because of its simple implementation and wide application range. However, the problem of large bandwidth occupied by VBD has not been well received. Solution, this is unacceptable for many bandwidth-constrained applications.

In the prior art, the processing of the voice signal usually adopts voice detection and silence compression technology to solve the problem of bandwidth occupation. The voice signal is composed of a voice signal and a non-voice signal. The voice signal refers to a signal during a speech period, and the non-voice signal refers to a signal. No one speaks, only the signal during background noise. More than 30~50% of the typical speech signal is the background noise signal without voice signal. By using this feature, voice detection and silence compression techniques in speech signal coding can greatly reduce the code rate without affecting the speech quality. 1 is a block diagram of a communication system employing voice detection and silence compression techniques: The entire system includes three modules: a sender gateway, a communication channel, and a receiver gateway. The sender gateway has a voice activity detector (VAD), a voice coder and a non-voice coder. The receiver gateway has a voice decoder and a non-voice decoder, and the communication channel is generally an IP network.

After the voice signal enters the sender gateway, the VAD analyzes each input voice frame, and divides the voice frame into a voice signal frame and a non-voice signal frame according to whether or not the voice signal is carried. If the input speech frame is a speech signal frame, the VAD controls the speech frame to enter the speech encoder for speech encoding. Otherwise, the VAD controls the speech frame to enter the non-speech encoder for silent compression encoding. The information output by the voice encoder is called a voice packet, and the information output by the non-voice encoder is called a mute packet. The background noise feature information in the mute packet is only used to recover the background noise, so the code rate is very low, and the voice packet is not available. One tenth of the code rate.

After receiving the voice packet, the receiving gateway sends the voice decoder to the voice decoding to output the voice signal, and after receiving the silence packet, it sends the non-voice decoder to reconstruct the background noise excitation signal, so that the reconstructed noise signal is more natural.

Voice detection and muting compression technology can greatly reduce the bandwidth occupied by the communication channel, but it has not been applied in the VBD mode of data services, and it is also explicitly stipulated in the ITU-T V.152 protocol that VAD is prohibited in VBD mode. . Because the basic function of VAD is the judgment of voice and non-speech, the so-called voice is the sound signal emitted by the human vocal organ. However, the data signals and voice signals sent by the data device have different characteristics. If the VAD is applied to the detection of the data signal, the data signal is detected as background noise and sent to the non-voice encoder, causing signal damage.

Therefore, the data signals sent by the data device in the existing VBD mode are all used as voice signal input voice encoders for low-loss codec processing, and the principle block diagram is as shown in FIG. 2.

But in many data services, there is a lot of time when data devices send out silent, not valid data signals. For example, the most widely used fax machine supporting a maximum speed of 14400bps is a half-duplex fax process, which means that more than 50% of the time of the fax process is muted by the fax machine. For the signals sent by these devices, the prior art uniformly treats the data signals, and wastes more bandwidth. Summary of the invention

An embodiment of the present invention provides a method, a system, and a data signal detector for implementing data services by using a voiceband data VBD, so as to solve the problem of occupying a wide bandwidth when implementing data services through VBD in the prior art.

To achieve the above object, the present invention adopts the following embodiment scheme:

A method for implementing a data service includes the following steps:

a. The data signal detector divides the input signal frame into a data signal frame and a non-data signal frame, and the voice signal encoder performs voice coding on the data signal frame, and the non-voice encoder performs the non-data signal frame on the frame. Silent compression coding;

b. The data signal frame encoded by the voice decoder voice is decoded after the voice is decoded, and the non-voice decoder reconstructs the silence signal after decoding the silence compressed and encoded non-data signal frame.

Wherein, an energy decision threshold may be set, and by the energy decision threshold, the data signal detector further divides the input signal frame into a data signal frame and a non-data signal frame, and divides the input signal frame into equal intervals. Or a plurality of sub-signal frames that are not equally spaced, each sub-signal frame corresponding to a signal window, respectively calculating the signal energy in each signal window and comparing with the energy decision threshold, if the signal energy in each signal window is If the energy decision is less than the threshold value, the signal frame is a non-data signal frame, otherwise it is a data signal frame.

The method further includes: buffering a T2 time before encoding the signal frame; the T2 is greater than a data signal loss when the non-voice encoder performs silence compression coding on the non-data signal frame and then performs voice coding on the data signal frame to the voice encoder. Time period.

The method further includes: the data signal detector determines that the input signal enters the non-data signal stage and delays the T1 + T2 time, and then the non-voice encoder performs silence compression encoding on the non-data signal frame; the T1 is greater than the speech coding. The time period during which the data signal is lost after the data signal frame is voice-coded to the non-voice encoder to perform non-data signal frame muting compression coding.

The system for implementing data services provided by the embodiment of the present invention includes a sender gateway, a receiver gateway, and a communication channel connecting the sender gateway and the receiver gateway, where the sender gateway includes a voice encoder, a non-voice encoder, and a data signal. Monitor, data signal detector will signal frame Dividing into a data signal frame and a non-data signal frame, and controlling the data signal frame to enter the voice encoder for voice coding, and the non-data signal frame to enter the non-voice encoder for silent compression coding; the receiver gateway includes a received voice decoder and The non-voice decoder, the voice signal encoded by the voice decoder is decoded by the voice signal to output a valid data signal, and the non-voice decoder reconstructs the silence signal after decoding the silenced and encoded non-data signal frame.

The system further includes a buffer disposed between the sender gateway signal input end and the voice encoder and the non-voice encoder, configured to output the input signal frame after buffering for T2 time before encoding; the T2 is greater than The period during which the non-voice encoder performs silence compression encoding on the non-data signal frame and then loses the data signal when the voice encoder switches the voice signal to the data signal frame.

In the system, the data signal detector determines that the input signal enters the non-data signal stage and delays the T1 + T2 time, and then the non-voice encoder performs silent compression coding on the non-data signal frame; the T2 is greater than the non-voice coding. The time period during which the data signal is lost after the voice data is encoded by the voice encoder and the voice signal is switched by the voice encoder; the T1 is greater than the voice encoder to perform voice coding on the data signal frame and then to the non-voice code. The time period during which the data signal is lost when the non-data signal frame is muted and compressed.

The embodiment of the present invention further provides a data signal detector, comprising: a determining unit and a control unit, wherein the determining unit is configured to determine whether the input signal frame carries a valid data signal; and the control unit is configured to: according to the result of the determining The signal frame is output as a data signal frame or a non-data signal frame.

According to the above embodiment provided by the present invention, by dividing the signal output by the data device into a data signal and a non-data signal signal and processing different signals for different signals, in the data service application, the VBD can be greatly reduced. When the data service is implemented, the bandwidth of the communication channel is occupied, and the availability of the VBD mode is improved.

DRAWINGS

1 is a block diagram of a communication system using voice detection and silence compression technology; FIG. 2 is a block diagram of a communication system of the existing VBD mode;

Figure 3 is a block diagram of a first embodiment of the system of the present invention;

4 is a schematic diagram of a signal passing through a data signal detector; 5 is a schematic diagram of a signal for buffering T2 time before encoding a data signal in the system embodiment of FIG. 3;

6 is a schematic diagram of a signal for determining that the delay is delayed by T1+T2 after entering the mute; FIG. 7 is a block diagram of a second embodiment of the system according to the present invention.

detailed description

A block diagram of a system embodiment of the present invention is shown in FIG. 3. The entire system includes three modules: a sender gateway, a communication channel, and a receiver gateway. The sender gateway has a data signal detector, a low-loss voice coder and a non-voice coder. The receiver gateway has a voice decoder and a non-voice decoder, and the communication channel is an IP network.

After the data signal enters the sender gateway, it is first divided into equally spaced signal frames. The frame length is determined according to the encoding protocol used by the encoder, generally between 5 and 30 milliseconds, and then the data signal detector for each input signal frame. For analysis, if the signal frame carries a valid data signal, the signal frame is used as a data signal frame, otherwise the signal frame is used as a non-data signal frame. If the signal frame is a data signal frame, the data signal detector controls the signal frame to enter the speech coder for low-loss speech coding. Otherwise, the data signal detector controls the signal frame to enter the non-speech encoder for silent compression coding.

After receiving the voice packet, the receiving gateway sends the voice decoder to decode the voice and outputs a valid data signal. After receiving the voice packet, the gateway sends the non-voice decoder to reconstruct the mute signal.

The detection algorithm used by the data signal detector to analyze the input signal frame uses the small window energy detection method in this embodiment:

The signal from the data device, the signal energy during the valid data period is relatively high and stable; while during the mute, it does not mix a lot of background noise like the speech signal, but rather pure silence, even if there is some noise or electrical echo, The energy is also very small. Therefore, a reasonable energy decision threshold can be set to distinguish the valid data from the silence by the amount of signal energy.

Considering that the energy of the data signal changes rapidly, it is possible that the energy mutation occurs in the last few milliseconds of the signal frame. A signal frame can be divided into many parts, that is, multiple sub-signal frames, and each sub-signal frame is called a signal window. Calculating the signal energy in each signal window separately. When the signal energy of all signal windows in a signal frame is lower than the set energy decision threshold, this is considered as The frame is a silent frame. The size of the signal window and the energy decision threshold can be set according to the specific conditions of different gateway devices. The size of the signal window can be the same or different.

However, as shown in FIG. 4, in the data signal detector, determining that the data signal enters the mute and exits the mute has a corresponding decision threshold, which causes an effective data signal that loses the length of the t1 time when entering the mute phase, and exits. In the silent phase, the valid data signal of the t2 time length is lost. Taking the mute process as an example, when the signal energy exceeds the set decision threshold, the mute is considered to be exited, resulting in loss of signal between the true exit of the signal and the decision to exit the mute. The incomplete data signal is reconstructed. For data devices, such incomplete data signals are unacceptable. The receiving data device may think that the signal is hopping or the cycle is incomplete and anomalies occur, even leading to data. Business failed.

Since the data signal detector judges that the time to exit the mute is later than the time when the mute is actually exited, the problem of losing the signal when exiting the mute phase in the present embodiment is to buffer the T2 before the data signal is sent to the encoder (T2>t2). The technical solution of time ensures that when the data signal detector judges to exit the mute and control the data signal to be sent to the speech encoder, the valid data signal has not yet reached the encoder. The reconstructed data signal after buffering is as shown in FIG.

At the same time, because the data signal detector judges that the moment of entering the silence is earlier than the time of actually entering the silence, the problem of losing the signal when entering the silent phase in this embodiment is adopted after the data signal detector determines to enter the silence, and the delay T1 (Tl> T1) The technical solution of switching the data signal from the speech encoder to the non-speech encoder. Considering the problem of losing the signal when exiting the mute phase, the data signal has a buffer of T2, which is equivalent to the time when the data signal detector judges to be muted earlier than the time when the signal sent to the encoder enters silence, tl+T2. Therefore, after the data signal detector judges to be muted, delay the T1+T2 time and then switch the data signal from the voice encoder to the non-voice encoder to ensure that all valid data signals are sent through the voice encoder, and the data signal is reconstructed after delay. Figure 6 shows.

The VBD mode communication system block diagram for processing data signal detection and silence compression techniques using the above technical solution is shown in FIG.

It should be noted that in the system described in FIG. 7, a buffer is provided between the sender gateway signal input terminal and the voice encoder and the non-voice encoder for buffering the input signal frame before encoding. Output after T2 time to ensure that the data signal detector judges to exit the mute and When the control data signal is sent to the voice encoder, the valid data signal has not yet reached the encoder. The T2 is greater than a period of time during which the non-voice encoder performs silence compression coding on the non-data signal frame and then loses the data signal to the voice encoder when the voice signal is switched on the data signal frame.

In the system of FIG. 7, the data signal detector determines that the input signal enters the non-data signal stage and delays the T1 + T2 time, and then the non-voice encoder performs silence compression coding on the non-data signal frame to Ensure that the valid data signals are all sent through the voice encoder; the T2 is greater than the time period during which the non-voice encoder performs silence compression coding on the non-data signal frame and then loses the data signal to the voice encoder when the voice signal is switched on the data signal frame;

The T1 is greater than a time period during which the voice signal is lost after the voice encoder performs voice encoding on the data signal frame and the non-voice encoder performs silent compression coding switching on the non-data signal frame.

In addition, in the system of FIG. 7, the data signal detector is typically configured to include a determining unit and a control unit, and the determining unit is configured to determine whether the input signal frame carries a valid data signal; And a unit, configured to output the signal frame as a data signal frame or a non-data signal frame to a voice encoder or a non-voice encoder according to the result of the judgment.

In the system described in FIG. 6, the time of T1 and T2 can be determined by the performance of the data signal detection algorithm used by the system, generally in the leap second stage, so the delay introduced by the data signal buffer T2 time hardly generates data traffic. Any impact. It can be seen that through the above two steps, the data signal loss caused by the data signal detector judging to enter the mute and exit the mute can be effectively avoided.

In the above embodiment, the signal outputted by the data device is divided into a valid data signal and a non-effective data signal, such as a mute signal, and processed by different coding modes for different signals, which can be greatly reduced in data service applications. When the data service is implemented by the VBD method, the bandwidth of the communication channel is occupied, and the availability of the VBD mode is improved.

Claims

Rights request

A method for implementing data services by means of voiceband data, characterized in that the method comprises the following steps:

2. The method according to claim 1, wherein the energy decision value is set, and the data signal detector divides the input signal frame into data signal frames by the energy decision threshold And non-data signal frames. '

3. The method according to claim 1, wherein the input signal frame is divided into equally spaced or non-equally spaced plurality of sub-signal frames, each sub-signal frame corresponding to one signal window, and each signal window is calculated separately. The signal energy is then compared with the energy decision threshold. If the signal energy in each signal window is less than the energy decision threshold, the signal frame is a non-data signal frame, otherwise it is a data signal frame.

4. The method according to claim 1, 2, or 3, characterized by further comprising: buffering a T2 time before encoding the signal frame; wherein the T2 is greater than a non-voice encoder to perform silence compression coding on the non-data signal frame. The period of time during which the data signal is lost when the voice encoder switches the voice signal to the data signal frame.

5. The method according to claim 4, further comprising: the data signal detector determining that the input signal enters the non-data signal stage and delaying the T1 + T2 time, and then the non-voice encoder to the non-data signal frame Performing silence compression coding; the T2 is greater than a time period when the non-voice encoder performs silence compression coding on the non-data signal frame and then loses the data signal to the voice encoder when the voice signal is switched on the data signal frame;

6. A system for implementing data services by means of VBD, including a sender gateway, receiving a party gateway and a communication channel connecting the sender gateway and the receiver gateway, wherein the sender gateway includes a voice coder, a non-voice coder, and a data signal detector, and the data signal detector divides the signal frame into data a signal frame and a non-data signal frame, and control the data signal frame to enter the voice encoder for voice coding, and the non-data signal frame to enter the non-voice encoder for silent compression coding; the receiver gateway includes a voice decoder and a non-voice decoder, The voice signal encoded by the voice decoder is subjected to voice decoding to output a valid data signal, and the non-voice decoder decodes the silenced and encoded non-data signal frame to reconstruct the silence signal.

7. The system of claim 6 further comprising a buffer disposed between the sender gateway signal input and the voice encoder and the non-voice encoder for encoding the input signal frame The T2 time is buffered before outputting; the T2 is greater than a period of time when the non-voice encoder performs silence compression coding on the non-data signal frame and then loses the data signal to the voice encoder when the voice signal is switched on the data signal frame.

8. The system according to claim 6 or 7, wherein: the data signal detector determines that the input signal enters the non-data signal stage and delays the time T1 + T2, and then the non-voice encoder performs the non-data. The signal frame is subjected to silent compression coding; the T2 is greater than a time period during which the non-voice encoder performs silence compression coding on the non-data signal frame and then loses the data signal to the voice encoder when the voice signal is switched on the data signal frame;

A data signal detector, comprising: a determining unit and a control unit, wherein the determining unit is configured to determine whether the input signal frame carries a valid data signal; and the control unit is configured to: according to the result of the determining The signal frame is output as a data signal frame or a non-data signal frame.