RU2426250C2

RU2426250C2 - Method and system for speech compensation in mobile communication system

Info

Publication number: RU2426250C2
Application number: RU2009129402/09A
Authority: RU
Inventors: Донгхуа ЛУ (CN); Донгхуа ЛУ; Вей РУАН (CN); Вей РУАН; Джиан КАО (CN); Джиан КАО; Хонгвей ЛОУ (CN); Хонгвей ЛОУ; Ванчун ЖАНГ (CN); Ванчун ЖАНГ
Original assignee: ЗетТиИ Корпорейшн
Priority date: 2007-01-10
Filing date: 2007-01-10
Publication date: 2011-08-10
Also published as: RU2009129402A

Abstract

FIELD: information technology.

SUBSTANCE: in the method for speech compensation in a mobile communication network, at each frame processing instant, the device determines on the network side whether the received or transmitted speech frame is an incorrect frame or not, and if yes, it determines whether the incorrect frame is a frame with speed mode not equal to 1/8 or not; and if yes, the device on the network side performs speech compensation for the incorrect frame. The system, which is in the device on the network side, has an incorrect frame detecting node and a speech compensation node.

EFFECT: high quality of speech in mobile communication network.

Description

ОБЛАСТЬ ТЕХНИКИFIELD OF TECHNOLOGY

Настоящее изобретение относится к технике речевой компенсации, а более конкретно - к способу и системе для осуществления речевой компенсации в случаях, когда устройство на сетевой стороне не использует вокодер или использует его частично.The present invention relates to a voice compensation technique, and more particularly, to a method and system for performing voice compensation in cases where a device on the network side does not use a vocoder or partially uses it.

УРОВЕНЬ ТЕХНИКИBACKGROUND

В мобильной системе связи вокодер на сетевой стороне обычно выполняет две следующие важные функции: в восходящем канале, после того, как оборудование пользователя (UE) посылает кодированную сжатую речь к сетевой стороне, вокодер на сетевой стороне декодирует принятую сжатую речь, чтобы сделать речь пригодной для передачи посредством сети; а в нисходящем канале вокодер на стороне сети кодирует со сжатием речевой поток, передаваемый в сеть, чтобы сделать речь подходящей для передачи по радиоканалу.In a mobile communication system, the vocoder on the network side typically performs two of the following important functions: in the uplink, after the user equipment (UE) sends coded compressed speech to the network side, the vocoder on the network side decodes the received compressed speech to make speech suitable network transmission; and in the downstream channel, the vocoder on the network side encodes with compression the speech stream transmitted to the network to make speech suitable for transmission over the air.

Если в качестве примера взять CDMA2000 (множественный доступ с кодовым разделением 2000), то в этом стандарте есть три способа кодирования и декодирования речи: EVRC (улучшенный кодер с переменной скоростью), QCELP-13k (кодирование Qualcomm-13k с линейным предсказанием и кодовой книгой) и QCELP-8k (кодирование Qualcomm-8k с линейным предсказанием и кодовой книгой). При этом EVRC в основном используется как формат для кодирования и декодирования. В типичных вызовах между MS1 (мобильная станция) и MS2, мобильные станции используют одинаковый способ речевого кодирования (например, EVRC). Речь абонента MS1 приходит к ушам абонента MS2 следующим путем: сначала MS1 посылает кодированный по стандарту EVRC кадр сжатой речи к сетевой стороне 1 посредством восходящего радиоканала, и сетевая сторона 1 использует вокодер 1 для декодирования принятых кадров EVRC сжатой речи и преобразования их в кодовый поток формата РСМ (импульсно-кодовая модуляция), после чего выполняется коммутация цепей; после принятия сетевой стороной 2 коммутируемого кодового потока РСМ от сетевой стороны 1 сетевая сторона 2 использует вокодер 2 для преобразования кодового потока РСМ в кадры EVRC сжатой речи и передает эти кадры в MS2 посредством восходящего канала.If we take CDMA2000 (code division multiple access 2000) as an example, then this standard has three methods for encoding and decoding speech: EVRC (advanced variable rate encoder), QCELP-13k (Qualcomm-13k linear prediction encoding and codebook ) and QCELP-8k (Qualcomm-8k coding with linear prediction and codebook). Moreover, EVRC is mainly used as a format for encoding and decoding. In typical calls between MS1 (mobile station) and MS2, mobile stations use the same voice coding method (e.g., EVRC). The speech of subscriber MS1 arrives at the ears of subscriber MS2 in the following way: first, MS1 sends an EVRC encoded compressed speech frame to network side 1 via an uplink radio channel, and network side 1 uses vocoder 1 to decode received EVRC compressed speech frames and convert them to a codestream format PCM (pulse code modulation), after which the circuit is switched; after network side 2 receives the switched PCM code stream from network side 1, network side 2 uses vocoder 2 to convert the PCM code stream to compressed speech EVRC frames and transmits these frames to MS2 via an uplink.

Речевое кодирование и декодирование вокодером является кодированием с потерями, и каждое кодирование и декодирование уменьшает качество речи. Снова используя в качестве примера вызов между MS1 и MS2, так как MS1 и MS2 используют одинаковый формат кодирования и декодирования, то два процесса кодирования и декодирования речи могут быть устранены на сетевой стороне, если устранить добавочное кодирование и декодирование кадров EVRC сжатой речи и осуществить процесс передачи речи абонента MS1 к ушам абонента MS2 следующим образом: сначала MS1 посылает кодированные кадры EVRC сжатой речи к сетевой стороне 1 посредством восходящего радиоканала, сетевая сторона 1 напрямую коммутирует принятые кадры EVRC и посылает их сетевой стороне 2; сетевая сторона 2 принимает коммутированные кадры EVRC сжатой речи и посылает их к MS2 посредством нисходящего радиоканала.Voice coding and decoding by vocoder is lossy coding, and each coding and decoding reduces speech quality. Using the call between MS1 and MS2 as an example, since MS1 and MS2 use the same encoding and decoding format, two speech encoding and decoding processes can be eliminated on the network side by eliminating the additional encoding and decoding of compressed speech frames EVRC and implementing the process speech transmission of the subscriber MS1 to the ears of the subscriber MS2 as follows: first, MS1 sends the encoded frames of compressed speech EVRC to the network side 1 through the upward radio channel, the network side 1 directly commutes the received frames EVRC and sends them to network side 2; the network side 2 receives the switched compressed speech frames EVRC and sends them to the MS2 via a downlink radio channel.

Из этого примера видно, что, поскольку на сетевой стороне устранены два процесса кодирования и декодирования речи, ухудшающие качество речи, то происходит не только существенное улучшение качества речи, но и экономия ресурсов вокодера на сетевой стороне, что уменьшает задержку при передаче и обработке речи. Во время начальной стадии разработки мобильной системы связи речевые вызовы в системе осуществляются в основном между мобильными абонентами и абонентами фиксированной связи, и вышеуказанное влияние является не таким очевидным. Статистика трафика показывает, что вызовы между мобильными абонентами всегда занимают лидирующее положение, поэтому исходная конфигурация вокодера не только увеличивает стоимость устройства, но также влияет на характеристики системы. Таким образом, путь улучшения структуры сети и стратегия управления конфигурацией вокодера становится важным вопросом при разработке.This example shows that, since the speech encoding and decoding processes that degrade the speech quality are eliminated on the network side, not only a significant improvement in speech quality occurs, but also vocoder resources are saved on the network side, which reduces the delay in the transmission and processing of speech. During the initial stage of development of a mobile communication system, voice calls in the system are carried out mainly between mobile and fixed-line subscribers, and the above effect is not so obvious. Traffic statistics show that calls between mobile subscribers always occupy a leading position, so the initial vocoder configuration not only increases the cost of the device, but also affects the characteristics of the system. Thus, the way to improve the network structure and vocoder configuration management strategy becomes an important issue in development.

При разработке мобильной технологии связи «all-IP» (все по протоколу Интернет) поддержка традиционной услуги передачи речи и услуги пакетной передачи данных с меньшей стоимостью и более гибким способом является основной задачей такой разработки. При поддержке традиционной услуги передачи речи мобильная сеть связи «all-IP» имеет проблему с одновременной недорогой поддержкой различных типов вокодеров, т.е. проблему поддержки так называемых TrFO (свободная операция транскодера) и RTO (удаленная операция транскодера).When developing the all-IP mobile communication technology (all over the Internet protocol), supporting a traditional voice service and packet data service with a lower cost and a more flexible way is the main task of this development. With the support of the traditional voice service, the all-IP mobile communication network has a problem with the simultaneous inexpensive support of various types of vocoders, i.e. the problem of supporting the so-called TrFO (free transcoder operation) and RTO (remote transcoder operation).

TrFO означает способность сети предоставлять тип кодирования/декодирования и режима вокодера перед установлением вызова посредством некоторого внешнего механизма предоставления. Посредством этого предоставления вызов между мобильными абонентами не требует прохода через вокодер на сетевой стороне, что улучшает качество речи, экономит дорогие ресурсы вокодера и потребляемую этими ресурсами мощность.TrFO means the ability of a network to provide an encoding / decoding type and vocoder mode before establishing a call through some external provisioning mechanism. Through this provision, a call between mobile subscribers does not require passage through the vocoder on the network side, which improves speech quality, saves expensive vocoder resources and the power consumed by these resources.

RTO - это специальный случай TrFO, и поскольку способы кодирования и декодирования на обоих сторонах соединения не совместимы во время внеполосного согласования, то необходимо наличие вокодера на сетевой стороне для преобразования типа кода одной стороны соединения в тип кода другой стороны. Основное отличие между RTO и передающими сетями TDM с переключением: в сети TDM есть два преобразования кодирования и декодирования на сетевой стороне, а в сети RTO - только одно преобразование. Принимая здесь сеть RTO в качестве примера, пусть MS1 использует EVRC, a MS2 использует QCELP-13k во время вызова между MS1 и MS2, тогда речь абонента MS1 приходит к ушам абонента MS2 посредством следующего процесса: сначала MS1 посылает кодированные кадры EVRC сжатой речи к сетевой стороне 1 посредством восходящего радиоканала; сетевая сторона 1 напрямую коммутирует принятые речевые кадры EVRC и посылает их к сетевой стороне 2; сетевая сторона 2 принимает коммутированные кадры EVRC сжатой речи и преобразует их в кадры QCELP-13k сжатой речи посредством вокодера, и затем посылает эти кадры к MS2 посредством нисходящего радиоканала.RTO is a special case of TrFO, and since the encoding and decoding methods on both sides of the connection are not compatible during out-of-band negotiation, it requires a vocoder on the network side to convert the code type of one side of the connection to the code type of the other side. The main difference between RTO and switching TDM networks is that there are two encoding and decoding conversions on the network side in the TDM network, and only one conversion in the RTO network. Taking the RTO network here as an example, let MS1 use EVRC, and MS2 use QCELP-13k during a call between MS1 and MS2, then the speech of the subscriber MS1 comes to the ears of the subscriber MS2 through the following process: first, MS1 sends the compressed speech EVRC frames to the network side 1 via an uplink radio channel; network side 1 directly commutes the received EVRC speech frames and sends them to network side 2; network side 2 receives the compressed compressed EVRC frames and converts them into compressed QCELP-13k frames by means of a vocoder, and then sends these frames to MS2 via a downlink radio channel.

Принимая здесь CDMA2000 LMSD (домен традиционной мобильной станции) в качестве примера, внешнее предоставление TrFO реализовано посредством сигнализации согласования между сетью доступа и MSCe. Поскольку CDMA2000 LMSDTaking here the CDMA2000 LMSD (traditional mobile station domain) as an example, TrFO external provisioning is implemented by means of negotiation signaling between the access network and the MSCe. Since CDMA2000 LMSD

применяет технологию коммутации IP, сетевая сторона может напрямую представлять сжатые речевые данные, кодированные в оборудовании пользователя, как пакеты RTP (транспортный протокол реального времени) и передавать их посредством IP-сети, таким образом не возникает необходимости преобразовывать схему кодирования и декодирования в РСМ с передачей посредством канала TDM.It uses IP switching technology, the network side can directly represent the compressed voice data encoded in the user equipment as RTP (real-time transport protocol) packets and transmit them via the IP network, so there is no need to convert the encoding and decoding scheme to PCM with transmission through the TDM channel.

Принимая EVRC в качестве примера, максимальная скорость передачи в EVRC - 8 кбит/с (скорость передачи кадра полной скорости), также EVRC имеет много кадров половинной скорости и кадров 1/8 скорости. Статистический анализ показывает, что в вызовах EVRC доля кадров полной скорости в среднем равна около 30%, их скорость передачи - 22 байта/(кадр 20 мс); доля кадров половинной скорости в среднем равна около 30%, их скорость передачи - 10 байт/(кадр 20 мс); доля кадров 1/8 скорости в среднем равна около 40%, их скорость передачи - 2 байта/(кадр 20 мс). Кроме того, поскольку передача RTP поддерживает функцию мультикадровой упаковки, кадр EVRC может быть упакован для передачи в сети с целью экономии служебной информации в заголовке IP. Например, при упаковке трех кадров EVRC в сообщение RTP средняя скорость передачи EVRC в сети равна 11,7 кбит/с со служебной информацией заголовка IP. Скорость передачи одностороннего речевого потока с кодом РСМ в сети - 64 кбит/с в предыдущем канале передачи, таким образом, полоса, используемая для передачи сжатой речи в конфигурации «all-IP», позволяет экономить примерно (1-11.7/64)=81.7% полосы потока с кодом РСМ в режиме канал TDM. Этот пример показывает, что TrFO может экономить значительную часть полосы сети.Taking EVRC as an example, the maximum bit rate in EVRC is 8 kbps (full frame rate), and EVRC also has many half speed frames and 1/8 speed frames. Statistical analysis shows that in EVRC calls, the percentage of full-speed frames is on average about 30%, their transmission speed is 22 bytes / (frame 20 ms); the fraction of half-speed frames is on average about 30%, their transmission speed is 10 bytes / (20 ms frame); the share of 1/8 speed frames is on average about 40%, their transmission speed is 2 bytes / (20 ms frame). In addition, since RTP transmission supports multi-frame packing, an EVRC frame can be packed for transmission on the network in order to save overhead in the IP header. For example, when packing three EVRC frames into an RTP message, the average EVRC transmission rate on the network is 11.7 kbps with IP header overhead. The transmission rate of a one-way speech stream with a PCM code in the network is 64 kbit / s in the previous transmission channel, thus, the band used to transmit compressed speech in the “all-IP” configuration allows you to save approximately (1-11.7 / 64) = 81.7 % stream band with PCM code in TDM channel mode. This example shows that TrFO can save a significant portion of the network bandwidth.

Однако TrFO имеет некоторые проблемы при практическом применении. Например, предположим, что MS1 и MS2 осуществляют вызов посредством TrFO; если качество радиоканала недостаточно хорошее, есть возможность, что сетевая сторона 1 не сможет корректно принять и проанализировать контент некоторых кадров, переданных MS1 к сетевой стороне 1 посредством восходящего канала, т.е. произойдет ошибка кадра. Такие кадры, которые невозможно проанализировать, могут быть корректно обработаны вокодером на сетевой стороне в канале TDM сети передачи, в то время как в технологии TrFO из-за отсутствия вокодера сетевая сторона 1 может лишь заполнить эти кадровые ошибки кадрами компенсации, определенными в протоколе (например, в EVRC кадры половинной скорости или кадры полной скорости с нулем во всех битах определены как кадры компенсации, и т.п.), и передать их на сетевую сторону 2, которая передает эти кадры компенсации, определенные в протоколе, к MS2. При этом из-за характеристик передачи сети IP существует возможность потери кадров или джиттера кадров, прибывающих на сетевую сторону 2 через сеть, во время обработки речевых кадров на сетевой стороне 1. В этот момент, если сетевая сторона 2 не получает кадры от сетевой стороны 1 за определенное время, сетевая сторона 2 будет заполнять кадровые ошибки кадрами компенсации в соответствии с протоколом, и посылать затем эти кадры к MS2. С этими кадрами компенсации, с учетом качества радиоканала и качества передачи сети на стороне MS2, не должно быть проблем, если MS2 выполняет речевую компенсацию для этих кадров; однако множество экспериментов показывают, что большинство UE не будут выполнять речевую компенсацию для этих кадров компенсации. Таким образом, эти кадры компенсации будут оказывать большое влияние на общее качество речи в TrFO.However, TrFO has some practical problems. For example, suppose that MS1 and MS2 make a call through TrFO; if the quality of the radio channel is not good enough, there is a possibility that the network side 1 will not be able to correctly receive and analyze the content of some frames transmitted by MS1 to the network side 1 via the uplink, i.e. a frame error will occur. Such frames that cannot be analyzed can be correctly processed by the vocoder on the network side in the TDM channel of the transmission network, while in TrFO technology due to the lack of a vocoder, network side 1 can only fill these frame errors with compensation frames defined in the protocol (for example , in EVRC, half-speed frames or full-speed frames with zero in all bits are defined as compensation frames, etc.), and transmit them to the network side 2, which transmits these compensation frames defined in the protocol, to MS2. Moreover, due to the characteristics of the transmission of the IP network, there is the possibility of loss of frames or jitter of frames arriving on the network side 2 through the network during the processing of speech frames on the network side 1. At this point, if the network side 2 does not receive frames from the network side 1 for a certain time, the network side 2 will fill in frame errors with compensation frames in accordance with the protocol, and then send these frames to MS2. With these compensation frames, taking into account the quality of the radio channel and the transmission quality of the network on the MS2 side, there should be no problem if MS2 performs voice compensation for these frames; however, many experiments show that most UEs will not perform voice compensation for these compensation frames. Thus, these compensation frames will greatly influence the overall speech quality of the TrFO.

Технология RTO также имеет подобную проблему. Поскольку вызов в технологии RTO использует вокодер на сетевой стороне, и в предположении, что MS1 и MS2 осуществляют вызов в RTO, некоторые кадры, переданные MS1 к сетевой стороне 1 посредством восходящего канала, могут иметь ошибки, если качество радиоканала недостаточно хорошее, и сетевая сторона 1 может использовать вокодер на сетевой стороне для выполнения речевой компенсации для этих кадровых ошибок. Однако, когда скомпенсированные речевые кадры прибывают на сетевую сторону 2, остается возможность потери кадров и джиттера кадров из-за проблем с качеством передачи в сети. В этот момент сетевая сторона 2 будет заполнять кадровые ошибки кадрами компенсации, определенными в протоколе, и затем посылать эти кадры к MS2. Таким образом, эти кадры компенсации будут оказывать большое влияние на общее качество речи в RTO, если речевая компенсация для этих кадров не может быть эффективно выполнена в MS2.RTO technology also has a similar problem. Since a call in RTO technology uses a vocoder on the network side, and under the assumption that MS1 and MS2 make a call in RTO, some frames transmitted by MS1 to network side 1 via the uplink may have errors if the radio channel quality is not good enough and the network side 1 may use a vocoder on the network side to perform voice compensation for these frame errors. However, when the compensated speech frames arrive on the network side 2, there remains the possibility of frame loss and frame jitter due to problems with transmission quality in the network. At this point, the network side 2 will fill in frame errors with compensation frames defined in the protocol, and then send these frames to MS2. Thus, these compensation frames will greatly influence the overall speech quality in the RTO if voice compensation for these frames cannot be efficiently performed in MS2.

В итоге, если среда радиоканала хорошая, и качество передачи в сети - идеальное, то технологии TrFO и RTO, уменьшая время кодирования и декодирования вокодером на сетевой стороне, тем самым улучшают качество речи. Но они не могут использовать вокодер на сетевой стороне для речевой компенсации, как это делает исходная коммутируемая мобильная система связи в случае, если среда радиоканала плохая и качество передачи в сети - низкое. В такой момент речевая компенсация полностью зависит от вокодера на стороне оборудования пользователя. На данный момент оборудование пользователя, выпускаемое различными производителями на рынке, имеет различные решения по отношению к тому, требуется ли компенсация принятых компенсационных речевых кадров или нет; таким образом, качество речи в TrFO и RTO жестко зависит от характеристик компенсации вокодера в оборудовании пользователя и от того, компенсирует ли вокодер любые виды компенсационных речевых кадров или нет, что существенно влияет на общее качество речи в TrFO и RTO.As a result, if the environment of the radio channel is good, and the transmission quality on the network is ideal, then the TrFO and RTO technologies, reducing the encoding and decoding time by the vocoder on the network side, thereby improve speech quality. But they cannot use the vocoder on the network side for voice compensation, as the original switched mobile communication system does in case the radio channel environment is poor and the transmission quality in the network is poor. At this point, voice compensation is completely dependent on the vocoder on the user equipment side. Currently, user equipment produced by various manufacturers on the market has various decisions regarding whether compensation for accepted compensation speech frames is required or not; thus, the speech quality in TrFO and RTO strictly depends on the compensation characteristics of the vocoder in the user equipment and on whether the vocoder compensates for any kind of compensation speech frames or not, which significantly affects the overall speech quality in TrFO and RTO.

Практика показывает, что если один из последовательных кадров полной скорости поврежден или потерян, то оборудование пользователя получает компенсационные кадры в случае использования TrFO или RTO. Качество речи в случае, когда оборудование пользователя обрабатывает компенсационные кадры, очевидно хуже, чем качество речи в случае, если имеется вокодер на сетевой стороне в канале TDM сети передачи; существует эффект «проглатывания», вибрации и прерывания речи в вышеупомянутом случае. Для оборудования пользователя, имеющего различные вокодеры, уровень качества речи не одинаков.Practice shows that if one of the consecutive full-speed frames is damaged or lost, then the user equipment receives compensation frames in the case of using TrFO or RTO. The speech quality in the case when the user equipment processes the compensation frames is obviously worse than the speech quality if there is a vocoder on the network side in the TDM channel of the transmission network; there is the effect of "swallowing", vibration and interruption of speech in the above case. For user equipment having various vocoders, the level of speech quality is not the same.

СУЩНОСТЬ ИЗОБРЕТЕНИЯSUMMARY OF THE INVENTION

Настоящее изобретение предлагает способ и систему для осуществления речевой компенсации в сети мобильной связи с решением вышеуказанных проблем и приблизительным компенсированием речи в условиях, когда имеется низкое качество передачи и устройство на сетевой стороне не использует вокодер или использует его частично, в итоге данное изобретение улучшает общее качество речи.The present invention provides a method and system for performing voice compensation in a mobile communication network with the solution of the above problems and approximate speech compensation in conditions where there is poor transmission quality and the device on the network side does not use the vocoder or uses it partially, as a result, this invention improves the overall quality speech.

Техническая схема, примененная в настоящем изобретении, является следующей.The technical scheme used in the present invention is as follows.

Способ речевой компенсации в сети мобильной связи содержит следующие операции:The method of voice compensation in a mobile communication network contains the following operations:

a) в каждый момент обработки кадров устройство на сетевой стороне определяет, является ли принятый или передаваемый речевой кадр некорректным кадром или нет, и если да, то осуществляется переход к следующему шагу;a) at each moment of frame processing, the device on the network side determines whether the received or transmitted speech frame is an invalid frame or not, and if so, the transition to the next step is carried out;

b) устройство на сетевой стороне выполняет речевую компенсацию для некорректного кадра.b) the device on the network side performs voice compensation for an incorrect frame.

Далее, существует следующий шаг после шага а:Next, there is the next step after step a:

а1) определение, является ли некорректный кадр кадром с режимом скорости, не равной 1/8, или нет; если да, осуществляется переход к следующему шагу.A1) determining whether the incorrect frame is a frame with a speed mode not equal to 1/8 or not; if so, proceed to the next step.

Далее, способ определения того, является ли некорректный кадр кадром с режимом скорости, не равной 1/8, или нет, на шаге а1 включает:Further, a method for determining whether an invalid frame is a frame with a speed mode not equal to 1/8 or not, in step a1, includes:

определение того, является ли последний корректный кадр некорректного кадра кадром не 1/8 скорости; если да, то некорректный кадр является кадром с состоянием не 1/8 скорости; иначе некорректный кадр не является кадром с состоянием не 1/8 скорости. Далее, существует шаг, следующий за указанным шагом а1:determining whether the last valid frame of the invalid frame is not 1/8 frame rate; if so, then the invalid frame is a frame with a state not 1/8 of the speed; otherwise, an invalid frame is not a frame with a state of not 1/8 speed. Next, there is a step following the indicated step a1:

а2) определение того, является ли кадровое расстояние между некорректным кадром и его последним корректным кадром меньшим либо равным порогу компенсации или нет; если да, осуществляется переход к следующему шагу.a2) determining whether the frame distance between the incorrect frame and its last valid frame is less than or equal to the compensation threshold or not; if so, proceed to the next step.

Далее, способ речевой компенсации для некорректного кадра в шаге b содержит одно из нижеследующего:Further, the voice compensation method for the invalid frame in step b comprises one of the following:

способ дублирования корректного кадра: использование последнего корректного кадра для замены текущего некорректного кадра с целью выполнения компенсации;way to duplicate the correct frame: use the last correct frame to replace the current incorrect frame in order to perform compensation;

способ заполнения кадром 1/4 скорости: использование кадра 1/4 скорости с любым контентом для замены текущего некорректного кадра с целью выполнения компенсации;method of filling 1/4 speed frame: using a 1/4 speed frame with any content to replace the current incorrect frame in order to perform compensation;

аппроксимацию путем моделирования: использование кадра, полученного при моделировании, для замены текущего некорректного кадра.approximation by simulation: using the frame obtained in the simulation to replace the current invalid frame.

Далее, указанный некорректный кадр означает пустой кадр, удаленный кадр, другой кадр с неопределенной скоростью в протоколе; кадр, не принятый за определенный промежуток времени обработки кадров, или кадр, который нуждается в компенсации после приема его вокодером, определенным в протоколе.Further, the indicated invalid frame means an empty frame, a deleted frame, another frame with an undefined speed in the protocol; a frame that is not accepted for a certain period of time for processing frames, or a frame that needs to be compensated after being received by the vocoder defined in the protocol.

Далее, указанный речевой кадр является прямым или обратным речевым кадром.Further, said speech frame is a forward or backward speech frame.

Если речевой кадр является прямым речевым кадром, указанный последний корректный кадр является последним корректным кадром прямого речевого кадра.If the speech frame is a direct speech frame, the last valid frame indicated is the last valid frame of the direct speech frame.

Если речевой кадр является обратным речевым кадром, указанный последний корректный кадр является последним корректным кадром обратного речевого кадра.If the speech frame is a return speech frame, the last valid frame specified is the last valid frame of the return speech frame.

Настоящее изобретение также предлагает систему для речевой компенсации в сети мобильной связи, где система находится в устройстве на сетевой стороне и содержит:The present invention also provides a system for voice compensation in a mobile communication network, where the system is located in the device on the network side and contains:

узел обнаружения некорректного кадра для определения, является ли принятый или передаваемый устройством на сетевой стороне речевой кадр некорректным кадром или нет, и для отправки некорректного кадра к узлу речевой компенсации и отправки корректного кадра к узлу обработки речевых кадров в устройстве на сетевой стороне; иan incorrect frame detection unit for determining whether a speech frame received or transmitted by the device on the network side is an invalid frame or not, and for sending an incorrect frame to the speech compensation unit and sending the correct frame to the speech frame processing unit in the device on the network side; and

узел речевой компенсации для выполнения речевой компенсации для некорректного кадра и отправки скомпенсированного речевого кадра в узел обработки речевых кадров в устройстве на сетевой стороне.a speech compensation node for performing voice compensation for an incorrect frame and sending the compensated speech frame to the speech frame processing node in the device on the network side.

Далее, указанный узел речевой компенсации содержит:Further, the specified node speech compensation contains:

узел принятия решения о речевой компенсации для приема некорректных кадров, отправленных узлом обнаружения некорректного кадра, отправки некорректных кадров с состоянием не 1/8 скорости к узлу процесса речевой компенсации, а других некорректных кадров - к узлу обработки речевых кадров в устройстве на сетевой стороне; иa decision unit for speech compensation for receiving invalid frames sent by the invalid frame detection unit, sending invalid frames with a state of not 1/8 speed to the speech compensation process node, and other invalid frames, to the speech processing unit in the device on the network side; and

узел процесса речевой компенсации для приема некорректных кадров, отправленных узлом принятия решения о речевой компенсации, выполнения речевой компенсации для этих кадров, и отправки скомпенсированных речевых кадров к узлу обработки речевых кадров в устройстве на сетевой стороне.a speech compensation process node for receiving incorrect frames sent by the speech compensation decision node, performing voice compensation for these frames, and sending compensated speech frames to the speech frame processing node in the device on the network side.

Далее, указанный узел принятия решения о речевой компенсации определяет, является ли последний корректный кадр принятого некорректного кадра кадром не 1/8 скорости или нет; если да, то некорректный кадр считается некорректным кадром с состоянием не 1/8 скорости; иначе некорректный кадр не является кадром с состоянием не 1/8 скорости.Further, said decision making unit for voice compensation determines whether the last valid frame of the received invalid frame is not 1/8 speed frame or not; if yes, then an invalid frame is considered an invalid frame with a state of not 1/8 speed; otherwise, an invalid frame is not a frame with a state of not 1/8 speed.

Далее, указанный узел принятия решения о речевой компенсации определяет кадровое расстояние между некорректным кадром с состоянием не 1/8 скорости и его последним корректным кадром, и такие некорректные кадры с кадровым расстоянием, меньшим либо равным порогу компенсации, передаются узлу процесса речевой компенсации, а кадры с кадровым расстоянием, большим, чем порог компенсации, передаются узлу обработки речевых кадров в устройстве на сетевой стороне.Further, the specified node for deciding on speech compensation determines the frame distance between an incorrect frame with a state of not 1/8 speed and its last correct frame, and such incorrect frames with a frame distance less than or equal to the compensation threshold are transmitted to the node of the speech compensation process, and frames with a frame distance greater than the compensation threshold, transmitted to the processing unit of the speech frames in the device on the network side.

Далее, речевая компенсация для некорректных кадров с помощью указанного узла речевой компенсации содержит одно из нижеследующего:Further, speech compensation for incorrect frames using the specified speech compensation node contains one of the following:

использование последнего корректного кадра для замены текущего некорректного кадра;using the last valid frame to replace the current invalid frame;

использование кадра с 1/4 скорости с любым контентом для замены текущего некорректного кадра; илиusing a frame with 1/4 speed with any content to replace the current incorrect frame; or

использование кадра, полученного посредством моделирования, для замены текущего некорректного кадра.using a frame obtained through simulation to replace the current invalid frame.

Далее, указанный узел обнаружения некорректного кадра считает речевой кадр некорректным кадром, если речевой кадр, принятый устройством на сетевой стороне, является пустым кадром, удаленным кадром, другим кадром с неопределенной скоростью в протоколе, кадром, не принятым за определенный промежуток времени обработки кадров, или кадром, который нуждается в компенсации после его приема вокодером, определенным в протоколе.Further, the specified invalid frame detection node considers the speech frame as an invalid frame if the speech frame received by the device on the network side is an empty frame, a deleted frame, another frame with an undefined speed in the protocol, a frame not received for a certain period of time for processing frames, or a frame that needs compensation after it is received by the vocoder defined in the protocol.

Далее, указанный речевой кадр, принятый устройством на сетевой стороне, является прямым или обратным речевым кадром.Further, said speech frame received by the device on the network side is a forward or reverse speech frame.

Если речевой кадр является прямым речевым кадром, указанный последний некорректный кадр является последним некорректным кадром прямого речевого кадра;If the speech frame is a direct speech frame, the last invalid frame indicated is the last invalid frame of the direct speech frame;

если речевой кадр является обратным речевым кадром, указанный последний некорректный кадр является последним некорректным кадром обратного речевого кадра.if the speech frame is a reverse speech frame, the last invalid frame indicated is the last invalid frame of the reverse speech frame.

Далее, указанное устройство на сетевой стороне является базовой станцией, контроллером базовой станции, контроллером радиосети или мобильным коммутационным центром.Further, said device on the network side is a base station, a base station controller, a radio network controller, or a mobile switching center.

Система и способ настоящего изобретения могут эффективно решать проблему, при которой качество речи является неприятным на слух, включая эффект прерывания, вибрации и «проглатывания» слов во время вызова, из-за плохой среды радиоканала или низкого качества передачи в сети, когда на сетевой стороне нет вокодера или вокодер на сетевой стороне используется при вызове лишь частично. Схема настоящего изобретения реализует речевую компенсацию в устройстве на сетевой стороне для эффективного уменьшения зависимости вызова от терминала пользователя и характеристик вокодера, тем самым удовлетворяя различным требованиям к качеству речи терминалов пользователя.The system and method of the present invention can effectively solve a problem in which speech quality is unpleasant by ear, including the effect of interruption, vibration, and “swallowing” of words during a call, due to poor radio channel environment or poor transmission quality on the network when on the network side no vocoder or vocoder on the network side is only partially used in a call. The circuit of the present invention implements voice compensation in the device on the network side to effectively reduce the dependence of the call on the user terminal and vocoder characteristics, thereby satisfying various requirements for the speech quality of user terminals.

КРАТКОЕ ОПИСАНИЕ ЧЕРТЕЖЕЙBRIEF DESCRIPTION OF THE DRAWINGS

Фиг.1 - блок-схема конкретной реализации способа речевой компенсации в соответствии с настоящим изобретением;Figure 1 - block diagram of a specific implementation of the method of voice compensation in accordance with the present invention;

Фиг.2 - иллюстрация системы для конкретной реализации речевой компенсации в соответствии с настоящим изобретением;Figure 2 is an illustration of a system for a particular implementation of voice compensation in accordance with the present invention;

Фиг.3 - блок-схема в соответствии с вариантом 1 осуществления настоящего изобретения;Figure 3 is a block diagram in accordance with Embodiment 1 of the present invention;

Фиг.4 - блок-схема в соответствии с вариантом 2 осуществления настоящего изобретения;4 is a block diagram in accordance with embodiment 2 of the implementation of the present invention;

Фиг.5 - блок-схема в соответствии с вариантом 3 осуществления настоящего изобретения.5 is a block diagram in accordance with embodiment 3 of the implementation of the present invention.

ПРЕДПОЧТИТЕЛЬНЫЕ ВАРИАНТЫ ОСУЩЕСТВЛЕНИЯPREFERRED EMBODIMENTS

НАСТОЯЩЕГО ИЗОБРЕТЕНИЯOf the present invention

Настоящее изобретение будет описано далее более подробно, со ссылкой на прилагаемые чертежи и варианты осуществления.The present invention will now be described in more detail with reference to the accompanying drawings and embodiments.

Главная идея настоящего изобретения заключается в следующем: во время вызова, кадры с полной скоростью и кадры с половинной скоростью вносят наибольший вклад в речь, и если есть поврежденные или потерянные кадры с полной скоростью или с половинной скоростью, то это будет сильно влиять на качество речи. Множество экспериментов показывают, что потеря или повреждение одного или более кадров с полной скоростью будет вызывать прерывание речи, «проглатывание» слов, особенно в условиях последовательного появления кадров полной и половинной скорости; а потеря и повреждение одного или более кадров с половинной скоростью будет вызывать вибрацию, особенно в условиях последовательного появления кадров полной и половинной скорости, вызывая дискомфорт для слуха; и степень ощущения дискомфорта зависит от характеристик кодирования и декодирования вокодера в оборудовании пользователя. Таким образом, цель настоящего изобретения заключается в основном в компенсации кадров полной скорости или кадров половинной скорости.The main idea of the present invention is as follows: during a call, frames at full speed and frames at half speed make the greatest contribution to speech, and if there are damaged or lost frames at full speed or half speed, this will greatly affect the quality of speech . Many experiments show that the loss or damage of one or more frames at full speed will cause interruption of speech, “swallowing” of words, especially in the conditions of successive occurrence of frames at full and half speed; and the loss and damage of one or more frames at half speed will cause vibration, especially in the case of successive occurrence of frames at full and half speed, causing discomfort for hearing; and the degree of discomfort feeling depends on the encoding and decoding characteristics of the vocoder in the user equipment. Thus, the aim of the present invention is mainly to compensate for full speed frames or half speed frames.

Настоящее изобретение предлагает способ реализации речевой компенсации в сети мобильной связи, применительно к условиям, когда среда радиоканала плохого качества или качество передачи - низкое, и сетевая сторона не использует вокодер или использует его частично (например, в TrFO или RTO), как показано на фиг.1, и данный способ содержит следующие шаги:The present invention provides a method for implementing voice compensation in a mobile communication network, in situations where the radio channel quality is poor or the transmission quality is poor and the network side does not use the vocoder or uses it partially (for example, in TrFO or RTO), as shown in FIG. .1, and this method contains the following steps:

Шаг 1: Устройство на сетевой стороне в каждый момент времени обработки прямых речевых кадров определяет, является ли прямой кадр (отправляемый и обрабатываемый или принятый с сетевой стороны) некорректным кадром или нет; либо сетевая сторона в каждый момент времени обработки обратных речевых кадров определяет, является ли обратный кадр (принятый от оборудования пользователя или отправляемый и обрабатываемый им) некорректным кадром или нет;Step 1: The device on the network side at each time point of processing direct speech frames determines whether the direct frame (sent and processed or received from the network side) is an invalid frame or not; or the network side at each time point of processing the reverse speech frames determines whether the reverse frame (received from the user equipment or sent and processed by it) is an incorrect frame or not;

Если это некорректный кадр, обработка продолжается на шаге 2;If this is an invalid frame, processing continues at step 2;

Иначе происходит обычная обработка и вывод речевых кадров.Otherwise, normal processing and output of speech frames occurs.

Указанные «некорректные кадры» означают следующие виды кадров:The indicated “invalid frames” means the following types of frames:

пустые кадры, удаленные кадры, или другие кадры с неопределенной скоростью в протоколе;empty frames, deleted frames, or other frames with an indefinite speed in the protocol;

или кадры, не принятые за определенное время обработки кадров (например, потерянные кадры или кадры, пришедшие позже из-за джиттера);or frames that were not received for a specific frame processing time (for example, lost frames or frames that came later due to jitter);

или кадры, которые нужно скомпенсировать после приема вокодером, определенным в протоколе.or frames that need to be compensated after being received by the vocoder defined in the protocol.

Шаг 2: устройство на сетевой стороне определяет, должна ли выполняться речевая компенсация для некорректного кадра. Основа анализа - является ли некорректный кадр кадром с состоянием не 1/8 скорости;Step 2: The device on the network side determines whether voice compensation should be performed for an invalid frame. The basis of the analysis is whether the incorrect frame is a frame with a state of not 1/8 speed;

Если некорректный кадр является кадром с состоянием не 1/8 скорости, некорректный кадр будет иметь относительно большое влияние на качество речи, и обработка продолжается на шаге 3;If the invalid frame is a frame with a state of not 1/8 speed, the invalid frame will have a relatively large effect on speech quality, and processing continues at step 3;

Если некорректный кадр является кадром с состоянием 1/8 скорости, некорректный кадр будет иметь небольшое влияние на качество речи, и его компенсация не требуется, некорректный кадр проходит обычную обработку и выводится.If the incorrect frame is a frame with a state of 1/8 speed, the incorrect frame will have a small effect on speech quality, and its compensation is not required, the incorrect frame goes through normal processing and is output.

Способ определения того, является ли некорректный кадр кадром с состоянием не 1/8 скорости, заключается в следующем.A method for determining whether an invalid frame is a frame with a state of not 1/8 speed is as follows.

Устройство на сетевой стороне определяет, является ли последний корректный кадр кадром со скоростью, равной 1/8, или нет, и если последний корректный кадр является кадром со скоростью, не равной 1/8, то некорректный кадр считается некорректным кадром с состоянием не 1/8 скорости; иначе некорректный кадр является кадром с состоянием 1/8 скорости.The device on the network side determines whether the last valid frame is a frame with a speed equal to 1/8 or not, and if the last valid frame is a frame with a speed not equal to 1/8, then an invalid frame is considered an invalid frame with a state of not 1 / 8 speeds; otherwise, the invalid frame is a frame with a state of 1/8 speed.

Если устройство на сетевой стороне анализирует каждый прямой речевой кадр на шаге 1, то оно на этом шаге анализирует последний корректный кадр прямого речевого кадра; если устройство на сетевой стороне анализирует каждый обратный речевой кадр на шаге 1, то оно на этом шаге анализирует последний корректный кадр обратного речевого кадра.If the device on the network side analyzes each direct speech frame in step 1, then at this step it analyzes the last valid frame of the direct speech frame; if the device on the network side analyzes each inverse speech frame in step 1, then in this step it analyzes the last valid frame of the inverse speech frame.

Указанный «корректный кадр» означает, что этот кадр может быть нормально кодирован и декодирован вокодером во время вызова, т.е. те кадры, которые отличаются от некорректных, называются корректными кадрами.The indicated “correct frame” means that this frame can be normally encoded and decoded by the vocoder during a call, i.e. those frames that differ from incorrect ones are called correct frames.

Указанный «последний корректный кадр» означает корректный кадр, принятый или передаваемый в последний момент обработки кадров; а если кадр, принятый или передаваемый в последний момент обработки кадров, является некорректным кадром, то «последний корректный кадр» означает корректный кадр, принятый или передаваемый сразу перед последним кадром из обрабатываемых кадров, и т.д.The indicated “last valid frame” means a valid frame received or transmitted at the last moment of frame processing; and if a frame received or transmitted at the last moment of frame processing is an invalid frame, then “last valid frame” means a valid frame received or transmitted immediately before the last frame from the frames being processed, etc.

Шаг 3: устройство на сетевой стороне определяет, является ли кадровое расстояние между некорректным кадром и его последним корректным кадром меньшим либо равным порогу речевой компенсации:Step 3: the device on the network side determines whether the frame distance between the incorrect frame and its last valid frame is less than or equal to the threshold of speech compensation:

Если да, обработка продолжается на шаге 4;If so, processing continues at step 4;

Иначе речевой компенсации нет, и корректный кадр проходит обычную обработку и вывод.Otherwise, there is no speech compensation, and the correct frame undergoes normal processing and output.

Указанный порог компенсации зависит от компенсационного эффекта и характеристик системы мобильной связи, порог компенсации для достижения оптимального компенсационного эффекта может быть получен в соответствии с качеством речи путем сравнения результатов нескольких экспериментов. Например, порог компенсации устанавливается равным 6, и 6 последовательных некорректных кадров компенсируются; если же порог компенсации устанавливается равным 2, то компенсируются только два некорректных кадра, а третий некорректный кадр из трех последовательных кадров компенсироваться не будет.The specified compensation threshold depends on the compensation effect and the characteristics of the mobile communication system, the compensation threshold to achieve the optimal compensation effect can be obtained in accordance with the quality of speech by comparing the results of several experiments. For example, the compensation threshold is set to 6, and 6 consecutive invalid frames are compensated; if the compensation threshold is set to 2, then only two invalid frames are compensated, and the third incorrect frame of three consecutive frames will not be compensated.

«Кадровое расстояние», например кадровое расстояние между кадром А и кадром В, означает число кадров между кадром А и кадром В плюс 1 для группы последовательно приходящих кадров. Например, для группы последовательно приходящих кадров, кадр а, кадр b, кадр с, кадр d…, кадровое расстояние между кадром а и кадром d равно 3.A “frame distance”, for example, a frame distance between frame A and frame B, means the number of frames between frame A and frame B plus 1 for a group of consecutive frames. For example, for a group of consecutive frames, frame a, frame b, frame c, frame d ..., the frame distance between frame a and frame d is 3.

Шаг 4: устройство на сетевой стороне выполняет компенсацию речевых кадров для некорректных кадров, используя скомпенсированный речевой кадр в качестве замены некорректного кадра для его обработки и вывода. Способ компенсации речевого кадра, применяемый в устройстве на сетевой стороне, содержит одно из нижеследующего: способ копирования корректного кадра, способ заполнения кадром с 1/4 скорости, способ аппроксимации путем моделирования и т.д.Step 4: the device on the network side compensates for speech frames for invalid frames, using the compensated speech frame as a replacement for the invalid frame for processing and output. The method of compensating for a speech frame used in a device on the network side contains one of the following: a method of copying a correct frame, a method of filling a frame with 1/4 speed, an approximation method by modeling, etc.

Способ копирования корректного кадра: последний корректный кадр используется для замены текущего некорректного кадра;Copying the correct frame: the last valid frame is used to replace the current invalid frame;

Способ заполнения кадра с 1/4 скорости: этот способ может быть применен только в вызове с кодированием и декодированием речи EVRC; где кадр с 1/4 скорости с любым контентом используется для замены текущего некорректного кадра, требующего компенсации;The method of filling the frame with 1/4 speed: this method can be applied only in a call with encoding and decoding of speech EVRC; where a frame with 1/4 speed with any content is used to replace the current incorrect frame requiring compensation;

Способ аппроксимации путем моделирования: кадр моделируется с использованием скорости и контента последнего корректного кадра и кадрового расстояния между текущим некорректным кадром и его последним корректным кадром в соответствии с правилом, полученным из моделирования, и этот кадр, полученный путем моделирования, используется для замены текущего некорректного кадра.Approximation method by modeling: a frame is modeled using the speed and content of the last valid frame and the frame distance between the current invalid frame and its last valid frame in accordance with the rule obtained from the simulation, and this frame obtained by modeling is used to replace the current invalid frame .

После компенсации скомпенсированный речевой кадр проходит обычную обработку и вывод.After compensation, the compensated speech frame undergoes normal processing and output.

Настоящее изобретение предлагает систему для реализации речевой компенсации в сети мобильной связи, эта система находится в устройстве на сетевой стороне и применима в условиях, когда среда радиоканала имеет плохое качество или качество передачи является низким, а устройство на сетевой стороне не использует вокодер или использует его частично, как показано на фиг.2. Эта система содержит:The present invention provides a system for implementing voice compensation in a mobile communication network, this system is located on the network side of the device and is applicable in conditions where the radio channel environment is poor or the transmission quality is poor and the device on the network side does not use the vocoder or partially uses it as shown in FIG. This system contains:

узел обнаружения некорректного кадра для определения, в каждый момент времени обработки кадров, является ли принятый или передаваемый устройством на сетевой стороне прямой или обратный кадр некорректным кадром или нет, для отправки некорректного кадра к узлу речевой компенсации и отправки корректного кадра к узлу обработки речевых кадров в устройстве на сетевой стороне;an incorrect frame detection unit for determining, at each instant of the frame processing time, whether the direct or reverse frame received or transmitted by the device on the network side is an incorrect frame or not, for sending an incorrect frame to the speech compensation node and sending the correct frame to the speech processing node in device on the network side;

узел речевой компенсации, содержащий:a speech compensation node comprising:

узел принятия решения о речевой компенсации для приема некорректных кадров, отправленных узлом обнаружения некорректного кадра, отправки некорректных кадров с состоянием не 1/8 скорости, кадровое расстояние которых от последнего корректного кадра меньше либо равно порогу компенсации, к узлу процесса речевой компенсации, и отправки других некорректных кадров к узлу обработки речевых кадров в устройстве на сетевой стороне;a decision unit for speech compensation for receiving incorrect frames sent by the detection unit for an incorrect frame, sending invalid frames with a state of not 1/8 speed, the personnel distance from the last correct frame is less than or equal to the compensation threshold, to the node of the speech compensation process, and sending other incorrect frames to the speech processing node in the device on the network side;

узел процесса речевой компенсации для приема некорректных кадров, отправленных узлом принятия решения о речевой компенсации, и выполнения речевой компенсации для этих кадров, т.е. для выполнения одной из нижеследующих обработок:the speech compensation process node for receiving incorrect frames sent by the decision making node about the speech compensation and performing voice compensation for these frames, i.e. to perform one of the following treatments:

использование кадра, полученного путем моделирования, для замены текущего некорректного кадра.using a frame obtained by simulation to replace the current invalid frame.

Скомпенсированные речевые кадры передаются в узел обработки речевых кадров в устройстве на сетевой стороне.Compensated speech frames are transmitted to the speech processing unit in the device on the network side.

Указанное устройство на сетевой стороне может быть одним из следующих устройств: базовая станция, контроллер базовой станции, контроллер радиосети или мобильный коммутационный центр.The specified device on the network side can be one of the following devices: a base station, a base station controller, a radio network controller or a mobile switching center.

Настоящее изобретение может быть использовано для речевых вызовов, в которых устройство на сетевой стороне не использует вокодер для речевой компенсации или использует его частично, включая мобильную систему связи, реализующую технологию TrFO, технологию RTO или технологию TFO (работа без тандемных соединений). Настоящее изобретение также может быть использовано в мобильной системе связи CDMA2000, WCDMA (Широкополосный множественный доступ с кодовым разделением каналов) и TDS-CDMA (синхронизация с кодовым разделением - множественный доступ с кодовым разделением каналов).The present invention can be used for voice calls in which the device on the network side does not use the vocoder for voice compensation or uses it partially, including a mobile communication system that implements TrFO technology, RTO technology, or TFO technology (operation without tandem connections). The present invention can also be used in the mobile communication system CDMA2000, WCDMA (Broadband Code Division Multiple Access) and TDS-CDMA (Code Division Multiplexing - Code Division Multiple Access).

Далее настоящее изобретение будет описано более подробно, с тремя вариантами его осуществления.The present invention will now be described in more detail, with three options for its implementation.

Вариант осуществления 1: Использование способа копирования корректного кадра для реализации речевой компенсации.Embodiment 1: Using the correct frame copy method to implement voice compensation.

Способ речевой компенсации, применяемый в этом варианте осуществления, - это способ копирования корректного кадра, и порог кадрового расстояния речевой компенсации в этом варианте осуществления равен 1, т.е. речевая компенсация выполняется только для первого некорректного кадра после корректного кадра в условиях полной скорости, при этом речевая компенсация не выполняется для непрерывно приходящих некорректных кадров после первого некорректного кадра; Как показано на фиг.3, конкретные шаги, которые нужно выполнить, заключаются в следующем:The speech compensation method used in this embodiment is a method of copying a correct frame, and the threshold for the frame distance of the speech compensation in this embodiment is 1, i.e. speech compensation is performed only for the first incorrect frame after the correct frame at full speed, while voice compensation is not performed for continuously arriving invalid frames after the first incorrect frame; As shown in FIG. 3, the specific steps to be taken are as follows:

101: Устройство на сетевой стороне в каждый момент времени обработки прямого речевого кадра анализирует принятые прямые речевые кадры;101: The device on the network side at each point in time processing the direct speech frame analyzes the received direct speech frames;

Если прямой речевой кадр, принятый в этот момент времени, является некорректным кадром, обработка продолжается на шаге 102;If the direct speech frame received at this point in time is an invalid frame, processing continues at step 102;

Если прямой речевой кадр, принятый в этот момент времени, является корректным кадром, обработка продолжается на шаге 104;If the direct speech frame received at this point in time is a valid frame, processing continues at step 104;

102: Устройство на сетевой стороне анализирует последний принятый кадр;102: A device on the network side analyzes the last received frame;

Если последний принятый кадр также является некорректным кадром, никаких специальных действий не требуется, и обработка прямо переходит к шагу 104;If the last received frame is also an invalid frame, no special actions are required, and processing proceeds directly to step 104;

Если последний кадр не является кадром с полной скоростью, никаких специальных действий не требуется, и обработка прямо переходит к шагу 104;If the last frame is not a frame at full speed, no special actions are required, and processing proceeds directly to step 104;

Если последний кадр является кадром с полной скоростью, обработка продолжается на шаге 103; следует отметить, что кадр с полной скоростью является здесь корректным кадром.If the last frame is a frame at full speed, processing continues at step 103; it should be noted that the frame at full speed is the correct frame here.

103. Текущий некорректный кадр отбрасывается, последний принятый кадр (т.е. указанный кадр с полной скоростью) используется для замены этого некорректного кадра; обработка продолжается на шаге 104;103. The current invalid frame is discarded, the last received frame (ie, the specified frame at full speed) is used to replace this invalid frame; processing continues at step 104;

104. Текущий прямой речевой кадр проходит обычную обработку и вывод.104. The current direct speech frame undergoes normal processing and output.

Хотя этот вариант осуществления описывает только шаги анализа и компенсации прямых речевых кадров с сетевой стороны устройством на сетевой стороне, такой вариант также применим для шагов анализа и компенсации устройством на сетевой стороне обратных речевых кадров, полученных из оборудования пользователя, что не будет описано здесь более детально.Although this embodiment only describes the steps of analyzing and compensating for direct speech frames from the network side by the device on the network side, this option is also applicable to the steps of analyzing and compensating for the device on the network side of reverse speech frames received from user equipment, which will not be described in more detail here .

Вариант осуществления 2: Использование способа заполнения кадрами с 1/4 скорости для реализации речевой компенсации.Embodiment 2: Using a 1/4 rate frame filling method to implement speech compensation.

Этот вариант осуществления может быть применен к вызовам, кодированным и декодированным по стандарту EVRC, разрешенный формат кодирования в EVRC не содержит кадров с 1/4 скорости. Множество экспериментов показывают, что вокодер в каждом оборудовании пользователя будет выполнять речевую компенсацию после приема кадров с 1/4 скорости, кодированных и декодированных по стандарту EVRC. Как показано на фиг.4, конкретные шаги для этого варианта осуществления заключаются в следующем:This embodiment can be applied to calls encoded and decoded according to the EVRC standard, the allowed encoding format in EVRC does not contain frames at 1/4 rate. Many experiments show that the vocoder in each user equipment will perform voice compensation after receiving frames at 1/4 speed encoded and decoded according to the EVRC standard. As shown in FIG. 4, specific steps for this embodiment are as follows:

201: Устройство на сетевой стороне в каждый момент времени обработки прямого речевого кадра определяет скорость принятых с сетевой стороны прямых речевых кадров:201: The device on the network side at each time point of processing the direct speech frame determines the speed of the direct speech frames received from the network side:

Если прямой речевой кадр, принятый в этот момент времени, является некорректным кадром, обработка продолжается на шаге 202;If the direct speech frame received at this point in time is an invalid frame, processing continues at step 202;

Если прямой речевой кадр, принятый в этот момент времени, является обычным речевым кадром, обработка продолжается на шаге 205;If the direct speech frame received at this point in time is a normal speech frame, processing continues at step 205;

202: Устройство на сетевой стороне определяет скорость последнего корректного кадра:202: The device on the network side determines the speed of the last valid frame:

Если последний корректный кадр является кадром с полной скоростью, обработка продолжается на шаге 203;If the last valid frame is a full speed frame, processing continues at step 203;

Если последний корректный кадр не является кадром с полной скоростью, никаких специальных действий не требуется, и обработка продолжается на шаге 205;If the last correct frame is not a frame at full speed, no special actions are required, and processing continues at step 205;

203. Устройство определяет кадровое расстояние между последним корректным кадром и текущим некорректным кадром:203. The device determines the frame distance between the last correct frame and the current incorrect frame:

Если кадровое расстояние меньше либо равно заранее заданному порогу компенсации, обработка продолжается на шаге 204;If the frame distance is less than or equal to a predetermined compensation threshold, processing continues at step 204;

иначе обработка продолжается на шаге 205;otherwise, processing continues at step 205;

204. Текущий некорректный кадр отбрасывается, и кадр с 1/4 скорости с любым контентом используется для замены этого некорректного кадра. Этот кадр с 1/4 скорости используется в этот момент времени как прямой речевой кадр. Обработка продолжается на шаге 205;204. The current invalid frame is discarded, and a frame at 1/4 speed with any content is used to replace this invalid frame. This 1/4 frame rate is used at this point in time as a direct speech frame. Processing continues at step 205;

205: текущий прямой речевой кадр проходит обычную обработку и вывод.205: the current direct speech frame goes through normal processing and output.

Из вышеприведенных шагов видно, что главная идея данного варианта осуществления заключается в замене группы последовательных кадров сразу после кадров с полной скоростью кадрами с 1/4 скорости;From the above steps it is seen that the main idea of this embodiment is to replace a group of consecutive frames immediately after frames at full speed with frames at 1/4 speed;

Каждый некорректный кадр, у которого кадровое расстояние от его последнего корректного кадра с полной скоростью меньше либо равно заранее заданному порогу, заменяется на кадр с 1/4 скорости. Для кадров, у которых кадровое расстояние больше порога, не производится никакой дополнительной речевой компенсации; т.е. если число последовательных некорректных кадров сразу после корректного кадра с полной скоростью больше, чем максимальный порог, не производится никакой дополнительной речевой компенсации для тех кадров, которые превышают этот порог; указанный максимальный порог числа некорректных кадров является порогом компенсации. В практических приложениях порог компенсации в этом способе может быть установлен равным бесконечности, т.е. все последовательные некорректные кадры сразу после кадров с полной скоростью будут заменяться на кадры с 1/4 скорости.Each incorrect frame whose frame distance from its last correct frame with full speed is less than or equal to a predetermined threshold is replaced with a frame with 1/4 speed. For frames in which the personnel distance is greater than the threshold, no additional speech compensation is made; those. if the number of consecutive invalid frames immediately after the correct frame with full speed is greater than the maximum threshold, no additional speech compensation is made for those frames that exceed this threshold; the specified maximum threshold for the number of invalid frames is the compensation threshold. In practical applications, the compensation threshold in this method can be set to infinity, i.e. all consecutive invalid frames immediately after frames at full speed will be replaced by frames with 1/4 speed.

Хотя этот вариант осуществления описывает только шаги анализа и компенсации прямых речевых кадров с сетевой стороны устройством на сетевой стороне, такой вариант также применим для шагов анализа и компенсации устройством на сетевой стороне обратных речевых кадров, полученных от оборудования пользователя, что не будет описано здесь более детально.Although this embodiment describes only the steps of analyzing and compensating for direct speech frames from the network side by the device on the network side, this option is also applicable to the steps of analyzing and compensating for the device on the network side of the reverse speech frames received from user equipment, which will not be described in more detail here .

Вариант осуществления 3: Использование способа аппроксимации путем моделирования для реализации речевой компенсации.Embodiment 3: Using a modeling approximation method to implement speech compensation.

В этом варианте осуществления изобретения большой объем речевых данных с полной скоростью генерируется статистически в соответствии с предыдущей реальной ситуацией для получения правила аппроксимации контента кадров и изменения скорости; когда некорректные кадры скомпенсированы, кадр может быть смоделирован и применен для замены скорости и контента некорректного кадра в соответствии с правилом аппроксимации с использованием контента и скорости последнего корректного кадра, а также кадрового расстояния между некорректным кадром и последним корректным кадром. В этом описании кадры, полученные посредством моделирования, называются псевдо-кадрами с полной скоростью. Заранее заданный порог компенсации в данном варианте осуществления изобретения равен 6. Как показано на фиг.5, конкретные шаги для этого варианта осуществления заключаются в следующем:In this embodiment of the invention, a large amount of speech data at full speed is generated statistically in accordance with the previous real situation to obtain a rule for approximating the frame content and changing the speed; when incorrect frames are compensated, the frame can be modeled and applied to replace the speed and content of the incorrect frame in accordance with the approximation rule using the content and speed of the last correct frame, as well as the frame distance between the incorrect frame and the last correct frame. In this description, frames obtained by simulation are called pseudo-frames at full speed. The predetermined compensation threshold in this embodiment is 6. As shown in FIG. 5, specific steps for this embodiment are as follows:

301. Устройство на сетевой стороне в каждый момент времени обработки прямого речевого кадра определяет скорость принятых с сетевой стороны прямых речевых кадров:301. The device on the network side at each time point of processing a direct speech frame determines the speed of direct speech frames received from the network side:

Если кадр, принятый в этот момент времени, является некорректным кадром, обработка продолжается на шаге 302;If the frame received at this point in time is an invalid frame, processing continues at step 302;

Если кадр, принятый в этот момент времени, является обычным речевым кадром, обработка продолжается на шаге 305;If the frame received at this point in time is a normal speech frame, processing continues at step 305;

302: Устройство определяет скорость сохраненного последнего корректного кадра:302: The device determines the speed of the saved last valid frame:

Если последний корректный кадр является кадром с полной скоростью, обработка продолжается на шаге 303;If the last valid frame is a full speed frame, processing continues at step 303;

Если последний корректный кадр не является кадром с полной скоростью, никаких специальных действий не требуется, и обработка продолжается на шаге 305;If the last correct frame is not a frame at full speed, no special actions are required, and processing continues at step 305;

303. Устройство определяет кадровое расстояние между последним корректным кадром и текущим некорректным кадром:303. The device determines the frame distance between the last correct frame and the current incorrect frame:

Если кадровое расстояние меньше либо равно 6, обработка продолжается на шаге 304;If the frame distance is less than or equal to 6, processing continues at step 304;

Иначе обработка продолжается на шаге 305;Otherwise, processing continues at step 305;

304. Данный некорректный кадр отбрасывается, моделируется псевдокадр с полной скоростью, в соответствии с правилом аппроксимации, полученным статистически с использованием контента последнего корректного кадра, кадрового расстояния между последним корректным кадром и текущим некорректным кадром в качестве параметров, моделированный псевдо-кадр с полной скоростью используется для замены этого некорректного кадра; псевдо-кадр с полной скоростью используется как прямой речевой кадр в этот момент времени; обработка продолжается на шаге 305;304. This incorrect frame is discarded, the pseudo-frame is simulated at full speed, in accordance with the approximation rule obtained statistically using the content of the last correct frame, the frame distance between the last correct frame and the current incorrect frame as parameters, the simulated pseudo-frame is used to replace this incorrect frame; a pseudo-frame at full speed is used as a direct speech frame at this point in time; processing continues at step 305;

305: Текущий прямой речевой кадр проходит обычную обработку и вывод.305: The current direct speech frame goes through normal processing and output.

Из вышеприведенных шагов видно, что главная идея данного варианта осуществления заключается в замене последовательных некорректных кадров сразу после кадров с полной скоростью моделированными речевыми кадрами; при моделировании 6 последовательных некорректных кадров сразу после кадров с полной скоростью могут быть скомпенсированы посредством статистического правила в соответствии с контентом кадров с полной скоростью и кадровым расстоянием между некорректным кадром и кадром с полной скоростью;From the above steps it can be seen that the main idea of this embodiment is to replace consecutive incorrect frames immediately after frames at full speed with simulated speech frames; when modeling 6 consecutive incorrect frames immediately after frames at full speed, they can be compensated by the statistical rule in accordance with the content of frames at full speed and the frame distance between the incorrect frame and frame at full speed;

Вышеприведенные три варианта осуществления имеют свои преимущества, но с точки зрения качества речь, полученная посредством способа аппроксимации путем моделирования, является наилучшей. Более того, способ аппроксимации путем моделирования может компенсировать некоторые удаленные кадры в случае последовательных кадров полной скорости, и дополнительные расходы ресурсов при этом способе не слишком велики, требуется лишь сохранение контента самого последнего кадра с полной скоростью.The above three embodiments have their advantages, but in terms of quality, the speech obtained by the approximation method by modeling is the best. Moreover, the approximation method by modeling can compensate for some deleted frames in the case of successive full-speed frames, and the additional resource costs in this method are not too large, it is only necessary to save the content of the most recent frame at full speed.

Вышеприведенные три варианта осуществления в основном компенсируют кадры в случае их полной скорости; в практических применениях можно считать, что речевая компенсация будет выполняться, когда последний корректный кадр является кадром с полной или половинной скоростью. Кроме того, в практических применениях порог компенсации может быть установлен в соответствии с практическими ситуациями.The above three embodiments generally compensate for frames in case of their full speed; in practical applications, it can be considered that speech compensation will be performed when the last correct frame is a frame with full or half speed. In addition, in practical applications, the compensation threshold can be set in accordance with practical situations.

Конечно, настоящее изобретение может иметь множество других вариантов осуществления. Поэтому специалисты могут выполнить различные модификации и вариации, которые будут оставаться в рамках приложенной формулы изобретения, без изменения идеи и сущности настоящего изобретения.Of course, the present invention may have many other embodiments. Therefore, specialists can make various modifications and variations, which will remain within the scope of the attached claims, without changing the idea and essence of the present invention.

ПРОМЫШЛЕННАЯ ПРИМЕНИМОСТЬINDUSTRIAL APPLICABILITY

Настоящее изобретение предлагает систему и способ для осуществления речевой компенсации на сетевой стороне для решения проблемы, которая возникает, когда общее качество речи низкое, и это вызывает дискомфорт для человеческого уха из-за того, что на сетевой стороне для компенсации и линейного предсказания качества речи не используется вокодер или используется частично, в результате чего качество речи значительно зависит от того, осуществляет ли вокодер в оборудовании пользователя компенсацию некоторых кадров, и от характеристик компенсации.The present invention provides a system and method for performing voice compensation on the network side to solve a problem that occurs when the overall speech quality is low, and this causes discomfort to the human ear due to the fact that the network side does not compensate and linearly predict speech quality a vocoder is used or partially used, as a result of which the speech quality significantly depends on whether the vocoder in the user equipment compensates for certain frames, and on the characteristics of sations.

Если среда радиоканала имеет плохое качество или качество передачи в сети относительно низкое, техническая схема настоящего изобретения может в определенной степени скомпенсировать речь, если на сетевой стороне нет вокодера или вокодер используется для уменьшения дискомфортных слуховых ощущений («проглатывание» слов, вибрация, прерывание речи) лишь частично, а также повысить общее качество речи, уменьшить зависимость вызовов от характеристик пользовательского оборудования и его вокодера.If the environment of the radio channel is of poor quality or the quality of transmission in the network is relatively poor, the technical scheme of the present invention can compensate for speech to a certain extent, if there is no vocoder on the network side or the vocoder is used to reduce uncomfortable auditory sensations (swallowing words, vibration, speech interruption) only partially, and also to increase the overall quality of speech, reduce the dependence of calls on the characteristics of user equipment and its vocoder.

Claims

1. The method of voice compensation in a mobile communication network, in which:
a) at each moment of frame processing, the device on the network side determines whether the received or transmitted speech frame is an invalid frame or not, and if so, the transition to the next step is performed; and
b) the device on the network side performs voice compensation for an incorrect frame,
there is the following step after step (a):
A1) determining whether the incorrect frame is a frame with a speed mode not equal to 1/8 or not; and if so, then go to the next step.

2. The method according to claim 1, where determining whether the incorrect frame is a frame with a speed mode not equal to 1/8 or not, at step (a1) contains:
determining whether the last valid frame of the invalid frame is not 1/8 speed frame, and if so, then the invalid frame is a frame with a speed mode not equal to 1/8; otherwise, an invalid frame is not a frame with a speed mode not equal to 1/8.

3. The method according to claim 1, where there is the next step after the specified step (A1):
a2) determining whether the frame distance between the incorrect frame and its last valid frame is less than or equal to the compensation threshold or not; and if so, then go to the next step.

4. The method according to claim 1, where the implementation of voice compensation for an incorrect frame in step (b) contains one of the following:
duplication of the correct frame by using the last valid frame to replace the current invalid frame to perform compensation;
filling the frame with 1/4 speed by using a frame with 1/4 speed with any content to replace the current incorrect frame for compensation;
approximation by modeling by using the frame obtained during modeling to replace the current incorrect frame.

5. The method according to claim 1, where the specified incorrect frame means an empty frame, a deleted frame, another frame with an indefinite speed in the protocol, a frame not received for a certain period of time for processing frames, or a frame that needs compensation after it is received by the vocoder, defined in the protocol.

6. The method according to any one of claims 2 to 4, wherein said speech frame is a forward or reverse speech frame; and
if the speech frame is a direct speech frame, then said last valid frame is the last valid frame of the direct speech frame;
if the speech frame is a return speech frame, the last valid frame indicated is the last valid frame of the return speech frame.

7. A system for performing voice compensation in a mobile communication network, while the specified system is located in the device on the network side and contains:
an incorrect frame detection unit for determining whether a speech frame received or transmitted by the device on the network side is an invalid frame or not, and for sending an incorrect frame to the speech compensation unit and sending the correct frame to the speech frame processing unit in the device on the network side; and
a speech compensation node for performing voice compensation for an incorrect frame and sending the compensated speech frame to the speech processing node in the device on the network side,
wherein the specified node speech compensation contains:
a speech compensation decision node for receiving invalid frames sent by the invalid frame detection node and for sending invalid frames with a speed mode not equal to 1/8 to the speech compensation process node and other invalid frames to the speech processing unit in the network device side; and
a speech compensation process node for receiving invalid frames sent by the speech compensation decision making node to perform voice compensation for these invalid frames and to send compensated speech frames to the speech frame processing node in the device on the network side.

8. The system according to claim 7, where the specified node making decisions about voice compensation determines whether the last valid frame of the received invalid frame is not 1/8 speed frame or not; and if so, then an invalid frame is considered an invalid frame with a speed mode not equal to 1/8; otherwise, an invalid frame is not a frame with a speed mode not equal to 1/8.

9. The system of claim 7, wherein said decision making unit for voice compensation determines a frame distance between an invalid frame with a speed mode not equal to 1/8 and its last valid frame, and such invalid frames with a frame distance less than or equal to the threshold compensations are transmitted to the process compensation node of the speech compensation, and frames with a frame distance greater than the compensation threshold are transmitted to the processing unit of speech frames in the device on the network side.

10. The system according to claim 7, where the speech compensation for incorrect frames using the specified node speech compensation contains one of the following:
using the last valid frame to replace the current invalid frame;
using a frame with 1/4 speed with any content to replace the current incorrect frame; or
using a frame obtained through simulation to replace the current invalid frame.

11. The system according to claim 7, where the specified node detecting an incorrect frame considers the speech frame as an invalid frame if the speech frame received by the device on the network side is an empty frame, a deleted frame, another frame with an undefined speed in the protocol, a frame not accepted a certain period of time for processing frames, or a frame that needs compensation after it is received by the vocoder defined in the protocol.

12. The system according to any one of claims 8 to 10, wherein said speech frame received by the device on the network side is a forward or reverse speech frame; and
if the speech frame is a direct speech frame, then the last invalid frame indicated is the last invalid frame of the direct speech frame;
and if the speech frame is a reverse speech frame, the last invalid frame indicated is the last invalid frame of the reverse speech frame.

13. The system of claim 7, wherein said device on the network side is a base station, a base station controller, a radio network controller, or a mobile switching center.