JP2006211509A

JP2006211509A - Encoding apparatus, stc correction method used therein, encoding system, transmission system, and audio transmission/reception system

Info

Publication number: JP2006211509A
Application number: JP2005023298A
Authority: JP
Inventors: Naoki Kobayashi; 尚樹小林; Tsugumichi Nagana; 継道永名
Original assignee: NEC Corp; NEC Engineering Ltd
Current assignee: NEC Corp; NEC Engineering Ltd
Priority date: 2005-01-31
Filing date: 2005-01-31
Publication date: 2006-08-10
Anticipated expiration: 2025-01-31
Also published as: JP4624121B2

Abstract

<P>PROBLEM TO BE SOLVED: To correct an STC value that becomes a source of a PTS value, if the STC value is not correct. <P>SOLUTION: An audio encoding unit 10 of an encoding system comprises an encoding processing section 11, a PTS generating section 12 and an STC monitoring section 13. The STC monitoring section 13 computes a differential value between a present STC value a<SB>n</SB>given to an audio frame at a present time point and a previous STC value that became a source of a PTS value given to an audio frame at a time point preceding to the audio frame of the present time point for one frame and determines whether the differential value is settled within an allowable range of an STC value increase amount in the case where an STC value given for each of a plurality of audio frames increases time-sequentially by the same value all the time and if it is determined that the differential value is settled within the range, the present STC value a<SB>n</SB>is adopted as an STC value c<SB>n</SB>that becomes a source of a PTS value to be given to the audio frame at the present time point, but if not, an arithmetic value with the STC value increase amount added to the previous STC value is adopted as an STC value c<SB>n</SB>that becomes a source of the PTS value given to the audio frame at the present time point. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、符号化装置、これで用いるＳＴＣ補正方法、符号化システム、送出系、及び音声送受信システムに関し、特に符号化装置のＳＴＣ自動補正方法に関する。 The present invention relates to an encoding apparatus, an STC correction method used in the encoding apparatus, an encoding system, a transmission system, and a voice transmission / reception system, and more particularly to an STC automatic correction method for an encoding apparatus.

従来、映像と音声信号を独立して符号化する映像及び音声符号化部を有し、独立した映像と音声信号の符号化信号を１つの伝送路に多重伝送する送出系と、この送出系により多重伝送された信号から映像と音声の符号化信号を分離して独立に復号化する映像及び音声復号化部を有する受信系とで構成されるシステムが知られている（例えば、特許文献１参照）。 Conventionally, a transmission system having a video and audio encoding unit for encoding video and audio signals independently, and multiplexing and transmitting the encoded signals of independent video and audio signals to one transmission path, A system is known that includes a video and audio decoding unit that separates and independently decodes video and audio encoded signals from multiplexed signals (see, for example, Patent Document 1). ).

このシステムでは、受信系で映像と音声が正しい時刻に再生できるよう、送出系で映像と音声の符号化信号にその信号の時刻情報を付加する。例えば、送出系は、音声符号化において、送出系内で共通の「ＳＴＣ（System Time Clock）」と呼ばれる時刻情報を元に、音声フレーム単位に付加される「ＰＴＳ（Presentation Time Stamp）」と呼ばれる再生時刻情報を作成し、作成されたＰＴＳを「ＰＣＲ（Program Clock Reference）」と呼ばれる時刻基準情報と共に受信系へ送出する。受信系では、受信された音声フレームのＰＣＲから受信系内のＳＴＣを生成し、ＰＴＳの時刻情報に基づいて音声フレームからなる音声信号を再生する。 In this system, the time information of the signal is added to the encoded video and audio signals in the transmission system so that the video and audio can be reproduced at the correct time in the reception system. For example, the transmission system is called “PTS (Presentation Time Stamp)” that is added in units of audio frames based on time information called “STC (System Time Clock)” that is common within the transmission system in audio coding. The reproduction time information is created, and the created PTS is sent to the reception system together with time reference information called “PCR (Program Clock Reference)”. In the reception system, an STC in the reception system is generated from the PCR of the received audio frame, and an audio signal composed of the audio frame is reproduced based on the time information of the PTS.

ここで、送出系内の音声符号化部へＳＴＣ値を渡す際に、例えばＳＴＣ値が破壊される、ＳＴＣ値の供給が停止される、音声符号化部とは独立してＳＴＣ値の基準が切り替わる等の事態が生じた場合、音声フレーム単位の時刻情報であるＰＴＳが正しく付加されず、その結果、受信系で復号化される音声信号は、同期再生が滞る、もしくは正しい時刻に再生されないといった不都合が生じる可能性がある。 Here, when the STC value is passed to the speech encoding unit in the transmission system, for example, the STC value is destroyed, the supply of the STC value is stopped, and the STC value reference is independent of the speech encoding unit. When a situation such as switching occurs, the PTS, which is time information in units of audio frames, is not correctly added, and as a result, the audio signal decoded in the reception system is delayed in synchronization or not reproduced at the correct time. Inconvenience may occur.

これを回避するために、音声符号化部においてＳＴＣを内部で自動算出し、音声フレームの時刻情報として採用する方法等が考えられる。しかし、この方法だけでは、システムの時刻情報のＳＴＣ値と内部で自動算出したＳＴＣ値のずれを検出することは不可能であり、一度両者にずれが生じた場合、その後全ての音声フレーム時刻はシステム時刻とずれ、その結果、受信系において所望の再生ができない可能性がある。
特開平１０−５１７５０号公報 In order to avoid this, a method of automatically calculating the STC internally in the speech encoding unit and adopting it as the time information of the speech frame can be considered. However, with this method alone, it is impossible to detect the deviation between the STC value of the system time information and the STC value that is automatically calculated internally. As a result, there is a possibility that a desired reproduction cannot be performed in the reception system.
JP-A-10-51750

上記のように、音声の再生時間を表すＰＴＳ値を付加する場合、ＰＴＳ値を付加するときの基準になるＳＴＣ値が破壊、供給停止、もしくは音声符号化部とは独立してＳＴＣ値の基準が切り替わる等の事態が生じた場合、音声フレームの正しい時刻情報は失われ、受信系での所望の再生ができない可能性が生じ、最悪の場合、復号化映像信号と同期が取れない現象に至り、システム運用に重大な問題を及ぼす可能性がある。 As described above, when adding a PTS value representing the playback time of a voice, the STC value serving as a reference when adding the PTS value is destroyed, stopped, or the STC value reference is independent of the voice encoding unit. When a situation such as switching occurs, the correct time information of the audio frame is lost, and there is a possibility that the desired reproduction cannot be performed in the reception system, and in the worst case, it becomes a phenomenon in which synchronization with the decoded video signal cannot be achieved. May cause serious problems in system operation.

本発明は、このような従来の事情を考慮してなされたもので、音声符号化において音声フレームの時刻情報が常に同値で増加していく場合、ＰＴＳ値の元となるＳＴＣ値が正しくないときにＳＴＣ値を自動的に補正することを目的とする。 The present invention has been made in consideration of such a conventional situation, and when the time information of a voice frame always increases with the same value in voice coding, when the STC value that is the basis of the PTS value is not correct. The purpose is to automatically correct the STC value.

上記目的を達成するため、本発明に係る符号化装置で用いるＳＴＣ補正方法は、音声信号を入力して符号化し、符号化された音声信号を成す複数の時系列に連続する音声フレーム毎に、時系列に与えられた時刻情報を表すＳＴＣ（System Time Clock）値を元に生成した再生時刻情報を表すＰＴＳ（Presentation Time Stamp）値を付与して出力する符号化装置で用いるＳＴＣ補正方法であって、前記複数の音声フレームのうち現時点の音声フレームに与えられた現ＳＴＣ値と、当該現時点の音声フレームよりも所定フレーム前の時点の音声フレームに付与されたＰＴＳ値の元になった前ＳＴＣ値との差分値を計算する演算ステップと、計算された前記差分値が、前記複数の音声フレーム毎に与えられるＳＴＣ値が時系列に常に同値で増加していく場合のＳＴＣ値増加量の許容範囲内にあるか否かを判定し、前記許容範囲内にあると判定された場合、前記現時点の音声フレームに付与されるＰＴＳ値の元になるＳＴＣ値として、前記現ＳＴＣ値を採用すると共に、前記許容範囲内にないと判定された場合、前記現時点の音声フレームに付与されるＰＴＳ値の元になるＳＴＣ値として、前記現ＳＴＣ値を補正した補正値を採用する処理ステップとを有することを特徴とする。 In order to achieve the above object, the STC correction method used in the encoding apparatus according to the present invention inputs and encodes a speech signal, and for each of a plurality of time-sequential speech frames constituting the encoded speech signal, This is an STC correction method used in an encoding apparatus that outputs a PTS (Presentation Time Stamp) value representing reproduction time information generated based on an STC (System Time Clock) value representing time information given in time series. The previous STC based on the current STC value given to the current voice frame among the plurality of voice frames and the PTS value given to the voice frame at a predetermined frame earlier than the current voice frame. A calculation step for calculating a difference value with respect to a value, and an ST in which the calculated difference value is always increased in time series with the same STC value given for each of the plurality of audio frames. It is determined whether or not it is within the allowable range of the C value increase amount, and if it is determined that it is within the allowable range, the STC value that is the basis of the PTS value assigned to the current audio frame is used as the STC value. When the STC value is adopted and when it is determined that it is not within the allowable range, the correction value obtained by correcting the current STC value is adopted as the STC value that is the basis of the PTS value given to the current audio frame. And a processing step.

本発明において、前記補正値は、前記前ＳＴＣ値に前記ＳＴＣ値増加量を加えた演算値であることが好ましい。 In the present invention, the correction value is preferably a calculated value obtained by adding the STC value increase amount to the previous STC value.

前記処理ステップは、前記差分値が前記ＳＴＣ値増加量の許容範囲内にないと判定される場合が所定回数連続して生じたとき、前記ＳＴＣ値増加量の基準となるＳＴＣ値を前記現ＳＴＣ値に切り替えるステップを有してもよい。 In the processing step, when it is determined that the difference value is not within the allowable range of the STC value increase amount continuously for a predetermined number of times, an STC value serving as a reference for the STC value increase amount is determined as the current STC value. You may have the step which switches to a value.

本発明に係る符号化装置は、音声信号を入力して符号化し、符号化された音声信号を成す複数の時系列に連続する音声フレーム毎に、時系列に与えられた時刻情報を表すＳＴＣ（System Time Clock）値を元に生成した再生時刻情報を表すＰＴＳ（Presentation Time Stamp）値を付与して出力する符号化装置であって、前記複数の音声フレームのうち現時点の音声フレームに与えられた現ＳＴＣ値と、当該現時点の音声フレームよりも所定フレーム前の時点の音声フレームに付与されたＰＴＳ値の元になった前ＳＴＣ値との差分値を計算する演算手段と、計算された前記差分値が、前記複数の音声フレーム毎に与えられるＳＴＣ値が時系列に常に同値で増加していく場合のＳＴＣ値増加量の許容範囲内にあるか否かを判定し、前記許容範囲内にあると判定された場合、前記現時点の音声フレームに付与されるＰＴＳ値の元になるＳＴＣ値として、前記現ＳＴＣ値を採用すると共に、前記許容範囲内にないと判定された場合、前記現時点の音声フレームに付与されるＰＴＳ値の元になるＳＴＣ値として、前記現ＳＴＣ値を補正した補正値を採用する処理手段とを有することを特徴とする。 The encoding apparatus according to the present invention inputs and encodes a speech signal, and for each speech frame that is continuous in a plurality of time sequences constituting the encoded speech signal, STC (time information given in time sequence) An encoding apparatus that outputs a PTS (Presentation Time Stamp) value representing reproduction time information generated based on a System Time Clock value, and is provided to a current audio frame among the plurality of audio frames An arithmetic means for calculating a difference value between the current STC value and the previous STC value that is a source of the PTS value given to the audio frame at a time point before the current audio frame, and the calculated difference It is determined whether or not the value is within the allowable range of the STC value increase amount when the STC value given for each of the plurality of audio frames always increases with the same value in time series, and is within the allowable range Judged If the current STC value is adopted as the STC value that is the basis of the PTS value given to the current audio frame, and it is determined that the current audio frame is not within the allowable range, And a processing unit that employs a correction value obtained by correcting the current STC value as an STC value that is a source of the assigned PTS value.

前記処理手段は、前記差分値が前記ＳＴＣ値増加量の許容範囲内にないと判定される場合が所定回数連続して生じたとき、前記ＳＴＣ値増加量の基準となるＳＴＣ値を前記現ＳＴＣ値に切り替える手段を有してもよい。 The processing means determines an STC value as a reference for the STC value increase amount when the difference value is determined not to be within the allowable range of the STC value increase amount for a predetermined number of times. You may have a means to switch to a value.

本発明に係る符号化システムは、上記いずれかに記載の符号化装置を複数備え、前記複数の符号化装置は、前記ＳＴＣ値増加量の許容範囲が異なることを特徴とする。 An encoding system according to the present invention includes a plurality of the encoding apparatuses according to any one of the above, wherein the plurality of encoding apparatuses have different allowable ranges of the STC value increase amount.

本発明に係る送出系は、上記いずれかに記載の符号化装置を有し、当該符号化装置により符号化された音声信号を送出することを特徴とする。 A transmission system according to the present invention includes any one of the encoding devices described above, and transmits an audio signal encoded by the encoding device.

本発明に係る音声送受信システムは、上記いずれかに記載の符号化装置を有し且つ当該符号化装置により符号化された音声信号を送出する送出系と、前記送出系により送出された音声信号を受信する受信系とを有し、前記受信系は、前記符号化装置で符号化された音声信号を復号化する復号化装置を有することを特徴とする。 An audio transmission / reception system according to the present invention includes a transmission system that includes the encoding device according to any one of the above and that transmits an audio signal encoded by the encoding device, and an audio signal transmitted by the transmission system. A receiving system for receiving, wherein the receiving system has a decoding device for decoding the audio signal encoded by the encoding device.

本発明によれば、音声フレームの時刻情報となるＳＴＣ値が常に同値で増加していく場合に音声フレームに付与されるＰＴＳ値の元となるＳＴＣ値の増加量・減少量を現ＳＴＣ値と前ＳＴＣ値との差分値に基づき監視することで、ＳＴＣ値の変動、ＳＴＣ値の瞬間ずれ、ＳＴＣ値基準の切り替わりを検出でき、これによりＳＴＣ値が正しく与えられない場合にはＳＴＣ値を補正することができ、その結果、音声フレームに付与されるＰＴＳ値の正当性を確保することができる。 According to the present invention, when the STC value, which is the time information of the audio frame, always increases at the same value, the increase / decrease amount of the STC value that is the source of the PTS value given to the audio frame is the current STC value. By monitoring based on the difference value from the previous STC value, it is possible to detect STC value fluctuation, instantaneous deviation of STC value, and switching of STC value reference, thereby correcting the STC value when the STC value is not given correctly. As a result, it is possible to ensure the validity of the PTS value assigned to the audio frame.

以下、本発明の実施の形態について、図面を参照して説明する。 Embodiments of the present invention will be described below with reference to the drawings.

本実施形態の符号化システムは、例えば映像信号をＭＰＥＧ−２（Moving Picture Coding Experts Group-phase 2）映像規格等の所定規格の映像符号化方式で符号化し、音声信号をＭＰＥＧ−２ＡＡＡ（Advanced Audio Coding）方式等の所定規格の音声符号化方式で符号化し、符号化された映像及び音声データからＭＰＥＧ−２システム規格等に基づくＰＥＳ（Packetized Elementary Stream）等のパケット列からなるＴＳ（Transport Stream）信号等を生成して送出する送出システム等に適用されるものである。 The encoding system according to the present embodiment encodes a video signal by a video encoding method of a predetermined standard such as MPEG-2 (Moving Picture Coding Experts Group-phase 2) video standard, and converts an audio signal to MPEG-2 AAA (Advanced TS (Transport Stream) consisting of a packet sequence such as PES (Packetized Elementary Stream) based on the MPEG-2 system standard, etc., from the encoded video and audio data, encoded by a predetermined audio encoding method such as Audio Coding It is applied to a transmission system that generates and transmits signals and the like.

図１は、本実施形態の符号化システムの主要部構成を示す。 FIG. 1 shows a main part configuration of the encoding system of this embodiment.

図１に示す符号化システムは、例えばＭＰＥＧ−２ＡＡＡ方式等の音声符号化方式で音声信号を符号化する音声符号化部１０と、音声符号化部１０にシステム内の共通の時刻情報であるＳＴＣを供給するＳＴＣ部２０と、操作者により各種設定等の操作が可能な操作部３０とを備えている。その他、図示しない映像信号を符号化する符号化装置や、符号化された音声及び映像信号を１本のＴＳ信号に多重化して出力する多重化装置等も含まれる。以下の説明では、音声符号化部１０以外の構成は、本発明に直接関係しないため、その説明を省略する。 The encoding system shown in FIG. 1 is a time information common to the speech encoding unit 10 and the speech encoding unit 10 that encodes a speech signal using a speech encoding method such as MPEG-2 AAA. An STC unit 20 that supplies an STC and an operation unit 30 that can be operated by the operator such as various settings are provided. In addition, an encoding device that encodes a video signal (not shown), a multiplexing device that multiplexes the encoded audio and video signals into one TS signal, and the like are included. In the following description, since the configuration other than the speech encoding unit 10 is not directly related to the present invention, the description thereof is omitted.

音声符号化部１０は、入力された音声信号を符号化する符号化処理部１１と、ＳＴＣ部２０から与えられるＳＴＣ値を元に再生時間情報であるＰＴＳ値を生成して符号化処理部１１に供給するＰＴＳ生成部１２とを有する。これに加え、本発明のＳＴＣ補正方法を用いた処理部として、本実施の形態では、ＳＴＣ部２０から与えられるＳＴＣ値ａ_ｎのずれ検出を現ＳＴＣ値ａ_ｎと前ＳＴＣ値との差分値Δｄに基づいて行い、ＳＴＣ値ａ_ｎの瞬間ずれを検出した場合、現ＳＴＣ値ａ_ｎを正しいＳＴＣ値ｃ_ｎに補正する等、ＳＴＣ値の増加量及び減少量を監視して必要に応じて補正するＳＴＣ監視部１３が設けられている。 The speech encoding unit 10 generates an encoding processing unit 11 that encodes an input speech signal, and generates a PTS value that is reproduction time information based on the STC value given from the STC unit 20, thereby encoding the processing unit 11. And a PTS generator 12 to be supplied to. Additionally, as a processing unit with STC correcting method of the present invention, in the present embodiment, the difference value between the current STC value _{a n} and the previous STC value the displacement detection of STC value _{a n} given from the STC unit 20 performed based on [Delta] d, when detecting the instantaneous deviation of the STC values _{a n,} etc. for correcting the current STC value _{a n} at the correct STC value _{c n,} as required by monitoring the increase and decrease in the STC value An STC monitoring unit 13 for correction is provided.

ここで、図２〜図５を参照して、本発明の符号化システムで用いるＳＴＣ補正方法の原理を説明する。 Here, the principle of the STC correction method used in the encoding system of the present invention will be described with reference to FIGS.

本ＳＴＣ補正方法は、音声符号化において音声フレームの時刻が常に同値で増加していく場合に採用するＳＴＣ値に問題があるとき、採用されるＳＴＣ値を自動的に補正するため、１）ＳＴＣ値の変動、２）ＳＴＣ値の瞬間ずれ、３）ＳＴＣ値基準の切り替わりをそれぞれ監視しながら、自動補正を行うものである。このため、本ＳＴＣ補正方法は、ａ）ＳＴＣ値のずれ検出を現ＳＴＣ値と前ＳＴＣ値の差分から行う手段と、ｂ）ＳＴＣ値の瞬間ずれを検出した場合、現ＳＴＣ値を正しいＳＴＣ値に変更する手段と、ｃ）ＳＴＣ値のずれを指定した時間内で連続して検出した場合、正しいＳＴＣ値の算出基準として現ＳＴＣ値を採用する手段とを有している。 This STC correction method automatically corrects the adopted STC value when there is a problem with the STC value adopted when the time of the voice frame always increases with the same value in speech coding. Automatic correction is performed while monitoring the fluctuation of the value, 2) the instantaneous deviation of the STC value, and 3) the switching of the STC value reference. For this reason, the present STC correction method includes a) means for detecting STC value deviation from the difference between the current STC value and the previous STC value, and b) when detecting an instantaneous deviation of STC value, the current STC value is set to the correct STC value. And c) means for adopting the current STC value as a reference for calculating a correct STC value when the deviation of the STC value is continuously detected within the specified time.

図２〜図５において、横軸は、実時間を表し、縦軸は、音声符号化部１０が採用したＳＴＣ値を表す。ＡＦ１は、ある時刻に符号化する音声信号として音声符号化部１０に入力される最初の音声フレーム、ＡＦ２は、最初の音声フレームＡＦ１に続いて符号化する音声信号として音声符号化部１０に入力される２番目の音声フレーム、ＡＦ３は、２番目の音声フレームＡＦ２に続いて符号化する音声信号として音声符号化部１０に入力される３番目の音声フレーム、ＡＦ４は、３番目の音声フレームＡＦ３に続いて符号化する音声信号として音声符号化部１０に入力される４番目の音声フレームである。 2 to 5, the horizontal axis represents real time, and the vertical axis represents the STC value adopted by the speech encoding unit 10. AF1 is the first audio frame input to the audio encoding unit 10 as an audio signal to be encoded at a certain time, and AF2 is input to the audio encoding unit 10 as an audio signal to be encoded following the first audio frame AF1. The second audio frame, AF3, is input to the audio encoding unit 10 as an audio signal to be encoded following the second audio frame AF2, and AF4 is the third audio frame AF3. Is the fourth audio frame that is input to the audio encoding unit 10 as an audio signal to be encoded.

また、ｔ１は、音声符号化部１０が最初の音声フレームＡＦ１に対するＳＴＣ値ａ１を取得した瞬間の時刻、ｔ２は、音声符号化部１０が２番目の音声フレームＡＦ２に対するＳＴＣ値ａ２を取得した瞬間の時刻、ｔ３は、音声符号化部１０が３番目の音声フレームＡＦ３に対するＳＴＣ値ａ３を取得した瞬間の時刻、ｔ４は、音声符号化部１０が４番目の音声フレームＡＦ４に対するＳＴＣ値ａ４を取得した瞬間の時刻を示す。 Also, t1 is the time when the speech encoding unit 10 acquires the STC value a1 for the first speech frame AF1, and t2 is the moment when the speech encoding unit 10 acquires the STC value a2 for the second speech frame AF2. , T3 is the time when the speech encoding unit 10 acquired the STC value a3 for the third speech frame AF3, and t4 is the time when the speech encoding unit 10 acquires the STC value a4 for the fourth speech frame AF4. Indicates the time of the moment.

また、ａ_ｎ（ｎは正の整数で、図中の例ではｎ＝１〜４、以下同様）は、現在の音声フレーム入力に対し、外部のＳＴＣ生成部２０から供給され音声符号化部１０で取得された現ＳＴＣ値、ｂ_ｎは、前の音声フレームで採用された前ＳＴＣ値、ｃ_ｎは、現音声フレームで採用されたＳＴＣ値、Δｄは、前の音声フレームで採用されたＳＴＣ値と現在の音声フレーム入力に対し、外部から供給され音声符号化部１０で取得されたＳＴＣ値との差分値であり、ａ_ｎ−ｂ_ｎを示す。Δｅは、現音声フレームで取得されるべき理想的なＳＴＣ値の増分値であり、前の音声フレームで採用された前ＳＴＣ値から音声フレーム入力毎に単位時間分増加してゆく理想的なＳＴＣ値の増加量を示す。 Also, a _n (n is a positive integer, n = 1 to 4 in the example in the figure, and so on) is supplied from the external STC generation unit 20 to the current speech frame input, and the speech encoding unit 10 STC in the obtained current STC value, b _n is the previous STC value before adopted in the speech frame, c _n is STC value adopted in the current speech frame, [Delta] d is adopted in the previous speech frame This is a difference value between the value and the STC value supplied from the outside and acquired by the speech encoding unit 10 for the current speech frame input, and indicates a _n −b _n . Δe is an increment value of an ideal STC value to be acquired in the current voice frame, and is an ideal STC that increases by unit time for each voice frame input from the previous STC value adopted in the previous voice frame. Indicates the amount of increase in value.

図２〜図５中のグラフ直線Ｌ１は、音声フレーム入力毎に単位時間分増加してゆく理想的な外部ＳＴＣ値の基準となるＳＴＣ値の時系列変化直線を示す。図５に示すグラフ直線Ｌ２は、音声フレーム入力毎に単位時間分増加してゆく理想的な外部ＳＴＣ値の基準となるＳＴＣ値を現ＳＴＣ値ａ_ｎに切り替えたときのＳＴＣ値の時系列変化直線を示す。 A graph straight line L1 in FIGS. 2 to 5 shows a time-series change line of STC values that serve as a reference for an ideal external STC value that increases by unit time for each voice frame input. Graph line L2 shown in FIG. 5, the time-series change in the STC value when switching the STC value as a reference of the ideal external STC value slide into increased unit time for each speech frame input to the current STC value a _n A straight line is shown.

なお、図には記していないが、Δｆは、現ＳＴＣ値ａ_ｎと前ＳＴＣ値ｂ_ｎとの差分値Δｄの許容差を示し、その値は指定される固定値とする。尺度Ｍ１は、時刻ｔ１のタイミングでの差分値Δｄの許容差Δｆの範囲（上限値＝ｒ２−ｂ１、下限値＝ｒ１−ｂ１）、尺度Ｍ２は、時刻ｔ２のタイミングでの差分値Δｄの許容差Δｆの範囲（上限値＝ｒ３−ｂ２、下限値＝ｒ２−ｂ２）、尺度Ｍ３は、時刻ｔ３のタイミングでの差分値Δｄの許容差Δｆの範囲（上限値＝ｒ４−ｂ３、下限値＝ｒ３−ｂ３）、尺度Ｍ４は、時刻ｔ４のタイミングでの差分値Δｄの許容差Δｆの範囲（上限値＝ｒ５−ｂ４、下限値＝ｒ４−ｂ４）を示す。 Although not noted in the figure, Delta] f represents the tolerance of the difference value Δd between the current STC value a _n and the previous STC value b _n, the value is a fixed value that is specified. Scale M1 is a range of tolerance Δf of difference value Δd at the timing of time t1 (upper limit = r2-b1, lower limit = r1-b1), and scale M2 is tolerance of difference value Δd at the timing of time t2. The range of the difference Δf (upper limit = r3-b2, lower limit = r2-b2), the scale M3 is the range of the tolerance Δf of the difference Δd at the timing of time t3 (upper limit = r4-b3, lower limit = r3-b3) and the scale M4 indicate the range of the tolerance Δf of the difference value Δd at the timing of time t4 (upper limit = r5-b4, lower limit = r4-b4).

以下、図２〜図５を参照して、４つのケースを説明する。
（ケース１）
図２の例は、音声フレーム入力時に外部から供給されたＳＴＣ値ａ_ｎが音声フレーム毎に単位時間分増加してゆく理想的な場合を示している。 Hereinafter, four cases will be described with reference to FIGS.
(Case 1)
The example of FIG. 2, STC value a _n supplied from the outside indicates an ideal case in which slide into increased unit time for each speech frame when the speech frame input.

この場合、時刻ｔ１のタイミングでＳＴＣ生成部２０から与えられる現ＳＴＣ値ａ１と、前音声フレームで付与されたＰＴＳ値の元になった前ＳＴＣ値ｂ１との差分値Δｄ（＝ａ１−ｂ１）は、尺度Ｍ１に示す許容差Δｆの範囲（上限値＝ｒ２−ｂ１、下限値＝ｒ１−ｂ１）に入っており、この時、現音声フレームに付与すべきＰＴＳ値の元になるＳＴＣ値ｃ１として、ａ１が採用される。即ち、ＳＴＣ値ｃ１は、ａ１＝ｃ１の関係となる。
（ケース２）
図３の例は、音声フレーム入力時に外部から供給されたＳＴＣ値ａ_ｎが音声フレーム毎に単位時間分増加しなかった場合の一例を示し、ＳＴＣ値の変動とその補正方法を表すものである。 In this case, a difference value Δd (= a1-b1) between the current STC value a1 given from the STC generation unit 20 at the timing of time t1 and the previous STC value b1 that is the basis of the PTS value given in the previous voice frame. Is within the range of the tolerance Δf shown in the scale M1 (upper limit = r2-b1, lower limit = r1-b1), and at this time, the STC value c1 that is the source of the PTS value to be assigned to the current speech frame A1 is adopted. That is, the STC value c1 has a relationship of a1 = c1.
(Case 2)
Example of FIG. 3 shows an example in which STC value a _n supplied from the outside during the speech frame input is not increased unit time for each speech frame, is representative of the change and its method of correcting STC value .

この場合、時刻ｔ２のタイミングでＳＴＣ生成部２０から与えられる現ＳＴＣ値ａ２と、前音声フレームで付与されたＰＴＳ値の元になった前ＳＴＣ値ｂ２との差分値Δｄ（＝ａ２−ｂ２）は、尺度Ｍ２に示す許容差Δｆの範囲（上限値＝ｒ３−ｂ２、下限値＝ｒ２−ｂ２）に入っており、この時、現音声フレームに付与すべきＰＴＳ値の元になるＳＴＣ値ｃ２として、ａ２が採用される。即ち、ＳＴＣ値ｃ２は、ａ２＝ｃ２の関係となる。 In this case, a difference value Δd (= a2−b2) between the current STC value a2 given from the STC generation unit 20 at time t2 and the previous STC value b2 that is the basis of the PTS value given in the previous voice frame. Is within the range of the tolerance Δf indicated by the scale M2 (upper limit = r3-b2, lower limit = r2-b2). At this time, the STC value c2 that is the basis of the PTS value to be assigned to the current speech frame As, a2 is adopted. That is, the STC value c2 has a relationship of a2 = c2.

図３の例では、外部のＳＴＣ値が理想的な増分Δｅより大きくなった場合を示しているが、理想的な増分より小さくなった場合も許容範囲Δｆの範囲であれば、外部のＳＴＣ値ａ_ｎを採用する。 The example of FIG. 3 shows the case where the external STC value is larger than the ideal increment Δe, but the external STC value is within the allowable range Δf even when the external STC value is smaller than the ideal increment. _An is adopted.

また、真のＳＴＣではなく補正したＳＴＣ値を用いてＰＴＳ値を作成したことを受信系（図示しない）へ通知するために、符号化処理部１１にて生成されるＰＥＳパケット内のＰＥＳヘッダー部のユーザデータにフラグを重畳する。こうすることで、受信系（図示しない）側は、補正したＳＴＣ値との正当性を検査することができる。また、音声符号化部１０にて補正を行った経緯に応じた警報を出力することが望ましい。こうすることで、上位システム（図示しない）で適切な対処を行えることになる。
（ケース３）
図４の例は、音声フレーム入力時に外部のＳＴＣ値ａ_ｎが音声フレーム毎に単位時間分増加しなかった場合の一例を示し、ＳＴＣ値の瞬間ずれとその補正方法を表す。 Further, in order to notify a receiving system (not shown) that a PTS value has been created using a corrected STC value instead of a true STC, a PES header part in a PES packet generated by the encoding processing unit 11 A flag is superimposed on the user data. By doing so, the receiving system (not shown) can inspect the correctness with the corrected STC value. Further, it is desirable to output an alarm corresponding to the circumstances in which the speech encoding unit 10 performed the correction. By doing so, appropriate measures can be taken in the host system (not shown).
(Case 3)
The example of FIG. 4, STC value a _n of the outside when the speech frame input shows an example in which no increase unit time for each speech frame represents the instantaneous deviation and the correction method of the STC value.

この場合、時刻ｔ３のタイミングでＳＴＣ生成部２０から与えられる現ＳＴＣ値ａ３と、前音声フレームで付与されたＰＴＳ値の元になった前ＳＴＣ値ｂ３の差分値Δｄ（＝ａ３−ｂ３）は、尺度Ｍ３に示す許容差Δｆの範囲に入っておらず、この時、現音声フレームに付与すべきＰＴＳ値の元になるＳＴＣ値ｃ３として、ｂ３＋Δｅの値が採用される。即ち、ＳＴＣ値ｃ３は、ｃ３＝ｂ３＋Δｅの関係となり、理想的にＳＴＣ値を増加させた値となる。 In this case, a difference value Δd (= a3−b3) between the current STC value a3 given from the STC generation unit 20 at the timing of time t3 and the previous STC value b3 that is the basis of the PTS value given in the previous voice frame is In this case, the value of b3 + Δe is adopted as the STC value c3 that is the basis of the PTS value to be given to the current voice frame. That is, the STC value c3 has a relationship of c3 = b3 + Δe, and is an ideally increased value of the STC value.

図４の例では、外部のＳＴＣ値が理想的な増分Δｅより大きくなった場合を示しているが、Δより小さくなった場合も、許容範囲Δｆの範囲に入っていなければ、ＳＴＣ値として、ｂｎ＋Δｅの値が採用される。 In the example of FIG. 4, a case where the external STC value is larger than the ideal increment Δe is shown, but even when the external STC value is smaller than Δ, if it is not within the allowable range Δf, the STC value is A value of bn + Δe is adopted.

また、真のＳＴＣではなく補正したＳＴＣ値を用いてＰＴＳ値を作成したことを受信系へ通知するために符号化処理部１１にて生成されるＰＥＳパケット内のＰＥＳヘッダー部のユーザデータにフラグを重畳する。こうすることで、受信系（図示しない）側は、補正したＳＴＣ値との正当性を検査することができる。また、音声符号化部１０にて補正を行った経緯に応じた警報を出力することが望ましい。こうすることで、上位システム（図示しない）で適切な対処を行えることになる。
（ケース４）
図５は、外部のＳＴＣ値が切り替わった状況を示しているＳＴＣ値基準の切り替わりとその補正方法を表す。図５中の直線Ｌ２は、切り替わった外部ＳＴＣの増加を表す直線である。 In addition, a flag is set in the user data in the PES header portion in the PES packet generated by the encoding processing unit 11 in order to notify the reception system that the PTS value has been created using the corrected STC value instead of the true STC. Is superimposed. By doing so, the receiving system (not shown) can inspect the correctness with the corrected STC value. Further, it is desirable to output an alarm corresponding to the circumstances in which the speech encoding unit 10 performed the correction. By doing so, appropriate measures can be taken in the host system (not shown).
(Case 4)
FIG. 5 shows the STC value reference switching and the correction method showing the situation where the external STC value is switched. A straight line L2 in FIG. 5 is a straight line representing an increase in the switched external STC.

この場合、時刻ｔ２、ｔ３のタイミングで、ＳＴＣ生成部２０から与えられる現ＳＴＣ値ａ２、ａ３と、前音声フレームで付与されたＰＴＳ値の元になった前ＳＴＣ値ｂ２、ｂ３のそれぞれの差分値Δｄ（＝ａ２−ｂ２）及びΔｄ（＝ａ３−ｂ３）は、尺度Ｍ２、Ｍ３に示す許容差Δｆの範囲に入っておらず、この時、現音声フレームに付与すべきＰＴＳ値の元になるＳＴＣ値ｃ２、ｃ３として、それぞれｂ２＋Δｅ、ｂ３＋Δｅが採用される。即ち、ＳＴＣ値ｃ２、ｃ３は、ｃ２＝ｂ２＋Δｅ、ｃ３＝ｂ３＋Δｅの関係となり、理想的にＳＴＣ値を増加させた値となる。 In this case, the difference between the current STC values a2 and a3 given from the STC generation unit 20 and the previous STC values b2 and b3 based on the PTS value given in the previous audio frame at the timings t2 and t3. The values Δd (= a2−b2) and Δd (= a3−b3) are not within the tolerance Δf indicated by the scales M2 and M3. At this time, the values Δd (= a2−b2) are based on the PTS values to be added to the current speech frame. As the STC values c2 and c3, b2 + Δe and b3 + Δe are adopted, respectively. That is, the STC values c2 and c3 are in a relationship of c2 = b2 + Δe and c3 = b3 + Δe, and are ideally increased STC values.

しかし、時刻ｔ２、ｔ３のタイミングで理想的に増加させたＳＴＣ値（ｂ２＋Δｅ、ｂ３＋Δｅ）を採用し、かつ、次の時刻ｔ４のタイミングでも理想的に増加させたＳＴＣ値（ｂ４＋Δｅ）を採用した場合には、３回連続して理想的なＳＴＣ値（ｂ_ｎ＋Δｅ）が採用されることになる。 However, when the STC values (b2 + Δe, b3 + Δe) that are ideally increased at the timings of times t2 and t3 are employed, and the STC values (b4 + Δe) that are ideally increased at the timings of the next time t4 are also employed. For this, an ideal STC value (b _n + Δe) is adopted three times in succession.

このように連続して理想的なＳＴＣ値が採用される場合は、理想的な外部ＳＴＣ値の増加の基準となるＳＴＣ値の変化直線が直線Ｌ１から直線Ｌ２に変化したとみなし、時刻ｔ４のタイミングでは、現音声フレームに付与すべきＰＴＳ値の元になるＳＴＣ値ｃ４として、理想的に増加させたＳＴＣ値（ｂ４＋Δｅ）が破棄され、ＳＴＣ生成部２０から与えられた現ＳＴＣ値ａ４が採用される。なお、”連続的に３回”の回数は、あくまで一例であり、任意な値に設定可能である。 When the ideal STC value is continuously adopted as described above, it is considered that the STC value changing straight line that serves as a reference for the increase in the ideal external STC value is changed from the straight line L1 to the straight line L2, and at time t4. At the timing, the STC value (b4 + Δe) that is ideally increased is discarded as the STC value c4 that is the basis of the PTS value to be assigned to the current speech frame, and the current STC value a4 given from the STC generation unit 20 is adopted. Is done. The number of times of “3 times continuously” is merely an example and can be set to an arbitrary value.

なお、ＳＴＣの切り替わりが図５の例とは反対にＳＴＣ値の減少方向にあったとしても、その差分と許容範囲の範囲の判断の概念は共通であり、ＳＴＣ値の増加・減少の状況により、適切な補正処理を行うことも作用に含む。 Note that even if the switching of STC is in the direction of decreasing STC value as opposed to the example of FIG. 5, the concept of the determination of the difference and the range of allowable range is common, and it depends on the situation of increase / decrease of STC value. The operation also includes performing an appropriate correction process.

また、真のＳＴＣではなく補正したＳＴＣ値を用いてＰＴＳ値を作成したこと、およびＳＴＣの基準の切り替わりを判断したことを受信系（図示しない）に通知するために符号化処理部１１にて生成されるＰＥＳパケット内のＰＥＳヘッダー部のユーザデータにフラグを重畳する。こうすることで、受信系（図示しない）側は、補正したＳＴＣ値との正当性を検査することができる。また、音声符号化部１０から補正を行った経緯に応じた警報を出力することが望ましい。こうすることで、上位システムで適切な対処を行えることになる。 Further, the encoding processing unit 11 notifies the receiving system (not shown) that the PTS value has been created using the corrected STC value instead of the true STC and that the STC reference has been switched. A flag is superimposed on the user data in the PES header part in the generated PES packet. By doing so, the receiving system (not shown) can inspect the correctness with the corrected STC value. Further, it is desirable to output an alarm according to the background of the correction from the speech encoding unit 10. By doing so, appropriate measures can be taken in the host system.

図６は、上記ＳＴＣ補正方法の原理に基づくＳＴＣ監視部１３の内部構成例を示す。 FIG. 6 shows an internal configuration example of the STC monitoring unit 13 based on the principle of the STC correction method.

ＳＴＣ監視部１３は、機能上、制御中枢を担う制御部１３１と、この制御部１３１による制御の元で動作する各部、即ち現ＳＴＣ値バッファ１３２、前ＳＴＣ値バッファ１３３、減算器１３４、連続補正回数設定テーブル１３５、ＳＴＣ値算出基準テーブル１３６、及び判定部１３７とを有している。 The STC monitoring unit 13 is functionally a control unit 131 responsible for the control center, and each unit operating under the control of the control unit 131, that is, a current STC value buffer 132, a previous STC value buffer 133, a subtractor 134, a continuous correction A number setting table 135, an STC value calculation reference table 136, and a determination unit 137.

この内、現ＳＴＣ値バッファ１３２には、ＳＴＣ生成部２０から入力されるＳＴＣ値ａ_ｎが現ＳＴＣ値ａ_ｎとして一時格納される。 Among them, the current STC value buffer 132, STC values _{a n} input from the STC generation unit 20 is temporarily stored as a current STC value _{a n.}

前ＳＴＣ値バッファ１３３には、ＳＴＣ監視部１３からＰＴＳ生成部１２へ出力されるＳＴＣ値ｃ_ｎが前ＳＴＣ値ｂ_ｎとして一時格納される。 The front STC value buffer 133, STC value _{c n} output from the STC monitoring unit 13 to the PTS generating portion 12 is temporarily stored as a previous STC value _{b n.}

減算器１３４は、両バッファ１３２、１３３からの現ＳＴＣ値ａ_ｎ及び前ＳＴＣ値ｂ_ｎの差分値Δｄ（＝ａ_ｎ−ｂ_ｎ）を計算する。 Subtractor 134 calculates the difference value Δd in the current STC value _{a n} and the previous STC value _{b n} from both buffers _{_{132,133 (= a n -b n)}} .

連続補正回数設定テーブル１３５には、操作部３０等を介して指定された連続補正回数ｘが設定される。 In the continuous correction number setting table 135, the continuous correction number x designated through the operation unit 30 or the like is set.

ＳＴＣ値算出基準設定テーブル１３６には、音声フレーム入力毎に単位時間分増加してゆく理想的なＳＴＣ値の基準値（例えば、図２〜図５中の直線Ｌ１に沿ったＳＴＣ値）及びその増加量の許容範囲を示す許容差Δｆが設定される。 The STC value calculation reference setting table 136 includes an ideal STC value reference value (for example, an STC value along the straight line L1 in FIGS. 2 to 5) that increases by unit time for each voice frame input, and its A tolerance Δf indicating the allowable range of the increase amount is set.

判定部１３７は、減算器１３４から出力される差分値Δｄと、両テーブル１３５、１３６の設定値とに基づき、現ＳＴＣ値ａ_ｎと前ＳＴＣ値ｂ_ｎのずれを検出し、これに応じて採用すべきＳＴＣ値ｃ_ｎを判定する。 Judging unit 137, the difference value Δd output from the subtractor 134, on the basis of the set value of both tables 135 and 136, to detect the deviation of the current STC value _{a n} and the previous STC value _{b n,} accordingly It determines STC value _{c n} to be adopted.

図７は、ＳＴＣ監視部１３の動作を説明するフローチャートを示す。 FIG. 7 is a flowchart for explaining the operation of the STC monitoring unit 13.

まず、連続補正回数設定テーブル１３５の連続補正回数ｘを初期設定し（ステップＳｔ１）、ＳＴＣ値算出基準テーブル１３６のＳＴＣ値の算出基準となるＳＴＣ値及び許容差Δｆを初期設定する（ステップＳｔ２）。 First, the continuous correction number x in the continuous correction number setting table 135 is initially set (step St1), and the STC value and the tolerance Δf as the STC value calculation reference in the STC value calculation reference table 136 are initially set (step St2). .

次いで、ＳＴＣ生成部２０から与えられるＳＴＣ値ａ_ｎを入力して、現ＳＴＣ値ａ_ｎとして現ＳＴＣ値バッファ１３２に一時格納し（ステップＳｔ３）、現ＳＴＣ値バッファ１３２の現ＳＴＣ値ａ_ｎと、前ＳＴＣ値バッファ１３３に一時格納されている前ＳＴＣ値ｂ_ｎとの差分値Δｄ（＝ａ_ｎ−ｂ_ｎ）を計算する（ステップＳｔ４）。 Then, enter the STC value _{a n} given from the STC generation unit 20, and temporarily stored in the current STC value buffer 132 as the current STC value _{a n} (step St3), and the current STC value _{a n} of the current STC value buffer 132 Then, a difference value Δd (= a _n −b _n ) from the previous STC value b _n temporarily stored in the previous STC value buffer 133 is calculated (step St4).

次いで、現ＳＴＣ値ａ_ｎと前ＳＴＣ値ｂ_ｎとの差分値Δｄ＝０か否かを判断し（ステップＳｔ５）、ＹＥＳ（Δｄ＝０である）の場合は、現音声フレームに付与されるＰＴＳ値の元になるＳＴＣ値ｃ_ｎとして、現ＳＴＣ値ａ_ｎを採用する（ｃ_ｎ＝ａ_ｎ）（ステップＳｔ６）。この場合は、前述したケース１に対応する。 Next, it is determined whether or not the difference value Δd = 0 between the current STC value a _n and the previous STC value b _n (step St5). If YES (Δd = 0), the difference is given to the current voice frame. as STC value _{c n} underlying the PTS value, employing the current STC value _{_{_{a n (c n = a n}}} ) ( step St6). This case corresponds to Case 1 described above.

上記ステップＳｔ５の判断でＮＯ（Δｄ＝０でない）の場合は、差分値Δｄが許容差Δｆの範囲内、即ちΔｆの上限値≧Δｄ≧Δｆの下限値の条件を満たすか否かを判断する（ステップＳｔ７）。 If the determination in step St5 is NO (Δd = 0 is not true), it is determined whether or not the difference value Δd satisfies the condition of the tolerance Δf, that is, the upper limit value of Δf ≧ the lower limit value of Δd ≧ Δf. (Step St7).

この判断で、ＹＥＳ（Δｆ上限値≧Δｄ≧Δｆ下限値の条件を満たしている）の場合は、現音声フレームに付与されるＰＴＳ値の元になるＳＴＣ値ｃ_ｎとして、現ＳＴＣ値ａ_ｎを採用し（ｃ_ｎ＝ａ_ｎ）（ステップＳｔ８）、これに関する制御信号Ｓ_ｎをＰＴＳ生成部１２に出力する（ステップＳｔ９）。ＰＴＳ生成部１２は、制御信号Ｓ_ｎにより、ＳＴＣ値ｃ_ｎとして採用された現ＳＴＣ値ａ_ｎが許容差Δｆの範囲内にあるものの理想的なＳＴＣ値よりも外れていることを認識し、これに関する警報を出力する等、必要な処理を行う。この場合は、前述したケース２に対応する。 In this decision, YES if the (meet the conditions of Delta] f upper limit ≧ [Delta] d ≧ Delta] f lower limit), the STC value _{c n} underlying the PTS value applied to the current speech frame, the current STC value _{a n} It was adopted _(c n _{= a} n) _(step St8), and outputs a control signal _{S n} to the PTS generating portion 12 in this regard (step St9). PTS generating unit 12, the control signals _{S n,} recognizes that deviates than ideal STC value of the current STC value _{a n} adopted as STC value _{c n} is intended to be within the scope of the tolerance Delta] f, Necessary processing, such as outputting an alarm related to this, is performed. This case corresponds to the case 2 described above.

上記ステップＳｔ７の判断でＮＯ（Δｆ上限値≧Δｄ≧Δｆ下限値の条件を満たしていない）の場合は、現音声フレームに付与されるＰＴＳ値の元になるＳＴＣ値ｃ_ｎとして、理想的なＳＴＣ値である「ｂ_ｎ＋Δｅ」の値を採用し（ｃ_ｎ＝ｂ_ｎ＋Δｅ）（ステップＳｔ１０）、連続補正回数ｘをインクリメントし（ステップＳｔ１１）、ｘ≧設定値（例えば３回）の条件を満たしているか否かを判断する（ステップＳｔ１２）。 For NO (Delta] f does not satisfy the condition of upper limit ≧ [Delta] d ≧ Delta] f lower limit) is determined in the step St 7, as STC value c _n underlying the PTS value applied to the current speech frame, ideal The STC value “b _n + Δe” is adopted (c _n = b _n + Δe) (step St10), the continuous correction number x is incremented (step St11), and the condition of x ≧ setting value (for example, 3 times) Is determined (step St12).

この判断でＹＥＳ（ｘ≧設定値の条件を満たしている）の場合は、ＰＴＳ生成部１２へ出力すべきＳＴＣ値ｃ_ｎとして、「ｂ_ｎ＋Δｅ」の値を破棄して、現ＳＴＣ値ａ_ｎを採用し（ｃ_ｎ＝ａ_ｎ）（ステップＳｔ１３）、理想的なＳＴＣ値の基準値となるＳＴＣ値として現ＳＴＣ値ａ_ｎを採用して、ＳＴＣ値算出基準の切り替えを行い（ステップＳｔ１４）、これに関する制御信号Ｓ_ｎをＰＴＳ生成部１２に出力する（ステップＳｔ１５）。また、連続補正回数ｘの初期化(ｘ→０)を行い（ステップＳｔ１６）、再度ＳＴＣ値算出基準の切り替え検出に備える。ＰＴＳ生成部１２は、制御信号Ｓ_ｎにより、現ＳＴＣ値ａ_ｎが許容差Δｆの範囲外にあるものの、現音声フレームに付与されるＰＴＳ値の元になるＳＴＣ値ｃ_ｎとして現ＳＴＣ値ａ_ｎが採用され、かつ、理想的なＳＴＣ値の基準値となるＳＴＣ値が現ＳＴＣ値ａ_ｎに切り替えられたことを認識し、これに関する警報を出力する等、必要な処理を行う。この場合は、前述したケース４に対応する。 In case of YES (satisfies the condition of x ≧ set value) in this determination, as STC value _{c n} to be outputted to the PTS generating portion 12 discards the value of _{"b n} + .DELTA.e" current STC value a adopted _n _(c n _{= a} n) (step St 13), employs a current STC value _{a n} as STC value as a reference value of an ideal STC value, switches the STC value calculation criteria (step St14 ), and outputs the control signal _{S n} in this regard to the PTS generating portion 12 (step St15). In addition, the number of continuous corrections x is initialized (x → 0) (step St16) to prepare for detection of switching of the STC value calculation reference again. PTS generating portion 12, a control signal by the _{S n,} although the current STC value _{a n} is outside the range of tolerance Delta] f, the current STC value a as STC values _{c n} underlying the PTS value applied to the current speech frame _n it is employed, and recognizes that the STC value as a reference value of an ideal STC value is switched to the current STC value a _n, etc. for outputting an alarm on this, performs the necessary processing. This case corresponds to Case 4 described above.

上記判断でＮＯ（ｘ≧設定値の条件を満たしいていない）の場合は、これに関する制御信号Ｓ_ｎをＰＴＳ生成部１２に出力する（ステップＳｔ１７）。ＰＴＳ生成部１２は、制御信号Ｓ_ｎにより現ＳＴＣ値ａ_ｎが許容差Δｆの範囲外にあるために理想的なＳＴＣ値（ｂ_ｎ＋Δｅ）が採用されていることを認識し、これに関する警報を出力する等、必要な処理を行う。この場合は、前述したケース３に対応する。 For the NO (not yet satisfy the conditions of x ≧ set value) in the determination, and outputs a control signal _{S n} in this regard to the PTS generating portion 12 (Step St17). PTS generator 12 recognizes that the ideal STC value to outside the range of the current STC value _{a n} is tolerance Δf _{(b n} + Δe) is employed by the control signals _{S n,} warning about this To perform necessary processing. This case corresponds to the case 3 described above.

上記ステップＳｔ６、Ｓｔ９、Ｓｔ１５の処理が終了すると、前述の各ケース１〜４に応じて採用されたＳＴＣ値ｃ_ｎをＰＴＳ生成部１２及び前ＳＴＣ値バッファ１３３に出力する（ステップＳｔ１８）。 When the process of step St6, St9, St15 is completed, outputs the STC value _{c n} adopted in accordance with each case 1 to 4 above the PTS generating portion 12 and the front STC value buffer 133 (step St18).

上記ステップＳｔ３〜Ｓｔ１８の処理は、ステップＳｔ１９にて処理終了と判断されるまで繰り返し実行される。 The processes in steps St3 to St18 are repeatedly executed until it is determined in step St19 that the process is finished.

従って、本実施の形態によれば、音声符号化部において音声フレームの時刻情報となるＳＴＣ値が時系列に常に同値で増加していく場合に音声フレームに付与されるＰＴＳ値の元となるＳＴＣ値の増加量・減少量を監視することで、ＳＴＣ値の変動、ＳＴＣ値の瞬間ずれ、ＳＴＣ値基準の切り替わりを検出するため、ＰＴＳ値の元となるＳＴＣ値が正しくない場合は採用するＳＴＣ値を自動的に補正することができる。これにより、符号化された音声フレーム毎に付与されるＰＴＳ値の正当性を確保することができる。 Therefore, according to the present embodiment, when the STC value, which is the time information of the voice frame in the voice coding unit, always increases in time series with the same value, the STC that is the source of the PTS value given to the voice frame By monitoring the amount of increase / decrease in the value, the STC value variation, the STC value instantaneous deviation, and the STC value reference switching are detected. If the STC value that is the basis of the PTS value is incorrect, the STC to be adopted The value can be automatically corrected. Accordingly, it is possible to ensure the validity of the PTS value given to each encoded audio frame.

よって、符号化された音声信号を復号化する復号化装置を有する受信系では、復号化信号を所望の時刻情報に基づき出力することができる。また、ＳＴＣの切り替わりを検出し、その際は基準のＳＴＣも切り替えることで、ＰＴＳ値の正当性を確保し、受信系での出力環境も、復号化装置を備えた送出系側と整合性のとれた正しいものとなり得る。さらに、音声符号化信号へ不正な時刻情報を付加することを防止し、時刻情報を用いた運用システムでの不都合を未然に防ぐことも可能である。また、システムとしての時刻情報を切り替えても、様々な設定、制御を施すことなく、自動的に追従できる点は、運用面から見て好都合である。 Therefore, a reception system having a decoding device that decodes an encoded audio signal can output a decoded signal based on desired time information. In addition, by detecting the switching of the STC and switching the reference STC at that time, the validity of the PTS value is ensured, and the output environment in the reception system is also consistent with the transmission system side equipped with the decoding device. It can be the right thing. In addition, it is possible to prevent unauthorized time information from being added to a speech encoded signal, and to prevent inconvenience in an operation system using the time information. In addition, it is convenient from an operational point of view that even when the time information as the system is switched, it can be automatically followed without performing various settings and controls.

なお、前述した音声符号化部１０内におけるＳＴＣ監視部１３の内部構成及び動作（図７、図８参照）はあくまで一例であり、本発明はこれに限定されるものではなく、本ＳＴＣ補正方法の原理（図２〜図５参照）に従うものであれば、いずれの内部構成及び動作でも適用可能である。 Note that the above-described internal configuration and operation (see FIGS. 7 and 8) of the STC monitoring unit 13 in the speech encoding unit 10 are merely examples, and the present invention is not limited to this, and this STC correction method. Any internal configuration and operation are applicable as long as they follow the principle (see FIGS. 2 to 5).

また、本発明の他の実施の形態として、音声符号化部でＳＴＣ値のずれを検出した結果を外部に通知する構成を採用してもよい。これによれば、映像符号化部でのＰＴＳ付加処理を行う部分や、映像符号化信号と音声符号化信号を多重化する部分等で、本音声符号化部で実施する形態を各部でも容易に実施することが可能である。 As another embodiment of the present invention, a configuration may be adopted in which the result of detecting the STC value deviation by the speech encoding unit is notified to the outside. According to this, it is possible to easily implement the embodiment implemented by the audio encoding unit in the part that performs the PTS addition processing in the video encoding unit, the part that multiplexes the video encoded signal and the audio encoded signal, etc. It is possible to implement.

また、本ＳＴＣ補正方法の有効／無効を切り替え設定可能なスイッチ手段を設けることで、次のような応用システムも考えられる。 In addition, the following application system is also conceivable by providing switch means capable of switching between valid / invalid of the STC correction method.

例えば、音声信号だけを用いるシステム的な一例として、音声多チャンネルでのバーチャル効果が可能となる。４チャンネルの音声符号化に際し、２チャンネルの音声符号化部を２つ設け、各々、前述したＰＴＳ値増加量の許容差Δｆを別に用意しておく。例えば、２つの音声符号化部の一方には、スイッチ手段により本ＳＴＣ補正方法の無効にすることでＰＴＳ値増加量の許容差Δｆを用意し、２つの音声符号化部の他方には、スイッチ手段により本ＳＴＣ補正方法の無効にすることでＰＴＳ値増加量の許容差Δｆを設けず、外部から供給される、ずれたＳＴＣ値をそのまま付加するものとする。このようにして、好意的にずれたＳＴＣを供給することで、効果音などの時に、輪唱的な音を実現できる。このバーチャル効果時間は、任意に設定できるＰＴＳ値増加量の許容差Δｆにより変化させることが可能である。これは、音声符号化と相反する音声復号化でも実現可能である。 For example, as an example of a system using only an audio signal, a virtual effect with multiple audio channels is possible. In the case of 4-channel audio encoding, two 2-channel audio encoding units are provided, and the aforementioned PTS value increase amount tolerance Δf is prepared separately. For example, one of the two speech encoding units is provided with a tolerance Δf of the PTS value increase amount by disabling the present STC correction method by the switch means, and the other speech encoding unit has a switch By disabling the present STC correction method by means, the PST value increase amount tolerance Δf is not provided, but the shifted STC value supplied from the outside is added as it is. In this way, by supplying an STC that is favorably shifted, a ringing sound can be realized at the time of a sound effect or the like. This virtual effect time can be changed by a PTS value increase amount tolerance Δf that can be arbitrarily set. This can also be realized by speech decoding which is contrary to speech encoding.

本発明の実施の形態に係る符号化システムの主要部構成を示す概略ブロック図である。It is a schematic block diagram which shows the principal part structure of the encoding system which concerns on embodiment of this invention. 音声フレーム入力時に外部から供給されたＳＴＣ値が音声フレーム毎に単位時間分増加してゆく理想的な場合を説明するグラフである。It is a graph explaining the ideal case where the STC value supplied from the outside at the time of audio | voice frame input increases by unit time for every audio | voice frame. 音声フレーム入力時に外部から供給されたＳＴＣ値が音声フレーム毎に単位時間分増加しなかった場合（許容範囲内）を説明するグラフである。It is a graph explaining the case where the STC value supplied from the outside at the time of voice frame input does not increase by unit time for each voice frame (within an allowable range). 音声フレーム入力時に外部から供給されたＳＴＣ値が音声フレーム毎に単位時間分増加しなかった場合（許容範囲外）を説明するグラフである。It is a graph explaining the case where the STC value supplied from the outside at the time of voice frame input does not increase by unit time for each voice frame (outside the allowable range). 音声フレーム入力時に外部から供給されたＳＴＣ値の基準が切り替わった場合を説明するグラフである。It is a graph explaining the case where the reference | standard of the STC value supplied from the outside at the time of audio | voice frame input switches. ＳＴＣ監視部の内部構成を示す概略ブロック図である。It is a schematic block diagram which shows the internal structure of a STC monitoring part. ＳＴＣ監視部の動作を説明する概略フローチャートである。It is a schematic flowchart explaining operation | movement of an STC monitoring part.

Explanation of symbols

１０音声符号化部
１１符号化処理部
１２ＰＴＳ生成部
１３ＳＴＣ監視部
２０ＳＴＣ生成部
３０操作部
１３１制御部
１３２現ＳＴＣ値バッファ
１３３前ＳＴＣ値バッファ
１３４減算器
１３５ＳＴＣ値算出基準テーブル
１３６連続補正回数テーブル
１３７判定部 DESCRIPTION OF SYMBOLS 10 Speech encoding part 11 Encoding process part 12 PTS generation part 13 STC monitoring part 20 STC generation part 30 Operation part 131 Control part 132 Current STC value buffer 133 Previous STC value buffer 134 Subtractor 135 STC value calculation reference table 136 Continuous correction Number table 137 determination unit

Claims

Generates an audio signal based on STC (System Time Clock) value representing time information given in time series for each audio frame that is continuous in a plurality of time series. An STC correction method used in an encoding device that outputs a PTS (Presentation Time Stamp) value representing reproduced time information,
A current STC value given to a current voice frame among the plurality of voice frames, and a previous STC value based on a PTS value given to a voice frame at a predetermined frame before the current voice frame; A calculation step for calculating a difference value of
The calculated difference value determines whether or not the STC value given for each of the plurality of audio frames is within an allowable range of an STC value increase amount when the STC value always increases in the same time series, and When it is determined that the current STC value is within the allowable range, the current STC value is adopted as the STC value that is the basis of the PTS value assigned to the current audio frame, and the current STC value is determined not to be within the allowable range And an STC correction method for use in an encoding apparatus, comprising a processing step of using a correction value obtained by correcting the current STC value as an STC value that is a source of a PTS value assigned to the current speech frame. .

The STC correction method used in the encoding apparatus according to claim 1, wherein the correction value is a calculated value obtained by adding the STC value increase amount to the previous STC value.

In the calculation step, when it is determined that the difference value is not within the allowable range of the STC value increase amount continuously for a predetermined number of times, an STC value serving as a reference for the STC value increase amount is determined as the current STC value. The STC correction method used in the encoding apparatus according to claim 1, further comprising a step of switching to a value.

Generates an audio signal based on STC (System Time Clock) value representing time information given in time series for each audio frame that is continuous in a plurality of time series. An encoding device that outputs a PTS (Presentation Time Stamp) value representing the playback time information,
A current STC value given to a current voice frame among the plurality of voice frames, and a previous STC value based on a PTS value given to a voice frame at a predetermined frame before the current voice frame; Computing means for calculating the difference value of
The calculated difference value determines whether or not the STC value given for each of the plurality of audio frames is within an allowable range of an STC value increase amount when the STC value always increases with the same value in time series, When it is determined that the current STC value is within the allowable range, the current STC value is adopted as the STC value that is the basis of the PTS value assigned to the current audio frame, and the current STC value is determined not to be within the allowable range And a processing unit that employs a correction value obtained by correcting the current STC value as an STC value that is a source of the PTS value assigned to the current speech frame.

5. The encoding apparatus according to claim 4, wherein the correction value is a calculated value obtained by adding the STC value increment to the previous STC value.

The processing means determines an STC value as a reference for the STC value increase amount when the difference value is determined not to be within the allowable range of the STC value increase amount for a predetermined number of times. 6. The encoding apparatus according to claim 4, further comprising means for switching to a value.

A plurality of the encoding devices according to any one of claims 4 to 6,
The encoding system, wherein the plurality of encoding devices have different allowable ranges of the STC value increase amount.

A transmission system comprising the encoding device according to any one of claims 4 to 6 and transmitting an audio signal encoded by the encoding device.

A transmission system comprising the encoding device according to any one of claims 4 to 6 and transmitting an audio signal encoded by the encoding device;
A receiving system for receiving an audio signal sent by the sending system;
The speech transmission / reception system, wherein the reception system includes a decoding device that decodes the speech signal encoded by the encoding device.