JP5948793B2

JP5948793B2 - Sound processing apparatus, sound processing method and program

Info

Publication number: JP5948793B2
Application number: JP2011240596A
Authority: JP
Inventors: 博信山崎
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2011-11-01
Filing date: 2011-11-01
Publication date: 2016-07-06
Anticipated expiration: 2031-11-01
Also published as: JP2013097202A

Description

本発明は、エコーハイディングされたデータを処理する音処理装置、音処理方法及びプログラムに関する。 The present invention relates to a sound processing apparatus, a sound processing method, and a program for processing echo-hided data.

従来、画像データや音声データに対して、人間が感知しにくい領域に所定データを埋め込むステガノグラフィーという技術がある。所定データは、以下では付加データとも呼ばれる。 Conventionally, there is a technique called steganography in which predetermined data is embedded in an area that is difficult for humans to detect image data and audio data. The predetermined data is also referred to as additional data below.

ステガノグラフィーは、既存の通信路のみで新たに情報を付加することができるため、
著作権保護や秘匿データの送信などに利用されている。 Steganography can add new information only on existing channels,
Used for copyright protection and transmission of confidential data.

例えば、音声データに付加データを付加する手段として、エコーハイディング法がある。この方法は、元の音声データに対し、人間が感知できない程度の遅延時間のエコーをかけ、遅延時間を付加データとして送る手法である。 For example, there is an echo hiding method as means for adding additional data to audio data. This method is a technique in which the original audio data is echoed with a delay time that cannot be detected by humans, and the delay time is sent as additional data.

音声データを受信する受信機側では、音声データのケプストラムを求めると、エコーがかけられた遅延時間のケフレンシーに有意な大きな値が得られ、予め送信側と定めた遅延時間とデータとの対応表から、付加データを再現することができる。 When the receiver that receives the audio data obtains the cepstrum of the audio data, a significant large value is obtained for the delay time quefrency with which the echo is applied, and a correspondence table between the delay time and the data determined as the transmission side in advance. Therefore, the additional data can be reproduced.

ケプストラムとは、フーリエ変換結果を対数変換し、さらにフーリエ変換をかけたもので、音声データのピッチなどを求めるときに使われるデータである。ケプストラムの各要素をケフレンシーと呼び、次元は時間と同じである。 The cepstrum is data obtained by logarithmically transforming a Fourier transform result and further applying a Fourier transform, and is used when obtaining the pitch of audio data. Each element of the cepstrum is called quefrency, and the dimension is the same as time.

エコーハイディング法では、付加データを更新する間隔は既知であるため、この更新タイミングと、音声データの読み出しタイミングとを合わせることで精度よく付加データを抽出することができる。 In the echo hiding method, since the interval for updating the additional data is known, the additional data can be extracted with high accuracy by combining the update timing with the readout timing of the audio data.

例えば、透かしが埋め込まれたセグメントの位相シフトの推定値とこの位相シフトの平均値との距離が所定の閾値以上であるか否かで、セグメントの境界を判定する技術がある For example, there is a technique for determining a segment boundary based on whether a distance between an estimated value of phase shift of a segment in which a watermark is embedded and an average value of the phase shift is equal to or greater than a predetermined threshold.

特表２００９−５１５３７１号公報Special table 2009-515371

ここで、エコーハイディング法により、音声データに付加データを連続して付加する場合、音声データが不連続になると音質の劣化が目立つ。そこで、実際は、付加データの更新タイミングでエコーが急に切り替わるのではなく、前のデータのエコーをフェードアウトさせ、次のデータのエコーをフェードインさせて、エコーが徐々に切り替わるようにする。 Here, when the additional data is continuously added to the audio data by the echo hiding method, the deterioration of the sound quality is conspicuous if the audio data becomes discontinuous. Therefore, in practice, the echo is not suddenly switched at the update timing of the additional data, but the echo of the previous data is faded out and the echo of the next data is faded in so that the echo is gradually switched.

この場合、受信装置側で、音声データの読み出しタイミングが、付加データの更新タイミングとずれてしまうと、異なるエコーが付加された音声データのケプストラム計算をすることになり、適切に付加データを抽出することができないという問題点があった。 In this case, when the read timing of the audio data deviates from the update timing of the additional data on the receiving device side, the cepstrum calculation of the audio data to which a different echo is added is performed, and the additional data is appropriately extracted. There was a problem that it was not possible.

そこで、開示の技術は、処理量を抑えつつ、エコーハイディングされた音声データに対し、付加データの抽出精度を向上させることができる音処理装置、音処理方法及びプログラムを提供することを目的とする。 Therefore, the disclosed technique aims to provide a sound processing device, a sound processing method, and a program capable of improving the extraction accuracy of additional data with respect to the sound data subjected to echo hiding while suppressing the processing amount. To do.

開示の一態様における音処理装置は、所定時間遅延されたエコー信号に基づいて所定データが更新間隔毎に付加された音声データに対し、前記更新間隔に対応する長さの音声データが複数に分割された各フレームでケプストラム計算を行う計算部と、各フレームのケプストラム計算結果に基づいて抽出される所定データが同じであるフレームの連続回数が、所定値以上であるか否かを判定し、一の更新間隔に対する所定データを決定する第１判定部と、前記決定された所定データを出力する出力部と、前記各フレームのうち、連続して同じ所定データが抽出される複数のフレームについて前記複数のフレーム全体に対する中心時刻を求める中心算出部と、求められた中心時刻に前記更新間隔を加算して、次の付加データの抽出に用いる所定数のフレームを決定する第１決定部と、決定された所定数のフレームのケプストラム計算結果に基づいて、最も多く抽出される所定データを判定し、最も多く抽出された所定データを前記出力部に出力する第２判定部とを備える。 In a sound processing apparatus according to an aspect of the disclosure, audio data having a length corresponding to the update interval is divided into a plurality of audio data to which predetermined data is added at each update interval based on an echo signal delayed for a predetermined time. Determining whether the number of consecutive frames having the same predetermined data extracted based on the result of cepstrum calculation of each frame is equal to or greater than a predetermined value. of a first determination unit that determines the predetermined data for the update interval, and an output unit for outputting a predetermined data said determined out of the frames, the plurality for a plurality of frames with the same predetermined data continuously is extracted a center calculation unit for determining the center time for the entire frame, by adding the update interval to the center time obtained, the predetermined number used for the extraction of the following additional data Based on the determined cepstrum calculation results for a predetermined number of frames and a first determination unit that determines frames, the most extracted predetermined data is determined, and the most extracted predetermined data is output to the output unit. A second determination unit.

開示の技術によれば、処理量を抑えつつ、エコーハイディングされた音声データに対し、付加データの抽出精度を向上させることができる。 According to the disclosed technology, it is possible to improve the extraction accuracy of additional data for echo-hided audio data while suppressing the processing amount.

実施例１における音処理装置の構成の一例を示すブロック図。1 is a block diagram illustrating an example of a configuration of a sound processing device according to Embodiment 1. FIG. エンコードフレームの一例を示す図。The figure which shows an example of an encoding frame. デコードフレームの一例を示す図。The figure which shows an example of a decoding frame. ケプストラム計算結果の一例を示す図。The figure which shows an example of a cepstrum calculation result. 実施例１における判定部の機能の一例を示すブロック図。FIG. 3 is a block diagram illustrating an example of a function of a determination unit according to the first embodiment. 実施例１における付加データの判定処理を説明するための図。FIG. 5 is a diagram for explaining additional data determination processing according to the first embodiment. 実施例１における音処理装置の処理の一例を示すフローチャート。3 is a flowchart illustrating an example of processing performed by the sound processing apparatus according to the first embodiment. 実施例２における判定部の機能の一例を示すブロック図。FIG. 9 is a block diagram illustrating an example of a function of a determination unit according to the second embodiment. 実施例２における付加データの判定処理を説明するための図。FIG. 10 is a diagram for explaining additional data determination processing according to the second embodiment. 実施例２における音処理装置の処理の一例を示すフローチャート。9 is a flowchart illustrating an example of processing of a sound processing device according to a second embodiment. 実施例３における判定部の機能の一例を示すブロック図。FIG. 10 is a block diagram illustrating an example of a function of a determination unit according to the third embodiment. 実施例３における付加データの判定処理を説明するための図。FIG. 10 is a diagram for explaining additional data determination processing according to the third embodiment. 実施例３における音処理装置の処理の一例を示すフローチャート。9 is a flowchart illustrating an example of processing of a sound processing device according to a third embodiment. 実施例４における音処理装置の構成の一例を示すブロック図。FIG. 10 is a block diagram illustrating an example of a configuration of a sound processing device according to a fourth embodiment. 実施例４における付加データの判定処理を説明するための図。FIG. 10 is a diagram for explaining additional data determination processing according to the fourth embodiment. 実施例４における音処理装置の処理の一例を示すフローチャート。10 is a flowchart illustrating an example of processing of a sound processing device according to a fourth embodiment. 実施例５における携帯端末装置のハードウェアの一例を示すブロック図。FIG. 10 is a block diagram illustrating an example of hardware of a mobile terminal device according to a fifth embodiment.

以下、音声データへの所定の遅延時間のエコーを所定データとして加えることをエンコードと呼び、１つの付加データを埋め込んだ一連の音声データ列をエンコードフレームと呼ぶ。所定データは、前述した通り付加データとも呼ぶ。また、音を受信する音処理装置側で音声データから付加データを抽出することをデコードと呼び、デコードする対象の一連の音声データ列をデコードフレームと呼ぶ。 Hereinafter, adding an echo having a predetermined delay time to audio data as predetermined data is referred to as encoding, and a series of audio data strings in which one additional data is embedded is referred to as an encoding frame. The predetermined data is also called additional data as described above. Further, extracting additional data from audio data on the sound processing device side that receives the sound is called decoding, and a series of audio data strings to be decoded is called a decoding frame.

［実施例１］
＜構成＞
図１は、実施例１における音処理装置１の構成の一例を示すブロック図である。図１に示す音処理装置１は、マイク部１１、音声データカウント部１２、音声データ記憶部１３、計算部１４、デコード部１５、判定部１６、出力部１７を有する。 [Example 1]
<Configuration>
FIG. 1 is a block diagram illustrating an example of the configuration of the sound processing device 1 according to the first embodiment. The sound processing device 1 illustrated in FIG. 1 includes a microphone unit 11, an audio data count unit 12, an audio data storage unit 13, a calculation unit 14, a decoding unit 15, a determination unit 16, and an output unit 17.

マイク部１１は、音を電気信号の音声データに変換し、変換された音声データを音声データカウント部１２及び音声データ記憶部１３に出力する。なお、マイク部１１は、音声データを受信する受信部などでもよい。 The microphone unit 11 converts sound into audio data of an electrical signal, and outputs the converted audio data to the audio data count unit 12 and the audio data storage unit 13. The microphone unit 11 may be a receiving unit that receives audio data.

ここで、入力される音には、付加データが付加されているとする。図２は、エンコードフレームの一例を示す図である。図２に示す原信号には、ｔ１間隔で、所定時間遅延されたエコー信号が付加される。この所定時間の違いにより、どのデータが付加されたかを判定することができる。 Here, it is assumed that additional data is added to the input sound. FIG. 2 is a diagram illustrating an example of an encode frame. An echo signal delayed by a predetermined time is added to the original signal shown in FIG. It is possible to determine which data is added based on the difference in the predetermined time.

図２に示す１つのエンコードフレームは、原信号に１つの付加データを示すエコー信号が付加されたフレームである。また、ｔ１間隔毎に、任意のΔｔだけ遅延されたエコー信号が付加されることで、ｔ１毎に付加データが１つずつ抽出される。 One encoded frame shown in FIG. 2 is a frame in which an echo signal indicating one additional data is added to the original signal. Further, by adding an echo signal delayed by an arbitrary Δt at every t1 interval, one additional data is extracted every t1.

図１に戻り、音声データカウント部１２は、マイク部１１から取得した音声データの数をカウントする。このとき、カウントされる音声データは、更新間隔ｔ1が複数に分割されたデコードフレームとする。 Returning to FIG. 1, the audio data counting unit 12 counts the number of audio data acquired from the microphone unit 11. At this time, the audio data to be counted is a decoded frame in which the update interval t1 is divided into a plurality.

図３は、デコードフレームの一例を示す図である。図３に示すように、デコードフレームは、例えば１つのエンコードフレームを複数に分割したうちの１つのフレームである。音声データカウント部１２は、このデコードフレームの数が例えば閾値以上になると、計算部１４に対し、計算可能なデータが記憶されたことを通知する。閾値は、例えば１つのエンコードフレームに含まれるデコードフレームの数などである。 FIG. 3 is a diagram illustrating an example of a decoded frame. As shown in FIG. 3, the decode frame is, for example, one frame obtained by dividing one encode frame into a plurality of pieces. When the number of decoded frames reaches a threshold value or more, for example, the audio data count unit 12 notifies the calculation unit 14 that data that can be calculated is stored. The threshold is, for example, the number of decoded frames included in one encoded frame.

また、１つのデコードフレームは、例えば数１０ｍｓｅｃ（ミリセック）などであり、また、１つのエンコードフレームは、例えばデコードフレームの１６個分などである。デコードフレームのサイズやエンコードフレームのサイズは、あくまでも一例であり、この例に限らない。 Further, one decode frame is, for example, several tens of msec (millisec), and one encode frame is, for example, 16 decode frames. The size of the decode frame and the size of the encode frame are merely examples, and are not limited to this example.

計算部１４は、音声データカウント部１２から通知を受けると、音声データ記憶部１３から音声データを読み出し、デコードフレーム毎にケプストラムを計算する。計算部１４は、各デコードフレームのケプストラム計算結果をデコード部１５に出力する。 When the calculation unit 14 receives the notification from the audio data count unit 12, the calculation unit 14 reads the audio data from the audio data storage unit 13 and calculates a cepstrum for each decoded frame. The calculation unit 14 outputs the cepstrum calculation result of each decoded frame to the decoding unit 15.

デコード部１５は、計算部１４から取得したケプストラム計算結果に基づき、ピーク値の間隔に対応する遅延時間を求める。 The decoding unit 15 obtains a delay time corresponding to the peak value interval based on the cepstrum calculation result acquired from the calculation unit 14.

図４は、ケプストラム計算結果の一例を示す図である。図４に示すように、１つのデコードフレームに対して図４に示すようなケプストラム計算結果が得られる。ケフレンシーのピーク値までの間隔がエコー信号の遅延時間に対応する。 FIG. 4 is a diagram illustrating an example of a cepstrum calculation result. As shown in FIG. 4, a cepstrum calculation result as shown in FIG. 4 is obtained for one decoded frame. The interval to the peak value of quefrency corresponds to the delay time of the echo signal.

図１に戻り、デコード部１５は、求めた遅延時間に基づいて、付加データにデコードする。デコード部１５は、遅延時間と付加データとの対応表を保持し、遅延時間を計算部１４から取得する度に付加データにデコードし、付加データを示すデコード結果を判定部１６に出力する。よって、デコード部１５は、デコードフレーム毎に付加データを抽出することができる。 Returning to FIG. 1, the decoding unit 15 decodes the additional data based on the obtained delay time. The decoding unit 15 holds a correspondence table between the delay time and the additional data, decodes the delay time into additional data every time the delay time is acquired from the calculation unit 14, and outputs a decoding result indicating the additional data to the determination unit 16. Therefore, the decoding unit 15 can extract additional data for each decoded frame.

判定部１６は、デコード部１５からデコード結果を取得し、デコード結果に基づいて、１つの更新間隔に対して１つの付加データを判定する。付加データの更新間隔は、音処理装置１に予め設定されているとする。ここで、判定部１６の詳細は図５を用いて説明する。判定部１６は、判定した付加データを出力部１７に出力する。 The determination unit 16 acquires a decoding result from the decoding unit 15 and determines one additional data for one update interval based on the decoding result. It is assumed that the update interval of the additional data is preset in the sound processing device 1. Details of the determination unit 16 will be described with reference to FIG. The determination unit 16 outputs the determined additional data to the output unit 17.

出力部１７は、判定部１６から取得した付加データを後段の処理部に出力する。後段の処理部は、例えば表示部である。これにより、付加データが表示されることで、音声データに付加されていたデータをユーザは把握することができる。 The output unit 17 outputs the additional data acquired from the determination unit 16 to a subsequent processing unit. The subsequent processing unit is, for example, a display unit. Thereby, the user can grasp the data added to the audio data by displaying the additional data.

＜判定部の機能＞
図５は、実施例１における判定部１６の機能の一例を示すブロック図である。図５に示す判定部１６は、最初の付加データを判定するための判定部１０１、２つ目以降の付加データを判定するための判定部１０２を有する。 <Function of determination unit>
FIG. 5 is a block diagram illustrating an example of the function of the determination unit 16 according to the first embodiment. The determination unit 16 illustrated in FIG. 5 includes a determination unit 101 for determining the first additional data, and a determination unit 102 for determining the second and subsequent additional data.

判定部１０１は、連続回数カウント部１１１、付加データ第１判定部１１２を有する。連続回数カウント部１１１は、例えば、デコード部１５から、最初のエンコードフレーム内にある各デコードフレームのデコード結果を取得し、同じデコード結果（同じ付加データ）が続くデコードフレームの回数をカウントする。連続回数カウント部１１１は、連続回数を付加データ第１判定部１１２及び中心算出部１２１に出力する。 The determination unit 101 includes a continuous count unit 111 and an additional data first determination unit 112. The continuous number counting unit 111 acquires, for example, the decoding result of each decoding frame in the first encoding frame from the decoding unit 15 and counts the number of decoding frames followed by the same decoding result (the same additional data). The continuous number counting unit 111 outputs the continuous number to the additional data first determination unit 112 and the center calculation unit 121.

付加データ第１判定部１１２は、連続回数カウント部１１１から取得した連続回数が、所定値以上であるか否かを判定する。ここで、所定値をＮとし、Ｎは、付加データの更新間隔ｔ１に基づいて予め決定される。Ｎが取りうる最大値は、エンコードフレーム内にあるデコードフレーム数から、データ切替のフェードにかかるフレーム数を除いた数である。よって、Ｎは、この最大値よりも小さい値とする。 The additional data first determination unit 112 determines whether or not the continuous number obtained from the continuous number counting unit 111 is equal to or greater than a predetermined value. Here, the predetermined value is N, and N is determined in advance based on the update interval t1 of the additional data. The maximum value that N can take is the number obtained by subtracting the number of frames for data switching fade from the number of decoded frames in the encoded frame. Therefore, N is set to a value smaller than this maximum value.

付加データ第１判定部１１２は、連続回数がＮ以上である場合、この連続した付加データを、エンコードフレームに付加された所定データと決定する。付加データ第１判定部１１２は、決定した付加データを出力部１７に出力する。 When the number of consecutive times is N or more, the additional data first determination unit 112 determines the continuous additional data as predetermined data added to the encoded frame. The additional data first determination unit 112 outputs the determined additional data to the output unit 17.

これにより、付加データの更新タイミングと、音声データの読み出しタイミングとを調整する必要がなく、読み出しタイミングが予め決められたデコードフレームを用いて適切に付加データを抽出することができる。 Thereby, it is not necessary to adjust the update timing of the additional data and the read timing of the audio data, and the additional data can be appropriately extracted using the decode frame whose read timing is determined in advance.

判定部１６は、最初の付加データの判定が終わると、判定部１０２により、２つ目以降の付加データの判定を行う。よって、２つ目以降のエンコードフレームに対応するデコード部１５のデコード結果は、判定部１０２に入力される。デコード結果の入力の切替については、例えばスイッチなどを用いればよい。 When the determination of the first additional data is completed, the determination unit 16 determines the second and subsequent additional data using the determination unit 102. Therefore, the decoding result of the decoding unit 15 corresponding to the second and subsequent encoded frames is input to the determination unit 102. For switching the decoding result input, for example, a switch or the like may be used.

判定部１０２は、中心算出部１２１、第１決定部１２２、付加データ第２判定部１２３を有する。中心算出部１２１は、連続回数カウント部１１１から取得した連続回数の中心を求める。中心算出部１２１は、例えば、同じデコード結果が連続したデコードフレームの中心時刻を算出する。中心算出部１２１は、求めた中心時刻を第１決定部１２２に出力する。 The determination unit 102 includes a center calculation unit 121, a first determination unit 122, and an additional data second determination unit 123. The center calculation unit 121 obtains the center of the continuous number obtained from the continuous number counting unit 111. For example, the center calculation unit 121 calculates the center time of the decoded frames in which the same decoding result is continuous. The center calculation unit 121 outputs the obtained center time to the first determination unit 122.

第１決定部１２２は、中心算出部１２１から取得した中心時刻に付加データの更新間隔を加算していく。第１決定部１２２は、加算した後の時刻を中心として所定数のフレームを、以降の付加データの判定位置であると決定する。この所定数は、例えばＮとするが、Ｎに限らず適切な値が設定されればよい。第１決定部１２２は、決定した判定位置に含まれる所定数のフレームを、付加データ第２判定部１２３に通知する。 The first determination unit 122 adds the update interval of the additional data to the center time acquired from the center calculation unit 121. The first determination unit 122 determines a predetermined number of frames centering on the time after the addition as determination positions for subsequent additional data. The predetermined number is N, for example, but is not limited to N, and an appropriate value may be set. The first determination unit 122 notifies the additional data second determination unit 123 of a predetermined number of frames included in the determined determination position.

付加データ第２判定部１２３は、第１決定部１２２により決定された判定位置のフレームを用いて、付加データを判定する。付加データ第２判定部１２２は、所定数のフレームのデコード結果で最も多く抽出された付加データを判定し、最も多く抽出された付加データを前記出力部１７に出力する。 The additional data second determination unit 123 determines additional data using the frame at the determination position determined by the first determination unit 122. The additional data second determination unit 122 determines the additional data extracted most from the decoding results of the predetermined number of frames, and outputs the additional data extracted most to the output unit 17.

付加データ第２判定部１２３は、２つ目以降の付加データの判定については、第１決定部１２２により決定される判定位置に含まれるフレームを用いる。 The additional data second determination unit 123 uses a frame included in the determination position determined by the first determination unit 122 for determining the second and subsequent additional data.

これにより、２つ目以降の付加データの判定処理については、更新間隔の中心付近のデコードフレームを用いて付加データを判定することができるので、付加データの抽出精度を向上させることができる。また、２つ目以降の付加データの判定処理については、連続回数などをカウントする必要がないため、１つ目の付加データの判定処理よりも処理量を抑えることができる。 As a result, in the second and subsequent additional data determination processing, additional data can be determined using a decode frame near the center of the update interval, so that the accuracy of additional data extraction can be improved. In addition, since it is not necessary to count the number of consecutive times for the second and subsequent additional data determination processing, the processing amount can be reduced compared to the first additional data determination processing.

＜付加データ判定処理の例＞
次に、実施例１における付加データの判定処理の例について説明する。図６は、実施例１における付加データの判定処理を説明するための図である。図６に示す付加データＥｎ１１〜１４は、各付加データＢ〜Ｅがエンコードされたエンコードフレームを表す。図６に示す例では、付加データＢからデコードするとする。 <Example of additional data determination processing>
Next, an example of the additional data determination process in the first embodiment will be described. FIG. 6 is a diagram for explaining the additional data determination processing according to the first embodiment. The additional data En11 to 14 shown in FIG. 6 represent an encoded frame in which the additional data B to E are encoded. In the example shown in FIG. 6, it is assumed that decoding is performed from the additional data B.

図６に示すＤｅｎ（ｎ＝１０１〜）は、デコードフレームを表す。図６に示す例では、デコード部１５によるデコード結果が、Ｄｅ１０１は「Ａ」、Ｄｅ１０２〜１０７は「Ｂ」とする。連続回数カウント部１１１は、Ｄｅ１０２〜１０７でデコード結果が「Ｂ」で連続するので、連続回数を６とする。ここで、所定値Ｎを例えば３とする。 Den (n = 101 to) shown in FIG. 6 represents a decoded frame. In the example shown in FIG. 6, it is assumed that the decoding result by the decoding unit 15 is “A” for De101 and “B” for De102 to 107. The continuous number counting unit 111 sets the number of continuous times to 6 because the decoding results are continuous with De 102 to 107 as “B”. Here, the predetermined value N is set to 3, for example.

付加データ第１判定部１１２は、連続回数が所定値Ｎ以上であるため、このデコード結果「Ｂ」を出力部１７に出力する。 The additional data first determination unit 112 outputs the decoding result “B” to the output unit 17 because the number of consecutive times is equal to or greater than the predetermined value N.

次に、中心算出部１２１は、連続回数６の中心Ｔ０を求める。中心Ｔ０は、例えば、デコードフレームＤｅ１０２〜１０７の中心時刻である。 Next, the center calculation part 121 calculates | requires the center T0 of the continuous frequency 6. FIG. The center T0 is, for example, the center time of the decoded frames De102 to 107.

第１決定部１２２は、中心時刻Ｔ０に付加データの更新間隔ｔ１を加算し、次のフレーム判定位置の中心時刻Ｔ１を算出する。第１決定部１２２は、更新間隔Ｔ１を含むフレームを中心に所定数Ｎのフレームを決定する。 The first determination unit 122 adds the update interval t1 of the additional data to the center time T0, and calculates the center time T1 of the next frame determination position. The first determination unit 122 determines a predetermined number N of frames centering on a frame including the update interval T1.

付加データ第２判定部１２３は、時刻Ｔ１を含む所定数Ｎのフレームのデコード結果で、最も多く抽出された付加データを判定する。ここでは、付加データ「Ｃ」が３つなので、付加データ「Ｃ」が出力部１７に出力される。 The additional data second determination unit 123 determines the most extracted additional data based on the decoding result of a predetermined number N of frames including the time T1. Here, since the additional data “C” is three, the additional data “C” is output to the output unit 17.

以降の付加データ判定については、第１決定部１２２は、時刻Ｔ１に更新間隔ｔ１を加算し、次のフレーム判定位置の中心時刻Ｔ２を算出する。付加データ第２判定部１２３は、時刻Ｔ２を含む所定数Ｎのフレームのデコード結果で、最も多く抽出された付加データを判定する。ここでは、付加データ「Ｄ」が出力部１７に出力される。 For the subsequent additional data determination, the first determination unit 122 adds the update interval t1 to the time T1, and calculates the center time T2 of the next frame determination position. The additional data second determination unit 123 determines the most extracted additional data from the decoding result of the predetermined number N of frames including the time T2. Here, the additional data “D” is output to the output unit 17.

なお、各付加データの判定に用いる所定フレームの中心時刻は、中心算出部１２１が算出し、第１決定部１２２は、中心算出部１２１から取得した中心時刻を含むフレームを中心とする所定数のフレームを決定するようにしてもよい。 The center calculation unit 121 calculates the center time of a predetermined frame used for determining each additional data, and the first determination unit 122 sets a predetermined number of frames centered on the frame including the center time acquired from the center calculation unit 121. A frame may be determined.

このようにして、最初の付加データの更新間隔の中心時刻Ｔ０を推定し、あとは更新間隔ｔ１を加算していけば、更新間隔の中心付近で付加データの判定ができるようになり、付加データを適切に抽出することができる。 In this way, if the center time T0 of the update interval of the first additional data is estimated and then the update interval t1 is added, the additional data can be determined near the center of the update interval, and the additional data Can be appropriately extracted.

＜動作＞
次に、実施例１における音処理装置１の動作について説明する。図７は、実施例１における音処理装置１の処理の一例を示すフローチャートである。図７に示すステップＳ１０１で、マイク部１１は、音声データを受信し、音声データカウント部１２、音声データ記憶部１３に出力する。 <Operation>
Next, the operation of the sound processing apparatus 1 in the first embodiment will be described. FIG. 7 is a flowchart illustrating an example of processing of the sound processing device 1 according to the first embodiment. In step S <b> 101 shown in FIG. 7, the microphone unit 11 receives the audio data and outputs it to the audio data count unit 12 and the audio data storage unit 13.

ステップＳ１０２で、音声データカウント部１２は、デコードフレームを必要数受信したか否かを判定する。デコードフレームが必要数受信されれば（ステップＳ１０２−ＹＥＳ）ステップＳ１０３に進み、デコードフレームが必要数受信されなければ（ステップＳ１０２−ＮＯ）ステップＳ１０１に戻る。 In step S102, the audio data count unit 12 determines whether the required number of decoded frames has been received. If the required number of decoded frames has been received (step S102—YES), the process proceeds to step S103. If the required number of decoded frames has not been received (step S102—NO), the process returns to step S101.

ステップＳ１０３で、計算部１４は、デコードフレーム毎にケプストラムを計算する。計算部１４は、ケプストラム計算結果から遅延時間を求め、デコード部１５に出力する。 In step S103, the calculation unit 14 calculates a cepstrum for each decoded frame. The calculation unit 14 obtains a delay time from the cepstrum calculation result and outputs it to the decoding unit 15.

ステップＳ１０４で、デコード部１５は、ケプストラム計算結果からピークを検出し、エコー信号の遅延時間を求める。デコード部１５は、求めた遅延時間に対応する付加データを対応表から取得する。対応表には、遅延時間と付加データとが対応付けられている。 In step S104, the decoding unit 15 detects a peak from the cepstrum calculation result and obtains a delay time of the echo signal. The decoding unit 15 acquires additional data corresponding to the obtained delay time from the correspondence table. The correspondence table associates the delay time with the additional data.

ステップＳ１０５で、判定部１６の連続回数カウント部１１１は、同じデコード結果となるデコードフレームの連続回数をカウントする。 In step S105, the continuous number counting unit 111 of the determination unit 16 counts the continuous number of decoded frames that are the same decoding result.

ステップＳ１０６で、付加データ第１判定部１１２は、連続回数がＮ以上であるか否かを判定する。連続回数がＮ以上であれば（ステップＳ１０６−ＹＥＳ）ステップＳ１０７に進み、連続回数がＮ未満であれば（ステップＳ１０６−ＮＯ）ステップＳ１０１に戻る。 In step S106, the additional data first determination unit 112 determines whether or not the number of consecutive times is N or more. If the number of continuous times is N or more (step S106—YES), the process proceeds to step S107. If the number of continuous times is less than N (step S106—NO), the process returns to step S101.

ステップＳ１０７で、出力部１７は、連続回数がＮ以上のデコード結果が示す付加データを出力する。 In step S107, the output unit 17 outputs the additional data indicated by the decoding result having the number of consecutive times of N or more.

ステップＳ１０８で、中心算出部１２１及び第１決定部１２２は、次のデコードフレームの判定位置を決定する。 In step S108, the center calculation unit 121 and the first determination unit 122 determine the determination position of the next decoded frame.

ステップＳ１０９で、マイク部１１は、音声データを受信し、音声データカウント部１２、音声データ記憶部１３に出力する。 In step S <b> 109, the microphone unit 11 receives audio data and outputs it to the audio data counting unit 12 and the audio data storage unit 13.

ステップＳ１１０で、音声データカウント部１２は、デコードフレームを必要数受信したか否かを判定する。デコードフレームが必要数受信されれば（ステップＳ１１０−ＹＥＳ）ステップＳ１１１に進み、デコードフレームが必要数受信されなければ（ステップＳ１１０−ＮＯ）ステップＳ１０９に戻る。 In step S110, the audio data count unit 12 determines whether the required number of decoded frames has been received. If the required number of decoded frames is received (step S110-YES), the process proceeds to step S111. If the required number of decoded frames is not received (step S110-NO), the process returns to step S109.

ステップＳ１１１で、計算部１４は、デコードフレーム毎にケプストラムを計算する。計算部１４は、ケプストラム計算結果から遅延時間を求め、デコード部１５に出力する。 In step S111, the calculation unit 14 calculates a cepstrum for each decoded frame. The calculation unit 14 obtains a delay time from the cepstrum calculation result and outputs it to the decoding unit 15.

ステップＳ１１２で、デコード部１５は、ケプストラム計算結果からピークを検出し、エコー信号の遅延時間を求める。デコード部１５は、求めた遅延時間に対応する付加データを対応表から取得する。対応表には、遅延時間と付加データとが対応付けられている。 In step S112, the decoding unit 15 detects a peak from the cepstrum calculation result, and obtains a delay time of the echo signal. The decoding unit 15 acquires additional data corresponding to the obtained delay time from the correspondence table. The correspondence table associates the delay time with the additional data.

なお、ステップＳ１０９〜Ｓ１１２は、説明の都合上ステップＳ１０８の後に記載したが、実際は、ステップＳ１０１に続いてステップＳ１０９で次の音声データが受信され、ステップＳ１１０以降の処理が行われている。 Steps S109 to S112 have been described after step S108 for convenience of explanation. Actually, however, the next audio data is received in step S109 following step S101, and the processing after step S110 is performed.

ステップＳ１１３で、付加データ第２判定部１２３は、判定対象のデコード結果であるか否かを判定する。例えば、付加データ第２判定部１２３は、第１決定部１２２により決定された判定位置にあるデコードフレームのデコード結果であるかを判定する。デコード結果が判定対象であれば（ステップＳ１１３−ＹＥＳ）ステップＳ１１４に進み、デコード結果が判定対象でなければ（ステップＳ１１３−ＮＯ）ステップＳ１０９に戻り、次の音声データを受信する。 In step S113, the additional data second determination unit 123 determines whether or not the decoding result is a determination target. For example, the additional data second determination unit 123 determines whether the decoding result of the decoded frame at the determination position determined by the first determination unit 122 is obtained. If the decoding result is a determination target (step S113—YES), the process proceeds to step S114. If the decoding result is not a determination target (step S113—NO), the process returns to step S109, and the next audio data is received.

ステップＳ１１４で、付加データ第２判定部１２３は、判定対象のデコード結果の回数をカウントする。 In step S114, the additional data second determination unit 123 counts the number of decoding results to be determined.

ステップＳ１１５で、付加データ第２判定部１２３は、判定対象のカウント回数がＮ回であるか否かを判定する。カウント回数がＮ回であれば（ステップＳ１１５−ＹＥＳ）ステップＳ１１６に進み、カウント回数がＮ回でなければ（ステップＳ１１５−ＮＯ）ステップＳ１０９に戻り、次の音声データを受信する。 In step S115, the additional data second determination unit 123 determines whether or not the number of counts to be determined is N. If the number of counts is N (step S115—YES), the process proceeds to step S116. If the number of counts is not N (step S115—NO), the process returns to step S109 to receive the next audio data.

ステップＳ１１６で、付加データ第２判定部１２３は、Ｎ個のデコード結果の多数決をとり、最も多く抽出された付加データを判定する。 In step S116, the additional data second determination unit 123 determines a majority of the additional data extracted by taking the majority of N decoding results.

ステップＳ１１７で、付加データ第２判定部１２３は、カウントしていた回数をリセットする。 In step S117, the additional data second determination unit 123 resets the counted number.

ステップＳ１１７の処理後、ステップＳ１０７に進み、出力部１７は、多数決の結果である付加データを出力する。 After the processing of step S117, the process proceeds to step S107, and the output unit 17 outputs additional data that is the result of the majority decision.

以降は、ステップＳ１０７〜Ｓ１１７の処理を繰り返し、判定された付加データを順に出力していく。 Thereafter, the processing of steps S107 to S117 is repeated, and the determined additional data is output in order.

以上、実施例１によれば、処理量を抑えつつ、エコーハイディングされた音声データに対し、付加データの抽出精度を向上させることができる。 As described above, according to the first embodiment, it is possible to improve the extraction accuracy of additional data with respect to the audio data subjected to echo hiding while suppressing the processing amount.

［実施例２］
次に、実施例２における音処理装置について説明する。実施例２における音処理装置は、最初の付加データが複数のエンコードフレームに連続して付加される場合であっても、付加データを適切に抽出することができる。 [Example 2]
Next, the sound processing apparatus according to the second embodiment will be described. The sound processing apparatus according to the second embodiment can appropriately extract additional data even when the first additional data is continuously added to a plurality of encoded frames.

＜構成＞
実施例２における音処理装置の構成は、実施例１における音処理装置と同様の構成であるから、その説明を省略する。実施例２における音処理装置の構成を用いる場合は、図１に示す符号と同じ符号を用いて説明する。 <Configuration>
Since the configuration of the sound processing apparatus in the second embodiment is the same as that of the sound processing apparatus in the first embodiment, the description thereof is omitted. In the case of using the configuration of the sound processing apparatus in the second embodiment, description will be made using the same reference numerals as those shown in FIG.

＜判定部の機能＞
次に、実施例２における判定部１６の機能について説明する。図８は、実施例２における判定部１６の機能の一例を示すブロック図である。図８に示す判定部１６は、最初の付加データを判定するための判定部２０１、２つ目以降の付加データを判定するための判定部２０２を有する。 <Function of determination unit>
Next, the function of the determination part 16 in Example 2 is demonstrated. FIG. 8 is a block diagram illustrating an example of the function of the determination unit 16 according to the second embodiment. The determination unit 16 illustrated in FIG. 8 includes a determination unit 201 for determining the first additional data and a determination unit 202 for determining the second and subsequent additional data.

判定部２０１は、連続回数カウント部２１１、付加データ第１判定部２１２を有する。連続回数カウント部２１１は、実施例１と同様に、デコード部１５からのデコード結果を取得し、同じデコード結果が続く回数をカウントする。連続回数カウント部２１１は、連続回数を付加データ第１判定部２１２及び中心算出部２２１に出力する。 The determination unit 201 includes a continuous number counting unit 211 and an additional data first determination unit 212. Similar to the first embodiment, the continuous number counting unit 211 acquires the decoding result from the decoding unit 15 and counts the number of times that the same decoding result continues. The continuous count unit 211 outputs the continuous count to the additional data first determination unit 212 and the center calculation unit 221.

付加データ第１判定部２１２は、連続回数カウント部２１１から取得した連続回数が、エンコードフレーム内のデコードフレームの数の所定倍を超えるか否かを判定する。また、付加データ第１判定部２１２は、連続回数分のデコードフレームの時間が、エンコードフレームの更新間隔の所定倍を超えたか否かを判定してもよい。 The additional data first determination unit 212 determines whether or not the continuous number obtained from the continuous number counting unit 211 exceeds a predetermined multiple of the number of decoded frames in the encoded frame. Further, the additional data first determination unit 212 may determine whether or not the decoding frame time corresponding to the continuous number of times exceeds a predetermined multiple of the encoding frame update interval.

この判定は、１つ目の付加データが連続して複数のエンコードフレームに付加されたか否かを判定するために行う。付加データ第１判定部２１２は、連続回数が所定倍を超えるとき、（連続回数／エンコードフレームに含まれるデコードフレーム数）又は（連続回数分のデコードフレームの時間／エンコードフレームの更新間隔）の整数値に１を加算した整数値Ｍを算出する。付加データ第１判定部２１２は、整数値Ｍの分だけ、デコード結果が示す付加データを出力部１７に出力する。 This determination is performed to determine whether or not the first additional data is continuously added to a plurality of encoded frames. When the number of consecutive times exceeds a predetermined multiple, the additional data first determination unit 212 adjusts (the number of consecutive times / the number of decoded frames included in the encoded frame) or (the time of the decoded frame for the number of consecutive times / the update interval of the encoded frame). An integer value M obtained by adding 1 to the numerical value is calculated. The additional data first determination unit 212 outputs the additional data indicated by the decoding result to the output unit 17 by the integer value M.

付加データ第１判定部２１２は、連続回数が所定倍を超えないときは、実施例１と同様にして、連続回数が所定値Ｎ以上であるか否かを判定する。 When the number of consecutive times does not exceed a predetermined multiple, the additional data first determination unit 212 determines whether or not the number of consecutive times is equal to or greater than a predetermined value N as in the first embodiment.

これにより、最初に判定する付加データが複数のエンコードフレームに連続して付加されている場合でも、付加データを適切に抽出することができる。 Thereby, even when the additional data to be determined first is continuously added to a plurality of encoded frames, the additional data can be appropriately extracted.

判定部１６は、最初の付加データの判定が終わると、判定部２０２により、２つ目以降の付加データの判定を行う。よって、２つ目以降のエンコードフレームに対応するデコード部１５のデコード結果は、判定部２０２に入力される。デコード結果の入力の切替については、例えばスイッチなどを用いればよい。 When the determination of the first additional data is completed, the determination unit 16 determines the second and subsequent additional data using the determination unit 202. Therefore, the decoding result of the decoding unit 15 corresponding to the second and subsequent encoded frames is input to the determination unit 202. For switching the decoding result input, for example, a switch or the like may be used.

判定部２０２は、中心算出部２２１、第２決定部２２２、付加データ第２判定部２２３を有する。中心算出部２２１は、連続回数カウント部２１１から取得した連続回数の中心を求める。中心算出部２２１は、例えば、同じデコード結果が連続したデコードフレームの中心時刻を算出する。中心算出部２２１は、求めた中心時刻を第２決定部２２２に出力する。 The determination unit 202 includes a center calculation unit 221, a second determination unit 222, and an additional data second determination unit 223. The center calculation unit 221 determines the center of the continuous number of times acquired from the continuous number of times counting unit 211. For example, the center calculation unit 221 calculates the center time of the decoded frames in which the same decoding result is continued. The center calculation unit 221 outputs the obtained center time to the second determination unit 222.

第２決定部２２２は、中心算出部２２１から取得した中心時刻に付加データの更新間隔を加算していく。第２決定部２２２は、中心時刻が所定倍を超えたときの中心である場合、Ｍが奇数の場合、取得した中心は、エンコードフレームの更新間隔の中心付近であると判断し、中心時刻に（（Ｍ＋１）／２）倍の更新間隔を加算した時刻を算出する。 The second determination unit 222 adds the update interval of the additional data to the center time acquired from the center calculation unit 221. When the center time is a center when the center time exceeds a predetermined multiple, and M is an odd number, the second determination unit 222 determines that the acquired center is near the center of the update interval of the encoded frame, and sets the center time as the center time. The time obtained by adding the ((M + 1) / 2) times update interval is calculated.

また、第２決定部２２２は、Ｍが偶数の場合、取得した中心は、エンコードフレームの更新タイミング付近であると判断し、この中心に更新間隔の半分を加算して更新間隔の中心付近を求める。第２決定部２２２は、更新間隔の半分を加算した中心に、さらに（Ｍ／２）倍の更新間隔を加算した時刻を算出する。 Further, when M is an even number, the second determination unit 222 determines that the acquired center is near the update timing of the encoded frame, and adds the half of the update interval to this center to obtain the vicinity of the center of the update interval. . The second determination unit 222 calculates a time obtained by adding (M / 2) times the update interval to the center obtained by adding half the update interval.

よって、第２決定部２２２は、整数値Ｍが偶数の場合も奇数の場合も、取得した中心に対し、（（Ｍ＋１）／２）倍の更新間隔を加算した時刻を算出する。第２決定部２２２は、一度（（Ｍ＋１）／２）倍の更新間隔を加算すると、次からは１倍の更新間隔を加算していく。 Therefore, the second determination unit 222 calculates a time obtained by adding an update interval of ((M + 1) / 2) times to the acquired center regardless of whether the integer value M is an even number or an odd number. The second determination unit 222 once adds ((M + 1) / 2) times update intervals, and then adds 1 time update intervals.

第２決定部２２２は、加算した後の時刻を中心とした所定数のフレームを、以降の付加データの判定位置であると決定する。この所定数は、例えばＮとするが、Ｎに限らず適切な値が設定されればよい。第２決定部２２２は、決定した判定位置に含まれる所定数のフレームを、付加データ第２判定部２２３に通知する。 The second determination unit 222 determines a predetermined number of frames centered on the time after the addition as determination positions for subsequent additional data. The predetermined number is N, for example, but is not limited to N, and an appropriate value may be set. The second determination unit 222 notifies the additional data second determination unit 223 of the predetermined number of frames included in the determined determination position.

付加データ第２判定部２２３は、第２決定部２２２により決定された判定位置のフレームを用いて、付加データを判定する。付加データ第２判定部２２２は、所定数のフレームのデコード結果で最も多く抽出された付加データを判定し、最も多く抽出された付加データを前記出力部１７に出力する。 The additional data second determination unit 223 determines additional data using the frame at the determination position determined by the second determination unit 222. The additional data second determination unit 222 determines the additional data extracted the most from the decoding results of the predetermined number of frames, and outputs the additional data extracted the most to the output unit 17.

付加データ第２判定部２２３は、２つ目以降の付加データの判定については、第２決定部２２２により決定される判定位置に含まれるフレームを用いる。 The additional data second determination unit 223 uses a frame included in the determination position determined by the second determination unit 222 for determining the second and subsequent additional data.

これにより、１つ目の付加データが複数のエンコードフレームに連続して付加されている場合でも、２つ目以降の付加データの判定処理を、実施例１と同様に処理量を増やさずに適切に行うことができる。 As a result, even when the first additional data is continuously added to a plurality of encoded frames, the determination of the second and subsequent additional data is appropriately performed without increasing the processing amount as in the first embodiment. Can be done.

＜付加データ判定処理の例＞
次に、実施例２における付加データの判定処理の例について説明する。図９は、実施例２における付加データの判定処理を説明するための図である。図９に示す付加データＥｎ２１〜２４は、各付加データＢ、Ｄ、Ｅがエンコードされたエンコードフレームを表す。図９に示す例では、付加データＢからデコードするとする。 <Example of additional data determination processing>
Next, an example of the additional data determination process in the second embodiment will be described. FIG. 9 is a diagram for explaining additional data determination processing according to the second embodiment. The additional data En21 to 24 shown in FIG. 9 represents an encoded frame in which the additional data B, D, and E are encoded. In the example shown in FIG. 9, it is assumed that decoding is performed from the additional data B.

図９に示すＤｅｎ（ｎ＝２０１〜）は、デコードフレームを表す。図９に示す例では、デコード部１５によるデコード結果が、Ｄｅ２０１は「Ａ」、Ｄｅ２０２〜２１４は「Ｂ」とする。連続回数カウント部２１１は、Ｄｅ２０２〜２１４でデコード結果が「Ｂ」で連続するので、連続回数を１２とする。このとき、１つのエンコードフレームに含まれるデコードフレームの数を８とすると、連続回数は、この「８」の数を超えている。 Den (n = 201) shown in FIG. 9 represents a decoded frame. In the example shown in FIG. 9, it is assumed that the decoding result by the decoding unit 15 is “A” for De201 and “B” for De202 to 214. The continuous number counting unit 211 sets the number of continuous times to 12 because the decoding results continue to be “B” in De 202 to 214. At this time, if the number of decode frames included in one encode frame is 8, the number of consecutive times exceeds the number of “8”.

そこで、付加データ第１判定部２１２は、（１２／８）の整数値（商）１に１を加算して整数値Ｍ＝２を算出する。付加データ第１判定部２１２は、連続回数８のデコード結果が示す付加データ「Ｂ」をＭ＝２回分、出力部１７に出力する。 Therefore, the additional data first determination unit 212 calculates the integer value M = 2 by adding 1 to the integer value (quotient) 1 of (12/8). The additional data first determination unit 212 outputs the additional data “B” indicated by the decoding result of 8 consecutive times to the output unit 17 for M = 2 times.

次に、中心算出部２２１は、連続回数の中心Ｔ０を求める。図９に示す中心Ｔ０は、Ｍ＝２であるため、エンコードフレームＥｎ２１、Ｅｎ２２の更新タイミング付近になる。 Next, the center calculation part 221 calculates | requires the center T0 of continuous frequency | count. The center T0 shown in FIG. 9 is near the update timing of the encode frames En21 and En22 because M = 2.

第２決定部２２２は、まず、中心時刻Ｔ０に付加データの更新間隔の半分（ｔ１／２）を加算し、更新間隔の中心付近を求める。第２決定部２２２は、次に、この（Ｔ０＋（ｔ１／２））の時刻に、（Ｍ／２）×ｔ１を加算する。つまり、次のフレーム判定位置の中心時刻Ｔ１は、次の式を用いて算出される。
Ｔ１＝Ｔ０＋（（Ｍ＋１）／２）×ｔ１
第２決定部２２２は、更新間隔Ｔ１を含むフレームを中心に所定数Ｎのフレームを決定する。 First, the second determination unit 222 adds half (t1 / 2) of the update interval of the additional data to the center time T0 to obtain the vicinity of the center of the update interval. Next, the second determination unit 222 adds (M / 2) × t1 to the time (T0 + (t1 / 2)). That is, the center time T1 of the next frame determination position is calculated using the following equation.
T1 = T0 + ((M + 1) / 2) × t1
The second determination unit 222 determines a predetermined number N of frames centering on the frame including the update interval T1.

付加データ第２判定部２２３は、時刻Ｔ１を含む所定数Ｎのフレームのデコード結果で、最も多く抽出された付加データを判定する。ここでは、付加データ「Ｄ」が３つなので、付加データ「Ｄ」が出力部１７に出力される。 The additional data second determination unit 223 determines the additional data extracted most by the decoding result of the predetermined number N of frames including the time T1. Here, since the additional data “D” is three, the additional data “D” is output to the output unit 17.

以降の付加データ判定については、第２決定部２２２は、時刻Ｔ１に更新間隔ｔ１を加算し、次のフレーム判定位置の中心時刻Ｔ２を算出する。付加データ第２判定部２２３は、時刻Ｔ２を含む所定数Ｎのフレームのデコード結果で、最も多く抽出された付加データを判定する。ここでは、付加データ「Ｅ」が出力部１７に出力される。 For the subsequent additional data determination, the second determination unit 222 adds the update interval t1 to the time T1, and calculates the center time T2 of the next frame determination position. The additional data second determination unit 223 determines the additional data extracted most by the decoding result of the predetermined number N of frames including the time T2. Here, the additional data “E” is output to the output unit 17.

なお、各付加データの判定に用いる所定フレームの中心時刻は、中心算出部２２１が算出し、第２決定部２２２は、中心算出部２２１から取得した中心時刻を含むフレームを中心とする所定数のフレームを決定するようにしてもよい。 The center time of a predetermined frame used for determining each additional data is calculated by the center calculation unit 221, and the second determination unit 222 has a predetermined number of frames centered on the frame including the center time acquired from the center calculation unit 221. A frame may be determined.

これにより、最初の付加データが複数のエンコードフレームに連続して付加された場合でも、付加データの判定に用いる所定数のフレームを適切に決定することができる。 Thereby, even when the first additional data is continuously added to a plurality of encoded frames, the predetermined number of frames used for determination of the additional data can be appropriately determined.

＜動作＞
次に、実施例２における音処理装置の動作について説明する。図１０は、実施例２における音処理装置の処理の一例を示すフローチャートである。図１０に示すステップＳ２０１〜Ｓ２０６の処理は、図７に示すステップＳ１０１〜Ｓ１０６の処理と同様であるため、その説明を省略する。 <Operation>
Next, the operation of the sound processing apparatus according to the second embodiment will be described. FIG. 10 is a flowchart illustrating an example of processing performed by the sound processing apparatus according to the second embodiment. The processing in steps S201 to S206 shown in FIG. 10 is the same as the processing in steps S101 to S106 shown in FIG.

ステップＳ２０７で、付加データ第１判定部２１２は、連続回数分のデコードフレームの時間を、エンコードフレームの更新間隔の整数値Ｍ倍に切り上げる。 In step S207, the additional data first determination unit 212 rounds up the decoding frame time corresponding to the continuous number of times to an integer value M times the encoding frame update interval.

ステップＳ２０８で、出力部１７は、付加データ第１判定部２１２が判定したＭ個の付加データを出力する。 In step S208, the output unit 17 outputs the M additional data determined by the additional data first determination unit 212.

ステップＳ２０９で、中心算出部２２１及び第２決定部２２２は、次のデコードフレームの判定位置を決定する。２つ目の付加データの判定位置は、中心時刻Ｔ０＋（（Ｍ＋１）／２）×更新間隔ｔ１を中心とする所定数のフレームである。 In step S209, the center calculation unit 221 and the second determination unit 222 determine the determination position of the next decoded frame. The determination position of the second additional data is a predetermined number of frames centered at the central time T0 + ((M + 1) / 2) × update interval t1.

ステップＳ２１０〜Ｓ２１８の処理は、図７に示すステップＳ１０９〜Ｓ１１７の処理と同様であるため、その説明を省略する。 The processing of steps S210 to S218 is the same as the processing of steps S109 to S117 shown in FIG.

ステップＳ２１８の処理後、ステップＳ２０８に進み、出力部１７は、多数決の結果である付加データを出力する。 After the process of step S218, the process proceeds to step S208, and the output unit 17 outputs additional data that is the result of the majority decision.

以降は、ステップＳ２０８〜Ｓ２１８の処理を繰り返し、判定された付加データを順に出力していく。なお、ステップＳ２０９で、３つ目以降の付加データを判定する際は、２つ目の付加データに対する中心時刻Ｔ１に更新間隔ｔ１を加算していけばよい。 Thereafter, the processing of steps S208 to S218 is repeated, and the determined additional data is output in order. When determining the third and subsequent additional data in step S209, the update interval t1 may be added to the center time T1 for the second additional data.

以上、実施例２によれば、処理量を抑えつつ、エコーハイディングされた音声データに対し、付加データの抽出精度を向上させることができる。また、実施例２によれば、最初の付加データが複数のエンコードフレームに連続して付加される場合であっても、付加データを適切に抽出することができる。 As described above, according to the second embodiment, it is possible to improve the extraction accuracy of the additional data with respect to the audio data subjected to echo hiding while suppressing the processing amount. Further, according to the second embodiment, even when the first additional data is continuously added to a plurality of encoded frames, the additional data can be appropriately extracted.

［実施例３］
次に、実施例３における音処理装置について説明する。実施例３における音処理装置は、２つ目以降の付加データを判定するために用いる中心時刻を修正することができる。 [Example 3]
Next, a sound processing apparatus according to the third embodiment will be described. The sound processing apparatus according to the third embodiment can correct the central time used for determining the second and subsequent additional data.

＜構成＞
実施例３における音処理装置の構成は、実施例１における音処理装置と同様の構成であるから、その説明を省略する。実施例３における音処理装置の構成を用いる場合は、図１に示す符号と同じ符号を用いて説明する。 <Configuration>
Since the configuration of the sound processing apparatus in the third embodiment is the same as that of the sound processing apparatus in the first embodiment, the description thereof is omitted. When the configuration of the sound processing apparatus in the third embodiment is used, the description will be made using the same reference numerals as those shown in FIG.

＜判定部の機能＞
次に、実施例３における判定部１６の機能について説明する。図１１は、実施例３における判定部１６の機能の一例を示すブロック図である。図１１に示す判定部１６は、最初の付加データを判定するための判定部３０１、２つ目以降の付加データを判定するための判定部３０２を有する。 <Function of determination unit>
Next, the function of the determination part 16 in Example 3 is demonstrated. FIG. 11 is a block diagram illustrating an example of the function of the determination unit 16 according to the third embodiment. The determination unit 16 illustrated in FIG. 11 includes a determination unit 301 for determining the first additional data, and a determination unit 302 for determining the second and subsequent additional data.

判定部３０１は、連続回数カウント部３１１、付加データ第１判定部３１２を有する。連続回数カウント部３１１、付加データ第１判定部３１２は、実施例１に記載の連続回数カウント部１１１、付加データ第１判定部１１２と同様であるため、その説明を省略する。 The determination unit 301 includes a continuous number counting unit 311 and an additional data first determination unit 312. Since the continuous number counting unit 311 and the additional data first determination unit 312 are the same as the continuous number counting unit 111 and the additional data first determination unit 112 described in the first embodiment, the description thereof is omitted.

判定部３０２は、中心算出部３２１、第３決定部３２２、付加データ第３判定部３２３を有する。中心算出部３２１は、実施例１に記載の中心算出部１２１と同様であるため、その説明を省略する。 The determination unit 302 includes a center calculation unit 321, a third determination unit 322, and an additional data third determination unit 323. The center calculation unit 321 is the same as the center calculation unit 121 described in the first embodiment, and a description thereof will be omitted.

第３決定部３２２は、中心算出部３２１から取得した中心時刻に付加データの更新間隔を加算していく。第３決定部３２２は、加算した後の時刻を中心として所定数のフレームを、以降の付加データの判定位置であると決定する。この所定数は、例えばＮとするが、Ｎに限らず適切な値が設定されればよい。 The third determination unit 322 adds the update interval of the additional data to the center time acquired from the center calculation unit 321. The third determination unit 322 determines a predetermined number of frames centering on the time after the addition as determination positions for subsequent additional data. The predetermined number is N, for example, but is not limited to N, and an appropriate value may be set.

第３決定部３２２は、決定した所定数のフレームを第１群とし、この第１群を、１つのデコードフレーム分、前にずらしたデコードフレームの群を第２群とし、第１群を、１つのデコードフレーム分、後にずらしたデコードフレームの群を第３群とする。 The third determining unit 322 sets the determined predetermined number of frames as the first group, sets the first group as one decoded frame, sets the group of decoded frames shifted forward as the second group, and sets the first group as A group of decode frames shifted later by one decode frame is defined as a third group.

第３決定部３２２は、決定した所定数のフレームの３つの群を、付加データ第３判定部３２３に通知する。なお、群の数は３つに限らない。 The third determination unit 322 notifies the additional data third determination unit 323 of the determined three groups of the predetermined number of frames. Note that the number of groups is not limited to three.

付加データ第３判定部３２３は、第３決定部３２２により決定された３つの郡の判定位置のフレームを用いて、付加データを判定する。付加データ第３判定部３２３は、それぞれの群で、所定数のフレームのデコード結果で最も多く抽出された付加データを判定する。 The additional data third determination unit 323 determines additional data using the frames at the determination positions of the three counties determined by the third determination unit 322. The additional data third determination unit 323 determines additional data extracted most by decoding results of a predetermined number of frames in each group.

付加データ第３判定部３２３は、この３つの群で、同じ付加データの抽出数で多数決をとる。付加データ第３判定部３２３は、抽出数が一番多い付加データを出力部１７に出力する。付加データ第３判定部３２３は、同じ付加データの抽出数が一番多い群の中に、第１群が含まれていれば、第１群の中心時刻を基準として以降の付加データの判定位置を決定させる。 The additional data third determination unit 323 decides the majority of the three groups with the same number of additional data extracted. The additional data third determination unit 323 outputs the additional data having the largest number of extractions to the output unit 17. If the first group is included in the group having the largest number of extractions of the same additional data, the additional data third determination unit 323 determines the subsequent additional data determination position based on the central time of the first group. Let me decide.

付加データ第３判定部３２３は、同じ付加データの抽出数が一番多い群の中に、第１群が含まれない場合、抽出数が一番多い郡の中心時刻を以降の付加データの判定位置の基準とするよう、第３決定部３２２に通知する。なお、付加データ第３判定部３２３は、第２群と第３群の両方で抽出数が一番多い場合、予め設定しておいたいずれかの群の中心時刻を用いればよい。 If the first group is not included in the group with the largest number of extractions of the same additional data, the additional data third determination unit 323 determines the center time of the county with the largest number of extractions as subsequent additional data. The third determination unit 322 is notified so that the position is used as a reference. Note that the additional data third determination unit 323 may use the central time of one of the groups set in advance when the number of extractions is the largest in both the second group and the third group.

これにより、付加データの判定に用いる中心時刻を修正することができる。この修正は、実際のエンコードフレームの中心時刻と、判定された中心時刻との間に少しのズレがあるので、このズレが蓄積されるのを防ぐのに役立つ。 Thereby, the center time used for determination of additional data can be corrected. This correction is useful for preventing this deviation from accumulating because there is a slight deviation between the center time of the actual encoding frame and the determined center time.

＜付加データ判定処理の例＞
次に、実施例３における付加データの判定処理の例について説明する。図１２は、実施例３における付加データの判定処理を説明するための図である。図１２に示す付加データＥｎ３１〜３４は、各付加データＢ〜Ｅがエンコードされたエンコードフレームを表す。図１２に示す例では、付加データＢからデコードするとする。 <Example of additional data determination processing>
Next, an example of the additional data determination process in the third embodiment will be described. FIG. 12 is a diagram for explaining the additional data determination process according to the third embodiment. Additional data En31 to 34 shown in FIG. 12 represents an encoded frame in which each additional data B to E is encoded. In the example shown in FIG. 12, it is assumed that the additional data B is decoded.

図１２に示すＤｅｎ（ｎ＝３０１〜）は、デコードフレームを表す。図１２に示す例では、デコード部１５によるデコード結果が、例えばＤｅ３０１は「Ａ」、Ｄｅ３０２〜３０７は「Ｂ」とする。連続回数カウント部３１１は、Ｄｅ３０２〜３０７でデコード結果が「Ｂ」で連続するので、連続回数を６とする。ここで、所定値Ｎを例えば３とする。 Den (n = 301 to) shown in FIG. 12 represents a decoded frame. In the example illustrated in FIG. 12, the decoding result by the decoding unit 15 is, for example, that De301 is “A” and De302 to 307 are “B”. The continuous number counting unit 311 sets the number of continuous times to 6 because the decoding results continue to be “B” in De 302 to 307. Here, the predetermined value N is set to 3, for example.

付加データ第１判定部３１２は、連続回数が所定値Ｎ以上であるため、このデコード結果「Ｂ」を出力部１７に出力する。 The additional data first determination unit 312 outputs the decoding result “B” to the output unit 17 because the number of consecutive times is equal to or greater than the predetermined value N.

次に、中心算出部３２１は、連続回数６の中心Ｔ０を求める。中心Ｔ０は、例えば、デコードフレームＤｅ３０２〜３０７の中心時刻である。 Next, the center calculation part 321 calculates | requires the center T0 of the continuous frequency 6. FIG. The center T0 is, for example, the center time of the decode frames De302 to 307.

第３決定部３２２は、中心時刻Ｔ０に付加データの更新間隔ｔ１を加算し、次のフレーム判定位置の中心時刻Ｔ１を算出する。第３決定部３２２は、更新間隔Ｔ１を含むフレームを中心に所定数Ｎのフレームを第１群として決定する。 The third determination unit 322 adds the update interval t1 of the additional data to the center time T0, and calculates the center time T1 of the next frame determination position. The third determination unit 322 determines a predetermined number N frames as the first group around the frame including the update interval T1.

第３決定部３２２は、第１群から１デコードフレーム前にずらした第２群、第１群から１デコードフレーム後にずらした第３群を決定する。例えば、図１２に示す例では、Ｄｅ３１０〜３１２が第１群、Ｄｅ３０９〜３１１が第２群、Ｄｅ３１１〜３１３が第３群となる。 The third determining unit 322 determines the second group shifted from the first group by one decoded frame and the third group shifted from the first group by one decoded frame. For example, in the example shown in FIG. 12, De310 to 312 are the first group, De309 to 311 are the second group, and De311 to 313 are the third group.

付加データ第３判定部３２３は、群毎に、所定数Ｎのフレームのデコード結果で、多数決をとる。例えば、付加データ第３判定部３２３は、同じ付加データの抽出数を用いて判定する。ここでは、いずれの群も付加データ「Ｃ」が３つずつ抽出されるので、付加データ「Ｃ」が出力部１７に出力される。 The additional data third determination unit 323 determines a majority by decoding the predetermined number N of frames for each group. For example, the additional data third determination unit 323 determines using the same number of additional data extracted. Here, since the additional data “C” is extracted three by three in any group, the additional data “C” is output to the output unit 17.

以降の付加データ判定については、第３決定部３２２は、時刻Ｔ１に更新間隔ｔ１を加算し、次のフレーム判定位置の中心時刻Ｔ２を算出する。第３決定部３２２は、中心時刻Ｔ２に基づき、第１群〜第３群のフレームを決定する。 For the subsequent additional data determination, the third determination unit 322 adds the update interval t1 to the time T1, and calculates the center time T2 of the next frame determination position. The third determination unit 322 determines the frames of the first group to the third group based on the central time T2.

付加データ第３判定部３２３は、第３群で、付加データ「Ｄ」が３つ抽出されるので、付加データ「Ｄ」を判定し、出力部１７に出力する。ここでは、第３群の付加データ「Ｄ」の抽出数「３」が、第１群の付加データ「Ｄ」の抽出数「２」、第２群の付加データ「Ｃ」の抽出数「２」よりも大きい。 The additional data third determination unit 323 determines the additional data “D” and outputs it to the output unit 17 because three additional data “D” are extracted from the third group. Here, the extraction number “3” of the additional data “D” of the third group is the extraction number “2” of the additional data “D” of the first group, and the extraction number “2” of the additional data “C” of the second group. Is bigger than

よって、付加データ第３判定部３２３は、第３群の所定数のデコードフレームの中心を、中心時刻Ｔ２’とし、第３決定部３２２に通知する。第３決定部３２２は、中心時刻Ｔ２’を取得すると、このＴ２’に更新間隔ｔ１を加算し、次のフレーム判定位置の中心時刻Ｔ３を算出する。 Therefore, the additional data third determination unit 323 notifies the third determination unit 322 of the center of the predetermined number of decode frames in the third group as the center time T2 '. Upon obtaining the center time T2 ', the third determination unit 322 adds the update interval t1 to T2' and calculates the center time T3 of the next frame determination position.

なお、各付加データの判定に用いる所定フレームの中心時刻は、中心算出部３２１が算出し、第３決定部３２２は、中心算出部３２１から取得した中心時刻に基づき、各群の所定数のフレームを決定するようにしてもよい。 The center calculation unit 321 calculates the center time of a predetermined frame used for determining each additional data, and the third determination unit 322 determines the predetermined number of frames of each group based on the center time acquired from the center calculation unit 321. May be determined.

これにより、２つ目以降の付加データの判定において、中心時刻を修正することができ、より適切な付加データを抽出することができる。 Thereby, in the determination of the second and subsequent additional data, the center time can be corrected, and more appropriate additional data can be extracted.

＜動作＞
次に、実施例３における音処理装置の動作について説明する。図１３は、実施例３における音処理装置の処理の一例を示すフローチャートである。図１３に示すステップＳ３０１〜Ｓ３１４の処理は、図７に示すステップＳ１０１〜Ｓ１１４の処理と同様であるため、その説明を省略する。 <Operation>
Next, the operation of the sound processing apparatus according to the third embodiment will be described. FIG. 13 is a flowchart illustrating an example of processing performed by the sound processing apparatus according to the third embodiment. The processes in steps S301 to S314 shown in FIG. 13 are the same as the processes in steps S101 to S114 shown in FIG.

ステップＳ３１５で、付加データ第３判定部３２３は、判定対象のカウント回数がＮ＋２回であるか否かを判定する。カウント回数がＮ＋２回であれば（ステップＳ３１５−ＹＥＳ）ステップＳ３１６に進み、カウント回数がＮ＋２回でなければ（ステップＳ３１５−ＮＯ）ステップＳ３０９に戻り、次の音声データを受信する。 In step S315, the additional data third determination unit 323 determines whether or not the number of counts to be determined is N + 2. If the number of counts is N + 2 (step S315—YES), the process proceeds to step S316, and if the number of counts is not N + 2 (step S315—NO), the process returns to step S309 to receive the next audio data.

ステップＳ３１６で、付加データ第３判定部３２３は、第１群のＮ個のデコード結果の多数決をとり、最も多く抽出された付加データを判定する。 In step S316, the additional data third determination unit 323 determines the majority of the additional data extracted by taking the majority of the N decoding results of the first group.

ステップＳ３１７で、付加データ第３判定部３２３は、第２群のＮ個のデコード結果の多数決をとり、最も多く抽出された付加データを判定する。 In step S317, the additional data third determination unit 323 determines the majority of additional data extracted by taking the majority of the N decoding results of the second group.

ステップＳ３１８で、付加データ第３判定部３２３は、第３群のＮ個のデコード結果の多数決をとり、最も多く抽出された付加データを判定する。 In step S318, the additional data third determination unit 323 determines the majority of the additional data extracted by taking the majority of the N decoding results of the third group.

ステップＳ３１９で、付加データ第３判定部３２３は、同じ付加データの抽出数が最も多い群を判定する。付加データ第３判定部３２３は、判定した群の最も多く抽出された付加データを出力部１７に出力する。 In step S319, the additional data third determination unit 323 determines the group having the largest number of extractions of the same additional data. The additional data third determination unit 323 outputs the additional data extracted most in the determined group to the output unit 17.

ステップＳ３２０で、付加データ第３判定部３２３は、カウントしていた回数をリセットする。 In step S320, the additional data third determination unit 323 resets the counted number.

ステップＳ３２０の処理後、ステップＳ３０７に進み、出力部１７は、多数決の結果である付加データを出力する。 After the process of step S320, the process proceeds to step S307, and the output unit 17 outputs additional data that is the result of the majority decision.

以降は、ステップＳ３０７〜Ｓ３２０の処理を繰り返し、判定された付加データを順に出力していく。なお、ステップＳ３０８で、３つ目以降の付加データを判定する際は、ステップＳ３１９で決定された群の中心時刻に対して更新間隔ｔ１を加算していけばよい。 Thereafter, the processes in steps S307 to S320 are repeated, and the determined additional data is output in order. When determining the third and subsequent additional data in step S308, the update interval t1 may be added to the central time of the group determined in step S319.

以上、実施例３によれば、処理量を抑えつつ、エコーハイディングされた音声データに対し、付加データの抽出精度を向上させることができる。また、２つ目以降の付加データの判定に用いる中心時刻を修正することができ、付加データの抽出精度をさらに向上させることができる。 As described above, according to the third embodiment, it is possible to improve the extraction accuracy of the additional data with respect to the audio data subjected to echo hiding while suppressing the processing amount. In addition, the central time used for determining the second and subsequent additional data can be corrected, and the additional data extraction accuracy can be further improved.

［実施例４］
次に、実施例４における音処理装置４について説明する。実施例４における音処理装置４は、付加データの判定に用いる音声データのみを取得するようにして、省電力を実現する。 [Example 4]
Next, the sound processing device 4 according to the fourth embodiment will be described. The sound processing device 4 according to the fourth embodiment achieves power saving by acquiring only audio data used for determination of additional data.

＜構成＞
図１４は、実施例４における音処理装置４の構成の一例を示すブロック図である。図１４に示す構成で、図１に示す構成と同様のものは同じ符号を付し、その説明を省略する。 <Configuration>
FIG. 14 is a block diagram illustrating an example of the configuration of the sound processing device 4 according to the fourth embodiment. In the configuration illustrated in FIG. 14, the same components as those illustrated in FIG. 1 are denoted by the same reference numerals, and description thereof is omitted.

判定部２１は、付加データの判定が終わると、電源制御部２２に電源切断を通知する。また、判定部２１は、次の付加データの判定に用いるデコードフレームの読み込み開始時刻までの時間をタイマ部２３に設定する。 When the determination of the additional data is completed, the determination unit 21 notifies the power control unit 22 of power off. Further, the determination unit 21 sets the time until the decode frame reading start time used for determination of the next additional data in the timer unit 23.

電源制御部２２は、判定部２１から電源切断を通知されると、マイク部１１、音声データカウント部１２、音声データ記憶部１３、計算部１４、デコード部１５に対して電源を切断する。 When the power control unit 22 is notified of the power off from the determination unit 21, the power control unit 22 turns off the power to the microphone unit 11, the audio data count unit 12, the audio data storage unit 13, the calculation unit 14, and the decoding unit 15.

また、電源制御部２２は、タイマ部２３から電源供給の指示を通知されると、マイク部１１、音声データカウント部１２、音声データ記憶部１３、計算部１４、デコード部１５に対して電源を供給する。よって、電源制御部２２は、音声データの読み込み処理を含む各部の処理を行わせないようにする制御機能を有する。 When the power supply control unit 22 is notified of the power supply instruction from the timer unit 23, the power supply control unit 22 supplies power to the microphone unit 11, the audio data count unit 12, the audio data storage unit 13, the calculation unit 14, and the decoding unit 15. Supply. Therefore, the power supply control unit 22 has a control function that prevents the processing of each unit including the audio data reading processing from being performed.

タイマ部２３は、時計を内蔵しており、判定部２１から通知された時間が経過したら電源制御部２２に対して電源供給を指示する。 The timer unit 23 has a built-in clock, and instructs the power supply control unit 22 to supply power when the time notified from the determination unit 21 has elapsed.

＜判定部の機能＞
実施例４における判定部２１は、前述した以外の機能は、実施例１〜３のいずれかの機能を有していればよい。判定部２１は、実施例１〜３のいずれかを用いて、次の付加データの判定に用いるデコードフレームの読み込み開始時刻までの時間を算出し、この時間をタイマ部２３に通知する。 <Function of determination unit>
The determination part 21 in Example 4 should just have the function in any one of Examples 1-3 except the function mentioned above. The determination unit 21 uses any one of the first to third embodiments to calculate the time until the decode frame reading start time used for determination of the next additional data, and notifies the timer unit 23 of this time.

これにより、付加データを判定するデコードフレームのみを処理すればよいので、電力消費を抑えることができる。 As a result, only the decode frame for determining the additional data needs to be processed, so that power consumption can be suppressed.

＜付加データ判定処理の例＞
次に、実施例４における付加データの判定処理の例について説明する。図１５は、実施例４における付加データの判定処理を説明するための図である。なお、図１５に示す例では、判定部２１の判定処理として実施例１で説明した図５の符号を用いて説明する。 <Example of additional data determination processing>
Next, an example of additional data determination processing according to the fourth embodiment will be described. FIG. 15 is a diagram for explaining additional data determination processing according to the fourth embodiment. In the example illustrated in FIG. 15, the determination process of the determination unit 21 will be described using the reference numerals in FIG. 5 described in the first embodiment.

図１５に示す付加データＥｎ４１〜４４は、各付加データＢ〜Ｅがエンコードされたエンコードフレームを表す。図１５に示す例では、付加データＢからデコードするとする。 Additional data En41 to 44 shown in FIG. 15 represents an encoded frame in which each additional data B to E is encoded. In the example shown in FIG. 15, it is assumed that the additional data B is decoded.

図１５に示すＤｅｎ（ｎ＝４０１〜）は、デコードフレームを表す。図１５に示す例では、デコード部１５によるデコード結果が、Ｄｅ４０１は「Ａ」、Ｄｅ４０２〜４０７は「Ｂ」とする。連続回数カウント部１１１は、Ｄｅ４０２〜４０７でデコード結果が「Ｂ」で連続するので、連続回数を６とする。ここで、所定値Ｎを例えば３とする。 Den (n = 401 to) shown in FIG. 15 represents a decoded frame. In the example illustrated in FIG. 15, it is assumed that the decoding result by the decoding unit 15 is “A” for De 401 and “B” for De 402 to 407. The continuous number counting unit 111 sets the number of continuous times to 6 because the decoding results are continuous with “B” in De 402 to 407. Here, the predetermined value N is set to 3, for example.

次に、中心算出部１２１は、連続回数６の中心Ｔ０を求める。中心Ｔ０は、例えば、デコードフレームＤｅ４０２〜４０７の中心時刻である。 Next, the center calculation part 121 calculates | requires the center T0 of the continuous frequency 6. FIG. The center T0 is, for example, the center time of the decode frames De402 to 407.

第１決定部１２２は、中心時刻Ｔ０に付加データの更新間隔ｔ１を加算し、次のフレーム判定位置の中心時刻Ｔ１を算出する。第１決定部１２２は、更新間隔Ｔ１を含むフレームを中心に所定数Ｎのフレームを決定する。判定部２１は、Ｄｅ４０８のデコードフレームの判定処理が終了すると、電源制御部２２に電源切断を通知する。また、判定部２１は、Ｄｅ４０９のデコードフレームを読み込む開始時刻までの時間をタイマ部２３に設定する。判定部２１は、以降も同様の処理を行い、電源制御部２２への電源切断の通知と、タイマ部２３への時間の設定を行う。 The first determination unit 122 adds the update interval t1 of the additional data to the center time T0, and calculates the center time T1 of the next frame determination position. The first determination unit 122 determines a predetermined number N of frames centering on a frame including the update interval T1. The determination unit 21 notifies the power supply control unit 22 of power-off when the determination processing of the De408 decode frame is completed. Also, the determination unit 21 sets the time until the start time for reading the De409 decode frame in the timer unit 23. The determination unit 21 performs the same processing thereafter, and notifies the power supply control unit 22 of power-off and sets the time to the timer unit 23.

これにより、次の付加データの判定において、Ｄｅ４０９〜４１１のデコードフレーム以外のデコードフレームについて音声データについては処理されず、省電力を実現できる。 Thereby, in the determination of the next additional data, the audio data is not processed for the decoded frames other than the decoded frames of De409 to 411, and power saving can be realized.

＜動作＞
次に、実施例４における音処理装置１の動作について説明する。図１６は、実施例４における音処理装置４の処理の一例を示すフローチャートである。なお、図１６に示す処理は、実施例１の判定処理を行う場合の処理である。 <Operation>
Next, the operation of the sound processing apparatus 1 in the fourth embodiment will be described. FIG. 16 is a flowchart illustrating an example of processing of the sound processing device 4 according to the fourth embodiment. Note that the process shown in FIG. 16 is a process when the determination process of the first embodiment is performed.

図１６に示すステップＳ４０１〜Ｓ４０８の処理は、図７に示すステップＳ１０１〜Ｓ１０８の処理と同様であるため、その説明を省略する。 The processes in steps S401 to S408 shown in FIG. 16 are the same as the processes in steps S101 to S108 shown in FIG.

ステップＳ４０９で、判定部２１は、現在の時刻から次のデコードフレームの読み込み開始時刻までの時間を算出し、タイマ部２３に設定する。 In step S409, the determination unit 21 calculates the time from the current time to the reading start time of the next decoded frame, and sets the time in the timer unit 23.

ステップＳ４１０で、判定部２１は、電源制御部２２に対し、電源切断を通知する。これにより、各部の不要な電源を切断することができる。 In step S410, the determination unit 21 notifies the power control unit 22 of power off. Thereby, the unnecessary power supply of each part can be cut | disconnected.

ステップＳ４１１で、タイマ部２３は、設定された時間が経過したか否かを判定する。設定時間が経過すれば（ステップＳ４１１−ＹＥＳ）ステップＳ４１２に進み、設定時間が経過していなければ（ステップＳ４１１−ＮＯ）ステップＳ４１１に戻る。 In step S411, the timer unit 23 determines whether the set time has elapsed. If the set time has elapsed (step S411-YES), the process proceeds to step S412. If the set time has not elapsed (step S411-NO), the process returns to step S411.

ステップＳ４１２で、電源制御部２２は、電源を切断した各部に対し、電源供給を開始する。これにより、付加データの判定処理が再開する。 In step S <b> 412, the power supply control unit 22 starts supplying power to each unit that has been turned off. As a result, the additional data determination process resumes.

ステップＳ４１３〜Ｓ４２０の処理は、図１に示すＳ１０９〜Ｓ１１７の処理と同様であるため、その説明を省略する。 The processing in steps S413 to S420 is the same as the processing in S109 to S117 shown in FIG.

以上、実施例４によれば、処理量を抑えつつ、エコーハイディングされた音声データに対し、付加データの抽出精度を向上させ、かつ、電力消費を抑えることができる。 As described above, according to the fourth embodiment, it is possible to improve the extraction accuracy of additional data and reduce power consumption with respect to the audio data subjected to echo hiding while suppressing the processing amount.

［実施例５］
図１７は、実施例５における携帯端末装置５のハードウェアの一例を示すブロック図である。携帯端末装置５は、アンテナ５０１、無線部５０２、ベースバンド処理部５０３、制御部５０４、端末インタフェース部５０５、主記憶部５０６、補助記憶部５０７、マイク５０８、表示部５０９を有する。 [Example 5]
FIG. 17 is a block diagram illustrating an example of hardware of the mobile terminal device 5 according to the fifth embodiment. The mobile terminal device 5 includes an antenna 501, a wireless unit 502, a baseband processing unit 503, a control unit 504, a terminal interface unit 505, a main storage unit 506, an auxiliary storage unit 507, a microphone 508, and a display unit 509.

アンテナ５０１は、送信アンプで増幅された無線信号を送信し、また、基地局から無線信号を受信する。無線部５０２は、ベースバンド処理部５０３で拡散された送信信号をＤ／Ａ変換し、直交変調により高周波信号に変換し、その信号を電力増幅器により増幅する。無線部７０２は、受信した無線信号を増幅し、その信号をＡ／Ｄ変換してベースバンド処理部７０３に伝送する。 The antenna 501 transmits the radio signal amplified by the transmission amplifier and receives the radio signal from the base station. Radio section 502 performs D / A conversion on the transmission signal spread by baseband processing section 503, converts it to a high-frequency signal by orthogonal modulation, and amplifies the signal by a power amplifier. Radio section 702 amplifies the received radio signal, A / D converts the signal, and transmits the signal to baseband processing section 703.

ベースバンド部５０３は、送信データの誤り訂正符号の追加、データ変調、拡散変調、受信信号の逆拡散、受信環境の判定、各チャネル信号の閾値判定、誤り訂正復号などのベースバンド処理などを行う。 The baseband unit 503 performs baseband processing such as addition of an error correction code of transmission data, data modulation, spread modulation, despreading of a received signal, determination of a reception environment, threshold determination of each channel signal, error correction decoding, and the like. .

制御部５０４は、制御信号の送受信などの無線制御を行う。また、制御部５０４は、補助記憶部５０７などに記憶されている音処理プログラムを実行し、各実施例で説明した音処理を行う。 The control unit 504 performs wireless control such as transmission / reception of control signals. Further, the control unit 504 executes a sound processing program stored in the auxiliary storage unit 507 or the like, and performs the sound processing described in each embodiment.

端末インタフェース部５０５は、データ用アダプタ処理、ハンドセットおよび外部データ端末とのインタフェース処理を行う。 The terminal interface unit 505 performs data adapter processing, interface processing with a handset, and an external data terminal.

主記憶部５０６は、ＲＯＭ（Read Only Memory）やＲＡＭ（Random Access Memory）などであり、制御部５０４が実行する基本ソフトウェアであるＯＳ（Operating System）やアプリケーションソフトウェアなどのプログラムやデータを記憶又は一時保存する記憶装置である。 The main storage unit 506 is a ROM (Read Only Memory), a RAM (Random Access Memory) or the like, and stores or temporarily stores programs and data such as an OS (Operating System) and application software which are basic software executed by the control unit 504. It is a storage device to save.

補助記憶部５０７は、ＨＤＤ（Hard Disk Drive）などであり、アプリケーションソフトウェアなどに関連するデータを記憶する記憶装置である。補助記憶部５０７は、前述した音処理プログラムを記憶する。 The auxiliary storage unit 507 is an HDD (Hard Disk Drive) or the like, and is a storage device that stores data related to application software or the like. The auxiliary storage unit 507 stores the above-described sound processing program.

マイク５０８は、付加データが付加されたアナウンスなどを電気信号の音声データに変換する。表示部５０９は、例えば、ＬＣＤ（Liquid Crystal Display）等であり、制御部５０４から入力される表示データに応じた表示が行われる。表示部５０９は、例えば、抽出された付加データを表示する。 The microphone 508 converts the announcement added with the additional data into audio data of an electrical signal. The display unit 509 is, for example, an LCD (Liquid Crystal Display) or the like, and performs display according to display data input from the control unit 504. For example, the display unit 509 displays the extracted additional data.

また、各実施例の音処理装置のマイク部１１と音声データ記憶部１３以外の各部の機能は、例えば制御部５０４、及びワークメモリとしての主記憶部５０６により実現されうる。音声データ記憶部１３は、例えば主記憶部５０６などにより実現されうる。マイク部１１は、例えばマイク５０８などにより実現されうる。 The functions of the units other than the microphone unit 11 and the audio data storage unit 13 of the sound processing apparatus of each embodiment can be realized by, for example, the control unit 504 and the main storage unit 506 as a work memory. The audio data storage unit 13 can be realized by the main storage unit 506, for example. The microphone unit 11 can be realized by a microphone 508, for example.

なお、各実施例における音処理装置の各部の一部をハードウェアで実装し、その他のソフトウェアで実装してもよい。例えば、マイク部１１、デコード部１５、音声データ記憶部１３などをハードウェアで実装し、その他をソフトウェアで実装してもよい。 In addition, a part of each part of the sound processing apparatus in each embodiment may be implemented by hardware and may be implemented by other software. For example, the microphone unit 11, the decoding unit 15, the audio data storage unit 13, and the like may be implemented by hardware, and the others may be implemented by software.

以上、実施例５によれば、携帯端末装置５において、実施例１〜４での付加データ判定処理を行うことができる。 As described above, according to the fifth embodiment, the mobile terminal device 5 can perform the additional data determination process in the first to fourth embodiments.

また、開示の技術は、携帯端末装置５に限らず、他の機器にも実装することができる。例えば、前述した音処理装置は、付加データが付加された音声データを入力するマイクや受信部と制御部とを有する処理装置であれば適用可能である。 In addition, the disclosed technology can be implemented not only in the mobile terminal device 5 but also in other devices. For example, the sound processing apparatus described above can be applied to any processing apparatus having a microphone, a receiving unit, and a control unit for inputting audio data to which additional data is added.

また、前述した各実施例で説明した音処理を実現するためのプログラムを記録媒体に記録することで、各実施例での音処理をコンピュータに実施させることができる。 Further, by recording a program for realizing the sound processing described in each of the above-described embodiments on a recording medium, the sound processing in each of the embodiments can be performed by a computer.

また、このプログラムを記録媒体に記録し、このプログラムが記録された記録媒体をコンピュータや携帯端末装置に読み取らせて、前述した音処理を実現させることも可能である。なお、記録媒体は、ＣＤ−ＲＯＭ、フレキシブルディスク、光磁気ディスク等の様に情報を光学的，電気的或いは磁気的に記録する記録媒体、ＲＯＭ、フラッシュメモリ等の様に情報を電気的に記録する半導体メモリ等、様々なタイプの記録媒体を用いることができる。記録媒体は、搬送波を含まない。 It is also possible to record the program on a recording medium and cause the computer or portable terminal device to read the recording medium on which the program is recorded to realize the above-described sound processing. The recording medium is a recording medium for recording information optically, electrically or magnetically, such as a CD-ROM, flexible disk, magneto-optical disk, etc., and information is electrically recorded such as ROM, flash memory, etc. Various types of recording media such as a semiconductor memory can be used. The recording medium does not include a carrier wave.

なお、上記各実施例で説明した装置は、例えば、電車の車内アナウンスに、アナウンスと同じ内容を付加データで付加し、耳の不自由な人が、付加データが示すアナウンスを表示部に表示する際に用いられる。これにより、各実施例の装置は、アナウンスの内容を知ることなどに適用することができる。 The devices described in the above embodiments, for example, add the same contents as the announcement to the in-car announcement in the train as additional data, and a person with hearing impairment displays the announcement indicated by the additional data on the display unit. Used when. Thereby, the apparatus of each embodiment can be applied to knowing the contents of the announcement.

以上、実施例について詳述したが、上記の各実施例に限定されるものではなく、特許請求の範囲に記載された範囲内において、種々の変形及び変更が可能である。また、前述した実施例の構成要素を全部又は複数を組み合わせることも可能である。また、前述した各実施例同士を組み合わせたりすることもできる。 Although the embodiments have been described in detail, the present invention is not limited to the above-described embodiments, and various modifications and changes can be made within the scope described in the claims. It is also possible to combine all or a plurality of the components of the above-described embodiments. Moreover, the embodiments described above can be combined.

なお、以上の実施例に関し、さらに以下の付記を開示する。
（付記１）
所定時間遅延されたエコー信号に基づいて所定データが更新間隔毎に付加された音声データに対し、前記更新間隔に対応する長さの音声データが複数に分割された各フレームでケプストラム計算を行う計算部と、
各フレームのケプストラム計算結果に基づいて抽出される所定データが同じであるフレームの連続回数が、所定値以上であるか否かを判定し、一の更新間隔に対する所定データを決定する第１判定部と、
前記決定された所定データを出力する出力部と、
を備える音処理装置。
（付記２）
連続して同じ所定データが抽出されるフレームの中心時刻を求める中心算出部と、
求められた中心時刻に前記更新間隔を加算して、次の付加データの抽出に用いる所定数のフレームを決定する第１決定部と、
決定された所定数のフレームのケプストラム計算結果に基づいて、最も多く抽出される所定データを判定し、最も多く抽出された所定データを前記出力部に出力する第２判定部とをさらに備える付記１記載の音処理装置。
（付記３）
前記第１判定部は、
連続して同じ所定データが抽出されるフレームの時間が前記更新間隔の所定倍を超える場合、前記所定倍に１を加算した整数値の数だけ、該所定データを前記出力部に出力する付記１記載の音処理装置。
（付記４）
連続して同じ所定データが抽出されるフレームの中心を求める中心算出部と、
求められた中心に前記更新間隔の（前記整数値＋１）／２倍を加算して、次の付加データの抽出に用いる所定数のフレームを決定する第２決定部と、
決定された所定数のフレームのケプストラム計算結果に基づいて、最も多く抽出される所定データを判定し、最も多く抽出された所定データを前記出力部に出力する第２判定部とをさらに備える付記３記載の音処理装置。
（付記５）
連続して同じ所定データが抽出されるフレームの中心を求める中心算出部と、
求められた中心に前記更新間隔を加算して、次の付加データの抽出に用いる所定数のフレームの群を複数決定する第３決定部と、
決定された各群で所定数のフレームのケプストラム計算結果に基づいて、最も多く抽出される所定データを判定し、最も多く抽出された所定データを前記出力部に出力する第３判定部とをさらに備え、
前記中心算出部は、
前記所定データが最も多く抽出された群のフレームの中心を求める付記１記載の音処理装置。
（付記６）
前記次の付加データの抽出に用いる所定数のフレーム以外のフレームでは、前記音声データの読み込みを行わないよう制御する制御部をさらに備える付記２、４及び５いずれか一項に記載の音処理装置。
（付記７）
前記エコー信号は、前記更新間隔の前後でフェードイン、フェードアウトされ、
前記所定値は、前記更新間隔から前記フェードイン及び前記フェードアウトされる時間を除いた時間に対するデコードフレームの数に基づいて決定される付記１乃至６いずれか一項に記載の音処理装置。
（付記８）
所定時間遅延されたエコー信号に基づいて所定データが更新間隔毎に付加された音声データに対し、前記更新間隔に対応する長さの音声データが複数に分割された各フレームでケプストラム計算を行い、
各フレームのケプストラム計算結果に基づいて抽出される所定データが同じであるフレームの連続回数が、所定値以上であるか否かを判定し、一の更新間隔に対する所定データを決定し、
前記決定された所定データを出力する処理をコンピュータが実行する音処理方法。
（付記９）
所定時間遅延されたエコー信号に基づいて所定データが更新間隔毎に付加された音声データに対し、前記更新間隔に対応する長さの音声データが複数に分割された各フレームでケプストラム計算を行い、
各フレームのケプストラム計算結果に基づいて抽出される所定データが同じであるフレームの連続回数が、所定値以上であるか否かを判定し、一の更新間隔に対する所定データを決定し、
前記決定された所定データを出力する処理をコンピュータに実行させるためのプログラム。 In addition, the following additional remarks are disclosed regarding the above Example.
(Appendix 1)
A calculation for performing cepstrum calculation on each frame in which audio data having a length corresponding to the update interval is divided into a plurality of frames for audio data to which predetermined data is added at each update interval based on an echo signal delayed by a predetermined time And
A first determination unit that determines whether or not the number of consecutive frames having the same predetermined data extracted based on the cepstrum calculation result of each frame is equal to or greater than a predetermined value, and determines predetermined data for one update interval When,
An output unit for outputting the determined predetermined data;
A sound processing apparatus comprising:
(Appendix 2)
A center calculation unit for obtaining a center time of frames from which the same predetermined data is extracted continuously;
A first determination unit that adds the update interval to the determined central time and determines a predetermined number of frames used for extraction of the next additional data;
Appendix 1 further comprising: a second determination unit that determines the most extracted predetermined data based on the determined cepstrum calculation results of the predetermined number of frames and outputs the most extracted predetermined data to the output unit. The sound processing apparatus as described.
(Appendix 3)
The first determination unit includes:
Note 1 that when the time of frames from which the same predetermined data is continuously extracted exceeds a predetermined multiple of the update interval, the predetermined data is output to the output unit by the number of integers obtained by adding 1 to the predetermined multiple The sound processing apparatus as described.
(Appendix 4)
A center calculation unit for obtaining the center of a frame from which the same predetermined data is extracted continuously;
A second determining unit that adds (the integer value + 1) / 2 times the update interval to the determined center and determines a predetermined number of frames used for extraction of the next additional data;
Supplementary note 3 further comprising: a second determination unit that determines the most extracted predetermined data based on the determined cepstrum calculation results of the predetermined number of frames and outputs the most extracted predetermined data to the output unit. The sound processing apparatus as described.
(Appendix 5)
A center calculation unit for obtaining the center of a frame from which the same predetermined data is extracted continuously;
A third determination unit for adding a plurality of groups of a predetermined number of frames to be used for extraction of the next additional data by adding the update interval to the determined center;
A third determination unit for determining the most extracted predetermined data based on the cepstrum calculation results of a predetermined number of frames in each determined group, and outputting the most extracted predetermined data to the output unit; Prepared,
The center calculator is
The sound processing apparatus according to supplementary note 1, wherein a center of a frame of a group in which the predetermined data is extracted most is obtained.
(Appendix 6)
The sound processing device according to any one of appendices 2, 4, and 5, further comprising a control unit that controls not to read the audio data in a frame other than a predetermined number of frames used for extraction of the next additional data. .
(Appendix 7)
The echo signal is faded in and out before and after the update interval,
The sound processing apparatus according to any one of appendices 1 to 6, wherein the predetermined value is determined based on a number of decode frames with respect to a time obtained by excluding the fade-in and fade-out times from the update interval.
(Appendix 8)
Performs cepstrum calculation for each frame in which audio data having a length corresponding to the update interval is divided into a plurality of frames for audio data to which predetermined data is added at each update interval based on an echo signal delayed for a predetermined time,
Determining whether or not the number of consecutive frames having the same predetermined data extracted based on the cepstrum calculation result of each frame is equal to or greater than a predetermined value, and determining predetermined data for one update interval;
A sound processing method in which a computer executes a process of outputting the determined predetermined data.
(Appendix 9)
Performs cepstrum calculation for each frame in which audio data having a length corresponding to the update interval is divided into a plurality of frames for audio data to which predetermined data is added at each update interval based on an echo signal delayed for a predetermined time,
Determining whether or not the number of consecutive frames having the same predetermined data extracted based on the cepstrum calculation result of each frame is equal to or greater than a predetermined value, and determining predetermined data for one update interval;
A program for causing a computer to execute a process of outputting the determined predetermined data.

１、４音処理装置
５携帯端末装置
１１マイク部
１２音声データカウント部
１３音声データ記憶部
１４計算部
１５デコード部
１６、２１判定部
１７出力部
２２電源制御部
２３タイマ部
１１１、２１１、３１１連続回数カウント部
１１２、２１２、３１２付加データ第１判定部
１２１、２２１、３２１中心算出部
１２２第１決定部
１２３、２２３付加データ第２判定部
２２２第２決定部
３２２第３決定部
３２３付加データ第３判定部
５０４制御部
５０６主記憶部
５０７補助記憶部
５０８マイク
５０９表示部 1, 4 Sound processing device 5 Portable terminal device 11 Microphone unit 12 Audio data counting unit 13 Audio data storage unit 14 Calculation unit 15 Decoding unit 16, 21 Determination unit 17 Output unit 22 Power supply control unit 23 Timer units 111, 211, 311 Continuous Number counting unit 112, 212, 312 Additional data first determination unit 121, 221, 321 Center calculation unit 122 First determination unit 123, 223 Additional data second determination unit 222 Second determination unit 322 Third determination unit 323 Additional data 3 determination unit 504 control unit 506 main storage unit 507 auxiliary storage unit 508 microphone 509 display unit

Claims

A calculation for performing cepstrum calculation on each frame in which audio data having a length corresponding to the update interval is divided into a plurality of frames for audio data to which predetermined data is added at each update interval based on an echo signal delayed by a predetermined time And
A first determination unit that determines whether or not the number of consecutive frames having the same predetermined data extracted based on the cepstrum calculation result of each frame is equal to or greater than a predetermined value, and determines predetermined data for one update interval When,
An output unit for outputting the determined predetermined data;
Of each frame, and the central calculation unit for a plurality of frames which the same predetermined data is extracted successively obtain the center time for the entire plurality of frames,
A first determination unit that adds the update interval to the determined central time and determines a predetermined number of frames used for extraction of the next additional data;
A second determination unit for determining the most extracted predetermined data based on the determined cepstrum calculation results of the predetermined number of frames, and outputting the most extracted predetermined data to the output unit;
A sound processing apparatus comprising:

A calculation for performing cepstrum calculation on each frame in which audio data having a length corresponding to the update interval is divided into a plurality of frames for audio data to which predetermined data is added at each update interval based on an echo signal delayed by a predetermined time And
A first determination unit that determines whether or not the number of consecutive frames having the same predetermined data extracted based on the cepstrum calculation result of each frame is equal to or greater than a predetermined value, and determines predetermined data for one update interval When,
An output unit for outputting the determined predetermined data,
The first determination unit includes:
When the time of frames from which the same predetermined data is continuously extracted exceeds a predetermined multiple of the update interval, the predetermined data is output to the output unit by the number of integer values obtained by adding 1 to the predetermined multiple ,
A center calculation unit for obtaining a center for the whole of the plurality of frames for a plurality of frames from which the same predetermined data is continuously extracted,
A second determining unit that adds (the integer value + 1) / 2 times the update interval to the determined center and determines a predetermined number of frames used for extraction of the next additional data;
Sound processing further comprising: a second determination unit that determines the most extracted predetermined data based on the determined cepstrum calculation results of the predetermined number of frames and outputs the most extracted predetermined data to the output unit apparatus.

A calculation for performing cepstrum calculation on each frame in which audio data having a length corresponding to the update interval is divided into a plurality of frames for audio data to which predetermined data is added at each update interval based on an echo signal delayed by a predetermined time And
A first determination unit that determines whether or not the number of consecutive frames having the same predetermined data extracted based on the cepstrum calculation result of each frame is equal to or greater than a predetermined value, and determines predetermined data for one update interval When,
An output unit for outputting the determined predetermined data;
Of each frame, and the central calculation unit for a plurality of frames which the same predetermined data is extracted successively obtain the center to the entire plurality of frames,
A third determination unit for adding a plurality of groups of a predetermined number of frames to be used for extraction of the next additional data by adding the update interval to the determined center;
A third determination unit for determining the most extracted predetermined data based on the cepstrum calculation results of a predetermined number of frames in each determined group, and outputting the most extracted predetermined data to the output unit; Prepared,
The center calculator is
A sound processing apparatus for obtaining a center of a group of frames from which the predetermined data is extracted most.

The next is a predetermined number of frames other than the frames used for extraction of the additional data, the sound processing apparatus according to any one of claims 1 to 3 further comprising a control unit for controlling not to perform the reading of the audio data .

Performs cepstrum calculation for each frame in which audio data having a length corresponding to the update interval is divided into a plurality of frames for audio data to which predetermined data is added at each update interval based on an echo signal delayed for a predetermined time,
Determining whether or not the number of consecutive frames having the same predetermined data extracted based on the cepstrum calculation result of each frame is equal to or greater than a predetermined value, and determining predetermined data for one update interval;
Outputting the determined predetermined data;
For each of the plurality of frames from which the same predetermined data is continuously extracted, a central time for the plurality of frames is obtained.
Adding the update interval to the determined central time to determine a predetermined number of frames used for extraction of the next additional data;
A sound processing method in which a computer executes a process of determining the most extracted predetermined data based on the determined cepstrum calculation results of a predetermined number of frames and outputting the most extracted predetermined data.

Performs cepstrum calculation for each frame in which audio data having a length corresponding to the update interval is divided into a plurality of frames for audio data to which predetermined data is added at each update interval based on an echo signal delayed for a predetermined time,
Determining whether or not the number of consecutive frames having the same predetermined data extracted based on the cepstrum calculation result of each frame is equal to or greater than a predetermined value, and determining predetermined data for one update interval;
Outputting the determined predetermined data;
For each of the plurality of frames from which the same predetermined data is continuously extracted, a central time for the plurality of frames is obtained.
Adding the update interval to the determined central time to determine a predetermined number of frames used for extraction of the next additional data;
A program for causing a computer to execute a process of determining the most extracted predetermined data based on the determined cepstrum calculation results of a predetermined number of frames and outputting the most extracted predetermined data.