JP6919261B2

JP6919261B2 - Sound data processing device, sound data processing method and program

Info

Publication number: JP6919261B2
Application number: JP2017058428A
Authority: JP
Inventors: 貴洋原
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2017-03-24
Filing date: 2017-03-24
Publication date: 2021-08-18
Anticipated expiration: 2037-03-24
Also published as: JP2018160872A; JP7201033B2; JP2021182747A

Description

この発明は、音データ処理装置、音データ処理方法及びプログラムに関する。 The present invention relates to a sound data processing apparatus , a sound data processing method and a program .

従来から、音声通信や音声のストリーミング配信等で、デジタル波形データである音データを含むデータパケットを定期的に受信して、その音データを再生する場合において、データパケットが欠落した箇所を修復する技術が知られている。 Conventionally, in voice communication, voice streaming distribution, etc., when a data packet containing sound data, which is digital waveform data, is periodically received and the sound data is reproduced, the part where the data packet is missing is repaired. The technology is known.

例えば非特許文献１には、有声音の区間でパケットが欠落した場合に、欠落前の最後の数ミリ秒の音データをテンプレートとして、このテンプレートの中から未再生のパケットと最も一致する部分を探し、その発見した部分の音データを、ピッチを調整して欠落部分に反復して埋め込むことにより、欠落部分の修復を行う技術が記載されている。また、非特許文献１には、埋め込む音データと元の音データとの境界では、埋め込む音データと元のデータとの加重平均を取って、両者をなだらかに繋ぎ合わせることも記載されている。
また、音データを含むデータパケットを定期的に受信して、その音データを再生する技術については、特許文献１及び特許文献２にも記載されている。 For example, in Non-Patent Document 1, when a packet is missing in a voiced sound section, the sound data of the last few milliseconds before the loss is used as a template, and the part of this template that most closely matches the unreproduced packet is used. A technique for repairing a missing part by adjusting the pitch and repeatedly embedding the sound data of the found part in the missing part is described. Further, Non-Patent Document 1 also describes that, at the boundary between the embedded sound data and the original sound data, a weighted average of the embedded sound data and the original data is taken and the two are gently connected.
Further, Patent Document 1 and Patent Document 2 also describe a technique for periodically receiving a data packet containing sound data and reproducing the sound data.

特開２０１４−１１０５２５号公報Japanese Unexamined Patent Publication No. 2014-110525 特開２０１４−１１０５２６号公報Japanese Unexamined Patent Publication No. 2014-110526

Colin Perkins著，小川晃通監訳，「マスタリングＴＣＰ／ＩＰＲＴＰ編」，株式会社オーム社、２００４年４月１５日，p.202-203Written by Colin Perkins, translated by Akimichi Ogawa, "Mastering TCP / IP RTP", Ohmsha Co., Ltd., April 15, 2004, p.202-203

ところで、非特許文献１に記載のような、パケットの欠落部の音データを修復する技術においては、修復した部分の音に違和感を発生させないために、修復時に書き込む音データは、パケットの欠落がない、連続したものであることが求められる。しかし、パケットの欠落が頻繁に発生する環境においては、連続性が保証された直近の音データを修復用に確保することは、必ずしも容易ではない。 By the way, in the technique of repairing the sound data of the missing part of the packet as described in Non-Patent Document 1, the sound data to be written at the time of repair has the missing packet so as not to cause a sense of discomfort in the sound of the repaired part. No, it is required to be continuous. However, in an environment where packet loss occurs frequently, it is not always easy to secure the latest sound data whose continuity is guaranteed for restoration.

また、非特許文献１に記載の技術は、パケットが欠落した箇所を修復するものであるが、パケットが欠落していない場合でも、到着遅れが発生すると、再生に必要な音データのサンプルを確保できない事態が生じ得る。このような場合でも音データの再生を続けるためには、何らか対処が必要であるが、非特許文献１はこのような事態に対処するための技術を示していない。
すなわち、不足する音データを、修復と同様な手法により取得するとしても、どのようなタイミングでどれだけの音データを取得すればよいか不明であり、非特許文献１に記載の技術を適用しても効率のよい処理はできない。 Further, the technique described in Non-Patent Document 1 repairs a portion where a packet is missing, but even if the packet is not missing, if an arrival delay occurs, a sample of sound data necessary for reproduction is secured. There can be situations where you cannot. In order to continue reproducing the sound data even in such a case, some measures are required, but Non-Patent Document 1 does not show a technique for dealing with such a situation.
That is, even if the lacking sound data is acquired by the same method as the restoration, it is unclear at what timing and how much sound data should be acquired, and the technique described in Non-Patent Document 1 is applied. However, efficient processing cannot be performed.

この発明は、このような事情に鑑みてなされたものであり、音データを受信して出力する場合に、出力すべき音データを適切なタイミングで受信できなくてもユーザにあまり違和感を与えることなく代替の音データを出力する動作を、低い処理負荷で確実性よく行えるようにすることを目的とする。 The present invention has been made in view of such circumstances, and when receiving and outputting sound data, it gives a user a sense of discomfort even if the sound data to be output cannot be received at an appropriate timing. The purpose is to enable the operation of outputting alternative sound data without any processing load with low processing load and with high certainty.

上記の目的を達成するため、この発明の音データ処理装置は、それぞれデジタル波形データである音データを格納するための記憶領域である第１バッファ、第２バッファ及び第３バッファと、音データを受信する受信部と、上記受信部が受信した音データを受信順に上記第１バッファ及び上記第２バッファに書き込む音データ保存部と、所定の要求を検出した場合に上記第１バッファに格納されている音データを格納順に出力する出力部と、上記第１バッファに格納されている未出力の音データの量が所定の閾値以下になったことを検出した場合に、上記第２バッファに格納されている音データから上記未出力の音データに続けるべき箇所を選択して、その箇所の音データを上記第１バッファの最新の音データの直後に書き込む補間部と、上記受信部において音データの受信の欠落が発生したことを検出した場合に、上記第２バッファが格納している音データを上記第３バッファにコピーすると共に、上記第２バッファをクリアするバッファ管理部とを備え、上記補間部が、上記第２バッファに格納されている音データから上記未出力の音データに続けるべき箇所を選択できない場合に、上記第３バッファに格納されている音データから上記未出力の音データに続けるべき箇所を選択して、その箇所の音データを上記第１バッファの最新の音データの直後に書き込むものである。 In order to achieve the above object, the sound data processing apparatus of the present invention stores sound data in a first buffer, a second buffer, and a third buffer, which are storage areas for storing sound data which are digital waveform data, respectively. It is stored in the receiving unit to receive, the sound data storage unit that writes the sound data received by the receiving unit to the first buffer and the second buffer in the order of reception, and the first buffer when a predetermined request is detected. When it is detected that the output unit that outputs the existing sound data in the storage order and the amount of unoutput sound data stored in the first buffer are equal to or less than a predetermined threshold, the data is stored in the second buffer. Select a part to be continued from the unoutput sound data from the sound data, and write the sound data of that part immediately after the latest sound data of the first buffer, and the receiving part of the sound data. When it is detected that a reception omission has occurred, the sound data stored in the second buffer is copied to the third buffer, and a buffer management unit that clears the second buffer is provided, and the interpolation is performed. When the unit cannot select a part to be continued from the unoutput sound data from the sound data stored in the second buffer, the sound data stored in the third buffer is changed to the unoutput sound data. A part to be continued is selected, and the sound data of that part is written immediately after the latest sound data of the first buffer.

このような音データ処理装置において、音データを格納するための記憶領域である一時バッファをさらに設け、上記補間部が、上記第１バッファに格納されている未出力の音データの量が上記所定の閾値以下になったことを検出したことに応じて、ａ）上記一時バッファに所定量以上の未使用の音データが記憶されていなければ、上記未出力の音データに続けるべき箇所の音データを上記一時バッファに書き込んだ後、上記一時バッファに格納されている音データを上記所定量だけ上記第１バッファの最新の音データの直後に書き込み、ｂ）上記一時バッファに上記所定量以上の未使用の音データが記憶されていれば、上記一時バッファに格納されている前回の書き込みの続きの音データを、上記所定量だけ上記第１バッファの最新の音データの直後に書き込むとよい。 In such a sound data processing device, a temporary buffer which is a storage area for storing sound data is further provided, and the interpolation unit determines the amount of unoutput sound data stored in the first buffer. In response to the detection that the value is below the threshold value of Is written to the temporary buffer, and then the sound data stored in the temporary buffer is written by the predetermined amount immediately after the latest sound data of the first buffer. If the sound data to be used is stored, it is preferable to write the sound data following the previous writing stored in the temporary buffer immediately after the latest sound data in the first buffer by the predetermined amount.

また、この発明の別の音データ処理装置は、それぞれデジタル波形データである音データを格納するための記憶領域である第１バッファ、第２バッファ及び一時バッファと、音データを受信する受信部と、上記受信部が受信した音データを受信順に上記第１バッファ及び上記第２バッファに書き込む音データ保存部と、所定の要求を検出した場合に上記第１バッファに格納されている音データを格納順に出力する出力部と、上記第１バッファに格納されている未出力の音データの量が所定の閾値以下になったことを検出したことに応じて、ａ）上記一時バッファに所定量以上の未使用の音データが記憶されていなければ、上記第２バッファに記憶されている音データから上記未出力の音データに続けるべき箇所を選択して、その箇所の音データを上記一時バッファに書き込んだ後、上記一時バッファに格納されている音データを上記所定量だけ上記第１バッファの最新の音データの直後に書き込み、ｂ）上記一時バッファに上記所定量以上の未使用の音データが記憶されていれば、上記一時バッファに格納されている前回の書き込みの続きの音データを、上記所定量だけ上記第１バッファの最新の音データの直後に書き込む、補間部とを設けたものである。 Further, another sound data processing device of the present invention includes a first buffer, a second buffer, and a temporary buffer, which are storage areas for storing sound data which are digital waveform data, and a receiving unit for receiving sound data. , The sound data storage unit that writes the sound data received by the receiving unit to the first buffer and the second buffer in the order of reception, and stores the sound data stored in the first buffer when a predetermined request is detected. Depending on the output unit that outputs in order and the detection that the amount of unoutput sound data stored in the first buffer is equal to or less than a predetermined threshold, a) a predetermined amount or more in the temporary buffer. If unused sound data is not stored, select a part to be continued from the unoutput sound data from the sound data stored in the second buffer, and write the sound data of that part to the temporary buffer. After that, the sound data stored in the temporary buffer is written by the predetermined amount immediately after the latest sound data in the first buffer, and b) unused sound data exceeding the predetermined amount is stored in the temporary buffer. If so, an interpolation unit is provided to write the sound data following the previous writing stored in the temporary buffer immediately after the latest sound data in the first buffer by the predetermined amount. ..

上記の各音データ処理装置において、上記補間部は、上記第１バッファに音データを書き込む場合に、その書き込もうとする音データの振幅を、上記未出力の音データの振幅に合わせる振幅調整を行い、その振幅調整後の音データを上記第１バッファに書き込むとよい。 In each of the sound data processing devices, when writing sound data to the first buffer, the interpolation unit adjusts the amplitude of the sound data to be written to match the amplitude of the unoutput sound data. , It is preferable to write the sound data after adjusting the amplitude to the first buffer.

さらに、上記音データ保存部は、上記受信部において音データの受信の欠落が発生し、欠落箇所の後の音データを受信した場合に、上記第１バッファについては、その受信した音データを、上記欠落がなかったとした場合にその音データを書き込むべき位置へ書き込み、上記補間部は、上記欠落箇所の後の音データが書き込まれる位置が、未出力の音データの末尾よりも後ろである場合に、上記第２バッファに格納されている音データから上記未出力の音データに続けるべき箇所を選択して、その箇所の音データを上記第１バッファの最新の音データの直後に書き込むことにより、上記欠落箇所の後の音データが書き込まれる位置までの記憶領域を埋めるとよい。
また、この発明は、上記した具体的な態様の他、システム、方法、プログラム、記録媒体等、任意の態様で実施することができる。 Further, when the sound data storage unit receives the sound data after the missing part due to the lack of reception of the sound data in the receiving unit, the sound data storage unit receives the received sound data for the first buffer. If there is no omission, it is written to the position where the sound data should be written, and the interpolation unit is when the position where the sound data is written after the omission is after the end of the unoutput sound data. By selecting a part to be continued from the unoutput sound data from the sound data stored in the second buffer and writing the sound data of that part immediately after the latest sound data of the first buffer. , It is preferable to fill the storage area up to the position where the sound data is written after the missing part.
Further, the present invention can be carried out in any embodiment such as a system, a method, a program, a recording medium, etc., in addition to the specific embodiments described above.

以上のようなこの発明の構成によれば、音データを受信して出力する場合に、出力すべき音データを適切なタイミングで受信できなくてもユーザにあまり違和感を与えることなく代替の音データを出力する動作を、低い処理負荷で確実性よく行うことができる。 According to the configuration of the present invention as described above, when sound data is received and output, alternative sound data is not given to the user so much even if the sound data to be output cannot be received at an appropriate timing. Can be reliably performed with a low processing load.

この発明の音データ処理装置の一実施形態であるＰＣのハードウェア構成を示す図である。It is a figure which shows the hardware composition of the PC which is one Embodiment of the sound data processing apparatus of this invention. 図１に示したＰＣに実現させる音データ処理機能の概略構成を示す図である。It is a figure which shows the schematic structure of the sound data processing function realized in the PC shown in FIG. 図２に示した音データ処理部の機能の構成をより詳細に示す図である。It is a figure which shows the structure of the function of the sound data processing part shown in FIG. 2 in more detail. 補間の準備を含む、通常状態での音データの送受信動作について説明するための図である。It is a figure for demonstrating the transmission / reception operation of sound data in a normal state including the preparation of interpolation. 補完バッファの補間用データを用いた補完動作の流れを示す図である。It is a figure which shows the flow of the completion operation using the interpolation data of a completion buffer. 補間バッファのバックアップ及びクリア動作の流れを示す図である。It is a figure which shows the flow of the backup and clear operation of an interpolation buffer. 図１に示したＰＣのＣＰＵが実行するメイン処理のフローチャートである。It is a flowchart of the main processing executed by the CPU of the PC shown in FIG. 図７のステップＳ１５で実行する音データ要求時の処理のフローチャートである。It is a flowchart of the process at the time of sound data request executed in step S15 of FIG. 図８の続きの処理のフローチャートである。It is the flowchart of the continuation process of FIG. 図９のステップＳ３３で実行する振幅調整処理のフローチャートである。It is a flowchart of the amplitude adjustment processing executed in step S33 of FIG. 図７のステップＳ１３で実行するパケット到着時の処理のフローチャートである。It is a flowchart of the process at the time of packet arrival executed in step S13 of FIG. 図１１の続きの処理のフローチャートである。It is the flowchart of the continuation process of FIG. オーディオバッファ中の未送信データと新たに到着したパケットの音データとの関係の例を示す図である。It is a figure which shows the example of the relationship between the untransmitted data in an audio buffer, and the sound data of a newly arrived packet. その別の例を示す図である。It is a figure which shows the other example. 第１変形例における音データ処理部の構成を示す、図３と対応する図である。It is a figure corresponding to FIG. 3 which shows the structure of the sound data processing part in the 1st modification. 第２変形例における音データ処理部の構成を示す、図３と対応する図である。It is a figure corresponding to FIG. 3 which shows the structure of the sound data processing part in the 2nd modification. 第３変形例における音データ処理部の構成を示す、図３と対応する図である。It is a figure corresponding to FIG. 3 which shows the structure of the sound data processing part in the 3rd modification.

以下、この発明を実施するための形態を図面に基づいて具体的に説明する。
〔実施形態：図１乃至図１４〕
図１に、この発明の音データ処理装置の一実施形態のハードウェア構成を示す。
図１に示す音データ処理装置は、ハードウェアとしては汎用コンピュータであるＰＣ（パーソナルコンピュータ）である。より具体的には、ＰＣ１００は、ＣＰＵ１０１、フラッシュメモリ１０２、ＲＡＭ１０３、通信Ｉ／Ｆ１０４、表示器１０５、操作子１０６、音信号出力部１０７を備え、これらがシステムバス１０８によって接続されている。
これらのうちＣＰＵ１０１は、ＰＣ１００全体の動作を制御する制御部であり、フラッシュメモリ１０２に記憶された所要のプログラムを実行して所要のハードウェアを制御することにより、図２及び図３を用いて説明するものをはじめとする種々の機能を実現する。 Hereinafter, a mode for carrying out the present invention will be specifically described with reference to the drawings.
[Embodiment: FIGS. 1 to 14]
FIG. 1 shows a hardware configuration of an embodiment of the sound data processing device of the present invention.
The sound data processing device shown in FIG. 1 is a PC (personal computer) which is a general-purpose computer as hardware. More specifically, the PC 100 includes a CPU 101, a flash memory 102, a RAM 103, a communication I / F 104, a display 105, an operator 106, and a sound signal output unit 107, which are connected by a system bus 108.
Of these, the CPU 101 is a control unit that controls the operation of the entire PC 100, and by executing a required program stored in the flash memory 102 to control the required hardware, FIGS. 2 and 3 are used. It realizes various functions including those described.

フラッシュメモリ１０２は、ＣＰＵ１０１が実行する制御プログラムや電源を切っても保存しておく必要のあるデータ等を記憶する書き換え可能な不揮発性記憶手段である。ＨＤＤ（ハードディスクドライブ）を併用してもよい。
ＲＡＭ１０３は、一時的に記憶すべきデータを記憶したり、ＣＰＵ１０１のワークメモリとして使用したりする記憶手段である。 The flash memory 102 is a rewritable non-volatile storage means for storing a control program executed by the CPU 101, data that needs to be stored even when the power is turned off, and the like. HDD (hard disk drive) may be used together.
The RAM 103 is a storage means for temporarily storing data to be stored or using it as a work memory of the CPU 101.

通信Ｉ／Ｆ１０４は、音データの供給源となるサーバ装置等の外部装置と通信するためのインタフェースである。通信方式は、有線無線を問わず、また、ピアツーピア、ネットワークを問わず、任意のものを採用可能である。
表示器１０５は、ＣＰＵ１０１からの制御に従い種々の画面を表示する、液晶ディスプレイ等による表示部である。
操作子１０６は、ユーザからの操作を受け付けるための操作部であり、ディスプレイに積層されたタッチパネルに加え、キーやスイッチ等により構成することができる。 The communication I / F 104 is an interface for communicating with an external device such as a server device that is a source of sound data. Any communication method can be adopted regardless of wired wireless communication, peer-to-peer communication, or network.
The display 105 is a display unit such as a liquid crystal display that displays various screens under the control of the CPU 101.
The operator 106 is an operation unit for receiving an operation from a user, and can be configured by a key, a switch, or the like in addition to a touch panel laminated on the display.

音信号出力部１０７は、スピーカやヘッドホン等の音出力装置を接続し、その音出力装置へ音信号を出力するためのインタフェースである。ここでは、音信号出力部１０７がＤＡ変換機能を備え、ＰＣ１００が処理するデジタルの音データをアナログの音信号に変換して出力するものとするが、デジタル出力を行う構成とすることも妨げられない。 The sound signal output unit 107 is an interface for connecting a sound output device such as a speaker or a headphone and outputting a sound signal to the sound output device. Here, it is assumed that the sound signal output unit 107 has a DA conversion function and converts the digital sound data processed by the PC 100 into an analog sound signal and outputs it, but it is also hindered from being configured to perform digital output. No.

この実施形態では、以上のＰＣ１００のＣＰＵ１０１に所要のプログラムを実行させて所要のハードウェアを制御させることにより、オーディオストリーミングサーバ等の音データ供給源からオーディオ形式のデジタル音データを受信すると共に、その音データを、スピーカ等の音出力装置へ、音出力に適した形式及びタイミングで出力する音データ処理機能を実現させ、音データ処理装置として機能させる。このことにより、ＰＣ１００は、音出力装置に、音データ供給源から受信した音データに基づく音を、ほぼリアルタイムで出力させることができる。 In this embodiment, by causing the CPU 101 of the PC 100 to execute the required program and control the required hardware, the digital sound data in the audio format is received from the sound data source such as an audio streaming server, and the digital sound data in the audio format is received. A sound data processing function for outputting sound data to a sound output device such as a speaker in a format and timing suitable for sound output is realized, and the sound data is made to function as a sound data processing device. As a result, the PC 100 can cause the sound output device to output the sound based on the sound data received from the sound data supply source in substantially real time.

次に、図２に、ＰＣ１００に実現させる音データ処理機能の概略構成を示す。
図２に示す制御部１２０が、ＣＰＵ１０１により実現される機能と対応する。この制御部１２０は、ネットワークドライバ１２１、オーディオドライバ１２２及び音データ処理部２００の機能を備える。 Next, FIG. 2 shows a schematic configuration of the sound data processing function realized in the PC 100.
The control unit 120 shown in FIG. 2 corresponds to the function realized by the CPU 101. The control unit 120 includes the functions of the network driver 121, the audio driver 122, and the sound data processing unit 200.

これらのうちネットワークドライバ１２１は、通信Ｉ／Ｆ１０４を介した音データの送受信を行う機能を備える。この実施形態では、この送受信機能のうち、複数のパケットに分割された一連の音データを順次受信する機能に注目する。ネットワークドライバ１２１は、音データを含むパケットを受信すると、これを音データ処理部２００に渡して、そこに含まれる音データをバッファさせる。１つのパケットには、オーディオ形式のデジタル波形データである音データが所定のサンプル数含まれる。 Of these, the network driver 121 has a function of transmitting and receiving sound data via the communication I / F 104. In this embodiment, attention is paid to a function of sequentially receiving a series of sound data divided into a plurality of packets among the transmission / reception functions. When the network driver 121 receives a packet containing sound data, it passes the packet to the sound data processing unit 200 to buffer the sound data contained therein. One packet contains a predetermined number of samples of sound data, which is digital waveform data in audio format.

音データ処理部２００は、ネットワークドライバ１２１から渡されるパケットに含まれる音データをバッファし、オーディオドライバ１２２からの要求に応じて所定のサンプル数ずつオーディオドライバ１２２に出力する出力部の機能を備える。また、音データ処理部２００は、バッファされている音データのサンプル数が少なくなり、オーディオドライバ１２２からの出力要求に応えられなくなる恐れがある場合や、パケットの欠落が判明した場合に、過去に受信した音データに基づき、不足分や欠落分を補う補間処理を行う機能も備える。この補間処理については後に詳述する。 The sound data processing unit 200 has a function of an output unit that buffers sound data included in a packet passed from the network driver 121 and outputs a predetermined number of samples to the audio driver 122 in response to a request from the audio driver 122. In addition, the sound data processing unit 200 may not be able to respond to the output request from the audio driver 122 due to a small number of buffered sound data samples, or when it is found that packets are missing, in the past. It also has a function to perform interpolation processing to make up for shortages and omissions based on the received sound data. This interpolation process will be described in detail later.

オーディオドライバ１２２は、音信号出力部１０７に連続的に音信号の出力を行わせるために必要な音データを音信号出力部１０７に供給する機能を備える。オーディオドライバ１２２は、必要なタイミングで必要なサンプル数（ここでは一定値とするがこれに限られない）の音データを音データ処理部２００から取得して、各サンプルの音データを、音信号出力部１０７からの出力に適したタイミングで音信号出力部１０７へ供給する。 The audio driver 122 has a function of supplying the sound signal output unit 107 with sound data necessary for causing the sound signal output unit 107 to continuously output the sound signal. The audio driver 122 acquires sound data of a required number of samples (here, a constant value is not limited to this) from the sound data processing unit 200 at a required timing, and obtains the sound data of each sample as a sound signal. It is supplied to the sound signal output unit 107 at a timing suitable for output from the output unit 107.

次に、図３に、図２に示した音データ処理部２００の機能の構成をより詳細に示す。
図３に示すように、音データ処理部２００は、受信部２１１、保存部２１２、出力部２１３、補間部２１４、およびバッファ管理部２１５の機能を備える。また、音データ処理部２００は、音データを格納するための記憶領域として、オーディオバッファ２２１（第１バッファ）、補間バッファ２２２（第２バッファ）、バックアップバッファ２２３（第３バッファ）、および一時バッファ２２４を備えている。これらの各バッファは、例えばＲＡＭ１０３に設けることができる。 Next, FIG. 3 shows the configuration of the function of the sound data processing unit 200 shown in FIG. 2 in more detail.
As shown in FIG. 3, the sound data processing unit 200 includes the functions of the receiving unit 211, the storage unit 212, the output unit 213, the interpolation unit 214, and the buffer management unit 215. Further, the sound data processing unit 200 has an audio buffer 221 (first buffer), an interpolation buffer 222 (second buffer), a backup buffer 223 (third buffer), and a temporary buffer as storage areas for storing sound data. It has 224. Each of these buffers can be provided in, for example, the RAM 103.

上記各部のうち、受信部２１１は、ネットワークドライバ１２１から音データ（ここでは一定のサンプル数とするがこれに限られない）を含むパケットを受信する機能を備える。この受信に係る動作が、受信手順の動作である。パケットには通し番号が付されており、番号順に受信されるべきものであるが、順番が入れ替わったりパケットが欠落（前回到着したパケットと連続しない、より後の番号のパケットが次に到着すること）したりした場合には、この通し番号によりこれを容易に把握することができる。また、受信部２１１は、受信したパケットに含まれる音データをパケットの通し番号と共に保存部２１２に渡す。 Of the above units, the receiving unit 211 has a function of receiving a packet including sound data (here, a fixed number of samples, but not limited to this) from the network driver 121. The operation related to this reception is the operation of the reception procedure. Packets are numbered and should be received in numerical order, but the order is changed or packets are missing (packets with later numbers that do not follow the previously arrived packet arrive next). If you do, you can easily grasp this by this serial number. Further, the receiving unit 211 passes the sound data included in the received packet to the storage unit 212 together with the serial number of the packet.

なお、各パケットに、当該パケットに含まれる音データがどのタイミングで再生されるべきものかを再生開始からの経過時間等で示すタイムスタンプを付しておくとよい。音データの先頭のタイムスタンプがあれば、当該タイムスタンプとパケットに含まれる音データのサンプル数とから、末尾の再生タイミングを算出できる。このようなタイムスタンプを用いれば、各パケットに含まれる音データのサンプル数が一定でない状態で途中のパケットが欠落しても、後のパケットの到着時点で、そのパケットに含まれる音データを、受信済みパケットに含まれる音データの末尾の何サンプル後で再生すればよいかを計算できる。 It is preferable to attach a time stamp to each packet to indicate at what timing the sound data contained in the packet should be reproduced, such as the elapsed time from the start of reproduction. If there is a time stamp at the beginning of the sound data, the reproduction timing at the end can be calculated from the time stamp and the number of sound data samples included in the packet. By using such a time stamp, even if a packet in the middle is missing while the number of sound data samples contained in each packet is not constant, the sound data contained in the packet can be stored at the time of arrival of the later packet. It is possible to calculate how many samples at the end of the sound data included in the received packet should be played back.

保存部２１２は、受信部２１１から受け取った音データを、オーディオバッファ２２１及び補間バッファ２２２へそれぞれ書き込む機能を備える。オーディオバッファ２２１及び補間バッファ２２２はそれぞれリングバッファであり、保存部２１２は、基本的には、最も新しいサンプルの次のサンプルを格納すべきアドレスを示す各バッファの書き込みポインタの位置から始まる領域に、パケット１つ分の音データを書き込み、書き込んだ分だけ各バッファの書き込みポインタの位置を動かす。しかし、オーディオバッファ２２１については、パケットの欠落が発生した場合など、現在の書き込みポインタの位置と異なる位置から書き込みを開始すべき場合もある。 The storage unit 212 has a function of writing the sound data received from the reception unit 211 to the audio buffer 221 and the interpolation buffer 222, respectively. The audio buffer 221 and the interpolation buffer 222 are ring buffers, respectively, and the storage unit 212 basically sets the area starting from the position of the write pointer of each buffer indicating the address where the next sample of the newest sample should be stored. The sound data for one packet is written, and the position of the write pointer of each buffer is moved by the written amount. However, with respect to the audio buffer 221, there are cases where writing should be started from a position different from the current position of the write pointer, such as when a packet is missing.

また、保存部２１２は、パケットの欠落やオーディオバッファ２２１に格納されているサンプル数の減少など、オーディオバッファ２２１中の音データの補間のトリガとなる事象を検出した場合に、補間部２１４に対して補間の実行を指示する機能も備える。何がトリガとなるかについては、後に詳述する。さらに、保存部２１２は、パケットの欠落を検出した場合に、バッファ管理部２１５に対しこれを通知する機能も備える。 Further, when the storage unit 212 detects an event that triggers the interpolation of the sound data in the audio buffer 221 such as a missing packet or a decrease in the number of samples stored in the audio buffer 221, the storage unit 212 responds to the interpolation unit 214. It also has a function to instruct the execution of interpolation. What triggers will be described in detail later. Further, the storage unit 212 also has a function of notifying the buffer management unit 215 when a packet omission is detected.

次に、出力部２１３は、オーディオドライバ１２２からの所定の音データ送信要求を検出したことに応じて、オーディオバッファ２２１から必要なサンプル数（ここでは一定値とするがこれに限られない）の音データを格納順に読み出してオーディオドライバ１２２へ送信する機能を備える。出力部２１３は、未送信の音データの中で最も古いサンプルが格納されたアドレスを示す読み出しポインタの位置から順に新しいサンプルの方へ向かって音データを読み出し、読み出した分だけ読み出しポインタの位置を動かす。 Next, the output unit 213 determines the required number of samples (here, a constant value, but not limited to this) from the audio buffer 221 in response to detecting a predetermined sound data transmission request from the audio driver 122. It has a function of reading sound data in the order of storage and transmitting it to the audio driver 122. The output unit 213 reads the sound data from the position of the read pointer indicating the address where the oldest sample in the untransmitted sound data is stored toward the new sample in order, and sets the position of the read pointer by the read amount. move.

また、出力部２１３は、読み出しポインタの位置を保存部２１２に伝える機能も有し、保存部２１２は、書き込みポインタと読み出しポインタのアドレス差から、オーディオバッファ２２１に格納されているサンプル数をリアルタイムで把握することができる。なお、補間が行われた場合には、保存部２１２は、補間により書き込まれたサンプル数の情報にも基づいて、オーディオバッファ２２１に格納されているサンプル数を把握する。 The output unit 213 also has a function of transmitting the position of the read pointer to the storage unit 212, and the storage unit 212 calculates the number of samples stored in the audio buffer 221 in real time from the address difference between the write pointer and the read pointer. Can be grasped. When interpolation is performed, the storage unit 212 grasps the number of samples stored in the audio buffer 221 based on the information on the number of samples written by the interpolation.

補間部２１４は、保存部２１２からの指示に基づき補間を実行すると共に、その実行結果として、補間処理によりオーディオバッファ２２１に格納した音データのサンプル数を保存部２１２に通知する機能を備える。ここで、本明細書において、補間とは、何らかの理由（例えばパケットが欠落したり到着が送れたりしたこと）により、オーディオバッファ２２１内に、出力部２１３が出力すべき（未出力の）音データを十分なサンプル数確保できない場合に、音データ処理部２００が最近受け取った音データ、あるいは過去に受け取ってバックアップした音データに基づき、出力音の聴感になるべく影響を与えないように、不足する音データを生成してオーディオバッファ２２１に書き込むことをいう（この際に必要に応じてフェードイン、フェードアウト、クロスフェード等の加工を施すことも含む）。この補間動作の詳細については後述するが、この実施形態では、この補間に際して一時バッファ２２４を利用する。 The interpolation unit 214 has a function of executing interpolation based on an instruction from the storage unit 212 and, as a result of the execution, notifying the storage unit 212 of the number of sound data samples stored in the audio buffer 221 by the interpolation process. Here, in the present specification, the term "interpolation" means sound data that should be output (not output) by the output unit 213 in the audio buffer 221 for some reason (for example, a packet is missing or an arrival is sent). When a sufficient number of samples cannot be secured, the sound is insufficient so as not to affect the audibility of the output sound as much as possible based on the sound data recently received by the sound data processing unit 200 or the sound data received and backed up in the past. It refers to generating data and writing it to the audio buffer 221 (including processing such as fade-in, fade-out, and crossfade as necessary). The details of this interpolation operation will be described later, but in this embodiment, the temporary buffer 224 is used for this interpolation.

バッファ管理部２１５は、保存部２１２からパケットの欠落が生じた旨の通知を受けたことに応じて、補間バッファ２２２に格納されている音データをバックアップバッファ２２３にコピーするバックアップと、補間バッファ２２２のクリアとを実行する機能を備える。この動作の意義についても後述する。 The buffer management unit 215 copies the sound data stored in the interpolation buffer 222 to the backup buffer 223 in response to the notification from the storage unit 212 that the packet is missing, and the interpolation buffer 222. It has a function to clear and execute. The significance of this operation will also be described later.

次に、図３に示した各部が実行する音データの処理動作について、図４乃至図６を用いて説明する。これらの図に示すサンプル数は一例であり、図に示したものに限られないことはもちろんである。
まず図４に、補間の準備を含む、通常状態での音データの送受信動作を示す。
図４に示すように、音データ処理部２００へは、音データ供給源から供給される複数の受信パケットＰが順次到着する。ここでは、各受信パケットＰは９６（＝Ｂ１）サンプルの音データを含む。 Next, the sound data processing operation executed by each part shown in FIG. 3 will be described with reference to FIGS. 4 to 6. Of course, the number of samples shown in these figures is an example and is not limited to those shown in the figures.
First, FIG. 4 shows a sound data transmission / reception operation in a normal state, including preparation for interpolation.
As shown in FIG. 4, a plurality of received packets P supplied from the sound data supply source arrive at the sound data processing unit 200 in sequence. Here, each received packet P contains sound data of 96 (= B1) samples.

音データ処理部２００においては、受信部２１１がその各受信パケットＰを受け取って保存部２１２へ渡し、保存部２１２がそのパケットに含まれる音データを、オーディオバッファ２２１と補間バッファ２２２へそれぞれ書き込む。このとき、どちらも書き込み時点で既に格納されている最新のサンプルの続きの位置へ書き込む。オーディオバッファ２２１と補間バッファ２２２へは同じ音データを書き込むが、補間バッファ２２２に書き込まれたデータは、補間処理に用いるデータという意味で「補間用データ」と呼ぶことにする。 In the sound data processing unit 200, the receiving unit 211 receives each received packet P and passes it to the storage unit 212, and the storage unit 212 writes the sound data contained in the packet to the audio buffer 221 and the interpolation buffer 222, respectively. At this time, both are written to the continuation position of the latest sample already stored at the time of writing. The same sound data is written to the audio buffer 221 and the interpolation buffer 222, but the data written to the interpolation buffer 222 is referred to as "interpolation data" in the sense of data used for interpolation processing.

なお、補間バッファ２２２への書き込みに際しては、バッファの容量が一杯になったら、古いデータを削除する。ただし、リングバッファを用いる場合には、単に、書き込みポインタが記憶領域の末尾まで移動したら先頭に戻すだけで、新しいサンプルをその時点で最も古いサンプルに上書きすることができる。この書き込みポインタの直後の位置が、現在最も古いサンプルの格納位置、すなわち補間用データの先頭位置となる。
この構造は基本的にはオーディオバッファ２２１でも変わらないが、オーディオバッファ２２１では、まだ出力部２１３により読み出されて送信されていないサンプル（「未送信データ」と呼ぶ）が、有効に格納されている音データであると取り扱う。 When writing to the interpolation buffer 222, old data is deleted when the buffer capacity is full. However, when using a ring buffer, the new sample can be overwritten with the oldest sample at that time by simply moving the write pointer to the end of the storage area and then returning it to the beginning. The position immediately after this write pointer is the current storage position of the oldest sample, that is, the start position of the interpolation data.
This structure is basically the same for the audio buffer 221. However, in the audio buffer 221, a sample (referred to as "untransmitted data") that has not yet been read and transmitted by the output unit 213 is effectively stored. Treat as existing sound data.

ここで、出力部２１３は、オーディオドライバ１２２からの要求に応じてオーディオバッファ２２１の読み出しポインタの位置から始まる１２８（＝Ｂ２）サンプルの音データを読み出して送信データＤとして出力し、読み出した分だけ読み出しポインタを後ろにずらす。従って、この読み出しポインタの位置が未送信データの先頭である。このとき、出力部２１３が読み出した音データ自体をオーディオバッファ２２１の記憶領域から削除する必要はないが、音データ処理部２００は、読み出しポインタより前で書き込みポインタ以降の領域を、有効なデータが格納されていない空の領域であるとして取り扱うので、実質的には削除したことになる。 Here, the output unit 213 reads the sound data of the 128 (= B2) sample starting from the position of the read pointer of the audio buffer 221 in response to the request from the audio driver 122, outputs it as transmission data D, and outputs only the read data. Move the read pointer back. Therefore, the position of this read pointer is the beginning of untransmitted data. At this time, it is not necessary to delete the sound data itself read by the output unit 213 from the storage area of the audio buffer 221. Since it is treated as an empty area that is not stored, it is effectively deleted.

なお、通常状態では、オーディオバッファ２２１に格納される未送信データは２５６サンプル程度になるように各部の動作タイミングが調整される。未送信データの量が多くなりすぎると、音データがオーディオバッファ２２１に長時間滞留することになり、パケットの受信から音の出力までのタイムラグが増加してしまう一方、未送信データの量が少なすぎると、パケットの到着が少し遅れただけでオーディオバッファ２２１の音データが枯渇することになり、（補間はできるとはいえ）音出力に支障を来すことになる。ここでは、これらのバランスを考慮して、未送信データ量の目標値を定めている。また、オーディオバッファ２２１のサイズは、音データの出力遅延等により想定より多い未送信データが滞留する可能性もあることを考慮して、目標値の２倍の５１２サンプル分としている。 In the normal state, the operation timing of each part is adjusted so that the untransmitted data stored in the audio buffer 221 is about 256 samples. If the amount of untransmitted data becomes too large, the sound data will stay in the audio buffer 221 for a long time, and the time lag from packet reception to sound output will increase, while the amount of untransmitted data will be small. If it is too much, the sound data of the audio buffer 221 will be exhausted even if the arrival of the packet is delayed a little, and the sound output will be hindered (although interpolation is possible). Here, the target value of the amount of untransmitted data is set in consideration of these balances. Further, the size of the audio buffer 221 is set to 512 samples, which is twice the target value, in consideration of the possibility that more untransmitted data than expected may be retained due to the output delay of the sound data or the like.

一方、補間用データのサイズにはこのような制約はないので、補間バッファ２２２のサイズについては、補間処理のために十分な量の補間用データが確保できることと、メモリ資源の有効活用とを考慮して、適当なサイズとすればよい。補間用データには連続性が求められるため、パケットの脱落が頻発する環境ではあまり大きなサイズの補間用データを作成できないことにも留意するとよい。ここでは、これらを考慮して補間バッファ２２２のサイズは１０２４サンプル分としている。 On the other hand, since there is no such restriction on the size of the interpolation data, the size of the interpolation buffer 222 is considered in consideration of securing a sufficient amount of interpolation data for the interpolation processing and effective utilization of memory resources. Then, the size may be appropriate. Since continuity is required for the interpolation data, it should be noted that it is not possible to create the interpolation data of a very large size in an environment where packets are frequently dropped. Here, in consideration of these, the size of the interpolation buffer 222 is set to 1024 samples.

次に図５に、補間バッファ２２２の補間用データを用いた補間動作の流れを示す。
補間動作が行われるのは、大きく分けて、オーディオバッファ２２１内の未送信データが減ってしまい、出力部２１３が読み出すための未送信データが不足する（又は不足が予想される）場合及び、パケットが欠落したことにより、欠落箇所の手前の音データと欠落箇所の後の音データとの間を埋める必要が生じた場合である。図５に示すのは、前者の場合の例であり、これが起こるのは、例えばパケットの到着が遅延している場合等である（その後パケットの欠落が判明する場合もある）。 Next, FIG. 5 shows the flow of the interpolation operation using the interpolation data of the interpolation buffer 222.
The interpolation operation is roughly divided into the case where the untransmitted data in the audio buffer 221 is reduced and the untransmitted data to be read by the output unit 213 is insufficient (or expected to be insufficient), and the packet. This is a case where it becomes necessary to fill the gap between the sound data before the missing part and the sound data after the missing part due to the missing part. FIG. 5 shows an example of the former case, which occurs, for example, when the arrival of a packet is delayed (then a packet may be found to be missing).

いずれにせよ、保存部２１２は、読み出しポインタと書き込みポインタのアドレスから、図５（ａ）に示すようにオーディオバッファ２２１に十分な量の未送信データが格納されていないことを検出すると、補間部２１４に対して補間の実行を指示し、補間部２１４はこの指示に応じて補間処理を実行する。 In any case, when the storage unit 212 detects from the addresses of the read pointer and the write pointer that a sufficient amount of untransmitted data is not stored in the audio buffer 221 as shown in FIG. 5A, the interpolation unit 212 detects it. The interpolation unit 214 is instructed to execute interpolation to 214, and the interpolation unit 214 executes interpolation processing in response to this instruction.

この補間処理において、一時バッファ２２４にはまだデータが格納されていないとすると、補間部２１４はまず、図５（ａ）に示すように、補間バッファ２２２に格納されている補間用データの中から、オーディオバッファ２２１に残っている未送信データと似た部分をサーチする。未送信データの量が多い場合は、新しい方から所定サンプル数のみを用いてもよい。また、未送信データの量が少なすぎる場合は、読み出しポインタより前の位置の、既に送信済みのデータを未送信データと繋げて、その繋げたデータと似た部分をサーチしてもよい。 In this interpolation process, assuming that the data is not yet stored in the temporary buffer 224, the interpolation unit 214 first selects the interpolation data stored in the interpolation buffer 222 as shown in FIG. 5A. , Search for a part similar to the untransmitted data remaining in the audio buffer 221. When the amount of untransmitted data is large, only the predetermined number of samples may be used from the newest one. If the amount of untransmitted data is too small, the already transmitted data at a position before the read pointer may be connected to the untransmitted data, and a portion similar to the connected data may be searched.

また、ここでは、オーディオバッファ２２１における最新の未送信データがパケット由来のものであれば、その部分は補間用データの最新の部分と一致することと、補間処理を行うためには、発見した部分の後ろに十分な量の補間用データが存在する必要があることとを考慮し、サーチは、補間用データの前半部分に対してのみ行う。しかし、範囲は半分に限られず、より狭い範囲や広い範囲に対して行うことも妨げられない。 Further, here, if the latest untransmitted data in the audio buffer 221 is derived from a packet, that part matches the latest part of the interpolation data, and the part found in order to perform the interpolation processing. Considering that a sufficient amount of interpolation data needs to be present after, the search is performed only on the first half of the interpolation data. However, the range is not limited to half, and it is not hindered to do it for a narrower range or a wider range.

また、サーチのアルゴリズムは、例えば、補間用データ中で、少しずつずらした位置の、比較対象の未送信データと同じサンプル数の連続した音データをそれぞれ候補として用意し、未送信データ側の各サンプル値と補間用データ側の各サンプル値とで積和を取って正規化した値を、双方のデータの相関を表す類似度として求め、類似度が最も大きい候補を、「似た部分」のサーチ結果とするものが考えられる。一定以上の類似度の候補が見つかった場合に、その時点でサーチを終了してもよい。
類似度Ｌは、例えば、未送信データ側の各サンプル値をＸ＝（ｘ_１，ｘ_２，・・・，ｘ_ｎ）、補間用データ側の各サンプル値をＹ＝（ｙ_１，ｙ_２，・・・，ｙ_ｎ）として、Ｘ，Ｙをそれぞれベクトルとして見た場合に、Ｌ＝（Ｘ・Ｙ）／（｜Ｘ｜｜Ｙ｜）によりベクトル同士がなす角のコサイン値として求めることが考えられる。ただし、Ｘ・ＹはベクトルＸとベクトルＹの内積であり、｜Ｘ｜はベクトルＸの大きさである。しかし、共分散や相関係数など他の方法で類似度を求めることも妨げられない。 In addition, the search algorithm prepares, for example, continuous sound data of the same number of samples as the untransmitted data to be compared at positions shifted little by little in the interpolation data as candidates, and each of the untransmitted data side. The value obtained by taking the sum of products of the sample value and each sample value on the interpolation data side and normalizing it is obtained as the similarity representing the correlation between the two data, and the candidate with the highest similarity is selected as the "similar part". The search result can be considered. If a candidate with a certain degree of similarity is found, the search may be terminated at that point.
The similarity L is, for example, X = (x ₁ , x ₂ , ..., X _n _{) for each sample value on the untransmitted data side, and Y = (y 1} , y ₂ ) for each sample value on the interpolation data side. , ..., y _n ), when X and Y are viewed as vectors, L = (XY) / (| X || Y |) is used to obtain the cosine value of the angle between the vectors. Can be considered. However, X and Y are the inner products of the vector X and the vector Y, and | X | is the magnitude of the vector X. However, it does not prevent the similarity from being obtained by other methods such as covariance and correlation coefficient.

いずれにせよ、サーチ結果の「似た部分」を、図５では類似領域２３１として表している。補間処理の基本的な考え方は、この類似領域２３１がオーディオバッファ２２１の未送信データ（の末尾）と似ていることから、類似領域２３１に続く補間用データも、未送信データに続くべき音データと似ていると推定し、類似領域２３１に続く補間用データを、未送信データに続ける音データとしてオーディオバッファ２２１に書き込む、というものである。 In any case, the "similar part" of the search result is represented as a similar region 231 in FIG. The basic idea of the interpolation process is that the similar area 231 is similar to the untransmitted data (at the end) of the audio buffer 221. Therefore, the interpolation data following the similar area 231 is also the sound data that should follow the untransmitted data. The interpolation data following the similar area 231 is written to the audio buffer 221 as sound data following the untransmitted data.

そして、類似領域２３１が特定されると、補間部２１４は、図５（ｂ）に示すように、補間バッファ２２２に格納されている補間用データのうち、類似領域２３１以降の部分（類似領域２３１自体も含む）を、まず一時バッファ２２４にコピーする。一時バッファ２２４のサイズは、補間バッファ２２２と同じにするとよい。また、類似領域２３１より後ろの部分が、未送信データに続けるべき箇所の音データであり、類似領域２３１自体は、未送信データと補間用データとの接続を滑らかに行うべく、未送信データとクロスフェードさせるために用いる音データである。 Then, when the similar region 231 is specified, as shown in FIG. 5B, the interpolation unit 214 includes the portion of the interpolation data stored in the interpolation buffer 222 after the similar region 231 (similar region 231). (Including itself) is first copied to the temporary buffer 224. The size of the temporary buffer 224 may be the same as that of the interpolation buffer 222. Further, the portion after the similar area 231 is the sound data of the portion to be continued to the untransmitted data, and the similar area 231 itself is the untransmitted data in order to smoothly connect the untransmitted data and the interpolation data. This is sound data used for crossfading.

図５（ｂ）の後、補間部２１４は、図５（ｃ）に示すように、一時バッファ２２４の先頭にある類似領域２３１の音データと、オーディオバッファ２２１の未送信データとをクロスフェードさせた音データを生成し、オーディオバッファ２２１の未送信データをその生成した音データに置き換える（上書きする）。そして、その直後（時系列で次以降のサンプルを書き込むべき領域）に、一時バッファ２２４中の、類似領域２３１の直後の所定サンプル数の音データをコピーする。このことにより、オーディオバッファ２２１には、未送信データの末尾と補間用データとが滑らかに繋がった音データが格納されることになる。コピーするサンプル数は、ここでは２５６（＝Ｂ３）サンプルとするが、この値には限られないし、一定であることにも限られない。また、一時バッファ２２４においては、オーディオバッファ２２１の場合と同様、音データを出力（コピー）した場合に、その分だけ読み出しポインタを後ろにずらし、出力した音データは、バッファ内に存在しないものとして取り扱う。 After FIG. 5B, the interpolation unit 214 crossfades the sound data of the similar region 231 at the head of the temporary buffer 224 and the untransmitted data of the audio buffer 221 as shown in FIG. 5C. The sound data is generated, and the untransmitted data in the audio buffer 221 is replaced (overwritten) with the generated sound data. Immediately after that (the area in which the next and subsequent samples should be written in chronological order), the sound data of a predetermined number of samples immediately after the similar area 231 in the temporary buffer 224 is copied. As a result, the audio buffer 221 stores the sound data in which the end of the untransmitted data and the interpolation data are smoothly connected. The number of samples to be copied is 256 (= B3) samples here, but it is not limited to this value and is not limited to a constant value. Further, in the temporary buffer 224, when the sound data is output (copied), the read pointer is moved backward by that amount, and the output sound data does not exist in the buffer, as in the case of the audio buffer 221. handle.

ここまでで、一度の補間処理が完了する。なお、図５（ｃ）のクロスフェードとコピーに当たっては、補間用データの振幅（レベル）を調整することが望ましいが、この点については図１０を用いて後述する。
なお、図５（ｃ）〜（ｅ）では、元々の由来を分かりやすくするために、補間処理によりオーディオバッファ２２１に書き込まれた音データに補間用データと同じハッチングを付している。しかし、補間処理によりオーディオバッファ２２１に書き込まれた音データは、以後、元々オーディオバッファ２２１に格納されていた音データと区別せずに、一連の未送信データとして取り扱われる。 Up to this point, one interpolation process is completed. It is desirable to adjust the amplitude (level) of the interpolation data when cross-fading and copying in FIG. 5 (c), and this point will be described later with reference to FIG.
In FIGS. 5 (c) to 5 (e), the sound data written in the audio buffer 221 by the interpolation process is hatched in the same manner as the interpolation data in order to make it easier to understand the original origin. However, the sound data written in the audio buffer 221 by the interpolation process is subsequently treated as a series of untransmitted data without distinguishing it from the sound data originally stored in the audio buffer 221.

また、図５（ｃ）の後もパケットが到着せず、再度オーディオバッファ２２１に十分な量の未送信データが格納されなくなった状態を、図５（ｄ）に示した。保存部２１２は、このことを検出すると、再度補間部２１４へ補間の実行を指示する。
補間部２１４は、この指示に応じて補間処理を実行する。そしてこのときには、図５（ｅ）に示すように、一時バッファ２２４内に、まだオーディオバッファ２２１にコピーしていない未使用の補間用データが、十分な量（コピー１回分の２５６サンプル以上）残っている。 Further, FIG. 5 (d) shows a state in which the packet did not arrive even after FIG. 5 (c) and a sufficient amount of untransmitted data was not stored in the audio buffer 221 again. When the storage unit 212 detects this, it instructs the interpolation unit 214 to execute the interpolation again.
The interpolation unit 214 executes the interpolation process in response to this instruction. At this time, as shown in FIG. 5E, a sufficient amount (more than 256 samples for one copy) of unused interpolation data that has not yet been copied to the audio buffer 221 remains in the temporary buffer 224. ing.

従って、補間部２１４は、図５（ａ）のような類似領域２３１のサーチを行うことなく、図５（ｅ）に示すように、一時バッファ２２４に格納されている未使用の補間用データの、先頭から所定サンプルを、オーディオバッファ２２１の、未送信データの直後にコピーする。このときには、未送信データの末尾は、今回コピーしようとする補間用データと元々繋がっていたデータであるので、クロスフェードを行わずに繋げても滑らかにつながり、出力音に違和感が生じる可能性は低いと考えられる。 Therefore, as shown in FIG. 5 (e), the interpolation unit 214 does not search the similar region 231 as shown in FIG. 5 (a), but instead of searching the unused interpolation data stored in the temporary buffer 224. , A predetermined sample is copied from the beginning immediately after the untransmitted data in the audio buffer 221. At this time, the end of the untransmitted data is the data that was originally connected to the interpolation data to be copied this time, so even if it is connected without crossfading, it will be connected smoothly, and there is a possibility that the output sound will feel uncomfortable. It is considered low.

以上のように、類似領域２３１をサーチした際に、サーチ結果に基づき補間用データをなるべく多く一時バッファ２２４に記憶させておけば、一時バッファ２２４に十分な量の補間用データが格納されている間は、補間処理においてサーチを省略しても、十分な品質の補間を行うことができる。補間処理においては類似領域２３１のサーチが負荷の大きい処理であるので、これを省略できれば、負荷軽減の効果は大きい。
この実施形態では、補間バッファ２２２のサイズはオーディオバッファ２２１の２倍であるが、上述のように補間バッファ２２２のサイズ上限の制約は小さいため、さらに大きいサイズの補間バッファ２２２を用いてもよい。大きなサイズの補間バッファ２２２を用いる場合には、一時バッファ２２４にコピーできる補間用データのサイズもその分大きくなることが期待でき、負荷軽減の効果は一層大きくなる。 As described above, when the similar area 231 is searched, if as much interpolation data as possible is stored in the temporary buffer 224 based on the search result, a sufficient amount of interpolation data is stored in the temporary buffer 224. In the meantime, even if the search is omitted in the interpolation process, sufficient quality interpolation can be performed. In the interpolation process, the search for the similar region 231 is a process with a large load, so if this can be omitted, the effect of reducing the load is great.
In this embodiment, the size of the interpolation buffer 222 is twice that of the audio buffer 221. However, since the upper limit of the size of the interpolation buffer 222 is small as described above, a larger size interpolation buffer 222 may be used. When a large-sized interpolation buffer 222 is used, the size of the interpolation data that can be copied to the temporary buffer 224 can be expected to increase accordingly, and the effect of load reduction is further enhanced.

次に図６に、補間バッファ２２２のバックアップ及びクリア動作の流れを示す。
ここで、補間バッファ２２２に格納される補間用データは、補間後の音データにノイズが混じらないよう、連続性（途中に欠落がないこと）が保証された音データであることが求められる。音データ処理部２００に到着するパケットに欠落がない限りは、各パケットの音データを到着順に補間バッファ２２２に書き込んでいくことでこの連続性は保証される。 Next, FIG. 6 shows the flow of backup and clear operations of the interpolation buffer 222.
Here, the interpolation data stored in the interpolation buffer 222 is required to be sound data whose continuity (no omission in the middle) is guaranteed so that noise is not mixed in the sound data after interpolation. As long as there is no omission in the packets arriving at the sound data processing unit 200, this continuity is guaranteed by writing the sound data of each packet to the interpolation buffer 222 in the order of arrival.

図６（ａ）は、この連続性が保証された状態の補間用データを示し、その末尾は、第ｎパケット由来の音データである。
この状態で、保存部２１２が次に第（ｎ＋２）パケットを受け取った場合を考える。このことは、第（ｎ＋１）パケットが（後で到着する可能性はあるが）欠落したことを意味するものである。そして、この第（ｎ＋２）パケットを補間用バッファ２２２に続けて書き込んでしまうと、補間用データの連続性が保証されなくなってしまう。そこで、保存部２１２は、パケットの欠落を検出すると、バッファ管理部２１５にこれを通知する。 FIG. 6A shows interpolation data in a state where this continuity is guaranteed, and the end thereof is sound data derived from the nth packet.
In this state, consider the case where the storage unit 212 next receives the second (n + 2) packet. This means that the (n + 1) th packet is missing (although it may arrive later). If this (n + 2) packet is continuously written to the interpolation buffer 222, the continuity of the interpolation data cannot be guaranteed. Therefore, when the storage unit 212 detects the missing packet, the storage unit 212 notifies the buffer management unit 215 of this.

そして、この通知を受けたバッファ管理部２１５は、まず図６（ｂ）に示すように、連続性が保証された状態の補間用データを、補間バッファ２２２からバックアップバッファ２２３にコピーする。バックアップバッファ２２３の補間用データもなるべく新しいものの方がよいので、このコピーは上書きコピーでよい。バックアップバッファ２２３は補間バッファ２２２と同サイズである。また、バッファ管理部２１５はその後、補間バッファ２２２をクリアする。 Then, upon receiving this notification, the buffer management unit 215 first copies the interpolation data in a state where continuity is guaranteed from the interpolation buffer 222 to the backup buffer 223, as shown in FIG. 6B. Since the interpolation data of the backup buffer 223 should be as new as possible, this copy may be an overwrite copy. The backup buffer 223 has the same size as the interpolation buffer 222. Further, the buffer management unit 215 then clears the interpolation buffer 222.

これらの処理が完了すると、バッファ管理部２１５はその旨を保存部２１２に通知し、保存部２１２は、この通知を受けた後で、第（ｎ＋２）パケットの音データを補間バッファ２２２に書き込む。補間バッファ２２２はクリアされているから、このとき書き込んだ音データが、この時点での補間用データの先頭となる。 When these processes are completed, the buffer management unit 215 notifies the storage unit 212 to that effect, and the storage unit 212 writes the sound data of the (n + 2) packet to the interpolation buffer 222 after receiving this notification. Since the interpolation buffer 222 is cleared, the sound data written at this time becomes the head of the interpolation data at this point.

以上のように、パケットの欠落が生じた時点で補間バッファ２２２を一旦クリアしてしまえば、簡単な処理で、補間用データの連続性を保証しつつ、補間用データとして、直近の音データを用いることができる。しかし、図６（ｃ）に示すような、補間バッファ２２２のクリア直後の状態で補間処理を行う必要が生じると、十分な長さの補間用データがなく、適切な補間処理を行うことができない可能性がある。 As described above, once the interpolation buffer 222 is cleared when a packet is missing, the latest sound data can be used as the interpolation data while guaranteeing the continuity of the interpolation data by a simple process. Can be used. However, if it becomes necessary to perform the interpolation processing in the state immediately after the interpolation buffer 222 is cleared as shown in FIG. 6C, the interpolation processing of a sufficient length is not available, and the appropriate interpolation processing cannot be performed. there is a possibility.

バックアップバッファ２２３は、このような事態を防止するために設けたものである。すなわち、補間部２１４は、補間処理に際して、図５（ａ）のサーチで補間バッファ２２２内に未送信データと似たデータを発見できない場合には、図６（ｄ）に示すように、同様なサーチをバックアップバッファ２２３に格納された補間用データに対して行う。バックアップバッファ２２３のデータは、補間バッファ２２２のデータに比べれば少々古いものの、少し前に実際に受信した音データであり、バックアップバッファ２２３のデータを用いても、十分に信頼性の高い補間処理を行うことができる。
すなわち、バックアップバッファ２２３を設けることにより、補間用データの連続性保証と、常に補間処理が可能な状態とを、低い処理負荷で両立させることができる。 The backup buffer 223 is provided to prevent such a situation. That is, when the interpolation unit 214 cannot find data similar to the untransmitted data in the interpolation buffer 222 by the search of FIG. 5A during the interpolation processing, the interpolation unit 214 is similar as shown in FIG. 6D. The search is performed on the interpolation data stored in the backup buffer 223. Although the data in the backup buffer 223 is a little older than the data in the interpolation buffer 222, it is the sound data actually received a while ago, and even if the data in the backup buffer 223 is used, the interpolation processing with sufficiently high reliability can be performed. It can be carried out.
That is, by providing the backup buffer 223, it is possible to achieve both the guarantee of continuity of the interpolation data and the state in which the interpolation processing can always be performed with a low processing load.

なお、補間バッファ２２２に十分な量の補間用データがある場合であっても、未送信データとの類似度が十分高い領域を発見できなかった場合には、バックアップバッファ２２３をサーチするようにしてもよい。また、バックアップバッファ２２３を複数（ｎ個）設けて、直前ｎ回の補間バッファ２２２のクリア時の補間用データをそれぞれ保持しておき、新しい方から順に補間処理時のサーチをトライするようにすることも考えられる。 Even if the interpolation buffer 222 has a sufficient amount of interpolation data, if a region having a sufficiently high degree of similarity to the untransmitted data cannot be found, the backup buffer 223 is searched. May be good. Further, a plurality (n) backup buffers 223 are provided to hold the interpolation data at the time of clearing the interpolation buffer 222 immediately before n times, and the search at the time of interpolation processing is tried in order from the newest one. It is also possible.

次に、図７乃至図１２を用いて、以上の音データ処理装置１００においてＣＰＵ１０１が実行する、音データの入出力及び補間に関連する処理について説明する。なお、これらのフローチャートに示す処理は、音データ処理部２００の機能と対応するものであり、ＣＰＵ１０１が所要のプログラムを実行することにより行うものであるが、その一部又は全部を処理回路により実現することも妨げられない。 Next, with reference to FIGS. 7 to 12, processing related to input / output and interpolation of sound data executed by the CPU 101 in the above sound data processing device 100 will be described. The processing shown in these flowcharts corresponds to the function of the sound data processing unit 200, and is performed by the CPU 101 executing a required program, but a part or all of the processing is realized by the processing circuit. It is not hindered to do it.

まず図７に、メイン処理のフローチャートを示す。
ＣＰＵ１０１は、音データ処理部２００の機能の起動時に、図７のフローチャートに示すメイン処理を開始し、以後、音データ処理部２００の機能が有効である間はこの処理の実行を続ける。
図７の処理において、ＣＰＵ１０１はまず初期処理を実行する（Ｓ１１）。この処理は、ネットワークドライバ１２１と音データ処理部２００とを接続して音データの取得に係る通信機能を有効にする処理、オーディオドライバ１２２と音データ処理部２００とを接続して音データの出力機能を有効にする処理、各バッファのサイズ設定処理等を含む。 First, FIG. 7 shows a flowchart of the main process.
The CPU 101 starts the main process shown in the flowchart of FIG. 7 when the function of the sound data processing unit 200 is activated, and continues to execute this process while the function of the sound data processing unit 200 is effective thereafter.
In the process of FIG. 7, the CPU 101 first executes the initial process (S11). This process is a process of connecting the network driver 121 and the sound data processing unit 200 to enable the communication function related to sound data acquisition, and connecting the audio driver 122 and the sound data processing unit 200 to output sound data. Includes processing to enable functions, processing to set the size of each buffer, and so on.

以後、ＣＰＵ１０１は、音データを含むパケットが到着したことに応じて図１１，図１２に示すパケット到着時の処理を実行し（Ｓ１２，Ｓ１３）、オーディオドライバ１２２から音データの要求があったことに応じて図８，図９に示す音データ要求時の処理を実行する（Ｓ１４，Ｓ１５）。 After that, the CPU 101 executes the processing at the time of packet arrival shown in FIGS. 11 and 12 in response to the arrival of the packet containing the sound data (S12, S13), and the audio driver 122 requests the sound data. The processing at the time of requesting the sound data shown in FIGS. 8 and 9 is executed according to the above (S14, S15).

次に、図８及び図９に、図７のステップＳ１５で実行する音データ要求時の処理のフローチャートを示す。
この処理において、ＣＰＵ１０１はまず、オーディオドライバ１２２へ送信するＢ２サンプルの未送信データがオーディオバッファ２２１に格納されているか否か判断する（Ｓ２１）。通常状態ではこの判断はＹｅｓになるが、この場合には、ＣＰＵ１０１は、オーディオバッファ２２１の先頭からＢ２サンプルの未送信データを読み出してオーディオドライバ１２２に渡し（Ｓ２２）、読み出したデータをオーディオバッファ２２１から削除して（Ｓ２３）、元の処理に戻る。また、ステップＳ２３の削除は、上述したように、読み出しポインタの移動により実質的に行うことができる。これらの処理は、図４の出力側の動作と対応するものである。 Next, FIGS. 8 and 9 show a flowchart of processing at the time of requesting sound data executed in step S15 of FIG.
In this process, the CPU 101 first determines whether or not the untransmitted data of the B2 sample to be transmitted to the audio driver 122 is stored in the audio buffer 221 (S21). In the normal state, this determination is Yes, but in this case, the CPU 101 reads the untransmitted data of the B2 sample from the beginning of the audio buffer 221 and passes it to the audio driver 122 (S22), and the read data is passed to the audio buffer 221. Delete from (S23) and return to the original process. Further, as described above, the deletion in step S23 can be substantially performed by moving the read pointer. These processes correspond to the operations on the output side of FIG.

一方、ステップＳ２１でＮｏの場合には、補間が必要であることがわかる。そこで、ＣＰＵ１０１は、Ｂ３サンプルの未使用データが一時バッファ２２４に格納されているか否か判断する（Ｓ２４）。ここでＹｅｓであれば、図５（ａ）に示したサーチを行う必要はないので、ＣＰＵ１０１は、一時バッファ２２４の先頭からＢ３サンプルを、オーディオバッファ２２１の最新のデータの直後にコピーし（Ｓ２５）、コピーしたデータを一時バッファ２２４から削除する（Ｓ２６）。以上で補間処理が完了し、オーディオドライバ１２２に対して音データを送信できる状態になるので、ＣＰＵ１０１はステップＳ２２以降の処理を実行する。 On the other hand, if No in step S21, it can be seen that interpolation is required. Therefore, the CPU 101 determines whether or not the unused data of the B3 sample is stored in the temporary buffer 224 (S24). If Yes, the search shown in FIG. 5A does not need to be performed, so the CPU 101 copies the B3 sample from the beginning of the temporary buffer 224 immediately after the latest data in the audio buffer 221 (S25). ), The copied data is deleted from the temporary buffer 224 (S26). Since the interpolation process is completed and the sound data can be transmitted to the audio driver 122, the CPU 101 executes the processes after step S22.

また、ステップＳ２４でＮｏであれば、ＣＰＵ１０１は、図５（ａ）に示したサーチを行う。すなわち、補間バッファ２２２に格納されている、古い方から所定範囲のサンプルの中で、オーディオバッファ２２１中の未送信データと似た部分をサーチする（Ｓ２７）。ここで適当な部分がみつからなければ（Ｓ２８のＮｏ）、バックアップバッファ２２３に格納されているデータに対しても同様なサーチを行う（Ｓ２９）。 If No in step S24, the CPU 101 performs the search shown in FIG. 5 (a). That is, in the sample in the predetermined range from the oldest stored in the interpolation buffer 222, a portion similar to the untransmitted data in the audio buffer 221 is searched (S27). If an appropriate part is not found here (No in S28), the same search is performed for the data stored in the backup buffer 223 (S29).

これらのいずれでも適当な部分がみつからない場合（Ｓ３０のＮｏ）、補間を行うことはできないため、ＣＰＵ１０１は、エラー処理として、オーディオバッファ２２１中の未送信データをフェードアウトさせるように加工して（Ｓ３１）、ステップＳ２２に進む。ステップＳ３１の処理は、未送信データの末尾で音が急に途切れてノイズが発生することを防止するためのものであり、このケースでは、フェードアウト後次のパケットが到着するまでは、ステップＳ２２で無音の音データをオーディオドライバ１２２に渡すことになる。 If no suitable part is found in any of these (No in S30), interpolation cannot be performed. Therefore, as error processing, the CPU 101 processes the untransmitted data in the audio buffer 221 so as to fade out (S31). ), Proceed to step S22. The process of step S31 is for preventing the sound from being suddenly interrupted at the end of the untransmitted data and causing noise. In this case, in step S22 until the next packet arrives after fading out. The silent sound data will be passed to the audio driver 122.

また、ステップＳ２７又はＳ２９のサーチで適切な部分（類似領域２３１）がみつかった場合（Ｓ２８又はＳ３０のＹｅｓ）、処理は図９のステップＳ３２に進む。なお、適切な部分とは、未送信データとの類似度（相関）が十分に高く、かつ後続に適切な長さの（少なくともＢ３サンプルの）補間用データが存在する部分である。
図９の処理において、ＣＰＵ１０１はまず、ステップＳ２７又はＳ２９で発見した類似領域２３１以降の補間用データを、一時バッファ２２４にコピーする（Ｓ３２）。この処理は、図５（ｂ）と対応し、コピー元は補間バッファ２２２の場合とバックアップバッファ２２３の場合とがある。 If an appropriate portion (similar region 231) is found in the search in step S27 or S29 (Yes in S28 or S30), the process proceeds to step S32 in FIG. The appropriate portion is a portion in which the degree of similarity (correlation) with the untransmitted data is sufficiently high and the interpolation data of an appropriate length (at least of the B3 sample) is present thereafter.
In the process of FIG. 9, the CPU 101 first copies the interpolation data after the similar region 231 found in step S27 or S29 to the temporary buffer 224 (S32). This process corresponds to FIG. 5B, and the copy source may be the interpolation buffer 222 or the backup buffer 223.

次に、ＣＰＵ１０１は、図１０に示す振幅調整処理を実行する（Ｓ３３）。この処理については後述する。
その後、ＣＰＵ１０１は、一時バッファ２２４の先頭にある類似領域２３１のデータを、オーディオバッファ２２１の未送信データとクロスフェードさせ（Ｓ３４）、一時バッファ２２４の類似領域２３１の後ろのＢ３サンプル分の音データを、オーディオバッファ２２１のクロスフェード済みデータの続きの領域にコピーする（Ｓ３５）。さらに、ここでクロスフェード又はコピーしたサンプルのデータを、一時バッファ２２４から削除する（Ｓ３６）。この削除も、上述したように、読み出しポインタの移動により実質的に行うことができる。以上のステップＳ３４乃至Ｓ３６の処理は、図５（ｃ）と対応する。 Next, the CPU 101 executes the amplitude adjustment process shown in FIG. 10 (S33). This process will be described later.
After that, the CPU 101 crossfades the data in the similar area 231 at the head of the temporary buffer 224 with the untransmitted data in the audio buffer 221 (S34), and the sound data for the B3 sample behind the similar area 231 in the temporary buffer 224. Is copied to the area following the crossfaded data in the audio buffer 221 (S35). Further, the data of the sample crossfaded or copied here is deleted from the temporary buffer 224 (S36). As described above, this deletion can also be substantially performed by moving the read pointer. The above steps S34 to S36 correspond to FIG. 5 (c).

以上でステップＳ２４がＮｏの場合の補間処理が完了し、オーディオドライバ１２２に対して音データを送信できる状態になるので、ＣＰＵ１０１は次にステップＳ２２以降の処理を実行する。
以上の処理により、オーディオバッファ２２１中の未送信データをオーディオドライバ１２２へ出力する出力手順の処理と、オーディオバッファ２２１中の未送信データが不足する場合の補間処理に係る補間手順の処理とを実行することができる。 With the above, the interpolation process when step S24 is No is completed, and the sound data can be transmitted to the audio driver 122. Therefore, the CPU 101 next executes the processes after step S22.
By the above processing, the processing of the output procedure for outputting the untransmitted data in the audio buffer 221 to the audio driver 122 and the processing of the interpolation procedure related to the interpolation processing when the untransmitted data in the audio buffer 221 is insufficient are executed. can do.

なお、オーディオバッファ２２１中の未送信データの不足は、音データの要求をトリガに判定あるいは検出する必要はない。その他のタイミングでも随時監視し、不足を検出した場合に、ステップＳ２５及びＳ２６あるいはステップＳ２７乃至Ｓ３６の補間処理を実行してもよい。補間処理に割けるリソースが少ない場合には、オーディオドライバ１２２からの音データの要求がある前に補間処理を開始し、処理時間を十分確保することも有効である。 It is not necessary to determine or detect the shortage of untransmitted data in the audio buffer 221 by using the request for sound data as a trigger. It may be monitored at any time at other timings, and when a shortage is detected, the interpolation processing of steps S25 and S26 or steps S27 to S36 may be executed. When the resources devoted to the interpolation processing are small, it is also effective to start the interpolation processing before the sound data is requested from the audio driver 122 to secure a sufficient processing time.

次に、図１０に、図９のステップＳ３３で実行する振幅調整処理のフローチャートを示す。
図１０の処理において、ＣＰＵ１０１はまず、オーディオバッファ２２１中の未送信データの最大振幅と、ステップＳ３２でコピーされた一時バッファ２２４中の類似領域２３１の音データの最大振幅とを求める（Ｓ５１，Ｓ５２）。対象範囲内に、振幅として信頼性のある値を求められる程度のサンプル数がない場合には、サンプル値の絶対値の最大値を、最大振幅として採用してもよい。 Next, FIG. 10 shows a flowchart of the amplitude adjustment process executed in step S33 of FIG.
In the process of FIG. 10, the CPU 101 first obtains the maximum amplitude of the untransmitted data in the audio buffer 221 and the maximum amplitude of the sound data in the similar region 231 in the temporary buffer 224 copied in step S32 (S51, S52). ). If the number of samples is not sufficient to obtain a reliable value for the amplitude within the target range, the maximum absolute value of the sample value may be adopted as the maximum amplitude.

次に、ＣＰＵ１０１は、ステップＳ５２で求めた類似領域２３１の音データの最大振幅の方が大きい場合に（Ｓ５３のＹｅｓ）、ステップＳ５１，Ｓ５２で求めた２つの最大振幅の比だけ、一時バッファ２２４に格納された補間用データ全体の振幅を下げる（Ｓ５４）。すなわち、補間用データの振幅を、未送信データの振幅に合わせて調整する。未送信データの最大振幅の方が大きい場合には（Ｓ５３のＮｏ）、振幅の調整は行わない。
以上の後、元の処理に戻る。 Next, when the maximum amplitude of the sound data in the similar region 231 obtained in step S52 is larger (Yes in S53), the CPU 101 temporarily buffers 224 by the ratio of the two maximum amplitudes obtained in steps S51 and S52. The amplitude of the entire interpolation data stored in is reduced (S54). That is, the amplitude of the interpolation data is adjusted according to the amplitude of the untransmitted data. If the maximum amplitude of the untransmitted data is larger (No in S53), the amplitude is not adjusted.
After the above, the process returns to the original process.

以上のような振幅調整を行うと、未送信データに係る音と、補間用データに係る音とが、より滑らかに繋がって聞こえるようにすることができる。すなわち、補間箇所で出力音が聞き手に与える違和感を低減することができる。
この効果は，ステップＳ５３の判断がなくても、すなわち、未送信データと類似領域２３１の音データのどちらの最大振幅が大きいかに関わらず振幅調整を行っても、ある程度は得ることができる。しかし、例えば音楽でよくある減衰音は、途中で音量が大きくなる箇所があると人の耳に目立って聞こえる一方、途中からより小さな音量に減衰してしまっても、それほど不自然に聞こえない。このため、類似領域２３１の音データの方が最大振幅が大きい場合のみ、補間用データの振幅を下げる調整を行う方が、より聞き手に違和感を感じさせないような結果が得られる。 By adjusting the amplitude as described above, it is possible to make the sound related to the untransmitted data and the sound related to the interpolation data more smoothly connected and heard. That is, it is possible to reduce the discomfort that the output sound gives to the listener at the interpolation point.
This effect can be obtained to some extent without the judgment of step S53, that is, even if the amplitude is adjusted regardless of which of the untransmitted data and the sound data in the similar region 231 has the larger maximum amplitude. However, for example, the attenuated sound that is often heard in music can be heard conspicuously by the human ear when the volume is increased in the middle, but it does not sound so unnatural even if it is attenuated to a lower volume in the middle. Therefore, only when the maximum amplitude of the sound data in the similar region 231 is larger, it is possible to obtain a result that the listener does not feel a sense of discomfort by adjusting the amplitude of the interpolation data.

次に、図１１及び図１２に、図７のステップＳ１３で実行するパケット到着時の処理のフローチャートを示す。この処理について、図１３及び図１４も参照しつつ説明する。
この処理において、ＣＰＵ１０１はまず、前回のパケットと連続するパケットが到着したか、または初回のパケットが到着したかのどちらかであるか否かを判断する。これらのどちらかであれば（Ｓ６１のＹｅｓ）、補間処理は不要と判断する。これは、図１３（ａ）に示すように、オーディオバッファ２２１に対し、第ｎパケットのデータの直後に第（ｎ＋１）パケットのデータを書き込める場合である。初回のパケットである場合には、補間処理を行うことなくオーディオバッファ２２１の先頭に音データを書き込める。 Next, FIGS. 11 and 12 show a flowchart of a packet arrival process executed in step S13 of FIG. 7. This process will be described with reference to FIGS. 13 and 14.
In this process, the CPU 101 first determines whether a packet continuous with the previous packet has arrived or the first packet has arrived. If it is either of these (Yes in S61), it is determined that the interpolation process is unnecessary. This is a case where the data of the (n + 1) th packet can be written to the audio buffer 221 immediately after the data of the nth packet, as shown in FIG. 13 (a). In the case of the first packet, sound data can be written to the head of the audio buffer 221 without performing interpolation processing.

通常状態ではステップＳ６１はＹｅｓになるはずである。この場合、ＣＰＵ１０１は、到着したパケットに含まれる音データを、オーディオバッファ２２１の最新の音データの直後に書き込む（Ｓ６２）。また、補間バッファ２２２の音データを古い方から１パケット分削除し（Ｓ６３）、到着したパケットに含まれる音データを、補間バッファ２２２の最新の音データの直後に書き込んで（Ｓ６４）、元の処理に戻る。
ステップＳ６２で書き込むデータとステップＳ６４で書き込むデータとは同じものである。また、ステップＳ６４の書き込みにより、同時にステップＳ６３を実行できるようにする構成を取り得ることは、図４の説明で述べた通りである。以上のステップＳ６２乃至Ｓ６４の処理は、図４の書き込み側の動作と対応するものである。 Under normal conditions, step S61 should be Yes. In this case, the CPU 101 writes the sound data included in the arrived packet immediately after the latest sound data in the audio buffer 221 (S62). Further, the sound data of the interpolation buffer 222 is deleted by one packet from the oldest one (S63), and the sound data included in the arrived packet is written immediately after the latest sound data of the interpolation buffer 222 (S64) to obtain the original sound data. Return to processing.
The data written in step S62 and the data written in step S64 are the same. Further, it is possible to form a configuration in which step S63 can be executed at the same time by writing in step S64, as described in FIG. The above steps S62 to S64 correspond to the operation on the writing side of FIG.

一方、パケットの欠落が発生する等してステップＳ６１でＮｏの場合、ＣＰＵ１０１は補間処理の必要性について検討する。まず、ＣＰＵ１０１は、パケットの欠落が発生しており、かつ、オーディオバッファ２２１の未送信データの後端（最新のサンプル）が、以前に到着したパケットの音データである、という条件が満たされるか否か判断する（Ｓ６５）。 On the other hand, if the result is No in step S61 due to the occurrence of packet omission or the like, the CPU 101 examines the necessity of interpolation processing. First, does the CPU 101 satisfy the condition that the packet is missing and the rear end (latest sample) of the untransmitted data of the audio buffer 221 is the sound data of the previously arrived packet? Whether or not it is determined (S65).

ここでＹｅｓとなるのは、図１３（ｂ）に示すように、第ｎパケットの次に第（ｎ＋ｋ）パケット（ｋは２以上の自然数）が到着したが、第ｎパケットの音データをオーディオバッファ２２１に書き込んだ後、補間処理が行われていない場合である。この場合、補間処理を行って、第ｎパケットの音データと第（ｎ＋ｋ）パケットの音データとの間を、補間用データで埋める必要がある。 Here, Yes means that, as shown in FIG. 13B, the (n + k) packet (k is a natural number of 2 or more) arrives after the nth packet, but the sound data of the nth packet is audio. This is a case where the interpolation process is not performed after writing to the buffer 221. In this case, it is necessary to perform interpolation processing to fill the space between the sound data of the nth packet and the sound data of the (n + k) packet with the interpolation data.

そこで、この場合、ＣＰＵ１０１はまず、オーディオバッファ２２１中で、到着したパケットの音データを本来格納すべき位置を算出する（Ｓ６６）。この位置は、１パケット当たりのサンプル数が一定であれば、現在の書き込みポインタの位置から、１パケット当たりのサンプル数のｋだけ後ろにずらした位置となる。また、上述したように、各パケットにタイムスタンプを付す場合、そのタイムスタンプが示す時刻に基づき、現在の書き込みポインタの位置から何サンプル分だけ後ろにずらした位置とすればよいかを算出できる。 Therefore, in this case, the CPU 101 first calculates a position in the audio buffer 221 where the sound data of the arrived packet should be originally stored (S66). If the number of samples per packet is constant, this position will be a position shifted backward by k of the number of samples per packet from the current position of the write pointer. Further, as described above, when a time stamp is attached to each packet, it is possible to calculate how many samples should be shifted backward from the position of the current write pointer based on the time indicated by the time stamp.

次に、ＣＰＵ１０１は、図８のステップＳ２７乃至Ｓ３０と同様に、補間バッファ２２２あるいはバックアップバッファ２２３に格納された補間用データの中から、オーディオバッファ２２１の最新のＢ４サンプルの音データと似た類似領域をサーチする（Ｓ６７）。Ｂ４は、次のステップＳ６８でクロスフェードさせる範囲のサンプル数であり、サーチの精度も考慮して適宜定めればよい。 Next, the CPU 101 resembles the sound data of the latest B4 sample of the audio buffer 221 from the interpolation data stored in the interpolation buffer 222 or the backup buffer 223, as in steps S27 to S30 of FIG. Search the area (S67). B4 is the number of samples in the range to be crossfaded in the next step S68, and may be appropriately determined in consideration of the accuracy of the search.

次に、ＣＰＵ１０１は、ステップＳ６７で発見した類似領域以降の補間バッファ２２２又はバックアップバッファ２２３の音データ（補間用データ）を、オーディオバッファ２２１に格納された未送信データの最新のＢ４サンプルとクロスフェードさせつつ、パケットの欠落による抜けた音データを埋められるだけオーディオバッファ２２１に書き込む（Ｓ６８）。この補間処理は一回限りなので一時バッファ２２４は利用しない。なお、類似部分がみつからなければステップＳ６８の処理は実行できないが、類似部分以降の音データのサンプル数が足りない場合には、データがある範囲でステップＳ６８の書き込みを行う。 Next, the CPU 101 crossfades the sound data (interference data) of the interpolation buffer 222 or the backup buffer 223 after the similar region found in step S67 with the latest B4 sample of the untransmitted data stored in the audio buffer 221. While doing so, the audio buffer 221 is written as much as the missing sound data due to the missing packet is filled (S68). Since this interpolation process is performed only once, the temporary buffer 224 is not used. If the similar portion is not found, the process of step S68 cannot be executed, but if the number of sound data samples after the similar portion is insufficient, the process of step S68 is written within a certain range of data.

そして、ＣＰＵ１０１は、ステップＳ６８で十分なサンプル数を書き込めたか否か判断する（Ｓ６９）。ここでＹｅｓであれば、補間処理を適切に実行できたと判断し、ＣＰＵ１０１は、到着したパケットの音データを、ステップＳ６８で書き込んだ音データの末尾とクロスフェードさせつつ、オーディオバッファ２２１のうち当該パケットの音データを本来格納すべき位置に書き込む（Ｓ７０）。本来格納すべき位置とは、パケットロスがなかったとした場合に格納すべき位置である。以上でオーディオバッファ２２１への書き込みは完了である。 Then, the CPU 101 determines whether or not a sufficient number of samples can be written in step S68 (S69). If Yes, it is determined that the interpolation process could be executed appropriately, and the CPU 101 crossfades the sound data of the arrived packet with the end of the sound data written in step S68, and the sound data of the audio buffer 221 is concerned. The sound data of the packet is written in the position where it should be stored (S70). The position that should be stored is the position that should be stored when there is no packet loss. This completes the writing to the audio buffer 221.

その後、ＣＰＵ１０１は、パケットの欠落により補間バッファ２２２中の補間用データの連続性が保証できなくなったため、補間バッファ２２２の音データをバックアップバッファ２２３にコピーすると共に、補間バッファ２２２をクリアする（Ｓ７１）。この処理は、図６と対応するものである。ステップＳ６７及びＳ６８での補間処理は、まだ連続性が保証できている状態の補間用データを用いて行ったことになる。 After that, since the continuity of the interpolation data in the interpolation buffer 222 cannot be guaranteed due to the lack of packets, the CPU 101 copies the sound data of the interpolation buffer 222 to the backup buffer 223 and clears the interpolation buffer 222 (S71). .. This process corresponds to FIG. The interpolation processing in steps S67 and S68 is performed using the interpolation data in a state where continuity can still be guaranteed.

また、ＣＰＵ１０１は、一時バッファ２２４もクリアする（Ｓ７２）。これは、ステップＳ７０の処理により、未送信データの末尾が、前回の補間処理による補間用データではなくなり、一時バッファ２２４のデータを次回の補間処理に利用できなくなったためである。その後、処理はステップＳ６３に進み、補間バッファ２２２への書き込みを行って元の処理に戻る。 The CPU 101 also clears the temporary buffer 224 (S72). This is because, due to the processing in step S70, the end of the untransmitted data is no longer the interpolation data obtained by the previous interpolation processing, and the data in the temporary buffer 224 cannot be used for the next interpolation processing. After that, the process proceeds to step S63, writes to the interpolation buffer 222, and returns to the original process.

一方、ステップＳ６９でＮｏの場合、図１２のステップＳ７５に進んで更なる補間処理を試みる。
また、ステップＳ６５でＮｏの場合、処理は図１２のステップＳ７３に進む。ここでは、ＣＰＵ１０１は、パケットの欠落が発生しており、かつ、オーディオバッファ２２１の未送信データの後端が、以前の補間処理で書き込まれた補間用データである、という条件が満たされるか否か判断する（Ｓ７３）。補間処理においてどのアドレス範囲に補間用データを書き込んだかを記録しておけば、それを参照してステップＳ７３の判断を行うことができる。 On the other hand, if No in step S69, the process proceeds to step S75 in FIG. 12 to try further interpolation processing.
If No in step S65, the process proceeds to step S73 in FIG. Here, whether or not the CPU 101 satisfies the condition that the packet is missing and the rear end of the untransmitted data of the audio buffer 221 is the interpolation data written in the previous interpolation processing. Is determined (S73). If the address range in which the interpolation data is written is recorded in the interpolation process, the determination in step S73 can be performed with reference to the record.

ここでＹｅｓとなるのは、前回受信したパケットの音データをオーディオバッファ２２１に書き込んだ後で補間処理を行った場合である。この場合には、補間処理を行って、音データの隙間を補間用データで埋めたり、逆に余分な補間用データを取り除いたりする必要がある。
いずれにせよ、ＣＰＵ１０１は、ステップＳ７３でＹｅｓの場合、まず到着したパケットの音データを本来格納すべき位置を算出する（Ｓ７４）。この処理は、ステップＳ６６と同じものである。 Here, Yes is the case where the interpolation processing is performed after writing the sound data of the packet received last time to the audio buffer 221. In this case, it is necessary to perform interpolation processing to fill the gaps in the sound data with interpolation data, or conversely remove excess interpolation data.
In any case, in the case of Yes in step S73, the CPU 101 first calculates the position where the sound data of the arrived packet should be originally stored (S74). This process is the same as in step S66.

その後、ＣＰＵ１０１は、補間用データが格納されている範囲と今回到着した第（ｎ＋ｋ）パケットを書き込むべき位置との位置関係に応じた処理を行う（Ｓ７５の分岐）。この位置関係には図１４（ａ）〜図１４（ｄ）に示す４通りが想定される。
すなわち、図１４（ａ）に示すように、ドットハッチングで示した補間用データとパケットの音データの格納位置との間に隙間があるケース、図１４（ｂ）に示すように、補間用データと上記格納位置とがちょうど隣り合うケース、図１４（ｃ）に示すように、補間用データと上記格納位置とが一部重なるケース、図１４（ｄ）に示すように、上記格納位置が補間用データ内に包含されるケースである。図１１のステップＳ６９からステップＳ７５に進んだ場合には、このうち図１４（ａ）のケースになると考えられる。 After that, the CPU 101 performs processing according to the positional relationship between the range in which the interpolation data is stored and the position where the second (n + k) packet arriving this time should be written (branch of S75). Four types of this positional relationship are assumed as shown in FIGS. 14 (a) to 14 (d).
That is, as shown in FIG. 14 (a), there is a gap between the interpolation data shown by dot hatching and the storage position of the sound data of the packet, and as shown in FIG. 14 (b), the interpolation data. The case where the storage position and the storage position are exactly adjacent to each other, the case where the interpolation data and the storage position partially overlap as shown in FIG. 14 (c), and the case where the storage position is interpolated as shown in FIG. 14 (d). This is a case that is included in the data for use. When the process proceeds from step S69 to step S75 in FIG. 11, it is considered that the case shown in FIG. 14A is the case.

各ケースにおいて実行される処理について説明すると、補間用データと格納位置との間に隙間があるケースでは、ＣＰＵ１０１は、図１１のステップＳ６７乃至Ｓ７０と同じ処理を実行する（Ｓ７６）。この場合、オーディオバッファ２２１の最新のＢ４サンプルの音データはパケット由来の音データではなく補間用データであるが、処理としてはステップＳ６７乃至Ｓ７０と同じでよい。この処理により、今回到着したパケットの音データをオーディオバッファ２２１に書き込むと共に、既に格納されている補間用データとの間に生じる隙間を、更なる補間用データにより埋めることができる。 Explaining the processing executed in each case, in the case where there is a gap between the interpolation data and the storage position, the CPU 101 executes the same processing as in steps S67 to S70 of FIG. 11 (S76). In this case, the sound data of the latest B4 sample of the audio buffer 221 is not the sound data derived from the packet but the interpolation data, but the processing may be the same as in steps S67 to S70. By this processing, the sound data of the packet arriving this time can be written to the audio buffer 221 and the gap generated between the sound data and the already stored interpolation data can be filled with the further interpolation data.

また、補間用データと格納位置とがちょうど隣り合うケースでは、ＣＰＵ１０１は、補間用データの後端をフェードアウトさせ、その直後から、今回到着したパケットの音データがフェードインするように、オーディオバッファ２２１を書き換える（Ｓ７７）。この場合、補間用データとパケットの音データとが重複する箇所がないため、クロスフェードができないので、フェードインフェードアウトを用いたものである。
また、補間用データと格納位置とが一部重なるケースでは、ＣＰＵ１０１は、今回到着したパケットの音データを、既にオーディオバッファ２２１に格納されている補間用データとクロスフェードさせつつ、ステップＳ７４で求めた本来の格納位置へ書き込む（Ｓ７８）。 Further, in the case where the interpolation data and the storage position are exactly adjacent to each other, the CPU 101 fades out the rear end of the interpolation data, and immediately after that, the audio buffer 221 so that the sound data of the packet arriving this time fades in. Is rewritten (S77). In this case, since there is no overlap between the interpolation data and the sound data of the packet, crossfading cannot be performed, so fade infade out is used.
Further, in the case where the interpolation data and the storage position partially overlap, the CPU 101 obtains the sound data of the packet arriving this time in step S74 while crossfading with the interpolation data already stored in the audio buffer 221. Write to the original storage position (S78).

また、格納位置が補間用データ内に包含されるケースでは、ＣＰＵ１０１は、今回到着したパケットの音データを、既にオーディオバッファ２２１に格納されている補間用データとクロスフェードさせつつ、ステップＳ７４で求めた本来の格納位置へ書き込み、今回到着したパケットの音データより後ろにある補間用データを削除する（Ｓ７９）。既に格納されている補間用データは、以前のパケットの音データに基づき補間したものと想定されるため、今回到着したパケットの音データの後ろに繋げるデータとしては不適当と考えられるためである。 Further, in the case where the storage position is included in the interpolation data, the CPU 101 obtains the sound data of the packet arriving this time in step S74 while crossfading with the interpolation data already stored in the audio buffer 221. The data is written to the original storage position, and the interpolation data behind the sound data of the packet arriving this time is deleted (S79). This is because the interpolation data already stored is assumed to have been interpolated based on the sound data of the previous packet, and is therefore considered inappropriate as data to be connected after the sound data of the packet arriving this time.

ＣＰＵ１０１は、以上のステップＳ７６乃至Ｓ７９のいずれかの処理の後、図１１のステップＳ７２の場合と同様、図１１のステップＳ７１へ進む。すなわち、補間バッファ２２２のバックアップバッファ２２３へのコピーと、補間バッファ２２２及び一時バッファ２２４のクリアとを行う（Ｓ７１，Ｓ７２）。
また、ステップＳ７３でＮｏの場合には、過去のパケットが後から届いた等の場合が考えられるが、この場合にはエラー処理を行って（Ｓ８０）、オーディオバッファ２２１や補間バッファ２２２への書き込みは行わずに図１１及び図１２の処理を終了する。 After any of the above steps S76 to S79, the CPU 101 proceeds to step S71 of FIG. 11 as in the case of step S72 of FIG. That is, the interpolation buffer 222 is copied to the backup buffer 223, and the interpolation buffer 222 and the temporary buffer 224 are cleared (S71, S72).
Further, in the case of No in step S73, it is conceivable that a past packet arrived later. In this case, error processing is performed (S80) and writing to the audio buffer 221 or the interpolation buffer 222 is performed. Ends the processing of FIGS. 11 and 12 without performing.

以上の処理により、受信した音データを受信順にオーディオバッファ２２１及び補間バッファ２２２に書き込む音データ保存手順の処理と、パケットの欠落を検出した場合に補間バッファ２２２のバックアップとクリアを行うバッファ管理手順の処理と、パケットの欠落箇所を埋めるための補間処理とを実行することができる。 Through the above processing, the processing of the sound data storage procedure for writing the received sound data to the audio buffer 221 and the interpolation buffer 222 in the order of reception, and the buffer management procedure for backing up and clearing the interpolation buffer 222 when a packet omission is detected. The process and the interpolation process for filling in the missing part of the packet can be executed.

〔変形例：図１５乃至図１７〕
以上で実施形態の説明を終了するが、装置の具体的な構成、具体的な処理の手順、取り扱う音データの形式やサンプル数、通信の方式などが、上述の実施形態で説明したものに限られないことはもちろんである。
また、この発明の実施形態は、図３に示した各部を全て備えているものに限られることもない。 [Modification example: FIGS. 15 to 17]
The description of the embodiment is completed above, but the specific configuration of the device, the specific processing procedure, the format and the number of samples of the sound data to be handled, the communication method, and the like are limited to those described in the above-described embodiment. Of course not.
Further, the embodiment of the present invention is not limited to the one including all the parts shown in FIG.

例えば、図１５には、一時バッファ２２４を備えない例を示している。この例では、図５の補間動作において、図５（ｂ）の時点で、補間バッファ２２２からオーディオバッファ２２１へ、直接補間用データのコピーを行う。この構成では、補間処理を行う度に類似領域２３１のサーチを行うことになるが、十分な処理能力のあるハードウェアを用いれば、大きな遅れなくこの処理を実行可能である。 For example, FIG. 15 shows an example in which the temporary buffer 224 is not provided. In this example, in the interpolation operation of FIG. 5, the interpolation data is directly copied from the interpolation buffer 222 to the audio buffer 221 at the time of FIG. 5 (b). In this configuration, the similar region 231 is searched every time the interpolation processing is performed, but this processing can be executed without a large delay if hardware having sufficient processing power is used.

また、図１６には、バックアップバッファ２２３を備えない例を示している。この例では、パケットの欠落を検出した場合、補間バッファ２２２の内容を特段バックアップせずにクリアする。このようにすると、クリア後少しの間は補間処理に支障を来すが、パケットの欠落が希な環境であれば、補間処理に支障を来す時間は少なく、このことが音出力に与える影響は小さい。 Further, FIG. 16 shows an example in which the backup buffer 223 is not provided. In this example, when a missing packet is detected, the contents of the interpolation buffer 222 are cleared without any special backup. In this way, the interpolation process will be hindered for a while after clearing, but in an environment where packet loss is rare, the time that hinders the interpolation process is small, and this has an effect on the sound output. Is small.

また、図１７には、バックアップバッファ２２３と一時バッファ２２４のいずれも備えない例を示している。この例では、図１５の例で説明した変形と図１６の例で説明した変形の双方を適用することになる。
また、これらの他、図１０に示した振幅調整処理も、必須ではなく、この処理を省略することも可能である。 Further, FIG. 17 shows an example in which neither the backup buffer 223 nor the temporary buffer 224 is provided. In this example, both the deformation described in the example of FIG. 15 and the deformation described in the example of FIG. 16 are applied.
In addition to these, the amplitude adjustment process shown in FIG. 10 is not essential, and this process can be omitted.

また、上述した実施形態では、オーディオバッファ２２１が１つである例について説明した。しかし、音データ処理装置が受信するパケットに複数チャンネル分の音データが含まれ、それらをチャンネル毎に用意されたオーディオバッファ２２１に格納して出力する装置においても、この発明は適用可能である。この場合、補間バッファ２２２、バックアップバッファ２２３及び一時バッファ２２４も、チャンネル毎に設け、チャンネル毎に補間動作を行えばよい。また、チャンネル毎に、図８及び図９の処理におけるＢ２、Ｂ３及び、図１１の処理におけるＢ４の値が異なっていてもよい。 Further, in the above-described embodiment, an example in which one audio buffer 221 is used has been described. However, the present invention is also applicable to a device in which a packet received by a sound data processing device contains sound data for a plurality of channels, stores the sound data for each channel in an audio buffer 221 prepared for each channel, and outputs the data. In this case, the interpolation buffer 222, the backup buffer 223, and the temporary buffer 224 may also be provided for each channel, and the interpolation operation may be performed for each channel. Further, the values of B2 and B3 in the processing of FIGS. 8 and 9 and B4 in the processing of FIG. 11 may be different for each channel.

また、上述した実施形態では、この発明を汎用コンピュータにより実現する例について説明したが、専用ハードウェアを用いて実現してもよいことはもちろんである。また、ストリーミング配信される音や音声付き動画を再生する場合だけでなく、電話回線やインターネット回線を通じて音声通信（通話）や画像付きの音声通信を行う場合における音データの受信及び出力にも、この発明を適用可能である。 Further, in the above-described embodiment, an example in which the present invention is realized by a general-purpose computer has been described, but it goes without saying that the present invention may be realized by using dedicated hardware. In addition, this is not only for playing back streamed sound and video with audio, but also for receiving and outputting sound data when performing audio communication (call) or audio communication with images through a telephone line or an internet line. The invention is applicable.

また、出力される音データあるいは音信号の用途は、スピーカ等の発音装置による音出力に限られず、記録や、さらに他の装置への転送に用いる場合でも、本発明を適用可能である。
また、上述した実施形態の音データ処理装置の機能は、任意に複数の装置に分散して設けることもできる。
また、以上述べてきた構成及び変形例は、矛盾しない範囲で適宜組み合わせて適用することも可能である。 Further, the use of the output sound data or sound signal is not limited to the sound output by a sounding device such as a speaker, and the present invention can be applied even when it is used for recording or transfer to another device.
Further, the functions of the sound data processing devices of the above-described embodiment can be arbitrarily distributed and provided in a plurality of devices.
Further, the configurations and modifications described above can be appropriately combined and applied within a consistent range.

以上の説明から明らかなように、この発明を利用すれば、音データを受信して出力する場合に、出力すべき音データを適切なタイミングで受信できなくてもユーザにあまり違和感を与えることなく代替の音データを出力する動作を、低い処理負荷で確実性よく行うことができる。従って、処理能力の低いハードウェアを用いても、品質の良い音データを出力することが可能になる。 As is clear from the above description, when the present invention is used, when sound data is received and output, even if the sound data to be output cannot be received at an appropriate timing, the user does not feel a sense of discomfort. The operation of outputting alternative sound data can be performed with low processing load and with good reliability. Therefore, it is possible to output high-quality sound data even if hardware with low processing capacity is used.

１００：ＰＣ（音データ処理装置）、１０１：ＣＰＵ、１０２：フラッシュメモリ、１０３：ＲＡＭ、１０４：通信Ｉ／Ｆ、１０５：表示器、１０６：操作子、１０７：音信号出力部、１０８：システムバス、１２０：制御部、１２１：ネットワークドライバ、１２２：オーディオドライバ、２００：音データ処理部、２１１：受信部、２１２：保存部、２１３：出力部、２１４：補間部、２１５：バッファ管理部、２２１：オーディオバッファ、２２２：補間バッファ、２２３：バックアップバッファ、２２４：一時バッファ、２３１：類似領域
100: PC (sound data processing device), 101: CPU, 102: flash memory, 103: RAM, 104: communication I / F, 105: display, 106: operator, 107: sound signal output unit, 108: system Bus, 120: control unit, 121: network driver, 122: audio driver, 200: sound data processing unit, 211: receiving unit, 212: storage unit, 213: output unit, 214: interpolation unit, 215: buffer management unit, 221: Audio buffer, 222: Interpolation buffer, 223: Backup buffer, 224: Temporary buffer, 231: Similar area

Claims

The first buffer, the second buffer, and the third buffer, which are storage areas for storing sound data which are digital waveform data, respectively,
A receiver that receives sound data and
A sound data storage unit that writes the sound data received by the receiving unit to the first buffer and the second buffer in the order of reception.
An output unit that outputs the sound data stored in the first buffer in the order of storage when a predetermined request is detected, and an output unit.
When it is detected that the amount of unoutput sound data stored in the first buffer is equal to or less than a predetermined threshold, the sound data stored in the second buffer is changed to the unoutput sound data. An interpolation unit that selects a part to be continued and writes the sound data of that part immediately after the latest sound data in the first buffer.
Buffer management that copies the sound data stored in the second buffer to the third buffer and clears the second buffer when it is detected that the reception unit has lost reception of the sound data. With a department
When the interpolation unit cannot select a portion to be continued from the unoutput sound data from the sound data stored in the second buffer, the unoutput sound from the sound data stored in the third buffer. A sound data processing device characterized in that a part to be continued in the data is selected and the sound data of the part is written immediately after the latest sound data of the first buffer.

The sound data processing device according to claim 1.
It also has a temporary buffer, which is a storage area for storing sound data.
The interpolation unit detects that the amount of unoutput sound data stored in the first buffer is equal to or less than the predetermined threshold value.
a) If a predetermined amount or more of unused sound data is not stored in the temporary buffer, the sound data at a position to be continued to the unoutput sound data is written to the temporary buffer and then stored in the temporary buffer. The sound data is written by the predetermined amount immediately after the latest sound data in the first buffer.
b) If more than the predetermined amount of unused sound data is stored in the temporary buffer, the sound data following the previous writing stored in the temporary buffer can be stored in the temporary buffer by the predetermined amount in the first buffer. A sound data processing device characterized by writing immediately after the latest sound data.

A first buffer, a second buffer, and a temporary buffer, which are storage areas for storing sound data, which are digital waveform data, respectively.
A receiver that receives sound data and
A sound data storage unit that writes the sound data received by the receiving unit to the first buffer and the second buffer in the order of reception.
An output unit that outputs the sound data stored in the first buffer in the order of storage when a predetermined request is detected, and an output unit.
Depending on the detection that the amount of unoutput sound data stored in the first buffer is equal to or less than a predetermined threshold value,
a) If a predetermined amount or more of unused sound data is not stored in the temporary buffer, a portion to be continued from the unoutput sound data is selected from the sound data stored in the second buffer, and the portion is selected. After writing the sound data of the location to the temporary buffer, the sound data stored in the temporary buffer is written by the predetermined amount immediately after the latest sound data of the first buffer.
b) If more than the predetermined amount of unused sound data is stored in the temporary buffer, the sound data following the previous writing stored in the temporary buffer can be stored in the temporary buffer by the predetermined amount in the first buffer. An interpolator that writes immediately after the latest sound data,
A sound data processing device characterized by being equipped with.

The sound data processing device according to any one of claims 1 to 3.
When writing sound data to the first buffer, the interpolation unit adjusts the amplitude of the sound data to be written to match the amplitude of the unoutput sound data, and obtains the sound data after the amplitude adjustment. A sound data processing device characterized by writing to the first buffer.

The sound data processing device according to any one of claims 1 to 4.
When the sound data storage unit receives a lack of sound data reception in the receiving unit and receives the sound data after the missing portion, the sound data storage unit receives the received sound data from the first buffer. If there is no sound data, write it to the position where it should be written,
When the position where the sound data is written after the missing portion is after the end of the unoutput sound data, the interpolation unit is not output from the sound data stored in the second buffer. By selecting a part to be continued in the sound data and writing the sound data of that part immediately after the latest sound data of the first buffer, the storage area up to the position where the sound data after the missing part is written is stored. A sound data processing device characterized by filling.

The first buffer, the second buffer, and the third buffer are storage areas for storing sound data, which are digital waveform data, respectively.
The reception procedure for receiving sound data and
A procedure for saving sound data in which the sound data received in the reception procedure is written in the first buffer and the second buffer in the order of reception, and a procedure for saving the sound data.
An output procedure for outputting the sound data stored in the first buffer in the order of storage when a predetermined request is detected, and an output procedure.
When it is detected that the amount of unoutput sound data stored in the first buffer is equal to or less than a predetermined threshold, the sound data stored in the second buffer is changed to the unoutput sound data. An interpolation procedure in which a part to be continued is selected and the sound data of that part is written immediately after the latest sound data in the first buffer.
Buffer management that copies the sound data stored in the second buffer to the third buffer and clears the second buffer when it is detected that the reception of sound data is missing in the reception procedure. With procedure
In the interpolation procedure, when it is not possible to select a portion to be continued from the unoutput sound data from the sound data stored in the second buffer, the unoutput sound is selected from the sound data stored in the third buffer. A sound data processing method comprising a procedure of selecting a part to be continued in the data and writing the sound data of that part immediately after the latest sound data of the first buffer.

The first buffer, the second buffer, and the temporary buffer are storage areas for storing sound data, which are digital waveform data, respectively.
The reception procedure for receiving sound data and
A procedure for saving sound data in which the sound data received in the reception procedure is written in the first buffer and the second buffer in the order of reception, and a procedure for saving the sound data.
An output procedure for outputting the sound data stored in the first buffer in the order of storage when a predetermined request is detected, and an output procedure.
Depending on the detection that the amount of unoutput sound data stored in the first buffer is equal to or less than a predetermined threshold value,
a) If a predetermined amount or more of unused sound data is not stored in the temporary buffer, a portion to be continued from the unoutput sound data is selected from the sound data stored in the second buffer, and the portion is selected. After writing the sound data of the location to the temporary buffer, the sound data stored in the temporary buffer is written by the predetermined amount immediately after the latest sound data of the first buffer.
b) If more than the predetermined amount of unused sound data is stored in the temporary buffer, the sound data following the previous writing stored in the temporary buffer can be stored in the temporary buffer by the predetermined amount in the first buffer. Interpolation procedure and writing immediately after the latest sound data,
A sound data processing method characterized by comprising.

On the computer
Assuming that the first buffer, the second buffer, and the third buffer are storage areas for storing sound data, which are digital waveform data, respectively.
The reception procedure for receiving sound data and
A procedure for saving sound data in which the sound data received in the reception procedure is written in the first buffer and the second buffer in the order of reception, and a procedure for saving the sound data.
An output procedure for outputting the sound data stored in the first buffer in the order of storage when a predetermined request is detected, and an output procedure.
When it is detected that the amount of unoutput sound data stored in the first buffer is equal to or less than a predetermined threshold, the sound data stored in the second buffer is changed to the unoutput sound data. An interpolation procedure in which a part to be continued is selected and the sound data of that part is written immediately after the latest sound data in the first buffer.
Buffer management that copies the sound data stored in the second buffer to the third buffer and clears the second buffer when it is detected that the reception of sound data is missing in the reception procedure. It is a program to execute the procedure
In the interpolation procedure, when it is not possible to select a portion to be continued from the unoutput sound data from the sound data stored in the second buffer, the unoutput sound is selected from the sound data stored in the third buffer. A program characterized in that the procedure is to select a part to be continued in the data and write the sound data of that part immediately after the latest sound data of the first buffer.

On the computer
The first buffer, the second buffer, and the temporary buffer are assumed to be storage areas for storing sound data, which are digital waveform data, respectively.
The reception procedure for receiving sound data and
A procedure for saving sound data in which the sound data received in the reception procedure is written in the first buffer and the second buffer in the order of reception, and a procedure for saving the sound data.
An output procedure for outputting the sound data stored in the first buffer in the order of storage when a predetermined request is detected, and an output procedure.
Depending on the detection that the amount of unoutput sound data stored in the first buffer is equal to or less than a predetermined threshold value,
a) If a predetermined amount or more of unused sound data is not stored in the temporary buffer, a position to be continued from the unoutput sound data is selected from the sound data stored in the second buffer, and the portion is selected. After writing the sound data of the location to the temporary buffer
, The sound data stored in the temporary buffer is written by the predetermined amount immediately after the latest sound data in the first buffer.
b) If more than the predetermined amount of unused sound data is stored in the temporary buffer, the sound data following the previous writing stored in the temporary buffer can be stored in the temporary buffer by the predetermined amount in the first buffer. Interpolation procedure and writing immediately after the latest sound data,
A program to execute.