JP6733450B2

JP6733450B2 - Video conferencing device, information processing method and program

Info

Publication number: JP6733450B2
Application number: JP2016180731A
Authority: JP
Inventors: 和紀北澤
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2016-09-15
Filing date: 2016-09-15
Publication date: 2020-07-29
Anticipated expiration: 2036-09-15
Also published as: JP2018046454A

Description

本発明は、ビデオ会議装置、情報処理方法およびプログラムに関する。 The present invention relates to a video conference device, an information processing method, and a program.

インターネット等の通信ネットワークを介して、遠隔地との間でビデオ会議を行うビデオ会議装置が広く普及している。ビデオ会議装置の中には、映像および音声だけでなく手描きの資料も相手拠点と共有するというニーズに応えるために、タッチパネル機能を有するインタラクティブ・ホワイトボード型のビデオ会議装置が提案され既に知られている。このインタラクティブ・ホワイトボード型のビデオ会議装置を用いることによって、タッチパネルディスプレイに対して筆記した文字、記号および図形等を示す軌跡を示す描画データを相手拠点と共有することが可能となる。 2. Description of the Related Art Video conferencing apparatuses that perform video conferencing with remote locations via a communication network such as the Internet are widely used. Among the video conferencing devices, an interactive whiteboard type video conferencing device with a touch panel function has been proposed and already known in order to meet the need to share not only video and audio but also hand-drawn materials with the partner site. There is. By using this interactive whiteboard type video conferencing device, it becomes possible to share drawing data showing a locus indicating characters, symbols, figures, etc. written on the touch panel display with the partner base.

このようなインタラクティブ・ホワイトボード型のビデオ会議装置では、筆記時にタッチパネルにペンまたは指が触れることによって筆記音（以下、「筆記ノイズ」と称する）が発生する。インタラクティブ・ホワイトボードにビデオ会議用のマイクを内蔵または近傍に配置した場合、この筆記ノイズがマイクに収音されて相手拠点に聴こえてしまうため、会話の妨げとなるという問題がある。このような音のノイズについては、音声フィルタを設けることによって減衰させることが重要となる。 In such an interactive whiteboard type video conference device, a writing sound (hereinafter referred to as “writing noise”) is generated by touching a touch panel with a pen or a finger during writing. When a video conferencing microphone is built in or placed near the interactive whiteboard, this writing noise is picked up by the microphone and is heard at the other site, which hinders conversation. It is important to attenuate such sound noise by providing an audio filter.

このような音のノイズを検出して抑制する技術として、キーボード音のように比較的継続時間が長く、非単調に減衰する特殊な突発性雑音を適切に検出する目的で、雑音信号を含む音声信号の振幅値を閾値と比較することにより、音声信号の雑音開始点を検出する振幅検出部と、少なくとも雑音開始点以降の音声信号の周波数特性を表す周波数特徴量を算出する周波数特徴量算出部と、周波数特徴量に基づいて、雑音開始点以降の音声信号のうち基準周波数以上の高周波数成分を継続的に含む区間を雑音区間として判定する雑音判定部を備えた音声信号処理装置がビームを形成する方法が提案されている（特許文献１参照）。 As a technique to detect and suppress such noise of sound, voices including noise signals are appropriately detected for the purpose of appropriately detecting special sudden noise that has a relatively long duration and is non-monotonically attenuated, such as keyboard sound. An amplitude detection unit that detects the noise start point of the voice signal by comparing the amplitude value of the signal with a threshold value, and a frequency feature amount calculation unit that calculates the frequency feature amount that represents the frequency characteristic of the voice signal at least after the noise start point. Based on the frequency feature amount, the audio signal processing device including the noise determination unit that determines, as the noise interval, the interval that continuously includes the high frequency component equal to or higher than the reference frequency in the audio signal after the noise start point forms a beam. A forming method has been proposed (see Patent Document 1).

しかしながら、特許文献１に記載された技術では、入力される音声の中から特定の不必要なノイズを検出して抑制することができるが、音の周波数特徴量を算出し、音声信号のうち基準周波数以上の高周波数成分を継続的に含む区間を雑音区間として判定する等の複雑なプロセスを経なければならないという問題点がある。 However, in the technique described in Patent Document 1, although specific unnecessary noise can be detected and suppressed from the input voice, the frequency feature amount of the sound is calculated and the reference of the voice signal is obtained. There is a problem that a complicated process such as determining a section including a high frequency component higher than the frequency as a noise section must be performed.

本発明は、上記に鑑みてなされたものであって、筆記ノイズを簡易な処理によって精度よく低減することができるビデオ会議装置、情報処理方法およびプログラムを提供することを目的とする。 The present invention has been made in view of the above, and an object thereof is to provide a video conference device, an information processing method, and a program capable of accurately reducing writing noise by a simple process.

上述した課題を解決し、目的を達成するために、本発明は、タッチパネルに対して棒状物が所定距離まで近づいたことを検知するホバリング検知を行う検知部と、前記検知部によって前記ホバリング検知がされた場合、音声入力部から収音された音声データに対して、前記タッチパネルに対して前記棒状物によって筆記された場合に発生する筆記ノイズを低減するための音声フィルタを適用する適用部と、を備え、前記検知部は、さらに、前記タッチパネルに対して前記棒状物が接触したことを検知する接触検知を行い、前記適用部は、前記検知部によって前記接触検知がされた後、さらに前記棒状物が前記タッチパネルから離れたことが検知された場合、前記音声フィルタの適用を解除することを特徴とする。 In order to solve the above-mentioned problems and achieve the object, the present invention has a detection unit that performs hovering detection to detect that a rod-shaped object has approached a predetermined distance with respect to a touch panel, and the hovering detection by the detection unit. In the case of being, for the voice data collected from the voice input unit, an applying unit that applies a voice filter for reducing writing noise generated when writing is performed by the rod-shaped object on the touch panel, The detection unit further performs contact detection to detect that the rod-shaped object has contacted the touch panel, and the application unit further includes the rod-shaped member after the contact detection is performed by the detection unit. When it is detected that an object has left the touch panel, the application of the audio filter is canceled .

本発明によれば、筆記ノイズを簡易な処理によって精度よく低減することができる。 According to the present invention, writing noise can be accurately reduced by a simple process.

図１は、第１の実施の形態に係るビデオ会議装置の外観の一例を示す図である。FIG. 1 is a diagram showing an example of the external appearance of the video conference apparatus according to the first embodiment. 図２は、第１の実施の形態に係るビデオ会議装置のハードウェア構成の一例を示す図である。FIG. 2 is a diagram illustrating an example of a hardware configuration of the video conference device according to the first embodiment. 図３は、第１の実施の形態に係る会議システムの構成の一例を示す図である。FIG. 3 is a diagram illustrating an example of the configuration of the conference system according to the first embodiment. 図４は、第１の実施の形態に係るビデオ会議装置の機能ブロックの構成の一例を示す図である。FIG. 4 is a diagram showing an example of a configuration of functional blocks of the video conference apparatus according to the first embodiment. 図５は、音声フィルタを適用しない場合の筆記ノイズのレベルのグラフの一例を示す図である。FIG. 5 is a diagram showing an example of a graph of the level of writing noise when the voice filter is not applied. 図６は、音声フィルタを適用した場合の筆記ノイズのレベルのグラフの一例を示す図である。FIG. 6 is a diagram showing an example of a graph of the level of writing noise when the voice filter is applied. 図７は、ホバリング検知機能を説明する図である。FIG. 7 is a diagram for explaining the hovering detection function. 図８は、第１の実施の形態に係るビデオ会議装置の筆記ノイズに音声フィルタを適用する処理の流れの一例を示すフローチャートである。FIG. 8 is a flowchart showing an example of the flow of processing for applying an audio filter to writing noise of the video conference apparatus according to the first embodiment. 図９は、第２の実施の形態に係るビデオ会議装置の機能ブロックの構成の一例を示す図である。FIG. 9 is a diagram showing an example of the configuration of functional blocks of the video conference apparatus according to the second embodiment. 図１０は、第２の実施の形態に係るビデオ会議装置の筆記ノイズに音声フィルタを適用する処理の流れの一例を示すフローチャートである。FIG. 10 is a flowchart showing an example of the flow of processing for applying an audio filter to writing noise of the video conference device according to the second embodiment. 図１１は、第３の実施の形態に係るビデオ会議装置の機能ブロックの構成の一例を示す図である。FIG. 11 is a diagram showing an example of a configuration of functional blocks of the video conference apparatus according to the third embodiment. 図１２は、第３の実施の形態に係るビデオ会議装置の筆記ノイズに音声フィルタを適用する処理の流れの一例を示すフローチャートである。FIG. 12 is a flowchart showing an example of the flow of processing for applying an audio filter to writing noise of the video conference device according to the third embodiment. 図１３は、第４の実施の形態に係るビデオ会議装置の筆記ノイズに音声フィルタを適用する処理の流れの一例を示すフローチャートである。FIG. 13 is a flowchart showing an example of the flow of processing for applying an audio filter to writing noise in the video conference device according to the fourth embodiment.

以下に、図１〜１３を参照しながら、本発明に係るビデオ会議装置、情報処理方法およびプログラムの実施の形態を詳細に説明する。また、以下の実施の形態によって本発明が限定されるものではなく、以下の実施の形態における構成要素には、当業者が容易に想到できるもの、実質的に同一のもの、およびいわゆる均等の範囲のものが含まれる。さらに、以下の実施の形態の要旨を逸脱しない範囲で構成要素の種々の省略、置換、変更および組み合わせを行うことができる。 Embodiments of a video conference device, an information processing method, and a program according to the present invention will be described in detail below with reference to FIGS. The present invention is not limited to the following embodiments, and constituent elements in the following embodiments include those that can be easily conceived by those skilled in the art, substantially the same, and so-called equivalent ranges. Stuff is included. Further, various omissions, substitutions, changes and combinations of the constituent elements can be made without departing from the scope of the following embodiments.

［第１の実施の形態］
（ビデオ会議装置のハードウェア構成）
図１は、第１の実施の形態に係るビデオ会議装置の外観の一例を示す図である。図２は、第１の実施の形態に係るビデオ会議装置のハードウェア構成の一例を示す図である。図１および図２を参照しながら、本実施の形態に係るビデオ会議装置１０のハードウェア構成について説明する。 [First Embodiment]
(Hardware configuration of video conferencing equipment)
FIG. 1 is a diagram showing an example of the external appearance of the video conference apparatus according to the first embodiment. FIG. 2 is a diagram illustrating an example of a hardware configuration of the video conference device according to the first embodiment. The hardware configuration of the video conference apparatus 10 according to the present embodiment will be described with reference to FIGS. 1 and 2.

図２に示すように、本実施の形態に係るビデオ会議装置１０は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２０１と、メモリ２０２と、記憶装置２０３と、タッチパネルディスプレイ２０４と、カメラ２０５と、マイク２０６と、スピーカ２０７と、操作部２０８と、ネットワークＩ／Ｆ（Ｉｎｔｅｒｆａｃｅ）２０９と、を備えている。 As shown in FIG. 2, the video conferencing apparatus 10 according to the present embodiment includes a CPU (Central Processing Unit) 201, a memory 202, a storage device 203, a touch panel display 204, a camera 205, a microphone 206, and A speaker 207, an operation unit 208, and a network I/F (Interface) 209 are provided.

ＣＰＵ２０１は、ビデオ会議装置１０全体の動作を制御する集積回路である。ＣＰＵ２０１は、相手拠点のビデオ会議装置１０に送信するための、カメラ２０５により撮影された映像データ、マイク２０６により収音された音声データ、およびタッチパネルディスプレイ２０４に表示された描画データをエンコードする。また、ＣＰＵ２０１は、相手拠点のビデオ会議装置１０から受信した映像データ、音声データおよび描画データをデコードして、タッチパネルディスプレイ２０４およびカメラ２０５に送信する。すなわち、ＣＰＵ２０１は、コーデック機能を有する。コーデック機能を実現するために、例えば、公知の方法であるＨ．２６４／ＡＶＣ、またはＨ．２６４／ＳＶＣ等の圧縮符号化技術を用いればよい。 The CPU 201 is an integrated circuit that controls the overall operation of the video conference device 10. The CPU 201 encodes the video data captured by the camera 205, the audio data collected by the microphone 206, and the drawing data displayed on the touch panel display 204, which are to be transmitted to the video conference apparatus 10 of the partner site. In addition, the CPU 201 decodes the video data, the audio data, and the drawing data received from the video conference apparatus 10 of the partner site, and transmits them to the touch panel display 204 and the camera 205. That is, the CPU 201 has a codec function. In order to realize the codec function, for example, a known method such as H.264 is used. H.264/AVC, or H.264. A compression encoding technique such as H.264/SVC may be used.

メモリ２０２は、ＣＰＵ２０１のワークエリアとして使用される揮発性のＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）等の記憶装置である。 The memory 202 is a storage device such as a volatile RAM (Random Access Memory) used as a work area of the CPU 201.

記憶装置２０３は、ビデオ会議装置１０の動作を実現する各種プログラム、映像データおよび音声データ等の各種データ、ならびに、後述する音声フィルタの情報を記憶する不揮発性の補助記憶装置である。記憶装置２０３は、例えば、フラッシュメモリ、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）またはＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等である。 The storage device 203 is a non-volatile auxiliary storage device that stores various programs for implementing the operation of the video conference device 10, various data such as video data and audio data, and information of an audio filter described later. The storage device 203 is, for example, a flash memory, a HDD (Hard Disk Drive), an SSD (Solid State Drive), or the like.

タッチパネルディスプレイ２０４は、映像データ等を表示デバイスに表示する機能、ユーザの指またはペン（例えば、図１に示すペン３０１）がタッチパネルに所定の距離まで近づいたことを検知してその検知座標をＣＰＵ２０１に送信するホバリング検知機能（詳細は後述する）、ならびに、タッチパネルに触れたことを検知してその検知座標をＣＰＵ２０１に送信する接触検知機能を備えたホバリングおよび接触検知機能付き表示デバイスである。例えば、タッチパネルディスプレイ２０４のタッチパネルは、静電容量方式のパネルである。タッチパネルディスプレイ２０４は、図１に示すように、例えば、ペン３０１でパネルの表面をなぞって文字、記号および図形等を示す軌跡を表示させることによって筆記が可能な電子黒板としての機能を有する。すなわち、タッチパネルディスプレイ２０４に筆記された描画データを、相手拠点のビデオ会議装置１０と共有することが可能となる。また、タッチパネルディスプレイ２０４は、相手拠点のビデオ会議装置１０のカメラ２０５により撮影された映像データを、全体の表示領域のうち他拠点表示領域２０４ａに表示することが可能である。また、タッチパネルディスプレイ２０４の表示デバイスは、例えば、ＬＣＤ（ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ：液晶ディスプレイ）、または有機ＥＬ（ＯｒｇａｎｉｃＥｌｅｃｔｒｏ−Ｌｕｍｉｎｅｓｃｅｎｃｅ）ディスプレイ等によって構成される。 The touch panel display 204 has a function of displaying video data and the like on a display device, detects that a user's finger or a pen (for example, the pen 301 shown in FIG. 1) has approached the touch panel to a predetermined distance, and the detected coordinates are detected by the CPU 201. The display device with a hovering and contact detection function has a hovering detection function (details will be described later) transmitted to the touch panel, and a contact detection function that detects the touch on the touch panel and transmits the detected coordinates to the CPU 201. For example, the touch panel of the touch panel display 204 is a capacitive panel. As shown in FIG. 1, the touch panel display 204 has a function as an electronic blackboard capable of writing by, for example, tracing a surface of the panel with a pen 301 to display a locus indicating characters, symbols, figures, and the like. That is, the drawing data written on the touch panel display 204 can be shared with the video conference apparatus 10 at the partner site. In addition, the touch panel display 204 can display the video data captured by the camera 205 of the video conference apparatus 10 of the partner site in the other site display area 204a of the entire display area. The display device of the touch panel display 204 is, for example, an LCD (Liquid Crystal Display), an organic EL (Organic Electro-Luminescence) display, or the like.

なお、タッチパネルディスプレイ２０４に筆記するためのペン３０１は、ビデオ会議装置１０の付属品、または、ビデオ会議装置１０の製造メーカの指定品等に限定されるものではない。また、筆記に使用されるペンは、１台のビデオ会議装置１０につき１種類に限定されるものではなく、複数種類のペンが同じビデオ会議装置１０に対して使用されることも想定される。また、ペン３０１は、ペン先の材質についても特定の材質に限定されるものではない。例えば、ペン先に使用される材質としては、フェルト系、プラスチック系、ゴム系、金属系、または樹脂系等が挙げられる。 The pen 301 for writing on the touch panel display 204 is not limited to the accessory of the video conference device 10 or the product specified by the manufacturer of the video conference device 10. Moreover, the number of pens used for writing is not limited to one type for one video conferencing apparatus 10, and it is also assumed that a plurality of types of pens are used for the same video conferencing apparatus 10. Further, the pen 301 is not limited to a particular material for the pen tip. For example, the material used for the pen tip may be felt-based, plastic-based, rubber-based, metal-based, resin-based, or the like.

ここで、ビデオ会議装置１０のタッチパネルディスプレイ２０４から使用者までの距離がある程度保たれるため、使用者の耳に直接聞こえる筆記ノイズは微小であり、筆記者およびその周りの参加者は、特段耳障りであると感じることはない。一方、相手拠点の会議の参加者は、自拠点のタッチパネルディスプレイ２０４で発生した筆記ノイズがマイク２０６に音声として収音され、相手拠点のビデオ会議装置１０に送信されてスピーカ２０７から音声出力されるので、耳障りに感じる場合がある。以上のことから、使用者が指またはペン３０１によって筆記した場合に、的確に筆記ノイズを低減させることが重要となる。なお、上述の筆記ノイズは、筆記によって発生する音声そのものを示す場合もあり、その音声データを示す場合もあるものとする。また、タッチパネルディスプレイ２０４に対して筆記するための指またはペン３０１を、便宜上「スタイラス」と総称して説明する場合がある。 Here, since the distance from the touch panel display 204 of the video conferencing apparatus 10 to the user is maintained to some extent, the writing noise directly heard by the user's ears is minute, and the writer and the participants around it are particularly offensive to the ears. I don't feel it. On the other hand, for the participant of the conference at the other site, the writing noise generated on the touch panel display 204 of the own site is picked up by the microphone 206 as voice, transmitted to the video conference device 10 at the other site, and output as voice from the speaker 207. Therefore, you may feel annoyance. From the above, when the user writes with the finger or the pen 301, it is important to accurately reduce the writing noise. The above-mentioned writing noise may indicate the voice itself generated by writing, or may indicate the voice data. In addition, a finger or pen 301 for writing on the touch panel display 204 may be collectively referred to as a “stylus” for convenience.

カメラ２０５は、レンズ、および光を電荷に変換して被写体の画像（映像）をデジタル化する固体撮像素子を含む撮像装置である。カメラ２０５によって撮影された会議の参加者の映像データは、相手拠点のビデオ会議装置１０に送信されることによって、相手拠点のビデオ会議装置１０のタッチパネルディスプレイ２０４に表示される。カメラ２０５は、例えば、図１に示すように、タッチパネルディスプレイ２０４の上側に設置され、ビデオ会議装置１０に対向する参加者を撮影する。固体撮像素子としては、ＣＭＯＳ（ＣｏｍｐｌｅｍｅｎｔａｒｙＭｅｔａｌＯｘｉｄｅＳｅｍｉｃｏｎｄｕｃｔｏｒ）またはＣＣＤ（ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ）等が用いられる。なお、カメラ２０５が設置される位置は、タッチパネルディスプレイ２０４の上側に限定されるものではなく、その他の位置に設置されるものとしてもよい。 The camera 205 is an imaging device including a lens and a solid-state imaging device that converts light into electric charges and digitizes an image (video) of a subject. The video data of the participants of the conference captured by the camera 205 is displayed on the touch panel display 204 of the video conference apparatus 10 at the partner site by being transmitted to the video conference apparatus 10 at the partner site. The camera 205 is installed on the upper side of the touch panel display 204, for example, as shown in FIG. 1, and shoots a participant facing the video conference device 10. As the solid-state image sensor, a CMOS (Complementary Metal Oxide Semiconductor), a CCD (Charge Coupled Device), or the like is used. Note that the position where the camera 205 is installed is not limited to the upper side of the touch panel display 204, and may be installed at other positions.

マイク２０６は、会議に参加している参加者の音声を入力する収音装置である。マイク２０６は、例えば、図１に示すように、タッチパネルディスプレイ２０４の上側に固定され、ビデオ会議装置１０に対向する参加者が発する音声を入力する。なお、マイク２０６が固定される位置は、タッチパネルディスプレイ２０４の上側に限定されるものではなく、その他の位置に固定されるものとしてもよい。 The microphone 206 is a sound pickup device for inputting voices of participants participating in the conference. The microphone 206 is fixed to the upper side of the touch panel display 204, for example, as shown in FIG. 1, and inputs a voice uttered by a participant facing the video conference device 10. Note that the position where the microphone 206 is fixed is not limited to the upper side of the touch panel display 204, and may be fixed at another position.

スピーカ２０７は、ＣＰＵ２０１の制御に従って、音声を出力する装置である。スピーカ２０７は、例えば、図１に示すように、タッチパネルディスプレイ２０４の下側に設置されており、ネットワークを介して受信された相手拠点のビデオ会議装置１０により収音された音声データを音声として出力する。なお、スピーカ２０７が設置される位置は、タッチパネルディスプレイ２０４の下側に限定されるものではなく、その他の位置に設置されるものとしてもよい。 The speaker 207 is a device that outputs audio under the control of the CPU 201. The speaker 207 is installed below the touch panel display 204, for example, as shown in FIG. 1, and outputs the voice data collected by the video conferencing apparatus 10 at the partner site received via the network as voice. To do. The position where the speaker 207 is installed is not limited to the lower side of the touch panel display 204, and may be installed at other positions.

操作部２０８は、ビデオ会議装置１０に対する各種操作を行うためのボタン等の入力装置である。操作部２０８は、例えば、図１に示すように、タッチパネルディスプレイ２０４の横側に設置されている。なお、操作部２０８は、図１に示すように、ハードウェアとしてのボタン等により構成されることに限定されるものではなく、例えば、タッチパネルディスプレイ２０４による接触検知機能によりタッチパネルディスプレイ２０４上にソフトウェアスイッチとして構成されるものとしてもよい。 The operation unit 208 is an input device such as buttons for performing various operations on the video conference device 10. The operation unit 208 is installed on the lateral side of the touch panel display 204, for example, as shown in FIG. It should be noted that the operation unit 208 is not limited to being configured by a button or the like as hardware as shown in FIG. 1, and for example, a software switch may be provided on the touch panel display 204 by a touch detection function of the touch panel display 204. It may be configured as.

ネットワークＩ／Ｆ２０９は、インターネット等のネットワーク（例えば、後述の図３に示すネットワーク２）を利用して、外部機器とのデータを通信するためのインターフェースである。ネットワークＩ／Ｆ２０９は、例えば、ＮＩＣ（ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣａｒｄ）等である。ネットワークＩ／Ｆ２０９が対応する規格としては、有線ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）の場合、例えば、１０Ｂａｓｅ−Ｔ、１００Ｂａｓｅ−ＴＸ、または１０００ＢＡＳＥ−Ｔ等のイーサネット（Ｅｔｈｅｒｎｅｔ）（登録商標）規格が挙げられ、無線ＬＡＮの場合、例えば、８０２．１１ａ／ｂ／ｇ／ｎが挙げられる。 The network I/F 209 is an interface for communicating data with an external device using a network such as the Internet (for example, the network 2 shown in FIG. 3 described later). The network I/F 209 is, for example, a NIC (Network Interface Card) or the like. As a standard supported by the network I/F 209, in the case of a wired LAN (Local Area Network), for example, an Ethernet (Ethernet) standard such as 10Base-T, 100Base-TX, or 1000BASE-T can be cited. In the case of a wireless LAN, for example, 802.11a/b/g/n can be cited.

なお、ビデオ会議装置１０のハードウェア構成は、図２に示す構成に限定されるものではない。例えば、ビデオ会議装置１０は、ビデオ会議装置１０を制御するためのファームウェア等を記憶した不揮発性の記憶装置であるＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、または、ＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）もしくは外付けの記憶装置等を接続するための外部機器用のＩ／Ｆ等を備えるものとしてもよい。また、図２に示すように、メモリ２０２、記憶装置２０３、タッチパネルディスプレイ２０４、カメラ２０５、マイク２０６、スピーカ２０７、操作部２０８およびネットワークＩ／Ｆ２０９は、ＣＰＵ２０１に接続されているが、このような構成に限定されるものではなく、例えば、上述の各構成部品が、アドレスバスおよびデータバス等のバスによって互いに通信可能となるように接続される構成としてもよい。 The hardware configuration of the video conference device 10 is not limited to the configuration shown in FIG. For example, the video conferencing apparatus 10 is a non-volatile storage device such as a ROM (Read Only Memory) storing firmware for controlling the video conferencing apparatus 10 or a PC (Personal Computer) or an external storage device. May be provided with an I/F or the like for an external device for connecting to. Further, as shown in FIG. 2, the memory 202, the storage device 203, the touch panel display 204, the camera 205, the microphone 206, the speaker 207, the operation unit 208, and the network I/F 209 are connected to the CPU 201. The configuration is not limited, and for example, the above-described components may be connected to each other via a bus such as an address bus and a data bus so that they can communicate with each other.

また、図１では、ビデオ会議装置１０は、カメラ２０５、マイク２０６およびスピーカ２０７を本体に内蔵する構成を示したが、これらの少なくともいずれかが別体として構成されているものとしてもよい。例えば、カメラ２０５が内蔵されているものではなく別体の外部機器である場合、上述の外部機器用のＩ／Ｆを介して、ビデオ会議装置１０に接続されるものとしてもよい。 Further, in FIG. 1, the video conference apparatus 10 has a configuration in which the camera 205, the microphone 206, and the speaker 207 are incorporated in the main body, but at least one of these may be configured separately. For example, in the case where the camera 205 is not a built-in one but a separate external device, it may be connected to the video conferencing apparatus 10 via the above-mentioned external device I/F.

（会議システムの構成）
図３は、第１の実施の形態に係る会議システムの構成の一例を示す図である。図３を参照しながら、本実施の形態に係る会議システム１の構成について説明する。 (Conference system configuration)
FIG. 3 is a diagram illustrating an example of the configuration of the conference system according to the first embodiment. The configuration of the conference system 1 according to the present embodiment will be described with reference to FIG.

図３に示すように、本実施の形態に係る会議システム１は、２以上のビデオ会議装置（図３では、ビデオ会議装置１０＿１〜１０＿４）と、会議サーバ２０と、会議予約サーバ３０と、を含む。なお、図３に示す２以上のビデオ会議装置（１０＿１、１０＿２、・・・）について、任意のビデオ会議装置を示す場合、または総称する場合、単に「ビデオ会議装置１０」と称するものとする。ビデオ会議装置１０＿１〜１０＿４は、それぞれインターネット等のネットワーク２を介して、他のビデオ会議装置１０、会議サーバ２０および会議予約サーバ３０と通信可能となっている。 As shown in FIG. 3, the conference system 1 according to the present embodiment includes two or more video conference devices (video conference devices 10_1 to 10_4 in FIG. 3 ), a conference server 20, and a conference reservation server 30. Including. It should be noted that the two or more video conferencing devices (10_1, 10_2,...) Shown in FIG. 3 are simply referred to as “video conferencing device 10” when referring to or collectively referring to any video conferencing device. The video conference devices 10_1 to 10_4 can communicate with other video conference devices 10, the conference server 20, and the conference reservation server 30 via the network 2 such as the Internet.

ビデオ会議装置１０は、他のビデオ会議装置１０との間で、会議サーバ２０の制御に基づいて、セッションを確立し、確立したセッションを介して、映像データ、音声データおよび描画データを送受信するインタラクティブ・ホワイトボード型の会議装置である。ビデオ会議（以下、単に「会議」という場合がある）時において、ビデオ会議装置１０は、送信時には、会議サーバ２０に対して映像データ、音声データおよび描画データを送信し、会議サーバ２０は、相手拠点のビデオ会議装置１０に対して映像データ、音声データおよび描画データを送信する。また、ビデオ会議装置１０は、受信時には、会議サーバ２０から相手拠点のビデオ会議装置１０の映像データ、音声データおよび描画データを受信する。 The video conferencing apparatus 10 establishes a session with another video conferencing apparatus 10 under the control of the conference server 20, and interactively transmits/receives video data, audio data, and drawing data via the established session. -It is a whiteboard type conference device. At the time of a video conference (hereinafter sometimes simply referred to as “conference”), the video conference device 10 transmits video data, audio data, and drawing data to the conference server 20 at the time of transmission, and the conference server 20 makes the other party. Video data, audio data, and drawing data are transmitted to the video conference apparatus 10 at the base. Further, the video conferencing apparatus 10 receives the video data, the audio data, and the drawing data of the video conferencing apparatus 10 at the partner site from the conference server 20 at the time of reception.

例えば、図３に示す会議システム１においてビデオ会議装置１０＿１〜１０＿３でビデオ会議を行う場合、ビデオ会議装置１０＿１が送信したデータは、会議サーバ２０を介して、ビデオ会議装置１０＿２、１０＿３それぞれに送信され、ビデオ会議装置１０＿４には送信されない。同様に、ビデオ会議装置１０＿２、１０＿３それぞれが送信したデータは、会議サーバ２０を介して、ビデオ会議に参加しているビデオ会議装置１０に送信され、ビデオ会議に参加していないビデオ会議装置１０（ビデオ会議装置１０＿４）には送信されない。これにより、会議システム１において、複数のビデオ会議装置１０間のビデオ会議が実現される。 For example, when a video conference is held by the video conference devices 10_1 to 10_3 in the conference system 1 illustrated in FIG. 3, the data transmitted by the video conference device 10_1 is transmitted to the video conference devices 10_2 and 10_3 via the conference server 20. , Are not transmitted to the video conferencing device 10_4. Similarly, the data transmitted by each of the video conferencing devices 10_2 and 10_3 is transmitted via the conference server 20 to the video conferencing device 10 participating in the video conference, and the video conferencing device 10(( not participating in the video conference). It is not transmitted to the video conference device 10_4). Thereby, in the conference system 1, a video conference between the plurality of video conference devices 10 is realized.

会議サーバ２０は、各ビデオ会議装置１０が会議サーバ２０と接続しているか否かのモニタリング、会議開始時に各ビデオ会議装置１０の呼び出し制御、および会議時の情報処理の制御を行うサーバ装置である。 The conference server 20 is a server device that monitors whether or not each video conference device 10 is connected to the conference server 20, controls calling of each video conference device 10 at the start of a conference, and controls information processing during the conference. ..

会議予約サーバ３０は、会議を主催する利用者等が、事前に、会議情報（開催日時、開催場所、参加する利用者、役割、および、使用する情報処理装置等）を登録（予約）しておくサーバ装置である。ビデオ会議装置１０は、会議予約サーバ３０に対して問い合わせを行い、該当する会議の会議情報を見つけた場合、その会議情報に基づいてビデオ会議に参加する。なお、会議予約サーバ３０は、例えば、管理ＰＣ等がネットワーク２を介して接続されており、上述の会議情報の登録等の設定ができるものとしてもよい。 The conference reservation server 30 registers (reserves) conference information (the date and time of the conference, the place of the conference, the participating users, the role, the information processing device to be used, etc.) in advance by the user who hosts the conference. It is a server device to be installed. When the video conference device 10 makes an inquiry to the conference reservation server 30 and finds the conference information of the corresponding conference, it joins the video conference based on the conference information. The conference reservation server 30 may be configured such that, for example, a management PC or the like is connected via the network 2 and settings such as registration of the conference information described above can be made.

なお、図３に示す会議システム１の構成は一例であり、例えば、会議サーバ２０および会議予約サーバ３０は別々のサーバ装置としているが、これに限定されるものではなく、１つのサーバ装置で構成されるものとしてもよい。 Note that the configuration of the conference system 1 illustrated in FIG. 3 is an example, and the conference server 20 and the conference reservation server 30 are separate server devices, for example, but the configuration is not limited to this, and a single server device is used. It may be done.

（ビデオ会議装置の機能ブロック構成）
図４は、第１の実施の形態に係るビデオ会議装置の機能ブロックの構成の一例を示す図である。図５は、音声フィルタを適用しない場合の筆記ノイズのレベルのグラフの一例を示す図である。図６は、音声フィルタを適用した場合の筆記ノイズのレベルのグラフの一例を示す図である。図７は、ホバリング検知機能を説明する図である。図４〜７を参照しながら、本実施の形態に係るビデオ会議装置１０の機能ブロックの構成および動作の詳細を説明する。 (Functional block configuration of video conferencing device)
FIG. 4 is a diagram showing an example of a configuration of functional blocks of the video conference apparatus according to the first embodiment. FIG. 5 is a diagram showing an example of a graph of the level of writing noise when the voice filter is not applied. FIG. 6 is a diagram showing an example of a graph of the level of writing noise when the voice filter is applied. FIG. 7 is a diagram for explaining the hovering detection function. Details of the configuration and operation of the functional blocks of the video conference apparatus 10 according to the present embodiment will be described with reference to FIGS.

図４に示すように、本実施の形態に係るビデオ会議装置１０は、取得部１０１と、制御部１０２と、適用部１０３と、送信部１０７と、受信部１０８と、通信部１０９と、撮像制御部１１０と、撮像部１１１と、表示制御部１１２と、表示部１１３と、パネル検知部１１４と、音声出力制御部１１５と、音声出力部１１６と、音声入力部１１７と、記憶部１１８と、操作入力部１１９と、を有する。 As shown in FIG. 4, the video conferencing apparatus 10 according to the present embodiment has an acquisition unit 101, a control unit 102, an application unit 103, a transmission unit 107, a reception unit 108, a communication unit 109, and an imaging unit. The control unit 110, the image pickup unit 111, the display control unit 112, the display unit 113, the panel detection unit 114, the sound output control unit 115, the sound output unit 116, the sound input unit 117, and the storage unit 118. , And an operation input unit 119.

取得部１０１は、通信部１０９およびネットワーク２を介して、会議予約サーバ３０から会議情報を取得する機能部である。具体的には、取得部１０１は、例えば、会議情報を取得するための取得要求、ならびに、会議の開催日時、開催場所および使用端末の情報を、通信部１０９およびネットワーク２を介して会議予約サーバ３０に送信する。そして、会議予約サーバ３０は、取得要求を受信すると、受信した開催日時、開催場所および使用端末に対応する会議情報を、ネットワーク２および通信部１０９を介して、取得部１０１に送信する。取得部１０１は、例えば、図２に示すＣＰＵ２０１がプログラムを実行することによって実現される。 The acquisition unit 101 is a functional unit that acquires conference information from the conference reservation server 30 via the communication unit 109 and the network 2. Specifically, the acquisition unit 101 acquires, for example, an acquisition request for acquiring the conference information, the date and time of the conference, the location of the conference, and the information of the used terminal via the communication unit 109 and the network 2 from the conference reservation server. Send to 30. Then, when the conference reservation server 30 receives the acquisition request, the conference reservation server 30 transmits the received conference information corresponding to the date and time of the conference, the venue, and the used terminal to the acquisition unit 101 via the network 2 and the communication unit 109. The acquisition unit 101 is realized, for example, by the CPU 201 shown in FIG. 2 executing a program.

制御部１０２は、ビデオ会議装置１０によるビデオ会議のための処理を総括的に制御する機能部である。制御部１０２は、例えば、図２に示すＣＰＵ２０１がプログラムを実行することによって実現される。 The control unit 102 is a functional unit that comprehensively controls the processing for the video conference performed by the video conference device 10. The control unit 102 is realized, for example, by the CPU 201 shown in FIG. 2 executing a program.

適用部１０３は、音声入力部１１７に入力（収音）された音声データに対して、スタイラス（指またはペン３０１等）（棒状物）の筆記ノイズに対応した音声フィルタを適用または解除する機能部である。音声フィルタは、筆記ノイズが有する特定の周波数成分等を減衰させるデジタルフィルタである。図５に示すように、ペン３０１がタッチパネルディスプレイ２０４に接触した時点（図５に示す時刻ｔ１）で筆記ノイズが発生する。この筆記ノイズは、図５に示すように、他の音声とは異なり、ペン３０１がタッチパネルディスプレイ２０４に接触した時の短時間で急激に音の大きさ（図５に示す「筆記ノイズレベル」）が変動するという特徴がある。適用部１０３は、例えば、図２に示すＣＰＵ２０１がプログラムを実行することによって実現される。 The application unit 103 applies or cancels a voice filter corresponding to writing noise of a stylus (finger or pen 301, etc.) (rod) to the voice data input (picked up) to the voice input unit 117. Is. The voice filter is a digital filter that attenuates a specific frequency component or the like of writing noise. As shown in FIG. 5, writing noise occurs when the pen 301 contacts the touch panel display 204 (time t1 shown in FIG. 5). As shown in FIG. 5, this writing noise is different from other voices in that the sound volume suddenly increases in a short time when the pen 301 contacts the touch panel display 204 (“writing noise level” shown in FIG. 5). Is characterized by fluctuating. The application unit 103 is realized, for example, by the CPU 201 shown in FIG. 2 executing a program.

ここで、図５を参照しながら、スタイラス（ペン３０１または指）（以下、筆記する媒体がペン３０１であるものとして説明する）がタッチパネルディスプレイ２０４に接触した場合の筆記ノイズについて説明する。まず、タッチパネルディスプレイ２０４にペン３０１が接触してない時点（時刻ｔ０）においては、筆記ノイズは発生していない。ペン３０１がタッチパネルディスプレイ２０４に接触した時点（時刻ｔ１）で、筆記ノイズが発生し、かつ、筆記ノイズのレベルが急上昇する。 Writing noise when a stylus (pen 301 or finger) (hereinafter, described as a medium for writing is the pen 301) contacts the touch panel display 204 will be described with reference to FIG. First, at the time when the pen 301 is not in contact with the touch panel display 204 (time t0), no writing noise is generated. At the time when the pen 301 comes into contact with the touch panel display 204 (time t1), writing noise occurs and the level of the writing noise sharply rises.

ペン３０１によって筆記を行っている期間（時刻ｔ１〜ｔ２）では、ペン３０１の接触時よりも筆記ノイズのレベルは小さいが、筆記による振動ノイズ等が発生する。そして、筆記が終了し、ペン３０１がタッチパネルディスプレイ２０４から離れた時点以降の期間（例えば、時刻ｔ２〜ｔ３）では、筆記ノイズは発生しない。 During the period of writing with the pen 301 (time t1 to t2), the level of the writing noise is smaller than that when the pen 301 is in contact, but vibration noise or the like occurs due to the writing. Then, during the period after the writing ends and the pen 301 moves away from the touch panel display 204 (for example, time t2 to t3), the writing noise does not occur.

図５で示した筆記ノイズに対して、適用部１０３により、音声入力部１１７により入力された筆記ノイズを含む音声データに音声フィルタを適用した場合の筆記ノイズのグラフを図６に示す。図６における時刻ｔ０〜ｔ３は、それぞれ図５における時刻ｔ０〜ｔ３に対応している。図６に示すように、フィルタが適用された筆記ノイズは、図５に示すフィルタが適用されていない筆記ノイズと比較して、レベルを的確に低減することができる。 FIG. 6 shows a graph of the writing noise when the voice filter including the writing noise input by the voice input unit 117 is applied by the applying unit 103 to the writing noise illustrated in FIG. Times t0 to t3 in FIG. 6 correspond to times t0 to t3 in FIG. 5, respectively. As shown in FIG. 6, the level of the writing noise to which the filter is applied can be appropriately reduced as compared with the writing noise to which the filter shown in FIG. 5 is not applied.

送信部１０７は、撮像部１１１により撮影された映像データ、音声入力部１１７により入力された音声データ、および表示部１１３に表示された描画データを、通信部１０９およびネットワーク２を介して、相手拠点のビデオ会議装置１０に送信する機能部である。具体的には、送信部１０７は、例えば、映像データ、音声データおよび描画データをエンコードして、相手拠点のビデオ会議装置１０に送信する。ここで、エンコードの方法としては、上述のように、例えば、公知の方法であるＨ．２６４／ＡＶＣ、またはＨ．２６４／ＳＶＣ等の圧縮符号化技術を用いればよい。送信部１０７は、例えば、図２に示すＣＰＵ２０１がプログラムを実行することによって実現される。 The transmission unit 107 transmits the video data captured by the image capturing unit 111, the audio data input by the audio input unit 117, and the drawing data displayed on the display unit 113 via the communication unit 109 and the network 2 to the partner site. Is a functional unit for transmitting to the video conferencing device 10. Specifically, the transmission unit 107 encodes video data, audio data, and drawing data, for example, and transmits the video data, the audio data, and the drawing data to the video conference apparatus 10 at the partner site. Here, as the encoding method, as described above, for example, a known method such as H.264 is used. H.264/AVC, or H.264. A compression encoding technique such as H.264/SVC may be used. The transmission unit 107 is realized, for example, by the CPU 201 shown in FIG. 2 executing a program.

受信部１０８は、ネットワーク２および通信部１０９を介して、相手拠点のビデオ会議装置１０から受信した映像データ、音声データおよび描画データを受信する機能部である。具体的には、受信部１０８は、例えば、受信した映像データ、音声データおよび描画データをデコードし、デコードした映像データおよび描画データを表示制御部１１２に送り、デコードした音声データを音声出力制御部１１５に送る。ここで、デコードの方法としては、公知の方法を用いればよい。受信部１０８は、例えば、図２に示すＣＰＵ２０１がプログラムを実行することによって実現される。 The receiving unit 108 is a functional unit that receives, via the network 2 and the communication unit 109, the video data, the audio data, and the drawing data received from the video conference apparatus 10 at the partner site. Specifically, for example, the receiving unit 108 decodes the received video data, audio data, and drawing data, sends the decoded video data and drawing data to the display control unit 112, and decodes the decoded audio data as an audio output control unit. Send to 115. Here, as a decoding method, a known method may be used. The receiving unit 108 is realized, for example, by the CPU 201 shown in FIG. 2 executing a program.

通信部１０９は、ネットワーク２を介して、他のビデオ会議装置１０、会議サーバ２０および会議予約サーバ３０とデータ通信をする機能部である。通信部１０９は、例えば、図２に示すネットワークＩ／Ｆ２０９によって実現される。 The communication unit 109 is a functional unit that performs data communication with other video conference devices 10, the conference server 20, and the conference reservation server 30 via the network 2. The communication unit 109 is realized by, for example, the network I/F 209 shown in FIG.

撮像制御部１１０は、撮像部１１１の動作を制御する機能部である。具体的には、撮像制御部１１０は、例えば、撮像部１１１による撮影の開始および停止の動作等を制御し、撮像部１１１により撮影された映像データを取得する。撮像制御部１１０は、例えば、図２に示すＣＰＵ２０１がプログラムを実行することによって実現される。 The imaging control unit 110 is a functional unit that controls the operation of the imaging unit 111. Specifically, the image capturing control unit 110 controls, for example, the operation of starting and stopping the image capturing by the image capturing unit 111, and acquires the video data captured by the image capturing unit 111. The imaging control unit 110 is realized, for example, by the CPU 201 shown in FIG. 2 executing a program.

撮像部１１１は、ビデオ会議装置１０に対向する向きで映像を撮影する機能部である。撮像部１１１は、例えば、図２に示すカメラ２０５によって実現される。なお、撮像部１１１が撮影する方向は、ビデオ会議装置１０に対向する向きとしたが、これに限定されるものではなく、撮像部１１１が撮影する方向は任意の方向に可変であるものとしてもよい。 The image capturing unit 111 is a functional unit that captures an image in a direction facing the video conference device 10. The imaging unit 111 is realized by, for example, the camera 205 shown in FIG. It should be noted that the imaging direction of the imaging unit 111 is set to face the video conferencing apparatus 10, but the present invention is not limited to this. The imaging direction of the imaging unit 111 may be variable in any direction. Good.

表示制御部１１２は、表示部１１３に各種画像および映像を表示させる制御を行う機能部である。表示制御部１１２は、例えば、図２に示すＣＰＵ２０１がプログラムを実行することによって実現される。 The display control unit 112 is a functional unit that controls the display unit 113 to display various images and videos. The display control unit 112 is realized, for example, by the CPU 201 shown in FIG. 2 executing a program.

表示部１１３は、表示制御部１１２の制御に従って、各種画像および映像を表示する機能部である。表示部１１３は、例えば、図２に示すタッチパネルディスプレイ２０４における表示デバイスによって実現される。 The display unit 113 is a functional unit that displays various images and videos under the control of the display control unit 112. The display unit 113 is realized by, for example, a display device in the touch panel display 204 shown in FIG.

パネル検知部１１４は、使用者（例えば、会議の参加者）の指またはペン等であるスタイラスによる筆記入力を受け付ける機能部である。パネル検知部１１４は、スタイラスがタッチパネルディスプレイ２０４のタッチパネルに所定の距離（後述する図７に示す距離ｄ１）以内に近づいたことを検知（ホバリング検知）するホバリング検知機能を有する。さらに、パネル検知部１１４は、スタイラスがタッチパネルディスプレイ２０４のタッチパネルに接触したこと検知（接触検知）する接触検知機能を有する。パネル検知部１１４は、例えば、図２に示すタッチパネルディスプレイ２０４におけるホバリング検知機能および接触検知機能によって実現される。パネル検知部１１４は、タッチパネルディスプレイ２０４のタッチパネルの静電容量の変化を捉えて、ホバリング検知および接触検知を行うものであり、具体的には、図２に示すＣＰＵ２０１がプログラムを実行することによって実現される。 The panel detection unit 114 is a functional unit that receives a writing input by a stylus such as a finger or a pen of a user (for example, a conference participant). The panel detection unit 114 has a hovering detection function of detecting that the stylus has approached the touch panel of the touch panel display 204 within a predetermined distance (distance d1 shown in FIG. 7 described later) (hovering detection). Furthermore, the panel detection unit 114 has a contact detection function of detecting that the stylus has touched the touch panel of the touch panel display 204 (contact detection). The panel detection unit 114 is realized by, for example, a hovering detection function and a contact detection function in the touch panel display 204 shown in FIG. The panel detection unit 114 detects a change in the capacitance of the touch panel of the touch panel display 204 and performs hovering detection and contact detection, and is specifically realized by the CPU 201 shown in FIG. 2 executing a program. To be done.

ここで、図７を参照しながら、パネル検知部１１４のホバリング検知機能および接触検知機能について説明する。図７は、使用者がスタイラス（指またはペン）をタッチパネルに近づけて接触した後、離した場合のタッチパネルの距離と時間との関係を示している。スタイラスを近づけていくと、時刻ｔ１０に、タッチパネル（タッチパネルディスプレイ２０４）との距離が、パネル検知部１１４がホバリング検知機能により検知可能な距離ｄ１に到達する。そして、パネル検知部１１４は、時刻ｔ１０〜ｔ１１の間、ホバリング検知機能により、スタイラスが所定の距離（距離ｄ１）に近づいていることを検知している。さらに、スタイラスを近づけると、時刻ｔ１１に、タッチパネルに接触する。スタイラスがタッチパネルに接触している時刻ｔ１１〜ｔ１２の間、パネル検知部１１４は、接触検知機能により、スタイラスがタッチパネルに接触していることを検知している。このとき、スタイラスがタッチパネルに接触している期間中（時刻ｔ１１〜ｔ１２）は、タッチパネルとスタイラスとの距離は変化しない。そして、スタイラスがタッチパネルから離れて、距離ｄ１まで離れるまでの期間（時刻ｔ１２〜ｔ１３）、パネル検知部１１４は、ホバリング検知機能により、スタイラスが所定の距離（距離ｄ１）以内に存在していることを検知している。さらに、時刻ｔ１３以降に、スタイラスがタッチパネルから距離ｄ１より離れると、パネル検知部１１４は、スタイラスが近づいていることも、接触していることも検知しなくなる。 Here, the hovering detection function and the contact detection function of the panel detection unit 114 will be described with reference to FIG. 7. FIG. 7 shows the relationship between the distance of the touch panel and the time when the user brings the stylus (finger or pen) close to the touch panel, touches the stylus, and then releases the stylus. When the stylus is moved closer, at time t10, the distance from the touch panel (touch panel display 204) reaches the distance d1 at which the panel detection unit 114 can detect the hovering detection function. Then, the panel detection unit 114 detects that the stylus is approaching a predetermined distance (distance d1) by the hovering detection function from time t10 to time t11. Further, when the stylus is brought closer, the touch panel is touched at time t11. During the time t11 to t12 when the stylus is in contact with the touch panel, the panel detection unit 114 detects that the stylus is in contact with the touch panel by the contact detection function. At this time, the distance between the touch panel and the stylus does not change during the period when the stylus is in contact with the touch panel (time t11 to t12). Then, during the period until the stylus moves away from the touch panel to the distance d1 (time t12 to t13), the panel detection unit 114 has the stylus within the predetermined distance (distance d1) by the hover detection function. Is being detected. Further, after the time t13, when the stylus is separated from the touch panel by the distance d1, the panel detection unit 114 will not detect that the stylus is approaching or touching.

音声出力制御部１１５は、音声出力部１１６に各種音声を出力させる制御を行う機能部である。音声出力制御部１１５は、例えば、図２に示すＣＰＵ２０１がプログラムを実行することによって実現される。 The voice output control unit 115 is a functional unit that controls the voice output unit 116 to output various voices. The audio output control unit 115 is realized, for example, by the CPU 201 shown in FIG. 2 executing a program.

音声出力部１１６は、音声出力制御部１１５の制御に従って、各種音声を出力する機能部である。音声出力部１１６は、例えば、図２に示すスピーカ２０７によって実現される。 The voice output unit 116 is a functional unit that outputs various voices under the control of the voice output control unit 115. The audio output unit 116 is realized by, for example, the speaker 207 shown in FIG.

音声入力部１１７は、音声を入力する機能部である。音声入力部１１７は、例えば、図２に示すマイク２０６によって実現される。 The voice input unit 117 is a functional unit that inputs voice. The voice input unit 117 is realized by, for example, the microphone 206 shown in FIG.

記憶部１１８は、ビデオ会議装置１０の動作を実現する各種プログラム、映像データ、音声データ、および描画データ、ならびに音声フィルタの情報等の各種データを記憶する機能部である。記憶部１１８は、例えば、図２に示す記憶装置２０３によって実現される。 The storage unit 118 is a functional unit that stores various programs for implementing the operation of the video conferencing apparatus 10, video data, audio data, drawing data, and various data such as audio filter information. The storage unit 118 is realized by, for example, the storage device 203 illustrated in FIG.

操作入力部１１９は、使用者（例えば、会議の参加者）の各種操作入力を受け付ける機能部である。操作入力部１１９は、例えば、図２に示す操作部２０８によって実現される。 The operation input unit 119 is a functional unit that receives various operation inputs of a user (for example, a conference participant). The operation input unit 119 is realized by, for example, the operation unit 208 illustrated in FIG.

なお、図４に示すビデオ会議装置１０の取得部１０１、制御部１０２、適用部１０３、送信部１０７、受信部１０８、通信部１０９、撮像制御部１１０、撮像部１１１、表示制御部１１２、表示部１１３、パネル検知部１１４、音声出力制御部１１５、音声出力部１１６、音声入力部１１７、記憶部１１８および操作入力部１１９は、機能を概念的に示したものであって、このような構成に限定されるものではない。例えば、図４に示すビデオ会議装置１０で独立した機能部として図示した複数の機能部を、１つの機能部として構成してもよい。一方、図４に示すビデオ会議装置１０で１つの機能部が有する機能を複数に分割し、複数の機能部として構成するものとしてもよい。 Note that the acquisition unit 101, the control unit 102, the application unit 103, the transmission unit 107, the reception unit 108, the communication unit 109, the imaging control unit 110, the imaging unit 111, the display control unit 112, and the display of the video conference apparatus 10 illustrated in FIG. The unit 113, the panel detection unit 114, the voice output control unit 115, the voice output unit 116, the voice input unit 117, the storage unit 118, and the operation input unit 119 conceptually show the functions and have such a configuration. It is not limited to. For example, a plurality of functional units illustrated as independent functional units in the video conference apparatus 10 shown in FIG. 4 may be configured as one functional unit. On the other hand, in the video conferencing device 10 shown in FIG. 4, the function of one functional unit may be divided into a plurality of units to be configured as a plurality of functional units.

また、ビデオ会議装置１０の取得部１０１、制御部１０２、適用部１０３、送信部１０７、受信部１０８、撮像制御部１１０、表示制御部１１２、パネル検知部１１４および音声出力制御部１１５の一部または全部は、ソフトウェアであるプログラムではなく、ＦＰＧＡ（Ｆｉｅｌｄ−ＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）またはＡＳＩＣ（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）等のハードウェア回路によって実現されてもよい。 Further, a part of the acquisition unit 101, the control unit 102, the application unit 103, the transmission unit 107, the reception unit 108, the imaging control unit 110, the display control unit 112, the panel detection unit 114, and the audio output control unit 115 of the video conference apparatus 10. Alternatively, all of them may be realized by a hardware circuit such as an FPGA (Field-Programmable Gate Array) or an ASIC (Application Specific Integrated Circuit) instead of a program that is software.

（ビデオ会議装置の筆記ノイズに対する処理）
図８は、第１の実施の形態に係るビデオ会議装置の筆記ノイズに音声フィルタを適用する処理の流れの一例を示すフローチャートである。図８を参照しながら、本実施の形態に係るビデオ会議装置１０の筆記ノイズに音声フィルタを適用する処理の流れについて説明する。 (Processing for writing noise of video conferencing equipment)
FIG. 8 is a flowchart showing an example of the flow of processing for applying an audio filter to writing noise of the video conference apparatus according to the first embodiment. The flow of processing of applying the audio filter to the writing noise of the video conference apparatus 10 according to the present embodiment will be described with reference to FIG. 8.

＜ステップＳ１１＞
制御部１０２は、取得部１０１により会議予約サーバ３０から取得された会議情報に基づき、会議情報が示す相手拠点のビデオ会議装置との間で、ビデオ会議を開始する。そして、ステップＳ１２へ移行する。 <Step S11>
The control unit 102 starts a video conference with the video conference device at the partner site indicated by the conference information, based on the conference information acquired from the conference reservation server 30 by the acquisition unit 101. Then, the process proceeds to step S12.

＜ステップＳ１２＞
パネル検知部１１４は、ホバリング検知機能により、スタイラス（指またはペン等）がタッチパネルに所定の距離以内に近づいたか否かを検知（ホバリング検知）する。ホバリング検知がされた場合（ステップＳ１２：Ｙｅｓ）、ステップＳ１３へ移行し、ホバリング検知がされない場合（ステップＳ１２：Ｎｏ）、引き続き、検知動作を続ける。 <Step S12>
The panel detection unit 114 detects whether or not the stylus (finger, pen, or the like) has approached the touch panel within a predetermined distance by the hovering detection function (hovering detection). If the hovering is detected (step S12: Yes), the process proceeds to step S13. If the hovering is not detected (step S12: No), the detection operation is continued.

＜ステップＳ１３＞
適用部１０３は、音声入力部１１７に入力（収音）された音声データに対して、スタイラスの筆記ノイズに対応した音声フィルタを適用する。そして、ステップＳ１４へ移行する。 <Step S13>
The application unit 103 applies an audio filter corresponding to writing noise of the stylus to the audio data input (picked up) by the audio input unit 117. Then, the process proceeds to step S14.

＜ステップＳ１４＞
パネル検知部１１４は、接触検知機能により、スタイラスがタッチパネルに接触したか否かを検知（接触検知）する。接触検知がされた場合（ステップＳ１４：Ｙｅｓ）、ステップＳ１５へ移行し、接触検知がされない場合（ステップＳ１４：Ｎｏ）、ステップＳ１７へ移行する。 <Step S14>
The panel detection unit 114 detects whether or not the stylus contacts the touch panel (contact detection) by the contact detection function. When the contact is detected (step S14: Yes), the process proceeds to step S15, and when the contact is not detected (step S14: No), the process proceeds to step S17.

＜ステップＳ１５＞
パネル検知部１１４は、接触検知機能により、スタイラスがタッチパネルから離れたか否かを検知する。タッチパネルから離れたことが検知された場合（ステップＳ１５：Ｙｅｓ）、ステップＳ１６へ移行する。タッチパネルに接触したままであることが検知された場合（ステップＳ１５：Ｎｏ）、パネル検知部１１４は、引き続き、離れたか否かを検知する。 <Step S15>
The panel detection unit 114 detects whether or not the stylus is separated from the touch panel by the contact detection function. When it is detected that the touch panel is separated (step S15: Yes), the process proceeds to step S16. When it is detected that the touch panel is still in contact (step S15: No), the panel detection unit 114 continues to detect whether or not the touch panel is separated.

＜ステップＳ１６＞
適用部１０３は、スタイラスがタッチパネルから離れたことにより筆記ノイズが発生しないことから、筆記ノイズに対応した音声フィルタの適用を解除する。そして、ステップＳ１９へ移行する。 <Step S16>
Since the writing noise does not occur due to the stylus moving away from the touch panel, the application unit 103 cancels the application of the voice filter corresponding to the writing noise. Then, the process proceeds to step S19.

＜ステップＳ１７＞
制御部１０２は、適用部１０３によって音声フィルタが適用されてから、所定時間が経過したか否かを判定する。所定時間が経過した場合（ステップＳ１７：Ｙｅｓ）、ステップＳ１８へ移行し、経過していない場合（ステップＳ１７：Ｎｏ）、ステップＳ１４へ戻る。 <Step S17>
The control unit 102 determines whether or not a predetermined time has elapsed since the application unit 103 applied the voice filter. When the predetermined time has elapsed (step S17: Yes), the process proceeds to step S18, and when the predetermined time has not elapsed (step S17: No), the process returns to step S14.

＜ステップＳ１８＞
適用部１０３は、スタイラスのホバリング検知がされてから所定時間経過しても接触検知がされない場合、スタイラスはタッチパネルから離れたものと判断し、筆記ノイズに対応した音声フィルタの適用を解除する。そして、ステップＳ１９へ移行する。 <Step S18>
The application unit 103 determines that the stylus has moved away from the touch panel and cancels the application of the voice filter corresponding to the writing noise when contact detection is not performed even after a lapse of a predetermined time after the stylus hovering detection. Then, the process proceeds to step S19.

＜ステップＳ１９＞
制御部１０２は、適用部１０３によって音声フィルタの適用が解除された後、ビデオ会議が終了したか否かを判定する。ビデオ会議が終了した場合（ステップＳ１９：Ｙｅｓ）、筆記ノイズに音声フィルタを適用する一連の処理を終了し、ビデオ会議が終了していない場合（ステップＳ１９：Ｎｏ）、ステップＳ１２へ戻る。 <Step S19>
The control unit 102 determines whether or not the video conference ends after the application of the audio filter is canceled by the application unit 103. If the video conference has ended (step S19: Yes), a series of processes for applying the voice filter to the writing noise ends, and if the video conference has not ended (step S19: No), the process returns to step S12.

以上のステップＳ１１〜Ｓ１９によって、本実施の形態に係るビデオ会議装置１０の筆記ノイズに音声フィルタを適用する処理が行われる。 Through the above steps S11 to S19, the process of applying the audio filter to the writing noise of the video conference apparatus 10 according to the present embodiment is performed.

以上のように、本実施の形態では、パネル検知部１１４によってスタイラスがタッチパネルに所定の距離以内に近づいたことが検知（ホバリング検知）された場合、適用部１０３によって音声入力部１１７に入力（収音）された音声データに対して、スタイラスの筆記ノイズに対応した音声フィルタが適用されるものとしている。このような方式によって筆記ノイズに対応した音声フィルタを適用することにより、筆記ノイズを簡易な処理によって精度よく低減することができる。 As described above, in this embodiment, when the panel detection unit 114 detects that the stylus has approached the touch panel within a predetermined distance (hovering detection), the application unit 103 inputs the voice to the voice input unit 117 (collection). It is assumed that a voice filter corresponding to writing noise of the stylus is applied to the (voiced) voice data. By applying the voice filter corresponding to the writing noise by such a method, the writing noise can be accurately reduced by a simple process.

また、スタイラスが接触検出された後、離れたことが検知された場合、適用部１０３によって、筆記ノイズに対応した音声フィルタの適用が解除されるものとしている。また、スタイラスのホバリング検知がされた後、所定時間だけ接触検知がされなかった場合、適用部１０３によって、筆記ノイズに対応した音声フィルタの適用が解除されるものとしている。これによって、筆記ノイズを含まない音声データに対して音声フィルタを適用することの影響を抑制することができる。 Further, when it is detected that the stylus is separated after the stylus is contact-detected, the application unit 103 cancels the application of the voice filter corresponding to the writing noise. Further, when the contact detection is not performed for a predetermined time after the hovering detection of the stylus, the application unit 103 cancels the application of the voice filter corresponding to the writing noise. As a result, it is possible to suppress the influence of applying the audio filter to the audio data that does not include writing noise.

［第２の実施の形態］
第２の実施の形態に係るビデオ会議装置について、第１の実施の形態に係るビデオ会議装置１０と相違する点を中心に説明する。第１の実施の形態では、スタイラスが指であるのかペンであるのかに関わらず、ホバリング検知がされた場合、筆記ノイズに対応した音声フィルタを適用する動作について説明した。本実施の形態では、ホバリング検知がされた場合、スタイラスの種類に応じた音声フィルタを適用する動作について説明する。 [Second Embodiment]
The video conference apparatus according to the second embodiment will be described focusing on the points different from the video conference apparatus 10 according to the first embodiment. In the first embodiment, the operation of applying the voice filter corresponding to the writing noise when the hovering detection is performed regardless of whether the stylus is a finger or a pen has been described. In the present embodiment, an operation of applying a voice filter according to the type of stylus when hovering is detected will be described.

なお、本実施の形態に係る会議システムの構成、およびビデオ会議装置のハードウェア構成は、第１の実施の形態で説明した構成と同様である。 The configuration of the conference system according to the present embodiment and the hardware configuration of the video conference device are the same as the configurations described in the first embodiment.

（ビデオ会議装置の機能ブロック構成）
図９は、第２の実施の形態に係るビデオ会議装置の機能ブロックの構成の一例を示す図である。図９を参照しながら、本実施の形態に係るビデオ会議装置１０ａの機能ブロックの構成および動作の詳細を説明する。 (Functional block configuration of video conferencing device)
FIG. 9 is a diagram showing an example of the configuration of functional blocks of the video conference apparatus according to the second embodiment. The configuration and operation of the functional blocks of the video conference apparatus 10a according to the present embodiment will be described in detail with reference to FIG.

図９に示すように、本実施の形態に係るビデオ会議装置１０ａは、取得部１０１と、制御部１０２と、適用部１０３ａと、判定部１０４と、送信部１０７と、受信部１０８と、通信部１０９と、撮像制御部１１０と、撮像部１１１と、表示制御部１１２と、表示部１１３と、パネル検知部１１４と、音声出力制御部１１５と、音声出力部１１６と、音声入力部１１７と、記憶部１１８と、操作入力部１１９と、を有する。なお、取得部１０１、制御部１０２、送信部１０７、受信部１０８、通信部１０９、撮像制御部１１０、撮像部１１１、表示制御部１１２、表示部１１３、パネル検知部１１４、音声出力制御部１１５、音声出力部１１６、音声入力部１１７および操作入力部１１９の機能は、第１の実施の形態で説明した機能と同様である。 As shown in FIG. 9, the video conferencing apparatus 10a according to the present embodiment includes an acquisition unit 101, a control unit 102, an application unit 103a, a determination unit 104, a transmission unit 107, a reception unit 108, and communication. The unit 109, the image pickup control unit 110, the image pickup unit 111, the display control unit 112, the display unit 113, the panel detection unit 114, the sound output control unit 115, the sound output unit 116, and the sound input unit 117. The storage unit 118 and the operation input unit 119 are included. Note that the acquisition unit 101, the control unit 102, the transmission unit 107, the reception unit 108, the communication unit 109, the imaging control unit 110, the imaging unit 111, the display control unit 112, the display unit 113, the panel detection unit 114, and the audio output control unit 115. The functions of the voice output unit 116, the voice input unit 117, and the operation input unit 119 are similar to the functions described in the first embodiment.

適用部１０３ａは、音声入力部１１７に入力（収音）された音声データに対して、スタイラスとしての指の筆記ノイズに対応した音声フィルタ、または、スタイラスとしてのペンの筆記ノイズに対応した音声フィルタを、適用または解除する機能部である。一般に、ペンは指よりも接触面積が小さく、かつ、硬質な素材であることが多い。したがって、筆記ノイズの特性（周波数成分およびレベルの大きさ等）は、ペンと指とで異なる。そこで、適用部１０３ａは、スタイラスが指であるのかペンであるのかによって、それぞれに対応した音声フィルタを適用する。これによって、筆記ノイズをさらに精度よく低減することができる。適用部１０３ａは、例えば、図２に示すＣＰＵ２０１がプログラムを実行することによって実現される。 The application unit 103a applies to the voice data input (picked up) by the voice input unit 117, a voice filter corresponding to writing noise of a finger as a stylus, or a voice filter corresponding to writing noise of a pen as a stylus. Is a functional unit that applies or cancels. Generally, a pen has a smaller contact area than a finger and is often a hard material. Therefore, the characteristics of the writing noise (frequency component, level magnitude, etc.) differ between the pen and the finger. Therefore, the application unit 103a applies the audio filter corresponding to each of the stylus depending on whether the stylus is a finger or a pen. As a result, the writing noise can be reduced more accurately. The application unit 103a is realized by, for example, the CPU 201 illustrated in FIG. 2 executing a program.

判定部１０４は、パネル検知部１１４によってホバリング検知がされた場合、タッチパネルに近づいたスタイラスが指であるのかペンであるのかを判定する機能部である。判定部１０４は、例えば、パネル検知部１１４によってホバリング検知がされたときの検知面積の差異に基づいて、スタイラスが指であるのかペンであるのかを判定する。例えば、判定部１０４は、検知面積が所定の閾値以上である場合、スタイラスは指であると判定し、所定の閾値未満である場合、スタイラスはペンであると判定する。判定部１０４は、例えば、図２に示すＣＰＵ２０１がプログラムを実行することによって実現される。 The determination unit 104 is a functional unit that determines whether the stylus approaching the touch panel is a finger or a pen when the hovering detection is performed by the panel detection unit 114. The determination unit 104 determines whether the stylus is a finger or a pen, for example, based on the difference in the detection area when the hovering detection is performed by the panel detection unit 114. For example, the determination unit 104 determines that the stylus is a finger when the detected area is equal to or larger than a predetermined threshold value, and determines that the stylus is a pen when the detected area is less than the predetermined threshold value. The determination unit 104 is realized, for example, by the CPU 201 shown in FIG. 2 executing a program.

記憶部１１８は、ビデオ会議装置１０ａの動作を実現する各種プログラム、映像データ、音声データ、および描画データ、ならびに、指の筆記ノイズに対応した音声フィルタ、およびペンの筆記ノイズに対応した音声フィルタの情報等の各種データを記憶する機能部である。記憶部１１８は、例えば、図２に示す記憶装置２０３によって実現される。 The storage unit 118 includes various programs for implementing the operation of the video conference apparatus 10a, video data, audio data, and drawing data, an audio filter corresponding to finger writing noise, and an audio filter corresponding to pen writing noise. It is a functional unit that stores various data such as information. The storage unit 118 is realized by, for example, the storage device 203 illustrated in FIG.

なお、図９に示すビデオ会議装置１０ａの取得部１０１、制御部１０２、適用部１０３ａ、判定部１０４、送信部１０７、受信部１０８、通信部１０９、撮像制御部１１０、撮像部１１１、表示制御部１１２、表示部１１３、パネル検知部１１４、音声出力制御部１１５、音声出力部１１６、音声入力部１１７、記憶部１１８および操作入力部１１９は、機能を概念的に示したものであって、このような構成に限定されるものではない。例えば、図９に示すビデオ会議装置１０ａで独立した機能部として図示した複数の機能部を、１つの機能部として構成してもよい。一方、図９に示すビデオ会議装置１０ａで１つの機能部が有する機能を複数に分割し、複数の機能部として構成するものとしてもよい。 In addition, the acquisition unit 101, the control unit 102, the application unit 103a, the determination unit 104, the transmission unit 107, the reception unit 108, the communication unit 109, the imaging control unit 110, the imaging unit 111, and the display control of the video conference apparatus 10a illustrated in FIG. The unit 112, the display unit 113, the panel detection unit 114, the voice output control unit 115, the voice output unit 116, the voice input unit 117, the storage unit 118, and the operation input unit 119 conceptually show functions. The configuration is not limited to this. For example, a plurality of functional units illustrated as independent functional units in the video conference device 10a illustrated in FIG. 9 may be configured as one functional unit. On the other hand, in the video conferencing apparatus 10a shown in FIG. 9, the function of one functional unit may be divided into a plurality of units and configured as a plurality of functional units.

また、ビデオ会議装置１０ａの取得部１０１、制御部１０２、適用部１０３ａ、判定部１０４、送信部１０７、受信部１０８、撮像制御部１１０、表示制御部１１２、パネル検知部１１４および音声出力制御部１１５の一部または全部は、ソフトウェアであるプログラムではなく、ＦＰＧＡまたはＡＳＩＣ等のハードウェア回路によって実現されてもよい。 Further, the acquisition unit 101, the control unit 102, the application unit 103a, the determination unit 104, the transmission unit 107, the reception unit 108, the imaging control unit 110, the display control unit 112, the panel detection unit 114, and the audio output control unit of the video conference apparatus 10a. Part or all of 115 may be realized by a hardware circuit such as FPGA or ASIC instead of a program that is software.

（ビデオ会議装置の筆記ノイズに対する処理）
図１０は、第２の実施の形態に係るビデオ会議装置の筆記ノイズに音声フィルタを適用する処理の流れの一例を示すフローチャートである。図１０を参照しながら、本実施の形態に係るビデオ会議装置１０ａの筆記ノイズに音声フィルタを適用する処理の流れについて説明する。 (Processing for writing noise of video conferencing equipment)
FIG. 10 is a flowchart showing an example of the flow of processing for applying an audio filter to writing noise in the video conference device according to the second embodiment. With reference to FIG. 10, a flow of processing of applying the audio filter to the writing noise of the video conference apparatus 10a according to the present embodiment will be described.

＜ステップＳ３１＞
制御部１０２は、取得部１０１により会議予約サーバ３０から取得された会議情報に基づき、会議情報が示す相手拠点のビデオ会議装置との間で、ビデオ会議を開始する。そして、ステップＳ３２へ移行する。 <Step S31>
The control unit 102 starts a video conference with the video conference device at the partner site indicated by the conference information, based on the conference information acquired from the conference reservation server 30 by the acquisition unit 101. Then, the process proceeds to step S32.

＜ステップＳ３２＞
パネル検知部１１４は、ホバリング検知機能により、スタイラス（指またはペン等）がタッチパネルに所定の距離以内に近づいたか否かを検知（ホバリング検知）する。ホバリング検知がされた場合（ステップＳ３２：Ｙｅｓ）、ステップＳ３３へ移行し、ホバリング検知がされない場合（ステップＳ３２：Ｎｏ）、引き続き、検知動作を続ける。 <Step S32>
The panel detection unit 114 detects whether or not the stylus (finger, pen, or the like) has approached the touch panel within a predetermined distance by the hovering detection function (hovering detection). If the hovering is detected (step S32: Yes), the process proceeds to step S33. If the hovering is not detected (step S32: No), the detection operation is continued.

＜ステップＳ３３＞
判定部１０４は、パネル検知部１１４によってホバリング検知がされた場合、タッチパネルに近づいたスタイラスが指であるのかペンであるのかを判定する。スタイラスが指であると判定された場合（ステップＳ３３：Ｙｅｓ）、ステップＳ３４へ移行し、スタイラスがペンであると判定された場合（ステップＳ３３：Ｎｏ）、ステップＳ３５へ移行する。 <Step S33>
When the hovering detection is performed by the panel detection unit 114, the determination unit 104 determines whether the stylus approaching the touch panel is a finger or a pen. When it is determined that the stylus is a finger (step S33: Yes), the process proceeds to step S34, and when it is determined that the stylus is a pen (step S33: No), the process proceeds to step S35.

＜ステップＳ３４＞
適用部１０３ａは、判定部１０４によりスタイラスが指であると判定された場合、音声入力部１１７に入力（収音）された音声データに対して、指の筆記ノイズに対応した音声フィルタを適用する。そして、ステップＳ３６へ移行する。 <Step S34>
When the determination unit 104 determines that the stylus is a finger, the application unit 103a applies a voice filter corresponding to the writing noise of the finger to the voice data input (picked up) to the voice input unit 117. .. Then, the process proceeds to step S36.

＜ステップＳ３５＞
適用部１０３ａは、判定部１０４によりスタイラスがペンであると判定された場合、音声入力部１１７に入力（収音）された音声データに対して、ペンの筆記ノイズに対応した音声フィルタを適用する。そして、ステップＳ３６へ移行する。 <Step S35>
When the determination unit 104 determines that the stylus is a pen, the application unit 103a applies an audio filter corresponding to writing noise of the pen to the audio data input (picked up) to the audio input unit 117. .. Then, the process proceeds to step S36.

＜ステップＳ３６〜Ｓ４１＞
ステップＳ３６〜Ｓ４１の動作は、それぞれ、第１の実施の形態の図８に示すステップＳ１４〜Ｓ１９の動作と同様である。 <Steps S36 to S41>
The operations of steps S36 to S41 are the same as the operations of steps S14 to S19 shown in FIG. 8 of the first embodiment, respectively.

以上のステップＳ３１〜Ｓ４１によって、本実施の形態に係るビデオ会議装置１０ａの筆記ノイズに音声フィルタを適用する処理が行われる。 Through the above steps S31 to S41, the process of applying the audio filter to the writing noise of the video conference apparatus 10a according to the present embodiment is performed.

以上のように、本実施の形態では、パネル検知部１１４によってスタイラスとしての指のホバリング検知が行われた場合、適用部１０３ａによって音声入力部１１７に入力（収音）された音声データに対して、指の筆記ノイズに対応した音声フィルタが適用されるものとしている。また、パネル検知部１１４によってスタイラスとしてのペンのホバリング検知が行われた場合、適用部１０３ａによって音声入力部１１７に入力（収音）された音声データに対して、ペンの筆記ノイズに対応した音声フィルタが適用されるものとしている。このように、スタイラスの種類に応じた音声フィルタを適用することによって、簡易な処理により筆記ノイズをさらに精度よく低減することができる。 As described above, in the present embodiment, when the hovering detection of the finger as the stylus is performed by the panel detection unit 114, with respect to the voice data input (picked up) to the voice input unit 117 by the application unit 103a. , A voice filter corresponding to finger writing noise is applied. When the hovering detection of the pen as the stylus is performed by the panel detection unit 114, a voice corresponding to the writing noise of the pen is added to the voice data input (picked up) to the voice input unit 117 by the application unit 103a. Filters are supposed to be applied. As described above, by applying the voice filter according to the type of the stylus, the writing noise can be further accurately reduced by the simple process.

なお、スタイラスの例として、人の指およびペンを挙げたが、これに限定されるものではなく、ホバリング検知の際に、筆記ノイズが異なる複数種類のスタイラスを検知するものであってもよい。 A human finger and a pen are given as examples of the stylus, but the stylus is not limited to this, and a plurality of types of styluses having different writing noises may be detected during hovering detection.

［第３の実施の形態］
第３の実施の形態に係るビデオ会議装置について、第１の実施の形態に係るビデオ会議装置１０と相違する点を中心に説明する。第１の実施の形態では、スタイラスのホバリング検知がされた場合、筆記ノイズに対応した音声フィルタを適用する動作について説明し、特に音声フィルタの強度については説明していない。本実施の形態では、ホバリング検知がされた場合、タッチパネル上の検知位置とマイクとの距離に応じた強度で音声フィルタを適用する動作について説明する。 [Third Embodiment]
The video conference apparatus according to the third embodiment will be described focusing on the points different from the video conference apparatus 10 according to the first embodiment. In the first embodiment, when the hovering detection of the stylus is detected, the operation of applying the voice filter corresponding to the writing noise is described, and the strength of the voice filter is not particularly described. In the present embodiment, an operation of applying a voice filter with an intensity according to the distance between the detection position on the touch panel and the microphone when hovering detection is performed will be described.

（ビデオ会議装置の機能ブロック構成）
図１１は、第３の実施の形態に係るビデオ会議装置の機能ブロックの構成の一例を示す図である。図１１を参照しながら、本実施の形態に係るビデオ会議装置１０ｂの機能ブロックの構成および動作の詳細を説明する。 (Functional block configuration of video conferencing device)
FIG. 11 is a diagram showing an example of a configuration of functional blocks of the video conference apparatus according to the third embodiment. The configuration and operation of the functional blocks of the video conference apparatus 10b according to this embodiment will be described in detail with reference to FIG.

図１１に示すように、本実施の形態に係るビデオ会議装置１０ｂは、取得部１０１と、制御部１０２と、適用部１０３ｂと、算出部１０５と、送信部１０７と、受信部１０８と、通信部１０９と、撮像制御部１１０と、撮像部１１１と、表示制御部１１２と、表示部１１３と、パネル検知部１１４と、音声出力制御部１１５と、音声出力部１１６と、音声入力部１１７と、記憶部１１８と、操作入力部１１９と、を有する。なお、取得部１０１、制御部１０２、送信部１０７、受信部１０８、通信部１０９、撮像制御部１１０、撮像部１１１、表示制御部１１２、表示部１１３、パネル検知部１１４、音声出力制御部１１５、音声出力部１１６、音声入力部１１７および操作入力部１１９の機能は、第１の実施の形態で説明した機能と同様である。 As shown in FIG. 11, the video conferencing apparatus 10b according to the present embodiment includes an acquisition unit 101, a control unit 102, an application unit 103b, a calculation unit 105, a transmission unit 107, a reception unit 108, and communication. The unit 109, the image pickup control unit 110, the image pickup unit 111, the display control unit 112, the display unit 113, the panel detection unit 114, the sound output control unit 115, the sound output unit 116, and the sound input unit 117. The storage unit 118 and the operation input unit 119 are included. Note that the acquisition unit 101, the control unit 102, the transmission unit 107, the reception unit 108, the communication unit 109, the imaging control unit 110, the imaging unit 111, the display control unit 112, the display unit 113, the panel detection unit 114, and the audio output control unit 115. The functions of the voice output unit 116, the voice input unit 117, and the operation input unit 119 are similar to the functions described in the first embodiment.

適用部１０３ｂは、算出部１０５により算出された、タッチパネル上のホバリング検知の検知位置とマイク２０６（音声入力部１１７）との距離に応じた強度で、音声入力部１１７に入力（収音）された音声データに対して、スタイラスの筆記ノイズに対応した音声フィルタを適用または解除する機能部である。一般に、スタイラスの筆記ノイズは接触検知の位置がマイク２０６に近いほど、マイク２０６に入力される音圧レベルも大きくなり、相手拠点に伝わる筆記ノイズは大きくなる。一方、マイク２０６から遠い位置で接触検知がされた場合は、距離に応じて音圧レベルが小さくなるため、相手拠点に伝わる筆記ノイズは小さくなる。したがって、距離に応じて音声フィルタの強度を変化させることにより、筆記ノイズをさらに精度よく低減することができると共に、音声データに含まれる筆記ノイズ以外の音に対しての影響を抑制することができる。また、適用部１０３ｂは、例えば、後述するように、記憶部１１８に記憶された、上述の検知位置とマイク２０６との距離と、音声フィルタの強度とを対応付けたテーブルを参照し、音声フィルタの強度を決定するものとすればよい。適用部１０３ｂは、例えば、図２に示すＣＰＵ２０１がプログラムを実行することによって実現される。 The application unit 103b inputs (picks up) the voice input unit 117 with an intensity calculated according to the distance between the detection position of hover detection on the touch panel and the microphone 206 (voice input unit 117) calculated by the calculation unit 105. It is a functional unit that applies or cancels a voice filter corresponding to stylus writing noise to the voice data. In general, the stylus writing noise increases in sound pressure level input to the microphone 206 as the contact detection position is closer to the microphone 206, and the writing noise transmitted to the partner site increases. On the other hand, when contact detection is performed at a position far from the microphone 206, the sound pressure level decreases according to the distance, and thus the writing noise transmitted to the partner site decreases. Therefore, by changing the strength of the voice filter according to the distance, the writing noise can be reduced more accurately, and the influence on the sound other than the writing noise included in the voice data can be suppressed. .. Further, for example, the application unit 103b refers to a table, which is stored in the storage unit 118 and in which the distance between the above-described detection position and the microphone 206 is associated with the strength of the audio filter, as described later, and the audio filter is referred to. Should be determined. The application unit 103b is realized, for example, by the CPU 201 shown in FIG. 2 executing a program.

算出部１０５は、パネル検知部１１４によりホバリング検知がされたタッチパネル上のスタイラスの検知座標から、スタイラスの検知位置とマイク２０６（音声入力部１１７）との物理的な距離を算出する機能部である。算出部１０５は、例えば、図２に示すＣＰＵ２０１がプログラムを実行することによって実現される。 The calculation unit 105 is a functional unit that calculates the physical distance between the detection position of the stylus and the microphone 206 (speech input unit 117) from the detection coordinates of the stylus on the touch panel on which hovering detection has been performed by the panel detection unit 114. .. The calculation unit 105 is realized, for example, by the CPU 201 shown in FIG. 2 executing a program.

記憶部１１８は、ビデオ会議装置１０ｂの動作を実現する各種プログラム、映像データ、音声データ、および描画データ、音声フィルタの情報、ならびに、スタイラスの検知位置とマイク２０６との距離と、音声フィルタの強度とを対応付けたテーブル等の各種データを記憶する機能部である。記憶部１１８は、例えば、図２に示す記憶装置２０３によって実現される。 The storage unit 118 stores various programs that realize the operation of the video conference apparatus 10b, video data, audio data, drawing data, information about an audio filter, the distance between the detection position of the stylus and the microphone 206, and the strength of the audio filter. It is a functional unit that stores various data such as a table in which the and are associated with each other. The storage unit 118 is realized by, for example, the storage device 203 illustrated in FIG.

なお、図１１に示すビデオ会議装置１０ｂの取得部１０１、制御部１０２、適用部１０３ｂ、算出部１０５、送信部１０７、受信部１０８、通信部１０９、撮像制御部１１０、撮像部１１１、表示制御部１１２、表示部１１３、パネル検知部１１４、音声出力制御部１１５、音声出力部１１６、音声入力部１１７、記憶部１１８および操作入力部１１９は、機能を概念的に示したものであって、このような構成に限定されるものではない。例えば、図１１に示すビデオ会議装置１０ｂで独立した機能部として図示した複数の機能部を、１つの機能部として構成してもよい。一方、図１１に示すビデオ会議装置１０ｂで１つの機能部が有する機能を複数に分割し、複数の機能部として構成するものとしてもよい。 In addition, the acquisition unit 101, the control unit 102, the application unit 103b, the calculation unit 105, the transmission unit 107, the reception unit 108, the communication unit 109, the imaging control unit 110, the imaging unit 111, and the display control of the video conference apparatus 10b illustrated in FIG. The unit 112, the display unit 113, the panel detection unit 114, the voice output control unit 115, the voice output unit 116, the voice input unit 117, the storage unit 118, and the operation input unit 119 conceptually show functions. The configuration is not limited to this. For example, a plurality of functional units illustrated as independent functional units in the video conference device 10b illustrated in FIG. 11 may be configured as one functional unit. On the other hand, in the video conferencing device 10b shown in FIG. 11, the function of one functional unit may be divided into a plurality of units to be configured as a plurality of functional units.

また、ビデオ会議装置１０ｂの取得部１０１、制御部１０２、適用部１０３ｂ、算出部１０５、送信部１０７、受信部１０８、撮像制御部１１０、表示制御部１１２、パネル検知部１１４および音声出力制御部１１５の一部または全部は、ソフトウェアであるプログラムではなく、ＦＰＧＡまたはＡＳＩＣ等のハードウェア回路によって実現されてもよい。 Further, the acquisition unit 101, the control unit 102, the application unit 103b, the calculation unit 105, the transmission unit 107, the reception unit 108, the imaging control unit 110, the display control unit 112, the panel detection unit 114, and the audio output control unit of the video conference apparatus 10b. Part or all of 115 may be realized by a hardware circuit such as FPGA or ASIC instead of a program that is software.

（ビデオ会議装置の筆記ノイズに対する処理）
図１２は、第３の実施の形態に係るビデオ会議装置の筆記ノイズに音声フィルタを適用する処理の流れの一例を示すフローチャートである。図１２を参照しながら、本実施の形態に係るビデオ会議装置１０ｂの筆記ノイズに音声フィルタを適用する処理の流れについて説明する。 (Processing for writing noise of video conferencing equipment)
FIG. 12 is a flowchart showing an example of the flow of processing for applying an audio filter to writing noise of the video conference device according to the third embodiment. With reference to FIG. 12, a flow of a process of applying the audio filter to the writing noise of the video conference device 10b according to the present embodiment will be described.

＜ステップＳ５１＞
制御部１０２は、取得部１０１により会議予約サーバ３０から取得された会議情報に基づき、会議情報が示す相手拠点のビデオ会議装置との間で、ビデオ会議を開始する。そして、ステップＳ５２へ移行する。 <Step S51>
The control unit 102 starts a video conference with the video conference device at the partner site indicated by the conference information, based on the conference information acquired from the conference reservation server 30 by the acquisition unit 101. Then, the process proceeds to step S52.

＜ステップＳ５２＞
パネル検知部１１４は、ホバリング検知機能により、スタイラス（指またはペン等）がタッチパネルに所定の距離以内に近づいたか否かを検知（ホバリング検知）する。ホバリング検知がされた場合（ステップＳ５２：Ｙｅｓ）、ステップＳ５３へ移行し、ホバリング検知がされない場合（ステップＳ５２：Ｎｏ）、引き続き、検知動作を続ける。 <Step S52>
The panel detection unit 114 detects whether or not the stylus (finger, pen, or the like) has approached the touch panel within a predetermined distance by the hovering detection function (hovering detection). When the hovering is detected (step S52: Yes), the process proceeds to step S53, and when the hovering is not detected (step S52: No), the detection operation is continued.

＜ステップＳ５３＞
算出部１０５は、パネル検知部１１４によりホバリング検知がされたタッチパネル上のスタイラスの検知座標から、スタイラスの検知位置とマイク２０６（音声入力部１１７）との物理的な距離を算出する。そして、ステップＳ５４へ移行する。 <Step S53>
The calculation unit 105 calculates the physical distance between the detection position of the stylus and the microphone 206 (sound input unit 117) from the detection coordinates of the stylus on the touch panel that is hovering detected by the panel detection unit 114. Then, the process proceeds to step S54.

＜ステップＳ５４＞
適用部１０３ｂは、算出部１０５により算出された、タッチパネル上のホバリング検知の検知位置とマイク２０６（音声入力部１１７）との距離に応じた強度で、音声入力部１１７に入力（収音）された音声データに対して、スタイラスの筆記ノイズに対応した音声フィルタを適用する。この場合、適用部１０３ｂは、例えば、記憶部１１８に記憶された、検知位置とマイク２０６との距離と、音声フィルタの強度とを対応付けたテーブルを参照し、音声フィルタの強度を決定する。そして、ステップＳ５５へ移行する。 <Step S54>
The application unit 103b inputs (picks up) sound into the voice input unit 117 with an intensity calculated according to the distance between the detection position of hover detection on the touch panel and the microphone 206 (voice input unit 117) calculated by the calculation unit 105. A voice filter corresponding to stylus writing noise is applied to the voice data. In this case, the application unit 103b determines the strength of the voice filter by referring to a table stored in the storage unit 118 that associates the distance between the detection position and the microphone 206 with the strength of the voice filter, for example. Then, the process proceeds to step S55.

＜ステップＳ５５〜Ｓ６０＞
ステップＳ５５〜Ｓ６０の動作は、それぞれ、第１の実施の形態の図８に示すステップＳ１４〜Ｓ１９の動作と同様である。 <Steps S55 to S60>
The operations of steps S55 to S60 are the same as the operations of steps S14 to S19 shown in FIG. 8 of the first embodiment, respectively.

以上のステップＳ５１〜Ｓ６０によって、本実施の形態に係るビデオ会議装置１０ｂの筆記ノイズに音声フィルタを適用する処理が行われる。 By the above steps S51 to S60, the process of applying the audio filter to the writing noise of the video conference apparatus 10b according to the present embodiment is performed.

以上のように、本実施の形態では、パネル検知部１１４によってスタイラスのホバリング検知が行われた場合、算出部１０５によってタッチパネル上のスタイラスの検知座標から、スタイラスの検知位置とマイク２０６（音声入力部１１７）との物理的な距離が算出される。そして、適用部１０３ｂは、算出部１０５により算出された距離に応じた強度で、音声入力部１１７に入力（収音）された音声データに対して、スタイラスの筆記ノイズに対応した音声フィルタを適用するものとしている。このように、検知位置とマイク２０６との距離に応じて音声フィルタの強度を変化させることにより、筆記ノイズをさらに精度よく低減することができると共に、音声データに含まれる筆記ノイズ以外の音に対しての影響を抑制することができる。 As described above, in the present embodiment, when hovering detection of the stylus is performed by panel detection unit 114, calculation unit 105 determines the stylus detection position and microphone 206 (voice input unit) from the detection coordinates of the stylus on the touch panel. The physical distance to 117) is calculated. Then, the application unit 103b applies the voice filter corresponding to the writing noise of the stylus to the voice data input (picked up) to the voice input unit 117 with the intensity according to the distance calculated by the calculation unit 105. It is supposed to do. In this way, by changing the strength of the voice filter in accordance with the distance between the detection position and the microphone 206, the writing noise can be reduced more accurately, and the sound other than the writing noise included in the voice data can be reduced. It is possible to suppress the influence of all.

なお、本実施の形態のように、検知位置とマイク２０６との距離に応じた強度で音声フィルタを適用する動作を、例えば、第２の実施の形態に適用することも可能である。 It should be noted that the operation of applying the voice filter with the strength according to the distance between the detection position and the microphone 206 as in the present embodiment can also be applied to, for example, the second embodiment.

［第４の実施の形態］
第４の実施の形態に係るビデオ会議装置について、第３の実施の形態に係るビデオ会議装置１０ｂと相違する点を中心に説明する。第３の実施の形態では、ホバリング検知がされた場合、タッチパネル上の検知位置とマイクとの距離に応じた強度で音声フィルタを適用する動作について説明した。本実施の形態では、所定の方式によって音声フィルタの強度を下げていく動作について説明する。 [Fourth Embodiment]
The video conference apparatus according to the fourth embodiment will be described focusing on the points different from the video conference apparatus 10b according to the third embodiment. In the third embodiment, the operation of applying the voice filter with the strength according to the distance between the detection position on the touch panel and the microphone when the hovering detection is performed has been described. In the present embodiment, an operation of reducing the strength of the audio filter by a predetermined method will be described.

なお、本実施の形態に係る会議システムの構成、およびビデオ会議装置のハードウェア構成は、第１の実施の形態で説明した構成と同様である。また、本実施の形態に係るビデオ会議装置の機能ブロックの構成は、第３の実施の形態で説明した構成と同様である。 The configuration of the conference system according to the present embodiment and the hardware configuration of the video conference device are the same as the configurations described in the first embodiment. Moreover, the configuration of the functional blocks of the video conference apparatus according to this embodiment is the same as the configuration described in the third embodiment.

（ビデオ会議装置の筆記ノイズに対する処理）
図１３は、第４の実施の形態に係るビデオ会議装置の筆記ノイズに音声フィルタを適用する処理の流れの一例を示すフローチャートである。図１３を参照しながら、本実施の形態に係るビデオ会議装置の筆記ノイズに音声フィルタを適用する処理の流れについて説明する。 (Processing for writing noise of video conferencing equipment)
FIG. 13 is a flowchart showing an example of the flow of processing for applying an audio filter to writing noise in the video conference device according to the fourth embodiment. With reference to FIG. 13, a flow of processing of applying the audio filter to the writing noise of the video conference apparatus according to the present embodiment will be described.

＜ステップＳ７１〜Ｓ７７＞
ステップＳ７１〜Ｓ７７の動作は、それぞれ、第３の実施の形態の図１２に示すステップＳ５１〜Ｓ５７の動作と同様である。ただし、このうち、ステップＳ７５において、パネル検知部１１４は、接触検知機能により、スタイラスがタッチパネルに接触したか否かを検知（接触検知）し、接触検知がされた場合（ステップＳ７５：Ｙｅｓ）、ステップＳ７６へ移行し、接触検知がされない場合（ステップＳ７５：Ｎｏ）、ステップＳ７８へ移行する。また、ステップＳ７７の処理の後は、ステップＳ８２へ移行する。 <Steps S71 to S77>
The operations of steps S71 to S77 are the same as the operations of steps S51 to S57 shown in FIG. 12 of the third embodiment, respectively. However, among these, in step S75, the panel detection unit 114 detects whether or not the stylus touches the touch panel by the contact detection function (contact detection), and when the contact detection is performed (step S75: Yes), If the contact is not detected in step S76 (step S75: No), the process proceeds to step S78. After the process of step S77, the process proceeds to step S82.

＜ステップＳ７８＞
制御部１０２は、適用部１０３ｂによって音声フィルタが特定の強度で適用されてから、所定時間が経過したか否かを判定する。所定時間が経過した場合（ステップＳ７８：Ｙｅｓ）、ステップＳ７９へ移行し、経過していない場合（ステップＳ７８：Ｎｏ）、ステップＳ７５へ戻る。 <Step S78>
The control unit 102 determines whether or not a predetermined time has passed since the application unit 103b applied the voice filter with a specific strength. When the predetermined time has elapsed (step S78: Yes), the process proceeds to step S79, and when it has not elapsed (step S78: No), the process returns to step S75.

＜ステップＳ７９＞
適用部１０３ｂは、現状の音声フィルタの強度を一段階下げて（所定量だけ強度を下げて）適用する。すなわち、ホバリング検知がされてから、接触検知がされるまでの時間が長いほど、すなわち、スタイラスがタッチパネルに近づく速度（以下、「接触速度」と称する場合がある）が遅いほど、強度を下げた音声フィルタが適用されることになる。これは、一般に、接触速度が速い場合は、タッチパネルに対するスタイラスの衝撃が大きく筆記ノイズが大きくなり、遅い場合は、衝撃が小さく筆記ノイズも小さくなる。本ステップのように、ホバリング検知がされてから、接触検知がされるまでの時間が長いほど、強度を下げた音声フィルタを適用するので、接触速度に応じて、筆記ノイズをさらに精度よく低減することができる。また、制御部１０２は、ステップＳ７８で計測していた経過時間をリセットする。そして、ステップＳ８０へ移行する。 <Step S79>
The application unit 103b reduces the strength of the current audio filter by one level (decreases the strength by a predetermined amount) and applies it. That is, the strength is lowered as the time from the hovering detection to the touch detection is longer, that is, the slower the speed at which the stylus approaches the touch panel (hereinafter, may be referred to as “contact speed”). Audio filters will be applied. This is because, generally, when the contact speed is high, the stylus has a large impact on the touch panel and the writing noise is large, and when the contact speed is slow, the impact is small and the writing noise is small. As in this step, as the time from the hovering detection to the contact detection is longer, the voice filter with lower strength is applied, so that the writing noise can be more accurately reduced according to the contact speed. be able to. Further, the control unit 102 resets the elapsed time measured in step S78. Then, the process proceeds to step S80.

なお、上述のように、ホバリング検知がされてから接触検知がされるまでの間に、所定時間が経過するごとに、音声フィルタの強度を一段階下げる（所定量だけ強度を下げる）ものとしているが、この動作に限定されるものではない。例えば、ホバリング検知がされてから接触検知がされるまで経過した時間に合わせて、連続的に、音声フィルタの強度を低下させるものとしてもよい。 As described above, the strength of the audio filter is lowered by one step (the strength is lowered by a predetermined amount) every time a predetermined time elapses between the time when the hovering is detected and the time when the contact is detected. However, the operation is not limited to this. For example, the strength of the audio filter may be continuously reduced in accordance with the time elapsed from the hovering detection to the contact detection.

＜ステップＳ８０＞
制御部１０２は、ステップＳ７４で適用部１０３ｂにより、算出部１０５により算出された距離に応じた強度で音声フィルタが適用されてから所定の時間が経過することによってタイムアウトとなったか否かを判定する。タイムアウトとなった場合（ステップＳ８０：Ｙｅｓ）、ステップＳ８１へ移行し、タイムアウトとなっていない場合（ステップＳ８０：Ｎｏ）、ステップＳ７５へ戻る。 <Step S80>
In step S74, the control unit 102 determines whether or not the application unit 103b has timed out after a predetermined time has elapsed since the voice filter was applied with the strength according to the distance calculated by the calculation unit 105. .. If it has timed out (step S80: Yes), the process proceeds to step S81. If it has not timed out (step S80: No), the process returns to step S75.

＜ステップＳ８１＞
適用部１０３ｂは、音声フィルタを適用してから所定時間経過しても接触検知がされず、タイムアウトとなった場合、スタイラスはタッチパネルから離れたものと判断し、筆記ノイズに対応した音声フィルタの適用を解除する。また、制御部１０２は、ステップＳ８０で計測していた経過時間をリセットする。そして、ステップＳ８２へ移行する。 <Step S81>
The application unit 103b determines that the stylus is away from the touch panel when contact detection is not performed even after a lapse of a predetermined time after applying the voice filter, and the stylus is away from the touch panel, and the voice filter corresponding to the writing noise is applied. To cancel. Further, the control unit 102 resets the elapsed time measured in step S80. Then, the process proceeds to step S82.

＜ステップＳ８２＞
制御部１０２は、適用部１０３ｂによって音声フィルタの適用が解除された後、ビデオ会議が終了したか否かを判定する。ビデオ会議が終了した場合（ステップＳ８２：Ｙｅｓ）、筆記ノイズに音声フィルタを適用する一連の処理を終了し、ビデオ会議が終了していない場合（ステップＳ８２：Ｎｏ）、ステップＳ７２へ戻る。 <Step S82>
The control unit 102 determines whether or not the video conference ends after the application of the audio filter is canceled by the application unit 103b. If the video conference has ended (step S82: Yes), a series of processes for applying the voice filter to the writing noise ends, and if the video conference has not ended (step S82: No), the process returns to step S72.

以上のステップＳ７１〜Ｓ８２によって、本実施の形態に係るビデオ会議装置の筆記ノイズに音声フィルタを適用する処理が行われる。 Through the above steps S71 to S82, the process of applying the audio filter to the writing noise of the video conference apparatus according to the present embodiment is performed.

以上のように、本実施の形態では、パネル検知部１１４によってスタイラスのホバリング検知が行われた場合、算出部１０５によってタッチパネル上のスタイラスの検知座標から、スタイラスの検知位置とマイク２０６（音声入力部１１７）との物理的な距離が算出される。そして、適用部１０３ｂは、算出部１０５により算出された距離に応じた強度で、音声入力部１１７に入力（収音）された音声データに対して、スタイラスの筆記ノイズに対応した音声フィルタを適用するものとしている。さらに、ホバリング検知がされてから、接触検知がされるまでの時間が長いほど、すなわち、接触速度が遅いほど、強度を下げた音声フィルタを適用するものとしている。これによって、接触速度に応じて、筆記ノイズをさらに精度よく低減することができる。 As described above, in the present embodiment, when hovering detection of the stylus is performed by panel detection unit 114, calculation unit 105 determines the stylus detection position and microphone 206 (voice input unit) from the detection coordinates of the stylus on the touch panel. The physical distance to 117) is calculated. Then, the application unit 103b applies the voice filter corresponding to the writing noise of the stylus to the voice data input (picked up) to the voice input unit 117 with the intensity according to the distance calculated by the calculation unit 105. It is supposed to do. Furthermore, the longer the time from the hovering detection to the contact detection, that is, the slower the contact speed, the lower the strength of the audio filter is applied. As a result, the writing noise can be further accurately reduced according to the contact speed.

なお、本実施の形態のように、検知位置とマイク２０６との距離に応じた強度で音声フィルタを適用し、接触速度が遅いほど、強度を下げた音声フィルタを適用する動作を、例えば、第２の実施の形態に適用することも可能である。 Note that, as in the present embodiment, an operation of applying a voice filter with an intensity corresponding to the distance between the detection position and the microphone 206, and applying a voice filter with a lower intensity as the contact speed is slower, for example, It is also possible to apply to the second embodiment.

また、上述の各実施の形態において、ビデオ会議装置の各機能部の少なくともいずれかがプログラムの実行によって実現される場合、そのプログラムは、ＲＯＭ等に予め組み込まれて提供される。また、上述の各実施の形態に係るビデオ会議装置で実行されるプログラムは、インストール可能な形式または実行可能な形式のファイルでＣＤ−ＲＯＭ（ＣｏｍｐａｃｔＤｉｓｃＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、フレキシブルディスク（ＦＤ）、ＣＤ−Ｒ（ＣｏｍｐａｃｔＤｉｓｋ−Ｒｅｃｏｒｄａｂｌｅ）、またはＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｃ）等のコンピュータで読み取り可能な記憶媒体に記憶して提供するように構成してもよい。また、上述の各実施の形態のビデオ会議装置で実行されるプログラムを、インターネット等のネットワークに接続されたコンピュータ上に格納し、ネットワーク経由でダウンロードさせることにより提供するように構成してもよい。また、上述の各実施の形態のビデオ会議装置で実行されるプログラムを、インターネット等のネットワーク経由で提供または配布するように構成してもよい。また、上述の各実施の形態のビデオ会議装置で実行されるプログラムは、上述した各機能部のうち少なくともいずれかを含むモジュール構成となっており、実際のハードウェアとしてはＣＰＵ２０１が上述の記憶装置（例えば、記憶装置２０３）からプログラムを読み出して実行することにより、上述の各機能部が主記憶装置（例えば、メモリ２０２）上にロードされて生成されるようになっている。 Further, in each of the above-described embodiments, when at least one of the functional units of the video conference apparatus is realized by executing the program, the program is provided by being pre-installed in the ROM or the like. The program executed by the video conferencing apparatus according to each of the above-described embodiments is a file in an installable format or an executable format, which is a CD-ROM (Compact Disc Read Only Memory), a flexible disk (FD), a CD. -R (Compact Disk-Recordable) or DVD (Digital Versatile Disc) may be configured to be stored and provided in a computer-readable storage medium. Further, the program executed by the video conferencing apparatus according to each of the above-described embodiments may be stored in a computer connected to a network such as the Internet and may be provided by being downloaded via the network. Further, the program executed by the video conferencing apparatus according to each of the above-described embodiments may be provided or distributed via a network such as the Internet. Further, the program executed by the video conferencing device of each of the above-described embodiments has a module configuration including at least one of the above-described functional units, and as actual hardware, the CPU 201 is the above-mentioned storage device. By reading and executing the program from (for example, the storage device 203), the above-described functional units are loaded onto the main storage device (for example, the memory 202) and generated.

１会議システム
２ネットワーク
１０、１０ａ、１０ｂ、１０＿１〜１０＿４ビデオ会議装置
２０会議サーバ
３０会議予約サーバ
１０１取得部
１０２制御部
１０３、１０３ａ、１０３ｂ適用部
１０４判定部
１０５算出部
１０７送信部
１０８受信部
１０９通信部
１１０撮像制御部
１１１撮像部
１１２表示制御部
１１３表示部
１１４パネル検知部
１１５音声出力制御部
１１６音声出力部
１１７音声入力部
１１８記憶部
１１９操作入力部
２０１ＣＰＵ
２０２メモリ
２０３記憶装置
２０４タッチパネルディスプレイ
２０４ａ他拠点表示領域
２０５カメラ
２０６マイク
２０７スピーカ
２０８操作部
２０９ネットワークＩ／Ｆ
３０１ペン 1 Conference System 2 Network 10, 10a, 10b, 10_1 to 10_4 Video Conference Device 20 Conference Server 30 Conference Reservation Server 101 Acquisition Unit 102 Control Unit 103, 103a, 103b Application Unit 104 Judgment Unit 105 Calculation Unit 107 Transmission Unit 108 Reception Unit 109 Communication unit 110 Imaging control unit 111 Imaging unit 112 Display control unit 113 Display unit 114 Panel detection unit 115 Audio output control unit 116 Audio output unit 117 Audio input unit 118 Storage unit 119 Operation input unit 201 CPU
202 memory 203 storage device 204 touch panel display 204a other base display area 205 camera 206 microphone 207 speaker 208 operation unit 209 network I/F
301 pen

特開２０１２−０２７１８６号公報JP 2012-027186 A

Claims

A detection unit that performs hovering detection to detect that a rod-shaped object has approached a predetermined distance to the touch panel,
When the hovering detection is performed by the detection unit, a voice for reducing writing noise generated when the rod-shaped object is written on the touch panel with respect to the voice data collected from the voice input unit. An application section to apply the filter,
Equipped with
The detection unit further performs contact detection to detect that the rod-shaped object has touched the touch panel,
The said application part is a video conferencing apparatus which cancels|applies the said audio|voice filter, when the said contact detection is not carried out within the predetermined time after the said detection part detects the said hovering .

Prior SL application unit, after the contact detection is by the detection unit, further wherein if the stick-shaped material is detected that the away from the touch panel, video according to claim 1 for releasing the application of the audio filter Conference equipment.

A detection unit that performs hovering detection to detect that a rod-shaped object has approached a predetermined distance to the touch panel,
When the hovering detection is performed by the detection unit, a voice for reducing writing noise generated when the rod-shaped object is written on the touch panel with respect to the voice data collected from the voice input unit. An application section to apply the filter,
When the hovering detection is performed by the detection unit, a determination unit that determines the type of the rod-shaped object ,
Bei to give a,
It said applying unit, the determination applied to the ruby Deo conference device the audio filter corresponding to the type of the stick-shaped material which is determined by the unit.

The video conferencing device according to claim 3 , wherein when the hovering detection is performed by the detection unit, the determination unit determines the type of the rod-shaped object based on a difference in a detection area of the hovering detection.

The determination section, as the stick-shaped material, video conferencing device according to claim 3 or 4 determines whether the pen or a human finger.

When the hovering detection is performed by the detection unit, a detection unit that further calculates a distance between the detection position of the rod-shaped object on the touch panel on which the hovering detection is performed and the voice input unit,
It said applying unit, on the audio data, video conferencing apparatus according to any one of claims 1 to 5, applying said audio filter at an intensity corresponding to the distance calculated by the calculation unit.

The video conference apparatus according to claim 6 , wherein the applying unit applies the audio filter by reducing the strength according to an elapsed time after the detection of the hovering by the detecting unit.

The video conferencing apparatus according to claim 7 , wherein the applying unit lowers the strength of the audio filter by a predetermined amount each time a predetermined time elapses after the hovering detection is performed by the detection unit.

The video conferencing apparatus according to claim 7 , wherein the application unit continuously lowers the strength of the audio filter in accordance with an elapsed time after the hovering detection is performed by the detection unit.

Show handwriting trajectories in the touch panel, video conferencing apparatus according to any one of claims 1 to 9, a display unit integrated with the said panel, comprising further.

A detection step of performing hovering detection to detect that the rod-shaped object has approached a predetermined distance to the touch panel,
When the hovering is detected, an audio filter is applied to the audio data picked up from the audio input unit to reduce writing noise generated when writing is made on the touch panel with the stick. Steps,
A step of performing contact detection for detecting that the rod-shaped object has touched the touch panel,
When the contact detection is not performed during a predetermined time after the hovering detection, a step of canceling the application of the voice filter,
An information processing method including.

A detection step of performing hovering detection to detect that the rod-shaped object has approached a predetermined distance to the touch panel,
When the hovering is detected, an audio filter is applied to the audio data picked up from the audio input unit to reduce writing noise generated when writing is made on the touch panel with the stick. Steps,
When the hovering is detected, a determination step of determining the type of the rod-shaped object,
Applying the audio filter corresponding to the determined type of the rod-shaped object,
An information processing method including.

Computer,
A detection unit that performs hovering detection to detect that a rod-shaped object has approached a predetermined distance to the touch panel,
When the hovering detection is performed by the detection unit, a voice for reducing writing noise generated when the rod-shaped object is written on the touch panel with respect to the voice data collected from the voice input unit. An application section to apply the filter,
And let it work ,
The detection unit further performs contact detection to detect that the rod-shaped object has touched the touch panel,
The said application part is a program for canceling|applying the said audio|voice filter, when the said contact detection is not carried out within the predetermined time after the said hovering detection by the said detection part .

Computer,
A detection unit that performs hovering detection to detect that a rod-shaped object has approached a predetermined distance to the touch panel,
When the hovering detection is performed by the detection unit, a voice for reducing writing noise generated when the rod-shaped object is written on the touch panel with respect to the voice data collected from the voice input unit. An application section to apply the filter,
When the hovering detection is performed by the detection unit, a determination unit that determines the type of the rod-shaped object,
And let it work,
The application unit is a program for applying the voice filter corresponding to the type of the rod-shaped object determined by the determination unit.