JPH1141577A - Speaker position detector - Google Patents

Speaker position detector

Info

Publication number
JPH1141577A
JPH1141577A JP9193630A JP19363097A JPH1141577A JP H1141577 A JPH1141577 A JP H1141577A JP 9193630 A JP9193630 A JP 9193630A JP 19363097 A JP19363097 A JP 19363097A JP H1141577 A JPH1141577 A JP H1141577A
Authority
JP
Japan
Prior art keywords
speaker
sound source
sensor
map
person
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
JP9193630A
Other languages
Japanese (ja)
Inventor
Hironori Kitagawa
博紀 北川
Naoji Matsuo
直司 松尾
Shigemi Osada
茂美 長田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP9193630A priority Critical patent/JPH1141577A/en
Publication of JPH1141577A publication Critical patent/JPH1141577A/en
Withdrawn legal-status Critical Current

Links

Landscapes

  • Studio Devices (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Telephonic Communication Services (AREA)
  • Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)

Abstract

PROBLEM TO BE SOLVED: To highly accurately detect a speaker position by judging it while adding two information of the detection information of a sound source position and the detection information of a figure position. SOLUTION: A sound source position detection part 3-1 of a control part 3 prepares a sound source position may by inputting sound source information inputted from a microphone array 1. Based on the image information inputted by a sensor 2 for image input, a human body position detection part 3-2 prepares a human body position map. The map calculates the probability for the sound source or human body position to exist for the unit of each domain while partitioning the range of a space detectable for the microphone array 1 and sensor 2. A speaker position discrimination part 3-3 calculates the product of probabilities in the correspondent domains of the sound source and human body position maps and discriminates the domain having the largest product as the speaker position. Either one of an ultrasonic sensor, an infrared sensor or a telecision camera is used for the sensor 2.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【0001】[0001]

【発明の属する技術分野】本発明は、テレビ会議等で利
用される話者の位置を検出する装置において、話者の位
置の検出精度の向上を図った新しい話者位置検出装置に
関する。現在、テレビ会議が急速に普及している。テレ
ビ会議においては、カメラを話者に向けたり、カメラの
焦点を話者に合わせたりするために、必要な音声のみを
拾いだす高精度の話者位置の検出手段が必要とされてい
る。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a new speaker position detecting device for detecting the position of a speaker used in a video conference or the like, which has improved accuracy in detecting the position of the speaker. Currently, video conferencing is rapidly spreading. 2. Description of the Related Art In a video conference, in order to point a camera at a speaker or focus the camera on the speaker, a high-precision speaker position detecting unit that picks up only necessary voice is required.

【0002】[0002]

【従来の技術】従来のテレビ会議装置等で用いられてい
る話者位置検出装置では、各座席に対応して指向性マイ
クが設置されており、誰かが話すと、最大音量の入力が
あったマイクの前に座っている人が話者であると確定し
ていた。また、特許出願公開番号「特開平5-244587」に
開示された例のように、複数のマイクを水平に等間隔で
放射状に配置して並べることにより、話者のXZ平面で
の方向を特定し、また、カメラから入力した画像を、エ
ッジ検出回路により処理して、輪郭パターンを認識し、
予め登録されている人物形状とパターンマッチングを行
って人物位置を確定し、更に、動き検出回路により唇の
動きを検知して話者位置を特定する例もある。
2. Description of the Related Art In a conventional speaker position detecting apparatus used in a video conference apparatus and the like, a directional microphone is provided corresponding to each seat, and when someone speaks, the maximum volume is input. The person sitting in front of the microphone was determined to be the speaker. In addition, as in the example disclosed in Patent Application Publication No. JP-A-5-244587, by arranging a plurality of microphones horizontally and radially at equal intervals and arranging them, the direction of the speaker on the XZ plane is specified. In addition, the image input from the camera is processed by an edge detection circuit to recognize a contour pattern,
There is also an example in which the position of a person is determined by performing pattern matching with a person shape registered in advance, and furthermore, the movement of a lip is detected by a motion detection circuit to specify a speaker position.

【0003】[0003]

【発明が解決しようとする課題】しかしながら、上記の
ような従来例の内、マイクと人を1対1に固定する方式
では、話者が移動している場合は対応出来なかった。ま
た別の従来例では、唇の動きを検出するため、話者の正
面が見えている必要がある。また、音源位置は、マイク
によりXZ平面上での方向しか検出出来なかった。
However, of the above-mentioned conventional examples, the method of fixing the microphone and the person in a one-to-one correspondence cannot cope with the case where the speaker is moving. In another conventional example, the front of the speaker needs to be visible in order to detect the movement of the lips. Further, the sound source position could only be detected by the microphone in the direction on the XZ plane.

【0004】本発明は、話者の位置を特定するために、
マイクと人を1対1に対応させる必要もなく、正面が見
えていなくても、或いは、人が移動しながら話をしてい
ても、高い精度で、話者位置を検出することを目的とす
る。
According to the present invention, in order to specify the position of a speaker,
There is no need to make a one-to-one correspondence between the microphone and the person, and even if the front is not visible or the person is talking while moving, the purpose is to detect the speaker position with high accuracy. I do.

【0005】[0005]

【課題を解決するための手段】前記目的を達成するため
に、本発明の話者位置検出装置は、テレビカメラ・超音
波センサ・赤外線センサ等のセンサからの入力信号を処
理し、人物位置を検出する手段と、マイクロホンアレイ
からの入力信号を処理し、音源位置を検出する手段と、
前記2種類の情報を合わせて処理することにより、話者
位置を判定する手段を有し、任意の位置の話者を検出可
能とすることを特徴とする。
In order to achieve the above object, a speaker position detecting apparatus according to the present invention processes input signals from sensors such as a television camera, an ultrasonic sensor and an infrared sensor to determine a person position. Means for detecting, processing the input signal from the microphone array, means for detecting the sound source position,
By combining and processing the two types of information, a means for determining a speaker position is provided, and a speaker at an arbitrary position can be detected.

【0006】また、前述の装置において、話者位置検出
手段として、画像でXY方向のみの位置を、音源により
Z方向のみの位置を求め、話者位置を検出することも出
来る。これにより、画像処理が2次元平面上で処理が出
来るため、画像処理時間を短縮して、話者位置の検出を
可能とすることが出来る。また、前述の装置において、
まず、音源位置のXYZ座標を求め、音源位置周辺のセ
ンサ信号のみを処理することにより、音源位置マップの
作成と音源位置周辺以外の人物位置検出の画像処理を省
くことが可能となり、話者位置検出精度を殆ど落とすこ
となく、人物位置検出を高速化することが出来る。
Further, in the above-mentioned apparatus, as a speaker position detecting means, a position in only an XY direction in an image and a position in only a Z direction can be obtained from a sound source to detect a speaker position. As a result, image processing can be performed on a two-dimensional plane, so that image processing time can be reduced and the speaker position can be detected. Also, in the above device,
First, the XYZ coordinates of the sound source position are obtained, and only the sensor signals around the sound source position are processed, thereby making it possible to omit the generation of the sound source position map and the image processing for detecting the position of the person other than around the sound source position. It is possible to speed up the detection of the position of a person without substantially lowering the detection accuracy.

【0007】また、前述の装置において、センサを回転
可能とする回転台を設置することにより、センサの死角
に話者がいても、音源位置をもとにセンサを回転させ、
死角にいる話者の検出も可能とすることが出来る。ま
た、前述の装置において、話者位置のキャリブレーショ
ンを行う機能を設けることにより、話者の特定が正しい
かどうかを、ディスプレイ等の表示装置上で確認出来る
ようにし、間違っている場合は、訂正出来るようにする
ことが出来る。
Further, in the above-described apparatus, by installing a turntable that can rotate the sensor, even if a speaker is in the blind spot of the sensor, the sensor is rotated based on the sound source position,
Detection of a speaker in a blind spot can also be enabled. Further, in the above-described apparatus, by providing a function of calibrating the speaker position, it is possible to confirm whether or not the speaker is specified correctly on a display device such as a display. You can do it.

【0008】[0008]

【発明の実施の形態】本発明の基本構成を、図1を用い
て説明する。マイクロホンアレイ1から入力された音源
情報を入力にして、制御部3の音源位置検出部3−1が
音源位置マップを作成し、話者位置判定部3−3に渡
す。画像入力用のセンサ2が入力した画像情報をもとに
して、人物位置検出部3−2は人物位置マップを作成
し、話者位置判定部3−3に渡す。センサから見た水平
方向・鉛直方向・奥行き方向を、それぞれ、X方向・Y
方向・Z方向とすると、マップは、マイクロホンアレイ
とセンサが検出可能な空間の範囲をXYZ方向に一定間
隔毎に区切って、各区分単位に音源位置または人物位置
の存在する確率を計算したものであり、音源位置マップ
と人物位置マップは、同じ空間に対応している。話者位
置判定部3−3は、音源位置マップと人物位置マップの
対応する各区分の確率の積を計算し、その積の最も大き
い区分を、話者位置と判定し、話者位置情報を他の機器
に渡す。
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The basic configuration of the present invention will be described with reference to FIG. With the sound source information input from the microphone array 1 as an input, the sound source position detecting unit 3-1 of the control unit 3 creates a sound source position map and passes it to the speaker position determining unit 3-3. Based on the image information input by the image input sensor 2, the person position detection unit 3-2 creates a person position map and passes it to the speaker position determination unit 3-3. The horizontal direction, vertical direction, and depth direction viewed from the sensor are represented by X direction and Y direction, respectively.
Assuming the direction and the Z direction, the map divides the range of the space that can be detected by the microphone array and the sensor at regular intervals in the XYZ directions, and calculates the probability of the existence of the sound source position or the person position in each division unit. Yes, the sound source position map and the person position map correspond to the same space. The speaker position determination unit 3-3 calculates the product of the probabilities of the corresponding sections of the sound source position map and the person position map, determines the section having the largest product as the speaker position, and determines the speaker position information. Give it to another device.

【0009】本発明の処理概要を図2のフローチャート
を用いて説明する。まず、ステップS1で、音声入力が
あると、音源位置検出部3−1が、マイクロホンアレイ
1の入力信号を分析して、音源位置マップの区分単位に
音源位置の確率を計算し、音源マップを完成させる。次
に、ステップS2で、人物位置検出部3−2が、センサ
2より入力した入力信号を画像処理し、人物位置マップ
の区分単位に人物位置の確率を計算し、人物位置マップ
を完成させる。ステップS1とステップS2の処理は、
どちらも常時行っており、特にどちらが先というわけで
はない。2つのマップが完成すると、ステップS3で、
話者位置判定部3−3が、音源位置マップと人物位置マ
ップの対応する各区分の積を計算し、その積の最も大き
い区分を、話者位置と判定する。話者位置が特定出来る
と、ステップS4で、話者位置判定部3−3が、他の機
器に話者位置を渡す。
An outline of the processing of the present invention will be described with reference to the flowchart of FIG. First, in step S1, when there is a voice input, the sound source position detecting unit 3-1 analyzes the input signal of the microphone array 1, calculates the probability of the sound source position for each unit of the sound source position map, and generates the sound source map. Finalize. Next, in step S2, the person position detection unit 3-2 performs image processing on the input signal input from the sensor 2, calculates the probability of the person position for each segment of the person position map, and completes the person position map. The processing of step S1 and step S2 is
Both of them are always going on, and not particularly which one comes first. When the two maps are completed, in step S3,
The speaker position determination unit 3-3 calculates the product of the corresponding sections of the sound source position map and the person position map, and determines the section having the largest product as the speaker position. When the speaker position can be specified, in step S4, the speaker position determination unit 3-3 passes the speaker position to another device.

【0010】[0010]

【実施例1】図3は、本発明の実施例である。音声が発
生すると、マイクロホンアレイ1からの音声情報を、音
源位置検出部3−1が、検出範囲内の空間を一定間隔毎
に区切り、その区分単位に音源位置の存在する確率を計
算し、音源位置マップを作成する。マイクロホンアレイ
1からの音声情報により、音源のXYZ座標を求めるこ
とが可能である。
Embodiment 1 FIG. 3 shows an embodiment of the present invention. When a sound is generated, the sound source position detection unit 3-1 separates the sound information from the microphone array 1 at regular intervals into a space within the detection range, calculates the probability of the existence of the sound source position in each division unit, Create a location map. It is possible to obtain the XYZ coordinates of the sound source from the audio information from the microphone array 1.

【0011】人物位置検出部3−2は、常時、センサ2
からの情報をもとに、検出範囲内の空間を一定間隔毎に
区切り、区分単位の人物位置の存在確率を求め、人物位
置マップを作成する。人物位置の検出は、テンプレート
マッチングを用いて検出する。センサ2には、超音波セ
ンサ2−2・赤外線センサ2−3・テレビカメラ2−4
のいづれか1つを使用し、センサ2は、回転台4の上に
載っている。超音波センサ2−2を使用する場合は、超
音波発信機2−1と組み合わせて使用する。
The person position detecting section 3-2 always has the sensor 2
, The space within the detection range is divided at regular intervals, the existence probability of the person position in the unit of division is obtained, and a person position map is created. The position of the person is detected using template matching. The sensor 2 includes an ultrasonic sensor 2-2, an infrared sensor 2-3, and a television camera 2-4.
The sensor 2 is mounted on a turntable 4. When the ultrasonic sensor 2-2 is used, it is used in combination with the ultrasonic transmitter 2-1.

【0012】音源位置検出範囲が、センサの感知範囲よ
り広い場合は、話者位置判定部3−3が、回転台制御部
3−4に指示して、回転台4を回転させてその回転角度
を感知範囲と対応づけ、音源位置検出範囲とセンサの感
知範囲を合わせることにより、センサの死角を無くす。
音源位置マップと人物位置マップは、同一の空間に対応
しており、一定間隔毎に区切られている。その各区分は
1対1に対応している。
If the sound source position detection range is wider than the sensing range of the sensor, the speaker position determination unit 3-3 instructs the turntable control unit 3-4 to rotate the turntable 4 to rotate the turntable. Is associated with the detection range, and the blind spot of the sensor is eliminated by matching the sound source position detection range with the detection range of the sensor.
The sound source position map and the person position map correspond to the same space and are separated at regular intervals. Each of the sections has a one-to-one correspondence.

【0013】音源位置マップと人物位置マップの作成が
完了すると、話者位置判定部3−3は、マップの各区分
単位に2つのマップの確率の積を求め、その積が最大の
区分を話者位置と特定する。但し、キャリブレーション
機能が、話者位置判定部3−3にある場合は、表示装置
6にセンサからの入力画像と話者位置と判定した部分を
表示し、人の判断により、話者位置が妥当かどうか判断
し、正しい話者位置を入力装置5から入力することが出
来る。話者位置判定部3−3は、特定した話者位置の情
報を、他の機器に渡す。
When the creation of the sound source position map and the person position map is completed, the speaker position determination unit 3-3 calculates the product of the probabilities of the two maps for each division unit of the map, and speaks the division having the largest product. Identify the position. However, when the calibration function is provided in the speaker position determining unit 3-3, the input image from the sensor and the portion determined as the speaker position are displayed on the display device 6, and the speaker position is determined by human judgment. It is possible to judge the validity and input the correct speaker position from the input device 5. The speaker position determination unit 3-3 passes the information on the specified speaker position to another device.

【0014】他の機器としては、例えば、テレビ会議シ
ステムが考えられ、話者位置情報をもとに、テレビ会議
の情報入力カメラの選択・切替え、回転、ズーム等を制
御することや、話者の音声のみを強調することや、話者
方向にディスプレイを向けること等が可能となる。前述
の実施例では、音源位置マップと人物位置マップを完全
に作成し、マップの全ての区分について、話者位置の確
率を計算しているが、必ずしも全ての区分の確率を計算
する必要はない。例えば、マイクロホンアレイを使用し
た音源位置の検出により、音源位置を求め、センサの人
物位置検出を該音源位置から一定の範囲内のマップの区
分に絞って検出することにより、話者位置の検出精度を
殆ど落とすことなく、高速に処理することが可能であ
る。反対に、センサの人物位置検出を先に行い、検出し
た人物位置周辺に絞って、音源位置の確率を求め、話者
位置を特定することも考えられるが、この場合は、処理
時間のかかるセンサによる画像処理を先に行うため、処
理の高速化はあまり望めない。
As another device, for example, a video conference system is conceivable, which controls selection / switching, rotation, zoom, etc. of an information input camera of a video conference based on speaker position information, Can be emphasized, and the display can be directed toward the speaker. In the above-described embodiment, the sound source position map and the person position map are completely created, and the probability of the speaker position is calculated for every section of the map. However, it is not always necessary to calculate the probabilities of all sections. . For example, by detecting a sound source position using a microphone array, a sound source position is obtained, and detection of a person position of the sensor is performed by narrowing down detection of a person position to a section of a map within a certain range from the sound source position. Can be processed at a high speed with almost no drop. Conversely, it is conceivable that the sensor position detection is performed first by the sensor, the probability of the sound source position is determined by focusing on the detected person position, and the speaker position is specified. , Image processing is performed first, so that high-speed processing cannot be expected.

【0015】また、別の方法として、前記のように、マ
ップを3次元の空間を区切って作成するのではなく、処
理の高速化のため、人物位置マップは、2次元のXY平
面で作成し、音源位置マップは、Z軸の方向を求め、そ
の交点を話者位置と特定することも可能である。
As another method, as described above, instead of creating a map by dividing a three-dimensional space, a person position map is created on a two-dimensional XY plane in order to speed up processing. In the sound source position map, it is also possible to determine the direction of the Z axis and specify the intersection point as the speaker position.

【0016】[0016]

【発明の効果】本発明では、マイクロホンアレイによる
音源位置の検出情報と、センサの入力信号を処理するこ
とによる人物位置の検出情報の2つの情報を合わせて判
断することにより、従来より精度の高い話者位置を検出
することを可能にすると同時に、従来では出来なかった
移動中の話者や画像で唇が検出出来ない場合の話者の特
定を可能とした。また、センサに回転台を取り付けるこ
とにより、話者がセンサの死角にいる場合も、回転台を
回転することにより、話者の位置を確定出来るようにし
た。
As described above, according to the present invention, the information of the sound source position detected by the microphone array and the detected information of the person position obtained by processing the input signal of the sensor are determined together to determine the information with higher accuracy than before. At the same time, it is possible to detect a speaker position, and at the same time, it is possible to specify a moving speaker or a speaker in the case where lips cannot be detected in an image, which has not been conventionally possible. Also, by attaching a turntable to the sensor, even when the speaker is in the blind spot of the sensor, the position of the speaker can be determined by rotating the turntable.

【図面の簡単な説明】[Brief description of the drawings]

【図1】 本発明の基本構成図FIG. 1 is a basic configuration diagram of the present invention.

【図2】 処理概要フローチャートFIG. 2 is a processing outline flowchart.

【図3】 本発明の実施例FIG. 3 shows an embodiment of the present invention.

【符号の説明】[Explanation of symbols]

1 マイクロホンアレイ 2 画像入力用のセンサ 2−1 超音波発生装置 2−2 超音波センサ 2−3 赤外線センサ 2−4 テレビカメラ 3 制御部 3−1 音源位置検出部 3−2 人物位置検出部 3−3 話者位置判定部 3−4 カメラ制御部 4 回転台 5 キーボード・マウス等の入力装置 6 ディスプレイ等の表示装置 DESCRIPTION OF SYMBOLS 1 Microphone array 2 Sensor for image input 2-1 Ultrasonic wave generator 2-2 Ultrasonic sensor 2-3 Infrared sensor 2-4 Television camera 3 Control unit 3-1 Sound source position detecting unit 3-2 Person position detecting unit 3 -3 Speaker position determination unit 3-4 Camera control unit 4 Turntable 5 Input device such as keyboard and mouse 6 Display device such as display

Claims (3)

【特許請求の範囲】[Claims] 【請求項1】 テレビカメラ・超音波センサ・赤外線セ
ンサ等のいづれか1つのセンサからの入力信号を処理
し、人物位置を検出する手段と、マイクロホンアレイか
らの入力信号を処理し、音源位置を検出する手段と、前
記2種類の情報を合わせて処理することにより、話者位
置を判定する手段を有し、感知範囲内の任意の位置の話
者を検出可能とすることを特徴とする話者位置検出装
置。
1. A means for processing an input signal from any one of a television camera, an ultrasonic sensor, an infrared sensor and the like to detect a person's position, and processing an input signal from a microphone array to detect a sound source position. And a means for determining the speaker position by processing the two types of information together, so that a speaker at an arbitrary position within the sensing range can be detected. Position detection device.
【請求項2】請求項1において、センサから見た感知範
囲内の水平方向をX、鉛直方向をY、奥行き方向をZと
した時、話者位置検出手段として、画像でXY方向の位
置を、音源によりZ方向の位置を求め、話者位置を検出
することを特徴とする話者位置検出装置。
2. The method according to claim 1, wherein when the horizontal direction within the sensing range viewed from the sensor is X, the vertical direction is Y, and the depth direction is Z, the position in the XY direction of the image is determined as the speaker position detecting means. A speaker position detecting apparatus for determining a position in a Z direction from a sound source and detecting a speaker position.
【請求項3】請求項1において、センサを回転可能とす
ることにより、センサの死角に話者がいても、音源位置
をもとにセンサを回転させることにより、死角にいる話
者の検出も可能としたことを特徴とする話者位置検出装
置。
3. A speaker according to claim 1, wherein the sensor is rotatable so that even if a speaker is present in the blind spot of the sensor, the speaker is detected based on the sound source position to detect the speaker in the blind spot. A speaker position detecting device, which is enabled.
JP9193630A 1997-07-18 1997-07-18 Speaker position detector Withdrawn JPH1141577A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP9193630A JPH1141577A (en) 1997-07-18 1997-07-18 Speaker position detector

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP9193630A JPH1141577A (en) 1997-07-18 1997-07-18 Speaker position detector

Publications (1)

Publication Number Publication Date
JPH1141577A true JPH1141577A (en) 1999-02-12

Family

ID=16311147

Family Applications (1)

Application Number Title Priority Date Filing Date
JP9193630A Withdrawn JPH1141577A (en) 1997-07-18 1997-07-18 Speaker position detector

Country Status (1)

Country Link
JP (1) JPH1141577A (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000077537A1 (en) * 1999-06-11 2000-12-21 Japan Science And Technology Corporation Method and apparatus for determining sound source
WO2001095314A1 (en) * 2000-06-09 2001-12-13 Japan Science And Technology Corporation Robot acoustic device and robot acoustic system
WO2002008782A1 (en) * 2000-07-20 2002-01-31 Robert Bosch Gmbh Method for the acoustic localization of persons in an area of detection
US6516066B2 (en) 2000-04-11 2003-02-04 Nec Corporation Apparatus for detecting direction of sound source and turning microphone toward sound source
US6583723B2 (en) 2001-02-23 2003-06-24 Fujitsu Limited Human interface system using a plurality of sensors
JP2003189273A (en) * 2001-12-20 2003-07-04 Sharp Corp Speaker identifying device and video conference system provided with speaker identifying device
JP2004126941A (en) * 2002-10-02 2004-04-22 P To Pa:Kk Image display apparatus, image display method and program
JP2004126784A (en) * 2002-09-30 2004-04-22 P To Pa:Kk Image display apparatus, image display method and program
US6795558B2 (en) * 1997-06-26 2004-09-21 Fujitsu Limited Microphone array apparatus
JP2005141687A (en) * 2003-11-10 2005-06-02 Nippon Telegr & Teleph Corp <Ntt> Method, device, and system for object tracing, program, and recording medium
JP2006245725A (en) * 2005-03-01 2006-09-14 Yamaha Corp Microphone system
KR100754385B1 (en) 2004-09-30 2007-08-31 삼성전자주식회사 Apparatus and method for object localization, tracking, and separation using audio and video sensors
JP2008113164A (en) * 2006-10-30 2008-05-15 Yamaha Corp Communication apparatus
JP2008145574A (en) * 2006-12-07 2008-06-26 Nec Access Technica Ltd Sound source direction estimation device, sound source direction estimation method and robot device
JP2009517936A (en) * 2005-11-30 2009-04-30 ノエミ バレンズエラ ミリアム Method for recording and playing back sound sources with time-varying directional characteristics
JP2010010857A (en) * 2008-06-25 2010-01-14 Oki Electric Ind Co Ltd Voice input robot, remote conference support system, and remote conference support method
JP2010251916A (en) * 2009-04-13 2010-11-04 Nec Casio Mobile Communications Ltd Sound data processing device and program
US7852369B2 (en) * 2002-06-27 2010-12-14 Microsoft Corp. Integrated design for omni-directional camera and microphone array
JP2011071702A (en) * 2009-09-25 2011-04-07 Fujitsu Ltd Sound pickup processor, sound pickup processing method, and program
US8249298B2 (en) 2006-10-19 2012-08-21 Polycom, Inc. Ultrasonic camera tracking system and associated methods
JP2014511476A (en) * 2011-02-10 2014-05-15 アトラス・コプコ・インダストリアル・テクニーク・アクチボラグ Positioning system for determining the position of an object
JP2018501671A (en) * 2015-11-27 2018-01-18 シャオミ・インコーポレイテッド Camera head photographing angle adjusting method and apparatus

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6795558B2 (en) * 1997-06-26 2004-09-21 Fujitsu Limited Microphone array apparatus
WO2000077537A1 (en) * 1999-06-11 2000-12-21 Japan Science And Technology Corporation Method and apparatus for determining sound source
US7035418B1 (en) 1999-06-11 2006-04-25 Japan Science And Technology Agency Method and apparatus for determining sound source
US6516066B2 (en) 2000-04-11 2003-02-04 Nec Corporation Apparatus for detecting direction of sound source and turning microphone toward sound source
US7215786B2 (en) 2000-06-09 2007-05-08 Japan Science And Technology Agency Robot acoustic device and robot acoustic system
WO2001095314A1 (en) * 2000-06-09 2001-12-13 Japan Science And Technology Corporation Robot acoustic device and robot acoustic system
WO2002008782A1 (en) * 2000-07-20 2002-01-31 Robert Bosch Gmbh Method for the acoustic localization of persons in an area of detection
US7224809B2 (en) 2000-07-20 2007-05-29 Robert Bosch Gmbh Method for the acoustic localization of persons in an area of detection
US6583723B2 (en) 2001-02-23 2003-06-24 Fujitsu Limited Human interface system using a plurality of sensors
US6686844B2 (en) 2001-02-23 2004-02-03 Fujitsu Limited Human interface system using a plurality of sensors
JP2003189273A (en) * 2001-12-20 2003-07-04 Sharp Corp Speaker identifying device and video conference system provided with speaker identifying device
US7852369B2 (en) * 2002-06-27 2010-12-14 Microsoft Corp. Integrated design for omni-directional camera and microphone array
JP2004126784A (en) * 2002-09-30 2004-04-22 P To Pa:Kk Image display apparatus, image display method and program
JP2004126941A (en) * 2002-10-02 2004-04-22 P To Pa:Kk Image display apparatus, image display method and program
JP2005141687A (en) * 2003-11-10 2005-06-02 Nippon Telegr & Teleph Corp <Ntt> Method, device, and system for object tracing, program, and recording medium
JP4490076B2 (en) * 2003-11-10 2010-06-23 日本電信電話株式会社 Object tracking method, object tracking apparatus, program, and recording medium
KR100754385B1 (en) 2004-09-30 2007-08-31 삼성전자주식회사 Apparatus and method for object localization, tracking, and separation using audio and video sensors
US7536029B2 (en) 2004-09-30 2009-05-19 Samsung Electronics Co., Ltd. Apparatus and method performing audio-video sensor fusion for object localization, tracking, and separation
JP2006245725A (en) * 2005-03-01 2006-09-14 Yamaha Corp Microphone system
JP2009517936A (en) * 2005-11-30 2009-04-30 ノエミ バレンズエラ ミリアム Method for recording and playing back sound sources with time-varying directional characteristics
US8249298B2 (en) 2006-10-19 2012-08-21 Polycom, Inc. Ultrasonic camera tracking system and associated methods
JP2008113164A (en) * 2006-10-30 2008-05-15 Yamaha Corp Communication apparatus
JP2008145574A (en) * 2006-12-07 2008-06-26 Nec Access Technica Ltd Sound source direction estimation device, sound source direction estimation method and robot device
JP2010010857A (en) * 2008-06-25 2010-01-14 Oki Electric Ind Co Ltd Voice input robot, remote conference support system, and remote conference support method
JP2010251916A (en) * 2009-04-13 2010-11-04 Nec Casio Mobile Communications Ltd Sound data processing device and program
JP2011071702A (en) * 2009-09-25 2011-04-07 Fujitsu Ltd Sound pickup processor, sound pickup processing method, and program
JP2014511476A (en) * 2011-02-10 2014-05-15 アトラス・コプコ・インダストリアル・テクニーク・アクチボラグ Positioning system for determining the position of an object
JP2018501671A (en) * 2015-11-27 2018-01-18 シャオミ・インコーポレイテッド Camera head photographing angle adjusting method and apparatus
US10375296B2 (en) 2015-11-27 2019-08-06 Xiaomi Inc. Methods apparatuses, and storage mediums for adjusting camera shooting angle

Similar Documents

Publication Publication Date Title
JPH1141577A (en) Speaker position detector
EP1715717B1 (en) Moving object equipped with ultra-directional speaker
JP3195920B2 (en) Sound source identification / separation apparatus and method
US20020140804A1 (en) Method and apparatus for audio/image speaker detection and locator
CN109565629B (en) Method and apparatus for controlling processing of audio signals
JP2004514359A (en) Automatic tuning sound system
CN111918018B (en) Video conference system, video conference apparatus, and video conference method
KR101808714B1 (en) Vehicle Center Fascia Control Method Based On Gesture Recognition By Depth Information And Virtual Touch Sensor
US20140086551A1 (en) Information processing apparatus and information processing method
EP2031905A2 (en) Sound processing apparatus and sound processing method thereof
JP2023024471A (en) Information processor and method for processing information
CN113014844A (en) Audio processing method and device, storage medium and electronic equipment
US11514108B2 (en) Content search
KR20130046759A (en) Apparatus and method for recogniting driver command in a vehicle
CN113853529A (en) Apparatus, and associated method, for spatial audio capture
JPH11313272A (en) Video/audio output device
US20230186642A1 (en) Object detection method
JP2016161626A (en) Control device, program, and projection system
JPH09182044A (en) Television conference system
JP2000041228A (en) Speaker position detector
JP2001194141A (en) Method for detecting position of moving body
JP2003078818A (en) Telop device
US20230421984A1 (en) Systems and methods for dynamic spatial separation of sound objects
EP4250768A1 (en) An apparatus for mapping sound source direction
WO2023054047A1 (en) Information processing device, information processing method, and program

Legal Events

Date Code Title Description
A300 Application deemed to be withdrawn because no request for examination was validly filed

Free format text: JAPANESE INTERMEDIATE CODE: A300

Effective date: 20041005