JPH09275533A - Signal processor - Google Patents

Signal processor

Info

Publication number
JPH09275533A
JPH09275533A JP8085214A JP8521496A JPH09275533A JP H09275533 A JPH09275533 A JP H09275533A JP 8085214 A JP8085214 A JP 8085214A JP 8521496 A JP8521496 A JP 8521496A JP H09275533 A JPH09275533 A JP H09275533A
Authority
JP
Japan
Prior art keywords
line
signal
sound
sight
sound source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
JP8085214A
Other languages
Japanese (ja)
Inventor
Hideo Nakaya
秀雄 中屋
Tetsujiro Kondo
哲二郎 近藤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Corp
Original Assignee
Sony Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corp filed Critical Sony Corp
Priority to JP8085214A priority Critical patent/JPH09275533A/en
Publication of JPH09275533A publication Critical patent/JPH09275533A/en
Abandoned legal-status Critical Current

Links

Landscapes

  • Position Input By Displaying (AREA)
  • Television Receiver Circuits (AREA)
  • Closed-Circuit Television Systems (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

PROBLEM TO BE SOLVED: To correctly hear out the signals sent from a sound source that is noticed by a viewer by detecting the line of sight of the viewer shown on a monitor screen and giving the weighting to the voices after selecting the voice signals sent from the sound source positioned in the direction of the line of sight of the viewer. SOLUTION: A specific part that is noticed by a viewer 13 is detected on a display screen which is projected on a monitor device 7. For instance, a compact video camera 11 is fixed at a position near the device 7 where an image is projected and then photographs the face of the viewer 13. Then the eye areas are detected out of an entire face image and then the eyelid and eyeball areas are separated from the eye areas. A line-of-sight detection circuit 12 detects the direction of the line of sight of the viewer 13 based on the eyelid shape changes and the position relation changes of eyeballs. Then the position that is caught by the line of sight is specified on the display screen. If a sound source is included in the specified part, the voice signal generated by the sound source is specified. The weight of this specified signal is increased and the weight of other signals are reduced. Then the synthetic stereo signals are outputted.

Description

【発明の詳細な説明】Detailed Description of the Invention

【0001】[0001]

【発明の属する技術分野】本発明は、信号処理装置に関
し、ことに映像と音声が同時に出力されるテレビ受信機
等の信号処理装置において、音声に視聴者の視線方向に
応じて重みづけを行わせる視線処理を実現し、音声を分
離して聞き分けることを可能にする信号処理装置に関す
る。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a signal processing device, and more particularly, in a signal processing device such as a television receiver which outputs video and audio simultaneously, the audio is weighted according to the line of sight of a viewer. The present invention relates to a signal processing device that realizes line-of-sight processing that makes it possible to separate and hear sound.

【0002】[0002]

【従来の技術】通常の音響空間の中では、我々は着目し
ている音声信号を聞き分けることができる。例えば、複
数の話者がそれぞれ発言しているパーティ会場等におい
ても、かなりまで特定の話者の発言を聞分けることがで
きるものである。しかし、従来のテレビ受信機等の映像
音声信号処理装置においては、複数の音源が同時に発音
している場合には視聴者が特定の音声を聞き分けるとい
うようなことはかなり困難であった。
2. Description of the Related Art In a normal acoustic space, we can distinguish a voice signal of interest. For example, even in a party hall where a plurality of speakers are speaking, the speech of a specific speaker can be recognized to a considerable extent. However, in a conventional video / audio signal processing apparatus such as a television receiver, it has been quite difficult for a viewer to hear a specific sound when a plurality of sound sources are simultaneously producing sound.

【0003】例えばテレビドラマなどで、複数の画面の
登場人物が同時に話しているような場面の中から、視聴
者が注目している特定の登場人物の声をクローズアップ
して聞かせるというような方法は従来とられていない。
したがって、テレビドラマでは登場人物は交互に会話す
るようになっている。
For example, in a TV drama or the like, from a scene in which characters on a plurality of screens are talking at the same time, a voice of a specific character that the viewer is paying attention to is heard in close-up. The method is not taken conventionally.
Therefore, in the TV drama, the characters are supposed to have conversations alternately.

【0004】また、テレビ監視装置やテレビ会議装置な
どで、特定の部分、特定の話者に着目しその付近の音声
を聞き分けたい場合でも、音量を上げると周囲の雑音ま
でが増幅されてかえって聞きづらくなる等の問題があっ
た。このため、視聴者が聞き取れる音声には奥行きがな
く、立体的な音声空間が視聴者に対して展開されてはい
ず、視聴者が特定の話者や特定の発音源からの信号を正
しく聞き分けることが難かしいことがあった。
Also, in a TV monitor or a video conference device, even when it is desired to distinguish a specific part or a specific speaker from sounds in the vicinity thereof, when the volume is increased, ambient noise is also amplified and the sound is heard. There were problems such as difficulty. Therefore, there is no depth in the audio that the viewer can hear, a stereoscopic audio space is not developed for the viewer, and the viewer correctly hears the signal from a specific speaker or a specific sound source. Was sometimes difficult.

【0005】[0005]

【発明が解決しようとする課題】上述のごとく、従来の
映像音声信号処理装置においては、視聴者が特定の話者
や特定の発音源からの信号を正しく聞き分けることが難
かしいという問題があった。
As described above, the conventional video / audio signal processing device has a problem that it is difficult for the viewer to correctly distinguish the signal from the specific speaker or the specific sound source. .

【0006】本発明はこのような従来技術の問題点を解
決するためのものであり、視聴者がどの音源に着目して
いるかを視聴者の視線方向から求め、視線方向にある音
源の音量に重みづけを行って、視聴者が着目している特
定の話者や特定の発音源からの信号を視聴者が正しく聞
き分けることを可能にすることを発明の課題とするもの
である。
The present invention is to solve the problems of the prior art as described above, and finds which sound source the viewer is paying attention to from the direction of the line of sight of the viewer, and determines the volume of the sound source in the direction of the line of sight. It is an object of the invention to perform weighting so that a viewer can correctly recognize a signal from a specific speaker or a specific sound source that the viewer is paying attention to.

【0007】[0007]

【課題を解決するための手段】上記目的を達成するた
め、本発明は、映像信号を表示する画像モニタ手段と、
音声信号を出力する受話手段とを具備するテレビ受像
機、ビデオ再生装置などの信号処理装置において、信号
処理装置の視聴者の前記モニタ装置上の視線方向を検出
する視線検出手段と、前記視線検出手段が検出した視線
方向に位置する発音源を特定する音源特定手段と、前記
音源特定手段が特定した前記発音源からの音声信号を選
択して増幅し、他の音源からの音声信号を抑圧して、音
声信号に重みづけを行う音声信号重み付け手段とを具備
することを特徴とする。
In order to achieve the above object, the present invention provides image monitor means for displaying a video signal,
In a signal processing device such as a television receiver and a video reproducing device, which is provided with a receiving means for outputting an audio signal, a line-of-sight detecting means for detecting a line-of-sight direction of a viewer of the signal processing device on the monitor device, and the line-of-sight detection. A sound source identification unit that identifies a sound source located in the line-of-sight direction detected by the unit and a sound signal from the sound source identified by the sound source identification unit is selected and amplified, and a sound signal from another sound source is suppressed. And a voice signal weighting means for weighting the voice signal.

【0008】また、対象を撮像する第1の撮像手段と、
前記第1の撮像手段が撮像した映像信号を表示する画像
モニタ手段と、対象が発する音声信号を集音する受音手
段と、前記受音手段が集音した音声信号を出力する受話
手段とを具備するテレビ会議装置、テレビ監視装置など
の信号処理装置において、信号処理装置の視聴者の前記
モニタ装置上の視線方向を検出する視線検出手段と、前
記視線検出手段が検出した視線方向に位置する発音源を
特定する音源特定手段と、前記音源特定手段が特定した
前記発音源からの音声信号を選択して増幅し、他の音源
からの音声信号を抑圧して、音声信号に重みづけを行う
音声信号重み付け手段とを具備することを特徴とする。
あるいは、前記視線検出手段が検出した視線方向に前記
第1の撮像手段の撮像方向および前記受音手段の集音方
向を移動させる受信方向移動手段を具備することを特徴
とする。
A first image pickup means for picking up an image of the object;
An image monitor means for displaying the video signal picked up by the first image pickup means, a sound receiving means for collecting a sound signal emitted by the target, and a receiving means for outputting the sound signal collected by the sound receiving means. In a signal processing device such as a video conference device and a television monitoring device provided, a line-of-sight detection unit that detects a line-of-sight direction of a viewer of the signal processing device on the monitor device and a line-of-sight direction detected by the line-of-sight detection unit A sound source specifying means for specifying a sound source and a sound signal from the sound source specified by the sound source specifying means are selected and amplified, and a sound signal from another sound source is suppressed to weight the sound signal. Audio signal weighting means.
Alternatively, it is characterized by further comprising receiving direction moving means for moving the image capturing direction of the first image capturing means and the sound collecting direction of the sound receiving means in the visual line direction detected by the visual line detecting means.

【0009】また、映像信号と音声信号を同時に出力す
る信号処理方式において、映像信号出力を注目する視聴
者の視線方向を検出する視線検出機能と、前記視線検出
機能が検出した視線方向に位置する発音源を特定する音
源特定機能と、前記音源特定機能が特定した前記発音源
からの音声信号を選択して増幅し、他の音源からの音声
信号を抑圧して、音声信号に重みをつけて出力する音声
信号出力機能とを具備することを特徴とする。
Further, in a signal processing system for simultaneously outputting a video signal and an audio signal, the visual line detection function for detecting the visual line direction of a viewer who is interested in the video signal output and the visual line direction detected by the visual line detection function are located. A sound source specifying function for specifying a sound source and a sound signal from the sound source specified by the sound source specifying function are selected and amplified, a sound signal from another sound source is suppressed, and a sound signal is weighted. And a voice signal output function for outputting.

【0010】これによって、モニタ装置上の画面で視聴
者が着目している特定の話者や特定の発音源からの信号
に重みづけを行ってその音をクローズアップするように
したので、周囲の雑音等からその音を区別して正しく聞
き分けることができるようになる。
As a result, the signal from the specific speaker or the specific sound source that the viewer is paying attention to is weighted on the screen of the monitor device so as to close up the sound. You will be able to distinguish the sound from noise etc. and correctly hear it.

【0011】[0011]

【発明の実施の形態】以下、本発明にかかる信号処理装
置を添付図面を参照にして詳細に説明する。図1に、本
発明の信号処理装置の一実施形態であるビデオ内臓テレ
ビ受信機のブロック図を示す。図1中、1は放送波のテ
レビ電波信号を受信するアンテナ、2はテレビ電波信号
受信回路のチューナ、3はチューナ2と切り替えられて
テレビ信号を送るビデオレコーダ、4はチューナ2とビ
デオレコーダ3を切り替える切り替えスイッチ、5はテ
レビ信号から映像信号と音声信号を分離する映像/音声
分離回路、6は映像信号の復調、増幅、色再生等を行う
映像処理回路、7は映像信号を再生するモニタ装置、8
は音声信号の復調、ステレオ増幅、重みづけを行う音声
処理回路、9は左スピーカ、10は右スピーカ、11は
視聴者13を捕らえるビデオカメラ、12はビデオカメ
ラ11の受信映像から視聴者13の視線を検出する視線
検出回路、13は視聴者である。
DETAILED DESCRIPTION OF THE INVENTION A signal processing apparatus according to the present invention will be described in detail below with reference to the accompanying drawings. FIG. 1 shows a block diagram of a video built-in television receiver which is an embodiment of the signal processing apparatus of the present invention. In FIG. 1, 1 is an antenna for receiving a television wave signal of a broadcast wave, 2 is a tuner of a television wave signal receiving circuit, 3 is a video recorder that switches a tuner 2 and sends a television signal, and 4 is a tuner 2 and a video recorder 3. The switch 5 for switching between the video signal and the audio signal from the television signal, the video / audio separation circuit for separating the video signal from the television signal, the video processing circuit 6 for demodulating, amplifying and reproducing the color of the video signal, and the monitor 7 for reproducing the video signal. Equipment, 8
Is an audio processing circuit that performs demodulation, stereo amplification, and weighting of an audio signal, 9 is a left speaker, 10 is a right speaker, 11 is a video camera that captures a viewer 13, and 12 is a video image received by the video camera 11 from the viewer 13. A visual line detection circuit for detecting a visual line, 13 is a viewer.

【0012】次に、図1に示す信号処理装置の動作を説
明する。アンテナ1によって受信されチューナで選択さ
れたテレビ放送信号やビデオソフトからの再生信号が切
り替えスイッチ4から映像/音声分離回路5に入力さ
れ、映像信号が映像処理回路6の処理によってモニタ装
置7上に映し出され、また複数の音声信号が音声処理回
路8の処理によって左右のスピーカ9、10からステレ
オ再生されているものとする。
Next, the operation of the signal processing apparatus shown in FIG. 1 will be described. The television broadcast signal received by the antenna 1 and selected by the tuner and the reproduction signal from the video software are input from the changeover switch 4 to the video / audio separation circuit 5, and the video signal is processed by the video processing circuit 6 on the monitor device 7. It is assumed that a plurality of audio signals are displayed and are reproduced in stereo from the left and right speakers 9 and 10 by the processing of the audio processing circuit 8.

【0013】この状態で、まず視聴者13がモニタ装置
7上に映し出されている表示画面のどの部分に注目して
いるかを検出する。例えば、映像が映し出されているモ
ニタ装置7の近傍に小型のビデオカメラ11を固定して
設置し、視聴者13の顔を撮像する。そうして視聴者1
3の顔の全体画像から、目の領域を検出して、その目の
領域から瞼の領域と眼球の領域をさらに分離し、瞼の形
状変化と、眼球の位置関係の変化から視聴者13の視線
の方向を視線検出回路12で検出する。この視線方向の
検出方法は、例えば、特開平6−6786に示された方
法などが使用できる。
In this state, first, it is detected which part of the display screen the viewer 13 is paying attention to. For example, a small video camera 11 is fixedly installed near the monitor device 7 on which an image is displayed, and the face of the viewer 13 is captured. Then viewer 1
The eye area is detected from the entire image of the face No. 3 and the eyelid area and the eyeball area are further separated from the eye area. From the change in the shape of the eyelid and the change in the positional relationship between the eyes, the viewer 13 The line-of-sight direction is detected by the line-of-sight detection circuit 12. As the method of detecting the line-of-sight direction, for example, the method disclosed in JP-A-6-6786 can be used.

【0014】このようにして、視線検出回路12で視線
の方向を検出すると、その視線の方向が表示画面のどの
位置を捕らえているかを特定する。そうして、その部分
に音声信号を発生している対象物があれば、その対象物
が発生する音声信号を特定してその重みを大きくし、そ
の他の音声信号の重みを小さくして、合成ステレオ信号
を出力する。このような音声信号の重みづけの方法は、
例えば、特開平6−301390に示された方法などが
使用できる。
In this way, when the line-of-sight detection circuit 12 detects the direction of the line-of-sight, it identifies which position on the display screen the direction of the line-of-sight captures. Then, if there is an object generating a sound signal in that portion, the sound signal generated by the object is specified and its weight is increased, and the weights of the other sound signals are decreased, and synthesis is performed. Outputs a stereo signal. A method of weighting such a voice signal is
For example, the method disclosed in JP-A-6-301390 can be used.

【0015】図2は、音声処理回路8に含まれている音
声の重みづけ回路の例である。図2で、21は重み係数
決定回路、22−1〜22−nは左側の音声処理回路
(Lw1〜Lwn)、23−1〜23−nは右側の音声
処理回路(Rw1〜Rwn)、24は左側の加算回路、
25は右側の加算回路、26は左側の増幅回路、27は
右側の増幅回路である。
FIG. 2 shows an example of a voice weighting circuit included in the voice processing circuit 8. In FIG. 2, reference numeral 21 is a weight coefficient determination circuit, 22-1 to 22-n are left-side audio processing circuits (Lw1 to Lwn), 23-1 to 23-n are right-side audio processing circuits (Rw1 to Rwn), and 24. Is the adder circuit on the left side,
Reference numeral 25 is a right addition circuit, 26 is a left amplification circuit, and 27 is a right amplification circuit.

【0016】図2の重みづけ回路の重み係数決定回路2
1は、視線検出回路12からの視線信号によって視聴者
13の視線が画面にある複数の発音源のうちのどの発音
源に注目しているかを検知して、この視聴者13の視線
が着目している発音源の音声信号の重みを大きくするよ
うな重み係数を決定し、この重み係数をそれぞれの音声
処理回路22−1〜22−n、23−1〜23−nに送
って重みづけを行わせる。
Weighting coefficient determination circuit 2 of the weighting circuit of FIG.
1 detects which sound source of the plural sound sources on the screen the viewer's line of sight is paying attention to by the line-of-sight signal from the line-of-sight detection circuit 12, and the line-of-sight of this viewer 13 is focused on. A weighting coefficient for increasing the weight of the sound signal of the sound source is determined, and the weighting coefficient is sent to each of the sound processing circuits 22-1 to 22-n, 23-1 to 23-n to perform weighting. Let it be done.

【0017】複数の発音源からの三次元音声信号S1 〜
Sn は各音声処理回路22−1〜22−n、23−1〜
23−nで重み係数決定回路21から指定された重み係
数によって左右のスピーカ毎に重みづけされ、左右に分
離される。そのあと、左側の各音声処理回路22−1〜
22−nの出力は、左側の加算回路24でミキシングさ
れ、左側の増幅回路26で増幅され、左スピーカ9から
出力される。また、右側の各音声処理回路23−1〜2
3−nの出力は、右側の加算回路25でミキシングさ
れ、右側の増幅回路27で増幅され、右スピーカ10か
ら出力される。
Three-dimensional audio signals S1 ...
Sn is each voice processing circuit 22-1 to 22-n, 23-1 to
23-n, the left and right speakers are weighted by the weighting factor designated by the weighting factor determination circuit 21 and separated into left and right. After that, the respective voice processing circuits 22-1 to 22-1 on the left side
The output of 22-n is mixed by the adder circuit 24 on the left side, amplified by the amplifier circuit 26 on the left side, and output from the left speaker 9. Also, each of the sound processing circuits 23-1 and 23-2 on the right side
The output of 3-n is mixed by the addition circuit 25 on the right side, amplified by the amplification circuit 27 on the right side, and output from the right speaker 10.

【0018】このような構成によって、視聴者13にと
って音声が立体化されると共に、視聴者13が着目して
いる発音源からの音が強調されて視聴者13に聞こえ、
かつ、画面内の発音源が移動した場合や視聴者13の視
線が他の発音源に移った場合には、あたかも音源が移動
するように聞こえたり、発音源が切り替わったりするよ
うに聞こえ、より映像の内容を楽しむことができる。
With such a configuration, the sound is stereoscopicized for the viewer 13 and the sound from the sound source focused on by the viewer 13 is emphasized and heard by the viewer 13.
Moreover, when the sound source on the screen moves or when the line of sight of the viewer 13 shifts to another sound source, it sounds as if the sound source moves or the sound source switches, and You can enjoy the contents of the video.

【0019】図3は、本発明の他の実施形態で、本発明
を監視システムに用いた場合の実施形態のブロック図で
ある。図3において、31は複数台の監視カメラであ
り、32は映像切り替えスイッチ、33は監視カメラ3
1の各々に対応して設けられた強い指向性を有する指向
性マイクロホンである。指向性マイクロホン33は例え
ば図に示すように複数の単位マイクロホン34で構成さ
れていて、全体で指向性を実現するものであっても、1
つの指向性マイクロホンであってもよい。35は映像切
り替えスイッチ32と連動して指向性マイクロホン33
からの信号を切り替える音声信号切り替えスイッチであ
る。
FIG. 3 is a block diagram of another embodiment of the present invention in which the present invention is applied to a monitoring system. In FIG. 3, 31 is a plurality of surveillance cameras, 32 is a video switch, and 33 is a surveillance camera 3.
1 is a directional microphone having a strong directivity provided corresponding to each one. The directional microphone 33 is composed of, for example, a plurality of unit microphones 34 as shown in FIG.
It may be one directional microphone. Reference numeral 35 is a directional microphone 33 in conjunction with the image changeover switch 32.
It is an audio signal selector switch for switching the signal from.

【0020】複数台の監視カメラ31からの映像は、映
像処理回路6によってモニタ装置7上に選択されて映し
出される。この場合、モニタ装置7上の画面は複数台の
監視カメラ31からの映像が同時に映し出されるもので
あっても、2つ程度の監視カメラ31からの映像が、交
互に順次切り替わりながら映し出されるものであっても
良い。さあらにこの映像の切り替えに連動して音声信号
も順次切り替わって出力される。
Images from a plurality of surveillance cameras 31 are selected and displayed on the monitor device 7 by the image processing circuit 6. In this case, even if images from a plurality of surveillance cameras 31 are displayed on the screen of the monitor device 7 at the same time, images from about two surveillance cameras 31 are displayed while being alternately switched. It may be. The audio signals are also sequentially switched and output in conjunction with the switching of the video.

【0021】この映像を観測者が見ている間は、小型ビ
デオカメラ11で観測者を撮像し、注視すべき対象物が
あった場合に、視線検出回路12は観測者の視線の動き
からそのことを判断する。観測者が注視する対象物があ
った場合には、視線検出回路12は検出した視線の方向
に当たる指向性マイクロフォン33を特定し、その指向
性マイクロフォン33の出力のゲインを上げ、その指向
性マイクロフォン33の重みを大きくして注目対象物の
発する音(対象物が人間である場合はその声や動作に伴
う物音等)をクローズアップする。
While the observer is watching this image, when the observer takes an image with the small video camera 11 and there is an object to be watched, the line-of-sight detection circuit 12 recognizes the movement of the line of sight of the observer. Make a decision. When there is an object that the observer gazes at, the line-of-sight detection circuit 12 identifies the directional microphone 33 that corresponds to the direction of the detected line of sight, increases the gain of the output of the directional microphone 33, and increases the directional microphone 33. The sound emitted from the target object (when the target object is a human, the voice or the object sound associated with the motion) is highlighted.

【0022】これにより、注目したい対象を視覚と聴覚
の両面からクローズアップして監視できるので、監視対
象をより確実に、よりリアルに捕らえて監視することが
でき、事故や犯罪の防止に役立てることができる。
With this, the target to be noticed can be monitored in close-up from both visual and auditory senses, so that the target to be monitored can be captured more reliably and more realistically, which is useful for preventing accidents and crimes. You can

【0023】図4は、本発明を監視システムに用いた場
合の他の例のブロック図である。図4において、31は
監視カメラで例えばモータで可動な雲台上に設置されて
いる。33は監視カメラ31と同一の雲台上に設置され
た指向性マイクロホンである。また、36は監視カメラ
31と指向性マイクロホン33が設置された雲台を回転
させて、監視カメラ31および指向性マイクロホン33
の検出の向きを変えさせるモータ駆動装置である。さら
に6は映像処理回路、7は映像信号を再生するモニタ装
置、8は音声処理回路、9はスピーカ、11は視聴者1
3を捕らえるビデオカメラ、12はビデオカメラ11の
受信映像から視聴者13の視線を検出する視線検出回
路、13は視聴者で、これらは図3で述べたものと同等
のものである。
FIG. 4 is a block diagram of another example in which the present invention is applied to a surveillance system. In FIG. 4, reference numeral 31 is a surveillance camera, which is installed on a pan head movable by, for example, a motor. Reference numeral 33 denotes a directional microphone installed on the same platform as the surveillance camera 31. Further, 36 rotates the pan head on which the surveillance camera 31 and the directional microphone 33 are installed to rotate the surveillance camera 31 and the directional microphone 33.
Is a motor drive device that changes the direction of detection of. Further, 6 is a video processing circuit, 7 is a monitor device for reproducing a video signal, 8 is an audio processing circuit, 9 is a speaker, 11 is a viewer 1
3 is a video camera, 12 is a line-of-sight detection circuit that detects the line of sight of the viewer 13 from the video received by the video camera 11, 13 is the viewer, and these are equivalent to those described in FIG.

【0024】映像が映し出されているモニタ装置7の近
傍に小型のビデオカメラ11を固定して設置し、視聴者
13の顔を撮像する。そうして視聴者13の顔の全体画
像から、目の領域を検出して、その目の瞼の位置や、白
目の中の黒目すなわち眼球の位置関係から視聴者13の
視線の方向を視線検出回路12で検出する。
A small video camera 11 is fixedly installed near the monitor device 7 on which an image is displayed, and the face of the viewer 13 is imaged. Then, the eye region is detected from the entire image of the face of the viewer 13, and the direction of the line of sight of the viewer 13 is detected based on the position of the eyelid and the positional relationship of the black eye in the white eye, that is, the eyeball. It is detected by the circuit 12.

【0025】視線検出回路12は視線の方向を検出する
と、その視線の方向が表示画面のどの位置を捕らえてい
るかを特定し、その方向に監視カメラ31と指向性マイ
クロホン33を向けるようモータ駆動装置36を回転さ
せる。
When the line-of-sight detection circuit 12 detects the direction of the line-of-sight, the line-of-sight direction identifies the position on the display screen at which the line-of-sight direction is captured, and the motor driving device is arranged so that the surveillance camera 31 and the directional microphone 33 are directed in that direction. Rotate 36.

【0026】これにより、着目している音源がモニタ装
置7画面の中央に位置し、かつ、指向性マイクロホン3
3の指向方向に位置するようになるため、監視が容易に
なり音声信号の聞分けが可能になる。
As a result, the sound source of interest is located in the center of the screen of the monitor device 7 and the directional microphone 3 is used.
Since it is located in the directivity direction of 3, the monitoring becomes easy and the audio signals can be recognized.

【0027】図3および図4に示した実施形態は、テレ
ビ会議等、監視システム以外のモニタシステムにも用い
ることができる。テレビ会議は互いに離れた地点にある
会議室の間で画像と音声を伝送し、テレビモニタ画面に
移る参加者の顔を見ながら会議を行う。この時、人物を
クローズアップする人物カメラは、従来、参加者中の発
言者に向けて切り替わるようになっているが、ビデオカ
メラ11を用いて特定の参加者の視線方向、あるいは大
多数の参加者の視線方向に合わせて人物カメラを切り替
えるようにすると、話者の手元や指示している表示など
にカメラや指向性マイクロホン33を向かわせることが
でき、より実際に適した処理が可能になる。
The embodiments shown in FIGS. 3 and 4 can be used in a monitor system other than the monitor system such as a video conference. In the video conference, images and voices are transmitted between conference rooms located far from each other, and the conference is held while watching the faces of the participants moving to the TV monitor screen. At this time, the person camera that closes up the person is conventionally switched to the speaker among the participants, but using the video camera 11, the direction of the line of sight of a particular participant, or the majority of participants. If the person's camera is switched according to the direction of the person's line of sight, the camera or the directional microphone 33 can be directed to the speaker's hand or a display instructed, and more practically suitable processing becomes possible. .

【0028】[0028]

【発明の効果】以上説明したように本発明では、映像信
号を表示する画像モニタ手段と、音声信号を出力する受
話手段とを具備するテレビ受像機、ビデオ再生装置、テ
レビ会議装置、テレビ監視装置などの信号処理装置にお
いて、信号処理装置の視聴者の前記モニタ装置上の視線
方向を検出する視線検出手段と、前記視線検出手段が検
出した視線方向に位置する発音源を特定する音源特定手
段と、前記音源特定手段が特定した発音源からの音声信
号を選択して増幅し、他の音源からの音声信号を抑圧し
て、音声信号に重みづけを行う音声信号重み付け手段と
を設けるようにした。あるいは、対象を撮像する撮像手
段と、前記撮像手段が撮像した映像信号を表示する画像
モニタ手段と、対象が発する音声信号を集音する受音手
段と、前記受音手段が集音した音声信号を出力する受話
手段とを具備するテレビ会議装置、テレビ監視装置など
の信号処理装置において、信号処理装置の視聴者の前記
モニタ装置上の視線方向を検出する視線検出手段と、前
記視線検出手段が検出した視線方向に前記撮像手段の撮
像方向および前記受音手段の集音方向を移動させる受信
方向移動手段を設けるようにする。また映像信号と音声
信号を同時に出力する信号処理方式において、映像信号
出力を注目する視聴者の視線方向を検出する視線検出機
能と、前記視線検出機能が検出した視線方向に位置する
発音源を特定する音源特定機能と、前記音源特定機能が
特定した前記発音源からの音声信号を選択して増幅し、
他の音源からの音声信号を抑圧して、音声信号に重みを
つけて出力する音声信号出力機能とを具備するようにす
る。このような装置によって、またはこのような方式を
採用することによって、視聴者がどの音源に着目してい
るかを視聴者の視線方向から求め、視線方向にある音源
の音量に自動的に重みづけを行い、あるいは撮像方向集
音方向を変えて、視聴者が着目している特定の話者や特
定の発音源からの信号を正しく聞き分けることができる
ようになる。
As described above, according to the present invention, a television receiver, a video reproducing device, a teleconference device, and a television monitoring device, which are provided with an image monitor means for displaying a video signal and a receiving means for outputting an audio signal. In a signal processing device such as the above, a line-of-sight detecting means for detecting a line-of-sight direction of a viewer of the signal processing device on the monitor device, and a sound source specifying means for specifying a sound source located in the line-of-sight direction detected by the line-of-sight detecting means. A sound signal weighting means for selecting and amplifying a sound signal from the sound source specified by the sound source specifying means, suppressing a sound signal from another sound source, and weighting the sound signal. . Alternatively, an image pickup means for picking up an image of a target, an image monitor means for displaying a video signal picked up by the image pickup means, a sound receiving means for collecting a sound signal emitted by the target, and a sound signal collected by the sound receiving means. In a signal processing device such as a video conference device and a television monitoring device, which includes a receiving means for outputting, a line-of-sight detecting means for detecting a line-of-sight direction of a viewer of the signal processing device on the monitor device, and the line-of-sight detecting means. Reception direction moving means for moving the imaging direction of the imaging means and the sound collecting direction of the sound receiving means in the detected line-of-sight direction is provided. Further, in a signal processing method that outputs a video signal and an audio signal at the same time, a line-of-sight detection function that detects the line-of-sight direction of a viewer who focuses on the video signal output and a sound source located in the line-of-sight direction detected by the line-of-sight detection function are specified. A sound source specifying function to do, and selects and amplifies a sound signal from the sound source specified by the sound source specifying function,
An audio signal output function of suppressing an audio signal from another sound source, weighting the audio signal, and outputting the weighted audio signal is provided. With such a device, or by adopting such a method, it is possible to determine which sound source the viewer is paying attention to from the viewer's line-of-sight direction, and automatically weight the volume of the sound source in the line-of-sight direction. By performing or changing the image pickup direction and the sound collection direction, it becomes possible to correctly distinguish the signal from the specific speaker or the specific sound source that the viewer is paying attention to.

【図面の簡単な説明】[Brief description of drawings]

【図1】本発明の信号処理装置をテレビ受像機に用いた
一実施形態のブロック図。
FIG. 1 is a block diagram of an embodiment in which a signal processing device of the present invention is used in a television receiver.

【図2】図1の実施形態で用いられる音声の重みづけ回
路ブロック図。
FIG. 2 is a block diagram of a voice weighting circuit used in the embodiment of FIG.

【図3】本発明の信号処理装置を監視システムに用いた
一実施形態のブロック図。
FIG. 3 is a block diagram of an embodiment in which the signal processing device of the present invention is used in a monitoring system.

【図4】本発明の信号処理装置を監視システムに用いた
他の実施形態のブロック図。
FIG. 4 is a block diagram of another embodiment in which the signal processing device of the present invention is used in a monitoring system.

【符号の説明】[Explanation of symbols]

1……アンテナ、2……チューナ、3……ビデオレコー
ダ、4……切り替えスイッチ、5……映像/音声分離回
路、6……映像処理回路、7……モニタ装置、8……音
声処理回路、9……左スピーカ、10……右スピーカ、
11……ビデオカメラ、12……視線検出回路、13…
…視聴者、21……重み係数決定回路、22……左側音
声処理回路、23……右側音声処理回路、24……左側
加算回路、25……右側加算回路、26……左側増幅回
路、27……右側増幅回路、31……監視カメラ、32
……映像切り替えスイッチ、33……指向性マイクロホ
ン、34……単位マイクロホン、35……音声信号切り
替えスイッチ、36……モータ駆動装置。
1 ... Antenna, 2 ... Tuner, 3 ... Video recorder, 4 ... Changeover switch, 5 ... Video / audio separation circuit, 6 ... Video processing circuit, 7 ... Monitor device, 8 ... Audio processing circuit , 9 ... left speaker, 10 ... right speaker,
11 ... Video camera, 12 ... Line-of-sight detection circuit, 13 ...
... viewer, 21 ... weighting factor determination circuit, 22 ... left side audio processing circuit, 23 ... right side audio processing circuit, 24 ... left side addition circuit, 25 ... right side addition circuit, 26 ... left side amplification circuit, 27 ...... Right side amplification circuit, 31 …… Surveillance camera, 32
...... Video changeover switch, 33 ...... Directional microphone, 34 …… Unit microphone, 35 …… Sound signal changeover switch, 36 …… Motor drive device.

Claims (5)

【特許請求の範囲】[Claims] 【請求項1】 映像信号を表示する画像モニタ手段と、
音声信号を出力する受話手段とを具備するテレビ受像
機、ビデオ再生装置などの信号処理装置において、 信号処理装置の視聴者の前記モニタ装置上の視線方向を
検出する視線検出手段と、 前記視線検出手段が検出した視線方向に位置する発音源
を特定する音源特定手段と、 前記音源特定手段が特定した前記発音源からの音声信号
を選択して増幅し、他の音源からの音声信号を抑圧し
て、音声信号に重みづけを行う音声信号重み付け手段と
を具備することを特徴とする信号処理装置。
1. Image monitor means for displaying a video signal,
In a signal processing device such as a television receiver and a video reproducing device, which is provided with a receiving means for outputting an audio signal, a line-of-sight detecting means for detecting a line-of-sight direction of a viewer of the signal processing device on the monitor device, and the line-of-sight detection A sound source specifying unit that specifies a sound source located in the line-of-sight direction detected by the unit, and selects and amplifies a sound signal from the sound source specified by the sound source specifying unit to suppress a sound signal from another sound source. And a voice signal weighting means for weighting the voice signal.
【請求項2】 対象を撮像する第1の撮像手段と、前記
第1の撮像手段が撮像した映像信号を表示する画像モニ
タ手段と、対象が発する音声信号を集音する受音手段
と、前記受音手段が集音した音声信号を出力する受話手
段とを具備するテレビ会議装置、テレビ監視装置などの
信号処理装置において、 信号処理装置の視聴者の前記モニタ装置上の視線方向を
検出する視線検出手段と、 前記視線検出手段が検出した視線方向に位置する発音源
を特定する音源特定手段と、 前記音源特定手段が特定した前記発音源からの音声信号
を選択して増幅し、他の音源からの音声信号を抑圧し
て、音声信号に重みづけを行う音声信号重み付け手段と
を具備することを特徴とする信号処理装置。
2. A first image pickup unit for picking up an image of a target, an image monitor unit for displaying a video signal picked up by the first image pickup unit, a sound receiving unit for collecting an audio signal emitted by the target, and In a signal processing device such as a video conference device and a television monitoring device, which is provided with a receiving means for outputting a sound signal collected by a sound receiving means, a line of sight for detecting a line-of-sight direction of a viewer of the signal processing device on the monitor device. Detecting means, sound source identifying means for identifying a sound source located in the line-of-sight direction detected by the line-of-sight detecting means, and selecting and amplifying a sound signal from the sound source identified by the sound source identifying means, and selecting another sound source. And a voice signal weighting means for suppressing the voice signal from the voice signal and weighting the voice signal.
【請求項3】 対象を撮像する第1の撮像手段と、前記
第1の撮像手段が撮像した映像信号を表示する画像モニ
タ手段と、対象が発する音声信号を集音する受音手段
と、前記受音手段が集音した音声信号を出力する受話手
段とを具備するテレビ会議装置、テレビ監視装置などの
信号処理装置において、 信号処理装置の視聴者の前記モニタ装置上の視線方向を
検出する視線検出手段と、 前記視線検出手段が検出した視線方向に前記第1の撮像
手段の撮像方向および前記受音手段の集音方向を移動さ
せる受信方向移動手段とを具備することを特徴とする信
号処理装置。
3. A first image pickup unit for picking up an image of a target, an image monitor unit for displaying a video signal picked up by the first image pickup unit, a sound receiving unit for collecting an audio signal emitted by the target, and In a signal processing device such as a video conference device and a television monitoring device, which is provided with a receiving means for outputting a sound signal collected by a sound receiving means, a line of sight for detecting a line-of-sight direction of a viewer of the signal processing device on the monitor device. Signal processing comprising: a detection unit; and a reception direction moving unit that moves the imaging direction of the first imaging unit and the sound collection direction of the sound receiving unit in the line-of-sight direction detected by the line-of-sight detection unit. apparatus.
【請求項4】 前記視線検出手段は、前記視聴者の顔画
像をとらえる第2の撮像手段と、 前記第2の撮像手段がとらえた前記顔画像から眼球およ
び瞼の領域を分離して検出する目領域分離手段と、 前記目領域分離手段が分離検出した前記眼球および瞼の
領域から眼球の方向変化と瞼の形状変化を抽出する抽出
手段と、 前記抽出手段が抽出した眼球の方向変化と瞼の形状変化
から視線方向を特定する視線方向特定手段を有すること
を特徴とする請求項1ないし請求項3のいずれか1項に
記載の信号処理装置。
4. The line-of-sight detecting means separates an eyeball and eyelid region from the second image capturing means for capturing the face image of the viewer, and the face image captured by the second image capturing means. Eye area separating means, an extracting means for extracting the eye direction change and eyelid shape change from the eyeball and eyelid areas that the eye area separating means has separately detected, and the eye direction change and the eyelid extracted by the extracting means. 4. The signal processing device according to claim 1, further comprising a line-of-sight direction specifying unit that specifies the line-of-sight direction from the change in shape.
【請求項5】 映像信号と音声信号を同時に出力する信
号処理方式において、 映像信号出力を注目する視聴者の視線方向を検出する視
線検出機能と、 前記視線検出機能が検出した視線方向に位置する発音源
を特定する音源特定機能と、 前記音源特定機能が特定した前記発音源からの音声信号
を選択して増幅し、他の音源からの音声信号を抑圧し
て、音声信号に重みをつけて出力する音声信号出力機能
とを具備することを特徴とする信号処理方式。
5. A signal processing system for simultaneously outputting a video signal and an audio signal, wherein a visual line detection function for detecting the visual line direction of a viewer who is paying attention to the video signal output, and a visual line direction detected by the visual line detection function. A sound source specifying function for specifying a sound source, and a sound signal from the sound source specified by the sound source specifying function is selected and amplified, and a sound signal from another sound source is suppressed to weight the sound signal. A signal processing method comprising: an output audio signal output function.
JP8085214A 1996-04-08 1996-04-08 Signal processor Abandoned JPH09275533A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP8085214A JPH09275533A (en) 1996-04-08 1996-04-08 Signal processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP8085214A JPH09275533A (en) 1996-04-08 1996-04-08 Signal processor

Publications (1)

Publication Number Publication Date
JPH09275533A true JPH09275533A (en) 1997-10-21

Family

ID=13852335

Family Applications (1)

Application Number Title Priority Date Filing Date
JP8085214A Abandoned JPH09275533A (en) 1996-04-08 1996-04-08 Signal processor

Country Status (1)

Country Link
JP (1) JPH09275533A (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11234640A (en) * 1998-02-17 1999-08-27 Sony Corp Communication control system
WO2000022823A1 (en) * 1998-10-09 2000-04-20 Sony Corporation Communication apparatus and method
JP2000138913A (en) * 1998-10-30 2000-05-16 Sony Corp Information processing unit its method and served medium
GB2351425A (en) * 1999-01-20 2000-12-27 Canon Kk Video conferencing apparatus
WO2001039479A1 (en) * 1999-11-24 2001-05-31 Sony Corporation Communication system
US6608644B1 (en) 1999-11-24 2003-08-19 Sony Corporation Communication system
JP2005045779A (en) * 2003-07-02 2005-02-17 Fuji Xerox Co Ltd Method and interface tool for managing audio device, and computer program product executed by computer which manages audio device
EP1613069A1 (en) * 2003-04-08 2006-01-04 Sony Corporation Reproduction device and reproduction method
JP2008005208A (en) * 2006-06-22 2008-01-10 Nec Corp Camera automatic control system for athletics, camera automatic control method, camera automatic control unit, and program
CN100430997C (en) * 2004-09-20 2008-11-05 Lg电子株式会社 Adjustable display of mobile communications terminal
JP2008312002A (en) * 2007-06-15 2008-12-25 Yamaha Corp Television conference apparatus
JP2009060220A (en) * 2007-08-30 2009-03-19 Konica Minolta Holdings Inc Communication system and communication program
JP2011066467A (en) * 2009-09-15 2011-03-31 Brother Industries Ltd Television conference terminal apparatus, method for controlling voice of the same, and voice control program
US8411165B2 (en) 2003-10-20 2013-04-02 Sony Corporation Microphone apparatus, reproducing apparatus, and image taking apparatus
JP2015142185A (en) * 2014-01-27 2015-08-03 日本電信電話株式会社 Viewing method, viewing terminal and viewing program
JP2017083661A (en) * 2015-10-28 2017-05-18 株式会社リコー Communication system, communication device, communication method, and program
WO2018066376A1 (en) * 2016-10-05 2018-04-12 ソニー株式会社 Signal processing device, method, and program
JP2020088618A (en) * 2018-11-27 2020-06-04 株式会社リコー Video conference system, communication terminal, and method for controlling microphone of communication terminal
CN112262367A (en) * 2018-04-09 2021-01-22 脸谱公司 Audio selection based on user engagement
CN114604800B (en) * 2022-03-31 2023-12-15 江苏西顿科技有限公司 Explosion-proof AGV car

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11234640A (en) * 1998-02-17 1999-08-27 Sony Corp Communication control system
KR100639750B1 (en) * 1998-10-09 2006-10-31 소니 가부시끼 가이샤 Communication apparatus and method
WO2000022823A1 (en) * 1998-10-09 2000-04-20 Sony Corporation Communication apparatus and method
US6606111B1 (en) 1998-10-09 2003-08-12 Sony Corporation Communication apparatus and method thereof
JP2000138913A (en) * 1998-10-30 2000-05-16 Sony Corp Information processing unit its method and served medium
GB2351425A (en) * 1999-01-20 2000-12-27 Canon Kk Video conferencing apparatus
WO2001039479A1 (en) * 1999-11-24 2001-05-31 Sony Corporation Communication system
US6608644B1 (en) 1999-11-24 2003-08-19 Sony Corporation Communication system
KR100697757B1 (en) * 1999-11-24 2007-03-21 소니 가부시끼 가이샤 Communication system
EP1613069A1 (en) * 2003-04-08 2006-01-04 Sony Corporation Reproduction device and reproduction method
EP1613069A4 (en) * 2003-04-08 2006-04-12 Sony Corp Reproduction device and reproduction method
EP1613069B1 (en) * 2003-04-08 2020-02-26 Sony Corporation Reproduction device and reproduction method
US8126155B2 (en) 2003-07-02 2012-02-28 Fuji Xerox Co., Ltd. Remote audio device management system
JP2005045779A (en) * 2003-07-02 2005-02-17 Fuji Xerox Co Ltd Method and interface tool for managing audio device, and computer program product executed by computer which manages audio device
JP4501556B2 (en) * 2003-07-02 2010-07-14 富士ゼロックス株式会社 Method, apparatus and program for managing audio apparatus
US8411165B2 (en) 2003-10-20 2013-04-02 Sony Corporation Microphone apparatus, reproducing apparatus, and image taking apparatus
CN100430997C (en) * 2004-09-20 2008-11-05 Lg电子株式会社 Adjustable display of mobile communications terminal
JP2008005208A (en) * 2006-06-22 2008-01-10 Nec Corp Camera automatic control system for athletics, camera automatic control method, camera automatic control unit, and program
JP2008312002A (en) * 2007-06-15 2008-12-25 Yamaha Corp Television conference apparatus
JP2009060220A (en) * 2007-08-30 2009-03-19 Konica Minolta Holdings Inc Communication system and communication program
JP2011066467A (en) * 2009-09-15 2011-03-31 Brother Industries Ltd Television conference terminal apparatus, method for controlling voice of the same, and voice control program
JP2015142185A (en) * 2014-01-27 2015-08-03 日本電信電話株式会社 Viewing method, viewing terminal and viewing program
JP2017083661A (en) * 2015-10-28 2017-05-18 株式会社リコー Communication system, communication device, communication method, and program
WO2018066376A1 (en) * 2016-10-05 2018-04-12 ソニー株式会社 Signal processing device, method, and program
CN112262367A (en) * 2018-04-09 2021-01-22 脸谱公司 Audio selection based on user engagement
JP2021518072A (en) * 2018-04-09 2021-07-29 フェイスブック,インク. Audio selection based on user involvement
JP2020088618A (en) * 2018-11-27 2020-06-04 株式会社リコー Video conference system, communication terminal, and method for controlling microphone of communication terminal
CN114604800B (en) * 2022-03-31 2023-12-15 江苏西顿科技有限公司 Explosion-proof AGV car

Similar Documents

Publication Publication Date Title
JPH09275533A (en) Signal processor
EP2046032B1 (en) A method and an apparatus for obtaining acoustic source location information and a multimedia communication system
US8064754B2 (en) Method and communication apparatus for reproducing a moving picture, and use in a videoconference system
US6275258B1 (en) Voice responsive image tracking system
KR101861590B1 (en) Apparatus and method for generating three-dimension data in portable terminal
KR100986228B1 (en) Camera apparatus and image recording/reproducing method
KR20150031179A (en) Audio accessibility
US8390665B2 (en) Apparatus, system and method for video call
JP3157769B2 (en) TV voice control device
JP2009065587A (en) Voice-recording device and voice-reproducing device
JPH0955925A (en) Picture system
JP2007274462A (en) Video conference apparatus and video conference system
JP2003032776A (en) Reproduction system
JP5214394B2 (en) camera
JPH11234640A (en) Communication control system
JP4244416B2 (en) Information processing apparatus and method, and recording medium
WO2010061791A1 (en) Video control device, and image capturing apparatus and display apparatus which are provided with same
JP5750668B2 (en) Camera, playback device, and playback method
JP2009065490A (en) Video conference apparatus
JP7111202B2 (en) SOUND COLLECTION CONTROL SYSTEM AND CONTROL METHOD OF SOUND COLLECTION CONTROL SYSTEM
JPH07162827A (en) Multi-spot communication method and communication terminal
JP2630041B2 (en) Video conference image display control method
JP2522787B2 (en) 3D television recording device
JP2007312181A (en) Imaging sound pickup signal reproduction system
JPH07193798A (en) Video/acoustic communication equipment

Legal Events

Date Code Title Description
A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20040526

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20040608

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20040806

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20040907

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20041104

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20041130

A762 Written abandonment of application

Free format text: JAPANESE INTERMEDIATE CODE: A762

Effective date: 20050114