JPH09322136A

JPH09322136A - Image transmitter

Info

Publication number: JPH09322136A
Application number: JP8134843A
Authority: JP
Inventors: Kazuyoshi Izumi; 和芳泉
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1996-05-29
Filing date: 1996-05-29
Publication date: 1997-12-12

Abstract

PROBLEM TO BE SOLVED: To select automatically a specific talker making a speech at present among a plurality of talkers in the image transmitter for a video conference system or the like. SOLUTION: The image transmitter having a communication control section making communication through the use of a telephone line, a digital signal processing section 5 compressing/expanding communication data, a video signal processing section 4 displaying a sent image or a received image, and a camera section 1 used for an input section of a transmission image and driven vertically and horizontally is provided with an image recognition means identifying image data from the camera section and the image recognition means applies recognition processing to the image data from the camera to specify one talker among a plurality of talkers so as to move the camera automatically in a direction of the specific talker.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、静止画伝送装置、
動画伝送装置、テレビ電話、テレビ会議システム等の画
像伝送装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a still image transmission device,
The present invention relates to an image transmission device such as a moving image transmission device, a videophone, a video conference system, or the like.

【０００２】[0002]

【従来の技術】図７に、従来の画像伝送装置の機能ブロ
ック図を示す。カメラ部１０１では被写体からの反射光
がレンズ１０２を通り固体撮像素子（以下ＣＣＤと称
す）１０３に結像される。このＣＣＤ１０３では入射光
が光電変換により電気信号に変換され、映像の水平方向
のシリアルデータとしてビデオ信号処理部１０４へ入力
される。ここでの入射光は輝度信号と色差信号に演算処
理された後、アナログ−デジタル（Ａ／Ｄ）変換器を通
して数ビットのデジタル信号に変換されデジタル信号処
理部１０５へ転送される。2. Description of the Related Art FIG. 7 shows a functional block diagram of a conventional image transmission apparatus. In the camera unit 101, reflected light from a subject passes through a lens 102 and is imaged on a solid-state image sensor (hereinafter referred to as CCD) 103. In the CCD 103, incident light is converted into an electric signal by photoelectric conversion, and is input to the video signal processing unit 104 as horizontal serial data of an image. The incident light here is processed into a luminance signal and a color difference signal, converted into a digital signal of several bits through an analog-digital (A / D) converter, and transferred to the digital signal processing unit 105.

【０００３】デジタル信号処理部１０５へ入力されたデ
ジタル信号は符号化処理が行われ圧縮されたデータとし
てフィールドメモリへ書き込まれる。マイク１０７から
の音声信号もここでデジタル信号に変換された後符号化
されメモリへ書き込まれる。フィールドメモリへ書き込
まれた映像データは表示を行うため、ビデオ信号処理部
１０４へ転送されると同時にユーザーの意志表示により
通信制御部１０６へ転送されアナログ一般回線またはＩ
ＳＤＮ回線を通して通信相手へ伝送される。The digital signal input to the digital signal processing unit 105 is encoded and written in the field memory as compressed data. The audio signal from the microphone 107 is also converted into a digital signal here and then encoded and written in the memory. Since the video data written in the field memory is displayed, it is transferred to the video signal processing unit 104 and at the same time transferred to the communication control unit 106 according to the intention of the user, and is transferred to the analog general line or I line.
It is transmitted to the communication partner through the SDN line.

【０００４】ビデオ信号処理部１０４へ転送された映像
データはデジタル−アナログ（Ｄ／Ａ）変換器によりア
ナログ信号に変換された後エンコード処理が行われコン
ポジット信号として外部モニタへ出力される。一方、通
信制御部１０６へ転送された映像／音声データはアナロ
グ一般回線またはＩＳＤＮ回線のネットワーク制御及び
プロトコル制御を通信制御部１０６で行い、伝送が開始
される。The video data transferred to the video signal processing unit 104 is converted into an analog signal by a digital-analog (D / A) converter and then subjected to encoding processing and output as a composite signal to an external monitor. On the other hand, the video / audio data transferred to the communication control unit 106 is subjected to network control and protocol control of the analog general line or ISDN line, and the transmission is started.

【０００５】以上は通信の送り手側の信号の流れで、受
け手側の場合はこの逆のプロセスをたどる。すなわち、
アナログ一般回線またはＩＳＤＮ回線から送られてきた
映像／音声データは通信制御部１０６を通してデジタル
信号処理部１０５へ入力される。この受信データは符号
化された圧縮データである。デジタル信号処理部１０５
では圧縮データをフィールドメモリへの書き込みが行わ
れると同時に復号してビデオ信号処理部１０４へ転送
し、デジタル−アナログ変換器でアナログ信号に変換さ
れた後エンコード処理でコンポジット信号に変換されて
外部モニタへ出力される。ここで使用したカメラ部１０
１は、画像伝送装置に付属または市販品のカメラ装置を
接続して撮影を行うものであった。従って、カメラを複
数話者の中の特定話者へ方向を変える場合、付属のカメ
ラでは遠隔電動操作が可能なものもあるが市販品のカメ
ラでは遠隔操作用の制御線が無いために手動で特定話者
の方向へカメラを向けるしかできなかった。何れにし
ろ、従来技術では、会議の話し手を撮影するには手動も
しくは電動により話者がどの位置にいるかをモニタを見
て確認しながら、特定話者がモニタの中央にくるまでカ
メラのレンズ部を上下左右に動かしていた。The above is the flow of signals on the sender side of communication, and on the receiver side, the reverse process is followed. That is,
The video / audio data sent from the analog general line or the ISDN line is input to the digital signal processing unit 105 through the communication control unit 106. This received data is encoded compressed data. Digital signal processing unit 105
At the same time, the compressed data is written to the field memory and simultaneously decoded and transferred to the video signal processing unit 104, converted into an analog signal by a digital-analog converter and then converted into a composite signal by an encoding process, and then an external monitor. Is output to. The camera unit 10 used here
In No. 1, the camera is attached to the image transmission device or is connected to a commercially available camera device to perform photographing. Therefore, when changing the direction of the camera to a specific speaker among multiple speakers, there are some cameras that can be operated remotely by the attached camera, but commercially available cameras do not have a control line for remote operation, so they can be operated manually. All I could do was point the camera at a specific speaker. In any case, in the prior art, the camera lens unit is used until the specific speaker comes to the center of the monitor while manually or electrically checking the position of the speaker to check the position of the speaker for shooting the speaker of the conference. Was moving up, down, left and right.

【０００６】[0006]

【発明が解決しようとする課題】従来の技術では、会議
中の話し手を撮影するためにモニタを見ながら手動もし
くは電動でカメラ位置を調整するしかなく、調整が完了
するまで会議の議題または話題に集中することができな
かった。また、特定の話者を抽出し、電気的に拡大する
方法しかできず解像度が低下するという問題があった。
本発明は、上記課題に鑑みてなしたものであり、複数話
者の中から現在話をしている特定話者を自動的に選択で
きる画像伝送装置を提供するものである。In the prior art, there is no choice but to adjust the camera position manually or electrically while looking at the monitor in order to photograph the talker during the conference. I couldn't concentrate. In addition, there is a problem that the resolution is lowered because only a method of extracting a specific speaker and electrically enlarging it can be performed.
The present invention has been made in view of the above problems, and provides an image transmission apparatus capable of automatically selecting a specific speaker who is currently speaking from a plurality of speakers.

【０００７】[0007]

【課題を解決するための手段】請求項１の発明は、電話
回線を使用して通信する通信制御部と、通信データを圧
縮／伸長するデジタル信号処理部と、送信した画像また
は受信した画像を表示するビデオ信号処理部と、送信す
る画像の入力部として使用する上下左右に駆動可能なカ
メラ部を備えた画像伝送装置において、上記カメラ部か
らの画像データを識別する画像認識手段を設け、該画像
認識手段において前記カメラからの画像データを認識処
理することで複数の話者の中から１人の話者を特定し、
前記カメラを自動的に特定話者の方向へ移動させるよう
にしたことを特徴としたものであり、請求項２の発明
は、上記請求項１の発明において、話者の特定を口元の
動きにより行うことを特徴とするものである。According to a first aspect of the present invention, there is provided a communication control unit for communicating using a telephone line, a digital signal processing unit for compressing / decompressing communication data, and a transmitted image or a received image. In an image transmission device including a video signal processing unit for displaying and a camera unit that can be driven vertically and horizontally to be used as an input unit for an image to be transmitted, image recognition means for identifying image data from the camera unit is provided. By recognizing the image data from the camera in the image recognition means, one speaker is specified from a plurality of speakers,
The invention is characterized in that the camera is automatically moved in the direction of a specific speaker. According to the invention of claim 2, in the invention of claim 1, the speaker is identified by the movement of the mouth. It is characterized by performing.

【０００８】上記画像伝送装置よれば、ＴＶ会議システ
ム等の画像伝送装置で複数話者の中から現在話をしてい
る特定話者を自動的に選択できるので複数話者が同席し
ている通常の会議と同様な効率の良い会議進行が可能に
なる。According to the above image transmission device, a specific speaker who is currently speaking can be automatically selected from a plurality of speakers in an image transmission device such as a TV conference system, so that a plurality of speakers are usually present. It is possible to proceed as efficiently as the conference.

【０００９】[0009]

【発明の実施の形態】以下、本発明の実施の形態につい
て説明する。図１は本発明に係る画像伝送装置の機能ブ
ロック図である。図１の画像伝送装置において、まず、
カメラ部１から取り込まれた画像は固体撮像素子（ＣＣ
Ｄ）８を経由してビデオ信号処理部４で輝度信号と色差
信号に変換された後アナログ−デジタル（Ａ／Ｄ）変換
が行われデジタル信号処理部５へ入力される。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below. FIG. 1 is a functional block diagram of an image transmission device according to the present invention. In the image transmission device of FIG. 1, first,
The image captured from the camera unit 1 is a solid-state image sensor (CC
After being converted into a luminance signal and a color difference signal by the video signal processing unit 4 via D) 8, analog-digital (A / D) conversion is performed and the result is input to the digital signal processing unit 5.

【００１０】デジタル信号処理部５へ入力されたデジタ
ル信号は符号化処理が行われ圧縮されたデータとしてフ
ィールドメモリへ書き込まれる。マイク７からの音声信
号もここでデジタル信号に変換された後、符号化されフ
ィルドメモリへ書き込まれる。また、このデジタル信号
処理部５では主に画像の特徴抽出、動きベクトルの検
出、特徴抽出画像の認識処理等が行われるとともに、カ
メラ部１の回転制御を行う回転制御信号の生成、画像メ
モリ制御及び画像の圧縮／伸長が行われている。すなわ
ち、前記カメラ部１のレンズ２から入力される映像信号
には会議または打ち合わせを行う複数の話者が撮影され
ているが、これら複数の話者の口元を抽出し、現在動い
ているか否かの判別をし、これにより複数話者の中から
最も動きの激しい動きのある話者を特定する。そして、
特定された話者はレンズ２からの映像信号の中央に位置
するようにカメラ部１の電磁石３を駆動しカメラの向き
を変化させる。つまり、複数の話者から１人を特定し、
常時その特定者の方向にカメラが向くような自動追尾を
行うことになる。The digital signal input to the digital signal processing unit 5 is encoded and written in the field memory as compressed data. The audio signal from the microphone 7 is also converted into a digital signal here, and then encoded and written in the filled memory. The digital signal processing unit 5 mainly performs image feature extraction, motion vector detection, feature extraction image recognition processing, and the like, and also generates a rotation control signal for controlling rotation of the camera unit 1 and image memory control. And the image is compressed / decompressed. That is, although a plurality of speakers who have a meeting or a meeting are photographed in the video signal input from the lens 2 of the camera unit 1, the mouths of the plurality of speakers are extracted to determine whether or not they are currently moving. Then, the speaker with the most vigorous movement is specified from among the plural speakers. And
The specified speaker drives the electromagnet 3 of the camera unit 1 so as to be positioned at the center of the video signal from the lens 2 and changes the orientation of the camera. In other words, identify one from multiple speakers,
Automatic tracking is always performed so that the camera faces the specified person.

【００１１】ここで、話者の口元が動いているかは、次
のような動きベクトル検出法の手法を用いる。動きベク
トル検出法には大別して次の３手法がある。代表点マッチング法時間的に連続した２枚の画像のうち片方の画像より代表
点を抜き出し、もう片方の画像と位置をずらしながら絶
対値差をとり、すべての代表点に関して加算累積する。
その累積値が最小となる偏位量を２枚の画像が最も相関
が高いものとして、その編位量を動きベクトルとして求
める。Here, the following motion vector detecting method is used to determine whether the speaker's mouth is moving. Motion vector detection methods are roughly classified into the following three methods. Representative point matching method A representative point is extracted from one of two images that are temporally continuous, the absolute value difference is calculated while shifting the position from the other image, and all representative points are added and accumulated.
The displacement amount that minimizes the cumulative value is determined as the one in which the two images have the highest correlation, and the displacement amount is obtained as a motion vector.

【００１２】勾配法時間的に連続した２枚の画像より輝度値の時間的勾配と
空間的勾配の比により動きベクトルを求める。この勾配
法には計算を簡略化し反復計算により検出精度を向上さ
せる反復勾配法がある。Gradient method A motion vector is obtained from two temporally continuous images by the ratio of the temporal gradient of the luminance value and the spatial gradient. This gradient method includes an iterative gradient method that simplifies the calculation and improves the detection accuracy by iterative calculation.

【００１３】位相相関法時間的に連続した２枚の画像のフーリエ変換係数の位相
部が速度を反映していることを利用して動きベクトルを
求める。この方法は計算量が膨大である。ここでは代表
点マッチング法を用いた回路構成とした。Phase Correlation Method A motion vector is obtained by utilizing the fact that the phase portion of the Fourier transform coefficient of two temporally consecutive images reflects velocity. This method requires a huge amount of calculation. Here, the circuit configuration is based on the representative point matching method.

【００１４】なお、前述のカメラ部１からのデジタル化
された画像データは、該デジタル信号処理部５で圧縮さ
れたデータとして回線制御を行う通信制御部６へ転送さ
れ、公衆電話回線を通して伝送される。図２は上記画像伝送装置における動きベクトル検出部の
構成を示すブロック図、図３は同検出部に用いられる動
きベクトル検出器の詳細ブロック図である。これら図２
及び図３において、８はＣＣＤ、９はＣＣＤ８からの信
号を処理するＣＣＤ信号処理部、１０はＣＣＤ信号処理
部９からの輝度信号に基づいて動きベクトルを検出する
動きベクトル検出器、１１は中央演算装置（ＣＰＵ）で
あり、前記ＣＣＤ８からの信号を処理して得られた輝度
信号は動きベクトル検出器１０内のＡ／Ｄ変換器１２を
通してデジタル信号に変換された後、奇数フィールド及
び偶数フィールドのライン間の位置ずれを補正するライ
ン補間器１３を通り、２字元ローパスフィルタ（ＬＰ
Ｆ）１４を通過した信号は、サンプリングされ代表点と
して代表点メモリ１５に記録される。次のフィールドで
は、前フィールドの代表点として代表点ラインメモリ１
６に呼び出され、ローパスフィルタ１４から出力される
現フィールドの各画素の輝度値との間で絶対値差分回路
１７で絶対値差分がとられ、次段の加算器１８に入力さ
れる。この加算器１８のもう一方の入力端子には累積加
算メモリ１９の出力が接続されており、加算結果は累積
加算メモリ１９に記録されて行く。これら一連の演算が
ＴＶ走査の全走査にわたって実行されると、累積加算メ
モリ１９には２次元アドレスに対応した累積関数が得ら
れる。その後累積加算メモリ１９の内容の内最小値をと
るセルに対応するアドレスが動きベクトルとして検出さ
れる。検出された動きベクトルは中央演算装置１２で約
１秒毎に最も動きのある話者を特定する。話者が特定さ
れればカメラに取り付けられている電磁石３（図１参
照）を駆動しその話者がカメラからの映像の中央にくる
ように制御する。The digitized image data from the camera section 1 is transferred to the communication control section 6 for line control as data compressed by the digital signal processing section 5 and transmitted through the public telephone line. It FIG. 2 is a block diagram showing the configuration of the motion vector detecting unit in the image transmitting apparatus, and FIG. 3 is a detailed block diagram of the motion vector detector used in the detecting unit. These Figure 2
In FIG. 3, 8 is a CCD, 9 is a CCD signal processing unit that processes a signal from the CCD 8, 10 is a motion vector detector that detects a motion vector based on a luminance signal from the CCD signal processing unit 9, and 11 is a center. A luminance signal obtained by processing the signal from the CCD 8 is converted into a digital signal through an A / D converter 12 in the motion vector detector 10, and then an odd field and an even field. Passing through the line interpolator 13 that corrects the positional deviation between the lines of the two-character low-pass filter (LP
The signal passed through F) 14 is sampled and recorded in the representative point memory 15 as a representative point. In the next field, the representative point line memory 1 is used as the representative point of the previous field.
6 and the absolute value difference between the luminance value of each pixel of the current field output from the low pass filter 14 is calculated by the absolute value difference circuit 17, and the difference is input to the adder 18 in the next stage. The output of the cumulative addition memory 19 is connected to the other input terminal of the adder 18, and the addition result is recorded in the cumulative addition memory 19. When this series of operations is executed over the entire TV scan, the cumulative addition memory 19 obtains a cumulative function corresponding to the two-dimensional address. After that, the address corresponding to the cell having the minimum value in the contents of the cumulative addition memory 19 is detected as a motion vector. Based on the detected motion vector, the central processing unit 12 identifies the speaker having the most motion every approximately one second. When the speaker is specified, the electromagnet 3 (see FIG. 1) attached to the camera is driven to control the speaker so that the speaker comes to the center of the image from the camera.

【００１５】図４にカメラ部の拡大正面図、図５にカメ
ラ部の拡大横面図を示す。レンズ２０を収納したカメラ
ユニット部２１の内部とこのカメラユニット部２１の後
部及び前部には電磁石２２、２３、２４が備えられてい
る。また、カメラユニット部２３の左右にはそれぞれカ
メラ駆動用の電磁石２５、２６が近接して配置され、さ
らにカメラユニット部２１の下部にもカメラ上下方向駆
動用の電磁石２７、２８が近接して配置されている。初
期状態ではそれぞれの電磁石のコイル部には電流は流れ
ておらず、カメラユニット部２１の方向を変化させる時
に各個別に電流が流れるように制御される。例えばカメ
ラユニット部２１を左方向（カメラに向かって左方向）
に変化させる時には電磁石２８に電流を流す。するとカ
メラユニット部２１の前部にある永久磁石２２が引き付
けられレンズ部は左方向に動く。動く割合はレンズから
の映像信号から特定された話者が中央にくるように制御
される。上下方向への移動には例えば上向きの被写体を
撮影時はカメラユニット底面前部側の電磁石２７に流れ
る電流を減少させカメラユニット部２１の後部が下側へ
下がるようにし、その結果レンズ部が上方向を向くこと
になる。レンズ部を下側へ向ける時はその逆でカメラ底
面後部側の電磁石２８の磁力を減少させ、かつ電磁石２
７の磁力を増加させることでレンズ部を下向きにする。FIG. 4 shows an enlarged front view of the camera section, and FIG. 5 shows an enlarged side view of the camera section. Electromagnets 22, 23, 24 are provided inside the camera unit 21 that houses the lens 20 and in the rear and front of the camera unit 21. Further, the electromagnets 25 and 26 for driving the camera are arranged close to each other on the left and right of the camera unit 23, and the electromagnets 27 and 28 for driving the camera in the vertical direction are also arranged close to each other at the lower part of the camera unit 21. Has been done. In the initial state, no current is flowing in the coil portion of each electromagnet, and when the direction of the camera unit 21 is changed, it is controlled so that the current individually flows. For example, turn the camera unit 21 to the left (to the left toward the camera).
When changing to, a current is passed through the electromagnet 28. Then, the permanent magnet 22 at the front of the camera unit 21 is attracted, and the lens unit moves leftward. The rate of movement is controlled so that the speaker specified by the video signal from the lens comes to the center. For moving up and down, for example, when shooting an upward object, the current flowing through the electromagnet 27 on the front side of the bottom surface of the camera unit is reduced so that the rear portion of the camera unit portion 21 is lowered, so that the lens portion is raised. I will turn to the direction. When the lens portion is directed downward, the magnetic force of the electromagnet 28 on the rear side of the bottom surface of the camera is reduced by the opposite, and
By increasing the magnetic force of 7, the lens portion is directed downward.

【００１６】図６に本発明の画像伝送装置の詳細回路ブ
ロック図を示す。この図６において、１は、前述したＣ
ＣＤ、レンズ等からなるカメラ部、３０はＮＴＳＣ信号
を出力する映像出力部、４は前述したビデオ信号処理部
であり、ＣＣＤ信号処理を行うＣＣＤ信号処理部９、Ｃ
ＣＤ信号処理部９からの信号から動きベクトル検出を行
う動きベクトル検出器１０、Ａ／Ｄ変換処理を行うＡ／
Ｄ変換部３１、エンコーダ３２、エンコーダ３２からの
信号をＤ／Ａ変換するＤ／Ａ変換部３３等から構成され
ている。５は前述したデジタル信号処理部であり、メモ
リ制御を行うメモリ制御部３４、バッファメモリ３５、
画像コーデック用ＤＳＰ３６、バッファメモリ３７、画
像メモリ３８、カメラ駆動制御を行うカメラ駆動制御部
３９、画像認識を行う画像認識部４０、マイクロプロセ
ッサ（ＣＰＵ）１１等から構成されている。６は前述し
た通信制御部であり、デュアルポートメモリ４１、デー
タセレクタ４２、シリアル・パラレル変換制御を行うシ
リアルパラレル変換制御部４３、回線制御用コントロー
ラ４４、マイクロプロセッサ（ＣＰＵ）４５等から構成
されている。FIG. 6 shows a detailed circuit block diagram of the image transmission apparatus of the present invention. In FIG. 6, 1 is the above-mentioned C
A camera unit including a CD, a lens, etc., 30 is a video output unit for outputting NTSC signals, 4 is the above-mentioned video signal processing unit, and CCD signal processing units 9, C for performing CCD signal processing
A motion vector detector 10 for detecting a motion vector from the signal from the CD signal processing unit 9 and an A / D for performing an A / D conversion process.
The D conversion unit 31, the encoder 32, and the D / A conversion unit 33 that performs D / A conversion on the signal from the encoder 32 are configured. Reference numeral 5 denotes the above-mentioned digital signal processing unit, which includes a memory control unit 34 for performing memory control, a buffer memory 35,
An image codec DSP 36, a buffer memory 37, an image memory 38, a camera drive control unit 39 for performing camera drive control, an image recognition unit 40 for performing image recognition, a microprocessor (CPU) 11, and the like. Reference numeral 6 denotes the above-mentioned communication control unit, which includes a dual port memory 41, a data selector 42, a serial / parallel conversion control unit 43 for performing serial / parallel conversion control, a line control controller 44, a microprocessor (CPU) 45, and the like. There is.

【００１７】この回路において、カメラ部１のＣＣＤか
らの出力信号はＣＣＤ信号処理部４で輝度信号と色差信
号に変換される。それぞれの信号はＡ／Ｄ変換部３１で
アナログ信号からデジタル信号へ変換されて画像メモリ
３８へ記録される。この時、ＣＣＤ信号処理部９からの
輝度信号は１部動きベクトル検出器１０へ転送され、動
きベクトルとしてマイクロプロセッサ１１へ入力され
る。デジタル信号処理部５では画像データの画像メモリ
３８への書き込み又は読み込み制御がメモリ制御部３４
によって行われる。また、画像データのエンコード／デ
コードが画像コーデック用ＤＳＰ３６によって行われ
る。このとき、バッファメモリ３５、３７によって効率
の良いデータ転送が行われる。In this circuit, an output signal from the CCD of the camera section 1 is converted into a luminance signal and a color difference signal by the CCD signal processing section 4. Each signal is converted from an analog signal to a digital signal by the A / D converter 31 and recorded in the image memory 38. At this time, the luminance signal from the CCD signal processing unit 9 is transferred to the partial motion vector detector 10 and input to the microprocessor 11 as a motion vector. In the digital signal processing unit 5, the memory control unit 34 controls writing or reading of image data into the image memory 38.
Done by Further, the image code DSP 36 performs encoding / decoding of image data. At this time, efficient data transfer is performed by the buffer memories 35 and 37.

【００１８】また、この回路において、カメラが捕らえ
た人物の口元の動きから話者の認識が画像認識部４０に
よって行われ、カメラが話者を追随すための制御がカメ
ラ駆動制御部３９によって行われる。Further, in this circuit, the speaker is recognized by the image recognition unit 40 from the movement of the mouth of the person captured by the camera, and the camera drive control unit 39 controls the camera to follow the speaker. Be seen.

【００１９】なお、ＣＣＤからの信号は画像メモリ３８
へ１度記録され、回線へのデータ転送時はこの画像メモ
リ３８からデータを読み出しバッファメモリ３５を通し
て画像コーデック用ＤＳＰ３６へ転送され、そこでデコ
ード処理が行われ、再びバッファメモリ３７を介して通
信制御部６へ転送される。通信回線からの受信データは
全くこの逆の流れでデコード処理が行われ画像メモリ３
８へ書き込まれる。The signal from the CCD is the image memory 38.
Is recorded once, and when data is transferred to the line, the data is read from the image memory 38 and transferred to the image codec DSP 36 through the buffer memory 35, where the decoding process is performed and the communication control unit is executed again via the buffer memory 37. 6 is transferred. The received data from the communication line is decoded in the reverse flow, and the image memory 3
Written to 8.

【００２０】上記回路において、マイクロプロッセサ１
１は上記一連の流れを制御すると同時期にビデオ信号処
理部４からの動きベクトルから複数話者の中の特定話者
を選択し、カメラをその話者へ移動する為のカメラ駆動
制御を行う。また、デジタル信号処理部５からのデコー
ドされたデータは通信制御部６のデュアルポートメモリ
４１を介して、パラレルデータからシリアルデータへ変
換後一般公衆回線へ送信される。In the above circuit, the microprocessor 1
1 controls a series of the above flows, and at the same time, selects a specific speaker from a plurality of speakers from the motion vector from the video signal processing unit 4, and performs camera drive control for moving the camera to the speaker. . Also, the decoded data from the digital signal processing unit 5 is converted to parallel data from serial data via the dual port memory 41 of the communication control unit 6 and transmitted to the general public line.

【００２１】[0021]

【発明の効果】本発明によれば、テレビ会議システム等
の画像伝送装置で複数話者の中から現在話をしている特
定話者を自動的に選択でき、カメラを自動的にこの特定
話者の方向へ移動させることができるので、複数話者が
同席している通常の会議と同様な効率の良い会議進行が
可能になる。According to the present invention, a specific speaker who is currently talking can be automatically selected from a plurality of speakers by an image transmission device such as a video conference system, and the camera automatically selects the specific speaker. Since it can be moved in the direction of the speaker, it is possible to proceed as efficiently as a normal conference in which a plurality of speakers are present.

[Brief description of drawings]

【図１】本発明に係る画像伝送装置の機能ブロック図で
ある。FIG. 1 is a functional block diagram of an image transmission device according to the present invention.

【図２】本発明装置における動きベクトル検出部の構成
を示すブロック図である。FIG. 2 is a block diagram showing a configuration of a motion vector detection unit in the device of the present invention.

【図３】本発明装置における動きベクトル検出器の詳細
ブロック図である。FIG. 3 is a detailed block diagram of a motion vector detector in the device of the present invention.

【図４】本発明装置におけるカメラ部の拡大正面図であ
る。FIG. 4 is an enlarged front view of a camera unit in the device of the present invention.

【図５】本発明装置におけるカメラ部の拡大横面図であ
る。FIG. 5 is an enlarged lateral view of a camera unit in the device of the present invention.

【図６】本発明の画像伝送装置の詳細回路ブロック図で
ある。FIG. 6 is a detailed circuit block diagram of the image transmission device of the present invention.

【図７】従来の画像伝送装置の機能ブロック図である。FIG. 7 is a functional block diagram of a conventional image transmission device.

[Explanation of symbols]

１カメラ部２レンズ３カメラ部駆動用電磁石４ビデオ信号処理部５デジタル信号処理部６通信制御部７内蔵／外部マイク８固体撮像素子（ＣＣＤ）１０動きベクトル検出器４０画像認識部 1 camera section 2 lens 3 electromagnet for driving camera section 4 video signal processing section 5 digital signal processing section 6 communication control section 7 built-in / external microphone 8 solid-state image sensor (CCD) 10 motion vector detector 40 image recognition section

Claims

[Claims]

1. A communication control unit for communicating using a telephone line, a digital signal processing unit for compressing / decompressing communication data, a video signal processing unit for displaying a transmitted image or a received image, and an image to be transmitted. In an image transmission device equipped with a camera unit that can be driven vertically and horizontally to be used as an input unit of, an image recognition unit for identifying image data from the camera unit is provided, and the image recognition unit recognizes image data from the camera. An image transmission device characterized in that one speaker is specified from a plurality of speakers by recognition processing, and the camera is automatically moved in the direction of the specified speaker.

2. The image transmission device according to claim 1, wherein the speaker is identified by the movement of the mouth.