JP2006154898A

JP2006154898A - Portable terminal

Info

Publication number: JP2006154898A
Application number: JP2004340164A
Authority: JP
Inventors: Mamoru Watarido; 守渡戸
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2004-11-25
Filing date: 2004-11-25
Publication date: 2006-06-15
Anticipated expiration: 2024-11-25
Also published as: JP4703173B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a novel character input mode without adding special hardware in a cellphone terminal equipped with a camera. <P>SOLUTION: Photographing is carried out while moving a terminal body to describe a character by the camera 33 mounted to the cellphone terminal 100. A motion vector showing the motion of a photographed imaged body is detected by a motion vector detecting part 211, and the distributed state of the motion vector in a moving image is detected by a motion vector distribution detecting part 213. Based on the direction, size, distributed state, and the like of the detected motion vector, a dictionary database 6A is retrieved to execute character recognition. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、動画像を撮影可能としたＣＣＤカメラ等の撮像装置、及び文字や制御コードを入力する入力部を備えた携帯端末に関する。 The present invention relates to an imaging apparatus such as a CCD camera that can capture a moving image, and a portable terminal that includes an input unit for inputting characters and control codes.

携帯電話を含む携帯情報端末装置は、小型で携帯性に優れる分、入力部にはパソコン等のように多数のキーを設けることができず、このため少ない数のキー（例えば、電話機の１５個前後のキー、ジョイスティックキーなど）によって入力するように工夫されている。いわゆるＰＤＡ（携帯情報端末）では、タッチパネルからペン入力による手書き文字認識などを用いたものも知られている。 Since a portable information terminal device including a mobile phone is small and excellent in portability, a large number of keys such as a personal computer cannot be provided in an input unit. Therefore, a small number of keys (for example, 15 phones) It is devised to input with the front and back keys, joystick keys, etc.). A so-called PDA (personal digital assistant) is also known that uses handwritten character recognition by pen input from a touch panel.

また、携帯電話の底部にラインセンサを配置し、文字認識を行って文字入力を行うようにしたものが特許文献１により知られている。また、特許文献２は、１５個のキーにそれぞれ４分割したタッチセンサを備え、キー全体をタッチパネルのようになぞることにより文字入力を可能にした携帯端末を提案している。また、特許文献３は、携帯端末に動きセンサを搭載し、端末自体を前後左右に動かして文字等を入力するものを提案している。
しかし、ラインセンサやタッチセンサ等の特別のハードウエアを文字入力のために増設することは、携帯端末の容積の肥大化を招き、携帯電話端末の高価格化を招く。 Japanese Patent Application Laid-Open No. 2004-133867 discloses a line sensor arranged at the bottom of a mobile phone to perform character input by performing character recognition. Patent Document 2 proposes a portable terminal that includes a touch sensor that is divided into four for each of 15 keys, and that enables character input by tracing the entire key like a touch panel. Japanese Patent Application Laid-Open No. 2004-228561 proposes a mobile terminal equipped with a motion sensor, and the terminal itself is moved back and forth and left and right to input characters and the like.
However, adding special hardware such as a line sensor or a touch sensor for character input leads to an increase in the volume of the mobile terminal and an increase in the price of the mobile phone terminal.

また、特許文献４は、携帯電話端末に付属のカメラにより、ユーザのジェスチャを認識して、画面の切り替え等を行う技術を開示している。しかし、この文献の技術では、画面の切り替え等を行うことができるに止まり、文字入力等を可能にするものではない。 Patent Document 4 discloses a technique for recognizing a user's gesture and switching screens using a camera attached to the mobile phone terminal. However, the technique of this document only allows screen switching and the like, and does not enable character input or the like.

特開２００３−１９９１６３号公報（［００１１］欄他）JP 2003-199163 A ([0011] column and others) 特開２００１−３３３１６６号公報（［００１６］欄他）JP 2001-333166 A ([0016] column etc.) 特開２０００−７８２６２号公報（［０００９］〜［００１７］欄他）JP 2000-78262 A ([0009] to [0017] columns and others) 特開２０００−８３３０２号公報（［０１５７］〜［０１８１］欄他）JP 2000-83302 A ([0157] to [0181] columns and others)

本発明は、特別なハードウエアを追加することなく、新規な文字入力モードを提供することを可能とした携帯端末を提供することを目的とする。 An object of the present invention is to provide a portable terminal capable of providing a new character input mode without adding special hardware.

本発明に係る携帯端末は、入力情報を入力する機能を備えた携帯端末において、本体部と、本体部に一体に設けられ動画像を撮像する撮像部と、前記本体部を動かしながら撮像された動画像中の被撮像体の動きを示す動ベクトルを検出する動ベクトル検出部と、前記動ベクトルの前記動画像中での分布状態を検出する動ベクトル分布検出部と、複数の前記動ベクトルのそれぞれの大きさ及び向き、並びに前記分布状態と対応付けて文字を記憶する辞書データベースと、前記動ベクトル検出部及び前記動ベクトル分布検出部が検出した前記大きさ及び向き並びに前記分布状態に対応する文字を前記辞書データベースから読み出す読み出し部とを備えたことを特徴とする。 The portable terminal according to the present invention is a portable terminal having a function of inputting input information, and is imaged while moving the main body, an imaging unit that is provided integrally with the main body and captures a moving image, and moving the main body. A motion vector detection unit for detecting a motion vector indicating the motion of the imaging target in the video, a motion vector distribution detection unit for detecting a distribution state of the motion vector in the video, and a plurality of the motion vectors. A dictionary database that stores characters in association with each size and direction, and the distribution state, and corresponds to the size and direction detected by the motion vector detection unit and the motion vector distribution detection unit, and the distribution state. And a reading unit for reading out characters from the dictionary database.

この発明によれば、携帯端末において、特別なハードウエアを追加することなく、新規な文字入力モードを提供することができる。 According to the present invention, a new character input mode can be provided in a portable terminal without adding special hardware.

次に、本発明の実施の形態を、図面を参照して詳細に説明する。また、以下の説明では、主に数字やアルファベットなどの文字の入力を行う例を説明するが、図形、記号や、メールの送信命令などの制御コードも「文字」と考えて入力の対象とすることができるのは勿論である。
図１は、この発明の実施形態に係わる携帯電話端末の回路構成を示すブロック図である。この携帯電話端末１００は、無線部１と、ベースバンド部２と、入出力部３と、電源部４と、バイブレータ５とから構成される。この携帯電話端末１００には、本実施の形態に係る辞書データベース６Ａ等を格納するためのメモリカード６が接続され得る。 Next, embodiments of the present invention will be described in detail with reference to the drawings. In addition, in the following explanation, an example of inputting characters such as numbers and alphabets is mainly explained. However, control codes such as figures, symbols, and mail sending commands are also considered as "characters" and are subject to input. Of course you can.
FIG. 1 is a block diagram showing a circuit configuration of a cellular phone terminal according to an embodiment of the present invention. The cellular phone terminal 100 includes a wireless unit 1, a baseband unit 2, an input / output unit 3, a power supply unit 4, and a vibrator 5. The mobile phone terminal 100 can be connected to a memory card 6 for storing the dictionary database 6A and the like according to the present embodiment.

同図において、図示しない基地局から無線チャネルを介して到来した無線周波信号は、アンテナ１１で受信されたのちアンテナ共用器（ＤＵＰ）１２を介して受信回路（ＲＸ）１３に入力される。受信回路１３は、高周波増幅器、周波数変換器及び復調器を備える。そして、この無線周波信号を低雑音増幅器で低雑音増幅したのち、周波数変換器において周波数シンセサイザ（ＳＹＮ）１４から発生された受信局部発振信号とミキシングして受信中間周波信号又は受信ベースバンド信号に周波数変換し、その出力信号を復調器でディジタル復調する。復調方式としては、例えばＱＰＳＫ方式に対応した直交復調方式と、拡散符号を使用したスペクトラム逆拡散方式が用いられる。なお、上記周波数シンセサイザ１４から発生される受信局部発振信号周波数は、ベースバンド部２に設けられた主制御部２１から指示される。 In the figure, a radio frequency signal arriving from a base station (not shown) via a radio channel is received by an antenna 11 and then input to a receiving circuit (RX) 13 via an antenna duplexer (DUP) 12. The reception circuit 13 includes a high frequency amplifier, a frequency converter, and a demodulator. The radio frequency signal is amplified by a low noise amplifier and then mixed with a reception local oscillation signal generated from a frequency synthesizer (SYN) 14 in a frequency converter to generate a reception intermediate frequency signal or a reception baseband signal. The output signal is digitally demodulated by a demodulator. As the demodulation method, for example, an orthogonal demodulation method corresponding to the QPSK method and a spectrum despreading method using a spreading code are used. The reception local oscillation signal frequency generated from the frequency synthesizer 14 is instructed from the main control unit 21 provided in the baseband unit 2.

上記復調器から出力された復調信号はベースバンド部２に入力される。ベースバンド部２は、主制御部２１と、多重分離部２２と、音声符号復号部（以後音声コーデックと呼称する）２３と、画像処理部２４と、ＬＣＤ制御部２５と、メモリ部２６とを備えている。メモリ部２６は、カメラ３３により得られた動画像データを記憶する動画像データ記憶部２６１、検出された動ベクトルを記憶する動ベクトルデータ記憶部２６２、検出された基準動ベクトルを記憶する基準動ベクトル記憶部２６３、及び文字認識結果を記憶する文字認識結果記憶部２６４、音声データを記憶する音声データ記憶部２６５等を備えている。動ベクトル、基準動ベクトルについては後述する。 The demodulated signal output from the demodulator is input to the baseband unit 2. The baseband unit 2 includes a main control unit 21, a demultiplexing unit 22, an audio code decoding unit (hereinafter referred to as an audio codec) 23, an image processing unit 24, an LCD control unit 25, and a memory unit 26. I have. The memory unit 26 is a moving image data storage unit 261 that stores moving image data obtained by the camera 33, a moving vector data storage unit 262 that stores detected moving vectors, and a reference moving image that stores detected reference moving vectors. A vector storage unit 263, a character recognition result storage unit 264 that stores character recognition results, a voice data storage unit 265 that stores voice data, and the like are provided. The motion vector and the reference motion vector will be described later.

復調信号は、主制御部２１において制御情報であるか画像情報であるかが識別され、画像情報であれば多重分離部２２に供給されてここで音声データと画像データとに分離される。そして、音声データは音声コーデック２３に供給されてここで音声復号され、これにより再生された音声信号は入出力部３のスピーカ３２から拡声出力される。これに対し画像データは、画像処理部２４に供給されてここで画像復号処理され、これにより再生された画像信号はＬＣＤ制御部２５を介して入出力部３のＬＣＤ３４に供給され表示される。 The demodulated signal is identified by the main control unit 21 as control information or image information. If it is image information, it is supplied to the demultiplexing unit 22 where it is separated into audio data and image data. Then, the audio data is supplied to the audio codec 23 where it is decoded and the reproduced audio signal is output from the speaker 32 of the input / output unit 3 as a loud voice. On the other hand, the image data is supplied to the image processing unit 24 where the image decoding process is performed, and the image signal reproduced thereby is supplied to the LCD 34 of the input / output unit 3 via the LCD control unit 25 and displayed.

また、メモリ部２６に記憶された音声データおよび画像データを再生および表示する場合にも、これらのデータはそれぞれ音声コーデック２３および画像処理部２４に入力される。そして、音声データは音声コーデック２３で復号されたのちスピーカ３２から拡声出力される。また画像データは、画像処理部２４で復号されたのちＬＣＤ制御部２５を介してＬＣＤ３４に供給され、表示される。 Also, when reproducing and displaying audio data and image data stored in the memory unit 26, these data are input to the audio codec 23 and the image processing unit 24, respectively. The audio data is decoded by the audio codec 23 and then output from the speaker 32. The image data is decoded by the image processing unit 24 and then supplied to the LCD 34 via the LCD control unit 25 and displayed.

一方、入出力部３のマイクロホン３１から出力されたユーザの送話音声信号は、ベースバンド部２の音声コーデック２３に入力され、ここで音声符号化されたのち多重分離部２２に入力される。またカメラ３３から出力された画像信号は、ベースバンド部２の画像処理部２４に入力され、ここで画像符号化処理が施されたのち多重分離部２２に入力される。多重分離部２２では、上記符号化された音声データと画像データとが所定のフォーマットで多重化され、この多重化された送信データは主制御部２１から無線部１の送信回路（ＴＸ）１５に入力される。
なお、カメラ３３は、フレーム撮影可能なもの、フィールド撮影が可能なもののいずれであってもよい。フレーム撮影とは、１秒間に例えば３０フレームを撮影し、その各フレームが１画像を構成するものであり、フィールド撮影とは、１画面を奇数ラインと偶数ラインに従って分割可能な形で撮影し（インタレース撮影）し、２つのフィールドで１つのフレームを構成するものである。以下では、図１のカメラはフレーム撮影を行うものとして説明を行う。 On the other hand, the user's transmitted voice signal output from the microphone 31 of the input / output unit 3 is input to the audio codec 23 of the baseband unit 2, and is encoded here and then input to the demultiplexing unit 22. The image signal output from the camera 33 is input to the image processing unit 24 of the baseband unit 2, and after being subjected to image encoding processing, is input to the demultiplexing unit 22. The demultiplexing unit 22 multiplexes the encoded audio data and image data in a predetermined format, and the multiplexed transmission data is sent from the main control unit 21 to the transmission circuit (TX) 15 of the radio unit 1. Entered.
Note that the camera 33 may be either one capable of frame photographing or one capable of field photographing. For example, 30 frames are shot in one second, and each frame forms one image. Field shooting is a shooting in which a screen can be divided according to odd lines and even lines ( Interlace shooting), and one frame is composed of two fields. In the following description, it is assumed that the camera in FIG. 1 performs frame shooting.

送信回路１５は、変調器、周波数変換器及び送信電力増幅器を備える。送信データは、変調器でディジタル変調されたのち、周波数変換器により周波数シンセサイザ１４から発生された送信局部発振信号とミキシングされて無線周波信号に周波数変換される。変調方式としては、ＱＰＳＫ方式及び拡散符号使用するスペクトラム拡散方式が用いられる。そして、この生成された送信無線周波信号は、送信電力増幅器で所定の送信レベルに増幅されたのち、アンテナ共用器１２を介してアンテナ１１に供給され、このアンテナ１１から図示しない基地局に向け送信される。 The transmission circuit 15 includes a modulator, a frequency converter, and a transmission power amplifier. The transmission data is digitally modulated by the modulator, then mixed with the transmission local oscillation signal generated from the frequency synthesizer 14 by the frequency converter and frequency-converted to a radio frequency signal. As a modulation method, a QPSK method and a spread spectrum method using a spread code are used. The generated transmission radio frequency signal is amplified to a predetermined transmission level by a transmission power amplifier, then supplied to the antenna 11 via the antenna duplexer 12, and transmitted from the antenna 11 to a base station (not shown). Is done.

なお、電源部４には、リチウムイオン電池等のバッテリ４１と、このバッテリ４１を商用電源出力（ＡＣ１００Ｖ）をもとに充電するための充電回路４２と、電圧生成回路（ＰＳ）４３とが設けられている。電圧生成回路４３は、例えばＤＣ／ＤＣコンバータからなり、バッテリ４１の出力電圧をもとに所定の電源電圧Ｖｃｃを生成する。 The power supply unit 4 includes a battery 41 such as a lithium ion battery, a charging circuit 42 for charging the battery 41 based on a commercial power output (AC 100 V), and a voltage generation circuit (PS) 43. It has been. The voltage generation circuit 43 is composed of, for example, a DC / DC converter, and generates a predetermined power supply voltage Vcc based on the output voltage of the battery 41.

主制御部２１は、本実施の形態に係る文字入力を行うため、図１に示すように動ベクトル検出部２１１、動ベクトル絞り込み処理部２１２、動ベクトル分布検出部２１４、音声認識部２１５、及び文字認識部２１６を含んでいる。本実施の形態では、図２に示すように、携帯電話端末１００を、入力したい文字を描くように動かしながらカメラ３３で被撮像体（図２ではユーザの左手の指Ｆ）を撮影することにより、文字入力を行う。例えば、数字の「６」を入力したい場合には、図２に示すように、文字入力起動キー３５Ａを押してカメラ３３による文字入力モードを起動した後、カメラ３３を指Ｆの方向に向けて指Ｆを撮影しながら、文字入力キー３５Ｂを押しながら、携帯電話端末１００を「６」を描くように移動させる。文字入力キー３５Ｂは、「６」を描く軌跡の始点で押し始められ、終点において離される。こうして文字入力キー３５Ｂが押下されている間に撮影された指Ｆのカメラ３３に対する相対的な動きを示す動ベクトルを動ベクトル検出部２１１で検出することにより文字入力を行うことができる。すなわち、文字入力キー３５Ｂは、文字を構成する１つの画の開始と終了を示す情報を入力する手段として機能する。２以上の画数からなる文字は、１画の文字が終了したら文字入力キー３５Ｂを離し、次の１画を入力するときに再び文字入力キー３５Ｂを押すことにより、入力することができる。 Since the main control unit 21 performs character input according to the present embodiment, as shown in FIG. 1, the motion vector detection unit 211, the motion vector narrowing processing unit 212, the motion vector distribution detection unit 214, the speech recognition unit 215, and A character recognition unit 216 is included. In the present embodiment, as shown in FIG. 2, the mobile phone terminal 100 is photographed by the camera 33 while moving the mobile phone terminal 100 so as to draw a character to be input (in FIG. 2, the user's left hand finger F). Input characters. For example, in order to input the number “6”, as shown in FIG. 2, after the character input activation key 35A is pressed to activate the character input mode by the camera 33, the camera 33 is directed toward the finger F. While photographing F, the cellular phone terminal 100 is moved to draw “6” while pressing the character input key 35B. The character input key 35B starts to be pressed at the start point of the locus for drawing “6” and is released at the end point. Thus, the character input can be performed by detecting the motion vector indicating the relative motion of the finger F taken with respect to the camera 33 while the character input key 35B is pressed by the motion vector detection unit 211. That is, the character input key 35B functions as means for inputting information indicating the start and end of one image constituting the character. Characters having two or more strokes can be input by releasing the character input key 35B when the characters of one stroke are completed and pressing the character input key 35B again when inputting the next one stroke.

なお、指Ｆ等の被撮像体は、静止物、すなわちカメラ３３が撮影を行っている間において、所定の位置から動かない物体とすることが好ましい。図２の例の指Ｆ以外でも、撮影可能で動ベクトルを明確に検出できる静止物であればよい。
また、文字入力キー３５Ｂを１画の入力の間押し続ける代わりに、１画の開始と終了のときにそれぞれ文字入力キー３５Ｂを１回押すことにより、１画の開始と終了を入力するようにしてもよい。または、マイクロホン３１から入力された音声により、文字の１画の開始および終了を入力するようにしてもよい。 Note that the imaging target such as the finger F is preferably a stationary object, that is, an object that does not move from a predetermined position while the camera 33 is photographing. Other than the finger F in the example of FIG. 2, any stationary object that can be photographed and that can clearly detect a motion vector may be used.
Also, instead of holding down the character input key 35B during the input of one stroke, the start and end of one stroke are input by pressing the character input key 35B once at the start and end of one stroke. May be. Alternatively, the start and end of one stroke of a character may be input by voice input from the microphone 31.

動ベクトル検出部２１１は、カメラ３３が取得した動画像を構成する各フレームの所定の領域における画素の動きを示す動ベクトルの大きさ及び向きを計算する機能を有する。この実施の形態で動ベクトルは、例えば参照フレームとしての前後のフレームと、現フレームとの間において、動画像が動いた方向と距離を表すベクトルのことをいうものとする。ＭＰＥＧ（Motion Picture Encoding Group）の規格に従う場合、基準フレームをＩピクチャとし、所定の設定間隔でＩピクチャを設定する。そして、このＩピクチャを基準として、前後のフレーム（Ｂピクチャ、Ｐピクチャ）の動ベクトルを計算する。なお、この実施の形態では、画像の圧縮符号化が目的ではないので、動ベクトルは指Ｆのカメラ３３に対する動きを検出する視標として用いられるのみであり、圧縮符号化に用いる必要のあるものではない。ただし、圧縮符号化後、その圧縮画像に基づいて、動ベクトルを検出するようにすることは可能である。 The motion vector detection unit 211 has a function of calculating the size and direction of a motion vector indicating the motion of a pixel in a predetermined region of each frame constituting a motion image acquired by the camera 33. In this embodiment, a motion vector refers to a vector representing the direction and distance in which a moving image has moved between, for example, the previous and subsequent frames as reference frames and the current frame. When conforming to the MPEG (Motion Picture Encoding Group) standard, a reference frame is set as an I picture, and an I picture is set at a predetermined setting interval. Then, using this I picture as a reference, the motion vectors of the previous and subsequent frames (B picture, P picture) are calculated. In this embodiment, since the purpose is not to compress and encode an image, the motion vector is only used as a target for detecting the movement of the finger F with respect to the camera 33, and is required to be used for the compression encoding. is not. However, after compression encoding, it is possible to detect a motion vector based on the compressed image.

一例として、動ベクトル検出部２１１は、図３に示すように、横Ｘ画素、縦Ｙ画素（例えばＸ＝３６０、Ｙ＝２４０）の１つのフレームを、横ＸＢ画素×縦ＹＢ画素（例えばＸＢ＝ＹＢ＝４）のマクロブロックに分割する。そして、各マクロブロック毎にＶＮ個（＝Ｘ／ＸＢ×Ｙ／ＹＢ）の動ベクトルを計算するように構成する。例えば、図３に示すように、現フレームＮ＋Ｔ、過去のフレームＮがあり、フレームＮではマクロブロックＭに存在していた画像が、フレームＮ＋ＴではマクロブロックＳに移動している、という場合を考える。このとき、マクロブロックＳにおける動ベクトルＶｍを、マクロブロックＭとＳの座標値の相違、その他位置の相違を示すデータにより計算することができる。その他のマクロブロックについても、同様に動ベクトルを計算することができる。
なお、マクロブロックは長方形に限るものではなく、例えば円形、楕円形、三角形、５角以上の多角形とすることもできる。また、動ベクトルは複数個のフレーム間に亘って、順方向予測計算、逆方向予測計算を用いて計算することもできる。 As an example, as illustrated in FIG. 3, the motion vector detection unit 211 converts one frame of horizontal X pixels and vertical Y pixels (for example, X = 360, Y = 240) into horizontal XB pixels × vertical YB pixels (for example, XB = YB = 4). Then, it is configured to calculate VN (= X / XB × Y / YB) motion vectors for each macroblock. For example, as shown in FIG. 3, consider a case where there are a current frame N + T and a past frame N, and an image that was present in the macroblock M in the frame N has moved to the macroblock S in the frame N + T. . At this time, the motion vector Vm in the macroblock S can be calculated from data indicating the difference in the coordinate values of the macroblocks M and S and other differences in position. For other macroblocks, motion vectors can be calculated similarly.
Note that the macroblock is not limited to a rectangle, and may be, for example, a circle, an ellipse, a triangle, or a polygon having five or more corners. The motion vector can also be calculated using a forward prediction calculation and a backward prediction calculation over a plurality of frames.

或いは、１つのフレームの中の特定の領域を動ベクトル抽出領域として定義し、その動ベクトル抽出領域ごとに代表動ベクトルを求めるようにすることもできる。動ベクトル抽出領域は、例えば図４に示すように、画面中央を挟んで略対称に、略同一形状・大きさの第１〜第４の動ベクトル抽出領域３０２〜３０５を定義し、それぞれの領域ごとに代表動ベクトル３０２Ａ〜３０５Ａを求めるようにすることができる。場合によっては、図５に示すように、左右対称でなく、また個々の抽出領域３０２’〜３０５’の大きさ又は形状も異ならせるようにしてもよい。画面のコーナー部など、動ベクトルが発生する可能性が少なく、雑音が入り込み易いと考えられる領域には、抽出領域を設定しないようにするのが好ましい。 Alternatively, a specific region in one frame can be defined as a motion vector extraction region, and a representative motion vector can be obtained for each motion vector extraction region. For example, as shown in FIG. 4, the motion vector extraction regions define first to fourth motion vector extraction regions 302 to 305 having substantially the same shape and size, with the screen center interposed therebetween, and each region. The representative motion vectors 302A to 305A can be obtained for each. In some cases, as illustrated in FIG. 5, the size and shape of the individual extraction regions 302 ′ to 305 ′ may be different from each other. It is preferable not to set an extraction region in a region where a motion vector is unlikely to occur and noise is likely to enter, such as a corner portion of a screen.

動ベクトル検出部２１１においては、動ベクトルの向きだけでなく大きさも検出する。動ベクトルの向きだけによると、例えば数字の「０」と「６」の識別が困難になり、誤認識等が生じる虞が大きくなる。動ベクトルの大きさは、携帯電話端末１００の移動速度に対応するデータである。「０」を描く場合と、「６」を描く場合とでは、後者の方が描くループが小さく、早くループが閉じるので、ループの移動速度にも差が生じる。従って、移動速度すなわち動ベクトルの大きさを検出することにより、例えば類似する文字「０」と「６」の識別が容易になる。 The motion vector detection unit 211 detects not only the direction of the motion vector but also the size thereof. According to only the direction of the motion vector, for example, it becomes difficult to distinguish the numbers “0” and “6”, and the possibility of erroneous recognition and the like increases. The magnitude of the motion vector is data corresponding to the moving speed of the mobile phone terminal 100. In the case of drawing “0” and the case of drawing “6”, the loop drawn by the latter is smaller and the loop closes earlier, so that there is a difference in the moving speed of the loop. Accordingly, by detecting the moving speed, that is, the size of the motion vector, for example, the similar characters “0” and “6” can be easily identified.

動ベクトル絞り込み処理部２１２は、動ベクトル検出部２１１で検出された動ベクトルに対し所定のデータ除去処理を実行して、動ベクトルデータを絞り込んだ基準動ベクトルを取得する機能を有する。実行されるデータ除去処理は、例えば、
（１）動ベクトル検出部２１１で計算された１フレーム内の動ベクトルのばらつきを吸収して外乱データを除去する処理、
（２）フレーム間の間引きを実行して外乱データを除去する処理、
（３）動画像中の物体像のエッジ検出を行い、このエッジ付近の動ベクトル以外を除去する処理、又は
（４）これらの組合せの処理
である。 The motion vector narrowing processing unit 212 has a function of executing a predetermined data removal process on the motion vector detected by the motion vector detecting unit 211 to acquire a reference motion vector obtained by narrowing motion vector data. The data removal process to be executed is, for example,
(1) Processing for removing disturbance data by absorbing variations in motion vectors in one frame calculated by the motion vector detection unit 211;
(2) Processing to remove disturbance data by performing frame thinning,
(3) Processing for detecting an edge of an object image in a moving image and removing other than the moving vector in the vicinity of the edge, or (4) Processing of a combination thereof.

例えば、移動する車内などで本実施の形態の文字入力がなされる場合、背景において、建物や自動車などが移動し、これが画面に映ることがあり得る。こうした物体の移動は、フレーム上において、動ベクトルの全体に対するばらつきとして現われる。このような場合、ＶＮ個の動ベクトルの総和平均、又は複数の代表ベクトルの相関係数を計算することにより、ばらつきを吸収し、このような物体の像の影響を除去することができる。或いは、ＶＮ個の動ベクトルの標準偏差σを求め、所定の値ＫＩ・σ（ＫＩは正の値の係数）の範囲内にある動ベクトルのみを残し、この範囲外の動ベクトルを除去することによっても、絞込みを行うことができる。 For example, when a character input according to the present embodiment is performed in a moving vehicle or the like, a building, a car, or the like may move in the background, and this may appear on the screen. Such movement of the object appears as a variation with respect to the entire motion vector on the frame. In such a case, by calculating the total average of VN motion vectors or the correlation coefficient of a plurality of representative vectors, variations can be absorbed and the influence of such an object image can be removed. Alternatively, the standard deviation σ of VN motion vectors is obtained, and only motion vectors within the range of a predetermined value KI · σ (KI is a positive value coefficient) are left, and motion vectors outside this range are removed. Depending on, it is possible to narrow down.

あるいは、撮影した動画像に含まれるフレーム数ＡＡを、これより少ないフレーム数ＡＰに間引きすることにより、動ベクトル数を絞り込むこともできる。フレーム数ＡＡをＡＰに間引きする方法は、例えば、（１）文字の１画の始点と終点の間において、その時間の長さに関係なく、固定の枚数に間引く、（２）間引き後のフレーム間の時間間隔を所定の期間ＴＴとする、（３）期間ＴＴを非等間隔とする、（４）隣接するフレームにおける各動ベクトルの相関を計算して、所定の閾値ＴＳＨ以上の高い相関を有するフレームは間引く、などが考えられる。 Alternatively, the number of motion vectors can be narrowed down by thinning out the number of frames AA included in the captured moving image to a smaller number of frames AP. The method of thinning the number of frames AA to the AP is, for example, (1) thinning a fixed number of characters between the start point and end point of one stroke of a character, regardless of the length of time, (2) frames after thinning The time interval between them is a predetermined period TT, (3) the period TT is an unequal interval, (4) the correlation of each motion vector in adjacent frames is calculated, and a high correlation greater than a predetermined threshold TSH is obtained. It is conceivable to thin out the frame that has it.

あるいは、フレーム間の相関から、移動していくフレーム内の小物体の移動を推測し、これを除去することも可能である。このようにして、指Ｆの相対移動に基づく動ベクトルの推定に有効でないと判定されるデータの除去処理が行われることにより、文字認識部２１６における文字認識精度を向上させることができる。 Alternatively, it is possible to estimate the movement of the small object in the moving frame from the correlation between the frames and remove it. In this way, the character recognition accuracy in the character recognition unit 216 can be improved by performing the data removal processing that is determined to be ineffective for estimating the motion vector based on the relative movement of the finger F.

動ベクトル分布検出部２１３は、各フレームにおいて検出され絞り込み処理が行われた後の動ベクトルの各フレーム上における分布状態を検出する機能を有する。例えば、図６に示すように、フレーム１１０１〜１１０６において、携帯電話端末１００が文字を描くように動かされたことにより、指Ｆが画面上を移動する場合、動ベクトルも、各フレーム１１０１〜１１０６において、指Ｆの移動方向に沿って、その分布範囲が移動していく。この分布の状態を、例えば図６に示すように分布の移動方向を示すベクトル１１０８を計算することにより、各フレームにおける動ベクトルの分布状態を検出することができる。 The motion vector distribution detection unit 213 has a function of detecting the distribution state of each motion vector on each frame after being detected and narrowed down in each frame. For example, as shown in FIG. 6, when the finger F moves on the screen due to the mobile phone terminal 100 being moved so as to draw characters in the frames 1101 to 1106, the motion vectors are also displayed in the frames 1101 to 1106. 2, the distribution range moves along the moving direction of the finger F. For example, by calculating a vector 1108 indicating the moving direction of the distribution as shown in FIG. 6, the distribution state of the motion vector in each frame can be detected.

音声認識部２１５は、携帯電話端末１００を動かしながら指Ｆ等を撮影することにより得られた動ベクトルによる文字認識の補助情報としてマイクロホン３１から入力される音声情報を認識する機能を有する。入力される音声の種類は様々に考え得るが、例えば（１）入力される文字の読み自体を示す音声、（２）入力される文字の形状を示す音声等を入力することができる。前者の例としては、例えば形状が類似した文字、例えば「０」と「６」を識別するため、その読みをマイクロホン３１から音声入力することができる。後者の例としては、例えば形状の類似する「＋」と「Ｔ」を入力する場合に、携帯電話端末１００を動かしつつ、各画の交点において、交点を示す音声（例：「クロス」）をマイクロホン３１から入力することにより、両者を識別することができる。 The voice recognition unit 215 has a function of recognizing voice information input from the microphone 31 as auxiliary information for character recognition using a motion vector obtained by photographing the finger F or the like while moving the mobile phone terminal 100. Various types of speech can be considered. For example, (1) speech indicating the input character itself, (2) speech indicating the shape of the input character, or the like can be input. As an example of the former, for example, in order to identify characters having similar shapes, for example, “0” and “6”, the reading can be inputted from the microphone 31 by voice. As an example of the latter, for example, when inputting “+” and “T” having similar shapes, the mobile phone terminal 100 is moved and a voice indicating the intersection (eg, “cross”) is displayed at the intersection of each image. By inputting from the microphone 31, both can be identified.

文字認識部２１６は、動ベクトル検出部２１１で検出され動ベクトル絞り込み処理部で絞り込み処理された後の動ベクトルの向き及び大きさ、並びに動ベクトル分布検出部２１３で検出された動ベクトルの分布状態に対応する文字を、辞書データベース６Ａから検索し、その検索結果を読み出す機能を有する。
辞書データベース６Ａは、動ベクトルの向き及び大きさ、並びに分布状態に関するデータと、各種文字とを対応付けて記憶している。この辞書データベース６Ａの学習処理は、文字入力及び認識と同様の手順により行うことができるが、他の同じ解像度、同じ光学的特性を有した撮像装置等を用いて入力したデータを用いることも可能である。また、撮像装置でなくとも、動ベクトルの情報を入力できるものであれば、辞書データベース６Ａへの学習処理の方法は不問である。 The character recognition unit 216 detects the direction and size of the motion vector after being detected by the motion vector detection unit 211 and subjected to the narrowing processing by the motion vector narrowing processing unit, and the distribution state of the motion vector detected by the motion vector distribution detection unit 213 Is retrieved from the dictionary database 6A, and the retrieval result is read out.
The dictionary database 6A stores data relating to the direction and size of the motion vector and the distribution state, and various characters in association with each other. The learning process of the dictionary database 6A can be performed by the same procedure as the character input and recognition, but it is also possible to use data input using other imaging devices having the same resolution and the same optical characteristics. It is. Further, the learning process method for the dictionary database 6A is not limited as long as it is possible to input motion vector information without using an imaging device.

学習時に用いる撮像装置は、カメラ３３とは異なる解像度、光学特性を有するものであってもよい。また、学習時において、動ベクトルの検出方法（マクロブロックや動ベクトル抽出領域の設定等）が、文字入力及び認識の場合と異なっていてもよい。この場合には、動ベクトル絞り込み処理部２１２において、動ベクトル検出部２１１で検出された動ベクトルの数Ｍ２を、辞書データベース６Ａにデータとして格納されている動ベクトルの数Ｍ１に合わせ込み（例えば、Ｍ１、Ｍ２以下のＭ３個の動ベクトルに絞り込み）、この合わせ込み後の動ベクトルにより、文字認識を行うことができる。 The imaging device used at the time of learning may have a resolution and optical characteristics different from those of the camera 33. Further, at the time of learning, a motion vector detection method (such as setting of a macro block and a motion vector extraction region) may be different from that in character input and recognition. In this case, the motion vector narrowing-down processing unit 212 adjusts the number M2 of motion vectors detected by the motion vector detection unit 211 to the number M1 of motion vectors stored as data in the dictionary database 6A (for example, Characters can be recognized based on the motion vectors after the adjustment to M3 motion vectors smaller than M1 and M2.

次に、本実施の形態の携帯電話端末による文字入力の手順を、図７のフローチャートを参照して説明する。ここでは、２画からなる文字「＋」を入力する場合を例にとって説明する。文字入力を開始する場合、ユーザはまず文字入力起動キー３５Ａを押下する（Ｓ１１）。カメラ３３のレンズの前に指Ｆ等の被撮像体を置き、この指Ｆの方向にカメラ３３を向け、文字入力キー３５Ｂを押し始めた後（Ｓ１２）、携帯電話端末１００を「＋」の横線の１画を描くように移動させる（Ｓ１３）。なお、携帯電話端末１００の移動中必要に応じてマイクロホン３１から音声を入力するが（Ｓ１４）、この横線を描く際は音声入力は不要である。 Next, a procedure for inputting characters by the mobile phone terminal according to the present embodiment will be described with reference to the flowchart of FIG. Here, a case where a character “+” consisting of two strokes is input will be described as an example. When starting character input, the user first presses the character input activation key 35A (S11). An object to be imaged such as a finger F is placed in front of the lens of the camera 33, the camera 33 is pointed in the direction of the finger F, and the character input key 35B is started to be pressed (S12). Move so as to draw one horizontal line (S13). In addition, while the mobile phone terminal 100 is moving, voice is input from the microphone 31 as necessary (S14), but voice input is not necessary when drawing this horizontal line.

横方向への移動が終わると、文字入力キー３５Ｂは離される（Ｓ１５）。こうして文字入力キー３５Ｂが押下されている間に撮影された指Ｆのカメラ３３に対する相対的な動きを示す動ベクトルを動ベクトル検出部２１１で検出する（Ｓ１６）。動ベクトル絞り込み処理部２１２において動ベクトルの絞り込み処理が行われた後（Ｓ１７）、動ベクトル分布検出部２１３により、これら動ベクトルの分布状態が検出される（Ｓ１８）。２画以上の画数からなる文字を入力する場合には、このＳ１２〜Ｓ１８の手順を、入力しようとする文字を構成する全画の入力が完了するまで繰り返す（Ｓ１９）。文字「＋」の２画目である縦線を入力する場合には、携帯電話１００の縦方向（上から下）への移動の途中において、交点を示す音声（「クロス」等）をマイクロホン３１に向けてユーザが発声する。文字認識部２１６は、この音声の内容、発せられたタイミングの情報、及び動ベクトルの情報とに基づき、入力文字の認識を行う（Ｓ２０）。 When the movement in the horizontal direction is finished, the character input key 35B is released (S15). A motion vector indicating the relative movement of the finger F taken with respect to the camera 33 while the character input key 35B is pressed is detected by the motion vector detection unit 211 (S16). After the motion vector narrowing processing is performed in the motion vector narrowing processing unit 212 (S17), the motion vector distribution detection unit 213 detects the distribution state of these motion vectors (S18). When inputting characters having two or more strokes, the procedure from S12 to S18 is repeated until the input of all the images constituting the character to be input is completed (S19). When a vertical line, which is the second stroke of the character “+”, is input, voice (“cross” or the like) indicating an intersection point is input to the microphone 31 while the mobile phone 100 is moving in the vertical direction (from top to bottom). The user speaks toward The character recognizing unit 216 recognizes the input character based on the contents of the voice, the timing information issued, and the motion vector information (S20).

なお、「＋」と類似する文字「Ｔ」を入力する場合には、移動の途中ではなく、縦方向への携帯電話端末１００の移動の最初において、交点を示す音声（「クロス」等）を発声する。このように、発声のタイミングを入力文字によって異ならせることにより、類似する文字同士の識別が可能になる。入力文字の読み自体を示す音声（「プラス」、「ティー」など）を入力するようにすることも可能である。 When a character “T” similar to “+” is input, a voice (“cross” or the like) indicating an intersection is not given at the beginning of the movement of the mobile phone terminal 100 in the vertical direction, but during the movement. Speak. In this manner, similar characters can be identified by varying the timing of utterance depending on the input characters. It is also possible to input a voice (“plus”, “tee”, etc.) indicating the input character reading itself.

以上、発明の実施の形態を説明したが、本発明はこれらに限定されるものではなく、発明の趣旨を逸脱しない範囲内において、種々の変更、追加等が可能である。例えば、上記実施の形態では、１つのレンズを持つ１つのカメラ３３により動画像を撮像していたが、複数のレンズを持つカメラにより得られる時間的に同期又は非同期の複数系列の動画像を選択してもよいし、また複数系列の画像を重畳させて動画像として用いてもよい。さらに、画像を重畳させる場合に、別に定めた重み付け係数を掛け合わせた上で重畳するようにしてもよい。 Although the embodiments of the invention have been described above, the present invention is not limited to these embodiments, and various modifications and additions can be made without departing from the spirit of the invention. For example, in the above embodiment, a moving image is picked up by one camera 33 having one lens, but a plurality of time-synchronous or asynchronous moving images obtained by a camera having a plurality of lenses are selected. Alternatively, a plurality of series of images may be superimposed and used as a moving image. Furthermore, when superimposing images, they may be superimposed after multiplying by a separately determined weighting coefficient.

また、上記の実施の形態では、辞書データベース６Ａに動ベクトルの情報（向き及び大きさ、分布状態）を文字と対応させて格納していたが、代わりに動ベクトルの情報から計算した携帯電話端末１００の移動方向及び速度を辞書データベース６Ａに格納するようにしてもよい。携帯電話端末１００の移動方向は、動ベクトルの方向とは逆方向となる。また、辞書データベース６Ａに、文字入力の際に入力される音声に関する情報を格納しておき、文字認識の際に利用するように構成することも可能である。 In the above embodiment, the motion vector information (direction, size, and distribution state) is stored in the dictionary database 6A in association with the characters. Instead, the mobile phone terminal calculated from the motion vector information. 100 moving directions and speeds may be stored in the dictionary database 6A. The moving direction of the mobile phone terminal 100 is opposite to the direction of the motion vector. It is also possible to store information related to speech input at the time of character input in the dictionary database 6A and use it at the time of character recognition.

また、上記の実施の形態では、マイクロホン３１により類似文字を互いに識別するための情報を入力していたが、これに限らず、様々な情報を入力可能に構成してもよい。例えば文字の種別に関する情報（英字、ひらがな、カタカナ、数字等）を音声で入力するようにしてもよい。 In the above embodiment, information for identifying similar characters from each other is input by the microphone 31. However, the present invention is not limited to this, and various information may be input. For example, information relating to the type of characters (English characters, hiragana, katakana, numbers, etc.) may be input by voice.

また、本実施の形態の文字入力モードを終了させる場合、上記の例ではキー３５Ａを操作していたが、この代わりに、例えばカメラ３３のレンズが所定の期間ＴＥ秒（例えば２〜５秒程度）継続して押えられた場合に、本実施の形態の文字入力モードを終了させるようにしてもよい。 When the character input mode according to the present embodiment is terminated, the key 35A is operated in the above example. Instead, for example, the lens of the camera 33 has a predetermined period of TE seconds (for example, about 2 to 5 seconds). ) When the character is continuously pressed, the character input mode of this embodiment may be terminated.

また、図８に示すように、フリップ３４と本体部３７とがヒンジ部３８により開閉可能となっているフリップ式携帯情報端末装置においては、フリップ３４の表面（フリップを閉じた状態で見える面）に本実施の形態の文字入力ボタン３５Ｂを設けてもよい。 Further, as shown in FIG. 8, in the flip-type portable information terminal device in which the flip 34 and the main body 37 can be opened and closed by the hinge portion 38, the surface of the flip 34 (the surface seen when the flip is closed). May be provided with the character input button 35B of the present embodiment.

また、上記実施の形態では、動ベクトル検出部２１１により検出した動ベクトルの向き及び大きさ、並びに動ベクトル分布検出部２１３により検出した動ベクトルの分布状態によって文字認識を行っていたが、本発明はこれに限定されるものではなく、例えば分布状態のみに基づいて文字認識を実行するようにしてもよい。このとき、各動ベクトルの向き又は大きさも「分布状態」を構成するデータの一部と把握して、動ベクトル分布検出部２１３において検出させるようにしてもよい。
また、画素のサイズも本発明においては、様々なサイズ又はフォーマットとすることができ、例えば、ＩＴＵ−ＴＨ.２６１のＣＩＦ、ＭＰＥＧ−１のＳＩＦ、ＮＴＳＣ用画像、ＰＡＬ用画像、ＨＤＴＶ画像あるいはそれとは異なるサイズの画像のいずれでも良い。 In the above embodiment, character recognition is performed based on the direction and magnitude of the motion vector detected by the motion vector detection unit 211 and the distribution state of the motion vector detected by the motion vector distribution detection unit 213. However, the present invention is not limited to this. For example, character recognition may be executed based only on the distribution state. At this time, the direction or size of each motion vector may be grasped as a part of data constituting the “distribution state” and may be detected by the motion vector distribution detection unit 213.
In addition, the pixel size may be various sizes or formats in the present invention. For example, ITU-T H.261 CIF, MPEG-1 SIF, NTSC image, PAL image, HDTV image, Any image of a different size may be used.

この発明の実施形態に係わる携帯電話端末の回路構成を示すブロック図である。It is a block diagram which shows the circuit structure of the mobile telephone terminal concerning embodiment of this invention. 本実施の形態による文字入力モードを実行する様子を示す概念図である。It is a conceptual diagram which shows a mode that the character input mode by this Embodiment is performed. 動ベクトル検出部２１１の動作の一例を示す。An example of the operation of the motion vector detection unit 211 will be described. フレーム中において設定される動ベクトル抽出領域の一例を示す。An example of a motion vector extraction region set in a frame is shown. フレーム中において設定される動ベクトル抽出領域の一例を示す。An example of a motion vector extraction region set in a frame is shown. 動ベクトル分布検出部２１３の動作の一例を示す。An example of the operation of the motion vector distribution detection unit 213 will be shown. 本実施の形態の携帯電話端末１００による文字入力モードの手順を示すフローチャートである。It is a flowchart which shows the procedure of the character input mode by the mobile telephone terminal 100 of this Embodiment. 本実施の形態の変形例の１つを示す。One of the modifications of this Embodiment is shown.

Explanation of symbols

１・・・無線部、２・・・ベースバンド部、３・・・入出力部、４・・・電源部、５・・・バイブレータ、６・・・メモリカード、１１・・・アンテナ、１２・・・アンテナ共用器（ＤＵＰ）、１３・・・受信回路（ＲＸ）、１４・・・周波数シンセサイザ（ＳＹＮ）、１５・・・送信回路、２１・・・主制御部、２２・・・多重分離部、２３・・・音声コーデック、２４・・・画像処理部、２５・・・ＬＣＤ制御部、２６・・・メモリ部、３１・・・マイクロホン、３２・・・スピーカ、３３・・・カメラ、３４・・・ＬＣＤ、３５・・・キーボード、４１・・・バッテリ、４２・・・充電回路、４３・・・電圧生成回路、２１１・・・動ベクトル検出部、２１２・・・動ベクトル絞り込み処理部、２１３・・・動ベクトル分布検出部部、２１５・・・音声認識部、２１６・・・文字認識部。 DESCRIPTION OF SYMBOLS 1 ... Radio part, 2 ... Baseband part, 3 ... Input / output part, 4 ... Power supply part, 5 ... Vibrator, 6 ... Memory card, 11 ... Antenna, 12 ... Antenna duplexer (DUP), 13 ... Reception circuit (RX), 14 ... Frequency synthesizer (SYN), 15 ... Transmission circuit, 21 ... Main control unit, 22 ... Multiplexing Separation unit, 23 ... Audio codec, 24 ... Image processing unit, 25 ... LCD control unit, 26 ... Memory unit, 31 ... Microphone, 32 ... Speaker, 33 ... Camera 34 ... LCD, 35 ... keyboard, 41 ... battery, 42 ... charge circuit, 43 ... voltage generation circuit, 211 ... motion vector detection unit, 212 ... motion vector narrowing down Processing unit, 213 ... motion vector distribution detection unit section, 215 ... voice recognition unit, 216 ... character recognition unit.

Claims

In a mobile terminal equipped with a function for inputting input information,
The main body,
An imaging unit that is provided integrally with the main body unit to capture a moving image;
A motion vector detection unit that detects a motion vector indicating a motion of the imaging target in a moving image captured while moving the main body;
A motion vector distribution detector for detecting a distribution state of the motion vector in the moving image;
A dictionary database that stores characters in association with the size and direction of each of the plurality of motion vectors and the distribution state;
A mobile terminal comprising: a reading unit that reads out from the dictionary database characters corresponding to the size and direction detected by the motion vector detection unit and the motion vector distribution detection unit and the distribution state.

The mobile terminal according to claim 1, further comprising: a voice identification unit that identifies voice, wherein the reading unit reads the size and direction and characters corresponding to the voice from the dictionary database.

A one-stroke start / end input unit for inputting information indicating the start and end of drawing of a stroke of a character having two or more strokes;
The mobile terminal according to claim 1, wherein the motion vector detection unit operates between the start and end indicated by the one-stroke start / end input unit.

The mobile terminal according to claim 1, further comprising a motion vector narrowing processing unit that narrows down a plurality of motion vectors detected by the motion vector detection unit according to a predetermined narrowing criterion and outputs a reference motion vector.

The mobile terminal according to claim 4, wherein the motion vector narrowing-down processing unit executes processing for absorbing variations in motion vectors detected by the motion vector detection unit.

The mobile terminal according to claim 4, wherein the motion vector narrowing processing unit performs thinning of the images.

The mobile terminal according to claim 4, wherein the motion vector narrowing-down processing unit detects an edge of an object image in the motion image and executes a process of removing motion vectors other than the motion vector near the edge.

The portable terminal according to claim 1, further comprising means for forcibly terminating the end of the character input mode when it is determined that the black image continues for a predetermined time or longer in the moving image.

A portable main body with a function for inputting information;
An imaging unit that is provided integrally with the main body unit and captures a moving image;
A motion vector detection unit that sequentially detects a motion vector indicating a motion of an imaging target in the moving image captured while moving the main body, and detects a distribution state of the motion vector in the moving image;
A dictionary database that stores distribution states of a plurality of motion vectors previously associated with each character symbol;
A mobile terminal comprising: a search unit that searches the dictionary database for a character symbol associated with a distribution state most similar to the distribution state detected by the motion vector distribution detection unit.

The motion vector distribution detection unit includes means for detecting at least one of the magnitude and direction of a motion vector as a feature amount, and means for outputting a set of a plurality of detected feature amounts as the distribution state. The mobile terminal according to claim 9, which is characterized by: