JP2005057360A

JP2005057360A - Picture photographing apparatus and program

Info

Publication number: JP2005057360A
Application number: JP2003206389A
Authority: JP
Inventors: Masashi Koga; 昌史古賀; Tatsuya Kameyama; 達也亀山; Kazumi Rissen; 和巳立仙
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2003-08-07
Filing date: 2003-08-07
Publication date: 2005-03-03

Abstract

<P>PROBLEM TO BE SOLVED: To provide a mobile terminal having a digital picture input means, in which a proper file name is attached to a digital picture and the resulting digital picture is stored. <P>SOLUTION: The mobile terminal recognizes characters from a received digital picture, uses a character string including a result of character recognition for a candidate of a file name to be given to the digital picture, and decides the file name on the basis of the confirmation of a user. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は，カメラを有する携帯型の端末における入力手段に関する技術分野に属する。
【０００２】
【従来の技術】
従来より，カメラで撮った画像をデジタル化し，ファイルとして蓄積することができるデジタルスチルカメラ，カメラ付き携帯電話，カメラ付きＰＤＡなどの装置が実用化されている。以下，これらの装置をデジタルカメラと総称する。多くのデジタルカメラは，蓄積した画像を表示する機能や，不要な画像を削除したりする機能も兼ね備えている。通常，ファイルの名称は日付，通し番号などに基づいて付けられる。また，多くの場合，デジタルカメラに蓄積した画像は，パーソナルコンピュータに転送し，整理，加工，印刷を行う。
また，特開平０７−０７２５４６号公報（特許文献１）のように，カメラから撮った画像から文字を認識し，認識した結果を画像とあわせて記録する例もある。
【０００３】
【特許文献１】特開平０７−０７２５４６号公報
【非特許文献１】Ｒ．Ｍ．Ｋ．Ｓｉｎｈａ，Ｂ．Ｐｒａｓａｄａ，Ｇ．Ｆ．Ｈｏｕｌｅ，Ｍ．Ｓａｂｏｕｒｉｎ， “ＨｙｂｒｉｄＣｏｎｔｅｘｔｕａｌＴｅｘｔＲｅｃｏｇｎｉｔｉｏｎｗｉｔｈＳｔｒｉｎｇＭａｔｃｈｉｎｇ，” ＩＥＥＥＴｒａｎｓａｃｔｉｏｎｓｏｎＰａｔｔｅｒｎＡｎａｌｙｓｉｓａｎｄＭａｃｈｉｎｅＩｎｔｅｌｌｉｇｅｎｃｅ，Ｖｏｌ．１５，Ｎｏ．９，Ｄｅｃｅｍｂｅｒ１９９３
【非特許文献２】Ａ．Ｋ．Ｊａｉｎ，Ｂ．Ｙｕ， “ＡｕｔｏｍａｔｉｃＴｅｘｔＬｏｃａｔｉｏｎｉｎＩｍａｇｅｓａｎｄＶｉｄｅｏＦｒａｍｅｓ，” ＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ，Ｖｏｌ．３１，Ｎｏ．１２，ｐｐ．２０５５−２０７６，１９９８
【非特許文献３】Ｃ．−Ｌ．Ｌｉｕ，Ｍ．ＫｏｇａａｎｄＨ．Ｆｕｊｉｓａｗａ， ”Ｌｅｘｉｃｏｎ−ｄｒｉｖｅｎＳｅｇｍｅｎｔａｔｉｏｎａｎｄＲｅｃｏｇｎｉｔｉｏｎｏｆＨａｎｄｗｒｉｔｔｅｎＣｈａｒａｃｔｅｒＳｔｒｉｎｇｓｆｏｒＪａｐａｎｅｓｅＡｄｄｒｅｓｓＲｅａｄｉｎｇ，” ＩＥＥＥＴｒａｎｓ．ＰａｔｔｅｒｎＡｎａｌｙｓｉｓａｎｄＭａｃｈｉｎｅＩｎｔｅｌｌｉｇｅｎｃｅ，Ｖｏｌ．２４，Ｎｏ．１１，Ｎｏｖ．２００２，ｐｐ．４２５−１４３７
【０００４】
【発明が解決しようとする課題】
本発明が解決しようとする課題は，デジタルカメラで撮ったファイルの検索・利用を容易にすることである。多くの画像を装置中に格納した際には，表示したり削除したりするファイルをファイル名称のみを手がかりとして探すことは困難である。なぜなら，日付や通し番号のみからは画像にどんなものが写っているかを知ることが出来ないからである。このため，多くの場合，日付や通し番号で大まかに見当をつけた後，記憶に頼って順に画像を表示させ，所望の画像を探さざるを得ない。この過程を支援するために，縮小画像を多数画像に表示させることもある。しかし，この方法では表示される画像が小さく，画像の細部，例えば写っている文字や人の顔などを確認することが出来ない。
【０００５】
【課題を解決するための手段】
上記の課題を解決するための第一の手段は，デジタルカメラに文字を認識する手段を搭載し，画像中から文字を認識した結果に基づきファイル名を決定，もしくは画像ファイルの一部にキーワードを埋め込むことである。多くの場合，画像中には文字があり，文字が撮影場所や撮影時の重要な手がかりとなっている。例えば，観光地での記念写真は名所旧跡を示す看板を撮影することが多い。このため，画像中の文字は画像を探す上での重要な手がかりである。また，最近はデジタルカメラをメモ帳代わりに用いることも多く，この場合には画像中の文字は画像を探す上での重要な手がかりである。
【０００６】
上記の手段の導入に伴い，いくつかの技術的な問題が派生する．そこで，以下のような手段でそれらを解決する。
第一に，画像中に多くの文字列が存在する場合があり，どれがファイル名，あるいはキーワードとして適切かが自明でない場合がある。そこで，文字の大きさ，文字の位置などに基づきファイル名もしくはキーワードを決定する手段を設ける。
【０００７】
第二に，文字の大きさや位置などでは適切なファイル名称もしくはキーワードを決定できない場合がある。そこで，予め指定したキーワードの集合を記憶する手段と，キーワードを画像中から認識する手段を設け，画像中から認識されたキーワードをファイル名もしくは画像ファイルに埋め込むキーワードに含むようにする。画像中からキーワードを認識する手段としては，文字認識結果とキーワードの集合を照合する方法，キーワードの集合を文字認識における言語辞書として用いる方法などがある。
【０００８】
第三に，文字認識結果だけではファイル名称を特定できない場合がある。例えば，同じ文字認識結果が得られる画像が２つ以上ある場合などである。そこで，文字認識結果と日付もしくは通し番号を組合わせた文字列を生成する手段を設け，これにより生成したファイル名とするようにする。
【０００９】
上記の課題を解決するための第二の手段は，予め指定した人物の顔や物体を認識する手段と，認識対像の顔の人名や物体名を記憶する手段とをデジタルカメラに設け，画像中から対像とする物体が認識された場合には，その顔や物体を表す文字列をファイル名，もしくは画像に埋め込むキーワードとすることである。
【００１０】
上記の課題を解決するための第三の手段は，画像の特徴を表す属性を画像から抽出する手段と，画像の特徴に応じて画像を予め定められているカテゴリに分類する手段をデジタルカメラに設け，分類結果に応じてファイル名を決定，もしくはキーワードを画像ファイルに埋め込むようにすることである。ここで，カテゴリは例えば，「晴天」「雨天」「夜」「室内」など撮影状況に関するものとしておく。こうした分類に基づいて付与されたファイル名もしくはキーワードは，画像に写っている内容を表しており，ファイルを探す上での大きな手がかりとなる。
【００１１】
第三の解決手段の導入に伴い，画像の撮影条件により，画像の特徴が変化し分類が困難になるという問題が生じる。例えば，通常カメラのピント，絞り，シャッター速度，ホワイトバランス，センサーの感度などは撮影時に調整するが，これに伴い，画像の明るさ，尖鋭さ，色ヒストグラムなどの特徴が変化する。そこで，分類の際にはこれらの撮影時に調整に用いたパラメータも特徴量として用いるようにする。
【００１２】
【発明の実施の形態】
図１に本発明の第一の実施例における画像の入力から画像ファイルの保存に至る処理の流れをデータフロー図で示す。本実施例では，カメラで画像を入力（１０４）後，従来どおり通し番号（１０６）や日付・時刻（１０７）に基づきファイル名を決定し（１０８），画像をファイルに出力する（１０９）。また，従来の方式に加え，画像中からキーワードを認識（１０５）した結果を併用し，ファイル名を決定することもできる。また，認識されたキーワードのうち，ファイル名の決定に用いたもの以外は，画像ファイル中のタグ部に埋め込んでもより。埋め込まれたキーワードは，検索時にインデックスとして利用可能である。キーワード辞書１０３中の単語は，予めデジタルカメラの記憶装置に記憶しておくか，キーワード設定処理１０２により登録する。ここでは，入力画像の形式はＲＧＢカラー画像とするが，他のカラー画像形式，もしくはグレー画像を用いてもよい。
【００１３】
キーワードとしては，例えば，名所旧跡の名称を登録しておく。この場合は，観光地などで名所旧跡の看板が写るように撮った画像に対しては，その名所旧跡名がファイル名として付与されることとなる。また，一般的に用いられる単語全てを辞書に登録しておいてもよい。この場合は，内容に関わらず画像中の文字列が認識され，ファイル名に反映されることとなる。
本実施例では，キーワード認識処理としては，例えば，非特許文献１のような方式を用いる。図２にキーワード認識処理のデータフローの一例を示す。まず２０１において入力画像から文字行を切出す。文字行切出しには，例えば非特許文献２のような方式を用いる。次に２０２において文字行中から個々の文字を切出す。複数の文字行が切出された場合には，それらの複数の文字行を以降の処理の対像とする。次に，ステップ２０３において切出した個々の文字が何の文字であるかを識別する。この際，文字識別辞書２０６を参照する。最後にステップ２０４において，文字識別した結果を文字列として解釈する。この際，キーワード辞書１０３を参照する。最後にキーワード選択２０６において，認識されたキーワードの画像中での大きさ，位置，文字列認識結果の尤もらしさなどに基づき，最終的にキーワードとして相応しいものを選択し，出力する。キーワード選択２０６の入力は認識されたキーワードの集合であり，出力はファイル名に相応しい順に並べられたキーワード認識結果の集合である。本実施例では，文字切出し，文字識別，後処理を逐次的に実行しているが，非特許文献３にあるように，これらを統合した処理を実行してもよい。また，言語情報を用いずに文字認識を実行した後，通常のテキストマッチングのアルゴリズムを用いてキーワード辞書中の単語と文字認識結果を照合してもよい
図３に，本発明の実施例におけるハードウエアの構成を示す。画像は，レンズ，絞りなどからなる光学装置３０２によって撮像された後，例えばＣＣＤ素子などの光電変換素子３０３で電気信号に変換される。さらに得られた電気信号はアナログ・デジタル変換器３０４にてデジタル信号に変換し，さらに例えばＤＳＰなどの信号処理素子３０５により，色空間変換，フィルタ処理などの処理を施す。この結果は，ＲＡＭ３０９へと転送する。演算装置３０７は，ＲＯＭ３０８に格納されている処理手順と文字識別辞書などのデータを参照し，ＲＡＭ３０９に格納されている画像を入力としてキーワード認識処理１０５を実行する。さらに演算装置３０７は，ＲＯＭ３０８に格納されている処理手順に従い，通番計数処理１０６を実行するとともに，時計３１２を参照してファイル名決定処理１０８を実行する。画像ファイルは入出力装置３１０を介して，メモリーカード３１１に格納する。また，入力装置３１３は，キーワードを入力する際に用いる。また，表示装置３０６は，撮影時の画像の確認，ファイル名決定処理１０８の結果の表示に用いる。また，通信装置３１４は，キーワード辞書や画像ファイルの転送などのため，パーソナルコンピュータとの接続に用いる。
【００１４】
図４に，本発明の実施例における装置の外観を示す。筐体４０１の前面には，光学装置３０２のレンズ部４０２を配置する。上部には，入力装置３１３の一部であって画像入力１０４を指示するためのシャッター４０３と，電源スイッチ４０４を配置する。側面には，パーソナルコンピュータとの接続に用いる通信装置３１４の端子４０７と，メモリーカード３１１の挿入口４０８を配置する。背面には，入力装置３１３の一部である，キャンセルボタン４０５，カーソルキー４０６を配置する。さらに背面には，表示装置３０６の表示面４０９を配置する。カーソルキー４０６は，上下左右の端を押すと方向を指示する信号を入力装置３１３に送り，中央を押すと確認などの別の信号を入力装置３１３へ送る。なお、以上は本発明を実施するための装置の一例であって、本発明は上記の配置に限定されるものではない。例えば、従来のカメラ型の装置ではなく、ＰＤＡや携帯電話などの携帯端末で、撮像装置等の、画像データ入力手段をもつものであってもよい。これらの携帯端末においては、通信装置として、無線ＬＡＮやセルラ通信の無線通信装置を備えるため、ネットワークと通信を行って画像のダウンロード・転送を行ってもよい。
【００１５】
図５に本発明の第一の実施例における操作の手順を示す。まず，操作者はデジタルカメラをパーソナルコンピュータへ接続し，ダウンロード指示を行う（５０１）。これに応じ，パーソナルコンピュータ上に格納されていたキーワード辞書がデジタルカメラに転送される（５０２）。次に，操作者はパーソナルコンピュータとデジタルカメラの接続を外し，カーソルキー４０５により，キーワード設定１０２を起動する（５０３）。キーワード設定１０２においては，まず，キーワード辞書１０３に格納されているキーワードを一覧表示する（５０４）。操作者はこれらから必要なものを取捨選択し，必要に応じ，カーソルキーで新たなキーワードを登録する（５０５）。次に，操作者はカーソルキー４０６を用い撮影モードを指定する（５０６）。ここで撮影モードとは，画像ファイル名を従来の方法で設定するか，キーワード認識で設定するかの違いを指定する。図中５０７以降は，キーワード認識で画像ファイル名を設定するモードが選択されたことを前提とした操作の手順を示している。モード指定後，レンズ４０２を撮影対像に向け，シャッター４０３を押下することにより，画像入力処理１０４，キーワード認識１０５，ファイル名決定１０８が起動される（５０７）。ファイル名決定１０８においては，認識した結果得られたキーワードを表示し（５０８），必要に応じ操作者がファイル名に用いるキーワードを指定する。その結果を利用し，ファイル名決定処理１０８は，通番や日付などの情報を参照し，ファイル名を決定する。決定したファイル名は操作者に表示する（５１０）．続いて，ファイル出力処理１０９にて，画像を符号化し，キーワード認識結果を埋め込み，ファイル名決定処理１０８で決定したファイル名に出力する。画像の符号化には，ＪＰＥＧ等の標準的な方式を用いる。また，キーワード認識結果は，ＪＰＥＧなど標準的な画像ファイル形式のタグ部に格納する。再び，デジタルカメラをパーソナルコンピュータに接続後，操作者が画像ファイル転送指示５１１を行うことで，デジタルカメラからパーソナルコンピュータへファイルを転送する処理を起動する（５１２）。通信機能を用いてネットワークに接続し、キーワードのダウンロードや画像ファイルの転送のためにネットワークを介して接続可能なサーバにアクセスしてもよい。
【００１６】
図６に，キーワード入力起動５０３およびモード指定５０６の操作を行う際の表示面４０９の状態を示す。まず，電源スイッチ４０４により電源が投入されている状態で，カーソルキー４０６のどこかを押下すると，メニュー６０１が表示される。メニュー中での選択項目は，操作者がカーソルキーの上下部分を押すことで変更し，確定の際にはカーソルキー４０６の中央を押す。いずれも選択しない場合には，キャンセルボタン４０５を押す。キーワード入力を起動する際には，操作者が「キーワード設定６０４」を選択し，カーソルキー４０６の中央を押す。５０６のモード指定を行う際には，操作者が「撮影（自動ファイル名）」６０３を選択し，カーソルキー４０６の中央を押す。通常の撮影を行う際には「撮影」６０２を選択する。
【００１７】
図７に，キーワード表示５０４およびキーワード指定５０５を行う際の表示面４０９の状態を示す。まず，操作者がメニュー６０１でキーワード設定６０４を指定すると，図７（ａ）の７０１のようなメニューが表示される。７０１は，予めパーソナルコンピュータから転送した複数のキーワード辞書から，キーワード認識に用いるものを選択するためのものである。この例では，「名所・旧跡」と「地名」が有効になっている。ここで，操作者がカーソルキー４０６の上下を押すことにより，選択項目を変更し，カーソルキー４０６の左を押すことでキーワード辞書が有功か無効かを切り替える。操作者がカーソルキー４０６の中央を押すことで，変更内容を確定して元の状態に戻る。さらに，操作者がある項目を選んだ状態でカーソルキー６０４の右を押すと，キーワード辞書に登録されているキーワードを一覧する。
図７（ａ）で操作者が７０２の項目を選択してカーソルキー６０４の右を押した際には図７（ｂ）に示すようなメニュー７０３が現れる。この状態で操作者がカーソルキー４０６の上下を押すことにより，選択項目を変更でき，さらに，カーソルキー４０６の左を押すことで選択したキーワードが有功か無効かを切り替える。さらに，カーソルキー４０６の右を押すことで，新たなキーワードを操作者がカーソルキー４０６を用いて入力するモードに入る。このモードでは，画面中に仮想的なキーボードを表示するなどして，文字を入力するようにする。また，図７（ｂ）の状態で操作者がカーソルキー４０６の中央を押すことで，変更内容を確定して元の状態に戻る。例えば，図７（ｂ）の状態では「××渓谷」７０４が選択されている。ここで，カーソルキー４０６の左を操作者が押すと，キーワード「××渓谷」は無効となる。
【００１８】
図８に，入力画像の一例を示す。この例では，看板にかかれた「△△山頂」８０１と，服に印刷された「□□ウエア」８０２の二つの文字列が写っている。「△△山頂」８０１はこの写真が撮影された状況の手がかりとなり，画像ファイル名として適切である。一方，「□□ウエア」８０２は必ずしも撮影された状況の手がかりとはならず，画像ファイル名としては不適切である。
【００１９】
図９に，キーワード選択処理２０５の処理手順を示す。本処理の入力は認識されたキーワードの集合であり，出力はファイル名に相応しい順に並べられたキーワード認識結果の集合である。まず，ループ９０１において，各キーワード認識結果について，特徴量算出９０２と確信度算出９０３を実行する。特徴量算出９０２は，各キーワード認識結果のファイル名としての相応しさを調べるために必要なｎ個の特徴量
Ｆ＝（ｆ１，ｆ２，ｆ３，．．．，ｆｎ）
を求める処理である。ここではｎ＝５とし，以下のような特徴量を用いる。
ｆ１：認識されたキーワードの中心の画像上でのＸ座標（画素）
ｆ２：認識されたキーワードの中心の画像上でのＹ座標（画素）
ｆ３：認識されたキーワードの中心の画像上での幅（画素）
ｆ４：認識されたキーワードの中心の画像上での高さ（画素）
ｆ５：後処理で得られる文字列としての尤度
確信度算出９０３は，得られた特徴量に基づきキーワードのファイル名としての相応しさを示す値すなわち確信度を算出する処理である。次に，ステップ９０４にて，確信度の値が予め定められている閾値θ以下のキーワード認識結果を削除する。これは，相応しくないキーワードを用いて誤ったファイル名をつけてしまうことを防ぐためである。次に，ステップ９０５にて，残ったキーワード認識結果を確信度順に並べ替える。
【００２０】
図１０に，認識結果表示５０８を実行した際の表示面４０９の状態を示す。ここでは，１００１，１００２のようなキーワードを認識した結果を，認識された文字列のすぐ下に表示する。さらに，最も確信度の高いキーワードを，１００１のようにハイライトして表示する。ここで，カーソルキー４０６の下を押すとより確信度の低いキーワードを，上を押すと確信度の高いキーワードを順にハイライトする。最終的に，カーソルキー４０６の中央を押した時点でハイライトされているキーワードをファイル名の決定に用いる。
【００２１】
図１１に，ファイル名決定１０８に処理手順を示す。本処理の入力は，キーワード認識１０５の結果であるファイル名に相応しい順に並べられたキーワード認識結果の集合，日付・時刻管理１０７から得られる日付の情報，および保存済み画像ファイルの名称の集合である。まず，ステップ１１０１において，図１０に示すようにキーワード認識結果を表示する。次に，ステップ１１０２において図１０で説明したような方式で，操作者によるキーワード選択により，ファイル名の決定に用いるキーワードを選択する。次に，ステップ１１０３にて，格納済みの画像ファイルの名称を調べ，ステップ１１０２で確定したキーワードと同一のキーワードを用いたファイル名で，かつ，日付が新たに撮影した画像の日付と同一のものを探索する。該当する画像ファイルが見つかった場合には，さらに該当する画像ファイル名称の中で，最も通番の大きいものを選択する。次に，ステップ１１０４にて，ステップ１１０３で得られた通番を１加えたものを新たな通番とする。もし，ステップ１１０３で該当する画像ファイルが見出されなかった場合には，通番の値を１とし、または通番をつけないこととする。最後に，ステップ１１０５にて，日付とキーワードと通番よりファイル名を合成する。
【００２２】
図１２にファイル名決定表示５１０実行した際の表示面４０９の状態を示す。この状態では，画面中に現れたウインドウ１２０１上に決定したファイル名と画像ファイルに埋め込むキーワードを表示する。この例では，この画像は２００２年１２月１７日に△△山頂で撮った２枚目の画像であるとしており，ファイル名は，日付とキーワードと通番を組合わせた「２００２１２１７△△山頂０２」と決定されている。日付、時刻などの情報は装置内に内蔵される時計から入手される。
【００２３】
図１３に，本発明の第二の実施例における画像入力から画像ファイル出力に至る処理の流れをデータフロー図にて説明する。まず，画像入力設定１３１１に従い，画像を入力する（１３０２）。ここで画像入力設定とは，カメラのピント，絞り，シャッター速度，ホワイトバランス，センサーの感度，コントラストなど，画像入力の際に必要なパラメータの設定のことである。次に，入力画像中から顔を検出し，予め顔辞書１３０３に登録してある顔を認識する。また，撮影状況辞書１３０４に登録してある情報と画像入力設定１３１１の出力を参照し，入力画像がいかなる状況で撮影されたかを認識する（１３０６）。次に，顔認識１３０５，撮影状況認識１３０６，日付・時刻管理１３０７の結果，すでに格納されている画像ファイル１３１０を参照し，１０８と同じの手順で画像ファイル名を決定する（１３０８）。さらに，決定したファイル名に基づき，画像ファイルを出力する（１３０９）。
【００２４】
図１４に，顔認識１３０５の処理手順を示す。まず，ステップ１４０１にて入力画像中より顔領域を検出する。次にループ１４０２にて，ステップ１４０１で得られた全ての顔領域について，ステップ１４０３，ループ１４０４を繰り返す。ステップ１４０３は顔領域の特徴量を算出する処理である。特徴量は複数算出し，本ステップの出力は特徴ベクトルとなる。ループ１４０４では，顔辞書１３０３に登録してある全ての顔について，ステップ１４０３で得られた特徴ベクトルの尤度を計算する（１４０５）。顔辞書１３０３には，登録した顔の数だけ，尤度関数と顔に対応する名称（例えば氏名）の対を記憶している。この尤度関数を用い，ステップ１４０５で尤度を計算する。ループ１４０２を終了後，ステップ１４０６にて最も尤度の値が高い顔を検出し，さらにステップ１４０７にてその顔に対応する名称を出力する。この出力がファイル名決定に利用される。
【００２５】
図１５に，撮影状況認識１３０５の処理手順を示す。まず，ステップ１５０１にて入力画像から撮影状況の特徴量を算出する。特徴量は複数算出し，本ステップの出力は特徴ベクトルとなる。例えば，特徴量としては，各色成分ごとのヒストグラム値，自己相関係数，モーメントなどを用いる。次に，ループ１５０２にて，撮影状況辞書１３０４に登録してある全ての撮影状況について，ステップ１５０１で得られた特徴ベクトルの尤度を計算する（１５０３）。撮影状況辞書１３０４には，登録した撮影状況の数だけ，尤度関数と撮影状況に対応する名称（例えば「晴天」「雨天」「夜」「室内」など）の対を記憶している。この尤度関数を用い，ステップ１５０３では尤度を計算する。次に，ステップ１５０４にて，最も尤度が高いものを選択する。さらにステップ１５０５にて，最も尤度が高い撮影状況の名称を出力する。
【００２６】
【発明の効果】
従来は困難であった画像の内容が容易に類推可能な画像ファイル名を自動的に付与することが可能となる。
【図面の簡単な説明】
【図１】第一の実施例における画像入力から画像ファイル出力に至る処理の流れを示すデータフロー図。
【図２】本発明の第一の実施例におけるキーワード認識処理の流れを示すデータフロー図。
【図３】本発明の実施例におけるハードウエアの構成図。
【図４】本発明の実施例における装置の外観図。
【図５】本発明の第一の実施例における操作の手順を示す図。
【図６】キーワード設定，撮影のモードを指定する際の表示面の状態を示す図。
【図７】キーワード辞書表示，キーワード指定の際の表示面の状態を示す図。
【図８】入力画像の模式図。
【図９】キーワード選択処理の処理手順を示す図。
【図１０】キーワード認識結果表示時の表示面。
【図１１】本発明の第一の実施例におけるファイル名決定の処理手順を示す図。
【図１２】本発明の第一の実施例におけるファイル名表示時の表示面。
【図１３】本発明の第二の実施例における画像入力から画像ファイル出力に至る処理の流れを示すデータフロー図。
【図１４】本発明の第二の実施例における顔認識の処理手順を示す図。
【図１５】本発明の第二の実施例における撮影状況認識の処理手順を示す図。
【符号の説明】
１０１・・・第一の実施例における画像入力から画像ファイル出力に至る処理，１０２・・・キーワード設定，１０３・・・キーワード辞書，１０４・・・画像入力，１０５・・・キーワード認識，１０６・・・通番計数，１０７・・・日付・時刻管理，１０８・・・ファイル名決定，１０９・・・ファイル出力，１１０・・・画像ファイル，２０１・・・文字行切出し，２０２・・・文字切出し，２０３・・・文字識別，２０４・・・後処理，２０５・・・キーワード選択，２０６・・・文字識別辞書，３０１・・・デジタルカメラ，３０２・・・光学装置，３０３・・・光電変換素子，３０４・・・アナログデジタル変換器，３０５・・・信号処理素子，３０６・・・表示装置，３０７・・・演算装置，３０８・・・ＲＯＭ，３０９・・・ＲＡＭ，３１０・・・入出力装置，３１１・・・メモリーカード，３１２・・・時計，３１３・・・入力装置，３１４・・・通信装置，４０１・・・筐体，４０２・・・レンズ，４０３・・・シャッター，４０４・・・電源スイッチ，４０５・・・キャンセルボタン，４０６・・・カーソルキー，４０７・・・通信装置端子，４０８・・・メモリーカード挿入口，４０９・・・表示面，５０１・・・ダウンロード支持，５０２・・・キーワード転送，５０３・・・キーワード入力起動，５０４・・・キーワード表示，５０５・・・キーワード指定，５０６・・・モード指定，５０７・・・撮影，５０８・・・認識結果表示，５０９・・・ファイル名用キーワード指定，５１０・・・ファイル名表示，５１１・・・ファイル転送指示，５１２・・・画像ファイル転送，６０１・・・モード指定メニュー，６０２・・・撮影，６０３・・・撮影（自動ファイル名），６０４・・・キーワード指定，７０１・・・キーワード辞書メニュー，７０２・・・選択されたキーワード辞書，７０３・・・キーワード一覧ウインドウ，７０４・・・選択されたキーワード，８０１・・・認識されたキーワード「△△山頂」，８０２・・・認識されたキーワード「□□ウエア」，９０１・・・各キーワード認識結果に関するループ，９０２・・・特長量算出するステップ，９０３・・・確信度算出するステップ，９０４・・確信度の閾値処理を行うステップ，９０５・・・キーワード並べ替えを行うステップ，１００１，１００２・・・キーワード認識結果，１１０１・・・キーワード認識結果を表示するステップ，１１０２・・・キーワードを選択するステップ，１１０３・・・既存の最大の通番を探索するステップ，１１０４・・・通番に１を加算するステップ，１１０５・・・ファイル名を合成するステップ，１２０１・・・ファイル名を表示するウインドウ，１３０１・・・本発明の第二の実施例における画像入力から画像ファイル出力に至る処理，１３０２・・・画像入力，１３０３・・・顔辞書，１３０４・・・撮影状況辞書，１３０５・・・顔認識，１３０６・・・撮影状況認識，１３０７・・・日付・時刻管理，１３０８・・・ファイル名決定，１３０９・・・ファイル出力，１３１０・・・画像ファイル，１４０１・・・入力画像より顔領域を検出するステップ，１４０２・・・全ての顔領域に関するループ，１４０３・・・特長量を算出するステップ，１４０４・・・全ての顔辞書に登録してある顔に関するループ，１４０５・・・尤度算出を行うステップ，１４０６・・・尤度値が最大のものを選択するステップ，１４０７・・・尤度値が最大の顔の名称を出力するステップ，１５０１・・・特徴量算出を行うステップ，１５０２・・・全ての撮影状況に関するステップ，１５０３・・・尤度算出を行うステップ，１５０４・・・最も尤度値が高い撮影状況を検出するステップ，１５０５・・・最も尤度値が高い撮影状況の名称を出力するステップ。[0001]
BACKGROUND OF THE INVENTION
The present invention belongs to a technical field relating to input means in a portable terminal having a camera.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, devices such as a digital still camera, a camera-equipped mobile phone, and a camera-equipped PDA that can digitize images stored by a camera and store them as files have been put into practical use. Hereinafter, these devices are collectively referred to as a digital camera. Many digital cameras also have a function of displaying accumulated images and a function of deleting unnecessary images. Normally, file names are given based on date, serial number, and so on. In many cases, images stored in a digital camera are transferred to a personal computer for organizing, processing, and printing.
In addition, as disclosed in Japanese Patent Application Laid-Open No. 07-072546 (Patent Document 1), there is an example in which characters are recognized from an image taken from a camera and the recognized result is recorded together with the image.
[0003]
[Patent Document 1] Japanese Patent Application Laid-Open No. 07-072546
[Non-Patent Document 1] R.A. M.M. K. Sinha, B.H. Prasada, G.M. F. Houle, M.M. Sabourin, “Hybrid Textual Text Recognition with String Matching,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 15, no. 9, December 1993
[Non-Patent Document 2] A. K. Jain, B.B. Yu, “Automatic Text Location in Images and Video Frames,” Pattern Recognition, Vol. 31, no. 12, pp. 2055-2076, 1998
[Non-patent Document 3] C.I. -L. Liu, M.M. Koga and H.K. Fujisawa, "Lexicon-drive Segmentation and Recognition of Handwriting Character Strings for Japan Address Reading," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 24, no. 11, Nov. 2002, pp. 425-1437
[0004]
[Problems to be solved by the invention]
The problem to be solved by the present invention is to facilitate the search and use of files taken with a digital camera. When many images are stored in the apparatus, it is difficult to search for a file to be displayed or deleted using only the file name as a clue. This is because it is impossible to know what is in the image from only the date and serial number. For this reason, in many cases, after roughly registering the date and serial number, it is necessary to search for a desired image by sequentially displaying images depending on the memory. In order to support this process, a number of reduced images may be displayed on the image. However, with this method, the displayed image is small, and details of the image, for example, characters and human faces that can be seen, cannot be confirmed.
[0005]
[Means for Solving the Problems]
The first means to solve the above problem is that a digital camera is equipped with means for recognizing characters, and a file name is determined based on the result of recognizing characters from the image, or a keyword is assigned to a part of the image file. Is to embed. In many cases, there are characters in the image, and the characters are an important clue at the shooting location and at the time of shooting. For example, commemorative photos at sightseeing spots often shoot signs that show historical sites. For this reason, the characters in the image are an important clue when searching for the image. Recently, a digital camera is often used instead of a memo pad. In this case, characters in the image are an important clue for searching for the image.
[0006]
With the introduction of the above means, several technical problems arise. Therefore, they are solved by the following means.
First, there may be many character strings in the image, and it may not be obvious which is appropriate as a file name or keyword. Therefore, means for determining a file name or a keyword based on the character size, the character position, etc. is provided.
[0007]
Secondly, there are cases where an appropriate file name or keyword cannot be determined based on the size and position of characters. Therefore, means for storing a set of keywords specified in advance and means for recognizing the keyword from the image are provided so that the keyword recognized from the image is included in the file name or the keyword embedded in the image file. As means for recognizing a keyword from an image, there are a method of collating a character recognition result with a set of keywords, a method of using a set of keywords as a language dictionary in character recognition, and the like.
[0008]
Third, there are cases where the file name cannot be specified only by the character recognition result. For example, there are two or more images from which the same character recognition result can be obtained. Therefore, a means for generating a character string combining the character recognition result and the date or serial number is provided, and the file name generated thereby is set.
[0009]
The second means for solving the above-mentioned problem is that a digital camera is provided with means for recognizing a face or object of a person specified in advance and means for storing a person name or object name of the face of recognition. When an object to be imaged is recognized from inside, a character string representing the face or object is set as a file name or a keyword embedded in the image.
[0010]
A third means for solving the above problems is a digital camera comprising means for extracting an attribute representing the feature of the image from the image and means for classifying the image into a predetermined category according to the feature of the image. The file name is determined according to the classification result, or the keyword is embedded in the image file. Here, the categories are related to shooting conditions such as “sunny”, “rainy”, “night”, and “indoor”. The file name or keyword assigned based on such a classification represents the content shown in the image, which is a great clue for searching for a file.
[0011]
With the introduction of the third solving means, there arises a problem that classification of images becomes difficult due to changes in image characteristics depending on image capturing conditions. For example, the camera's focus, aperture, shutter speed, white balance, sensor sensitivity, etc. are adjusted at the time of shooting. However, characteristics such as image brightness, sharpness, and color histogram change accordingly. Therefore, at the time of classification, these parameters used for adjustment at the time of shooting are also used as feature amounts.
[0012]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 1 is a data flow diagram showing the flow of processing from image input to image file storage in the first embodiment of the present invention. In this embodiment, after inputting an image with a camera (104), a file name is determined based on a serial number (106) and date / time (107) as usual (108), and the image is output to a file (109). In addition to the conventional method, the file name can also be determined by using the result of the keyword recognition (105) from the image. Of the recognized keywords, those other than those used to determine the file name can be embedded in the tag portion of the image file. The embedded keyword can be used as an index when searching. The words in the keyword dictionary 103 are stored in advance in the storage device of the digital camera or registered by the keyword setting process 102. Here, the format of the input image is an RGB color image, but another color image format or a gray image may be used.
[0013]
As a keyword, for example, the name of a famous landmark is registered. In this case, the famous landmark name is given as a file name to the image taken so that the signboard of the famous landmark is reflected in a sightseeing spot or the like. Also, all commonly used words may be registered in the dictionary. In this case, the character string in the image is recognized regardless of the contents, and is reflected in the file name.
In the present embodiment, as the keyword recognition process, for example, a method as described in Non-Patent Document 1 is used. FIG. 2 shows an example of the data flow of the keyword recognition process. First, in 201, a character line is cut out from the input image. For example, a method as described in Non-Patent Document 2 is used for character line extraction. Next, in 202, individual characters are cut out from the character line. When a plurality of character lines are cut out, the plurality of character lines are used as an image of the subsequent processing. Next, what character each individual character extracted in step 203 is identified. At this time, the character identification dictionary 206 is referred to. Finally, in step 204, the character identification result is interpreted as a character string. At this time, the keyword dictionary 103 is referred to. Finally, in the keyword selection 206, an appropriate keyword is finally selected and output based on the size and position of the recognized keyword in the image, the likelihood of the character string recognition result, and the like. The input of the keyword selection 206 is a set of recognized keywords, and the output is a set of keyword recognition results arranged in an order appropriate to the file name. In the present embodiment, character extraction, character identification, and post-processing are sequentially performed. However, as described in Non-Patent Document 3, these integrated processing may be performed. In addition, after performing character recognition without using linguistic information, it is possible to collate the word recognition result with the words in the keyword dictionary using a normal text matching algorithm.
FIG. 3 shows a hardware configuration in the embodiment of the present invention. An image is picked up by an optical device 302 including a lens and a diaphragm, and then converted into an electric signal by a photoelectric conversion element 303 such as a CCD element. Further, the obtained electric signal is converted into a digital signal by an analog / digital converter 304 and further subjected to processing such as color space conversion and filter processing by a signal processing element 305 such as a DSP. The result is transferred to the RAM 309. The arithmetic unit 307 refers to the processing procedure stored in the ROM 308 and data such as a character identification dictionary, and executes the keyword recognition process 105 using the image stored in the RAM 309 as an input. Further, the arithmetic unit 307 executes the serial number counting process 106 according to the processing procedure stored in the ROM 308 and executes the file name determination process 108 with reference to the clock 312. The image file is stored in the memory card 311 via the input / output device 310. The input device 313 is used when inputting a keyword. The display device 306 is used to check the image at the time of shooting and to display the result of the file name determination process 108. The communication device 314 is used for connection with a personal computer for transferring a keyword dictionary or an image file.
[0014]
FIG. 4 shows the appearance of the apparatus in the embodiment of the present invention. The lens unit 402 of the optical device 302 is disposed on the front surface of the housing 401. In the upper part, a shutter 403 for instructing the image input 104 and a power switch 404 which are part of the input device 313 are arranged. On the side surface, a terminal 407 of a communication device 314 used for connection with a personal computer and an insertion port 408 of a memory card 311 are arranged. On the back side, a cancel button 405 and a cursor key 406, which are part of the input device 313, are arranged. Further, a display surface 409 of the display device 306 is disposed on the back surface. The cursor key 406 sends a signal indicating a direction to the input device 313 when the top, bottom, left, and right ends are pressed, and sends another signal such as confirmation to the input device 313 when the center is pressed. The above is an example of an apparatus for carrying out the present invention, and the present invention is not limited to the above arrangement. For example, instead of a conventional camera-type device, a portable terminal such as a PDA or a mobile phone may be provided with image data input means such as an imaging device. Since these mobile terminals include a wireless LAN or cellular communication wireless communication device as a communication device, the image may be downloaded / transferred by communicating with a network.
[0015]
FIG. 5 shows an operation procedure in the first embodiment of the present invention. First, the operator connects the digital camera to a personal computer and issues a download instruction (501). In response, the keyword dictionary stored on the personal computer is transferred to the digital camera (502). Next, the operator disconnects the personal computer from the digital camera and activates the keyword setting 102 using the cursor key 405 (503). In the keyword setting 102, first, a list of keywords stored in the keyword dictionary 103 is displayed (504). The operator selects necessary items from these, and registers new keywords with the cursor keys as necessary (505). Next, the operator designates a shooting mode using the cursor key 406 (506). Here, the shooting mode designates the difference between setting an image file name by a conventional method or by keyword recognition. In the figure, 507 and the subsequent figures show an operation procedure on the premise that a mode for setting an image file name by keyword recognition is selected. After the mode is designated, the image input processing 104, the keyword recognition 105, and the file name determination 108 are activated by pointing the lens 402 toward the image-to-image and pressing the shutter 403 (507). In the file name determination 108, the keyword obtained as a result of recognition is displayed (508), and the operator designates the keyword used for the file name as necessary. Using the result, the file name determination process 108 refers to information such as a serial number and date to determine the file name. The determined file name is displayed to the operator (510). Subsequently, in the file output process 109, the image is encoded, the keyword recognition result is embedded, and the file name determined in the file name determination process 108 is output. A standard method such as JPEG is used for image encoding. The keyword recognition result is stored in a tag portion of a standard image file format such as JPEG. Again, after the digital camera is connected to the personal computer, the operator issues an image file transfer instruction 511 to start processing for transferring the file from the digital camera to the personal computer (512). You may connect to a network using a communication function, and access a server that can be connected via the network for downloading keywords or transferring image files.
[0016]
FIG. 6 shows a state of the display screen 409 when the keyword input activation 503 and the mode designation 506 are operated. First, when the power is turned on by the power switch 404 and the cursor key 406 is pressed somewhere, a menu 601 is displayed. The selection item in the menu is changed by the operator pressing the upper and lower portions of the cursor key, and the center of the cursor key 406 is pressed when confirming. If neither is selected, the cancel button 405 is pressed. When starting the keyword input, the operator selects “keyword setting 604” and presses the center of the cursor key 406. When specifying the mode 506, the operator selects “shooting (automatic file name)” 603 and presses the center of the cursor key 406. When performing normal shooting, “shooting” 602 is selected.
[0017]
FIG. 7 shows the state of the display surface 409 when the keyword display 504 and the keyword designation 505 are performed. First, when the operator designates the keyword setting 604 using the menu 601, a menu such as 701 in FIG. 7A is displayed. Reference numeral 701 is used to select a keyword dictionary to be used for keyword recognition from a plurality of keyword dictionaries previously transferred from a personal computer. In this example, “famous place / historic site” and “place name” are valid. Here, the operator changes the selected item by pressing the cursor key 406 up and down, and switches the keyword dictionary between valid and invalid by pressing the left side of the cursor key 406. When the operator presses the center of the cursor key 406, the change content is confirmed and the original state is restored. Furthermore, when the operator presses the right side of the cursor key 604 with an item selected, the keywords registered in the keyword dictionary are listed.
When the operator selects the item 702 in FIG. 7A and presses the right of the cursor key 604, a menu 703 as shown in FIG. 7B appears. In this state, the operator can change the selection item by pressing the cursor key 406 up and down, and further, by pressing the left side of the cursor key 406, the selected keyword is switched between valid and invalid. Further, by pressing the right side of the cursor key 406, a mode is entered in which the operator inputs a new keyword using the cursor key 406. In this mode, characters are entered by displaying a virtual keyboard on the screen. Further, when the operator presses the center of the cursor key 406 in the state of FIG. 7B, the change content is confirmed and the original state is restored. For example, “xx canyon” 704 is selected in the state of FIG. Here, when the operator presses the left side of the cursor key 406, the keyword “XX valley” becomes invalid.
[0018]
FIG. 8 shows an example of the input image. In this example, two character strings of “ΔΔ mountaintop” 801 written on a signboard and “□□ wear” 802 printed on clothes are shown. “ΔΔ mountaintop” 801 is a clue to the situation in which this photograph was taken, and is appropriate as an image file name. On the other hand, “□□ ware” 802 is not necessarily a clue to the situation in which the image was taken, and is inappropriate as an image file name.
[0019]
FIG. 9 shows a processing procedure of the keyword selection processing 205. The input of this process is a set of recognized keywords, and the output is a set of keyword recognition results arranged in the order appropriate to the file name. First, in a loop 901, a feature amount calculation 902 and a certainty factor calculation 903 are executed for each keyword recognition result. The feature quantity calculation 902 includes n feature quantities necessary for checking the appropriateness of each keyword recognition result as a file name.
F = (f1, f2, f3,..., Fn)
Is a process for obtaining. Here, n = 5 and the following feature amounts are used.
f1: X coordinate (pixel) on the image of the center of the recognized keyword
f2: Y coordinate (pixel) on the image of the center of the recognized keyword
f3: width of the recognized keyword center on the image (pixels)
f4: Height of the recognized keyword center on the image (pixels)
f5: Likelihood as a character string obtained by post-processing
The certainty factor calculation 903 is a process of calculating a value indicating the appropriateness of the keyword as a file name, that is, a certainty factor, based on the obtained feature amount. Next, in step 904, keyword recognition results whose certainty values are equal to or smaller than a predetermined threshold value θ are deleted. This is to prevent incorrect file names from being used with inappropriate keywords. Next, in step 905, the remaining keyword recognition results are rearranged in order of certainty.
[0020]
FIG. 10 shows a state of the display surface 409 when the recognition result display 508 is executed. Here, the result of recognizing keywords such as 1001 and 1002 is displayed immediately below the recognized character string. Further, the keyword with the highest certainty factor is highlighted and displayed as 1001. Here, pressing down the cursor key 406 highlights keywords with lower certainty, and pressing up highlights keywords with higher certainty in order. Finally, the keyword highlighted when the center of the cursor key 406 is pressed is used to determine the file name.
[0021]
FIG. 11 shows a processing procedure for the file name determination 108. The input of this processing is a set of keyword recognition results arranged in an order appropriate to the file name as a result of keyword recognition 105, date information obtained from date / time management 107, and a set of names of saved image files. . First, in step 1101, the keyword recognition result is displayed as shown in FIG. Next, in step 1102, the keyword used for determining the file name is selected by the keyword selection by the operator by the method described with reference to FIG. Next, in step 1103, the name of the stored image file is checked. The file name uses the same keyword as the keyword determined in step 1102, and the date is the same as the date of the newly captured image. Explore. If the corresponding image file is found, the corresponding image file name having the largest serial number is selected. Next, in step 1104, the serial number obtained in step 1103 plus 1 is set as a new serial number. If no corresponding image file is found in step 1103, the serial number is set to 1 or no serial number is assigned. Finally, in step 1105, the file name is synthesized from the date, keyword, and serial number.
[0022]
FIG. 12 shows the state of the display surface 409 when the file name determination display 510 is executed. In this state, the determined file name and the keyword embedded in the image file are displayed on the window 1201 that appears on the screen. In this example, it is assumed that this image is the second image taken at the summit of △△ on December 17, 2002, and the file name is “200221217 △△ summit 02” combining date, keyword and serial number. It has been determined. Information such as date and time is obtained from a clock built in the apparatus.
[0023]
FIG. 13 is a data flow diagram illustrating the flow of processing from image input to image file output in the second embodiment of the present invention. First, an image is input according to the image input setting 1311 (1302). Here, the image input setting is a setting of parameters necessary for image input, such as camera focus, aperture, shutter speed, white balance, sensor sensitivity, and contrast. Next, a face is detected from the input image, and a face registered in advance in the face dictionary 1303 is recognized. Further, the information registered in the shooting situation dictionary 1304 and the output of the image input setting 1311 are referred to recognize the situation in which the input image was shot (1306). Next, as a result of face recognition 1305, shooting situation recognition 1306, date / time management 1307, the image file 1310 already stored is referred to, and the image file name is determined by the same procedure as 108 (1308). Further, based on the determined file name, an image file is output (1309).
[0024]
FIG. 14 shows the processing procedure of face recognition 1305. First, in step 1401, a face area is detected from the input image. Next, in a loop 1402, the steps 1403 and 1404 are repeated for all the face regions obtained in the step 1401. Step 1403 is a process of calculating the feature amount of the face area. A plurality of feature quantities are calculated, and the output of this step is a feature vector. In the loop 1404, the likelihood of the feature vector obtained in step 1403 is calculated for all the faces registered in the face dictionary 1303 (1405). The face dictionary 1303 stores pairs of likelihood functions and names (for example, names) corresponding to faces as many as the number of registered faces. Using this likelihood function, the likelihood is calculated in step 1405. After the end of the loop 1402, the face with the highest likelihood value is detected in step 1406, and the name corresponding to the face is output in step 1407. This output is used to determine the file name.
[0025]
FIG. 15 shows a processing procedure of the shooting situation recognition 1305. First, in step 1501, a feature value of the shooting situation is calculated from the input image. A plurality of feature quantities are calculated, and the output of this step is a feature vector. For example, as the feature quantity, a histogram value, autocorrelation coefficient, moment, etc. for each color component are used. Next, in the loop 1502, the likelihood of the feature vector obtained in step 1501 is calculated for all shooting situations registered in the shooting situation dictionary 1304 (1503). In the shooting situation dictionary 1304, as many as the number of registered shooting situations, pairs of likelihood functions and names corresponding to the shooting situations (for example, “sunny weather”, “rainy weather”, “night”, “indoor”, etc.) are stored. Using this likelihood function, the likelihood is calculated in step 1503. Next, at step 1504, the one with the highest likelihood is selected. In step 1505, the name of the shooting situation with the highest likelihood is output.
[0026]
【The invention's effect】
It is possible to automatically assign an image file name that can be easily inferred from the contents of an image, which has been difficult in the past.
[Brief description of the drawings]
FIG. 1 is a data flow diagram showing a flow of processing from image input to image file output in the first embodiment.
FIG. 2 is a data flow diagram showing the flow of keyword recognition processing in the first embodiment of the present invention.
FIG. 3 is a hardware configuration diagram according to the embodiment of the present invention.
FIG. 4 is an external view of an apparatus according to an embodiment of the present invention.
FIG. 5 is a diagram showing an operation procedure in the first embodiment of the present invention.
FIG. 6 is a diagram showing a state of a display screen when specifying a keyword setting and a shooting mode.
FIG. 7 is a diagram showing a state of a display surface when displaying a keyword dictionary and specifying a keyword.
FIG. 8 is a schematic diagram of an input image.
FIG. 9 is a diagram showing a processing procedure for keyword selection processing;
FIG. 10 is a display screen when a keyword recognition result is displayed.
FIG. 11 is a diagram showing a processing procedure for determining a file name in the first embodiment of the present invention.
FIG. 12 is a display screen when a file name is displayed in the first embodiment of the present invention.
FIG. 13 is a data flow diagram showing a flow of processing from image input to image file output in the second embodiment of the present invention.
FIG. 14 is a diagram showing a face recognition processing procedure in the second embodiment of the present invention.
FIG. 15 is a diagram showing a processing procedure for photographing state recognition in the second embodiment of the present invention.
[Explanation of symbols]
101: Processing from image input to image file output in the first embodiment, 102 ... Keyword setting, 103 ... Keyword dictionary, 104 ... Image input, 105 ... Keyword recognition, 106 ..Sequence number counting, 107 ... Date / time management, 108 ... File name determination, 109 ... File output, 110 ... Image file, 201 ... Character line extraction, 202 ... Character extraction , 203 ... Character identification, 204 ... Post-processing, 205 ... Keyword selection, 206 ... Character identification dictionary, 301 ... Digital camera, 302 ... Optical device, 303 ... Photoelectric conversion Elements 304, analog-digital converter, 305 signal processing element, 306 display, 307 arithmetic unit, 308 ROM, 309,. · RAM, 310 ··· input / output device, 311 ··· memory card, 312 ··· clock, 313 ··· input device, 314 ··· communication device, 401 ··· housing, 402 ··· lens 403 ... Shutter 404 ... Power switch 405 ... Cancel button 406 ... Cursor key 407 ... Communication device terminal 408 ... Memory card insertion slot 409 ... Display Surface, 501 ... Download support, 502 ... Keyword transfer, 503 ... Keyword input activation, 504 ... Keyword display, 505 ... Keyword specification, 506 ... Mode specification, 507 ... Shooting 508... Recognition result display 509... File name keyword specification 510... File name display 511. ..Image file transfer, 601... Mode designation menu, 602 .. shoot, 603 .. shoot (automatic file name), 604... Keyword designation, 701. Selected keyword dictionary, 703... Keyword list window, 704... Selected keyword, 801..., Recognized keyword ".DELTA .. Summit", 802. 901... Loop relating to each keyword recognition result 902... Feature amount calculating step 903 .. certainty factor calculating step 904 .. certainty factor threshold value processing 905. Steps for replacement, 1001, 1002... Keyword recognition result, 1101. 1102 ... Selecting a keyword 1103 ... Searching for an existing maximum serial number, 1104 ... Adding 1 to the serial number, 1105 ... Synthesizing a file name, 1201... Window for displaying file name, 1301... Processing from image input to image file output in the second embodiment of the present invention, 1302... Image input, 1303. ..Shooting situation dictionary, 1305... Face recognition, 1306 .. Shooting situation recognition, 1307... Date / time management, 1308... File name determination, 1309. File, 1401... Step for detecting face area from input image, 1402... Loop for all face areas, 1403. ..Step of calculating feature amount, 1404... Loop relating to faces registered in all face dictionaries, 1405... Step of calculating likelihood, 1406... Step 1407: Outputting the name of the face with the maximum likelihood value, 1501 ... Step for calculating the feature value, 1502 ... Step for all shooting situations, 1503 ... Calculation of likelihood 1504... Detecting the shooting situation with the highest likelihood value, 1505... Outputting the name of the shooting situation with the highest likelihood value.

Claims

An image pickup device that photoelectrically converts an image and takes it in as a digital signal, and a storage device that encodes the digital signal, assigns a name, and stores it.
A means for recognizing characters from the input image and a candidate for an image name including a character string obtained as a result of recognizing the character are created, and when the candidate is determined to be valid, the candidate is used as the name of the image. And an image photographing device.

2. The image photographing apparatus according to claim 1, further comprising means for storing a set of words, wherein the word is recognized from the image and used to create a candidate for the name of the image.

2. The image processing apparatus according to claim 1, further comprising means for acquiring a date or time, wherein the character string obtained as a result of the character recognition and a character string representing the date or time are combined to form a part of an image name candidate. Image shooting device.

4. The image photographing apparatus according to claim 3, wherein a character string obtained as a result of character recognition and a serial number are combined to form a part of an image name candidate.

2. The image photographing apparatus according to claim 1, wherein when two or more character strings are recognized from the image, a character string used for an image name candidate is determined based on the position of the recognized character string. .

2. The image photographing according to claim 1, wherein when two or more character strings are recognized from the image, a character string to be used as a candidate for an image name is determined based on the size of the recognized character string. apparatus.

A process for recognizing characters from images stored in a file and creating a new image file name candidate that includes the character string obtained as a result of character recognition. A program for describing a procedure for determining the candidate as the name of the image and storing an image with a newly determined file name.

8. The program according to claim 7, wherein a procedure for accessing a means for storing a set of words, recognizing the word from the image, and using the name of the image for determination is described.

Describe the processing procedure to access the date or time acquisition means and synthesize the character string obtained as a result of the character recognition and the character string representing the date or time to become part of the image name candidate. The program according to claim 7, wherein:

10. The program according to claim 9, wherein a processing procedure is described in which a character string obtained as a result of character recognition and a serial number are combined to form part of the name of the image.

When two or more character strings are recognized from the image, a processing procedure for determining a character string to be used as a candidate for the image name based on the position of the recognized character string is described. The program according to claim 7.

When two or more character strings are recognized from the image, a processing procedure for determining a character string to be used as a candidate for the image name based on the size of the recognized character string is described. The program according to claim 7.

An image pickup device that photoelectrically converts an image and takes it in as a digital signal, and a storage device that encodes the digital signal, assigns a name, and stores it.
A means for detecting a face area from the input digital signal, a means for calculating a feature quantity of the detected face area, and a likelihood function and face name pair for calculating likelihood from the feature quantity of the face area. And a means for determining the name of the image so as to include the name of the face having the highest likelihood of the detected face area.

Stores a process for detecting a face area from an image, a process for calculating a feature value of the detected face area, and a set of likelihood function and face name pairs for calculating likelihood from the feature value of the face area. A process for accessing a means for performing the process, a process for determining a name of a new image file so as to include a face name having the highest likelihood of the detected face area, and storing the image with the newly determined file name A program characterized by describing processing.

An image pickup device that photoelectrically converts an image and takes it in as a digital signal, and a storage device that encodes the digital signal, assigns a name, and stores it.
Means for calculating a feature value of a shooting situation from the input digital signal; means for storing a set of a likelihood function and a shooting situation name pair for calculating likelihood from the feature quantity of the shooting situation; Means for determining the name of the image so as to include a name of a high photographing situation.

Means for calculating the feature value of the shooting situation from the image, means for storing a set of a pair of a likelihood function and a shooting situation name from the feature quantity of the shooting situation, and a shooting situation with the highest likelihood A program for describing a procedure for determining a name of a new image file so as to include the name of the image and a process for storing the image with the newly determined name.