JP2002014973A

JP2002014973A - Video retrieving system and method, and recording medium with video retrieving program recorded thereon

Info

Publication number: JP2002014973A
Application number: JP2000194972A
Authority: JP
Inventors: Hidekatsu Kuwano; 秀豪桑野; Yukinobu Taniguchi; 行信谷口; Haruhiko Kojima; 治彦児島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2000-06-28
Filing date: 2000-06-28
Publication date: 2002-01-18

Abstract

PROBLEM TO BE SOLVED: To reduce errors in the judgment of similarity caused by a gap between with meaning contents which the feature of a picture and voice has. SOLUTION: A character information recognition part 2 recognizes character information such as telop characters displayed in video with respect to video stored in a memory and outputs the character codes of plural character recognition result candidates with respect to one character displayed on a screen. A recognition result similarity calculation part 4 takes out a character code string candidate recognized from each video frame extending to two video frames from plural recognition results stored in a recognition result storing part 3 and calculates similarity between the character recognition results with respect to all of the pairs. A result outputting part 5 outputs the character code information and similarity of the character recognizing results.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、映像内容あるいは
画像内容の意味的な類似性を判断し、判断結果に基づい
て映像を検索する映像検索装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a video search device that determines the semantic similarity of video content or image content and searches for video based on the determination result.

【０００２】[0002]

【従来の技術】映像内容、画像内容の意味的な類似性を
判断し、判断結果に基づいて映像を表示する技術に関し
ては、従来から検討がすすめられており、これまでに提
案されているものとして文献［１］［２］［３］に掲載
された技術がある。文献［１］［２］に掲載された技術
はいずれも映像中の色、形状、動き、音といった信号情
報を基に映像の類似性を定義しており、映像信号の特徴
が映像の持つ意味内容を反映したものになっている場合
は有効に類似映像を判定することが可能である。また、
文献［３］では、映像とは別に存在するクローズドキャ
プションやドラマのシナリオ、ニュースの原稿記事など
の映像内容に関連するテキスト情報を利用し、テキスト
同士の類似性を求め、これを映像の類似性とすることで
類似映像を関連づけている。2. Description of the Related Art Techniques for judging the semantic similarity of video contents and image contents and displaying a video based on the judgment result have been studied in the past, and those proposed so far have been proposed. There is a technology described in Documents [1], [2] and [3]. Each of the technologies described in the literatures [1] and [2] defines the similarity of an image based on signal information such as color, shape, motion, and sound in the image, and the characteristics of the image signal have the meaning of the image. If the content is reflected, it is possible to effectively determine a similar image. Also,
In document [3], similarity between texts is obtained by using text information related to video content such as closed captions, drama scenarios, news manuscript articles, etc., which are present separately from the video, and the similarity between the texts is calculated. Is associated with similar images.

【０００３】文献一覧［１］粕谷英司、村井正人、富永英義、“圧縮映像中の
有意情報抽出に基づく映像シーン検索手法”、電子情報
通信学会総合大会講演予稿集、D−１２−７４、p.２７
３、（１９９８−０３）［２］岩佐英彦、山本英典、横谷直和、竹村治雄、“画
像間の構図の類似性を考慮した類似画像検索”、画像電
子学会年次大会講演予稿集、pp.４７−４８、（１９９
０−０６）［３］宮内進吾、馬場口登、北橋忠宏、“言語ストリー
ムからの特徴抽出による映像メディアのイベント検
出”、電子情報通信学会総合大会講演予稿集、D−１２
−７８、p.２４８、（２０００−０３）［４］倉掛正治、桑野秀豪、新井啓之、小倉健司、“映
像検索のためのテロップ文字認識の検討”、NTTR&D、 v
ol. ４７、 pp.１１−１６、（１９９８−０１）List of References [1] Eiji Kasuya, Masato Murai, Hideyoshi Tominaga, "Video Scene Retrieval Method Based on Extraction of Significant Information from Compressed Video", Proc. Of IEICE General Conference, D-12-74, p. .27
3, (1998-03) [2] Hidehiko Iwasa, Hidenori Yamamoto, Naokazu Yokotani, Haruo Takemura, "Similar Image Search Considering Compositional Similarity Between Images", Proceedings of the Annual Conference of the Institute of Image Electronics Engineers of Japan, pp .47-48, (199
0-06) [3] Shingo Miyauchi, Noboru Babaguchi, Tadahiro Kitahashi, "Event Detection of Video Media by Extracting Features from Language Streams", Proceedings of IEICE General Conference, D-12
-78, p.248, (2000-03) [4] Shoji Kurakake, Hidego Kuwano, Hiroyuki Arai, Kenji Ogura, "Study of Character Recognition for Video Search", NTTR & D, v
ol. 47, pp. 11-16, (1998-01)

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、前記文
献［１］［２］［３］で掲載された技術では、類似映像
を判断できない場合があるという問題がある。However, the techniques described in the above-mentioned documents [1], [2], and [3] have a problem that similar images cannot be determined in some cases.

【０００５】文献［１］［２］の技術では、映像信号の
特徴と映像の持つ意味内容は必ずしも一致しないため、
類似性の判定に失敗する場合がある。例えば、赤い車が
映っている映像と、黄色い車が映っている映像の間で色
の情報の使い類似性を判定した場合、両者は「車が映っ
ている」という同じ意味を持つが、車の色が異なるた
め、この場合、類似しないと判定されてしまう。また、
同じ意味内容を持つ複数の映像が存在しても、映像信号
の特徴が全く異なるような場合には、信号特徴を利用し
た方法では、それらの意味的な類似性を見出すことは困
難である。例えば、「赤い夕焼けの空」が映っている映
像と「巨大なビルが建ち並ぶ都市」が映っている映像と
いう、色、形状、動き、音などの映像信号の特徴が全く
異なる映像でも、仮にこれらが「ある人物が見た夢の中
の風景」という意味で共通する場合、これらを類似する
と判定することは映像信号の特徴のみでは困難である。[0005] In the techniques of Documents [1] and [2], the characteristics of the video signal do not always match the semantic contents of the video.
The determination of similarity may fail. For example, when judging the similarity of the use of color information between an image showing a red car and an image showing a yellow car, both have the same meaning of "the car is shown". Are different, it is determined that they are not similar in this case. Also,
Even when there are a plurality of videos having the same semantic content, if the characteristics of the video signals are completely different, it is difficult to find a semantic similarity between them by a method using the signal characteristics. For example, an image showing the “red sunset sky” and an image showing a “city with huge buildings” that are completely different in image signal characteristics such as color, shape, movement, sound, etc. Are common in the sense of "a scene in a dream seen by a certain person", it is difficult to determine that these are similar only by the characteristics of the video signal.

【０００６】また、文献［３］の技術では、映像とは別
のテキスト情報を事前になんらかの方法で用意する必要
があり、映像しか存在しない場合には、適用できない。
一般的には映像の内容に関連するテキスト情報は、映像
製作会社などの特定の映像利用者しか所有していないた
め、文献［３］の技術をあらゆる映像に適用することは
困難である。[0006] In addition, in the technique of Reference [3], it is necessary to prepare text information different from video in advance by some method, and it cannot be applied when only video exists.
In general, text information related to the contents of a video is owned only by a specific video user such as a video production company. Therefore, it is difficult to apply the technique of Ref. [3] to any video.

【０００７】本発明の目的は、画像、音声特徴の持つ意
味内容とギャップから生じる類似性判断の誤りを減少さ
せた映像検索装置、方法、映像検索プログラムを記録し
た記録媒体を提供することにある。SUMMARY OF THE INVENTION It is an object of the present invention to provide a video search device, a video search method, and a recording medium on which a video search program is recorded, which reduces errors in similarity judgment caused by gaps in the meaning and content of images and audio features. .

【０００８】[0008]

【課題を解決するための手段】上記目的を達成するため
に、本発明の第１の映像検索装置は、映像データを入力
し、メモリに記憶する映像入力記憶部と、前記映像入力
記憶部から入力された映像データに対して、映像フレー
ム別に該映像フレームに表示された字幕を抽出し、該字
幕の文字コード列候補を認識する文字情報認識部と、前
記文字コード列候補が格納される認識結果格納部と、前
記映像フレームの全ての２つの映像フレームの組合わせ
において文字列候補を構成する文字が一致する度合いに
基づいて類似度を算出する認識結果類似度計算部と、前
記文字コード列候補および類似度を出力する結果出力部
を有する。In order to achieve the above object, a first video search device of the present invention comprises: a video input storage unit for inputting video data and storing the video data in a memory; A character information recognizing unit that extracts subtitles displayed in the video frame for each video frame from input video data and recognizes a character code string candidate of the subtitle, and a recognition unit that stores the character code string candidate. A result storage unit, a recognition result similarity calculating unit that calculates a similarity based on a degree of matching of characters constituting a character string candidate in a combination of all two video frames of the video frame, and the character code string It has a result output unit for outputting candidates and similarities.

【０００９】本発明の第２の映像検索装置は、映像デー
タを入力し、メモリに記憶する映像入力記憶部と、前記
映像入力記憶部から入力された映像データに対して、映
像フレーム別に該映像フレームに表示された字幕を抽出
し、該字幕の文字コード列候補を認識する文字情報認識
部と、前記文字コード列候補が格納される認識結果格納
部と、前記映像フレームの全ての２つの映像フレームの
組合わせにおいて文字列候補を構成する文字が一致する
度合いに基づいて類似度を算出する認識結果類似度算出
部と、前記類似度情報を予め決められた規則に基づき分
類し、分類された各グループの識別情報である類似性情
報を出力する類似度分類部と、前記類似度分類部におい
て出力された類似性情報をメモリに格納する類似性情報
格納部と、前記類似性情報を予め与えられた要求に基づ
き検索し、出力する検索結果出力部を有する。A second video search device according to the present invention comprises: a video input storage unit for inputting video data and storing the video data in a memory; and a video data input from the video input storage unit. A character information recognition unit that extracts a subtitle displayed in a frame and recognizes a character code string candidate of the subtitle, a recognition result storage unit that stores the character code string candidate, and all two images of the video frame. A recognition result similarity calculation unit that calculates a similarity based on a degree of matching of characters forming a character string candidate in a combination of frames, and classifies the similarity information based on a predetermined rule; A similarity classifying unit that outputs similarity information that is identification information of each group; a similarity information storage unit that stores the similarity information output by the similarity classifying unit in a memory; Searches based on pre-given request sex information, has a search result output unit for outputting.

【００１０】本発明の第３の映像検索装置は、映像デー
タを入力し、メモリに記憶する映像入力記憶部と、前記
映像入力記憶部から入力された映像データに対して、映
像フレーム別に該映像フレームに表示された字幕を抽出
し、該字幕の文字コード列候補を認識する文字情報認識
部と、前記文字コード列候補が格納される認識結果格納
部と、前記映像フレームの全ての２つの映像フレームの
組合わせにおいて文字列候補を構成する文字が一致する
度合いに基づいて類似度を算出する認識結果類似度計算
部と、前記類似度情報を予め決められた規則に基づき分
類し、分類された各グループの識別情報である類似性情
報を出力する類似度分類部と、前記類似度分類部におい
て出力された類似性情報をメモリに格納する類似文字認
識結果格納部と、入力される映像データに対して、予め
決められた方法を用いて指定した任意の時点から映像を
再生することを可能とする情報を付与し、映像を蓄積す
る映像蓄積部と、前記類似性情報格納部に格納された映
像の類似性情報を予め与えられた要求により検索し、該
検索結果に対応する映像情報を前記映像蓄積部に要求
し、検索結果とともに出力する検索結果出力部を有す
る。According to a third aspect of the present invention, there is provided a video search device for inputting video data and storing the video data in a memory, and for the video data input from the video input storage unit, for each video frame. A character information recognition unit that extracts a subtitle displayed in a frame and recognizes a character code string candidate of the subtitle, a recognition result storage unit that stores the character code string candidate, and all two images of the video frame. A recognition result similarity calculation unit that calculates a similarity based on the degree of matching of characters constituting a character string candidate in a combination of frames; and a classification unit that classifies the similarity information based on a predetermined rule. A similarity classification unit that outputs similarity information that is identification information of each group; a similar character recognition result storage unit that stores the similarity information output by the similarity classification unit in a memory; To the input video data, information that enables the video to be reproduced from an arbitrary point specified using a predetermined method is added, and a video storage unit that stores the video, and the similarity information A search result output unit that searches for similarity information of the video stored in the storage unit according to a request given in advance, requests video information corresponding to the search result to the video storage unit, and outputs the video information together with the search result.

【００１１】本発明は、映像中に表示されるテロップ文
字などの文字情報を自動認識し、認識した結果の文字コ
ードの類似性を利用し、従来よりも意味内容の類似する
映像を正確に対応付け、また、映像の内容に関連するテ
キスト情報が予め存在しない場合でも類似映像の対応付
けを可能にするものである。The present invention automatically recognizes character information such as telop characters displayed in a video, and utilizes a similarity of a character code as a result of the recognition to accurately correspond to a video having a similar meaning and content. In addition, even if text information related to the content of the video does not exist in advance, it is possible to associate a similar video.

【００１２】映像中には、ニュース映像中のニュースタ
イトルや人物名、地名、また、料理番組のレシピ情報な
ど、これら以外にも様々な映像で一般的に映像内容を説
明するものとしてテロップ文字などの文字情報が表示さ
れる。これらの文字情報を認識処理を用いて認識した結
果の文字コードの情報は、色、形、動き、音などの映像
の信号特徴よりも映像の持つ意味内容に近いレベルの情
報であるため、信号特徴に比べ、映像の意味内容の類似
性を評価する指標として有効であるといえる。例えば、
前記の「発明が解決しようとする課題」の中で擧げた２
つの例について、最初の例では、両方の映像中に「車」
などの文字が表示されている場合、色の情報では不可能
であるが、本発明を適用すると文字認識処理で得られる
文字コード「車」が共通するということから、両者は類
似すると判定することが可能となる。次の例についても
両方の映像に「ある人物が見た夢の中の風景」という文
字情報が表示されている場合、文字認識処理の結果得ら
れる文字コードの共通性を利用し、同様に本発明によ
り、映像間の類似性を判定することが可能となる。In the video, various titles, such as news titles, person names, and place names in a news video, recipe information of a cooking program, etc. Is displayed. The character code information obtained as a result of recognizing these character information using the recognition process is information at a level closer to the semantic content of the video than the signal characteristics of the video such as color, shape, motion, sound, etc. It can be said that it is more effective as an index for evaluating the similarity of the semantic contents of the video than the feature. For example,
2 raised in the above “Problems to be solved by the invention”
For one example, in the first example, "car"
When characters such as are displayed, it is impossible with color information, but since the present invention is applied, the character code "car" obtained by the character recognition process is common, so it is determined that both are similar. Becomes possible. Also in the following example, if the character information “Scene in a dream seen by a certain person” is displayed in both images, use the commonality of character codes obtained as a result of the character recognition According to the present invention, it is possible to determine the similarity between videos.

【００１３】また、本発明では、映像中に表示されるテ
ロップ文字などの文字情報を自動的に認識した結果のテ
キスト情報を映像の類似度計算に利用するため、映像情
報から自動的に映像の内容に関連するテキスト情報を獲
得することが可能となる。このため、クローズドキャプ
ションなどのテキスト情報が予め存在しない場合でも、
文字認識結果のテキスト情報を利用した類似映像の判定
を行うことが可能である。Further, in the present invention, the text information obtained by automatically recognizing character information such as telop characters displayed in the video is used for calculating the similarity of the video. It is possible to obtain text information related to the content. For this reason, even if there is no text information such as closed captions in advance,
It is possible to determine the similar video using the text information of the character recognition result.

【００１４】また、本発明では、文字認識処理により得
られるテキスト情報として、映像中に表示される一つの
文字につき、文字認識処理の結果の複数個の文字コード
を利用することで、一つの文字につき文字認識結果文字
コードを１個だけ利用する場合に比べ、文字認識結果の
中に正しい文字モードが含まれる可能性が高いため、精
度の良い類似度を定義することが可能となる。According to the present invention, as text information obtained by the character recognition processing, a plurality of character codes obtained as a result of the character recognition processing are used for one character displayed in the video, so that one character is displayed. As compared with the case where only one character recognition result character code is used, there is a high possibility that a correct character mode is included in the character recognition result, so that it is possible to define a highly accurate similarity.

【００１５】また、放送映像などでは、話題性の高い映
像は、ニュース番組などで、数多く放送されるため、本
発明により、テロップ文字などの文字情報の認識結果の
類似性を利用し、放送映像中の類似する映像を関連付
け、それらを類似する映像毎に出力することで、映像利
用者は、情報の話題性を映像の出力内容を通じて知るこ
とが可能となる。[0015] Further, in broadcast video and the like, since many highly topical videos are broadcast in news programs and the like, the present invention utilizes the similarity of the recognition result of character information such as telop characters to broadcast video. By associating the similar images in each other and outputting them for each similar image, the image user can know the topicality of the information through the output contents of the image.

【００１６】[0016]

【発明の実施の形態】次に、本発明の実施の形態につい
て図面を参照して説明する。Next, embodiments of the present invention will be described with reference to the drawings.

【００１７】図１を参照すると、本発明の第１の実施形
態の類似映像検索表示装置は映像入力記憶部１と文字情
報認識部２と認識結果格納部３と認識結果類似度計算部
４と結果出力部５で構成されている。Referring to FIG. 1, a similar image retrieval and display apparatus according to a first embodiment of the present invention includes a video input storage unit 1, a character information recognition unit 2, a recognition result storage unit 3, a recognition result similarity calculation unit 4, It comprises a result output unit 5.

【００１８】映像入力記憶部１は、処理対象の映像デー
タを入力し、メモリに記憶する。例えば、ビデオなどの
映像再生装置からの映像信号をパーソナルコンピュータ
に装着した映像取り込み用のボードを利用してコンピュ
ータ上のメモリに記憶する。The video input storage unit 1 receives video data to be processed and stores it in a memory. For example, a video signal from a video reproducing apparatus such as a video is stored in a memory on a computer using a video capturing board mounted on a personal computer.

【００１９】文字情報認識部２は、映像入力記憶部１で
コンピュータ上のメモリに記憶された映像に対して、予
め決められた方法を用いて、映像中に表示されるテロッ
プ文字などの文字情報を認識し、画面（映像データを構
成する個々の静止画像）上に表示される一つの文字に対
して、複数の文字認識結果候補の文字コードを出力す
る。予め決められた方法としては、例えば、文献［４］
で紹介されている方法を実現したコンピュータ上のソフ
トウェアが利用できる。The character information recognizing unit 2 applies character information such as telop characters displayed in the video to the video stored in the memory on the computer by the video input storage unit 1 using a predetermined method. And outputs character codes of a plurality of character recognition result candidates for one character displayed on the screen (individual still images constituting video data). As the predetermined method, for example, reference [4]
You can use software on a computer that implements the method described in.

【００２０】認識結果格納部３はコンピュータのハード
ディスクなどで、文字情報認識部２で出力された文字認
識結果の文字コード情報が格納される。The recognition result storage unit 3 is a hard disk of a computer or the like, and stores character code information of the character recognition result output from the character information recognition unit 2.

【００２１】認識結果類似度計算部４は、認識結果格納
部３に格納された複数の文字認識結果から２つの映像フ
レームにまたがって各映像フレームから認識された文字
コード列候補を取り出し、全ての２つの組合せに対し、
文字認識結果同士の類似度を計算する。例えば、・文字認識結果A：あす開幕高校野球・文字認識結果B：高校野球、雨で中止があった場合、文字認識結果A,Bにともに含まれる同一
の文字コードとして「高校野球」という４個の文字コー
ドがある。A,Bはそれぞれ映像フレームを区別する符号
である。これに対し、両者の類似度を「共通する文字コ
ードの個数」として定義した場合、類似度は４となる。
この例は、類似度の計算例の説明の便宜上、文字認識結
果として一つの文字に対して一つの文字コードが存在す
る場合を示した。しかしながら、実際には文字認識の処
理には誤りもあり、前記の例のように必ずしも正しい文
字コードが得られるとは限らない。このため、文字情報
認識部２では、画面上の一つの文字に対し、複数個の文
字コードを文字認識結果候補として利用する。前記の例
を利用し、一つの文字につき２個の文字コードを得る場
合として説明すると、・文字認識結果A：あす閉幕、高校野球（第一候補の文
字コード列）：あた開草−甘林町理（第二候補の文字コ
ード列）・文字認識結果B：高林野球、雨て虫止（第一候補の文
字コード列）：甘校町利−雲で中宝（第二候補の文字コ
ード列）のような結果が得られる。この場合、第一候補の文字コ
ード列しか存在しなければ、前記類似度の考え方では、
両者の類似度は、「高」「野」「球」の３個の文字コー
ドから類似度３となるが、第二候補まで利用して類似度
を計算すると、「高校野球」の４文字が同一のため類似
度４となる。このように画面の表示される一つの文字に
対し、複数個の文字認識結果を求めることで、類似度の
精度は向上する。The recognition result similarity calculation unit 4 extracts character code string candidates recognized from each video frame over two video frames from a plurality of character recognition results stored in the recognition result storage unit 3 and For the two combinations,
Calculate the similarity between character recognition results. For example: • Character recognition result A: Tomorrow opening high school baseball • Character recognition result B: High school baseball, if there is a stop due to rain, the same character code that is included in both character recognition results A and B is “High school baseball”. There are character codes. A and B are codes for distinguishing video frames, respectively. On the other hand, if the similarity between the two is defined as “the number of common character codes”, the similarity is 4.
This example shows a case where one character code exists for one character as a character recognition result, for convenience of description of an example of calculating the similarity. However, there is actually an error in the character recognition processing, and a correct character code is not always obtained as in the above-described example. Therefore, the character information recognition unit 2 uses a plurality of character codes for one character on the screen as character recognition result candidates. Using the above example, a case where two character codes are obtained for one character will be described. Character recognition result A: Tomorrow closing, high school baseball (first candidate character code string): Atashikusa-Aman Mori Hayashi (Character code string of the second candidate) ・ Character recognition result B: Takabayashi baseball, rain and insect stoppage (Character code string of the first candidate): Amakumachi Toshi-Cloud and Chuho (Character of the second candidate) Code sequence). In this case, if only the character code string of the first candidate exists, the concept of the similarity indicates that:
The similarity between the two is “3” from the three character codes “high”, “field” and “ball”. When the similarity is calculated using the second candidate, four characters “high school baseball” are obtained. Since they are the same, the similarity is 4. By obtaining a plurality of character recognition results for one character displayed on the screen in this manner, the accuracy of the similarity is improved.

【００２２】結果出力部５は、文字認識結果の文字コー
ド情報および類似度を出力する。The result output unit 5 outputs character code information and similarity of the character recognition result.

【００２３】図２は認識結果類似度計算部４の処理を説
明するためのフローチャートであり、２つの文字認識結
果Aと文字認識結果Bの類似度を計算するための具体的な
処理の過程の一例を示すものである。この例では、認識
結果の類似度を両者に共通する文字コードの個数で定義
した場合であり、文字認識結果A、文字認識結果Bはそれ
ぞれK個、L個の文字コードからなるものとする。ステッ
プ４１では文字認識結果Aを読み込む。ステップ４２で
は文字認識結果Bを読み込む。ステップ４３ではカウン
タ変数Iを１に初期化する。ステップ４４ではカウンタ
変数Jを１に初期化する。ステップ４５では類似度Sを０
に初期化する。ステップ４６ではカウンタ変数IがK以下
かどうか判断し、K以下の場合は、ステップ４７に進
み、Kより大きい場合は、ステップ５３に進む。ステッ
プ４７ではカウンタ変数JがL以下かどうか判断し、L以
下の場合は、ステップ４８に進み、Lより大きい場合
は、ステップ５１に進む。ステップ４８では文字認識結
果A中のI番目の文字コードと文字認識結果B中のJ番目の
文字コードが同じかどうか判断し、同じ場合は、ステッ
プ４９に進み、異なる場合は、ステップ５０に進む。ス
テップ４９では類似度変数Sを１だけ増やす。ステップ
５０ではカウンタ変数Jを１だけ増やす。ステップ５１
ではカウンタ変数Iを１だけ増やす。ステップ５２では
カウンタ変数Jを１に設定する。FIG. 2 is a flow chart for explaining the processing of the recognition result similarity calculation unit 4, showing the specific processing steps for calculating the similarity between two character recognition results A and B. An example is shown. In this example, the similarity of the recognition result is defined by the number of character codes common to both, and the character recognition result A and the character recognition result B are assumed to be composed of K and L character codes, respectively. In step 41, the character recognition result A is read. In step 42, the character recognition result B is read. In step 43, the counter variable I is initialized to 1. In step 44, the counter variable J is initialized to 1. In step 45, the similarity S is set to 0
Initialize to In step 46, it is determined whether or not the counter variable I is equal to or smaller than K. If the counter variable is equal to or smaller than K, the process proceeds to step 47; In step 47, it is determined whether or not the counter variable J is equal to or less than L. If it is equal to or less than L, the process proceeds to step 48, and if it is greater than L, the process proceeds to step 51. In step 48, it is determined whether the I-th character code in the character recognition result A and the J-th character code in the character recognition result B are the same. If they are the same, the process proceeds to step 49; . In step 49, the similarity variable S is increased by one. In step 50, the counter variable J is increased by one. Step 51
Then, the counter variable I is increased by one. In step 52, the counter variable J is set to 1.

【００２４】図３を参照すると、本発明の第２の実施形
態の類似映像検索表示装置は、映像入力記憶部１と文字
情報認識部２と認識結果格納部３と認識結果類似度計算
部４と類似度分類部６と類似性情報格納部７と検索結果
出力部８で構成されている。Referring to FIG. 3, a similar image retrieval and display apparatus according to a second embodiment of the present invention comprises a video input storage unit 1, a character information recognition unit 2, a recognition result storage unit 3, and a recognition result similarity calculation unit 4 And a similarity classifying unit 6, a similarity information storage unit 7, and a search result output unit 8.

【００２５】本実施形態は、第１の実施形態と同様に、
映像中に表示される複数個の文字情報について、それら
の類似度を獲得した後、類似度を分類し、分類結果に基
づき結果を表示する処理を加えた装置構成の一例を示す
ものである。映像入力記憶部１から認識結果類似度計算
部４までは、第１の実施形態と同じである。類似度分類
部６は、認識結果類似度計算部４により得られた文字認
識結果の類似度情報を予め決められた規則に基づき分類
する。例えば、文字認識結果Cとして「高校野球組合わ
せ発表」という文字が存在する場合、前記の認識結果類
似度計算部４により、文字認識結果AとB, BとC、そして
AとCの組合わせがそれぞれ同一の文字コードを持ち、さ
らに前記の類似度の考え方では、それぞれ類似度４を持
つ。この場合、例えば、同一の文字コードを持ち、類似
度４以上の全ての組合わせを一つの類似グループとみな
す、とした場合、A, B, Cは全て類似する映像と判断さ
れる。This embodiment is similar to the first embodiment,
This shows an example of an apparatus configuration in which, after acquiring similarities of a plurality of pieces of character information displayed in a video, the similarities are classified, and a result is displayed based on the classification result. The components from the video input storage unit 1 to the recognition result similarity calculation unit 4 are the same as in the first embodiment. The similarity classifying unit 6 classifies the similarity information of the character recognition result obtained by the recognition result similarity calculating unit 4 based on a predetermined rule. For example, when the character “high school baseball combination announcement” is present as the character recognition result C, the character recognition results A and B, B and C, and
Each combination of A and C has the same character code, and in the above-described concept of similarity, each has a similarity of 4. In this case, for example, if all combinations having the same character code and a similarity of 4 or more are regarded as one similar group, A, B, and C are all determined to be similar images.

【００２６】類似性情報格納部７は、コンピュータのハ
ードディスクなどで、類似度分類部６により得られた文
字認識結果の類似性情報が格納される。類似性情報と
は、例えば、類似する文字認識結果のグループ毎の各文
字認識結果の識別番号などがある。前記の例では、文字
認識結果A、文字認識結果BのA, Bという識別番号が類似
性情報として利用できる。The similarity information storage unit 7 stores the similarity information of the character recognition result obtained by the similarity classifying unit 6 on a hard disk of a computer or the like. The similarity information includes, for example, an identification number of each character recognition result for each group of similar character recognition results. In the above example, the identification numbers A and B of the character recognition result A and the character recognition result B can be used as the similarity information.

【００２７】映像出力部８は、類似性情報格納部７に格
納された映像の類似性情報を予め与えられた要求に基づ
き検索し、出力する。The video output unit 8 retrieves and outputs similarity information of the video stored in the similarity information storage unit 7 based on a request given in advance.

【００２８】なお、図３の構成は、最初の映像データに
対して計算された類似度情報が既に類似性情報格納部に
存在し、この状態で新たな映像データが入力され、新た
な映像から得られる文字認識結果の情報同士だけで類似
度を計算するのではなく、既に存在する最初の映像デー
タの文字についての類似度情報を含めて改めて類似度を
計算する場合に適用してもよい。In the configuration shown in FIG. 3, the similarity information calculated for the first video data already exists in the similarity information storage unit, and in this state, new video data is input and new video data is input. The present invention may be applied to a case where the similarity is calculated not only by the information of the obtained character recognition results but also by calculating the similarity again including the similarity information of the character of the first existing video data.

【００２９】図４は本発明の第３の実施形態の類似映像
検索表示装置のブロック図で、第２の実施形態の映像検
索装置に映像蓄積部９を備えたものである。FIG. 4 is a block diagram of a similar image search and display device according to a third embodiment of the present invention. The image search device according to the second embodiment has an image storage unit 9.

【００３０】映像蓄積部９は、パーソナルコンピュータ
などのメモリに入力された映像に対し、予め決められた
方法を用いて指定した任意の時点から映像を再生するこ
とを可能とする情報を付与し、映像をコンピュータのハ
ードディスクなどの蓄積装置内に蓄積する。予め決めら
れた方法としては、例えば、映像を構成する１つ１つの
フレーム情報とそれに対応する映像蓄積装置内の映像の
格納位置を記憶する方法などが利用できる。The video storage unit 9 adds information to a video input to a memory such as a personal computer so that the video can be reproduced from an arbitrary point specified using a predetermined method, The video is stored in a storage device such as a hard disk of a computer. As the predetermined method, for example, a method of storing the frame information of each video and the storage position of the video in the video storage device corresponding to the frame information can be used.

【００３１】検索結果出力部８から類似性情報格納部７
に検索条件を出し、類似性情報格納部７から条件に合致
する結果を検索結果出力部８に出力する。この時の結果
は、映像ではなく、あくまで文字情報でしかない。そこ
で、類似性情報格納部７から出力された文字情報が実際
に表示されている映像を検索結果出力部８で出力するた
めに、類似性情報格納部７から出力された結果を基に、
その出力結果に対応する映像情報を検索結果出力部８か
ら映像蓄積部９に要求し、映像蓄積部９はその要求に答
える。From the search result output unit 8 to the similarity information storage unit 7
The search condition is output to the search result output unit 8 from the similarity information storage unit 7. The result at this time is not video but only textual information. Therefore, in order for the search result output unit 8 to output a video in which the character information output from the similarity information storage unit 7 is actually displayed, based on the result output from the similarity information storage unit 7,
The search result output unit 8 requests the video storage unit 9 for video information corresponding to the output result, and the video storage unit 9 answers the request.

【００３２】図５は図１、図３、図４中の文字情報認識
部２、認識結果格納部３、認識結果類似度計算部４の一
例を示すブロック図である。FIG. 5 is a block diagram showing an example of the character information recognition unit 2, the recognition result storage unit 3, and the recognition result similarity calculation unit 4 in FIGS. 1, 3, and 4.

【００３３】文字認識部２は文字認識計算部１１と文字
コード出力部１２と信頼度出力部１３から構成される。The character recognition unit 2 comprises a character recognition calculation unit 11, a character code output unit 12, and a reliability output unit 13.

【００３４】文字認識計算部１１は、入力された映像に
対して、予め決められた文字認識方法を用いて、映像中
に表示される文字情報に対し文字認識処理を行い、認識
結果の文字コード情報と文字コード情報の持つ認識結果
としての信頼性を表す数値（信頼度）を求める。文字認
識方法としては、例えば、文献［４］で紹介されている
方法を実現したコンピュータ上のソフトウェアが利用で
きる。また、認識結果の文字コードの持つ信頼度は、例
えば、信頼度の範囲が０から１の間の値であり、数値が
大きいほど信頼度が高いものとすると、前記の例の「あ
す開幕高校野球」について、「あ：０.７８」「す：０.
７５」「開：０.８９」「幕：０.７０」「高：０.６
５」「校：０.９１」「野：０.８７」「球：０.９５」
と、各文字コードに対応する数値として求められる。こ
の場合、「球」が最も信頼性が高い文字認識結果である
といえる。信頼度は文献［４］等に記載の文字認識手法
により求める。文献［４］に記載されている手法では映
像中から抽出した文字領域を入力パターン、予め用意し
てある文字認識用の辞書データ中の文字領域を辞書パタ
ーンとし、入力パターンと辞書パターンの類似性を計算
することで文字の認識を行う。信頼度は入力パターンと
辞書パターンの類似性を計算する際に得られる値で、類
似性と比例した値をとる。The character recognition calculation unit 11 performs a character recognition process on character information displayed in the video by using a predetermined character recognition method for the input video, and obtains a character code of the recognition result. A numerical value (reliability) representing the reliability of the information and the character code information as the recognition result is obtained. As a character recognition method, for example, software on a computer that realizes the method introduced in Reference [4] can be used. In addition, the reliability of the character code of the recognition result is, for example, a value in the range of 0 to 1, and the higher the numerical value is, the higher the reliability is. About baseball, "A: 0.78"
75 "opening: 0.89""curtain:0.70""high: 0.6
5 "School: 0.91""Field:0.87""Sphere:0.95"
And a numerical value corresponding to each character code. In this case, it can be said that "sphere" is the character recognition result with the highest reliability. The reliability is obtained by a character recognition method described in reference [4] or the like. According to the method described in reference [4], a character area extracted from a video is used as an input pattern, and a character area in dictionary data for character recognition prepared in advance is used as a dictionary pattern, and the similarity between the input pattern and the dictionary pattern is determined. The character is recognized by calculating. The reliability is a value obtained when calculating the similarity between the input pattern and the dictionary pattern, and takes a value proportional to the similarity.

【００３５】文字コード出力部１２は、文字認識計算部
１１で得られた認識結果の文字コード情報を出力する。
信頼度出力部１３は、文字認識計算部１１で得られた各
文字コードの持つ信頼度の数値を出力する。The character code output unit 12 outputs character code information of the recognition result obtained by the character recognition calculation unit 11.
The reliability output unit 13 outputs a numerical value of the reliability of each character code obtained by the character recognition calculation unit 11.

【００３６】認識結果格納部３は、文字コード格納部２
１と信頼度格納部２２から構成される。文字コード格納
部２１は文字認識結果の文字コードが格納される。信頼
度格納部２２は文字認識結果の文字コードの信頼度が格
納される。The recognition result storage unit 3 stores the character code storage unit 2
1 and a reliability storage unit 22. The character code storage unit 21 stores the character code of the character recognition result. The reliability storage unit 22 stores the reliability of the character code as a result of the character recognition.

【００３７】認識結果類似性判断部４は文字コード類似
度計算部３１と類似性信頼度計算部３２から構成され
る。The recognition result similarity determination unit 4 includes a character code similarity calculation unit 31 and a similarity reliability calculation unit 32.

【００３８】文字コード類似度計算部３１は、文字コー
ド格納部２１に格納された複数の文字認識結果から２つ
ずつ文字認識結果を取り出し、全ての２つの組合わせに
対し、文字認識結果同士の文字コードの類似度を計算す
る。本処理部は、認識結果類似度計算部４と同様に、・文字認識結果A：あす開幕高校野球・文字認識結果B：高校野球、雨で中止があった場合、A, Bにともに含まれる同一の文字コード
として「高校野球」という４個の文字コードがある。こ
れに対し、類似度を同一文字コードの個数として定義し
た場合、A, Bの文字認識結果の類似度は４と計算され
る。類似性信頼度計算部３２は、文字コード格納部２１
に格納された複数の文字認識結果から２つずつの文字認
識結果を取り出し、各組合わせについて、文字コードの
信頼度から両者の類似性の信頼度を計算する。前記の例
を利用して本処理部の処理例を説明する。前記の例は、
文字認識結果Aと文字認識結果Bの各文字コードの信頼度
値が例えば以下の各文字コードの右隣の括弧内の数値の
ように求められたとする。The character code similarity calculation unit 31 takes out two character recognition results from the plurality of character recognition results stored in the character code storage unit 21 and compares the character recognition results with each other for all two combinations. Calculate the similarity of character codes. This processing unit is the same as the recognition result similarity calculation unit 4.-Character recognition result A: Tomorrow opening high school baseball-Character recognition result B: High school baseball, when there is a stop due to rain, both A and B are included As the same character code, there are four character codes “high school baseball”. On the other hand, when the similarity is defined as the number of the same character codes, the similarity of the character recognition results of A and B is calculated as 4. The similarity reliability calculation unit 32 stores the character code storage unit 21
, Two character recognition results are extracted from the plurality of character recognition results stored in the storage unit, and for each combination, the reliability of the similarity between the two is calculated from the reliability of the character code. A processing example of this processing unit will be described using the above example. The above example is
It is assumed that the reliability value of each character code of the character recognition result A and the character recognition result B is obtained, for example, as a numerical value in parentheses to the right of each character code below.

【００３９】・文字認識結果A：あ（０.７８）す（０.
７５）開（０.８９）幕（０.７０）高（０.６５）校（０.９１）野（０.８７）球（０.９
５）・文字認識結果B：高（０.８８）校（０.７８）野（０.
６７）球（０.９６）、（０.５５）雨（０.７７）で
（０.８１）中（０.７２）止（０.９２）両者の類似性の信頼度を共通する文字コードの１文字当
たりの信頼度値として定義した場合、各文字認識結果中
の「高校野球」に相当する文字コード列の信頼度値の合
計は、それぞれ、文字認識結果Aで３.３８、文字認識結
果Bで３.２９となり、両者を足し合わせ、一文字当たり
の信頼度の平均値を算出すると０.８３となり、両者の
類似性信頼度は０.８３として計算される。Character recognition result A: a (0.78) (0.
75) Opening (0.89) Curtain (0.70) High (0.65) School (0.91) Field (0.87) Ball (0.9
5) Character recognition result B: High (0.88) school (0.78) field (0.
67) Ball (0.96), (0.55) Rain (0.77), (0.81) Medium (0.72) Stop (0.92) Character code common to the reliability of similarity between the two , The sum of the reliability values of the character code strings corresponding to “high school baseball” in each character recognition result is 3.38 for the character recognition result A, and The result B is 3.29, the two are added together, and the average value of the reliability per character is calculated to be 0.83. The similarity reliability of both is calculated as 0.83.

【００４０】図６は認識結果類似度計算部４が、文字コ
ード類似度計算部３１と類似性信頼度計算部３２で構成
される場合に、図２のフローチャートへ追加されるステ
ップを示したものである。FIG. 6 shows steps added to the flowchart of FIG. 2 when the recognition result similarity calculator 4 is composed of a character code similarity calculator 31 and a similarity reliability calculator 32. It is.

【００４１】この例では、文字認識結果の類似性信頼度
を共通する文字コードの信頼度の和に基づいた値として
計算する場合を示す。ステップ６１では類似性信頼度E
を０に初期化する。ステップ６２では類似性信頼度Eに
文字コードIの信頼度値T１と文字コードJの信頼度T２の
和を足し合わせる。In this example, a case is shown in which the similarity reliability of the character recognition result is calculated as a value based on the sum of the reliability of common character codes. In step 61, the similarity reliability E
Is initialized to 0. In step 62, the sum of the reliability value T1 of the character code I and the reliability value T2 of the character code J is added to the similarity reliability E.

【００４２】図７は図６の検索結果出力部８を示すブロ
ック図である。FIG. 7 is a block diagram showing the search result output unit 8 of FIG.

【００４３】映像格納表示部８２は、類似性情報格納部
７に格納された類似性情報から、類似すると判断された
文字認識結果に対応する映像を映像蓄積部９から取り出
し、コンピュータ上の映像用バッファ８１に配置する。
前記の例を利用すると、「あす開幕高校野球」という文
字が表示されている映像と、「高校野球、雨で中止」と
いう文字が表示されている映像がコンピュータのハード
ディスク上から取り出され、コンピュータのメモリなど
に一時的に配置される。The video storage / display unit 82 retrieves a video corresponding to the character recognition result determined to be similar from the similarity information stored in the similarity information storage unit 7 from the video storage unit 9, and stores the video on the computer. It is arranged in the buffer 81.
Using the above example, an image displaying the characters "Tomorrow Opening High School Baseball" and an image displaying the characters "High School Baseball, Canceled by Rain" are taken out from the hard disk of the computer and Temporarily placed in memory, etc.

【００４４】映像格納表示部８２はさらに、映像用バッ
ファ８１に格納された類似する映像の組合せ毎にコンピ
ュータのディスプレイ画面に整理して表示する。本処理
を実施した場合のコンピュータのディスクプレイ画面の
一例を図８に示す。図８に示すように、類似する映像毎
に、例えば、文字認識結果の件数が多い類似グループ順
に画面の上部から下部に並べて類似する文字コード情報
や件数情報とともに表示することで、映像内容が放送映
像の場合、映像利用者は放送映像中の話題性の高い情報
を効率的に把握することが可能となる。図８では画面上
部に「話題の映像ベスト５」というタイトルを表示した
例であるが、これは、利用者に表示内容をわかりやすく
説明する効果がある。また、図８では、画面上部に類似
グループ毎の文字情報を件数順に並べて表示し、画面下
部に各類似グループに属する映像を放送局、日付、時間
情報とともに表示した例を示す。The video storage / display unit 82 further organizes and displays the similar video combinations stored in the video buffer 81 on the display screen of the computer. FIG. 8 shows an example of the display screen of the computer when this processing is performed. As shown in FIG. 8, for each similar video, for example, the video content is broadcast by arranging from the top to the bottom of the screen in the same group in which the number of character recognition results is large, together with similar character code information and number information. In the case of video, a video user can efficiently grasp highly topical information in a broadcast video. FIG. 8 shows an example in which the title “Top 5 Best Videos” is displayed at the top of the screen, but this has the effect of clearly explaining the display contents to the user. FIG. 8 shows an example in which character information for each similar group is displayed in the upper part of the screen in the order of the number of cases, and video belonging to each similar group is displayed together with broadcast station, date, and time information at the lower part of the screen.

【００４５】また、類似映像表示部８２では、映像類似
度の分類結果だけでなく、分類前の類似度の計算結果の
値だけを利用した映像表示も行う。例えば、一つの映像
を指定し、指定された映像と類似度の値が予め決められ
た値より大きい映像を表示する。「高校野球、あす開
幕」と表示された映像を選択すると「雨で中止、高校野
球」という文字が表示される映像など類似性のある映像
が表示される。The similar image display unit 82 performs not only the image similarity classification result but also the image display using only the value of the similarity calculation result before the classification. For example, one image is designated, and an image whose similarity value to the designated image is larger than a predetermined value is displayed. When an image displayed as “High school baseball, tomorrow opening” is selected, an image having similarity, such as an image displaying characters “Stopped by rain, High school baseball”, is displayed.

【００４６】図９は図１に示した第１の実施形態の映像
検索装置を、パソコンなどのコンピュータ上で実施する
場合のブロック図である。入力装置９１は映像データを
入力するためのものである。記憶装置９２は入力された
映像データ、文字情報認識結果が格納される。記憶装置
９３はハードディスクである。出力装置９４は映像フレ
ームが出力されるディスクプレイである。記録媒体９５
は映像入力記憶部１から結果出力部５まで（認識結果格
納部３を除く）の処理からなる映像検索プログラムが記
録されている、フロッピィ・ディスク、CD−ROM、光磁
気ディスクなどである。データ処理装置９６はCPU、各
種インタフェースを含み、記録媒体９５から映像検索プ
ログラムを読み込み、これを実行する。FIG. 9 is a block diagram in the case where the video search device of the first embodiment shown in FIG. 1 is implemented on a computer such as a personal computer. The input device 91 is for inputting video data. The storage device 92 stores the input video data and character information recognition result. The storage device 93 is a hard disk. The output device 94 is a disc display for outputting a video frame. Recording medium 95
Is a floppy disk, a CD-ROM, a magneto-optical disk, or the like, on which a video search program consisting of processes from the video input storage unit 1 to the result output unit 5 (excluding the recognition result storage unit 3) is recorded. The data processing device 96 includes a CPU and various interfaces, reads a video search program from the recording medium 95, and executes the program.

【００４７】なお、第２、第３の実施形態の映像検索装
置も同様に、コンピュータ上で実施できる。The video search apparatuses of the second and third embodiments can be similarly implemented on a computer.

【００４８】[0048]

【発明の効果】以上説明したように、本発明によれば、
映像中のテロップ文字などの文字情報を文字認識処理で
認識した結果の文字コードや認識結果としての信頼性の
情報などを分類し、意味内容の類似する映像をグループ
化することで、映像の色、形、動き、音などの信号情報
の特徴を利用した従来技術では不可能であった、意味内
容の類似性を判断することが可能となる。これにより、
例えば、放送映像などを対象とする場合、放送映像中の
話題性の高い情報をグループ化された映像情報を参照す
ることで知ることが可能となる。また、本機能を備えた
コンピュータをサーバとし、映像利用者に対し、コンピ
ュータネットワークを通じてクライアント用端末からサ
ーバ上の類似するグループ毎の映像情報を参照する機能
を提供し、本機能を利用したユーザから利用料金を徴収
するというコンピュータネットワークを利用した課金シ
ステムを構築することも可能となる。As described above, according to the present invention,
By classifying character codes such as telop characters in the video by character recognition processing and character information as a result of recognition by character recognition processing, and by grouping videos with similar semantic contents, the color of the video It is possible to determine the similarity of the meaning and content, which is impossible in the related art using characteristics of signal information such as shape, motion, sound, and the like. This allows
For example, in the case of a broadcast video or the like, it is possible to know information with high topicality in the broadcast video by referring to the grouped video information. In addition, a computer equipped with this function is used as a server, and a video user is provided with a function of referring to video information of each similar group on the server from a client terminal through a computer network, and a user who uses this function is provided with the function. It is also possible to construct a billing system using a computer network for collecting usage fees.

[Brief description of the drawings]

【図１】本発明の第１の実施形態の類似映像検索表示装
置のブロック図である。FIG. 1 is a block diagram of a similar image search and display device according to a first embodiment of the present invention.

【図２】図１中の認識結果類似度計算部４の処理を示す
フローチャートである。FIG. 2 is a flowchart showing processing of a recognition result similarity calculation unit 4 in FIG.

【図３】本発明の第２の実施形態の類似映像検索表示装
置のブロック図である。FIG. 3 is a block diagram of a similar image search and display device according to a second embodiment of the present invention.

【図４】本発明の第３の実施形態の類似映像検索表示装
置のブロック図である。FIG. 4 is a block diagram of a similar image search and display device according to a third embodiment of the present invention.

【図５】文字情報認識部２、認識結果格納部３、認識結
果類似度計算部４のブロック図である。FIG. 5 is a block diagram of a character information recognition unit 2, a recognition result storage unit 3, and a recognition result similarity calculation unit 4.

【図６】認識結果類似性判断部５の処理を示すフローチ
ャートである。FIG. 6 is a flowchart showing processing of a recognition result similarity determination unit 5;

【図７】映像出力部８のブロック図である。FIG. 7 is a block diagram of a video output unit 8;

【図８】類似映像表示部８２の表示例を示す図である。FIG. 8 is a diagram showing a display example of a similar image display unit 82.

【図９】図１の映像検索装置をコンピュータ上で実施す
る場合のブロック図である。FIG. 9 is a block diagram when the video search device of FIG. 1 is implemented on a computer.

[Explanation of symbols]

１映像入力記憶部２文字情報認識部３認識結果格納部４認識結果類似度計算部５結果出力部６類似度分類部７類似性情報格納部８検索結果出力部９映像蓄積部１１文字認識計算部１２文字コード出力部１３信頼度出力部２１文字コード格納部２２信頼度格納部３１文字コード類似度計算部３２類似性信頼度計算部４１〜５２、６１、６２ステップ８１映像用バッファ８２映像格納表示部９１入力装置９２、９３記憶装置９４出力装置９５記録媒体９６データ処理装置 Reference Signs List 1 video input storage unit 2 character information recognition unit 3 recognition result storage unit 4 recognition result similarity calculation unit 5 result output unit 6 similarity classification unit 7 similarity information storage unit 8 search result output unit 9 video storage unit 11 character recognition calculation Unit 12 character code output unit 13 reliability output unit 21 character code storage unit 22 reliability storage unit 31 character code similarity calculation unit 32 similarity reliability calculation unit 41 to 52, 61, 62 step 81 video buffer 82 video storage Display unit 91 Input device 92, 93 Storage device 94 Output device 95 Recording medium 96 Data processing device

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ０６Ｔ 1/00 ２００Ｇ０６Ｔ 1/00 ２００ＥＨ０４Ｎ 5/76 Ｈ０４Ｎ 5/76 Ｂ (72)発明者児島治彦東京都千代田区大手町二丁目３番１号日本電信電話株式会社内Ｆターム(参考） 5B050 AA08 AA09 BA10 BA15 EA04 GA08 5B064 AA10 CA08 5B075 ND08 ND12 NK10 NR12 PQ02 PQ29 PQ36 PQ62 PQ74 PR06 QM08 5C052 AC08 DD04 FE01 ──────────────────────────────────────────────────の Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat ゛ (Reference) G06T 1/00 200 G06T 1/00 200E H04N 5/76 H04N 5/76 B (72) Inventor Haruhiko Kojima Tokyo 2-3-1, Otemachi, Chiyoda-ku, Tokyo Nippon Telegraph and Telephone Corporation F-term (reference) 5B050 AA08 AA09 BA10 BA15 EA04 GA08 5B064 AA10 CA08 5B075 ND08 ND12 NK10 NR12 PQ02 PQ29 PQ36 PQ62 PQ74 PR06 QM08 50405

Claims

[Claims]

A video input storage unit for receiving video data and storing the video data in a memory; and extracting video data input from the video input storage unit and extracting subtitles displayed in the video frame for each video frame. A character information recognizing unit that recognizes a character code string candidate of the subtitle; a recognition result storage unit that stores the character code string candidate; and a character string candidate in a combination of all two video frames of the video frame. A video search device comprising: a recognition result similarity calculating unit that calculates a similarity based on a degree at which constituent characters match; and a result output unit that outputs the character code string candidate and the similarity.

2. A video input storage unit for inputting video data and storing the video data in a memory, and extracting subtitles displayed in the video frame for each video frame from the video data input from the video input storage unit. A character information recognition unit that recognizes a character code string candidate of the subtitle; a recognition result storage unit in which the character code string is stored; and a character string candidate in a combination of all two video frames of the video frame. A recognition result similarity calculation unit that calculates a similarity based on the degree to which constituent characters match, and classifying the similarity information based on a predetermined rule;
A similarity classifying unit that outputs similarity information that is identification information of the classified groups; a similarity information storage unit that stores the similarity information output by the similarity classifying unit; Search based on the request given in advance,
A video search device having a search result output unit for outputting.

3. A video input storage unit for inputting video data and storing the video data in a memory; and extracting video data input from the video input storage unit and extracting subtitles displayed in the video frame for each video frame. A character information recognizing unit that recognizes a character code string candidate of the subtitle; a recognition result storage unit that stores the character code string candidate; and a character string candidate in a combination of all two video frames of the video frame. A recognition result similarity calculation unit that calculates a similarity based on the degree to which constituent characters match, and classifying the similarity information based on a predetermined rule;
A similarity classifier that outputs similarity information that is identification information of each of the classified groups; a similarity information storage unit that stores the similarity information output by the similarity classifier; and input video data To the video storage unit that stores information that allows video to be reproduced from an arbitrary point specified using a predetermined method, and stores the video, and is stored in the similarity information storage unit. A video search device having a search result output unit that searches for similarity information of the video in response to a request given in advance, requests video information corresponding to the search result to the video storage unit, and outputs the video information together with the search result.

4. The character information recognizing unit performs a character recognition process on character information displayed in a video on an input video frame using a predetermined character recognition method, and obtains a recognition result. Character code information, a character recognition calculation unit that calculates a reliability that is a numerical value representing the reliability as a recognition result of the character code information, and character code information of the recognition result obtained by the character recognition calculation unit. A character code output unit, and a reliability output unit that outputs the reliability of each character code obtained by the character recognition calculation unit, wherein the recognition result storage unit stores a character code of a character recognition result. A character code storage unit, and a reliability storage unit that stores the reliability of the character code of the character recognition result. The recognition result similarity calculation unit includes a plurality of characters stored in the character code storage unit. From recognition results A character code similarity calculation unit that extracts character recognition results one by one and calculates the similarity of character codes for each combination; and two character recognition units from a plurality of character recognition results stored in the character code storage unit. 4. A similarity calculating unit for extracting a result, and for each combination, calculating a reliability of the similarity between the character codes from the reliability of the character codes stored in the reliability storing unit. An apparatus according to any one of the preceding claims.

5. The search result output unit outputs, from the image storage unit, an image corresponding to a character recognition result determined to be similar based on an image buffer and similarity information stored in the similarity information storage unit. 5. The apparatus according to claim 3, further comprising: a video storage / display unit that takes out the video, stores the video in the video buffer, and organizes and displays a combination of similar videos stored in the video buffer on a display screen. .

6. A video input storage step of inputting video data and storing the video data in a memory, and extracting subtitles displayed in the video frame for each video frame from the video data input from the video input storage step. A character information recognition step of recognizing a character code string candidate of the caption; a recognition result storing step of storing the character code string candidate in a memory; and a character string candidate in a combination of all two video frames of the video frame. And a result output unit that outputs the character code string candidate and the similarity.

7. A video input storage step of inputting video data and storing the video data in a memory, and extracting subtitles displayed in the video frame for each video frame from the video data input from the video input storage step. A character information recognition step of recognizing a character code string candidate of the caption; a recognition result storing step of storing the character code string candidate in a memory; and a character string candidate in a combination of all two video frames of the video frame. A recognition result similarity calculating step of calculating a similarity based on the degree of matching of the characters constituting, and classifying the similarity information based on a predetermined rule,
A similarity classifying step of outputting similarity information which is identification information of each of the classified groups; a similarity information storing step of storing the similarity information output in the similarity classifying step in a memory; and the similarity information. Is searched based on a request given in advance,
A video search method having a search result output step of outputting.

8. A video input storage step of inputting video data and storing the video data in a memory, and extracting subtitles displayed in the video frame for each video frame from the video data input from the video input storage step. A character information recognition step of recognizing a character code string candidate of the caption; a recognition result storing step of storing the character code string candidate in a memory; and a character string candidate in a combination of all two video frames of the video frame. A recognition result similarity calculating step of calculating a similarity based on the degree of matching of the characters constituting, and classifying the similarity information based on a predetermined rule,
A similarity classifying step of outputting similarity information that is identification information of each of the classified groups; a similarity information storing step of storing the similarity information output in the similarity classifying step in a memory; Attach information enabling reproduction of a video from an arbitrary point specified using a predetermined method to the data, and store the video in the video storage step of storing the video and the similarity information storage step Searching for similarity information of the obtained video by a previously given request, requesting video information corresponding to the search result to the video storage unit step,
A video search method including a search result output step of outputting the search result together with the search result.

9. The character information recognizing step performs a character recognition process on character information displayed in a video on an input video frame using a predetermined character recognition method, Character code information and a character recognition calculating step of obtaining a reliability which is a numerical value representing reliability as a recognition result of the character code information; and a character for outputting character code information of the recognition result obtained in the character recognition calculating step A code output step, and a reliability output step of outputting the reliability of each character code obtained in the character recognition calculation step, wherein the recognition result storing step stores the character code of the character recognition result in a memory. Character code storing step, and a reliability storing step of storing the reliability of the character code of the character recognition result in a memory, wherein the recognition result similarity Calculating a character code similarity calculating step of extracting two character recognition results from a plurality of character recognition results stored in the character code storing step and calculating similarity of character codes for each combination; Two character recognition results are extracted from the plurality of character recognition results stored in the character code storage step,
9. A similarity reliability calculating step of calculating a similarity reliability of the two from the character code reliability stored in the reliability storage unit step for each combination. The method of claim 1.

10. The search result output step retrieves, from the video storage step, a video corresponding to a character recognition result determined to be similar from the similarity information stored in the memory in the similarity information storage step, from the video storage step. 8. The method according to claim 7, further comprising the steps of: storing in a display buffer; and organizing and displaying on a display screen for each combination of similar images stored in the buffer.

11. A video input storage procedure for inputting video data and storing the video data in a memory, and extracting subtitles displayed in the video frame for each video frame from the video data input in the video input storage procedure. A character information recognition procedure for recognizing a character code string candidate of the caption; a recognition result storing procedure for storing the character code string candidate in a memory; and a character string candidate in a combination of all two video frames of the video frame. And a video search program for causing a computer to execute a result output step of outputting the character code string candidate and the similarity based on a degree of similarity calculated based on the degree of matching of the characters constituting The recording medium on which it was recorded.

12. A video input storage procedure for inputting video data and storing the video data in a memory, and extracting subtitles displayed in the video frame for each video frame from the video data input in the video input storage procedure. A character information recognition procedure for recognizing a character code string candidate of the caption; a recognition result storing procedure for storing the character code string candidate in a memory; and a character string candidate in a combination of all two video frames of the video frame. A recognition result similarity calculation procedure for calculating a similarity based on the degree of matching of the characters constituting the, and classifying the similarity information based on a predetermined rule,
A similarity classifying step of outputting similarity information that is identification information of each of the classified groups; a similarity information storing procedure of storing the similarity information output in the similarity classifying procedure in a memory; and the similarity information. Is searched based on a request given in advance,
A recording medium storing a video search program for causing a computer to execute a search result output procedure to be output.

13. A video input storage procedure for inputting video data and storing the video data in a memory, and extracting subtitles displayed in the video frame for each video frame from video data input from the video input storage procedure. A character information recognition procedure for recognizing a character code string candidate of the caption; a recognition result storing procedure for storing the character code string candidate in a memory; and a character string candidate in a combination of all two video frames of the video frame. A recognition result similarity calculation procedure for calculating a similarity based on the degree of matching of the characters constituting the, and classifying the similarity information based on a predetermined rule,
A similarity classification procedure for outputting similarity information that is identification information of each of the classified groups; a similarity information storage procedure for storing the similarity information output in the similarity classification procedure in a memory; The data is provided with information enabling reproduction of the video from an arbitrary point specified using a predetermined method, and stored in the video storage procedure for storing the video and the similarity information storage procedure. Searching for similarity information of the searched video according to a previously given request, requesting the video storage procedure for video information corresponding to the search result, and causing the computer to execute a search result output procedure for outputting the search result together with the search result. Recording medium on which the program of the above is recorded.

14. The character information recognizing step performs a character recognition process on character information displayed in a video by using a predetermined character recognition method on an input video frame, and Character code information and a character recognition calculation procedure for obtaining a reliability which is a numerical value representing the reliability as a recognition result possessed by the character code information, and a character for outputting the character code information of the recognition result obtained in the character recognition calculation procedure A code output procedure, and a reliability output procedure for outputting the reliability of each character code obtained in the character recognition calculation procedure, wherein the recognition result storing procedure stores the character code of the character recognition result in a memory. Character code storing procedure, and a reliability storing procedure of storing the reliability of the character code of the character recognition result in a memory, wherein the recognition result similarity calculating procedure includes the character code storing procedure. A character code similarity calculation procedure for extracting character recognition results two by two from the stored plurality of character recognition results and calculating similarity of character codes for each combination, and a plurality of character code similarity calculation procedures stored in the character code storage procedure. Similarity reliability calculation for extracting two character recognition results from the character recognition result and calculating the reliability of the similarity between the character combinations for each combination from the reliability of the character code stored in the reliability storage procedure. 14. The method according to any one of claims 11 to 13, comprising:
Recording medium according to the item.

15. The search result output step retrieves, from the video accumulation procedure, a video corresponding to a character recognition result determined to be similar from the similarity information stored in the memory in the similarity information storage procedure. Having a procedure of storing in the display buffer and a procedure of organizing and displaying on the display screen for each combination of similar images stored in the procedure.
The recording medium according to claim 12.