JP2005227944A

JP2005227944A - Character information acquisition device

Info

Publication number: JP2005227944A
Application number: JP2004034673A
Authority: JP
Inventors: Ryuichi Shibuya; 竜一澁谷
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2004-02-12
Filing date: 2004-02-12
Publication date: 2005-08-25

Abstract

<P>PROBLEM TO BE SOLVED: To provide highly precise character string information while suppressing malfunction when character information on a URL sent as a portion of a video signal is transmitted. <P>SOLUTION: When a video signal character string is detected, an audio signal is subjected to speech recognition to detect a character string and extract the transmitted character information. When the two character strings are matched with each other, the character string is outputted. When the rate of coincidence between the two character strings is high, the character string is corrected with a previously set correction rate and the corrected character string is outputted; when the rate of coincidence between the two character strings is low, the character string detected from the video signal is outputted. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、表示画像中に含まれるＵＲＬ情報やメールアドレスなどを取得するインターネット情報取得装置に関し、さらに詳述すれば、インターネット情報取得装置を組み込んだインターネット接続機能を有する映像表示装置に関する。 The present invention relates to an Internet information acquisition apparatus that acquires URL information, a mail address, and the like included in a display image. More specifically, the present invention relates to a video display apparatus having an Internet connection function incorporating the Internet information acquisition apparatus.

近年のパーソナルコンピュータやインターネット機能付き携帯電話の急速な普及と共に、インターネットの利用が拡大している。また、テレビ放送中の番組において、当該番組自身や番組で紹介した商品や場所などの情報が提供されているホームページのＵＲＬ情報を映像や音声で表示して、視聴者が当該ホームページにアクセスしてさらに情報を獲得できるような便宜を図っている。また、放送される各社のオリジナルＣＭなどでも、商品情報や会社概要などが紹介されているホームページのＵＲＬが頻繁に表示されている。その映像の一部として送られてくるテロップなどの文字情報を抜き出し、コンテンツ視聴などに利用する手法として、例えば特開２００３−０６９９１４号公報などがある。
特開２００３−０６９９１４号公報 With the rapid spread of personal computers and mobile phones with Internet functions in recent years, the use of the Internet is expanding. Also, in the program being broadcasted on TV, the URL information of the homepage that provides information on the program itself, the product introduced in the program, the location, etc. is displayed in video or audio, and the viewer accesses the homepage. In addition, we are trying to make it easier to obtain information. In addition, the URLs of websites where product information, company outlines, etc. are introduced are frequently displayed in the original commercials of each company. For example, Japanese Patent Application Laid-Open No. 2003-069914 discloses a technique for extracting character information such as a telop sent as part of the video and using it for content viewing.
JP 2003-069914 A

しかしながら、画像の一部である文字列を抜き取る場合、文字情報としては静止しているが、文字情報の背景は動画となっていることが多く、文字情報の抜き取り精度が悪く結果として文字情報としての精度が損なわれてしまう場合があり、所望するコンテンツデータが取得できないなどの不具合が生じやすい。 However, when extracting a character string that is a part of an image, the character information is stationary, but the background of the character information is often a moving image, and the accuracy of extracting the character information is poor, resulting in character information. May be impaired, and problems such as failure to obtain desired content data are likely to occur.

第１の発明は、入力映像信号に対し、その画像の一部として送信されてくる文字列を検出する文字列検出装置と、前記文字列検出装置の出力を入力とし、あらかじめ設定された特殊文字と一致する文字列があった場合、その文字列の前後のあらかじめ設定された長さの文字列のみを出力する第１の特殊文字認識装置と、前記第１の特殊文字認識装置の出力を入力とし、その内容をある一定期間保持しておくことができる第１の文字列候補蓄積装置と、入力音声信号を文字列に変換する音声認識装置と、前記音声認識装置の出力を入力とし、あらかじめ設定された特殊文字と一致する文字列があった場合、その文字列の前後のあらかじめ設定された長さの文字列のみを出力する第２の特殊文字認識装置と、前記第２の特殊文字認識装置の出力を入力とし、その内容をある一定期間保持しておくことができる第２の文字列候補蓄積装置と、前記第１の文字列候補蓄積装置および前記第２の文字列候補蓄積装置の出力を入力とし、２つの入力が一致すればその文字列を出力し、２つの文字列の一致の割合が高ければあらかじめ設定された補正率により文字列の補正を行いその補正された文字列を出力し、２つの文字列の一致の割合が低ければ前記第１の文字列候補蓄積装置の出力を出力する文字列補正装置と、前記文字列補正装置の出力を入力とし、その入力文字列を保持し、ＣＰＵの要求に従ってその文字列を出力する特殊文字記憶装置とを備え、映像信号の一部として送られてくる文字列を抜き取りその文字列をＣＰＵにて用いる際に、音声信号情報を用いてその誤動作を抑え、精度の高い文字列情報を提供することを特徴とする。 According to a first aspect of the present invention, there is provided a character string detection device for detecting a character string transmitted as a part of an image for an input video signal, and an output of the character string detection device as an input, and a special character set in advance. When there is a character string that matches, the first special character recognition device that outputs only a character string of a predetermined length before and after the character string and the output of the first special character recognition device are input. A first character string candidate accumulating device capable of holding the contents for a certain period, a speech recognition device for converting an input speech signal into a character string, and an output of the speech recognition device as inputs, A second special character recognition device that outputs only a character string of a predetermined length before and after the character string when there is a character string that matches the set special character; and the second special character recognition The output of the device The second character string candidate accumulating device capable of holding the contents for a certain period of time, and the outputs of the first character string candidate accumulating device and the second character string candidate accumulating device as inputs. If the two inputs match, the character string is output, and if the two character strings match at a high rate, the character string is corrected at a preset correction rate, and the corrected character string is output. A character string correction device that outputs the output of the first character string candidate storage device if the rate of matching of the two character strings is low, and an output of the character string correction device as an input, holds the input character string, and a CPU A special character storage device that outputs the character string according to the request of the user, and when the character string sent as part of the video signal is extracted and used by the CPU, the malfunction is caused by using the audio signal information. Control accuracy And providing a higher character string information.

第２の発明は、入力映像信号に対し、その画像の一部として送信されてくる文字列を検出し、その文字を検出する際の確からしさを文字列に付与し出力する文字列情報検出装置と、前記文字列情報検出装置の出力を入力とし、あらかじめ設定された特殊文字と一致する文字列があった場合、その文字列の前後のあらかじめ設定された長さの文字列および確からしさのみを出力する第３の特殊文字認識装置と、前記第３の特殊文字認識装置の出力を入力とし、その内容をある一定期間保持しておくことができる第３の文字列候補蓄積装置と、入力音声信号を文字列に変換し、その文字に変換する際の確からしさを文字列に付与し出力する音声認識情報出力装置と、前記音声認識情報出力装置の出力を入力とし、あらかじめ設定された特殊文字と一致する文字列があった場合、その文字列の前後のあらかじめ設定された長さの文字列および確からしさのみを出力する第４の特殊文字認識装置と、前記第４の特殊文字認識装置の出力を入力とし、その内容をある一定期間保持しておくことができる第４の文字列候補蓄積装置と、前記第３の文字列候補蓄積装置および前記第４の文字列候補蓄積装置の出力を入力とし、２つの入力が一致すればその文字列を出力し、２つの文字列が一致しなければ文字列に付与された確からしさの情報により確からしさの大きい文字を選択し出力し、２つの文字列の一致の割合が低ければ前記第１の文字列候補蓄積装置の出力を出力する第２の文字列補正装置と、前記第２の文字列補正装置の出力を入力とし、その入力文字列を保持し、ＣＰＵの要求に従ってその文字列を出力する特殊文字記憶装置とを備え、映像信号の一部として送られてくる文字列を抜き取りその文字列をＣＰＵにて用いる際に、音声信号情報を用いてその誤動作を抑え、精度の高い文字列情報を提供することを特徴とする。 According to a second aspect of the present invention, there is provided a character string information detecting device for detecting a character string transmitted as a part of an image from an input video signal, and adding the certainty to the character string when outputting the character string and outputting it. And the output of the character string information detection device as an input, and if there is a character string that matches a special character set in advance, only a character string of a predetermined length before and after the character string and the probability A third special character recognition device to output, a third character string candidate accumulating device which can receive the output of the third special character recognition device and keep the contents for a certain period, and input speech A speech recognition information output device that converts a signal into a character string, adds the probability of conversion to the character string to the character string and outputs it, and an output of the speech recognition information output device as an input, and a special character set in advance When When there is a matching character string, a fourth special character recognition device that outputs only a character string of a predetermined length before and after the character string and the probability, and an output of the fourth special character recognition device , And the output of the fourth character string candidate storage device, the third character string candidate storage device, and the fourth character string candidate storage device that can hold the contents for a certain period of time. If the two inputs match, the character string is output. If the two character strings do not match, a character with a high probability is selected and output based on the probability information given to the character string. If the string matching ratio is low, the second character string correcting device that outputs the output of the first character string candidate accumulating device and the output of the second character string correcting device are used as inputs. Hold and follow CPU request A special character storage device that outputs the character string, and when the character string sent as a part of the video signal is extracted and used by the CPU, the malfunction is suppressed using the audio signal information; It is characterized by providing highly accurate character string information.

本発明によれば、映像信号の一部として送信されているＵＲＬ等の文字列情報を高精度に検出し、コンテンツ視聴などに利用することができる。 According to the present invention, character string information such as a URL transmitted as a part of a video signal can be detected with high accuracy and used for viewing content.

（実施の形態１）
図１を用いて説明を行う。１０１において入力した映像信号をパターン化して検出文字列１０Ａを得る。１０Ａは第１の特殊文字認識装置１０２に入力する。あらかじめ抽出したい特殊文字または文字列（以下、キーワードと称す）を設定し、文字列の比較を行う。ここではＵＲＬ情報を抽出したいという設定を考え、「ｈｔｔｐ」をキーワードに設定しておく。１０Ａに設定キーワードが存在した場合、そのキーワードから例えば後に続くＵＲＬに使用される可能性のある連続文字列をＵＲＬ情報を含んだ情報１０Ｂとして１０２の出力とする。１０２の出力１０Ｂは第１の文字列候補蓄積装置１０３に入力される。１０３では、後述する第２の文字列候補蓄積装置に情報が蓄積されるまでに想定される時間保持しておく必要がある。このように抽出されたＵＲＬ情報候補を１０３に蓄積していく。また、入力された音声信号は音声認識装置１０４で検出文字列１０Ｄを得る。１０Ｄは第１の特殊文字認識装置１０５に入力する。ここで、１０５の動作は１０２と同一で良いので省略する。１０５からＵＲＬ情報を含んだ文字列情報１０Ｅを得る。１０Ｅは第２の文字列候補蓄積装置１０６に入力される。ここで、１０６の動作は１０３と同一で良いので省略する。このように抽出されたＵＲＬ情報候補を１０６に蓄積していく。１０３および１０６に蓄積されている情報を１０７が読み出し、逐次比較を行う。ここで１０７ではまず、両者が完全に一致する場合はそのどちらかを出力する。また、両者が完全に一致しないが一致する文字が多い場合は、例えば一致する文字列はそのまま使用し、一致しない文字列については１０６の出力１０Ｆを使用し、出力する。また、一致する文字が少ない場合は音声の情報がないとして１０３の出力をそのまま出力する。出力信号１０Ｇは特殊文字記憶装置１０８に入力され、ＣＰＵにより１０８から別の記憶保持装置などに転送されるなどにより、精度の高いＵＲＬ情報としてコンテンツ視聴などに利用される。 (Embodiment 1)
This will be described with reference to FIG. The detected video signal 10A is obtained by patterning the video signal input at 101. 10 A is input to the first special character recognition device 102. Special characters or character strings (hereinafter referred to as keywords) to be extracted are set in advance, and character strings are compared. Here, considering the setting that the URL information is to be extracted, “http” is set as a keyword. When a set keyword exists in 10A, for example, a continuous character string that may be used in a URL that follows from the keyword is output as information 10B including URL information as 102. The output 10B of 102 is input to the first character string candidate storage device 103. In 103, it is necessary to hold an expected time until information is stored in a second character string candidate storage device described later. The URL information candidates extracted in this way are stored in 103. Further, the input speech signal obtains a detected character string 10D by the speech recognition device 104. 10D is input to the first special character recognition device 105. Here, the operation 105 is the same as the operation 102, and will be omitted. From 105, character string information 10E including URL information is obtained. 10E is input to the second character string candidate storage device 106. Here, the operation of 106 may be the same as 103, and will be omitted. The URL information candidates extracted in this way are accumulated in 106. 107 reads out the information stored in 103 and 106 and performs successive comparison. Here, in step 107, first, when both coincide completely, either one is output. If there are many characters that do not match completely, for example, the matching character string is used as it is, and the character string that does not match is output using the output 10F of 106. If there are few matching characters, the output 103 is output as it is because there is no voice information. The output signal 10G is input to the special character storage device 108 and transferred from the 108 to another storage holding device or the like by the CPU, and used as content URL information with high accuracy.

（実施の形態２）
図２および図３を用いて説明を行う。２０１において入力した映像信号をパターン化して検出文字列を得る。この文字を検出する際にはいくつかの方法があるが、パターン化された信号に対してパターン認識を行い、いくつかの候補の中から最も可能性の高いものを選択するという手法をとるとすると、例えばこのときの候補の数を確からしさの情報とすると、この候補の数が少ない方が確からしさが高いということになる。この数字を検出した文字に付与すれば、確からしさを含んだ文字列情報２０Ａを得ることができる（図３を参照）。２０Ａは第３の特殊文字認識装置２０２に入力する。２０２の動作は１０２と同様であるが、キーワードとの比較は、確からしさデータは無視する。２０２の出力２０Ｂは第３の文字列候補蓄積装置２０３に入力される。２０３の動作は１０３と同様であるが、確からしさデータも同様に保持する。このため２０３は１０３よりも大きな容量を必要とする。このように抽出された確からしさを含んだＵＲＬ情報候補を２０３に蓄積していく。また、入力された音声信号は音声認識装置２０４で検出文字列を得る。この文字を検出する際にはいくつかの方法があるが、パターン化された信号に対してパターン認識を行い、いくつかの候補の中から最も可能性の高いものを選択するという手法をとるとすると、例えばこのときの候補の数を確からしさの情報とすると、この候補の数が少ない方が確からしさが高いということになる。この数字を検出した文字に付与すれば、確からしさを含んだ音声認識情報である文字列情報２０Ｄを得ることができる（図３を参照）。２０Ｄは第４の特殊文字認識装置２０５に入力する。ここで、２０５の動作は２０２と同一で良いので省略する。２０５からＵＲＬ情報を含んだ文字列情報２０Ｅを得る。２０Ｅは第４の文字列候補蓄積装置２０６に入力される。ここで、２０６の動作は２０３と同一で良いので省略する。このように抽出された確からしさを含んだＵＲＬ情報候補を２０６に蓄積していく。２０３および２０６に蓄積されている情報を２０７が読み出し、逐次比較を行う。ここで２０７ではまず、検出文字において両者が完全に一致する場合はそのどちらかを出力する。また、両者が完全に一致しないが一致する文字が多い場合は、例えば一致する文字列はそのまま使用し、一致しない文字列については付与された確からしさデータをもとにどちらの情報を選択するかを決定する。確からしさデータは小さいほど確からしさが大きいので、データが小さいほうを選択するようにする。また、一致する文字が少ない場合は音声の情報がないとして２０３の出力をそのまま出力する。出力信号２０Ｇは特殊文字記憶装置２０８に入力され、ＣＰＵにより２０８から別の記憶保持装置などに転送されるなどにより、精度の高いＵＲＬ情報としてコンテンツ視聴などに利用される。 (Embodiment 2)
This will be described with reference to FIGS. In 201, the input video signal is patterned to obtain a detected character string. There are several ways to detect this character, but pattern recognition is performed on the patterned signal and the most probable method is selected from several candidates. Then, for example, if the number of candidates at this time is used as information on the probability, the smaller the number of candidates, the higher the probability. If this number is added to the detected character, character string information 20A including the certainty can be obtained (see FIG. 3). 20A is input to the third special character recognition device 202. The operation of 202 is the same as that of 102, but the probability data is ignored in comparison with the keyword. The output 20B of 202 is input to the third character string candidate storage device 203. The operation of 203 is the same as that of 103, but the probability data is similarly held. For this reason, 203 requires a capacity larger than 103. The URL information candidates including the certainty extracted in this way are accumulated in 203. The input speech signal obtains a detected character string by the speech recognition device 204. There are several ways to detect this character, but pattern recognition is performed on the patterned signal and the most probable method is selected from several candidates. Then, for example, if the number of candidates at this time is used as information on the probability, the smaller the number of candidates, the higher the probability. If this number is added to the detected character, character string information 20D, which is voice recognition information including certainty, can be obtained (see FIG. 3). 20D is input to the fourth special character recognition device 205. Here, the operation of 205 may be the same as that of 202, and will be omitted. From 205, character string information 20E including URL information is obtained. 20E is input to the fourth character string candidate storage device 206. Here, the operation of 206 may be the same as that of 203, and will be omitted. The URL information candidates including the certainty extracted in this way are accumulated in 206. 207 reads out the information stored in 203 and 206 and performs successive comparison. Here, in step 207, if both of the detected characters completely match, one of them is output. Also, if the two characters do not match completely but there are many characters that match, for example, the matching character string is used as it is, and which information is selected based on the given probability data for the character string that does not match. To decide. The smaller the probability data, the greater the probability. Therefore, the smaller data is selected. If there are few matching characters, the output 203 is output as it is because there is no voice information. The output signal 20G is input to the special character storage device 208, and is used for viewing content as high-precision URL information by being transferred from the 208 to another storage device or the like by the CPU.

本発明に係る文字情報取得装置は、映像信号の一部として送信されているＵＲＬ等の文字列情報を高精度に検出し、コンテンツ視聴などに利用することができるという効果を有し、インターネット情報取得装置を組み込んだインターネット接続機能を有する映像表示装置として有用である。 The character information acquisition apparatus according to the present invention has an effect that character string information such as a URL transmitted as a part of a video signal can be detected with high accuracy and used for content viewing and the like. It is useful as a video display device having an Internet connection function incorporating an acquisition device.

本発明の第１の実施例における文字情報取得装置のブロック図The block diagram of the character information acquisition apparatus in 1st Example of this invention 本発明の第２の実施例における文字情報取得装置のブロック図The block diagram of the character information acquisition apparatus in 2nd Example of this invention 検出文字に確からしさ情報が付与されているイメージ図Image diagram with certainty information added to detected characters

Explanation of symbols

１０１文字列検出装置
１０２、１０５、２０２、２０５第１の特殊文字認識装置
１０３、１０６、２０３、２０６文字列候補蓄積装置
１０４音声認識装置
１０７、２０７文字列補正装置
１０８、２０８特殊文字記憶装置
１０９、２０９ＣＰＵ
２０１文字列情報検出装置
２０４音声認識情報出力装置 101 Character string detection device 102, 105, 202, 205 First special character recognition device 103, 106, 203, 206 Character string candidate storage device 104 Speech recognition device 107, 207 Character string correction device 108, 208 Special character storage device 109 209 CPU
201 character string information detection apparatus 204 voice recognition information output apparatus

Claims

A character string detection device that detects a character string transmitted as a part of the image with respect to an input video signal, and an output from the character string detection device, and a character string that matches a special character set in advance If there is, the first special character recognition device that outputs only a character string of a predetermined length before and after the character string and the output of the first special character recognition device are input, and the contents are A first character string candidate storage device that can be held for a certain period, a speech recognition device that converts an input speech signal into a character string, an output of the speech recognition device as an input, and a special character set in advance If there is a matching character string, the second special character recognition device that outputs only a character string of a predetermined length before and after the character string and the output of the second special character recognition device are input. ,That A second character string candidate accumulating device capable of holding the contents for a certain period, and outputs of the first character string candidate accumulating device and the second character string candidate accumulating device as inputs. If the two characters match, the character string is output. If the two character strings match, the character string is corrected at a preset correction rate, and the corrected character string is output. If the match rate is low, the character string correction device that outputs the output of the first character string candidate storage device, and the output of the character string correction device are input, the input character string is held, and according to the request of the CPU A special character storage device that outputs a character string, and when the character string sent as a part of the video signal is extracted and used by the CPU, the malfunction is suppressed by using the audio signal information and accuracy. High string information Character information acquisition apparatus characterized by providing a.

A character string information detecting device that detects a character string transmitted as a part of the image with respect to an input video signal, adds a probability of detecting the character to the character string, and outputs the character string information. If there is a character string that matches the preset special character with the output of the detection device as input, the third special that outputs only the character string of the preset length before and after that character string and the probability A character recognition device, a third character string candidate accumulating device that can receive the output of the third special character recognition device as input and hold the contents for a certain period of time, and converts an input speech signal into a character string A speech recognition information output device that outputs a character string with a certainty when converted into the character, and a character string that matches the preset special character with the output of the speech recognition information output device as an input If there is, a fourth special character recognition device that outputs only a character string of a predetermined length before and after the character string and the probability, and an output of the fourth special character recognition device are input, The fourth character string candidate accumulating device capable of holding the contents for a certain period of time, the outputs of the third character string candidate accumulating device and the fourth character string candidate accumulating device as inputs, and two inputs If the two characters match, the character string is output. If the two character strings do not match, the character with the highest probability is selected and output based on the probability information given to the character string. Is low, the second character string correcting device that outputs the output of the third character string candidate accumulating device, and the output of the second character string correcting device as inputs, holding the input character string, That string as requested A special character storage device, and when the character string sent as a part of the video signal is extracted and used by the CPU, the malfunction is suppressed by using the audio signal information and the character is highly accurate. A character information acquisition apparatus characterized by providing column information.