JP2005267249A

JP2005267249A - Data processing system, server and communication device

Info

Publication number: JP2005267249A
Application number: JP2004078651A
Authority: JP
Inventors: Daiki Maeda; 大輝前田
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2004-03-18
Filing date: 2004-03-18
Publication date: 2005-09-29

Abstract

PROBLEM TO BE SOLVED: To provide a data processing system, a server and a communication device capable of improving the accuracy of character recognition in image data taken by the communication device. SOLUTION: When country information is transmitted with image data from a portable terminal 10, a probability model generation part 204 performs weighting to a language used for character recognition based on the country information, according to an instruction from a control part 209, to generate a probability model of language. When a character string to be character-recognized is composed of a plurality of words, the probability model generation part 204 divides the character string of a recognition result transferred from a character recognition part 203 to words based on a language dictionary 206, extracts candidates of a word estimated to come next from a word just before a character to be recognized or a word before this word, performs weighting to the extracted candidates of the word, and generates a probability model of the word estimated to come next. The character recognition part 203 performs the character recognition according to the weighting shown by the probability model of language and the probability model of the word. COPYRIGHT: (C)2005,JPO&NCIPI

Description

通信装置で撮影された画像データの文字認識および翻訳をサーバで行うデータ処理システム、サーバおよび通信装置に関する。 The present invention relates to a data processing system, a server, and a communication device that perform character recognition and translation of image data captured by a communication device using a server.

近年の情報通信技術の発達に伴って、携帯端末といった通信装置の高機能化が急速に進んでいる。その高機能化のひとつとして、デジタルカメラ機能を内蔵した携帯端末の普及が挙げられる。また、携帯端末の３Ｇ（3rd Generation）化によって、これまで多数存在していた通信方式が統一されつつあり、日常使用している携帯端末を外国でも使用できる可能性が大きくなっている。 With the development of information communication technology in recent years, high functionality of communication devices such as portable terminals is rapidly progressing. One of the enhancements is the popularization of mobile terminals with built-in digital camera functions. In addition, with the 3G (3rd Generation) of mobile terminals, many communication systems that have existed so far are being unified, and the possibility that mobile terminals that are used everyday can be used even in foreign countries is increasing.

このような背景をもとに、旅行などで外国を訪れた際に、表示されている文字を携帯端末に内蔵されているデジタルカメラで撮影して、撮影した画像データの文字を文字認識した後翻訳することができるシステムが数々提案されている。しかし、一般に、文字認識を行うためには、ＯＣＲ（Optical Character Reader）で代表されるようにかなり高い解像度の画像が要求される。携帯端末に内蔵されるデジタルカメラは、近年、画素数が飛躍的に増えているものの、未だ高精度の文字認識を行うために必要な精度に達していない。また、携帯端末に内蔵されているＣＰＵ（Central Processing Unit）の処理能力も文字認識を行うためには十分なものではなく、文字認識のさらなる精度向上が期待されている。 Based on this background, when you visit a foreign country, such as when traveling, you can take a picture of the displayed characters with a digital camera built into your mobile device and recognize the characters in the image data Many systems that can translate are proposed. However, generally, in order to perform character recognition, an image with a considerably high resolution is required as represented by OCR (Optical Character Reader). Although the number of pixels of digital cameras built in portable terminals has increased dramatically in recent years, it has not yet reached the accuracy necessary for highly accurate character recognition. Further, the processing capability of a CPU (Central Processing Unit) built in the portable terminal is not sufficient for character recognition, and further improvement in character recognition is expected.

このような携帯端末の処理能力不足を補い、文字認識の精度向上を図った従来技術として、携帯端末のカメラで撮影した画像をサーバに送信し、サーバで文字認識と翻訳を行う携帯端末型画像処理システムがあり、携帯端末が所在する施設の種類を、ＧＰＳ（Global Positioning System）機能を備える携帯端末の現在位置と地図データとから特定し、あるいは、ユーザが携帯端末から指示し、特定されたあるいは指示された施設の種類に応じた認識辞書や言語辞書に切り替えることにより、認識精度を向上させている（たとえば、特許文献１参照）。また、この携帯端末型画像処理システムの携帯端末は、撮影された画像データをカラー画像の文字と背景を分離するように白と黒に二値化し、サーバに転送するデータ量を減らし、送信時間の短縮を図っている。 As a conventional technology that compensates for the shortage of processing capability of such portable terminals and improves the accuracy of character recognition, portable terminal type images that transmit images taken with the camera of the portable terminal to the server and perform character recognition and translation on the server There is a processing system, and the type of facility where the mobile terminal is located is specified from the current position and map data of the mobile terminal equipped with the GPS (Global Positioning System) function, or specified by the user instructing from the mobile terminal Alternatively, the recognition accuracy is improved by switching to a recognition dictionary or language dictionary corresponding to the type of facility instructed (see, for example, Patent Document 1). In addition, the portable terminal of this portable terminal type image processing system binarizes captured image data into white and black so as to separate the color image from the background, reducing the amount of data transferred to the server, and the transmission time Is shortened.

特開２００３−１７８０６７号公報JP 2003-178067 A

上述した従来技術では、携帯端末が所在する施設が特定あるいは指示されないと、認識辞書や言語辞書の範囲を限定できず、認識精度の向上が望めないという問題がある。 The conventional technology described above has a problem that unless the facility where the mobile terminal is located is specified or designated, the range of the recognition dictionary and the language dictionary cannot be limited, and improvement in recognition accuracy cannot be expected.

また、撮影したカラー画像の二値化が正しく行われなかった場合には、文字認識を誤り、正しく翻訳されない可能性が高まるという問題がある。 In addition, when binarization of the photographed color image is not performed correctly, there is a problem that there is an increased possibility that character recognition will be incorrect and not correctly translated.

本発明の目的は、通信装置で撮影した画像データの文字認識の精度を向上することができるデータ処理システム、サーバおよび通信装置を提供することである。 An object of the present invention is to provide a data processing system, a server, and a communication device that can improve the accuracy of character recognition of image data captured by a communication device.

本発明は、文字認識機能および翻訳機能を有するサーバと、文字認識および翻訳のために撮影された画像データを前記サーバに転送し、前記サーバから転送された翻訳結果を表示する機能を有する通信装置とを有するデータ処理システムにおいて、
前記通信装置は、前記通信装置がどこの国に所在するかを示す国情報を前記画像データとともに前記サーバに転送し、
前記サーバは、転送された国情報に基づいて、文字認識に用いる言語に重み付けを与え、言語に与えられた重み付けにしたがって前記画像データの文字認識を行うことを特徴とするデータ処理システムである。 The present invention relates to a server having a character recognition function and a translation function, and a communication device having a function of transferring image data taken for character recognition and translation to the server and displaying the translation result transferred from the server. In a data processing system having
The communication device transfers country information indicating where the communication device is located to the server together with the image data,
The server is a data processing system that assigns a weight to a language used for character recognition based on the transferred country information and performs character recognition of the image data according to the weight given to the language.

また本発明は、文字認識機能および翻訳機能を有するサーバと、文字認識および翻訳のために撮影された画像データを前記サーバに転送し、前記サーバから転送された翻訳結果を表示する機能を有する通信装置とを有するデータ処理システムにおいて、
前記通信装置は、ユーザが指定した言語を示す言語情報を前記画像データとともに前記サーバに転送し、
前記サーバは、転送された言語情報が示す言語を用いて前記画像データの文字認識を行うことを特徴とするデータ処理システムである。 In addition, the present invention provides a server having a character recognition function and a translation function, and a communication having a function of transferring image data taken for character recognition and translation to the server and displaying the translation result transferred from the server. In a data processing system having a device,
The communication device transfers language information indicating a language designated by a user to the server together with the image data,
The server is a data processing system that performs character recognition of the image data using a language indicated by transferred language information.

また本発明は、前記サーバは、文字認識を行う際に、既に文字認識が行われた文字から構成される単語であって文字認識を行う文字の直前の単語または二つ前の単語に基づいて、文字認識を行う文字を含む単語の候補を抽出し、抽出した候補に重み付けを与え、候補の単語に与えられた重み付けにしたがって文字認識することを特徴とする。 Further, according to the present invention, when the server performs character recognition, the server is a word composed of characters that have already been character-recognized, and is based on the word immediately before the character to be character-recognized or the word immediately before it. A candidate for a word including a character for character recognition is extracted, a weight is given to the extracted candidate, and a character is recognized according to the weight given to the candidate word.

また本発明は、前記通信装置は、撮影された画像データに対して減色処理を行った画像データを前記サーバに転送することを特徴とする。 Further, the present invention is characterized in that the communication device transfers image data obtained by performing a color reduction process on photographed image data to the server.

また本発明は、文字認識機能および翻訳機能を有するサーバにおいて、
通信装置から画像データと前記通信装置がどこの国に所在するかを示す国情報とを受信し、受信した国情報に基づいて、文字認識に用いる言語に重み付けを与え、言語に与えられた重み付けにしたがって前記画像データの文字認識を行うことを特徴とするサーバである。 Further, the present invention provides a server having a character recognition function and a translation function.
Receives image data from the communication device and country information indicating in which country the communication device is located, assigns a weight to a language used for character recognition based on the received country information, and assigns a weight to the language The server performs character recognition of the image data according to the above.

また本発明は、文字認識機能および翻訳機能を有するサーバにおいて、
通信装置から画像データとユーザが指定した言語を示す言語情報とを受信し、受信した言語情報が示す言語を用いて前記画像データの文字認識を行うことを特徴とするサーバである。 Further, the present invention provides a server having a character recognition function and a translation function.
A server that receives image data and language information indicating a language designated by a user from a communication device, and performs character recognition of the image data using a language indicated by the received language information.

また本発明は、文字認識および翻訳のために撮影された画像データをサーバに転送し、前記サーバから転送された翻訳結果を表示する機能を有する通信装置において、
前記通信装置がどこの国に所在するかを示す国情報を前記画像データとともに転送することを特徴とする通信装置である。 Further, the present invention provides a communication device having a function of transferring image data taken for character recognition and translation to a server and displaying a translation result transferred from the server.
Country information indicating in which country the communication apparatus is located is transferred together with the image data.

また本発明は、文字認識および翻訳のために撮影された画像データをサーバに転送し、前記サーバから転送された翻訳結果を表示する機能を有する通信装置において、
ユーザが指定した言語を示す言語情報を前記画像データとともに転送することを特徴とする通信装置である。 Further, the present invention provides a communication device having a function of transferring image data taken for character recognition and translation to a server and displaying a translation result transferred from the server.
A communication apparatus is characterized in that language information indicating a language designated by a user is transferred together with the image data.

また本発明は、前記通信装置は、撮影された画像データに対して減色処理を行った画像データを転送することを特徴とする。 In the invention, it is preferable that the communication device transfers image data obtained by performing a color reduction process on photographed image data.

本発明によれば、通信装置が、通信装置が所在する国の国情報を自動的にサーバに知らせるので、サーバは、その国情報に基づいて、その国で主として用いられる言語に大きな重み付けを与えて認識することができ、また、主として用いられない他の言語にも小さいながらもいくらかの重み付けを行っているので、その国であまり用いられない文字があっても、文字認識可能であり、文字認識の精度を向上することができる。 According to the present invention, since the communication device automatically informs the server of the country information of the country in which the communication device is located, the server gives a large weight to the language mainly used in the country based on the country information. In addition, some other languages that are not mainly used are given some weighting, so even if there is a character that is not used very much in that country, it can be recognized. Recognition accuracy can be improved.

また本発明によれば、ユーザが指定した言語で文字認識を行うので、認識対象の文字と無関係な言語を認識対象から除くことができ、文字認識の精度を向上することができる。 Further, according to the present invention, since character recognition is performed in a language specified by the user, a language unrelated to the character to be recognized can be excluded from the recognition target, and the accuracy of character recognition can be improved.

また本発明によれば、文字認識を行う際に、既に文字認識が行われた文字から構成される単語であって文字認識を行う文字の直前の単語または二つ前の単語に基づいて、文字認識を行う文字を含む単語の候補を抽出し、抽出した候補のうちよく現れる単語に大きな重み付けを与え、候補の単語に与えられた重み付けにしたがって文字認識するので、認識する文字の候補を限定することができ、文字認識の精度を向上することができる。 Further, according to the present invention, when character recognition is performed, a character composed of characters that have already been character-recognized and based on the word immediately before the character to be character-recognized or the two previous words Word candidates including characters to be recognized are extracted, a large weight is given to frequently appearing words among the extracted candidates, and characters are recognized according to the weight given to the candidate words, so that the character candidates to be recognized are limited. And the accuracy of character recognition can be improved.

また本発明によれば、撮影された画像データに対して文字認識の妨げにならない程度に減色処理を行った画像データを転送するので、二値化の場合よりも、文字認識の精度を向上することができ、また、減色処理を行わない場合よりも、サーバに転送するデータ量が少なくなり、送信時間の短縮を図ることができる。 In addition, according to the present invention, since the image data subjected to the color reduction processing is transferred to the captured image data so as not to hinder character recognition, the character recognition accuracy is improved as compared with the case of binarization. In addition, the amount of data transferred to the server is smaller than when no color reduction processing is performed, and the transmission time can be shortened.

また本発明によれば、通信装置から自動的に知らされる国情報に基づいて、その国で主として用いられる言語に大きな重み付けを与えて認識することができ、また、主として用いられない他の言語にも小さいながらもいくらかの重み付けを行っているので、その国であまり用いられない文字があっても、文字認識可能であり、文字認識の精度を向上することができる。 Further, according to the present invention, based on the country information automatically notified from the communication device, the language mainly used in the country can be recognized by giving a large weight, and other languages not mainly used can be recognized. However, although it is small, some weighting is performed, so that even if there is a character that is not frequently used in the country, the character can be recognized, and the accuracy of character recognition can be improved.

また、本発明によれば、通信装置が所在する国を示す国情報をサーバに提供できるので、文字認識の精度を向上することができる。 Further, according to the present invention, the country information indicating the country in which the communication device is located can be provided to the server, so that the accuracy of character recognition can be improved.

また本発明によれば、ユーザが指定した言語を示す言語情報をサーバに提供できるので、文字認識の精度を向上することができる。 Further, according to the present invention, the language information indicating the language designated by the user can be provided to the server, so that the accuracy of character recognition can be improved.

また本発明によれば、撮影された画像データに対して文字認識の妨げにならない程度に減色処理を行った画像データをサーバに提供できるので、二値化の場合よりも、文字認識の精度を向上することができ、また、減色処理を行わない場合よりも、サーバに転送するデータ量が少なくなり、送信時間の短縮を図ることができる。 In addition, according to the present invention, image data obtained by performing color reduction processing on captured image data to such an extent that does not hinder character recognition can be provided to the server, so that character recognition accuracy can be improved compared to binarization. In addition, the amount of data transferred to the server is smaller than when no color reduction processing is performed, and the transmission time can be shortened.

図１は、本発明の実施の一形態である通信装置の携帯端末１０の構成を示すブロック図である。携帯端末１０は、たとえば、携帯電話などの通信装置であり、送信部１０１、受信部１０２、撮影部１０３、表示部１０４、キー入力部１０５、画像処理部１０６、表示ドライバ１０７、メモリ１０８、および制御部１０９を含んで構成される。この携帯端末１０にはまた、図示されていないが、マイクといった音声入力部やスピーカといった音声出力部が搭載されている。 FIG. 1 is a block diagram illustrating a configuration of a mobile terminal 10 of a communication apparatus according to an embodiment of the present invention. The mobile terminal 10 is a communication device such as a mobile phone, for example, and includes a transmission unit 101, a reception unit 102, a photographing unit 103, a display unit 104, a key input unit 105, an image processing unit 106, a display driver 107, a memory 108, and A control unit 109 is included. Although not shown, the portable terminal 10 is also equipped with a voice input unit such as a microphone and a voice output unit such as a speaker.

制御部１０９は、携帯端末１０全体を制御する制御部で、たとえば、マイクロコンピュータで実現されるＣＰＵ（Central Processing Unit）と、プログラムを格納するメモリと、処理に必要な情報を一時的に記憶するためのメモリとにより構成される。 The control unit 109 is a control unit that controls the mobile terminal 10 as a whole, and temporarily stores, for example, a CPU (Central Processing Unit) realized by a microcomputer, a memory that stores a program, and information necessary for processing. And a memory.

キー入力部１０５は、電話番号の数字や電子メールの文字などを入力するキーや内蔵されたデジタルカメラを操作する操作ボタンなどを備えており、ユーザがキー入力部１０５から入力した情報や指示は、制御部１０９に転送される。 The key input unit 105 includes keys for inputting numbers of telephone numbers, characters of e-mails, operation buttons for operating a built-in digital camera, and the like, and information and instructions input by the user from the key input unit 105 are And transferred to the control unit 109.

ユーザが翻訳を希望する場合、ユーザは、まず、被写体に撮影部１０３を向け、キー入力部１０５を操作して、文字を含む画像を撮影する。撮影部１０３は、たとえば、ＣＣＤ（Charge Coupled Devices）やＣＭＯＳ（Complementary Metal Oxide Semiconductor）を用いたカメラモジュールにより構成される内蔵型のデジタルカメラであり、キー入力部１０５から受けたユーザの指示を制御部１０９から受けて、画像を撮影する。撮影された画像は、画像処理部１０６で画像圧縮や減色処理が行われた後、読み書き可能な記憶媒体、たとえば、半導体メモリといったメモリ１０８に画像データとして記憶される。 When the user desires translation, the user first points the photographing unit 103 toward the subject and operates the key input unit 105 to photograph an image including characters. The photographing unit 103 is a built-in digital camera configured by a camera module using, for example, a charge coupled device (CCD) or a complementary metal oxide semiconductor (CMOS), and controls a user instruction received from the key input unit 105. The image is received from the unit 109. The photographed image is subjected to image compression and color reduction processing by the image processing unit 106 and then stored as image data in a readable / writable storage medium, for example, a memory 108 such as a semiconductor memory.

撮影を行った後、ユーザは、翻訳する言語と翻訳指示を入力する。入力された翻訳する言語と翻訳指示は、制御部１０９に通知される。通知を受けた制御部１０９は、メモリ１０８に記憶した画像データを、ユーザが入力した翻訳する言語とともにサーバに送信するよう送信部１０１に指示する。無線通信によりデータを送信する送信部１０１は、画像データをメモリ１０８から読み出し、ユーザが入力した翻訳する言語とともに、読み出した画像データをサーバに送信する。 After photographing, the user inputs a language to translate and a translation instruction. The input language to be translated and the translation instruction are notified to the control unit 109. Upon receiving the notification, the control unit 109 instructs the transmission unit 101 to transmit the image data stored in the memory 108 to the server together with the language to be translated input by the user. A transmitting unit 101 that transmits data by wireless communication reads image data from the memory 108 and transmits the read image data to the server together with the language to be translated input by the user.

移動を伴う携帯端末１０は、良い通信状態を維持するために、定期的に基地局と通信を行って、どの基地局が近くにあるかを把握している。このとき、携帯端末１０は、基地局の所在地情報を基地局から受信し、メモリに保持している。この所在地情報には、基地局が所在する国の情報（以下、国情報という）が含まれている。国情報は、たとえば、国名であっても良いし、国名を特定するための番号であっても良い。また、携帯端末１０に差し込んで、利用者を識別するために用いられる、契約者情報が記録されたＩＣ
（Integrated Circuit）カードであるＳＩＭ（Subscriber Identity Module）カードの中にも同様の国情報があり、この情報を用いても良い。制御部１０９は、画像データをサーバに送信する際、この国情報を送信する。 In order to maintain a good communication state, the mobile terminal 10 that accompanies movement periodically communicates with a base station to know which base station is nearby. At this time, the mobile terminal 10 receives the location information of the base station from the base station and holds it in the memory. This location information includes information on the country where the base station is located (hereinafter referred to as country information). The country information may be, for example, a country name or a number for specifying the country name. Also, an IC on which contractor information is recorded, which is used to identify a user by being inserted into the mobile terminal 10
There is similar country information in a SIM (Subscriber Identity Module) card which is an (Integrated Circuit) card, and this information may be used. The control unit 109 transmits this country information when transmitting image data to the server.

もし、撮影した文字の言語名をユーザが判断できる場合は、その言語名をキー入力部１０５から指示させ、指示された言語を、国情報の代わりに言語情報としてサーバに転送しても良い。言語情報は、たとえば、言語名であっても良いし、言語名を特定できる言語番号などでも良い。ユーザによる言語の指示方法は、表示部１０４に言語名の一覧表を表示し、その中からユーザに選択させても良いし、キー入力部１０５から直接言語名を入力させても良い。 If the user can determine the language name of the photographed characters, the language name may be instructed from the key input unit 105, and the instructed language may be transferred to the server as language information instead of country information. The language information may be, for example, a language name or a language number that can identify the language name. As a method for instructing the language by the user, a list of language names may be displayed on the display unit 104, and the user may select a language name from the list, or the language name may be directly input from the key input unit 105.

また、サーバに送信する画像データは、文字認識の認識率を低下させない程度まで減色処理を行い、データ量を減らして送信しても良い。どの程度まで減色できるかについては、撮影する画像の状態や環境により、変化するので、たとえば、明るい場所用、暗い場所用、色が濃い場合用、色が薄い場合用等いくつかの減色段階を用意しておき、翻訳の成否を試しながら、ユーザに選択させても良い。 Further, the image data to be transmitted to the server may be transmitted by performing a color reduction process to such an extent that the recognition rate of character recognition is not lowered, and reducing the data amount. The degree of color reduction can vary depending on the condition and environment of the image being shot.For example, there are several color reduction steps, such as for bright places, dark places, dark colors, and light colors. You may prepare it and let the user select it while trying the success or failure of the translation.

無線通信によりデータを受信する受信部１０２は、送信した画像データの文字が文字認識された後翻訳された結果をサーバから受信し、受信した翻訳結果をメモリ１０８に記憶した後、制御部１０９に翻訳結果を受信した旨を通知する。通知を受けた制御部１０９は、翻訳結果を表示するように表示ドライバ１０７に指示する。表示ドライバ１０７は、液晶ディスプレイといった表示部１０４を駆動する駆動部であり、翻訳結果をメモリ１０８から読み出して、表示部１０４に表示し、ユーザに翻訳結果を示す。 The receiving unit 102 that receives data by wireless communication receives the translated result after the characters of the transmitted image data are recognized from the server, stores the received translation result in the memory 108, and then stores the received translation result in the control unit 109. Notify that the translation result has been received. Upon receiving the notification, the control unit 109 instructs the display driver 107 to display the translation result. The display driver 107 is a drive unit that drives the display unit 104 such as a liquid crystal display. The display driver 107 reads the translation result from the memory 108 and displays the translation result on the display unit 104 to show the translation result to the user.

図２は、本発明の実施の一形態であるサーバ２０の構成を示すブロック図である。サーバ２０は、無線通信網やインターネットを介して送受信される電子メールや音声情報などのデータを集配信するデータ処理システムにおけるコンピュータであり、送信部２０１、受信部２０２、文字認識部２０３、確率モデル生成部２０４、翻訳部２０５、言語辞書２０６、翻訳辞書２０７、メモリ２０８、および制御部２０９を含んで構成される。 FIG. 2 is a block diagram showing a configuration of the server 20 according to the embodiment of the present invention. The server 20 is a computer in a data processing system that collects and distributes data such as e-mail and voice information transmitted and received via a wireless communication network or the Internet. The server 20 includes a transmission unit 201, a reception unit 202, a character recognition unit 203, a probability model. A generation unit 204, a translation unit 205, a language dictionary 206, a translation dictionary 207, a memory 208, and a control unit 209 are configured.

無線通信やインターネットを介してデータを受信する受信部２０２は、携帯端末から送信された画像データを受信すると、読み書き可能な記憶媒体、たとえば、半導体メモリといったメモリ２０８に受信した画像データを記憶し、画像データを受信した旨を、翻訳する言語、および国情報または言語情報とともに、制御部２０９に通知する。 When receiving the image data transmitted from the portable terminal, the receiving unit 202 that receives data via wireless communication or the Internet stores the received image data in a readable / writable storage medium, for example, a memory 208 such as a semiconductor memory, The control unit 209 is notified that the image data has been received together with the language to be translated and country information or language information.

制御部２０９は、サーバ２０全体を制御する制御部で、たとえば、マイクロコンピュータで実現されるＣＰＵと、プログラムを格納するメモリと、処理に必要な情報を一時的に記憶するためのメモリとにより構成される。制御部２０９は、画像データを受信した旨を受信部２０２から通知されると、国情報または言語情報を文字認識部２０３に指示する。 The control unit 209 is a control unit that controls the entire server 20, and includes, for example, a CPU realized by a microcomputer, a memory that stores a program, and a memory that temporarily stores information necessary for processing. Is done. When notified from the receiving unit 202 that the image data has been received, the control unit 209 instructs the character recognition unit 203 to provide country information or language information.

文字認識部２０３は、画像データをメモリ２０８から読み出し、画像データから文字部を抽出し、抽出した文字に対して文字認識を行う。文字認識を行う際、多言語の言語辞書である言語辞書２０６に登録されている文字の中から一番確率の高い文字を認識後の文字として選択する。文字認識部２０３は、認識した文字の認識結果を翻訳部２０５に転送する。翻訳部２０５は、翻訳辞書２０７に基づいて、制御部２０９から指示された翻訳する言語に翻訳し、翻訳結果をメモリ２０８に記憶し、翻訳が完了したときに、制御部２０９に翻訳完了を通知する。 The character recognition unit 203 reads image data from the memory 208, extracts a character part from the image data, and performs character recognition on the extracted character. When performing character recognition, a character having the highest probability is selected as a recognized character from characters registered in the language dictionary 206, which is a multilingual language dictionary. The character recognition unit 203 transfers the recognition result of the recognized character to the translation unit 205. Based on the translation dictionary 207, the translation unit 205 translates into the language to be translated instructed by the control unit 209, stores the translation result in the memory 208, and notifies the control unit 209 of the completion of translation when the translation is completed. To do.

制御部２０９は、翻訳完了の通知を受けると、翻訳結果を端末装置１０に送信するように、送信部２０１に指示する。無線通信やインターネットを介してデータを送信する送信部２０１は、翻訳結果をメモリ２０８から読み出し、端末装置１０に送信する。 When receiving the notification of translation completion, the control unit 209 instructs the transmission unit 201 to transmit the translation result to the terminal device 10. The transmission unit 201 that transmits data via wireless communication or the Internet reads the translation result from the memory 208 and transmits it to the terminal device 10.

画像データとともに国情報が送信されている場合は、確率モデル生成部２０４は、制御部２０９から指示された国情報に基づいて、言語辞書２０６に登録されている言語に対して、国情報が示す国で用いられている言語の割合から算出した重み付けを行う。たとえば、国情報が日本である場合、日本語の重みが０．８、英語の重みが０．１、その他の言語の重みが０．１というように、その国で主に用いられている言語に大きな重みを与える。この言語の確率モデル、つまり、日本語０．８、英語０．１、その他の言語０．１という確立モデルは、確率モデル生成部２０４から、文字認識部２０３に通知される。 When the country information is transmitted together with the image data, the probability model generation unit 204 indicates the country information for the language registered in the language dictionary 206 based on the country information instructed from the control unit 209. Weight is calculated from the percentage of languages used in the country. For example, if the country information is Japan, the language used mainly in the country, such as Japanese weight 0.8, English weight 0.1, and other language weight 0.1 Give a big weight to. The probability model of this language, that is, the established model of Japanese 0.8, English 0.1, and other language 0.1 is notified from the probability model generation unit 204 to the character recognition unit 203.

文字認識部２０３は、通知された言語の確率モデル、つまり、日本語０．８、英語０．１、その他の言語０．１という確率モデルが与えられた場合、言語辞書２０６に登録されている文字の中から一番確率の高い文字を認識後の文字として選択するとき、選択する文字の確率に、その文字の言語の重みを掛けた値をその文字の確率として用いる。 The character recognition unit 203 is registered in the language dictionary 206 when given a probability model of the notified language, that is, a probability model of Japanese 0.8, English 0.1, and other languages 0.1. When the character with the highest probability is selected as a recognized character from the characters, a value obtained by multiplying the probability of the selected character by the weight of the language of the character is used as the probability of the character.

たとえば、認識後の文字の候補として、「ｕ」または「い」を考えたときに、重み付けを考慮しない確率が、「い」が０．１、「ｕ」が０．９の場合は、重みを掛けて、つまり、それぞれ、日本語０．８、英語０．１を掛けて、「い」が０．０８、「ｕ」が０．０９となり、「ｕ」の値が大きいので「ｕ」と認識し、また、重み付けを考慮しない確率が、「い」が０．５、「ｕ」が０．５の場合は、重みを掛けて、「い」が０．４、「ｕ」が０．０５となり、「い」の値が大きいので「い」と認識する。 For example, when “u” or “i” is considered as a character candidate after recognition, if the probability of not considering weighting is 0.1 and “u” is 0.9, the weight is In other words, multiplying by 0.8 and Japanese 0.1 respectively, “I” becomes 0.08, “u” becomes 0.09, and the value of “u” is large, so “u” If “i” is 0.5 and “u” is 0.5, the weight is multiplied by “i” is 0.4 and “u” is 0. .05, and the value of “I” is large, so “I” is recognized.

したがって、重みが大きくなればなるほど、その言語であると認識される可能性が高くなる。また、主として用いられない他の言語にも小さいながらもいくらかの重み付けを行っているので、その国であまり用いられない文字があっても、文字認識可能である。複数の公用語が用いられている国であっても、その公用語が用いられている割合に応じて重み付けされるので、対応することができる。 Therefore, the greater the weight, the higher the likelihood that the language is recognized. In addition, since other languages that are not mainly used are given some weighting, they can be recognized even if there are characters that are rarely used in that country. Even in a country where a plurality of official languages are used, weighting is performed according to the proportion of the official languages used, so it is possible to cope with them.

また、国情報の代わりに、言語情報が画像データとともに送信されている場合は、文字認識部２０３は、言語辞書２０６のうち、言語情報として指示されている言語の文字の中から認識後の文字を選択する。指示されていない言語の文字は、認識対象から除外して、文字を誤認識する可能性を少なくしている。 In addition, when language information is transmitted together with image data instead of country information, the character recognition unit 203 recognizes a character after recognition from characters in the language indicated as language information in the language dictionary 206. Select. Characters in languages that are not instructed are excluded from recognition targets to reduce the possibility of misrecognizing characters.

また、文字認識を行う文字列が複数の単語から構成される場合、サーバ２０は、言語辞書２０６に基づいて、直前の単語またはその前の単語から次の単語、つまり、認識する文字を含む単語の候補を推測し、文字認識の精度を向上させる。 Further, when the character string for character recognition is composed of a plurality of words, the server 20 uses the language dictionary 206 to determine the next word from the immediately preceding word or the previous word, that is, the word including the character to be recognized. To improve the accuracy of character recognition.

確率モデル生成部２０４は、文字認識部２０３から転送される認識結果の文字列を単語に区切り、認識する文字の直前の単語、あるいはその前の単語から次に来ると推測される単語の候補を抽出し、よく現れる可能性の高い候補に、大きな重み付けを与え、次に来ると推測される単語の確率モデルを生成し、文字認識部２０３に通知する。文字認識部２０３は、言語辞書２０６に登録されている文字の中から一番確率の高い文字を認識後の文字として選択するとき、選択する文字の確率に、その文字の単語の重みを掛けた値をその文字の確率として用いる。 The probability model generation unit 204 divides the character string of the recognition result transferred from the character recognition unit 203 into words, and selects a word immediately before the recognized character or a word candidate estimated to come next from the previous word. A candidate is extracted and given a high weight to a candidate that is likely to appear frequently, a probability model of a word estimated to come next is generated and notified to the character recognition unit 203. When the character recognition unit 203 selects a character with the highest probability from characters registered in the language dictionary 206 as a recognized character, the character recognition unit 203 multiplies the probability of the character to be selected by the weight of the word of the character. Use the value as the probability of that character.

たとえば、直前の単語が、日本語の「私」という単語であり、認識対象の文字を含む単語として、よく現れる候補が、「は」または「が」であるとすると、「は」の重みが０．２、「が」の重みが０．２、その他の重みが０．６というように、次によく現れる単語に大きな重み付けを与えた確率モデルを生成する。認識後の文字の候補として、「は」または「に」を考えたときに、重み付けを考慮しない確率が、「は」が０．８、「に」が０．２の場合は、重みを掛けて、つまり、それぞれ、「は」０．２、その他０．６を掛けて、「は」が０．１６、「に」が０．１２となり、「は」の値が大きいので「は」と認識し、また、重み付けを考慮しない確率が、「は」が０．５、「に」が０．５の場合は、重みを掛けて、「は」が０．１、「に」が０．３となり、「に」の値が大きいので「に」と認識する。 For example, if the immediately preceding word is the word “I” in Japanese and the candidate that frequently appears as a word including the character to be recognized is “ha” or “ga”, the weight of “ha” is A probability model is generated in which a large weight is given to the next frequently occurring word, such as 0.2, the weight of “GA” is 0.2, and the other weights are 0.6. When “ha” or “ni” is considered as a character candidate after recognition, if the probability that weighting is not considered is “ha” is 0.8 and “ni” is 0.2, the weight is multiplied. In other words, “ha” is 0.2 and other 0.6, respectively, “ha” is 0.16, “ni” is 0.12, and “ha” is large. If the probability of recognizing and not considering the weight is “ha” is 0.5 and “ni” is 0.5, the weight is multiplied by “ha” is 0.1, and “ni” is 0. 3. Since the value of “ni” is large, it is recognized as “ni”.

この場合、重み付けをしなければ、常に、「は」としか認識されないが、以前の単語から推測される単語の候補にも重み付けをして候補としているので、「に」と認識できる場合もでてくる。 In this case, if it is not weighted, it will always be recognized only as “ha”, but the word candidate estimated from the previous word is also weighted as a candidate, so it may be recognized as “ni”. Come.

本発明の実施の一形態であるデータ処理システムは、上述した携帯端末１０とサーバ２０とを組み合わせることにより実現できる。この場合、携帯端末１０は、１台に限られず、複数台あっても良い。 A data processing system according to an embodiment of the present invention can be realized by combining the mobile terminal 10 and the server 20 described above. In this case, the mobile terminal 10 is not limited to one, and a plurality of mobile terminals 10 may be provided.

図３は、本発明の実施の一形態であるデータ処理システムにおける画像データの文字認識と翻訳の処理手順を示すフローチャートであり、携帯端末１０の処理を左側に、サーバ２０の処理を右側に記載している。ユーザが、撮影した画像データに含まれる文字の翻訳を行いたいと考え、携帯端末１０の操作を開始したときに処理が開始される。 FIG. 3 is a flowchart showing a character recognition and translation processing procedure of image data in the data processing system according to the embodiment of the present invention. The processing of the mobile terminal 10 is shown on the left side, and the processing of the server 20 is shown on the right side. doing. The process is started when the user wants to translate characters included in the captured image data and starts operating the mobile terminal 10.

ステップＳ１では、携帯端末１０は、ユーザの指示により被写体を撮影し、画像データとして取り込む撮影処理を行う。ステップＳ２では、ユーザからの翻訳要求を受け付ける。ステップＳ３では、サーバ１０に送信するデータ量を減らすために、画像の圧縮処理が行われるが、文字認識の妨げにならない程度に減色処理を行ってから圧縮処理を行っても良い。 In step S 1, the mobile terminal 10 performs a photographing process of photographing a subject according to a user instruction and taking it as image data. In step S2, a translation request from the user is accepted. In step S3, image compression processing is performed in order to reduce the amount of data transmitted to the server 10. However, the color reduction processing may be performed after the color reduction processing is performed so as not to hinder character recognition.

ステップＳ４では、撮影した画像データと基地局情報などから入手した携帯端末が所在する国を示す国情報とをサーバ２０に送信する。ステップＳ５では、サーバ２０は、送信された画像データと国情報とを受信する。 In step S4, the captured image data and country information indicating the country in which the mobile terminal is obtained from the base station information and the like are transmitted to the server 20. In step S5, the server 20 receives the transmitted image data and country information.

ステップＳ６では、送信された国情報に基づいて、文字認識に使用する言語に重み付けを行った言語の確率モデル、たとえば、国情報が日本である場合、日本語の重みが０．８、英語の重みが０．１、その他の言語の重みが０．１という確率モデルを生成する。ステップＳ７では、認識すべき単語が複数かどうか確認し、複数のときは、ステップＳ８に進み、複数でないときは、ステップＳ９に進む。 In step S6, a language probability model in which the language used for character recognition is weighted based on the transmitted country information, for example, when the country information is Japan, the Japanese weight is 0.8, A probability model having a weight of 0.1 and other language weights of 0.1 is generated. In step S7, it is confirmed whether or not there are a plurality of words to be recognized. If there are a plurality of words, the process proceeds to step S8, and if not, the process proceeds to step S9.

ステップＳ８では、言語辞書２０６に基づいて、直前の単語またはその前の単語から次の単語、つまり、認識する文字を含む単語の候補を抽出し、抽出した単語の候補に重み付けを与え、次に来ると推測される単語の確率モデル、たとえば、直前の単語が、日本語の「私」という単語であり、認識対象の文字を含む単語として、よく現れる候補が、「は」または「が」であるとすると、「は」の重みが０．２、「が」の重みが０．２、その他の重みが０．６というように、次によく現れる単語に大きな重み付けを与えた単語の確率モデルを生成する。 In step S8, based on the language dictionary 206, a next word, that is, a word candidate including a character to be recognized is extracted from the immediately preceding word or the preceding word, and the extracted word candidate is weighted. A probabilistic model of the word that is supposed to come, for example, the word just before is the word "I" in Japanese, and the word that appears frequently as a word containing the character to be recognized is "ha" or "ga" If there is, a probability model of a word that gives a large weight to the next frequently occurring word, such as “ha” weight 0.2, “ga” weight 0.2, and other weights 0.6. Is generated.

ステップＳ９では、言語の確率モデルと単語の確率モデル、単語の確率モデルがなければ、言語の確率モデルのみの重み付けを用いて文字認識を行う。ステップＳ１０では、認識結果に基づいて翻訳処理を行う。ステップＳ１１では、翻訳結果を携帯端末１０に送信する。 In step S9, if there is no language probability model, word probability model, or word probability model, character recognition is performed using weighting only the language probability model. In step S10, a translation process is performed based on the recognition result. In step S 11, the translation result is transmitted to the mobile terminal 10.

ステップＳ１２では、携帯端末１０は、サーバ２０から送信された翻訳結果を受信する。ステップＳ１３では、受信した翻訳結果を、液晶ディスプレイなどの表示部にテキストデータとして表示する。 In step S 12, the mobile terminal 10 receives the translation result transmitted from the server 20. In step S13, the received translation result is displayed as text data on a display unit such as a liquid crystal display.

図４は、本発明の実施の他の形態であるデータ処理システムにおける画像データの文字認識と翻訳の処理手順を示すフローチャートである。ユーザが、撮影した文字の言語を知っている場合、国情報の代わりに、ユーザにその言語を指定させて、指定された言語を用いて文字認識を行う処理であり、図３の場合と同様に、ユーザの操作により処理が開始される。 FIG. 4 is a flowchart showing processing steps for character recognition and translation of image data in a data processing system according to another embodiment of the present invention. When the user knows the language of the photographed character, instead of country information, the user designates the language and performs character recognition using the designated language, as in FIG. In addition, processing is started by a user operation.

ステップＳ２０では、携帯端末１０は、ユーザの指示により被写体を撮影し、画像データとして取り込む撮影処理を行う。ステップＳ２１では、ユーザからの翻訳要求を受け付ける。ステップＳ２２では、ユーザが携帯端末１０のキー入力部１０５から入力した言語名を取り込む。 In step S 20, the mobile terminal 10 captures a subject according to a user instruction, and performs a capturing process for capturing the subject as image data. In step S21, a translation request from the user is accepted. In step S 22, the language name input by the user from the key input unit 105 of the mobile terminal 10 is captured.

ステップＳ２３では、サーバ１０に送信するデータ量を減らすために、画像の圧縮処理が行われるが、文字認識の妨げにならない程度に減色処理を行ってから圧縮処理を行っても良い。ステップＳ２４では、撮影した画像データとユーザが入力した言語名を示す言語情報とをサーバ２０に送信する。ステップＳ２５では、サーバ２０は、携帯端末１０から送信された画像データと言語情報とを受信する。 In step S23, an image compression process is performed to reduce the amount of data transmitted to the server 10, but the color reduction process may be performed after the color reduction process is performed so as not to hinder character recognition. In step S24, the captured image data and language information indicating the language name input by the user are transmitted to the server 20. In step S 25, the server 20 receives the image data and language information transmitted from the mobile terminal 10.

ステップＳ２６では、送信された言語情報に基づいて、文字認識に使用する言語辞書を送信された言語情報に対応する言語辞書のみに制限する。ステップＳ２７では、認識すべき単語が複数かどうか確認し、複数のときは、ステップＳ２８に進み、複数でないときは、ステップＳ２９に進む。 In step S26, based on the transmitted language information, the language dictionary used for character recognition is limited to only the language dictionary corresponding to the transmitted language information. In step S27, it is confirmed whether there are a plurality of words to be recognized. If there are a plurality of words, the process proceeds to step S28, and if not, the process proceeds to step S29.

ステップＳ２８では、言語辞書２０６に基づいて、直前の単語またはその前の単語から次の単語、つまり、認識する文字を含む単語の候補を抽出し、抽出した単語の候補に重み付けを与え、次に来ると推測される単語の確率モデル、たとえば、直前の単語が、日本語の「私」という単語であり、認識対象の文字を含む単語として、よく現れる候補が、「は」または「が」であるとすると、「は」の重みが０．２、「が」の重みが０．２、その他の重みが０．６というように、次によく現れる単語に大きな重み付けを与えた単語の確率モデルを生成する。 In step S28, based on the language dictionary 206, the next word, that is, the word candidate including the character to be recognized is extracted from the previous word or the previous word, and the extracted word candidate is weighted. A probabilistic model of the word that is supposed to come, for example, the word just before is the word "I" in Japanese, and the word that appears frequently as a word containing the character to be recognized is "ha" or "ga" If there is, a probability model of a word that gives a large weight to the next frequently occurring word, such as “ha” weight 0.2, “ga” weight 0.2, and other weights 0.6. Is generated.

ステップＳ２９では、送信された言語情報に対応する言語辞書と単語の確率モデルとを用いて、単語の確率モデルがなければ、送信された言語情報に対応する言語辞書のみを用いて文字認識を行う。ステップＳ３０では、認識結果に基づいて翻訳処理を行う。ステップＳ３１では、翻訳結果を携帯端末１０に送信する。 In step S29, using the language dictionary corresponding to the transmitted language information and the word probability model, if there is no word probability model, character recognition is performed using only the language dictionary corresponding to the transmitted language information. . In step S30, a translation process is performed based on the recognition result. In step S31, the translation result is transmitted to the portable terminal 10.

ステップＳ３２では、携帯端末１０は、サーバ２０から送信された翻訳結果を受信する。ステップＳ３３では、送信された翻訳結果を、液晶ディスプレイなどの表示部にテキストデータとして表示する。 In step S 32, the mobile terminal 10 receives the translation result transmitted from the server 20. In step S33, the transmitted translation result is displayed as text data on a display unit such as a liquid crystal display.

本発明の実施の一形態である通信装置の携帯端末１０の構成を示すブロック図である。It is a block diagram which shows the structure of the portable terminal 10 of the communication apparatus which is one Embodiment of this invention. 本発明の実施の一形態であるサーバ２０の構成を示すブロック図である。It is a block diagram which shows the structure of the server 20 which is one Embodiment of this invention. 本発明の実施の一形態であるデータ処理システムにおける画像データの文字認識と翻訳の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the character recognition and translation of image data in the data processing system which is one Embodiment of this invention. 本発明の実施の他の形態であるデータ処理システムにおける画像データの文字認識と翻訳の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the character recognition and translation of image data in the data processing system which is the other form of implementation of this invention.

Explanation of symbols

１０携帯端末
２０サーバ
１０１，２０１送信部
１０２，２０２受信部
１０３撮影部
１０４表示部
１０５キー入力部
１０６画像処理部
１０７表示ドライバ
１０８，２０８メモリ
１０９，２０９制御部
２０３文字認識部
２０４確率モデル生成部
２０５翻訳部
２０６言語辞書
２０７翻訳辞書 DESCRIPTION OF SYMBOLS 10 Mobile terminal 20 Server 101,201 Transmission part 102,202 Reception part 103 Image | photographing part 104 Display part 105 Key input part 106 Image processing part 107 Display driver 108,208 Memory 109,209 Control part 203 Character recognition part 204 Probability model generation part 205 Translation Section 206 Language Dictionary 207 Translation Dictionary

Claims

Data having a server having a character recognition function and a translation function, and a communication device having a function of transferring image data taken for character recognition and translation to the server and displaying a translation result transferred from the server In the processing system,
The communication device transfers country information indicating where the communication device is located to the server together with the image data,
The data processing system, wherein the server assigns a weight to a language used for character recognition based on the transferred country information and performs character recognition of the image data according to the weight given to the language.

Data having a server having a character recognition function and a translation function, and a communication device having a function of transferring image data taken for character recognition and translation to the server and displaying a translation result transferred from the server In the processing system,
The communication device transfers language information indicating a language designated by a user to the server together with the image data,
The data processing system, wherein the server performs character recognition of the image data using a language indicated by the transferred language information.

When performing character recognition, the server performs character recognition based on a word composed of characters that have already been character-recognized and immediately before the character on which character recognition is performed or on the word immediately before the character to be character-recognized. 3. The data processing system according to claim 1, wherein word candidates including characters are extracted, weights are assigned to the extracted candidates, and characters are recognized according to the weights assigned to the candidate words.

The data processing system according to claim 1, wherein the communication device transfers image data obtained by performing a color reduction process on captured image data to the server.

In a server having a character recognition function and a translation function,
Receives image data from the communication device and country information indicating in which country the communication device is located, assigns a weight to a language used for character recognition based on the received country information, and assigns a weight to the language The server performs character recognition of the image data according to the above.

In a server having a character recognition function and a translation function,
A server which receives image data and language information indicating a language designated by a user from a communication device, and performs character recognition of the image data using a language indicated by the received language information.

When performing character recognition, the server performs character recognition based on a word composed of characters that have already been character-recognized and immediately before the character on which character recognition is performed or on the word immediately before the character to be character-recognized. 7. The server according to claim 5, wherein word candidates including characters are extracted, weights are assigned to the extracted candidates, and characters are recognized according to the weights assigned to the candidate words.

In a communication device having a function of transferring image data taken for character recognition and translation to a server and displaying a translation result transferred from the server,
Country information indicating in which country the communication apparatus is located is transferred together with the image data.

In a communication device having a function of transferring image data taken for character recognition and translation to a server and displaying a translation result transferred from the server,
A communication apparatus, wherein language information indicating a language designated by a user is transferred together with the image data.

10. The communication apparatus according to claim 8, wherein the communication apparatus transfers image data obtained by performing a color reduction process on photographed image data.