JP2021197165A

JP2021197165A - Information processing apparatus, information processing method and computer readable storage medium

Info

Publication number: JP2021197165A
Application number: JP2021082555A
Authority: JP
Inventors: ジャン・イン; Ying Zhang; 留安汪; Liu An Wang; 俊孫; Shun Son
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2020-06-16
Filing date: 2021-05-14
Publication date: 2021-12-27
Also published as: CN113807377A

Abstract

To provide an information processing apparatus, an information processing method and a computer readable storage medium.SOLUTION: An information processing apparatus comprises: a probability vector acquisition unit that acquires an M-dimensional probability vector of each of N segments which are obtained by dividing a target to be classified; a candidate class selection unit that selects a class corresponding to upper maximum K elements of elements other than a H-th element in the M-dimensional probability vector of each segment as a candidate class of the segments; a route vector generation unit that generates a route vector on the basis of the candidate class of each segment, and that calculates a score of the route vector on the basis of probability corresponding to an element included in each route vector and an association degree between adjacent elements; and a classification result acquisition unit that acquires a route vector having the highest score as a classification result of the target to be classified, and the association degree between the adjacent elements is calculated on the basis of semantic information between the adjacent elements, and variable weight associated with a distance between segments corresponding to the adjacent elements.SELECTED DRAWING: Figure 1

Description

本開示は、情報処理の分野に関し、具体的には、情報処理装置、情報処理方法及びコンピュータ読み取り可能な記憶媒体に関する。 The present disclosure relates to the field of information processing, specifically to information processing devices, information processing methods and computer readable storage media.

分類技術は、例えば画像認識、文字認識、音声認識などに幅広く応用されている。 The classification technique is widely applied to, for example, image recognition, character recognition, voice recognition, and the like.

以下は、本開示の態様を基本的に理解させるために、本開示の簡単な概要を説明する。なお、この簡単な概要は、本開示を網羅的な概要ではなく、本開示のポイント又は重要な部分を意図的に特定するものではなく、本開示の範囲を意図的に限定するものではなく、後述するより詳細的な説明の前文として、単なる概念を簡単な形で説明することを目的とする。 The following is a brief overview of the present disclosure in order to provide a basic understanding of aspects of the present disclosure. It should be noted that this brief overview is not an exhaustive overview of the present disclosure, does not intentionally specify the points or important parts of the present disclosure, and does not intentionally limit the scope of the present disclosure. As a preamble to a more detailed explanation, which will be described later, the purpose is to explain a mere concept in a simple form.

本開示は、改良された情報処理装置、情報処理方法及びコンピュータ読み取り可能な記憶媒体を提供することを目的とする。 It is an object of the present disclosure to provide an improved information processing device, an information processing method, and a computer-readable storage medium.

本開示の１つの態様では、分類すべき対象を分割して得られたＮ個のセグメントのそれぞれのＭ次元確率ベクトルを取得する確率ベクトル取得部であって、Ｍはクラスの数であり、各Ｍ次元確率ベクトルにおける第１要素乃至第Ｍ要素は対応するセグメントが第１クラス乃至第Ｍクラスに属する確率をそれぞれ表し、Ｍ及びＮは１よりも大きい自然数である、確率ベクトル取得部と、前記Ｎ個のセグメントのそれぞれについて、該セグメントのＭ次元確率ベクトルにおける第Ｈ要素以外の要素のうちの上位Ｋ個の最大の要素に対応するクラスを該セグメントの候補クラスとして選択する候補クラス選択部であって、Ｈ及びＫは自然数であり、１≦Ｈ≦Ｍ、且つ１≦Ｋ≦Ｍ−１となり、前記第Ｈ要素に対応する第Ｈクラスは意味情報を含まないクラスである、候補クラス選択部と、前記Ｎ個のセグメントのそれぞれの候補クラスに基づいて経路ベクトルを生成し、生成された経路ベクトルのそれぞれについて、該経路ベクトルに含まれる各要素に対応する確率及び隣接する要素間の関連度に基づいて該経路ベクトルのスコアを計算する経路ベクトル生成部と、前記経路ベクトルのうちのスコアが最も高い経路ベクトルを前記分類すべき対象の分類結果として取得する分類結果取得部と、を含み、隣接する要素間の関連度は、前記経路ベクトルにおける隣接する要素間の意味情報、及び隣接する要素に対応するセグメント間の距離に関する可変の重みに基づいて計算される、情報処理装置を提供する。 In one aspect of the present disclosure, it is a random vector acquisition unit that acquires the M-dimensional probability vector of each of the N segments obtained by dividing the object to be classified, and M is the number of classes, and each is The first element to the M element in the M-dimensional probability vector represent the probability that the corresponding segment belongs to the first class to the M class, respectively, and M and N are natural numbers larger than 1, the probability vector acquisition unit and the above. For each of the N segments, in the candidate class selection unit that selects the class corresponding to the highest K maximum elements among the elements other than the H element in the M-dimensional random variable of the segment as the candidate class of the segment. Therefore, H and K are natural numbers, 1 ≦ H ≦ M, and 1 ≦ K ≦ M-1, and the H class corresponding to the H element is a class that does not include semantic information. Candidate class selection. A path vector is generated based on the unit and each candidate class of the N segments, and for each of the generated path vectors, the probability corresponding to each element included in the path vector and the relationship between adjacent elements. It includes a route vector generation unit that calculates the score of the route vector based on the degree, and a classification result acquisition unit that acquires the route vector having the highest score among the route vectors as the classification result of the target to be classified. Provided is an information processing apparatus in which the degree of association between adjacent elements is calculated based on the semantic information between the adjacent elements in the path vector and the variable weight regarding the distance between the segments corresponding to the adjacent elements. ..

本開示のもう１つの態様では、分類すべき対象を分割して得られたＮ個のセグメントのそれぞれのＭ次元確率ベクトルを取得する確率ベクトル取得ステップであって、Ｍはクラスの数であり、各Ｍ次元確率ベクトルにおける第１要素乃至第Ｍ要素は対応するセグメントが第１クラス乃至第Ｍクラスに属する確率をそれぞれ表し、Ｍ及びＮは１よりも大きい自然数である、確率ベクトル取得ステップと、前記Ｎ個のセグメントのそれぞれについて、該セグメントのＭ次元確率ベクトルにおける第Ｈ要素以外の要素のうちの上位Ｋ個の最大の要素に対応するクラスを該セグメントの候補クラスとして選択する候補クラス選択ステップであって、Ｈ及びＫは自然数であり、１≦Ｈ≦Ｍ、且つ１≦Ｋ≦Ｍ−１となり、前記第Ｈ要素に対応する第Ｈクラスは意味情報を含まないクラスである、候補クラス選択ステップと、前記Ｎ個のセグメントのそれぞれの候補クラスに基づいて経路ベクトルを生成し、生成された経路ベクトルのそれぞれについて、該経路ベクトルに含まれる各要素に対応する確率及び隣接する要素間の関連度に基づいて該経路ベクトルのスコアを計算する経路ベクトル生成ステップと、前記経路ベクトルのうちのスコアが最も高い経路ベクトルを前記分類すべき対象の分類結果として取得する分類結果取得ステップと、を含み、隣接する要素間の関連度は、前記経路ベクトルにおける隣接する要素間の意味情報、及び隣接する要素に対応するセグメント間の距離に関する可変の重みに基づいて計算される、情報処理方法を提供する。 In another aspect of the present disclosure, a random vector acquisition step of acquiring the M-dimensional random variables of each of the N segments obtained by dividing the object to be classified, where M is the number of classes. The first element to the M element in each M-dimensional probability vector represent the probability that the corresponding segment belongs to the first class to the M class, respectively, and M and N are natural numbers larger than 1, the probability vector acquisition step and For each of the N segments, a candidate class selection step of selecting the class corresponding to the highest K maximum elements among the elements other than the H element in the M-dimensional random variable of the segment as the candidate class of the segment. Therefore, H and K are natural numbers, 1 ≦ H ≦ M, and 1 ≦ K ≦ M-1, and the H-class corresponding to the H element is a class that does not include semantic information. A path vector is generated based on the selection step and each candidate class of the N segments, and for each of the generated path vectors, the probability corresponding to each element included in the path vector and between adjacent elements. A route vector generation step for calculating the score of the route vector based on the degree of relevance, and a classification result acquisition step for acquiring the route vector having the highest score among the route vectors as the classification result of the target to be classified. Provided is an information processing method in which the degree of association between adjacent elements including and included is calculated based on semantic information between adjacent elements in the path vector and variable weights regarding the distance between segments corresponding to the adjacent elements. do.

本開示の他の態様では、上記の本開示の方法を実現するためのコンピュータプログラムコード及びコンピュータプログラムプロダクト、並びに上記の本開示の方法を実現するためのコンピュータプログラムコードが記録されているコンピュータ読み取り可能な記憶媒体をさらに提供する。 In another aspect of the present disclosure, a computer-readable computer in which the computer program code and computer program product for realizing the method of the present disclosure described above, and the computer program code for realizing the method of the present disclosure described above are recorded. Further provides a storage medium.

以下は、本開示の実施例の他の態様を説明し、特に本開示の好ましい実施例を詳細に説明するが、本開示はこれらの実施例に限定されない。 The following describes other embodiments of the present disclosure, particularly preferred embodiments of the present disclosure, but the present disclosure is not limited to these examples.

本開示の原理及び利点を理解させるために、図面を参照しながら本開示の各実施例を説明する。全ての図面において、同一又は類似の符号で同一又は類似の構成部を示している。ここで説明される図面は、好ましい実施例を例示するためのものであり、全ての可能な実施例ではなく、本開示の範囲を限定するものではない。
本開示の実施例に係る情報処理装置の機能的構成の例を示すブロック図である。図２Ａ及び図２Ｂは分類すべき対象の例及び分類結果の例をそれぞれ示す図である。図２Ａ及び図２Ｂは分類すべき対象の一例及び分類結果の一例をそれぞれ示す図である。文字認識の場合に経路ベクトルを更新する際に考慮される３つのシナリオの一例を示す図である。本開示の実施例に係る情報処理装置の経路ベクトル生成部がｉ番目（ｉ≧２）のラウンドの処理において実行する処理の流れの一例を示すフローチャートである。本開示の実施例に係る情報処理方法４００の流れの一例を示すフローチャートである。本開示の実施例に適用可能なパーソナルコンピュータの例示的な構成を示すブロック図である。 In order to understand the principles and advantages of the present disclosure, each embodiment of the present disclosure will be described with reference to the drawings. In all drawings, the same or similar components are indicated by the same or similar reference numerals. The drawings described herein are for illustration purposes only, and are not all possible examples and are not intended to limit the scope of the present disclosure.
It is a block diagram which shows the example of the functional configuration of the information processing apparatus which concerns on embodiment of this disclosure. 2A and 2B are diagrams showing an example of an object to be classified and an example of a classification result, respectively. 2A and 2B are diagrams showing an example of an object to be classified and an example of a classification result, respectively. It is a figure which shows an example of three scenarios which are considered when updating a path vector in the case of character recognition. It is a flowchart which shows an example of the flow of the process which the path vector generation part of the information processing apparatus which concerns on embodiment of this disclosure executes in the process of the i-th (i ≧ 2) round. It is a flowchart which shows an example of the flow of the information processing method 400 which concerns on embodiment of this disclosure. FIG. 3 is a block diagram illustrating an exemplary configuration of a personal computer applicable to the embodiments of the present disclosure.

以下、図面を参照しながら本開示の例示的な実施例を詳細に説明する。説明の便宜上、明細書には実際の実施形態の全ての特徴が示されていない。なお、実際に実施する際に、開発者の具体的な目標を実現するために、特定の実施形態を変更してもよい、例えばシステム及び業務に関する制限条件に応じて実施形態を変更してもよい。また、開発作業が非常に複雑であり、且つ時間がかかるが、本公開の当業者にとって、この開発作業は単なる例の作業である。 Hereinafter, exemplary embodiments of the present disclosure will be described in detail with reference to the drawings. For convenience of explanation, the specification does not show all the features of the actual embodiment. In the actual implementation, a specific embodiment may be changed in order to realize the specific goal of the developer, for example, the embodiment may be changed according to the restriction conditions related to the system and business. good. In addition, the development work is very complicated and time-consuming, but for those skilled in the art of this publication, this development work is merely an example work.

なお、本開示を明確にするために、図面には本開示に密に関連する装置の構成要件又は処理のステップのみが示され、本開示と関係のない細部が省略されている。 It should be noted that, in order to clarify the present disclosure, the drawings show only the configuration requirements or processing steps of the apparatus closely related to the present disclosure and omit details unrelated to the present disclosure.

以下は、図面を参照しながら、本開示の実施例を詳細に説明する。 Hereinafter, embodiments of the present disclosure will be described in detail with reference to the drawings.

まず、図１を参照しながら本開示の実施例に係る情報処理装置の機能的構成の例を説明する。図１は、本開示の実施例に係る情報処理装置の機能的構成の例を示すブロック図である。図１に示すように、本開示の実施例に係る情報処理装置１００は、確率ベクトル取得部１０２、候補クラス選択部１０４、経路ベクトル生成部１０６及び分類結果取得部１０８を含んでもよい。 First, an example of the functional configuration of the information processing apparatus according to the embodiment of the present disclosure will be described with reference to FIG. FIG. 1 is a block diagram showing an example of a functional configuration of the information processing apparatus according to the embodiment of the present disclosure. As shown in FIG. 1, the information processing apparatus 100 according to the embodiment of the present disclosure may include a probability vector acquisition unit 102, a candidate class selection unit 104, a path vector generation unit 106, and a classification result acquisition unit 108.

確率ベクトル取得部１０２は、分類すべき対象を分割して得られたＮ個のセグメントのそれぞれのＭ次元確率ベクトルを取得する。Ｍはクラスの数であり、各Ｍ次元確率ベクトルにおける第１要素乃至第Ｍ要素は、対応するセグメントが第１クラス乃至第Ｍクラスに属する確率をそれぞれ表し、Ｍ及びＮは１よりも大きい自然数である。例えば、確率ベクトル取得部１０２は、例えば予め訓練された畳み込みリカレントニューラルネットワーク（ＣＲＮＮ）などのニューラルネットワークにより各セグメントのＭ次元確率ベクトルを取得してもよい（例えばＳｈｉＢ，ＢａｉＸ，ＹａｏＣ．ＡｎＥｎｄ−ｔｏ−ＥｎｄＴｒａｉｎａｂｌｅＮｅｕｒａｌＮｅｔｗｏｒｋｆｏｒＩｍａｇｅ−ＢａｓｅｄＳｅｑｕｅｎｃｅＲｅｃｏｇｎｉｔｉｏｎａｎｄＩｔｓＡｐｐｌｉｃａｔｉｏｎｔｏＳｃｅｎｅＴｅｘｔＲｅｃｏｇｎｉｔｉｏｎ［Ｊ］．ＩＥＥＥｔｒａｎｓａｃｔｉｏｎｓｏｎｐａｔｔｅｒｎａｎａｌｙｓｉｓ＆ｍａｃｈｉｎｅｉｎｔｅｌｌｉｇｅｎｃｅ，２０１７，３９（１１）：２２９８−２３０４を参照する）。なお、各セグメントのＭ次元確率ベクトルの取得方法はこれに限定されず、当業者は実際の必要に応じて他の方法を採用して各セグメントのＭ次元確率ベクトルを取得してもよく、ここでその説明を省略する。 The probability vector acquisition unit 102 acquires the M-dimensional probability vector of each of the N segments obtained by dividing the object to be classified. M is the number of classes, the first element to the M element in each M-dimensional probability vector represents the probability that the corresponding segment belongs to the first class to the M class, respectively, and M and N are natural numbers larger than 1. Is. For example, the probability vector acquisition unit 102 may acquire the M-dimensional probability vector of each segment by a neural network such as a pre-trained convolutional recurrent neural network (CRNN) (for example, Shi B, Bai X, Yao C.I. an End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition [J] IEEE transactions on pattern analysis & machine intelligence, 2017, 39 (11):. referring to the 2298-2304). The method of acquiring the M-dimensional probability vector of each segment is not limited to this, and those skilled in the art may adopt another method as necessary to acquire the M-dimensional probability vector of each segment. The explanation is omitted in.

一例として、分類すべき対象は、テキストを含む画像であってもよい。この場合、第１クラス乃至第Ｍクラスは、異なる文字（例えば漢字、英語のアルファベット、記号など）をそれぞれ表してもよく、第１クラス乃至第Ｍクラスのうちの第Ｈ要素に対応する第Ｈクラスは、空白文字（間隔文字）を表してもよい。なお、分類すべき対象は、これに限定されない。例えば、分類すべき対象は、音声であってもよい。この場合、第１クラス乃至第Ｍクラスのうちの第Ｈクラス以外のクラスは、異なる文字（例えば漢字、英語の単語など）をそれぞれ表してもよく、第Ｈクラスは、空白（音声における休止に対応する）を表してもよい。 As an example, the object to be classified may be an image containing text. In this case, the first class to the M class may represent different characters (for example, Chinese characters, English alphabets, symbols, etc.), respectively, and the Hth corresponding to the H element in the first class to the M class may be represented. The class may represent a whitespace character (spacing character). The objects to be classified are not limited to this. For example, the object to be classified may be voice. In this case, the classes other than the H class among the first class to the M class may represent different characters (for example, Chinese characters, English words, etc.), and the H class may be blank (for pause in voice). Corresponding) may be represented.

例えば、Ｎは、分類すべき対象の大きさに関連してもよいが、これに限定されない。例えば、分類すべき対象がテキストを含む画像である場合、画像のサイズが大きいほど、Ｎが大きくなる。また、例えば、分類すべき対象が音声である場合、音声に対応する時間が長いほど、Ｎが大きくなる。 For example, N may be related to, but not limited to, the size of the object to be classified. For example, when the object to be classified is an image containing text, the larger the size of the image, the larger N. Further, for example, when the object to be classified is voice, the longer the time corresponding to the voice, the larger N becomes.

候補クラス選択部１０４は、Ｎ個のセグメントのそれぞれについて、該セグメントのＭ次元確率ベクトルにおける第Ｈ要素以外の要素のうちの上位Ｋ個の最大の要素に対応するクラスを該セグメントの候補クラスとして選択してもよい。ここで、Ｈ及びＫは自然数であり、１≦Ｈ≦Ｍ、且つ１≦Ｋ≦Ｍ−１となり、第Ｈ要素に対応する第Ｈクラスは、意味情報を含まないクラスであってもよい。例えば、候補クラス選択部１０４は、各セグメントのＭ次元確率ベクトルのうちの第Ｈ要素以外の要素を降順でソートし、上位のＫ個の要素に対応するクラスを対応するセグメントの候補クラスとして選択してもよい。ここで、当業者は実際の必要に応じてＫの値を設定してもよい。 For each of the N segments, the candidate class selection unit 104 uses a class corresponding to the largest K elements other than the H element in the M-dimensional probability vector of the segment as the candidate class for the segment. You may choose. Here, H and K are natural numbers, 1 ≦ H ≦ M, and 1 ≦ K ≦ M-1, and the H-class corresponding to the H element may be a class that does not include semantic information. For example, the candidate class selection unit 104 sorts the elements other than the H element in the M-dimensional probability vector of each segment in descending order, and selects the class corresponding to the upper K elements as the candidate class of the corresponding segment. You may. Here, those skilled in the art may set the value of K as needed.

経路ベクトル生成部１０６は、Ｎ個のセグメントのそれぞれの候補クラスに基づいて経路ベクトルを生成し、生成された経路ベクトルのそれぞれについて、該経路ベクトルに含まれる各要素に対応する確率及び隣接する要素間の関連度に基づいて該経路ベクトルのスコアを計算してもよい。ここで、隣接する要素間の関連度は、経路ベクトルにおける隣接する要素間の意味情報、及び隣接する要素に対応するセグメント間の距離に関する可変の重みに基づいて計算されてもよい。 The route vector generation unit 106 generates a route vector based on each candidate class of N segments, and for each of the generated route vectors, the probability corresponding to each element included in the route vector and the adjacent elements. The score of the path vector may be calculated based on the degree of relevance between them. Here, the degree of relevance between adjacent elements may be calculated based on semantic information between adjacent elements in a path vector and variable weights with respect to distances between segments corresponding to adjacent elements.

分類結果取得部１０８は、経路ベクトルのうちのスコアが最も高い経路ベクトルを分類すべき対象の分類結果として取得してもよい。 The classification result acquisition unit 108 may acquire the route vector having the highest score among the route vectors as the classification result of the target to be classified.

分類技術は、例えば画像認識、文字認識、音声認識などに幅広く応用されている。文字認識では、コア技術は、テキストを含む画像などから特徴を抽出して文字認識を行うことである。通常、抽出された特徴には意味情報が含まれないため、長短期記憶（ＬＳＴＭ：ＬｏｎｇＳｈｏｒｔ−ＴｅｒｍＭｅｍｏｒｙ）ネットワーク（例えば、ＨｏｃｈｒｅｉｔｅｒＳ，ＳｃｈｍｉｄｈｕｂｅｒＪ．Ｌｏｎｇｓｈｏｒｔ−ｔｅｒｍｍｅｍｏｒｙ［Ｊ］．ＮｅｕｒａｌＣｏｍｐｕｔａｔｉｏｎ，１９９７，９（８）：１７３５−１７８０を参照する）、ｎ−ｇｒａｍモデル（例えば、ＢｒｏｗｎＰＦ，ＤｅｓｏｕｚａＰＶ，ＭｅｒｃｅｒＲＬ，ｅｔａｌ．Ｃｌａｓｓ−ｂａｓｅｄｎ−ｇｒａｍｍｏｄｅｌｓｏｆｎａｔｕｒａｌｌａｎｇｕａｇｅ［Ｊ］．ＣｏｍｐｕｔａｔｉｏｎａｌＬｉｎｇｕｉｓｔｉｃｓ，１９９２，１８（４）：４６７−４７９を参照する）などの意味情報を復号プロセスに導入する技術が提案されている。従来技術では、２つの特定の互いに隣接する非空白文字、単語又は句の間の意味情報は、その間の実際の距離によらず、一定である。なお、ここの「隣接する」とは、この２つの特定の非空白文字、単語又は句の間に文字がなく、或いは空白以外の他の文字がないことを意味する。 The classification technique is widely applied to, for example, image recognition, character recognition, voice recognition, and the like. In character recognition, the core technology is to perform character recognition by extracting features from images containing text. Usually, the extracted features do not contain semantic information, so long-term and short-term memory (LSTM: Long Short-Term Memory) networks (eg, Hochreiter S, Schmidhuber J. Long short-term memory [J]. 1997, 9 (8): see 1735-1780), n-gram model (eg, Brown PF, Desouza PV, Mercer RL, et al. Class-based n-gram models of natural lang. A technique has been proposed for introducing semantic information into the decoding process, such as Computational Linguistics, 1992, 18 (4): 467-479). In the prior art, the semantic information between two specific adjacent non-blank characters, words or phrases is constant regardless of the actual distance between them. In addition, "adjacent" here means that there is no character between these two specific non-blank characters, words or phrases, or there is no other character other than the blank.

上述したように、本開示の実施例に係る情報処理装置１００は、経路ベクトルにおける隣接する要素間の意味情報、及び隣接する要素に対応するセグメント間の距離に関する可変の重みに基づいて、隣接する要素間の関連度を計算する。分類すべき対象がテキストを含む画像である場合、隣接する要素に対応するセグメント間の距離は、隣接する要素間の実際の距離に対応しており、上記の関連度の計算方法を用いる場合、隣接する要素間の意味情報に隣接する要素間の実際の距離に関する可変の重みを付加することと同等であるため、隣接する要素間の実際の距離を考慮してテキストを認識することができる。 As described above, the information processing apparatus 100 according to the embodiment of the present disclosure is adjacent based on the semantic information between adjacent elements in the path vector and the variable weight regarding the distance between the segments corresponding to the adjacent elements. Calculate the degree of association between elements. If the object to be classified is an image containing text, the distance between the segments corresponding to the adjacent elements corresponds to the actual distance between the adjacent elements, and when using the above relevance calculation method, Since it is equivalent to adding a variable weight regarding the actual distance between adjacent elements to the semantic information between adjacent elements, the text can be recognized in consideration of the actual distance between adjacent elements.

本開示の１つの実施例では、隣接する要素間の意味情報は、予め訓練されたｎ−ｇｒａｍモデルを介して計算された値により表される。なお、他の方法を用いて隣接する要素間の意味情報を取得し、或いは表してもよい。例えば、ＬＳＴＭなどのリカレントニューラルネットワーク（ＲＮＮ）用いて隣接する要素間の意味情報を取得してもよい。 In one embodiment of the present disclosure, semantic information between adjacent elements is represented by values calculated via a pre-trained n-gram model. It should be noted that other methods may be used to acquire or represent semantic information between adjacent elements. For example, a recurrent neural network (RNN) such as LSTM may be used to acquire semantic information between adjacent elements.

本開示の１つの実施例では、隣接する要素に対応するセグメント間の距離が所定閾値以下である場合、可変の重みを１に設定し、隣接する要素に対応するセグメント間の距離が所定閾値よりも大きい場合、可変の重みを１よりも小さい値に設定し、且つ可変の重みは上記の距離の増加に伴って減少する。例えば、隣接する要素に対応するセグメント間の距離は、対応するセグメントの数の差により表されてもよい。例えば、２つの隣接する要素が文字「Ｂ」及び「Ｃ」であり、文字「Ｂ」及び「Ｃ」に対応するセグメント番号がそれぞれＴＳ_“Ｂ”＝２及びＴＳ_“Ｃ”＝４であるとする場合、文字「Ｂ」及び「Ｃ」に対応するセグメント間の距離は、ＴＳ_“Ｃ”−ＴＳ_“Ｂ”＝２として表されてもよい。 In one embodiment of the present disclosure, when the distance between segments corresponding to adjacent elements is less than or equal to a predetermined threshold, the variable weight is set to 1 and the distance between segments corresponding to adjacent elements is greater than or equal to the predetermined threshold. If is also large, the variable weight is set to a value less than 1, and the variable weight decreases with increasing distance. For example, the distance between segments corresponding to adjacent elements may be represented by the difference in the number of corresponding segments. For example, suppose that two adjacent elements are the letters "B" and "C", and the segment numbers corresponding to the letters "B" and "C" are TS _"B" = 2 and TS _"C" = 4, respectively. If so, the distance between the segments corresponding to the letters "B" and "C" may be expressed as _{TS "C"} -TS _{"B" = 2.}

例えば、隣接する要素に対応するセグメント間の距離が所定閾値よりも大きい場合、可変の重みは、上記の距離と所定の閾値との差の逆数に設定されてもよいが、これに限定されない。例えば、可変の重みｗｅｉｇｈｔは、以下の式（１）に従って計算されてもよい。

For example, if the distance between segments corresponding to adjacent elements is greater than or equal to a predetermined threshold, the variable weight may be set to the reciprocal of the difference between the distance and the predetermined threshold, but is not limited thereto. For example, the variable weight may be calculated according to the following equation (1).

式（１）において、Ｄは隣接する要素に対応するセグメント間の距離を表し、Ｄ_ｔは所定閾値を表す。また、当業者は、実際の必要に応じて所定閾値Ｄ_ｔを決定してもよい。例えば、所定閾値Ｄ_ｔは、１つの空白文字により占められるセグメントの数に基づいて決定されてもよい。 In the equation (1), D represents the distance between the segments corresponding to the adjacent elements, and D _t represents a predetermined threshold value. Moreover, those skilled in the art may determine the predetermined threshold value D _t If the actual need. For example, the predetermined threshold D _t may be determined based on the number of segments occupied by one blank character.

上述したように、従来技術では、２つの特定の互いに隣接する非空白文字、単語又は句の間の意味情報は、その間の実際の距離によらず、一定である。一方、認識すべきテキストに区切り（例えば空白）が存在する場合が多く、区切りの前後の文の意味的関連性が弱い。例えば、図２Ａに示すように、語句「１−１」とその後の語句「（真柄建設」とのコンテキスト意味情報が比較的に弱い。しかし、従来技術を用いて図２Ａに示すテキストを認識する場合、語句「１−１」と語句「（真柄建設」との間に特定の意味情報を導入し、該特定の意味情報は、語句「１−１」と語句「（真柄建設」との間に空白があるか否か、及び空白（即ち、空白文字の数）とは関係がない。 As mentioned above, in the prior art, the semantic information between two specific adjacent non-blank characters, words or phrases is constant regardless of the actual distance between them. On the other hand, there are often delimiters (for example, blanks) in the text to be recognized, and the semantic relevance of the sentences before and after the delimiter is weak. For example, as shown in FIG. 2A, the contextual semantic information of the phrase “1-1” and the subsequent phrase “(magara construction” is relatively weak. However, the text shown in FIG. 2A is recognized using the prior art. In the case, a specific semantic information is introduced between the phrase "1-1" and the phrase "(magara construction", and the specific semantic information is between the phrase "1-1" and the phrase "(magara construction"). It has nothing to do with whether or not there is a space in, and the space (ie, the number of blank characters).

本開示の実施例に係る情報処理装置は、経路ベクトルにおける隣接する要素間の意味情報、及び隣接する要素に対応するセグメント間の距離に関する可変の重みに基づいて、隣接する要素間の関連度を計算してもよい。分類すべき対象がテキストを含む画像である場合、隣接する要素に対応するセグメント間の距離は隣接する要素間の実際の距離に対応しており、隣接する要素間の区切り（例えば、空白）により、隣接する要素間の実際の距離を増大させることができる。言い換えれば、本開示の実施例に係る情報処理装置は、隣接する要素間の意味情報に隣接する要素間の区切り（例えば、空白）に関する可変の重みを付加してもよい。例えば、図２Ａに示すテキストでは、隣接する文字「１」と「（」に対応するセグメント間の距離が所定の閾値よりも大きい場合、隣接する文字「１」と「（」との間の意味情報に上記の式（１）に従って算出された可変の重みを付加してもよい。 The information processing apparatus according to the embodiment of the present disclosure determines the degree of relevance between adjacent elements based on the semantic information between adjacent elements in the path vector and the variable weight of the distance between the segments corresponding to the adjacent elements. You may calculate. If the object to be classified is an image containing text, the distance between the segments corresponding to the adjacent elements corresponds to the actual distance between the adjacent elements, and by the delimiter between the adjacent elements (eg, blank). , The actual distance between adjacent elements can be increased. In other words, the information processing apparatus according to the embodiment of the present disclosure may add a variable weight regarding a delimiter (for example, a blank) between adjacent elements to the semantic information between adjacent elements. For example, in the text shown in FIG. 2A, when the distance between the segments corresponding to the adjacent characters "1" and "(" is larger than a predetermined threshold value, the meaning between the adjacent characters "1" and "("). A variable weight calculated according to the above equation (1) may be added to the information.

隣接する要素に対応するセグメント間の距離が所定閾値よりも大きい場合、可変の重みを１よりも小さい値に設定し、且つ可変の重みは上記の距離の増加に伴って減少する。これによって、テキストを認識する際に、隣接する要素間に区切り（例えば、空白）がある場合、隣接する要素間に、区切り（例えば、空白）の増大に伴って減少する関連度を適用することで、文字認識のパフォーマンス（例えば、文字認識の正確度）を向上させることができる。 If the distance between the segments corresponding to the adjacent elements is greater than a predetermined threshold, the variable weight is set to a value less than 1, and the variable weight decreases as the distance increases. This ensures that when recognizing text, if there is a delimiter (eg, blank) between adjacent elements, a degree of relevance that decreases with increasing delimiters (eg, blank) is applied between the adjacent elements. Therefore, the performance of character recognition (for example, the accuracy of character recognition) can be improved.

本開示の１つの実施例では、情報処理装置１００は、前処理部１１０をさらに含んでもよい。前処理部１１０は、分類すべき対象に対して前処理を行う。ここで、前処理は、ノイズ除去、正規化、二値化及び傾き補正のうちの少なくとも１つを含んでもよい。ノイズ除去、正規化、二値化及び傾き補正は、本分野で周知であるため、その詳細について説明を省略する。 In one embodiment of the present disclosure, the information processing apparatus 100 may further include a preprocessing unit 110. The pre-processing unit 110 performs pre-processing on the target to be classified. Here, the pretreatment may include at least one of noise reduction, normalization, binarization and tilt correction. Since noise removal, normalization, binarization, and tilt correction are well known in the art, the details thereof will be omitted.

本開示の１つの実施例では、経路ベクトル生成部１０６は、以下のＮラウンドの処理により経路ベクトルを生成し、経路ベクトルのスコアを取得してもよい。 In one embodiment of the present disclosure, the route vector generation unit 106 may generate a route vector by the following N rounds of processing and acquire the score of the route vector.

１番目のラウンドの処理において、経路ベクトル生成部１０６は、Ｎ個のセグメントのうちの第１セグメントの候補クラス及び第Ｈ要素に対応する第Ｈクラスに基づいてＫ＋１個の経路ベクトルを生成し、各経路ベクトルにおける要素に対応する確率に基づいて該経路ベクトルのスコアを生成してもよい。 In the processing of the first round, the route vector generation unit 106 generates K + 1 route vectors based on the candidate class of the first segment of the N segments and the H class corresponding to the H element. Scores for the path vectors may be generated based on the probabilities corresponding to the elements in each path vector.

例えば、図４に示すように、ｉ番目（ｉ≧２）のラウンドの処理において、経路ベクトル生成部１０６は、スコアが最大である上位Ｌ個の経路ベクトルを候補経路ベクトルとして選択し（ステップＳ４１０２）、ここで、Ｌは１よりも大きい自然数であり、同一の２つ以上の候補経路ベクトルのうちのスコアが最も高い候補経路ベクトル以外の他の候補経路ベクトルを除外し（ステップＳ４１０４）、少なくとも第ｉセグメントのＭ次元確率ベクトルに基づいて除外後の残りの候補経路ベクトルのスコアを更新し（ステップＳ４１０６）、残りの候補経路ベクトルのそれぞれについて、第ｉセグメントの候補クラスのそれぞれを該残りの候補経路ベクトルにそれぞれ追加して経路ベクトルを新しく生成し、該残りの候補経路ベクトルの更新前のスコア、新しく追加された候補クラスに対応する確率、及び新しく追加された候補クラスと該残りの候補経路ベクトルとの関連度に基づいて、新しく生成された経路ベクトルのスコアを計算する（ステップＳ４１０８）。ここで、当業者は、実際の必要に応じてＬの値を設定してもよい。 For example, as shown in FIG. 4, in the processing of the i-th (i ≧ 2) round, the route vector generation unit 106 selects the upper L route vectors having the maximum score as candidate route vectors (step S4102). ), Here, L is a natural number larger than 1, and excludes other candidate route vectors other than the candidate route vector having the highest score among the same two or more candidate route vectors (step S4104), and at least. The score of the remaining candidate route vectors after exclusion is updated based on the M-dimensional probability vector of the i-segment (step S4106), and for each of the remaining candidate route vectors, each of the candidate classes of the i-segment is the remaining. Each of the candidate route vectors is added to generate a new route vector, and the score before updating the remaining candidate route vectors, the probability corresponding to the newly added candidate class, and the newly added candidate class and the remaining candidates. The score of the newly generated route vector is calculated based on the degree of association with the route vector (step S4108). Here, those skilled in the art may set the value of L as needed.

なお、ｉ番目（ｉ≧２）のラウンドの処理において、ステップＳ４１０２において経路ベクトル生成部１０６により選択された候補ベクトルに同一の２つ以上の候補経路ベクトルが含まれ、且つ上記の同一の２つ以上の候補経路ベクトルうちのスコアが最も高い候補経路ベクトルが１つよりも多い（即ち、２つ以上の候補経路ベクトルが最も高いスコアを有する）場合、ステップＳ４１０４において、そのうちの何れか１つの候補経路ベクトルを残してもよい。 In the process of the i-th (i ≧ 2) round, the candidate vectors selected by the path vector generation unit 106 in step S4102 include the same two or more candidate path vectors, and the same two as described above. If there are more than one candidate route vector with the highest score among the above candidate route vectors (that is, two or more candidate route vectors have the highest score), any one of the candidates in step S4104. You may leave the path vector.

以下は、分類すべき対象がテキストを含む画像であることを一例として、経路ベクトル生成部１０６により実行される上記の反復処理を説明する。説明の便宜上、以下は、テキストに大文字の英語アルファベット及び空白（ｂｌａｎｋ）のみがあり、即ちｂｌａｎｋ、Ａ、Ｂ、Ｃ、Ｄ、Ｅ、Ｆ、Ｇ、…、ＺというＭ＝２７個のクラスのみがあると仮定する。 The following describes the above-mentioned iteration process executed by the path vector generation unit 106, taking as an example that the object to be classified is an image containing text. For convenience of explanation, the following has only uppercase English alphabets and blanks in the text, ie only M = 27 classes such as blank, A, B, C, D, E, F, G, ..., Z. Suppose there is.

１番目のラウンドの処理において、経路ベクトル生成部１０６は、Ｎ個のセグメントのうちの第１セグメントの候補クラス（この例では、Ｋ＝２、且つ第１セグメントの候補クラスがＡ及びＢであると仮定する）及び第Ｈクラス（即ち、第１クラスｂｌａｎｋ）に基づいてＫ＋１＝３個の経路ベクトル［Ａ］、［Ｂ］及び［ｂｌａｎｋ］を生成し、各経路ベクトルにおける要素に対応する確率に基づいて該経路ベクトルのスコアを生成する。例えば、第１セグメントのＭ次元確率ベクトルにおける要素Ａ、Ｂ、及びｂｌａｎｋに対応する確率を経路ベクトル［Ａ］、［Ｂ］及びｂｌａｎｋのスコアとしてそれぞれ用いてもよい。また、実際の応用では、各経路ベクトルの要素に対応する確率をさらに処理してもよく、例えば、各確率の対数を求め、取得された結果を対応する経路ベクトルのスコアとして用いてもよい。例えば、確率ベクトル取得部１０２により取得された第１セグメントのＭ次元確率ベクトルがＰ_１＝［ｐ_{ｂｌａｎｋ１}，ｐ_Ａ１，ｐ_Ｂ１，ｐ_Ｃ１，ｐ_Ｄ１，ｐ_Ｅ１，ｐ_Ｆ１，ｐ_Ｇ１，…，ｐ_Ｚ１］となると仮定し、ここで、ｐ_{ｂｌａｎｋ１}、ｐ_Ａ１、ｐ_Ｂ１、ｐ_Ｃ１、ｐ_Ｄ１、ｐ_Ｅ１、ｐ_Ｆ１、ｐ_Ｇ１、…、ｐ_Ｚ１は、第１セグメントがクラスｂｌａｎｋ、Ａ、Ｂ、Ｃ、Ｄ、Ｅ、Ｆ、Ｇ、…、Ｚに属する確率をそれぞれ表す。この場合、経路ベクトル［Ａ］、［Ｂ］及び［ｂｌａｎｋ］のスコアは、それぞれｌｏｇ（ｐ_{ｂｌａｎｋ１}）、ｌｏｇ（ｐ_Ａ１）及びｌｏｇ（ｐ_Ｂ１）である。また、当業者は、必要に応じて他の方法を用いて各要素に対応する確率を処理し、経路ベクトルのスコアが対応する要素の確率の増大に伴って増大できるように、処理結果を対応する経路ベクトルのスコアとして用いてもよい。 In the processing of the first round, the path vector generation unit 106 is a candidate class for the first segment of the N segments (in this example, K = 2 and the candidate classes for the first segment are A and B). (Assuming) and the probability of generating K + 1 = 3 path vectors [A], [B] and [blank] based on the H class (that is, the first class blank) and corresponding to the elements in each path vector. Generates a score for the path vector based on. For example, the probabilities corresponding to the elements A, B, and blank in the M-dimensional probability vector of the first segment may be used as the scores of the path vectors [A], [B], and blank, respectively. Further, in an actual application, the probabilities corresponding to the elements of each route vector may be further processed. For example, the logarithm of each probability may be obtained and the obtained result may be used as the score of the corresponding route vector. For example, the M-dimensional probability vector of the first segment acquired by the probability vector acquisition unit 102 is P ₁ = [p _blank1 , p _A1 , p _B1 , p _C1 , p _D1 , p _E1 , p _F1 , p _G1 , ..., p _Z1 ], where p _blank1 , p _A1 , p _B1 , p _C1 , p _D1 , p _E1 , p _F1 , p _G1 , ..., p _Z1 have the first segment of classes blank, A, Represents the probabilities of belonging to B, C, D, E, F, G, ..., Z, respectively. In this case, the scores of the route vectors [A], [B] and [blank] are log (p _blank1 ), log (p _A1 ) and log (p _B1 ), respectively. In addition, those skilled in the art will process the probabilities corresponding to each element using other methods as needed, and will handle the processing results so that the score of the path vector can increase as the probabilities of the corresponding elements increase. It may be used as the score of the path vector to be used.

２番目のラウンドの処理において、経路ベクトル生成部１０６は、現在の経路ベクトル（即ち１番目のラウンドのそりにおいて生成された経路ベクトル）からスコアが最大である上位Ｌ個（この例では、Ｌ＝２）の経路ベクトルを候補経路ベクトルとして選択してもよい（２番目のラウンドの処理において選択された候補経路ベクトルが［Ａ］及び［ｂｌａｎｋ］であると仮定する）（ステップＳ４１０２）。この例では、２番目のラウンドの処理において選択された候補経路ベクトルがそれぞれ異なるため、ステップＳ４１０４における除外処理（即ち、同一の２つ以上の候補経路ベクトルのうちのスコアが最も高い候補経路ベクトル以外の他の候補経路ベクトルを除外する）を実行しない。経路ベクトル生成部１０６は、少なくとも第２セグメントのＭ次元確率ベクトルに基づいて候補経路ベクトル［Ａ］及び［ｂｌａｎｋ］のスコアを更新し（ステップＳ４１０６）、第２セグメントの候補クラス（第２セグメントの候補クラスがＣ及びＤであると仮定する）のそれぞれを候補経路ベクトル［Ａ］及び［ｂｌａｎｋ］にそれぞれ追加して経路ベクトル［Ａ，Ｃ］、［Ａ，Ｄ］、［ｂｌａｎｋ，Ｃ］及び［ｂｌａｎｋ，Ｄ］を新しく生成し、候補経路ベクトル［Ａ］及び［ｂｌａｎｋ］の更新前のスコア、新しく追加された候補クラスＣ及びＤに対応する確率、及び新しく追加された候補クラスＣ及びＤと候補経路ベクトル［Ａ］及び［ｂｌａｎｋ］との関連度に基づいて、新しく生成された経路ベクトルのスコアを計算してもよい（ステップＳ４１０８）。 In the processing of the second round, the path vector generation unit 106 has the highest score from the current path vector (that is, the path vector generated in the sled of the first round) (L = in this example). The route vector of 2) may be selected as the candidate route vector (assuming that the candidate route vectors selected in the processing of the second round are [A] and [blank]) (step S4102). In this example, since the candidate route vectors selected in the processing of the second round are different from each other, the exclusion processing in step S4104 (that is, other than the candidate route vector having the highest score among the same two or more candidate route vectors). Exclude other candidate path vectors). The route vector generation unit 106 updates the scores of the candidate route vectors [A] and [blank] based on at least the M-dimensional probability vector of the second segment (step S4106), and the candidate class of the second segment (of the second segment). (Assuming that the candidate classes are C and D) are added to the candidate path vectors [A] and [blank], respectively, and the path vectors [A, C], [A, D], [blank, C] and Newly generated [blank, D], pre-updated scores of candidate path vectors [A] and [blank], probabilities corresponding to newly added candidate classes C and D, and newly added candidate classes C and D. The score of the newly generated route vector may be calculated based on the degree of association between the candidate route vector [A] and [blank] (step S4108).

例えば、経路ベクトル生成部１０６は、以下の式（２）に従って、新しく生成された経路ベクトル［Ａ，Ｃ］のスコアＳｃｏｒｅ（“ＡＣ”）を計算してもよい。

For example, the route vector generation unit 106 may calculate the score Score (“AC”) of the newly generated route vector [A, C] according to the following equation (2).

式（２）において、スコア（“Ａ”）は、候補経路ベクトル［Ａ］の更新前のスコアを表し、ｐ_Ｃ２は、確率ベクトル取得部１０２により取得された第２セグメントのＭ次元確率ベクトルにおけるクラスＣに対応する確率をし、ＬＭ（“Ｃ”／“Ａ”）＊ｗｅｉｇｈｔは、新しく追加された候補クラスＣと候補経路ベクトル［Ａ］との関連度を表す。ここで、ＬＭ（“Ｃ”／“Ａ”）は、“Ａ”と“Ｃ”との間の意味情報を表し、ｗｅｉｇｈｔは、可変の重みを表し、上記の式（１）に従って算出されてもよい。 In the equation (2), the score (“A”) represents the score before the update of the candidate path vector [A], and p _C2 is the M-dimensional probability vector of the second segment acquired by the probability vector acquisition unit 102. The probability corresponding to the class C is set, and the LM (“C” / “A”) * weight represents the degree of association between the newly added candidate class C and the candidate route vector [A]. Here, LM (“C” / “A”) represents semantic information between “A” and “C”, and weight represents a variable weight, which is calculated according to the above equation (1). May be good.

また、経路ベクトル生成部１０６は、同様の方法を用いて、新しく生成された経路ベクトル［Ａ，Ｄ］、［ｂｌａｎｋ，Ｃ］及び［ｂｌａｎｋ，Ｄ］のスコアを計算してもよい。ｂｌａｎｋと文字Ａ、Ｂ、Ｃ、Ｄ、Ｅ、Ｆ、Ｇ、…、Ｚとの間の意味情報は、０で表されてもよい。 Further, the route vector generation unit 106 may calculate the scores of the newly generated route vectors [A, D], [blank, C] and [blank, D] by using the same method. The semantic information between the blank and the letters A, B, C, D, E, F, G, ..., Z may be represented by 0.

３番目のラウンドの処理において、経路ベクトル生成部１０６は、現在の経路ベクトル、即ち２番目のラウンドの処理において選択された候補経路ベクトル（除外処理を実行した場合、除外後の残りの候補経路ベクトル）（即ち［Ａ］及び［ｂｌａｎｋ］）及び２番目のラウンドにおいて新しく生成された経路ベクトル（即ち［Ａ，Ｃ］、［Ａ，Ｄ］、［ｂｌａｎｋ，Ｃ］及び［ｂｌａｎｋ，Ｄ］）からスコアが最大の上位Ｌ個（Ｌ＝２）の経路ベクトルを候補経路ベクトルとして選択し、上記の２番目のラウンドの処理と同様の処理を実行してもよい。ここでその説明を省略する。なお、同一の２つ以上の候補経路ベクトルがある場合、除外処理を実行する必要がある。 In the processing of the third round, the route vector generation unit 106 uses the current route vector, that is, the candidate route vector selected in the processing of the second round (when the exclusion processing is executed, the remaining candidate route vector after exclusion). ) (Ie [A] and [blank]) and the newly generated path vectors in the second round (ie [A, C], [A, D], [blank, C] and [blank, D]). The upper L (L = 2) route vectors having the highest score may be selected as candidate route vectors, and the same processing as the processing of the second round may be executed. The description thereof will be omitted here. If there are two or more candidate route vectors that are the same, it is necessary to execute the exclusion process.

また、その後のラウンドの処理において、経路ベクトル生成部１０６は、上記の２番目のラウンド及び３番目のラウンドの処理と同様の処理を実行してもよいが、ここでその説明を省略する。なお、同一の２つ以上の候補経路ベクトルがある場合、除外処理を実行する必要がある。 Further, in the subsequent round processing, the path vector generation unit 106 may execute the same processing as the processing of the second round and the third round described above, but the description thereof will be omitted here. If there are two or more candidate route vectors that are the same, it is necessary to execute the exclusion process.

上述したＮラウンドの処理は、ビーム探索方法（例えば、ＨａｎｎｕｎＡＹ，ＭａａｓＡＬ，ＪｕｒａｆｓｋｙＤ，ｅｔａｌ．Ｆｉｒｓｔ−ｐａｓｓｌａｒｇｅｖｏｃａｂｕｌａｒｙｃｏｎｔｉｎｕｏｕｓｓｐｅｅｃｈｒｅｃｏｇｎｉｔｉｏｎｕｓｉｎｇｂｉ−ｄｉｒｅｃｔｉｏｎａｌｒｅｃｕｒｒｅｎｔｄｎｎｓ［Ｊ］．ＣｏｍｐｕｔｅｒＳｃｉｅｎｃｅ，２０１４を参照する）及び隣接する要素間の意味情報と可変の重みに基づく関連度を組み合わせることで、分類の正確度を向上させることができる。また、ｉ番目（ｉ≧２）のラウンドの処理においてスコアが最大の上位のＬ個の経路ベクトルを候補経路ベクトルとして選択し、候補経路ベクトルのみに対して後続の処理を行うことで、分類処理の計算量を低減させ、計算時間を節約することができる。 The above-mentioned N-round processing is performed by a beam search method (for example, Hannun A Y, MaaS A L, Jurafsky D, et al. First-pass vocabulary vocabulary reference computer science selection Computer Science. The accuracy of classification can be improved by combining the semantic information between adjacent elements and the degree of association based on variable weights. Further, in the processing of the i-th (i ≧ 2) round, the L route vectors having the highest score are selected as the candidate route vectors, and the subsequent processing is performed only on the candidate route vectors to perform the classification processing. The amount of calculation can be reduced and the calculation time can be saved.

なお、説明の便宜上、空白（ｂｌａｎｋ）のみを含む経路ベクトルは［ｂｌａｎｋ］で表される。ただし、実際には、経路ベクトルに通常空白文字が含まれていないため、空白（ｂｌａｎｋ）のみを含むいわゆる経路ベクトルは実際に空白の経路ベクトル、即ち文字を含まない経路ベクトルである。同様に、経路ベクトル［ｂｌａｎｋ］に基づいて取得された第１要素がｂｌａｎｋである経路ベクトルの場合、実際には、対応する経路ベクトルに空白文字が含まれていない。例えば、経路ベクトル［ｂｌａｎｋ，Ｃ］は実際には［Ｃ］である。 For convenience of explanation, a path vector containing only a blank is represented by [blank]. However, in reality, since the route vector usually does not include blank characters, the so-called route vector containing only blanks is actually a blank route vector, that is, a route vector containing no characters. Similarly, when the first element acquired based on the route vector [blank] is a route vector having a blank, the corresponding route vector does not actually contain a blank character. For example, the path vector [blank, C] is actually [C].

例えば、文字認識では、復号のルールに従って、隣接する重複の文字を併合して空白文字ｂｌａｎｋを除去する必要があるため、経路ベクトルのスコアを更新する際に、以下の３つのシナリオを考慮してもよい。例えば、図３に示す例では、「新世界」について、この３つのシナリオは以下のようにまとめられてもよい。（１）Ｓｉｔｕａｔｉｏｎ１（シナリオ１）：最後の文字と重複する文字、即ち「界」が出現する。（２）Ｓｉｔｕａｔｉｏｎ２（シナリオ２）：ｂｌａｎｋ，「新世界」＋ｂｌａｎｋが出現する。（３）Ｓｉｔｕａｔｉｏｎ３（シナリオ３）：経路ベクトル「新世界」が存在し、「界」が出現する。この３つのシナリオは何れも「新世界」に復号できるため、これらの経路ベクトルのスコアを合計して、経路ベクトル「新世界」のスコアに更新する必要がある。なお、図３に示す例では、Ｓｉｔｕａｔｉｏｎ３では、「新世」と「界」との間の１つ空白文字しか示されていないが、実際には「新世」と「界」との間に１つ以上の空白文字が存在してもよい。 For example, in character recognition, it is necessary to merge adjacent duplicate characters and remove the blank character blank according to the decoding rule. Therefore, when updating the score of the route vector, the following three scenarios are taken into consideration. May be good. For example, in the example shown in FIG. 3, for the "new world", these three scenarios may be summarized as follows. (1) Stationion 1 (scenario 1): A character that overlaps with the last character, that is, a "world" appears. (2) Station 2 (scenario 2): blank, "New World" + blank appears. (3) Stationion 3 (Scenario 3): The path vector "new world" exists, and the "world" appears. Since all of these three scenarios can be decoded into the "New World", it is necessary to add up the scores of these route vectors and update them to the score of the route vector "New World". In the example shown in FIG. 3, in Stationion 3, only one blank character between "new world" and "world" is shown, but in reality, between "new world" and "world". There may be one or more whitespace characters in.

一例として、ｉ番目（ｉ≧２）のラウンドの処理において、経路ベクトル生成部１０６は、残りの候補経路ベクトルのそれぞれのスコアを更新する際に、以下の処理を行ってもよい。 As an example, in the processing of the i-th (i ≧ 2) round, the route vector generation unit 106 may perform the following processing when updating the scores of the remaining candidate route vectors.

残りの候補経路ベクトルに１つの要素が含まれ、且つ該要素が第Ｈクラス（文字認識の例では、第Ｈクラスはｂｌａｎｋである）ではない場合、経路ベクトル生成部１０６は、第ｉセグメントのＭ次元確率ベクトルにおける該残りの候補経路ベクトルの第１要素に対応する確率及び第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素に基づいて該残りの候補経路ベクトルのスコアを更新してもよい。 If the remaining candidate path vectors contain one element and the element is not of the H class (in the character recognition example, the H class is blank), the path vector generator 106 of the i-segment The score of the remaining candidate path vector may be updated based on the probability corresponding to the first element of the remaining candidate path vector in the M-dimensional probability vector and the H element in the M-dimensional probability vector of the i-segment.

また、残りの候補経路ベクトルに１つの要素が含まれ、且つ該要素が第Ｈクラスではない場合、ｉ番目のラウンドにおける残りの候補経路ベクトルに第Ｈクラスのみを含む残りの候補経路ベクトルがあるとき、第ｉセグメントのＭ次元確率ベクトルにおける該残りの候補経路ベクトルの第１要素に対応する確率、第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素、及び第Ｈクラスのみを含む残りの候補経路ベクトルのスコアに基づいて、該残りの候補経路ベクトルのスコアを更新する。 Further, when one element is included in the remaining candidate route vector and the element is not the H class, the remaining candidate route vector in the i-th round has the remaining candidate route vector containing only the H class. When, the probability corresponding to the first element of the remaining candidate path vector in the M-dimensional probability vector of the i-segment, the H element in the M-dimensional probability vector of the i-segment, and the remaining candidate paths including only the H class. The score of the remaining candidate path vector is updated based on the score of the vector.

また、残りの候補経路ベクトルに第Ｈクラスのみが含まれる場合、第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素に基づいて該残りの候補経路ベクトルのスコアを更新する。 Further, when the remaining candidate route vectors include only the H class, the score of the remaining candidate route vectors is updated based on the H element in the M-dimensional probability vector of the i-segment.

例えば、上記の分類すべき対象がテキストを含む画像である例では、２番目のラウンドの処理では、第Ｈクラス（即ちｂｌａｎｋ）のみを含む残りの候補経路ベクトル［ｂｌａｎｋ］があるため、経路ベクトル生成部１０６は、第２セグメントのＭ次元確率ベクトルにおける残りの候補経路ベクトル［Ａ］の第１要素Ａに対応する確率ｐ_Ａ２（確率ｐ_Ａ２は、確率ベクトル取得部１０２により取得された第２セグメントのＭ次元確率ベクトルにおけるクラスＡに対応する要素である）、第２セグメントのＭ次元確率ベクトルにおける第Ｈ要素（即ち第１要素ｐ_{ｂｌａｎｋ２}）、及び第Ｈクラスのみを含む残りの候補経路ベクトル（即ち［ｂｌａｎｋ］）のスコアに基づいて、候補経路ベクトル［Ａ］のスコアを更新してもよい。例えば、以下の式（３）〜式（６）に従って候補経路ベクトル［Ａ］のスコアを更新してもよい。

For example, in the above example where the object to be classified is an image containing text, in the processing of the second round, there is a remaining candidate route vector [blank] containing only the H class (that is, blank), so that the route vector _{The generation unit 106 has a probability p A2} corresponding to the first element A of the remaining candidate path vector [A] in the M-dimensional probability vector of the second segment (the probability p _A2 is a second acquired by the probability vector acquisition unit 102. The remaining candidate path vector containing only the class A in the M-dimensional random vector of the segment), the H element (ie, the first element p _{blank2) in the M-dimensional random vector of the second segment, and the H class only.} The score of the candidate route vector [A] may be updated based on the score of (that is, [blank]). For example, the score of the candidate route vector [A] may be updated according to the following equations (3) to (6).

式（５）において、ＬＭ（“ｂｌａｎｋ”／“Ａ”）は“ｂｌａｎｋ”と“Ａ”との間の意味情報を表し、ｗｅｉｇｈｔは可変の重みを表し、上記の式（１）に従って算出されてもよい。上述したように、ＬＭ（“ｂｌａｎｋ”／“Ａ”）は０に設定されてもよい。また、式（６）において、Ｓｃｏｒｅ’（“Ａ”）は、候補経路ベクトル［Ａ］の更新後のスコアを表す。 In formula (5), LM (“blank” / “A”) represents semantic information between “blank” and “A”, weight represents a variable weight, and is calculated according to the above formula (1). You may. As mentioned above, LM (“blank” / “A”) may be set to 0. Further, in the equation (6), Score'(“ A ”) represents the score after the update of the candidate route vector [A].

また、確率ベクトル取得部１０２は、候補経路ベクトル［Ａ］のスコアを更新する上記の方法と同様の方法で、１つの要素を含む他の候補経路ベクトルのスコアを更新してもよい。残りの候補経路ベクトルに１つの要素が含まれ、且つ該要素が第Ｈクラスではない場合、第Ｈクラスのみを含む残りの候補経路ベクトルがない（即ち、シナリオ３がない）とき、残りの候補経路ベクトルのスコアが更新する際に、シナリオ３を考慮する必要がない。例えば、候補経路ベクトル［Ａ］について、残りの候補経路ベクトル［ｂｌａｎｋ］がない場合、上記の式（６）を以下の式（７）に変換してもよい。

Further, the probability vector acquisition unit 102 may update the score of another candidate route vector including one element by the same method as the above method of updating the score of the candidate route vector [A]. If the remaining candidate path vectors contain one element and the element is not in class H, then there are no remaining candidate path vectors containing only class H (ie, there is no scenario 3), then the remaining candidates. It is not necessary to consider scenario 3 when updating the path vector score. For example, regarding the candidate route vector [A], if there is no remaining candidate route vector [blank], the above equation (6) may be converted into the following equation (7).

また、候補経路ベクトル［ｂｌａｎｋ］のスコアを更新する際に、シナリオ１、シナリオ２及びシナリオ３は実際に１つのシナリオであるため、１つだけのシナリオを考慮すればよい。例えば、２番目のラウンドの処理において、以下の式（８）に従って候補経路ベクトル［ｂｌａｎｋ］のスコアを更新してもよい。

Further, when updating the score of the candidate route vector [blank], since scenario 1, scenario 2 and scenario 3 are actually one scenario, only one scenario needs to be considered. For example, in the processing of the second round, the score of the candidate route vector [blank] may be updated according to the following equation (8).

式（８）において、Ｓｃｏｒｅ’（“ｂｌａｎｋ”）は、候補経路ベクトル［ｂｌａｎｋ］の更新後のスコアを表す。 In the formula (8), Score'(“ blank”) represents the score after the update of the candidate route vector [blank].

残りの候補経路ベクトルにｍ個（ｍ≧２）の要素が含まれる場合、経路ベクトル生成部１０６は、第ｉセグメントのＭ次元確率ベクトルにおける残りの候補経路ベクトルの第ｍ要素に対応する確率及び第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素に基づいて残りの候補経路ベクトルのスコアを更新してもよい。 When the remaining candidate route vectors include m (m ≧ 2) elements, the route vector generation unit 106 has the probability corresponding to the mth element of the remaining candidate route vectors in the M-dimensional probability vector of the i-segment. The scores of the remaining candidate path vectors may be updated based on the H element in the M-dimensional probability vector of the i-segment.

また、残りの候補経路ベクトルにｍ個（ｍ≧２）の要素が含まれる場合、ｉ番目のラウンドにおける残りの候補経路ベクトルにｍ−１個の要素を含む残りの候補経路ベクトルがあり、且つ上記のｍ−１個の要素を含む残りの候補経路ベクトルの第１要素乃至第ｍ−１要素と該残りの候補経路ベクトルの第１要素乃至第ｍ−１要素とがそれぞれ同一であるとき、第ｉセグメントのＭ次元確率ベクトルにおける該残りの候補経路ベクトルの第ｍ要素に対応する確率、第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素、上記のｍ−１個の要素を含む残りの候補経路ベクトルのスコア、及び該残りの候補経路ベクトルの第ｍ要素と上記のｍ−１個の要素を含む残りの候補経路ベクトルとの関連度に基づいて、該残りの候補経路ベクトルのスコアを更新する。 Further, when the remaining candidate route vectors include m (m ≧ 2) elements, the remaining candidate route vectors in the i-th round have the remaining candidate route vectors containing m-1 elements, and the remaining candidate route vectors have m-1 elements. When the first element to the m-1 element of the remaining candidate path vector including the above m-1 elements and the first element to the m-1 element of the remaining candidate path vector are the same, respectively. The probability corresponding to the mth element of the remaining candidate path vector in the M-dimensional probability vector of the i-segment, the H element in the M-dimensional probability vector of the i-segment, and the remaining candidates including the above m-1 elements. The score of the remaining candidate route vector is updated based on the score of the route vector and the degree of association between the mth element of the remaining candidate route vector and the remaining candidate route vector including the above m-1 elements. do.

例えば、上記の分類すべき対象がテキストを含む画像である具体的な例では、４番目のラウンドの処理において、残りの候補経路ベクトルは［Ａ，Ｂ］及び［Ａ，Ｂ，Ｃ］であると仮定する。残りの候補経路ベクトル［Ａ，Ｂ，Ｃ］（候補経路ベクトル［Ａ，Ｂ，Ｃ］について、ｍ＝３）について、ｍ−１＝２個の要素を含む残りの候補経路ベクトル［Ａ，Ｂ］があり、且つ残りの候補経路ベクトル［Ａ，Ｂ］の第１要素乃至第２要素と候補経路ベクトル［Ａ，Ｂ，Ｃ］の第１要素乃至第２要素とがそれぞれ同一であるため、経路ベクトル生成部１０６は、第４セグメントのＭ次元確率ベクトルにおける候補経路ベクトル［Ａ，Ｂ，Ｃ］の第ｍ（ｍ＝３）要素Ｃに対応する確率ｐ_Ｃ４（確率ｐ_Ｃ４は、確率ベクトル取得部１０２により取得された第４セグメントのＭ次元確率ベクトルにおけるクラスＣに対応する要素である）、第４セグメントのＭ次元確率ベクトルにおける第Ｈ要素（即ち第１要素ｐ_{ｂｌａｎｋ４}）、候補経路ベクトル［Ａ，Ｂ］のスコア、及び候補経路ベクトル［Ａ，Ｂ，Ｃ］の第ｍ（ｍ＝３）要素Ｃと候補経路ベクトル［Ａ，Ｂ］との関連度に基づいて、候補経路ベクトル［Ａ，Ｂ，Ｃ］のスコアを更新してもよい。例えば、以下の式（９）〜式（１２）に従って候補経路ベクトル［Ａ，Ｂ，Ｃ］のスコアを更新してもよい。

For example, in the specific example where the object to be classified is an image containing text, in the processing of the fourth round, the remaining candidate path vectors are [A, B] and [A, B, C]. Suppose. For the remaining candidate path vectors [A, B, C] (for the candidate path vectors [A, B, C], m = 3), m-1 = the remaining candidate path vectors [A, B] containing two elements. ], And the first and second elements of the remaining candidate path vectors [A, B] and the first and second elements of the candidate path vector [A, B, C] are the same, respectively. _{The path vector generation unit 106 has a probability p C4} (probability p _C4 is a probability vector) corresponding to the m (m = 3) element C of the candidate path vector [A, B, C] in the M-dimensional probability vector of the fourth segment. The element corresponding to class C in the M-dimensional probability vector of the fourth segment acquired by the acquisition unit 102), the H element (that is, the first element p _blank4 ) in the M-dimensional probability vector of the fourth segment, and the candidate route vector. The candidate route vector [A, B] is based on the score of [A, B] and the degree of association between the m (m = 3) element C of the candidate route vector [A, B, C] and the candidate route vector [A, B]. A, B, C] scores may be updated. For example, the score of the candidate route vector [A, B, C] may be updated according to the following equations (9) to (12).

式（１１）において、ＬＭ（“Ｃ”／“ＡＢ”）＊ｗｅｉｇｈｔは、候補経路ベクトル［Ａ，Ｂ，Ｃ］の第３要素Ｃと候補経路ベクトル［Ａ，Ｂ］との間の関連度を表し、ここで、ＬＭ（“Ｃ”／“ＡＢ”）は“ＡＢ”と“Ｃ”との間の意味情報を表し、ｗｅｉｇｈｔは可変の重みを表し、上記の式（１）に従って算出されてもよい。式（１２）において、Ｓｃｏｒｅ’（“ＡＢＣ”）は、候補経路ベクトル［Ａ，Ｂ，Ｃ］の更新後のスコアを表す。 In the equation (11), the LM (“C” / “AB”) * weight is the degree of association between the third element C of the candidate route vector [A, B, C] and the candidate route vector [A, B]. Here, LM (“C” / “AB”) represents semantic information between “AB” and “C”, and weight represents a variable weight, which is calculated according to the above equation (1). You may. In the equation (12), Score'(“ABC”) represents the updated score of the candidate route vector [A, B, C].

また、確率ベクトル取得部１０２は、候補経路ベクトル［Ａ，Ｂ，Ｃ］のスコアを更新する上記の方法と同様の方法で、ｍ個（ｍ≧２）の要素を含む他の候補経路ベクトルのスコアを更新してもよい。残りの候補経路ベクトルにｍ個（ｍ≧２）の要素が含まれる場合、ｍ−１個の要素を含み、且つ第１要素乃至第ｍ−１要素が該残りの候補経路ベクトルの第１要素乃至第ｍ−１要素とそれぞれ同一である残りの候補経路ベクトルがない（即ち、シナリオ３がない）とき、残りの候補経路ベクトルのスコアが更新する際に、シナリオ３を考慮する必要がない。例えば、候補経路ベクトル［Ａ，Ｂ，Ｃ］について、残りの候補経路ベクトル［Ａ，Ｂ］がない場合、上記の式（１２）を以下の式（１３）に変換してもよい。

Further, the probability vector acquisition unit 102 can use the same method as the above method for updating the score of the candidate route vector [A, B, C] to obtain m (m ≧ 2) elements of other candidate route vectors. You may update the score. When the remaining candidate route vector contains m (m ≧ 2) elements, m-1 elements are included, and the first element to the m-1 element is the first element of the remaining candidate route vector. When there is no remaining candidate route vector that is the same as the m-1th element (that is, there is no scenario 3), it is not necessary to consider scenario 3 when updating the score of the remaining candidate route vector. For example, for the candidate route vector [A, B, C], if there is no remaining candidate route vector [A, B], the above equation (12) may be converted into the following equation (13).

以上のように残りの候補経路ベクトルのスコアを更新するで、文字認識の復号プロセスにおいて隣接する重複の文字を併合して空白文字ｂｌａｎｋを除去することができるため、文字認識の正確度をさらに向上させることができる。 By updating the scores of the remaining candidate path vectors as described above, it is possible to merge adjacent duplicate characters and remove the blank character blank in the character recognition decoding process, further improving the accuracy of character recognition. Can be made to.

図２Ｂは、本開示の実施例に係る情報処理装置１００を用いて図２Ａにおけるテキストを認識して得られた結果を示す。図２Ｂから分かるように、本開示の実施例に係る情報処理装置１００は図２Ａのテキストは正確に認識できる。 FIG. 2B shows the results obtained by recognizing the text in FIG. 2A using the information processing apparatus 100 according to the embodiment of the present disclosure. As can be seen from FIG. 2B, the information processing apparatus 100 according to the embodiment of the present disclosure can accurately recognize the text of FIG. 2A.

以上は図１〜図３を参照しながら本開示の実施例に係る情報処理装置を説明しているが、本開示は、上記の情報処理装置の実施例に対応する情報処理方法の実施例をさらに提供する。 Although the information processing apparatus according to the embodiment of the present disclosure has been described above with reference to FIGS. 1 to 3, the present disclosure is an embodiment of an information processing method corresponding to the above-mentioned embodiment of the information processing apparatus. Further provide.

図５は、本開示の実施例に係る情報処理方法４００の流れの一例を示すフローチャートである。図５に示すように、本開示の実施例に係る情報処理方法４００は、確率ベクトル取得ステップＳ４０６、候補クラス選択ステップＳ４０８、経路ベクトル生成ステップＳ４１０及び分類結果取得ステップＳ４１２を含んでもよい。情報処理方法は、開始ステップＳ４０２から開始し、終了ステップＳ４１４に終了する。 FIG. 5 is a flowchart showing an example of the flow of the information processing method 400 according to the embodiment of the present disclosure. As shown in FIG. 5, the information processing method 400 according to the embodiment of the present disclosure may include a probability vector acquisition step S406, a candidate class selection step S408, a route vector generation step S410, and a classification result acquisition step S412. The information processing method starts from the start step S402 and ends in the end step S414.

確率ベクトル取得ステップＳ４０６において、分類すべき対象を分割して得られたＮ個のセグメントのそれぞれのＭ次元確率ベクトルを取得してもよい。Ｍはクラスの数であり、各Ｍ次元確率ベクトルにおける第１要素乃至第Ｍ要素は対応するセグメントが第１クラス乃至第Ｍクラスに属する確率をそれぞれ表し、Ｍ及びＮは１よりも大きい自然数である。例えば、確率ベクトル取得ステップＳ４０６において、例えば予め訓練された畳み込みリカレントニューラルネットワーク（ＣＲＮＮ）などのニューラルネットワークにより各セグメントのＭ次元確率ベクトルを取得してもよい。例えば、確率ベクトル取得ステップＳ４０６は、上記の情報処理装置１００の確率ベクトル取得部１０２により実施されてもよく、ここでその具体的な説明を省略する。 In the probability vector acquisition step S406, the M-dimensional probability vector of each of the N segments obtained by dividing the object to be classified may be acquired. M is the number of classes, the first element to the M element in each M-dimensional probability vector represents the probability that the corresponding segment belongs to the first class to the M class, respectively, and M and N are natural numbers larger than 1. be. For example, in the probability vector acquisition step S406, the M-dimensional probability vector of each segment may be acquired by a neural network such as a pre-trained convolutional recurrent neural network (CRNN). For example, the probability vector acquisition step S406 may be performed by the probability vector acquisition unit 102 of the information processing apparatus 100, and the specific description thereof will be omitted here.

候補クラス選択ステップＳ４０８において、Ｎ個のセグメントのそれぞれについて、該セグメントのＭ次元確率ベクトルにおける第Ｈ要素以外の要素のうちの上位Ｋ個の最大の要素に対応するクラスを該セグメントの候補クラスとして選択してもよい。ここで、Ｈ及びＫは自然数であり、１≦Ｈ≦Ｍ、且つ１≦Ｋ≦Ｍ−１となり、第Ｈ要素に対応する第Ｈクラスは、意味情報を含まないクラスであってもよい。例えば、候補クラス選択ステップＳ４０８において、各セグメントのＭ次元確率ベクトルのうちの第Ｈ要素以外の要素を降順でソートし、上位のＫ個の要素に対応するクラスを対応するセグメントの候補クラスとして選択してもよい。ここで、当業者は実際の必要に応じてＫの値を設定してもよい。例えば、候補クラス選択ステップＳ４０８は、上記の情報処理装置１００の候補クラス選択部１０４により実施されてもよく、ここでその具体的な説明を省略する。 In the candidate class selection step S408, for each of the N segments, the class corresponding to the largest K elements other than the H element in the M-dimensional probability vector of the segment is set as the candidate class for the segment. You may choose. Here, H and K are natural numbers, 1 ≦ H ≦ M, and 1 ≦ K ≦ M-1, and the H-class corresponding to the H element may be a class that does not include semantic information. For example, in the candidate class selection step S408, the elements other than the H element in the M-dimensional probability vector of each segment are sorted in descending order, and the class corresponding to the upper K elements is selected as the candidate class of the corresponding segment. You may. Here, those skilled in the art may set the value of K as needed. For example, the candidate class selection step S408 may be performed by the candidate class selection unit 104 of the information processing apparatus 100, and the specific description thereof will be omitted here.

経路ベクトル生成ステップＳ４１０において、Ｎ個のセグメントのそれぞれの候補クラスに基づいて経路ベクトルを生成し、生成された経路ベクトルのそれぞれについて、該経路ベクトルに含まれる各要素に対応する確率及び隣接する要素間の関連度に基づいて該経路ベクトルのスコアを計算してもよい。ここで、隣接する要素間の関連度は、経路ベクトルにおける隣接する要素間の意味情報、及び隣接する要素に対応するセグメント間の距離に関する可変の重みに基づいて計算されてもよい。例えば、経路ベクトル生成ステップＳ４１０は、上記の情報処理装置１００の経路ベクトル生成部１０６により実施されてもよく、ここでその具体的な説明を省略する。 In the route vector generation step S410, a route vector is generated based on each candidate class of N segments, and for each of the generated route vectors, the probability corresponding to each element included in the route vector and the adjacent elements. The score of the path vector may be calculated based on the degree of relevance between them. Here, the degree of relevance between adjacent elements may be calculated based on semantic information between adjacent elements in a path vector and variable weights with respect to distances between segments corresponding to adjacent elements. For example, the route vector generation step S410 may be performed by the route vector generation unit 106 of the information processing apparatus 100, and the specific description thereof will be omitted here.

分類結果取得ステップＳ４１２において、経路ベクトルのうちのスコアが最も高い経路ベクトルを分類すべき対象の分類結果として取得してもよい。例えば、分類結果取得ステップＳ４１２は、上記の情報処理装置１００の分類結果取得部１０８により実施されてもよく、ここでその具体的な説明を省略する。 In the classification result acquisition step S412, the route vector having the highest score among the route vectors may be acquired as the classification result of the target to be classified. For example, the classification result acquisition step S412 may be performed by the classification result acquisition unit 108 of the information processing apparatus 100, and the specific description thereof will be omitted here.

分類技術は、例えば画像認識、文字認識、音声認識などに幅広く応用されている。文字認識では、コア技術は、テキストを含む画像などから特徴を抽出して文字認識を行うことである。通常、抽出された特徴には意味情報が含まれないため、長短期記憶（ＬＳＴＭ：ＬｏｎｇＳｈｏｒｔ−ＴｅｒｍＭｅｍｏｒｙ）ネットワーク、ｎ−ｇｒａｍモデルなどの意味情報を復号プロセスに導入する技術が提案されている。従来技術では、２つの特定の互いに隣接する非空白文字、単語又は句の間の意味情報は、その間の実際の距離によらず、一定である。なお、ここの「隣接する」とは、この２つの特定の非空白文字、単語又は句の間に文字がなく、或いは空白以外の他の文字がないことを意味する。 The classification technique is widely applied to, for example, image recognition, character recognition, voice recognition, and the like. In character recognition, the core technology is to perform character recognition by extracting features from images containing text. Since the extracted features usually do not contain semantic information, techniques have been proposed to introduce semantic information such as long short-term memory (LSTM) network, n-gram model, etc. into the decoding process. .. In the prior art, the semantic information between two specific adjacent non-blank characters, words or phrases is constant regardless of the actual distance between them. In addition, "adjacent" here means that there is no character between these two specific non-blank characters, words or phrases, or there is no other character other than the blank.

上述したように、本開示の実施例に係る情報処理方法４００は、経路ベクトルにおける隣接する要素間の意味情報、及び隣接する要素に対応するセグメント間の距離に関する可変の重みに基づいて、隣接する要素間の関連度を計算する。分類すべき対象がテキストを含む画像である場合、隣接する要素に対応するセグメント間の距離は、隣接する要素間の実際の距離に対応しており、上記の関連度の計算方法を用いる場合、隣接する要素間の意味情報に隣接する要素間の実際の距離に関する可変の重みを付加することと同等であるため、隣接する要素間の実際の距離を考慮してテキストを認識することができる。 As described above, the information processing method 400 according to the embodiment of the present disclosure is adjacent based on semantic information between adjacent elements in a path vector and variable weights with respect to distances between segments corresponding to the adjacent elements. Calculate the degree of association between elements. If the object to be classified is an image containing text, the distance between the segments corresponding to the adjacent elements corresponds to the actual distance between the adjacent elements, and when using the above relevance calculation method, Since it is equivalent to adding a variable weight regarding the actual distance between adjacent elements to the semantic information between adjacent elements, the text can be recognized in consideration of the actual distance between adjacent elements.

本開示の１つの実施例では、隣接する要素間の意味情報は、予め訓練されたｎ−ｇｒａｍモデルを介して計算された値により表される。 In one embodiment of the present disclosure, semantic information between adjacent elements is represented by values calculated via a pre-trained n-gram model.

本開示の１つの実施例では、隣接する要素に対応するセグメント間の距離が所定閾値以下である場合、可変の重みを１に設定し、隣接する要素に対応するセグメント間の距離が所定閾値よりも大きい場合、可変の重みを１よりも小さい値に設定し、且つ可変の重みは上記の距離の増加に伴って減少する。 In one embodiment of the present disclosure, when the distance between segments corresponding to adjacent elements is less than or equal to a predetermined threshold, the variable weight is set to 1 and the distance between segments corresponding to adjacent elements is greater than or equal to the predetermined threshold. If is also large, the variable weight is set to a value less than 1, and the variable weight decreases with increasing distance.

例えば、隣接する要素に対応するセグメント間の距離が所定閾値よりも大きい場合、可変の重みは、上記の距離と所定の閾値との差の逆数に設定されてもよいが、これに限定されない。例えば、可変の重みｗｅｉｇｈｔは、上記の式（１）に従って計算されてもよい。 For example, if the distance between segments corresponding to adjacent elements is greater than or equal to a predetermined threshold, the variable weight may be set to the reciprocal of the difference between the distance and the predetermined threshold, but is not limited thereto. For example, the variable weight may be calculated according to the above equation (1).

また、当業者は、実際の必要に応じて所定閾値を決定してもよい。 Further, those skilled in the art may determine a predetermined threshold value as needed in practice.

上述したように、従来技術では、２つの特定の互いに隣接する非空白文字、単語又は句の間の意味情報は、その間の実際の距離によらず、一定である。一方、認識すべきテキストに区切り（例えば空白）が存在する場合が多く、区切りの前後の文の意味的関連性が弱い。 As mentioned above, in the prior art, the semantic information between two specific adjacent non-blank characters, words or phrases is constant regardless of the actual distance between them. On the other hand, there are often delimiters (for example, blanks) in the text to be recognized, and the semantic relevance of the sentences before and after the delimiter is weak.

本開示の実施例に係る情報処理方法４００は、経路ベクトルにおける隣接する要素間の意味情報、及び隣接する要素に対応するセグメント間の距離に関する可変の重みに基づいて、隣接する要素間の関連度を計算してもよい。分類すべき対象がテキストを含む画像である場合、隣接する要素に対応するセグメント間の距離は隣接する要素間の実際の距離に対応しており、隣接する要素間の区切り（例えば、空白）により、隣接する要素間の実際の距離を増大させることができる。言い換えれば、本開示の実施例に係る情報処理装置は、隣接する要素間の意味情報に隣接する要素間の区切り（例えば、空白）に関する可変の重みを付加してもよい。例えば、図２Ａに示すテキストでは、隣接する文字「１」と「（」に対応するセグメント間の距離が所定の閾値よりも大きい場合、隣接する文字「１」と「（」との間の意味情報に上記の式（１）に従って算出された可変の重みを付加してもよい。 The information processing method 400 according to an embodiment of the present disclosure has a degree of relevance between adjacent elements based on semantic information between adjacent elements in a path vector and variable weights with respect to distances between segments corresponding to the adjacent elements. May be calculated. If the object to be classified is an image containing text, the distance between the segments corresponding to the adjacent elements corresponds to the actual distance between the adjacent elements, and by the delimiter between the adjacent elements (eg, blank). , The actual distance between adjacent elements can be increased. In other words, the information processing apparatus according to the embodiment of the present disclosure may add a variable weight regarding a delimiter (for example, a blank) between adjacent elements to the semantic information between adjacent elements. For example, in the text shown in FIG. 2A, when the distance between the segments corresponding to the adjacent characters "1" and "(" is larger than a predetermined threshold value, the meaning between the adjacent characters "1" and "("). A variable weight calculated according to the above equation (1) may be added to the information.

本開示の１つの実施例では、情報処理方法４００は、前処理ステップＳ４０４をさらに含んでもよい。前処理ステップＳ４０４において、分類すべき対象に対して前処理を行う。ここで、前処理は、ノイズ除去、正規化、二値化及び傾き補正のうちの少なくとも１つを含んでもよい。ノイズ除去、正規化、二値化及び傾き補正は、本分野で周知であるため、その詳細について説明を省略する。 In one embodiment of the present disclosure, the information processing method 400 may further include preprocessing step S404. In the preprocessing step S404, preprocessing is performed on the target to be classified. Here, the pretreatment may include at least one of noise reduction, normalization, binarization and tilt correction. Since noise removal, normalization, binarization, and tilt correction are well known in the art, the details thereof will be omitted.

本開示の１つの実施例では、経路ベクトル生成ステップＳ４１０において、以下のＮラウンドの処理により経路ベクトルを生成し、経路ベクトルのスコアを取得してもよい。 In one embodiment of the present disclosure, in the route vector generation step S410, the route vector may be generated by the following N rounds of processing, and the score of the route vector may be acquired.

１番目のラウンドの処理において、Ｎ個のセグメントのうちの第１セグメントの候補クラス及び第Ｈ要素に対応する第Ｈクラスに基づいてＫ＋１個の経路ベクトルを生成し、各経路ベクトルにおける要素に対応する確率に基づいて該経路ベクトルのスコアを生成してもよい。 In the processing of the first round, K + 1 path vectors are generated based on the candidate class of the first segment of the N segments and the H class corresponding to the H element, and the elements in each path vector are supported. The score of the path vector may be generated based on the probability of doing so.

例えば、図４に示すように、ｉ番目（ｉ≧２）のラウンドの処理において、スコアが最大である上位Ｌ個の経路ベクトルを候補経路ベクトルとして選択し（ステップＳ４１０２）、ここで、Ｌは１よりも大きい自然数であり、同一の２つ以上の候補経路ベクトルのうちのスコアが最も高い候補経路ベクトル以外の他の候補経路ベクトルを除外し（ステップＳ４１０４）、少なくとも第ｉセグメントのＭ次元確率ベクトルに基づいて除外後の残りの候補経路ベクトルのスコアを更新し（ステップＳ４１０６）、残りの候補経路ベクトルのそれぞれについて、第ｉセグメントの候補クラスのそれぞれを該残りの候補経路ベクトルにそれぞれ追加して経路ベクトルを新しく生成し、該残りの候補経路ベクトルの更新前のスコア、新しく追加された候補クラスに対応する確率、及び新しく追加された候補クラスと該残りの候補経路ベクトルとの関連度に基づいて、新しく生成された経路ベクトルのスコアを計算する（ステップＳ４１０８）。 For example, as shown in FIG. 4, in the processing of the i-th (i ≧ 2) round, the upper L route vectors having the maximum score are selected as candidate route vectors (step S4102), where L is Exclude candidate path vectors other than the candidate path vector that is a natural number larger than 1 and has the highest score among the same two or more candidate path vectors (step S4104), and at least the M-dimensional probability of the i-segment. The score of the remaining candidate route vectors after exclusion is updated based on the vector (step S4106), and for each of the remaining candidate route vectors, each of the candidate classes of the i-segment is added to the remaining candidate route vectors. To generate a new route vector, the score before updating the remaining candidate route vector, the probability corresponding to the newly added candidate class, and the degree of association between the newly added candidate class and the remaining candidate route vector. Based on this, the score of the newly generated path vector is calculated (step S4108).

なお、ｉ番目（ｉ≧２）のラウンドの処理において、ステップＳ４１０２において選択された候補ベクトルに同一の２つ以上の候補経路ベクトルが含まれ、且つ上記の同一の２つ以上の候補経路ベクトルうちのスコアが最も高い候補経路ベクトルが１つよりも多い（即ち、２つ以上の候補経路ベクトルが最も高いスコアを有する）場合、ステップＳ４１０４において、そのうちの何れか１つの候補経路ベクトルを残してもよい。 In the process of the i-th (i ≧ 2) round, the candidate vector selected in step S4102 includes the same two or more candidate route vectors, and among the above two or more candidate route vectors. If there are more than one candidate route vector with the highest score (that is, two or more candidate route vectors have the highest score), even if any one of the candidate route vectors is left in step S4104. good.

上述したＮラウンドの処理は、ビーム探索方法及び隣接する要素間の意味情報と可変の重みに基づく関連度を組み合わせることで、分類の正確度を向上させることができる。また、ｉ番目（ｉ≧２）のラウンドの処理においてスコアが最大の上位のＬ個の経路ベクトルを候補経路ベクトルとして選択し、候補経路ベクトルのみに対して後続の処理を行うことで、分類処理の計算量を低減させ、計算時間を節約することができる。 The above-mentioned N-round processing can improve the accuracy of classification by combining the beam search method, the semantic information between adjacent elements, and the relevance based on the variable weight. Further, in the processing of the i-th (i ≧ 2) round, the L route vectors having the highest score are selected as the candidate route vectors, and the subsequent processing is performed only on the candidate route vectors to perform the classification processing. The amount of calculation can be reduced and the calculation time can be saved.

例えば、文字認識では、復号のルールに従って、隣接する重複の文字を併合して空白文字ｂｌａｎｋを除去する必要があるため、経路ベクトルのスコアを更新する際に、以下の３つのシナリオを考慮してもよい。 For example, in character recognition, it is necessary to merge adjacent duplicate characters and remove the blank character blank according to the decoding rule. Therefore, when updating the score of the route vector, the following three scenarios are taken into consideration. May be good.

一例として、ｉ番目（ｉ≧２）のラウンドの処理において、残りの候補経路ベクトルのそれぞれのスコアを更新する際に、以下の処理を行ってもよい。 As an example, in the processing of the i-th (i ≧ 2) round, the following processing may be performed when updating the scores of the remaining candidate route vectors.

残りの候補経路ベクトルにｍ個（ｍ≧２）の要素が含まれる場合、第ｉセグメントのＭ次元確率ベクトルにおける残りの候補経路ベクトルの第ｍ要素に対応する確率及び第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素に基づいて残りの候補経路ベクトルのスコアを更新してもよい。 When the remaining candidate path vector contains m (m ≧ 2) elements, the probability corresponding to the mth element of the remaining candidate path vector in the M-dimensional probability vector of the i-segment and the M-dimensional probability of the i-segment. The scores of the remaining candidate path vectors may be updated based on the Hth element in the vector.

なお、以上は本開示の実施例に係る情報処理装置及び情報処理方法の機能的構成及び動作を説明しているが、該機能的構成及び動作は単なる例示的なものであり、本開示を限定するものではない。当業者は、本開示の原理に従って上記実施例を修正してもよく、例えば各実施例における機能的モジュールを追加、削除又は組み合わせてもよく、これらの修正は本開示の範囲に含まれるものである。 Although the functional configuration and operation of the information processing apparatus and the information processing method according to the embodiment of the present disclosure have been described above, the functional configuration and operation are merely exemplary, and the present disclosure is limited. It's not something to do. Those skilled in the art may modify the above embodiments according to the principles of the present disclosure, eg, add, remove or combine functional modules in each embodiment, these modifications are within the scope of the present disclosure. be.

また、ここの装置の実施例は上記方法の実施例に対応するため、装置の実施例に詳細に説明されていない内容は、上記方法実施例の対応説明を参照してもよく、ここでその説明を省略する。 Further, since the embodiment of the apparatus here corresponds to the embodiment of the above method, the corresponding description of the above method embodiment may be referred to for the contents not described in detail in the embodiment of the apparatus. The explanation is omitted.

また、本開示は記憶媒体及びプログラムプロダクトをさらに提供する。本開示の実施例に係る記憶媒体及びプログラムプロダクトにおける機器が実行可能な命令は上記方法を実行してもよく、ここで詳細に説明されていない内容は、上記方法の実施例の対応説明を参照してもよく、ここでその説明を省略する。 The disclosure also provides storage media and program products. The instruction that can be executed by the device in the storage medium and the program product according to the embodiment of the present disclosure may execute the above method, and for the contents not described in detail here, refer to the corresponding description of the embodiment of the above method. However, the description thereof will be omitted here.

それに応じて、本開示は、機器が実行可能な命令を含むプログラムプロダクトが記録されている記憶媒体をさらに含む。該記憶媒体は、フロッピーディスク、光ディスク、光磁気ディスク、メモリカード、メモリスティック等を含むが、これらに限定されない。 Accordingly, the present disclosure further includes storage media on which program products containing instructions that the device can execute are recorded. The storage medium includes, but is not limited to, a floppy disk, an optical disk, a magneto-optical disk, a memory card, a memory stick, and the like.

なお、上記処理及び装置はソフトウェア及び／又はファームウェアにより実現されてもよい。ソフトウェア及び／又はファームウェアにより実施されている場合、記憶媒体又はネットワークから専用のハードウェア構成を有するコンピュータ、例えば図６示されている汎用パーソナルコンピュータ５００に上記方法を実施するためのソフトウェアを構成するプログラムをインストールしてもよく、該コンピュータは各種のプログラムがインストールされている場合は各種の機能などを実行できる。 The above processing and device may be realized by software and / or firmware. When implemented by software and / or firmware, a program constituting software for performing the above method on a computer having a dedicated hardware configuration from a storage medium or network, for example, the general purpose personal computer 500 shown in FIG. The computer may perform various functions when various programs are installed.

図６において、中央処理部（ＣＰＵ）５０１は、読み出し専用メモリ（ＲＯＭ）５０２に記憶されているプログラム、又は記憶部５０８からランダムアクセスメモリ（ＲＡＭ）５０３にロードされたプログラムにより各種の処理を実行する。ＲＡＭ５０３には、必要に応じて、ＣＰＵ５０１が各種の処理を実行するに必要なデータが記憶されている。 In FIG. 6, the central processing unit (CPU) 501 executes various processes by a program stored in the read-only memory (ROM) 502 or a program loaded from the storage unit 508 into the random access memory (RAM) 503. do. The RAM 503 stores data necessary for the CPU 501 to execute various processes, if necessary.

ＣＰＵ５０１、ＲＯＭ５０２、及びＲＡＭ５０３は、バス５０４を介して互いに接続されている。入力／出力インターフェース５０５もバス５０４に接続されている。 The CPU 501, ROM 502, and RAM 503 are connected to each other via the bus 504. The input / output interface 505 is also connected to the bus 504.

入力部５０６（キーボード、マウスなどを含む）、出力部５０７（ディスプレイ、例えばブラウン管（ＣＲＴ）、液晶ディスプレイ（ＬＣＤ）など、及びスピーカなどを含む）、記憶部５０８（例えばハードディスクなどを含む）、通信部５０９（例えばネットワークのインタフェースカード、例えばＬＡＮカード、モデムなどを含む）は、入力／出力インターフェース５０５に接続されている。通信部５０９は、ネットワーク、例えばインターネットを介して通信処理を実行する。 Input unit 506 (including keyboard, mouse, etc.), output unit 507 (including displays such as brown tube (CRT), liquid crystal display (LCD), and speakers), storage unit 508 (including hard disk, etc.), communication. The unit 509 (including, for example, a network interface card, for example, a LAN card, a modem, etc.) is connected to the input / output interface 505. The communication unit 509 executes communication processing via a network, for example, the Internet.

必要に応じて、ドライバ５１０は、入力／出力インターフェース５０５に接続されてもよい。取り外し可能な媒体５１１は、例えば磁気ディスク、光ディスク、光磁気ディスク、半導体メモリなどであり、必要に応じてドライバ５１０にセットアップされて、その中から読みだされたコンピュータプログラムは必要に応じて記憶部５０８にインストールされている。 If desired, the driver 510 may be connected to the input / output interface 505. The removable medium 511 is, for example, a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like. It is installed in 508.

ソフトウェアにより上記処理を実施する場合、ネットワーク、例えばインターネット、又は記憶媒体、例えば取り外し可能な媒体５１１を介してソフトウェアを構成するプログラムをインストールする。 When performing the above processing by software, a program constituting the software is installed via a network such as the Internet or a storage medium such as a removable medium 511.

なお、これらの記憶媒体は、図６に示されている、プログラムを記憶し、機器と分離してユーザへプログラムを提供する取り外し可能な媒体５１１に限定されない。取り外し可能な媒体５１１は、例えば磁気ディスク（フロッピーディスク（登録商標）を含む）、光ディスク（光ディスク−読み出し専用メモリ（ＣＤ−ＲＯＭ）、及びデジタル多目的ディスク（ＤＶＤ）を含む）、光磁気ディスク（ミニディスク（ＭＤ）（登録商標））及び半導体メモリを含む。或いは、記憶媒体は、ＲＯＭ５０２、記憶部５０８に含まれるハードディスクなどであってもよく、プログラムを記憶し、それらを含む機器と共にユーザへ提供される。 Note that these storage media are not limited to the removable medium 511 shown in FIG. 6, which stores the program and provides the program to the user separately from the device. The removable medium 511 is, for example, a magnetic disk (including a floppy disk (registered trademark)), an optical disk (including an optical disk-read-only memory (CD-ROM), and a digital multipurpose disk (DVD)), and an optical magnetic disk (mini). Includes optical disc (MD) (registered trademark)) and semiconductor memory. Alternatively, the storage medium may be a ROM 502, a hard disk included in the storage unit 508, or the like, which stores programs and is provided to the user together with a device containing them.

以上は図面を参照しながら本開示の好ましい実施例を説明しているが、上記実施例及び例は例示的なものであり、制限的なものではない。当業者は、特許請求の範囲の主旨及び範囲内で本開示に対して各種の修正、改良、均等的なものに変更してもよい。これらの修正、改良又は均等的なものに変更することは本開示の保護範囲に含まれるものである。 Although the preferred embodiments of the present disclosure have been described above with reference to the drawings, the above examples and examples are exemplary and not restrictive. Those skilled in the art may make various modifications, improvements, and equalities to the present disclosure within the scope and purpose of the claims. These modifications, improvements or changes to equal ones are within the scope of this disclosure.

例えば、上記実施例の１つのユニットに含まれる機能は別々の装置により実現されてもよい。また、上記実施例の複数のユニットにより実現される複数の機能は別々の装置によりそれぞれ実現されてもよい。さらに、以上の機能の１つは複数のユニットにより実現されてもよい。なお、これらの構成は本開示の範囲内のものである。 For example, the functions included in one unit of the above embodiment may be realized by separate devices. Further, the plurality of functions realized by the plurality of units of the above embodiment may be realized by different devices. Further, one of the above functions may be realized by a plurality of units. It should be noted that these configurations are within the scope of the present disclosure.

また、本開示の方法は、明細書に説明された時間的順序で実行するものに限定されず、他の時間的順序で順次、並行、又は独立して実行されてもよい。このため、本明細書に説明された方法の実行順序は、本開示の技術的な範囲を限定するものではない。 Further, the method of the present disclosure is not limited to the one executed in the temporal order described in the specification, and may be executed sequentially, in parallel, or independently in another temporal order. For this reason, the order of execution of the methods described herein does not limit the technical scope of the present disclosure.

また、上述の各実施例を含む実施形態に関し、更に以下の付記を開示するが、これらの付記に限定されない。
（付記１）
分類すべき対象を分割して得られたＮ個のセグメントのそれぞれのＭ次元確率ベクトルを取得する確率ベクトル取得部であって、Ｍはクラスの数であり、各Ｍ次元確率ベクトルにおける第１要素乃至第Ｍ要素は対応するセグメントが第１クラス乃至第Ｍクラスに属する確率をそれぞれ表し、Ｍ及びＮは１よりも大きい自然数である、確率ベクトル取得部と、
前記Ｎ個のセグメントのそれぞれについて、該セグメントのＭ次元確率ベクトルにおける第Ｈ要素以外の要素のうちの上位Ｋ個の最大の要素に対応するクラスを該セグメントの候補クラスとして選択する候補クラス選択部であって、Ｈ及びＫは自然数であり、１≦Ｈ≦Ｍ、且つ１≦Ｋ≦Ｍ−１となり、前記第Ｈ要素に対応する第Ｈクラスは意味情報を含まないクラスである、候補クラス選択部と、
前記Ｎ個のセグメントのそれぞれの候補クラスに基づいて経路ベクトルを生成し、生成された経路ベクトルのそれぞれについて、該経路ベクトルに含まれる各要素に対応する確率及び隣接する要素間の関連度に基づいて該経路ベクトルのスコアを計算する経路ベクトル生成部と、
前記経路ベクトルのうちのスコアが最も高い経路ベクトルを前記分類すべき対象の分類結果として取得する分類結果取得部と、を含み、
隣接する要素間の関連度は、前記経路ベクトルにおける隣接する要素間の意味情報、及び隣接する要素に対応するセグメント間の距離に関する可変の重みに基づいて計算される、情報処理装置。
（付記２）
隣接する要素間の意味情報は、予め訓練されたｎ−ｇｒａｍモデルを介して計算された値により表される、付記１に記載の情報処理装置。
（付記３）
隣接する要素に対応するセグメント間の距離が所定閾値以下である場合、前記可変の重みを１に設定し、
隣接する要素に対応するセグメント間の距離が前記所定閾値よりも大きい場合、前記可変の重みを１よりも小さい値に設定し、且つ前記可変の重みは前記距離の増加に伴って減少する、付記１に記載の情報処理装置。
（付記４）
前記経路ベクトル生成部は、以下のＮラウンドの処理により前記経路ベクトルを生成し、前記経路ベクトルのスコアを取得し、
１番目のラウンドの処理において、前記経路ベクトル生成部は、前記Ｎ個のセグメントのうちの第１セグメントの候補クラス及び前記第Ｈクラスに基づいてＫ＋１個の経路ベクトルを生成し、各経路ベクトルにおける要素に対応する確率に基づいて該経路ベクトルのスコアを生成し、
ｉ番目（ｉ≧２）のラウンドの処理において、前記経路ベクトル生成部は、
スコアが最大である上位Ｌ個の経路ベクトルを候補経路ベクトルとして選択し、Ｌは１よりも大きい自然数であり、
同一の２つ以上の候補経路ベクトルのうちのスコアが最も高い候補経路ベクトル以外の他の候補経路ベクトルを除外し、
少なくとも第ｉセグメントのＭ次元確率ベクトルに基づいて除外後の残りの候補経路ベクトルのスコアを更新し、
残りの候補経路ベクトルのそれぞれについて、第ｉセグメントの候補クラスのそれぞれを該残りの候補経路ベクトルにそれぞれ追加して経路ベクトルを新しく生成し、該残りの候補経路ベクトルの更新前のスコア、新しく追加された候補クラスに対応する確率、及び新しく追加された候補クラスと該残りの候補経路ベクトルとの関連度に基づいて、新しく生成された経路ベクトルのスコアを計算する、付記１乃至３の何れかに記載の情報処理装置。
（付記５）
ｉ番目（ｉ≧２）のラウンドの処理において、前記経路ベクトル生成部は、前記残りの候補経路ベクトルのそれぞれのスコアを更新する際に、
前記残りの候補経路ベクトルに１つの要素が含まれ、且つ該要素が前記第Ｈクラスではない場合、
第ｉセグメントのＭ次元確率ベクトルにおける前記残りの候補経路ベクトルの第１要素に対応する確率及び前記第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素に基づいて前記残りの候補経路ベクトルのスコアを更新し、或いは、
ｉ番目のラウンドにおける残りの候補経路ベクトルに前記第Ｈクラスのみを含む残りの候補経路ベクトルがあるとき、第ｉセグメントのＭ次元確率ベクトルにおける前記残りの候補経路ベクトルの第１要素に対応する確率、前記第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素、及び前記第Ｈクラスのみを含む前記残りの候補経路ベクトルのスコアに基づいて、前記残りの候補経路ベクトルのスコアを更新し、
前記残りの候補経路ベクトルに前記第Ｈクラスのみが含まれる場合、前記第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素に基づいて前記残りの候補経路ベクトルのスコアを更新し、
前記残りの候補経路ベクトルにｍ個（ｍ≧２）の要素が含まれる場合、
前記第ｉセグメントのＭ次元確率ベクトルにおける前記残りの候補経路ベクトルの第ｍ要素に対応する確率及び前記第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素に基づいて前記残りの候補経路ベクトルのスコアを更新し、或いは、
ｉ番目のラウンドにおける残りの候補経路ベクトルにｍ−１個の要素を含む残りの候補経路ベクトルがあり、且つ前記ｍ−１個の要素を含む残りの候補経路ベクトルの第１要素乃至第ｍ−１要素と前記残りの候補経路ベクトルの第１要素乃至第ｍ−１要素とがそれぞれ同一であるとき、前記第ｉセグメントのＭ次元確率ベクトルにおける前記残りの候補経路ベクトルの第ｍ要素に対応する確率、前記第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素、前記ｍ−１個の要素を含む残りの候補経路ベクトルのスコア、及び前記残りの候補経路ベクトルの第ｍ要素と前記ｍ−１個の要素を含む残りの候補経路ベクトルとの関連度に基づいて、前記残りの候補経路ベクトルのスコアを更新する、付記４に記載の情報処理装置。
（付記６）
前記確率ベクトル取得部は、予め訓練された畳み込みリカレントニューラルネットワークにより前記Ｎ個のセグメントのそれぞれのＭ次元確率ベクトルを取得する、付記１乃至３の何れかに記載の情報処理装置。
（付記７）
前記分類すべき対象に対して前処理を行う前処理部、をさらに含み、
前記前処理は、ノイズ除去、正規化、二値化及び傾き補正のうちの少なくとも１つを含む、付記１乃至３の何れかに記載の情報処理装置。
（付記８）
前記分類すべき対象は、テキストを含む画像であり、
前記第１クラス乃至第Ｍクラスは、異なる文字をそれぞれ表し、
前記第Ｈクラスは、空白文字を表す、付記１乃至３の何れかに記載の情報処理装置。
（付記９）
Ｎは、分類すべき対象の大きさに関連する、付記１乃至３の何れかに記載の情報処理装置。
（付記１０）
分類すべき対象を分割して得られたＮ個のセグメントのそれぞれのＭ次元確率ベクトルを取得する確率ベクトル取得ステップであって、Ｍはクラスの数であり、各Ｍ次元確率ベクトルにおける第１要素乃至第Ｍ要素は対応するセグメントが第１クラス乃至第Ｍクラスに属する確率をそれぞれ表し、Ｍ及びＮは１よりも大きい自然数である、確率ベクトル取得ステップと、
前記Ｎ個のセグメントのそれぞれについて、該セグメントのＭ次元確率ベクトルにおける第Ｈ要素以外の要素のうちの上位Ｋ個の最大の要素に対応するクラスを該セグメントの候補クラスとして選択する候補クラス選択ステップであって、Ｈ及びＫは自然数であり、１≦Ｈ≦Ｍ、且つ１≦Ｋ≦Ｍ−１となり、前記第Ｈ要素に対応する第Ｈクラスは意味情報を含まないクラスである、候補クラス選択ステップと、
前記Ｎ個のセグメントのそれぞれの候補クラスに基づいて経路ベクトルを生成し、生成された経路ベクトルのそれぞれについて、該経路ベクトルに含まれる各要素に対応する確率及び隣接する要素間の関連度に基づいて該経路ベクトルのスコアを計算する経路ベクトル生成ステップと、
前記経路ベクトルのうちのスコアが最も高い経路ベクトルを前記分類すべき対象の分類結果として取得する分類結果取得ステップと、を含み、
隣接する要素間の関連度は、前記経路ベクトルにおける隣接する要素間の意味情報、及び隣接する要素に対応するセグメント間の距離に関する可変の重みに基づいて計算される、情報処理方法。
（付記１１）
隣接する要素間の意味情報は、予め訓練されたｎ−ｇｒａｍモデルを介して計算された値により表される、付記１０に記載の情報処理方法。
（付記１２）
隣接する要素に対応するセグメント間の距離が所定閾値以下である場合、前記可変の重みを１に設定し、
隣接する要素に対応するセグメント間の距離が前記所定閾値よりも大きい場合、前記可変の重みを１よりも小さい値に設定し、且つ前記可変の重みは前記距離の増加に伴って減少する、付記１０に記載の情報処理方法。
（付記１３）
前記経路ベクトル生成ステップにおいて、以下のＮラウンドの処理により前記経路ベクトルを生成し、前記経路ベクトルのスコアを取得し、
１番目のラウンドの処理において、前記Ｎ個のセグメントのうちの第１セグメントの候補クラス及び前記第Ｈクラスに基づいてＫ＋１個の経路ベクトルを生成し、各経路ベクトルにおける要素に対応する確率に基づいて該経路ベクトルのスコアを生成し、
ｉ番目（ｉ≧２）のラウンドの処理において、
スコアが最大である上位Ｌ個の経路ベクトルを候補経路ベクトルとして選択し、Ｌは１よりも大きい自然数であり、
同一の２つ以上の候補経路ベクトルのうちのスコアが最も高い候補経路ベクトル以外の他の候補経路ベクトルを除外し、
少なくとも第ｉセグメントのＭ次元確率ベクトルに基づいて除外後の残りの候補経路ベクトルのスコアを更新し、
残りの候補経路ベクトルのそれぞれについて、第ｉセグメントの候補クラスのそれぞれを該残りの候補経路ベクトルにそれぞれ追加して経路ベクトルを新しく生成し、該残りの候補経路ベクトルの更新前のスコア、新しく追加された候補クラスに対応する確率、及び新しく追加された候補クラスと該残りの候補経路ベクトルとの関連度に基づいて、新しく生成された経路ベクトルのスコアを計算する、付記１０乃至１２の何れかに記載の情報処理方法。
（付記１４）
ｉ番目（ｉ≧２）のラウンドの処理において、前記残りの候補経路ベクトルのそれぞれのスコアを更新する際に、
前記残りの候補経路ベクトルに１つの要素が含まれ、且つ該要素が前記第Ｈクラスではない場合、
第ｉセグメントのＭ次元確率ベクトルにおける前記残りの候補経路ベクトルの第１要素に対応する確率及び前記第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素に基づいて前記残りの候補経路ベクトルのスコアを更新し、或いは、
ｉ番目のラウンドにおける残りの候補経路ベクトルに前記第Ｈクラスのみを含む残りの候補経路ベクトルがあるとき、第ｉセグメントのＭ次元確率ベクトルにおける前記残りの候補経路ベクトルの第１要素に対応する確率、前記第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素、及び前記第Ｈクラスのみを含む前記残りの候補経路ベクトルのスコアに基づいて、前記残りの候補経路ベクトルのスコアを更新し、
前記残りの候補経路ベクトルに前記第Ｈクラスのみが含まれる場合、前記第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素に基づいて前記残りの候補経路ベクトルのスコアを更新し、
前記残りの候補経路ベクトルにｍ個（ｍ≧２）の要素が含まれる場合、
前記第ｉセグメントのＭ次元確率ベクトルにおける前記残りの候補経路ベクトルの第ｍ要素に対応する確率及び前記第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素に基づいて前記残りの候補経路ベクトルのスコアを更新し、或いは、
ｉ番目のラウンドにおける残りの候補経路ベクトルにｍ−１個の要素を含む残りの候補経路ベクトルがあり、且つ前記ｍ−１個の要素を含む残りの候補経路ベクトルの第１要素乃至第ｍ−１要素と前記残りの候補経路ベクトルの第１要素乃至第ｍ−１要素とがそれぞれ同一であるとき、前記第ｉセグメントのＭ次元確率ベクトルにおける前記残りの候補経路ベクトルの第ｍ要素に対応する確率、前記第ｉセグメントのＭ次元確率ベクトルにおける第Ｈ要素、前記ｍ−１個の要素を含む残りの候補経路ベクトルのスコア、及び前記残りの候補経路ベクトルの第ｍ要素と前記ｍ−１個の要素を含む残りの候補経路ベクトルとの関連度に基づいて、前記残りの候補経路ベクトルのスコアを更新する、付記１３に記載の情報処理方法。
（付記１５）
前記確率ベクトル取得ステップにおいて、予め訓練された畳み込みリカレントニューラルネットワークにより前記Ｎ個のセグメントのそれぞれのＭ次元確率ベクトルを取得する、付記１０乃至１２の何れかに記載の情報処理方法。
（付記１６）
前記分類すべき対象に対して前処理を行う前処理ステップ、をさらに含み、
前記前処理は、ノイズ除去、正規化、二値化及び傾き補正のうちの少なくとも１つを含む、付記１０乃至１２の何れかに記載の情報処理方法。
（付記１７）
前記分類すべき対象は、テキストを含む画像であり、
前記第１クラス乃至第Ｍクラスは、異なる文字をそれぞれ表し、
前記第Ｈクラスは、空白文字を表す、付記１０乃至１２の何れかに記載の情報処理方法。
（付記１８）
Ｎは、分類すべき対象の大きさに関連する、付記１０乃至１２の何れかに記載の情報処理方法。
（付記１９）
プログラム命令が記憶されているコンピュータ読み取り可能な記憶媒体であって、前記プログラム命令がコンピュータにより実行される際に、付記１０乃至１８の何れかに記載の情報処理方法を実行させる、記憶媒体。 Further, the following additional notes will be disclosed with respect to the embodiments including each of the above-mentioned embodiments, but the present invention is not limited to these additional notes.
(Appendix 1)
It is a probability vector acquisition unit that acquires the M-dimensional probability vector of each of the N segments obtained by dividing the object to be classified, M is the number of classes, and the first element in each M-dimensional probability vector. The M element represents the probability that the corresponding segment belongs to the first class to the M class, respectively, and M and N are natural numbers larger than 1, a probability vector acquisition unit and the like.
For each of the N segments, a candidate class selection unit that selects the class corresponding to the highest K maximum elements among the elements other than the H element in the M-dimensional probability vector of the segment as the candidate class of the segment. Therefore, H and K are natural numbers, 1 ≦ H ≦ M, and 1 ≦ K ≦ M-1, and the H-class corresponding to the H element is a class that does not include semantic information. Selection part and
A route vector is generated based on each candidate class of the N segments, and for each of the generated route vectors, the probability corresponding to each element included in the route vector and the degree of relevance between adjacent elements are used. And a path vector generator that calculates the score of the path vector,
A classification result acquisition unit for acquiring the path vector having the highest score among the path vectors as the classification result of the object to be classified is included.
The relevance between adjacent elements is calculated based on the semantic information between the adjacent elements in the path vector and the variable weight of the distance between the segments corresponding to the adjacent elements.
(Appendix 2)
The information processing apparatus according to Appendix 1, wherein the semantic information between adjacent elements is represented by a value calculated via a pre-trained n-gram model.
(Appendix 3)
If the distance between segments corresponding to adjacent elements is less than or equal to a predetermined threshold, the variable weight is set to 1.
If the distance between the segments corresponding to the adjacent elements is greater than the predetermined threshold, the variable weight is set to a value less than 1, and the variable weight decreases as the distance increases. The information processing apparatus according to 1.
(Appendix 4)
The path vector generation unit generates the path vector by the following N rounds of processing, obtains the score of the path vector, and obtains the score.
In the processing of the first round, the route vector generation unit generates K + 1 route vectors based on the candidate class of the first segment among the N segments and the H class, and in each route vector. Generate a score for the path vector based on the probabilities corresponding to the elements
In the processing of the i-th (i ≧ 2) round, the path vector generation unit is
The top L route vectors with the highest score are selected as candidate route vectors, where L is a natural number greater than 1.
Exclude other candidate path vectors other than the one with the highest score among the same two or more candidate path vectors.
Update the scores of the remaining candidate path vectors after exclusion, at least based on the M-dimensional probability vector of the i-segment.
For each of the remaining candidate route vectors, each of the candidate classes in the i-segment is added to the remaining candidate route vectors to generate a new route vector, and the score before the update of the remaining candidate route vectors is newly added. Any of appendices 1 to 3, which calculates the score of the newly generated route vector based on the probability corresponding to the created candidate class and the degree of association between the newly added candidate class and the remaining candidate route vectors. The information processing device described in.
(Appendix 5)
In the processing of the i-th (i ≧ 2) round, the path vector generation unit updates the scores of the remaining candidate path vectors.
When the remaining candidate path vectors contain one element and the element is not of the H class.
The score of the remaining candidate path vector is updated based on the probability corresponding to the first element of the remaining candidate path vector in the M-dimensional probability vector of the i-segment and the H element in the M-dimensional probability vector of the i-segment. Or,
When the remaining candidate path vectors in the i-th round include the remaining candidate path vectors containing only the H class, the probability corresponding to the first element of the remaining candidate path vectors in the M-dimensional probability vector of the i-segment. , The score of the remaining candidate route vector is updated based on the score of the remaining candidate route vector including only the H element in the M-dimensional probability vector of the i-th segment and the H class.
When the remaining candidate route vectors include only the H class, the score of the remaining candidate route vectors is updated based on the H element in the M-dimensional probability vector of the i-segment.
When the remaining candidate path vectors include m (m ≧ 2) elements,
The score of the remaining candidate route vector is calculated based on the probability corresponding to the mth element of the remaining candidate route vector in the M-dimensional probability vector of the i-segment and the H element in the M-dimensional probability vector of the i-segment. Update or
The remaining candidate path vectors in the i-th round have the remaining candidate path vectors containing m-1 elements, and the first element to the m-th of the remaining candidate path vectors containing the m-1 elements. When one element and the first element to the m-1 element of the remaining candidate route vector are the same, it corresponds to the mth element of the remaining candidate route vector in the M-dimensional probability vector of the i-segment. The probability, the H element in the M-dimensional probability vector of the i-segment, the score of the remaining candidate route vector including the m-1 element, and the mth element of the remaining candidate route vector and the m-1 element. 4. The information processing apparatus according to Appendix 4, which updates the score of the remaining candidate route vectors based on the degree of association with the remaining candidate route vectors including the elements of.
(Appendix 6)
The information processing apparatus according to any one of Supplementary note 1 to 3, wherein the probability vector acquisition unit acquires M-dimensional probability vectors of each of the N segments by a pre-trained convolutional recurrent neural network.
(Appendix 7)
Further includes a pretreatment unit that performs pretreatment on the object to be classified.
The information processing apparatus according to any one of Supplementary note 1 to 3, wherein the preprocessing includes at least one of noise removal, normalization, binarization, and tilt correction.
(Appendix 8)
The object to be classified is an image containing text.
The first class to the M class represent different characters, respectively.
The information processing apparatus according to any one of Supplementary note 1 to 3, wherein the H-class represents a blank character.
(Appendix 9)
N is the information processing apparatus according to any one of Supplementary note 1 to 3, which is related to the size of the object to be classified.
(Appendix 10)
It is a probability vector acquisition step to acquire the M-dimensional probability vector of each of the N segments obtained by dividing the object to be classified, and M is the number of classes and is the first element in each M-dimensional probability vector. The M element represents the probability that the corresponding segment belongs to the first class to the M class, respectively, and M and N are natural numbers larger than 1, a probability vector acquisition step and
For each of the N segments, a candidate class selection step of selecting the class corresponding to the highest K maximum elements among the elements other than the H element in the M-dimensional probability vector of the segment as the candidate class of the segment. Therefore, H and K are natural numbers, 1 ≦ H ≦ M, and 1 ≦ K ≦ M-1, and the H-class corresponding to the H element is a class that does not include semantic information. Selection steps and
A route vector is generated based on each candidate class of the N segments, and for each of the generated route vectors, the probability corresponding to each element included in the route vector and the degree of relevance between adjacent elements are used. And the path vector generation step to calculate the score of the path vector,
Includes a classification result acquisition step of acquiring the route vector having the highest score among the route vectors as the classification result of the object to be classified.
Relevance between adjacent elements is an information processing method calculated based on semantic information between adjacent elements in the path vector and variable weights with respect to distances between segments corresponding to the adjacent elements.
(Appendix 11)
The information processing method according to Appendix 10, wherein the semantic information between adjacent elements is represented by a value calculated via a pre-trained n-gram model.
(Appendix 12)
If the distance between segments corresponding to adjacent elements is less than or equal to a predetermined threshold, the variable weight is set to 1.
If the distance between the segments corresponding to the adjacent elements is greater than the predetermined threshold, the variable weight is set to a value less than 1, and the variable weight decreases as the distance increases. The information processing method according to 10.
(Appendix 13)
In the path vector generation step, the path vector is generated by the following N rounds of processing, and the score of the path vector is acquired.
In the processing of the first round, K + 1 path vectors are generated based on the candidate class of the first segment of the N segments and the H class, and based on the probability corresponding to the element in each path vector. To generate a score for the path vector
In the processing of the i-th (i ≧ 2) round,
The top L route vectors with the highest score are selected as candidate route vectors, where L is a natural number greater than 1.
Exclude other candidate path vectors other than the one with the highest score among the same two or more candidate path vectors.
Update the scores of the remaining candidate path vectors after exclusion, at least based on the M-dimensional probability vector of the i-segment.
For each of the remaining candidate route vectors, each of the candidate classes in the i-segment is added to the remaining candidate route vectors to generate a new route vector, and the score before the update of the remaining candidate route vectors is newly added. Any of Appendix 10-12, which calculates the score of the newly generated route vector based on the probability corresponding to the created candidate class and the degree of association between the newly added candidate class and the remaining candidate route vectors. Information processing method described in.
(Appendix 14)
In the processing of the i-th (i ≧ 2) round, when updating the score of each of the remaining candidate route vectors,
When the remaining candidate path vectors contain one element and the element is not of the H class.
The score of the remaining candidate path vector is updated based on the probability corresponding to the first element of the remaining candidate path vector in the M-dimensional probability vector of the i-segment and the H element in the M-dimensional probability vector of the i-segment. Or,
When the remaining candidate path vectors in the i-th round include the remaining candidate path vectors containing only the H class, the probability corresponding to the first element of the remaining candidate path vectors in the M-dimensional probability vector of the i-segment. , The score of the remaining candidate route vector is updated based on the score of the remaining candidate route vector including only the H element in the M-dimensional probability vector of the i-th segment and the H class.
When the remaining candidate route vectors include only the H class, the score of the remaining candidate route vectors is updated based on the H element in the M-dimensional probability vector of the i-segment.
When the remaining candidate path vectors include m (m ≧ 2) elements,
The score of the remaining candidate route vector is calculated based on the probability corresponding to the mth element of the remaining candidate route vector in the M-dimensional probability vector of the i-segment and the H element in the M-dimensional probability vector of the i-segment. Update or
The remaining candidate path vectors in the i-th round have the remaining candidate path vectors containing m-1 elements, and the first element to the m-th of the remaining candidate path vectors containing the m-1 elements. When one element and the first element to the m-1 element of the remaining candidate route vector are the same, it corresponds to the mth element of the remaining candidate route vector in the M-dimensional probability vector of the i-segment. The probability, the H element in the M-dimensional probability vector of the i-segment, the score of the remaining candidate route vector including the m-1 element, and the mth element of the remaining candidate route vector and the m-1 element. 13. The information processing method according to Appendix 13, wherein the score of the remaining candidate route vectors is updated based on the degree of association with the remaining candidate route vectors including the elements of.
(Appendix 15)
The information processing method according to any one of Supplementary note 10 to 12, wherein in the probability vector acquisition step, the M-dimensional probability vector of each of the N segments is acquired by a pre-trained convolutional recurrent neural network.
(Appendix 16)
Further including a pretreatment step of performing pretreatment on the object to be classified.
The information processing method according to any one of Supplementary Provisions 10 to 12, wherein the preprocessing includes at least one of noise removal, normalization, binarization, and tilt correction.
(Appendix 17)
The object to be classified is an image containing text.
The first class to the M class represent different characters, respectively.
The information processing method according to any one of Supplementary Provisions 10 to 12, wherein the H-class represents a blank character.
(Appendix 18)
N is the information processing method according to any one of Supplementary note 10 to 12, which is related to the size of the object to be classified.
(Appendix 19)
A computer-readable storage medium in which a program instruction is stored, wherein the information processing method according to any one of the appendices 10 to 18 is executed when the program instruction is executed by the computer.

Claims

It is a probability vector acquisition unit that acquires the M-dimensional probability vector of each of the N segments obtained by dividing the object to be classified, M is the number of classes, and the first element in each M-dimensional probability vector. The M element represents the probability that the corresponding segment belongs to the first class to the M class, respectively, and M and N are natural numbers larger than 1, a probability vector acquisition unit and the like.
For each of the N segments, a candidate class selection unit that selects the class corresponding to the highest K maximum elements among the elements other than the H element in the M-dimensional probability vector of the segment as the candidate class of the segment. Therefore, H and K are natural numbers, 1 ≦ H ≦ M, and 1 ≦ K ≦ M-1, and the H-class corresponding to the H element is a class that does not include semantic information. Selection part and
A route vector is generated based on each candidate class of the N segments, and for each of the generated route vectors, the probability corresponding to each element included in the route vector and the degree of relevance between adjacent elements are used. And a path vector generator that calculates the score of the path vector,
A classification result acquisition unit for acquiring the path vector having the highest score among the path vectors as the classification result of the object to be classified is included.
The relevance between adjacent elements is calculated based on the semantic information between the adjacent elements in the path vector and the variable weight of the distance between the segments corresponding to the adjacent elements.

The information processing apparatus according to claim 1, wherein the semantic information between adjacent elements is represented by a value calculated via a pre-trained n-gram model.

If the distance between segments corresponding to adjacent elements is less than or equal to a predetermined threshold, the variable weight is set to 1.
If the distance between the segments corresponding to the adjacent elements is greater than the predetermined threshold, the variable weight is set to a value less than 1, and the variable weight decreases as the distance increases. Item 1. The information processing apparatus according to item 1.

The path vector generation unit generates the path vector by the following N rounds of processing, obtains the score of the path vector, and obtains the score.
In the processing of the first round, the route vector generation unit generates K + 1 route vectors based on the candidate class of the first segment among the N segments and the H class, and in each route vector. Generate a score for the path vector based on the probabilities corresponding to the elements
In the processing of the i-th (i ≧ 2) round, the path vector generation unit is
The top L route vectors with the highest score are selected as candidate route vectors, where L is a natural number greater than 1.
Exclude other candidate path vectors other than the one with the highest score among the same two or more candidate path vectors.
Update the scores of the remaining candidate path vectors after exclusion, at least based on the M-dimensional probability vector of the i-segment.
For each of the remaining candidate route vectors, each of the candidate classes of the i-segment is added to the remaining candidate route vectors to generate a new route vector, and the score before the update of the remaining candidate route vectors is newly added. Any of claims 1 to 3, which calculates the score of the newly generated route vector based on the probability corresponding to the created candidate class and the degree of association between the newly added candidate class and the remaining candidate route vectors. Information processing device described in Crab.

In the processing of the i-th (i ≧ 2) round, the path vector generation unit updates the scores of the remaining candidate path vectors.
When the remaining candidate path vectors contain one element and the element is not of the H class.
The score of the remaining candidate path vector is updated based on the probability corresponding to the first element of the remaining candidate path vector in the M-dimensional probability vector of the i-segment and the H element in the M-dimensional probability vector of the i-segment. Or,
When the remaining candidate path vectors in the i-th round include the remaining candidate path vectors containing only the H class, the probability corresponding to the first element of the remaining candidate path vectors in the M-dimensional probability vector of the i-segment. , The score of the remaining candidate route vector is updated based on the score of the remaining candidate route vector including only the H element in the M-dimensional probability vector of the i-th segment and the H class.
When the remaining candidate route vectors include only the H class, the score of the remaining candidate route vectors is updated based on the H element in the M-dimensional probability vector of the i-segment.
When the remaining candidate path vectors include m (m ≧ 2) elements,
The score of the remaining candidate route vector is calculated based on the probability corresponding to the mth element of the remaining candidate route vector in the M-dimensional probability vector of the i-segment and the H element in the M-dimensional probability vector of the i-segment. Update or
The remaining candidate path vectors in the i-th round have the remaining candidate path vectors containing m-1 elements, and the first element to the m-th of the remaining candidate path vectors containing the m-1 elements. When one element and the first element to the m-1 element of the remaining candidate route vector are the same, it corresponds to the mth element of the remaining candidate route vector in the M-dimensional probability vector of the i-segment. The probability, the H element in the M-dimensional probability vector of the i-segment, the score of the remaining candidate route vector including the m-1 element, and the mth element of the remaining candidate route vector and the m-1 element. The information processing apparatus according to claim 4, wherein the score of the remaining candidate route vectors is updated based on the degree of association with the remaining candidate route vectors including the elements of.

The information processing apparatus according to any one of claims 1 to 3, wherein the probability vector acquisition unit acquires M-dimensional probability vectors of each of the N segments by a pre-trained convolutional recurrent neural network.

Further includes a pretreatment unit that performs pretreatment on the object to be classified.
The information processing apparatus according to any one of claims 1 to 3, wherein the preprocessing includes at least one of noise reduction, normalization, binarization, and tilt correction.

The object to be classified is an image containing text.
The first class to the M class represent different characters, respectively.
The information processing apparatus according to any one of claims 1 to 3, wherein the H-class represents a blank character.

It is a probability vector acquisition step to acquire the M-dimensional probability vector of each of the N segments obtained by dividing the object to be classified, and M is the number of classes and is the first element in each M-dimensional probability vector. The M element represents the probability that the corresponding segment belongs to the first class to the M class, respectively, and M and N are natural numbers larger than 1, a probability vector acquisition step and
For each of the N segments, a candidate class selection step of selecting the class corresponding to the highest K maximum elements among the elements other than the H element in the M-dimensional probability vector of the segment as the candidate class of the segment. Therefore, H and K are natural numbers, 1 ≦ H ≦ M, and 1 ≦ K ≦ M-1, and the H-class corresponding to the H element is a class that does not include semantic information. Selection steps and
A route vector is generated based on each candidate class of the N segments, and for each of the generated route vectors, the probability corresponding to each element included in the route vector and the degree of relevance between adjacent elements are used. And the path vector generation step to calculate the score of the path vector,
Includes a classification result acquisition step of acquiring the route vector having the highest score among the route vectors as the classification result of the object to be classified.
Relevance between adjacent elements is an information processing method calculated based on semantic information between adjacent elements in the path vector and variable weights with respect to distances between segments corresponding to the adjacent elements.

A computer-readable storage medium in which a program instruction is stored, wherein the information processing method according to claim 9 is executed when the program instruction is executed by the computer.