JP2019528520A

JP2019528520A - Classification network training apparatus, character recognition apparatus and method for character recognition

Info

Publication number: JP2019528520A
Application number: JP2019504733A
Authority: JP
Inventors: ファヌ・ウエイ; 俊孫
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2016-08-31
Filing date: 2016-08-31
Publication date: 2019-10-10
Anticipated expiration: 2036-08-31
Also published as: WO2018039970A1; CN109478229B; CN109478229A; JP6696622B2

Abstract

文字認識のための分類ネットワークの訓練装置、文字認識装置及び方法を提供する。該装置及び方法は、ラベルなしサンプルについてサンプルペアを構築して対称ネットワークを訓練し、訓練された対称ネットワークのパラメータを用いて分類ネットワークを初期化し、ラベル付きサンプルを用いて初期化された分類ネットワークを訓練することで、分類ネットワークの認識の正確率を向上させることができ、ラベル付けのコストを効果的に節約することができる。【選択図】図１A classification network training apparatus, character recognition apparatus, and method for character recognition are provided. The apparatus and method build a sample pair for unlabeled samples to train a symmetric network, initialize a classification network using parameters of the trained symmetric network, and initialize a classification network using labeled samples Training can improve the accuracy of classification network recognition and effectively save the cost of labeling. [Selection] Figure 1

Description

本発明は、情報技術分野に関し、特に文字認識のための分類ネットワークの訓練装置、文字認識装置及び方法に関する。 The present invention relates to the field of information technology, and more particularly to a classification network training apparatus, character recognition apparatus, and method for character recognition.

資料保存と情報化発展の必要性のために、文書資料の電子化への需要が益々高まっている。このため、文書画像における文字の認識が益々重要になっている。古代文献の漢字等のような特殊な文字に対する認識は、古典文献のデジタル化、古典籍の整理及び文化の保存において非常に重要である。しかし、現代漢字の認識に比べて、古代文献の漢字の認識は非常に困難な問題である。第１に、古代文献の漢字の数は現代の漢字の数よりも遥かに多い。第２に、古代文献の漢字の構造は現代の簡略化した漢字よりも遥かに複雑である。第３に、古代文献の漢字は複数の態様を有し、即ち、異なる歴史的な時期に大量の漢字が異なる書き方を有する。第４に、異なる筆記具（例えば毛筆）又は木版印刷の使用により、古代文献の漢字は複数のスタイルを有する。最後に、撮影又はスキャンされた古典籍の画像の劣化は現代の漢字に比べてより顕著である。 Due to the necessity of document storage and information development, the demand for digitization of document materials is increasing. For this reason, recognition of characters in document images is becoming increasingly important. Recognition of special characters such as kanji in ancient literature is very important in digitizing classical literature, organizing classical books and preserving culture. However, compared to the recognition of modern Chinese characters, the recognition of ancient Chinese characters is a very difficult problem. First, the number of Chinese characters in ancient literature is far greater than the number of modern Chinese characters. Second, the structure of Chinese characters in ancient literature is much more complex than modern simplified Chinese characters. Thirdly, Chinese characters in ancient literature have multiple aspects, that is, large numbers of Chinese characters have different writing styles at different historical times. Fourth, due to the use of different writing instruments (eg brush) or woodcut printing, ancient Chinese characters have multiple styles. Finally, the degradation of the images of classics taken or scanned is more pronounced compared to modern Chinese characters.

近年、光学式文字認識（ＯＣＲ：ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎ）分野の研究では、ディープラーニング方法（例えば畳み込みニューラルネットワーク）は従来の方法よりも著しく優れている。現在主に使用されている畳み込みニューラルネットワーク（ＣＮＮ：ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）をベースとした教師あり学習の方法は、通常何百万のサンプルの訓練（トレーニング）データが必要である。古代文献の漢字認識には十分なラベル付きサンプルが欠けているため、スキャン又は撮影により大量のラベルなしサンプルを取得し、自動的な文字分割方法を用いて分割を行い、人間により手作業でラベル付けを行うことで畳み込みニューラルネットワークを訓練するためのラベル付きサンプルを取得する必要がある。 In recent years, in the field of optical character recognition (OCR), deep learning methods (for example, convolutional neural networks) are significantly superior to conventional methods. A supervised learning method based on a Convolutional Neural Network (CNN), which is currently mainly used, usually requires millions of samples of training data. Since there are not enough labeled samples for kanji recognition in ancient literature, a large number of unlabeled samples are obtained by scanning or photographing, and divided using an automatic character dividing method, and manually labeled by humans. It is necessary to obtain labeled samples for training the convolutional neural network by performing the pasting.

なお、上述した技術背景の説明は、本発明の技術案を明確、完全に理解させるための説明であり、当業者を理解させるために記述されているものである。これらの技術案は、単なる本発明の背景技術部分として説明されたものであり、当業者により周知されたものではない。 The above description of the technical background is an explanation for making the technical solution of the present invention clear and complete, and is described for the purpose of understanding those skilled in the art. These technical solutions are merely described as background art portions of the present invention, and are not well known by those skilled in the art.

上記従来の方法を用いて畳み込みニューラルネットワークを訓練する際に、大量の手作業でのラベル付けが必要であり、長い時間が必要であり、多くの労力及びコストが必要である。 When training a convolutional neural network using the above conventional method, a large amount of manual labeling is required, a long time is required, and a lot of labor and cost are required.

本発明の実施例は、ラベルなしサンプルについてサンプルペアを構築して対称ネットワークを訓練し、訓練された対称ネットワークのパラメータを用いて分類ネットワークを初期化し、ラベル付きサンプルを用いて初期化された分類ネットワークを訓練することで、分類ネットワークの認識の正確率を向上させることができ、ラベル付けのコストを効果的に節約することができる、文字認識のための分類ネットワークの訓練装置、文字認識装置及び方法を提供する。 An embodiment of the present invention builds a sample pair for unlabeled samples to train a symmetric network, initializes a classification network using parameters of the trained symmetric network, and initializes classification using labeled samples. Classification network training apparatus for character recognition, character recognition apparatus, which can improve the accuracy rate of classification network recognition by training the network, and can effectively save the cost of labeling, and Provide a method.

本発明の実施例の第１態様では、文字認識のための分類ネットワークの訓練装置であって、文字を含む各ラベルなしサンプルの特徴を抽出する抽出手段と、抽出された各ラベルなしサンプルの特徴に基づいて、サンプルペアを構築する構築手段と、構築された前記サンプルペアに基づいて、対称ネットワークを訓練する第１訓練手段と、訓練された対称ネットワークのパラメータを用いて、文字認識のための分類ネットワークを初期化する初期化手段と、文字を含むラベル付きサンプルを用いて、初期化された前記分類ネットワークを訓練する第２訓練手段と、を含む、装置を提供する。 According to a first aspect of an embodiment of the present invention, there is provided a classification network training apparatus for character recognition, an extraction means for extracting features of each unlabeled sample including characters, and features of each extracted unlabeled sample. Using the construction means for constructing the sample pair, the first training means for training the symmetric network based on the constructed sample pair, and the parameters for the trained symmetric network for character recognition An apparatus is provided comprising: initialization means for initializing a classification network; and second training means for training the initialized classification network using labeled samples containing characters.

本発明の実施例の第２態様では、本発明の実施例の第１態様に記載の装置により訓練された文字認識のための分類ネットワークを含む、文字認識装置を提供する。 According to a second aspect of an embodiment of the present invention, there is provided a character recognition device including a classification network for character recognition trained by the apparatus according to the first aspect of the embodiment of the present invention.

本発明の実施例の第３態様では、文字認識のための分類ネットワークの訓練方法であって、文字を含む各ラベルなしサンプルの特徴を抽出するステップと、抽出された各ラベルなしサンプルの特徴に基づいて、サンプルペアを構築するステップと、構築された前記サンプルペアに基づいて、対称ネットワークを訓練するステップと、訓練された対称ネットワークのパラメータを用いて、文字認識のための分類ネットワークを初期化するステップと、文字を含むラベル付きサンプルを用いて、初期化された前記分類ネットワークを訓練するステップと、を含む、方法を提供する。 According to a third aspect of the embodiment of the present invention, there is provided a classification network training method for character recognition, comprising: extracting a feature of each unlabeled sample including characters; and extracting a feature of each extracted unlabeled sample. Based on, constructing a sample pair, training a symmetric network based on the constructed sample pair, and initializing a classification network for character recognition using parameters of the trained symmetric network And training the initialized classification network with labeled samples containing characters.

本発明の有利な効果としては、ラベルなしサンプルについてサンプルペアを構築して対称ネットワークを訓練し、訓練された対称ネットワークのパラメータを用いて分類ネットワークを初期化し、ラベル付きサンプルを用いて初期化された分類ネットワークを訓練することで、分類ネットワークの認識の正確率を向上させることができ、ラベル付けのコストを効果的に節約することができる。 An advantageous effect of the present invention is that it builds a sample pair for unlabeled samples to train a symmetric network, initializes a classification network using the parameters of the trained symmetric network, and initializes using labeled samples. By training the classification network, the accuracy of classification network recognition can be improved and the cost of labeling can be effectively saved.

本発明の特定の実施形態は、後述の説明及び図面に示すように、詳細に開示され、本発明の原理を採用されることが可能な方式を示している。なお、本発明の実施形態は、範囲上には限定されるものではない。本発明の実施形態は、添付されている特許請求の範囲の主旨及び内容の範囲内、各種の変更、修正、及び均等的なものが含まれる。 Certain embodiments of the present invention are disclosed in detail and illustrate the manner in which the principles of the present invention can be employed, as illustrated in the following description and drawings. The embodiment of the present invention is not limited in scope. The embodiments of the present invention include various changes, modifications, and equivalents within the scope and spirit of the appended claims.

ある一つの実施形態に説明及び又は示されている特徴は、同一又は類似の方式で一つ又は多くの他の実施形態に使用されてもよく、他の実施形態における特徴と組み合わせてもよく、他の実施形態における特徴を代替してもよい。 Features described and / or shown in one embodiment may be used in one or many other embodiments in the same or similar manner, and may be combined with features in other embodiments, Features in other embodiments may be substituted.

なお、用語「包括／含む」は、本文に使用される際に、特徴、要素、ステップ又は構成要件の存在を意味し、一つ又は複数の他の特徴、要素、ステップ又は構成要件の存在又は追加を排除するものではない。 As used herein, the term “inclusive / include” means the presence of a feature, element, step or component, and the presence or absence of one or more other features, elements, steps or components. It does not exclude the addition.

ここで含まれる図面は、本発明の実施例を理解させるためのものであり、本明細書の一部を構成し、本発明の実施例を例示するためのものであり、文言の記載と合わせて本発明の原理を説明する。なお、ここに説明される図面は、単なる本発明の実施例を説明するためのものであり、当業者にとって、これらの図面に基づいて他の図面を容易に得ることができる。
本発明の実施例１の文字認識のための分類ネットワークの訓練装置を示す図である。本発明の実施例１の構築部１０２を示す図である。本発明の実施例１の第１決定部２０１を示す図である。本発明の実施例１の第１決定部２０１を示す他の図である。本発明の実施例１の対称ネットワークを示す図である。本発明の実施例２の文字認識装置を示す図である。本発明の実施例３の電子機器を示す図である。本発明の実施例３の電子機器のシステム構成を示すブロック図である。本発明の実施例４の文字認識のための分類ネットワークの訓練方法を示す図である。 The drawings included herein are for the purpose of understanding the embodiments of the present invention, constitute part of the present specification, illustrate the embodiments of the present invention, and are combined with the description of the words. The principle of the present invention will be described. Note that the drawings described here are merely illustrative of embodiments of the present invention, and those skilled in the art can easily obtain other drawings based on these drawings.
It is a figure which shows the training apparatus of the classification network for the character recognition of Example 1 of this invention. It is a figure which shows the construction | assembly part 102 of Example 1 of this invention. It is a figure which shows the 1st determination part 201 of Example 1 of this invention. It is another figure which shows the 1st determination part 201 of Example 1 of this invention. It is a figure which shows the symmetrical network of Example 1 of this invention. It is a figure which shows the character recognition apparatus of Example 2 of this invention. It is a figure which shows the electronic device of Example 3 of this invention. It is a block diagram which shows the system configuration | structure of the electronic device of Example 3 of this invention. It is a figure which shows the training method of the classification network for the character recognition of Example 4 of this invention.

本発明の上記及びその他の特徴は、図面及び下記の説明により理解できるものである。明細書及び図面では、本発明の特定の実施形態、即ち本発明の原則に従う一部の実施形態を表すものを公開している。なお、本発明は説明される実施形態に限定されず、本発明は、特許請求の範囲内の全ての修正、変形されたもの、及び均等なものを含む。 These and other features of the present invention can be understood from the drawings and the following description. The specification and drawings disclose certain embodiments of the invention, i.e., some embodiments that follow the principles of the invention. Note that the present invention is not limited to the described embodiments, and the present invention includes all modifications, variations, and equivalents within the scope of the claims.

＜実施例１＞
図１は本発明の実施例１の文字認識のための分類ネットワークの訓練装置を示す図である。図１に示すように、該訓練装置１００は、抽出部１０１、構築部１０２、第１訓練部１０３、初期化部１０４及び第２訓練部１０５を含む。 <Example 1>
FIG. 1 is a diagram illustrating a classification network training apparatus for character recognition according to a first embodiment of the present invention. As shown in FIG. 1, the training apparatus 100 includes an extraction unit 101, a construction unit 102, a first training unit 103, an initialization unit 104, and a second training unit 105.

抽出部１０１は、文字を含む各ラベルなしサンプルの特徴を抽出する。 The extraction unit 101 extracts features of each unlabeled sample including characters.

構築部１０２は、抽出された各ラベルなしサンプルの特徴に基づいて、サンプルペアを構築する。 The construction unit 102 constructs a sample pair based on the extracted features of each unlabeled sample.

第１訓練部１０３は、構築されたサンプルペアに基づいて、対称ネットワークを訓練する。 The first training unit 103 trains a symmetric network based on the constructed sample pair.

初期化部１０４は、訓練された対称ネットワークのパラメータを用いて、文字認識のための分類ネットワークを初期化する。 The initialization unit 104 initializes a classification network for character recognition using parameters of the trained symmetric network.

第２訓練部１０５は、文字を含むラベル付きサンプルを用いて、初期化された該分類ネットワークを訓練する。 The second training unit 105 trains the initialized classification network using labeled samples including characters.

本実施例によれば、ラベルなしサンプルについてサンプルペアを構築して対称ネットワークを訓練し、訓練された対称ネットワークのパラメータを用いて分類ネットワークを初期化し、ラベル付きサンプルを用いて初期化された分類ネットワークを訓練することで、分類ネットワークの認識の正確率を向上させることができ、ラベル付けのコストを効果的に節約することができる。 According to this example, a sample pair is constructed for unlabeled samples to train a symmetric network, a classification network is initialized using parameters of the trained symmetric network, and a classification is initialized using labeled samples. Training the network can improve the accuracy of classification network recognition and can effectively save the cost of labeling.

本実施例では、文字を含むラベルなしサンプル及びラベル付きサンプルは、従来の方法を用いて取得されてもよく、各サンプルは、従来の文字分割方法を用いて、複数の文字を含む画像を分割して取得されてもよい。 In this example, unlabeled samples and labeled samples containing characters may be obtained using conventional methods, each sample dividing an image containing multiple characters using a conventional character segmentation method. May be obtained.

本実施例では、該文字は任意の形態の文字、例えば現代の文字、古代文献の漢字であってもよいし、他の国の言語の文字であってもよい。本発明の実施例は該文字の種類に限定されず、本発明の実施例により訓練された文字認識のための分類ネットワークは、任意の形態の文字を認識するために用いられてもよく、古代文献の文字などの特殊な文字の認識に限定されない。 In the present embodiment, the character may be in any form, for example, a modern character, a Chinese character from ancient literature, or a character in another language. The embodiments of the present invention are not limited to the types of characters, and the classification network for character recognition trained according to the embodiments of the present invention may be used to recognize any form of characters. It is not limited to the recognition of special characters such as literature characters.

本実施例では、文字を含むラベルなしサンプル及びラベル付きサンプルの数は実際の状況に応じて設定されてもよく、本実施例はこれに限定されない。 In the present embodiment, the number of unlabeled samples including characters and the number of labeled samples may be set according to the actual situation, and the present embodiment is not limited to this.

本実施例では、抽出部１０１は、直接文字を含むラベルなしサンプルから文字の特徴を抽出してもよいし、文字を含むラベル付きサンプルを用いて訓練されたネットワークにラベルなしサンプルを入力し、出力結果を抽出された特徴としてもよい。 In the present embodiment, the extraction unit 101 may extract character features from unlabeled samples that directly include characters, or input unlabeled samples to a trained network using labeled samples that include characters, The output result may be the extracted feature.

例えば、抽出部１０１は、従来の方法を用いて、字画やテクスチャ等の文字特徴を抽出特徴として直接抽出してもよい。 For example, the extraction unit 101 may directly extract character features such as strokes and textures as extraction features using a conventional method.

例えば、抽出部１０１は、ラベル付きサンプルを用いてネットワークを訓練し、該訓練されたネットワークにラベルなしサンプルを入力し、出力結果を抽出特徴としてもよい。例えば、該ネットワークは畳み込みニューラルネットワーク（ＣＮＮ）であってもよい。また、例えば該畳み込みニューラルネットワークは分類器であってもよく、この場合は、抽出された特徴は、入力されたラベルなしサンプルの分類結果である。 For example, the extraction unit 101 may train a network using labeled samples, input unlabeled samples to the trained network, and use the output result as an extraction feature. For example, the network may be a convolutional neural network (CNN). For example, the convolutional neural network may be a classifier. In this case, the extracted feature is a classification result of the input unlabeled sample.

本実施例では、抽出部１０１が各ラベルなしサンプルの特徴を抽出した後に、構築部１０２は抽出された各ラベルなしサンプルの特徴に基づいて、サンプルペアを構築する。以下は、本実施例の構築部１０２の構成及びサンプルペアの構築方法を例示的に説明する。 In this embodiment, after the extraction unit 101 extracts the features of each unlabeled sample, the construction unit 102 constructs a sample pair based on the extracted features of each unlabeled sample. Hereinafter, the configuration of the construction unit 102 and the construction method of the sample pair according to the present embodiment will be described as an example.

図２は本発明の実施例１の構築部１０２を示す図である。図２に示すように、構築部１０２は第１決定部２０１を含む。 FIG. 2 is a diagram illustrating the construction unit 102 according to the first embodiment of this invention. As shown in FIG. 2, the construction unit 102 includes a first determination unit 201.

第１決定部２０１は、抽出された各ラベルなしサンプルの特徴に基づいて、第１類似サンプルペア及び第１非類似サンプルペアを決定する。 The first determination unit 201 determines a first similar sample pair and a first dissimilar sample pair based on the extracted feature of each unlabeled sample.

本実施例では、第１決定部２０１により決定された第１類似サンプルペア及び第１非類似サンプルペアの数は実際の要求に応じて設定されてもよい。 In the present embodiment, the number of first similar sample pairs and first dissimilar sample pairs determined by the first determination unit 201 may be set according to an actual request.

以下は、第１決定部２０１の構成、並びに第１類似サンプルペア及び第１非類似サンプルペアの決定方法を例示的に説明する。 Below, the structure of the 1st determination part 201 and the determination method of a 1st similar sample pair and a 1st dissimilar sample pair are demonstrated exemplarily.

図３は本発明の実施例１の第１決定部２０１を示す図である。図３に示すように、第１決定部２０１は、第２計算部３０１、第４決定部３０２及び第５決定部３０３を含む。 FIG. 3 is a diagram illustrating the first determination unit 201 according to the first embodiment of this invention. As illustrated in FIG. 3, the first determination unit 201 includes a second calculation unit 301, a fourth determination unit 302, and a fifth determination unit 303.

第２計算部３０１は、抽出された各ラベルなしサンプルのうち任意の２つのラベルなしサンプルの特徴間の距離を計算する。 The second calculator 301 calculates the distance between the features of any two unlabeled samples among the extracted unlabeled samples.

第４決定部３０２は、特徴間の距離が所定閾値よりも小さい任意の２つのラベルなしサンプルを第１類似サンプルペアとして決定する。 The fourth determination unit 302 determines any two unlabeled samples whose distance between features is smaller than a predetermined threshold as the first similar sample pair.

第５決定部３０３は、特徴間の距離が該所定閾値以上である任意の２つのラベルなしサンプルを第１非類似サンプルペアとして決定する。 The fifth determination unit 303 determines any two unlabeled samples whose distance between features is equal to or greater than the predetermined threshold as the first dissimilar sample pair.

本実施例では、第１計算部２０１は従来の方法を用いて任意の２つのラベルサンプルの特徴間の距離を計算してもよく、該所定閾値は実際の要求に応じて設定されてもよい。 In the present embodiment, the first calculation unit 201 may calculate a distance between features of any two label samples using a conventional method, and the predetermined threshold may be set according to an actual request. .

本実施例では、抽出部１０１により抽出された各ラベルなしサンプルの特徴が該ラベルなしサンプルの分類結果である場合は、第１決定部２０１は他の方法を用いて第１類似サンプルペア及び第１非類似サンプルペアを決定してもよい。 In the present embodiment, when the feature of each unlabeled sample extracted by the extraction unit 101 is a classification result of the unlabeled sample, the first determination unit 201 uses the other method to determine the first similar sample pair and the first similar sample pair. One dissimilar sample pair may be determined.

図４は本発明の実施例１の第１決定部２０１を示す他の図である。図４に示すように、第１決定部２０１は、第６決定部４０１及び第７決定部４０２を含む。 FIG. 4 is another diagram illustrating the first determination unit 201 according to the first embodiment of this invention. As illustrated in FIG. 4, the first determination unit 201 includes a sixth determination unit 401 and a seventh determination unit 402.

第６決定部４０１は、分類結果が同一である任意の２つのラベルなしサンプルを第１類似サンプルペアとして決定する。 The sixth determination unit 401 determines any two unlabeled samples having the same classification result as the first similar sample pair.

第７決定部４０２は、分類結果が異なる任意の２つのラベルなしサンプルを第１非類似サンプルペアとして決定する。 The seventh determination unit 402 determines any two unlabeled samples with different classification results as the first dissimilar sample pair.

以上は第１決定部２０１による第１類似サンプルペア及び第１非類似サンプルペアの決定方法を例示的に説明しているが、本発明の実施例はこれに限定されない。 Although the above has described the determination method of the 1st similar sample pair and the 1st dissimilar sample pair by the 1st determination part 201 exemplarily, the Example of this invention is not limited to this.

本実施例では、図２に示す構築部１０２は、第２決定部２０２、第１計算部２０３及び第３決定部２０４をさらに含んでもよい。 In the present embodiment, the construction unit 102 illustrated in FIG. 2 may further include a second determination unit 202, a first calculation unit 203, and a third determination unit 204.

第２決定部２０２は、ラベル付きサンプルのラベルに基づいて、各ラベル付きサンプルのうち任意の２つのラベル付きサンプルを第２類似サンプルペア又は第２非類似サンプルペアとして決定する。 The second determination unit 202 determines any two labeled samples among the labeled samples as the second similar sample pair or the second dissimilar sample pair based on the label of the labeled sample.

第１計算部２０３は、決定された第２類似サンプルペアと第２非類似サンプルペアとの数の比を計算する。 The first calculation unit 203 calculates the ratio of the number of the determined second similar sample pair and the second dissimilar sample pair.

第３決定部２０４は、第１類似サンプルペアと第１非類似サンプルペアとの数の比が第２類似サンプルペアと第２非類似サンプルペアとの数の比に等しくなるように、第１類似サンプルペアと第１非類似サンプルペアとの数の比を決定する。 The third determination unit 204 sets the first similar sample pair and the first dissimilar sample pair so that the number ratio of the first similar sample pair and the first dissimilar sample pair is equal to the number ratio of the second similar sample pair and the second dissimilar sample pair. A ratio of the number of similar sample pairs to the first dissimilar sample pair is determined.

このように、ラベル付きサンプルの第２類似サンプルペアと第２非類似サンプルペアとの数の比に基づいてラベルなしサンプルの第１類似サンプルペアと第１非類似サンプルペアとの数の比を決定することで、訓練された分類ネットワークの分類正確率をさらに向上させることができる。 Thus, based on the ratio of the number of the second similar sample pair and the second dissimilar sample pair of the labeled sample, the ratio of the number of the first similar sample pair and the first dissimilar sample pair of the unlabeled sample is calculated. The determination can further improve the classification accuracy of the trained classification network.

本実施例では、第２決定部２０２は、ラベル付きサンプルのラベルに基づいて、各ラベル付きサンプルのうち任意の２つのラベル付きサンプルを第２類似サンプルペア又は第２非類似サンプルペアとして決定し、例えば、同一のラベルを有する任意の２つのラベル付きサンプルを第２類似サンプルペアとして決定し、異なるラベルを有する任意の２つのラベル付きサンプルを第２非類似サンプルペアとして決定する。 In the present embodiment, the second determination unit 202 determines any two labeled samples among the labeled samples as the second similar sample pair or the second dissimilar sample pair based on the label of the labeled sample. For example, any two labeled samples having the same label are determined as the second similar sample pair, and any two labeled samples having different labels are determined as the second dissimilar sample pair.

本実施例では、第１計算部２０３は、第１決定部２０１により決定された第２類似サンプルペアと第２非類似サンプルペアとの数の比を計算し、第３決定部２０４は、第１類似サンプルペアと第１非類似サンプルペアとの数の比が第２類似サンプルペアと第２非類似サンプルペアとの数の比に等しくなるように、第１類似サンプルペアと第１非類似サンプルペアとの数の比を決定する。 In the present embodiment, the first calculation unit 203 calculates the ratio of the number of the second similar sample pairs and the second dissimilar sample pairs determined by the first determination unit 201, and the third determination unit 204 The first similar sample pair and the first dissimilarity such that the ratio of the number of one similar sample pair and the first dissimilar sample pair is equal to the ratio of the number of the second similar sample pair and the second dissimilar sample pair. Determine the ratio of the number to the sample pair.

例えば、第１決定部２０１により十分に多い第１類似サンプルペア及び第１非類似サンプルペアが既に決定された場合は、第３決定部２０４は、第１類似サンプルペアと第１非類似サンプルペアとの数の比が第２類似サンプルペアと第２非類似サンプルペアとの数の比に等しくなるように、第１類似サンプルペア及び第１非類似サンプルペアを選択する。或いは、第３決定部２０４が第１類似サンプルペアと第１非類似サンプルペアとの数の比を決定した後に、第１決定部２０１は、該数の比に基づいて第１類似サンプルペア及び第１非類似サンプルペアを決定する。 For example, when a sufficiently large number of first similar sample pairs and first dissimilar sample pairs have already been determined by the first determining unit 201, the third determining unit 204 determines that the first similar sample pair and the first dissimilar sample pair The first similar sample pair and the first dissimilar sample pair are selected such that the ratio of the number of the first similar sample pair and the second dissimilar sample pair is equal to the number ratio of the second similar sample pair. Alternatively, after the third determining unit 204 determines the ratio of the number of the first similar sample pair and the first dissimilar sample pair, the first determining unit 201 determines whether the first similar sample pair and the first similar sample pair and A first dissimilar sample pair is determined.

本実施例では、第１訓練部１０３は、構築部１０２により構築されたサンプルペアに基づいて、対称ネットワークを訓練する。例えば、該対称ネットワークは、対称に配置された２つの畳み込みニューラルネットワーク（ＣＮＮ）を有するＳｉａｍｅｓｅ（シャム）ネットワークである。図５は本発明の実施例１の対称ネットワークを示す図である。図５に示すように、Ｓｉａｍｅｓｅネットワーク５００における２つの畳み込みニューラルネットワークＣＮＮ１及びＣＮＮ２は対称に配置され、ＣＮＮ１及びＣＮＮ２は従来のＣＮＮの構成を用いてもよく、両者の構成及びパラメータは完全に同一である。 In the present embodiment, the first training unit 103 trains a symmetric network based on the sample pairs constructed by the construction unit 102. For example, the symmetric network is a Siamese network having two convolutional neural networks (CNN) arranged symmetrically. FIG. 5 is a diagram showing a symmetric network according to the first embodiment of the present invention. As shown in FIG. 5, the two convolutional neural networks CNN1 and CNN2 in the Siamese network 500 may be arranged symmetrically, and CNN1 and CNN2 may use the conventional CNN configuration, and the configuration and parameters of both are completely the same. is there.

本実施例では、従来の方法を用いて、構築されたサンプルペアに基づいて対称ネットワークを訓練してもよい。例えば、構築されたサンプルペアをペア毎にＳｉａｍｅｓｅネットワークに入力し、該サンプルペアにおける１つのサンプルをＣＮＮ１に入力し、もう１つのサンプルをＣＮＮ２に入力し、Ｓｉａｍｅｓｅネットワークの出力側において、ＣＮＮ１とＣＮＮ２の対比損失に基づいて、Ｓｉａｍｅｓｅネットワークの損失を計算する。そして、ＣＮＮ１及びＣＮＮ２において該Ｓｉａｍｅｓｅネットワークの損失を各層を介してフィードバックして、ＣＮＮ１及びＣＮＮ２における各層のパラメータを補正する。Ｓｉａｍｅｓｅネットワークの損失が所定の収束条件を満たすまで、上記ステップを繰り返した後に、訓練を終了させる。 In this example, a symmetric network may be trained based on the constructed sample pairs using conventional methods. For example, the constructed sample pair is input to the Siase network for each pair, one sample in the sample pair is input to CNN1, the other sample is input to CNN2, and CNN1 and CNN2 are input to the output side of the Siasese network. The loss of the Siamese network is calculated based on the contrast loss. Then, the loss of the Siasese network is fed back through each layer in CNN1 and CNN2, and the parameters of each layer in CNN1 and CNN2 are corrected. The training is terminated after repeating the above steps until the loss of the Siamese network satisfies a predetermined convergence condition.

本実施例では、初期化部１０４は、訓練された対称ネットワークのパラメータを用いて、文字認識のための分類ネットワークを初期化する。例えば、文字認識のための分類ネットワークは畳み込みニューラルネットワーク（ＣＮＮ）であり、該畳み込みニューラルネットワークは従来の構成を用いてもよい。訓練されたＳｉａｍｅｓｅネットワークにおける任意の１つの畳み込みニューラルネットワークのパラメータを用いて、分類ネットワークとなる畳み込みニューラルネットワークを初期化する。 In this embodiment, the initialization unit 104 initializes a classification network for character recognition using parameters of a trained symmetric network. For example, the classification network for character recognition is a convolutional neural network (CNN), and the convolutional neural network may use a conventional configuration. The parameters of the arbitrary one convolutional neural network in the trained Siamese network are used to initialize the convolutional neural network that becomes the classification network.

本実施例では、初期化のパラメータは、畳み込みニューラルネットワークの各畳み込み層のパラメータを含んでもよく、全結合層のパラメータをさらに含んでもよい。 In the present embodiment, the initialization parameters may include parameters of each convolution layer of the convolutional neural network, and may further include parameters of all connection layers.

本実施例では、第２訓練部１０５は、文字を含むラベル付きサンプルを用いて、初期化された分類ネットワークを訓練し、訓練された文字認識のための分類ネットワークを取得する。本実施例では、第２訓練部１０５は、従来の方法を用いて、初期化された分類ネットワークを訓練してもよい。 In a present Example, the 2nd training part 105 trains the initialized classification network using the labeled sample containing a character, and acquires the classification network for the trained character recognition. In the present embodiment, the second training unit 105 may train the initialized classification network using a conventional method.

例えば、初期化された畳み込みニューラルネットワークに文字を含むラベル付きサンプルをサンプル毎に入力し、出力側においてネットワークの損失を計算する。そして、畳み込みニューラルネットワークにおいて該ネットワークの損失を各層を介してフィードバックして、畳み込みニューラルネットワークにおける各層のパラメータを補正する。畳み込みニューラルネットワークのネットワーク損失が所定の収束条件を満たすまで、上記ステップを繰り返した後に、訓練を終了させる。 For example, a labeled sample including characters is input to the initialized convolutional neural network for each sample, and the loss of the network is calculated on the output side. Then, the loss of the network is fed back through each layer in the convolutional neural network, and the parameters of each layer in the convolutional neural network are corrected. The training is terminated after repeating the above steps until the network loss of the convolutional neural network satisfies a predetermined convergence condition.

本実施例では、図１に示す訓練装置１００は判断部１０６をさらに含んでもよい。 In the present embodiment, the training apparatus 100 illustrated in FIG. 1 may further include a determination unit 106.

判断部１０６は、訓練された分類ネットワークが所定条件を満たすか否かを判断し、訓練された分類ネットワークが所定条件を満たさない場合は、訓練された分類ネットワークを用いて、該文字を含む各ラベルなしサンプルの特徴を抽出し、訓練された分類ネットワークが該所定条件を満たす場合は、訓練された分類ネットワークを出力する。 The determination unit 106 determines whether or not the trained classification network satisfies a predetermined condition. When the trained classification network does not satisfy the predetermined condition, the trained classification network is used to include each character including the character. The feature of the unlabeled sample is extracted, and if the trained classification network satisfies the predetermined condition, the trained classification network is output.

本実施例では、該判断部１０６はオプションの構成要素であり、図１において破線の枠で示されている。 In this embodiment, the determination unit 106 is an optional component, and is indicated by a broken-line frame in FIG.

本実施例では、該所定条件は実際の要求に応じて設定されてもよい。例えば、該所定条件は、反復回数が所定の回数に達したこと、又は、訓練された分類ネットワークの分類正確率が収束すること、即ち現在の訓練された分類ネットワークの分類正確率と前回の訓練された分類ネットワークの分類正確率との差が所定閾値よりも小さいことである。 In this embodiment, the predetermined condition may be set according to an actual request. For example, the predetermined condition is that the number of iterations reaches a predetermined number, or that the classification accuracy of the trained classification network converges, that is, the classification accuracy of the current trained classification network and the previous training The difference from the classification accuracy rate of the classified network is smaller than a predetermined threshold.

このように、現在の訓練された分類ネットワークが所定条件を満たさない場合は、訓練された分類ネットワークを用いて、文字を含む各ラベルなしサンプルの特徴を抽出する。即ち、この場合は、訓練された分類ネットワークが所定条件を満たすまで、抽出部１０１が現在の訓練された分類ネットワークに基づいて文字を含む各ラベルなしサンプルの特徴を抽出し、サンプルペアを再構築し、対称ネットワークを訓練し、分類ネットワークを初期化し、分類ネットワークを訓練する。このような反復処理により、訓練された分類ネットワークの認識正確率をさらに向上させることができる。 Thus, if the current trained classification network does not meet the predetermined condition, the trained classification network is used to extract the features of each unlabeled sample containing characters. That is, in this case, until the trained classification network satisfies a predetermined condition, the extraction unit 101 extracts features of each unlabeled sample including characters based on the current trained classification network, and reconstructs the sample pair Train the symmetric network, initialize the classification network, and train the classification network. Such an iterative process can further improve the recognition accuracy rate of the trained classification network.

＜実施例２＞
本発明の実施例は、実施例１に記載された訓練装置により訓練された文字認識のための分類ネットワークを含む文字認識装置をさらに提供する。 <Example 2>
The embodiment of the present invention further provides a character recognition device including a classification network for character recognition trained by the training device described in the first embodiment.

図６は本発明の実施例２の文字認識装置を示す図である。図６に示すように、文字認識装置６００は、文字認識のための分類ネットワーク６０１を含む。 FIG. 6 is a diagram showing a character recognition apparatus according to the second embodiment of the present invention. As shown in FIG. 6, the character recognition device 600 includes a classification network 601 for character recognition.

分類ネットワーク６０１は訓練装置により訓練されたものである。ここで、該訓練￥装置の構成及び機能は実施例１に記載されたものと同じであり、ここでその説明を省略する。 The classification network 601 is trained by a training device. Here, the configuration and function of the training device are the same as those described in the first embodiment, and the description thereof is omitted here.

例えば、文字認識装置は記憶部を含み、該記憶部には、実施例１に記載された訓練装置により訓練された文字認識のための分類ネットワークが記憶されている。 For example, the character recognition device includes a storage unit, and the storage unit stores a classification network for character recognition trained by the training device described in the first embodiment.

＜実施例３＞
本発明の実施例は電子機器をさらに提供し、図７は本発明の実施例３の電子機器を示す図である。図７に示すように、電子機器７００は訓練装置７０１又は文字認識装置７０２を含む。ここで、訓練装置７０１の構成及び機能は実施例１に記載されたものと同じであり、文字認識装置７０２の構成及び機能は実施例２に記載されたものと同じであり、ここでその説明を省略する。 <Example 3>
The embodiment of the present invention further provides an electronic device, and FIG. 7 is a diagram showing the electronic device of Embodiment 3 of the present invention. As shown in FIG. 7, the electronic device 700 includes a training device 701 or a character recognition device 702. Here, the configuration and function of the training device 701 are the same as those described in the first embodiment, and the configuration and function of the character recognition device 702 are the same as those described in the second embodiment. Is omitted.

図８は本発明の実施例３の電子機器のシステム構成を示すブロック図である。図８に示すように、電子機器８００は、中央処理装置（中央制御装置）８０１及び記憶装置８０２を含んでもよく、記憶装置８０２は中央処理装置８０１に接続される。該図は単なる例示的なものであり、電気通信機能又は他の機能を実現するように、他の種類の構成を用いて、該構成を補充又は代替してもよい。 FIG. 8 is a block diagram showing a system configuration of the electronic apparatus according to the third embodiment of the present invention. As shown in FIG. 8, the electronic apparatus 800 may include a central processing unit (central control unit) 801 and a storage device 802, and the storage device 802 is connected to the central processing unit 801. The figures are merely exemplary, and other types of configurations may be used or supplemented or substituted to implement telecommunications functions or other functions.

図８に示すように、電子機器８００は、入力部８０３、ディスプレイ８０４及び電源８０５をさらに含んでもよい。 As illustrated in FIG. 8, the electronic device 800 may further include an input unit 803, a display 804, and a power source 805.

１つの態様では、実施例１の訓練装置の機能は中央処理装置８０１に統合されてもよい。ここで、中央処理装置８０１は、文字を含む各ラベルなしサンプルの特徴を抽出し、抽出された各ラベルなしサンプルの特徴に基づいて、サンプルペアを構築し、構築された該サンプルペアに基づいて、対称ネットワークを訓練し、訓練された対称ネットワークのパラメータを用いて、文字認識のための分類ネットワークを初期化し、文字を含むラベル付きサンプルを用いて、初期化された該分類ネットワークを訓練するように構成されてもよい。 In one aspect, the functionality of the training device of Example 1 may be integrated into the central processing unit 801. Here, the central processing unit 801 extracts features of each unlabeled sample including characters, constructs a sample pair based on the extracted features of each unlabeled sample, and based on the constructed sample pair. Train a symmetric network, initialize a classification network for character recognition using parameters of the trained symmetric network, and train the initialized classification network using labeled samples containing characters May be configured.

ここで、中央処理装置８０１は、訓練された該分類ネットワークが所定条件を満たすか否かを判断し、訓練された該分類ネットワークが該所定条件を満たさない場合は、訓練された該分類ネットワークを用いて該文字を含む各ラベルなしサンプルの特徴を抽出し、訓練された該分類ネットワークが該所定条件を満たす場合は、訓練された該分類ネットワークを出力するようにさらに構成されてもよい。 Here, the central processing unit 801 determines whether or not the trained classification network satisfies a predetermined condition. If the trained classification network does not satisfy the predetermined condition, the central processing unit 801 determines the trained classification network. May be further configured to extract the features of each unlabeled sample that includes the character and output the trained classification network if the trained classification network satisfies the predetermined condition.

ここで、該文字を含む各ラベルなしサンプルの特徴を抽出するステップは、直接文字を含む該ラベルなしサンプルから文字の特徴を抽出するステップ、又は、文字を含むラベル付きサンプルを用いて訓練されたネットワークに該ラベルなしサンプルを入力し、出力結果を抽出された特徴とするステップ、を含む。 Here, extracting the feature of each unlabeled sample containing the character was trained using extracting the character feature from the unlabeled sample containing the character directly, or using the labeled sample containing the character. Inputting the unlabeled sample into the network and characterizing the output result.

ここで、該抽出された各ラベルなしサンプルの特徴に基づいてサンプルペアを構築するステップは、抽出された各ラベルなしサンプルの特徴に基づいて、第１類似サンプルペア及び第１非類似サンプルペアを決定するステップ、を含む。該ラベル付きサンプルのラベルに基づいて、各ラベル付きサンプルのうち任意の２つのラベル付きサンプルを第２類似サンプルペア又は第２非類似サンプルペアとして決定するステップと、決定された該第２類似サンプルペアと該第２非類似サンプルペアとの数の比を計算するステップと、該第１類似サンプルペアと該第１非類似サンプルペアとの数の比が該第２類似サンプルペアと該第２非類似サンプルペアとの数の比に等しくなるように、該第１類似サンプルペアと該第１非類似サンプルペアとの数の比を決定するステップと、をさらに含んでもよい。 Here, the step of constructing a sample pair based on the extracted feature of each unlabeled sample is a first similar sample pair and a first dissimilar sample pair based on the extracted feature of each unlabeled sample. Determining. Determining any two labeled samples of each labeled sample as a second similar sample pair or a second dissimilar sample pair based on the label of the labeled sample; and the determined second similar sample Calculating a ratio of the number of pairs to the second dissimilar sample pair; and a ratio of the number of the first similar sample pair to the first dissimilar sample pair to determine the second similar sample pair and the second dissimilar sample pair. Determining a ratio of the number of the first similar sample pair and the first dissimilar sample pair to be equal to a ratio of the number of dissimilar sample pairs.

ここで、該抽出された各ラベルなしサンプルの特徴に基づいて、第１類似サンプルペア及び第１非類似サンプルペアを決定するステップは、抽出された各ラベルなしサンプルのうち任意の２つのラベルなしサンプルの特徴間の距離を計算するステップと、特徴間の距離が所定閾値よりも小さい任意の２つのラベルなしサンプルを該第１類似サンプルペアとして決定するステップと、特徴間の距離が該所定閾値以上である任意の２つのラベルなしサンプルを該第１非類似サンプルペアとして決定するステップと、を含む。 Here, the step of determining the first similar sample pair and the first dissimilar sample pair based on the characteristics of each extracted unlabeled sample includes any two unlabeled samples of the extracted unlabeled samples. Calculating a distance between features of the sample; determining any two unlabeled samples whose feature distance is less than a predetermined threshold as the first similar sample pair; and determining a distance between features as the predetermined threshold. Determining any two unlabeled samples as the first dissimilar sample pair as described above.

ここで、抽出された各ラベルなしサンプルの特徴は、該ラベルなしサンプルの分類結果であり、該抽出された各ラベルなしサンプルの特徴に基づいて、第１類似サンプルペア及び第１非類似サンプルペアを決定するステップは、分類結果が同一である任意の２つのラベルなしサンプルを該第１類似サンプルペアとして決定するステップと、分類結果が異なる任意の２つのラベルなしサンプルを該第１非類似サンプルペアとして決定するステップと、を含む。 Here, the feature of each extracted unlabeled sample is a classification result of the unlabeled sample, and based on the feature of each extracted unlabeled sample, the first similar sample pair and the first dissimilar sample pair Determining any two unlabeled samples having the same classification result as the first similar sample pair, and determining any two unlabeled samples having different classification results to the first dissimilar sample. Determining as a pair.

ここで、該対称ネットワークは、対称に配置された２つの畳み込みニューラルネットワークを有するシャム（Ｓｉａｍｅｓｅ）ネットワークであり、該文字認識のための分類ネットワークは、畳み込みニューラルネットワークである。 Here, the symmetric network is a Siamese network having two convolutional neural networks arranged symmetrically, and the classification network for character recognition is a convolutional neural network.

ここで、該訓練された対称ネットワークのパラメータを用いて、文字認識のための分類ネットワークを初期化するステップは、訓練されたシャムネットワークにおける任意の１つの畳み込みニューラルネットワークのパラメータを用いて、該分類ネットワークとなる畳み込みニューラルネットワークを初期化するステップ、を含む。 Here, the step of initializing a classification network for character recognition using the parameters of the trained symmetric network uses the parameters of any one convolutional neural network in the trained sham network to Initializing a convolutional neural network to be a network.

もう１つの態様では、実施例１に記載された訓練装置は中央処理装置８０１とそれぞれ構成されてもよく、例えば訓練装置は中央処理装置８０１に接続されたチップであり、中央処理装置８０１の制御により該訓練装置の機能を実現してもよい。 In another aspect, the training device described in the first embodiment may be configured with a central processing unit 801, for example, the training device is a chip connected to the central processing unit 801, and the control of the central processing unit 801. The function of the training apparatus may be realized by

本実施例における電子機器８００は、図８に示されている全ての構成部を含まなくてもよい。 The electronic apparatus 800 in the present embodiment may not include all the components shown in FIG.

図８に示すように、中央処理装置８０１は、コントローラ又は操作制御部とも称され、マイクロプロセッサ又は他の処理装置及び／又は論理装置を含んでもよく、中央処理装置８０１は入力を受信し、電子機器８００の各部の操作を制御する。 As shown in FIG. 8, the central processing unit 801 is also referred to as a controller or an operation control unit, and may include a microprocessor or other processing unit and / or a logic unit. The operation of each part of the device 800 is controlled.

記憶装置８０２は、例えばバッファ、フラッシュメモリ、ハードディスク、移動可能な媒体、発揮性メモリ、不発揮性メモリ、又は他の適切な装置の１つ又は複数であってもよい。また、中央処理装置８０１は、記憶装置８０２に記憶されたプログラムを実行し、情報の記憶又は処理などを実現してもよい。他の構成要素は従来技術に類似するため、ここでその説明が省略される。電子機器８００の各部は、本発明の範囲から逸脱することなく、特定のハードウェア、ファームウェア、ソフトウェア又はその組み合わせによって実現されてもよい。 Storage device 802 may be, for example, one or more of a buffer, flash memory, hard disk, removable medium, volatile memory, non-volatile memory, or other suitable device. Further, the central processing unit 801 may execute a program stored in the storage device 802 to realize information storage or processing. Since other components are similar to those of the prior art, description thereof is omitted here. Each unit of the electronic device 800 may be realized by specific hardware, firmware, software, or a combination thereof without departing from the scope of the present invention.

＜実施例４＞
本発明の実施例は文字認識のための分類ネットワークの訓練方法をさらに提供し、該訓練方法は実施例１の文字認識のための分類ネットワークの訓練装置に対応する。図９は本発明の実施例４の文字認識のための分類ネットワークの訓練方法を示す図である。図９に示すように、該方法は以下のステップを含む。 <Example 4>
The embodiment of the present invention further provides a classification network training method for character recognition, which corresponds to the classification network training apparatus for character recognition of the first embodiment. FIG. 9 is a diagram illustrating a classification network training method for character recognition according to a fourth embodiment of the present invention. As shown in FIG. 9, the method includes the following steps.

ステップ９０１：文字を含む各ラベルなしサンプルの特徴を抽出する。 Step 901: Extract features of each unlabeled sample containing characters.

ステップ９０２：抽出された各ラベルなしサンプルの特徴に基づいて、サンプルペアを構築する。 Step 902: Build a sample pair based on the extracted features of each unlabeled sample.

ステップ９０３：構築されたサンプルペアに基づいて、対称ネットワークを訓練する。 Step 903: Train a symmetric network based on the constructed sample pairs.

ステップ９０４：訓練された対称ネットワークのパラメータを用いて、文字認識のための分類ネットワークを初期化する。 Step 904: Initialize a classification network for character recognition using the parameters of the trained symmetric network.

ステップ９０５：文字を含むラベル付きサンプルを用いて、初期化された分類ネットワークを訓練する。 Step 905: Train the initialized classification network with labeled samples containing letters.

ステップ９０６：訓練された分類ネットワークが所定条件を満たすか否かを判断する。判断結果が「ＮＯ」の場合は、ステップ９０１に進み、訓練された分類ネットワークを用いて、文字を含む各ラベルなしサンプルの特徴を抽出する。判断結果が「ＹＥＳ」の場合は、ステップ９０７に進む。 Step 906: It is determined whether the trained classification network satisfies a predetermined condition. If the determination result is “NO”, the process proceeds to step 901, and features of each unlabeled sample including characters are extracted using the trained classification network. If the determination result is “YES”, the process proceeds to step 907.

ステップ９０７：訓練された分類ネットワークを出力する。 Step 907: Output the trained classification network.

本実施例では、特徴の抽出方法、サンプルペアの構築方法、対称ネットワークの訓練方法、分類ネットワークの初期化方法、分類ネットワークの訓練方法、及び訓練された分類ネットワークが所定条件を満たすか否かを判断する方法は、実施例１に記載されたものと同じであり、ここでその説明を省略する。 In this embodiment, a feature extraction method, a sample pair construction method, a symmetric network training method, a classification network initialization method, a classification network training method, and whether or not the trained classification network satisfies a predetermined condition. The determination method is the same as that described in the first embodiment, and the description thereof is omitted here.

本発明の実施例は、文字認識のための分類ネットワークの訓練装置又は電子機器においてプログラムを実行する際に、コンピュータに、該文字認識のための分類ネットワークの訓練装置又は電子機器において上記実施例４に記載の訓練方法を実行させる、コンピュータ読み取り可能なプログラムをさらに提供する。 According to the embodiment of the present invention, when the program is executed in the classification network training apparatus or electronic device for character recognition, the above-described fourth embodiment is applied to the classification network training apparatus or electronic device for character recognition. The computer-readable program which performs the training method as described in 1 is further provided.

本発明の実施例は、コンピュータに、文字認識のための分類ネットワークの訓練装置又は電子機器において上記実施例４に記載の訓練方法を実行させるためのコンピュータ読み取り可能なプログラムを記憶する、記憶媒体をさらに提供する。 According to an embodiment of the present invention, there is provided a storage medium for storing a computer-readable program for causing a computer to execute the training method described in Embodiment 4 in a classification network training apparatus or electronic device for character recognition. Provide further.

本発明の実施例を参照しながら説明した文字認識のための分類ネットワークの訓練装置において実行される訓練方法は、ハードウェア、プロセッサにより実行されるソフトウェアモジュール、又は両者の組み合わせで実施されてもよい。例えば、図１に示す機能的ブロック図における１つ若しくは複数、又は機能的ブロック図の１つ若しくは複数の組み合わせは、コンピュータプログラムフローの各ソフトウェアモジュールに対応してもよいし、各ハードウェアモジュールに対応してもよい。これらのソフトウェアモジュールは、図９に示す各ステップにそれぞれ対応してもよい。これらのハードウェアモジュールは、例えばフィールド・プログラマブル・ゲートアレイ（ＦＰＧＡ）を用いてこれらのソフトウェアモジュールをハードウェア化して実現されてもよい。 The training method executed in the classification network training device for character recognition described with reference to the embodiments of the present invention may be implemented in hardware, a software module executed by a processor, or a combination of both. . For example, one or more of the functional block diagrams shown in FIG. 1 or one or more combinations of the functional block diagrams may correspond to each software module of the computer program flow, and may correspond to each hardware module. May correspond. These software modules may correspond to the respective steps shown in FIG. These hardware modules may be realized by hardwareizing these software modules using, for example, a field programmable gate array (FPGA).

ソフトウェアモジュールは、ＲＡＭメモリ、フラッシュメモリ、ＲＯＭメモリ、ＥＰＲＯＭメモリ、ＥＥＰＲＯＭメモリ、レジスタ、ハードディスク、モバイルハードディスク、ＣＤ−ＲＯＭ又は当業者にとって既知の任意の他の形の記憶媒体に位置してもよい。プロセッサが記憶媒体から情報を読み取ったり、記憶媒体に情報を書き込むように該記憶媒体をプロセッサに接続してもよいし、記憶媒体がプロセッサの構成部であってもよい。プロセッサ及び記憶媒体はＡＳＩＣに位置する。該ソフトウェアモジュールは移動端末のメモリに記憶されてもよいし、移動端末に挿入されたメモリカードに記憶されてもよい。例えば、機器（例えば移動端末）が比較的に大きい容量のＭＥＧＡ−ＳＩＭカード又は大容量のフラッシュメモリ装置を用いる場合、該ソフトウェアモジュールは該ＭＥＧＡ−ＳＩＭカード又は大容量のフラッシュメモリ装置に記憶されてもよい。 A software module may be located in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, mobile hard disk, CD-ROM, or any other form of storage medium known to those skilled in the art. The processor may be connected to the processor so that the processor reads information from or writes information to the storage medium, or the storage medium may be a component of the processor. The processor and the storage medium are located in the ASIC. The software module may be stored in the memory of the mobile terminal or may be stored in a memory card inserted in the mobile terminal. For example, when a device (eg, a mobile terminal) uses a relatively large capacity MEGA-SIM card or a large capacity flash memory device, the software module is stored in the MEGA-SIM card or the large capacity flash memory device. Also good.

図１に記載されている機能的ブロック図における一つ以上の機能ブロックおよび/または機能ブロックの一つ以上の組合せは、本願に記載されている機能を実行するための汎用プロセッサ、デジタル信号プロセッサ（ＤＳＰ）、特定用途向け集積回路（ＡＳＩＣ）、フィールド・プログラマブル・ゲートアレイ（ＦＰＧＡ）又は他のプログラマブル論理デバイス、ディスクリートゲートまたはトランジスタ論理装置、ディスクリートハードウェアコンポーネント、またはそれらの任意の適切な組み合わせで実現されてもよい。図１に記載されている機能的ブロック図における一つ以上の機能ブロックおよび/または機能ブロックの一つ以上の組合せは、例えば、コンピューティング機器の組み合わせ、例えばＤＳＰとマイクロプロセッサの組み合わせ、複数のマイクロプロセッサの組み合わせ、ＤＳＰ通信と組み合わせた１つ又は複数のマイクロプロセッサ又は他の任意の構成で実現されてもよい。 One or more functional blocks and / or one or more combinations of functional blocks in the functional block diagram described in FIG. 1 may be a general purpose processor, digital signal processor (for performing the functions described herein) DSP, application specific integrated circuit (ASIC), field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware component, or any suitable combination thereof May be. One or more functional blocks and / or one or more combinations of functional blocks in the functional block diagram depicted in FIG. 1 may be, for example, a combination of computing devices, such as a combination of a DSP and a microprocessor, A combination of processors, one or more microprocessors combined with DSP communications, or any other configuration may be implemented.

以上、具体的な実施形態を参照しながら本発明を説明しているが、上記の説明は、例示的なものに過ぎず、本発明の保護の範囲を限定するものではない。本発明の趣旨及び原理から離脱しない限り、本発明に対して各種の変形及び変更を行ってもよく、これらの変形及び変更も本発明の範囲に属する。 Although the present invention has been described above with reference to specific embodiments, the above description is merely illustrative and does not limit the scope of protection of the present invention. Various modifications and changes may be made to the present invention without departing from the spirit and principle of the present invention, and these modifications and changes are also within the scope of the present invention.

本実施例では、第２計算部３０１は従来の方法を用いて任意の２つのラベルなしサンプルの特徴間の距離を計算してもよく、該所定閾値は実際の要求に応じて設定されてもよい。 In this embodiment, the second calculation unit 301 may calculate the distance between any two unlabeled sample characteristics using conventional methods, even the predetermined threshold is set according to the actual requirements Good.

Claims

A classification network training device for character recognition,
Extraction means for extracting features of each unlabeled sample containing characters;
A construction means for constructing a sample pair based on the characteristics of each extracted unlabeled sample;
First training means for training a symmetric network based on the constructed sample pair;
Initialization means for initializing a classification network for character recognition using parameters of the trained symmetric network;
And a second training means for training the initialized classification network with a labeled sample containing characters.

It is determined whether or not the trained classification network satisfies a predetermined condition. If the trained classification network does not satisfy the predetermined condition, each label including the characters is not included using the trained classification network. The apparatus according to claim 1, further comprising: a determination unit that extracts sample characteristics and outputs the trained classification network if the trained classification network satisfies the predetermined condition.

The extraction means includes
Extracting character features from the unlabeled sample containing direct characters, or
The apparatus according to claim 1, wherein the unlabeled sample is input to a trained network using labeled samples including characters, and the output result is extracted.

The construction means includes
The apparatus according to claim 1, comprising first determining means for determining a first similar sample pair and a first dissimilar sample pair based on characteristics of each extracted unlabeled sample.

The construction means includes
Second determining means for determining any two labeled samples of each labeled sample as a second similar sample pair or a second dissimilar sample pair based on the label of the labeled sample;
First calculating means for calculating a ratio of the number of the determined second similar sample pairs and the second dissimilar sample pairs;
The first similar sample such that a ratio of the number of the first similar sample pair and the first dissimilar sample pair is equal to a ratio of the number of the second similar sample pair and the second dissimilar sample pair. The apparatus according to claim 4, further comprising third determining means for determining a ratio of the number of pairs to the first dissimilar sample pair.

The first determining means includes
Second calculating means for calculating a distance between features of any two unlabeled samples of each extracted unlabeled sample;
Fourth determining means for determining any two unlabeled samples having a distance between features smaller than a predetermined threshold as the first similar sample pair;
The apparatus according to claim 4, further comprising: fifth determination means for determining any two unlabeled samples whose distance between features is equal to or greater than the predetermined threshold as the first dissimilar sample pair.

The feature extracted by the extraction means is a classification result of the unlabeled sample,
The first determining means includes
Sixth determination means for determining any two unlabeled samples having the same classification result as the first similar sample pair;
The 7th determination means which determines the arbitrary two unlabeled samples from which a classification result differs as said 1st dissimilar sample pair, The apparatus of Claim 4 characterized by the above-mentioned.

The symmetric network is a Siamese network having two convolutional neural networks arranged symmetrically,
The apparatus of claim 1, wherein the classification network for character recognition is a convolutional neural network.

9. The apparatus according to claim 8, wherein the initialization means initializes a convolutional neural network to be the classification network using parameters of any one convolutional neural network in a trained sham network.

A character recognition device comprising a classification network for character recognition trained by the device according to claim 1.

A classification network training method for character recognition,
Extracting the features of each unlabeled sample containing characters;
Building a sample pair based on the characteristics of each extracted unlabeled sample;
Training a symmetric network based on the constructed sample pairs;
Initializing a classification network for character recognition using parameters of the trained symmetric network;
Training the initialized classification network with labeled samples containing characters.

Determining whether the trained classification network satisfies a predetermined condition;
If the trained classification network does not satisfy the predetermined condition, using the trained classification network to extract features of each unlabeled sample containing the characters;
12. The method of claim 11, further comprising: outputting the trained classification network if the trained classification network satisfies the predetermined condition.

Extracting the features of each unlabeled sample containing said characters,
Extracting character features from the unlabeled sample containing direct characters, or
11. The method of claim 10, comprising: inputting the unlabeled sample into a network trained with labeled samples including characters and characterizing the output result.

Building a sample pair based on the characteristics of each extracted unlabeled sample comprises:
11. The method of claim 10, comprising determining a first similar sample pair and a first dissimilar sample pair based on the characteristics of each extracted unlabeled sample.

Building a sample pair based on the characteristics of each extracted unlabeled sample comprises:
Determining any two labeled samples of each labeled sample as a second similar sample pair or a second dissimilar sample pair based on the label of the labeled sample;
Calculating a ratio of the determined number of the second similar sample pairs and the second dissimilar sample pairs;
The first similar sample such that a ratio of the number of the first similar sample pair and the first dissimilar sample pair is equal to a ratio of the number of the second similar sample pair and the second dissimilar sample pair. 15. The method of claim 14, further comprising: determining a ratio of the number of pairs to the first dissimilar sample pair.

Based on the characteristics of each extracted unlabeled sample, determining a first similar sample pair and a first dissimilar sample pair comprises:
Calculating the distance between the features of any two unlabeled samples of each extracted unlabeled sample;
Determining any two unlabeled samples whose distance between features is less than a predetermined threshold as the first similar sample pair;
And determining any two unlabeled samples having a distance between features greater than or equal to the predetermined threshold as the first dissimilar sample pair.

The characteristic of each extracted unlabeled sample is the classification result of the unlabeled sample,
Based on the characteristics of each extracted unlabeled sample, determining a first similar sample pair and a first dissimilar sample pair includes:
Determining any two unlabeled samples with the same classification result as the first similar sample pair;
And determining any two unlabeled samples with different classification results as the first dissimilar sample pair.

The symmetric network is a Siamese network having two convolutional neural networks arranged symmetrically,
The method according to claim 11, wherein the classification network for character recognition is a convolutional neural network.

Initializing a classification network for character recognition using the parameters of the trained symmetric network comprises:
19. The method of claim 18, comprising initializing a convolutional neural network to be the classification network using parameters of any one convolutional neural network in a trained sham network.