TWI786946B

TWI786946B - Method for detection and recognition of characters on the surface of metal

Info

Publication number: TWI786946B
Application number: TW110142347A
Authority: TW
Inventors: 林義隆; 呂偉嘉; 徐紹恩
Original assignee: 國立雲林科技大學
Priority date: 2021-11-15
Filing date: 2021-11-15
Publication date: 2022-12-11
Also published as: TW202321982A

Abstract

A method for detection and recognition of characters on the surface of metal includes:a step of establishing a recognized character to establish a database of pre-recognized characters; an image capturing step, to obtain training images on the surface of a plurality of metal products; an image amplification step, to expand data and then increase the number of samples of the training images; an input image limitation step, to adjust and standardize the size of the input training images; an image feature capturing step, to establish a learning model, and to modify the learning model through class samples; a bounding box definition step, to define the range to be recognized, generate a bounding box and a first target parameter; a character content training step, to capture the characters in the bounding box for training, and generate a second target parameter; and an error correction and a string generating step, the first target parameter and the second target parameter are combined to perform error correction and then recognized characters to generate a text string to be recognized. The invention can use a small amount of diverse data to perform the purpose of machine learning, and can achieve accurate recognition effects without precise alignment during character recognition.

Description

Text Recognition Method on the Surface of Metal Products

本發明係關於一種文字辨識方法，尤其是一種應用在金屬製品表面之文字辨識方法，可以有效減少因金屬製品表面反光或字元太小而無法精準辨識的問題。The present invention relates to a character recognition method, in particular to a character recognition method applied on the surface of metal products, which can effectively reduce the problems that cannot be accurately recognized due to the reflection of light on the surface of metal products or too small characters.

一般針對金屬製品表面文字的辨識，例如閥類製品上的字元，主要多為產品型號相關之編號，由於多數金屬閥類工件體積或面積小，導致工人或倉儲管理人員在清點物品時，相對非常吃力，也沒有效率，而傳統部分光學或影像辨識方法則由於金屬表面的反光特性，常造成不管是人工、或其他影像辨識方法，都有較大的辨識誤差及不準確率。It is generally aimed at the identification of text on the surface of metal products. For example, the characters on valve products are mainly serial numbers related to the product model. Due to the small size or area of most metal valve workpieces, workers or warehouse management personnel are relatively confused when counting items. It is very laborious and inefficient, and some traditional optical or image recognition methods often have large recognition errors and inaccuracies due to the reflective properties of metal surfaces, whether it is manual or other image recognition methods.

習知技術中華民國公告I703508 號專利，係一種文件的字元影像識別方法與系統，為了有效減少紙本診斷書或相關單據輸入時發生錯誤及提高輸入的效率，當該診斷書或該相關單據的字元模糊不清或是被髒污附著時，能提高辨識率。Known technology Patent No. I703508 of the Republic of China Announcement is a method and system for character image recognition of documents. In order to effectively reduce errors in the input of paper medical certificates or related documents and improve input efficiency, when the medical certificate or the related documents When the characters are blurred or stained, the recognition rate can be improved.

習知技術中華民國公告I723541 號專利，提供一種板金辨識方法，其包括下列步驟：接收一板金影像；將該影像進行正規化轉換，以得到一轉換影像圖檔；根據該轉換影像圖檔取得一影像特徵點資料；將該影像特徵點資料與一資料庫中的複數個板金特徵模型對比，以得到至少一目標板金特徵模型。藉此，可自動將板金辨識，改善人工辨識板金缺乏效率的問題。Known technology Patent No. I723541 of the Republic of China Announcement provides a sheet metal identification method, which includes the following steps: receiving a sheet metal image; normalizing and converting the image to obtain a converted image file; obtaining a converted image file according to the converted image file. Image feature point data; comparing the image feature point data with a plurality of sheet metal feature models in a database to obtain at least one target sheet metal feature model. In this way, sheet metal can be automatically identified, improving the problem of lack of efficiency in manual identification of sheet metal.

惟，上述習知技術並未揭露針對金屬製品表面之字元的辨識，尤其是在一般金屬閥類原件上的字元，如產品型號、編號等，由於字體相對小且金屬反光及陰影的問題，以及需要精準對位才能機器辨識，導致人工辨識有難度及機器辨識誤差率高的問題，有鑑於此，習知技術有必要加以改良。However, the above-mentioned conventional technology does not disclose the recognition of characters on the surface of metal products, especially characters on general metal valve originals, such as product models, serial numbers, etc., due to the relatively small font size and the problems of metal reflection and shadows , and requires precise alignment to be recognized by the machine, which leads to the difficulty of manual recognition and the high error rate of machine recognition. In view of this, it is necessary to improve the conventional technology.

本發明之一目的在提供一種金屬製品表面之文字辨識方法，具有有不需精準對位即可準確辨識的功能。One object of the present invention is to provide a method for recognizing characters on the surface of metal products, which has the function of accurately recognizing characters without precise alignment.

本發明之另一目的在提供一種金屬製品表面之文字辨識方法，具有動態調整辨識字元長度的功能。Another object of the present invention is to provide a method for character recognition on the surface of metal products, which has the function of dynamically adjusting the length of characters to be recognized.

為達成上述及其他目的，本發明之金屬製品表面之文字辨識方法，包含：一辨識字元建立步驟，建立一預辨識字元資料庫；一影像擷取步驟，取得複數金屬製品表面之訓練影像，裁切該些訓練影像；一影像擴增步驟，資料擴增該些訓練影像之樣本數；一輸入影像限制步驟，調整並標準化輸入之訓練影像之大小；一影像特徵擷取步驟，建立學習模型，透過類別樣本修正該學習模型；一邊界框界定步驟，界訂欲辨識範圍，產生一邊界框及一第一目標參數；一字元内容訓練步驟，擷取該邊界框內之字元進行訓練，產生一第二目標參數；及一誤差修正及字串生成步驟，合併該第一目標參數及該第二目標參數進行誤差修正後辨識字元，產生一欲辨識文字字串。In order to achieve the above and other objectives, the text recognition method on the surface of metal products of the present invention includes: a step of establishing characters for recognition, establishing a database of pre-recognized characters; an image acquisition step, obtaining training images of the surfaces of multiple metal products , cropping these training images; an image augmentation step, data augmentation of the number of samples of these training images; an input image restriction step, adjusting and standardizing the size of the input training images; an image feature extraction step, establishing learning Model, correcting the learning model through category samples; a bounding box definition step, defining the range to be recognized, generating a bounding box and a first target parameter; a character content training step, extracting the characters in the bounding box for Training, generating a second target parameter; and an error correction and word string generation step, combining the first target parameter and the second target parameter for error correction to identify characters to generate a character string to be recognized.

在本發明的一些實施例中，該影像擴增步驟係使用對比調整法擴增該些訓練影像之樣本數。In some embodiments of the present invention, the image augmentation step uses a contrast adjustment method to augment the number of samples of the training images.

在本發明的一些實施例中，該影像擴增步驟係使用亮度調整法擴增該些訓練影像之樣本數。In some embodiments of the present invention, the image augmentation step uses a brightness adjustment method to augment the number of samples of the training images.

在本發明的一些實施例中，該影像特徵擷取步驟係使用原生網路模型作為初始訓練模型，包含兩個卷積層及兩個池化層，依序為卷積層、池化層、卷積層及池化層。In some embodiments of the present invention, the image feature extraction step is to use the original network model as the initial training model, including two convolutional layers and two pooling layers, which are the convolutional layer, the pooling layer, and the convolutional layer in sequence. and pooling layer.

在本發明的一些實施例中，該影像特徵擷取步驟之類別樣本分成兩種資料集合，分別為支撐資料集合及查詢資料集合。In some embodiments of the present invention, the category samples in the image feature extraction step are divided into two kinds of data sets, namely support data set and query data set.

在本發明的一些實施例中，影像特徵擷取步驟另包含：將該支撐資料集合輸入原生網路模型，輸出原型特徵；再將該查詢資料集合輸入原生網路模型，輸出模型學習參數；及持續讓該支撐資料集合及該查詢資料集合分別輸入該原生網路模型，進行輸出之誤差修正，再將誤差修正持續輸入至該原生網路模型進行迴圈訓練直到收斂。In some embodiments of the present invention, the image feature extraction step further includes: inputting the support data set into the native network model, and outputting prototype features; then inputting the query data set into the native network model, and outputting model learning parameters; and Continuously let the support data set and the query data set be respectively input into the native network model for output error correction, and then continuously input the error correction into the native network model for loop training until convergence.

在本發明的一些實施例中，該邊界框係作為欲辨識字元區域範圍之界，該邊界框界定步驟利用該原生網路模型之輸出模型特徵，經過三個全連接層，再利用平均平方誤差法，生成該第一目標參數，並使用最陡梯度法修正參數。In some embodiments of the present invention, the bounding box is used as the boundary of the character area to be identified. The bounding box definition step uses the output model features of the original network model, passes through three fully connected layers, and then uses the mean square An error method, generating the first target parameters, and modifying the parameters using the steepest gradient method.

在本發明的一些實施例中，該第一目標參數之平均平方誤差(MSE) =

，其中

為目標邊界框之寬度，

為目標邊界框之高度，

為預測邊界框之寬度，

為預測邊界框之度，(

,

)為目標邊界框之中心點座標，(

,

)為預測邊界框之中心點座標，藉由調整平均平方誤差，使

、

=

及

=

，使該目標邊界框與該預測邊界框重疊，所欲辨識字元全部匡列在辨識區域範圍內。 In some embodiments of the present invention, the mean squared error (MSE) of the first target parameter =

,in

is the width of the target bounding box,

is the height of the target bounding box,

is the width of the predicted bounding box,

is the degree of the predicted bounding box, (

,

) is the coordinates of the center point of the target bounding box, (

,

) is the coordinates of the center point of the predicted bounding box, by adjusting the average square error, so that

,

=

and

=

, so that the target bounding box overlaps the predicted bounding box, and all the characters to be recognized are listed within the recognition area.

在本發明的一些實施例中，最陡梯度法之該第一目標參數修正公式為

，其中，

為修正參數，

修正步階次數，

為對誤差微分項。 In some embodiments of the present invention, the first target parameter correction formula of the steepest gradient method is

,in,

For the correction parameters,

The number of corrected steps,

is the differential term for the error.

在本發明的一些實施例中，該字元内容訓練步驟使用兩個全連結層及兩個雙向長短期記憶層進行訓練學習，依序為全連結層-雙向長短期記憶層-雙向長短期記憶層-全連結層，利用連結時序分類進行損失計算。In some embodiments of the present invention, the character content training step uses two full-connection layers and two bidirectional long-short-term memory layers for training and learning, and the sequence is full-connection layer-bidirectional long-term short-term memory layer-bidirectional long-term short-term memory Layers - Fully connected layers that utilize connection temporal classification for loss computation.

圖1為本發明之金屬製品表面之文字辨識方法之一實施例流程圖，請參考圖1。本實施例之金屬製品表面之文字辨識方法，包含：一辨識字元建立步驟(S0)、一影像擷取步驟(S1)、一影像擴增步驟(S2)、一輸入影像限制步驟(S3)、一影像特徵擷取步驟(S4)、一邊界框界定步驟(S5) 、一字元内容訓練步驟(S6) 及一誤差修正及字串生成步驟(S7)。FIG. 1 is a flow chart of an embodiment of the text recognition method on the surface of metal products of the present invention, please refer to FIG. 1 . The method for character recognition on the surface of metal products in this embodiment includes: a recognition character establishment step (S0), an image capture step (S1), an image amplification step (S2), an input image restriction step (S3) , an image feature extraction step (S4), a bounding box definition step (S5), a character content training step (S6) and an error correction and word string generation step (S7).

該辨識字元建立步驟(S0)建立一預辨識字元資料庫，該預辨識字元資料庫可以據實際所要辨識的字元種類建立，如拉丁字母、阿拉伯數字、日文或其他字元。在本實施例中以26個英文字母(A

Z)、10個阿拉伯數字(0

9)及1個空白(Space)共37組字元作為該預辨識字元資料庫的辨識字元資料，作為後續訓練時需比對在該預辨識字元資料庫中字元的出現機率。 The recognition character creation step (S0) establishes a pre-recognition character database. The pre-recognition character database can be established according to the type of character to be recognized actually, such as Latin letters, Arabic numerals, Japanese or other characters. In this embodiment, with 26 English letters (A

Z), 10 Arabic numerals (0

9) and 1 blank (Space), a total of 37 groups of characters are used as the recognition character data of the pre-recognition character database, and as follow-up training, the occurrence probability of characters in the pre-recognition character database needs to be compared.

該影像擷取步驟(S1)取得複數金屬製品表面之訓練影像，刪除不必要的訓練影像區塊，增加機器學習的效率。在本實施例中，該影像擷取步驟(S1)取得2,700 張訓練影像，裁切大小為500x200像素。The image capturing step (S1) obtains multiple training images of the surface of the metal product, deletes unnecessary training image blocks, and increases the efficiency of machine learning. In this embodiment, the image capture step ( S1 ) obtains 2,700 training images, and the cropping size is 500×200 pixels.

該影像擴增步驟(S2)利用資料擴增方法增加影像訓練樣本。本實施例利用少量多樣化的訓練方式，可以從少量的樣本資料進行擴增來提高資料量，去建立影像樣本類別，省去蒐集大量樣本的時間。The image augmentation step (S2) utilizes a data augmentation method to increase image training samples. In this embodiment, a small number of diversified training methods can be used to amplify a small amount of sample data to increase the amount of data, to establish image sample categories, and to save the time of collecting a large number of samples.

較佳地，該影像擴增步驟 (S2)可使用對比調整方式增加影像訓練樣本。在一實施例中，該對比調整方式為將影像中所有像素點

的亮度值

乘上一對比參數

。在本實施例中，該對比參數為

，範圍變化量為-1.1~1.1，該對比公式調整相當於改變直線方程式的斜率值，即

，

為常數，使得影像中增加或減少顏色分佈大小的範圍。 Preferably, the image augmentation step (S2) can use a contrast adjustment method to increase image training samples. In one embodiment, the contrast adjustment method is to convert all pixels in the image

The brightness value of

multiplied by a comparison parameter

. In this embodiment, the comparison parameter is

, the range change is -1.1~1.1, the adjustment of this comparison formula is equivalent to changing the slope value of the straight line equation, that is

,

Is a constant that increases or decreases the size of the color distribution in the image.

較佳地，該影像擴增步驟(S2)亦可使用亮度調整方式增加影像訓練樣本。在一實施例中，該亮度調整方式為將影像中所有像素點的亮度值

進行調整，該亮度值

範圍為-30~30，該亮度調整相當於改變直線方程式的位移量，即

，

為常數，使得影像中的顏色分佈圖整個增加或減少。 Preferably, the image augmentation step (S2) can also use brightness adjustment to increase image training samples. In one embodiment, the brightness adjustment method is to adjust the brightness values of all pixels in the image

To adjust, the brightness value

The range is -30~30, the brightness adjustment is equivalent to changing the displacement of the linear equation, that is

,

is a constant that causes the overall increase or decrease of the color map in the image.

該輸入影像限制步驟(S3)調整並標準化輸入之訓練影像之大小。該輸入影像限制步驟(S3)將該影像擴增步驟(S2)所得到之影像進行尺寸調整，使得所有調整後之影像輸入訓練模型時可以標準化的單一影像大小，以利模型訓練。The input image limiting step (S3) adjusts and normalizes the size of the input training images. The input image restriction step (S3) adjusts the size of the image obtained in the image expansion step (S2), so that all adjusted images can be standardized to a single image size when inputting the training model, so as to facilitate model training.

圖2為本發明之金屬製品表面之文字辨識方法之一實施例之3類別2樣本示意圖，圖3為本發明之金屬製品表面之文字辨識方法之一實施例原生網路模型訓練示意圖，請參考圖2及圖3。該影像特徵擷取步驟(S4) 建立學習模型，透過類別樣本修正該學習模型，較佳地，使用卷積層與池化層建立學習模型，並透過類別樣本進行學習模型修正。該影像特徵擷取步驟(S4)使用原生網路模型(Prototypical Networks)作為初始訓練模型以進行少量多樣學習。在本實施例，該影像擷取步驟(S1)取得2,700張訓練影像中，分為27類訓練影像樣本集合。該初始訓練模型包含兩個卷積層及兩個池化層，依序為卷積層、池化層、卷積層及池化層，共四層。接著隨機從27類影像樣本集合中抽樣選取10個類別的影像樣本集合，再從這10個類別的影像樣本集合中分別選取10個影像樣本，即共100張影像樣本，並將此10個類別的影像樣本分成兩種資料集合，分別為支撐資料集合(Support set)及查詢資料集合(Query set)，且各從10個類別中分別各取得5個影像樣本，即各有50個影像樣本。其中，該原生網路模型得到支撐資料集合(Support set)之特徵。Figure 2 is a schematic diagram of 3 categories 2 samples of one embodiment of the text recognition method on the surface of metal products of the present invention, and Figure 3 is a schematic diagram of native network model training of one embodiment of the text recognition method on the surface of metal products of the present invention, please refer to Figure 2 and Figure 3. The image feature extraction step ( S4 ) establishes a learning model, corrects the learning model through category samples, preferably, uses a convolutional layer and a pooling layer to establish a learning model, and performs learning model correction through category samples. The image feature extraction step ( S4 ) uses Prototypical Networks as an initial training model for a small number of diverse learning. In this embodiment, the image capture step ( S1 ) obtains 2,700 training images, which are divided into 27 types of training image sample sets. The initial training model includes two convolutional layers and two pooling layers, which are sequentially convolutional layer, pooling layer, convolutional layer, and pooling layer, with a total of four layers. Then randomly select 10 categories of image sample sets from the 27 categories of image sample sets, and then select 10 image samples from the 10 categories of image sample sets, that is, a total of 100 image samples, and divide the 10 categories The image samples are divided into two types of data sets, namely the support data set (Support set) and the query data set (Query set), and each obtains 5 image samples from each of the 10 categories, that is, each has 50 image samples. Wherein, the native network model is characterized by a support data set (Support set).

將該支撐資料集合(Support set)輸入原生網路模型(Prototypical Networks)後所得之輸出可建立樣本類別的原型特徵，原型特徵即影像中具代表性的特徵，其中，相同類別的特徵平均即是該類別的原生特徵，而特徵平均即所謂的特徵重心。The output obtained after inputting the support set (Support set) into the original network model (Prototypical Networks) can establish the prototype feature of the sample category, the prototype feature is the representative feature in the image, and the average of the features of the same category is The native features of the category, and the feature average is the so-called feature centroid.

將該查詢資料集合(Query set)輸入原生網路模型(Prototypical Networks)後所得之輸出可根據樣本類別的原生特徵誤差更新訓練該原生網路模型之模型學習參數。接著，持續讓該支撐資料集合及該查詢資料集合分別輸入該原生網路模型後所得之輸出進行誤差修正，再將誤差修正輸入至該原生網路模型進行迴圈訓練直到收斂為止。The output obtained after inputting the query set into the Prototypical Networks model can update the model learning parameters for training the Prototypical Networks model according to the error of the original features of the sample category. Then, the support data set and the query data set are respectively input into the native network model for error correction, and then the error correction is input to the native network model for loop training until convergence.

圖4為本發明之金屬製品表面之文字辨識方法之一實施例邊界框示意圖，圖5為本發明之金屬製品表面之文字辨識方法之一實施例之邊界框生成及字元訓練示意圖，請參考圖4及圖5。該邊界框界定步驟(S5)利用該影像特徵擷取步驟(S4)之結果進行欲辨識範圍之界訂，並產生一邊界框及一第一目標參數。該邊界框係作為欲辨識字元區域範圍之界定，該邊界框界定步驟(S5)利用該原生網路模型之輸出模型特徵，並經過三個全連接層(Dense)再利用平均平方誤差(MSE)來生成該第一目標參數，並使用最陡梯度法修正參數。其中，該第一目標參數之平均平方誤差值(MSE) =

，其中

為目標邊界框之寬度，

為目標邊界框之高度，

為預測邊界框之寬度，

為預測邊界框之高度，(

,

)為目標邊界框之中心點座標，(

,

)為預測邊界框之中心點座標，藉由調整平均平方誤差，當

、

=

及

=

時，即該目標邊界框與該預測邊界框重疊，則可將所欲辨識字元全部匡列在辨識區域範圍內，可以避免漏字或因為字元不完整而誤判，其中該第一目標參數修正公式

，其中，

為修正參數，

修正步階次數，

為對誤差微分項。 Figure 4 is a schematic diagram of the bounding box of an embodiment of the text recognition method on the surface of metal products of the present invention, and Figure 5 is a schematic diagram of the bounding box generation and character training of an embodiment of the text recognition method on the surface of metal products of the present invention, please refer to Figure 4 and Figure 5. The bounding box defining step (S5) utilizes the result of the image feature extracting step (S4) to define the range to be recognized, and generates a bounding box and a first target parameter. The bounding box is used as the definition of the character region to be identified, and the bounding box definition step (S5) utilizes the output model features of the original network model, and then uses the mean squared error (MSE) through three fully connected layers (Dense) ) to generate the first target parameters, and use the steepest gradient method to modify the parameters. Wherein, the mean square error (MSE) of the first target parameter =

,in

is the width of the target bounding box,

is the height of the target bounding box,

is the width of the predicted bounding box,

is the height of the predicted bounding box, (

,

) is the coordinates of the center point of the target bounding box, (

,

) is the coordinate of the center point of the predicted bounding box, by adjusting the average square error, when

,

=

and

=

When , that is, the target bounding box overlaps with the predicted bounding box, all the characters to be recognized can be listed within the range of the recognition area, and missing characters or misjudgments due to incomplete characters can be avoided. The first target parameter Correction formula

,in,

For the correction parameters,

The number of corrected steps,

is the differential term for the error.

該字元内容訓練步驟(S6)利用該邊界框界定步驟(S5)所得到之邊界框座標範圍，擷取該邊界框內字元進行訓練，並產生一第二目標參數。在本實施例中，目標字串共有五個字元，目標為找到英文字母A

Z ，阿拉伯數字0

9及空白(Space)，共計37個字元中，該第二目標參數為37個字元中機率值最大個數值，該字元内容訓練步驟(S6)使用兩個全連結層(Dense)及兩個雙向長短期記憶層(Bidirectional Long Short-Term Memory，Bidirectional LSTM)進行訓練學習，依序為全連結層-雙向長短期記憶層-雙向長短期記憶層-全連結層，利用連結時序分類(Connectionist temporal classification，CTC)進行損失計算(CTC Loss)。本實施例中，在單位時間內取樣26次，再從每一取樣訊號中找出37個字元中的最大機率，再把取樣次數中重複出現的字元合併跟刪除，剩下所需要的五個字元。 The character content training step (S6) utilizes the bounding box coordinate range obtained in the bounding box defining step (S5), extracts characters in the bounding box for training, and generates a second target parameter. In this embodiment, the target string has five characters in total, and the target is to find the English letter A

Z , Arabic numeral 0

9 and blank (Space), in a total of 37 characters, the second target parameter is the largest numerical value of the probability value in the 37 characters, and the character content training step (S6) uses two fully connected layers (Dense) and Two bidirectional long short-term memory layers (Bidirectional Long Short-Term Memory, Bidirectional LSTM) for training and learning, in order of fully connected layer - bidirectional long short-term memory layer - bidirectional long short-term memory layer - fully connected layer, using the connection timing classification ( Connectionist temporal classification, CTC) for loss calculation (CTC Loss). In this embodiment, 26 samples are taken per unit time, and then the maximum probability among the 37 characters is found from each sampled signal, and then the repeated characters in the sampling times are combined and deleted, leaving the required five characters.

請續參考圖5。該誤差修正及字串生成步驟(S7)合併該第一目標參數及該第二目標參數進行誤差修正後辨識字元，產生一欲辨識文字字串。依據該邊界框誤差位置跟中心點座標，並參考該預辨識字元資料庫中字元的出現最大機率，產生該欲辨識文字字串。在本實施例中，辨識字串為「0KMMS」，取樣26次，其中，「M」在第23次有重複出現一次因此刪除，因此辨識為0KMMS，而非0KMMMS。Please continue to refer to FIG. 5 . The error correction and character string generation step (S7) combines the first target parameter and the second target parameter for error correction to identify characters to generate a character string to be recognized. The character string to be recognized is generated according to the error position of the bounding box and the coordinates of the center point, and referring to the maximum probability of occurrence of characters in the pre-recognized character database. In this embodiment, the identification string is "OKMMS", which is sampled 26 times, and "M" appears repeatedly once in the 23rd time, so it is deleted, so it is identified as OKMMS, not OKMMMS.

綜上所述，藉由本發明之辨識方法，可以利用少量多樣的機器學習訓練方式提高辨識準確定，具有非精準對位辨識的優點，即不須限制物件位置即能精準進行字元辨識，以及可以動態調整邊界框，即可以動態辨識字元長度，例如5個或7個字元，可以廣泛應用於字元小、字元長度不一及表面反光的字元辨識領域，特別是金屬閥類表面字元的影像辨識。In summary, with the identification method of the present invention, a small number of various machine learning training methods can be used to improve the accuracy of identification, and it has the advantages of non-precise alignment identification, that is, character identification can be performed accurately without restricting the position of the object, and The bounding box can be dynamically adjusted, that is, the character length can be dynamically recognized, such as 5 or 7 characters, which can be widely used in the field of character recognition with small characters, different character lengths and reflective surfaces, especially metal valves Image recognition of surface characters.

以上所述之實施例僅係為說明本發明之技術思想及特徵，其目的在使熟習此項技藝之人士均能了解本發明之內容並據以實施，當不能以此限定本發明之專利範圍，凡依本發明之精神及說明書內容所作之均等變化或修飾，皆應涵蓋於本發明專利範圍內。The above-mentioned embodiments are only to illustrate the technical ideas and characteristics of the present invention, and its purpose is to enable those who are familiar with this art to understand the content of the present invention and implement it accordingly, and should not limit the patent scope of the present invention. , all equivalent changes or modifications made in accordance with the spirit of the present invention and the content of the description shall be covered within the patent scope of the present invention.

S0:辨識字元建立步驟 S1:影像擷取步驟 S2:影像擴增步驟 S3:輸入影像限制步驟 S4:影像特徵擷取步驟 S5:邊界框界定步驟 S6:字元内容訓練步驟 S7:誤差修正及字串生成步驟 S0: Recognition character creation steps S1: Image capture step S2: Image amplification step S3: input image restriction step S4: Image feature extraction step S5: bounding box definition step S6: character content training steps S7: error correction and character string generation steps

圖1為本發明之金屬製品表面之文字辨識方法之一實施例流程圖；圖2為本發明之金屬製品表面之文字辨識方法之一實施例之3類別2樣本之影像示意圖；圖3為本發明之金屬製品表面之文字辨識方法之一實施例原生網路模型訓練示意圖；圖4為本發明之金屬製品表面之文字辨識方法之一實施例邊界框示意圖；圖5為本發明之金屬製品表面之文字辨識方法之一實施例之邊界框生成及字元訓練示意圖。 Fig. 1 is a flow chart of one embodiment of the character recognition method on the surface of metal products of the present invention; Fig. 2 is a schematic diagram of images of 3 types 2 samples of one embodiment of the text recognition method on the surface of metal products of the present invention; Fig. 3 is a schematic diagram of native network model training of one embodiment of the text recognition method on the surface of metal products of the present invention; Fig. 4 is a schematic diagram of the bounding box of one embodiment of the text recognition method on the surface of metal products of the present invention; FIG. 5 is a schematic diagram of bounding box generation and character training of an embodiment of the method for character recognition on the surface of metal products of the present invention.

S0:辨識字元建立步驟 S0: Recognition character creation steps

S1:影像擷取步驟 S1: Image capture step

S2:影像擴增步驟 S2: Image amplification step

S3:輸入影像限制步驟 S3: input image restriction step

S4:影像特徵擷取步驟 S4: Image feature extraction step

S5:邊界框界定步驟 S5: bounding box definition step

S6:字元內容訓練步驟 S6: character content training steps

S7:誤差修正及字串生成步驟 S7: error correction and character string generation steps

Claims

A method for character recognition on the surface of a metal product, comprising: a character recognition building step (S0), establishing a pre-recognition character database; an image capture step (S1), obtaining multiple training images of the metal product surface, and cutting These training images; an image augmentation step (S2), data amplification of the number of samples of these training images; an input image restriction step (S3), adjusting and standardizing the size of the input training image; an image feature extraction step (S4), establish a learning model, modify the learning model through category samples, and use a native network model as an initial training model; a bounding box definition step (S5), define the desired recognition range, and generate a bounding box and a first Target parameters, using the output model features of the original network model, through three fully connected layers, and then using the average square error method to generate the first target parameters, and using the steepest gradient method to modify the parameters, the bounding box is used as the desired Recognize the boundaries of the character area range; a character content training step (S6), extract the characters in the bounding box for training, and generate a second target parameter; and an error correction and word string generation step (S7), Combining the first target parameter and the second target parameter and performing error correction to identify characters to generate a character string to be recognized.

The text recognition method on the surface of a metal product as described in Claim 1, wherein the image amplification step (S2) is to use a contrast adjustment method to increase the number of samples of the training images.

The text recognition method on the surface of a metal product as described in Claim 1, wherein the image amplification step (S2) is to use a brightness adjustment method to amplify the number of samples of the training images.

The text recognition method on the surface of metal products as described in Claim 1, wherein the native network model includes two convolutional layers and two pooling layers, which are convolutional layers, pooling layers, convolutional layers, and pooling layers in sequence .

The method for character recognition on the surface of metal products as described in Claim 1, wherein the category samples in the image feature extraction step (S4) are divided into two data sets, which are support data set and query data set.

The text recognition method on the surface of metal products as described in Claim 5, wherein the image feature extraction step (S4) further includes: inputting the support data set into the native network model, and outputting prototype features; and then inputting the query data set Native network model, outputting model learning parameters; and, continuously allowing the support data set and the query data set to be respectively input into the native network model to perform output error correction, and then continuously input the error correction to the native network model for further processing Loop training until convergence.

The text recognition method on the surface of metal products as described in Claim 1, wherein the average square error of the first target parameter

, where w _d is the width of the target bounding box, h _d is the height of the target bounding box, w _p is the width of the predicted bounding box, h _p is the height of the predicted bounding box, ( x _d , y _d ) is the height of the target bounding box The coordinates of the center point, ( x _p , y _p ) are the coordinates of the center point of the predicted bounding box. By adjusting the average square error, x _p = x _d , y _p = y _d , w _p = w _d and h _p = h _d , make the target bounding box overlap with the predicted bounding box, and all the characters to be recognized are listed within the range of the recognition area.

The text recognition method on the surface of metal products as described in Claim 1, wherein the first target parameter correction formula of the steepest gradient method is θ _{k + 1} = θ _k -▽ _θ L (θ), where θ is the correction parameter, k correction step number, ▽ _θ L (θ) is the differential term for error.

The text recognition method on the surface of metal products as described in Claim 1, wherein the character content training step (S6) uses two fully connected layers and two bidirectional long and short-term memory layers for training and learning, and the sequence is the fully connected layer -Bidirectional long-term short-term memory layer-bidirectional long-term short-term memory layer-full connection layer, using connection timing classification for loss calculation.