CN113239932A

CN113239932A - Tesseract-OCR-based identification method for air velocity scale in PFD (flight display device)

Info

Publication number: CN113239932A
Application number: CN202110560479.4A
Authority: CN
Inventors: 赵亮; 王迪
Original assignee: Xian University of Architecture and Technology
Current assignee: Xian University of Architecture and Technology
Priority date: 2021-05-21
Filing date: 2021-05-21
Publication date: 2021-08-10

Abstract

The invention discloses a Tesseract-OCR-based method for identifying a hollow speed scale in a PFD (pulse frequency device) of an aircraft main display, and belongs to the field of instrument identification. The identification method of the air speed scale in the aircraft main display PFD based on Tesseract-OCR comprises the steps of preprocessing a picture, extracting and storing the characteristics of incomplete characters by utilizing a Tesseract-OCR targeted training data set, and solving the problems that the identification rate is low and even the identification cannot be realized when the incomplete characters exist in roller type digital display of the air speed scale in an aircraft instrument; and then, by combining the LSTM neural network model, the incomplete character features can be effectively trained, and the recognition rate of incomplete characters is improved.

Description

Tesseract-OCR-based identification method for air velocity scale in PFD (flight display device)

Technical Field

The invention belongs to the field of instrument recognition, and particularly relates to a method for recognizing a hollow speed scale in a PFD (pulse frequency device) of an aircraft main display based on Tesseract-OCR (optical character recognition).

Background

In instrument Recognition, the Optical Character Recognition (OCR) algorithm is a common instrument Recognition method, and in 1985, the HP laboratory developed an OCR engine of Tesseract, whose basic theory is a process of determining a shape by detecting dark and light patterns and then translating the shape into a computer text by a Character Recognition method. The defects of the existing Tesseract OCR algorithm are shown as follows: (1) when incomplete characters appear in the roller type digital display of an airspeed scale in an aeronautical instrument, the existing data set cannot play a role, and the situation that the characters cannot be identified can occur. (2) When a targeted data set is trained, the recognition rate is not high.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a method for identifying an air speed scale in an aircraft main display PFD based on Tesseract-OCR.

In order to achieve the purpose, the invention adopts the following technical scheme to realize the purpose:

a method for identifying an air speed scale in an aircraft main display PFD based on Tesseract-OCR comprises the following steps:

(1) sequentially carrying out graying and binarization preprocessing on the collected picture to obtain a preprocessed picture;

(2) carrying out character segmentation based on the preprocessed picture to obtain complete characters and incomplete characters;

(3) carrying out targeted training on the incomplete characters by using Tesseract-OCR to obtain a data set;

(4) integrating an LSTM neural network model by using the data set to obtain a new data set;

(5) and carrying out graying and binarization preprocessing on the picture to be recognized, and calling and combining the new data set to recognize on Tesseract-OCR.

Further, the step (1) utilizes a conversion formula from RGB to a gray scale map to perform graying:

GARY＝RED*0.299+GREEN*0.588+BLUE*0.133。

further, the step (1) divides the image into two parts, namely a foreground and a background, by setting a preset threshold value for binarization.

Further, binarization is performed by adopting the following formula:

wherein f (x, y) is an original image; t is a gray threshold; g (x, y) is a binary image obtained by threshold value operation.

Further, in the step (2), character segmentation is carried out on the preprocessed picture by adopting a vertical projection method.

Further, the specific process of the step (3) is as follows:

training the incomplete character set by a jessboxeditor;

correcting the Box files of the incomplete characters one by one, and defining a font characteristic file after the correction is finished;

and creating a batch preprocessing file under the directory where the sample picture is located, executing the batch preprocessing file to obtain a finally generated language file, and copying the language file into a tessdata file of the program to obtain a trained database.

Further, the step (4) specifically comprises:

(401) extracting lstmf files from tif and box files generated from the data set for training of an LSTM neural network model;

(402) extracting an LSTM file from the traineddata file to obtain an LSTM neural network model;

(403) beginning training when the stage file eng.lstm is generated, and ending training when the error rate of the LSTM neural network model is lower than 0.01;

(404) and generating a checkpoint file after training is finished, combining the language file generated by the data set and the checkpoint file to generate a new language file, and placing the new language file in a tessdata folder to obtain a new data set.

Further, the number of training times in step (403) is 6000.

Compared with the prior art, the invention has the following beneficial effects:

the identification method of the air speed scale in the aircraft main display PFD based on Tesseract-OCR comprises the steps of preprocessing a picture, extracting and storing the characteristics of incomplete characters by utilizing a Tesseract-OCR targeted training data set, and solving the problems that the identification rate is low and even the identification cannot be realized when the incomplete characters exist in roller type digital display of the air speed scale in an aircraft instrument; and then, by combining the LSTM neural network model, incomplete character features can be effectively trained, and the recognition rate of incomplete characters is improved.

Drawings

FIG. 1 is a schematic flow diagram of the present invention;

FIG. 2 is a picture after graying of an embodiment;

FIG. 3 is a picture after binarization according to the embodiment;

fig. 4 is a picture in an embodiment, wherein fig. 4(a), fig. 4(b), fig. 4(c), and fig. 4(d) are an original image, a binarized image, a histogram of character projection, and a divided character image, respectively;

FIG. 5 is a partially cut character set according to an embodiment;

FIG. 6 is a diagram illustrating the effect of the embodiment; wherein, fig. 6(a) is an effect graph of recognition using a self-contained data set, fig. 6(b) is an effect graph of recognition using a trained data set, and fig. 6(c) is an effect graph of database recognition combined with an LSTM neural network;

FIG. 7 is a graph comparing the recognition rates of examples.

Detailed Description

In order to make the technical solutions of the present invention better understood by those skilled in the art, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In consideration of the complexity of the aerometer, the invention carries out targeted data set training aiming at the condition that incomplete characters exist in the roller type digital display of the airspeed scale in the aerometer, solves the problem that incomplete characters cannot be identified, and improves the identification rate when incomplete characters exist by combining with an LSTM neural network model.

The invention is described in further detail below with reference to the accompanying drawings:

referring to fig. 1, fig. 1 is a schematic flow chart of the present invention, and a method for identifying an air velocity scale in a PFD of an aircraft main display based on Tesseract-OCR includes the following steps:

(1) pretreatment: carrying out graying and binaryzation on the collected picture so that the picture can be better identified;

(2) character segmentation: carrying out character segmentation on the preprocessed picture, dividing the preprocessed picture into complete characters and incomplete characters, and carrying out targeted character set training on the incomplete characters so as to carry out subsequent recognition;

(3) training data set: training the incomplete characters obtained after segmentation to obtain a data set;

(4) training an LSTM neural network model: extracting an lstmf file from a file generated by a previously trained sample data set for lstm training, then extracting an lstm file from a traineddata file, starting training from a generated stage file, training 6000 times to enable the error rate to be lower than 0.01, combining a checkpoint file generated after the training and a language file generated by the previously trained data set to generate a new language file, and placing the new language file in a tessdata folder;

(5) character recognition: and preprocessing the picture to be recognized, and calling the trained data set through codes to finish recognition.

In summary, the recognition method of the air speed scale in the aircraft main display PFD based on Tesseract-OCR provided by the invention firstly preprocesses the acquired picture to enable the picture to be better recognized, can improve the algorithm recognition rate, needs to perform targeted character set training because the main display PFD number in the aircraft cockpit is displayed in a rolling mode, adopts jessboxer training to extract and store the characteristics of incomplete characters, and then combines Tesseract OCR with a neural network model to enable the obtained incomplete character characteristics to be effectively trained, and improves the recognition rate.

In the embodiment of the present invention, step (1) specifically includes the following steps:

the collected picture is preprocessed, wherein the preprocessing comprises graying and binarization, so that subsequent preprocessing is facilitated, as the human eyes have the strongest green sensitivity and the weakest blue, the green channel weight is the largest, the blue channel weight is the smallest, and the conversion formula from RGB to a gray scale map is as follows:

GARY＝RED*0.299+GREEN*0.588+BLUE*0.133

in the formula, GARY represents gray; RED represents the RED component; GREEN represents the GREEN component; BLUE represents the BLUE component.

After the image to be recognized is grayed, some interference factors need to be removed for better recognition, so that binarization preprocessing is performed, the preprocessed image is a black-and-white image, and the foreground can be highlighted and separated from the background.

The binarization preprocessing is as follows:

Referring to fig. 2 and fig. 3, fig. 2 is a picture after graying, and fig. 3 is a picture after binarization. In the embodiment of the present invention, the threshold value 194 is selected as a segmentation value for better preprocessing.

In step (2) of the embodiment of the present invention, the preprocessed graph is cut to cut out a portion to be recognized, and then a vertical projection method is used to perform character segmentation, wherein the specific vertical projection segmentation method is as follows:

the character cutting adopts a vertical projection method in OpenCV, the principle is to analyze a distribution histogram of pixels of a binary image so as to find out boundary points of adjacent characters for segmentation, and when an image only consists of an object and a background, a gray level histogram becomes an obvious double-peak value.

At this time, it is necessary to find a valley value to perform segmentation, find a first peak value and a second peak value, and then find a valley value between the first peak value and the second peak value, thereby determining a threshold value of the valley value. OTSU, Chinese translation is the maximum inter-class variance method, and is proposed by Japanese scientists at the end of the 20 th 70 s, and the method is used for researching the gray level characteristics of an image and enabling the image to have a foreground part and a background part according to the gray level distribution of the image. The value of the inter-class method between the foreground image and the background image is calculated. The higher the value, the larger the difference between the foreground image and the background image. For this reason, in the erroneous classification, the background image also takes the contents of the target image, or some contents in the background image are listed in the target image. Thus, the variance becomes small. Therefore, if the dispersion of the object from the background image is large, the error rate will be reduced.

The principle of the OTSU algorithm is as follows: when the foreground background image is segmented, setting the threshold value as t and setting the ratio of the total number of the foreground image pixel points to the total image pixel points as w₀The average gray level of the foreground image is u₀Similarly, the value of the number of background image pixels to the total image pixels is set as w₁The corresponding order is u₁And u₀The total average gray scale is as follows.

u＝w₀u₀＝w₁u₁

The variance of the foreground and background images is:

g＝w₀(u₀-u)²+(u₁-u)²

in actual calculations, an equivalent formula can be used:

σ²＝w₀u₁(u₀±u₁)²

when the variance of the foreground and background images is large, i.e. g in the formula is large, t is the optimal segmentation threshold of the image. The method is used for operating the instrument image according to the characteristics of the aircraft instrument image. The character cutting pattern is shown in fig. 4.

In the embodiment of the invention, in the step (3), 1000 sample pictures are collected and preprocessed, the preprocessed pictures are subjected to character segmentation by using a vertical projection method, the obtained incomplete character set is trained by using a jessboarder, and the partially segmented character set is shown in fig. 5.

After Tesseract-OCR is downloaded, Java JDK is downloaded firstly, Jtessboxeditors are installed after downloading is completed, so that Box files need to be corrected one by one, the more data is, the larger the workload is, after correction is good, font feature files are defined, a batch preprocessing file is created under a directory where sample pictures are located, the batch preprocessing file is executed to obtain a finally generated language file, and the finally generated language file is copied into a tessdata file of a program, namely a trained database can be used.

Step (4) of the embodiment of the present invention specifically includes:

(401) extracting lstmf files from tif and box files generated from the data set in the step (3) for LSTM neural network model training;

(402) extracting lstm files from traineddata files;

(403) starting training from the generated stage file eng.lstm, and ending the training until the error rate is lower than 0.01, wherein the training is carried out for 6000 times;

(404) and generating a checkpoint file after training, combining the checkpoint file with a new language file generated by the previously trained sample data set to generate a new language file, placing the synthesized file in a tessdata folder, and identifying through code calling to test the identification rate.

In the embodiment of the invention, after the picture to be recognized is preprocessed, the data set of the self-contained library, the trained data set and the trained data set are respectively called by codes to be recognized by combining with an LSTM model, and the recognition rate is tested.

Referring to fig. 6, fig. 6(a) is a graph showing the effect of recognition using a self-contained data set, fig. 6(b) is a graph showing the effect of recognition using a self-trained data set, and fig. 6(c) is a graph showing the effect of database recognition combined with an LSTM neural network, and it can be seen that the recognition effect of fig. 6(c) is the best. Referring to fig. 7, it can be seen that the recognition rate of the present invention is greatly improved.

According to the identification method of the air velocity scale in the aircraft main display PFD based on Tesseract-OCR, the collected picture is preprocessed through OpenCV so as to be convenient for subsequent preprocessing, due to the complexity of an aircraft instrument, targeted character set training is carried out on the aircraft instrument, the characteristics of incomplete characters are extracted and stored, the problem that the incomplete characters cannot be identified is solved, the characteristics of the character set are effectively utilized by combining an LSTM neural network model, and the identification rate of the incomplete characters is improved.

The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical solution according to the technical idea proposed by the present invention falls within the protection scope of the claims of the present invention.

Claims

1. A method for identifying an air speed scale in an aircraft main display PFD based on Tesseract-OCR is characterized by comprising the following steps:

(1) sequentially carrying out graying and binaryzation pretreatment on the collected picture to obtain a pretreated picture;

2. The Tesseract-OCR based recognition method of the air velocity scale in the PFD of the main display of the airplane as claimed in claim 1, wherein the step (1) grays by using the conversion formula from RGB to gray scale map:

GARY＝RED*0.299+GREEN*0.588+BLUE*0.133。

3. the Tesseract-OCR based identification method of the air velocity scale in the aircraft main display PFD according to claim 1, wherein the step (1) is performed by binarizing the image by dividing it into two parts-foreground and background by setting a preset threshold.

4. The Tesseract-OCR-based identification method of the air velocity scale in the PFD of the main display of the aircraft according to claim 3, characterized in that binarization is performed by using the following formula:

5. The Tesseract-OCR-based identification method of the air speed scale in the PFD of the main display of the airplane as claimed in claim 1, wherein the step (2) uses a vertical projection method to perform character segmentation on the preprocessed picture.

6. The Tesseract-OCR-based identification method of the air speed scale in the PFD of the main display of the airplane as claimed in claim 1, wherein the specific process of the step (3) is as follows:

training the incomplete character set by a jessboxeditor;

and creating a batch preprocessing file under the directory where the sample pictures are located, executing the batch preprocessing file to obtain a finally generated language file, and copying the language file into a tessdata file of the program to obtain the trained database.

7. The Tesseract-OCR-based identification method of the air speed scale in the aircraft main display PFD according to claim 6, wherein the step (4) specifically comprises:

8. The Tesseract-OCR based recognition method for air velocity scales in PFD main display of airplane as claimed in claim 7, wherein the number of training times in step (403) is 6000.