CN116543405A - Aviation plan notification information extraction method, device, equipment and storage medium - Google Patents

Aviation plan notification information extraction method, device, equipment and storage medium Download PDF

Info

Publication number
CN116543405A
CN116543405A CN202310538985.2A CN202310538985A CN116543405A CN 116543405 A CN116543405 A CN 116543405A CN 202310538985 A CN202310538985 A CN 202310538985A CN 116543405 A CN116543405 A CN 116543405A
Authority
CN
China
Prior art keywords
data
image
text
generate
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310538985.2A
Other languages
Chinese (zh)
Inventor
王志鹏
秦星达
齐宝东
李俊
李泽轮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Hezhong Sizhuang Space Time Material Union Technology Co ltd
Original Assignee
Beijing Hezhong Sizhuang Space Time Material Union Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Hezhong Sizhuang Space Time Material Union Technology Co ltd filed Critical Beijing Hezhong Sizhuang Space Time Material Union Technology Co ltd
Priority to CN202310538985.2A priority Critical patent/CN116543405A/en
Publication of CN116543405A publication Critical patent/CN116543405A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/12Detection or correction of errors, e.g. by rescanning the pattern
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/1463Orientation detection or correction, e.g. rotation of multiples of 90 degrees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing
    • G06V30/164Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/1801Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/26Techniques for post-processing, e.g. correcting the recognition result

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Character Input (AREA)

Abstract

The application relates to an aviation planning notice information extraction method, device, equipment and storage medium, which are applied to the technical field of information extraction, and the method comprises the following steps: acquiring a training sample set and a pre-training model, and performing migration learning on the pre-training model based on the training sample set to generate a trained information extraction model; acquiring a notice image to be extracted, preprocessing the notice image to be extracted to generate first data, wherein the preprocessing comprises row line extraction; inputting the first data into the information extraction model to generate second data; and carrying out post-processing on the second data to generate extraction information, wherein the post-processing comprises dynamic form layout recovery. The method and the device have the effect of improving the recognition accuracy of the aviation planning notice with dense characters and no table lines.

Description

Aviation plan notification information extraction method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of information extraction technologies, and in particular, to an aviation planning notification information extraction method, device, equipment, and storage medium.
Background
The flight schedule notification sheet is a flight schedule sheet containing information such as serial numbers, flight dates, flight numbers, properties, and machine numbers. However, due to security reasons, the flight schedule notification cannot be transmitted via the internet but only via conventional facsimile, and thus, when it is necessary to structure information in the flight schedule notification, it is necessary to perform the transmission via OCR technology.
However, the flight plan notification sheet is different from the common table, and some flight plan notification sheets have dense single-page information, no table lines exist in the single page, and due to the influence of photographing environment, the pictures have identification interference factors such as inclination, distortion, shadow and the like, so that the identification accuracy is poor when the common table identification method is used for carrying out aviation notification sheet information.
Disclosure of Invention
In order to improve recognition accuracy of an aviation planning notice with dense characters and no table line, the application provides an aviation planning notice information extraction method, device and equipment and a storage medium.
In a first aspect, the present application provides an aviation planning notification information extraction method, which adopts the following technical scheme:
an aviation planning notice information extraction method comprises the following steps:
acquiring a training sample set and a pre-training model, and performing migration learning on the pre-training model based on the training sample set to generate a trained information extraction model;
acquiring a notice image to be extracted, preprocessing the notice image to be extracted to generate first data, wherein the preprocessing comprises row line extraction;
inputting the first data into the information extraction model to generate second data;
and carrying out post-processing on the second data to generate extraction information, wherein the post-processing comprises dynamic form layout recovery.
Through adopting the technical scheme, the training sample set is used for carrying out transfer learning training on the pre-training model, an information extraction model for carrying out information extraction is generated, the recognition accuracy of the information extraction model is more accurate, the information extraction model is trained according to the to-be-extracted notification image, the extraction requirement for extracting the type content of the to-be-extracted notification image is met, the to-be-extracted notification image needing to be subjected to information extraction is preprocessed, the to-be-extracted notification image is processed into first data which is convenient for the information extraction model to carry out information extraction, the first data is input into the information extraction model for carrying out information extraction recognition, second data is obtained, the second data is subjected to post-processing after the second data is obtained, namely, the recognized second data is subjected to recalibration error correction, and error generation is reduced, so that the recognition accuracy of the aviation planning notification with dense characters and no table line is improved.
Optionally, the pre-training model includes a first pre-processing model and a second pre-processing model, and the information extraction model includes a first information extraction model and a second information extraction model; the performing migration learning on the pre-training model based on the training sample set, and generating a trained information extraction model training sample set includes:
inputting the training sample set to the first pre-training model to generate a problem data set;
acquiring a target demand, and labeling and correcting the problem data set based on the target demand to generate a labeling data set;
performing migration learning on the first pre-training model based on the annotation data set to generate a first information extraction model;
performing deformation processing on the training sample set to generate a deformed sample set;
acquiring a character recognition basic dictionary, and modifying the character recognition basic dictionary based on a preset modification requirement to generate a target dictionary;
and performing migration learning based on the deformation sample set and the second pre-training model of the target dictionary to generate a second information extraction model.
Through adopting above-mentioned technical scheme, firstly obtain problem dataset through training sample set and first training model in advance, compare problem dataset with the target demand, can obtain the data that need carry out the modification and train, carry out the summation with above-mentioned data and generate the annotation dataset, use the annotation dataset to carry out migration study to first training model in advance and obtain first information extraction model, can reduce problem data's production in the problem dataset, make the discernment more accurate, training sample data in the deformation sample set is more laminated actual waiting to draw the notice image, the recognition accuracy of the second information extraction model that obtains is also higher, thereby make final recognition result more accurate.
Optionally, the preprocessing further comprises edge clipping, image denoising and image rotation; the preprocessing the notification image to be extracted further comprises:
removing blank parts around the Chinese characters in the to-be-extracted notice image based on a contour detection method, and generating an edge clipping image; removing noise points in the edge clipping image to generate a noise-removed image;
and estimating the character direction of the denoising image, and rotating the processed denoising image based on a character direction estimation result to generate a rotating image.
Optionally, the estimating the text direction of the denoised image, and rotating the processed denoised image based on the text direction estimation result, and generating the rotated image includes:
acquiring a preset progress angle value and a value range, and determining a stepping range of a stepping value based on the preset progress angle value and the value range;
acquiring a row pixel value of the de-noised image, and calculating a rotation fraction based on the row pixel value, the stepping range and the preset calculation formula;
selecting the maximum value in the rotation scores, and taking an angle value corresponding to the maximum value as a character direction estimation result;
and rotating the de-dried image based on the text direction estimation result to generate a rotation image.
Optionally, the preprocessing further includes image cutting; after estimating the text direction of the denoised image and rotating the denoised image based on the text direction estimation result to generate a rotated image, the method further comprises:
acquiring a judgment threshold value, extracting row lines based on the judgment threshold value and the row pixel value, and generating a target row line;
and acquiring a target segmentation number, and performing image cutting on the rotating image based on the target segmentation number and the target row line to generate first data.
Optionally, the second data includes text box position coordinates, and the dynamic form layout recovery includes determining a row in which the text box is located and a column in which the text box is located; the post-processing the second data to generate extraction information includes: determining the height and width of the text box based on the text box position coordinates;
determining the row of the text box based on the height;
acquiring text characteristics of texts in a text box and information characteristics of each column in the aviation planning notice to be extracted;
determining a column in which the text box is located based on the text feature, the information feature and the width;
acquiring text data in the text box, and performing error correction processing on the text data to generate target data;
extraction information is generated based on the row of the text box, the column of the text box and the target data.
Optionally, the post-processing further includes information text error correction; the error correction processing of the text data to generate target data comprises the following steps:
acquiring airport information, and creating a text error correction dictionary based on the airport information;
acquiring the occurrence frequency of the airport information in the text error correction dictionary;
and performing text correction processing on the text data based on the occurrence frequency and the text correction dictionary to generate target data.
In a second aspect, the present application provides an aviation planning notification information extraction apparatus, which adopts the following technical scheme:
an aviation planning notice information extraction device, comprising:
the extraction model training module is used for acquiring a training sample set and a pre-training model, performing migration learning on the pre-training model based on the training sample set, and generating a trained information extraction model;
the device comprises a first data generation module, a first data extraction module and a second data extraction module, wherein the first data generation module is used for acquiring a notice image to be extracted, preprocessing the notice image to be extracted, and generating first data, wherein the preprocessing comprises line extraction;
the second data generation module is used for inputting the first data into the information extraction model to generate second data;
and the extraction information generation module is used for carrying out post-processing on the second data to generate extraction information, wherein the post-processing comprises dynamic form layout recovery.
Through adopting the technical scheme, the training sample set is used for carrying out transfer learning training on the pre-training model, an information extraction model for carrying out information extraction is generated, the recognition accuracy of the information extraction model is more accurate, the information extraction model is trained according to the to-be-extracted notification image, the extraction requirement for extracting the type content of the to-be-extracted notification image is met, the to-be-extracted notification image needing to be subjected to information extraction is preprocessed, the to-be-extracted notification image is processed into first data which is convenient for the information extraction model to carry out information extraction, the first data is input into the information extraction model for carrying out information extraction recognition, second data is obtained, the second data is subjected to post-processing after the second data is obtained, namely, the recognized second data is subjected to recalibration error correction, and error generation is reduced, so that the recognition accuracy of the aviation planning notification with dense characters and no table line is improved.
In a third aspect, the present application provides an electronic device, which adopts the following technical scheme:
an electronic device comprising a processor coupled with a memory;
the processor is configured to execute a computer program stored in the memory, so that the electronic device executes the computer program of the aviation planning notification information extraction method according to any one of the first aspects.
In a fourth aspect, the present application provides a computer readable storage medium, which adopts the following technical scheme:
a computer-readable storage medium storing a computer program capable of being loaded by a processor and executing the aviation planning notification information extraction method of any one of the first aspects.
Drawings
Fig. 1 is a flow chart of an aviation planning notification information extraction method according to an embodiment of the present application.
Fig. 2 is a block diagram of an aviation planning notice information extraction device according to an embodiment of the present application.
Fig. 3 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The present application is described in further detail below with reference to the accompanying drawings.
The embodiment of the application provides an aviation plan notification information extraction method, which can be executed by electronic equipment, wherein the electronic equipment can be a server or terminal equipment, the server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and a cloud server for providing cloud computing service. The terminal device may be, but is not limited to, a smart phone, a tablet computer, a desktop computer, etc.
Fig. 1 is a flow chart of an aviation planning notification information extraction method according to an embodiment of the present application.
As shown in fig. 1, the main flow of the method is described as follows (steps S101 to S104):
step S101, a training sample set and a pre-training model are obtained, the pre-training model is subjected to transfer learning based on the training sample set, and a trained information extraction model is generated.
Aiming at the step S101, a training sample set is input into a first pre-training model to generate a problem data set; acquiring target requirements, and performing annotation correction on the problem data set based on the target requirements to generate an annotation data set; performing migration learning on the first pre-training model based on the labeling data set to generate a first information extraction model; performing deformation processing on the training sample set to generate a deformation sample set; acquiring a character recognition basic dictionary, and modifying the character recognition basic dictionary based on a preset modification requirement to generate a target dictionary; and performing migration learning based on the deformation sample set and the target dictionary second pre-training model to generate a second information extraction model.
In this embodiment, the pre-training model includes a first pre-processing model and a second pre-processing model, and the information extraction model includes a first information extraction model and a second information extraction model; the first preprocessing model is a model for detecting characters and is mainly used for detecting whether characters and positions of the characters exist, the second preprocessing model is a model for identifying characters and is mainly used for identifying specific contents of the characters, and when information extraction of an aviation plan notice with dense characters and no form line is carried out, information extraction is carried out by adopting a mode that the two models are simultaneously matched with each other.
In terms of word recognition, in order to improve the accuracy of word recognition, a training sample set closer to the actual situation is required to be used, but because the number of available data may not meet the requirement of training, a file sample in a common training sample set is subjected to deformation processing to generate a deformation sample set, wherein the deformation processing includes but is not limited to illumination processing, distortion processing and black line adding processing, besides the deformation processing of the training sample set, a basic dictionary used for word recognition is modified, according to the specific content of an aviation notification plan notification bill, the content of the basic dictionary is not used for Chinese characters and lower case English, so that the Chinese characters and lower case English are used as the requirement of preset modification, the basic dictionary required to be modified for word recognition is used, the Chinese characters and lower case English in the basic dictionary for word recognition are removed, a target dictionary is obtained, and a second pre-training model for word recognition is subjected to migration learning by using the deformation sample set and the target dictionary, so that a second information extraction model is generated.
After the first information extraction model and the second information extraction model are obtained, in order to improve the speed and efficiency of text extraction, the first information extraction model and the second information extraction model need to be deployed according to the actual aviation plan notification information extraction requirement, and are deployed into a plurality of model instances, and it is noted that the first information extraction model and the second information extraction model are models used in a matched manner, so that the same number of the first information extraction model and the second information extraction model needs to be ensured during deployment, and the specific number of the deployments is not limited in particular.
Step S102, obtaining a notice image to be extracted, and preprocessing the notice image to be extracted to generate first data, wherein the preprocessing comprises row line extraction.
For step S102, since the lines in the to-be-extracted notification image are dense and the line spacing is small, and the common cutting method cannot meet the cutting requirement of the to-be-extracted notification image due to the addition of partial distortion, the pretreatment is performed on the to-be-extracted notification image, that is, the line is extracted by using the pixel statistics method, the extracted line is used as the target line, then the appropriate line is selected from the target line as the dividing line, and the premise of extracting the line is that the word direction is offset as much as possible, and meanwhile, the interference of some noise points is reduced, so that the image needs to be made before the image is divided.
Thus, preprocessing also image preprocessing, including edge cropping, image denoising, and image rotation; removing blank parts around the Chinese characters in the to-be-extracted notice image based on a contour detection method, and generating an edge clipping image; removing noise points in the edge clipping image to generate a noise-removed image; estimating the character direction of the denoising image, and rotating the processed denoising image based on the character direction estimation result to generate a rotating image.
The edge clipping and image denoising specific operation is that firstly, blank parts around characters are detected through a contour detection mode, the detected blank parts are clipped and removed to obtain edge clipping images, then the edge clipping images are subjected to image denoising processing, namely, the edge clipping images are subjected to gray level processing and image binarization processing, and noise points with smaller connected domains are removed through a connected domain calculation method, so that image denoising processing is completed to generate a noise-removed image. And after the de-fringing image is obtained, performing rotation processing on the de-fringing image according to the character direction in the de-fringing image.
Further, a preset progressive angle value and a value range are obtained, and a stepping range of a stepping value is determined based on the preset progressive angle value and the value range; acquiring row pixel values of the de-fringing image, and calculating a rotation fraction based on the row pixel values, the stepping range and a preset calculation formula; selecting the maximum value in the rotation scores, and taking the angle value corresponding to the maximum value as a character direction estimation result; and rotating the de-dried image based on the text direction estimation result to generate a rotation image.
In the aspect of character square estimation, in order to reduce the calculated amount, according to the actual shooting condition, the rotation angle of plus or minus 10 degrees is regulated, so that a simpler processing method can be adopted for character direction estimation, namely a stepping angle score method. The step angle dividing method is to set a step angle delta_angle, then gradually rotate the image by-10+i delta_angle degrees, then calculate the current fraction, select the angle with the highest fraction as the final angle, and calculate the fraction by using the sum of squares of the line pixel differences.
Wherein i is a step range calculated according to a preset progress angle value and a value range, i is a positive number, the value range is [0,10/delta_angle ], for example, the value range of i is [0, 20] if the preset progress angle value, i.e. delta_angle, is 0.5, and then the rotation fraction is calculated according to a preset calculation formula.
The preset calculation formula is thatWhere i is the above progress range, rows is the sum of the pixel values of each row in the de-noised image, i.e. the row pixel value, score is the word direction pixel value under a certain angle, i.e. the rotation score, M is the row of words in the to-be-extracted notification image, and N is the column of the to-be-extracted notification image. And (3) obtaining a plurality of rotation scores after all i are calculated, selecting an angle value corresponding to the maximum value from all rotation scores as a character direction estimation result, wherein the corresponding direction is the direction in which the de-noising image is to be rotated, then rotating the de-noising image to generate a rotation image, and performing next row line extraction and dynamic image cutting after the rotation image is generated.
In this embodiment, a judgment threshold is obtained, and row line extraction is performed based on the judgment threshold and a row pixel value, so as to generate a target row line; and acquiring the target segmentation number, and performing image cutting on the rotating image based on the target segmentation number and the target row line to generate first data.
The dynamic image cutting process is to acquire the row lines as much as possible, then to cut as much as possible according to the designated target cutting number, and if the number of the cutting which cannot be cut to be wanted appears in the middle, the cutting number is reduced until the cutting position is acquired.
The row line is extracted by finding a line which does not cover characters according to the row spacing, and at most one row line is arranged between two rows, specifically, the row line is obtained by adopting a pixel statistics method, namely, counting the average value of the pixel values of the row, namely, the tie value of each row of pixels in a rotation image, then judging whether the row line is the row line according to a set judging threshold value, and if the judging threshold value thread=10, judging whether the basis of the row line is that the average value avgi of the pixel values of the current row and the average value avgi+1 of the pixels of the next row have a relation: avgi < thread < avgi+1, then the row line is determined to be the target row line. After the target row line is obtained, cutting processing can be performed, and the dividing line is obtained, namely, a proper row line is selected from the target row line to serve as the dividing line according to the target dividing number. For example, assuming that 2 dividing lines are required for designating division into 3 blocks, and that the 2 dividing lines are as close as possible to 1/3 height and 2/3 height of the rotated image, the number of blocks to be divided is changed to 2 if no appropriate row line is found, and then the row line as close as 1/2 height is found until the cutting position is acquired, the cut is completed, and the cut image and the number of actual cuts are returned, and the cut image is taken as the first data.
Step S103, inputting the first data into the information extraction model, and generating the second data.
In this embodiment, the first data is input into the deployed information extraction model to obtain the second data, where the number of the first data needs to be less than or equal to the number of the deployed information extraction models. The information extraction model processes and calculates the first data, the obtained second data comprises predicted text and the position of a text box, and the position of each word in each text, and the result is in the JSON format.
Wherein the text array comprises all the identified texts, each text comprises text content, coordinates (x 1, y1, x2, y 2) of the upper left corner and the lower right corner of the text box and an information array of words in the text box, and each word information comprises the words and coordinates (x 1, y1, x2, y 2) of the upper left corner and the lower right corner of the text box.
And step S104, carrying out post-processing on the second data to generate extraction information, wherein the post-processing comprises dynamic form layout recovery.
For step S104, the second data includes text box position coordinates, and the dynamic form layout recovery includes determining a row in which the text box is located and a column in which the text box is located; determining the height and width of the text box based on the text box position coordinates; determining a row of the text box based on the height; acquiring text characteristics of a text in a text box and information characteristics of each column in an aviation planning notice to be extracted; determining a column in which the text box is located based on the text features, the information features and the width; text data in the text box is subjected to error correction processing to generate target data; the extraction information is generated based on the row in which the text box is located, the column in which the text box is located, and the target data.
Further, the post-processing also comprises information text error correction; acquiring airport information, and creating a text error correction dictionary based on the airport information; acquiring the occurrence frequency of airport information in a text error correction dictionary; and performing text correction processing on the text data based on the occurrence frequency and the text correction dictionary to generate target data.
In this embodiment, since the extraction according to the common form identification cannot be performed by drawing lines, the layout of the form is restored to the post-processing portion of the data, and the overall idea is to determine the rows and columns where the characters are located by identifying the positions of the characters in the result and the specific format type, so as to restore the real typesetting information. In the typesetting recovery process, the recovery of the whole layout is not completed by only one round of analysis due to the problems of smaller row-column spacing, distortion and the like, but the whole layout needs to be divided into a plurality of stages, and meanwhile, dynamic adjustment is needed in the typesetting process, so that the whole layout recovery is also called dynamic table layout recovery. Dynamic form layout recovery includes determining the row of the text box and determining the column of the text box, wherein the height and width of the text box need to be determined according to the position coordinates of the text box.
The method comprises the steps of determining that a line where a text box is located adopts a mode of extracting line data, extracting the line data to be classified into the same line by finding that the overlapping degree of the heights of adjacent text frames is highest, and firstly calculating the height overlapping degree, wherein a simple method is that the height difference of two text frames is divided by the height of one text frame, but the problem is that the identified text frames may have larger difference, so that the calculated overlapping degree is larger or smaller. Thereby using the formulaThe average box height is calculated, wherein avg_box_high is the average box height, n is the number of all text boxes, and H is the height of the text boxes.
Based on the overlap calculation, the 1 st text box is taken as a single 1 row, traversal is started from the 2 nd text box, and the text boxes of the rows are determined to be overlapped, all text boxes with the overlap degree larger than 0.5 are found, if no text box with the overlap degree larger than 0.5 is found, the text box is taken as a new row, and if a plurality of text boxes with the overlap degree larger than 0.5 exist, the row to which the text box belongs is determined according to the average overlap degree.
For example, there are two rows of already determined rows, 2 text boxes on row 1, and 3 text boxes on row 2, then it is currently necessary to calculate the degree of overlap that the 6 th text box is with the 3+2=5 text boxes of already determined rows. Assuming that the calculated 2 overlaps of the 1 st line are 7.2 and 7.5, and the 3 text boxes of the 2 nd line are 5.2, 5.4, and 3.4, respectively, the average overlap with the 1 st line is (7.2+7.4)/2=7.3, and the average overlap with the 2 nd line is (5.2+5.4)/2=5.3. Since the average overlap of line 1 is greater than line 2, the text box belongs to line 1.
After the line is determined, the insertion position of the text box is determined according to the abscissa of the upper left corner of the text box.
After the data is adjusted, the columns need to be aligned, that is, the column in which the text box is located is determined, and since not all data items have values, some data may be unreasonable due to recognition problems. For example, the serial number and the flight number are connected to form a single text box, and the date of the date is disconnected from the specific time to form two text boxes, so that data adjustment is required for the columns.
Column adjustment is performed by means of text boxes and coordinates based on text features. Firstly, information characteristics of each type of information are found, for example, the 1 st column of serial numbers should be numbers, serial numbers of different rows have accumulation relation, and the 2 nd column of flight numbers should be letters and numbers; columns 3 and 4 are mainly numbers, and each column is not less than 10 numbers. Thus, correction processing can be performed based on the information features of the different columns and the text features of the text in the text box. For example, for the problem of connecting serial numbers and flight numbers, the positions of text boxes and text boxes of each text can be obtained first, and the interval between the serial numbers and the flight numbers can be known according to the interval of the text boxes, so that the text boxes can be split.
When the position of the text box is adjusted according to the abscissa of the text box, the whole process is also an adjustment process from left to right and from top to bottom. The process is that the width overlapping degree is calculated by comparing with all text boxes with the positions determined before, the calculating mode is consistent with the height overlapping degree calculating method, if the overlapping degree is high, the same column is selected, and if the overlapping degree is too low, a new column is created.
The information text correction mainly comprises two parts of creating a dictionary and contrast correction. Because the letters in the plan notice are mainly numbers and letters, and some letters are similar to the numbers, such as I and 1, and the data items are relatively short, such as ZSHC, if the error is identified as Z5HC by using a common letter matching algorithm, the error correction cannot be performed because the error correction is possible because the error correction is not corrected. Therefore, the combined information error correction mode is adopted, namely the airport codes, the air line codes and the airport model information are combined, and the airport codes, the air line codes and the airport model information are combined to form an information group of 'airport codes, air line codes and airport model information' due to the mutual constraint relation among the three information groups, so that the success rate of error correction is improved. And according to the actual airport information, the text error correction dictionary is created by splicing and combining in an information group mode.
The text matching aspect adopts a dynamic frequency text editing distance method, namely, the frequency value of occurrence of airport information in a text error correction dictionary is added on the basis of a general text editing distance method, and the frequency can be periodically adjusted, so that the frequency of occurrence of airport information corresponding to different airports is different, and the success rate of error correction is further improved.
Fig. 2 is a block diagram of an aviation planning notification information extraction device 200 according to an embodiment of the present application.
As shown in fig. 2, the aviation planning notice information extraction apparatus 200 mainly includes:
the extraction model training module 201 is configured to obtain a training sample set and a pre-training model, perform transfer learning on the pre-training model based on the training sample set, and generate a trained information extraction model;
a first data generating module 202, configured to obtain a notification image to be extracted, and pre-process the notification image to be extracted to generate first data, where the pre-process includes line extraction;
a second data generating module 203, configured to input the first data into the information extraction model, and generate second data;
and the extraction information generation module 204 is configured to perform post-processing on the second data to generate extraction information, where the post-processing includes dynamic form layout recovery.
As an alternative implementation manner of this embodiment, the extraction model training module 201 is specifically configured to input a training sample set into the first pre-training model, and generate a problem data set; acquiring target requirements, and performing annotation correction on the problem data set based on the target requirements to generate an annotation data set; performing migration learning on the first pre-training model based on the labeling data set to generate a first information extraction model; performing deformation processing on the training sample set to generate a deformation sample set; acquiring a character recognition basic dictionary, and modifying the character recognition basic dictionary based on a preset modification requirement to generate a target dictionary; and performing migration learning based on the deformation sample set and the target dictionary second pre-training model to generate a second information extraction model.
As an alternative implementation of the present embodiment, the first data generating module 202 includes:
the clipping image generation module is used for removing blank parts around the Chinese character in the to-be-extracted notice image based on a contour detection method and generating an edge clipping image;
the image generating module is used for removing noise points in the edge clipping image to generate a noise-removed image;
and the rotating image generating module is used for estimating the character direction of the denoising image, and rotating the processed denoising image based on the character direction estimation result to generate a rotating image.
In this optional embodiment, the rotating image generating module is specifically configured to obtain a preset progress angle value and a value range, and determine a step range of the step value based on the preset progress angle value and the value range; acquiring row pixel values of the de-fringing image, and calculating a rotation fraction based on the row pixel values, the stepping range and a preset calculation formula; selecting the maximum value in the rotation scores, and taking the angle value corresponding to the maximum value as a character direction estimation result; and rotating the de-dried image based on the text direction estimation result to generate a rotation image.
As an alternative implementation of this embodiment, the first data generating module 202 further includes:
the row line extraction module is used for acquiring a judgment threshold value, extracting row lines based on the judgment threshold value and the row pixel value, and generating a target row line; the data generation module is used for acquiring the target segmentation number, performing image cutting on the rotating image based on the target segmentation number and the target row line, and generating first data.
As an alternative implementation of the present embodiment, the extraction information generation module 204 includes:
the height and width acquisition module is used for determining the height and width of the text box based on the position coordinates of the text box;
the text line position determining module is used for determining the line where the text box is located based on the height;
the information feature acquisition module is used for acquiring text features of texts in the text boxes and information features of each column in the aviation planning notice to be extracted;
the text column position determining module is used for determining the column of the text box based on the text characteristics, the information characteristics and the width;
the target data generation module is used for acquiring text data in the text box and performing error correction processing on the text data to generate target data; and the final information generation module is used for generating extraction information based on the line where the text box is located, the column where the text box is located and the target data.
In this optional embodiment, the target data generating module is specifically configured to obtain airport information, and create a text error correction dictionary based on the airport information; acquiring the occurrence frequency of airport information in a text error correction dictionary; and performing text correction processing on the text data based on the occurrence frequency and the text correction dictionary to generate target data.
In one example, a module in any of the above apparatuses may be one or more integrated circuits configured to implement the above methods, for example: one or more application specific integrated circuits (application specific integratedcircuit, ASIC), or one or more digital signal processors (digital signal processor, DSP), or one or more field programmable gate arrays (field programmable gate array, FPGA), or a combination of at least two of these integrated circuit forms.
For another example, when a module in an apparatus may be implemented in the form of a scheduler of processing elements, the processing elements may be general-purpose processors, such as a central processing unit (central processing unit, CPU) or other processor that may invoke a program. For another example, the modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the apparatus and modules described above may refer to the corresponding process in the foregoing method embodiment, which is not repeated herein.
Fig. 3 is a block diagram of an electronic device 300 according to an embodiment of the present application.
As shown in FIG. 3, electronic device 300 includes a processor 301 and memory 302, and may further include an information input/information output (I/O) interface 303, one or more of a communication component 304, and a communication bus 305.
The processor 301 is configured to control the overall operation of the electronic device 300 to complete all or part of the steps of the above-mentioned aviation planning notification information extraction method; the memory 302 is used to store various types of data to support operation at the electronic device 300, which may include, for example, instructions for any application or method operating on the electronic device 300, as well as application-related data. The Memory 302 may be implemented by any type or combination of volatile or non-volatile Memory devices, such as one or more of static random access Memory (Static Random Access Memory, SRAM), electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), erasable programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk.
The I/O interface 303 provides an interface between the processor 301 and other interface modules, which may be a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 304 is used for wired or wireless communication between the electronic device 300 and other devices. Wireless communication, such as Wi-Fi, bluetooth, near field communication (Near Field Communication, NFC for short), 2G, 3G or 4G, or a combination of one or more thereof, the corresponding communication component 104 may thus comprise: wi-Fi part, bluetooth part, NFC part.
The electronic device 300 may be implemented by one or more application specific integrated circuits (Application Specific Integrated Circuit, abbreviated as ASIC), digital signal processors (Digital Signal Processor, abbreviated as DSP), digital signal processing devices (Digital Signal Processing Device, abbreviated as DSPD), programmable logic devices (Programmable Logic Device, abbreviated as PLD), field programmable gate arrays (Field Programmable Gate Array, abbreviated as FPGA), controllers, microcontrollers, microprocessors, or other electronic components for performing the aviation planning notification information extraction method as set forth in the above embodiments.
Communication bus 305 may include a pathway to transfer information between the aforementioned components. The communication bus 305 may be a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus or an EISA (Extended Industry Standard Architecture ) bus, or the like. The communication bus 305 may be divided into an address bus, a data bus, a control bus, and the like.
The electronic device 300 may include, but is not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), car terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like, and may also be a server, and the like.
The application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the steps of the aviation planning notice information extraction method when being executed by a processor.
The computer readable storage medium may include: a U-disk, a removable hard disk, a read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The foregoing description is only of the preferred embodiments of the present application and is presented as a description of the principles of the technology being utilized. It will be appreciated by persons skilled in the art that the scope of the application referred to in this application is not limited to the specific combinations of features described above, but it is intended to cover other embodiments in which any combination of features described above or their equivalents is possible without departing from the spirit of the application. Such as the above-mentioned features and the technical features having similar functions (but not limited to) applied for in this application are replaced with each other.

Claims (10)

1. An aviation planning notice information extraction method is characterized by comprising the following steps:
acquiring a training sample set and a pre-training model, and performing migration learning on the pre-training model based on the training sample set to generate a trained information extraction model;
acquiring a notice image to be extracted, preprocessing the notice image to be extracted to generate first data, wherein the preprocessing comprises row line extraction;
inputting the first data into the information extraction model to generate second data;
and carrying out post-processing on the second data to generate extraction information, wherein the post-processing comprises dynamic form layout recovery.
2. The method of claim 1, wherein the pre-training model comprises a first pre-processing model and a second pre-processing model, and the information extraction model comprises a first information extraction model and a second information extraction model; the performing migration learning on the pre-training model based on the training sample set, and generating a trained information extraction model training sample set includes:
inputting the training sample set to the first pre-training model to generate a problem data set;
acquiring a target demand, and labeling and correcting the problem data set based on the target demand to generate a labeling data set;
performing migration learning on the first pre-training model based on the annotation data set to generate a first information extraction model;
performing deformation processing on the training sample set to generate a deformed sample set;
acquiring a character recognition basic dictionary, and modifying the character recognition basic dictionary based on a preset modification requirement to generate a target dictionary;
and performing migration learning based on the deformation sample set and the second pre-training model of the target dictionary to generate a second information extraction model.
3. The method of claim 1, wherein the preprocessing further comprises edge cropping, image denoising, and image rotation; the preprocessing the notification image to be extracted further comprises:
removing blank parts around the Chinese characters in the to-be-extracted notice image based on a contour detection method, and generating an edge clipping image;
removing noise points in the edge clipping image to generate a noise-removed image;
and estimating the character direction of the denoising image, and rotating the processed denoising image based on a character direction estimation result to generate a rotating image.
4. The method of claim 3, wherein estimating the text direction of the denoised image, rotating the processed denoised image based on the text direction estimation, and generating a rotated image comprises:
acquiring a preset progress angle value and a value range, and determining a stepping range of a stepping value based on the preset progress angle value and the value range;
acquiring a row pixel value of the de-noised image, and calculating a rotation fraction based on the row pixel value, the stepping range and the preset calculation formula;
selecting the maximum value in the rotation scores, and taking an angle value corresponding to the maximum value as a character direction estimation result;
and rotating the de-dried image based on the text direction estimation result to generate a rotation image.
5. The method of claim 4, wherein the preprocessing further comprises image cutting; after estimating the text direction of the denoised image and rotating the denoised image based on the text direction estimation result to generate a rotated image, the method further comprises:
acquiring a judgment threshold value, extracting row lines based on the judgment threshold value and the row pixel value, and generating a target row line;
and acquiring a target segmentation number, and performing image cutting on the rotating image based on the target segmentation number and the target row line to generate first data.
6. The method of claim 1, wherein the second data comprises text box position coordinates, and wherein the dynamic form layout recovery comprises determining a row in which the text box is located and a column in which the text box is located; the post-processing the second data to generate extraction information includes:
determining the height and width of the text box based on the text box position coordinates;
determining the row of the text box based on the height;
acquiring text characteristics of texts in a text box and information characteristics of each column in the aviation planning notice to be extracted;
determining a column in which the text box is located based on the text feature, the information feature and the width;
acquiring text data in the text box, and performing error correction processing on the text data to generate target data;
extraction information is generated based on the row of the text box, the column of the text box and the target data.
7. The method of claim 6, wherein the post-processing further comprises information text error correction; the error correction processing of the text data to generate target data comprises the following steps:
acquiring airport information, and creating a text error correction dictionary based on the airport information;
acquiring the occurrence frequency of the airport information in the text error correction dictionary;
and performing text correction processing on the text data based on the occurrence frequency and the text correction dictionary to generate target data.
8. An aviation planning notice information extraction device, characterized by comprising:
the extraction model training module is used for acquiring a training sample set and a pre-training model, performing migration learning on the pre-training model based on the training sample set, and generating a trained information extraction model;
the device comprises a first data generation module, a first data extraction module and a second data extraction module, wherein the first data generation module is used for acquiring a notice image to be extracted, preprocessing the notice image to be extracted, and generating first data, wherein the preprocessing comprises line extraction;
the second data generation module is used for inputting the first data into the information extraction model to generate second data;
and the extraction information generation module is used for carrying out post-processing on the second data to generate extraction information, wherein the post-processing comprises dynamic form layout recovery.
9. An electronic device comprising a processor coupled to a memory;
the processor is configured to execute a computer program stored in the memory to cause the electronic device to perform the method of any one of claims 1 to 7.
10. A computer readable storage medium comprising a computer program or instructions which, when run on a computer, cause the computer to perform the method of any of claims 1 to 7.
CN202310538985.2A 2023-05-12 2023-05-12 Aviation plan notification information extraction method, device, equipment and storage medium Pending CN116543405A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310538985.2A CN116543405A (en) 2023-05-12 2023-05-12 Aviation plan notification information extraction method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310538985.2A CN116543405A (en) 2023-05-12 2023-05-12 Aviation plan notification information extraction method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116543405A true CN116543405A (en) 2023-08-04

Family

ID=87444962

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310538985.2A Pending CN116543405A (en) 2023-05-12 2023-05-12 Aviation plan notification information extraction method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116543405A (en)

Similar Documents

Publication Publication Date Title
CN109829453B (en) Method and device for recognizing characters in card and computing equipment
CN110766014B (en) Bill information positioning method, system and computer readable storage medium
WO2018010657A1 (en) Structured text detection method and system, and computing device
EP3712812A1 (en) Recognizing typewritten and handwritten characters using end-to-end deep learning
US11361570B2 (en) Receipt identification method, apparatus, device and storage medium
US8494273B2 (en) Adaptive optical character recognition on a document with distorted characters
CN110866495A (en) Bill image recognition method, bill image recognition device, bill image recognition equipment, training method and storage medium
CN111080660B (en) Image segmentation method, device, terminal equipment and storage medium
CN109255300B (en) Bill information extraction method, bill information extraction device, computer equipment and storage medium
CN110942004A (en) Handwriting recognition method and device based on neural network model and electronic equipment
US11836969B2 (en) Preprocessing images for OCR using character pixel height estimation and cycle generative adversarial networks for better character recognition
CN110807455A (en) Bill detection method, device and equipment based on deep learning and storage medium
CN109447080B (en) Character recognition method and device
CN110738238B (en) Classification positioning method and device for certificate information
US11023764B2 (en) Method and system for optical character recognition of series of images
CN113486828A (en) Image processing method, device, equipment and storage medium
CN110490190A (en) A kind of structured image character recognition method and system
CN111639566A (en) Method and device for extracting form information
CN113158895A (en) Bill identification method and device, electronic equipment and storage medium
CN109697442B (en) Training method and device of character recognition model
CN109741273A (en) A kind of mobile phone photograph low-quality images automatically process and methods of marking
CN109508716B (en) Image character positioning method and device
CN114463767A (en) Credit card identification method, device, computer equipment and storage medium
CN111507181B (en) Correction method and device for bill image and computer equipment
CN116030472A (en) Text coordinate determining method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination