CN115578729A

CN115578729A - AI intelligent process arrangement method for digital staff

Info

Publication number: CN115578729A
Application number: CN202211457579.5A
Authority: CN
Inventors: 冯珺; 彭梁英; 王红凯; 王艺丹; 张辰; 章九鼎; 张楠; 孙镇
Original assignee: State Grid Zhejiang Electric Power Co Ltd; Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd
Current assignee: State Grid Zhejiang Electric Power Co Ltd; Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd
Priority date: 2022-11-21
Filing date: 2022-11-21
Publication date: 2023-01-06
Anticipated expiration: 2042-11-21
Also published as: CN115578729B

Abstract

The invention discloses a digital employee AI intelligent process arrangement method, which comprises the following steps: collecting an original image of a paper file with process information, and carrying out difference graying on the original image to obtain a plurality of difference grayed images; rotating the difference gray-scale image by a preset angle to obtain a plurality of rotated gray-scale images; expanding the rotary grayscale image, and detecting a character straight line formed by each line of expanded characters by using Hough transform to obtain a character walking diagram; carrying out perspective transformation on the rotated grayscale image before expansion according to the character walking diagram to obtain a correction diagram; and extracting arrow marks in the correction graph, performing affine transformation on the correction graph by taking the arrow marks as auxiliary information, rotating to obtain a restored graph, performing binarization on the restored graph, inputting the restored graph to a character recognition module for recognition, and sequentially extracting flow information to complete arrangement. The invention can obtain accurate character trend, avoids recognition errors caused by reasons such as special angles and the like, and is beneficial to improving the processing speed and accuracy.

Description

AI intelligent process arrangement method for digital staff

Technical Field

The invention relates to the technical field of data processing, in particular to a digital employee AI intelligent process arrangement method.

Background

At present, the flow information recorded in the paper document is inefficient if manually input into a computer, so image recognition is a common solution. Among them, OCR (Optical Character Recognition) refers to a process in which an electronic device checks characters printed on paper, determines a shape thereof by detecting dark and light patterns, and then translates the shape into computer data by a Character Recognition method. The method is a technology for converting characters in a paper document into an image file of a black-and-white dot matrix in an optical mode aiming at print characters, and converting the characters in the image into a text format through recognition software for further editing and processing by character processing software. The main indicators for measuring the performance of an OCR system are: rejection rate, false recognition rate, recognition speed, stability of the product, usability and the like. RPA digital staff performs deep fusion of traditional character recognition and machine learning, can parse data from non-standard documents, and is helpful for converting handwritten text characters into a machine-readable format. In most cases, OCR is mainly used to simplify and convert paper services into digital services, such as: PDF, scanned documents, paper invoices, faxes, handwritten documents, and the like.

However, for the recognition of paper documents, because the document placement may be irregular or the angle may not be aligned, the direction of the actually photographed character may change, and meanwhile, the individual document contains a table or a flowchart, and the conventional character direction judgment cannot be used, the prior art aims at the problem, and particularly, when the angle difference is large, the accurate recognition may be difficult.

Disclosure of Invention

The invention provides a digital staff AI intelligent flow arrangement method aiming at the problem that the angle or direction of characters is difficult to judge in the process of identifying paper documents in the prior art, which mainly aims at the preprocessing of character identification, automatically corrects the angle and direction of the characters, avoids the identification error or failure caused by special tables or flow charts and other reasons, is beneficial to improving the processing speed and accuracy, and obtains accurate and clear results which are convenient for subsequent identification.

The technical scheme of the invention is as follows.

The digital employee AI intelligent process arrangement method comprises the following steps:

s1: collecting an original image of a paper file with process information, and performing difference graying on the original image to obtain a plurality of difference grayed images;

s2: rotating the difference gray-scale image by a preset angle to obtain a plurality of rotated gray-scale images;

s3: expanding the rotary grayscale image, and detecting a character straight line formed by each line of expanded characters by using Hough transform to obtain a character walking diagram;

s4: performing perspective transformation on the rotation gray image before expansion according to the character walking diagram to obtain a correction diagram;

s5: and extracting arrow marks in the correction graph, carrying out affine transformation on the correction graph by taking the arrow marks as auxiliary information, rotating to obtain a restored graph, carrying out binarization on the restored graph, inputting the restored graph to a character recognition module for recognition, and sequentially extracting flow information to complete arrangement.

The invention can prevent the problem of unclear images possibly caused by single graying through differential graying, can ensure that at least one image with a smaller angle with the correct position appears through the rotation of a preset angle so as to reduce the error probability in the subsequent transformation process, and finally can avoid the identification error or the identification failure caused by special tables or flow charts and the like through a series of transformations and by means of the arrow representation identification flow information, thereby being beneficial to improving the speed and the accuracy of processing.

Preferably, the performing the differential graying on the original image includes:

carrying out average graying on the original image by taking the average value of RGB values as a gray value to obtain an average grayed image;

carrying out maximum graying on the original image by taking the maximum value in the RGB values as a gray value to obtain a maximum grayed image;

and carrying out weighted average graying on the original image by combining the RGB value with the preset weight to obtain a weighted average grayed image.

Preferably, the process of obtaining the preset weight includes:

calculating the ratio of pixels with R values larger than a critical value to the total pixels in the original image to obtain a first ratio, calculating the ratio of pixels with G values larger than the critical value to the total pixels to obtain a second ratio, and calculating the ratio of pixels with B values larger than the critical value to the total pixels to obtain a third ratio;

and determining the preset weight of each value of RGB in equal proportion according to the first ratio, the second ratio and the third ratio.

In the scheme, taking the ratio of the pixels with the R values larger than the critical value to the total pixels as an example, the larger the first ratio is, the larger the color ratio of the R values on the whole image is, the larger the influence degree on the image is, so that when the preset weight of each value of RGB is determined in equal proportion, the larger the weight obtained by the R values is, and otherwise, the smaller the ratio is, the smaller the obtained weight is; the method can strengthen the difference caused by the color characteristics of the image, and is particularly suitable for the image processing task of character recognition, because compared with the common image, the color parameters of the character and the background in the image mainly comprising the character are usually obviously broken, and the difference caused by the broken parameter can be amplified by the method. The threshold value is generally set to be about 128 and can be adjusted according to actual needs.

Preferably, the rotating the difference grayscale image by a preset angle to obtain a plurality of rotated grayscale images includes:

the preset angles are set to be-90 degrees, 90 degrees and 180 degrees, one preset angle is sequentially selected from each difference gray-scale image, and the difference gray-scale images are rotated to obtain a plurality of rotation gray-scale images. Generally, the image with an uncertain angle and the expected righting angle are most easily recognized when the included angle is smaller than 45 degrees, but in fact, the image may be laid flat or placed upside down, and the recognition difficulty is seriously increased, so that at least one image with the righting angle smaller than 45 degrees can be obtained inevitably through the rotation, the recognition accuracy probability is increased, and the character recognition is facilitated.

Preferably, the obtaining a correction map by subjecting the rotated grayscale image before expansion to perspective transformation according to the character walking diagram includes:

and taking any character straight line in the character walking graph as a reference straight line, and locally stretching or compressing pixels of the rotated gray image before expansion so as to enable the rest character straight lines to be parallel to the reference straight line, thereby obtaining the correction graph.

Preferably, the extracting an arrow mark in the correction map, performing affine transformation on the correction map using the arrow mark as auxiliary information, and rotating to obtain the reduction map includes:

judging the direction of an arrow mark in the same correction chart to obtain a plurality of unit vectors, calculating the total vector of the unit vectors, and judging the pointing direction (x, y) of the total vector;

rotating the correction diagram until the character straight line in the correction diagram is at the horizontal position and y in the pointing direction (x, y) of the total vector is less than or equal to 0 to obtain a candidate diagram;

and screening according to the actual rotation angle of the candidate image relative to the original image, and reserving at least one qualified candidate image as a restoration image.

The scheme performs targeted optimization on the recognition of the flow chart with the arrow, the flow chart generally has a form from top to bottom as a whole, but the directions of the local branch arrows are not consistent, so that the judgment is performed according to the pointing direction of the total vector, when y is less than or equal to 0 after the rotation, the total vector has a downward component, and the condition is met no matter whether the total vector deviates to the left or to the right. This step can filter out the inverted image after rotation.

Preferably, the screening according to the actual rotation angle of the candidate map relative to the original image and retaining at least one qualified candidate map as the restoration map includes:

and judging actual rotation angles of different candidate graphs obtained by processing the same original image relative to the original image, calculating the numerical distribution of each actual rotation angle, reserving the actual rotation angles with numerical difference values within 10%, deleting the candidate graphs corresponding to the rest actual rotation angles, and taking the rest candidate graphs as the restored graphs. Although the character line is in a horizontal position, it is not excluded that the image is processed upside down, which may be reduced based on the judgment of the introduced arrow, and this may be substantially removed by further screening.

Preferably, the calculation process of the actual rotation angle includes:

recording a preset angle p of rotation of each rotated grayscale image;

recording a rotation angle q when the correction graph rotates to obtain a candidate graph;

the actual rotation angle C = q + p, wherein clockwise rotation is noted as positive and counter-clockwise rotation is noted as negative.

The substantial effects of the present invention include: AI character recognition is carried out on an image to be recognized by digital staff, and a plurality of gray level images with different outstanding color characteristics can be obtained through difference graying, so that the result with the clearest characteristics can be obtained conveniently; at least one image with an included angle smaller than 45 degrees with the correcting angle can be obtained through rotation of a preset angle, and the probability of accurate recognition is increased; the correction process is assisted through integral judgment in the arrow direction; through the progressive steps and the mutual combined action of the steps, the error conditions of angles and directions can be gradually reduced, the correction success rate is increased, the character direction is finally accurately judged, the situation of character inversion cannot occur, and the method is suitable for initial identification of the flow chart.

Drawings

FIG. 1 is a flow chart of an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions will be clearly and completely described below with reference to the embodiments, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be understood that, in the various embodiments of the present invention, the sequence numbers of the processes do not mean the execution sequence, and the execution sequence of the processes should be determined by the functions and the internal logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

It should be understood that in the present application, "comprising" and "having" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be understood that, in the present invention, "a plurality" means two or more. "and/or" is merely an association describing an associated object, meaning that three relationships may exist, for example, and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "comprises A, B and C" and "comprises A, B, C" means that all three of A, B, C comprise, "comprises A, B or C" means that one of three of A, B, C is comprised, "comprises A, B and/or C" means that any 1 or any 2 or 3 of the three of A, B, C is comprised.

The technical solution of the present invention will be described in detail below with specific examples. Embodiments may be combined with each other and some details of the same or similar concepts or processes may not be repeated in some embodiments.

Example (b):

the digital employee AI intelligent process arrangement method, as shown in fig. 1, includes the following steps:

s1: collecting an original image of a paper file with process information, and performing difference graying on the original image to obtain a plurality of difference grayed images. The method comprises the following steps:

Wherein, the process of obtaining the preset weight comprises the following steps:

In the scheme, taking the ratio of the pixels with the R values larger than the critical value to the total pixels as an example, the larger the first ratio is, the larger the color ratio of the R values on the whole image is, the larger the influence degree on the image is, so that when the preset weight of each value of RGB is determined in equal proportion, the larger the weight obtained by the R values is, and otherwise, the smaller the ratio is, the smaller the obtained weight is; the method can strengthen the difference caused by the color characteristics of the image, and is particularly suitable for the image processing task of character recognition, because compared with the common image, the color parameters of the character and the background in the image mainly comprising the character are usually obviously broken, and the difference caused by the broken parameter can be amplified by the method.

In most cases, the expected effect can be obtained by the conventional gray-scale method, but sometimes it cannot be achieved, for example, in a photographed image, the background is very similar to the text color, the background color is greenish, the RGB value is (180,250,100), which accounts for 70% of the whole image, and the text portion color is bluish, the RGB value is (100,180,250), which accounts for 30% of the whole image. If the average value is grayed, the obtained background and character gray values are the same, which is obviously not beneficial to subsequent identification; if the gray level is maximized, the gray levels of the obtained background and text are still the same, which is obviously not beneficial to the subsequent identification. In most cases, the gradation values obtained by these processing methods are not equal to each other, but in this case, gradation must be performed by another method.

If the weighted average graying of the present embodiment is adopted, in the case where the critical value is 128, the pixels having R values greater than 128 account for 70%, and thus the first ratio is 0.7; the G values of all pixels are greater than 128, so the second ratio is 1; similarly, the third ratio is 0.3. The predetermined weight of each value of RGB is determined in equal proportion, and then the predetermined weight of R is 0.35, the predetermined weight of g is 0.5, and the predetermined weight of b is 0.15, so that the gray value of the background is 203 and the gray value of the text is 162.5. Because the colors of the original images are very similar, the difference seen by naked eyes after graying is not obvious, but compared with the conventional mode, the difference is obvious, and a clearer image can be obtained by adjusting the critical value.

S2: and rotating the difference gray-scale image by a preset angle to obtain a plurality of rotated gray-scale images. The method comprises the following steps:

setting preset angles to be-90 degrees, 90 degrees and 180 degrees, sequentially selecting one preset angle for each difference gray-scale image, and rotating to obtain a plurality of rotation gray-scale images. Generally, the image with an uncertain angle and the expected righting angle are most easily recognized when the included angle is smaller than 45 degrees, but in fact, the image may be laid flat or placed upside down, and the recognition difficulty is seriously increased, so that at least one image with the righting angle smaller than 45 degrees can be obtained inevitably through the rotation, the recognition accuracy probability is increased, and the character recognition is facilitated.

S3: and expanding the rotary grayscale image, and detecting a character straight line formed by each line of expanded characters by using Hough transform to obtain a character walking diagram.

The most common method for tilt correction is hough transform, which is based on the principle that an image is expanded to connect discontinuous characters into a straight line, thereby facilitating straight line detection.

S4: and carrying out perspective transformation on the rotated grayscale image before expansion according to the character walking diagram to obtain a corrected diagram. The method comprises the following steps:

and taking any character straight line in the character walking graph as a reference straight line, and locally stretching or compressing pixels of the rotated gray image before expansion so as to enable the rest character straight lines to be parallel to the reference straight line, thereby obtaining the correction graph. The process is similar to the process of trapezoidal correction, and the angle caused by the shooting position can be corrected.

S5: and extracting arrow marks in the correction graph, carrying out affine transformation on the correction graph by taking the arrow marks as auxiliary information, rotating to obtain a restored graph, carrying out binarization on the restored graph, inputting the restored graph to a character recognition module for recognition, and sequentially extracting flow information to complete arrangement. For correcting the oblique picture to the horizontal position. The method comprises the following steps:

and screening according to the actual rotation angle of the candidate image relative to the original image, and reserving at least one qualified candidate image as a restored image.

In addition, the process of extracting the restored image comprises the following steps:

and judging actual rotation angles of different candidate graphs obtained by processing the same original image relative to the original image, calculating the numerical distribution of each actual rotation angle, reserving the actual rotation angles with numerical difference values within 10%, deleting the candidate graphs corresponding to the rest actual rotation angles, and taking the rest candidate graphs as the restored graphs.

Wherein the calculation process of the actual rotation angle comprises the following steps:

recording a preset angle p of rotation of each rotated gray image;

recording a rotation angle q when the correction chart rotates to obtain a candidate chart;

Although the character line is in a horizontal position, it is not excluded that the image is processed upside down, which may be reduced based on the judgment of the introduced arrow, and this may be substantially removed by further screening.

It should be noted that through the above steps, the error conditions of the angle and the direction can be gradually reduced, the correction success rate is increased, the character direction is finally accurately judged, the situation of character inversion does not occur, and the method is suitable for the initial recognition of the flow chart. The steps realize the result that 1+1 is greater than 2, and the lack of any step can cause other steps to lose the optimal effect, thus leading to inaccurate result.

The embodiment can prevent the problem of unclear images possibly caused by single graying through differential graying, can ensure that at least one image with a smaller angle with the correct position appears through the rotation of a preset angle so as to reduce the error probability in the subsequent transformation process, finally represents identification flow information through a series of transformations and by means of an arrow, does not generate the situation of character inversion, can avoid identification errors or identification failures caused by special tables, flow charts and other reasons, and is favorable for improving the processing speed and accuracy.

In the embodiments provided in this application, it should be understood that the disclosed structures and methods may be implemented in other ways, or some features may be omitted, or not implemented.

In addition, the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially contributed to by the prior art, or all or part of the technical solutions may be embodied in the form of a software product, where the software product is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a variety of media that can store program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. The digital employee AI intelligent process arrangement method is characterized by comprising the following steps:

s3: expanding the rotated grayscale image, and detecting a character straight line formed by each line of expanded characters by using Hough transform to obtain a character walking diagram;

s4: carrying out perspective transformation on the rotated grayscale image before expansion according to the character walking diagram to obtain a correction diagram;

2. The digital employee AI intelligent process layout method of claim 1 wherein said differentially graying the original image comprises:

3. The AI intelligent process arrangement method for digital staff according to claim 2, wherein said process of obtaining preset weights includes:

4. The AI intelligent process arrangement method for digital employees according to claim 1, wherein said rotating the difference grayed images by a preset angle to obtain a plurality of rotated grayed images comprises:

the preset angles are set to be-90 degrees, 90 degrees and 180 degrees, one preset angle is sequentially selected from each difference gray-scale image, and the difference gray-scale images are rotated to obtain a plurality of rotation gray-scale images.

5. The AI intelligent process arrangement method for digital staff according to claim 1, wherein said perspective transformation of the rotated grayed images before expansion according to the character walking diagram to obtain a correctional diagram comprises:

6. The digital employee AI intelligent process arrangement method of claim 5 wherein said extracting arrow labels in said correctional graph, affine transforming said correctional graph with said arrow labels as auxiliary information, and rotating to obtain restored graphs comprises:

7. The digital employee AI intelligent process arrangement method according to claim 6, wherein said screening according to the actual rotation angle of the candidate map relative to the original image, retaining at least one qualified candidate map as a restored map, comprises:

8. The digital staff AI intelligent process orchestration method according to claim 6 or 7, wherein the actual rotation angle calculation process comprises:

recording a preset angle p of rotation of each rotated grayscale image;