CN113158755A - Method for improving accuracy of bank pipelining recognition - Google Patents

Method for improving accuracy of bank pipelining recognition Download PDF

Info

Publication number
CN113158755A
CN113158755A CN202110174145.3A CN202110174145A CN113158755A CN 113158755 A CN113158755 A CN 113158755A CN 202110174145 A CN202110174145 A CN 202110174145A CN 113158755 A CN113158755 A CN 113158755A
Authority
CN
China
Prior art keywords
picture
bank
coordinates
dimensional table
data item
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110174145.3A
Other languages
Chinese (zh)
Inventor
李潇
董伯文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Fuli Technology Co Ltd
Original Assignee
Shanghai Fuli Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Fuli Technology Co Ltd filed Critical Shanghai Fuli Technology Co Ltd
Priority to CN202110174145.3A priority Critical patent/CN113158755A/en
Publication of CN113158755A publication Critical patent/CN113158755A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Finance (AREA)
  • Artificial Intelligence (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of financial wind control, in particular to a method for improving the accuracy of bank pipelining recognition. A method for improving the accuracy of bank pipelining recognition is characterized in that: the specific process is as follows: s1: scanning the paper bank assembly line into an electronic file and inputting the electronic file into a computer; s2: rotating the scanned picture to enable the picture to be basically horizontal according to the content; s3: after the picture is basically horizontal, acquiring the abscissa and the ordinate of a two-dimensional table of contents of the bank flow picture; s4: according to the horizontal coordinates and the vertical coordinates, the pictures are divided, and each data item is guaranteed to correspond to one small picture; s5: and identifying the picture contents one by one according to the divided small pictures, organizing the identified contents into form data in a text form, and finishing data identification. The method provides a picture processing method aiming at the preprocessing process of electronic pictures scanned by paper bank running water to improve the accuracy of bank running water identification.

Description

Method for improving accuracy of bank pipelining recognition
Technical Field
The invention relates to the technical field of financial wind control, in particular to a method for improving the accuracy of bank pipelining recognition.
Background
In the field of financial wind control, bank running analysis and results are a very important wind control strategy and index. The running analysis of the bank involves a lot of calculations, so that the running analysis by means of a computer system can greatly improve the analysis efficiency of the running of the bank. In the current bank system, a paper bank flow is provided in many cases. If the running analysis is required by a computer, the scanning and the identification of the running paper bank are necessary. And the character recognition rate of the whole streamline image recognized by the common bank streamline recognition method is not high. The invention provides a bank flow identification method which is used for improving the accuracy of bank flow identification.
Disclosure of Invention
The invention provides a method for improving the accuracy of bank flow identification, aiming at overcoming the defects of the prior art, and providing a picture processing method for the early-stage processing process of electronic pictures scanned in the paper bank flow to improve the accuracy of the bank flow identification.
In order to achieve the purpose, the method for improving the accuracy of bank pipelining recognition is designed, and is characterized in that: the specific process is as follows:
s1: scanning the paper bank assembly line into an electronic file and inputting the electronic file into a computer;
s2: rotating the scanned picture to enable the picture to be basically horizontal according to the content;
s3: after the picture is basically horizontal, acquiring the abscissa and the ordinate of a two-dimensional table of contents of the bank flow picture;
s4: according to the horizontal coordinates and the vertical coordinates, the pictures are divided, and each data item is guaranteed to correspond to one small picture;
s5: and identifying the picture contents one by one according to the divided small pictures, organizing the identified contents into form data in a text form, and finishing data identification.
The specific process of S2 is as follows:
s21: judging whether the scanned pictures are two-dimensional table bank flow, if so, performing steps S22 to S26, otherwise, performing steps S27 to S211;
s22: if the scanned picture is a two-dimensional table bank pipeline, a plurality of short line segments exist in the picture, line segments with y coordinates which are closer to the minimum y coordinate in all the line segments are searched from the line segments, and the line segments are just the first line segments forming the two-dimensional table;
s23: in the first line segment, the coordinates of two points of the leftmost line segment are searched and recorded as (x1, y1) and (x2, y2), and the coordinates of the rightmost line segment are recorded as (x3, y3) and (x4, y 4);
s24: taking (x1, y1) and (x4, y4) as the start point coordinate and the end point coordinate of the two-dimensional table of the data of the bank water picture, and the line is taken as the reference line of the rotating picture;
s25: the slope value is calculated from this line segment coordinate: rate = (y4-y1)/(x4-x1), if the absolute value of the slope is greater than 0.005, then the picture needs to be rotated;
s26: rotating by using the central point of the picture as an origin and an angle obtained by calculating a slope rate;
s27: if the scanned picture is not a two-dimensional table bank pipelining, no line segment of the two-dimensional table exists in the picture, so that the rotation adjustment needs to be carried out by arranging the data item content as a reference;
s28: carrying out binarization, expansion, corrosion and negation operations on the picture to enable the part of the picture with data content to be highlighted;
s29: finding out rectangles with smaller areas, and positioning data items in the pictures according to the coordinates of the rectangles;
s210: the rectangle of the data item table can be positioned from all the rectangles according to the approximate position information of the picture where the data item is located;
s211: the coordinates of the position rectangle of the first row are selected, and the picture is rotated until the picture data item is kept substantially horizontal according to the method of steps S22 to S26.
The specific process of S3 is as follows:
s31: if the scanned picture is a two-dimensional table bank flow, acquiring coordinates of all vertical lines in the picture, storing vertical coordinates of all points in the vertical lines into a list and sequencing;
s32: gathering the numbers with the relatively close sizes together to form a number list, wherein all the number lists form a number list;
s33: averaging the numbers in a single number list to form a list;
s34: if the scanned picture is not a two-dimensional table bank flow, acquiring rectangular coordinates of most data item contents in the picture;
s35: composing most of the data into a two-dimensional table;
s36: the position of the data item content in the picture excludes the non-transaction data item, and rectangular coordinates of the key data item content are obtained.
In step S4, the picture is divided, a two-dimensional grid is drawn in the picture according to the abscissa and ordinate obtained in step S3, and the picture is cut according to the abscissa and ordinate in the two-dimensional table to obtain a plurality of corresponding data pictures, each picture corresponding to a single data item.
Compared with the prior art, the method for improving the accuracy of bank flow identification is provided, and the method provides a picture processing method for the early-stage processing process of electronic pictures scanned by paper bank flow to improve the accuracy of bank flow identification.
Drawings
FIG. 1 is a software flow diagram of the present invention.
FIG. 2 is a schematic representation of a two-dimensional table bank pipeline.
FIG. 3 is a schematic view of a bank pipeline for a non-two-dimensional table.
FIG. 4 is a schematic representation of a bank pipeline prior to processing by the present invention.
FIG. 5 is a schematic representation of a bank pipeline after processing by the present invention.
Detailed Description
The invention is further illustrated below with reference to the accompanying drawings.
As shown in fig. 1, a method for improving accuracy of bank pipelining recognition specifically includes the following steps:
s1: scanning the paper bank assembly line into an electronic file and inputting the electronic file into a computer;
s2: rotating the scanned picture to enable the picture to be basically horizontal according to the content;
s3: after the picture is basically horizontal, acquiring the abscissa and the ordinate of a two-dimensional table of contents of the bank flow picture;
s4: according to the horizontal coordinates and the vertical coordinates, the pictures are divided, and each data item is guaranteed to correspond to one small picture;
s5: and identifying the picture contents one by one according to the divided small pictures, organizing the identified contents into form data in a text form, and finishing data identification.
The specific flow of S2 is as follows:
s21: judging whether the scanned pictures are two-dimensional table bank flow, if so, performing steps S22 to S26, otherwise, performing steps S27 to S211;
s22: if the scanned picture is a two-dimensional table bank pipeline, a plurality of short line segments exist in the picture, line segments with y coordinates which are closer to the minimum y coordinate in all the line segments are searched from the line segments, and the line segments are just the first line segments forming the two-dimensional table;
s23: in the first line segment, the coordinates of two points of the leftmost line segment are searched and recorded as (x1, y1) and (x2, y2), and the coordinates of the rightmost line segment are recorded as (x3, y3) and (x4, y 4);
s24: taking (x1, y1) and (x4, y4) as the start point coordinate and the end point coordinate of the two-dimensional table of the data of the bank water picture, and the line is taken as the reference line of the rotating picture;
s25: the slope value is calculated from this line segment coordinate: rate = (y4-y1)/(x4-x1), if the absolute value of the slope is greater than 0.005, then the picture needs to be rotated;
s26: rotating by using the central point of the picture as an origin and an angle obtained by calculating a slope rate;
s27: if the scanned picture is not a two-dimensional table bank pipelining, no line segment of the two-dimensional table exists in the picture, so that the rotation adjustment needs to be carried out by arranging the data item content as a reference;
s28: carrying out binarization, expansion, corrosion and negation operations on the picture to enable the part of the picture with data content to be highlighted;
s29: finding out rectangles with smaller areas, and positioning data items in the pictures according to the coordinates of the rectangles;
s210: the rectangle of the data item table can be positioned from all the rectangles according to the approximate position information of the picture where the data item is located;
s211: the coordinates of the position rectangle of the first row are selected, and the picture is rotated until the picture data item is kept substantially horizontal according to the method of steps S22 to S26.
The specific flow of S3 is as follows:
s31: if the scanned picture is a two-dimensional table bank flow, acquiring coordinates of all vertical lines in the picture, storing vertical coordinates of all points in the vertical lines into a list and sequencing;
s32: gathering the numbers with the relatively close sizes together to form a number list, wherein all the number lists form a number list;
s33: averaging the numbers in a single number list to form a list;
s34: if the scanned picture is not a two-dimensional table bank flow, acquiring rectangular coordinates of most data item contents in the picture;
s35: composing most of the data into a two-dimensional table;
s36: the position of the data item content in the picture excludes the non-transaction data item, and rectangular coordinates of the key data item content are obtained.
In step S4, the picture is divided, a two-dimensional grid is drawn in the picture according to the abscissa and ordinate obtained in step S3, and the picture is cut according to the abscissa and ordinate in the two-dimensional table to obtain a plurality of corresponding data pictures, each picture corresponding to a single data item.
Because the bank flow is usually displayed in a two-dimensional spreadsheet form, aiming at the characteristic of the bank flow, the invention provides a method for dividing the picture according to the display characteristics of the bank flow picture, cutting the picture into separate small pictures according to the subjects (such as transaction date, transaction amount, balance, remark and the like) and the transaction records of the bank flow, wherein each small picture only comprises one data item. And finally, according to the contents of the data items of different types, performing targeted picture processing and recognition. For example, for data item pictures of categories such as transaction amount, balance, transaction date and the like, the picture recognition result range can be narrowed (the recognition result can only be a number), and the picture recognition machine learning model training can be performed in a targeted manner, so that the recognition rate of key data item contents can be improved.
If the data item content of the bank flow picture is to be subjected to two-dimensional segmentation, the following processing needs to be carried out on the picture: first, the picture needs to be rotated so that the picture remains substantially horizontal according to the content. Due to the possibility of skew of the bank serial scanning pictures, if the pictures are cut according to the coordinates, the pictures must be ensured to be basically horizontal (step S2 in FIG. 1); secondly, acquiring the abscissa and the ordinate of the two-dimensional table of the contents of the bank flow picture after the picture is basically horizontal (step S3 in FIG. 1); thirdly, dividing the picture according to the abscissa and the ordinate to ensure that each data item is a small picture (step S4 in fig. 1); and finally, identifying the picture contents one by one according to the divided small pictures, organizing the identified contents into form data in a text form, and finishing data identification. Since the content type of the divided small picture is single in this step, measures for improving the identification accuracy rate can be taken for specific content (step S5 in fig. 1).
In step S2, if a line needs to be found in the picture as a reference to rotate the picture, the entire content is kept horizontal by rotating the reference line to be close to horizontal. The spinning picture is divided into the following two cases: case one is a two-dimensional table banking pipeline, as shown in fig. 2. Case two is a bank pipeline without a two-dimensional table, as shown in fig. 3.
In the first case, horizontal line segments need to be searched in the picture, and due to problems of bank pipelining, scanning quality and the like, a plurality of short line segments can be found on the way, and the short line segments are combined to form all the horizontal line segments in the two-dimensional table. From these segments, the segments with y coordinates closer to the minimum y coordinate (less than 20 pixels apart) are searched, and these segments are the first segments in the two-dimensional table. In the first line segment, the coordinates of the two points of the leftmost line segment are found as (x1, y1) and (x2, y2), and the two coordinates of the rightmost line segment are found as (x3, y3) and (x4, y 4). Taking (x1, y1) and (x4, y4) as the start point coordinates and the end point coordinates of the two-dimensional table of the data of the bank water picture, the line is the reference line of the rotated picture. The slope (tangent value of the line segment) is calculated from this line segment coordinate: rate = (y4-y1)/(x4-x 1). If the absolute value of the slope is large (taking larger than 0.005), the picture needs to be rotated. And rotating by using the central point of the picture as an origin and an angle obtained by calculating the slope rate. The absolute value of the slope of the first line is repeatedly acquired, calculated and verified until the condition of the base level is satisfied.
In the second case, there is no two-dimensional table of data in the picture to be analyzed, so that rotation adjustment needs to be performed with reference to the arrangement of the contents of the data items. First, operations such as binarization, expansion, erosion, inversion and the like are performed on the picture, so that the part of the picture with the data content is highlighted as shown in fig. 4. In fig. 4, a rectangle with a relatively small area can be found. The data items in the picture can be located according to the coordinates of these rectangles (location is shown in fig. 5). The rectangles of the data item table can be located according to the approximate position information of the picture where the data item is located from all the rectangles in fig. 5. The coordinates of the position rectangle of the first row are selected and the picture is rotated in a similar way as in case 1, in such a way that the picture data items remain substantially horizontal.
The step S3 is the most critical step, and requires acquiring the abscissa and ordinate of the two-dimensional content division of the picture. Since the picture has been rotated in step S1 to ensure that the data content remains substantially horizontal, acquiring the coordinates of the rectangle containing the smallest unit data item ensures that the separator picture of step S4 is correct.
In the first case, the coordinates of all the vertical lines in the picture are still obtained by the method in step S2. The vertical coordinates of all points in the vertical line (the starting and ending points of the line) are stored in a list and sorted, an example is as follows: [0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 40, 40, 41, 41, 42, 42, 43, 43, 131, 131, 132, 132, 133, 133, 297, 297, 298, 298, 299, 299, 300, 300, 422, 422, 423, 423, 424, 424, 612, 612, 613, 613, 614, 614, 615, 615, 615, 615, 1438, 1438, 1439, 1439, 1440, 1440, 1441, 1441, 1741, 1741, 1741, 1741, 1742, 1742, 1743, 1743, 2013, 2013, 2014, 2014, 2015, 2015, 2016, 2016, 2283, 2283, 2284, 2284, 2285, 2285, 2554, 2554, 2555, 2555, 2556, 2556, 2557, 2557, 2684, 2684, 2685, 2685, 2686, 2686, 2687, 2687, 2688, 2688, 2776, 2776, 2777, 2777, 2778, 2778, 2779, 2779, 2780, 2780, 2781, 2781, 2782, 2782, 2783, 2783, 2784, 2784].
The numbers with relatively close sizes are gathered together to form a number list, and all the number lists form a list of the number lists, for example, as follows:
[0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8];
[40, 40, 41, 41, 42, 42, 43, 43];
[131, 131, 132, 132, 133, 133];
[297, 297, 298, 298, 299, 299, 300, 300];
[422, 422, 423, 423, 424, 424];
[612, 612, 613, 613, 614, 614, 615, 615, 615, 615];
[1438, 1438, 1439, 1439, 1440, 1440, 1441, 1441];
[1741, 1741, 1741, 1741, 1742, 1742, 1743, 1743];
[2013, 2013, 2014, 2014, 2015, 2015, 2016, 2016];
[2283, 2283, 2284, 2284, 2285, 2285];
[2554, 2554, 2555, 2555, 2556, 2556, 2557, 2557];
[2684, 2684, 2685, 2685, 2686, 2686, 2687, 2687, 2688, 2688];
[2776, 2776, 2777, 2777, 2778, 2778, 2779, 2779, 2780, 2780, 2781, 2781, 2782, 2782, 2783, 2783, 2784, 2784]。
the numbers in the list of single numbers are averaged to form a list, as shown in the following example: [4, 42, 132, 298, 423, 614, 1440, 1742, 2014, 2284, 2556, 2686, 2780].
Meanwhile, it should be noted that the vertical line near the edge of the picture is not a line segment of the two-dimensional table, but a paper edge line, and therefore, the coordinate data near the edge of the picture needs to be excluded, and the following results are obtained: [42, 132, 298, 423, 614, 1440, 1742, 2014, 2284, 2556, 2686]. The resulting data is the abscissa of the two-dimensional table in the picture.
In the same way, the ordinate of the two-dimensional table in the picture can also be acquired.
Through the above operation, the abscissa and the ordinate of the picture two-dimensional table are obtained in the S3 step.
In case two: the data items are segmented by finding no line segments of the two-dimensional table in the graph. But the rectangular coordinates (marked in fig. 5) of the vast majority of the data item content in the picture have been acquired in step S2. The key data item contents (transaction record data) constitute a two-dimensional table. And excluding the non-transaction data items according to the positions of the key data item contents in the pictures, and acquiring the rectangular coordinates of the key data item contents. According to the distribution condition of the coordinate data, the aggregation rule of the data item coordinates of the same type, which is similar to the condition, can still be observed. According to this aggregation rule, the abscissa and ordinate of the content of the divided data item are acquired.
In step S4, the picture needs to be divided. And D, drawing a two-dimensional grid in the picture according to the abscissa and the ordinate obtained in the step two, and cutting the picture according to the abscissa and the ordinate in the two-dimensional table to obtain a plurality of corresponding data pictures, wherein each picture corresponds to a single data item.
In step S5, the picture corresponding to each data item is processed and recognized, and in this step, the recognition result range of some key data items (such as money amount, transaction date, etc.) can be narrowed, so as to improve the accuracy of OCR picture recognition. The picture can also be identified by a plurality of methods, so that the result is cross-validated, and the identification accuracy can be improved to a great extent.

Claims (4)

1. A method for improving the accuracy of bank pipelining recognition is characterized in that: the specific process is as follows:
s1: scanning the paper bank assembly line into an electronic file and inputting the electronic file into a computer;
s2: rotating the scanned picture to enable the picture to be basically horizontal according to the content;
s3: after the picture is basically horizontal, acquiring the abscissa and the ordinate of a two-dimensional table of contents of the bank flow picture;
s4: according to the horizontal coordinates and the vertical coordinates, the pictures are divided, and each data item is guaranteed to correspond to one small picture;
s5: and identifying the picture contents one by one according to the divided small pictures, organizing the identified contents into form data in a text form, and finishing data identification.
2. The method for improving accuracy of bank pipelining recognition according to claim 1, wherein: the specific process of S2 is as follows:
s21: judging whether the scanned pictures are two-dimensional table bank flow, if so, performing steps S22 to S26, otherwise, performing steps S27 to S211;
s22: if the scanned picture is a two-dimensional table bank pipeline, a plurality of short line segments exist in the picture, line segments with y coordinates which are closer to the minimum y coordinate in all the line segments are searched from the line segments, and the line segments are just the first line segments forming the two-dimensional table;
s23: in the first line segment, the coordinates of two points of the leftmost line segment are searched and recorded as (x1, y1) and (x2, y2), and the coordinates of the rightmost line segment are recorded as (x3, y3) and (x4, y 4);
s24: taking (x1, y1) and (x4, y4) as the start point coordinate and the end point coordinate of the two-dimensional table of the data of the bank water picture, and the line is taken as the reference line of the rotating picture;
s25: the slope value is calculated from this line segment coordinate: rate = (y4-y1)/(x4-x1), if the absolute value of the slope is greater than 0.005, then the picture needs to be rotated;
s26: rotating by using the central point of the picture as an origin and an angle obtained by calculating a slope rate;
s27: if the scanned picture is not a two-dimensional table bank pipelining, no line segment of the two-dimensional table exists in the picture, so that the rotation adjustment needs to be carried out by arranging the data item content as a reference;
s28: carrying out binarization, expansion, corrosion and negation operations on the picture to enable the part of the picture with data content to be highlighted;
s29: finding out rectangles with smaller areas, and positioning data items in the pictures according to the coordinates of the rectangles;
s210: the rectangle of the data item table can be positioned from all the rectangles according to the approximate position information of the picture where the data item is located;
s211: the coordinates of the position rectangle of the first row are selected, and the picture is rotated until the picture data item is kept substantially horizontal according to the method of steps S22 to S26.
3. The method for improving accuracy of bank pipelining recognition according to claim 1, wherein: the specific process of S3 is as follows:
s31: if the scanned picture is a two-dimensional table bank flow, acquiring coordinates of all vertical lines in the picture, storing vertical coordinates of all points in the vertical lines into a list and sequencing;
s32: gathering the numbers with the relatively close sizes together to form a number list, wherein all the number lists form a number list;
s33: averaging the numbers in a single number list to form a list;
s34: if the scanned picture is not a two-dimensional table bank flow, acquiring rectangular coordinates of most data item contents in the picture;
s35: composing most of the data into a two-dimensional table;
s36: the position of the data item content in the picture excludes the non-transaction data item, and rectangular coordinates of the key data item content are obtained.
4. The method for improving accuracy of bank pipelining recognition according to claim 1, wherein: in step S4, the picture is divided, a two-dimensional grid is drawn in the picture according to the abscissa and ordinate obtained in step S3, and the picture is cut according to the abscissa and ordinate in the two-dimensional table to obtain a plurality of corresponding data pictures, each picture corresponding to a single data item.
CN202110174145.3A 2021-02-07 2021-02-07 Method for improving accuracy of bank pipelining recognition Pending CN113158755A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110174145.3A CN113158755A (en) 2021-02-07 2021-02-07 Method for improving accuracy of bank pipelining recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110174145.3A CN113158755A (en) 2021-02-07 2021-02-07 Method for improving accuracy of bank pipelining recognition

Publications (1)

Publication Number Publication Date
CN113158755A true CN113158755A (en) 2021-07-23

Family

ID=76883082

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110174145.3A Pending CN113158755A (en) 2021-02-07 2021-02-07 Method for improving accuracy of bank pipelining recognition

Country Status (1)

Country Link
CN (1) CN113158755A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113989822A (en) * 2021-12-24 2022-01-28 中奥智能工业研究院(南京)有限公司 Picture table content extraction method based on computer vision and natural language processing

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103488986A (en) * 2013-09-18 2014-01-01 西安理工大学 Method for segmenting and extracting characters in self-adaptation mode
CN204178382U (en) * 2014-10-16 2015-02-25 上海京颐科技股份有限公司 A kind of electronization of the paper-bill based on OCR device
CN105574486A (en) * 2015-11-25 2016-05-11 成都数联铭品科技有限公司 Image table character segmenting method
CN106446881A (en) * 2016-07-29 2017-02-22 北京交通大学 Method for extracting lab test result from medical lab sheet image
CN110309746A (en) * 2019-06-21 2019-10-08 国网辽宁省电力有限公司鞍山供电公司 High-grade information security area list data information extracting method without communication interconnection
CN110458070A (en) * 2019-08-01 2019-11-15 上海眼控科技股份有限公司 Method and system based on motor vehicle annual test check table picture recognition amount of testing
CN111368744A (en) * 2020-03-05 2020-07-03 中国工商银行股份有限公司 Method and device for identifying unstructured table in picture
CN111753706A (en) * 2020-06-19 2020-10-09 西安工业大学 Complex table intersection point clustering extraction method based on image statistics
CN112102203A (en) * 2020-09-27 2020-12-18 中国建设银行股份有限公司 Image correction method, device and equipment

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103488986A (en) * 2013-09-18 2014-01-01 西安理工大学 Method for segmenting and extracting characters in self-adaptation mode
CN204178382U (en) * 2014-10-16 2015-02-25 上海京颐科技股份有限公司 A kind of electronization of the paper-bill based on OCR device
CN105574486A (en) * 2015-11-25 2016-05-11 成都数联铭品科技有限公司 Image table character segmenting method
CN106446881A (en) * 2016-07-29 2017-02-22 北京交通大学 Method for extracting lab test result from medical lab sheet image
CN110309746A (en) * 2019-06-21 2019-10-08 国网辽宁省电力有限公司鞍山供电公司 High-grade information security area list data information extracting method without communication interconnection
CN110458070A (en) * 2019-08-01 2019-11-15 上海眼控科技股份有限公司 Method and system based on motor vehicle annual test check table picture recognition amount of testing
CN111368744A (en) * 2020-03-05 2020-07-03 中国工商银行股份有限公司 Method and device for identifying unstructured table in picture
CN111753706A (en) * 2020-06-19 2020-10-09 西安工业大学 Complex table intersection point clustering extraction method based on image statistics
CN112102203A (en) * 2020-09-27 2020-12-18 中国建设银行股份有限公司 Image correction method, device and equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113989822A (en) * 2021-12-24 2022-01-28 中奥智能工业研究院(南京)有限公司 Picture table content extraction method based on computer vision and natural language processing

Similar Documents

Publication Publication Date Title
US6778703B1 (en) Form recognition using reference areas
US8971620B2 (en) Detecting a label from an image
US6600834B1 (en) Handwriting information processing system with character segmentation user interface
US8515208B2 (en) Method for document to template alignment
JP5500480B2 (en) Form recognition device and form recognition method
US20160171627A1 (en) Processing electronic documents for invoice recognition
US11087163B2 (en) Neural network-based optical character recognition
JP4347677B2 (en) Form OCR program, method and apparatus
CN109255300B (en) Bill information extraction method, bill information extraction device, computer equipment and storage medium
US9679354B2 (en) Duplicate check image resolution
JP4078009B2 (en) CHARACTERISTIC RECORDING AREA DETECTION DEVICE FOR FORM, CHARACTER RECORDING AREA DETECTION METHOD FOR FORM, STORAGE MEDIUM, AND FORM FORMAT CREATION DEVICE
Caldeira et al. Industrial optical character recognition system in printing quality control of hot-rolled coils identification
CN110717492B (en) Method for correcting direction of character string in drawing based on joint features
CN112949455B (en) Value-added tax invoice recognition system and method
CN111738252B (en) Text line detection method, device and computer system in image
CN109508716B (en) Image character positioning method and device
CN111738979A (en) Automatic certificate image quality inspection method and system
US9519404B2 (en) Image segmentation for data verification
CN114463767A (en) Credit card identification method, device, computer equipment and storage medium
CN113158755A (en) Method for improving accuracy of bank pipelining recognition
JP2012190434A (en) Form defining device, form defining method, program and recording medium
Xu et al. Tolerance Information Extraction for Mechanical Engineering Drawings–A Digital Image Processing and Deep Learning-based Model
KR20180126352A (en) Recognition device based deep learning for extracting text from images
JP2005165978A (en) Business form ocr program, method and device thereof
CN114627457A (en) Ticket information identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210723