WO2022111549A1 - Document recognition method and apparatus, and readable storage medium - Google Patents

Document recognition method and apparatus, and readable storage medium Download PDF

Info

Publication number
WO2022111549A1
WO2022111549A1 PCT/CN2021/132930 CN2021132930W WO2022111549A1 WO 2022111549 A1 WO2022111549 A1 WO 2022111549A1 CN 2021132930 W CN2021132930 W CN 2021132930W WO 2022111549 A1 WO2022111549 A1 WO 2022111549A1
Authority
WO
WIPO (PCT)
Prior art keywords
bill
image
original image
node
length
Prior art date
Application number
PCT/CN2021/132930
Other languages
French (fr)
Chinese (zh)
Inventor
徐青松
李青
Original Assignee
杭州睿胜软件有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州睿胜软件有限公司 filed Critical 杭州睿胜软件有限公司
Publication of WO2022111549A1 publication Critical patent/WO2022111549A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/32Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • the invention relates to the technical field of artificial intelligence, in particular to a bill identification method, a bill identification device and a readable storage medium.
  • the purpose of the present invention is to provide a bill identification method, a bill identification device and a readable storage medium to solve the problem of difficulty in bill identification.
  • the present invention provides a bill identification method, including:
  • the preprocessing includes: after scaling the length of the original image in the first direction to a preset first node size, and then placing the scaled original image in the first The lengths in the two directions are complemented to a preset second node size, wherein the first direction is perpendicular to the second direction, and the length of the original image in the first direction is not less than that in the second direction length;
  • the image of the bill is segmented and output.
  • the bill identification method further includes:
  • a plurality of the first node sizes and a plurality of the second node sizes are preset, and the first node sizes and the second node sizes are in one-to-one correspondence; when scaling the original image, the The length of the original image in the first direction is scaled to the closest numerical value of the first node size; and,
  • the length of the scaled original image in the second direction is padded to the second node size corresponding to the scaled first node size.
  • the size of the first node and the size of the second node are the same.
  • the method for compensating the length of the scaled original image in the second direction to a preset second node size includes:
  • the method for obtaining the marked frame of the bill includes:
  • the callout frame of the ticket is acquired.
  • the bill identification method before outputting the image of the bill, the bill identification method further includes:
  • the orientation of the image of the bill is adjusted so that the orientation of the characters on the bill is a preset direction.
  • the bill identification method after acquiring the image of the bill, the bill identification method further includes:
  • the image of the bill is trimmed.
  • the bill identification method after acquiring the image of the bill, the bill identification method further includes:
  • the correction includes global correction and local correction.
  • the present invention also provides a bill identification device, comprising:
  • the image preprocessing module is used for preprocessing the original image containing the bill, and the preprocessing includes: after scaling the length of the original image in the first direction to a preset first node size, The length of the original image in the second direction is complemented to a preset second node size, wherein the first direction is perpendicular to the second direction, and the length of the original image in the first direction not less than the length in the second direction;
  • a labeling frame acquisition and adjustment module used to obtain the labeling frame of the bill, and enlarge the labeling frame of the bill by a preset ratio
  • the image post-processing module is used for segmenting the image of the bill based on the enlarged annotation frame of the bill, and outputting the image.
  • the bill identification device further includes a node size setting module, and the node size setting module is used to preset a plurality of the first node sizes and a plurality of the second node sizes.
  • node size, the first node size and the second node size are in one-to-one correspondence;
  • the image preprocessing module scales the length of the original image in the first direction to the first node size with the closest value; The length of the image in the second direction is padded to the second node size corresponding to the scaled first node size.
  • the size of the first node and the size of the second node are the same.
  • the image preprocessing method for compensating the length of the scaled original image in the second direction to a preset second node size includes: along the second The direction fills the blank space on the sides of the scaled original image.
  • the image post-processing module includes an image segmentation module and an image output module, and the image segmentation module is used to segment the bill based on the enlarged label frame of the bill.
  • the image output module is used for outputting the image of the bill that has been segmented.
  • the image post-processing module further includes an orientation adjustment module, and the orientation adjustment module is used to adjust the orientation of the image of the bill, so that the characters on the bill are directional.
  • the orientation is the default direction.
  • the image post-processing module further includes an edge processing module, the edge processing module is used to identify the edge of the bill, and based on the identification result, the image of the bill is Edge trimming.
  • the image post-processing module further includes an image correction module, and the image correction module is used to correct the image content of the bill, and the correction includes global correction and local correction. Correction.
  • the present invention also provides a readable storage medium, characterized in that, the readable storage medium stores a computer program, and when the computer program is executed, the above-mentioned bill identification method is implemented.
  • the device for identifying bills, and the readable storage medium provided by the present invention, first, the original image containing bills is preprocessed, and the preprocessing includes: storing the original image in a first After the length in the direction is scaled to the preset first node size, the length of the scaled original image in the second direction is filled to the preset second node size, wherein the first direction is perpendicular to the second direction, and the length of the original image in the first direction is not less than the length in the second direction; that is, by adjusting the original image containing the bill to a preset size, and then performing subsequent segmentation, and Before dividing, the labeling frame of each bill is enlarged according to the preset ratio.
  • the speed of subsequent processing is improved by unifying the original bill images of various sizes to the preset size, and the image size adjustment method adopted is not It will bring the trouble of image deformation.
  • the image size adjustment method adopted is not It will bring the trouble of image deformation.
  • the loss of the edge area of the bill is avoided, thus reducing the difficulty of identifying the bill in the picture.
  • FIG. 1 is a flowchart of a method for identifying a ticket provided by an embodiment of the present invention
  • FIG. 2 is a schematic diagram of an exemplary original image including multiple notes provided by an embodiment of the present invention
  • FIG. 3 is a schematic diagram of resizing an original image in an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of forming a labeling frame of each bill in an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of enlarging each marked frame in an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of an image of a bill formed by cutting in an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of performing edge trimming processing on an image of a bill in an embodiment of the present invention.
  • FIG. 8 is a block diagram of the composition of a bill identification device provided by an embodiment of the present invention.
  • P1, P2, P3, P4 - bills A1, A2 - blank area; Z1, Z2, Z3, Z4 - callout boxes; Z1', Z2', Z3', Z4' - text boxes;
  • 10-image preprocessing module 20-marking frame acquisition and adjustment module; 30-image post-processing module; 301-segmentation module; 302-image output module; 303-direction adjustment module; 304-edge processing module; 305-image correction module.
  • the embodiments of the present invention provide a bill identification method, a bill identification device and a readable storage medium.
  • topic search method of the embodiment of the present invention can be applied to the topic search apparatus of the embodiment of the present invention, and the topic search apparatus can be configured on an electronic device.
  • the electronic device may be a personal computer, a mobile terminal, etc.
  • the mobile terminal may be a hardware device with various operating systems, such as a mobile phone, a tablet computer, and the like.
  • this embodiment provides a bill identification method, and the bill identification method includes the following steps:
  • Preprocess the original image containing the bill includes: after scaling the length of the original image in the first direction to a preset first node size, and then scaling the scaled original image
  • the length in the second direction is complemented to a preset second node size, wherein the first direction is perpendicular to the second direction, and the length of the original image in the first direction is not less than that in the second direction the length in the direction;
  • the bill recognition method provided in this embodiment, after the original image containing the bill is adjusted to a preset size, subsequent segmentation is performed, and before the segmentation, the labeling frame of the bill is enlarged according to a preset ratio.
  • the speed of subsequent processing is improved, and the image size adjustment method adopted will not cause the trouble of image deformation. Edge regions are lost, thus reducing the difficulty of identifying multiple bills in the picture.
  • step S11 preferably, a plurality of the first node sizes and a plurality of the second node sizes are preset, and the first node sizes and the second node sizes are in one-to-one correspondence;
  • the length of the original image in the first direction is scaled to the size of the first node with the closest value; and the length of the scaled original image in the second direction is complemented to the second node size corresponding to the scaled first node size.
  • the size of the first node is the same as the size of the corresponding second node.
  • the image After scaling the original image and supplementing the blank area, the image is not deformed, and the obtained image is consistent with the preset size, so that in subsequent processing (such as the bill area recognition model described below)
  • subsequent processing such as the bill area recognition model described below
  • the processing speed is significantly improved, and by setting the size of the first node and the size of the second node to be the same, so that the image is square, the processing speed of the model can be further improved.
  • set multiple node sizes to 800X800, 1600X1600... (other sizes are also possible), first determine which node size the length of the original image in the first direction is close to, such as the length of the original image in the first direction If the length of the original image in the first direction is 1400 or 1800, it is scaled to 1600.
  • the size adjustment range of the image is as small as possible, so that the image size of the final output ticket is as consistent as possible with the original image, so that a better user experience can always be guaranteed.
  • scaling the original image refers to scaling the entire original image, not only adjusting the length in the first direction, but also adjusting the length in the second direction along with the length in the first direction And adjustment, in this way, can ensure that the image does not deform.
  • the length in the first direction of the original image is the same as the preset first node size, it is not necessary to perform scaling, but directly fill in the length in the second direction to the corresponding second node size. For example, if the length of the original image in the first direction is 800, no scaling is performed, but the length of the original image in the second direction is directly filled to 800.
  • the length in the first direction of the original image is the same as the length in the second direction, and the preset first node size and the second node size are the same, the length in the first direction of the original image After scaling to the preset first node size, there is no need to fill in the length in the second direction. For example, if the original image is 1000 ⁇ 1000, after scaling the length of the original image in the first direction to 800, the length in the second direction of the original image is also changed to 800, so there is no need to make up.
  • step S11 a blank area can be supplemented along the side of the scaled original image along the second direction to make up the length of the scaled original image in the second direction to a preset second node size.
  • FIG. 2 shows an exemplary original image including four bills provided in this embodiment, and the included bills are: bill P1, bill P2, bill P3, and bill P4. Since the length of the original image in the first direction is greater than or equal to the length in the second direction, for the original image shown in the figure, the first direction can be understood as the X direction shown in FIG. 2 , The second direction can be understood as the Y direction shown in FIG. 2 .
  • the length of the scaled original image in the second direction is filled by supplementing blank areas A1 and A2 on both sides of the original image along the second direction
  • the supplementary blank areas A1 and A2 on both sides have equal areas.
  • blank areas may also be supplemented on one side of the original image along the second direction.
  • the area of the supplemented blank areas is equal to The sum of the areas of the blank areas supplemented on both sides is equal.
  • the supplemented area may also be an area filled with an image, such as a grid line area and the like.
  • the marked frame of the bill can be obtained by obtaining the position area information of the bill.
  • the method for obtaining the labeling frame of the bill may include: acquiring the location area information of the bill, and acquiring the labeling box of the bill based on the location area information of the bill.
  • the column frame of each of the bills obtained is as shown in FIG. 4 .
  • the text box of the bill P1 is Z1
  • the text box of the bill P2 is Z2
  • the text box of the bill P3 is Z3
  • the text box of the bill P4 is Z4.
  • a bill area identification model can be used to obtain the position area information of each of the bills.
  • the ticket area identification model may employ machine learning techniques and run, for example, on a general purpose computing device or a special purpose computing device.
  • the bill region recognition model can be implemented by using a neural network such as a deep convolutional neural network (DEEP-CNN).
  • the image is input to the bill region recognition model, and the bill region identification model can identify the boundaries of each bill in the input image, and then mark out the identified boundaries, so as to obtain each of the bills.
  • the callout box for the ticket is
  • the labeling box of each bill is enlarged according to the preset ratio. It is guaranteed that the entire area of the ticket is contained within the callout box. It should be understood that the enlargement of the marked frame of each bill here refers to the enlargement of the marked frame along the periphery.
  • each of the bill marked boxes is enlarged by 5%, and in other embodiments, it can also be enlarged by 3%, 7%, 9%, and so on.
  • different magnification ratios can be matched for bills of different sizes. For example, for small bills, the marked frame can be enlarged by 2%, and for large bills, the marked frame can be enlarged by 6%, and so on.
  • step S13 based on the enlarged text frame Z1' of the text frame Z1 in step S12, the enlarged text frame Z2' of the text frame Z2, the enlarged text frame Z3' of the text frame Z3, and the enlarged text frame Z4.
  • the text box Z4' all the bills are divided, and the image of the bill P1 obtained from the segmentation is shown in Figure 6.
  • the image of the bill P1 obtained from the segmentation is shown in Figure 6.
  • the bill itself it also includes a partial image of the bill P2.
  • the partial image, for the bill P1 is Redundant images.
  • the edge of each of the bills is identified; and, based on the identification result, the image of each of the bills is subjected to edge trimming processing.
  • the image of the bill P1 obtained by the edge trimming process is shown in FIG. 7 , and it can be seen from FIG. 7 that the redundant image is cut out through the edge trimming process.
  • the image of the bill can be processed by the edge detection algorithm to obtain the line drawing of the grayscale outline.
  • an image can be processed by an OpenCV-based edge detection algorithm to obtain a line drawing of the grayscale contours in the image.
  • OpenCV is an open source computer vision library.
  • Edge detection algorithms based on OpenCV include Sobel, Scarry, Canny, Laplacian, Prewitt, Marr-Hildresh, scharr and other algorithms.
  • the Canny edge detection algorithm is used in this embodiment.
  • the Canny edge detection algorithm is a multi-stage algorithm, that is, the Canny edge detection algorithm consists of multiple steps.
  • the Canny edge detection algorithm includes: 1. Image noise reduction: using Gaussian Filter to smooth the image; 2.
  • Calculate the image gradient use the first-order partial derivative finite difference to calculate the gradient magnitude and direction; 3.
  • Non-maximum suppression perform non-maximum suppression on the gradient amplitude; 4.
  • Threshold filtering use A dual threshold algorithm detects and connects edges.
  • edge identification methods known to those skilled in the art may also be used, and the selection of specific edge identification methods does not constitute a limitation to the present application.
  • step S13 further preferably, after the image of each of the bills is acquired, the image content of each of the bills is corrected, and the correction includes global correction and local correction.
  • the correction step may be performed after the edge trimming process, and in other embodiments, the correction step may also be performed before the edge trimming process.
  • the text image may be inclined, and the inclination may adversely affect the analysis of the text image (for example, character recognition, etc.) and other processing. Therefore, in this embodiment, the image content of each of the bills is corrected, so as to avoid the inclination of the text image from adversely affecting the analysis and processing of the original image.
  • the image content of any bill can be corrected by using the following steps: performing global correction processing on the image of the bill to obtain an intermediate corrected image; performing local adjustment on the intermediate corrected image to obtain a target corrected image ; wherein, performing local adjustment on the intermediate correction image to obtain a target correction image, including:
  • the lower boundaries of M character lines corresponding to the M character rows of the image of the bill are determined; based on the intermediate correction image and the lower boundaries of the M character rows, local adjustment reference lines and M reservation coefficient groups, wherein each reservation coefficient group in the M reservation coefficient groups includes a plurality of reservation coefficients; according to the lower boundary of the M character lines, the local adjustment reference line and the reservation coefficient group , determine M local adjustment offset groups corresponding to the M character lines, wherein each local adjustment offset group in the M local adjustment offset groups includes a plurality of local adjustment offsets ; Perform local adjustment on the M character lines in the intermediate correction image according to the M local adjustment offset groups to obtain the target correction image.
  • the bill recognition method further includes: adjusting the direction of the image of the bill, so that the characters on the bill
  • the orientation is the default direction.
  • the preset direction is the positive Y direction in the plane coordinate system, so as to facilitate subsequent identification.
  • this embodiment also provides a bill identification device, as shown in FIG. 8 , the bill identification device includes:
  • the image preprocessing module 10 is configured to perform preprocessing on the original image containing the bill, and the preprocessing includes: after scaling the length of the original image in the first direction to a preset first node size, scaling the The length of the original image in the second direction is then complemented to a preset second node size, wherein the first direction is perpendicular to the second direction, and the length of the original image in the first direction is The length is not less than the length in the second direction;
  • the labeling frame obtaining and adjusting module 20 is used for obtaining the labeling frame of the bill, and enlarging the labeling frame of the bill according to a preset ratio;
  • the image post-processing module 30 is configured to segment and output the image of the bill based on the enlarged annotation frame of the bill.
  • the bill identification device further includes a node size setting module (not shown in the figure), and the node size setting module is used for setting the first node size and the second node size.
  • the node size setting module is used for setting the first node size and the second node size.
  • the image preprocessing module 10 scales the length of the original image in the first direction to the first node size with the closest value; The length of the original image in the second direction is padded to the second node size corresponding to the scaled first node size.
  • the method for the image preprocessing module 10 to fill in the length of the scaled original image in the second direction to a preset second node size includes: along the second direction on the side of the scaled original image; Fill in the blank space.
  • the image post-processing module 30 specifically includes a segmentation module 301 and an image output module 302.
  • the image segmentation module 301 is configured to segment the image of the ticket based on the enlarged annotation frame of the ticket.
  • the image output module 302 is used for outputting the image of the segmented bill.
  • the image post-processing module 30 further includes: an orientation adjustment module 303, which is used to adjust the orientation of the image of the bill, so that the orientation of the characters on the bill is a preset direction .
  • the preset direction is the positive Y direction in the plane coordinate system, so as to facilitate subsequent identification.
  • the bill identification device may further include an edge processing module 304, and the edge processing module 304 is configured to identify the edge of the bill, and based on the identification result, perform edge trimming processing on the image of the bill.
  • the bill identification device may further include an image correction module 305; the image correction module 305 is used to correct the image content of the bill, and the correction includes global correction and local correction.
  • the correction of the image content of the bill by the image correction module 305 can be performed after the edge processing module 304 performs edge trimming processing on the image of the bill, and the edge processing module 304 also performs the edge processing on the bill.
  • the image of the ticket is processed before edge trimming.
  • each module in the bill identification device provided in this embodiment is respectively used to implement each step of the bill identification method provided in this embodiment. Therefore, for the specific description of the functions that each module can implement, please refer to the above The relevant descriptions of the corresponding steps of the bill identification method will not be repeated where repeated.
  • the bill identification device can achieve the same technical effect as the bill identification method described above, which will not be repeated here.
  • the bill recognition device, the image preprocessing module 10, the frame acquisition and adjustment module 20, and the image post-processing module 30 can be combined in one device, or any one of the modules can be split into A plurality of sub-modules, or, in the bill recognition device, at least part of the functions of one or more modules in the image preprocessing module 10, the frame acquisition and adjustment module 20, and the image post-processing module 30 can be combined with at least part of the other modules.
  • the functions are combined and implemented in one functional module.
  • At least one of the bill identification device, the statistical analysis module 11 and the calibration module 12 may be at least partially implemented as a hardware circuit, such as a field programmable gate array (FPGA), a programmable logic array (PLA), system-on-chip, system-on-substrate, system-on-package, application specific integrated circuit (ASIC), or any other reasonable means of integrating or packaging circuits, implemented in hardware or firmware, or in software, It can be realized by an appropriate combination of the three implementations of hardware and firmware.
  • FPGA field programmable gate array
  • PLA programmable logic array
  • ASIC application specific integrated circuit
  • At least one of the image preprocessing module 10, the frame acquisition and adjustment module 20, and the image postprocessing module 30 may be at least partially implemented as a computer program module, and when the program is run by a computer , can execute the function of the corresponding module.
  • this embodiment further provides a readable storage medium, where a computer program is stored in the readable storage medium, and when the computer program is executed, the bill identification method described in this embodiment is implemented.
  • the readable storage medium can be a tangible device that can hold and store instructions for use by the instruction execution device, such as, but not limited to, electrical storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or the above. any suitable combination. More specific examples (non-exhaustive list) of readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or Flash memory), static random access memory (SRAM), portable compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory sticks, floppy disks, mechanical encoding devices, and any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or Flash memory erasable programmable read only memory
  • SRAM static random access memory
  • CD-ROM compact disc read only memory
  • DVD digital versatile disc
  • memory sticks floppy disks, mechanical encoding devices, and any suitable combination of the foregoing.
  • the computer programs described herein can be downloaded to various computing/processing devices from readable storage media, or to external computers or external storage devices over a network such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives the computer program from the network and forwards the computer program for storage in a readable storage medium in the respective computing/processing device.
  • the computer program for carrying out the operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or any other program in one or more programming languages.
  • ISA instruction set architecture
  • the computer program may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server .
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, through the Internet using an Internet service provider) connect).
  • LAN local area network
  • WAN wide area network
  • Internet service provider an Internet service provider
  • electronic circuits such as programmable logic circuits, field programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), that can execute computer programmable logic circuits, are personalized by utilizing state information from a computer program.
  • Program instructions are read to implement various aspects of the present invention.
  • These computer programs can also be stored in a readable storage medium, and these computer programs cause computers, programmable data processing devices and/or other devices to operate in a specific manner, so that the readable storage medium storing the computer program includes a An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks of the flowchart and/or block diagrams.
  • a computer program can also be loaded onto a computer, other programmable data processing apparatus, or other equipment, causing a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other equipment to produce a computer-implemented process that causes A computer program executing on a computer, other programmable data processing apparatus, or other device implements the functions/acts specified in one or more blocks of the flowcharts and/or block diagrams.
  • the device for identifying bills, and the readable storage medium provided by the present invention, first, the original image containing bills is preprocessed, and the preprocessing includes: storing the original image in the first After the length in one direction is scaled to the preset first node size, the length of the scaled original image in the second direction is filled to the preset second node size, wherein the first direction is vertical in the second direction, and the length of the original image in the first direction is not less than the length in the second direction; then, obtain the marked frame of each of the bills, and press the marked frame of each of the bills The preset ratio is enlarged; finally, all the bills are segmented based on the enlarged annotation frame of each of the bills, so as to obtain an image of each of the bills and output them.

Abstract

The present invention provides a document identification method and apparatus, and a readable storage medium, the method comprising: scaling to a preset first node size the length of an original image containing a document in a first direction, and compensating, to a preset second node size, for the length of the scaled original image in a second direction, wherein the first direction is perpendicular to the second direction, and the length of the original image in the first direction is not less than the length thereof in the second direction; then, acquiring a marking box of the document, and enlarging the marking box of the document according to a preset proportion; and finally, segmenting an image of the document on the basis of the enlarged marking box of the document, and outputting the image, such that deformation of images can be prevented, and the subsequent processing speed is increased by unifying original document pictures of various sizes to a preset size, in addition, the marking box is enlarged to prevent the loss of document edge areas, thereby lowering the difficulty of recognizing the document in a picture.

Description

票据识别方法、装置及可读存储介质Bill identification method, device and readable storage medium 技术领域technical field
本发明涉及人工智能技术领域,特别涉及一种票据识别方法、票据识别装置及可读存储介质。The invention relates to the technical field of artificial intelligence, in particular to a bill identification method, a bill identification device and a readable storage medium.
背景技术Background technique
随着经济的不断发展,人们消费水平的不断提高,为了维护人们的消费权益,票据成为了消费者的有力保障以及有效的报销凭证,因此财务人员每天需要处理大量的票据。同时也有越来越多的人通过记账分类统计以掌握自身的消费情况。With the continuous development of the economy and the continuous improvement of people's consumption level, in order to protect people's consumption rights and interests, bills have become a powerful guarantee for consumers and an effective reimbursement certificate. Therefore, financial personnel need to deal with a large number of bills every day. At the same time, more and more people are keeping track of their consumption through accounting and classification.
近年来,票据识别技术不断发展,但是对图片上票据的准确识别仍然有一定难度,尤其针对一张图片上分布有多张票据的场景时,即一张图片中包括多张票据的场景时,对该图片中的多张票据的识别具有一定难度。In recent years, bill recognition technology has been developing continuously, but it is still difficult to accurately identify bills in pictures, especially when there are multiple bills distributed on a picture, that is, when a picture includes multiple bills, It is difficult to recognize the multiple bills in the picture.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于提供一种票据识别方法、票据识别装置及可读存储介质,以解决票据识别困难的问题。The purpose of the present invention is to provide a bill identification method, a bill identification device and a readable storage medium to solve the problem of difficulty in bill identification.
为解决上述技术问题,本发明提供一种票据识别方法,包括:In order to solve the above-mentioned technical problems, the present invention provides a bill identification method, including:
对包含有票据的原始图像进行预处理,所述预处理包括:将所述原始图像在第一方向上的长度缩放至预设的第一节点尺寸后,将缩放后的所述原始图像在第二方向上的长度补齐至预设的第二节点尺寸,其中,所述第一方向垂直于所述第二方向,且所述原始图像在第一方向上的长度不小于在第二方向上的长度;Preprocessing the original image containing the ticket, the preprocessing includes: after scaling the length of the original image in the first direction to a preset first node size, and then placing the scaled original image in the first The lengths in the two directions are complemented to a preset second node size, wherein the first direction is perpendicular to the second direction, and the length of the original image in the first direction is not less than that in the second direction length;
获取所述票据的标注框,并将所述票据的标注框按预设比例放大;acquiring the labeling frame of the bill, and enlarging the labeling box of the bill according to a preset ratio;
基于所述票据放大后的标注框,分割出所述票据的图像,并输出。Based on the enlarged annotation frame of the bill, the image of the bill is segmented and output.
可选的,在所述的票据识别方法中,所述票据识别方法还包括:Optionally, in the bill identification method, the bill identification method further includes:
预设多个所述第一节点尺寸和多个所述第二节点尺寸,所述第一节点尺寸和所述第二节点尺寸一一对应;在进行所述原始图像的缩放时,将所述原 始图像在第一方向上的长度缩放至数值最接近的所述第一节点尺寸;以及,A plurality of the first node sizes and a plurality of the second node sizes are preset, and the first node sizes and the second node sizes are in one-to-one correspondence; when scaling the original image, the The length of the original image in the first direction is scaled to the closest numerical value of the first node size; and,
将缩放后的所述原始图像在第二方向上的长度补齐至与进行缩放的所述第一节点尺寸相对应的所述第二节点尺寸。The length of the scaled original image in the second direction is padded to the second node size corresponding to the scaled first node size.
可选的,在所述的票据识别方法中,所述第一节点尺寸和所述第二节点尺寸相同。Optionally, in the bill identification method, the size of the first node and the size of the second node are the same.
可选的,在所述的票据识别方法中,将缩放后的所述原始图像在第二方向上的长度补齐至预设的第二节点尺寸的方法包括:Optionally, in the bill recognition method, the method for compensating the length of the scaled original image in the second direction to a preset second node size includes:
沿第二方向在缩放后的所述原始图像的侧边补充空白区域。Filling blank areas along the side of the scaled original image along the second direction.
可选的,在所述的票据识别方法中,获取所述票据的标注框的方法包括:Optionally, in the bill identification method, the method for obtaining the marked frame of the bill includes:
获取所述票据的位置区域信息;以及,obtaining location area information for the ticket; and,
基于所述票据的位置区域信息,获取所述票据的标注框。Based on the location area information of the ticket, the callout frame of the ticket is acquired.
可选的,在所述的票据识别方法中,在输出所述票据的图像之前,所述票据识别方法还包括:Optionally, in the bill identification method, before outputting the image of the bill, the bill identification method further includes:
对所述票据的图像的方向进行调整,使得所述票据上字符的朝向为预设方向。The orientation of the image of the bill is adjusted so that the orientation of the characters on the bill is a preset direction.
可选的,在所述的票据识别方法中,在获取所述票据的图像后,所述票据识别方法还包括:Optionally, in the bill identification method, after acquiring the image of the bill, the bill identification method further includes:
识别所述票据的边缘;以及,identifying the edge of the note; and,
基于识别结果,对所述票据的图像进行切边处理。Based on the recognition result, the image of the bill is trimmed.
可选的,在所述的票据识别方法中,在获取所述票据的图像后,所述票据识别方法还包括:Optionally, in the bill identification method, after acquiring the image of the bill, the bill identification method further includes:
对所述票据的图像内容进行校正,所述校正包括全局校正和局部校正。Correcting the image content of the ticket, the correction includes global correction and local correction.
本发明还提供一种票据识别装置,包括:The present invention also provides a bill identification device, comprising:
图像预处理模块,用于对包含有票据的原始图像进行预处理,所述预处理包括:将所述原始图像在第一方向上的长度缩放至预设的第一节点尺寸后,将缩放后的所述原始图像在第二方向上的长度补齐至预设的第二节点尺寸,其中,所述第一方向垂直于所述第二方向,且所述原始图像在第一方向上的长度不小于在第二方向上的长度;The image preprocessing module is used for preprocessing the original image containing the bill, and the preprocessing includes: after scaling the length of the original image in the first direction to a preset first node size, The length of the original image in the second direction is complemented to a preset second node size, wherein the first direction is perpendicular to the second direction, and the length of the original image in the first direction not less than the length in the second direction;
标注框获取与调整模块,用于获取所述票据的标注框,并将所述票据的 标注框按预设比例放大;A labeling frame acquisition and adjustment module, used to obtain the labeling frame of the bill, and enlarge the labeling frame of the bill by a preset ratio;
图像后处理模块,用于基于所述票据放大后的标注框,分割出所述票据的图像,并输出。The image post-processing module is used for segmenting the image of the bill based on the enlarged annotation frame of the bill, and outputting the image.
可选的,在所述的票据识别装置中,所述票据识别装置还包括节点尺寸设置模块,所述节点尺寸设置模块用于预设多个所述第一节点尺寸和多个所述第二节点尺寸,所述第一节点尺寸和所述第二节点尺寸一一对应;Optionally, in the bill identification device, the bill identification device further includes a node size setting module, and the node size setting module is used to preset a plurality of the first node sizes and a plurality of the second node sizes. node size, the first node size and the second node size are in one-to-one correspondence;
所述图像预处理模块在进行所述原始图像的缩放时,将所述原始图像在第一方向上的长度缩放至数值最接近的所述第一节点尺寸;以及,将缩放后的所述原始图像在第二方向上的长度补齐至与进行缩放的所述第一节点尺寸相对应的所述第二节点尺寸。When scaling the original image, the image preprocessing module scales the length of the original image in the first direction to the first node size with the closest value; The length of the image in the second direction is padded to the second node size corresponding to the scaled first node size.
可选的,在所述的票据识别装置中,所述第一节点尺寸和所述第二节点尺寸相同。Optionally, in the bill identification device, the size of the first node and the size of the second node are the same.
可选的,在所述的票据识别装置中,所述图像预处理将缩放后的所述原始图像在第二方向上的长度补齐至预设的第二节点尺寸的方法包括:沿第二方向在缩放后的所述原始图像的侧边补充空白区域。Optionally, in the bill recognition device, the image preprocessing method for compensating the length of the scaled original image in the second direction to a preset second node size includes: along the second The direction fills the blank space on the sides of the scaled original image.
可选的,在所述的票据识别装置中,所述图像后处理模块包括图像分割模块和图像输出模块,所述图像分割模块用于基于所述票据放大后的标注框,分割出所述票据的图像,所述图像输出模块用于将分割出的所述票据的图像输出。Optionally, in the bill recognition device, the image post-processing module includes an image segmentation module and an image output module, and the image segmentation module is used to segment the bill based on the enlarged label frame of the bill. The image output module is used for outputting the image of the bill that has been segmented.
可选的,在所述的票据识别装置中,所述图像后处理模块还包括方向调整模块,所述方向调整模块用于将所述票据的图像的方向进行调整,使得所述票据上字符的朝向为预设方向。Optionally, in the bill recognition device, the image post-processing module further includes an orientation adjustment module, and the orientation adjustment module is used to adjust the orientation of the image of the bill, so that the characters on the bill are directional. The orientation is the default direction.
可选的,在所述的票据识别装置中,所述图像后处理模块还包括边缘处理模块,所述边缘处理模块用于识别所述票据的边缘,以及基于识别结果,对所述票据的图像进行切边处理。Optionally, in the bill identification device, the image post-processing module further includes an edge processing module, the edge processing module is used to identify the edge of the bill, and based on the identification result, the image of the bill is Edge trimming.
可选的,在所述的票据识别装置中,所述图像后处理模块还包括图像校正模块,所述图像校正模块用于对所述票据的图像内容进行校正,所述校正包括全局校正和局部校正。Optionally, in the bill recognition device, the image post-processing module further includes an image correction module, and the image correction module is used to correct the image content of the bill, and the correction includes global correction and local correction. Correction.
本发明还提供一种可读存储介质,其特征在于,所述可读存储介质存储 有计算机程序,所述计算机程序被执行时,实现如上所述的票据识别方法。The present invention also provides a readable storage medium, characterized in that, the readable storage medium stores a computer program, and when the computer program is executed, the above-mentioned bill identification method is implemented.
综上所述,本发明提供的票据识别的方法、票据识别装置及可读存储介质,首先,对包含有票据的原始图像进行预处理,所述预处理包括:将所述原始图像在第一方向上的长度缩放至预设的第一节点尺寸后,将缩放后的所述原始图像在第二方向上的长度补齐至预设的第二节点尺寸,其中,所述第一方向垂直于所述第二方向,且所述原始图像在第一方向上的长度不小于在第二方向上的长度;即,通过将包含票据的原始图像调整至预设尺寸后,再进行后续分割,且在进行分割之前,将各票据的标注框按预设比例放大,如此,便通过将各种尺寸的原始票据图片统一到预设尺寸提高了后续处理的速度,且所采用的图像尺寸调整方式不会带来图像变形的困扰,另外通过对标注框进行放大,避免了票据的边缘区域丢失,因此,降低了对图片中票据进行识别的难度。To sum up, in the method for identifying bills, the device for identifying bills, and the readable storage medium provided by the present invention, first, the original image containing bills is preprocessed, and the preprocessing includes: storing the original image in a first After the length in the direction is scaled to the preset first node size, the length of the scaled original image in the second direction is filled to the preset second node size, wherein the first direction is perpendicular to the second direction, and the length of the original image in the first direction is not less than the length in the second direction; that is, by adjusting the original image containing the bill to a preset size, and then performing subsequent segmentation, and Before dividing, the labeling frame of each bill is enlarged according to the preset ratio. In this way, the speed of subsequent processing is improved by unifying the original bill images of various sizes to the preset size, and the image size adjustment method adopted is not It will bring the trouble of image deformation. In addition, by enlarging the label frame, the loss of the edge area of the bill is avoided, thus reducing the difficulty of identifying the bill in the picture.
附图说明Description of drawings
图1为本发明实施例提供的票据识别方法的流程图;FIG. 1 is a flowchart of a method for identifying a ticket provided by an embodiment of the present invention;
图2为本发明实施例提供的一种示例性的包含多票据的原始图像的示意图;FIG. 2 is a schematic diagram of an exemplary original image including multiple notes provided by an embodiment of the present invention;
图3为本发明实施例中对原始图像进行尺寸调整的示意图;3 is a schematic diagram of resizing an original image in an embodiment of the present invention;
图4为本发明实施例中形成各票据的标注框的示意图;FIG. 4 is a schematic diagram of forming a labeling frame of each bill in an embodiment of the present invention;
图5为本发明实施例中对各标注框进行放大的示意图;5 is a schematic diagram of enlarging each marked frame in an embodiment of the present invention;
图6为本发明实施例中切割形成的一票据的图像的示意图;6 is a schematic diagram of an image of a bill formed by cutting in an embodiment of the present invention;
图7为本发明实施例中对票据的图像进行切边处理的示意图;7 is a schematic diagram of performing edge trimming processing on an image of a bill in an embodiment of the present invention;
图8为本发明实施例提供的票据识别装置的组成框图;FIG. 8 is a block diagram of the composition of a bill identification device provided by an embodiment of the present invention;
其中,各附图标记说明如下:Wherein, each reference sign is described as follows:
P1、P2、P3、P4-票据;A1、A2-空白区域;Z1、Z2、Z3、Z4-标注框;Z1'、Z2'、Z3'、Z4'-文本框;P1, P2, P3, P4 - bills; A1, A2 - blank area; Z1, Z2, Z3, Z4 - callout boxes; Z1', Z2', Z3', Z4' - text boxes;
10-图像预处理模块;20-标注框获取与调整模块;30-图像后处理模块;301-分割模块;302-图像输出模块;303-方向调整模块;304-边缘处理模块;305-图像校正模块。10-image preprocessing module; 20-marking frame acquisition and adjustment module; 30-image post-processing module; 301-segmentation module; 302-image output module; 303-direction adjustment module; 304-edge processing module; 305-image correction module.
具体实施方式Detailed ways
以下结合附图和具体实施例对本发明提出的票据识别方法、票据识别装置及可读存储介质作进一步详细说明。根据下面说明,本发明的优点和特征将更清楚。需说明的是,附图均采用非常简化的形式且均使用非精准的比例,仅用以方便、明晰地辅助说明本发明实施例的目的。此外,附图所展示的结构往往是实际结构的一部分。特别的,各附图需要展示的侧重点不同,有时会采用不同的比例。The bill identification method, bill identification device and readable storage medium proposed by the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. The advantages and features of the present invention will become more apparent from the following description. It should be noted that, the accompanying drawings are all in a very simplified form and in inaccurate scales, and are only used to facilitate and clearly assist the purpose of explaining the embodiments of the present invention. Furthermore, the structures shown in the drawings are often part of the actual structure. In particular, each drawing needs to show different emphases, and sometimes different scales are used.
除非另外定义,本发明使用的技术术语或者科学术语应当为本发明所属领域内具有一般技能的人士所理解的通常意义。本发明中使用的“第一”、“第二”以及类似的词语并不表示任何顺序、数量或者重要性,而只是用来区分不同的组成部分。“包括”或者“包含”等类似的词语意指出现该词前面的元件或者物件涵盖出现在该词后面列举的元件或者物件及其等同,而不排除其他元件或者物件。Unless otherwise defined, technical or scientific terms used in the present invention should have the ordinary meaning as understood by one of ordinary skill in the art to which the present invention belongs. The terms "first," "second," and similar terms used herein do not denote any order, quantity, or importance, but are merely used to distinguish different components. "Comprises" or "comprising" and similar words mean that the elements or things appearing before the word encompass the elements or things recited after the word and their equivalents, but do not exclude other elements or things.
为解决现有技术的问题,本发明实施例提供了一种票据识别方法、票据识别装置及可读存储介质。In order to solve the problems in the prior art, the embodiments of the present invention provide a bill identification method, a bill identification device and a readable storage medium.
需要说明的是,本发明实施例的题目搜索方法可应用于本发明实施例的题目搜索装置,该题目搜索装置可被配置于电子设备上。其中,该电子设备可以是个人计算机、移动终端等,该移动终端可以是手机、平板电脑等具有各种操作系统的硬件设备。It should be noted that the topic search method of the embodiment of the present invention can be applied to the topic search apparatus of the embodiment of the present invention, and the topic search apparatus can be configured on an electronic device. Wherein, the electronic device may be a personal computer, a mobile terminal, etc., and the mobile terminal may be a hardware device with various operating systems, such as a mobile phone, a tablet computer, and the like.
如图1所示,本实施例提供一种票据识别方法,所述票据识别方法包括如下步骤:As shown in FIG. 1 , this embodiment provides a bill identification method, and the bill identification method includes the following steps:
S11,对包含有票据的原始图像进行预处理,所述预处理包括:将所述原始图像在第一方向上的长度缩放至预设的第一节点尺寸后,将缩放后的所述原始图像在第二方向上的长度补齐至预设的第二节点尺寸,其中,所述第一方向垂直于所述第二方向,且所述原始图像在第一方向上的长度不小于在第二方向上的长度;S11. Preprocess the original image containing the bill, the preprocessing includes: after scaling the length of the original image in the first direction to a preset first node size, and then scaling the scaled original image The length in the second direction is complemented to a preset second node size, wherein the first direction is perpendicular to the second direction, and the length of the original image in the first direction is not less than that in the second direction the length in the direction;
S12,获取所述票据的标注框,并将所述票据的标注框按预设比例放大;S12, acquiring the labeling frame of the bill, and enlarging the labeling frame of the bill according to a preset ratio;
S13,基于所述票据放大后的标注框,分割出所述票据的图像,并输出。S13 , segment the image of the bill based on the enlarged annotation frame of the bill, and output it.
本实施例提供的所述票据识别方法,通过将包含票据的原始图像调整至预设尺寸后,再进行后续分割,且在进行分割之前,将票据的标注框按预设比例放大,如此,便通过将各种尺寸的原始票据图片统一到预设尺寸提高了后续处理的速度,且所采用的图像尺寸调整方式不会带来图像变形的困扰,另外通过对标注框进行放大,避免了票据的边缘区域丢失,因此,降低了对图片中多张票据进行识别的难度。In the bill recognition method provided in this embodiment, after the original image containing the bill is adjusted to a preset size, subsequent segmentation is performed, and before the segmentation, the labeling frame of the bill is enlarged according to a preset ratio. By unifying the original bill images of various sizes to the preset size, the speed of subsequent processing is improved, and the image size adjustment method adopted will not cause the trouble of image deformation. Edge regions are lost, thus reducing the difficulty of identifying multiple bills in the picture.
以下对上述各步骤进一步详细描述。The above steps are further described in detail below.
步骤S11中,较佳的,预设多个所述第一节点尺寸和多个所述第二节点尺寸,所述第一节点尺寸和所述第二节点尺寸一一对应;在进行所述原始图像的缩放时,将所述原始图像在第一方向上的长度缩放至数值最接近的所述第一节点尺寸;以及,将缩放后的所述原始图像在第二方向上的长度补齐至与进行缩放的所述第一节点尺寸相对应的所述第二节点尺寸。进一步较佳的,所述第一节点尺寸和相对应的所述第二节点尺寸相同。在通过对所述原始图像进行缩放,以及进行空白区域的补充之后,图像未发生变形,而因所获得的图像呈与预设尺寸一致,使得在进行后续处理(例如下文所述票据区域识别模型获取各票据的位置信息)时,处理速度显著提高,而通过将所述第一节点尺寸和所述第二节点尺寸设置为相同,使得图像呈方形,可进一步提高模型处理速度。In step S11, preferably, a plurality of the first node sizes and a plurality of the second node sizes are preset, and the first node sizes and the second node sizes are in one-to-one correspondence; When the image is zoomed, the length of the original image in the first direction is scaled to the size of the first node with the closest value; and the length of the scaled original image in the second direction is complemented to the second node size corresponding to the scaled first node size. Further preferably, the size of the first node is the same as the size of the corresponding second node. After scaling the original image and supplementing the blank area, the image is not deformed, and the obtained image is consistent with the preset size, so that in subsequent processing (such as the bill area recognition model described below) When obtaining the position information of each bill), the processing speed is significantly improved, and by setting the size of the first node and the size of the second node to be the same, so that the image is square, the processing speed of the model can be further improved.
例如,设置多个节点尺寸分别为800X800,1600X1600……(也可以为其他尺寸),先判断原始图片的在第一方向上的长度接近于哪个节点尺寸,例如原始图像在第一方向上的长度为600或者1000的,缩放到800,原始图像在第一方向上的长度为1400或者1800的,缩放到1600。For example, set multiple node sizes to 800X800, 1600X1600... (other sizes are also possible), first determine which node size the length of the original image in the first direction is close to, such as the length of the original image in the first direction If the length of the original image in the first direction is 1400 or 1800, it is scaled to 1600.
当用户进行图片拍摄之后,若最终输出的票据的图像与拍摄时所看到的图像差别过大,则会给用户带来不好的使用感受,本实施通过多节点尺寸的设置,使得对原始图像的尺寸调整幅度尽可能小,从而使得最终输出的票据的图像大小与原始图像中尽可能保持一致,因此,可以始终保证用户较佳的使用感受。After the user takes a picture, if the image of the final output ticket is too different from the image seen at the time of shooting, it will bring a bad use experience to the user. The size adjustment range of the image is as small as possible, so that the image size of the final output ticket is as consistent as possible with the original image, so that a better user experience can always be guaranteed.
应当理解,步骤S11中,对原始图像进行缩放,是指对原始图像的整体进行缩放,不仅仅是对第一方向上的长度的调整,第二方向的长度上也随第 一方向的长度调整而调整,如此,才能保证图像不发生变形。It should be understood that in step S11, scaling the original image refers to scaling the entire original image, not only adjusting the length in the first direction, but also adjusting the length in the second direction along with the length in the first direction And adjustment, in this way, can ensure that the image does not deform.
需要说明的是,若原始图像的第一方向上的长度与预设的第一节点尺寸相同,则可不进行缩放,而直接将第二方向上的长度补齐至与相对应的第二节点尺寸,例如,若所述原始图像在第一方向上的长度为800,则不进行缩放,而是直接将所述原始图像在第二方向上的长度补齐至800。It should be noted that, if the length in the first direction of the original image is the same as the preset first node size, it is not necessary to perform scaling, but directly fill in the length in the second direction to the corresponding second node size. For example, if the length of the original image in the first direction is 800, no scaling is performed, but the length of the original image in the second direction is directly filled to 800.
需要说明的是,若原始图像的第一方向上的长度与第二方向上的长度相同,且预设的第一节点尺寸和第二节点尺寸相同,则将原始图像的第一方向上的长度缩放至与预设的第一节点尺寸后,无需再进行第二方向上的长度的补齐。例如,若所述原始图像为1000X1000,则将所述原始图像的第一方向上的长度缩放至800后,其第二方向上的长度也变至800,故而无需再进行补齐。It should be noted that, if the length in the first direction of the original image is the same as the length in the second direction, and the preset first node size and the second node size are the same, the length in the first direction of the original image After scaling to the preset first node size, there is no need to fill in the length in the second direction. For example, if the original image is 1000×1000, after scaling the length of the original image in the first direction to 800, the length in the second direction of the original image is also changed to 800, so there is no need to make up.
步骤S11中,可沿第二方向在缩放后的所述原始图像的侧边补充空白区域,来将缩放后的所述原始图像在第二方向上的长度补齐至预设的第二节点尺寸In step S11, a blank area can be supplemented along the side of the scaled original image along the second direction to make up the length of the scaled original image in the second direction to a preset second node size.
图2所示为本实施例提供的一种示例性的包含四张票据的原始图像,包含的票据分别为:票据P1、票据P2、票据P3和票据P4。由于所述原始图像在第一方向上的长度大于或等于在第二方向上的长度,因此,对于图所示的原始图像而言,第一方向可以理解为图2中所示的X向,第二方向可以理解为图2中所示的Y向。FIG. 2 shows an exemplary original image including four bills provided in this embodiment, and the included bills are: bill P1, bill P2, bill P3, and bill P4. Since the length of the original image in the first direction is greater than or equal to the length in the second direction, for the original image shown in the figure, the first direction can be understood as the X direction shown in FIG. 2 , The second direction can be understood as the Y direction shown in FIG. 2 .
如图3所示,本实施例中,通过沿第二方向在所述原始图像的两侧边均补充空白区域A1和A2来将缩放后的所述原始图像在第二方向上的长度补齐至与第一方向上的长度相同,且较佳的,两侧边补充空白区域A1和A2的面积相等。在另外一些实施例中,也可沿第二方向在所述原始图像的其中一侧边补充空白区域,当在所述原始图像的其中一侧边补充空白区域时,所补充空白区域的面积与在两侧边补充的空白区域的面积之和相等。在另外一些实施例中,所补充的区域还可为填充有图像的区域,例如网格线区域等等。As shown in FIG. 3 , in this embodiment, the length of the scaled original image in the second direction is filled by supplementing blank areas A1 and A2 on both sides of the original image along the second direction To be the same as the length in the first direction, and preferably, the supplementary blank areas A1 and A2 on both sides have equal areas. In other embodiments, blank areas may also be supplemented on one side of the original image along the second direction. When blank areas are supplemented on one side of the original image, the area of the supplemented blank areas is equal to The sum of the areas of the blank areas supplemented on both sides is equal. In other embodiments, the supplemented area may also be an area filled with an image, such as a grid line area and the like.
步骤S12中,可通过获取所述票据的位置区域信息来获取所述票据的标注框。具体的,获取所述票据的标注框的方法可包括:获取所述票据的位置区域信息,以及,基于所述票据的位置区域信息,获取所述票据的标注框。 对于图2所示的所述原始图像而言,通过步骤S12,获取的各所述票据的标柱框如图4所示。其中,票据P1的文本框为Z1,票据P2的文本框为Z2,票据P3的文本框为Z3,票据P4的文本框为Z4。In step S12, the marked frame of the bill can be obtained by obtaining the position area information of the bill. Specifically, the method for obtaining the labeling frame of the bill may include: acquiring the location area information of the bill, and acquiring the labeling box of the bill based on the location area information of the bill. For the original image shown in FIG. 2 , through step S12 , the column frame of each of the bills obtained is as shown in FIG. 4 . The text box of the bill P1 is Z1, the text box of the bill P2 is Z2, the text box of the bill P3 is Z3, and the text box of the bill P4 is Z4.
其中,可采用票据区域识别模型来获取各所述票据的位置区域信息。所述票据区域识别模型可采用机器学习技术并且例如运行在通用计算装置或专用计算装置上。例如,所述票据区域识别模型可以采用深度卷积神经网络(DEEP-CNN)等神经网络实现。在一些实施例中,将图像输入至所述票据区域识别模型,所述票据区域识别模型可以识别出输入图像中的各票据的边界,然后将识别的各边界标注出来,从而以获得各所述票据的标注框。Wherein, a bill area identification model can be used to obtain the position area information of each of the bills. The ticket area identification model may employ machine learning techniques and run, for example, on a general purpose computing device or a special purpose computing device. For example, the bill region recognition model can be implemented by using a neural network such as a deep convolutional neural network (DEEP-CNN). In some embodiments, the image is input to the bill region recognition model, and the bill region identification model can identify the boundaries of each bill in the input image, and then mark out the identified boundaries, so as to obtain each of the bills. The callout box for the ticket.
在获取各票据的标注框后,如图5所示,按预设比例对各票据的标注框进行放大,通过各票据的标注框放大,可避免票据边缘部分的区域丢失,放大标注框后,可以保证票据的全部区域包含在标注框内。应当可以理解,这里所述对各票据的标注框进行放大,是指将标注框沿四周进行放大。After obtaining the labeling frame of each bill, as shown in Figure 5, the labeling box of each bill is enlarged according to the preset ratio. It is guaranteed that the entire area of the ticket is contained within the callout box. It should be understood that the enlargement of the marked frame of each bill here refers to the enlargement of the marked frame along the periphery.
在一些实施例中,将各所述票据标注框按5%放大,在另外一些实施例中,还可按3%、7%、9%等放大。此外,还可针对不同大小的票据,匹配不同的放大比例,例如对于小票据,其标注框可按2%放大,对于大票据,其标注框可按6%放大等等。In some embodiments, each of the bill marked boxes is enlarged by 5%, and in other embodiments, it can also be enlarged by 3%, 7%, 9%, and so on. In addition, different magnification ratios can be matched for bills of different sizes. For example, for small bills, the marked frame can be enlarged by 2%, and for large bills, the marked frame can be enlarged by 6%, and so on.
步骤S13中,基于步骤S12中对文本框Z1放大后的文本框Z1',对文本框Z2放大后的文本框Z2',对文本框Z3放大后的文本框Z3',对文本框Z4放大后的文本框Z4',对所有所述票据进行分割,分割得到票据P1的图像如图6所示,除了该票据本身,还包括票据P2的部分图像,该部分图像,对于票据P1而言,为冗余图像。In step S13, based on the enlarged text frame Z1' of the text frame Z1 in step S12, the enlarged text frame Z2' of the text frame Z2, the enlarged text frame Z3' of the text frame Z3, and the enlarged text frame Z4. In the text box Z4', all the bills are divided, and the image of the bill P1 obtained from the segmentation is shown in Figure 6. In addition to the bill itself, it also includes a partial image of the bill P2. The partial image, for the bill P1, is Redundant images.
有鉴于此,较佳的,在获取各所述票据的图像之后,识别各所述票据的边缘;以及,基于识别结果,对各所述票据的图像进行切边处理。切边处理得到的票据P1的图像如图7所示,从图7中可以看出,通过切边处理,冗余图像被切除。In view of this, preferably, after acquiring the image of each of the bills, the edge of each of the bills is identified; and, based on the identification result, the image of each of the bills is subjected to edge trimming processing. The image of the bill P1 obtained by the edge trimming process is shown in FIG. 7 , and it can be seen from FIG. 7 that the redundant image is cut out through the edge trimming process.
本实施例中,在识别任一所述票据的边缘时,采用如下方法:In this embodiment, when identifying the edge of any one of the bills, the following methods are used:
对所述票据的图像进行处理,以得到所述票据的图像中灰度轮廓的线条图;processing the image of the bill to obtain a line drawing of grayscale contours in the image of the bill;
将所述线条图的中多条线条进行合并处理,以得到多条参考边界线;Combine multiple lines in the line drawing to obtain multiple reference boundary lines;
通过边界区域模型识别所述票据图像的边界区域,其中,所述边界区域模型与所述票据区域识别模型可采用同一模型;Identify the boundary area of the bill image through a boundary area model, wherein the boundary area model and the bill area identification model may use the same model;
计算各所述参考边界线属于所述边界区域的像素点的个数,并根据多个所述参考边界线、多个所述参考边界线属于所述边界区域的像素点的个数以及所述边界区域,确认所述票据的边缘。Calculate the number of pixels belonging to the boundary area of each of the reference boundary lines, and according to the number of the reference boundary lines, the number of pixels belonging to the boundary area of the reference boundary lines, and the Boundary area, identify the edge of the note.
其中,可通过边缘检测算法对票据的图像进行处理,以获得灰度轮廓的线条图。例如,可以通过基于OpenCV的边缘检测算法对图像进行处理,以获得图像中灰度轮廓的线条图。OpenCV为一种开源计算机视觉库,基于OpenCV的边缘检测算法包括Sobel、Scarry、Canny、Laplacian、Prewitt、Marr-Hildresh、scharr等多种算法。例如,本实施例中采用Canny边缘检测算法,Canny边缘检测算法是一个多阶段的算法,即Canny边缘检测算法由多个步骤构成,例如,Canny边缘检测算法包括:1、图像降噪:用高斯滤波器平滑图像;2、计算图像梯度:用一阶偏导有限差分计算梯度幅值和方向;3、非极大值抑制:对梯度幅值进行非极大值抑制;4、阈值筛选:用双阈值算法检测和连接边缘。Among them, the image of the bill can be processed by the edge detection algorithm to obtain the line drawing of the grayscale outline. For example, an image can be processed by an OpenCV-based edge detection algorithm to obtain a line drawing of the grayscale contours in the image. OpenCV is an open source computer vision library. Edge detection algorithms based on OpenCV include Sobel, Scarry, Canny, Laplacian, Prewitt, Marr-Hildresh, scharr and other algorithms. For example, the Canny edge detection algorithm is used in this embodiment. The Canny edge detection algorithm is a multi-stage algorithm, that is, the Canny edge detection algorithm consists of multiple steps. For example, the Canny edge detection algorithm includes: 1. Image noise reduction: using Gaussian Filter to smooth the image; 2. Calculate the image gradient: use the first-order partial derivative finite difference to calculate the gradient magnitude and direction; 3. Non-maximum suppression: perform non-maximum suppression on the gradient amplitude; 4. Threshold filtering: use A dual threshold algorithm detects and connects edges.
这里需要说明的是,在另外一些实施例中,还可采用本领域技术人员所熟知的其它边缘识别方法,具体边缘识别方法的选择不构成对于本申请的限制。It should be noted here that in other embodiments, other edge identification methods known to those skilled in the art may also be used, and the selection of specific edge identification methods does not constitute a limitation to the present application.
步骤S13中,进一步较佳的,在获取各所述票据的图像之后,对各所述票据的图像内容进行校正,所述校正包括对全局校正和局部校正。本实施例中,为减小校正的范围,该校正步骤可在切边处理后执行,在另外一些实施例中,该校正步骤也可在切边处理前执行。In step S13, further preferably, after the image of each of the bills is acquired, the image content of each of the bills is corrected, and the correction includes global correction and local correction. In this embodiment, in order to reduce the range of correction, the correction step may be performed after the edge trimming process, and in other embodiments, the correction step may also be performed before the edge trimming process.
在将纸质文件转换为文本图像的过程中,可能导致文本图像倾斜等情况,这种倾斜则会对文本图像的分析(例如,文字识别等)等处理产生不利的影响。因此,本实施例通过对各所述票据的图像内容进行校正,从而以避免文本图像倾斜给方本图像的分析处理产生不利的影响。In the process of converting a paper document into a text image, the text image may be inclined, and the inclination may adversely affect the analysis of the text image (for example, character recognition, etc.) and other processing. Therefore, in this embodiment, the image content of each of the bills is corrected, so as to avoid the inclination of the text image from adversely affecting the analysis and processing of the original image.
本实施例中,对于任一票据的图像内容,可采用如下步骤进行校正:对票据的图像进行全局校正处理,以得到中间校正图像;对所述中间校正图像 进行局部调整,以得到目标校正图像;其中,对所述中间校正图像进行局部调整,以得到目标校正图像,包括:In this embodiment, the image content of any bill can be corrected by using the following steps: performing global correction processing on the image of the bill to obtain an intermediate corrected image; performing local adjustment on the intermediate corrected image to obtain a target corrected image ; wherein, performing local adjustment on the intermediate correction image to obtain a target correction image, including:
根据所述中间校正图像,确定与所述票据的图像的M个字符行对应的M个字符行下边界;基于所述中间校正图像和所述M个字符行下边界,确定局部调整基准线和M个保留系数组,其中,所述M个保留系数组中的每个保留系数组包括多个保留系数;根据所述M个字符行下边界、所述局部调整基准线和所述保留系数组,确定与所述M个字符行对应的M个局部调整偏移量组,其中,所述M个局部调整偏移量组中的每个局部调整偏移量组包括多个局部调整偏移量;根据所述M个局部调整偏移量组对所述中间校正图像中的所述M个字符行进行局部调整,以得到所述目标校正图像。According to the intermediate correction image, the lower boundaries of M character lines corresponding to the M character rows of the image of the bill are determined; based on the intermediate correction image and the lower boundaries of the M character rows, local adjustment reference lines and M reservation coefficient groups, wherein each reservation coefficient group in the M reservation coefficient groups includes a plurality of reservation coefficients; according to the lower boundary of the M character lines, the local adjustment reference line and the reservation coefficient group , determine M local adjustment offset groups corresponding to the M character lines, wherein each local adjustment offset group in the M local adjustment offset groups includes a plurality of local adjustment offsets ; Perform local adjustment on the M character lines in the intermediate correction image according to the M local adjustment offset groups to obtain the target correction image.
在另外一些实施例中,还可采用本领域技术人员所熟知的其他方法对各所述票据的图像内容进行校正,在此不再赘述。In other embodiments, other methods well known to those skilled in the art may also be used to correct the image content of each of the bills, which will not be repeated here.
另外,在进行贴票时,可能会存在各票据摆放方向不同(有的正着贴,有的倒着贴或者横着贴)的现象,当字符横着或倒着时,不利于统计观察。有鉴于此,本实施例中,较佳的,在输出任一所述票据的图像之前,所述票据识别方法还包括:对所述票据的图像的方向进行调整,使得所述票据上字符的朝向为预设方向。较佳的,所述预设方向为平面坐标系中的正Y方向,以便于后续识别。In addition, when sticking bills, there may be a phenomenon that the bills are placed in different directions (some are sticking up, some are sticking upside down or sideways). When the characters are sideways or upside down, it is not conducive to statistical observation. In view of this, in this embodiment, preferably, before outputting any image of the bill, the bill recognition method further includes: adjusting the direction of the image of the bill, so that the characters on the bill The orientation is the default direction. Preferably, the preset direction is the positive Y direction in the plane coordinate system, so as to facilitate subsequent identification.
基于同一思想,本实施例还提供一种票据识别装置,如图8所示,所述票据识别装置包括:Based on the same idea, this embodiment also provides a bill identification device, as shown in FIG. 8 , the bill identification device includes:
图像预处理模块10,用于对包含有票据的原始图像进行预处理,所述预处理包括:将所述原始图像在第一方向上的长度缩放至预设的第一节点尺寸后,将缩放后的所述原始图像在第二方向上的长度补齐至预设的第二节点尺寸,其中,所述第一方向垂直于所述第二方向,且所述原始图像在第一方向上的长度不小于在第二方向上的长度;The image preprocessing module 10 is configured to perform preprocessing on the original image containing the bill, and the preprocessing includes: after scaling the length of the original image in the first direction to a preset first node size, scaling the The length of the original image in the second direction is then complemented to a preset second node size, wherein the first direction is perpendicular to the second direction, and the length of the original image in the first direction is The length is not less than the length in the second direction;
标注框获取与调整模块20,用于获取所述票据的标注框,并将所述票据的标注框按预设比例放大;The labeling frame obtaining and adjusting module 20 is used for obtaining the labeling frame of the bill, and enlarging the labeling frame of the bill according to a preset ratio;
图像后处理模块30,用于基于所述票据放大后的标注框,分割出所述票据的图像,并输出。The image post-processing module 30 is configured to segment and output the image of the bill based on the enlarged annotation frame of the bill.
其中,所述票据识别装置还包括节点尺寸设置模块(图中未示出),所述节点尺寸设置模块用于所述第一节点尺寸和所述第二节点尺寸的设置,较佳的,用于预设多个所述第一节点尺寸和多个所述第二节点尺寸,所述第一节点尺寸和所述第二节点尺寸一一对应。所述图像预处理模块10在进行所述原始图像的缩放时,将所述原始图像在第一方向上的长度缩放至数值最接近的所述第一节点尺寸;以及,将缩放后的所述原始图像在第二方向上的长度补齐至与进行缩放的所述第一节点尺寸相对应的所述第二节点尺寸。所述图像预处理模块10将缩放后的所述原始图像在第二方向上的长度补齐至预设的第二节点尺寸的方法包括:沿第二方向在缩放后的所述原始图像的侧边补充空白区域。Wherein, the bill identification device further includes a node size setting module (not shown in the figure), and the node size setting module is used for setting the first node size and the second node size. When a plurality of the first node sizes and a plurality of the second node sizes are preset, the first node sizes and the second node sizes are in one-to-one correspondence. When scaling the original image, the image preprocessing module 10 scales the length of the original image in the first direction to the first node size with the closest value; The length of the original image in the second direction is padded to the second node size corresponding to the scaled first node size. The method for the image preprocessing module 10 to fill in the length of the scaled original image in the second direction to a preset second node size includes: along the second direction on the side of the scaled original image; Fill in the blank space.
对于所述图像后处理模块30,具体的,包括分割模块301和图像输出模块302,所述图像分割模块301用于基于所述票据放大后的标注框,分割出所述票据的图像,所述图像输出模块302用于将分割出的所述票据的图像输出。The image post-processing module 30 specifically includes a segmentation module 301 and an image output module 302. The image segmentation module 301 is configured to segment the image of the ticket based on the enlarged annotation frame of the ticket. The image output module 302 is used for outputting the image of the segmented bill.
较佳的,所述图像后处理模块30还包括:方向调整模块303,所述方向调整模块303用于将所述票据的图像的方向进行调整,使得所述票据上字符的朝向为预设方向。较佳的,所述预设方向为平面坐标系中的正Y方向,以便于后续识别。Preferably, the image post-processing module 30 further includes: an orientation adjustment module 303, which is used to adjust the orientation of the image of the bill, so that the orientation of the characters on the bill is a preset direction . Preferably, the preset direction is the positive Y direction in the plane coordinate system, so as to facilitate subsequent identification.
进一步的,所述票据识别装置还可包括边缘处理模块304,所述边缘处理模块304用于识别所述票据的边缘,以及基于识别结果,对所述票据的图像进行切边处理。Further, the bill identification device may further include an edge processing module 304, and the edge processing module 304 is configured to identify the edge of the bill, and based on the identification result, perform edge trimming processing on the image of the bill.
进一步的,所述票据识别装置还可包括图像校正模块305;所述图像校正模块305用于对所述票据的图像内容进行校正,所述校正包括全局校正和局部校正。其中,所述图像校正模块305对所述票据的图像内容的校正,可以在所述边缘处理模块304对所述票据的图像进行切边处理之后进行,也在所述边缘处理模块304对所述票据的图像进行切边处理之前进行。Further, the bill identification device may further include an image correction module 305; the image correction module 305 is used to correct the image content of the bill, and the correction includes global correction and local correction. Wherein, the correction of the image content of the bill by the image correction module 305 can be performed after the edge processing module 304 performs edge trimming processing on the image of the bill, and the edge processing module 304 also performs the edge processing on the bill. The image of the ticket is processed before edge trimming.
需要说明的是,本实施例提供的所述票据识别装置中的各模块分别用于实现本实施提供的所述票据识别方法的各步骤,因此,各模块能够实现的功能的具体说明可以参考上述所述票据识别方法的相应步骤的相关描述,重复之处不再赘述。此外,所述票据识别装置可以实现与上述所述票据识别方法 相同的技术效果,在此亦不再赘述。It should be noted that each module in the bill identification device provided in this embodiment is respectively used to implement each step of the bill identification method provided in this embodiment. Therefore, for the specific description of the functions that each module can implement, please refer to the above The relevant descriptions of the corresponding steps of the bill identification method will not be repeated where repeated. In addition, the bill identification device can achieve the same technical effect as the bill identification method described above, which will not be repeated here.
可以理解的是,所述的票据识别装置,图像预处理模块10、标注框获取与调整模块20以及图像后处理模块30可以合并在一个装置中实现,或者其中的任意一个模块可以被拆分成多个子模块,或者,所述的票据识别装置,图像预处理模块10、标注框获取与调整模块20以及图像后处理模块30中的一个或多个模块的至少部分功能可以与其他模块的至少部分功能相结合,并在一个功能模块中实现。根据本发明的实施例,所述的票据识别装置,统计分析模块11以及标定模块12中的至少一个可以至少被部分地实现为硬件电路,例如现场可编程门阵列(FPGA)、可编程逻辑阵列(PLA)、片上系统、基板上的系统、封装上的系统、专用集成电路(ASIC),或可以以对电路进行集成或封装的任何其他的合理方式等硬件或固件来实现,或以软件、硬件以及固件三种实现方式的适当组合来实现。或者,所述的票据识别装置,图像预处理模块10、标注框获取与调整模块20以及图像后处理模块30中的至少一个可以至少被部分地实现为计算机程序模块,当该程序被计算机运行时,可以执行相应模块的功能。It can be understood that, the bill recognition device, the image preprocessing module 10, the frame acquisition and adjustment module 20, and the image post-processing module 30 can be combined in one device, or any one of the modules can be split into A plurality of sub-modules, or, in the bill recognition device, at least part of the functions of one or more modules in the image preprocessing module 10, the frame acquisition and adjustment module 20, and the image post-processing module 30 can be combined with at least part of the other modules. The functions are combined and implemented in one functional module. According to an embodiment of the present invention, at least one of the bill identification device, the statistical analysis module 11 and the calibration module 12 may be at least partially implemented as a hardware circuit, such as a field programmable gate array (FPGA), a programmable logic array (PLA), system-on-chip, system-on-substrate, system-on-package, application specific integrated circuit (ASIC), or any other reasonable means of integrating or packaging circuits, implemented in hardware or firmware, or in software, It can be realized by an appropriate combination of the three implementations of hardware and firmware. Alternatively, in the bill recognition device, at least one of the image preprocessing module 10, the frame acquisition and adjustment module 20, and the image postprocessing module 30 may be at least partially implemented as a computer program module, and when the program is run by a computer , can execute the function of the corresponding module.
另外,本实施例还提供一种可读存储介质,所述可读存储介质存储有计算机程序,所述计算机程序被执行时,实现本实施例所述的票据识别方法。In addition, this embodiment further provides a readable storage medium, where a computer program is stored in the readable storage medium, and when the computer program is executed, the bill identification method described in this embodiment is implemented.
所述可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备,例如可以是但不限于电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备以及上述的任意合适的组合。这里所描述的计算机程序可以从可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收所述计算机程序,并转发该计算机程序,以供存 储在各个计算/处理设备中的可读存储介质中。用于执行本发明操作的计算机程序可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。所述计算机程序可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机程序的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本发明的各个方面。The readable storage medium can be a tangible device that can hold and store instructions for use by the instruction execution device, such as, but not limited to, electrical storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or the above. any suitable combination. More specific examples (non-exhaustive list) of readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or Flash memory), static random access memory (SRAM), portable compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory sticks, floppy disks, mechanical encoding devices, and any suitable combination of the foregoing. The computer programs described herein can be downloaded to various computing/processing devices from readable storage media, or to external computers or external storage devices over a network such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or network interface in each computing/processing device receives the computer program from the network and forwards the computer program for storage in a readable storage medium in the respective computing/processing device. The computer program for carrying out the operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or any other program in one or more programming languages. Combining source or object code written in programming languages including object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as the "C" language or similar programming languages. The computer program may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server . Where a remote computer is involved, the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, through the Internet using an Internet service provider) connect). In some embodiments, electronic circuits, such as programmable logic circuits, field programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), that can execute computer programmable logic circuits, are personalized by utilizing state information from a computer program. Program instructions are read to implement various aspects of the present invention.
这里参照根据本发明实施例的方法、系统和计算机程序产品的流程图和/或框图描述了本发明的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机程序实现。这些计算机程序可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些程序在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机程序存储在可读存储介质中,这些计算机程序使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有该计算机程序的可读存储介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, systems and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by a computer program. These computer programs may be provided to the processor of a general purpose computer, special purpose computer or other programmable data processing apparatus to produce a machine which, when executed by the processor of the computer or other programmable data processing apparatus, produces a Means implementing the functions/acts specified in one or more blocks of the flowchart and/or block diagrams. These computer programs can also be stored in a readable storage medium, and these computer programs cause computers, programmable data processing devices and/or other devices to operate in a specific manner, so that the readable storage medium storing the computer program includes a An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks of the flowchart and/or block diagrams.
也可以把计算机程序加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的计算机程序实现流程图和/或框图中的一个或 多个方框中规定的功能/动作。A computer program can also be loaded onto a computer, other programmable data processing apparatus, or other equipment, causing a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other equipment to produce a computer-implemented process that causes A computer program executing on a computer, other programmable data processing apparatus, or other device implements the functions/acts specified in one or more blocks of the flowcharts and/or block diagrams.
综上所述,本发明提供的票据识别的方法、票据识别装置及可读存储介质,首先,对对包含有票据的原始图像进行预处理,所述预处理包括:将所述原始图像在第一方向上的长度缩放至预设的第一节点尺寸后,将缩放后的所述原始图像在第二方向上的长度补齐至预设的第二节点尺寸,其中,所述第一方向垂直于所述第二方向,且所述原始图像在第一方向上的长度不小于在第二方向上的长度;而后,获取各所述票据的标注框,并将各所述票据的标注框按预设比例放大;最后,基于各所述票据放大后的标注框,对所有所述票据进行分割,以获取各所述票据的图像,并输出。即,通过将包含多个票据的原始图像调整呈方形后,再进行后续分割,且在进行分割之前,将各票据的标注框按预设比例放大,如此,便通过将各种尺寸的原始票据图片统一到相同的尺寸提高了后续处理的速度,且所采用的图像尺寸调整方式不会带来图像变形的困扰,另外通过对标注框进行放大,避免了票据的边缘区域丢失,因此,降低了对图片中多张票据进行识别的难度。To sum up, in the method for identifying bills, the device for identifying bills, and the readable storage medium provided by the present invention, first, the original image containing bills is preprocessed, and the preprocessing includes: storing the original image in the first After the length in one direction is scaled to the preset first node size, the length of the scaled original image in the second direction is filled to the preset second node size, wherein the first direction is vertical in the second direction, and the length of the original image in the first direction is not less than the length in the second direction; then, obtain the marked frame of each of the bills, and press the marked frame of each of the bills The preset ratio is enlarged; finally, all the bills are segmented based on the enlarged annotation frame of each of the bills, so as to obtain an image of each of the bills and output them. That is, by adjusting the original image containing multiple bills into a square, and then performing subsequent segmentation, and before dividing, enlarging the marked frame of each bill according to a preset ratio, so that the original bills of various sizes can be divided into different sizes. The image is unified to the same size, which improves the speed of subsequent processing, and the image size adjustment method adopted will not bring about the trouble of image deformation. In addition, by enlarging the annotation frame, the loss of the edge area of the bill is avoided. Difficulty in identifying multiple bills in a picture.
上述描述仅是对本发明较佳实施例的描述,并非对本发明范围的任何限定,本发明领域的普通技术人员根据上述揭示内容做的任何变更、修饰,均属于权利要求书的保护范围。The above description is only a description of the preferred embodiments of the present invention, and is not intended to limit the scope of the present invention. Any changes and modifications made by those of ordinary skill in the field of the present invention based on the above disclosure all belong to the protection scope of the claims.

Claims (17)

  1. 一种票据识别方法,其特征在于,包括:A method for identifying bills, comprising:
    对包含有票据的原始图像进行预处理,所述预处理包括:将所述原始图像在第一方向上的长度缩放至预设的第一节点尺寸后,将缩放后的所述原始图像在第二方向上的长度补齐至预设的第二节点尺寸,其中,所述第一方向垂直于所述第二方向,且所述原始图像在所述第一方向上的长度不小于在所述第二方向上的长度;Preprocessing the original image containing the ticket, the preprocessing includes: after scaling the length of the original image in the first direction to a preset first node size, and then placing the scaled original image in the first The lengths in the two directions are complemented to a preset second node size, wherein the first direction is perpendicular to the second direction, and the length of the original image in the first direction is not less than that in the the length in the second direction;
    获取所述票据的标注框,并将所述票据的标注框按预设比例放大;acquiring the labeling frame of the bill, and enlarging the labeling box of the bill according to a preset ratio;
    基于所述票据放大后的标注框,分割出所述票据的图像,并将分割出的所述票据的图像输出。Based on the enlarged annotation frame of the bill, the image of the bill is segmented, and the segmented image of the bill is output.
  2. 如权利要求1所述的票据识别方法,其特征在于,所述票据识别方法还包括:The bill identification method according to claim 1, wherein the bill identification method further comprises:
    预设多个所述第一节点尺寸和多个所述第二节点尺寸,所述第一节点尺寸和所述第二节点尺寸一一对应;在进行所述原始图像的缩放时,将所述原始图像在所述第一方向上的长度缩放至数值最接近的所述第一节点尺寸;以及,A plurality of the first node sizes and a plurality of the second node sizes are preset, and the first node sizes and the second node sizes are in one-to-one correspondence; when scaling the original image, the The length of the original image in the first direction is scaled to the closest numerical value of the first node size; and,
    将缩放后的所述原始图像在所述第二方向上的长度补齐至与进行缩放的所述第一节点尺寸相对应的所述第二节点尺寸。The length of the scaled original image in the second direction is padded to the second node size corresponding to the scaled first node size.
  3. 如权利要求1所述的票据识别方法,其特征在于,所述第一节点尺寸和所述第二节点尺寸相同。The bill identification method according to claim 1, wherein the size of the first node and the size of the second node are the same.
  4. 如权利要求1所述的票据识别方法,其特征在于,将缩放后的所述原始图像在第二方向上的长度补齐至预设的第二节点尺寸的方法包括:The bill recognition method according to claim 1, wherein the method for compensating the length of the scaled original image in the second direction to a preset second node size comprises:
    沿所述第二方向在缩放后的所述原始图像的侧边补充空白区域。Blank areas are supplemented along the side of the scaled original image along the second direction.
  5. 如权利要求1所述的票据识别方法,其特征在于,获取所述票据的标注框的方法包括:The bill identification method according to claim 1, wherein the method for obtaining the marked frame of the bill comprises:
    获取所述票据的位置区域信息;以及,obtaining location area information for the ticket; and,
    基于所述票据的位置区域信息,获取所述票据的标注框。Based on the location area information of the ticket, the callout frame of the ticket is acquired.
  6. 如权利要求1所述的票据识别方法,其特征在于,在输出所述票据的 图像之前,所述票据识别方法还包括:The bill identification method according to claim 1, wherein before outputting the image of the bill, the bill identification method further comprises:
    对所述票据的图像的方向进行调整,使得所述票据上字符的朝向为预设方向。The orientation of the image of the bill is adjusted so that the orientation of the characters on the bill is a preset direction.
  7. 如权利要求1所述的票据识别方法,其特征在于,在获取所述票据的图像后,所述票据识别方法还包括:The bill identification method according to claim 1, wherein after acquiring the image of the bill, the bill identification method further comprises:
    识别所述票据的边缘;以及,identifying the edge of the note; and,
    基于识别结果,对所述票据的图像进行切边处理。Based on the recognition result, the image of the bill is trimmed.
  8. 如权利要求1所述的票据识别方法,其特征在于,在获取所述票据的图像后,所述票据识别方法还包括:The bill identification method according to claim 1, wherein after acquiring the image of the bill, the bill identification method further comprises:
    对所述票据的图像内容进行校正,所述校正包括全局校正和局部校正。The image content of the bill is corrected, and the correction includes global correction and local correction.
  9. 一种票据识别装置,其特征在于,包括:A bill identification device, characterized in that it includes:
    图像预处理模块,用于对包含有票据的原始图像进行预处理,所述预处理包括:将所述原始图像在第一方向上的长度缩放至预设的第一节点尺寸后,将缩放后的所述原始图像在第二方向上的长度补齐至预设的第二节点尺寸,其中,所述第一方向垂直于所述第二方向,且所述原始图像在所述第一方向上的长度不小于在所述第二方向上的长度;The image preprocessing module is used for preprocessing the original image containing the bill, and the preprocessing includes: after scaling the length of the original image in the first direction to a preset first node size, The length of the original image in the second direction is padded to a preset second node size, wherein the first direction is perpendicular to the second direction, and the original image is in the first direction The length is not less than the length in the second direction;
    标注框获取与调整模块,用于获取所述票据的标注框,并将所述票据的标注框按预设比例放大;An annotation frame acquisition and adjustment module, used for acquiring the annotation frame of the bill, and enlarging the annotation frame of the bill according to a preset ratio;
    图像后处理模块,用于基于所述票据放大后的标注框,分割出所述票据的图像,并将分割出的所述票据的图像输出。The image post-processing module is used for segmenting the image of the ticket based on the enlarged annotation frame of the ticket, and outputting the segmented image of the ticket.
  10. 如权利要求9所述的票据识别装置,其特征在于,所述票据识别装置还包括节点尺寸设置模块,所述节点尺寸设置模块用于预设多个所述第一节点尺寸和多个所述第二节点尺寸,所述第一节点尺寸和所述第二节点尺寸一一对应;The bill identification device according to claim 9, wherein the bill identification device further comprises a node size setting module, and the node size setting module is used to preset a plurality of the first node sizes and a plurality of the a second node size, the first node size and the second node size are in one-to-one correspondence;
    所述图像预处理模块在进行所述原始图像的缩放时,将所述原始图像在所述第一方向上的长度缩放至数值最接近的所述第一节点尺寸;以及,将缩放后的所述原始图像在所述第二方向上的长度补齐至与进行缩放的所述第一节点尺寸相对应的所述第二节点尺寸。When scaling the original image, the image preprocessing module scales the length of the original image in the first direction to the first node size with the closest value; The length of the original image in the second direction is padded to the second node size corresponding to the scaled first node size.
  11. 如权利要求9所述的票据识别装置,其特征在于,所述第一节点尺 寸和所述第二节点尺寸相同。The bill identification device of claim 9, wherein the size of the first node and the size of the second node are the same.
  12. 如权利要求9所述的票据识别装置,其特征在于,所述图像预处理模块将缩放后的所述原始图像在第二方向上的长度补齐至预设的第二节点尺寸的方法包括:沿所述第二方向在缩放后的所述原始图像的侧边补充空白区域。The bill recognition device according to claim 9, wherein the method for the image preprocessing module to complement the length of the scaled original image in the second direction to a preset second node size comprises: Blank areas are supplemented at the sides of the scaled original image along the second direction.
  13. 如权利要求9所述的票据识别装置,其特征在于,所述图像后处理模块包括图像分割模块和图像输出模块,所述图像分割模块用于基于所述票据放大后的标注框,分割出所述票据的图像,所述图像输出模块用于将分割出的所述票据的图像输出。The bill recognition device according to claim 9, wherein the image post-processing module comprises an image segmentation module and an image output module, and the image segmentation module is configured to segment out the the image of the bill, and the image output module is configured to output the image of the bill that has been segmented.
  14. 如权利要13所述的票据识别装置,其特征在于,所述图像后处理模块还包括方向调整模块,所述方向调整模块用于对所述票据的图像的方向进行调整,使得所述票据上字符的朝向为预设方向。The bill recognition device according to claim 13, wherein the image post-processing module further comprises an orientation adjustment module, and the orientation adjustment module is used to adjust the orientation of the image of the bill, so that the The orientation of the characters is the preset direction.
  15. 如权利要求13所述的票据识别装置,其特征在于,所述图像后处理模块还包括边缘处理模块,所述边缘处理模块用于识别所述票据的边缘,以及基于识别结果,对所述票据的图像进行切边处理。The bill identification device according to claim 13, wherein the image post-processing module further comprises an edge processing module, the edge processing module is used for identifying the edge of the bill, and based on the identification result, image is trimmed.
  16. 如权利要求13所述的票据识别装置,其特征在于,所述图像后处理模块还包括图像校正模块,所述图像校正模块用于对所述票据的图像内容进行校正,所述校正包括全局校正和局部校正。The bill recognition device according to claim 13, wherein the image post-processing module further comprises an image correction module, the image correction module is used to correct the image content of the bill, and the correction includes global correction and local correction.
  17. 一种可读存储介质,其特征在于,所述可读存储介质存储有计算机程序,所述计算机程序被执行时,实现如权利要求1~8任一项所述的票据识别方法。A readable storage medium, characterized in that the readable storage medium stores a computer program, and when the computer program is executed, the bill identification method according to any one of claims 1 to 8 is implemented.
PCT/CN2021/132930 2020-11-25 2021-11-24 Document recognition method and apparatus, and readable storage medium WO2022111549A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011338701.8 2020-11-25
CN202011338701.8A CN112308036A (en) 2020-11-25 2020-11-25 Bill identification method and device and readable storage medium

Publications (1)

Publication Number Publication Date
WO2022111549A1 true WO2022111549A1 (en) 2022-06-02

Family

ID=74335645

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/132930 WO2022111549A1 (en) 2020-11-25 2021-11-24 Document recognition method and apparatus, and readable storage medium

Country Status (2)

Country Link
CN (1) CN112308036A (en)
WO (1) WO2022111549A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308036A (en) * 2020-11-25 2021-02-02 杭州睿胜软件有限公司 Bill identification method and device and readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060045321A1 (en) * 2004-08-24 2006-03-02 Chen-Yu Enterprises Llc Bank check and method for positioning and interpreting a digital check within a defined region
CN107798299A (en) * 2017-10-09 2018-03-13 平安科技(深圳)有限公司 Billing information recognition methods, electronic installation and readable storage medium storing program for executing
CN109740548A (en) * 2019-01-08 2019-05-10 北京易道博识科技有限公司 A kind of reimbursement bill images dividing method and system
CN110751143A (en) * 2019-09-26 2020-02-04 中电万维信息技术有限责任公司 Electronic invoice information extraction method and electronic equipment
CN111739024A (en) * 2020-08-28 2020-10-02 安翰科技(武汉)股份有限公司 Image recognition method, electronic device and readable storage medium
CN112308036A (en) * 2020-11-25 2021-02-02 杭州睿胜软件有限公司 Bill identification method and device and readable storage medium

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104715449A (en) * 2015-03-31 2015-06-17 百度在线网络技术(北京)有限公司 Method and device for generating mosaic image
CN110457973A (en) * 2018-05-07 2019-11-15 北京中海汇银财税服务有限公司 A kind of method and system of bank slip recognition
US11651206B2 (en) * 2018-06-27 2023-05-16 International Business Machines Corporation Multiscale feature representations for object recognition and detection
CN109063085B (en) * 2018-07-26 2021-07-13 创新先进技术有限公司 Thumbnail generation method and device
CN109948510B (en) * 2019-03-14 2021-06-11 北京易道博识科技有限公司 Document image instance segmentation method and device
CN110443239A (en) * 2019-06-28 2019-11-12 平安科技(深圳)有限公司 The recognition methods of character image and its device
CN110490193B (en) * 2019-07-24 2022-11-08 西安网算数据科技有限公司 Single character area detection method and bill content identification method
CN110427932B (en) * 2019-08-02 2023-05-02 杭州睿琪软件有限公司 Method and device for identifying multiple bill areas in image
CN110428414B (en) * 2019-08-02 2023-05-23 杭州睿琪软件有限公司 Method and device for identifying number of notes in image
CN111476109A (en) * 2020-03-18 2020-07-31 深圳中兴网信科技有限公司 Bill processing method, bill processing apparatus, and computer-readable storage medium
CN111931664B (en) * 2020-08-12 2024-01-12 腾讯科技(深圳)有限公司 Mixed-pasting bill image processing method and device, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060045321A1 (en) * 2004-08-24 2006-03-02 Chen-Yu Enterprises Llc Bank check and method for positioning and interpreting a digital check within a defined region
CN107798299A (en) * 2017-10-09 2018-03-13 平安科技(深圳)有限公司 Billing information recognition methods, electronic installation and readable storage medium storing program for executing
CN109740548A (en) * 2019-01-08 2019-05-10 北京易道博识科技有限公司 A kind of reimbursement bill images dividing method and system
CN110751143A (en) * 2019-09-26 2020-02-04 中电万维信息技术有限责任公司 Electronic invoice information extraction method and electronic equipment
CN111739024A (en) * 2020-08-28 2020-10-02 安翰科技(武汉)股份有限公司 Image recognition method, electronic device and readable storage medium
CN112308036A (en) * 2020-11-25 2021-02-02 杭州睿胜软件有限公司 Bill identification method and device and readable storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308036A (en) * 2020-11-25 2021-02-02 杭州睿胜软件有限公司 Bill identification method and device and readable storage medium

Also Published As

Publication number Publication date
CN112308036A (en) 2021-02-02

Similar Documents

Publication Publication Date Title
CN110176027B (en) Video target tracking method, device, equipment and storage medium
CN108229303B (en) Detection recognition and training method, device, equipment and medium for detection recognition network
US11636604B2 (en) Edge detection method and device, electronic equipment, and computer-readable storage medium
CN111080660B (en) Image segmentation method, device, terminal equipment and storage medium
CN110427932B (en) Method and device for identifying multiple bill areas in image
CN111553923B (en) Image processing method, electronic equipment and computer readable storage medium
JP7198350B2 (en) CHARACTER DETECTION DEVICE, CHARACTER DETECTION METHOD AND CHARACTER DETECTION SYSTEM
CN112330696B (en) Face segmentation method, face segmentation device and computer-readable storage medium
CN110516541B (en) Text positioning method and device, computer readable storage medium and computer equipment
WO2020097909A1 (en) Text detection method and apparatus, and storage medium
CN111914698A (en) Method and system for segmenting human body in image, electronic device and storage medium
WO2023035531A1 (en) Super-resolution reconstruction method for text image and related device thereof
US20220076119A1 (en) Device and method of training a generative neural network
CN112308051B (en) Text box detection method and device, electronic equipment and computer storage medium
CN110163866A (en) A kind of image processing method, electronic equipment and computer readable storage medium
WO2022111549A1 (en) Document recognition method and apparatus, and readable storage medium
CN111310758A (en) Text detection method and device, computer equipment and storage medium
CN109035167A (en) Method, apparatus, equipment and the medium that multiple faces in image are handled
CN113592720B (en) Image scaling processing method, device, equipment and storage medium
Hao et al. LEDet: A single-shot real-time object detector based on low-light image enhancement
CN110580462B (en) Natural scene text detection method and system based on non-local network
US9886629B2 (en) Techniques for restoring content from a torn document
US11816842B2 (en) Image processing method, apparatus, electronic device, and storage medium
US11687886B2 (en) Method and device for identifying number of bills and multiple bill areas in image
CN114155540A (en) Character recognition method, device and equipment based on deep learning and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21897049

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21897049

Country of ref document: EP

Kind code of ref document: A1