CN113903040A

CN113903040A - Text recognition method, equipment, system and computer readable medium for shopping receipt

Info

Publication number: CN113903040A
Application number: CN202111257834.7A
Authority: CN
Inventors: 于兴兴; 林喆; 朱亮; 梅娟; 曹颖
Original assignee: Shanghai Sunmi Technology Group Co Ltd; Shenzhen Michelangelo Technology Co Ltd
Current assignee: Shanghai Sunmi Technology Group Co Ltd; Shanghai Sunmi Technology Co Ltd; Shenzhen Michelangelo Technology Co Ltd
Priority date: 2021-10-27
Filing date: 2021-10-27
Publication date: 2022-01-07

Abstract

The application provides a character recognition method, equipment, a system and a computer readable medium for a shopping receipt. The method comprises the following steps: obtaining an electronic image of a shopping receipt from a printer, the electronic image containing one or more lines of text, each line of text including one or more blocks of text; detecting text blocks of the electronic image and obtaining coordinates of each text block; acquiring an image of each text block based on the coordinates of each text block, and performing character recognition on the image of each text block to obtain a text of each text block; and sequencing and typesetting the texts of all the text blocks based on the coordinates of all the text blocks, and obtaining the text with format typesetting information of the shopping receipt. The method carries out character recognition through the electronic image of the shopping receipt, reserves the typesetting information of the shopping receipt, and improves the reliability of commercial analysis of the user.

Description

Text recognition method, equipment, system and computer readable medium for shopping receipt

Technical Field

The present application relates generally to the field of text recognition, and more particularly to a text recognition method, system, device and computer readable medium for a shopping receipt.

Background

In recent years, artificial intelligence has become an important driving force for global technological and industrial changes, and the rise and development of business intelligence are promoted. Image text recognition (abbreviated as receipt OCR) of merchant shopping receipts is an important application in the field of business intelligence, and has attracted much attention in recent years. The receipt OCR is a technology of detecting a text therein from a receipt and recognizing the text as editable characters (including chinese characters, english letters, arabic numerals, special symbols, and the like). The receipt OCR can directly collect the operation information in a uniform mode, and plays an important role in aspects of market operation analysis, market state distribution, merchant rent schemes and the like.

A common ticket OCR system usually recognizes an image of a paper ticket. The user firstly obtains the paper shopping receipt from the printer, then shoots the paper shopping receipt to obtain a receipt picture, then inputs the receipt picture into the OCR system, and finally feeds back the recognized receipt text to the user. The images of the paper tickets are identified, so that the labor cost is high and the efficiency is low; because the accuracy is greatly influenced due to the interference of illumination, angles, creases, backgrounds and the like during shooting, if the identification result is directly used for business such as operation analysis and the like, wrong decision is easily caused; in addition, the format and the typesetting of the receipt are not considered by the system, so that the recognized result cannot be restored to the original text arrangement mode of the receipt, the information is seriously damaged, and the reliability of subsequent commercial analysis is influenced.

Therefore, a method and a system for identifying words of shopping tickets with low cost, high efficiency and high accuracy are needed to improve the reliability of commercial analysis of users.

Disclosure of Invention

The technical problem to be solved by the application is to provide a method, a system, equipment and a computer readable medium for recognizing characters of a shopping receipt, and solve the problem that commercial analysis of a user is unreliable due to low accuracy of character recognition of the receipt.

In order to solve the technical problem, the application provides a text recognition method for a shopping receipt, which comprises the following steps: obtaining an electronic image of a shopping receipt from a printer, the electronic image containing one or more lines of text, each line of text including one or more blocks of text; detecting text blocks of the electronic image and obtaining coordinates of each text block; acquiring an image of each text block based on the coordinates of each text block, and performing character recognition on the image of each text block to obtain a text of each text block; and sequencing and typesetting the texts of all the text blocks based on the coordinates of all the text blocks, and obtaining the text with format typesetting information of the shopping receipt.

In an embodiment of the present application, the method further includes: and carrying out image processing of enhancing character information on the electronic image.

In an embodiment of the application, the step of performing image processing of the enhanced text information on the electronic image includes any one or a combination of: converting the electronic image into a grey scale map; carrying out binarization operation on the electronic image; carrying out character thickening operation on the electronic image; and removing the background which does not contain words in the electronic image.

In an embodiment of the present application, the step of performing text block detection on the electronic image and obtaining coordinates of each text block includes: combining left and right adjacent words in each line of text of the electronic image into a text block; performing longitudinal pixel scanning on the electronic image, and marking a starting ordinate and an ending ordinate of each line according to foreground pixels of each line of text; scanning horizontal pixels of each line of text, and marking a starting abscissa and an ending abscissa of each text block according to foreground pixels of each column; and obtaining the coordinates of each text block based on the starting ordinate and the ending ordinate of each line and the starting abscissa and the ending abscissa of each text block, wherein the coordinates of the text block comprise the abscissa of the upper left corner of the text block and the abscissa of the lower right corner of the text block.

In an embodiment of the present application, the step of merging the left and right adjacent words in each line of text of the electronic image into one text block is implemented by performing a morphological dilation operation on the words of the electronic image by using a template of n × 1, where n is a preset dilation parameter.

In an embodiment of the application, the step of performing character recognition on the image of each text block to obtain the text of each text block includes: carrying out size adjustment and normalization operation on the image of each text block; and performing character recognition on the image of each text block by using a character recognition model based on a convolution cyclic neural network to obtain the text of each text block.

In an embodiment of the application, the step of sorting and composing the texts of all the text blocks based on the coordinates of all the text blocks and obtaining the text of the shopping receipt with the formatting information includes: performing first sequencing operation on the texts of all the text blocks according to the abscissa of the upper left corner of each text block; performing second sorting operation on the texts of all the text blocks subjected to the first sorting operation according to the upper left-corner ordinate of each text block; and adding format typesetting information to the texts of all the text blocks subjected to the second sorting operation based on the coordinates of each text block to obtain the text with the format typesetting information of the shopping receipt.

In an embodiment of the application, the step of adding format information to the texts of all the text blocks subjected to the second sorting operation based on the coordinates of each text block to obtain the text of the shopping ticket with the format information includes: adding a first symbol to the texts of the text blocks on the same line for separation based on the coordinates of each text block in the texts of all the text blocks subjected to the second sorting operation; and adding a second symbol to the texts of the text blocks in different lines for separation based on the coordinates of each text block in the texts of all the text blocks subjected to the second sorting operation.

In an embodiment of the present application, the formatting information includes: separating texts of the text blocks in the same line by using a first symbol; and the text of the text blocks of different lines are separated by a second symbol.

In order to solve the above technical problem, the present application further provides a text recognition device for a shopping receipt, including: a receipt image collection module for obtaining an electronic image of a shopping receipt from a printer, the electronic image containing one or more lines of text, each line of text including one or more text blocks; the receipt character detection module is used for detecting text blocks of the electronic image and acquiring coordinates of each text block; the receipt character recognition module is used for acquiring an image of each text block based on the coordinates of each text block and performing character recognition on the image of each text block to obtain a text of each text block; and the receipt text sequencing module is used for sequencing and typesetting the texts of all the text blocks based on the coordinates of all the text blocks and obtaining the text with the format typesetting information of the shopping receipt.

In order to solve the above technical problem, the present application further provides a text recognition system for a shopping receipt, including: a memory for storing instructions executable by the processor; and a processor for executing the instructions to implement the method as described above.

To solve the above technical problem, the present application also provides a computer readable medium storing computer program code, which when executed by a processor implements the method as described above.

Compared with the prior art, the character recognition method, the system, the equipment and the computer readable medium of the shopping receipt perform character recognition by adopting the electronic image of the shopping receipt, have high image quality, overcome the interference of illumination, angle, crease, background and the like when the paper shopping receipt is shot, greatly improve the recognition efficiency and reduce the labor cost; the character recognition method for the shopping receipt adopts a coordinate detection and sorting algorithm, retains the typesetting information of the receipt image, and improves the reliability of commercial analysis of a user.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the principle of the application. In the drawings:

FIG. 1 is a flow chart diagram illustrating a text recognition method for a shopping receipt according to one embodiment of the present application;

FIG. 2 is an electronic image of a shopping receipt shown in accordance with an embodiment of the present application;

FIG. 3 is an electronic image of a text-consolidated shopping receipt according to one embodiment of the present application;

FIG. 4 is a diagram illustrating coordinates of text blocks corresponding to a pixel-scanned electronic image, according to an embodiment of the present application;

FIG. 5 is a text recognition result of an electronic image of a shopping receipt according to one embodiment of the present application;

FIG. 6 illustrates a ticket text with formatting information according to an embodiment of the present application;

FIG. 7 is a flow diagram illustrating a method for text recognition of a shopping ticket according to another embodiment of the present application;

FIG. 8 is an exemplary block diagram of a text recognition system for a shopping ticket according to one embodiment of the present application;

FIG. 9 is a system block diagram illustrating a text recognition device for a shopping ticket according to one embodiment of the present application.

Detailed Description

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only examples or embodiments of the application, from which the application can also be applied to other similar scenarios without inventive effort for a person skilled in the art. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.

As used in this application and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.

The relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present application unless specifically stated otherwise. Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description. Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate. In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

Flow charts are used herein to illustrate operations performed by systems according to embodiments of the present application. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, various steps may be processed in reverse order or simultaneously. Meanwhile, other operations are added to or removed from these processes.

The application provides a character recognition method for a shopping receipt. Fig. 1 is a flowchart illustrating a text recognition method for a shopping receipt according to an embodiment of the present application. As shown in fig. 1, the text recognition method of the shopping ticket of the present embodiment includes the following steps S11-S14:

step S11: an electronic image of a shopping coupon is obtained from a printer, the electronic image containing one or more lines of text, each line of text including one or more blocks of text.

Step S12: text block detection is performed on the electronic image and coordinates of each text block are obtained.

Step S13: and acquiring an image of each text block based on the coordinates of each text block, and performing character recognition on the image of each text block to obtain the text of each text block.

Step S14: and sequencing and typesetting the texts of all the text blocks based on the coordinates of all the text blocks, and obtaining the text with format typesetting information of the shopping receipt.

The above steps S11-S14 are explained in detail below with reference to fig. 1-7:

in step S11, the printer is a ticket printer, and is a printer for printing tickets, such as tickets for stores and supermarkets, and financial invoices from companies. After the cash register finishes shopping and payment, an electronic version receipt picture is generated and delivered to the printer for generating a paper version receipt. The embodiment obtains the electronic image for printing the paper-edition shopping receipt directly from the printer. Step S11 may be performed by a receipt image collection module provided on the printer body.

FIG. 2 is an electronic image of a shopping receipt as shown in one embodiment of the present application. Fig. 2 is only an illustration, and as shown in fig. 2, the electronic image may include a plurality of lines of text information such as a sales order number, a product total, a product offer, a whole order offer, and a transaction time, and each line of text may further include a plurality of text blocks such as a name, a unit price, a quantity, and an amount, and the content of the text is not limited in the present application. In the present application, the electronic image may be a color image or a black-and-white image, and the present application is not limited thereto.

In some embodiments, step S12 includes steps S121-S124 as follows:

step S121: and combining the characters adjacent to each other in the left and right of each line of text of the electronic image of the shopping receipt into a text block.

Step S122: the electronic image is scanned in longitudinal pixels and the start and end ordinates of each line are marked in accordance with the foreground pixels of each line of text.

Step S123: a horizontal pixel scan is performed for each line of text and the start and end abscissas for each block of text are marked according to the foreground pixels of each column.

Step S124: and obtaining the coordinates of each text block based on the starting ordinate and the ending ordinate of each line and the starting abscissa and the ending abscissa of each text block, wherein the coordinates of the text block comprise the abscissa of the upper left corner of the text block and the abscissa of the lower right corner of the text block.

In one example, the steps S121 to S124 may be performed in the order of S121, S123, and S124.

In step S121, the step of merging the adjacent left and right characters in each line of text of the electronic image into a text block may be implemented by performing a morphological dilation operation on the characters of the electronic image by using a template of n × 1, where n is a preset dilation parameter. Morphological dilation is a means of digital image processing, in which the text of an electronic image is convolved with a template, and the maximum value of the pixel points in the template coverage area is calculated. The templates may be of any shape and size, and in this application, n x1 templates are used, which can connect adjacent words in each line of text without causing the upper and lower lines of text to connect. FIG. 3 is an electronic image of a text-integrated shopping ticket according to one embodiment of the present application. As shown in fig. 3, the characters adjacent to each other left and right in the same line of text are merged into one text block, and the characters in the text in different lines, i.e. the characters adjacent to each other up and down, do not form one text block, so the template of n × 1 in the present application ensures the correct segmentation of the text block.

In step S122, pixels of the electronic image, which include foreground pixels and background pixels, may be scanned line by line from top to bottom (or bottom to top). In this embodiment, the foreground pixel refers to a non-blank pixel. If there is a foreground pixel for this line of pixels and there is no foreground pixel for the previous line of pixels, then the line ordinate is marked as the starting ordinate y1 of the text block and scanning continues. Until it is scanned that there is no foreground pixel for a line of pixels and there is a foreground pixel for the previous line, then the line ordinate is marked as the termination ordinate y2 for the text block. And repeating the process to obtain the initial vertical coordinate and the termination vertical coordinate of all the text blocks.

In step S123, each text block scanned in step S122 may be scanned by pixels from left to right (or from right to left) column by column, and if there is a foreground pixel in the column of pixels and there is no foreground pixel in the previous column of pixels, the horizontal coordinate of the column is marked as the starting horizontal coordinate x1 of the text block and the scanning is continued. Until a column of pixels is scanned without foreground pixels and a previous column of pixels with foreground pixels, then the column abscissa is marked as the ending abscissa x2 of the text block. And repeating the process to obtain the starting abscissa and the ending abscissa of all the text blocks.

In step S124, the coordinates of each text block are obtained based on the start ordinate and the end ordinate of each text block and the start abscissa and the end abscissa of each text block acquired in steps S122 and S123, and the coordinates of each text block may be represented by one quadruple (x1, y1, x2, y 2). Where x1 represents the start abscissa of each text block, x2 represents the end abscissa of each text block, y1 represents the start ordinate of the text line in which each text block is located, and y2 represents the end ordinate of the text line in which each text block is located.

Fig. 4 illustrates coordinates of a text block corresponding to a pixel-scanned electronic image according to an embodiment of the present application. By pixel scanning the electronic image of the shopping receipt, the coordinates of the text block corresponding to the electronic image as shown in FIG. 4 can be obtained. The coordinates of each text block are a four-tuple (x1, y1, x2, y2) representation, and the coordinates of the text block include the horizontal and vertical coordinates of the upper left corner of the text block and the horizontal and vertical coordinates of the lower right corner of the text block. Because the text blocks are obtained by line-row scanning, y1 and y2 of the same line of text are the same, and each line of text distinguishes the text blocks through x1 and x2, so that the position typesetting information of each text block is completely reserved.

In some embodiments, step S13 includes steps S131 and S132 as follows:

step S131: and carrying out size adjustment and normalization operation on the image of each text block.

Step S132: and performing character recognition on the image of each text block by using a character recognition model based on a convolution cyclic neural network to obtain the text of each text block.

In step S131, an image corresponding to the text block is acquired from the electronic image of the scanned shopping receipt according to the coordinates of the text block. The width and the height of each acquired text block image are adjusted, the height of the image can be adjusted to a preset value, and the width is scaled in equal proportion. For example, the height of the image may be set to 32 pixels (i.e., dpi), and the width may be scaled equally, i.e., the resolution of the entire image may be scaled to 32 n. The height of the image may be set to other values, and the present application does not limit this.

In some embodiments, normalization may be performed on each acquired text block image. The original image to be processed is transformed into a corresponding unique standard form (the standard form has invariant characteristics to affine transformations such as translation, rotation, scaling, etc.) by a series of transformations. The image normalization may adopt a maximum and minimum value normalization method, a moment-based image normalization method, or other normalization methods, which is not limited in the present application. The normalization operation does not change the image information, only changes the pixel from 0-255 to 0-1, accelerates the convergence for the subsequent training network, and does not need to perform normalization processing.

In step S132, the character recognition of the image of the text block means that the text in the text block is detected and recognized as editable characters (including kanji characters, english letters, arabic numerals, special symbols, and the like). The character recognition model (CRNN) of the convolutional recurrent neural network used in step S132 is an artificial intelligence model, can recognize characters in an image after training, and is a lightweight model, so that the model parameters are less, and a recognition result can be obtained quickly, in real time and accurately. In other embodiments, other character recognition models may be used, and the present application is not limited thereto. FIG. 5 is a text recognition result of an electronic image of a shopping receipt according to an embodiment of the present application. As shown in fig. 5, the image of each text block is recognized as the text of the corresponding text block, and the text of each text block is another line and does not contain the layout information of the electronic image of the shopping receipt.

In some embodiments, step S14 includes the steps of:

step S141: and performing first sequencing operation on the texts of all the text blocks according to the abscissa of the upper left corner of each text block.

Step S142: and performing second sorting operation on the texts of all the text blocks subjected to the first sorting operation according to the vertical coordinate of the upper left corner of each text block.

Step S143: and adding format typesetting information to the texts of all the text blocks subjected to the second sorting operation based on the coordinates of each text block to obtain the text with the format typesetting information of the shopping receipt.

In step S141, a first sorting operation is performed on the recognized text blocks according to the upper left abscissa x1 of each text block. The first sorting operation may employ any sorting algorithm, such as: direct insertion ordering, simple selection ordering, bubble ordering, and the like, which are not limited in this application.

In step S142, a second sorting operation is performed on the text blocks sorted in step 141 according to the upper left ordinate y1 of each text block. The second sort operation uses a stable sort algorithm, such as: direct insert ordering, merge ordering, bubble ordering, etc. When the first sorting operation adopts a stable sorting algorithm, the second sorting operation may adopt the same sorting algorithm as the first sorting operation, or may adopt a different sorting algorithm, which is not limited in this application.

In step S143, the step of adding the formatting information to the text of all the text blocks subjected to the second sorting operation based on the coordinates of each text block to obtain the text of the shopping ticket having the formatting information includes steps S1431 and S1432:

step S1431: and adding a first symbol to the texts of the text blocks on the same line for separation based on the coordinates of each text block in the texts of all the text blocks subjected to the second sorting operation. The first symbol may be the tab symbol "\ t".

Step S1432: and adding a second symbol to the texts of the text blocks in different lines for separation based on the coordinates of each text block in the texts of all the text blocks subjected to the second sorting operation. The second symbol may be the line break "\ n".

Fig. 6 is a diagram illustrating a ticket text with formatting information according to an embodiment of the present application. As shown in FIG. 6, the texts in the text blocks in the same line are separated by a first symbol \ t, and the texts in the text blocks in different lines are separated by a second symbol \ n.

The character recognition method of the shopping receipt in the embodiment performs character recognition through the electronic image of the shopping receipt, has high image quality, overcomes the interference of illumination, angle, crease, background and the like when the paper shopping receipt is shot, greatly improves the recognition efficiency and reduces the labor cost; the character recognition method of the shopping receipt adopts a lightweight character recognition model of a convolution cyclic neural network, has few model parameters, can quickly recognize texts from the receipt, and has less resource consumption; the character recognition method for the shopping receipt in the embodiment adopts a coordinate detection and sorting algorithm, reserves the typesetting information of the receipt image, and improves the reliability of commercial analysis of a user.

Fig. 7 is a flowchart illustrating a text recognition method for a shopping receipt according to another embodiment of the present application. As shown in fig. 7, the text recognition method 70 of the shopping ticket of the present embodiment includes the following steps S71-S74:

step S71: an electronic image of a shopping coupon is obtained from a printer, the electronic image containing one or more lines of text, each line of text including one or more blocks of text. For a detailed description of this step, reference may be made to step S11 in the text recognition method 10 of the shopping receipt of the foregoing embodiment.

Step S72: and performing image processing for enhancing the character information on the electronic image.

Step S73: text block detection is performed on the electronic image and coordinates of each text block are obtained. For a detailed description of this step, reference may be made to step S12 in the text recognition method 10 of the shopping receipt of the foregoing embodiment.

Step S74: and acquiring an image of each text block based on the coordinates of each text block, and performing character recognition on the image of each text block to obtain the text of each text block. For a detailed description of this step, reference may be made to step S13 in the text recognition method 10 of the shopping receipt of the foregoing embodiment.

Step S75: and sequencing and typesetting the texts of all the text blocks based on the coordinates of all the text blocks, and obtaining the text with format typesetting information of the shopping receipt. For a detailed description of this step, reference may be made to step S14 in the text recognition method 10 of the shopping receipt of the foregoing embodiment.

The above step S72 is explained in detail as follows:

in step S72, the step of performing image processing of enhanced text information on the electronic image includes any one or a combination of steps S721 to S724 of:

step S721: the electronic image is converted to a gray scale image.

Step S722: and carrying out binarization operation on the electronic image.

Step S723: and carrying out thickening operation on the electronic image characters.

Step S724: the background in the electronic image that does not contain text is removed.

In step S721, when the obtained electronic image is a color image, the RGB image of the color image is converted into a gray scale image, the gray scale values of the channels of the gray scale image are equal, and the gray scale value of the gray scale image may be an average value of the gray scale values of the original RGB three channels, or a gray scale value obtained by weighting the gray scale values of the original RGB three channels differently. The RGB image of the color image is converted into a gray image, so that meaningless color information can be removed, and the image dimension is reduced.

In step S722, a binarization operation may be performed on the gray-scale image, where the gray-scale value of each pixel in the pixel matrix of the image is 0 (black) or 255 (white), that is, the entire image has only black and white effects. The range of the gray scale value in the grayed image is 0 to 255, and the range of the gray scale value in the binarized image is 0 or 255. The specific binarization operation is to compare the gray value with a threshold value, where the gray value is 0 (black) when the gray value is less than or equal to the threshold value, the gray value is 255 (white) when the gray value is greater than the threshold value, and the threshold value may be 20 or other values, which are set as required, and the application does not limit this. Through the binarization operation, the gray noise around the black characters can be removed, and the difference between the foreground and the background is increased.

In step S723, in some embodiments, an operation of thickening electronic image characters of the obtained shopping receipt may also be performed, specifically, a black character is subjected to morphological expansion operation by using an n × n template, where n is a preset expansion parameter, so as to thicken the character and highlight character information.

In step S724, in some embodiments, the background that does not contain text in the electronic image of the shopping ticket may also be removed. Specifically, the background without characters in the upper edge and/or the lower edge of the electronic image of the shopping receipt can be cut off in a cutting mode, so that the size of the image is reduced, and the resource consumption of a character recognition system is reduced.

The method for recognizing the characters of the shopping receipt is an improvement of the former embodiment, and the method reduces unnecessary redundant information of character recognition, improves the efficiency of subsequent text recognition and reduces the resource consumption of a character recognition system by performing image processing of enhancing character information on an electronic image before character recognition.

FIG. 8 is an exemplary block diagram of a text recognition system for a shopping ticket according to one embodiment of the present application. The text recognition system 80 can be used to perform the text recognition method for shopping tickets described above, and thus, the description and drawings can be used to describe the text recognition system, and the same will not be expanded. Referring to fig. 8, the text recognition system 80 of the shopping ticket of this embodiment includes the following program modules: the system comprises a receipt image collection module 81, a receipt character detection module 83, a receipt character recognition module 84 and a receipt text sorting module 85.

The receipt image collection module 81 is used to obtain an electronic image of a shopping receipt from the printer. The receipt text detection module 83 is configured to perform text block detection on the electronic image and obtain coordinates of each text block. The receipt character recognition module 84 is configured to obtain an image of each text block based on the coordinates of each text block, and perform character recognition on the image of each text block to obtain a text of each text block. The receipt text sorting module 85 is configured to sort and type the texts of all the text blocks based on the coordinates of all the text blocks, and obtain the text of the shopping receipt with the format type information.

In an embodiment of the present application, the text recognition system 80 for shopping tickets may also include a ticket image processing module 82. The slip image processing module 82 may be used for image processing of enhanced textual information for electronic images.

The constructed character recognition system of the shopping receipt can be deployed in the cloud, and the printer acquires an electronic receipt image from the cash register and sends the electronic receipt image to the character recognition system through the network, so that rapid and real-time text recognition is realized.

The application also provides a text recognition device for a shopping receipt, comprising: a memory for storing instructions executable by the processor; and a processor for executing the instructions to implement any of the above-described text recognition methods for shopping tickets.

FIG. 9 is a system block diagram illustrating a text recognition device for a shopping ticket according to one embodiment of the present application. The text recognition device 90 for a shopping ticket may include an internal communication bus 901, a Processor (Processor)902, a Read Only Memory (ROM)903, a Random Access Memory (RAM)904, and a communication port 905. When used on a personal computer, the text recognition device 90 for a shopping ticket may also include a hard disk 907. Internal communication bus 901 enables data communication among the text recognition device 90 components of the purchase tickets. The processor 902 may make the determination and issue the prompt. In some embodiments, the processor 902 may be comprised of one or more processors. The communication port 905 can enable data communication between the text recognition device 90 of the shopping ticket and the outside. In some embodiments, the text recognition device 90 of the shopping ticket may send and receive information and data from the network through the communication port 905. The text recognition device 90 for shopping tickets may also include various forms of program storage units and data storage units, such as a hard disk 907, a Read Only Memory (ROM)903 and a Random Access Memory (RAM)904, capable of storing various data files for computer processing and/or communication use, as well as possible program instructions for execution by the processor 902. The processor executes these instructions to implement the main parts of the method. The results processed by the processor are communicated to the user device through the communication port and displayed on the user interface.

The above-described method for recognizing characters of a shopping receipt may be implemented as a computer program, stored in the hard disk 907, and recorded in the processor 902 to be executed, so as to implement any method for recognizing characters of a shopping receipt in the present application.

The present application also provides a computer readable medium having stored thereon computer program code which, when executed by a processor, implements any of the methods of text recognition of a shopping receipt as described above.

When the text recognition method for a shopping receipt is implemented as a computer program, it may also be stored in a computer-readable storage medium as an article of manufacture. For example, computer-readable storage media can include but are not limited to magnetic storage devices (e.g., hard disk, floppy disk, magnetic strips), optical disks (e.g., Compact Disk (CD), Digital Versatile Disk (DVD)), smart cards, and flash memory devices (e.g., electrically Erasable Programmable Read Only Memory (EPROM), card, stick, key drive). In addition, various storage media described herein can represent one or more devices and/or other machine-readable media for storing information. The term "machine-readable medium" can include, without being limited to, wireless channels and various other media (and/or storage media) capable of storing, containing, and/or carrying code and/or instructions and/or data.

It should be understood that the above-described embodiments are illustrative only. The embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or any combination thereof. For a hardware implementation, the processing units may be implemented within one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), processors, controllers, micro-controllers, microprocessors, and/or other electronic units designed to perform the functions described herein, or a combination thereof.

Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing disclosure is by way of example only, and is not intended to limit the present application. Various modifications, improvements and adaptations to the present application may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present application and thus fall within the spirit and scope of the exemplary embodiments of the present application.

Also, this application uses specific language to describe embodiments of the application. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the present application is included in at least one embodiment of the present application. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the present application may be combined as appropriate.

Aspects of the present application may be embodied entirely in hardware, entirely in software (including firmware, resident software, micro-code, etc.) or in a combination of hardware and software. The above hardware or software may be referred to as "data block," module, "" engine, "" unit, "" component, "or" system. The processor may be one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), digital signal processing devices (DAPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), processors, controllers, microcontrollers, microprocessors, or a combination thereof. Furthermore, aspects of the present application may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media. For example, computer-readable media may include, but are not limited to, magnetic storage devices (e.g., hard disk, floppy disk, magnetic strips … …), optical disks (e.g., Compact Disk (CD), Digital Versatile Disk (DVD) … …), smart cards, and flash memory devices (e.g., card, stick, key drive … …).

Similarly, it should be noted that in the preceding description of embodiments of the application, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to require more features than are expressly recited in the claims. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.

Although the present application has been described with reference to the present specific embodiments, it will be recognized by those skilled in the art that the foregoing embodiments are merely illustrative of the present application and that various changes and substitutions of equivalents may be made without departing from the spirit of the application, and therefore, it is intended that all changes and modifications to the above-described embodiments that come within the spirit of the application fall within the scope of the claims of the application.

Claims

1. A character recognition method for a shopping receipt comprises the following steps:

obtaining an electronic image of a shopping receipt from a printer, the electronic image containing one or more lines of text, each line of text including one or more blocks of text;

detecting text blocks of the electronic image and obtaining coordinates of each text block;

acquiring an image of each text block based on the coordinates of each text block, and performing character recognition on the image of each text block to obtain a text of each text block; and

and sequencing and typesetting the texts of all the text blocks based on the coordinates of all the text blocks, and obtaining the text with format typesetting information of the shopping receipt.

2. The method of claim 1, further comprising:

and carrying out image processing of enhancing character information on the electronic image.

3. The method of claim 2, wherein the step of image processing the electronic image with enhanced textual information comprises any one or a combination of:

converting the electronic image into a grey scale map;

carrying out binarization operation on the electronic image;

carrying out character thickening operation on the electronic image; and

removing the background which does not contain words in the electronic image.

4. The method of claim 1, wherein the step of text block detecting the electronic image and obtaining coordinates for each text block comprises:

combining left and right adjacent words in each line of text of the electronic image into a text block;

performing longitudinal pixel scanning on the electronic image, and marking a starting ordinate and an ending ordinate of each line according to foreground pixels of each line of text;

scanning horizontal pixels of each line of text, and marking a starting abscissa and an ending abscissa of each text block according to foreground pixels of each column; and

and obtaining the coordinates of each text block based on the starting ordinate and the ending ordinate of each line and the starting abscissa and the ending abscissa of each text block, wherein the coordinates of the text block comprise the abscissa of the upper left corner of the text block and the abscissa of the lower right corner of the text block.

5. The method of claim 4, wherein the step of merging adjacent left and right words in each line of text of the electronic image into a text block is performed by performing a morphological dilation operation on the words of the electronic image using a template of n x1, where n is a predetermined dilation parameter.

6. The method of claim 1, wherein the step of word-recognizing the image of each text block to obtain the text of each text block comprises:

carrying out size adjustment and normalization operation on the image of each text block; and

and performing character recognition on the image of each text block by using a character recognition model based on a convolution cyclic neural network to obtain the text of each text block.

7. The method of claim 4, wherein the step of sorting and composing the text of all text blocks based on their coordinates and obtaining the text of the shopping ticket with formatting information comprises:

performing first sequencing operation on the texts of all the text blocks according to the abscissa of the upper left corner of each text block;

performing second sorting operation on the texts of all the text blocks subjected to the first sorting operation according to the upper left-corner ordinate of each text block; and

and adding format typesetting information to the texts of all the text blocks subjected to the second sorting operation based on the coordinates of each text block to obtain the text with the format typesetting information of the shopping receipt.

8. The method of claim 7, wherein the step of adding formatting information to the text of all text blocks subjected to the second sort operation based on the coordinates of each text block to obtain the text of the shopping ticket with formatting information comprises:

adding a first symbol to the texts of the text blocks on the same line for separation based on the coordinates of each text block in the texts of all the text blocks subjected to the second sorting operation; and

and adding a second symbol to the texts of the text blocks in different lines for separation based on the coordinates of each text block in the texts of all the text blocks subjected to the second sorting operation.

9. The method of claim 1, wherein the formatting information comprises:

separating texts of the text blocks in the same line by using a first symbol; and

the text of the text blocks of different lines is separated by a second symbol.

10. A text recognition system for a shopping receipt, comprising:

a receipt image collection module for obtaining an electronic image of a shopping receipt from a printer, the electronic image containing one or more lines of text, each line of text including one or more text blocks;

the receipt character detection module is used for detecting text blocks of the electronic image and acquiring coordinates of each text block;

the receipt character recognition module is used for acquiring an image of each text block based on the coordinates of each text block and performing character recognition on the image of each text block to obtain a text of each text block; and

and the receipt text sequencing module is used for sequencing and typesetting the texts of all the text blocks based on the coordinates of all the text blocks and obtaining the text with the format typesetting information of the shopping receipt.

11. The system of claim 10, further comprising:

and the receipt image processing module is used for carrying out image processing of enhanced character information on the electronic image.

12. A text recognition device for a shopping receipt comprising:

a memory for storing instructions executable by the processor; and

a processor for executing the instructions to implement the method of any one of claims 1-9.

13. A computer-readable medium having stored thereon computer program code which, when executed by a processor, implements the method of any of claims 1-9.