CN112560845A - Character recognition method and device, intelligent meal taking cabinet, electronic equipment and storage medium - Google Patents

Character recognition method and device, intelligent meal taking cabinet, electronic equipment and storage medium Download PDF

Info

Publication number
CN112560845A
CN112560845A CN202011540674.2A CN202011540674A CN112560845A CN 112560845 A CN112560845 A CN 112560845A CN 202011540674 A CN202011540674 A CN 202011540674A CN 112560845 A CN112560845 A CN 112560845A
Authority
CN
China
Prior art keywords
character
image
neural network
character recognition
standard format
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011540674.2A
Other languages
Chinese (zh)
Inventor
祖春山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BOE Technology Group Co Ltd
Original Assignee
BOE Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BOE Technology Group Co Ltd filed Critical BOE Technology Group Co Ltd
Priority to CN202011540674.2A priority Critical patent/CN112560845A/en
Publication of CN112560845A publication Critical patent/CN112560845A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Character Discrimination (AREA)
  • Character Input (AREA)

Abstract

The application provides a character recognition method, a character recognition device, an intelligent meal taking cabinet, electronic equipment and a storage medium, wherein the character recognition method comprises the following steps: converting an image to be identified into an image with a standard format; determining a character area in the standard format image; identifying a character string in the character region based on a lightweight character recognition neural network. The method and the device have the advantages that the complexity and the content of the image needing to be processed by the neural network are simplified in advance, so that the character recognition function can be realized quickly and accurately by adopting the light-weight character recognition neural network, the response speed is high, and the requirement on the computing performance of the equipment can be greatly reduced due to the adoption of the light-weight character recognition neural network.

Description

Character recognition method and device, intelligent meal taking cabinet, electronic equipment and storage medium
Technical Field
The present application relates to the field of optical character recognition technology, and in particular, to a character recognition method and apparatus, an intelligent meal cabinet, an electronic device, and a storage medium.
Background
Optical Character Recognition (OCR) is a technology for recognizing text in an image to extract Character information that can be directly used, and is increasingly widely applied in the fields of computer vision, artificial intelligence and the like. At present, a neural network is a mainstream tool for realizing optical character recognition, and the neural network is applied to the optical character recognition, so that the accuracy and the efficiency of the character recognition can be effectively improved.
However, in order to obtain high recognition accuracy and adapt to various image conditions, the existing neural network for optical character recognition is designed to be large and complex, a high-performance computer or a special server is required to be used for calculation, the implementation cost is high, and the recognition timeliness is poor. This makes the large and complex neural network difficult to be applied in some real application scenarios such as mobile or embedded devices, and firstly, the model is too large and faces the problems of insufficient memory, and secondly, the scenarios require low delay or fast response speed, and the neural network is difficult to satisfy. For example, if an optical character recognition function is added to the intelligent meal taking cabinet to recognize a meal taking code, the requirements of low computing performance and quick response of equipment exist so as to meet the use experience of customers.
Therefore, it is highly desirable to provide a character recognition scheme that requires less computational performance on the device and is responsive quickly.
Disclosure of Invention
An embodiment of the application aims to provide a character recognition method and device, an intelligent meal taking cabinet, an electronic device and a storage medium, so as to solve the problem that the accuracy, the efficiency and the cost of a current character recognition mode cannot be considered at the same time.
In order to solve the above technical problem, an embodiment of the present application provides the following technical solutions:
a first aspect of the present application provides a character recognition method, including:
converting an image to be identified into an image with a standard format;
determining a character area in the standard format image;
identifying a character string in the character region based on a lightweight character recognition neural network.
In some modified embodiments of the first aspect of the present application, the converting the image to be recognized into a standard format image includes:
and converting the image to be recognized into a standard format image which accords with a preset color mode and/or a preset image size.
In some variations of the first aspect of the present application, the determining the character region in the standard format image comprises:
and determining a character area in the standard format image by adopting a lightweight text detection neural network.
In some variations of the first aspect of the present application, the lightweight text-detection neural network comprises:
and adopting a lightweight neural network MobileNet as a gradual scale expansion network PSENet of a backbone network.
In some modified embodiments of the first aspect of the present application, the recognizing the character string in the character region based on the lightweight character recognition neural network includes:
extracting a character area image containing the character area from the standard format image through affine transformation;
and identifying the character string in the character area from the character area image based on a light weight character recognition neural network.
In some variations of the first aspect of the present application, the lightweight character recognition neural network comprises:
and (3) a lightweight convolutional recurrent neural network.
In some variations of the first aspect of the present application, the output layer of the lightweight convolutional recurrent neural network employs locality-sensitive hash, LSH, encoding.
In some variations of the first aspect of the present application, the character string identified based on a lightweight character recognition neural network comprises a plurality of strings;
the method further comprises the following steps:
and screening the character strings to obtain a target character string which meets the constraint condition of the current application scene.
In some modified embodiments of the first aspect of the present application, the obtaining, by filtering from the plurality of character strings, a target character string that meets a constraint condition of a current application scenario includes:
and matching a plurality of character strings to obtain a target character string meeting the constraint condition of the current application scene by adopting at least one mode of regular expression matching, background information matching and confidence coefficient matching.
A second aspect of the present application provides a character recognition apparatus, comprising:
the standard format conversion module is used for converting the image to be identified into a standard format image;
the character area determining module is used for determining a character area in the standard format image;
and the character string extraction module is used for identifying the character strings in the character region based on the lightweight character recognition neural network.
In some variations of the second aspect of the present application, the standard format conversion module comprises:
and the standard format conversion unit is used for converting the image to be recognized into a standard format image which accords with a preset color mode and/or a preset image size.
In some variations of the second aspect of the present application, the character region determination module comprises:
and the character region determining unit is used for determining the character region in the standard format image by adopting a light-weight text detection neural network.
In some variations of the second aspect of the present application, the lightweight text-detecting neural network comprises:
and adopting a lightweight neural network MobileNet as a gradual scale expansion network PSENet of a backbone network.
In some modified embodiments of the second aspect of the present application, the character string extraction module includes:
a character region image extraction unit configured to extract a character region image containing the character region from the standard format image by affine transformation;
and the character string extraction unit is used for identifying the character string in the character area from the character area image based on the light weight character recognition neural network.
In some variations of the second aspect of the present application, the lightweight character recognition neural network comprises:
and (3) a lightweight convolutional recurrent neural network.
In some variations of the second aspect of the present application, the output layer of the lightweight convolutional recurrent neural network employs locality-sensitive hash, LSH, encoding.
In some variations of the second aspect of the present application, the character string identified based on a lightweight character recognition neural network comprises a plurality of strings;
the device, still include:
and the target character string screening module is used for screening a plurality of character strings to obtain a target character string which accords with the constraint condition of the current application scene.
In some variations of the second aspect of the present application, the target string filtering module includes:
and the target character string screening unit is used for matching a plurality of character strings to obtain a target character string meeting the constraint condition of the current application scene by adopting at least one mode of regular expression matching, background information matching and confidence coefficient matching.
A third aspect of the present application provides an electronic device comprising: memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the computer program when executing the computer program to perform the method of the first aspect of the application.
A fourth aspect of the present application provides a computer readable storage medium having computer readable instructions stored thereon which are executable by a processor to implement the method of the first aspect of the present application.
The fifth aspect of the present application provides a dining cabinet is got to intelligence, includes: the intelligent cabinet comprises a cabinet body, an image acquisition device and a main control device, wherein the image acquisition device and the main control device are arranged on the cabinet body; wherein,
the image acquisition device is connected with the main control device;
the image acquisition device is used for acquiring images of the meal taking voucher to generate an image to be identified and sending the image to be identified to the main control device;
the main control device is used for identifying the character string from the image to be identified by adopting the method of the first aspect of the application to determine the meal taking code recorded by the meal taking voucher, and controlling the cabinet door of the appointed meal taking position in the cabinet body to be opened according to the meal taking code.
The embodiment of the application aims to provide a character recognition method, a device, an intelligent meal cabinet, electronic equipment and a storage medium, wherein an image to be recognized is converted into a standard format image, a character area in the standard format image is determined, and a character string in the character area is recognized based on a light weight character recognition neural network, wherein the image to be recognized is converted into the standard format image, so that the image needing character recognition neural network recognition can be simpler and more consistent, the data processing amount of the neural network can be effectively reduced, the performance requirement on computing equipment is reduced, and the operation cost is reduced, therefore, the scheme can effectively reduce the requirement on the character recognition neural network, and the method does not need to face and process images with complicated and different specifications, therefore, the character recognition can be realized by adopting the light-weight character recognition neural network, and on the whole, the image complexity and the content which need to be processed by the neural network are simplified in advance, so that the character recognition function can be realized quickly and accurately by adopting the light-weight character recognition neural network, the response speed is high, and the requirement on the computing performance of equipment can be greatly reduced by adopting the light-weight character recognition neural network, so that the character recognition can be realized by mobile equipment, embedded equipment and the like, the popularization and the implementation are facilitated, and the application prospect is wide.
The second aspect of the present application provides a character recognition apparatus, the third aspect provides an electronic device, the fourth aspect provides a computer-readable storage medium, and the fifth aspect provides an intelligent meal taking cabinet, which have at least the same advantages as the character recognition method provided by the first aspect of the present application based on the same inventive concept.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present application will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present application are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings and in which like reference numerals refer to similar or corresponding parts and in which:
FIG. 1 schematically illustrates a flow chart of a character recognition method provided by some embodiments of the present application;
FIG. 2 schematically illustrates a schematic diagram of a standard format image provided by some embodiments of the present application;
FIG. 3 schematically illustrates a schematic diagram of a character region image provided by some embodiments of the present application;
FIG. 4 is a schematic diagram illustrating the computation of a matrix multiplication based on One-Hot encoding according to some embodiments of the present application;
FIG. 5 is a schematic diagram illustrating a calculation of a matrix multiplication based on an LSH encoding scheme according to some embodiments of the present application;
FIG. 6 schematically illustrates a schematic diagram of a character recognition apparatus provided by some embodiments of the present application;
FIG. 7 schematically illustrates a schematic view of an electronic device provided by some embodiments of the present application;
FIG. 8 schematically illustrates a schematic diagram of a computer-readable storage medium provided by some embodiments of the present application.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
It is to be noted that, unless otherwise specified, technical or scientific terms used herein shall have the ordinary meaning as understood by those skilled in the art to which this application belongs.
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and "a" and "an" typically include at least two, but do not exclude the presence of at least one.
It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
It is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a commodity or system that includes the element.
In addition, the terms "first" and "second", etc. are used to distinguish different objects, rather than to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
The embodiment of the application provides a character recognition method and device, an intelligent meal taking cabinet, electronic equipment and a storage medium, and the following description is given by way of example with reference to the accompanying drawings.
Referring to fig. 1, which schematically illustrates a flow chart of a character recognition method provided in some embodiments of the present application, as shown in fig. 1, a character recognition method may include the following steps:
step S101: and converting the image to be identified into a standard format image.
The image to be recognized may be an image including a character to be recognized, which is acquired by an image acquisition device such as a camera or a scanner, and may be in any format such as jpg, tif, and the embodiment of the present application is not limited.
Considering that the sizes, color modes and the like of the images to be recognized acquired by different image acquisition devices are different, and the diversification of the images to be recognized inevitably brings higher requirements to the subsequent neural network for character recognition, for example, a large number of diversified training samples are needed to train the neural network, the neural network needs to be provided with more layers and has higher complexity, and the like, which all affect the realization of the lightweight of the character recognition neural network, therefore, in the embodiment of the application, the images to be recognized can be simplified by converting the images to be recognized into the standard format images, so that the images which need to be recognized by the character recognition neural network are simpler and more consistent, the requirements on the character recognition neural network are reduced, for example, the subsequent character recognition neural network only needs to use a small number of standard format images to train, and the method can be realized by adopting fewer layers and lower network depth, thereby being beneficial to realizing the light weight of the character recognition neural network and realizing the recognition of the standard format image by adopting the light weight character recognition neural network.
In practical applications, the converting the image to be recognized into the image with the standard format may include:
and converting the image to be recognized into a standard format image which accords with a preset color mode and/or a preset image size.
The standard format image may be in accordance with a preset color mode, may also be in accordance with a preset image size, and may also be in accordance with both, so as to simplify and unify the images to be recognized.
The preset color mode and the preset image size are not limited in the embodiment of the present application, and those skilled in the art can flexibly set the preset color mode according to actual requirements, for example, the preset color mode may adopt an RGB mode, a CMYK mode, an HSB mode, an Lab color mode, and the like, and as long as a unified standard is set, the purpose of the embodiment of the present application can be achieved. In some examples, the image to be recognized may be uniformly converted into a 3-channel RGB image, and since the image in the RGB color mode is common in practical applications, after the image is set as a standard format image, the color mode conversion is not required for the image to be recognized in the original RGB mode, and only the color mode conversion is required for the image to be recognized in the non-RGB mode, so that the workload of performing the color mode conversion in this step may be effectively reduced as a whole.
In addition, the preset image size is not limited in this embodiment, and the preset image size may include a height requirement and a width requirement for an image, so as to uniformly process the image to be recognized into an image with a fixed height and a fixed width, and implement the uniformity of the image to be recognized, where the preset image size may include a requirement for a specific pixel value or a requirement for a scale, and the embodiment of the present application is not limited in this embodiment.
In some examples, in order to maintain the image scale, when the image to be recognized is converted into the standard format image conforming to the preset image size, the image content may be filled, for example, the ratio of the preset image size is 4:3, and the ratio of the image to be recognized is 16:9, so that the height of the image to be recognized may be filled to convert the image to be recognized into the standard format image of 4: 3.
Step S102: determining a character area in the standard format image.
Considering that characters may exist only in a partial region of an image to be recognized, in order to avoid performing character recognition on a non-character region and improve subsequent character recognition efficiency, it is necessary to first determine and extract a character region in the standard format image, and specifically, in some embodiments, the determining the character region in the standard format image includes:
and determining a character area in the standard format image by adopting a lightweight text detection neural network.
Since the step S101 has standardized the image to be recognized, so as to simplify and unify the image to be recognized, the step can use a light-weight text detection neural network to determine the character region in the standard format image.
For example, text detection neural networks such as a Progressive Scale Expansion Network (PSENet), a Pixel Aggregation Network (PANNet), and a Differentiable Binarization Network (DBNet) DBNet provided in the prior art may be adopted, and the lightweight processing may be implemented by replacing a Backbone Network (BN) in the text detection neural Network with a lightweight neural Network (e.g., MobileNet, squeezemet, shuffle, NasNet, mnsernet, and EfficientNet), which may specifically refer to the lightweight processing technology for neural networks provided in the prior art, and is not limited herein.
In some specific examples, the lightweight text detection neural network described above may include: and adopting a lightweight neural network MobileNet as a gradual scale expansion network PSENet of a backbone network. The lightweight text detection neural network has the advantages of high detection accuracy and high speed, and the overall character recognition accuracy and response speed of the scheme can be effectively improved.
The character area obtained by recognition is generally rectangular, and specifically includes coordinate information of four vertices of the rectangle.
It should be noted that the number of character areas recognized in this step is not limited to one, and may be multiple, where multiple character areas all need to be subjected to character recognition in the subsequent step.
Step S103: identifying a character string in the character region based on a lightweight character recognition neural network.
After the character region is identified, the character string in the character region can be identified by using the light weight character identification neural network. Specifically, in some embodiments, the recognizing the character string in the character region based on the lightweight character recognition neural network includes:
extracting a character area image containing the character area from the standard format image through affine transformation;
and identifying the character string in the character area from the character area image based on a light weight character recognition neural network.
The character region images including the character regions are extracted from the standard format images through affine transformation, and then the character strings in the character regions are identified by the light-weight character recognition neural network, so that the character region images required to be faced and processed by the light-weight character recognition neural network are simpler and more consistent, the identification accuracy and efficiency can be effectively improved, and the fact that the light-weight character recognition neural network can be adopted to accurately reduce the weight and the weight, The recognition of the character string in the character area image is efficiently realized.
Specifically, referring to fig. 2 and 3 for understanding, fig. 2 schematically illustrates a schematic diagram of a standard format image provided by some embodiments of the present application, figure 3 schematically illustrates a schematic diagram of a character area image provided by some embodiments of the present application, as shown in fig. 2 and fig. 3, the present embodiment may directly adopt affine transformation to extract the character region image from the standard format image according to the coordinates of the vertices of the circumscribed rectangle of the character region, for example, first, the coordinates of the vertices of a circumscribed rectangle of the source character region and the coordinates of the vertices of a target rectangle region in the standard-format image are set, the coordinates of the vertex of the target rectangle can be directly set to be (0, 0) at the upper left corner, so that the character area image can be directly extracted from the standard format image at one time after the subsequent transformation, and then an affine transformation matrix is calculated and the character area image is extracted by using the matrix.
In the above embodiment, the light-weighted character recognition Neural Network may be a light-weighted Convolutional Recurrent Neural Network (CRNN) obtained by performing a light-weighting process on a Convolutional Recurrent Neural Network (CRNN), where the light-weighting process may be implemented by using partially sensitive hash LSH coding on an output layer of the CRNN.
Specifically, the output layer of the conventional CRNN adopts an One-Hot output coding mode, and since the scale of the chinese-english character set is generally large (e.g. 10000), the output vector dimension of the One-Hot coding mode must be matched with the One-Hot output coding mode, so that the parameter number of the classification matrix W of the last full connection layer of the CRNN model is also large, and meanwhile, the calculation amount of matrix multiplication is also large (please refer to fig. 4 for understanding), which finally affects the overall scale and reasoning speed of the CRNN model. Therefore, in the embodiment of the application, a local-Sensitive Hashing (LSH) output coding mode is adopted in an output layer of the CRNN to replace the One-Hot output coding mode, the LSH output coding mode adopts maximum interval loss during training, a threshold is binarized during reasoning, then hamming distances are calculated with LSH codes of all characters, the closest character is a prediction result, and compared with the example of the One-Hot coding mode, the dimension of the classification matrix W after the LSH coding is adopted is reduced from (10000 × 128) to (256 × 128), so that 97.4% is reduced, and the calculation amount of matrix multiplication can be greatly reduced (please refer to fig. 5 to fig. 4).
According to the embodiment, a high-efficiency Local Sensitive Hash (LSH) coding output mode is adopted to replace a traditional One-Hot output coding mode, so that the scale of a character recognition model can be greatly reduced and the reasoning speed can be improved on the premise of not losing precision.
By the implementation mode, the CRNN can be lightened, and the CRNN has the advantages of high recognition accuracy and high speed, and the light CRNN inherits the advantages, so that the character string in the character region image is recognized by the light CRNN, the accuracy and the speed of character recognition can be effectively improved, and the requirement on the performance of computing equipment is reduced.
On the basis of any of the foregoing embodiments, in some modified embodiments, the character string recognized based on the lightweight character recognition neural network includes a plurality of character strings;
the method further comprises the following steps:
and screening the character strings to obtain a target character string which meets the constraint condition of the current application scene.
It is easy to understand that the image to be recognized may include a plurality of character strings, but some of the character strings are not required to be extracted, so in this embodiment, a target character string meeting the constraint condition of the current application scenario may be further obtained by screening the plurality of character strings, so as to meet the actual requirement of the application scenario.
In some variations of the foregoing embodiments, the filtering out a target character string that meets a constraint condition of a current application scenario from among the plurality of character strings includes:
and matching a plurality of character strings to obtain a target character string meeting the constraint condition of the current application scene by adopting at least one mode of regular expression matching, background information matching and confidence coefficient matching.
For example, for an application scenario in which the intelligent dining cabinet identifies the meal taking code in the meal taking voucher, first, each single character string identified in step S103 is further subjected to regular expression matching, for example, the first character in the item of the intelligent dining cabinet needs to be an uppercase character, the last 4 characters need to be numbers, and only the character strings meeting the condition are screened out as candidate character strings. And then, further screening candidate character strings according to constraint conditions, for example, only 1 valid character (meal fetching code) meeting the requirements exists on the same meal fetching voucher in the intelligent meal fetching cabinet, and when a plurality of candidate character strings exist, screening according to the environment information (namely background information matching) of the candidate character strings and the confidence information (namely confidence matching) of the character recognition results.
In the above embodiment, the candidate character strings are screened and judged by using the regular expression and other manners, so that the processing can be completed very quickly and accurately, and the customized screening can be performed according to actual needs. Therefore, target character strings meeting the requirements of practical application scenes, such as meal-taking codes, are obtained through screening.
The at least one character recognition method provided by the embodiment of the application converts the image to be recognized into the image with the standard format, then determines the character area in the image with the standard format, and recognizes the character string in the character area based on the light-weight character recognition neural network, wherein the image to be recognized is converted into the image with the standard format, so that the image needing character recognition neural network recognition can be simpler and more consistent, the data processing amount of the neural network can be effectively reduced by determining the character area and then inputting the character area into the neural network for recognition, the performance requirement on computing equipment is reduced, and the operation cost is reduced, therefore, the scheme can effectively reduce the requirement on the character recognition neural network, and the image with complicated processing and different specifications is not required, so the light-weight character recognition neural network can be adopted to realize character recognition, on the whole, the image complexity and the content which need to be processed by the neural network are simplified in advance, so that the character recognition function can be realized quickly and accurately by adopting the light-weight character recognition neural network, the response speed is high, and the requirement on the calculation performance of equipment can be greatly reduced by adopting the light-weight character recognition neural network, so that the character recognition can be realized by mobile equipment, embedded equipment and the like, the popularization and the implementation are facilitated, and the application prospect is wide.
In addition, in order to meet the requirement of real-time character recognition, the character recognition method provided in any of the above embodiments of the present invention not only adopts a lightweight and fast model as much as possible when designing a neural network model, but also performs special acceleration and compression processing on the model, including model quantization, pruning and distillation. Therefore, the character recognition efficiency and accuracy can be effectively improved, and the method can be better applied to equipment with low computing capability, such as mobile equipment, embedded equipment and the like.
In the above embodiments, a character recognition method is provided, and correspondingly, the present application also provides a character recognition apparatus. The character recognition device provided by the embodiment of the application can implement the character recognition method, and the character recognition device can be implemented by software, hardware or a combination of software and hardware. For example, the character recognition apparatus may include integrated or separate functional modules or units to perform the corresponding steps in the above-described methods. Please refer to fig. 6, which schematically illustrates a schematic diagram of a character recognition apparatus according to some embodiments of the present application. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.
As shown in fig. 6, the character recognition apparatus 10, for a server, may include:
the standard format conversion module 101 is used for converting the image to be identified into a standard format image;
a character region determining module 102, configured to determine a character region in the standard format image;
a character string extraction module 103, configured to identify a character string in the character region based on a lightweight character recognition neural network.
In some variations of the embodiments of the present application, the standard format conversion module 101 includes:
and the standard format conversion unit is used for converting the image to be recognized into a standard format image which accords with a preset color mode and/or a preset image size.
In some variations of the embodiments of the present application, the character region determining module 102 includes:
and the character region determining unit is used for determining the character region in the standard format image by adopting a light-weight text detection neural network.
In some variations of embodiments of the present application, the lightweight text-detection neural network comprises:
and adopting a lightweight neural network MobileNet as a gradual scale expansion network PSENet of a backbone network.
In some variations of the embodiments of the present application, the character string extraction module 103 includes:
a character region image extraction unit configured to extract a character region image containing the character region from the standard format image by affine transformation;
and the character string extraction unit is used for identifying the character string in the character area from the character area image based on the light weight character recognition neural network.
In some variations of embodiments of the present application, the lightweight character recognition neural network comprises:
and (3) a lightweight convolutional recurrent neural network.
In some variations of the embodiments of the present application, the output layer of the lightweight convolutional recurrent neural network uses Locality Sensitive Hash (LSH) coding.
In some variations of embodiments of the present application, the character string identified based on a lightweight character recognition neural network includes a plurality of character strings;
the apparatus 10, further comprising:
and the target character string screening module is used for screening a plurality of character strings to obtain a target character string which accords with the constraint condition of the current application scene.
In some variations of the embodiments of the present application, the target string filtering module includes:
and the target character string screening unit is used for matching a plurality of character strings to obtain a target character string meeting the constraint condition of the current application scene by adopting at least one mode of regular expression matching, background information matching and confidence coefficient matching.
The character recognition device 10 provided in the embodiment of the present application and the character recognition method provided in the foregoing embodiment of the present application have the same inventive concept and the same beneficial effects, and are not described herein again.
The embodiment of the present application further provides an electronic device corresponding to the character recognition method provided in the foregoing embodiment, where the electronic device may be any device with data processing capability to execute the character recognition method.
Please refer to fig. 7, which schematically illustrates a schematic diagram of an electronic device according to some embodiments of the present application. As shown in fig. 7, the electronic device 20 includes: the system comprises a processor 200, a memory 201, a bus 202 and a communication interface 203, wherein the processor 200, the communication interface 203 and the memory 201 are connected through the bus 202; the memory 201 stores a computer program that can be executed on the processor 200, and the processor 200 executes the character recognition method provided in any of the foregoing embodiments when executing the computer program.
The Memory 201 may include a Random Access Memory (RAM) and may further include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 203 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used.
Bus 202 can be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. The memory 201 is used for storing a program, and the processor 200 executes the program after receiving an execution instruction, and the character recognition method disclosed in any of the foregoing embodiments of the present application may be applied to the processor 200, or implemented by the processor 200.
The processor 200 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 200. The Processor 200 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 201, and the processor 200 reads the information in the memory 201 and completes the steps of the method in combination with the hardware thereof.
The electronic device provided by the embodiment of the present application and the character recognition method provided by the foregoing embodiment of the present application have the same inventive concept, and have the same beneficial effects as the method adopted, operated, or implemented by the electronic device.
Referring to fig. 8, a computer-readable storage medium is shown as an optical disc 30, on which a computer program (i.e., a program product) is stored, and when the computer program is executed by a processor, the computer program executes the character recognition method provided in any of the foregoing embodiments.
It should be noted that examples of the computer-readable storage medium may also include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory, or other optical and magnetic storage media, which are not described in detail herein.
The computer-readable storage medium provided by the above-mentioned embodiments of the present application and the character recognition method provided by the foregoing embodiments of the present application have the same beneficial effects as the method adopted, executed or implemented by the application program stored in the computer-readable storage medium.
The embodiment of the present application further provides an intelligent meal taking cabinet corresponding to the character recognition method provided by the foregoing embodiment, including: the intelligent cabinet comprises a cabinet body, an image acquisition device and a main control device, wherein the image acquisition device and the main control device are arranged on the cabinet body; the image acquisition device is connected with the main control device; the image acquisition device is used for acquiring images of the meal taking voucher to generate an image to be identified and sending the image to be identified to the main control device; the main control device is used for identifying the character string from the image to be identified by adopting the character identification method provided by any embodiment to determine the meal taking code recorded by the meal taking voucher, and controlling the cabinet door of the appointed meal taking position in the cabinet body to be opened according to the meal taking code.
The intelligent meal cabinet provided by the above embodiment of the application and the character recognition method provided by the foregoing embodiment of the application have the same beneficial effects as the method adopted, operated or implemented by the application program stored in the intelligent meal cabinet.
It should be noted that the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the present disclosure, and the present disclosure should be construed as being covered by the claims and the specification.

Claims (13)

1. A character recognition method, comprising:
converting an image to be identified into an image with a standard format;
determining a character area in the standard format image;
identifying a character string in the character region based on a lightweight character recognition neural network.
2. The method of claim 1, wherein converting the image to be recognized into a standard format image comprises:
and converting the image to be recognized into a standard format image which accords with a preset color mode and/or a preset image size.
3. The method of claim 1, wherein determining the character regions in the standard format image comprises:
and determining a character area in the standard format image by adopting a lightweight text detection neural network.
4. The method of claim 3, wherein the lightweight text-detection neural network comprises:
and adopting a lightweight neural network MobileNet as a gradual scale expansion network PSENet of a backbone network.
5. The method of claim 1, wherein the identifying the character string in the character region based on a lightweight character recognition neural network comprises:
extracting a character area image containing the character area from the standard format image through affine transformation;
and identifying the character string in the character area from the character area image based on a light weight character recognition neural network.
6. The method of claim 5, wherein the lightweight character recognition neural network comprises:
and (3) a lightweight convolutional recurrent neural network.
7. The method of claim 6, wherein an output layer of the lightweight convolutional recurrent neural network employs locality-sensitive-hash (LSH) encoding.
8. The method of claim 1, wherein the character string identified based on a lightweight character recognition neural network comprises a plurality of;
the method further comprises the following steps:
and screening the character strings to obtain a target character string which meets the constraint condition of the current application scene.
9. The method according to claim 8, wherein the filtering out the target character strings meeting the current application scenario constraint condition from the plurality of character strings comprises:
and matching a plurality of character strings to obtain a target character string meeting the constraint condition of the current application scene by adopting at least one mode of regular expression matching, background information matching and confidence coefficient matching.
10. A character recognition apparatus, comprising:
the standard format conversion module is used for converting the image to be identified into a standard format image;
the character area determining module is used for determining a character area in the standard format image;
and the character string extraction module is used for identifying the character strings in the character region based on the lightweight character recognition neural network.
11. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor executes the computer program to implement the method according to any of claims 1 to 9.
12. A computer-readable storage medium having computer-readable instructions stored thereon, the computer-readable instructions being executable by a processor to implement the method of any one of claims 1 to 9.
13. The utility model provides a dining cabinet is got to intelligence which characterized in that includes: the intelligent cabinet comprises a cabinet body, an image acquisition device and a main control device, wherein the image acquisition device and the main control device are arranged on the cabinet body; wherein,
the image acquisition device is connected with the main control device;
the image acquisition device is used for acquiring images of the meal taking voucher to generate an image to be identified and sending the image to be identified to the main control device;
the main control device is used for identifying the character string from the image to be identified by adopting the method of any one of claims 1 to 9 to determine the meal taking code recorded by the meal taking voucher, and controlling the cabinet door of the appointed meal taking position in the cabinet body to be opened according to the meal taking code.
CN202011540674.2A 2020-12-23 2020-12-23 Character recognition method and device, intelligent meal taking cabinet, electronic equipment and storage medium Pending CN112560845A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011540674.2A CN112560845A (en) 2020-12-23 2020-12-23 Character recognition method and device, intelligent meal taking cabinet, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011540674.2A CN112560845A (en) 2020-12-23 2020-12-23 Character recognition method and device, intelligent meal taking cabinet, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112560845A true CN112560845A (en) 2021-03-26

Family

ID=75030913

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011540674.2A Pending CN112560845A (en) 2020-12-23 2020-12-23 Character recognition method and device, intelligent meal taking cabinet, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112560845A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113343967A (en) * 2021-05-27 2021-09-03 山东师范大学 Optical character rapid identification method and system
CN113920527A (en) * 2021-10-13 2022-01-11 中国平安人寿保险股份有限公司 Text recognition method and device, computer equipment and storage medium
CN114821601A (en) * 2022-04-14 2022-07-29 北京知云再起科技有限公司 End-to-end English handwritten text detection and recognition technology based on deep learning
WO2022237058A1 (en) * 2021-05-14 2022-11-17 苏州大学 Embedded object cognitive system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022237058A1 (en) * 2021-05-14 2022-11-17 苏州大学 Embedded object cognitive system
CN113343967A (en) * 2021-05-27 2021-09-03 山东师范大学 Optical character rapid identification method and system
CN113920527A (en) * 2021-10-13 2022-01-11 中国平安人寿保险股份有限公司 Text recognition method and device, computer equipment and storage medium
CN113920527B (en) * 2021-10-13 2024-09-17 中国平安人寿保险股份有限公司 Text recognition method, device, computer equipment and storage medium
CN114821601A (en) * 2022-04-14 2022-07-29 北京知云再起科技有限公司 End-to-end English handwritten text detection and recognition technology based on deep learning

Similar Documents

Publication Publication Date Title
CN112560845A (en) Character recognition method and device, intelligent meal taking cabinet, electronic equipment and storage medium
US11475681B2 (en) Image processing method, apparatus, electronic device and computer readable storage medium
CN111881913A (en) Image recognition method and device, storage medium and processor
CN113313022B (en) Training method of character recognition model and method for recognizing characters in image
CN109086811A (en) Multi-tag image classification method, device and electronic equipment
CN113704531A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
US7277584B2 (en) Form recognition system, form recognition method, program and storage medium
Xu et al. Weakly supervised deep semantic segmentation using CNN and ELM with semantic candidate regions
CN111373393B (en) Image retrieval method and device and image library generation method and device
CN116189162A (en) Ship plate detection and identification method and device, electronic equipment and storage medium
CN112364933A (en) Image classification method and device, electronic equipment and storage medium
CN115577768A (en) Semi-supervised model training method and device
CN115223166A (en) Picture pre-labeling method, picture labeling method and device, and electronic equipment
CN111651674A (en) Bidirectional searching method and device and electronic equipment
CN114155395A (en) Image classification method, device, electronic device and storage medium
CN113780116A (en) Invoice classification method and device, computer equipment and storage medium
US20240221426A1 (en) Behavior detection method, electronic device, and computer readable storage medium
CN115984588A (en) Image background similarity analysis method and device, electronic equipment and storage medium
Zhang A Fine‐Grained Image Classification and Detection Method Based on Convolutional Neural Network Fused with Attention Mechanism
CN114387294A (en) Image processing method and storage medium
CN107766863B (en) Image characterization method and server
Goudah et al. OBJECT DETECTION IN INLAND VESSELS USING COMBINED TRAINED AND PRETRAINED MODELS OF YOLO8.
Yépez et al. Real‐time CVSA decals recognition system using deep convolutional neural network architectures
CN112288685B (en) Method, device, terminal equipment and readable storage medium for detecting acid-fast bacillus
CN115564778B (en) Defect detection method and device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination