CN114663894A - Bill identification method and device based on target detection, electronic equipment and medium - Google Patents

Bill identification method and device based on target detection, electronic equipment and medium Download PDF

Info

Publication number
CN114663894A
CN114663894A CN202210301866.0A CN202210301866A CN114663894A CN 114663894 A CN114663894 A CN 114663894A CN 202210301866 A CN202210301866 A CN 202210301866A CN 114663894 A CN114663894 A CN 114663894A
Authority
CN
China
Prior art keywords
bill
target
identification
determining
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210301866.0A
Other languages
Chinese (zh)
Inventor
师燕妮
韩茂琨
刘玉宇
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202210301866.0A priority Critical patent/CN114663894A/en
Publication of CN114663894A publication Critical patent/CN114663894A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the field of artificial intelligence, and provides a bill identification method, a bill identification device, electronic equipment and a bill identification medium based on target detection, wherein the method comprises the following steps: inputting the bill image into a pre-trained target detection model; determining a bill area of the target bill and an identification area of the reference identification from the bill image; determining the offset angle of the target bill in the bill image according to the bill center coordinate of the bill area and the identification center coordinate of the identification area; determining target adjustment information according to the offset angle and the reference angle, and adjusting the bill image according to the target adjustment information to obtain a target image; and performing OCR recognition processing on the target image to obtain bill information. According to the technical scheme of the embodiment, the target bill and the reference mark area can be determined through the target detection model, and the target bill is adjusted to the forward direction according to the obtained target adjustment information, so that the bill information can be correctly obtained through OCR (optical character recognition) processing, and the success rate and the accuracy of bill identification are effectively improved.

Description

Bill identification method and device based on target detection, electronic equipment and medium
Technical Field
The invention belongs to the technical field of artificial intelligence, and particularly relates to a bill identification method and device based on target detection, electronic equipment and a medium.
Background
At present, with the development of electronic offices, more and more paper bills need to be input into an electronic system, the manual input mode cannot meet the current requirements, and the development direction of an automatic system is that bill information is extracted from a bill image by using an Optical Character Recognition (OCR) technology. The common input mode of the bill image is scanning or photographing, and the bill information of each bill has a specific typesetting mode, so that the identification position of the bill information is fixed, and in order to ensure the accuracy of bill identification, if the bill direction is not correct during photographing or scanning, the corresponding bill information does not exist in the area corresponding to the identification position, and the accuracy of bill identification is influenced.
Disclosure of Invention
The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims.
The embodiment of the invention provides a bill identification method, a bill identification device, electronic equipment and a medium based on target detection, which can adjust the direction of a bill image in the bill image and ensure the accuracy of bill identification.
In a first aspect, an embodiment of the present invention provides a method for identifying a bill based on object detection, including:
acquiring a bill image to be identified, and inputting the bill image into a pre-trained target detection model;
determining a bill area of a target bill and an identification area of a reference identification of the target bill from the bill image through the target detection model;
determining the bill center coordinate of the bill area and the identification center coordinate of the identification area, and determining the offset angle of the target bill in the bill image according to the bill center coordinate and the identification center coordinate;
determining target adjustment information according to the offset angle and a preset reference angle, and adjusting the bill image according to the target adjustment information to obtain a target image;
and performing OCR recognition processing on the target image to obtain the bill information of the target bill.
In some embodiments, the determining, by the object detection model, a document region of an object document and an identification region of a reference identification of the object document from the document image includes:
determining a plurality of bill candidate regions and a plurality of identification candidate regions from the bill image through the object detection model;
and when the intersection ratio of the bill candidate area and the identification candidate area is 1, determining the corresponding bill candidate area as the bill area of the target bill, and determining the corresponding identification candidate area as the identification area of the reference identification.
In some embodiments, prior to said determining a plurality of document candidate regions and a plurality of identification candidate regions from said document image by said object detection model, said method further comprises:
determining a bill type of the target bill from the bill image;
and determining the reference identifier of the target bill according to the bill type.
In some embodiments, the determining the offset angle of the target document in the document image according to the document center coordinates and the identification center coordinates is obtained by the following formula:
Figure BDA0003565870130000021
wherein, AngleReflectFor the offset angle, Center (x, y)refIs the Center coordinate of the mark, Center (x, y)tarAnd the central coordinates of the bill are obtained.
In some embodiments, the determining target adjustment information according to the offset angle and a preset reference angle includes:
determining a target angle difference between the offset angle and the reference angle, and determining a target rotation angle and a target rotation direction according to the target angle difference;
and determining the target rotation angle and the target rotation direction as the target adjustment information.
In some embodiments, the number of reference angles is at least two, the determining an angular difference between the offset angle and the reference angle, and determining a target rotation angle and a target rotation direction from the angular difference, comprises:
determining a plurality of candidate angle differences according to the offset angle and the plurality of reference angles;
determining the candidate angle difference with the smallest absolute value as the target angle difference;
and when the target angle difference is a positive number, determining that the target rotation direction is clockwise, and when the target angle difference is a negative number, determining that the target rotation direction is anticlockwise.
In some embodiments, the target detection model is trained by:
acquiring a plurality of sample bill images, wherein the sample bill images are marked with marking information in advance, and the marking information comprises a marking bill area and a marking identification area;
inputting the sample bill images into the target detection model, and training the target detection model to be convergent.
In a second aspect, an embodiment of the present invention provides a bill recognition apparatus based on object detection, including:
the image acquisition unit is used for acquiring a bill image to be identified and inputting the bill image into a pre-trained target detection model;
the detection unit is used for determining a bill area of a target bill and an identification area of a reference identification of the target bill from the bill image through the target detection model;
the positioning unit is used for determining the central coordinates of the bills in the bill area and the identification central coordinates of the identification area, and determining the offset angle of the target bill in the bill image according to the central coordinates of the bills and the identification central coordinates;
the image adjusting unit is used for determining target adjusting information according to the offset angle and a preset reference angle, and adjusting the bill image according to the target adjusting information to obtain a target image;
and the recognition unit is used for carrying out OCR recognition processing on the target image to obtain the bill information of the target bill.
In a third aspect, an embodiment of the present invention provides an electronic device, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor when executing the computer program implementing the object detection based ticket identification method according to the first aspect.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium storing a computer program for executing the method for identifying a bill based on object detection according to the first aspect.
The embodiment of the invention comprises the following steps: acquiring a bill image to be identified, and inputting the bill image into a pre-trained target detection model; determining a bill area of a target bill and an identification area of a reference identification of the target bill from the bill image through the target detection model; determining the bill center coordinate of the bill area and the identification center coordinate of the identification area, and determining the offset angle of the target bill in the bill image according to the bill center coordinate and the identification center coordinate; determining target adjustment information according to the offset angle and a preset reference angle, and adjusting the bill image according to the target adjustment information to obtain a target image; and performing OCR recognition processing on the target image to obtain the bill information of the target bill. According to the technical scheme of the embodiment, the target bill and the reference mark area can be determined through the target detection model, and the target bill is adjusted to the forward direction according to the obtained target adjustment information, so that the bill information can be correctly obtained through OCR (optical character recognition) processing, and the success rate and the accuracy of bill identification are effectively improved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and not to limit the invention.
FIG. 1 is a flow chart of a method for object detection based document identification provided by one embodiment of the present invention;
FIG. 2 is a schematic illustration of a ticket image provided in accordance with another embodiment of the present invention;
FIG. 3 is a flow chart of determining a ticket region and an identification region provided by another embodiment of the present invention;
FIG. 4 is a flow chart of determining a reference identifier provided by another embodiment of the present invention;
FIG. 5 is a flow chart of determining target adjustment information provided by another embodiment of the present invention;
FIG. 6 is a schematic diagram of determining target adjustment information from a plurality of reference angles according to another embodiment of the present invention;
FIG. 7 is a flow diagram of training a target detection model according to another embodiment of the invention;
FIG. 8 is a block diagram of a document identification device based on object detection according to another embodiment of the present invention;
fig. 9 is a device diagram of an electronic apparatus according to another embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
It should be noted that although functional blocks are partitioned in a schematic diagram of an apparatus and a logical order is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the partitioning of blocks in the apparatus or the order in the flowchart. The terms "first," "second," and the like in the description, in the claims, or in the drawings described above, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
The invention provides a bill identification method, a bill identification device, electronic equipment and a bill identification medium based on target detection, wherein the method comprises the following steps: acquiring a bill image to be identified, and inputting the bill image into a pre-trained target detection model; determining a bill area of a target bill and an identification area of a reference identification of the target bill from the bill image through the target detection model; determining the bill center coordinate of the bill area and the identification center coordinate of the identification area, and determining the offset angle of the target bill in the bill image according to the bill center coordinate and the identification center coordinate; determining target adjustment information according to the offset angle and a preset reference angle, and adjusting the bill image according to the target adjustment information to obtain a target image; and performing OCR recognition processing on the target image to obtain the bill information of the target bill. According to the technical scheme of the embodiment, the target bill and the reference mark area can be determined through the target detection model, and the target bill is adjusted to the forward direction according to the obtained target adjustment information, so that the bill information can be correctly obtained through OCR (optical character recognition) processing, and the success rate and the accuracy of bill identification are effectively improved.
The embodiment of the application can compile, acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application device that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction devices, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.
Computer Vision technology (CV) is a science for researching how to make a machine "see", and further refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or is transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. The computer vision technology generally includes image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technology, virtual reality, augmented reality, synchronous positioning and map building, automatic driving, intelligent transportation and other technologies, and also includes common biometric identification technologies such as face recognition and fingerprint recognition.
It should be noted that the data in the embodiments of the present invention may be stored in a server, and the server may be an independent server, or may be a cloud server that provides basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, web service, cloud communication, middleware service, domain name service, security service, Content Delivery Network (CDN), big data, and an artificial intelligence platform.
OCR refers to the process of an electronic device (e.g., a scanner or digital camera) examining printed characters on paper, determining their shape by detecting dark and light patterns, and then translating the shape into computer text using character recognition methods.
As shown in fig. 1, fig. 1 is a flowchart of a method for identifying a ticket based on object detection according to an embodiment of the present invention, where the method for identifying a ticket based on object detection includes, but is not limited to, the following steps:
step S110, acquiring a bill image to be identified, and inputting the bill image into a pre-trained target detection model;
step S120, determining a bill area of the target bill and an identification area of a reference identification of the target bill from the bill image through the target detection model;
step S130, determining the central coordinates of the bills in the bill area and the identification central coordinates of the identification area, and determining the offset angle of the target bill in the bill image according to the central coordinates of the bills and the identification central coordinates;
step S140, determining target adjustment information according to the offset angle and a preset reference angle, and adjusting the bill image according to the target adjustment information to obtain a target image;
and S150, performing OCR recognition processing on the target image to obtain the bill information of the target bill.
It should be noted that, because the recording mode of the document image may be photographing or scanning, taking photographing as an example, it is difficult to ensure that the photographing frame is completely aligned with the document, even if photographing is performed by using a high-speed photographing apparatus with a fixed position, an operator may shift the document when placing the document on the photographing area, and an example of the shift of the target document in the document image may refer to fig. 2, in the document image 200 shown in fig. 2, the target document 211 and a frame of the document image 200 form a certain shift angle, at this time, if OCR recognition is performed by using a preset style of the target document, since an area does not conform to or a text inclines, a recognition error is likely to occur, which affects the accuracy of document recognition, and in order to solve this problem, it is necessary to correct the shift of the target document to ensure that regular OCR recognition can be performed on the document image.
It should be noted that since the target document has a certain offset in the document image, not all the document image is the target document, in this embodiment, the document region of the target document is detected by the target detection model to determine, for example, in the document image 200 shown in fig. 2, the document region 212 of the target document 211 can be determined by the target detection model, and subsequent processing is performed on the document region 212, so that interference of the background image on document identification can be effectively eliminated.
It should be noted that the process of performing target detection on an image may be implemented by a deep learning algorithm, for example, by a two-stage (two-stage) target detection algorithm such as a region algorithm (R-CNN) based on a convolutional neural network feature, a fast region algorithm (fast R-CNN) based on a convolutional neural network feature, or may be implemented by a one-stage (one-stage) target detection algorithm such as a glance target detection algorithm (YOLO) and a Single Shot multi-box Detector algorithm (SSD), and a specific algorithm may be selected according to actual requirements, which is not limited in this embodiment.
It should be noted that the reference identifier may be determined according to different bill types, for example, a stamp exists for most bills, and the position of the stamp is relatively fixed, for example, as shown in fig. 2, taking the reference identifier 231 in the middle of the upper part of the target bill 211 as an example, in the training process of the target detection model, the stamp may be set in advance as the reference identifier, and a corresponding region is marked in the sample image, so that after the target detection model acquires the bill image 200 shown in fig. 2, secondary target detection can be performed from the bill region 212, and the identifier region 232 of the stamp 231 is determined.
It should be noted that, in this embodiment, after the identification area and the bill area are determined, the identification center coordinate is determined from the identification area, and the bill center coordinate is determined from the bill area, since the position of the reference identification is usually fixed, an included angle between a line segment between the identification center coordinate and the bill center coordinate and a vertical line or a parallel line of the bill image is positively correlated with the offset angle of the target bill, for example, as shown in fig. 2, an included angle α is formed between a center reference line 221 between the identification center coordinate and the bill center coordinate and a horizontal reference line 222 parallel to upper and lower sides of the bill image 200, and when the position of the target bill is a forward direction, α is 90 degrees, therefore, after the offset angle is determined, an angle and a direction that need to be adjusted for correcting the target bill, that is, target adjustment information, can be determined with 90 degrees as a reference angle. According to the technical scheme, the target adjustment information can be determined through the offset angle no matter the placement position of the target bill in the bill image, so that the target bill is adjusted to be positive, and the bill information is recognized from the bill image through conventional OCR recognition.
It should be noted that, the determining manner of the offset angle may be selected according to actual requirements, as shown in fig. 2, the offset angle is determined according to the horizontal reference line 222 and the central reference line 221, the reference angle is 90 degrees, for example, if the calculated offset angle is 95 degrees, the offset angle is 5 degrees, and the specific calculating manner may be determined according to the selection of the reference angle and the reference line, which is not limited herein. The target adjustment information may include a rotation angle and a rotation direction, for example, a counterclockwise rotation of 5 degrees, a clockwise rotation of 25 degrees, etc., and the specific form of the target adjustment information is not limited herein.
It should be noted that after the target image is obtained according to the target adjustment information, the target ticket in the target image has been adjusted to the forward direction, and then the ticket information can be obtained by performing ticket identification on the target ticket according to a conventional OCR identification technology, and a specific identification process is not an improvement made in this embodiment, and is not described herein again.
In addition, in an embodiment, referring to fig. 3, step S120 of the embodiment shown in fig. 1 further includes, but is not limited to, the following steps:
step S310, determining a plurality of bill candidate areas and a plurality of identification candidate areas from the bill image through the target detection model;
step S320, when the intersection ratio of the ticket candidate area and the identification candidate area is 1, determining the corresponding ticket candidate area as the ticket area of the target ticket, and determining the corresponding identification candidate area as the identification area of the reference identification.
It should be noted that, in order to improve the efficiency of bill recognition, multiple bills may be simultaneously input in a bill image, for example, image input is performed in a manner that multiple bills are sequentially arranged, bill recognition is performed on each bill through a recognition system to obtain multiple bill information, and in order to correct the angle of the bill, the technical scheme of this embodiment may be independently implemented for each bill, that is, the offset angle of each bill is determined, and since the determination of the offset angle depends on the relative position where the reference identifier and the target bill are fixed, it is necessary to distinguish different bill images from the bill image.
It should be noted that, for the target detection model, a plurality of bill candidate regions and a plurality of identification candidate regions can be quickly detected from the bill image, and on this basis, different target bills can be distinguished only by determining the identification candidate region associated with each bill candidate region.
It should be noted that an Intersection-over-Union (IoU) is a concept often used in target detection, and is a ratio of overlap of a generated candidate frame and an original labeled frame, i.e. a ratio of Intersection to Union of the generated candidate frame and the original labeled frame, and when the candidate frame is located in the original labeled frame, the ratio of the Intersection-over-Union is 1. In this embodiment, the calculation of the intersection ratio can be accomplished by the following formula:
Figure BDA0003565870130000081
wherein, ObjBillAs a bill candidate area, ObjReference toFor identifying the candidate region, the reference identifier of this embodiment is located in the target document, so that the identifier candidate region is used as the candidate frame, the document candidate region is used as the original mark frame, and when the intersection ratio of one identifier candidate region and one document candidate region is 1, it can be determined that the identifier candidate region is located in the ticket candidate region, and the two are attributed to the same target document. Certainly, the position relationship between the bill candidate region and the identification candidate region may also be determined in other forms, for example, the pixel coordinates of the identification candidate region are all located in the bill candidate region, and a person skilled in the art is motivated to select a specific manner according to actual situations, and it is only necessary to determine that the bill candidate region and the identification candidate region belong to the same target bill.
It should be noted that, when a plurality of target tickets are determined from the ticket image according to the intersection ratio, the operations in the embodiment shown in fig. 1 may be sequentially performed for each target ticket, so as to identify the ticket information from each target ticket, which is not repeated in this embodiment.
In addition, in an embodiment, referring to fig. 4, before performing step S310 of the embodiment shown in fig. 3, the following steps are further included, but not limited to:
step S410, determining the bill type of the target bill from the bill image;
and step S420, determining the reference identifier of the target bill according to the bill type.
It should be noted that, because different bills have different reference marks, for example, a common invoice includes not only a stamp but also a two-dimensional code at a fixed position, a person skilled in the art has an incentive to have different reference marks according to actual needs, and can predetermine the position of the reference mark, so that the target detection model can learn the shape of the reference mark from the position in the training process.
It should be noted that, in order to enable the target detection model to detect the identification area of the reference identifier from the bill image, the identification area and the reference identifier need to be marked in the training sample, so that the reference identifiers corresponding to different bill types need to be determined before training, for example, two types of bill marks and reference identifier marks may be predefined, and the bill type of each sample is determined, so that the target detection model can learn the reference identifier corresponding to different bill types in the training process, and the specific marking process is not limited herein.
It should be noted that, in order to determine the reference identifier according to the bill type of the target bill, the bill type may be input while the bill image is input, so that the target detection model may determine the bill type directly according to the input information, and certainly, the target detection model may also perform image recognition on the bill image by simple answer, determine the bill type according to the typesetting manner of the bill or the keyword, and select a specific manner according to the actual requirement, which is not limited herein.
In addition, in an embodiment, referring to fig. 4, step S130 of the embodiment shown in fig. 1 is obtained by the following formula:
Figure BDA0003565870130000091
wherein, AngleReflectFor offset angle, Center (x, y)refFor identifying the Center coordinates, Center (x, y)tarThe coordinates of the center of the bill.
It should be noted that after the center coordinates of the mark and the center coordinates of the bill are determined, a line segment can be obtained according to the two points, and based on this, a right triangle can be made in combination with the images parallel or perpendicular to the bill image, so as to calculate the offset angle according to the above-mentioned trigonometric function.
In addition, in an embodiment, referring to fig. 5, step S140 of the embodiment shown in fig. 1 further includes, but is not limited to, the following steps:
step S510, determining a target angle difference between the offset angle and the reference angle, and determining a target rotation angle and a target rotation direction according to the target angle difference;
in step S520, the target rotation angle and the target rotation direction are determined as target adjustment information.
Note that, since the offset angle represents the inclination angle of the target document in the document image, and is not directly used for the angle of adjustment, since the document identification requires adjustment of the target document to the reference position, for example, the frame of the document area is parallel to the document image, when the offset angle is provided, adjustment is required according to the reference angle corresponding to the reference position, for example, when 90 is used as the reference angle and the offset angle is calculated to be 95 degrees in the above embodiment, the target angle difference is 5 degrees and the rotation direction is clockwise, and based on this, the document image is rotated clockwise by 5 degrees, and the document image in the positive direction can be obtained.
It should be noted that the target rotation direction may be clockwise or counterclockwise, and the specific direction may be selected according to actual requirements, for example, to reduce the flow of determining the direction, a clockwise direction is preset as the target rotation direction, and for example, to improve the rotation efficiency, a direction with a smaller rotation angle is taken as the target rotation direction, which is not limited in this embodiment.
In addition, in an embodiment, the number of the reference angles is at least two, and referring to fig. 6, step S520 of the embodiment shown in fig. 5 further includes, but is not limited to, the following steps:
step S610, determining a plurality of candidate angle differences according to the offset angle and a plurality of reference angles;
step S620, determining the candidate angle difference with the minimum absolute value as a target angle difference;
in step S630, when the target angle difference is positive, the target rotation direction is determined to be clockwise, and when the target angle difference is negative, the target rotation direction is determined to be counterclockwise.
It should be noted that, since the bill image can be rotated, the target bill is adjusted to the positive direction, and the bill image is substantially adjusted to be parallel to the frame of the bill image in any corresponding manner, for example, the upper frame of the bill image is parallel to the upper frame, the lower frame, the left frame, or the right frame of the bill image, after alignment in any one manner is achieved, the direction correction of the target bill can be achieved by adjusting the bill image, for example, the upper frame of the bill image is parallel to the lower frame of the bill image and is close to the lower frame, at this time, the target bill is reversed in the bill image, and at this time, the target bill is located in the positive direction by rotating the bill image 180 degrees.
It should be noted that, on the basis of the above, a plurality of reference angles may be preset, a plurality of candidate angle differences are determined, and in order to reduce the influence of image rotation on image sharpness, in this embodiment, the candidate angle difference with the smallest absolute value is selected as the target angle difference, so that rotation is performed with the smallest target angle difference, and a rotation direction is determined according to the positive and negative of the target angle difference, for example, if the target angle difference is 5 degrees, the clockwise rotation is 5 degrees, and if the target angle difference is-5 degrees, the counterclockwise rotation is 5 degrees.
To better illustrate the technical solution of the present embodiment, a specific example is provided below, in which the offset angle is 95 degrees, and the set of reference angles [ -90, 0, 90, 180 [ -90]Calculating a candidate angle difference according to the following formula: | AngleReflect(1*1)1*4-AnglesReflect(1*4)L, wherein AngleReflect(1*1)To offset the angle, AnglesReflect(1*4)Matrix of reference angles [ -90, 0, 90, 180 [ -90],α1*4Is a unit matrix [1,1,1,1 ] of one row and four columns]The resulting set of candidate angle differences has not turned out to be [185,95,5, -85 ]]And selecting the candidate angle difference with the minimum absolute value as 5, so that the target angle difference is 5 degrees, and the target bill can be corrected by clockwise rotating 5 degrees.
Additionally, in one embodiment, referring to FIG. 7, the training steps of the target detection model include, but are not limited to:
step S710, acquiring a plurality of sample bill images, wherein the sample bill images are marked with marking information in advance, and the marking information comprises a marking bill area and a marking identification area;
step S720, inputting the multiple sample bill images into the target detection model, and training the target detection model to be convergent.
It should be noted that, the labeling of the sample bill image may be performed in the form of a coco data set, a reference identifier and a region of the target bill are labeled in the sample bill image to obtain a labeled bill region and a labeled identifier region, and a center point, a width, and a height of each region are respectively labeled, so that the target detection model can be trained according to the labeling information, and the trained target detection model can return the region of the bill image and the reference identifier and a center point coordinate from the input image.
It should be noted that, in order to verify whether the target detection model has been trained to converge, a test image without annotation may be input to the target detection model, and when an output value of the target detection model is the same as the actual values of the bill region and the identification region or a difference value satisfies a preset threshold, it may be determined that the target detection model has been trained to converge.
In addition, referring to fig. 8, an embodiment of the present invention provides a target detection-based ticket recognition apparatus, where the target detection-based ticket recognition apparatus 800 includes:
the image acquisition unit 810 is configured to acquire a bill image to be identified, and input the bill image into a pre-trained target detection model;
a detecting unit 820, configured to determine a bill region of the target bill and an identification region of a reference identifier of the target bill from the bill image through the target detection model;
the positioning unit 830 is configured to determine a bill center coordinate of the bill area and an identification center coordinate of the identification area, and determine an offset angle of the target bill in the bill image according to the bill center coordinate and the identification center coordinate;
the image adjusting unit 840 is used for determining target adjusting information according to the offset angle and a preset reference angle, and adjusting the bill image according to the target adjusting information to obtain a target image;
and the recognition unit 850 is used for performing OCR recognition processing on the target image to obtain the bill information of the target bill.
In addition, referring to fig. 9, an embodiment of the present invention also provides an electronic device, where the electronic device 900 includes: memory 910, processor 920, and computer programs stored on memory 910 and operable on processor 920.
The processor 920 and the memory 910 may be connected by a bus or other means.
Non-transitory software programs and instructions required to implement the target detection-based ticket recognition method of the above-described embodiment are stored in the memory 910, and when executed by the processor 920, perform the target detection-based ticket recognition method of the above-described embodiment, for example, perform the above-described method steps S110 to S150 in fig. 1, method steps S310 to S320 in fig. 3, method steps S410 to S420 in fig. 4, method steps S510 to S520 in fig. 5, method steps S610 to S630 in fig. 6, and method steps S710 to S720 in fig. 7.
The above-described embodiments of the apparatus are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may also be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
Furthermore, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored, and the computer program is executed by a processor or a controller, for example, by a processor in the above-mentioned embodiment of the electronic device, so as to enable the processor to execute the bill identification method based on object detection in the above-mentioned embodiment, for example, to execute the above-mentioned method steps S110 to S150 in fig. 1, method steps S310 to S320 in fig. 3, method steps S410 to S420 in fig. 4, method steps S510 to S520 in fig. 5, method steps S610 to S630 in fig. 6, and method steps S710 to S720 in fig. 7. It will be understood by those of ordinary skill in the art that all or some of the steps, means, and methods disclosed above may be implemented as software, firmware, hardware, or suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable storage media, which may include computer storage media (or non-transitory storage media) and communication storage media (or transitory storage media). The term computer storage media includes volatile and nonvolatile, removable and non-removable storage media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other storage medium which can be used to store the desired information and which can be accessed by a computer. In addition, communication storage media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery storage media as is well known to those of ordinary skill in the art.
The embodiments are operational with numerous general purpose or special purpose computing device environments or configurations. For example: personal computers, server computers, hand-held or portable electronic devices, tablet electronic devices, multiprocessor apparatus, microprocessor-based apparatus, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above devices or electronic devices, and the like. The application may be described in the general context of computer programs, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing electronic devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage electronic devices.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. Each block in the flowchart or block diagrams may represent a module, a segment, or a portion of code, which comprises one or more programs for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based apparatus that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.
It should be noted that although in the above detailed description several modules or units of the electronic device for action execution are mentioned, this division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, and may also be implemented by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing electronic device (which can be a personal computer, a server, a touch terminal, or a network electronic device, etc.) to execute the method according to the embodiments of the present application.
The electronic device of the present embodiment may include: radio Frequency (RF) circuit, memory, input unit, display unit, sensor, audio circuit, wireless fidelity (WiFi) module, processor, and power supply. The RF circuit can be used for receiving and transmitting signals in the process of information receiving and transmitting or conversation, and particularly, the downlink information of the base station is received and then is processed by the processor; in addition, the data for designing uplink is transmitted to the base station. Typically, the RF circuitry includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Messaging Service (SMS), and the like. The memory may be used to store software programs and modules, and the processor may execute various functional applications and data processing of the electronic device by operating the software programs and modules stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the electronic device, and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. The input unit may be used to receive input numeric or character information and generate key signal inputs related to settings and function control of the electronic device. Specifically, the input unit may include a touch panel and other input devices. The touch panel, also called a touch screen, may collect touch operations thereon or nearby (such as operations on or near the touch panel using any suitable object or accessory, such as a finger, a stylus, etc.) and drive the corresponding connection device according to a preset program. Alternatively, the touch panel may include two parts, a touch detection device and a touch controller. The touch detection device detects a touch direction, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor, and can receive and execute commands sent by the processor. In addition, the touch panel may be implemented by various types such as resistive, capacitive, infrared, and surface acoustic wave. The input unit may include other input devices in addition to the touch panel. In particular, other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like. The display unit may be used to display input information or provided information and various menus of the electronic device. The Display unit may include a Display panel, and optionally, the Display panel may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel may overlay the display panel, and when the touch panel detects a touch operation thereon or thereabout, the touch panel may transmit the touch operation to the processor to determine a category of the touch event, and then the processor may provide a corresponding visual output on the display panel according to the category of the touch event. The touch panel and the display panel are two separate components to implement the input and output functions of the electronic device, but in some embodiments, the touch panel and the display panel may be integrated to implement the input and output functions of the electronic device. The electronic device may also include at least one sensor, such as a light sensor, a motion sensor, and other sensors. In particular, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel according to the brightness of ambient light, and a proximity sensor that may turn off the display panel and/or the backlight when the electronic device is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), detect the magnitude and direction of gravity when stationary, and can be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration) for recognizing the attitude of the electronic device, vibration recognition related functions (such as pedometer, tapping) and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which may be further configured to the electronic device, detailed descriptions thereof are omitted. The audio circuit, speaker, microphone may provide an audio interface. The audio circuit can transmit the electric signal converted from the received audio data to the loudspeaker, and the electric signal is converted into a sound signal by the loudspeaker to be output; on the other hand, the microphone converts the collected sound signal into an electrical signal, which is received by the audio circuit and converted into audio data, which is then output to the processor for processing, and then transmitted to, for example, another electronic device via the RF circuit, or the audio data is output to the memory for further processing.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.
While the preferred embodiments of the present invention have been described in detail, it will be understood by those skilled in the art that the foregoing and various other changes, omissions and deviations in the form and detail thereof may be made without departing from the scope of this invention.

Claims (10)

1. A bill identification method based on target detection is characterized by comprising the following steps:
acquiring a bill image to be identified, and inputting the bill image into a pre-trained target detection model;
determining a bill area of a target bill and an identification area of a reference identification of the target bill from the bill image through the target detection model;
determining the bill center coordinate of the bill area and the identification center coordinate of the identification area, and determining the offset angle of the target bill in the bill image according to the bill center coordinate and the identification center coordinate;
determining target adjustment information according to the offset angle and a preset reference angle, and adjusting the bill image according to the target adjustment information to obtain a target image;
and performing OCR recognition processing on the target image to obtain the bill information of the target bill.
2. The method for bill identification based on object detection as claimed in claim 1, wherein the determining of the bill area of the object bill and the identification area of the reference identification of the object bill from the bill image by the object detection model comprises:
determining a plurality of bill candidate regions and a plurality of identification candidate regions from the bill image through the object detection model;
and when the intersection ratio of the bill candidate area and the identification candidate area is 1, determining the corresponding bill candidate area as the bill area of the target bill, and determining the corresponding identification candidate area as the identification area of the reference identification.
3. The method of claim 2, wherein prior to said determining a plurality of document candidate regions and a plurality of identification candidate regions from said document image by said object detection model, said method further comprises:
determining a bill type of the target bill from the bill image;
and determining the reference identifier of the target bill according to the bill type.
4. The method for bill identification based on target detection as claimed in claim 1 wherein, the determining the offset angle of the target bill in the bill image according to the bill center coordinate and the identification center coordinate is obtained by the following formula:
Figure FDA0003565870120000011
wherein, AngleReflectFor the offset angle, Center (x, y)refIs the Center coordinate of the mark, Center (x, y)tarAnd the central coordinates of the bill are obtained.
5. The bill identifying method based on target detection as claimed in claim 1, wherein the determining target adjustment information according to the offset angle and a preset reference angle comprises:
determining a target angle difference between the offset angle and the reference angle, and determining a target rotation angle and a target rotation direction according to the target angle difference;
and determining the target rotation angle and the target rotation direction as the target adjustment information.
6. The method of claim 5, wherein the number of the reference angles is at least two, and the determining the angular difference between the offset angle and the reference angle and the determining the target rotation angle and the target rotation direction according to the angular difference comprises:
determining a plurality of candidate angle differences according to the offset angle and the plurality of reference angles;
determining the candidate angle difference with the smallest absolute value as the target angle difference;
and when the target angle difference is a positive number, determining that the target rotation direction is clockwise, and when the target angle difference is a negative number, determining that the target rotation direction is anticlockwise.
7. The bill identification method based on target detection as claimed in claim 1, wherein the target detection model is trained by the following method:
acquiring a plurality of sample bill images, wherein the sample bill images are marked with marking information in advance, and the marking information comprises a marking bill area and a marking identification area;
and inputting a plurality of sample bill images to the target detection model, and training the target detection model to be convergent.
8. A bill identifying apparatus based on object detection, comprising:
the image acquisition unit is used for acquiring a bill image to be identified and inputting the bill image into a pre-trained target detection model;
the detection unit is used for determining a bill area of a target bill and an identification area of a reference identification of the target bill from the bill image through the target detection model;
the positioning unit is used for determining the central coordinates of the bills in the bill area and the identification central coordinates of the identification area, and determining the offset angle of the target bill in the bill image according to the central coordinates of the bills and the identification central coordinates;
the image adjusting unit is used for determining target adjusting information according to the offset angle and a preset reference angle, and adjusting the bill image according to the target adjusting information to obtain a target image;
and the recognition unit is used for carrying out OCR recognition processing on the target image to obtain the bill information of the target bill.
9. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the object detection based ticket recognition method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored, characterized in that the computer program is configured to execute the object detection-based ticket recognition method according to any one of claims 1 to 7.
CN202210301866.0A 2022-03-25 2022-03-25 Bill identification method and device based on target detection, electronic equipment and medium Pending CN114663894A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210301866.0A CN114663894A (en) 2022-03-25 2022-03-25 Bill identification method and device based on target detection, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210301866.0A CN114663894A (en) 2022-03-25 2022-03-25 Bill identification method and device based on target detection, electronic equipment and medium

Publications (1)

Publication Number Publication Date
CN114663894A true CN114663894A (en) 2022-06-24

Family

ID=82032375

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210301866.0A Pending CN114663894A (en) 2022-03-25 2022-03-25 Bill identification method and device based on target detection, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN114663894A (en)

Similar Documents

Publication Publication Date Title
US9292739B1 (en) Automated recognition of text utilizing multiple images
US9390340B2 (en) Image-based character recognition
US9436883B2 (en) Collaborative text detection and recognition
CN107273002B (en) Handwriting input answering method, terminal and computer readable storage medium
US9280716B2 (en) Apparatus for sensing user condition to assist handwritten entry and a method therefor
CN109684980B (en) Automatic scoring method and device
CN109766879A (en) Generation, character detection method, device, equipment and the medium of character machining model
CN111259889A (en) Image text recognition method and device, computer equipment and computer storage medium
CN109685870B (en) Information labeling method and device, labeling equipment and storage medium
CN112100431B (en) Evaluation method, device and equipment of OCR system and readable storage medium
CN112507806B (en) Intelligent classroom information interaction method and device and electronic equipment
CN111209377B (en) Text processing method, device, equipment and medium based on deep learning
US20230384576A1 (en) Virtual Slide Stage (VSS) Method For Viewing Whole Slide Images
CN112686197B (en) Data processing method and related device
CN106326802A (en) Two-dimensional code correction method and device and terminal device
CN115205883A (en) Data auditing method, device, equipment and storage medium based on OCR (optical character recognition) and NLP (non-line language)
US20220386071A1 (en) Road side positioning method and apparatus, device, and storage medium
CN115393872A (en) Method, device and equipment for training text classification model and storage medium
CN112995757B (en) Video clipping method and device
CN114817742B (en) Knowledge distillation-based recommendation model configuration method, device, equipment and medium
CN114663894A (en) Bill identification method and device based on target detection, electronic equipment and medium
CN111695372A (en) Click-to-read method and click-to-read data processing method
CN113535055B (en) Method, equipment and storage medium for playing point-to-read based on virtual reality
KR20220116818A (en) Mehtod and device for information extraction through deep learning-based answer sheet scanning
CN113191251A (en) Method and device for detecting stroke order, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination