CN117197798A - Invoice information extraction method, invoice information extraction device, invoice information extraction equipment and invoice information storage medium - Google Patents

Invoice information extraction method, invoice information extraction device, invoice information extraction equipment and invoice information storage medium Download PDF

Info

Publication number
CN117197798A
CN117197798A CN202311354182.8A CN202311354182A CN117197798A CN 117197798 A CN117197798 A CN 117197798A CN 202311354182 A CN202311354182 A CN 202311354182A CN 117197798 A CN117197798 A CN 117197798A
Authority
CN
China
Prior art keywords
invoice
information
image
identification code
segmented
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311354182.8A
Other languages
Chinese (zh)
Inventor
邓琬耀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Property and Casualty Insurance Company of China Ltd
Priority to CN202311354182.8A priority Critical patent/CN117197798A/en
Publication of CN117197798A publication Critical patent/CN117197798A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The embodiment of the application provides an invoice information extraction method, device and equipment and a storage medium. The method comprises the following steps: acquiring an invoice image of information to be extracted, and performing image segmentation on the invoice image to obtain a plurality of segmented images; extracting the characteristics of each segmented image to obtain corresponding characteristic extraction information; judging whether the segmented image comprises an identification code or not based on the feature extraction information; and when the segmented image comprises the identification code, analyzing the identification code to obtain an analysis result, and determining information corresponding to the invoice based on the analysis result. The embodiment of the application aims to improve the information extraction efficiency of medical invoices in the insurance field and reduce the labor cost.

Description

Invoice information extraction method, invoice information extraction device, invoice information extraction equipment and invoice information storage medium
Technical Field
The present application relates to the technical field of financial science and technology, and in particular, to an invoice information extraction method, an invoice information extraction device, a computer device, and a computer readable storage medium.
Background
In the traditional claim settlement system in the insurance industry, after the case is reported by the customer, if the case belongs to the medical class, a medical invoice needs to be provided, and a claim checking staff usually needs to manually input the invoice, so as to extract information of the invoice, such as invoice codes, invoice numbers and the like, so as to check whether relevant content of the invoice is consistent with insurance purchased by the customer and case reporting conditions, further judge whether the invoice has a fake making action, the part of work occupies more manpower, and the occurrence of errors can not be avoided by manual operation.
In addition, some companies recognize invoices by using an OCR (Optical Character Recognition ) method, and although the OCR method can reduce the time for manually inputting and checking invoices, it requires a relatively high hardware cost and has a relatively low recognition efficiency.
Disclosure of Invention
The application provides an invoice information extraction method, an invoice information extraction device, computer equipment and a computer readable storage medium, which aim to improve invoice information extraction efficiency in the insurance field and reduce labor cost.
In order to achieve the above object, the present application provides a method for extracting invoice information, the method comprising:
acquiring an invoice image of information to be extracted, and performing image segmentation on the invoice image to obtain a plurality of segmented images;
extracting the characteristics of each segmented image to obtain corresponding characteristic extraction information;
judging whether the segmented image comprises an identification code or not based on the feature extraction information;
and when the segmented image comprises the identification code, analyzing the identification code to obtain an analysis result, and determining information corresponding to the invoice based on the analysis result.
In order to achieve the above object, the present application further provides an invoice information extraction device, including:
the acquisition module is used for acquiring invoice images of information to be extracted, and carrying out image segmentation on the invoice images to obtain a plurality of segmented images;
the feature extraction module is used for carrying out feature extraction on each segmented image to obtain corresponding feature extraction information;
the judging module is used for judging whether the segmented image comprises an identification code or not based on the feature extraction information;
and the information extraction module is used for analyzing the identification code to obtain an analysis result when the segmented image comprises the identification code or not, and determining information corresponding to the invoice based on the analysis result.
In addition, to achieve the above object, the present application also provides a computer apparatus including a memory and a processor; the memory is used for storing a computer program; the processor is configured to execute the computer program and implement the steps of the method for extracting information of an invoice according to any one of the embodiments of the present application when the computer program is executed.
In addition, to achieve the above object, the present application further provides a computer readable storage medium storing a computer program, where the computer program when executed by a processor causes the processor to implement the steps of the method for extracting information of an invoice according to any one of the embodiments of the present application.
According to the invoice information extraction method, the invoice information extraction device, the computer equipment and the computer readable storage medium disclosed by the embodiment of the application, the invoice image of the information to be extracted can be obtained, and the invoice image is subjected to image segmentation to obtain a plurality of segmented images. Further, feature extraction can be performed on each of the segmented images to obtain corresponding feature extraction information, and whether the segmented images include identification codes or not can be judged based on the feature extraction information. When the divided image is judged to comprise the identification code, the identification code can be analyzed to obtain an analysis result, and information corresponding to the invoice is determined based on the analysis result. The application is used for capturing important information in the invoice by carrying out image segmentation and feature extraction on the invoice image. Further, for each divided image, it may be determined whether it includes an identification code based on the feature extraction information. And further analyzing the identification code when the identification code is contained in the segmented image so as to obtain an analysis result, thereby realizing the determination of the information corresponding to the invoice based on the analysis result. By the method, invoice information can be extracted with high automation, manual intervention is not needed, the possibility of errors is reduced, and the accuracy and the processing efficiency of invoice information extraction are improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic view of a scenario of an invoice information extraction method according to an embodiment of the present application;
FIG. 2 is a flow chart of an invoice information extraction method according to an embodiment of the present application;
FIG. 3 is a flowchart of obtaining an image of a segmented region according to an embodiment of the present application;
FIG. 4 is a schematic flow chart of obtaining an analysis result according to an embodiment of the present application;
FIG. 5 is a schematic block diagram of an invoice information extraction device provided by an embodiment of the application;
fig. 6 is a schematic block diagram of a computer device provided by an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The flow diagrams depicted in the figures are merely illustrative and not necessarily all of the elements and operations/steps are included or performed in the order described. For example, some operations/steps may be further divided, combined, or partially combined, so that the order of actual execution may be changed according to actual situations. In addition, although the division of the functional modules is performed in the apparatus schematic, in some cases, the division of the modules may be different from that in the apparatus schematic.
The term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
Some embodiments of the present application are described in detail below with reference to the accompanying drawings. The following embodiments and features of the embodiments may be combined with each other without conflict.
As shown in fig. 1, the method for extracting invoice information provided by the embodiment of the application can be applied to an application environment as shown in fig. 1. The application environment includes a terminal device 110 and a server 120, where the terminal device 110 may communicate with the server 120 through a network. Specifically, the server 120 can obtain an invoice image of information to be extracted, and perform image segmentation on the invoice image to obtain a plurality of segmented images; carrying out feature extraction on each segmented image to obtain corresponding feature extraction information; judging whether the segmented image comprises an identification code or not based on the feature extraction information; and finally, when the divided image comprises the identification code, analyzing the identification code to obtain an analysis result, determining information corresponding to the invoice based on the analysis result, and sending the information corresponding to the invoice to the terminal equipment 110. The server 120 may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms. The terminal device 110 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the present application is not limited herein.
Referring to fig. 2, fig. 2 is a flow chart of an invoice information extraction method according to an embodiment of the application. As shown in fig. 2, the information extraction method of the invoice includes steps S11 to S14.
Step S11: and acquiring an invoice image of the information to be extracted, and performing image segmentation on the invoice image to obtain a plurality of segmented images.
It should be noted that the present application is not limited to the type of invoice, and may include, for example, medical invoice, transportation fee invoice, shopping invoice, and the like. The application takes the invoice as the medical invoice for illustration, and the method provided by the application can realize the information extraction of the medical invoice, thereby reducing the cost of the information extraction of the medical invoice on the basis of improving the efficiency of the information extraction of the medical invoice.
Specifically, a medical invoice image of information to be extracted can be obtained in a photographing mode or a scanning mode, and image segmentation is performed on the medical invoice image to obtain a plurality of segmented images after segmentation.
It should be noted that image segmentation is a technique of dividing an image into a plurality of non-overlapping regions, each of which can be regarded as an object or region having similar characteristics or properties. Image segmentation may be used to identify and locate different objects, regions, or boundaries in an image. Common image segmentation methods include methods based on characteristics of pixel color, texture, shape and the like, convolutional neural networks in deep learning methods, semantic segmentation techniques and the like.
In the embodiment of the application, the medical invoice image of the information to be extracted can be obtained, and the medical invoice image is subjected to image segmentation to obtain a plurality of segmented images, so that the information extraction of the medical invoice image is realized based on the segmented images, and the information extraction efficiency of the medical invoice image is improved.
Step S12: and carrying out feature extraction on each segmented image to obtain corresponding feature extraction information.
Step S13: based on the feature extraction information, it is determined whether the divided image includes an identification code.
The feature extraction information may include the number of contours, texture features, shape information, etc. of the segmented image, which is not limited in the present application.
Specifically, feature extraction operations such as edge detection, texture feature detection, local statistical features and the like can be performed on each segmented image to obtain corresponding feature extraction information. Further, the feature extraction information may be compared with feature information of a predefined identification code to determine whether the feature extraction information is feature information of the predefined identification code, thereby determining whether the segmented image includes the identification code.
The present application is not limited to the identification code, and for example, the identification code includes a two-dimensional code, a bar code, and the like.
Optionally, the feature extraction information includes a contour number, the identification code includes a two-dimensional code, and determining whether the segmented image includes the identification code based on the feature extraction information includes: determining the number of contours corresponding to each segmented image, and judging whether the number of contours is larger than or equal to a preset threshold value; if the number of the contours is greater than or equal to a preset threshold value, determining that the feature extraction information comprises a two-dimensional code; if the number of contours is smaller than a preset threshold, determining that the feature extraction information does not comprise the two-dimensional code.
Specifically, contour detection can be performed on each of the segmented images to obtain a corresponding contour number. Further, the number of contours of each segmented image may be compared to a preset threshold. The preset threshold may be set according to the characteristics of the two-dimensional code, which is not limited in the present application, for example, the profile number is set to be 1. Thus, whether the number of contours is greater than or equal to 1 can be judged; if the number of the outlines is greater than or equal to 1, determining that the feature extraction information comprises a two-dimensional code; if the number of contours is less than 1, determining that the feature extraction information does not comprise the two-dimensional code.
In the embodiment of the application, feature extraction can be performed on each segmented image to obtain corresponding feature extraction information. And further, based on the feature extraction information, the judgment that the segmented image comprises the identification code is realized.
Step S14: when the segmented image comprises the identification code, the identification code is analyzed to obtain an analysis result, and information corresponding to the invoice is determined based on the analysis result.
Optionally, the parsing result includes a character string, and determining information corresponding to the target invoice based on the parsing result includes: and determining an identifier corresponding to the character string at the target position, and determining the identifier as information corresponding to the medical invoice.
For example, the analysis result based on the identification code is the following character string: [ 01, 10, 044002300311, 55859685, 157.34, 20230429, 64514373523256009105, 34CD ]. It will be appreciated that the character string described above is made up of identifiers (numbers) at different locations, i.e. the character string corresponds to the identifiers at different locations. Thus, an identifier corresponding to the target location can be determined, and the identifier can be determined as information corresponding to the medical invoice. For example, the information includes invoice code, invoice number, amount, date of invoicing, check code, type of invoicing, etc., which the present application is not limited to.
The identifiers of the character strings in the fourth and fifth sections may be determined as an invoice code and an invoice number of the medical invoice, respectively. That is, the invoice code is "044002300311"; the invoice number is "55859685".
On the basis of the above embodiment, after determining the information corresponding to the invoice based on the analysis result, the method further includes: inquiring whether the invoice records which are the same as the identifiers exist in an invoice database; if the same invoice records exist, the target invoice is determined to be true.
It can be understood that if the invoice database has the same invoice record as the identifier, it is indicated that the invoice database has the same type of real information such as the invoice code, the invoice number, the amount, the billing date, the check code, the billing type and the like, so that the medical invoice can be determined to be a real invoice. Otherwise, an invoice that changes the medical invoice to false may be determined.
Optionally, after determining whether the segmented image includes the identification code, further comprising: and when the segmented image does not comprise the identification code, carrying out feature extraction on the image outside the segmented image to obtain corresponding feature extraction information, and judging whether the image outside the segmented image comprises the identification code or not.
Specifically, if the divided image does not include the identification code, feature extraction may be performed on the image other than the divided image to obtain feature information corresponding to the image other than the divided image, so as to determine whether the image other than the divided image includes the identification code. The feature extraction can prevent the occurrence of failure in extraction of the identification code outside the divided image, and further improve the probability of extraction of the identification code.
In the embodiment of the application, when the segmented image comprises the identification code, the identification code is analyzed to obtain the analysis result, and the information corresponding to the medical invoice is determined based on the analysis result, so that the automatic extraction of the medical invoice information is realized, and the extraction efficiency and accuracy of the medical invoice information are improved.
The invoice information extraction method disclosed by the embodiment of the application can acquire the medical invoice image of the information to be extracted, and carry out image segmentation on the medical invoice image to obtain a plurality of segmented images. Further, feature extraction can be performed on each of the segmented images to obtain corresponding feature extraction information, and whether the segmented images include identification codes or not can be judged based on the feature extraction information. When the divided image is judged to comprise the identification code, the identification code can be analyzed to obtain an analysis result, and information corresponding to the medical invoice is determined based on the analysis result. The application is used for capturing important information in the medical invoice by carrying out image segmentation and feature extraction on the medical invoice image. Further, for each divided image, it may be determined whether it includes an identification code based on the feature extraction information. And then when the identification code is contained in the segmented image, the identification code is analyzed to obtain an analysis result, so that the information corresponding to the medical invoice is determined based on the analysis result. By the method, the medical invoice information can be extracted with high automation, manual intervention is not needed, the possibility of errors is reduced, and the accuracy and the processing efficiency of the medical invoice information extraction are improved.
Referring to fig. 3, fig. 3 is a flowchart illustrating a process of obtaining an image of a segmented region according to an embodiment of the application. As shown in fig. 3, obtaining an image of the divided region may be achieved through steps S121 to S123.
Step S121: several target areas of the invoice image are determined.
Step S122: and carrying out edge detection on the plurality of target areas to obtain a plurality of edge-detected target areas.
Step S123: and carrying out image segmentation on the target areas after the edge detection to obtain a plurality of segmented images.
The target area may be an upper left corner, a lower left corner, an upper right corner, a lower right corner, etc. of the medical invoice, which is not limited in the present application.
Specifically, the edge detection can be performed on the target area to determine the boundary of the target area, so as to obtain a plurality of edge-detected target areas. Thus, the image of the target area after edge detection can be segmented according to the boundary of the determined target area, and a segmented image can be obtained.
It will be appreciated that by image segmentation, the medical invoice image may be segmented into a plurality of segmented images, each representing a different medical invoice portion, thereby facilitating further image processing and analysis of the segmented image invoice.
In the embodiment of the application, a plurality of target areas of the invoice image of the segmented image can be determined, then the edge detection is carried out on the plurality of target areas, a plurality of target areas after the edge detection are obtained, and the image segmentation is carried out on the target areas after the edge detection, so that a plurality of segmented images are obtained. The method can extract different segmentation areas from the invoice images of the segmentation images in an automatic mode, so that a basis is provided for subsequent information extraction.
With continued reference to fig. 4, fig. 4 is a flow chart illustrating a method for obtaining an analysis result according to an embodiment of the application. As shown in fig. 4, the analysis result may be obtained through steps S141 to S142.
Step S141: and preprocessing the identification code to obtain the preprocessed identification code.
The preprocessing operation comprises at least one of graying operation, two-dimensional operation and filtering operation.
Step S142: analyzing the preprocessed identification code through an identification algorithm to obtain an analysis result.
Specifically, at least one of graying operation, two-dimensional operation and filtering operation can be performed on the identification code, so as to obtain the preprocessed identification code. The gray-scale operation can convert the identification code image into a gray-scale image, so that the color value of each pixel point represents the brightness of the identification code image, thereby reducing the data dimension and simultaneously retaining key information; the binarization operation can convert the gray level image into a black-and-white binary image, and the pixels are divided into two areas of black and white by setting a threshold value, so that the outline and the shape of the identification code are highlighted; the filtering operation can apply a filter to smooth the image, removing noise and detail, to achieve improved accuracy of the subsequent recognition algorithm. Further, the preprocessed identification code can be analyzed through an identification algorithm, and an analysis result is obtained.
In the embodiment of the application, the identification code can be preprocessed, and then the preprocessed identification code is analyzed to obtain an analysis result. Thus, noise interference in the identification code can be reduced, and the success rate of analysis of the identification code can be improved.
Referring to fig. 5, fig. 5 is a schematic block diagram of an invoice information extraction device according to an embodiment of the present application. The invoice information extraction device can be configured in a server and used for executing the invoice information extraction method.
As shown in fig. 6, the invoice information extraction device 200 includes: an acquisition module 201, a feature extraction module 202, a judgment module 203 and an information extraction module 204.
The acquiring module 201 is configured to acquire an invoice image of information to be extracted, and perform image segmentation on the invoice image to obtain a plurality of segmented images;
the feature extraction module 202 is configured to perform feature extraction on each of the segmented images to obtain corresponding feature extraction information;
a judging module 203, configured to judge whether the segmented image includes an identification code based on the feature extraction information;
and the information extraction module 204 is configured to parse the identification code to obtain a parsing result when the segmented image includes the identification code, and determine information corresponding to the invoice based on the parsing result.
The acquisition module 201 is further configured to determine a plurality of target areas of the invoice image; performing edge detection on a plurality of target areas to obtain a plurality of edge-detected target areas; and carrying out image segmentation on the target areas after the edge detection to obtain a plurality of segmented images.
The judging module 203 is further configured to determine a number of contours corresponding to each of the segmented images, and judge whether the number of contours is greater than or equal to a preset threshold; if the number of contours is greater than or equal to the preset threshold, determining that the segmented image comprises the two-dimensional code; and if the number of the outlines is smaller than the preset threshold, determining that the segmented image does not comprise the two-dimensional code.
The judging module 203 is further configured to perform feature extraction on an image outside the segmented image when the segmented image does not include the identification code, obtain corresponding feature extraction information, and judge whether the image outside the segmented image includes the identification code.
The information extraction module 204 is further configured to perform a preprocessing operation on the identification code to obtain a preprocessed identification code, where the preprocessing operation includes at least one of a graying operation, a two-dimensional operation, and a filtering operation; and analyzing the preprocessed identification code through an identification algorithm to obtain the analysis result.
The information extraction module 204 is further configured to determine an identifier corresponding to the character string at the target location, and determine the identifier as information corresponding to the invoice.
The information extraction module 204 is further configured to query an invoice database for whether an invoice record identical to the identifier exists; and if the same invoice records exist, determining that the target invoice is true.
It should be noted that, for convenience and brevity of description, specific working processes of the above-described apparatus and each module, unit may refer to corresponding processes in the foregoing method embodiments, which are not repeated herein.
The methods and apparatus of the present application are operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
By way of example, the methods, apparatus described above may be implemented in the form of a computer program that is executable on a computer device as shown in fig. 6.
Referring to fig. 6, fig. 6 is a schematic diagram of a computer device according to an embodiment of the application. The computer device may be a server.
As shown in fig. 6, the computer device includes a processor, a memory, and a network interface connected by a system bus, wherein the memory may include a volatile storage medium, a non-volatile storage medium, and an internal memory. The non-volatile storage medium may store an operating system and a computer program. The computer program comprises program instructions that, when executed, cause the processor to perform any of the methods of information extraction of invoices.
The processor is used to provide computing and control capabilities to support the operation of the entire computer device.
The internal memory provides an environment for the execution of a computer program in a non-volatile storage medium that, when executed by a processor, causes the processor to perform any of the invoice information extraction methods.
The network interface is used for network communication such as transmitting assigned tasks and the like. It will be appreciated by those skilled in the art that the architecture of the computer device, which is merely a block diagram of some of the structures associated with the present application, is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or less components than those shown, or may combine some of the components, or have a different arrangement of components.
It should be appreciated that the processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field-programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Wherein in some embodiments the processor is configured to run a computer program stored in the memory to implement the steps of: acquiring an invoice image of information to be extracted, and performing image segmentation on the invoice image to obtain a plurality of segmented images; extracting the characteristics of each segmented image to obtain corresponding characteristic extraction information; judging whether the segmented image comprises an identification code or not based on the feature extraction information; and when the segmented image comprises the identification code, analyzing the identification code to obtain an analysis result, and determining information corresponding to the invoice based on the analysis result.
In some embodiments, the processor is further configured to determine a number of target areas of the invoice image; performing edge detection on a plurality of target areas to obtain a plurality of edge-detected target areas; and carrying out image segmentation on the target areas after the edge detection to obtain a plurality of segmented images.
In some embodiments, the processor is further configured to determine a number of contours corresponding to each of the segmented images, and determine whether the number of contours is greater than or equal to a preset threshold; if the number of contours is greater than or equal to the preset threshold, determining that the segmented image comprises the two-dimensional code; and if the number of the outlines is smaller than the preset threshold, determining that the segmented image does not comprise the two-dimensional code.
In some embodiments, the processor is further configured to perform feature extraction on an image outside the segmented image when the segmented image does not include the identification code, obtain corresponding feature extraction information, and determine whether the image outside the segmented image includes the identification code.
In some embodiments, the processor is further configured to perform a preprocessing operation on the identification code to obtain a preprocessed identification code, where the preprocessing operation includes at least one of a graying operation, a two-dimensional operation, and a filtering operation; and analyzing the preprocessed identification code through an identification algorithm to obtain the analysis result.
In some embodiments, the processor is further configured to determine an identifier corresponding to the character string at a target location, and determine the identifier as information corresponding to the invoice.
In some embodiments, the processor is further configured to query an invoice database for the presence of an invoice record identical to the identifier; and if the same invoice records exist, determining that the target invoice is true.
The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium is stored with a computer program, the computer program comprises program instructions, and the program instructions are executed to realize the information extraction method of any invoice provided by the embodiment of the application.
The computer readable storage medium may be an internal storage unit of the computer device according to the foregoing embodiment, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, which are provided on the computer device.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like.
While the application has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the application. Therefore, the protection scope of the application is subject to the protection scope of the claims.

Claims (10)

1. An invoice information extraction method, which is characterized by comprising the following steps:
acquiring an invoice image of information to be extracted, and performing image segmentation on the invoice image to obtain a plurality of segmented images;
extracting the characteristics of each segmented image to obtain corresponding characteristic extraction information;
judging whether the segmented image comprises an identification code or not based on the feature extraction information;
and when the segmented image comprises the identification code, analyzing the identification code to obtain an analysis result, and determining information corresponding to the invoice based on the analysis result.
2. The method of claim 1, wherein the image segmentation of the invoice image to obtain images of a plurality of segmented regions comprises:
determining a plurality of target areas of the invoice image;
performing edge detection on a plurality of target areas to obtain a plurality of edge-detected target areas;
and carrying out image segmentation on the target areas after the edge detection to obtain a plurality of segmented images.
3. The method according to claim 1, wherein the feature extraction information includes a contour number, the identification code includes a two-dimensional code, and the determining whether the divided image includes the identification code based on the feature extraction information includes:
determining the number of contours corresponding to each segmented image, and judging whether the number of contours is larger than or equal to a preset threshold value;
if the number of contours is greater than or equal to the preset threshold, determining that the segmented image comprises the two-dimensional code;
and if the number of the outlines is smaller than the preset threshold, determining that the segmented image does not comprise the two-dimensional code.
4. The method of claim 1, wherein after determining whether the segmented image includes an identification code, further comprising:
and when the divided image does not comprise the identification code, carrying out feature extraction on the image outside the divided image to obtain corresponding feature extraction information, and judging whether the image outside the divided image comprises the identification code or not.
5. The method of claim 1, wherein the parsing the identification code to obtain a parsed result comprises:
preprocessing the identification code to obtain a preprocessed identification code, wherein the preprocessing operation comprises at least one of graying operation, two-dimensional operation and filtering operation;
and analyzing the preprocessed identification code through an identification algorithm to obtain the analysis result.
6. The method of claim 5, wherein the parsing result includes a string, and wherein the determining information corresponding to the target invoice based on the parsing result includes:
and determining an identifier corresponding to the character string at the target position, and determining the identifier as information corresponding to the invoice.
7. The method for extracting information from an invoice according to claim 6, wherein after determining information corresponding to the invoice based on the analysis result, further comprising:
inquiring whether the invoice records which are the same as the identifiers exist in an invoice database;
and if the same invoice records exist, determining that the target invoice is true.
8. An information extraction device for an invoice, the information extraction device comprising:
the acquisition module is used for acquiring invoice images of information to be extracted, and carrying out image segmentation on the invoice images to obtain a plurality of segmented images;
the feature extraction module is used for carrying out feature extraction on each segmented image to obtain corresponding feature extraction information;
the judging module is used for judging whether the segmented image comprises an identification code or not based on the feature extraction information;
and the information extraction module is used for analyzing the identification code to obtain an analysis result when the segmented image comprises the identification code or not, and determining information corresponding to the invoice based on the analysis result.
9. A computer device, comprising: a memory and a processor; wherein the memory is connected to the processor for storing a program, and the processor is configured to implement the steps of the method for extracting information of an invoice according to any one of claims 1 to 7 by running the program stored in the memory.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, causes the processor to implement the steps of the method for information extraction of an invoice as claimed in any one of claims 1 to 7.
CN202311354182.8A 2023-10-18 2023-10-18 Invoice information extraction method, invoice information extraction device, invoice information extraction equipment and invoice information storage medium Pending CN117197798A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311354182.8A CN117197798A (en) 2023-10-18 2023-10-18 Invoice information extraction method, invoice information extraction device, invoice information extraction equipment and invoice information storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311354182.8A CN117197798A (en) 2023-10-18 2023-10-18 Invoice information extraction method, invoice information extraction device, invoice information extraction equipment and invoice information storage medium

Publications (1)

Publication Number Publication Date
CN117197798A true CN117197798A (en) 2023-12-08

Family

ID=88983531

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311354182.8A Pending CN117197798A (en) 2023-10-18 2023-10-18 Invoice information extraction method, invoice information extraction device, invoice information extraction equipment and invoice information storage medium

Country Status (1)

Country Link
CN (1) CN117197798A (en)

Similar Documents

Publication Publication Date Title
US11380113B2 (en) Methods for mobile image capture of vehicle identification numbers in a non-document
US10943105B2 (en) Document field detection and parsing
US9042647B2 (en) Adaptive character segmentation method and system for automated license plate recognition
WO2019237549A1 (en) Verification code recognition method and apparatus, computer device, and storage medium
CN110569341B (en) Method and device for configuring chat robot, computer equipment and storage medium
CN111209827B (en) Method and system for OCR (optical character recognition) bill problem based on feature detection
CN109740417B (en) Invoice type identification method, invoice type identification device, storage medium and computer equipment
CN110780965B (en) Vision-based process automation method, equipment and readable storage medium
CN111507324A (en) Card frame identification method, device, equipment and computer storage medium
Ghandour et al. Building shadow detection based on multi-thresholding segmentation
CN117197798A (en) Invoice information extraction method, invoice information extraction device, invoice information extraction equipment and invoice information storage medium
CN114758340A (en) Intelligent identification method, device and equipment for logistics address and storage medium
CN112861843A (en) Method and device for analyzing selection frame based on feature image recognition
CN112862409A (en) Picking bill verification method and device
CN111753842A (en) Bill text region detection method and device
Chen et al. Video-based content recognition of bank cards with mobile devices
CN116311292A (en) Document image information extraction method, device, computer equipment and storage medium
CN117973410A (en) Two-dimensional code identification method and device
CN117609532A (en) Similar image retrieval method, device, equipment and medium
CN118155232A (en) Nuclear power plant document optical character recognition system and method
CN117612180A (en) Character recognition method, device, terminal equipment and medium
CN117037166A (en) Text recognition method and device based on artificial intelligence, computer equipment and medium
CN115984861A (en) Box number identification method, device, equipment and storage medium
CN113420684A (en) Report recognition method and device based on feature extraction, electronic equipment and medium
CN117437414A (en) Segmentation method, flow automation method, device, all-in-one machine and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination