WO2024012209A1 - Image recognition-based service processing method and apparatus, and storage medium - Google Patents

Image recognition-based service processing method and apparatus, and storage medium Download PDF

Info

Publication number
WO2024012209A1
WO2024012209A1 PCT/CN2023/103596 CN2023103596W WO2024012209A1 WO 2024012209 A1 WO2024012209 A1 WO 2024012209A1 CN 2023103596 W CN2023103596 W CN 2023103596W WO 2024012209 A1 WO2024012209 A1 WO 2024012209A1
Authority
WO
WIPO (PCT)
Prior art keywords
verified
image
document
target
documents
Prior art date
Application number
PCT/CN2023/103596
Other languages
French (fr)
Chinese (zh)
Inventor
彭波
Original Assignee
深圳前海环融联易信息科技服务有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海环融联易信息科技服务有限公司 filed Critical 深圳前海环融联易信息科技服务有限公司
Publication of WO2024012209A1 publication Critical patent/WO2024012209A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/1444Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words

Definitions

  • the present application relates to the field of computer technology, and in particular, to a business processing method and device based on image recognition, storage media, and computer equipment.
  • this application provides a business processing method and device, storage medium, and computer equipment based on image recognition, which can simplify the verification process of documents to be verified, greatly improve the efficiency of verification and archiving, and help improve the user experience. feel.
  • a business processing method based on image recognition including:
  • each of the images to be verified includes at least one document to be verified;
  • a target archive image corresponding to any of the documents to be verified is determined based on the at least one image to be verified, and the business is generated based on the target archive image archived information.
  • a service processing device based on image recognition including:
  • a text recognition module configured to obtain at least one image to be verified, perform text recognition on the at least one image to be verified, and obtain text information corresponding to each image to be verified, wherein each image to be verified includes at least one Documents to be verified;
  • a verification module configured to determine the target query information corresponding to each of the documents to be verified from the text information, and determine whether the document to be verified passes the verification based on the target query information;
  • An archiving module configured to determine a target archive image corresponding to any of the documents to be verified based on the at least one image to be verified after the document to be verified passes the verification, and based on the target archive image , generate archive information of the business.
  • a storage medium on which a computer program is stored.
  • the program is executed by a processor, the following steps are implemented:
  • each image to be verified perform text recognition on the at least one image to be verified, and obtain text information corresponding to each image to be verified, wherein each image to be verified includes at least one document to be verified;
  • a target archive image corresponding to any document to be verified is determined based on at least one image to be verified, and based on the target archive image, archive information of the business is generated.
  • a computer device including a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor.
  • the processor executes the program, it implements the following steps:
  • each image to be verified perform text recognition on the at least one image to be verified, and obtain text information corresponding to each image to be verified, wherein each image to be verified includes at least one document to be verified;
  • a target archive image corresponding to any document to be verified is determined based on at least one image to be verified, and based on the target archive image, archive information of the business is generated.
  • this application provides a business processing method and device, storage medium, and computer equipment based on image recognition.
  • one or more images to be verified can be obtained, and each image to be verified can contain one or Multiple documents to be verified.
  • text recognition can be performed on each image to be verified, and the text information corresponding to each image to be verified can be obtained.
  • the target query information corresponding to the document to be verified can be determined directly from the text information corresponding to the image to be verified; when the document to be verified is to be When the image contains multiple documents to be verified, the target query information corresponding to each document to be verified can be determined from the text information corresponding to the image to be verified. After the target query information is determined, each document to be verified can be verified based on the target query information corresponding to the document to be verified, and it is determined whether the document to be verified can pass the verification. If there is a document to be verified that has passed the verification, the document to be verified that has passed the verification can be used to handle related business in the future.
  • documents to be verified can also be archived.
  • the target archive image corresponding to the verified document to be verified can be found from the obtained image to be verified, and then the archive information of the business can be generated based on the target archive image.
  • the embodiments of this application can simplify the verification process of documents to be verified, greatly improve the efficiency of verification and archiving, and help improve the user experience.
  • Figure 1 shows a schematic flow chart of a business processing method based on image recognition provided by an embodiment of the present application
  • Figure 2 shows a schematic flow chart of another image recognition-based business processing method provided by an embodiment of the present application
  • Figure 3 shows a schematic structural diagram of a service processing device based on image recognition provided by an embodiment of the present application.
  • a business processing method based on image recognition includes:
  • Step 101 Obtain at least one image to be verified, perform text recognition on the at least one image to be verified, and obtain text information corresponding to each of the images to be verified, wherein each of the images to be verified includes at least one document to be verified. .
  • the business processing method based on image recognition can be applied to invoice financing business scenarios.
  • one or more images to be verified can be obtained, and each image to be verified can contain one or more documents to be verified.
  • the document to be verified can be an invoice
  • the image to be verified can be a picture or pdf of an invoice.
  • the image to be verified can be It is a picture or pdf composed of multiple invoices.
  • the image to be verified is a picture
  • the image to be verified can be obtained by photographing a paper invoice with a camera; when the image to be verified is a PDF, the image to be verified can be obtained by scanning with a scanner, or it can be obtained directly from the electronic invoice.
  • OCR technology is the abbreviation of Optical Character Recognition (Optical Character Recognition). It converts the text of various bills, newspapers, books, manuscripts and other printed matter into image information through optical input methods such as scanning, and then uses text recognition technology to convert the image information. Enter technology for computers that can be used.
  • Step 102 Determine the target query information corresponding to each of the documents to be verified from the text information, and determine whether the document to be verified passes the verification based on the target query information.
  • the target corresponding to the document to be verified can be determined directly from the text information corresponding to the image to be verified.
  • Query information when the image to be verified contains multiple documents to be verified, the target query information corresponding to each document to be verified can be determined from the text information corresponding to the image to be verified.
  • each document to be verified can be verified based on the target query information corresponding to the document to be verified, and it is determined whether the document to be verified can pass the verification.
  • the target query information can be the invoice name, company name, trade The contract name, invoice number, invoice amount, invoice date, tax-included amount, tax-exclusive amount, special invoice, general invoice, seller's name, buyer's name, etc. can be selected according to the actual situation.
  • the target query information can be determined to be multiple of the above. This application can directly obtain the target query information corresponding to each document to be verified based on the obtained image to be verified, and automatically input the target query information into the document verification system. There is no need for users to log in to the document verification system offline and manually enter the target multiple times. Query information for verification, greatly simplifying the verification process.
  • Step 103 After any of the documents to be verified passes the verification, determine the target archive image corresponding to any of the documents to be verified based on the at least one image to be verified, and generate based on the target archive image. Archived information for said business.
  • the document to be verified that has passed the verification can be used to handle related business later.
  • documents to be verified can also be archived. Specifically, the target archive image corresponding to the verified document to be verified can be found from the obtained image to be verified, and then the archive information of the business can be generated based on the target archive image.
  • the image to be verified includes multiple documents to be verified, and not all documents to be verified have passed verification, that is, not all documents to be verified can be used for business processing, if the document to be verified is manually found from multiple images to be verified.
  • the embodiment of the present application can directly and automatically extract the document from multiple documents.
  • the document to be verified for business processing is determined in the image to be verified, which is simple, convenient and efficient, and is conducive to improving the user experience.
  • users are usually required to find documents to be verified (i.e. invoices) that have passed verification from many images to be verified, and then upload them manually.
  • this series of processes can be automatically implemented, which greatly improves archiving efficiency. and accuracy.
  • one or more images to be verified can be obtained, and each image to be verified can contain one or more documents to be verified.
  • text recognition can be performed on each image to be verified, and the text information corresponding to each image to be verified can be obtained.
  • the target query information corresponding to the document to be verified can be determined directly from the text information corresponding to the image to be verified; when the document to be verified is to be When the image contains multiple documents to be verified, the target query information corresponding to each document to be verified can be determined from the text information corresponding to the image to be verified.
  • each document to be verified can be verified based on the target query information corresponding to the document to be verified, and it is determined whether the document to be verified can pass the verification. If there is a document to be verified that has passed the verification, the document to be verified that has passed the verification can be used to handle related business in the future.
  • documents to be verified can also be archived. Specifically, the target archive image corresponding to the verified document to be verified can be found from the obtained image to be verified, and then the archive information of the business can be generated based on the target archive image.
  • the embodiments of this application can simplify the verification process of documents to be verified, greatly improve the efficiency of verification and archiving, and help improve the user experience.
  • the method includes:
  • Step 201 Obtain at least one image to be verified, perform text recognition on the at least one image to be verified, and obtain text information corresponding to each image to be verified, wherein each image to be verified includes at least one document to be verified. .
  • each image to be verified can contain one or more documents to be verified.
  • the document to be verified can be an invoice
  • the image to be verified can be an invoice.
  • the image to be verified can be a picture or pdf composed of multiple invoices.
  • the image to be verified can be obtained by photographing a paper invoice with a camera; when the image to be verified is a PDF, the image to be verified can be obtained by scanning with a scanner, or it can be obtained directly from the electronic invoice. Downloadable from the website.
  • text recognition can be performed on each image to be verified, and the text information corresponding to each image to be verified can be obtained.
  • Step 202 Based on the text information, determine the number of documents to be verified contained in the image to be verified.
  • step 202 includes: identifying the number of occurrences of the first word combination from the text information, and using the number of occurrences as the individual number of the document to be verified contained in the image to be verified. number, wherein the first character combination is a character combination that exists in each of the documents to be verified and appears only once.
  • the text information may include multiple repeated information. Specifically, if an image to be verified is If the corresponding text information contains a first text combination, then it can be determined that the image to be verified contains a document to be verified; if the text information corresponding to a certain image to be verified contains multiple first text combinations, then it can be determined that the image to be verified is Contains the number of occurrences of the first word combination in documents to be verified.
  • the first text combination may be a text combination that appears in each document to be verified and appears only once. For example, when the document to be verified is an invoice, the first text combination may be "Special Value Added Tax Invoice", "Password Area" " and other text combinations.
  • Step 203 When a document to be verified is included, a unique identifier is determined from the text information, the image to be verified is marked with the unique identifier, and the marked image to be verified is used as the target verification image.
  • the unique identifier corresponding to the document to be verified can be found from the text information corresponding to the image to be verified. After that, the unique identifier can be used to verify the document to be verified.
  • the image is marked, and the image to be verified after the marking process can be used as the target verification image.
  • the unique identifier may be an identifier that corresponds one-to-one with the document to be verified.
  • the unique identifier may be an invoice number, etc.
  • Step 204 When multiple documents to be verified are included, the image to be verified is cropped to obtain a sub-image to be verified corresponding to each document to be verified, and the sub-image to be verified is determined from the text information.
  • the sub-images to be verified are marked with unique identifiers corresponding to the sub-images to be verified, and each marked sub-image to be verified is used as the target verification image.
  • the image to be verified contains multiple documents to be verified
  • the image to be verified can be cropped to obtain multiple sub-images to be verified, where each sub-image to be verified corresponds to a document to be verified.
  • the unique identifier corresponding to each sub-image to be verified can also be determined from the text information corresponding to the image to be verified. Afterwards, the obtained unique identifier can be used to mark the corresponding sub-image to be verified, and the marked sub-image to be verified can be used as the target verification image.
  • Step 205 Determine target query information corresponding to each document to be verified from the text information.
  • step 205 includes: dividing the text information into sub-sections corresponding to each of the target verification images. text information, and extract the target query information corresponding to the document to be verified from each sub-text information.
  • the target query information of the document to be verified can be determined directly from the text information corresponding to the image to be verified.
  • the text information can be divided into sub-text information corresponding to each target verification image, and the target query information corresponding to the document to be verified can be further extracted from each sub-text information.
  • the image to be verified contains 4 documents to be verified
  • the text information corresponding to the image to be verified can be divided into 4 sub-text information, and then the target query corresponding to each document to be verified can be extracted from each sub-text information. information.
  • Step 206 Call the preset document verification interface, input the target query information into the preset document verification interface, and receive the document status information returned by the preset document verification interface, where the document status information includes the Registered status and unregistered status; when the document status information is in the registered status, it is determined that the document to be verified has not passed verification; when the document status information is in the unregistered state, it is determined that the document to be verified has passed verification.
  • a preset document verification interface can be called.
  • the document to be verified can be verified.
  • the default document verification interface may be the Zhonglog query system call interface. This application calls the Zhongdeng query system that needs to be used for invoice financing business directly through a third-party interface call, which can avoid users' manual cross-system query.
  • the document status information returned after verification can include registered status and unregistered status. If the document status information is registered, it means that the document to be verified has been used to handle the business and can no longer be used to handle the business. At this time, it can be determined that the document to be verified has not passed the verification; if the document status information is unregistered , then it means that the document to be verified has not been used to handle this business, but can be used to handle this business. At this time, it can be determined that the document to be verified has passed the verification.
  • the invoice financing scenario the document to be verified at this time can be an invoice, and the business can be a financing business. If the document status information is registered, it means that the invoice has been used for financing business and cannot be used for financing again. , if the document status information is unregistered, it means that the invoice has not been used for financing business, and the invoice can be used for financing.
  • Step 207 After any of the documents to be verified passes the verification, determine the document corresponding to any of the documents to be verified from the target verification image corresponding to the at least one image to be verified based on the unique identifier.
  • the target verification image serves as the target archive image, and based on the target archive image, the archive information of the business is generated.
  • the document to be verified can be used to handle the business.
  • the target archive image corresponding to the document to be verified can also be used for archiving.
  • the target archive image is the image corresponding to the document to be verified that needs to be archived.
  • the target verification image corresponding to the unique identifier of the verified document to be verified can be found from the target verification images corresponding to one or more to-be-verified images, and Use this target verification image as the target archive image. Then, based on the target archive image, archive information corresponding to this business transaction can be generated.
  • step 204 includes: The positions corresponding to the second text combination and the third text combination are determined sequentially in the verification image, wherein the second text combination is the universal ending text combination of the document to be verified, and the third text combination is the document to be verified. common starting text combinations; based on each The position corresponding to the second word combination, the position corresponding to the third word combination, and the preset segmentation ratio are used to determine the target cropping position; the image to be verified is cropped based on the target cropping position, and the image to be verified is obtained. The sub-image to be verified corresponding to the document to be verified.
  • the position of the second word combination and the position of the third word combination can be determined from the image to be verified in the order of the second word combination - the third word combination. Location.
  • the second text combination can be a common ending text combination of the document to be verified.
  • the common ending text combination of each invoice can be "Payee”, “Review”, “Invoicer” ", “Seller”, etc., these text combinations are usually in the last line of the invoice;
  • the third text combination can be the common starting text combination of the document to be verified, for example, when the document to be verified is an invoice, the common starting text combination of each invoice
  • the text combination can be "invoice code”, “VAT special invoice”, “VAT general invoice”, etc. These text combinations are usually on the first line of the invoice.
  • the position corresponding to the second text combination and the position corresponding to the third text combination you can use the position corresponding to the second text combination, the position corresponding to the third text combination, and the preset division ratio to determine the two to-be-used characters. Verify target crop position between documents.
  • the preset split ratio can be calculated in advance. For example, the second text combination is located on the last line of the document to be verified, and the third text combination is located on the first line of the document to be verified, then it can be divided between the two documents to be verified.
  • the optimal split ratio is calculated respectively, and the preset split ratio is determined based on multiple optimal split ratios to ensure that when two consecutive documents to be verified are split according to the preset split ratio, the two consecutive documents to be verified can be split.
  • the documents to be verified are complete.
  • the image to be verified can be cropped according to the target cropping position, and then the sub-image to be verified corresponding to each document to be verified can be obtained.
  • This application determines the position of the second text combination in the image to be verified, the position of the third text combination in the image to be verified, and the preset division ratio, and then analyzes the image to be verified that contains multiple documents to be verified. By cropping, you can simply and conveniently determine the sub-image to be verified corresponding to each document to be verified, making it easier to accurately find the image for archiving later.
  • the segmentation can be performed by determining the positions of a second text combination and a third text combination in the image to be verified. If the image to be verified contains three documents to be verified, To verify the document, you only need to determine the positions of the two sets of second text combinations and third text combinations in the image to be verified to perform segmentation, and so on. Therefore, after determining the number of documents to be verified corresponding to each image to be verified, the positions of the second text combination and the third text combination can be determined from the text information of the image to be verified based on this number. When the above-mentioned After the positions of the second word combination and the third word combination are located, the position of the second word combination and the third word combination can no longer be determined.
  • the embodiment of the present application uses the number of documents to be verified.
  • the number of determined positions of the second character combination and the third character combination is equal to the number of documents to be verified, it can be stopped, which can effectively reduce the time for position determination. , improve the efficiency of position determination.
  • the method further includes: obtaining a list of document numbers, wherein the list of document numbers includes each of the The document number corresponding to the document to be verified; accordingly, the target query information includes the target document number; after step 205, the method further includes: identifying the target document number in the target query information, and The target document number is eliminated from the document number list to obtain a missing document list.
  • a list of document numbers can first be acquired.
  • the document number list may include the document number corresponding to each document to be verified, where the document number and the document to be verified are also in one-to-one correspondence.
  • the document number can be the invoice number.
  • the target query information includes the target document number
  • the target document number can be determined from the target query information, and the target document number can be changed from the document number Eliminate it from the list so that the target document number is no longer included in the document number list, so that when the target query information corresponding to each document to be verified contains After all the target document numbers are eliminated from the document number list, the missing document list can be obtained.
  • the document numbers included in the missing document list can be the document numbers corresponding to the documents to be verified that failed to recognize the text information.
  • a document number list containing the document numbers corresponding to all documents to be verified is first obtained, and then the document numbers corresponding to the identified documents to be verified are removed from the document number list, and finally the document numbers are The remaining document numbers in the number list are the document numbers corresponding to the documents to be verified that cannot be identified or cannot be identified accurately during the OCR process.
  • the documents to be verified can be quickly and accurately located. Verification documents are missed, improving the efficiency of determining missing documents to be verified.
  • an embodiment of the present application provides a business processing device based on image recognition, as shown in Figure 3.
  • the device includes:
  • a text recognition module configured to obtain at least one image to be verified, perform text recognition on the at least one image to be verified, and obtain text information corresponding to each image to be verified, wherein each image to be verified includes at least one Documents to be verified;
  • a verification module configured to determine the target query information corresponding to each of the documents to be verified from the text information, and determine whether the document to be verified passes the verification based on the target query information;
  • An archiving module configured to determine a target archive image corresponding to any of the documents to be verified based on the at least one image to be verified after the document to be verified passes the verification, and based on the target archive image , generate archive information of the business.
  • the device also includes:
  • a quantity determination module configured to determine the number of documents to be verified contained in the image to be verified based on the text information after obtaining the text information corresponding to each of the images to be verified;
  • a marking module configured to determine a unique identifier from the text information when it contains one of the documents to be verified, mark the image to be verified through the unique identifier, and use the marked image to be verified as the target verification image;
  • the image to be verified is cropped to obtain a sub-image to be verified corresponding to each document to be verified, and the content of each document to be verified is determined from the text information.
  • the archiving module is configured to: determine the target verification image corresponding to any of the documents to be verified based on the unique identification from the target verification image corresponding to the at least one image to be verified, Archive the image as the target.
  • the quantity determination module is configured to: identify the number of occurrences of the first text combination from the text information, and use the number of occurrences as the number of documents to be verified contained in the image to be verified,
  • the first text combination is a text combination that exists in each of the documents to be verified and appears only once.
  • the marking module includes:
  • the first position determination unit is used to sequentially determine the positions corresponding to the second text combination and the third text combination from the image to be verified, wherein the second text combination is a universal ending text combination of the document to be verified, The third text combination is a common starting text combination of the document to be verified;
  • the second position determination unit is used to determine the target cropping position based on the position corresponding to each group of the second character combination, the position corresponding to the third character combination, and the preset segmentation ratio;
  • a cropping unit configured to crop the image to be verified based on the target cropping position to obtain the sub-image to be verified corresponding to each of the documents to be verified.
  • the verification module is configured to: divide the text information into sub-text information corresponding to each of the target verification images, The target query information corresponding to the document to be verified is extracted from each sub-text information respectively.
  • the verification module includes:
  • An interface calling unit configured to call a preset document verification interface, input the target query information into the preset document verification interface, and receive the document status information returned by the preset document verification interface, wherein the document status Information includes registered status and unregistered status;
  • a judgment unit configured to determine that the document to be verified has not passed verification when the document status information is in a registered state; and to determine that the document to be verified has passed verification when the document status information is in an unregistered state.
  • the device also includes:
  • a list acquisition module configured to obtain a list of document numbers before acquiring at least one image to be verified, wherein the list of document numbers includes a document number corresponding to each of the documents to be verified;
  • the target query information includes the target document number; the device further includes:
  • An elimination module configured to identify the target document number in the target query information after determining the target query information corresponding to each document to be verified from the text information, and add the target document number to the target query information. Eliminate from the list of document numbers to obtain a list of missing documents.
  • embodiments of the present application also provide a storage medium on which a computer program is stored.
  • the computer program is executed by a processor, the above described Figures 1 to 2 are implemented.
  • the business processing method based on image recognition is shown.
  • the technical solution of the present application can be embodied in the form of a software product.
  • the software product can be stored in a computer-readable storage medium.
  • the computer-readable storage medium can be a non-volatile storage medium (can be a CD). -ROM, U disk, mobile hard disk, etc.), or it can be a volatile storage medium.
  • the computer-readable storage medium includes a number of instructions to enable a computer device (which can be a personal computer, a server, or a network device, etc.) Execute the methods described in each implementation scenario of this application.
  • embodiments of the present application also provide a computer device, which can be a personal computer, a server, a network Equipment, etc.
  • the computer equipment includes a storage medium and a processor; the storage medium is used to store a computer program; the processor is used to execute the computer program to implement the above-mentioned business processing method based on image recognition as shown in Figures 1 to 2.
  • the computer device may also include a user interface, a network interface, a camera, a radio frequency (Radio Frequency, RF) circuit, a sensor, an audio circuit, a WI-FI module, etc.
  • the user interface may include a display screen (Display), an input unit such as a keyboard (Keyboard), etc.
  • the optional user interface may also include a USB interface, a card reader interface, etc.
  • Optional network interfaces may include standard wired interfaces, wireless interfaces (such as Bluetooth interfaces, WI-FI interfaces), etc.
  • a computer device does not constitute a limitation on the computer device, and may include more or less components, or combine certain components, or arrange different components.
  • the storage medium may also include an operating system and a network communication module.
  • An operating system is a program that manages and saves the hardware and software resources of a computer device and supports the operation of information processing programs and other software and/or programs.
  • the network communication module is used to implement communication between components within the storage medium, as well as communication with other hardware and software in the physical device.
  • one or more images to be verified can be obtained, and each image to be verified can contain one or more documents to be verified.
  • text recognition can be performed on each image to be verified, and the text information corresponding to each image to be verified can be obtained.
  • the target query information corresponding to the document to be verified can be determined directly from the text information corresponding to the image to be verified; when the document to be verified is to be When the image contains multiple documents to be verified, the target query information corresponding to each document to be verified can be determined from the text information corresponding to the image to be verified. After the target query information is determined, each document to be verified can be verified based on the target query information corresponding to the document to be verified, and it is determined whether the document to be verified can pass the verification. If there is a document to be verified that has passed the verification, the document to be verified that has passed the verification can be used to handle related business in the future.
  • documents to be verified can also be archived.
  • the target archive image corresponding to the verified document to be verified can be found from the obtained image to be verified, and then the archive information of the business can be generated based on the target archive image.
  • the embodiments of this application can simplify the verification process of documents to be verified, greatly improve the efficiency of verification and archiving, and help improve the user experience.
  • the accompanying drawing is only a schematic diagram of a preferred implementation scenario, and the modules or processes in the accompanying drawing are not necessarily necessary for implementing the present application.
  • the modules in the devices in the implementation scenario can be distributed in the devices in the implementation scenario according to the description of the implementation scenario, or can be correspondingly changed and located in one or more devices different from the implementation scenario.
  • the modules of the above implementation scenarios can be combined into one module or further split into multiple sub-modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Character Input (AREA)

Abstract

The present application relates to the technical field of computers. Disclosed are an image recognition-based service processing method and apparatus, a storage medium, and a computer device. The method comprises: acquiring at least one image to be verified, and performing character recognition on said at least one image to obtain character information corresponding to each image to be verified, each image to be verified comprising at least one invoice to be verified; determining from the character information target query information corresponding to each invoice to be verified, and on the basis of the target query information, determining whether the invoice to be verified has passed verification; and when any invoice to be verified has passed verification, on the basis of the at least one image to be verified, determining a target archived image corresponding to said invoice to be verified, and generating archived information of a service on the basis of the target archived image. The present application can simplify a verification process of an invoice to be verified, thus greatly improving verification and archiving efficiency, and improving experience of a user.

Description

一种基于图像识别的业务处理方法及装置、存储介质A business processing method, device and storage medium based on image recognition
本申请要求于2022年07月13日提交中国专利局、申请号为202210821064.2、申请名称为“一种基于图像识别的业务处理方法及装置、存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。This application requests the priority of the Chinese patent application submitted to the China Patent Office on July 13, 2022, with the application number 202210821064.2 and the application title "A business processing method, device and storage medium based on image recognition", and its entire content Incorporated into the application by reference.
技术领域Technical field
本申请涉及计算机技术领域,尤其是涉及到一种基于图像识别的业务处理方法及装置、存储介质、计算机设备。The present application relates to the field of computer technology, and in particular, to a business processing method and device based on image recognition, storage media, and computer equipment.
背景技术Background technique
当前,很多业务在办理过程中都需要对单据进行验证,例如,利用发票这种单据进行融资时,需要对发票进行验证,当通过验证表明该发票没有被用于过融资之后,才可以利用该发票办理本次融资业务。然而,现有技术中,在融资前对发票进行验证时,通常需要用户线下登录中登系统等发票验证系统,多次输入发票相关信息进行验证,验证过程繁琐。发明人意识到当发票通过验证后,将通过验证的发票相关信息手动上传到融资系统办理融资业务,并且需要用户从众多发票中找出用于办理融资业务的发票上传以进行存档,整个过程非常工作量大、效率低,用户体验较差。Currently, many businesses require verification of documents during the processing process. For example, when using documents such as invoices for financing, the invoice needs to be verified. Only after verification shows that the invoice has not been used for financing can the invoice be used. The invoice is used to handle this financing business. However, in the existing technology, when verifying invoices before financing, users usually need to log in to invoice verification systems such as the Zhongdeng system offline and enter invoice-related information multiple times for verification, which makes the verification process cumbersome. The inventor realized that after the invoice passed the verification, the relevant information of the verified invoice was manually uploaded to the financing system to handle the financing business, and the user needed to find the invoice used to handle the financing business from many invoices and upload it for archiving. The whole process was very complicated. The workload is heavy, the efficiency is low, and the user experience is poor.
发明内容Contents of the invention
有鉴于此,本申请提供了一种基于图像识别的业务处理方法及装置、存储介质、计算机设备,可以简化待验证单据的验证过程,能够大大提升验证以及存档的效率,有利于提高用户的体验感。In view of this, this application provides a business processing method and device, storage medium, and computer equipment based on image recognition, which can simplify the verification process of documents to be verified, greatly improve the efficiency of verification and archiving, and help improve the user experience. feel.
根据本申请的一个方面,提供了一种基于图像识别的业务处理方法,包括:According to one aspect of this application, a business processing method based on image recognition is provided, including:
获取至少一个待验证图像,对所述至少一个待验证图像进行文字识别,得到每个所述待验证图像对应的文字信息,其中,每个所述待验证图像包括至少一个待验证单据;Obtain at least one image to be verified, perform text recognition on the at least one image to be verified, and obtain text information corresponding to each of the images to be verified, wherein each of the images to be verified includes at least one document to be verified;
从所述文字信息中确定与每个所述待验证单据对应的目标查询信息,并基于所述目标查询信息,确定所述待验证单据是否通过验证;Determine the target query information corresponding to each of the documents to be verified from the text information, and determine whether the document to be verified passes the verification based on the target query information;
当任一所述待验证单据通过验证后,基于所述至少一个待验证图像,确定与所述任一所述待验证单据对应的目标存档图像,并基于所述目标存档图像,生成所述业务的存档信息。After any of the documents to be verified passes the verification, a target archive image corresponding to any of the documents to be verified is determined based on the at least one image to be verified, and the business is generated based on the target archive image archived information.
根据本申请的另一方面,提供了一种基于图像识别的业务处理装置,包括:According to another aspect of the present application, a service processing device based on image recognition is provided, including:
文字识别模块,用于获取至少一个待验证图像,对所述至少一个待验证图像进行文字识别,得到每个所述待验证图像对应的文字信息,其中,每个所述待验证图像包括至少一个待验证单据; A text recognition module, configured to obtain at least one image to be verified, perform text recognition on the at least one image to be verified, and obtain text information corresponding to each image to be verified, wherein each image to be verified includes at least one Documents to be verified;
验证模块,用于从所述文字信息中确定与每个所述待验证单据对应的目标查询信息,并基于所述目标查询信息,确定所述待验证单据是否通过验证;A verification module, configured to determine the target query information corresponding to each of the documents to be verified from the text information, and determine whether the document to be verified passes the verification based on the target query information;
存档模块,用于当任一所述待验证单据通过验证后,基于所述至少一个待验证图像,确定与所述任一所述待验证单据对应的目标存档图像,并基于所述目标存档图像,生成所述业务的存档信息。An archiving module, configured to determine a target archive image corresponding to any of the documents to be verified based on the at least one image to be verified after the document to be verified passes the verification, and based on the target archive image , generate archive information of the business.
依据本申请又一个方面,提供了一种存储介质,其上存储有计算机程序,所述程序被处理器执行时实现以下步骤:According to another aspect of the present application, a storage medium is provided, on which a computer program is stored. When the program is executed by a processor, the following steps are implemented:
获取至少一个待验证图像,对至少一个待验证图像进行文字识别,得到每个待验证图像对应的文字信息,其中,每个待验证图像包括至少一个待验证单据;Obtain at least one image to be verified, perform text recognition on the at least one image to be verified, and obtain text information corresponding to each image to be verified, wherein each image to be verified includes at least one document to be verified;
从文字信息中确定与每个待验证单据对应的目标查询信息,并基于目标查询信息,确定待验证单据是否通过验证;Determine the target query information corresponding to each document to be verified from the text information, and determine whether the document to be verified passes the verification based on the target query information;
当任一待验证单据通过验证后,基于至少一个待验证图像,确定与任一待验证单据对应的目标存档图像,并基于目标存档图像,生成业务的存档信息。When any document to be verified passes the verification, a target archive image corresponding to any document to be verified is determined based on at least one image to be verified, and based on the target archive image, archive information of the business is generated.
依据本申请再一个方面,提供了一种计算机设备,包括存储介质、处理器及存储在存储介质上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现以下步骤:According to yet another aspect of the present application, a computer device is provided, including a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor. When the processor executes the program, it implements the following steps:
获取至少一个待验证图像,对至少一个待验证图像进行文字识别,得到每个待验证图像对应的文字信息,其中,每个待验证图像包括至少一个待验证单据;Obtain at least one image to be verified, perform text recognition on the at least one image to be verified, and obtain text information corresponding to each image to be verified, wherein each image to be verified includes at least one document to be verified;
从文字信息中确定与每个待验证单据对应的目标查询信息,并基于目标查询信息,确定待验证单据是否通过验证;Determine the target query information corresponding to each document to be verified from the text information, and determine whether the document to be verified passes the verification based on the target query information;
当任一待验证单据通过验证后,基于至少一个待验证图像,确定与任一待验证单据对应的目标存档图像,并基于目标存档图像,生成业务的存档信息。When any document to be verified passes the verification, a target archive image corresponding to any document to be verified is determined based on at least one image to be verified, and based on the target archive image, archive information of the business is generated.
借由上述技术方案,本申请提供的一种基于图像识别的业务处理方法及装置、存储介质、计算机设备,首先,可以获取一个或者多个待验证图像,每个待验证图像中可以包含一个或者多个待验证单据。获取到一个或者多个待验证图像之后,可以对每个待验证图像进行文字识别,进而可以得到每个待验证图像对应的文字信息。得到每个待验证图像对应的文字信息之后,当待验证图像中包含一个待验证单据时,可以直接从该待验证图像对应的文字信息中确定该待验证单据对应的目标查询信息;当待验证图像中包含多个待验证单据时,可以从待验证图像对应的文字信息中分别确定每个待验证单据对应的目标查询信息。确定目标查询信息之后,可以根据每个待验证单据对应的目标查询信息对该待验证单据进行验证,确定该待验证单据是否能够通过验证。如果存在通过验证的待验证单据,那么后续可以利用该通过验证的待验证单据进行相关业务的办理。在业务办理过程中,为了完善业务的相关信息,还可以将待验证单据进行存档处理。具体地,可以从获取的待验证图像中找到该通过验证的待验证单据对应的目标存档图像,之后,可以以该目标存档图像为基础,生成该业务的存档信息。本申请实施例可以简化待验证单据的验证过程,能够大大提升验证以及存档的效率,有利于提高用户的体验感。 With the above technical solution, this application provides a business processing method and device, storage medium, and computer equipment based on image recognition. First, one or more images to be verified can be obtained, and each image to be verified can contain one or Multiple documents to be verified. After obtaining one or more images to be verified, text recognition can be performed on each image to be verified, and the text information corresponding to each image to be verified can be obtained. After obtaining the text information corresponding to each image to be verified, when the image to be verified contains a document to be verified, the target query information corresponding to the document to be verified can be determined directly from the text information corresponding to the image to be verified; when the document to be verified is to be When the image contains multiple documents to be verified, the target query information corresponding to each document to be verified can be determined from the text information corresponding to the image to be verified. After the target query information is determined, each document to be verified can be verified based on the target query information corresponding to the document to be verified, and it is determined whether the document to be verified can pass the verification. If there is a document to be verified that has passed the verification, the document to be verified that has passed the verification can be used to handle related business in the future. During the business processing process, in order to complete business-related information, documents to be verified can also be archived. Specifically, the target archive image corresponding to the verified document to be verified can be found from the obtained image to be verified, and then the archive information of the business can be generated based on the target archive image. The embodiments of this application can simplify the verification process of documents to be verified, greatly improve the efficiency of verification and archiving, and help improve the user experience.
上述说明仅是本申请技术方案的概述,为了能够更清楚了解本申请的技术手段,而可依照说明书的内容予以实施,并且为了让本申请的上述和其它目的、特征和优点能够更明显易懂,以下特举本申请的具体实施方式。The above description is only an overview of the technical solutions of the present application. In order to have a clearer understanding of the technical means of the present application, they can be implemented according to the content of the description, and in order to make the above and other purposes, features and advantages of the present application more obvious and understandable. , the specific implementation methods of the present application are specifically listed below.
附图说明Description of drawings
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings described here are used to provide a further understanding of the present application and constitute a part of the present application. The illustrative embodiments of the present application and their descriptions are used to explain the present application and do not constitute an improper limitation of the present application. In the attached picture:
图1示出了本申请实施例提供的一种基于图像识别的业务处理方法的流程示意图;Figure 1 shows a schematic flow chart of a business processing method based on image recognition provided by an embodiment of the present application;
图2示出了本申请实施例提供的另一种基于图像识别的业务处理方法的流程示意图;Figure 2 shows a schematic flow chart of another image recognition-based business processing method provided by an embodiment of the present application;
图3示出了本申请实施例提供的一种基于图像识别的业务处理装置的结构示意图。Figure 3 shows a schematic structural diagram of a service processing device based on image recognition provided by an embodiment of the present application.
具体实施方式Detailed ways
下文中将参考附图并结合实施例来详细说明本申请。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。The present application will be described in detail below with reference to the accompanying drawings and embodiments. It should be noted that, as long as there is no conflict, the embodiments and features in the embodiments of this application can be combined with each other.
在本实施例中提供了一种基于图像识别的业务处理方法,如图1所示,该方法包括:In this embodiment, a business processing method based on image recognition is provided. As shown in Figure 1, the method includes:
步骤101,获取至少一个待验证图像,对所述至少一个待验证图像进行文字识别,得到每个所述待验证图像对应的文字信息,其中,每个所述待验证图像包括至少一个待验证单据。Step 101: Obtain at least one image to be verified, perform text recognition on the at least one image to be verified, and obtain text information corresponding to each of the images to be verified, wherein each of the images to be verified includes at least one document to be verified. .
本申请实施例提供的基于图像识别的业务处理方法,可以应用于发票融资业务场景中。首先,可以获取一个或者多个待验证图像,每个待验证图像中可以包含一个或者多个待验证单据。在这里,待验证单据可以是发票,待验证图像可以是一个发票组成的图片或者pdf,当多个发票被贴在一个pdf文件中,或者是多个发票在同一图片中时,待验证图像可以是多个发票组成的图片或者pdf。当待验证图像是图片时,待验证图像可以是通过相机拍摄纸质版发票得到的;当待验证图像是pdf时,待验证图像可以是通过扫描仪扫描得到的,也可以是直接从电子发票网站上下载得到的。获取到一个或者多个待验证图像之后,可以对每个待验证图像进行文字识别,进而可以得到每个待验证图像对应的文字信息。具体地,可以利用OCR技术加以实现。OCR技术是光学字符识别的缩写(Optical Character Recognition),是通过扫描等光学输入方式将各种票据、报刊、书籍、文稿及其它印刷品的文字转化为图像信息,再利用文字识别技术将图像信息转化为可以使用的计算机输入技术。The business processing method based on image recognition provided by the embodiment of this application can be applied to invoice financing business scenarios. First, one or more images to be verified can be obtained, and each image to be verified can contain one or more documents to be verified. Here, the document to be verified can be an invoice, and the image to be verified can be a picture or pdf of an invoice. When multiple invoices are posted in one pdf file, or multiple invoices are in the same picture, the image to be verified can be It is a picture or pdf composed of multiple invoices. When the image to be verified is a picture, the image to be verified can be obtained by photographing a paper invoice with a camera; when the image to be verified is a PDF, the image to be verified can be obtained by scanning with a scanner, or it can be obtained directly from the electronic invoice. Downloadable from the website. After obtaining one or more images to be verified, text recognition can be performed on each image to be verified, and the text information corresponding to each image to be verified can be obtained. Specifically, OCR technology can be used to achieve this. OCR technology is the abbreviation of Optical Character Recognition (Optical Character Recognition). It converts the text of various bills, newspapers, books, manuscripts and other printed matter into image information through optical input methods such as scanning, and then uses text recognition technology to convert the image information. Enter technology for computers that can be used.
步骤102,从所述文字信息中确定与每个所述待验证单据对应的目标查询信息,并基于所述目标查询信息,确定所述待验证单据是否通过验证。Step 102: Determine the target query information corresponding to each of the documents to be verified from the text information, and determine whether the document to be verified passes the verification based on the target query information.
在该实施例中,得到每个待验证图像对应的文字信息之后,当待验证图像中包含一个待验证单据时,可以直接从该待验证图像对应的文字信息中确定该待验证单据对应的目标查询信息;当待验证图像中包含多个待验证单据时,可以从待验证图像对应的文字信息中分别确定每个待验证单据对应的目标查询信息。确定目标查询信息之后,可以根据每个待验证单据对应的目标查询信息对该待验证单据进行验证,确定该待验证单据是否能够通过验证。在这里,当待验证单据为发票时,目标查询信息可以是发票名称、企业名称、贸易 合同名称、发票号码、发票金额、发票日期、含税金额、不含税金额、专票、普票、销方名称、购方名称等,具体可以根据实际情况进行选择。当想要精确匹配时,可以将目标查询信息确定为上述中的多个。本申请可以直接根据获取的待验证图像得到每个待验证单据对应的目标查询信息,并将目标查询信息自动输入到单据验证系统中,无需用户线下登录单据验证系统,并多次手动输入目标查询信息进行验证,大大简化了验证过程。In this embodiment, after obtaining the text information corresponding to each image to be verified, when the image to be verified contains a document to be verified, the target corresponding to the document to be verified can be determined directly from the text information corresponding to the image to be verified. Query information; when the image to be verified contains multiple documents to be verified, the target query information corresponding to each document to be verified can be determined from the text information corresponding to the image to be verified. After the target query information is determined, each document to be verified can be verified based on the target query information corresponding to the document to be verified, and it is determined whether the document to be verified can pass the verification. Here, when the document to be verified is an invoice, the target query information can be the invoice name, company name, trade The contract name, invoice number, invoice amount, invoice date, tax-included amount, tax-exclusive amount, special invoice, general invoice, seller's name, buyer's name, etc. can be selected according to the actual situation. When exact matching is desired, the target query information can be determined to be multiple of the above. This application can directly obtain the target query information corresponding to each document to be verified based on the obtained image to be verified, and automatically input the target query information into the document verification system. There is no need for users to log in to the document verification system offline and manually enter the target multiple times. Query information for verification, greatly simplifying the verification process.
步骤103,当任一所述待验证单据通过验证后,基于所述至少一个待验证图像,确定与所述任一所述待验证单据对应的目标存档图像,并基于所述目标存档图像,生成所述业务的存档信息。Step 103: After any of the documents to be verified passes the verification, determine the target archive image corresponding to any of the documents to be verified based on the at least one image to be verified, and generate based on the target archive image. Archived information for said business.
在该实施例中,如果存在通过验证的待验证单据,那么后续可以利用该通过验证的待验证单据进行相关业务的办理。在业务办理过程中,为了完善业务的相关信息,还可以将待验证单据进行存档处理。具体地,可以从获取的待验证图像中找到该通过验证的待验证单据对应的目标存档图像,之后,可以以该目标存档图像为基础,生成该业务的存档信息。当待验证图像中包括多个待验证单据,且并非所有待验证单据全部通过验证,即并非所有待验证单据均可用于业务办理时,如果人工从多个待验证图像中找到通过验证的待验证单据,并将该通过验证的待验证单据从待验证图像中截取出来,再上传进行存档,需要花费大量的时间,效率较低,且用户体验较差,本申请实施例可以直接自动从多个待验证图像中确定用于业务办理的待验证单据,简单方便、效率高,有利于提升用户体验感。例如,在发票融资场景中,通常需要用户从众多待验证图像中找到通过验证的待验证单据(也即发票),之后手动上传,而本申请这一系列过程可以自动实现,大大提升了存档效率和准确性。In this embodiment, if there is a document to be verified that has passed the verification, the document to be verified that has passed the verification can be used to handle related business later. During the business processing process, in order to complete business-related information, documents to be verified can also be archived. Specifically, the target archive image corresponding to the verified document to be verified can be found from the obtained image to be verified, and then the archive information of the business can be generated based on the target archive image. When the image to be verified includes multiple documents to be verified, and not all documents to be verified have passed verification, that is, not all documents to be verified can be used for business processing, if the document to be verified is manually found from multiple images to be verified. document, and extracting the verified document to be verified from the image to be verified, and then uploading it for archiving, which requires a lot of time, low efficiency, and poor user experience. The embodiment of the present application can directly and automatically extract the document from multiple documents. The document to be verified for business processing is determined in the image to be verified, which is simple, convenient and efficient, and is conducive to improving the user experience. For example, in an invoice financing scenario, users are usually required to find documents to be verified (i.e. invoices) that have passed verification from many images to be verified, and then upload them manually. However, this series of processes can be automatically implemented, which greatly improves archiving efficiency. and accuracy.
通过应用本实施例的技术方案,首先,可以获取一个或者多个待验证图像,每个待验证图像中可以包含一个或者多个待验证单据。获取到一个或者多个待验证图像之后,可以对每个待验证图像进行文字识别,进而可以得到每个待验证图像对应的文字信息。得到每个待验证图像对应的文字信息之后,当待验证图像中包含一个待验证单据时,可以直接从该待验证图像对应的文字信息中确定该待验证单据对应的目标查询信息;当待验证图像中包含多个待验证单据时,可以从待验证图像对应的文字信息中分别确定每个待验证单据对应的目标查询信息。确定目标查询信息之后,可以根据每个待验证单据对应的目标查询信息对该待验证单据进行验证,确定该待验证单据是否能够通过验证。如果存在通过验证的待验证单据,那么后续可以利用该通过验证的待验证单据进行相关业务的办理。在业务办理过程中,为了完善业务的相关信息,还可以将待验证单据进行存档处理。具体地,可以从获取的待验证图像中找到该通过验证的待验证单据对应的目标存档图像,之后,可以以该目标存档图像为基础,生成该业务的存档信息。本申请实施例可以简化待验证单据的验证过程,能够大大提升验证以及存档的效率,有利于提高用户的体验感。By applying the technical solution of this embodiment, first, one or more images to be verified can be obtained, and each image to be verified can contain one or more documents to be verified. After obtaining one or more images to be verified, text recognition can be performed on each image to be verified, and the text information corresponding to each image to be verified can be obtained. After obtaining the text information corresponding to each image to be verified, when the image to be verified contains a document to be verified, the target query information corresponding to the document to be verified can be determined directly from the text information corresponding to the image to be verified; when the document to be verified is to be When the image contains multiple documents to be verified, the target query information corresponding to each document to be verified can be determined from the text information corresponding to the image to be verified. After the target query information is determined, each document to be verified can be verified based on the target query information corresponding to the document to be verified, and it is determined whether the document to be verified can pass the verification. If there is a document to be verified that has passed the verification, the document to be verified that has passed the verification can be used to handle related business in the future. During the business processing process, in order to complete business-related information, documents to be verified can also be archived. Specifically, the target archive image corresponding to the verified document to be verified can be found from the obtained image to be verified, and then the archive information of the business can be generated based on the target archive image. The embodiments of this application can simplify the verification process of documents to be verified, greatly improve the efficiency of verification and archiving, and help improve the user experience.
进一步的,作为上述实施例具体实施方式的细化和扩展,为了完整说明本实施例的具体实施过程,提供了另一种基于图像识别的业务处理方法,如图2所示,该方法包括:Further, as a refinement and expansion of the specific implementation of the above embodiment, in order to completely explain the specific implementation process of this embodiment, another business processing method based on image recognition is provided. As shown in Figure 2, the method includes:
步骤201,获取至少一个待验证图像,对所述至少一个待验证图像进行文字识别,得到每个所述待验证图像对应的文字信息,其中,每个所述待验证图像包括至少一个待验证单据。Step 201: Obtain at least one image to be verified, perform text recognition on the at least one image to be verified, and obtain text information corresponding to each image to be verified, wherein each image to be verified includes at least one document to be verified. .
在该实施例中,首先,可以获取一个或者多个待验证图像,每个待验证图像中可以包含一个或者多个待验证单据。在这里,待验证单据可以是发票,待验证图像可以是一个发 票组成的图片或者pdf,当多个发票被贴在一个pdf文件中,或者是多个发票在同一图片中时,待验证图像可以是多个发票组成的图片或者pdf。当待验证图像是图片时,待验证图像可以是通过相机拍摄纸质版发票得到的;当待验证图像是pdf时,待验证图像可以是通过扫描仪扫描得到的,也可以是直接从电子发票网站上下载得到的。获取到一个或者多个待验证图像之后,可以对每个待验证图像进行文字识别,进而可以得到每个待验证图像对应的文字信息。In this embodiment, first, one or more images to be verified can be obtained, and each image to be verified can contain one or more documents to be verified. Here, the document to be verified can be an invoice, and the image to be verified can be an invoice. A picture or pdf composed of invoices. When multiple invoices are posted in one pdf file, or multiple invoices are in the same picture, the image to be verified can be a picture or pdf composed of multiple invoices. When the image to be verified is a picture, the image to be verified can be obtained by photographing a paper invoice with a camera; when the image to be verified is a PDF, the image to be verified can be obtained by scanning with a scanner, or it can be obtained directly from the electronic invoice. Downloadable from the website. After obtaining one or more images to be verified, text recognition can be performed on each image to be verified, and the text information corresponding to each image to be verified can be obtained.
步骤202,基于所述文字信息,确定所述待验证图像中包含的待验证单据的个数。Step 202: Based on the text information, determine the number of documents to be verified contained in the image to be verified.
在本申请实施例中,可选地,步骤202包括:从所述文字信息中识别出第一文字组合的出现次数,并将所述出现次数作为所述待验证图像中包含的待验证单据的个数,其中,所述第一文字组合为每个所述待验证单据中存在的,且仅出现一次的文字组合。In the embodiment of the present application, optionally, step 202 includes: identifying the number of occurrences of the first word combination from the text information, and using the number of occurrences as the individual number of the document to be verified contained in the image to be verified. number, wherein the first character combination is a character combination that exists in each of the documents to be verified and appears only once.
在该实施例中,得到每个待验证图像对应的文字信息后,如果待验证图像中包括多个待验证单据,那么文字信息中可以包括多个重复信息,具体地,如果某一待验证图像对应的文字信息中包含一个第一文字组合,那么可以确定该待验证图像中包含一个待验证单据;如果某一待验证图像对应的文字信息中包含多个第一文字组合,那么可以确定该待验证图像中包含第一文字组合出现次数个待验证单据。在这里,第一文字组合可以是每个待验证单据都出现的、且仅出现一次的文字组合,例如,当待验证单据是发票时,第一文字组合可以是“增值税专用发票”、“密码区”等文字组合。In this embodiment, after obtaining the text information corresponding to each image to be verified, if the image to be verified includes multiple documents to be verified, the text information may include multiple repeated information. Specifically, if an image to be verified is If the corresponding text information contains a first text combination, then it can be determined that the image to be verified contains a document to be verified; if the text information corresponding to a certain image to be verified contains multiple first text combinations, then it can be determined that the image to be verified is Contains the number of occurrences of the first word combination in documents to be verified. Here, the first text combination may be a text combination that appears in each document to be verified and appears only once. For example, when the document to be verified is an invoice, the first text combination may be "Special Value Added Tax Invoice", "Password Area" " and other text combinations.
步骤203,当包含一个所述待验证单据时,从所述文字信息中确定唯一标识,通过所述唯一标识标记所述待验证图像,并将标记后的待验证图像作为目标验证图像。Step 203: When a document to be verified is included, a unique identifier is determined from the text information, the image to be verified is marked with the unique identifier, and the marked image to be verified is used as the target verification image.
在该实施例中,如果待验证图像中仅包含一个待验证单据时,此时从该待验证图像对应的文字信息中找出待验证单据对应的唯一标识,之后,可以利用该唯一标识对待验证图像进行标记处理,并且可以将进行标记处理后的待验证图像作为目标验证图像。其中,唯一标识可以是与待验证单据一一对应的标识,当待验证单据是发票时,唯一标识可以是发票号码等。In this embodiment, if the image to be verified contains only one document to be verified, the unique identifier corresponding to the document to be verified can be found from the text information corresponding to the image to be verified. After that, the unique identifier can be used to verify the document to be verified. The image is marked, and the image to be verified after the marking process can be used as the target verification image. The unique identifier may be an identifier that corresponds one-to-one with the document to be verified. When the document to be verified is an invoice, the unique identifier may be an invoice number, etc.
步骤204,当包含多个所述待验证单据时,将所述待验证图像进行裁剪处理,得到与每个所述待验证单据对应的待验证子图像,从所述文字信息中分别确定与每个所述待验证子图像对应的唯一标识,通过所述唯一标识标记所述待验证子图像,并将每个标记后的待验证子图像作为所述目标验证图像。Step 204: When multiple documents to be verified are included, the image to be verified is cropped to obtain a sub-image to be verified corresponding to each document to be verified, and the sub-image to be verified is determined from the text information. The sub-images to be verified are marked with unique identifiers corresponding to the sub-images to be verified, and each marked sub-image to be verified is used as the target verification image.
在该实施例中,如果待验证图像中包含多个待验证单据,此时可以对待验证图像进行裁剪,得到多个待验证子图像,其中每个待验证子图像与一个待验证单据相对应。此外,还可以从待验证图像对应的文字信息中分别确定每个待验证子图像对应的唯一标识。之后,可以利用得到的唯一标识对对应的待验证子图像进行标记处理,并可以将标记之后的待验证子图像作为目标验证图像。In this embodiment, if the image to be verified contains multiple documents to be verified, the image to be verified can be cropped to obtain multiple sub-images to be verified, where each sub-image to be verified corresponds to a document to be verified. In addition, the unique identifier corresponding to each sub-image to be verified can also be determined from the text information corresponding to the image to be verified. Afterwards, the obtained unique identifier can be used to mark the corresponding sub-image to be verified, and the marked sub-image to be verified can be used as the target verification image.
步骤205,从所述文字信息中确定与每个所述待验证单据对应的目标查询信息。Step 205: Determine target query information corresponding to each document to be verified from the text information.
在本申请实施例中,可选地,当所述待验证图像中包含多个所述待验证单据时,步骤205包括:将所述文字信息分割为与每个所述目标验证图像对应的子文字信息,分别从每个子文字信息中提取出与所述待验证单据对应的目标查询信息。 In the embodiment of the present application, optionally, when the image to be verified contains multiple documents to be verified, step 205 includes: dividing the text information into sub-sections corresponding to each of the target verification images. text information, and extract the target query information corresponding to the document to be verified from each sub-text information.
在该实施例中,当待验证图像中仅包含一个待验证单据时,可以直接从该待验证图像对应的文字信息确定包含的待验证单据的目标查询信息。当待验证图像中包含多个待验证单据时,此时可以将文字信息分割成和每个目标验证图像对应的子文字信息,进一步从每个子文字信息中提取出待验证单据对应的目标查询信息。例如,待验证图像中包含4个待验证单据,那么可以将该待验证图像对应的文字信息分割成4个子文字信息,接着分别从每个子文字信息中提取出每个待验证单据对应的目标查询信息。In this embodiment, when the image to be verified contains only one document to be verified, the target query information of the document to be verified can be determined directly from the text information corresponding to the image to be verified. When the image to be verified contains multiple documents to be verified, the text information can be divided into sub-text information corresponding to each target verification image, and the target query information corresponding to the document to be verified can be further extracted from each sub-text information. . For example, if the image to be verified contains 4 documents to be verified, the text information corresponding to the image to be verified can be divided into 4 sub-text information, and then the target query corresponding to each document to be verified can be extracted from each sub-text information. information.
步骤206,调用预设单据验证接口,将所述目标查询信息输入至所述预设单据验证接口,并接收所述预设单据验证接口返回的单据状态信息,其中,所述单据状态信息包括已登记状态和未登记状态;当所述单据状态信息为已登记状态时,确定所述待验证单据未通过验证;当所述单据状态信息为未登记状态时,确定所述待验证单据通过验证。Step 206: Call the preset document verification interface, input the target query information into the preset document verification interface, and receive the document status information returned by the preset document verification interface, where the document status information includes the Registered status and unregistered status; when the document status information is in the registered status, it is determined that the document to be verified has not passed verification; when the document status information is in the unregistered state, it is determined that the document to be verified has passed verification.
在该实施例中,确定每个待验证单据对应的目标查询信息之后,进一步可以调用预设单据验证接口,通过调用该预设单据验证接口,可以对待验证单据进行验证。调用预设单据验证接口之后,可以将每个待验证单据对应的目标查询信息输入到预设单据验证接口中,还可以接收预设单据验证接口返回的单据状态信息。例如,当待验证单据为发票时,预设单据验证接口可以是中登查询系统调用接口。本申请将发票融资业务需要使用的中登查询系统直接通过第三方接口调用的方式进行调用,可以避免用户跨系统手动查询,查询后再将验证信息手动输入到融资系统办理融资业务,可以大大提升验证的效率。经过验证返回的单据状态信息中可以包括已登记状态以及未登记状态。如果单据状态信息是已登记状态,那么说明该待验证单据已经用于办理过该业务,不能再用于办理该业务,此时可以确定待验证单据未通过验证;如果单据状态信息是未登记状态,那么说明该待验证单据没有用于办理过该业务,可以用于办理该业务,此时可以确定待验证单据通过验证。例如,对于发票融资场景,此时待验证单据可以是发票,业务可以是融资业务,如果单据状态信息是已登记状态,那么说明该发票已经用于办理过融资业务,不能再次利用该发票进行融资,如果单据状态信息是未登记状态,那么说明该发票没有用于办理过融资业务,可以利用该发票进行融资。In this embodiment, after determining the target query information corresponding to each document to be verified, a preset document verification interface can be called. By calling the preset document verification interface, the document to be verified can be verified. After calling the preset document verification interface, you can input the target query information corresponding to each document to be verified into the preset document verification interface, and you can also receive the document status information returned by the preset document verification interface. For example, when the document to be verified is an invoice, the default document verification interface may be the Zhonglog query system call interface. This application calls the Zhongdeng query system that needs to be used for invoice financing business directly through a third-party interface call, which can avoid users' manual cross-system query. After querying, the verification information is then manually input into the financing system to handle the financing business, which can greatly improve Verification efficiency. The document status information returned after verification can include registered status and unregistered status. If the document status information is registered, it means that the document to be verified has been used to handle the business and can no longer be used to handle the business. At this time, it can be determined that the document to be verified has not passed the verification; if the document status information is unregistered , then it means that the document to be verified has not been used to handle this business, but can be used to handle this business. At this time, it can be determined that the document to be verified has passed the verification. For example, in the invoice financing scenario, the document to be verified at this time can be an invoice, and the business can be a financing business. If the document status information is registered, it means that the invoice has been used for financing business and cannot be used for financing again. , if the document status information is unregistered, it means that the invoice has not been used for financing business, and the invoice can be used for financing.
步骤207,当任一所述待验证单据通过验证后,从所述至少一个待验证图像对应的所述目标验证图像中,基于所述唯一标识确定与所述任一所述待验证单据对应的目标验证图像,作为所述目标存档图像,并基于所述目标存档图像,生成所述业务的存档信息。Step 207: After any of the documents to be verified passes the verification, determine the document corresponding to any of the documents to be verified from the target verification image corresponding to the at least one image to be verified based on the unique identifier. The target verification image serves as the target archive image, and based on the target archive image, the archive information of the business is generated.
在该实施例中,如果存在待验证单据通过验证,那么可以利用该待验证单据进行业务的办理,在办理业务后还可以利用待验证单据对应的目标存档图像进行存档。在这里,目标存档图像即为需要存档的待验证单据对应的图像。具体地,由于每个目标验证图像被标记有唯一标识,因此可以从一个或者多个待验证图像对应的目标验证图像中,找到与通过验证的待验证单据的唯一标识对应的目标验证图像,并将该目标验证图像作为目标存档图像。接着,可以以目标存档图像为基础,生成本次业务办理对应的存档信息。在业务办理时,对待验证单据对应的图像进行存档是有必要的,例如对于发票融资的场景中,对发票对应的图像进行存档,可以使得业务办理资料更加完善,有利于应对外部金融监管机构不定期抽查。In this embodiment, if there is a document to be verified that passes the verification, the document to be verified can be used to handle the business. After the business is handled, the target archive image corresponding to the document to be verified can also be used for archiving. Here, the target archive image is the image corresponding to the document to be verified that needs to be archived. Specifically, since each target verification image is marked with a unique identifier, the target verification image corresponding to the unique identifier of the verified document to be verified can be found from the target verification images corresponding to one or more to-be-verified images, and Use this target verification image as the target archive image. Then, based on the target archive image, archive information corresponding to this business transaction can be generated. When handling business, it is necessary to archive the images corresponding to the documents to be verified. For example, in the scenario of invoice financing, archiving the images corresponding to the invoices can make the business processing information more complete and help to deal with the inaccuracies of external financial regulatory agencies. Regular spot checks.
在本申请实施例中,可选地,步骤204中所述“将所述待验证图像进行裁剪处理,得到与每个所述待验证单据对应的待验证子图像”,包括:从所述待验证图像中依次确定第二文字组合以及第三文字组合对应的位置,其中,所述第二文字组合为所述待验证单据的通用结束文字组合,所述第三文字组合为所述待验证单据的通用起始文字组合;基于每组 所述第二文字组合对应的位置、所述第三文字组合对应的位置,以及预设分割比例,确定目标裁剪位置;基于所述目标裁剪位置对所述待验证图像进行裁剪处理,得到与每个所述待验证单据对应的待验证子图像。In the embodiment of the present application, optionally, "cropping the image to be verified to obtain a sub-image to be verified corresponding to each document to be verified" in step 204 includes: The positions corresponding to the second text combination and the third text combination are determined sequentially in the verification image, wherein the second text combination is the universal ending text combination of the document to be verified, and the third text combination is the document to be verified. common starting text combinations; based on each The position corresponding to the second word combination, the position corresponding to the third word combination, and the preset segmentation ratio are used to determine the target cropping position; the image to be verified is cropped based on the target cropping position, and the image to be verified is obtained. The sub-image to be verified corresponding to the document to be verified.
在该实施例中,在对待验证图像进行裁剪时,可以按照第二文字组合-第三文字组合的顺序分别从待验证图像中确定第二文字组合所处的位置和第三文字组合所处的位置。在这里,第二文字组合可以是待验证单据的通用结束文字组合,例如当待验证单据是发票时,每个发票的通用结束文字组合可以是“收款人”、“复核”、“开票人”、“销售方”等,这些文字组合通常在发票的最后一行;第三文字组合可以是待验证单据的通用起始文字组合,例如当待验证单据是发票时,每个发票的通用起始文字组合可以是“发票代码”、“增值税专用发票”、“增值税普通发票”等,这些文字组合通常在发票的第一行。确定完第二文字组合对应的位置和第三文字组合对应的位置之后,可以利用第二文字组合对应的位置、第三文字组合对应的位置,以及预设分割比例,即可确定出两个待验证单据之间的目标裁剪位置。在这里,预设分割比例可以是预先经过计算得到的,例如第二文字组合位于待验证单据的最后一行,第三文字组合位于待验证单据的第一行,那么可以在两个待验证单据之间的距离不同时,分别计算出最佳分割比例,并根据多个最佳分割比例确定预设分割比例,保证按照该预设分割比例对两个连续的待验证单据进行分割时,可以使得两个待验证单据完整。确定目标裁剪位置之后,可以按照目标裁剪位置对待验证图像进行裁剪处理,进而可以得到与每个待验证单据对应的待验证子图像。本申请通过确定第二文字组合在待验证图像中所处的位置、第三文字组合在待验证图像中所处的位置,以及预设分割比例,进而对包含多个待验证单据的待验证图像进行裁剪,可以简单、方便地确定每个待验证单据对应的待验证子图像,方便后续准确地找到用于存档的图像。In this embodiment, when cropping the image to be verified, the position of the second word combination and the position of the third word combination can be determined from the image to be verified in the order of the second word combination - the third word combination. Location. Here, the second text combination can be a common ending text combination of the document to be verified. For example, when the document to be verified is an invoice, the common ending text combination of each invoice can be "Payee", "Review", "Invoicer" ", "Seller", etc., these text combinations are usually in the last line of the invoice; the third text combination can be the common starting text combination of the document to be verified, for example, when the document to be verified is an invoice, the common starting text combination of each invoice The text combination can be "invoice code", "VAT special invoice", "VAT general invoice", etc. These text combinations are usually on the first line of the invoice. After determining the position corresponding to the second text combination and the position corresponding to the third text combination, you can use the position corresponding to the second text combination, the position corresponding to the third text combination, and the preset division ratio to determine the two to-be-used characters. Verify target crop position between documents. Here, the preset split ratio can be calculated in advance. For example, the second text combination is located on the last line of the document to be verified, and the third text combination is located on the first line of the document to be verified, then it can be divided between the two documents to be verified. When the distances between the documents are different, the optimal split ratio is calculated respectively, and the preset split ratio is determined based on multiple optimal split ratios to ensure that when two consecutive documents to be verified are split according to the preset split ratio, the two consecutive documents to be verified can be split. The documents to be verified are complete. After the target cropping position is determined, the image to be verified can be cropped according to the target cropping position, and then the sub-image to be verified corresponding to each document to be verified can be obtained. This application determines the position of the second text combination in the image to be verified, the position of the third text combination in the image to be verified, and the preset division ratio, and then analyzes the image to be verified that contains multiple documents to be verified. By cropping, you can simply and conveniently determine the sub-image to be verified corresponding to each document to be verified, making it easier to accurately find the image for archiving later.
如果待验证图像中包含两个待验证单据,那么只要在待验证图像中确定一组第二文字组合-第三文字组合分别所处的位置即可进行分割,如果待验证图像中包含三个待验证单据,那么只要在待验证图像中确定两组第二文字组合-第三文字组合分别所处的位置即可进行分割,以此类推。因此,确定每个待验证图像对应的待验证单据的个数后,可以根据此个数从待验证图像的文字信息中确定第二文字组合-第三文字组合分别所处的位置,当确定上述个数个第二文字组合-第三文字组合所处的位置后,即可不再进行第二文字组合-第三文字组合位置的确定。本申请实施例利用待验证单据的个数,当确定的第二文字组合-第三文字组合的位置的组数与待验证单据的个数相等时,即可停止,可以有效减少位置确定的时间,提升位置确定的效率。If the image to be verified contains two documents to be verified, then the segmentation can be performed by determining the positions of a second text combination and a third text combination in the image to be verified. If the image to be verified contains three documents to be verified, To verify the document, you only need to determine the positions of the two sets of second text combinations and third text combinations in the image to be verified to perform segmentation, and so on. Therefore, after determining the number of documents to be verified corresponding to each image to be verified, the positions of the second text combination and the third text combination can be determined from the text information of the image to be verified based on this number. When the above-mentioned After the positions of the second word combination and the third word combination are located, the position of the second word combination and the third word combination can no longer be determined. The embodiment of the present application uses the number of documents to be verified. When the number of determined positions of the second character combination and the third character combination is equal to the number of documents to be verified, it can be stopped, which can effectively reduce the time for position determination. , improve the efficiency of position determination.
在本申请实施例中,可选地,步骤201中所述“获取至少一个待验证图像”之前,所述方法还包括:获取单据号列表,其中,所述单据号列表中包括每个所述待验证单据对应的单据号;相应地,所述目标查询信息中包括目标单据号;步骤205之后,所述方法还包括:识别所述目标查询信息中的所述目标单据号,并将所述目标单据号从所述单据号列表中剔除,得到遗漏单据列表。In the embodiment of the present application, optionally, before "obtaining at least one image to be verified" in step 201, the method further includes: obtaining a list of document numbers, wherein the list of document numbers includes each of the The document number corresponding to the document to be verified; accordingly, the target query information includes the target document number; after step 205, the method further includes: identifying the target document number in the target query information, and The target document number is eliminated from the document number list to obtain a missing document list.
在该实施例中,在获取一个或者多个待验证图像之前,首先可以获取单据号列表。在这里,单据号列表中可以包括每个待验证单据对应的单据号,其中,单据号和待验证单据也是一一对应的。当待验证单据是发票时,单据号可以是发票号码。当目标查询信息中包括目标单据号时,如果已经从文字信息中确定了每个待验证单据对应的目标查询信息之后,可以从目标查询信息中确定目标单据号,并将目标单据号从单据号列表中剔除出去,使得单据号列表中不再包含该目标单据号,这样当每个待验证单据对应的目标查询信息中包含 的目标单据号均从单据号列表中剔除之后,即可得到遗漏单据列表,遗漏单据列表中包含的单据号可以是那些文字信息识别失败的待验证单据对应的单据号。后续可以根据遗漏单据列表直接确定哪些待验证单据没有被识别,进而可以找出这些待验证单据重新进行识别,相比于人工从众多待验证单据中找出被遗漏的待验证单据,进而再对这些被遗漏的待验证单据重新进行识别,本申请可以大大减少人工工作量。本申请实施例在进行待验证图像文字识别之前,先获取包含所有待验证单据对应的单据号的单据号列表,之后将识别通过的待验证单据对应的单据号从单据号列表中剔除,最后单据号列表中剩余的单据号即为OCR过程中无法识别到或者识别不准的待验证单据对应的单据号,特别是在待验证单据的数据量特别大时,可以快速、准确地定位出哪些待验证单据被遗漏掉,提升遗漏的待验证单据的确定效率。In this embodiment, before acquiring one or more images to be verified, a list of document numbers can first be acquired. Here, the document number list may include the document number corresponding to each document to be verified, where the document number and the document to be verified are also in one-to-one correspondence. When the document to be verified is an invoice, the document number can be the invoice number. When the target query information includes the target document number, if the target query information corresponding to each document to be verified has been determined from the text information, the target document number can be determined from the target query information, and the target document number can be changed from the document number Eliminate it from the list so that the target document number is no longer included in the document number list, so that when the target query information corresponding to each document to be verified contains After all the target document numbers are eliminated from the document number list, the missing document list can be obtained. The document numbers included in the missing document list can be the document numbers corresponding to the documents to be verified that failed to recognize the text information. Subsequently, you can directly determine which documents to be verified have not been identified based on the list of missing documents, and then you can find these documents to be verified and re-identify them. Compared with manually finding the missing documents to be verified from many documents to be verified, and then re-identifying them. By re-identifying these missed documents to be verified, this application can greatly reduce the manual workload. In the embodiment of this application, before performing image text recognition to be verified, a document number list containing the document numbers corresponding to all documents to be verified is first obtained, and then the document numbers corresponding to the identified documents to be verified are removed from the document number list, and finally the document numbers are The remaining document numbers in the number list are the document numbers corresponding to the documents to be verified that cannot be identified or cannot be identified accurately during the OCR process. Especially when the data volume of the documents to be verified is particularly large, the documents to be verified can be quickly and accurately located. Verification documents are missed, improving the efficiency of determining missing documents to be verified.
进一步的,作为图1方法的具体实现,本申请实施例提供了一种基于图像识别的业务处理装置,如图3所示,该装置包括:Further, as a specific implementation of the method in Figure 1, an embodiment of the present application provides a business processing device based on image recognition, as shown in Figure 3. The device includes:
文字识别模块,用于获取至少一个待验证图像,对所述至少一个待验证图像进行文字识别,得到每个所述待验证图像对应的文字信息,其中,每个所述待验证图像包括至少一个待验证单据;A text recognition module, configured to obtain at least one image to be verified, perform text recognition on the at least one image to be verified, and obtain text information corresponding to each image to be verified, wherein each image to be verified includes at least one Documents to be verified;
验证模块,用于从所述文字信息中确定与每个所述待验证单据对应的目标查询信息,并基于所述目标查询信息,确定所述待验证单据是否通过验证;A verification module, configured to determine the target query information corresponding to each of the documents to be verified from the text information, and determine whether the document to be verified passes the verification based on the target query information;
存档模块,用于当任一所述待验证单据通过验证后,基于所述至少一个待验证图像,确定与所述任一所述待验证单据对应的目标存档图像,并基于所述目标存档图像,生成所述业务的存档信息。An archiving module, configured to determine a target archive image corresponding to any of the documents to be verified based on the at least one image to be verified after the document to be verified passes the verification, and based on the target archive image , generate archive information of the business.
可选地,所述装置还包括:Optionally, the device also includes:
数量确定模块,用于所述得到每个所述待验证图像对应的文字信息之后,基于所述文字信息,确定所述待验证图像中包含的待验证单据的个数;A quantity determination module, configured to determine the number of documents to be verified contained in the image to be verified based on the text information after obtaining the text information corresponding to each of the images to be verified;
标记模块,用于当包含一个所述待验证单据时,从所述文字信息中确定唯一标识,通过所述唯一标识标记所述待验证图像,并将标记后的待验证图像作为目标验证图像;当包含多个所述待验证单据时,将所述待验证图像进行裁剪处理,得到与每个所述待验证单据对应的待验证子图像,从所述文字信息中分别确定与每个所述待验证子图像对应的唯一标识,通过所述唯一标识标记所述待验证子图像,并将每个标记后的待验证子图像作为所述目标验证图像;A marking module, configured to determine a unique identifier from the text information when it contains one of the documents to be verified, mark the image to be verified through the unique identifier, and use the marked image to be verified as the target verification image; When multiple documents to be verified are included, the image to be verified is cropped to obtain a sub-image to be verified corresponding to each document to be verified, and the content of each document to be verified is determined from the text information. A unique identifier corresponding to the sub-image to be verified, marking the sub-image to be verified by the unique identifier, and using each marked sub-image to be verified as the target verification image;
相应地,所述存档模块,用于:从所述至少一个待验证图像对应的所述目标验证图像中,基于所述唯一标识确定与所述任一所述待验证单据对应的目标验证图像,作为所述目标存档图像。Correspondingly, the archiving module is configured to: determine the target verification image corresponding to any of the documents to be verified based on the unique identification from the target verification image corresponding to the at least one image to be verified, Archive the image as the target.
可选地,所述数量确定模块,用于:从所述文字信息中识别出第一文字组合的出现次数,并将所述出现次数作为所述待验证图像中包含的待验证单据的个数,其中,所述第一文字组合为每个所述待验证单据中存在的,且仅出现一次的文字组合。Optionally, the quantity determination module is configured to: identify the number of occurrences of the first text combination from the text information, and use the number of occurrences as the number of documents to be verified contained in the image to be verified, Wherein, the first text combination is a text combination that exists in each of the documents to be verified and appears only once.
可选地,所述标记模块,包括: Optionally, the marking module includes:
第一位置确定单元,用于从所述待验证图像中依次确定第二文字组合以及第三文字组合对应的位置,其中,所述第二文字组合为所述待验证单据的通用结束文字组合,所述第三文字组合为所述待验证单据的通用起始文字组合;The first position determination unit is used to sequentially determine the positions corresponding to the second text combination and the third text combination from the image to be verified, wherein the second text combination is a universal ending text combination of the document to be verified, The third text combination is a common starting text combination of the document to be verified;
第二位置确定单元,用于基于每组所述第二文字组合对应的位置、所述第三文字组合对应的位置,以及预设分割比例,确定目标裁剪位置;The second position determination unit is used to determine the target cropping position based on the position corresponding to each group of the second character combination, the position corresponding to the third character combination, and the preset segmentation ratio;
裁剪单元,用于基于所述目标裁剪位置对所述待验证图像进行裁剪处理,得到与每个所述待验证单据对应的待验证子图像。A cropping unit, configured to crop the image to be verified based on the target cropping position to obtain the sub-image to be verified corresponding to each of the documents to be verified.
可选地,当所述待验证图像中包含多个所述待验证单据时,所述验证模块,用于:将所述文字信息分割为与每个所述目标验证图像对应的子文字信息,分别从每个子文字信息中提取出与所述待验证单据对应的目标查询信息。Optionally, when the image to be verified contains multiple documents to be verified, the verification module is configured to: divide the text information into sub-text information corresponding to each of the target verification images, The target query information corresponding to the document to be verified is extracted from each sub-text information respectively.
可选地,所述验证模块,包括:Optionally, the verification module includes:
接口调用单元,用于调用预设单据验证接口,将所述目标查询信息输入至所述预设单据验证接口,并接收所述预设单据验证接口返回的单据状态信息,其中,所述单据状态信息包括已登记状态和未登记状态;An interface calling unit, configured to call a preset document verification interface, input the target query information into the preset document verification interface, and receive the document status information returned by the preset document verification interface, wherein the document status Information includes registered status and unregistered status;
判断单元,用于当所述单据状态信息为已登记状态时,确定所述待验证单据未通过验证;当所述单据状态信息为未登记状态时,确定所述待验证单据通过验证。A judgment unit configured to determine that the document to be verified has not passed verification when the document status information is in a registered state; and to determine that the document to be verified has passed verification when the document status information is in an unregistered state.
可选地,所述装置还包括:Optionally, the device also includes:
列表获取模块,用于所述获取至少一个待验证图像之前,获取单据号列表,其中,所述单据号列表中包括每个所述待验证单据对应的单据号;A list acquisition module, configured to obtain a list of document numbers before acquiring at least one image to be verified, wherein the list of document numbers includes a document number corresponding to each of the documents to be verified;
相应地,所述目标查询信息中包括目标单据号;所述装置还包括:Correspondingly, the target query information includes the target document number; the device further includes:
剔除模块,用于所述从所述文字信息中确定与每个所述待验证单据对应的目标查询信息之后,识别所述目标查询信息中的所述目标单据号,并将所述目标单据号从所述单据号列表中剔除,得到遗漏单据列表。An elimination module, configured to identify the target document number in the target query information after determining the target query information corresponding to each document to be verified from the text information, and add the target document number to the target query information. Eliminate from the list of document numbers to obtain a list of missing documents.
需要说明的是,本申请实施例提供的一种基于图像识别的业务处理装置所涉及各功能单元的其他相应描述,可以参考图1至图2方法中的对应描述,在此不再赘述。It should be noted that for other corresponding descriptions of each functional unit involved in the image recognition-based business processing device provided by the embodiment of the present application, reference can be made to the corresponding descriptions in the methods of Figures 1 to 2, and will not be described again here.
基于上述如图1至图2所示方法,相应的,本申请实施例还提供了一种存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述如图1至图2所示的基于图像识别的业务处理方法。Based on the above methods shown in Figures 1 to 2, correspondingly, embodiments of the present application also provide a storage medium on which a computer program is stored. When the computer program is executed by a processor, the above described Figures 1 to 2 are implemented. The business processing method based on image recognition is shown.
基于这样的理解,本申请的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个计算机可读存储介质中,该计算机可读存储介质可以是非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等),也可以是易失性存储介质,该计算机可读存储介质中包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施场景所述的方法。 Based on this understanding, the technical solution of the present application can be embodied in the form of a software product. The software product can be stored in a computer-readable storage medium. The computer-readable storage medium can be a non-volatile storage medium (can be a CD). -ROM, U disk, mobile hard disk, etc.), or it can be a volatile storage medium. The computer-readable storage medium includes a number of instructions to enable a computer device (which can be a personal computer, a server, or a network device, etc.) Execute the methods described in each implementation scenario of this application.
基于上述如图1至图2所示的方法,以及图3所示的虚拟装置实施例,为了实现上述目的,本申请实施例还提供了一种计算机设备,具体可以为个人计算机、服务器、网络设备等,该计算机设备包括存储介质和处理器;存储介质,用于存储计算机程序;处理器,用于执行计算机程序以实现上述如图1至图2所示的基于图像识别的业务处理方法。Based on the above methods shown in Figures 1 to 2 and the virtual device embodiment shown in Figure 3, in order to achieve the above purpose, embodiments of the present application also provide a computer device, which can be a personal computer, a server, a network Equipment, etc., the computer equipment includes a storage medium and a processor; the storage medium is used to store a computer program; the processor is used to execute the computer program to implement the above-mentioned business processing method based on image recognition as shown in Figures 1 to 2.
可选地,该计算机设备还可以包括用户接口、网络接口、摄像头、射频(Radio Frequency,RF)电路,传感器、音频电路、WI-FI模块等等。用户接口可以包括显示屏(Display)、输入单元比如键盘(Keyboard)等,可选用户接口还可以包括USB接口、读卡器接口等。网络接口可选的可以包括标准的有线接口、无线接口(如蓝牙接口、WI-FI接口)等。Optionally, the computer device may also include a user interface, a network interface, a camera, a radio frequency (Radio Frequency, RF) circuit, a sensor, an audio circuit, a WI-FI module, etc. The user interface may include a display screen (Display), an input unit such as a keyboard (Keyboard), etc. The optional user interface may also include a USB interface, a card reader interface, etc. Optional network interfaces may include standard wired interfaces, wireless interfaces (such as Bluetooth interfaces, WI-FI interfaces), etc.
本领域技术人员可以理解,本实施例提供的一种计算机设备结构并不构成对该计算机设备的限定,可以包括更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the structure of a computer device provided in this embodiment does not constitute a limitation on the computer device, and may include more or less components, or combine certain components, or arrange different components.
存储介质中还可以包括操作系统、网络通信模块。操作系统是管理和保存计算机设备硬件和软件资源的程序,支持信息处理程序以及其它软件和/或程序的运行。网络通信模块用于实现存储介质内部各组件之间的通信,以及与该实体设备中其它硬件和软件之间通信。The storage medium may also include an operating system and a network communication module. An operating system is a program that manages and saves the hardware and software resources of a computer device and supports the operation of information processing programs and other software and/or programs. The network communication module is used to implement communication between components within the storage medium, as well as communication with other hardware and software in the physical device.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到本申请可以借助软件加必要的通用硬件平台的方式来实现,也可以通过硬件实现。首先,可以获取一个或者多个待验证图像,每个待验证图像中可以包含一个或者多个待验证单据。获取到一个或者多个待验证图像之后,可以对每个待验证图像进行文字识别,进而可以得到每个待验证图像对应的文字信息。得到每个待验证图像对应的文字信息之后,当待验证图像中包含一个待验证单据时,可以直接从该待验证图像对应的文字信息中确定该待验证单据对应的目标查询信息;当待验证图像中包含多个待验证单据时,可以从待验证图像对应的文字信息中分别确定每个待验证单据对应的目标查询信息。确定目标查询信息之后,可以根据每个待验证单据对应的目标查询信息对该待验证单据进行验证,确定该待验证单据是否能够通过验证。如果存在通过验证的待验证单据,那么后续可以利用该通过验证的待验证单据进行相关业务的办理。在业务办理过程中,为了完善业务的相关信息,还可以将待验证单据进行存档处理。具体地,可以从获取的待验证图像中找到该通过验证的待验证单据对应的目标存档图像,之后,可以以该目标存档图像为基础,生成该业务的存档信息。本申请实施例可以简化待验证单据的验证过程,能够大大提升验证以及存档的效率,有利于提高用户的体验感。Through the above description of the embodiments, those skilled in the art can clearly understand that the present application can be implemented by means of software plus a necessary general hardware platform, or can also be implemented by hardware. First, one or more images to be verified can be obtained, and each image to be verified can contain one or more documents to be verified. After obtaining one or more images to be verified, text recognition can be performed on each image to be verified, and the text information corresponding to each image to be verified can be obtained. After obtaining the text information corresponding to each image to be verified, when the image to be verified contains a document to be verified, the target query information corresponding to the document to be verified can be determined directly from the text information corresponding to the image to be verified; when the document to be verified is to be When the image contains multiple documents to be verified, the target query information corresponding to each document to be verified can be determined from the text information corresponding to the image to be verified. After the target query information is determined, each document to be verified can be verified based on the target query information corresponding to the document to be verified, and it is determined whether the document to be verified can pass the verification. If there is a document to be verified that has passed the verification, the document to be verified that has passed the verification can be used to handle related business in the future. During the business processing process, in order to complete business-related information, documents to be verified can also be archived. Specifically, the target archive image corresponding to the verified document to be verified can be found from the obtained image to be verified, and then the archive information of the business can be generated based on the target archive image. The embodiments of this application can simplify the verification process of documents to be verified, greatly improve the efficiency of verification and archiving, and help improve the user experience.
本领域技术人员可以理解附图只是一个优选实施场景的示意图,附图中的模块或流程并不一定是实施本申请所必须的。本领域技术人员可以理解实施场景中的装置中的模块可以按照实施场景描述进行分布于实施场景的装置中,也可以进行相应变化位于不同于本实施场景的一个或多个装置中。上述实施场景的模块可以合并为一个模块,也可以进一步拆分成多个子模块。Those skilled in the art can understand that the accompanying drawing is only a schematic diagram of a preferred implementation scenario, and the modules or processes in the accompanying drawing are not necessarily necessary for implementing the present application. Those skilled in the art can understand that the modules in the devices in the implementation scenario can be distributed in the devices in the implementation scenario according to the description of the implementation scenario, or can be correspondingly changed and located in one or more devices different from the implementation scenario. The modules of the above implementation scenarios can be combined into one module or further split into multiple sub-modules.
上述本申请序号仅仅为了描述,不代表实施场景的优劣。以上公开的仅为本申请的几个具体实施场景,但是,本申请并非局限于此,任何本领域的技术人员能思之的变化都应落入本申请的保护范围。 The above serial numbers of this application are only for description and do not represent the advantages and disadvantages of the implementation scenarios. What is disclosed above are only a few specific implementation scenarios of the present application. However, the present application is not limited thereto. Any changes that can be thought of by those skilled in the art should fall within the protection scope of the present application.

Claims (20)

  1. 一种基于图像识别的业务处理方法,其中,所述方法包括:A business processing method based on image recognition, wherein the method includes:
    获取至少一个待验证图像,对所述至少一个待验证图像进行文字识别,得到每个所述待验证图像对应的文字信息,其中,每个所述待验证图像包括至少一个待验证单据;Obtain at least one image to be verified, perform text recognition on the at least one image to be verified, and obtain text information corresponding to each of the images to be verified, wherein each of the images to be verified includes at least one document to be verified;
    从所述文字信息中确定与每个所述待验证单据对应的目标查询信息,并基于所述目标查询信息,确定所述待验证单据是否通过验证;Determine the target query information corresponding to each of the documents to be verified from the text information, and determine whether the document to be verified passes the verification based on the target query information;
    当任一所述待验证单据通过验证后,基于所述至少一个待验证图像,确定与所述任一所述待验证单据对应的目标存档图像,并基于所述目标存档图像,生成所述业务的存档信息。After any of the documents to be verified passes the verification, a target archive image corresponding to any of the documents to be verified is determined based on the at least one image to be verified, and the business is generated based on the target archive image archived information.
  2. 根据权利要求1所述的方法,其中,所述得到每个所述待验证图像对应的文字信息之后,所述方法还包括:The method according to claim 1, wherein after obtaining the text information corresponding to each of the images to be verified, the method further includes:
    基于所述文字信息,确定所述待验证图像中包含的待验证单据的个数;Based on the text information, determine the number of documents to be verified contained in the image to be verified;
    当包含一个所述待验证单据时,从所述文字信息中确定唯一标识,通过所述唯一标识标记所述待验证图像,并将标记后的待验证图像作为目标验证图像;When a document to be verified is included, a unique identifier is determined from the text information, the image to be verified is marked with the unique identifier, and the marked image to be verified is used as the target verification image;
    当包含多个所述待验证单据时,将所述待验证图像进行裁剪处理,得到与每个所述待验证单据对应的待验证子图像,从所述文字信息中分别确定与每个所述待验证子图像对应的唯一标识,通过所述唯一标识标记所述待验证子图像,并将每个标记后的待验证子图像作为所述目标验证图像;When multiple documents to be verified are included, the image to be verified is cropped to obtain a sub-image to be verified corresponding to each document to be verified, and the content of each document to be verified is determined from the text information. A unique identifier corresponding to the sub-image to be verified, marking the sub-image to be verified by the unique identifier, and using each marked sub-image to be verified as the target verification image;
    相应地,所述基于所述至少一个待验证图像,确定与所述任一所述待验证单据对应的目标存档图像,包括:Correspondingly, determining the target archive image corresponding to any of the documents to be verified based on the at least one image to be verified includes:
    从所述至少一个待验证图像对应的所述目标验证图像中,基于所述唯一标识确定与所述任一所述待验证单据对应的目标验证图像,作为所述目标存档图像。From the target verification image corresponding to the at least one image to be verified, the target verification image corresponding to any of the documents to be verified is determined based on the unique identifier as the target archive image.
  3. 根据权利要求2所述的方法,其中,所述基于所述文字信息,确定所述待验证图像中包含的待验证单据的个数,包括:The method according to claim 2, wherein determining the number of documents to be verified contained in the image to be verified based on the text information includes:
    从所述文字信息中识别出第一文字组合的出现次数,并将所述出现次数作为所述待验证图像中包含的待验证单据的个数,其中,所述第一文字组合为每个所述待验证单据中存在的,且仅出现一次的文字组合。The number of occurrences of the first character combination is identified from the text information, and the number of occurrences is used as the number of documents to be verified contained in the image to be verified, wherein the first character combination is for each of the documents to be verified. Verify the text combination that exists in the document and appears only once.
  4. 根据权利要求2所述的方法,其中,所述将所述待验证图像进行裁剪处理,得到与每个所述待验证单据对应的待验证子图像,包括:The method according to claim 2, wherein the cropping of the image to be verified to obtain the sub-image to be verified corresponding to each of the documents to be verified includes:
    从所述待验证图像中依次确定第二文字组合以及第三文字组合对应的位置,其中,所述第二文字组合为所述待验证单据的通用结束文字组合,所述第三文字组合为所述待验证单据的通用起始文字组合; The positions corresponding to the second text combination and the third text combination are determined sequentially from the image to be verified, wherein the second text combination is the universal ending text combination of the document to be verified, and the third text combination is the Describe the common starting text combination of the document to be verified;
    基于每组所述第二文字组合对应的位置、所述第三文字组合对应的位置,以及预设分割比例,确定目标裁剪位置;Determine the target cropping position based on the position corresponding to the second text combination of each group, the position corresponding to the third text combination, and the preset segmentation ratio;
    基于所述目标裁剪位置对所述待验证图像进行裁剪处理,得到与每个所述待验证单据对应的待验证子图像。The image to be verified is cropped based on the target cropping position to obtain a sub-image to be verified corresponding to each document to be verified.
  5. 根据权利要求2所述的方法,其中,当所述待验证图像中包含多个所述待验证单据时,所述从所述文字信息中确定与每个所述待验证单据对应的目标查询信息,包括:The method according to claim 2, wherein when the image to be verified contains multiple documents to be verified, the target query information corresponding to each document to be verified is determined from the text information. ,include:
    将所述文字信息分割为与每个所述目标验证图像对应的子文字信息,分别从每个子文字信息中提取出与所述待验证单据对应的目标查询信息。The text information is divided into sub-text information corresponding to each target verification image, and target query information corresponding to the document to be verified is extracted from each sub-text information.
  6. 根据权利要求1所述的方法,其中,所述基于所述目标查询信息,确定所述待验证单据是否通过验证,包括:The method according to claim 1, wherein determining whether the document to be verified passes verification based on the target query information includes:
    调用预设单据验证接口,将所述目标查询信息输入至所述预设单据验证接口,并接收所述预设单据验证接口返回的单据状态信息,其中,所述单据状态信息包括已登记状态和未登记状态;Call the preset document verification interface, input the target query information into the preset document verification interface, and receive the document status information returned by the preset document verification interface, where the document status information includes registered status and Unregistered status;
    当所述单据状态信息为已登记状态时,确定所述待验证单据未通过验证;When the document status information is in the registered state, it is determined that the document to be verified has not passed verification;
    当所述单据状态信息为未登记状态时,确定所述待验证单据通过验证。When the document status information is in the unregistered state, it is determined that the document to be verified passes the verification.
  7. 根据权利要求1所述的方法,其中,所述获取至少一个待验证图像之前,所述方法还包括:The method according to claim 1, wherein before obtaining at least one image to be verified, the method further includes:
    获取单据号列表,其中,所述单据号列表中包括每个所述待验证单据对应的单据号;Obtain a list of document numbers, wherein the list of document numbers includes the document number corresponding to each of the documents to be verified;
    相应地,所述目标查询信息中包括目标单据号;所述从所述文字信息中确定与每个所述待验证单据对应的目标查询信息之后,所述方法还包括:Correspondingly, the target query information includes a target document number; after determining the target query information corresponding to each of the documents to be verified from the text information, the method further includes:
    识别所述目标查询信息中的所述目标单据号,并将所述目标单据号从所述单据号列表中剔除,得到遗漏单据列表。Identify the target document number in the target query information, and remove the target document number from the document number list to obtain a missing document list.
  8. 一种基于图像识别的业务处理装置,其中,包括:A business processing device based on image recognition, which includes:
    文字识别模块,用于获取至少一个待验证图像,对所述至少一个待验证图像进行文字识别,得到每个所述待验证图像对应的文字信息,其中,每个所述待验证图像包括至少一个待验证单据;A text recognition module, configured to obtain at least one image to be verified, perform text recognition on the at least one image to be verified, and obtain text information corresponding to each image to be verified, wherein each image to be verified includes at least one Documents to be verified;
    验证模块,用于从所述文字信息中确定与每个所述待验证单据对应的目标查询信息,并基于所述目标查询信息,确定所述待验证单据是否通过验证;A verification module, configured to determine the target query information corresponding to each of the documents to be verified from the text information, and determine whether the document to be verified passes the verification based on the target query information;
    存档模块,用于当任一所述待验证单据通过验证后,基于所述至少一个待验证图像,确定与所述任一所述待验证单据对应的目标存档图像,并基于所述目标存档图像,生成所述业务的存档信息。 An archiving module, configured to determine a target archive image corresponding to any of the documents to be verified based on the at least one image to be verified after the document to be verified passes the verification, and based on the target archive image , generate archive information of the business.
  9. 一种存储介质,其上存储有计算机程序,其中,所述计算机程序被处理器执行时实现以下步骤:A storage medium on which a computer program is stored, wherein the following steps are implemented when the computer program is executed by a processor:
    获取至少一个待验证图像,对所述至少一个待验证图像进行文字识别,得到每个所述待验证图像对应的文字信息,其中,每个所述待验证图像包括至少一个待验证单据;Obtain at least one image to be verified, perform text recognition on the at least one image to be verified, and obtain text information corresponding to each of the images to be verified, wherein each of the images to be verified includes at least one document to be verified;
    从所述文字信息中确定与每个所述待验证单据对应的目标查询信息,并基于所述目标查询信息,确定所述待验证单据是否通过验证;Determine the target query information corresponding to each of the documents to be verified from the text information, and determine whether the document to be verified passes the verification based on the target query information;
    当任一所述待验证单据通过验证后,基于所述至少一个待验证图像,确定与所述任一所述待验证单据对应的目标存档图像,并基于所述目标存档图像,生成所述业务的存档信息。After any of the documents to be verified passes the verification, a target archive image corresponding to any of the documents to be verified is determined based on the at least one image to be verified, and the business is generated based on the target archive image archived information.
  10. 根据权利要求9所述的存储介质,其中,在执行所述得到每个所述待验证图像对应的文字信息之后,还用于执行:The storage medium according to claim 9, wherein, after performing the step of obtaining the text information corresponding to each of the images to be verified, it is further used to perform:
    基于所述文字信息,确定所述待验证图像中包含的待验证单据的个数;Based on the text information, determine the number of documents to be verified contained in the image to be verified;
    当包含一个所述待验证单据时,从所述文字信息中确定唯一标识,通过所述唯一标识标记所述待验证图像,并将标记后的待验证图像作为目标验证图像;When a document to be verified is included, a unique identifier is determined from the text information, the image to be verified is marked with the unique identifier, and the marked image to be verified is used as the target verification image;
    当包含多个所述待验证单据时,将所述待验证图像进行裁剪处理,得到与每个所述待验证单据对应的待验证子图像,从所述文字信息中分别确定与每个所述待验证子图像对应的唯一标识,通过所述唯一标识标记所述待验证子图像,并将每个标记后的待验证子图像作为所述目标验证图像;When multiple documents to be verified are included, the image to be verified is cropped to obtain a sub-image to be verified corresponding to each document to be verified, and the content of each document to be verified is determined from the text information. A unique identifier corresponding to the sub-image to be verified, marking the sub-image to be verified by the unique identifier, and using each marked sub-image to be verified as the target verification image;
    相应地,执行所述基于所述至少一个待验证图像,确定与所述任一所述待验证单据对应的目标存档图像时,具体执行:Correspondingly, when performing the step of determining the target archive image corresponding to any of the documents to be verified based on the at least one image to be verified, specifically execute:
    从所述至少一个待验证图像对应的所述目标验证图像中,基于所述唯一标识确定与所述任一所述待验证单据对应的目标验证图像,作为所述目标存档图像。From the target verification image corresponding to the at least one image to be verified, the target verification image corresponding to any of the documents to be verified is determined based on the unique identifier as the target archive image.
  11. 根据权利要求10所述的存储介质,其中,执行所述基于所述文字信息,确定所述待验证图像中包含的待验证单据的个数时,具体执行:The storage medium according to claim 10, wherein when determining the number of documents to be verified contained in the image to be verified based on the text information, the specific execution is:
    从所述文字信息中识别出第一文字组合的出现次数,并将所述出现次数作为所述待验证图像中包含的待验证单据的个数,其中,所述第一文字组合为每个所述待验证单据中存在的,且仅出现一次的文字组合。The number of occurrences of the first character combination is identified from the text information, and the number of occurrences is used as the number of documents to be verified contained in the image to be verified, wherein the first character combination is for each of the documents to be verified. Verify the text combination that exists in the document and appears only once.
  12. 根据权利要求10所述的存储介质,其中,执行所述将所述待验证图像进行裁剪处理,得到与每个所述待验证单据对应的待验证子图像时,具体执行:The storage medium according to claim 10, wherein when performing the cropping process on the image to be verified to obtain the sub-image to be verified corresponding to each of the documents to be verified, the specific execution is:
    从所述待验证图像中依次确定第二文字组合以及第三文字组合对应的位置,其中,所述第二文字组合为所述待验证单据的通用结束文字组合,所述第三文字组合为所述待验证单据的通用起始文字组合;The positions corresponding to the second text combination and the third text combination are determined sequentially from the image to be verified, wherein the second text combination is the universal ending text combination of the document to be verified, and the third text combination is the Describe the common starting text combination of the document to be verified;
    基于每组所述第二文字组合对应的位置、所述第三文字组合对应的位置,以及预设分割比例,确定目标裁剪位置; Determine the target cropping position based on the position corresponding to the second text combination of each group, the position corresponding to the third text combination, and the preset segmentation ratio;
    基于所述目标裁剪位置对所述待验证图像进行裁剪处理,得到与每个所述待验证单据对应的待验证子图像。The image to be verified is cropped based on the target cropping position to obtain a sub-image to be verified corresponding to each document to be verified.
  13. 根据权利要求10所述的存储介质,其中,当所述待验证图像中包含多个所述待验证单据时,执行所述从所述文字信息中确定与每个所述待验证单据对应的目标查询信息时,具体执行:The storage medium according to claim 10, wherein when the image to be verified contains a plurality of the documents to be verified, determining the target corresponding to each of the documents to be verified from the text information is performed. When querying information, specifically execute:
    将所述文字信息分割为与每个所述目标验证图像对应的子文字信息,分别从每个子文字信息中提取出与所述待验证单据对应的目标查询信息。The text information is divided into sub-text information corresponding to each target verification image, and target query information corresponding to the document to be verified is extracted from each sub-text information.
  14. 根据权利要求9所述的存储介质,其中,执行所述基于所述目标查询信息,确定所述待验证单据是否通过验证时,具体执行:The storage medium according to claim 9, wherein when determining whether the document to be verified passes verification based on the target query information, the specific execution is:
    调用预设单据验证接口,将所述目标查询信息输入至所述预设单据验证接口,并接收所述预设单据验证接口返回的单据状态信息,其中,所述单据状态信息包括已登记状态和未登记状态;Call the preset document verification interface, input the target query information into the preset document verification interface, and receive the document status information returned by the preset document verification interface, where the document status information includes registered status and Unregistered status;
    当所述单据状态信息为已登记状态时,确定所述待验证单据未通过验证;When the document status information is in the registered state, it is determined that the document to be verified has not passed verification;
    当所述单据状态信息为未登记状态时,确定所述待验证单据通过验证。When the document status information is in the unregistered state, it is determined that the document to be verified passes the verification.
  15. 一种计算机设备,包括存储介质、处理器及存储在存储介质上并可在处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现以下步骤:A computer device includes a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor, wherein the processor implements the following steps when executing the computer program:
    获取至少一个待验证图像,对所述至少一个待验证图像进行文字识别,得到每个所述待验证图像对应的文字信息,其中,每个所述待验证图像包括至少一个待验证单据;Obtain at least one image to be verified, perform text recognition on the at least one image to be verified, and obtain text information corresponding to each of the images to be verified, wherein each of the images to be verified includes at least one document to be verified;
    从所述文字信息中确定与每个所述待验证单据对应的目标查询信息,并基于所述目标查询信息,确定所述待验证单据是否通过验证;Determine the target query information corresponding to each of the documents to be verified from the text information, and determine whether the document to be verified passes the verification based on the target query information;
    当任一所述待验证单据通过验证后,基于所述至少一个待验证图像,确定与所述任一所述待验证单据对应的目标存档图像,并基于所述目标存档图像,生成所述业务的存档信息。After any of the documents to be verified passes the verification, a target archive image corresponding to any of the documents to be verified is determined based on the at least one image to be verified, and the business is generated based on the target archive image archived information.
  16. 根据权利要求15所述的计算机设备,其中,在执行所述得到每个所述待验证图像对应的文字信息之后,还用于执行:The computer device according to claim 15, wherein, after performing the step of obtaining the text information corresponding to each of the images to be verified, it is further configured to perform:
    基于所述文字信息,确定所述待验证图像中包含的待验证单据的个数;Based on the text information, determine the number of documents to be verified contained in the image to be verified;
    当包含一个所述待验证单据时,从所述文字信息中确定唯一标识,通过所述唯一标识标记所述待验证图像,并将标记后的待验证图像作为目标验证图像;When a document to be verified is included, a unique identifier is determined from the text information, the image to be verified is marked with the unique identifier, and the marked image to be verified is used as the target verification image;
    当包含多个所述待验证单据时,将所述待验证图像进行裁剪处理,得到与每个所述待验证单据对应的待验证子图像,从所述文字信息中分别确定与每个所述待验证子图像对应的唯一标识,通过所述唯一标识标记所述待验证子图像,并将每个标记后的待验证子图像作为所述目标验证图像; When multiple documents to be verified are included, the image to be verified is cropped to obtain a sub-image to be verified corresponding to each document to be verified, and the content of each document to be verified is determined from the text information. A unique identifier corresponding to the sub-image to be verified, marking the sub-image to be verified by the unique identifier, and using each marked sub-image to be verified as the target verification image;
    相应地,执行所述基于所述至少一个待验证图像,确定与所述任一所述待验证单据对应的目标存档图像时,具体执行:Correspondingly, when performing the step of determining the target archive image corresponding to any of the documents to be verified based on the at least one image to be verified, specifically execute:
    从所述至少一个待验证图像对应的所述目标验证图像中,基于所述唯一标识确定与所述任一所述待验证单据对应的目标验证图像,作为所述目标存档图像。From the target verification image corresponding to the at least one image to be verified, the target verification image corresponding to any of the documents to be verified is determined based on the unique identifier as the target archive image.
  17. 根据权利要求16所述的计算机设备,其中,执行所述基于所述文字信息,确定所述待验证图像中包含的待验证单据的个数时,具体执行:The computer device according to claim 16, wherein when determining the number of documents to be verified contained in the image to be verified based on the text information, the specific execution is:
    从所述文字信息中识别出第一文字组合的出现次数,并将所述出现次数作为所述待验证图像中包含的待验证单据的个数,其中,所述第一文字组合为每个所述待验证单据中存在的,且仅出现一次的文字组合。The number of occurrences of the first character combination is identified from the text information, and the number of occurrences is used as the number of documents to be verified contained in the image to be verified, wherein the first character combination is for each of the documents to be verified. Verify the text combination that exists in the document and appears only once.
  18. 根据权利要求16所述的计算机设备,其中,执行所述将所述待验证图像进行裁剪处理,得到与每个所述待验证单据对应的待验证子图像时,具体执行:The computer device according to claim 16, wherein when performing the cropping process of the image to be verified to obtain the sub-image to be verified corresponding to each of the documents to be verified, the specific execution is:
    从所述待验证图像中依次确定第二文字组合以及第三文字组合对应的位置,其中,所述第二文字组合为所述待验证单据的通用结束文字组合,所述第三文字组合为所述待验证单据的通用起始文字组合;The positions corresponding to the second text combination and the third text combination are determined sequentially from the image to be verified, wherein the second text combination is the universal ending text combination of the document to be verified, and the third text combination is the Describe the common starting text combination of the document to be verified;
    基于每组所述第二文字组合对应的位置、所述第三文字组合对应的位置,以及预设分割比例,确定目标裁剪位置;Determine the target cropping position based on the position corresponding to the second text combination of each group, the position corresponding to the third text combination, and the preset segmentation ratio;
    基于所述目标裁剪位置对所述待验证图像进行裁剪处理,得到与每个所述待验证单据对应的待验证子图像。The image to be verified is cropped based on the target cropping position to obtain a sub-image to be verified corresponding to each document to be verified.
  19. 根据权利要求16所述的计算机设备,其中,当所述待验证图像中包含多个所述待验证单据时,执行所述从所述文字信息中确定与每个所述待验证单据对应的目标查询信息时,具体执行:The computer device according to claim 16, wherein when the image to be verified contains a plurality of the documents to be verified, performing the step of determining the target corresponding to each of the documents to be verified from the text information. When querying information, specifically execute:
    将所述文字信息分割为与每个所述目标验证图像对应的子文字信息,分别从每个子文字信息中提取出与所述待验证单据对应的目标查询信息。The text information is divided into sub-text information corresponding to each target verification image, and target query information corresponding to the document to be verified is extracted from each sub-text information.
  20. 根据权利要求15所述的计算机设备,其中,执行所述基于所述目标查询信息,确定所述待验证单据是否通过验证时,具体执行:The computer device according to claim 15, wherein when determining whether the document to be verified passes verification based on the target query information, the specific execution is:
    调用预设单据验证接口,将所述目标查询信息输入至所述预设单据验证接口,并接收所述预设单据验证接口返回的单据状态信息,其中,所述单据状态信息包括已登记状态和未登记状态;Call the preset document verification interface, input the target query information into the preset document verification interface, and receive the document status information returned by the preset document verification interface, where the document status information includes registered status and Unregistered status;
    当所述单据状态信息为已登记状态时,确定所述待验证单据未通过验证;When the document status information is in the registered state, it is determined that the document to be verified has not passed verification;
    当所述单据状态信息为未登记状态时,确定所述待验证单据通过验证。 When the document status information is in the unregistered state, it is determined that the document to be verified passes the verification.
PCT/CN2023/103596 2022-07-13 2023-06-29 Image recognition-based service processing method and apparatus, and storage medium WO2024012209A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210821064.2 2022-07-13
CN202210821064.2A CN115019325A (en) 2022-07-13 2022-07-13 Service processing method and device based on image recognition and storage medium

Publications (1)

Publication Number Publication Date
WO2024012209A1 true WO2024012209A1 (en) 2024-01-18

Family

ID=83082657

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/103596 WO2024012209A1 (en) 2022-07-13 2023-06-29 Image recognition-based service processing method and apparatus, and storage medium

Country Status (2)

Country Link
CN (1) CN115019325A (en)
WO (1) WO2024012209A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115019325A (en) * 2022-07-13 2022-09-06 深圳前海环融联易信息科技服务有限公司 Service processing method and device based on image recognition and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080208738A1 (en) * 2007-02-23 2008-08-28 George Mathew Automated bill validation for electronic and telephonic transactions
CN109461247A (en) * 2018-10-29 2019-03-12 北京慧流科技有限公司 Note validating method and device, electronic equipment and storage medium
CN109472918A (en) * 2018-10-12 2019-03-15 深圳壹账通智能科技有限公司 Invoice validation method, financing checking method, device, equipment and medium
CN113191448A (en) * 2021-05-17 2021-07-30 广东电网有限责任公司 Auditing method, device and equipment based on picture identification and storage medium
CN115019325A (en) * 2022-07-13 2022-09-06 深圳前海环融联易信息科技服务有限公司 Service processing method and device based on image recognition and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080208738A1 (en) * 2007-02-23 2008-08-28 George Mathew Automated bill validation for electronic and telephonic transactions
CN109472918A (en) * 2018-10-12 2019-03-15 深圳壹账通智能科技有限公司 Invoice validation method, financing checking method, device, equipment and medium
CN109461247A (en) * 2018-10-29 2019-03-12 北京慧流科技有限公司 Note validating method and device, electronic equipment and storage medium
CN113191448A (en) * 2021-05-17 2021-07-30 广东电网有限责任公司 Auditing method, device and equipment based on picture identification and storage medium
CN115019325A (en) * 2022-07-13 2022-09-06 深圳前海环融联易信息科技服务有限公司 Service processing method and device based on image recognition and storage medium

Also Published As

Publication number Publication date
CN115019325A (en) 2022-09-06

Similar Documents

Publication Publication Date Title
AU2017302250B2 (en) Optical character recognition in structured documents
US10380237B2 (en) Smart optical input/output (I/O) extension for context-dependent workflows
CN112052749A (en) Archive filing method and device, electronic equipment and computer readable storage medium
US10339373B1 (en) Optical character recognition utilizing hashed templates
WO2024012209A1 (en) Image recognition-based service processing method and apparatus, and storage medium
US10057449B2 (en) Document analysis system, image forming apparatus, and analysis server
CN109993075B (en) Chat application session content storage method, system and device
CN111178836A (en) Batch archiving method, device and equipment for electronic documents and storage medium
AU2018410912B2 (en) Method and system for background removal from documents
CN108304815A (en) A kind of data capture method, device, server and storage medium
WO2021017458A1 (en) Auxiliary processing method, device, and system for image recognition
US11611677B2 (en) Information processing apparatus that identifies related document images based on metadata and associates them based on user input, information processing system, information processing method, and storage medium
JP2014175978A (en) Information processing apparatus, control method of the same, and program
CN110059184B (en) Operation error collection and analysis method and system
US9372857B2 (en) Information processing apparatus, trail collection system, information processing method, and non-transitory computer readable medium
US20230206672A1 (en) Image processing apparatus, control method of image processing apparatus, and storage medium
CN111178365A (en) Picture character recognition method and device, electronic equipment and storage medium
CN110119743B (en) Picture identification method, server and computer readable storage medium
JP2019164509A (en) Information processing device, program, and information processing system
CN111046864A (en) Method and system for automatically extracting five elements of contract scanning piece
WO2015160988A1 (en) Smart optical input/output (i/o) extension for context-dependent workflows
US9195888B2 (en) Document registration apparatus and non-transitory computer readable medium
JP2014063457A (en) Annotation management system, and program for making computer execute the same
US20230368555A1 (en) Information processing apparatus, information processing method, and storage medium
CN116403096B (en) Intelligent financial work method and system based on OCR bill recognition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23838702

Country of ref document: EP

Kind code of ref document: A1