CN113989817A - User-defined bill identification method, device and medium - Google Patents

User-defined bill identification method, device and medium Download PDF

Info

Publication number
CN113989817A
CN113989817A CN202111324906.5A CN202111324906A CN113989817A CN 113989817 A CN113989817 A CN 113989817A CN 202111324906 A CN202111324906 A CN 202111324906A CN 113989817 A CN113989817 A CN 113989817A
Authority
CN
China
Prior art keywords
identification
bill
recognition
template
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111324906.5A
Other languages
Chinese (zh)
Inventor
王雪飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur General Software Co Ltd
Original Assignee
Inspur General Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur General Software Co Ltd filed Critical Inspur General Software Co Ltd
Priority to CN202111324906.5A priority Critical patent/CN113989817A/en
Publication of CN113989817A publication Critical patent/CN113989817A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The application discloses a method, equipment and medium for identifying a user-defined bill, wherein the method comprises the following steps: receiving a bill template image, and determining a reference field and an identification field area according to the bill template image; determining a corresponding designated recognition model from the recognition model library, and matching a corresponding high-frequency vocabulary so as to correct the recognition result through the high-frequency vocabulary; constructing a custom bill identification template, and classifying and storing the custom bill identification template; receiving an identification service request, and loading a custom bill identification template on a corresponding starting port according to the identification service request; and receiving a custom bill, and identifying the custom bill image through a custom bill identification template to obtain an identification result. On the basis of ensuring certain identification precision, the development time is greatly shortened, and a user can quickly obtain a required identification template.

Description

User-defined bill identification method, device and medium
Technical Field
The application relates to the field of image recognition, in particular to a method, equipment and medium for recognizing a user-defined bill.
Background
With the development of modern enterprises, more and more enterprises can design a plurality of customized documents specially used for internal recording or reporting according to self business or administrative processes. Many custom documents need to be filled in enterprises, and when the staff collects the collected custom documents, the documents are recorded by a handwriting mode, so that the collection records are inaccurate, and the human resources are wasted.
Therefore, enterprises generally adopt document identification technology to replace manpower, and perform document identification and aggregation. However, in the prior art that can realize document identification, only fixed types of documents can be identified, and it is difficult to meet practical requirements.
In addition, when the identification model development platform develops the identification model corresponding to the custom document, the problem of long development period exists, and when the types of the custom documents of an enterprise are more, the development platform is difficult to rapidly develop the identification model corresponding to the custom document.
Disclosure of Invention
In order to solve the above-mentioned problem, for solving the staff promptly and gathering the document through the handwriting mode, lead to easily gathering the problem of record inaccuracy, extravagant manpower resources to and discern the problem that is difficult to satisfy actual demand to custom document through intelligent identification's mode, and the development platform is difficult to develop the recognition model that obtains a large amount of custom documents and correspond in the short time, this application has proposed a self-defined document identification method, equipment and medium, include: in one aspect, the application provides a method for identifying a user-defined bill, including: receiving a bill template image, and determining a corresponding reference field and a corresponding identification field area according to the bill template image; determining a corresponding appointed recognition model from a recognition model library according to the reference field and the recognition field region, matching a corresponding high-frequency vocabulary for the appointed recognition model, and correcting a recognition result through the high-frequency vocabulary, wherein the high-frequency vocabulary is the vocabulary with the occurrence frequency greater than a preset threshold value in the reference field and the recognition field region; constructing a self-defined bill recognition template according to the bill template image, the reference field, the recognition field area, the designated recognition model and the high-frequency vocabulary, and storing the self-defined bill recognition template in a classified manner; receiving an identification service request, and loading the custom bill identification template on a corresponding starting port according to the identification service request; and receiving a user-defined bill image, and identifying the user-defined bill image through the user-defined bill identification template to obtain an identification result.
In one example, before receiving a custom ticket image and performing recognition processing on the custom ticket image through the custom ticket recognition template to obtain a recognition result, the method further includes: carrying out image optimization processing on the user-defined bill image, wherein the image optimization processing at least comprises the following steps: denoising, sharpening, adjusting brightness and smoothly zooming; carrying out first position detection on the self-defined bill image, and carrying out first preset angle rotation processing on the self-defined bill according to a detection result; text region detection is carried out on the user-defined bill image to obtain a plurality of text regions, and the user-defined bill image is cut according to the text regions to obtain a plurality of sub-images with the same number as the text regions; performing second position detection on the plurality of sub-images, and performing second preset angle rotation processing on the plurality of sub-images according to the detection result; and determining the blank areas corresponding to the sub-images respectively, and cutting the blank areas, wherein the blank areas are areas not containing texts.
In one example, receiving a custom bill image, and performing recognition processing on the custom bill image through the custom bill recognition template to obtain a recognition result, specifically including: receiving a user-defined bill image, and carrying out first identification on the user-defined bill image through the specified identification model to obtain a plurality of first identified fields; comparing the first identified fields with the reference fields to obtain a plurality of same fields, and grouping the same fields according to a preset allocation strategy to obtain a plurality of groups of same fields; respectively carrying out perspective correction processing on the multiple groups of same fields according to preset datum points in the reference fields, and carrying out integration processing on multiple perspective results to obtain a user-defined bill image after perspective processing; constructing the same coordinate system for the bill template image and the self-defined bill image after perspective processing, and determining the area to be identified in the self-defined bill image after perspective processing through the coordinates of the identification field area in the bill template image; performing second identification on the area to be identified through the specified identification model to obtain a plurality of second identified fields; and correcting the second recognized field according to the high-frequency vocabulary to obtain a recognition result.
In one example, after constructing a custom bill recognition template according to the bill template image, the reference field, the recognition field region, the designated recognition model and the high-frequency vocabulary, and storing the custom bill recognition template in a classified manner, the method further includes: receiving a deployment instruction, and reading the self-defined bill identification template according to the deployment instruction to generate an appointed identification model configuration file and an identification service configuration file, wherein the appointed identification model configuration file at least comprises parameter information of the appointed identification model, and the identification service configuration file at least comprises parameter information corresponding to the bill template image, the reference field, the identification field area and the high-frequency field respectively; obtaining a starting port of the user-defined bill identification template according to a pre-stored port starting strategy, and storing the starting port into the specified identification model configuration file and the identification service configuration file; acquiring a required video memory of the appointed identification model according to the appointed identification model configuration file, and adding a calling instruction corresponding to the video memory and the file position of the appointed identification model configuration file to a first starting script; and adding the file position of the identification service configuration file to a second starting script.
In one example, receiving an identification service request, and loading the custom ticket identification template on a corresponding start port according to the identification service request specifically includes: receiving an identification service request, and determining the corresponding self-defined bill identification template according to the identification service request; running the first start script and the second start script; distributing the resources of the video memory for the starting port, and loading the specified identification model, the bill template image, the reference field, the identification field area and the high-frequency field at the starting port.
In one example, after adding the file location of the identified service profile to the second startup script, the method further comprises: receiving a service stopping instruction, deleting the first script and the second script according to the service stopping instruction, and deleting the contents in the identification model configuration file and the identification service configuration file; and deleting the user-defined bill identification template after classified storage from a database according to the service stopping instruction, and releasing storage resources.
In one example, after adding the file location of the identified service profile to the second startup script, the method further comprises: receiving a service export instruction, and packaging the identification model configuration file, the identification service configuration file, the first start script and the second start script according to the service export instruction to obtain a packaged start file; determining a file downloading interface, and loading the packaged starting file to the file downloading interface.
In one example, the method further comprises: acquiring the occupation conditions of a memory and a video memory, acquiring the running number of the specified identification model, and acquiring the running state of the user-defined bill identification template; and constructing a dynamic chart according to the occupation condition, the running number and the running state, and sending the dynamic chart to front-end equipment for displaying through the front-end equipment.
On the other hand, this application still provides the identification equipment of self-defined bill, includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to: receiving a bill template image, and determining a corresponding reference field and a corresponding identification field area according to the bill template image; determining a corresponding appointed recognition model from a recognition model library according to the reference field and the recognition field region, matching a corresponding high-frequency vocabulary for the appointed recognition model, and correcting a recognition result through the high-frequency vocabulary, wherein the high-frequency vocabulary is the vocabulary with the occurrence frequency greater than a preset threshold value in the reference field and the recognition field region; constructing a self-defined bill recognition template according to the bill template image, the reference field, the recognition field area, the designated recognition model and the high-frequency vocabulary, and storing the self-defined bill recognition template in a classified manner; receiving an identification service request, and loading the custom bill identification template on a corresponding starting port according to the identification service request; and receiving a user-defined bill image, and identifying the user-defined bill image through the user-defined bill identification template to obtain an identification result.
In one example, the present application further provides a non-transitory computer storage medium storing computer-executable instructions configured to: receiving a bill template image, and determining a corresponding reference field and a corresponding identification field area according to the bill template image; determining a corresponding appointed recognition model from a recognition model library according to the reference field and the recognition field region, matching a corresponding high-frequency vocabulary for the appointed recognition model, and correcting a recognition result through the high-frequency vocabulary, wherein the high-frequency vocabulary is the vocabulary with the occurrence frequency greater than a preset threshold value in the reference field and the recognition field region; constructing a self-defined bill recognition template according to the bill template image, the reference field, the recognition field area, the designated recognition model and the high-frequency vocabulary, and storing the self-defined bill recognition template in a classified manner; receiving an identification service request, and loading the custom bill identification template on a corresponding starting port according to the identification service request; and receiving a user-defined bill image, and identifying the user-defined bill image through the user-defined bill identification template to obtain an identification result.
The method, the equipment and the medium for identifying the user-defined bill can bring the following beneficial effects: the method provides a user-defined bill identification template establishing mode for users or enterprises, ensures that the users can rapidly make corresponding identification templates according to requirements, and improves the working efficiency of the users. Meanwhile, on the basis of ensuring certain identification precision, the development time is greatly shortened, and a user can quickly obtain a required identification template.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a schematic flow chart of a method for identifying a custom ticket in an embodiment of the present application;
fig. 2 is a schematic diagram of a device for identifying a custom ticket in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that, the identification method of the customized ticket described in this application may be stored in the system or the server in the form of a program or an algorithm, and the support for the program or the algorithm may be implemented by corresponding elements in a hardware terminal where the system or the server is located, such as a processor, a memory, a communication module, and the like. In the embodiment of the present application, a server is taken as an example for explanation, and the server may perform support of a program or an algorithm through a hardware terminal where the server is located. Meanwhile, a user can log in the server through corresponding front-end equipment and realize information interaction with the server, wherein the front-end equipment comprises but is not limited to: cell-phone, panel computer, personal computer and other possess the hardware equipment of corresponding power of calculating. The user can realize information interaction with the server through a built-in system of the front-end equipment, an APP or a WEB page and the like, and further realize automatic identification of the user-defined bill.
The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.
As shown in fig. 1, a method for identifying a custom ticket provided in an embodiment of the present application includes:
s101: and receiving the bill template image, and determining a corresponding reference field and a corresponding identification field area according to the bill template image.
Specifically, the server receives a ticket template image. The ticket template image is typically uploaded by the user to provide template support for later custom ticket identification. Therefore, the note template image and the custom note image are the same custom note.
In order to ensure the accuracy of the subsequent process, a user needs to ensure that the uploaded bill template image is clear and complete and has no bending or damage. After receiving the bill template image, the server preprocesses the bill template image, where the preprocessing process includes, but is not limited to: rotation, zooming, cutting and perspective are carried out to ensure that the bill template image can meet the definition requirement.
Further, the server displays the preprocessed bill template image through front-end equipment so that a user can check and operate the bill template image, the user can select the image in a frame mode, and the selected content comprises a reference field and an identification field area. The reference field is the header of the fixed text content in the self-defining bill with the same plate type, such as a value-added tax invoice. In order to ensure the accuracy of the subsequent process, at least four reference fields are selected, and the positions of four corners of the image are selected as far as possible.
Meanwhile, the user needs to select and identify a field area, and the field area is the content filled in by himself, namely the content needing to be identified.
After the user selects, the system can determine the reference field corresponding to the bill template image and identify the field area.
S102: and determining a corresponding appointed recognition model from a recognition model library according to the reference field and the recognition field region, matching a corresponding high-frequency vocabulary for the appointed recognition model, and correcting a recognition result through the high-frequency vocabulary, wherein the high-frequency vocabulary is the vocabulary with the occurrence frequency greater than a preset threshold value in the reference field and the recognition field region.
Specifically, the recognition model library is pre-stored with a plurality of recognition models for recognizing different types of characters, including but not limited to: chinese character recognition, number recognition, English recognition, Russian recognition and the like. In addition, in the embodiment of the present application, the recognition model is explained by taking an OCR (optical character recognition) model as an example, and in order to avoid redundant description, the recognition model mentioned below is defaulted to be the OCR model.
The user can select a corresponding designated recognition model from the recognition model library according to the character contents in the reference field and the recognition field area, and the number of the designated recognition models can be more than one because the character contents in the reference field and the recognition field area may contain more than one.
The server can determine the corresponding appointed recognition model from the recognition model library according to the reference field and the recognition field area based on the selection of the user.
In addition, the user can input high-frequency words, namely the words with higher occurrence frequency in the plate-type self-defined bill, namely the words with the occurrence frequency larger than a preset threshold value in the reference field and the recognition field area. The server can match the high-frequency vocabulary with the specified recognition model so as to correct the recognition result in the recognition process or after the recognition is finished, thereby ensuring the accuracy of the recognition result.
S103: and constructing a self-defined bill recognition template according to the bill template image, the reference field, the recognition field area, the designated recognition model and the high-frequency vocabulary, and storing the self-defined bill recognition template in a classified manner.
Specifically, the server can construct a custom bill recognition template through the determined bill template image, the reference field, the recognition field area, the designated recognition model and the high-frequency vocabulary, and the custom bill recognition template is specially used for recognizing the plate-type custom bill.
The user-defined bill identification template generated by the technical scheme has the advantages of short creation time, high accuracy and simple operation of the user side, and the user can immediately create the user-defined bill identification template according to the requirement and identify the user-defined bill through the template, so that the working efficiency is greatly improved. Meanwhile, due to modularization and customization of the user-defined bill identification template, the economic cost of a user can be greatly reduced, another set of identification technology does not need to be developed for the user-defined bill, and the flexibility and timeliness of model deployment are guaranteed.
The server stores the self-defined bill identification template in a key-value format, and it should be noted that the key-value is a file format stored in a classified manner, and has the effects of large data storage amount and high query speed.
After the server stores the custom bill identification templates in a classified manner, a user can call the custom bill identification templates and perform corresponding tests, if the test results are satisfactory, a deployment instruction can be issued through the front-end equipment, and the server can deploy the custom bill identification templates according to the deployment instruction.
The deployment process specifically includes:
and the server receives the deployment instruction and reads the custom bill identification template stored in each database according to the deployment instruction so as to generate a specified identification model configuration file and an identification service configuration file. The appointed identification model configuration file at least comprises parameter information of the appointed identification model, and the identification service configuration file at least comprises parameter information corresponding to the bill template image, the reference field, the identification field area and the high-frequency field respectively.
A port starting strategy is built in the server, the starting port of the custom bill identification template can be obtained through a corresponding algorithm by combining the custom bill identification template with the port starting strategy, and the starting port is stored in a specified identification model configuration file and an identification service configuration file.
Further, the server acquires a required video memory of the designated model according to the designated identification model configuration file, and adds a calling instruction corresponding to the video memory and a file position of the designated identification model configuration file to the first start script, so that when the first start script is started, corresponding video memory resources are called to the start port, and the designated identification model is loaded to the start port.
In addition, the system adds a file location identifying the service profile to the second startup script to load the ticket template image, the reference field, the identification field region, and the high frequency field into the startup port when the second startup script is started.
In addition, it should be further noted that another implementation manner of the present application is: the server is also provided with a server configuration file, the configuration file stores information such as the version number of a designated identification model and the legitimacy certification of the server calling resource, in addition, the starting port is firstly stored in the server configuration file, after the first starting script is started, a script corresponding to the server configuration file synchronously starts a mirror image script matched with the identification service configuration file, the starting port can be synchronized into the identification service configuration file through the mirror image script, and meanwhile, the identification service configuration file can load the bill template image, the reference field, the identification field area and the high-frequency field to the starting port through the mirror image script.
In addition, after the file position of the identification service configuration file is added to the second start script, the user can also send out a service stopping instruction according to the requirement.
And the server receives the service stopping instruction, deletes the first script and the second script according to the service stopping instruction, and deletes the contents in the identification model configuration file and the identification service configuration file so as to release the computing resources of the server.
Meanwhile, the server can delete the user-defined bill identification template after classified storage from the database according to the service stopping instruction, wherein the user-defined bill identification template comprises a designated identification model, a bill template image, a reference field, an identification field area and a high-frequency vocabulary, and the high-frequency vocabulary is deleted from the database of the respective classified storage, so that storage resources are released.
In addition, after the server adds the file position of the identification service configuration file to the second start script, the user can also send out a service export instruction, so that the identification service can be exported in a packaged file form for the user to refer to.
Specifically, the system receives a service export instruction, and packages the identification model configuration file, the identification service configuration file, the first start script and the second start script according to the service export instruction to obtain a packaged start file.
And the server determines a file downloading interface corresponding to the user and loads the packaged starting file to the file downloading interface for the user to download.
By the method, the user can use the packaged starting file as the identification file package of the user-defined bill to be applied to other terminals or perform targeted allocation, and the use flexibility of the user is guaranteed.
S104: and receiving an identification service request, and loading the custom bill identification template on a corresponding starting port according to the identification service request.
Specifically, the server receives an identification service request, and determines a corresponding custom bill identification template according to the identification service request.
The server runs the first start script and the second start script.
The server allocates resources corresponding to the video memory for the starting port according to the video memory calling instruction in the first starting script, and loads the appointed recognition model, the bill template image, the reference field, the recognition field area and the high-frequency field at the starting port to prepare for subsequent recognition service.
S105: and receiving a user-defined bill image, and identifying the user-defined bill image through the user-defined bill identification template to obtain an identification result.
Before the server identifies the user-defined bill image through the user-defined bill identification template, the user-defined bill image also needs to be subjected to targeted processing in order to ensure the accuracy of the identification result.
The method specifically comprises the following steps:
carrying out image optimization processing on the user-defined bill image, wherein the image optimization processing at least comprises the following steps: denoising, sharpening, adjusting brightness and smoothly zooming.
Because the user-defined bill is uploaded by the user, the image quality and the definition are difficult to ensure, and the user-defined bill has higher definition through image optimization processing in order to avoid influencing the subsequent identification process.
The user-defined bill that the user uploaded probably has vertical image, the slant image, the problem of reverse image, and when the identification model discerned the field of this angle, the rate of accuracy is lower, for guaranteeing the rate of accuracy, the server can carry out first position detection to the user-defined bill, through the testing result that first position detected, can confirm whether the user-defined bill is forward position up, if there is the deviation in the angle, can carry out first predetermined angle rotation processing according to the user-defined bill to make the user-defined bill forward up.
The adjustment here is not an adjustment with high accuracy, and therefore, the first preset angle is set to include 0 °, 90 °, 180 °, 270 °.
Further, the server detects text areas of the user-defined bill image to determine a plurality of text areas existing in the user-defined bill image, wherein the plurality of text areas are all text contents in the user-defined bill. And the server cuts the self-defined bill image according to the text areas to obtain a plurality of sub-images with the same number as the text areas.
Because the recognition process is carried out aiming at the text area, the image is cut, which is beneficial to carrying out micro-adjustment and recognition on each text area, and the recognition accuracy is ensured.
And the server performs second position detection on the plurality of sub-images to determine whether any one of the plurality of sub-images is upward in the forward direction, and performs second preset angle rotation processing on the plurality of sub-images according to the detection result. Because the adjustment precision is higher here, consequently the second preset angle can set up and use 5 as the gradient and set up, has further promoted the discernment rate of accuracy of subimage.
Further, the server determines blank areas corresponding to the sub-images respectively, and cuts the blank areas, wherein the blank areas are areas not containing texts. By cutting the margin area, the character part in the subimage can be ensured to occupy the area to the maximum extent, and the appointed recognition model can be conveniently recognized to obtain a more accurate recognition result.
After the server performs the above-mentioned targeted processing on the custom ticket image, the server can start to recognize the custom ticket image.
Specifically, the method comprises the following steps:
the server receives the user-defined bill image, and the server performs first identification on the user-defined bill image through the appointed identification model to obtain a plurality of first identified fields. More specifically, the server performs first recognition on the plurality of sub-images.
The plurality of recognized fields may have a problem of low accuracy, the source of the problem mainly comes from the problem that the orientation of the self-defined bill image still has a problem, the orientation of the self-defined bill image may be already positive, but not completely upward, and in order to solve the problem, the self-defined bill needs to be subjected to perspective processing so that the display orientation of the self-defined bill image is completely upward.
Specifically, the server compares a plurality of first identified fields with a reference field to obtain a plurality of identical fields that are identical to the reference field. The number of the same fields is at least four, and the number of the same fields is more than four, namely, the same fields can be grouped into a plurality of groups according to a preset allocation strategy to obtain a plurality of groups of the same fields. For example, there are five identical fields, i.e., there may be two groups that are not exactly identical.
And determining a preset reference point in the reference field, wherein in the embodiment of the application, a certain point of the reference field at the upper left corner of the bill template image is taken as the preset reference point. The method comprises the steps of placing a bill template image on a bottom layer, placing a self-defined template image on an upper layer, stretching and translating the self-defined bill image according to a preset datum point in a reference field so as to enable the same field to be overlapped with the reference field, enabling the self-defined bill image to be completely the same as the bill template in the forward direction and upward, and enabling the process to be perspective correction processing.
Specifically, the server respectively performs perspective correction processing on multiple groups of same fields according to preset datum points in the reference fields, namely performs the perspective correction processing with the same number according to the grouping number to obtain multiple perspective results, and performs integration processing on the multiple perspective results to obtain the user-defined bill image after the perspective processing.
Further, the server constructs the same coordinate system for the bill template image and the customized bill image after perspective processing, and determines the area to be identified in the customized bill image after perspective processing by identifying the coordinates of the field area in the bill template image.
And performing second identification on the area to be identified by the appointed identification model to obtain a plurality of second identified fields. And correcting the second recognized field according to the high-frequency vocabulary to obtain a recognition result.
Through the technical scheme, the accuracy of the recognition result is guaranteed to the greatest extent, the recognition efficiency is improved, and the human resources are saved.
In one embodiment, the server may further obtain occupation conditions of the memory and the video memory, obtain the running number of the specified recognition model, and obtain the running state of the custom bill recognition template.
And constructing a dynamic chart according to the occupation condition, the running quantity and the running state, sending the dynamic chart to the front-end equipment, and displaying the dynamic chart by the front-end equipment.
In one embodiment, after receiving a custom ticket image and performing recognition processing on the custom ticket through the custom ticket recognition template to obtain a recognition result, the method further includes:
evaluating the precision of the recognition result to obtain an evaluation result, and correcting the recognition result according to the evaluation result to obtain a corrected recognition result;
classifying the corrected recognition result to obtain a classification result, and determining a designated recognition model corresponding to the type of recognition result according to the classification result;
packing the recognition result and the custom bill image, inputting the recognition result and the custom bill image as a training packet to the corresponding designated recognition model, and retraining the corresponding designated recognition model through the training packet to obtain an updated recognition model;
replacing the corresponding specified recognition model with the updated recognition model.
In one embodiment, as shown in fig. 2, the present application further provides a device for identifying a custom ticket, including:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to cause the at least one processor to perform instructions for:
receiving a bill template image, and determining a corresponding reference field and a corresponding identification field area according to the bill template image;
determining a corresponding appointed recognition model from a recognition model library according to the reference field and the recognition field region, matching a corresponding high-frequency vocabulary for the appointed recognition model, and correcting a recognition result through the high-frequency vocabulary, wherein the high-frequency vocabulary is the vocabulary with the occurrence frequency greater than a preset threshold value in the reference field and the recognition field region;
constructing a self-defined bill recognition template according to the bill template image, the reference field, the recognition field area, the designated recognition model and the high-frequency vocabulary, and storing the self-defined bill recognition template in a classified manner;
receiving an identification service request, and loading the custom bill identification template on a corresponding starting port according to the identification service request;
and receiving a user-defined bill image, and identifying the user-defined bill image through the user-defined bill identification template to obtain an identification result.
In one embodiment, the present application further provides a non-transitory computer storage medium storing computer-executable instructions configured to:
receiving a bill template image, and determining a corresponding reference field and a corresponding identification field area according to the bill template image;
determining a corresponding appointed recognition model from a recognition model library according to the reference field and the recognition field region, matching a corresponding high-frequency vocabulary for the appointed recognition model, and correcting a recognition result through the high-frequency vocabulary, wherein the high-frequency vocabulary is the vocabulary with the occurrence frequency greater than a preset threshold value in the reference field and the recognition field region;
constructing a self-defined bill recognition template according to the bill template image, the reference field, the recognition field area, the designated recognition model and the high-frequency vocabulary, and storing the self-defined bill recognition template in a classified manner;
receiving an identification service request, and loading the custom bill identification template on a corresponding starting port according to the identification service request;
and receiving a user-defined bill image, and identifying the user-defined bill image through the user-defined bill identification template to obtain an identification result.
The embodiments in the present application are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the device and media embodiments, the description is relatively simple as it is substantially similar to the method embodiments, and reference may be made to some descriptions of the method embodiments for relevant points.
The device and the medium provided by the embodiment of the application correspond to the method one to one, so the device and the medium also have the similar beneficial technical effects as the corresponding method, and the beneficial technical effects of the method are explained in detail above, so the beneficial technical effects of the device and the medium are not repeated herein.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A method for identifying a custom bill is characterized by comprising the following steps:
receiving a bill template image, and determining a corresponding reference field and a corresponding identification field area according to the bill template image;
determining a corresponding appointed recognition model from a recognition model library according to the reference field and the recognition field region, matching a corresponding high-frequency vocabulary for the appointed recognition model, and correcting a recognition result through the high-frequency vocabulary, wherein the high-frequency vocabulary is the vocabulary with the occurrence frequency greater than a preset threshold value in the reference field and the recognition field region;
constructing a self-defined bill recognition template according to the bill template image, the reference field, the recognition field area, the designated recognition model and the high-frequency vocabulary, and storing the self-defined bill recognition template in a classified manner;
receiving an identification service request, and loading the custom bill identification template on a corresponding starting port according to the identification service request;
and receiving a user-defined bill image, and identifying the user-defined bill image through the user-defined bill identification template to obtain an identification result.
2. The method for recognizing the self-defined bill according to claim 1, wherein before receiving the self-defined bill image and performing recognition processing on the self-defined bill image through the self-defined bill recognition template to obtain a recognition result, the method further comprises:
carrying out image optimization processing on the user-defined bill image, wherein the image optimization processing at least comprises the following steps: denoising, sharpening, adjusting brightness and smoothly zooming;
carrying out first position detection on the self-defined bill image, and carrying out first preset angle rotation processing on the self-defined bill according to a detection result;
text region detection is carried out on the user-defined bill image to obtain a plurality of text regions, and the user-defined bill image is cut according to the text regions to obtain a plurality of sub-images with the same number as the text regions;
performing second position detection on the plurality of sub-images, and performing second preset angle rotation processing on the plurality of sub-images according to the detection result;
and determining the blank areas corresponding to the sub-images respectively, and cutting the blank areas, wherein the blank areas are areas not containing texts.
3. The method for recognizing the self-defined bill according to claim 1, wherein the steps of receiving a self-defined bill image, and recognizing the self-defined bill image through the self-defined bill recognition template to obtain a recognition result specifically include:
receiving a user-defined bill image, and carrying out first identification on the user-defined bill image through the specified identification model to obtain a plurality of first identified fields;
comparing the first identified fields with the reference fields to obtain a plurality of same fields, and grouping the same fields according to a preset allocation strategy to obtain a plurality of groups of same fields;
respectively carrying out perspective correction processing on the multiple groups of same fields according to preset datum points in the reference fields, and carrying out integration processing on multiple perspective results to obtain a user-defined bill image after perspective processing;
constructing the same coordinate system for the bill template image and the self-defined bill image after perspective processing, and determining the area to be identified in the self-defined bill image after perspective processing through the coordinates of the identification field area in the bill template image;
performing second identification on the area to be identified through the specified identification model to obtain a plurality of second identified fields;
and correcting the second recognized field according to the high-frequency vocabulary to obtain a recognition result.
4. The method for recognizing the self-defined bill according to claim 1, wherein after the self-defined bill recognition template is constructed according to the bill template image, the reference field, the recognition field area, the designated recognition model and the high-frequency vocabulary, and is classified and stored, the method further comprises:
receiving a deployment instruction, and reading the self-defined bill identification template according to the deployment instruction to generate an appointed identification model configuration file and an identification service configuration file, wherein the appointed identification model configuration file at least comprises parameter information of the appointed identification model, and the identification service configuration file at least comprises parameter information corresponding to the bill template image, the reference field, the identification field area and the high-frequency field respectively;
obtaining a starting port of the user-defined bill identification template according to a pre-stored port starting strategy, and storing the starting port into the specified identification model configuration file and the identification service configuration file;
acquiring a required video memory of the appointed identification model according to the appointed identification model configuration file, and adding a calling instruction corresponding to the video memory and the file position of the appointed identification model configuration file to a first starting script;
and adding the file position of the identification service configuration file to a second starting script.
5. The method for identifying a custom ticket according to claim 4, wherein receiving an identification service request and loading the custom ticket identification template at a corresponding start port according to the identification service request specifically comprises:
receiving an identification service request, and determining the corresponding self-defined bill identification template according to the identification service request;
running the first start script and the second start script;
distributing the resources of the video memory for the starting port, and loading the specified identification model, the bill template image, the reference field, the identification field area and the high-frequency field at the starting port.
6. The method for identifying the custom ticket according to claim 4, wherein after adding the file location of the identification service profile to the second start-up script, the method further comprises:
receiving a service stopping instruction, deleting the first script and the second script according to the service stopping instruction, and deleting the contents in the identification model configuration file and the identification service configuration file;
and deleting the user-defined bill identification template after classified storage from a database according to the service stopping instruction, and releasing storage resources.
7. The method for identifying the custom ticket according to claim 4, wherein after adding the file location of the identification service profile to the second start-up script, the method further comprises:
receiving a service export instruction, and packaging the identification model configuration file, the identification service configuration file, the first start script and the second start script according to the service export instruction to obtain a packaged start file;
determining a file downloading interface, and loading the packaged starting file to the file downloading interface.
8. The method for identifying the custom ticket according to claim 1, further comprising:
acquiring the occupation conditions of a memory and a video memory, acquiring the running number of the specified identification model, and acquiring the running state of the user-defined bill identification template;
and constructing a dynamic chart according to the occupation condition, the running number and the running state, and sending the dynamic chart to front-end equipment for displaying through the front-end equipment.
9. An apparatus for identifying a custom ticket, comprising:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to cause the at least one processor to perform instructions for:
receiving a bill template image, and determining a corresponding reference field and a corresponding identification field area according to the bill template image;
determining a corresponding appointed recognition model from a recognition model library according to the reference field and the recognition field region, matching a corresponding high-frequency vocabulary for the appointed recognition model, and correcting a recognition result through the high-frequency vocabulary, wherein the high-frequency vocabulary is the vocabulary with the occurrence frequency greater than a preset threshold value in the reference field and the recognition field region;
constructing a self-defined bill recognition template according to the bill template image, the reference field, the recognition field area, the designated recognition model and the high-frequency vocabulary, and storing the self-defined bill recognition template in a classified manner;
receiving an identification service request, and loading the custom bill identification template on a corresponding starting port according to the identification service request;
and receiving a user-defined bill image, and identifying the user-defined bill image through the user-defined bill identification template to obtain an identification result.
10. A non-transitory computer storage medium storing computer-executable instructions, the computer-executable instructions configured to:
receiving a bill template image, and determining a corresponding reference field and a corresponding identification field area according to the bill template image;
determining a corresponding appointed recognition model from a recognition model library according to the reference field and the recognition field region, matching a corresponding high-frequency vocabulary for the appointed recognition model, and correcting a recognition result through the high-frequency vocabulary, wherein the high-frequency vocabulary is the vocabulary with the occurrence frequency greater than a preset threshold value in the reference field and the recognition field region;
constructing a self-defined bill recognition template according to the bill template image, the reference field, the recognition field area, the designated recognition model and the high-frequency vocabulary, and storing the self-defined bill recognition template in a classified manner;
receiving an identification service request, and loading the custom bill identification template on a corresponding starting port according to the identification service request;
and receiving a user-defined bill image, and identifying the user-defined bill image through the user-defined bill identification template to obtain an identification result.
CN202111324906.5A 2021-11-10 2021-11-10 User-defined bill identification method, device and medium Pending CN113989817A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111324906.5A CN113989817A (en) 2021-11-10 2021-11-10 User-defined bill identification method, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111324906.5A CN113989817A (en) 2021-11-10 2021-11-10 User-defined bill identification method, device and medium

Publications (1)

Publication Number Publication Date
CN113989817A true CN113989817A (en) 2022-01-28

Family

ID=79747570

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111324906.5A Pending CN113989817A (en) 2021-11-10 2021-11-10 User-defined bill identification method, device and medium

Country Status (1)

Country Link
CN (1) CN113989817A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117315705A (en) * 2023-10-10 2023-12-29 河北神玥软件科技股份有限公司 Universal card identification method, device and system, electronic equipment and storage medium
CN117437648A (en) * 2023-12-20 2024-01-23 国网浙江省电力有限公司金华供电公司 Processing method and system for automatic sample sealing equipment for electric power material construction

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117315705A (en) * 2023-10-10 2023-12-29 河北神玥软件科技股份有限公司 Universal card identification method, device and system, electronic equipment and storage medium
CN117315705B (en) * 2023-10-10 2024-04-30 河北神玥软件科技股份有限公司 Universal card identification method, device and system, electronic equipment and storage medium
CN117437648A (en) * 2023-12-20 2024-01-23 国网浙江省电力有限公司金华供电公司 Processing method and system for automatic sample sealing equipment for electric power material construction
CN117437648B (en) * 2023-12-20 2024-03-05 国网浙江省电力有限公司金华供电公司 Processing method and system for automatic sample sealing equipment for electric power material construction

Similar Documents

Publication Publication Date Title
CN113989817A (en) User-defined bill identification method, device and medium
CN110019298B (en) Data processing method and device
CN115391439B (en) Document data export method, device, electronic equipment and storage medium
CN110287125A (en) Software routine test method and device based on image recognition
CN114239097A (en) Method and device for generating process file, storage medium and electronic device
CN113010169A (en) Method and apparatus for converting UI diagram into code file
CN111984666B (en) Database access method, apparatus, computer readable storage medium and computer device
CN111857712A (en) Form processing method, device, terminal and medium
CN113641592B (en) Test sequence generation method and device
CN114359533A (en) Page number identification method based on page text and computer equipment
CN112818937A (en) Excel file identification method and device, electronic equipment and readable storage medium
CN109584091B (en) Generation method and device of insurance image file
CN113835704B (en) Layout file generation method, device, equipment and storage medium
CN112182413B (en) Intelligent recommendation method and server based on big teaching data
CN114138787A (en) Bar code identification method, equipment and medium
CN114037493A (en) Method and system for generating bidding document
CN110457332B (en) Information processing method and related equipment
CN112508656A (en) Guest-obtaining information processing method and device
CN110717131B (en) Page revising monitoring method and related system
CN110991164B (en) Legal document processing method and device
CN113098961A (en) Component uploading method, device and system, computer equipment and readable storage medium
CN113961272B (en) Personalized page display method and system
CN112612915B (en) Picture labeling method and device
CN113792137B (en) Method, system, intelligent terminal and storage medium for searching middle-stage research and development materials
CN113569538A (en) Document generation method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination