CN113239921A - Task grading and distributing method and system for OCR (optical character recognition) service - Google Patents

Task grading and distributing method and system for OCR (optical character recognition) service Download PDF

Info

Publication number
CN113239921A
CN113239921A CN202110504025.5A CN202110504025A CN113239921A CN 113239921 A CN113239921 A CN 113239921A CN 202110504025 A CN202110504025 A CN 202110504025A CN 113239921 A CN113239921 A CN 113239921A
Authority
CN
China
Prior art keywords
task
field
meaning
tasks
name
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110504025.5A
Other languages
Chinese (zh)
Inventor
王杰华
蒋昆
詹成富
杨洪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University Huigu General Technology Co ltd
Original Assignee
Shanghai Jiaotong University Huigu General Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University Huigu General Technology Co ltd filed Critical Shanghai Jiaotong University Huigu General Technology Co ltd
Priority to CN202110504025.5A priority Critical patent/CN113239921A/en
Publication of CN113239921A publication Critical patent/CN113239921A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The invention relates to a task grading and distributing method and a task grading and distributing system of an OCR (optical character recognition) service, belonging to big data processing. The method comprises the following steps: the grading recognition mechanism of the OCR recognition service can lead the recognition service to preferentially recognize the tasks with high grades, and lead the task recognition with high priority to be always in a quick response state. In the aspect of task distribution and callback response, the system supports multipoint task distribution to maximize the utilization of each client OCR recognition service (integrating OCR product components), a task completion notification address is transmitted when task recognition and calling are carried out, and a calling party is notified of task completion at the first time when the recognition task is completed.

Description

Task grading and distributing method and system for OCR (optical character recognition) service
Technical Field
The invention belongs to the field of big data processing, and relates to a task grading and distributing method and system of an OCR (optical character recognition) service.
Background
The OCR recognition component is generally provided by different manufacturers, the OCR recognition service is provided in an interface mode, the service is responsible for receiving recognition tasks and returning recognition results, problems of high concurrency, exception handling, retry handling and the like are often encountered when the service is directly combined with business, and the problem of how to efficiently utilize the recognition service is faced by practical application.
Disclosure of Invention
In view of the above, the present invention provides a task ranking and distributing method and system for an OCR recognition service.
In order to achieve the purpose, the invention provides the following technical scheme:
a task ranking and distribution method for an OCR recognition service, the method comprising the steps of:
s1: storing tasks to be identified into a queue to be processed according to grades;
s2: the server side obtains the tasks according to the levels and distributes the tasks to the identification client side, and the tasks with high levels are processed preferentially;
s3: and the recognition client interacts with the OCR after receiving the task, and recalls the completion notice after acquiring the interface.
Optionally, in S1, the system stores the access address of the material to be recognized, the task level, and the recognition completion notification address.
Optionally, in S2, the server may obtain the tasks according to the priority order of the levels and construct a queue to be consumed, push the queue to a plurality of clients for identification, and after the clients finish processing, update the status of the tasks and call back the notification address to complete identification.
Optionally, in S2, the OCR recognition service divides the tasks according to priorities of 1 to 4, preferentially processes the high-priority tasks during recognition work, and ensures that the high-priority tasks are responded at any time while using the recognition resources; and receiving an information callback notification address when the task is created, wherein the callback notification is used for notifying a calling party to reach the minimum interval time consumption after the identification task is completed, so that the whole identification function is smooth.
Optionally, in S2, the server is responsible for acquiring and distributing the tasks according to the priority, so as to ensure that each identification task is correctly responded.
Optionally, in S2, the plurality of clients are composed of a plurality of nodes and interact with the service to complete the identification, error determination, and error retry-to-callback notification of each task, so as to ensure the task execution progress.
Optionally, the identification task is:
the field name is id, the field type is int, the field meaning is record id, and the record id is a main key;
the field name is tp _ url, the field type is Varchar (500), the field meaning is the picture access address, not the primary key;
the field name is Recall _ url, the field type is Varchar (500), the field meaning is callback notification address, not primary key;
the field name is Create _ date, the field type is date, the field meaning is task creation time and is not a primary key;
the field name is Comp _ date, the field type is date, the field meaning is task completion time, not the primary key;
the field name is Zt, the field type is int, and the field meaning is: when the state is 0, the processing is to be performed, when the state is 1, the processing is completed-normal, when the state is 2, the processing is completed-abnormal, and when the state is 3, the processing is not a primary key;
the field name is Bz, the field type is Varchar (1000), the field meaning is remark information and is not a main key;
the field name is Task _ server, the field type is Varchar (50), and the field meaning is the processing Task node number and is not the primary key.
Optionally, in S3, the registration content of the client is:
the field name is id, the field type is int, the field meaning is record id, and the record id is a main key;
the field name is C _ name, the field type is Varchar (500), the field meaning is the client name, not the primary key;
the field name is C _ IP, the field type is Varchar (500), the field meaning is the client IP address and is not the primary key;
the field name is R _ datetime, the field type is datetime, the field meaning is registration time and is not a primary key;
the field name is Last _ checkTime, the field type is datatime, the field meaning is Last handshake time, not primary key;
the field name is Zt, the field type is int, and the field meaning is: when the state is 10, it indicates normal, and when the state is 2, it indicates missing, and is not a primary key.
A computer system comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, the processor when executing the computer program implementing the method according to any one of claims 1-8.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-8.
The invention has the beneficial effects that: the invention receives materials with priorities, the recognition service responds according to grades, the processing timeliness of a high-priority task is guaranteed, callback notification is carried out after the processing is finished, and the point change processing guarantee service is given to be continuous when the node abnormity is faced, thereby forming a set of efficient OCR recognition system.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Drawings
For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a flow chart of the server processing logic;
FIG. 3 is a flow chart of client processing logic;
FIG. 4 is an overall flow chart of OCR recognition;
fig. 5 is a schematic diagram of a client registration and health maintenance mechanism.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.
One, hierarchical task reception
As shown in fig. 1, when receiving a task, the system requests to transmit an address of a picture to be recognized, a recognition priority (1-4), and a callback notification address, the service system provides a picture accessible address for picture recognition as required, and performs notification according to the callback address after the recognition is completed, thereby completing the overall process, as shown in table 1.
Table 1 identification task ocr _ init table
Figure BDA0003057593710000051
Second, hierarchical task acquisition and distribution
The server side can obtain a batch of unprocessed tasks after sequencing according to the priority in the task list to be identified, and sends the tasks to the client side node through a request, and at the moment, the server information of current processing can be marked for each task. After a period of time, the server will check the health condition of the task being processed, and when finding that the task is overtime, the server will distribute the task to the healthy client to continue the task, so as to ensure that each task can correctly execute and recognize, as shown in the logic flow chart of the server in fig. 2, the whole server flow is described.
Third, client task processing
The client calls the recognition service to perform OCR processing on the material after receiving the task, the nodes are replaced and retried when the abnormal, error and blank contents are recognized and returned, if the abnormal, error and blank contents are recognized and returned to the same error result, the recognition error is returned, and if the task is normally completed, the notification address is called back, and the state of the task table is updated, as shown in Table 2. The client process flow is described in detail in fig. 3. FIG. 4 is an overall flow chart of OCR recognition; fig. 5 is a schematic diagram of a client registration and health maintenance mechanism.
Table 2 client registration ocr _ client _ list table
Figure BDA0003057593710000061
The invention relates to a task and distribution method and a system of an OCR recognition service.A system receives a material with priority, the recognition service responds according to grades to ensure the processing timeliness of a high-priority task, and performs callback notification after processing, and gives a point change processing guarantee service for continuation when a node is abnormal, thereby forming a set of efficient OCR recognition system.
It should be recognized that embodiments of the present invention can be realized and implemented by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer readable memory. The methods may be implemented in a computer program using standard programming techniques, including a non-transitory computer-readable storage medium configured with the computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner, according to the methods and figures described in the detailed description. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.
Further, the operations of processes described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes described herein (or variations and/or combinations thereof) may be performed under the control of one or more computer systems configured with executable instructions, and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications) collectively executed on one or more processors, by hardware, or combinations thereof. The computer program includes a plurality of instructions executable by one or more processors.
Further, the method may be implemented in any type of computing platform operatively connected to a suitable interface, including but not limited to a personal computer, mini computer, mainframe, workstation, networked or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, and the like. Aspects of the invention may be embodied in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optically read and/or write storage medium, RAM, ROM, or the like, such that it may be read by a programmable computer, which when read by the storage medium or device, is operative to configure and operate the computer to perform the procedures described herein. Further, the machine-readable code, or portions thereof, may be transmitted over a wired or wireless network. The invention described herein includes these and other different types of non-transitory computer-readable storage media when such media include instructions or programs that implement the steps described above in conjunction with a microprocessor or other data processor. When programmed according to the task ranking and distribution method and technique of an OCR recognition service of the present invention, the present invention also includes the computer itself.
A computer program can be applied to input data to perform the functions described herein to transform the input data to generate output data that is stored to non-volatile memory. The output information may also be applied to one or more output devices, such as a display. In a preferred embodiment of the invention, the transformed data represents physical and tangible objects, including particular visual depictions of physical and tangible objects produced on a display.
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims (10)

1. A task grading and distribution method of an OCR recognition service is characterized in that: the method comprises the following steps:
s1: storing tasks to be identified into a queue to be processed according to grades;
s2: the server side obtains the tasks according to the levels and distributes the tasks to the identification client side, and the tasks with high levels are processed preferentially;
s3: and the recognition client interacts with the OCR after receiving the task, and recalls the completion notice after acquiring the interface.
2. A task ranking and distribution method for an OCR recognition service according to claim 1, characterized in that: in said S1, the system stores the access address of the material to be identified, the task level, and the identification completion notification address.
3. A task ranking and distribution method for an OCR recognition service according to claim 1, characterized in that: in S2, the server acquires the tasks according to the priority order and constructs a queue to be consumed, and pushes the queue to a plurality of clients for identification, and after the clients finish processing, the clients update the status of the tasks and call back the notification address to complete identification.
4. A task ranking and distribution method for an OCR recognition service according to claim 1, characterized in that: in the S2, the OCR recognition service divides the tasks according to the priority of 1-4, preferentially processes the tasks with high grade in the recognition work, and ensures that the high-priority tasks are responded at any time while utilizing the recognition resources; and receiving an information callback notification address when the task is created, wherein the callback notification is used for notifying a calling party to reach the minimum interval time consumption after the identification task is completed, so that the whole identification function is smooth.
5. A task ranking and distribution method for an OCR recognition service according to claim 3 wherein: in S2, the server is responsible for acquiring and distributing tasks according to priority, and ensuring that each identification task is correctly responded.
6. A task ranking and distribution method for an OCR recognition service according to claim 5, characterized in that: in S2, the plurality of clients are composed of a plurality of nodes and interact with the service to complete the identification, error determination, and error retry-to-callback notification of each task, thereby ensuring the task execution progress.
7. A task ranking and distribution method for an OCR recognition service according to claim 1, characterized in that: the identification task is as follows:
the field name is id, the field type is int, the field meaning is record id, and the record id is a main key;
the field name is tp _ url, the field type is Varchar (500), the field meaning is the picture access address, not the primary key;
the field name is Recall _ url, the field type is Varchar (500), the field meaning is callback notification address, not primary key;
the field name is Create _ date, the field type is date, the field meaning is task creation time and is not a primary key;
the field name is Comp _ date, the field type is date, the field meaning is task completion time, not the primary key;
the field name is Zt, the field type is int, and the field meaning is: when the state is 0, the processing is to be performed, when the state is 1, the processing is completed-normal, when the state is 2, the processing is completed-abnormal, and when the state is 3, the processing is not a primary key;
the field name is Bz, the field type is Varchar (1000), the field meaning is remark information and is not a main key;
the field name is Task _ server, the field type is Varchar (50), and the field meaning is the processing Task node number and is not the primary key.
8. A task ranking and distribution method for an OCR recognition service according to claim 1, characterized in that: in S3, the client performs registration as follows:
the field name is id, the field type is int, the field meaning is record id, and the record id is a main key;
the field name is C _ name, the field type is Varchar (500), the field meaning is the client name, not the primary key;
the field name is C _ IP, the field type is Varchar (500), the field meaning is the client IP address and is not the primary key;
the field name is R _ datetime, the field type is datetime, the field meaning is registration time and is not a primary key;
the field name is Last _ checkTime, the field type is datatime, the field meaning is Last handshake time, not primary key;
the field name is Zt, the field type is int, and the field meaning is: when the state is 10, it indicates normal, and when the state is 2, it indicates missing, and is not a primary key.
9. A computer system comprising a memory, a processor, and a computer program stored on the memory and capable of running on the processor, wherein: the processor, when executing the computer program, implements the method of any of claims 1-8.
10. A computer-readable storage medium having stored thereon a computer program, characterized in that: the computer program, when executed by a processor, implements the method of any one of claims 1-8.
CN202110504025.5A 2021-05-10 2021-05-10 Task grading and distributing method and system for OCR (optical character recognition) service Pending CN113239921A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110504025.5A CN113239921A (en) 2021-05-10 2021-05-10 Task grading and distributing method and system for OCR (optical character recognition) service

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110504025.5A CN113239921A (en) 2021-05-10 2021-05-10 Task grading and distributing method and system for OCR (optical character recognition) service

Publications (1)

Publication Number Publication Date
CN113239921A true CN113239921A (en) 2021-08-10

Family

ID=77132870

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110504025.5A Pending CN113239921A (en) 2021-05-10 2021-05-10 Task grading and distributing method and system for OCR (optical character recognition) service

Country Status (1)

Country Link
CN (1) CN113239921A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116189210A (en) * 2023-04-23 2023-05-30 福昕鲲鹏(北京)信息科技有限公司 Image OCR (optical character recognition) method, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109656733A (en) * 2018-12-27 2019-04-19 厦门商集网络科技有限责任公司 The method and apparatus of the more OCR recognition engines of intelligent scheduling
CN110096344A (en) * 2018-01-29 2019-08-06 北京京东尚科信息技术有限公司 Task management method, system, server cluster and computer-readable medium
CN110688998A (en) * 2019-09-27 2020-01-14 中国银行股份有限公司 Bill identification method and device
CN111079397A (en) * 2019-12-25 2020-04-28 中国建设银行股份有限公司 Task file generation method and device based on image recognition
CN111586658A (en) * 2020-04-30 2020-08-25 贵州电网有限责任公司 Bluetooth transmission method and system based on image recognition service in transformer substation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110096344A (en) * 2018-01-29 2019-08-06 北京京东尚科信息技术有限公司 Task management method, system, server cluster and computer-readable medium
CN109656733A (en) * 2018-12-27 2019-04-19 厦门商集网络科技有限责任公司 The method and apparatus of the more OCR recognition engines of intelligent scheduling
CN110688998A (en) * 2019-09-27 2020-01-14 中国银行股份有限公司 Bill identification method and device
CN111079397A (en) * 2019-12-25 2020-04-28 中国建设银行股份有限公司 Task file generation method and device based on image recognition
CN111586658A (en) * 2020-04-30 2020-08-25 贵州电网有限责任公司 Bluetooth transmission method and system based on image recognition service in transformer substation

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116189210A (en) * 2023-04-23 2023-05-30 福昕鲲鹏(北京)信息科技有限公司 Image OCR (optical character recognition) method, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US20050262112A1 (en) Method and apparatus to convert project plans into workflow definitions
CN105373428B (en) Task scheduling method and system
US10261892B2 (en) Cloud-based automated test execution factory
US20030229653A1 (en) System and method for data backup
US20070005623A1 (en) Process oriented message driven workflow programming model
CN104750522A (en) Dynamic execution method and system for tasks or processes
EP1517261A2 (en) Work-flow system and work-flow system management method
US20050044173A1 (en) System and method for implementing business processes in a portal
CN113239921A (en) Task grading and distributing method and system for OCR (optical character recognition) service
US7657590B2 (en) Load balancing system and method
CN108521524B (en) Agent collaborative task management method and device, computer equipment and storage medium
US20080178182A1 (en) Work state returning apparatus, work state returning method, and computer product
US10515338B2 (en) Systems, devices, and methods for machine reading
US9292364B1 (en) Packaging application data and logic for offline support
CN110764911A (en) Resource scheduling method, device and control system based on order
CN115660261A (en) Production order information processing method, computer device and storage medium
CN114819490A (en) Task issuing method, device, equipment and storage medium
CN114020368A (en) Information processing method and device based on state machine and storage medium
CN114281281A (en) Interaction method and device for printer and intelligent equipment
CN112559493A (en) Data blood relationship analysis method, computer device, and storage medium
CN107967549B (en) Multi-process task processing device and method
US10893166B2 (en) Management system, method, and program storage medium
JP6626327B2 (en) Gantt chart generation program, Gantt chart generation device, and Gantt chart generation method
CN113997294B (en) Office robot control method and related equipment
US20230049322A1 (en) Information processing method, device, system, and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination