CN111079397B - Task file generation method and device based on image recognition - Google Patents

Task file generation method and device based on image recognition Download PDF

Info

Publication number
CN111079397B
CN111079397B CN201911358133.5A CN201911358133A CN111079397B CN 111079397 B CN111079397 B CN 111079397B CN 201911358133 A CN201911358133 A CN 201911358133A CN 111079397 B CN111079397 B CN 111079397B
Authority
CN
China
Prior art keywords
information
instruction
header
data
combination
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911358133.5A
Other languages
Chinese (zh)
Other versions
CN111079397A (en
Inventor
杨延俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN201911358133.5A priority Critical patent/CN111079397B/en
Publication of CN111079397A publication Critical patent/CN111079397A/en
Application granted granted Critical
Publication of CN111079397B publication Critical patent/CN111079397B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Character Discrimination (AREA)

Abstract

The invention provides a task file generation method and a task file generation device based on image recognition, which are used for obtaining header combination information through image recognition, and then generating a task file comprising instruction information according to the header combination information comprising instruction related information.

Description

Task file generation method and device based on image recognition
Technical Field
The invention relates to the field of banking and escrow business, in particular to a task file generation method and device based on image recognition.
Background
One of the cores of the escrow clearing business is to process the money drawing instruction sent by the manager every day, and finish the operations of task sorting, instruction input, verification, money drawing and the like in the escrow system. Because different managers have different direct connection degrees of the system, a large number of money drawing instructions are still sent to the hosting line through channels such as fax, mail and the like every day, so that the pressure for manually inputting the instructions is high. Meanwhile, the processing time window of different services has very strict requirements, so the managed service processing must have very strong compatibility and timeliness.
For each manager (fund, trust, financial product, trusted user, etc.), the format of the drawing instructions organized by their own system is different, and facing different customer groups, the hosting system as a hosted person receives the fax or mail instructions sent by the customers, and it is impossible or impossible to require the customers to send the instructions in a uniform format.
Disclosure of Invention
To solve at least one of the above problems, an embodiment of the present invention provides a task file generating method based on image recognition, including:
acquiring an image document file in a bank escrow system and header combination information of each form in the image document file, wherein the header combination information is obtained through image recognition;
and generating a task file containing at least one instruction information according to the header combination information containing the instruction related information.
In some embodiments, the generating a task file including at least one instruction information according to the header combination information includes:
judging whether the image document file comprises instruction related information according to the header combination information;
if the header combination information is included, converting the header combination information into instruction information in a set format;
And establishing an association relation between the instruction information and the task, and obtaining the task file in a lasting way.
In some embodiments, the image recognition is further capable of recognizing a location of each header information, further comprising:
estimating a document template style corresponding to the header combination information according to each header information and the position of the header information;
and checking the header information according to the data structure corresponding to the document template style.
In some embodiments, the verifying the header information according to the data structure corresponding to the document template style includes:
and sending the number of the document template style to the image recognition model so that the image recognition model recognizes the image document file again according to the information types corresponding to the positions in the document template style.
The embodiment of the invention also provides a task file generation method based on image recognition, which comprises the following steps:
acquiring an image document file in a line hosting system;
and identifying the header combination information of each form in the image document file through an image identification model, so that an external hosting system can generate a task file containing at least one instruction information according to the header combination information containing the instruction related information.
In some embodiments, the image recognition model is a neural network model trained from historical image document files.
In certain embodiments, further comprising:
establishing the image recognition model;
the image recognition model is trained by the historical image document file marked with header combination information.
The invention also provides a task file generating device based on image recognition, which comprises:
the system comprises an acquisition module, a storage module and a storage module, wherein the acquisition module acquires an image document file in a bank escrow system and header combination information of each form in the image document file, and the header combination information is obtained through image identification;
and the task file generation module is used for generating a task file containing at least one instruction information according to the header combination information containing the instruction related information.
In some embodiments, the task file generation module includes:
a judging unit that judges whether the image document file includes instruction-related information according to the header combination information;
the conversion unit is used for converting the header combination information into instruction information in a set format if the header combination information is included;
and establishing an association relation between the instruction information and the task, and obtaining the task file in a lasting way.
In some embodiments, the image recognition is further capable of recognizing a location of each header information, the task file generation device further comprising:
the document template style estimation module estimates a document template style corresponding to the header combination information according to each header information and the position of the header information;
and the header information verification module is used for verifying the header information according to the data structure corresponding to the document template style.
In some embodiments, the header information verification module sends the number of the document template style to the image recognition model, so that the image recognition model recognizes the image document file according to the information type corresponding to each position in the document template style.
The embodiment of the invention also provides an image recognition device, which comprises:
the image document file acquisition module acquires an image document file in the line hosting system;
the identification module is used for identifying the header combination information of each form in the image document file through the image identification model, so that an external hosting system can generate a task file containing at least one instruction information according to the header combination information containing the instruction related information.
In a preferred embodiment, the image recognition model is a neural network model trained from historical image document files.
In a preferred embodiment, further comprising:
the model building module is used for building the image recognition model;
and the model training module is used for training the image recognition model through the historical image document file marked with the header combination information.
In yet another aspect, an embodiment of the present invention provides a computer device including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method described above when the program is executed.
In yet another aspect, embodiments of the present invention provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method described above.
The beneficial effects of the invention are as follows:
the invention provides a task file generation method and a task file generation device based on image recognition, which are used for obtaining header combination information through image recognition, and then generating a task file comprising instruction information according to the header combination information comprising instruction related information.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 shows a schematic flow chart of a task file generating method based on image recognition in an embodiment of the invention.
Fig. 2 shows a second flowchart of a task file generating method based on image recognition in an embodiment of the invention.
Fig. 3 shows a schematic flow chart of task file generation in a specific scenario of the present invention.
FIG. 4 shows one of the specific flow diagrams of the middle molecular step in the embodiment of the present invention.
FIG. 5 shows a second embodiment of the molecular steps in the embodiment of the invention.
FIG. 6 shows a third embodiment of the molecular step in the embodiment of the invention.
Fig. 7 is a schematic structural diagram of a task file generating device based on image recognition in an embodiment of the invention.
Fig. 8 is a schematic diagram showing a configuration of an image recognition apparatus in an embodiment of the present invention.
Fig. 9 shows a schematic diagram of an electronic device suitable for implementing the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment of the invention provides a task file generation method based on image recognition, which is shown in fig. 1 and comprises the following steps:
s11: the method comprises the steps of obtaining an image document file in a bank escrow system and header combination information of each form in the image document file, wherein the header combination information is obtained through image recognition.
In step S11, the bank escrow system is the part of the background art that implements the escrow clearing business, and the bank escrow business system generates a large number of document files including information related to the instruction, and it can be understood that most of the document files sent by fax or mail are image document files.
In the step, the sources of the image document file and the header combination information are different, specifically, the image document file is from a bank escrow system, the method is executed by a task file generating device in the bank escrow system, and the header combination information is identified by an image identification model built in an external or internal image identification device.
In some embodiments, the image recognition device is a device external to the bank escrow system, for example, may be a third party device that cooperates with the bank escrow system, which acts as a user and a caller, interacting with it by way of a message interface.
In other embodiments, the image recognition device is a device in a bank escrow system or integrated in a task file generating device that performs the method, for example, the image recognition device and the task file generating device are the same electronic device including a processor, or the image recognition device and the task file generating device are different electronic devices, and the invention is not limited thereto.
The image recognition may be OCR recognition, i.e., a text recognition technique, in which an image document file is a document content stored in the form of an image, such as a facsimile document, in which the document content is integrated in the image, and the image document file includes a plurality of forms each including a header for characterizing the type of information, such as "amount", "payor", and the like, and corresponding information.
The header combination information is a combination of header information, and since each form includes a plurality of headers, each header corresponds to one header information, each form corresponds to one header combination information of combination, for example, the header combination information is "amount", "payee", "payor", and "payment method".
S12: and generating a task file containing at least one instruction information according to the header combination information containing the instruction related information.
Specifically, the header combination information is used for characterizing the data type, so that by judging the syntactic semantics formed by the field combination of the header combination information, since the instruction information necessarily includes necessary information such as a payor, a payee, a payment mode, a payment amount, etc., whether the header combination information includes instruction related information (i.e., payment necessary information) can be determined by identifying whether the header combination information includes the necessary information, that is, whether the header combination information can form the instruction information can be judged, for example, for the header combination information, the "amount", "payee", "payment mode", etc., the instruction information can be generated: the payoff party pays an amount to the payoff party in a payment mode, namely, each line of data in the form is instruction information, and a corresponding task file containing at least one instruction information is generated.
Of course, the instruction information generally has a set format, so when it is determined that the header information includes instruction related information, the header information can be correspondingly converted into instruction information in the set format.
Then, the association relation between the instruction information and the task is established, the identified header combination information is converted into instruction information of a system standard, a series of data processing operations (such as account matching of a receiving party, checking of an amount date format, rule backfilling of an instruction service type and the like) are carried out, the association relation between the finally obtained instruction information and the task is established, and lasting operation is carried out. The task generally comprises an execution main body of the task, execution time of the task and the like, instruction information is assigned to an object to be processed in a specified time through an association relation in a hosting system, the association relation of the task can be a stored (persistent) task file, the task file can be singly and correspondingly corresponding to each form, and the task file can also contain the instruction information in all forms.
The invention provides a task file generation method based on image recognition, which obtains header combination information through image recognition, and then generates a task file comprising instruction information according to the header combination information comprising instruction related information.
In a preferred embodiment, the image recognition is further capable of recognizing a location of each header information, and the task file generation method further includes:
s01: and predicting the document template style corresponding to the header combination information according to each header information and the position of the header information.
In this step, since the document file is generated according to a certain template style, that is, header combination information primarily identified by the image identification model and the position thereof, the template style adopted by the document file can be estimated, for example, for the A style, the "amount" should be at the upper left corner position of the document image, when the header information of the "amount" is identified at the upper left corner position of the document image, the A style can be judged, and further, since the amount is digital data, if the letter is identified, the identification is judged to be wrong, and the error report is performed; or matching the information to be identified in the digital range, so that the matching range is reduced, and the workload of information matching is reduced.
Of course, the template style is generally determined by combining a plurality of header information and corresponding positions, for example, the positions of the header information of the amount of money, the payee, the sender and the like are determined by a multidimensional relation, so that the accuracy of template style estimation is improved.
S02: and checking the header information according to the data structure corresponding to the document template style.
Specifically, since the data structure of each position in the document template style is unchanged, as described above, the position in the upper left corner of the a style is necessarily digital data, and if letter data is recognized, it is judged whether or not the recognition is present, and it is possible to use it for verification.
Further, the image recognition may be performed by the image recognition model, the model is built in the image recognition device, and the verification of the header information may be that the number of the document template pattern is sent to the image recognition model, so that the image recognition model recognizes the image document file again according to the information types corresponding to the positions in the document template pattern.
That is, in the preferred embodiment, the image recognition is performed twice, the primary function of the first image recognition is to recognize the document template style, and the second image recognition is more accurate based on the first image recognition.
An embodiment of the second aspect of the present invention provides a task file generating method based on image recognition, which is executed by the image recognition device, as shown in fig. 2, and specifically includes:
S21: acquiring an image document file in a line hosting system;
s22: and identifying the header combination information of each form in the image document file through an image identification model, so that an external hosting system can generate a task file containing at least one instruction information according to the header combination information containing the instruction related information.
The invention provides a task file generation method based on image recognition, which obtains header combination information through image recognition, and then generates a task file comprising instruction information according to the header combination information comprising instruction related information.
In a preferred embodiment, the image recognition model is a neural network model obtained by training a historical image document file, for example, the model is a convolutional neural network implemented based on a TensorFlow framework and deep learning of model training, and is implemented by using Python, which is not described herein.
The model may be built on-line or off-line, i.e. the building and training steps of the model may be included in or outside the steps of the invention, which specifically include:
establishing the image recognition model;
the image recognition model is trained by the historical image document file marked with header combination information.
The following describes a specific scenario illustrated in connection with fig. 3.
Template configuration: when the invention is applied specifically, the invention provides a set of complete template configuration functions for newly adding, modifying, deleting and the like for maintaining template configuration information for a hosting system operator, and template mapping information can be configured according to dimensions of template types (formats of fax or mail instruction forms), service types, asset combinations, managers and the like. So as to establish the image instruction document information and the mapping relation provided by the clients. In general, the forms of the instruction fax of one manager are unified, and at most 2 or 3 forms exist, so that a plurality of forms can be configured under one manager as templates (for example, up-down forms, left-right forms and the like), and the template configuration also supports the dimension of the combination for the case that the forms used in a specific combination under the same manager are inconsistent.
Automatic sorting of tasks: the documents that are sent by the manager via fax or mail every day are numerous and not all instruction documents, which may include other types of documents such as inter-bank transactions, contracts, accounting statements, etc., while only instruction-type documents may generate tasks for the operator to claim for entry. The module firstly sends the header information of the form which is recognized in advance by the OCR image recognition system after receiving the file, acquires the combined information of the header, and automatically generates a task of a payment instruction class for inputting a subsequent instruction according to the maintained combined information in the rule matching system, thereby replacing the manual sorting operation.
Automatic identification of instructions: after the task is sorted into the payment instruction type task, calling, finding out the corresponding template information mapped by the combination or manager through the combination information of the task, sending fax or mail attachments hung under the task to an OCR image recognition system, receiving information successfully recognized by OCR, judging whether the time-out is overtime and whether the admission conditions such as manual entry are met, and pushing the information to a data processing and persistence module for data processing and persistence operation.
Instruction data processing and persistence: and converting the information returned by the identification into instruction information of a system standard, performing a series of data processing operations (such as account matching of a receiving party, checking of an amount and date format, rule backfilling of an instruction service type and the like), establishing an association relation between the finally obtained instruction information and a task, and performing persistence operation.
Specifically, for the template configuration, the data structure is divided into two levels, a template information table and a template and combination manager mapping table. The template information table can be uniquely identified according to information such as a template number, a template name, a file source, a file name, and the like as conditions. The template and combination manager mapping table manages with the template number as an external key and the template information table, and uniquely identifies according to dimensions such as asset combination, manager, delegate, template information, document name and the like as conditions, and adds limit check and the like. At the software level, operations of adding templates, modifying, deleting, querying, auditing, and checking up templates can be performed.
The automatic task sorting is responsible for scanning fax pieces and mail attachments received by all hosting systems, sending an OCR recognition system to perform pre-sorting, recognizing the header information of the form, judging whether the form is a payment instruction attachment, and completing the automatic task sorting operation according to the combined information. Specifically, as shown in fig. 4, the method includes: scanning a document, namely scanning faxes and mails entering the hosting system, and sending an OCR (optical character recognition) system to the image document which is not sorted to recognize the header (combination information); automatically sorting, namely, for the fax piece of the instruction type, the header is relatively fixed, if the combination field returned to the header is identified, the combination information existing in the system is reversely checked according to the rule, and a task with the task type of 'payment instruction' is automatically generated for the subsequent operator to process; otherwise, the sorting action is not performed, and the manual processing is changed; transaction control, because operators of sorting posts may also be processing fax or mail documents at the same time, requires transaction and concurrency control, and if manual sorting is found when automatic sorting processing is submitted, database operation in the transaction before rollback is performed in the control of manual sorting. Pessimistic locking of the same document information does not allow the same document to be sorted at the same time.
The automatic fingerprint identification is to find out the template information mapped by the corresponding combination or manager through the combination information of the task after the task is sorted, send the fax or mail attachment hung under the task to the OCR image recognition system, receive the information of successful OCR recognition and generate a payment instruction. Specifically, as shown in fig. 5, at the software execution level, it includes: monitoring asynchronous messages, after task sorting (automatic and task sorting), sending asynchronous messages (JMS messages are used), wherein a monitoring component of the command automatic identification module is responsible for monitoring and receiving the asynchronous messages, and starting a command identification flow by a new thread; the template is acquired, the template information is acquired through the combined information in the task, so that the combined dimension, the delegate dimension and the priority of the manager dimension are used by an OCR recognition system (the recognition accuracy is improved), the combined dimension, the delegate dimension and the priority of the manager dimension are searched, and the task without the template information is not subjected to OCR recognition processing; the control of the recognition instruction flow, namely, the acquired template information corresponding to the combination or manager is transmitted to an OCR recognition system for instruction recognition along with the image document hung by the task, the recognized information is transferred to an instruction data processing and persistence module for data processing, compliance checking and data persistence, and finally, the instruction is associated with the task, and a series of flow control such as task recognition state updating is performed; exception handling, capturing exceptions occurring in the whole flow, including business exceptions and system exceptions, wherein the business exceptions are generally predictable problems possibly occurring in client data, such as incorrect document format identification, incorrect field format, empty necessary entry and the like; the system exception is mostly an exception thrown by the analysis program itself, for example, database operation fails, and the process is required to be interrupted and the transaction is required to be rolled back.
Instruction data processing and persistence are data cleansing, instruction data processing, compliance checking, and instruction persistence operations based on the identified information. As shown in fig. 6, at the software implementation level, it specifically includes: and (3) data cleaning, namely performing data cleaning on the known field information, removing identification errors caused by unclear images or excessive noise, such as removing monetary symbols like $ of money, removing special characters outside non-numerical and decimal points, removing non-numerical characters from a date field, converting the non-numerical characters into an identifiable date format, and preventing illegal characters like Chinese from occurring in account information. The data cleaning aims at eliminating interference and preparing for subsequent processing; and (3) carrying out data cleaning work on the corresponding fields by using preset rules for the information fed back by OCR recognition, loading the preset rules by the system after the OCR feedback is received, matching the fields by using regular expressions, retaining the data meeting the matching rules in the data, deleting the rest of dirty data, and if the fields do not meet the matching rules, directly carrying out null processing. Table 1 shows the data cleansing rules and corresponding field lookup tables; and processing data, namely processing the cleaned data, converting the cleaned data into instruction information, such as acquiring account information in a system through a receipt and payment account number, back-filling a necessary payment system number for payment, and back-filling the service type of the instruction according to rules and the like through the account information and payment purpose information. Performing data processing operation on the instruction information after data cleaning, relying on information such as accounts, combinations and the like maintained in the system, performing necessary data conversion, mapping feedback information after data processing to an instruction format in the system for subsequent compliance checking, wherein a comparison table of processing information and processing rules is shown in table 2; compliance checking, judging whether the account is a fund account under the combination according to the account information of the payer and the combination information, and if not, directly rejecting; judging that the payment date is required to be more than or equal to the current working day, or refusing; judging whether the sum field is larger than 0; judging whether the service type is applicable to the asset, if not, directly rejecting the asset, and aiming at the instruction information produced after data processing, carrying out compliance checking on a data set of a system, wherein the compliance checking mainly aims at whether the instruction information produced by OCR feedback is compliant and sets tolerance, if the compliance checking is not passed and does not reach the tolerance, directly rejecting to produce the instruction, and table 3 shows a comparison table of compliance checking items and checking rules; and (3) data persistence, namely performing persistence operation on instruction information with valid compliance, tolerating the loss of certain information, and providing the subsequent manual complement.
Table 1 data cleansing rules and corresponding field lookup tables
Fields Cleaning rules
Account number Numbers and uppercase English and special symbols (-, spaces), etc
Account name Chinese and english and special symbol (-), etc
Amount of money Number and decimal point ()
Date of day Digits or Chinese zero to nine delimiters (year, month, day,/, -, etc.)
Special field For example transaction units, digital parallel (|) segmentation
...... ......
Table 2 comparison of processing information and processing rules
TABLE 3 comparison of compliance check terms and check rules
Compliance check item Checking rules
Combination and payer account information check Whether the payer is a subscription account under the combination
Payment party account information and business type checking Whether the business type is operable with accounts of such attributes
Payment date check Whether the payment date is greater than or equal to the current system working day
Check of payment amount Whether the payment amount is greater than 0
Business type and asset type inspection Determining whether the business type is applicable to the asset type
...... ......
The OCR image recognition device is implemented as follows: model training, aiming at a command of successful historical payment and a corresponding document, and taking the command and the corresponding document as training data so as to improve the recognition accuracy; and (3) customizing templates, analyzing and classifying templates of different types according to the combination of the current operation of the host and fax and mail forms sent by a manager, such as different formats of form patterns including up-down format, left-right format, middle-left format and the like, so that the region of the image data can be approximately designated, and the recognition accuracy is greatly improved.
It can be understood that from the specific scene, the OCR image recognition technology is applied to the command recognition field of the escrow clearing system, so that the method is not only applied to the actual working scene of the business, but also greatly reduces the tedious and repeated manual operation of the business every day, and liberates part of productivity. Because the accuracy requirement of the scribing instructions on the information is very high and the fault tolerance is very low, the system adopts a series of auxiliary modes, the identification accuracy is improved, and a series of check rules are provided to prevent the wrong identification instructions from flowing backwards, so that the probability of errors is reduced. Through template configuration, if a manager or a combination is newly established in an actual scene, automatic identification operation can be completed only by configuring a corresponding template, so that the flexibility and expandability of the system are improved.
Based on the same inventive concept as the above method of the present invention, an embodiment of the present invention further provides a task file generating device based on image recognition, as shown in fig. 7, including:
the acquiring module 11 acquires an image document file in the bank escrow system and header combination information of each form in the image document file, wherein the header combination information is obtained through image recognition;
The task file generation module 12 generates a task file including at least one instruction information based on header combination information including instruction related information.
In a preferred embodiment, the task file generation module includes:
a judging unit that judges whether the image document file includes instruction-related information according to the header combination information;
the conversion unit is used for converting the header combination information into instruction information in a set format if the header combination information is included;
and establishing an association relation between the instruction information and the task, and obtaining the task file in a lasting way.
In a preferred embodiment, the image recognition is further capable of recognizing a position of each header information, and the task file generation device further includes:
the document template style estimation module estimates a document template style corresponding to the header combination information according to each header information and the position of the header information;
and the header information verification module is used for verifying the header information according to the data structure corresponding to the document template style.
In a preferred embodiment, the header information verification module sends the number of the document template style to an image recognition model, so that the image recognition model recognizes the image document file according to the information types corresponding to the positions in the document template style.
The invention provides a task file generating device based on image recognition, which obtains header combination information through image recognition, and then generates a task file comprising instruction information according to the header combination information comprising instruction related information.
Based on the same inventive concept as the above method of the present invention, an embodiment of the present invention further provides an image recognition apparatus, as shown in fig. 8, including:
an image document file acquisition module 21 that acquires an image document file in the line hosting system;
the identifying module 22 identifies header combination information of each form in the image document file through an image identifying model, so that an external hosting system can generate a task file containing at least one instruction information according to the header combination information containing the instruction related information.
In a preferred embodiment, the image recognition model is a neural network model trained from historical image document files.
In a preferred embodiment, further comprising:
the model building module is used for building the image recognition model;
and the model training module is used for training the image recognition model through the historical image document file marked with the header combination information.
The invention provides an image recognition device, which obtains header combination information through image recognition, and then generates a task file comprising instruction information according to the header combination information containing instruction related information.
The embodiment of the present invention further provides a specific implementation manner of an electronic device capable of implementing all the steps in the task file generating method based on image recognition in the foregoing embodiment, and referring to fig. 9, the electronic device specifically includes the following contents:
a processor (processor) 601, a memory (memory) 602, a communication interface (Communications Interface) 603, and a bus 604;
wherein the processor 601, the memory 602, and the communication interface 603 complete communication with each other through the bus 604; the communication interface 603 is configured to implement information transmission between the task file generating device based on image recognition and related devices such as a user terminal;
the processor 601 is configured to invoke a computer program in the memory 602, where the processor executes the computer program to implement all the steps in the task file generation method based on image recognition in the above embodiment.
As can be seen from the above description, the electronic device provided by the embodiment of the invention obtains the header combination information through image recognition, and then generates the task file including the instruction information according to the header combination information including the instruction related information.
The embodiment of the present invention also provides a computer-readable storage medium capable of implementing all the steps in the task file generation method based on image recognition in the above embodiment, the computer-readable storage medium storing thereon a computer program which, when executed by a processor, implements all the steps in the task file generation method based on image recognition in the above embodiment, for example, the processor implements the following steps when executing the computer program.
As can be seen from the above description, the computer readable storage medium provided by the embodiment of the present invention obtains header combination information through image recognition, and then generates a task file including instruction information according to the header combination information including instruction related information.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for a hardware+program class embodiment, the description is relatively simple, as it is substantially similar to the method embodiment, as relevant see the partial description of the method embodiment.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
Although the invention provides method operational steps as described in the examples or flowcharts, more or fewer operational steps may be included based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one way of performing the order of steps and does not represent a unique order of execution. When implemented by an actual device or client product, the instructions may be executed sequentially or in parallel (e.g., in a parallel processor or multi-threaded processing environment) as shown in the embodiments or figures.
The apparatus, device, module or unit described in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a car-mounted human-computer interaction device, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Although the present description provides method operational steps as described in the examples or flowcharts, more or fewer operational steps may be included based on conventional or non-inventive means. The order of steps recited in the embodiments is merely one way of performing the order of steps and does not represent a unique order of execution. When implemented in an actual device or end product, the instructions may be executed sequentially or in parallel (e.g., in a parallel processor or multi-threaded processing environment, or even in a distributed data processing environment) as illustrated by the embodiments or by the figures. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, it is not excluded that additional identical or equivalent elements may be present in a process, method, article, or apparatus that comprises a described element.
For convenience of description, the above devices are described as being functionally divided into various modules, respectively. Of course, when implementing the embodiments of the present disclosure, the functions of each module may be implemented in the same or multiple pieces of software and/or hardware, or a module that implements the same function may be implemented by multiple sub-modules or a combination of sub-units, or the like. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple units or components may be combined or integrated into another apparatus, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller can be regarded as a hardware component, and means for implementing various functions included therein can also be regarded as a structure within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It will be apparent to one of ordinary skill in the art that embodiments of the present description may be provided as a method, apparatus, or computer program product. Accordingly, the present specification embodiments may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description embodiments may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present embodiments may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The embodiments of the specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments in part. In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the embodiments of the present specification. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
The foregoing is merely an example of an embodiment of the present disclosure and is not intended to limit the embodiment of the present disclosure. Various modifications and variations of the illustrative embodiments will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, or the like, which is within the spirit and principles of the embodiments of the present specification, should be included in the scope of the claims of the embodiments of the present specification.

Claims (12)

1. The task file generation method based on image recognition is characterized by comprising the following steps of:
acquiring an image document file in a bank escrow system and header combination information of each form in the image document file, wherein the header combination information is obtained through image recognition;
generating a task file containing at least one instruction information according to header combination information containing instruction related information;
generating a task file containing at least one instruction information according to the header combination information, wherein the task file comprises:
judging whether the image document file comprises instruction related information according to the header combination information, wherein the method specifically comprises the following steps:
judging whether the header combination information can form instruction information or not by judging the syntactic semantics formed by the field combination of the header combination information; determining whether the header combination information contains instruction related information by identifying whether the header combination information contains necessary information or not, wherein the header combination information is used for representing the data type, and each line of data in the form is instruction information;
If so, converting the header combination information into instruction information in a set format, and generating a corresponding task containing at least one instruction information;
inquiring corresponding template information mapped by a combination or manager through at least one instruction information of the task, sending an attachment of the task to an OCR image recognition system, and receiving information of successful OCR recognition to generate a payment instruction; the data structure of the template configuration comprises a template information table and a template and combination manager mapping table, wherein the template information table is uniquely identified according to template numbers, template names, file sources and file name information serving as conditions;
judging whether overtime and manually entered access conditions exist or not;
if the admission condition is met, establishing an association relation between the instruction information and the task, and obtaining the task file in a lasting manner, wherein the method specifically comprises the following steps:
performing a series of data processing operations on the instruction information, establishing an association relation between the finally obtained instruction information and a task, and performing persistence operations, wherein the data processing operations comprise account matching of a payment receiving party, checking of an amount date format and rule reverse filling of an instruction service type, the task comprises an execution main body of the task and execution time of the task, and feedback information after data processing can be mapped to an instruction format in a system for subsequent compliance checking;
Assigning instruction information to the object for processing in a specified time through the association relation;
the image recognition is further capable of recognizing a position of each header information, and the task file generation method further includes:
estimating a document template style corresponding to the header combination information according to each header information and the position of the header information, wherein each header information and each position of the header information are obtained through one-time identification of an image identification model;
according to the data structure corresponding to the document template style, the header information is checked, and the method specifically comprises the following steps:
if the data structure of the identified header information is different from the data structure of the document template style, determining that the header information is wrongly identified, wherein the data structure of the header information is obtained through secondary identification of an image identification model;
the task file generation method further comprises the following steps:
pessimistic locking is carried out on the same document information;
capturing anomalies in the whole process, including business anomalies and system anomalies, wherein the business anomalies are predictable anomalies of client data, and the system anomalies are anomalies thrown by an analysis program;
the performing a series of data processing operations on the instruction information includes:
Performing data cleaning, instruction data processing, compliance checking and instruction persistence operation according to the identified instruction information;
the data cleansing includes: data cleaning is carried out on the known field information, and identification errors caused by unclear images or excessive noise are removed; for the information fed back by OCR recognition, carrying out data cleaning on the corresponding fields by using preset rules, loading the preset rules by the system after the OCR feedback is received, matching the fields by using regular expressions, retaining the data meeting the matching rules in the data, deleting the rest dirty data, and if the fields do not meet the matching rules, directly carrying out null processing;
the instruction data processing includes: processing the cleaned data, and converting the processed data to generate instruction information; the processing information comprises combination information, account information, currency, money date and service type, and the processing rule corresponding to the combination information is that fuzzy matching is carried out on system combination through the identified combination information, so that corresponding combination information is found; the processing rule corresponding to the account information is that a system account is searched through account number and user name information, and if a unique account is found, other information of the account is back filled; the processing rule corresponding to the currency is that the currency symbol is converted into a currency code; the processing rule corresponding to the monetary Date is converted into a Date format recognized by the system; the processing rules corresponding to the service types are preset rules of the using system, and the payment instructions of the service types are judged according to the account attributes, the combination, the account number of the receiving and paying party, the remarks and the fields of the payment purposes;
The compliance check includes: whether instruction information generated by OCR feedback is compliant or not, and setting tolerance, if compliance checking is not passed and the tolerance is not reached, refusing to generate the instruction; wherein the compliance check includes determining whether the payer account is a subscription account under the combination, whether the business type is operable with an account of such attribute, whether a date of the amount is greater than or equal to a current system workday, whether a payment amount is greater than 0, and whether the business type is applicable to a corresponding asset type;
data persistence includes: and performing persistence operation on instruction information with valid compliance.
2. The task file generation method according to claim 1, wherein the verifying the header information according to the data structure corresponding to the document template style includes:
and sending the number of the document template pattern to an image recognition model so that the image recognition model recognizes the image document file again according to the information types corresponding to the positions in the document template pattern.
3. The task file generation method based on image recognition is characterized by comprising the following steps of:
acquiring an image document file in a bank escrow system;
The header combination information of each form in the image document file is identified through an image identification model, so that an external hosting system can generate a task file containing at least one instruction information according to the header combination information containing the instruction related information, and the task file specifically comprises the following steps: judging whether the header combination information can form instruction information or not by judging the syntactic semantics formed by the field combination of the header combination information; determining whether the header combination information contains instruction related information by identifying whether the header combination information contains necessary information or not, wherein the header combination information is used for representing the data type, and each line of data in the form is instruction information; if so, converting the header combination information into instruction information in a set format, and generating a corresponding task containing at least one instruction information; inquiring corresponding template information mapped by a combination or manager through at least one instruction information of the task, sending an attachment of the task to an OCR image recognition system, and receiving information of successful OCR recognition to generate a payment instruction; the data structure of the template configuration comprises a template information table and a template and combination manager mapping table, wherein the template information table is uniquely identified according to template numbers, template names, file sources and file name information serving as conditions; judging whether overtime and manually entered access conditions exist or not; if the admission condition is met, performing a series of data processing operations on the instruction information, establishing an association relation between the finally obtained instruction information and a task, and performing persistence operations, wherein the data processing operations comprise account matching of a payment receiving party, checking of an amount date format and instruction service type rule backfilling, the task comprises an execution main body of the task and execution time of the task, and feedback information after data processing can be mapped to an instruction format in a system for subsequent compliance checking; assigning instruction information to the object for processing in a specified time through the association relation;
The image recognition is further capable of recognizing a position of each header information, and the task file generation method further includes:
estimating a document template style corresponding to the header combination information according to each header information and the position of the header information, wherein each header information and each position of the header information are obtained through one-time identification of an image identification model;
according to the data structure corresponding to the document template style, the header information is checked, and the method specifically comprises the following steps:
if the data structure of the identified header information is different from the data structure of the document template style, determining that the header information is wrongly identified, wherein the data structure of the header information is obtained through secondary identification of an image identification model;
the task file generation method further comprises the following steps:
pessimistic locking is carried out on the same document information;
capturing anomalies in the whole process, including business anomalies and system anomalies, wherein the business anomalies are predictable anomalies of client data, and the system anomalies are anomalies thrown by an analysis program;
the performing a series of data processing operations on the instruction information includes:
performing data cleaning, instruction data processing, compliance checking and instruction persistence operation according to the identified instruction information;
The data cleansing includes: data cleaning is carried out on the known field information, and identification errors caused by unclear images or excessive noise are removed; for the information fed back by OCR recognition, carrying out data cleaning on the corresponding fields by using preset rules, loading the preset rules by the system after the OCR feedback is received, matching the fields by using regular expressions, retaining the data meeting the matching rules in the data, deleting the rest dirty data, and if the fields do not meet the matching rules, directly carrying out null processing;
the instruction data processing includes: processing the cleaned data, and converting the processed data to generate instruction information; the processing information comprises combination information, account information, currency, money date and service type, and the processing rule corresponding to the combination information is that fuzzy matching is carried out on system combination through the identified combination information, so that corresponding combination information is found; the processing rule corresponding to the account information is that a system account is searched through account number and user name information, and if a unique account is found, other information of the account is back filled; the processing rule corresponding to the currency is that the currency symbol is converted into a currency code; the processing rule corresponding to the monetary Date is converted into a Date format recognized by the system; the processing rules corresponding to the service types are preset rules of the using system, and the payment instructions of the service types are judged according to the account attributes, the combination, the account number of the receiving and paying party, the remarks and the fields of the payment purposes;
The compliance check includes: whether instruction information generated by OCR feedback is compliant or not, and setting tolerance, if compliance checking is not passed and the tolerance is not reached, refusing to generate the instruction; wherein the compliance check includes determining whether the payer account is a subscription account under the combination, whether the business type is operable with an account of such attribute, whether a date of the amount is greater than or equal to a current system workday, whether a payment amount is greater than 0, and whether the business type is applicable to a corresponding asset type;
data persistence includes: and performing persistence operation on instruction information with valid compliance.
4. A task file generation method according to claim 3, wherein the image recognition model is a neural network model trained from a history image document file.
5. The task file generation method according to claim 4, characterized by further comprising:
establishing the image recognition model;
the image recognition model is trained by the historical image document file marked with header combination information.
6. A task file generation device based on image recognition, characterized by comprising:
the system comprises an acquisition module, a storage module and a storage module, wherein the acquisition module acquires an image document file in a bank escrow system and header combination information of each form in the image document file, and the header combination information is obtained through image identification;
The task file generation module generates a task file containing at least one instruction information according to header combination information containing instruction related information;
the task file generation module comprises:
a judging unit that judges whether the image document file includes instruction-related information according to the header combination information;
the conversion unit is used for converting the header combination information into instruction information in a set format if the header combination information is included, and generating a corresponding task including at least one instruction information; inquiring corresponding template information mapped by a combination or manager through at least one instruction information of the task, sending an attachment of the task to an OCR image recognition system, and receiving information of successful OCR recognition to generate a payment instruction; the data structure of the template configuration comprises a template information table and a template and combination manager mapping table, wherein the template information table is uniquely identified according to template numbers, template names, file sources and file name information serving as conditions; judging whether overtime and manually entered access conditions exist or not; if the admission condition is met, establishing an association relation between the instruction information and the task, and obtaining the task file in a lasting mode;
The judging unit judges whether the header combination information can form instruction information or not by judging the syntactic semantics formed by the field combination of the header combination information; determining whether the header combination information contains instruction related information by identifying whether the header combination information contains necessary information or not, wherein the header combination information is used for representing the data type, and each line of data in the form is instruction information;
the conversion unit is used for carrying out a series of data processing operations on the instruction information, establishing an association relation between the finally obtained instruction information and a task, and carrying out persistence operation, wherein the data processing operations comprise account matching of a payment receiving party, checking of an amount date format and rule back filling of an instruction service type, the task comprises an execution main body of the task and execution time of the task, and feedback information after data processing can be mapped to an instruction format in a system for subsequent compliance checking; assigning instruction information to the object for processing in a specified time through the association relation;
the image recognition may further be capable of recognizing a position of each header information, and the task file generation device may further include:
The document template style estimation module estimates a document template style corresponding to the header combination information according to each header information and the position of the header information, and the position of each header information and each header information is obtained through one-time identification of an image identification model;
the header information verification module is used for verifying the header information according to the data structure corresponding to the document template style;
the header information verification module determines that the header information is wrongly identified if the data structure of the identified header information is different from the data structure of the document template style, and the data structure of the header information is obtained through secondary identification of an image identification model;
the task file generating device further includes:
the locking module is used for locking the same document information pessimistically;
the system comprises an anomaly capturing module, a processing module and a processing module, wherein the anomaly capturing module captures anomalies in the whole process, including business anomalies and system anomalies, the business anomalies are predictable anomalies of client data, and the system anomalies are anomalies thrown by an analysis program;
the conversion unit is used for performing data cleaning, instruction data processing, compliance checking and instruction persistence operation according to the identified instruction information;
The data cleansing includes: data cleaning is carried out on the known field information, and identification errors caused by unclear images or excessive noise are removed; for the information fed back by OCR recognition, carrying out data cleaning on the corresponding fields by using preset rules, loading the preset rules by the system after the OCR feedback is received, matching the fields by using regular expressions, retaining the data meeting the matching rules in the data, deleting the rest dirty data, and if the fields do not meet the matching rules, directly carrying out null processing;
the instruction data processing includes: processing the cleaned data, and converting the processed data to generate instruction information; the processing information comprises combination information, account information, currency, money date and service type, and the processing rule corresponding to the combination information is that fuzzy matching is carried out on system combination through the identified combination information, so that corresponding combination information is found; the processing rule corresponding to the account information is that a system account is searched through account number and user name information, and if a unique account is found, other information of the account is back filled; the processing rule corresponding to the currency is that the currency symbol is converted into a currency code; the processing rule corresponding to the monetary Date is converted into a Date format recognized by the system; the processing rules corresponding to the service types are preset rules of the using system, and the payment instructions of the service types are judged according to the account attributes, the combination, the account number of the receiving and paying party, the remarks and the fields of the payment purposes;
The compliance check includes: whether instruction information generated by OCR feedback is compliant or not, and setting tolerance, if compliance checking is not passed and the tolerance is not reached, refusing to generate the instruction; wherein the compliance check includes determining whether the payer account is a subscription account under the combination, whether the business type is operable with an account of such attribute, whether a date of the amount is greater than or equal to a current system workday, whether a payment amount is greater than 0, and whether the business type is applicable to a corresponding asset type;
data persistence includes: and performing persistence operation on instruction information with valid compliance.
7. The task file generation device according to claim 6, wherein the header information verification module sends a number of a document template style to an image recognition model, so that the image recognition model recognizes the image document file according to the information type corresponding to each position in the document template style.
8. An image recognition apparatus, comprising:
the image document file acquisition module acquires an image document file in the bank escrow system;
the identification module is used for identifying the header combination information of each form in the image document file through the image identification model so that an external hosting system can generate a task file containing at least one instruction information according to the header combination information containing the instruction related information, and the task file specifically comprises the following steps: judging whether the header combination information can form instruction information or not by judging the syntactic semantics formed by the field combination of the header combination information; determining whether the header combination information contains instruction related information by identifying whether the header combination information contains necessary information or not, wherein the header combination information is used for representing the data type, and each line of data in the form is instruction information; if so, converting the header combination information into instruction information in a set format, and generating a corresponding task containing at least one instruction information; inquiring corresponding template information mapped by a combination or manager through at least one instruction information of the task, sending an attachment of the task to an OCR image recognition system, and receiving information of successful OCR recognition to generate a payment instruction; the data structure of the template configuration comprises a template information table and a template and combination manager mapping table, wherein the template information table is uniquely identified according to template numbers, template names, file sources and file name information serving as conditions; judging whether overtime and manually entered access conditions exist or not; if the admission condition is met, performing a series of data processing operations on the instruction information, establishing an association relation between the finally obtained instruction information and a task, and performing persistence operations, wherein the data processing operations comprise account matching of a payment receiving party, checking of an amount date format and instruction service type rule backfilling, the task comprises an execution main body of the task and execution time of the task, and feedback information after data processing can be mapped to an instruction format in a system for subsequent compliance checking; assigning instruction information to the object for processing in a specified time through the association relation;
The image recognition is further capable of recognizing a position of each header information, the image recognition apparatus further comprising:
the template estimating module estimates a document template style corresponding to the header combination information according to each header information and the position of the header information, and the position of each header information and each header information is obtained through one-time identification of an image identification model;
according to the data structure corresponding to the document template style, the header information is checked, and the method specifically comprises the following steps:
if the data structure of the identified header information is different from the data structure of the document template style, determining that the header information is wrongly identified, wherein the data structure of the header information is obtained through secondary identification of an image identification model;
the image recognition apparatus further includes:
the locking module is used for locking the same document information pessimistically;
the system comprises an anomaly capturing module, a processing module and a processing module, wherein the anomaly capturing module captures anomalies in the whole process, including business anomalies and system anomalies, the business anomalies are predictable anomalies of client data, and the system anomalies are anomalies thrown by an analysis program;
the performing a series of data processing operations on the instruction information includes:
Performing data cleaning, instruction data processing, compliance checking and instruction persistence operation according to the identified instruction information;
the data cleansing includes: data cleaning is carried out on the known field information, and identification errors caused by unclear images or excessive noise are removed; for the information fed back by OCR recognition, carrying out data cleaning on the corresponding fields by using preset rules, loading the preset rules by the system after the OCR feedback is received, matching the fields by using regular expressions, retaining the data meeting the matching rules in the data, deleting the rest dirty data, and if the fields do not meet the matching rules, directly carrying out null processing;
the instruction data processing includes: processing the cleaned data, and converting the processed data to generate instruction information; the processing information comprises combination information, account information, currency, money date and service type, and the processing rule corresponding to the combination information is that fuzzy matching is carried out on system combination through the identified combination information, so that corresponding combination information is found; the processing rule corresponding to the account information is that a system account is searched through account number and user name information, and if a unique account is found, other information of the account is back filled; the processing rule corresponding to the currency is that the currency symbol is converted into a currency code; the processing rule corresponding to the monetary Date is converted into a Date format recognized by the system; the processing rules corresponding to the service types are preset rules of the using system, and the payment instructions of the service types are judged according to the account attributes, the combination, the account number of the receiving and paying party, the remarks and the fields of the payment purposes;
The compliance check includes: whether instruction information generated by OCR feedback is compliant or not, and setting tolerance, if compliance checking is not passed and the tolerance is not reached, refusing to generate the instruction; wherein the compliance check includes determining whether the payer account is a subscription account under the combination, whether the business type is operable with an account of such attribute, whether a date of the amount is greater than or equal to a current system workday, whether a payment amount is greater than 0, and whether the business type is applicable to a corresponding asset type;
data persistence includes: and performing persistence operation on instruction information with valid compliance.
9. The image recognition device of claim 8, wherein the image recognition model is a neural network model trained from a historical image document file.
10. The image recognition device of claim 9, further comprising:
the model building module is used for building the image recognition model;
and the model training module is used for training the image recognition model through the historical image document file marked with the header combination information.
11. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any of claims 1 to 5 when the program is executed.
12. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 5.
CN201911358133.5A 2019-12-25 2019-12-25 Task file generation method and device based on image recognition Active CN111079397B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911358133.5A CN111079397B (en) 2019-12-25 2019-12-25 Task file generation method and device based on image recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911358133.5A CN111079397B (en) 2019-12-25 2019-12-25 Task file generation method and device based on image recognition

Publications (2)

Publication Number Publication Date
CN111079397A CN111079397A (en) 2020-04-28
CN111079397B true CN111079397B (en) 2024-02-20

Family

ID=70317764

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911358133.5A Active CN111079397B (en) 2019-12-25 2019-12-25 Task file generation method and device based on image recognition

Country Status (1)

Country Link
CN (1) CN111079397B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639911B (en) * 2020-05-27 2023-08-08 中信银行股份有限公司 Online processing method and device for asset hosting instruction, storage medium and electronic equipment
CN111931468B (en) * 2020-07-13 2023-12-08 珠海格力电器股份有限公司 Verification report generation method and device, electronic equipment and storage medium
CN111859145B (en) * 2020-07-30 2024-02-09 中国民航信息网络股份有限公司 Information searching method and device, electronic equipment and computer storage medium
CN112507350B (en) * 2020-11-18 2023-11-17 中国工商银行股份有限公司 Authentication method and device for assisting in executing check and control service
CN112712085A (en) * 2020-12-28 2021-04-27 哈尔滨工业大学 Method for extracting date in multi-language PDF document
CN112733518A (en) * 2021-01-14 2021-04-30 卫宁健康科技集团股份有限公司 Table template generation method, device, equipment and storage medium
CN113239921A (en) * 2021-05-10 2021-08-10 上海交大慧谷通用技术有限公司 Task grading and distributing method and system for OCR (optical character recognition) service

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006146830A (en) * 2004-11-24 2006-06-08 Glory Ltd System, method, and program for processing form
CN107798534A (en) * 2017-11-24 2018-03-13 珠海市魅族科技有限公司 A kind of information recording method and device, terminal and readable storage medium storing program for executing
CN109377342A (en) * 2018-12-04 2019-02-22 金蝶软件(中国)有限公司 Bill processing method, device, computer equipment and storage medium
CN109685477A (en) * 2018-12-28 2019-04-26 北京爱康鼎科技有限公司 Accounting process systems and processing method
CN109784235A (en) * 2018-12-29 2019-05-21 广东益萃网络科技有限公司 Method for automatically inputting, device, computer equipment and the storage medium of paper form
CN109800761A (en) * 2019-01-25 2019-05-24 厦门商集网络科技有限责任公司 Method and terminal based on deep learning model creation paper document structural data
CN110457117A (en) * 2019-07-05 2019-11-15 中国平安人寿保险股份有限公司 Data processing method, device, computer equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006146830A (en) * 2004-11-24 2006-06-08 Glory Ltd System, method, and program for processing form
CN107798534A (en) * 2017-11-24 2018-03-13 珠海市魅族科技有限公司 A kind of information recording method and device, terminal and readable storage medium storing program for executing
CN109377342A (en) * 2018-12-04 2019-02-22 金蝶软件(中国)有限公司 Bill processing method, device, computer equipment and storage medium
CN109685477A (en) * 2018-12-28 2019-04-26 北京爱康鼎科技有限公司 Accounting process systems and processing method
CN109784235A (en) * 2018-12-29 2019-05-21 广东益萃网络科技有限公司 Method for automatically inputting, device, computer equipment and the storage medium of paper form
CN109800761A (en) * 2019-01-25 2019-05-24 厦门商集网络科技有限责任公司 Method and terminal based on deep learning model creation paper document structural data
CN110457117A (en) * 2019-07-05 2019-11-15 中国平安人寿保险股份有限公司 Data processing method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN111079397A (en) 2020-04-28

Similar Documents

Publication Publication Date Title
CN111079397B (en) Task file generation method and device based on image recognition
CN109887153B (en) Finance and tax processing method and system
US7840455B1 (en) System and method of detecting fraudulent or erroneous invoices
US10657530B2 (en) Automated transactions clearing system and method
US11615110B2 (en) Systems and methods for unifying formats and adaptively automating processing of business records data
US11868979B2 (en) System and process for electronic payments
CN108897729B (en) Transaction template sharing method and device, electronic equipment and storage medium
US20210390299A1 (en) Techniques to determine document recognition errors
CN112417516A (en) File processing method, device, equipment and medium
CN113902573A (en) Method and device for processing claim settlement data, storage medium and terminal
US20170322777A1 (en) Presentation Oriented Rules-based Technical Architecture Display Framework
US9823958B2 (en) System for processing data using different processing channels based on source error probability
CN114358707A (en) Man-machine cooperative hybrid examination order decision method and system
US20130300562A1 (en) Generating delivery notification
CN114511318A (en) Account accounting method and device and electronic equipment
US9342541B1 (en) Presentation oriented rules-based technical architecture display framework (PORTRAY)
CN112445461A (en) Business rule generation method and device, electronic equipment and readable storage medium
CN111639905A (en) Enterprise business process management and control system, method, electronic equipment and storage medium
US20240070570A1 (en) Intelligent document processing in enterprise resource planning
US10115081B2 (en) Monitoring module usage in a data processing system
US20240046242A1 (en) Systems and methods for check image deposit
US20230195504A1 (en) Systems and methods for resolving interdependencies between user interfaces in a domain driven design microservice architecture
CN117891441A (en) Method, device, equipment and storage medium for establishing operation management system
CN116030481A (en) Bank electronic receipt PDF file identification method, equipment and medium
CN116975774A (en) Mechanism name fusion method, terminal equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20220916

Address after: 25 Financial Street, Xicheng District, Beijing 100033

Applicant after: CHINA CONSTRUCTION BANK Corp.

Address before: 25 Financial Street, Xicheng District, Beijing 100033

Applicant before: CHINA CONSTRUCTION BANK Corp.

Applicant before: Jianxin Financial Science and Technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant