CN113159212A

CN113159212A - OCR recognition model training method, device and computer readable storage medium

Info

Publication number: CN113159212A
Application number: CN202110485412.9A
Authority: CN
Inventors: 邹锦富; 杨皓
Original assignee: Shanghai Yuncong Enterprise Development Co ltd
Current assignee: Shanghai Yuncong Enterprise Development Co ltd
Priority date: 2021-04-30
Filing date: 2021-04-30
Publication date: 2021-07-23

Abstract

The invention relates to the technical field of machine learning, in particular to an OCR recognition model training method, an OCR recognition model training device and a computer readable storage medium, and aims to solve the technical problem of how to conveniently and efficiently label an image sample so as to quickly finish model training of an OCR recognition model. To this end, the OCR recognition model training method of the embodiment of the present invention includes: acquiring a first type of image sample with label data; training a preset OCR recognition model by adopting a first type of image sample to obtain an initial OCR recognition model; identifying business data in the second type image sample without label data by using an initial OCR recognition model; generating label data according to the identification result, and labeling the second type image sample with a label; and training an initial OCR recognition model by adopting the first type of image samples and the labeled second type of image samples to obtain a final OCR recognition model. Based on the implementation mode, the OCR recognition model can quickly complete model training, and the labeling accuracy of the image sample is improved.

Description

OCR recognition model training method, device and computer readable storage medium

Technical Field

The invention relates to the technical field of machine learning, in particular to an OCR recognition model training method, an OCR recognition model training device and a computer readable storage medium.

Background

With the advent of the information age, more and more image information is presented to people, and in order to accurately convert text information contained in the image information into information that can be edited by a computer or other devices, an OCR (Optical Character Recognition) Recognition model constructed based on an OCR technology may be employed to detect a text region in an image, recognize text information in the text region, and convert the recognized text information into information that can be edited by the computer or other devices. When the OCR model is constructed, a large number of image samples marked with correct label data (including but not limited to text regions in images and text information in the text regions) are required to be used for model training, so that the OCR model has high OCR recognition capability.

However, as the number of the OCR recognition application scenes is increased and the image differentiation of different application scenes is larger, such as the format difference of text information in an image is larger, the requirements of OCR recognition under different application scenes cannot be met simultaneously through one OCR recognition model, if a special OCR recognition model is respectively constructed for each application scene, because label labeling needs to be carried out on a large number of image samples when the model is constructed, and in order to ensure the accuracy of label labeling, only a manual labeling mode is often adopted, time and labor are wasted, errors are easy to occur, the image sample labeling work cannot be completed conveniently and efficiently, and further, an available special OCR recognition model cannot be quickly constructed for each application scene.

Accordingly, there is a need in the art for a new training scheme for OCR recognition models to solve the above-mentioned problems.

Disclosure of Invention

In order to overcome the above-mentioned drawbacks, the present invention is proposed to solve or at least partially solve the technical problem of how to perform label labeling on image samples conveniently and efficiently to complete model training of an OCR recognition model quickly.

In a first aspect, an OCR recognition model training method is provided, which includes:

acquiring a first type of image sample with label data;

performing model training on a preset OCR (optical character recognition) model by using the first type of image sample to obtain an initial OCR model;

performing OCR recognition on the second type image sample of the label-free data by using the initial OCR recognition model;

generating label data of the second type image sample according to the OCR recognition result, and labeling the second type image sample according to the generated label data;

and performing model training on the initial OCR recognition model by adopting the first type of image sample and the second type of image sample marked by the label to obtain a final OCR recognition model.

In one technical solution of the OCR recognition model training method, the label data of the first type image sample and the second type image sample each include a position of an image recognition area, business data recorded in each image recognition area, and a data type thereof;

the first type of image sample with the label data is obtained by the following method:

responding to a received annotation instruction, and acquiring annotation information of an image sample to be annotated specified in the annotation instruction, wherein the annotation information comprises the position of each image identification area in the image to be annotated, and business data and data types thereof recorded in each image identification area;

generating label data of the image sample to be labeled according to the labeling information and labeling the image sample to be labeled according to the generated label data to obtain a first type of image sample with the label data;

the annotation information is determined according to information which is annotated on the image sample to be annotated by a user through a visual interface.

In one technical solution of the OCR recognition model training method, the position of the image recognition area in the annotation information is determined according to the position of the area selected by the user on the visual interface in a frame selection manner on the image sample to be annotated, and the service data and the category thereof in the annotation information are determined according to the service data and the category thereof entered by the user on the visual interface for each image recognition area.

In one embodiment of the OCR recognition model training method, after the step of "performing model training on the initial OCR recognition model to obtain a final OCR recognition model", the method further includes:

generating a download path of the final OCR recognition model according to the storage position of the final OCR recognition model;

generating and displaying release information of the final OCR recognition model according to the download path;

and/or when the first-class image samples and the second-class image samples under different service scenes are used for respectively training to obtain the initial OCR recognition models corresponding to the service scenes, the step of performing model training on the initial OCR recognition models specifically comprises the following steps:

generating a model training queue according to the training completion time corresponding to each initial OCR recognition model;

sequentially carrying out model training on each initial OCR recognition model according to the training sequence corresponding to each initial OCR recognition model in the model training queue;

and/or the step of "performing model training on the initial OCR recognition model" specifically includes:

and displaying the model training progress of the initial OCR recognition model through a visual interface.

In a second aspect, an OCR recognition model training apparatus is provided, the OCR recognition model training apparatus comprising:

a sample acquisition module configured to acquire a first type of image sample with label data;

a first model training module configured to perform model training on a preset OCR recognition model by using the first type of image sample to obtain an initial OCR recognition model;

an attribute category prediction module configured to perform OCR recognition on a second type of image sample of unlabeled data using the initial OCR recognition model;

a label labeling module configured to generate label data of the second type image sample according to the result of the OCR recognition and label the second type image sample according to the generated label data;

and the second model training module is configured to perform model training on the initial OCR recognition model by adopting the first type of image sample and the second type of image sample labeled by the label so as to obtain a final OCR recognition model.

In one technical solution of the OCR recognition model training apparatus, the label data of the first type image sample and the second type image sample each include a position of an image recognition area, business data recorded in each image recognition area, and a data type thereof;

the sample acquisition module is further configured to perform the following operations:

In one technical solution of the OCR recognition model training apparatus, the position of the image recognition area in the annotation information is determined according to the position of the area selected by the user on the visual interface in a frame selection manner on the image sample to be annotated, and the service data and the category thereof in the annotation information are determined according to the service data and the category thereof entered by the user on the visual interface for each image recognition area.

In one aspect of the OCR recognition model training apparatus, the apparatus includes a model issuing module configured to perform the following operations:

and/or the second model training module comprises a first model training unit and/or a second model training unit;

the first model training unit is configured to, when initial OCR recognition models corresponding to the service scenes are obtained by respectively training first-class image samples and second-class image samples under different service scenes, perform model training on each initial OCR recognition model by performing the following operations:

the second model training unit is configured to display a model training progress of the initial OCR recognition model through a visualization interface.

In a third aspect, a control device is provided, which comprises a processor and a storage device, wherein the storage device is adapted to store a plurality of program codes, and the program codes are adapted to be loaded and run by the processor to execute the OCR recognition model training method according to any one of the above-mentioned OCR recognition model training methods.

In a fourth aspect, a computer readable storage medium is provided, having stored therein a plurality of program codes adapted to be loaded and run by a processor to execute the OCR recognition model training method according to any one of the above-mentioned OCR recognition model training methods.

One or more technical schemes of the invention at least have one or more of the following beneficial effects:

in the technical scheme of the invention, a preset OCR recognition model can be initially trained by using a first type of image sample with label data to obtain an initial OCR recognition model, a second type of image sample without label data is recognized by using the initial OCR recognition model, and the second type of image sample is labeled according to a recognition result to determine the label data of the second type of image sample. Because the first type of image sample has accurate label data, the initial OCR recognition model trained by the first type of image sample has higher OCR recognition capability, and then the OCR recognition result obtained by performing OCR recognition on the second type of image sample by using the initial OCR recognition model (including but not limited to the position of one or more image recognition areas in the second type of image sample, the business data recorded in each image recognition area and the data type thereof) is also a more accurate result, so that the label data of the second type of image sample generated according to the OCR recognition result is also more accurate label data. That is to say, according to the OCR model training method of the embodiment of the present invention, through the initial OCR recognition model, not only can automatic labeling of the label data of the second type image sample be realized, but also the labeled label data can have higher accuracy. In practical application, in order to ensure the accuracy of the label data of the first-class image samples, a small number of first-class image samples can be labeled in a manual labeling mode, and then the small number of first-class image samples are used by the OCR model training method according to the embodiment of the invention to automatically label the label data of a large number of second-class image samples, so that the workload of manual labeling can be greatly reduced on the premise of ensuring that the second-class image samples have label data with higher accuracy. Further, after label data labeling of the second type of image sample is completed, the initial OCR recognition model can be retrained by using the first type of image sample and the second type of image sample simultaneously, so that the OCR recognition capability of the OCR recognition model is further improved, and the final OCR recognition model is obtained.

Drawings

The disclosure of the present invention will become more readily understood with reference to the accompanying drawings. As is readily understood by those skilled in the art: these drawings are for illustrative purposes only and are not intended to constitute a limitation on the scope of the present invention. Wherein:

FIG. 1 is a flow diagram illustrating the main steps of a method for training OCR recognition models, according to one embodiment of the present invention;

FIG. 2 is a flow chart illustrating the main steps of an OCR recognition model training method according to another embodiment of the present invention;

FIG. 3 is a flow chart illustrating the main steps of a first type of image sample acquisition method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a visual model training progress real-time monitoring interface for an OCR recognition model according to one embodiment of the present invention;

FIG. 5 is a schematic diagram of a visualization model training progress real-time monitoring interface of an OCR recognition model according to another embodiment of the invention;

FIG. 6 is a block diagram of the main structure of a model training apparatus for OCR recognition models according to an embodiment of the present invention;

FIG. 7 is a block diagram of the main structure of a model training apparatus for OCR recognition models according to another embodiment of the present invention;

list of reference numerals:

61: a sample acquisition module; 62: a first model training module; 63: an attribute category prediction module; 64: a label labeling module; 65: a second model training module; 71: a data processing module; 72: a model training module; 73: a model deployment verification module; 74: and (4) configuring a model algorithm output recognition engine module.

Detailed Description

Some embodiments of the invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only for explaining the technical principle of the present invention, and are not intended to limit the scope of the present invention.

In the description of the present invention, a "module" or "processor" may include hardware, software, or a combination of both. A module may comprise hardware circuitry, various suitable sensors, communication ports, memory, may comprise software components such as program code, or may be a combination of software and hardware. The processor may be a central processing unit, microprocessor, image processor, digital signal processor, or any other suitable processor. The processor has data and/or signal processing functionality. The processor may be implemented in software, hardware, or a combination thereof. Non-transitory computer readable storage media include any suitable medium that can store program code, such as magnetic disks, hard disks, optical disks, flash memory, read-only memory, random-access memory, and the like. The term "a and/or B" denotes all possible combinations of a and B, such as a alone, B alone or a and B. The term "at least one A or B" or "at least one of A and B" means similar to "A and/or B" and may include only A, only B, or both A and B. The singular forms "a", "an" and "the" may include the plural forms as well.

The conventional OCR recognition model is generally a generalized recognition model, and such recognition model cannot simultaneously perform detailed and accurate recognition on images to be recognized in all application scenes under the condition that more and more images of different application scenes which need to be recognized are present, that is, the generalized recognition model cannot adapt to various specific recognition images, and the recognition effect is poor.

In the embodiment of the invention, a preset OCR recognition model can be initially trained by using a first type of image sample with label data to obtain an initial OCR recognition model, then a second type of image sample without label data is recognized by using the initial OCR recognition model, and the second type of image sample is labeled according to a recognition result to determine the label data of the second type of image sample. Because the first type of image sample has accurate label data, the initial OCR recognition model trained by the first type of image sample has higher OCR recognition capability, and then the OCR recognition result obtained by performing OCR recognition on the second type of image sample by using the initial OCR recognition model (including but not limited to the position of one or more image recognition areas in the second type of image sample, the business data recorded in each image recognition area and the data type thereof) is also a more accurate result, so that the label data of the second type of image sample generated according to the OCR recognition result is also more accurate label data. That is to say, according to the OCR model training method of the embodiment of the present invention, through the initial OCR recognition model, not only can automatic labeling of the label data of the second type image sample be realized, but also the labeled label data can have higher accuracy. In practical application, in order to ensure the accuracy of the label data of the first-class image samples, a small number of first-class image samples can be labeled in a manual labeling mode, and then the small number of first-class image samples are used by the OCR model training method according to the embodiment of the invention to automatically label the label data of a large number of second-class image samples, so that the workload of manual labeling can be greatly reduced on the premise of ensuring that the second-class image samples have label data with higher accuracy. Further, after label data labeling of the second type of image sample is completed, the initial OCR recognition model can be retrained by simultaneously using the first type of image sample and the second type of image sample, so that the OCR recognition capability of the OCR recognition model meeting the requirement is further improved, and the final OCR recognition model is obtained.

Referring to FIG. 1, FIG. 1 is a flow chart illustrating the main steps of an OCR recognition model training method according to an embodiment of the present invention. As shown in fig. 1, the OCR recognition model training method in the embodiment of the present invention mainly includes the following steps:

step S101: a first type of image sample is acquired with label data.

The image recognition area refers to an image area containing information to be recognized, namely a target area for performing OCR recognition on the first type of image sample. The business data refers to data recorded in the image recognition area, and the data are target data for performing OCR recognition on the first type image sample. In one example, if the first type of image sample is a bank card image and a bank card number on the bank card image needs to be identified, an area on the bank card image containing the bank card number may be selected as an image identification area and a location of the image identification area may be obtained, and the data type of the bank card number may be set as "card number". Then, according to the above, it can be determined that the tag data of the bank card image includes "the position of the area containing the bank card number", "the bank card number", and the data category "card number".

In an implementation manner of the embodiment of the present invention, the tag data of the first type image sample may include a position of each image recognition area in the first type image sample, service data recorded in each image recognition area, and a data type thereof, and for this reason, the first type image sample with the tag data may be obtained through steps S301 to S302 shown in fig. 3 in this implementation manner.

Step S301: and responding to the received annotation instruction, and acquiring annotation information of the to-be-annotated image sample specified in the annotation instruction, wherein the annotation information can comprise the position of each image identification area in the to-be-annotated image, the business data recorded in each image identification area and the data type of the business data.

Step S302: and generating label data of the image sample to be labeled according to the labeling information, and labeling the image sample to be labeled according to the generated label data to obtain a first-class image sample with the label data.

The annotation information can be determined according to information that a user annotates on the image sample to be annotated through a visual interface. Specifically, in the present embodiment, the position of the image recognition area in the annotation information is determined according to the position of the area selected by the user on the image sample to be annotated by frame selection on the visual interface, and if the position of the selected area is directly used as the position of the image recognition area, the position obtained by scaling the position of the selected area may be used as the position of the image recognition area. The service data and the category thereof in the annotation information are determined according to the service data and the category thereof which are input by a user on the visual interface aiming at each image identification area.

The user manually marks the first type of image sample on a visual interface, and manually selects a position to be identified, such as a name, a date, a number, letters and the like, and the position and real information of the identification content. And setting the sample which is labeled according to the manual work and comprises the real information as a first type of image sample.

Step S102: performing model training on a preset OCR (optical character recognition) model by adopting a first type of image sample to obtain an initial OCR model;

after the first-class image samples are set, the first-class image samples are required to be used for carrying out first training on the preset OCR recognition model, so that the preset OCR recognition model has certain OCR recognition capability. It should be noted that, in the embodiment of the present invention, a model structure of an OCR recognition model that is conventional in the OCR technical field may be adopted to construct the preset OCR recognition model. Meanwhile, a conventional model training method can be adopted to perform model training on the preset OCR recognition model by using the first type of image samples. For the sake of brevity, the model structure of the preset OCR recognition model and the model training method that can be adopted are not described again here.

Step S103: and performing OCR recognition on the second type image sample of the non-label data by using an initial OCR recognition model.

The trained initial OCR recognition model can recognize the second type of image samples to be recognized to a certain degree, and the initial OCR recognition model is used for recognizing the second type of samples without label data to obtain the recognition result of the initial OCR recognition model. As can be seen from the foregoing step S101, the label data of the first type image sample may include the position of each image recognition area in the first type image sample, the service data recorded in each image recognition area, and the data type thereof, so that the initial OCR recognition model trained by using the first type image sample has the capability of determining the position of the image recognition area in the image to be detected, and recognizing the service data recorded in the image recognition area, and the data type thereof, that is, in this embodiment, the recognition result of performing OCR recognition on the second type image sample without label data by using the initial OCR recognition model may include the positions of one or more image recognition areas in the second type image sample, the service data recorded in each image recognition area, and the data type thereof. It should be noted that the meanings of "the position of the image recognition area", "the service data recorded in the image recognition area", and "the data category" in the second type image sample are similar to the meanings of "the position of the image recognition area", "the service data recorded in the image recognition area", and "the data category" in step S101, respectively, and are not repeated herein for brevity of description.

Step S104: and generating label data of the second type of image samples according to the OCR recognition result, and labeling the second type of image samples according to the generated label data. It should be noted that, in the implementation of the present invention, a conventional tag data generation method in the data processing technology field may be adopted, and tag data of a second type image sample is generated according to "the position of one or more image identification areas in the second type image sample, the service data recorded in each image identification area, and the data type thereof", which is not described herein again for brevity of description.

Step S105: and performing model training on the initial OCR recognition model by adopting the first type of image sample and the second type of image sample marked by the label to obtain a final OCR recognition model.

And training the initial OCR recognition model through the image sample with the artificially marked real information and the image sample of the recognition result of the initial OCR recognition model. The recognition result of the initial OCR recognition model may include the recognized location, category, specific content, etc., for example, the category of the real sample is the identification card, and the recognition result is the bank card, which is an obvious category recognition error, and for example, the location of the recognition result is the birth date column of the identification card, and the real location is also the birth date column of the identification card, i.e., the recognition location is correct.

In an implementation manner of the embodiment of the present invention, the final OCR recognition model obtained through training may be further configured to have an ability to evaluate a sample acquisition difficulty, in addition to an ability to perform OCR recognition on an image, for example, if a noise value of the sample is too high, that is, the sample is influenced too much, or if the sample is damaged or has a surface contamination degree too high, the recognition of the OCR recognition model is influenced, the OCR recognition model may also output the evaluation of the acquisition difficulty of the sample information, for example, a confidence of the acquired information, and when the acquired confidence is too low, the OCR recognition model may send a prompt to a user.

Application scenarios of the OCR recognition model in the embodiment of the present invention include, but are not limited to: card identification, bill identification, and the like. The card identification can include bank card identification, driver's license identification, identity card identification and the like. In the embodiment of the invention, the OCR recognition model special for each application scene can be obtained by training the image samples under different application scenes. Further, in an implementation manner of the embodiment of the present invention, when the initial OCR recognition models corresponding to each service scene are obtained by respectively training the first type image samples and the second type image samples in different service scenes, the step S105 may perform model training on the initial OCR recognition models according to the following steps 1 to 2 to obtain the final OCR recognition models:

step 1: and generating a model training queue according to the training completion time corresponding to each initial OCR recognition model.

Step 2: and sequentially carrying out model training on each initial OCR recognition model according to the training sequence corresponding to each initial OCR recognition model in the model training queue.

In addition, in the embodiment, the model training progress of the initial OCR recognition model can be displayed through the visual interface, so that the user can control the training progress in real time, and the training effect required by the user is achieved.

In the embodiment of the invention, the training of a plurality of initial OCR recognition models can be finished by performing queue type management and control on the model training, so that the plurality of initial OCR recognition models respectively meet the recognition requirements of different types of samples.

According to the OCR recognition model training method embodiments in the above steps S101 to S105, not only can automatic labeling of label data be performed on the second type image samples without label data, but also the labeled label data can have higher accuracy, so that the workload of manual labeling can be greatly reduced on the premise of ensuring that the second type image samples have label data with higher accuracy. After the labeling of the label data of the second type of image sample is completed, the initial OCR recognition model can be retrained by using the first type of image sample with the label data and the automatically labeled second type of image sample at the same time, so that the OCR recognition capability of obtaining the OCR recognition model meeting the requirement is further improved.

Further, in another embodiment of the OCR recognition model training method according to the present invention, the OCR recognition model training method may further include step S206 and step S207 as shown in fig. 2, in addition to step S101-step S105 in the aforementioned embodiment of the OCR recognition model training method.

Step S206: and generating a download path of the final OCR recognition model according to the storage position of the final OCR recognition model.

Step S207: and generating and displaying the release information of the final OCR recognition model according to the download path.

In the embodiment of the invention, the trained OCR recognition model is stored to the preset position and the download path is generated, so that a user can download the trained OCR recognition model to the electronic equipment or the computer through the download path in any required scene to complete the recognition of the image sample of the specific type without additional training, thereby saving the use time.

In one implementation manner of the embodiment of the present invention, a user may save and release the final OCR recognition model to obtain a product of the OCR recognition model adapted to a certain usage scenario. In a usage scenario, a user may view the training progress of an initial OCR model through a visual model training progress real-time monitoring interface shown in fig. 4 to 5, where an abscissa of a curve in fig. 4 represents the training times, an ordinate represents a loss value of a loss function adopted by model training, an abscissa of a curve in fig. 5 represents the training times, and an ordinate represents accuracy of a model recognition result; through the curve diagrams in the two visual model progress real-time monitoring interfaces, a user can clearly check the training progress of the current OCR recognition model, whether the loss value of the loss function and the accuracy of the model recognition result reach the standard or not, and training can be continued or stopped in a self-defined mode according to the real-time training progress.

It should be noted that, although the foregoing embodiments describe each step in a specific sequence, those skilled in the art will understand that, in order to achieve the effect of the present invention, different steps do not necessarily need to be executed in such a sequence, and they may be executed simultaneously (in parallel) or in other sequences, and these changes are all within the protection scope of the present invention.

Furthermore, the invention also provides an OCR recognition model training device.

Referring to fig. 6, fig. 6 is a main structural block diagram of an OCR recognition model training apparatus according to an embodiment of the present invention. As shown in fig. 6, the OCR recognition model training apparatus in the embodiment of the present invention mainly includes a sample obtaining module 61, a first model training module 62, an attribute class predicting module 63, a label labeling module 64, and a second model training module 65. In some embodiments, one or more of the sample acquisition module 61, the first model training module 62, the attribute class prediction module 63, the label labeling module 64, and the second model training module 65 may be combined together into one module. In some embodiments, the sample acquisition module 61 may be configured to acquire a first type of image sample with the label data. The first model training module 62 may be configured to perform model training on a preset OCR recognition model using the first type of image samples to obtain an initial OCR recognition model. The attribute class prediction module 63 may be configured to perform OCR recognition on the second type image samples of unlabeled data using an initial OCR recognition model. The labeling module 64 may be configured to generate label data of the second type of image sample according to the OCR recognition result, and label the second type of image sample according to the generated label data. The second model training module 65 may be configured to perform model training on the initial OCR recognition model using the first type image samples and the labeled second type image samples to obtain a final OCR recognition model. In one embodiment, the description of the specific implementation function may refer to steps S101 to S105.

In one embodiment, the tag data may include a position of each image recognition area in the first type image sample, the business data recorded in each image recognition area, and a data category thereof, and the sample acquiring module 61 may be further configured to perform the following operations:

in response to the received annotation instruction, obtaining annotation information of the image sample to be annotated specified in the annotation instruction, wherein the annotation information can comprise the position of each image identification area in the image to be annotated, the business data recorded in each image identification area and the data type of the business data; generating label data of the image sample to be labeled according to the labeling information, and labeling the image sample to be labeled according to the generated label data to obtain a first type of image sample with the label data; the annotation information can be determined according to information annotated on the image sample to be annotated by a user through a visual interface. In one embodiment, the description of the specific implementation function may be referred to in step S101.

In one embodiment, the position of the image identification area in the annotation information is determined according to the position of the area selected by the user on the image sample to be annotated by means of frame selection on the visual interface, and the service data and the category thereof in the annotation information are determined according to the service data and the category thereof which are input by the user on the visual interface aiming at each image identification area. In one embodiment, the description of the specific implementation function may be referred to in step S101.

In one embodiment, the OCR recognition model training apparatus shown in fig. 6 may further include a model issuing module, and in this embodiment, the model issuing module may be configured to perform the following operations:

generating a download path of the final OCR recognition model according to the storage position of the final OCR recognition model; generating and displaying final release information of the OCR recognition model according to the download path;

in one embodiment, the second model training module may comprise the first model training unit and/or the second model training unit;

the first model training unit can be configured to, when the initial OCR recognition models corresponding to the service scenes are respectively trained by using the first type image samples and the second type image samples under different service scenes, model train each initial OCR recognition model by performing the following operations: generating a model training queue according to the training completion time corresponding to each initial OCR recognition model; and sequentially carrying out model training on each initial OCR recognition model according to the training sequence corresponding to each initial OCR recognition model in the model training queue. In one embodiment, the detailed implementation functions may be described in reference to steps S106-S107.

In one embodiment, the second model training unit may be configured to display a model training progress of the initial OCR recognition model through the visualization interface. In one embodiment, the detailed implementation functions may be described in reference to steps S106-S107.

The OCR recognition model training apparatus is used for executing the OCR recognition model training method embodiment shown in fig. 1, and the technical principles, the solved technical problems and the generated technical effects of the two are similar, and it can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process and the related description of the OCR recognition model training apparatus may refer to the content described in the OCR recognition model training method embodiment, and are not repeated here.

Referring to fig. 7, fig. 7 is a main block diagram of an OCR recognition model training apparatus according to another embodiment of the present invention. As shown in fig. 7, the OCR recognition model training apparatus in the embodiment of the present invention mainly includes:

a data processing module 71, a model training module 72, a model deployment verification module 73, a configuration model algorithm, and an output recognition engine module 74.

In some embodiments, the data processing module 71 has the same functions as a part of the aforementioned sample acquiring module 61 in fig. 6, and is capable of acquiring labeling data and labeling pictures; the model training module 72 has the same functions as part of the first model training module 62, the attribute category prediction module 63 and the label labeling module 64, and can complete the training of the initial OCR recognition model; the model deployment verification module 73 has the same function as part of the second model training module 65, and can complete the retraining of the initial OCR recognition model and improve the recognition accuracy of the OCR recognition model;

in addition, the functions performed by the configuration model algorithm and the output recognition engine module 74 in the configuration mode are as in steps S206-S207, and are not described herein for brevity.

It will be understood by those skilled in the art that all or part of the flow of the method according to the above-described embodiment may be implemented by a computer program, which may be stored in a computer-readable storage medium and used to implement the steps of the above-described embodiments of the method when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying said computer program code, media, usb disk, removable hard disk, magnetic diskette, optical disk, computer memory, read-only memory, random access memory, electrical carrier wave signals, telecommunication signals, software distribution media, etc. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

Furthermore, the invention also provides a control device. In an embodiment of the control device according to the present invention, the control device comprises a processor and a storage device, the storage device may be configured to store a program for executing the OCR recognition model training method of the above-mentioned method embodiment, and the processor may be configured to execute the program in the storage device, the program including but not limited to the program for executing the OCR recognition model training method of the above-mentioned method embodiment. For convenience of explanation, only the parts related to the embodiments of the present invention are shown, and details of the specific techniques are not disclosed. The control device may be a control device apparatus formed including various electronic apparatuses.

Further, the invention also provides a computer readable storage medium. In one computer-readable storage medium embodiment according to the present invention, a computer-readable storage medium may be configured to store a program for executing the OCR recognition model training method of the above-described method embodiment, which may be loaded and executed by a processor to implement the OCR recognition model training method described above. For convenience of explanation, only the parts related to the embodiments of the present invention are shown, and details of the specific techniques are not disclosed. The computer readable storage medium may be a storage device formed by including various electronic devices, and optionally, the computer readable storage medium is a non-transitory computer readable storage medium in the embodiment of the present invention.

Further, it should be understood that, since the configuration of each module is only for explaining the functional units of the apparatus of the present invention, the corresponding physical devices of the modules may be the processor itself, or a part of software, a part of hardware, or a part of a combination of software and hardware in the processor. Thus, the number of individual modules in the figures is merely illustrative.

Those skilled in the art will appreciate that the various modules in the apparatus may be adaptively split or combined. Such splitting or combining of specific modules does not cause the technical solutions to deviate from the principle of the present invention, and therefore, the technical solutions after splitting or combining will fall within the protection scope of the present invention.

So far, the technical solution of the present invention has been described with reference to one embodiment shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims

1. An OCR recognition model training method, the method comprising:

acquiring a first type of image sample with label data;

2. An OCR recognition model training method according to claim 1, wherein the label data of the first type image sample and the second type image sample each include a position of an image recognition area, business data recorded in each image recognition area, and a data category thereof;

3. An OCR recognition model training method according to claim 2, wherein the position of the image recognition area in the annotation information is determined according to the position of the area selected by the user on the image sample to be annotated by means of frame selection on the visual interface, and the business data and the category thereof in the annotation information are determined according to the business data and the category thereof entered by the user for each image recognition area on the visual interface.

4. An OCR recognition model training method according to any one of claims 1 to 3 and further comprising, after the step of "model training said initial OCR recognition model to obtain a final OCR recognition model":

and/or the like and/or,

when the first-class image samples and the second-class image samples under different service scenes are used for respectively training to obtain the initial OCR recognition models corresponding to the service scenes, the step of performing model training on the initial OCR recognition models specifically comprises the following steps:

and/or the like and/or,

the step of "performing model training on the initial OCR recognition model" specifically includes:

5. An OCR recognition model training apparatus, the apparatus comprising:

the system comprises a sample acquisition module, a data acquisition module and a data acquisition module, wherein the sample acquisition module is configured to acquire a first type of image sample with label data, and the label data comprises the position of each image identification area in the first type of image sample, business data recorded in each image identification area and the data category of the business data;

a label labeling module configured to generate label data of the second type image sample according to a result of OCR recognition, and label the second type image sample according to the generated label data, wherein the result of OCR recognition includes a position of one or more image recognition areas in the second type image sample, business data recorded in each image recognition area, and a data category thereof;

6. An OCR recognition model training apparatus as claimed in claim 5, wherein the label data of the first type image sample and the second type image sample each comprise the location of an image recognition area, the business data recorded in each image recognition area and its data category;

7. An OCR recognition model training apparatus according to claim 6, wherein the position of the image recognition area in the annotation information is determined according to the position of the area selected by the user on the image sample to be annotated by means of frame selection on the visual interface, and the business data and the category thereof in the annotation information are determined according to the business data and the category thereof entered by the user for each image recognition area on the visual interface.

8. An OCR recognition model training apparatus as claimed in any of claims 5 to 7, wherein the apparatus comprises a model issuing module configured to:

and/or the like and/or,

the second model training module comprises a first model training unit and/or a second model training unit;

9. A control apparatus comprising a processor and a storage device, the storage device being adapted to store a plurality of program codes, wherein the program codes are adapted to be loaded and run by the processor to perform the OCR recognition model training method of any of claims 1 to 4.

10. A computer readable storage medium having stored therein a plurality of program codes, characterized in that the program codes are adapted to be loaded and executed by a processor to perform the OCR recognition model training method according to any one of claims 1 to 4.